Updates capability-writing skill with progressive disclosure structure based on Anthropic's January 2025 documentation. Implements Haiku-first approach (12x cheaper, 2-5x faster than Sonnet). Key changes: - Add 5 core principles: conciseness, progressive disclosure, script bundling, degrees of freedom, and Haiku-first model selection - Restructure with best-practices.md, templates/, examples/, and reference/ - Create 4 templates: user-invocable skill, background skill, agent, helper script - Add 3 examples: simple workflow, progressive disclosure, with scripts - Add 3 reference docs: frontmatter fields, model selection, anti-patterns - Update create-capability to analyze complexity and recommend structures - Default all new skills/agents to Haiku unless justified Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
12 KiB
Skill Authoring Best Practices
Based on Anthropic's latest agent skills documentation (January 2025).
Core Principles
Concise is Key
"The context window is a public good. Default assumption: Claude is already very smart."
Only add context Claude doesn't already have.
Challenge each piece of information:
- "Does Claude really need this explanation?"
- "Can I assume Claude knows this?"
- "Does this paragraph justify its token cost?"
Good example (concise):
## Extract PDF text
Use pdfplumber:
\`\`\`python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
\`\`\`
Bad example (verbose):
## Extract PDF text
PDF (Portable Document Format) files are a common file format that contains text,
images, and other content. To extract text from a PDF, you'll need to use a library.
There are many libraries available for PDF processing, but we recommend pdfplumber
because it's easy to use and handles most cases well. First, you'll need to install
it using pip. Then you can use the code below...
The concise version assumes Claude knows what PDFs are and how libraries work.
Set Appropriate Degrees of Freedom
Match the level of specificity to the task's fragility and variability.
High Freedom (Text-Based Instructions)
Use when multiple approaches are valid:
## Code Review Process
1. Analyze code structure and organization
2. Check for potential bugs or edge cases
3. Suggest improvements for readability
4. Verify adherence to project conventions
Medium Freedom (Templates/Pseudocode)
Use when there's a preferred pattern but variation is acceptable:
## Generate Report
Use this template and customize as needed:
\`\`\`python
def generate_report(data, format="markdown", include_charts=True):
# Process data
# Generate output in specified format
# Optionally include visualizations
\`\`\`
Low Freedom (Exact Scripts)
Use when operations are fragile and error-prone:
## Database Migration
Run exactly this script:
\`\`\`bash
python scripts/migrate.py --verify --backup
\`\`\`
Do not modify the command or add additional flags.
Analogy: Think of Claude as a robot exploring a path:
- Narrow bridge with cliffs: One safe way forward. Provide specific guardrails (low freedom)
- Open field: Many paths lead to success. Give general direction (high freedom)
Progressive Disclosure
Split large skills into layers that load on-demand.
Three Levels of Loading
| Level | When Loaded | Token Cost | Content |
|---|---|---|---|
| Level 1: Metadata | Always (at startup) | ~100 tokens | name and description from frontmatter |
| Level 2: Instructions | When skill is triggered | Under 5k tokens | SKILL.md body with instructions |
| Level 3: Resources | As needed | Unlimited | Referenced files, scripts |
Organizing Large Skills
Pattern 1: High-level guide with references
# PDF Processing
## Quick Start
\`\`\`python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
\`\`\`
## Advanced Features
**Form filling**: See [FORMS.md](FORMS.md)
**API reference**: See [REFERENCE.md](REFERENCE.md)
**Examples**: See [EXAMPLES.md](EXAMPLES.md)
Claude loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed.
Pattern 2: Domain-specific organization
For skills with multiple domains:
bigquery-skill/
├── SKILL.md (overview and navigation)
└── reference/
├── finance.md (revenue, billing metrics)
├── sales.md (opportunities, pipeline)
├── product.md (API usage, features)
└── marketing.md (campaigns, attribution)
When user asks about revenue, Claude reads only reference/finance.md.
Pattern 3: Conditional details
# DOCX Processing
## Creating Documents
Use docx-js. See [DOCX-JS.md](DOCX-JS.md).
## Editing Documents
For simple edits, modify XML directly.
**For tracked changes**: See [REDLINING.md](REDLINING.md)
**For OOXML details**: See [OOXML.md](OOXML.md)
Avoid Deeply Nested References
Keep references one level deep from SKILL.md.
Bad (too deep):
SKILL.md → advanced.md → details.md → actual info
Good (one level):
SKILL.md → {advanced.md, reference.md, examples.md}
Structure Longer Files with TOC
For reference files >100 lines, include a table of contents:
# API Reference
## Contents
- Authentication and setup
- Core methods (create, read, update, delete)
- Advanced features (batch operations, webhooks)
- Error handling patterns
- Code examples
## Authentication and Setup
...
This ensures Claude can see the full scope even with partial reads.
Script Bundling
When to Bundle Scripts
Bundle scripts for:
- Error-prone operations: Complex bash with retry logic
- Fragile sequences: Operations requiring exact order
- Validation steps: Checking conditions before proceeding
- Reusable utilities: Operations used in multiple steps
Benefits of bundled scripts:
- More reliable than generated code
- Save tokens (no code in context)
- Save time (no code generation)
- Ensure consistency
Script Structure
#!/bin/bash
# script-name.sh - Brief description
#
# Usage: script-name.sh <param1> <param2>
#
# Example: script-name.sh issue-42 "Fix bug"
set -e # Exit on error
# Input validation
if [ $# -lt 2 ]; then
echo "Usage: $0 <param1> <param2>"
exit 1
fi
param1=$1
param2=$2
# Main logic with error handling
if ! some_command; then
echo "ERROR: Command failed"
exit 1
fi
# Success output
echo "SUCCESS: Operation completed"
Referencing Scripts in Skills
Make clear whether to execute or read:
Execute (most common):
7. **Create PR**: `./scripts/create-pr.sh $1 "$title"`
Read as reference (for understanding complex logic):
See `./scripts/analyze-form.py` for the field extraction algorithm
Solving, Not Punting
Scripts should handle error conditions, not punt to Claude.
Good (handles errors):
def process_file(path):
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
print(f"File {path} not found, creating default")
with open(path, 'w') as f:
f.write('')
return ''
except PermissionError:
print(f"Cannot access {path}, using default")
return ''
Bad (punts to Claude):
def process_file(path):
return open(path).read() # Fails, Claude has to figure it out
Workflow Patterns
Plan-Validate-Execute
Add verification checkpoints to catch errors early.
Example: Workflow with validation
## PDF Form Filling
Copy this checklist:
\`\`\`
Progress:
- [ ] Step 1: Analyze form (run analyze_form.py)
- [ ] Step 2: Create field mapping (edit fields.json)
- [ ] Step 3: Validate mapping (run validate_fields.py)
- [ ] Step 4: Fill form (run fill_form.py)
- [ ] Step 5: Verify output (run verify_output.py)
\`\`\`
**Step 1: Analyze**
Run: `python scripts/analyze_form.py input.pdf`
**Step 2: Create Mapping**
Edit `fields.json`
**Step 3: Validate**
Run: `python scripts/validate_fields.py fields.json`
Fix any errors before continuing.
**Step 4: Fill**
Run: `python scripts/fill_form.py input.pdf fields.json output.pdf`
**Step 5: Verify**
Run: `python scripts/verify_output.py output.pdf`
If verification fails, return to Step 2.
Feedback Loops
Pattern: Run validator → fix errors → repeat
Example: Document editing
1. Make edits to `word/document.xml`
2. **Validate**: `python scripts/validate.py unpacked_dir/`
3. If validation fails:
- Review error message
- Fix issues
- Run validation again
4. **Only proceed when validation passes**
5. Rebuild: `python scripts/pack.py unpacked_dir/ output.docx`
6. Test output document
Model Selection
Decision Framework
Start with Haiku
|
v
Test on 3-5 representative tasks
|
+-- Success rate ≥80%? ---------> Use Haiku ✓
|
+-- Success rate <80%? --------> Try Sonnet
|
v
Test on same tasks
|
+-- Success ≥80%? --> Use Sonnet
|
+-- Still failing? --> Opus or redesign task
Haiku Works Well When
- Steps are simple and validated
- Instructions are concise (no verbose explanations)
- Error-prone operations use scripts (deterministic)
- Outputs have structured templates
- Checklists replace open-ended judgment
Testing with Multiple Models
Test skills with all models you plan to use:
- Create test cases: 3-5 representative scenarios
- Run with Haiku: Measure success rate, response quality
- Run with Sonnet: Compare results
- Adjust instructions: If Haiku struggles, add clarity or scripts
What works for Opus might need more detail for Haiku.
Common Anti-Patterns
Offering Too Many Options
Bad (confusing):
You can use pypdf, or pdfplumber, or PyMuPDF, or pdf2image, or...
Good (provide default):
Use pdfplumber for text extraction:
\`\`\`python
import pdfplumber
\`\`\`
For scanned PDFs requiring OCR, use pdf2image with pytesseract instead.
Time-Sensitive Information
Bad (will become wrong):
If you're doing this before August 2025, use the old API.
After August 2025, use the new API.
Good (use "old patterns" section):
## Current Method
Use the v2 API: `api.example.com/v2/messages`
## Old Patterns
<details>
<summary>Legacy v1 API (deprecated 2025-08)</summary>
The v1 API used: `api.example.com/v1/messages`
This endpoint is no longer supported.
</details>
Inconsistent Terminology
Good (consistent):
- Always "API endpoint"
- Always "field"
- Always "extract"
Bad (inconsistent):
- Mix "API endpoint", "URL", "API route", "path"
- Mix "field", "box", "element", "control"
- Mix "extract", "pull", "get", "retrieve"
Windows-Style Paths
Always use forward slashes:
- ✓ Good:
scripts/helper.py,reference/guide.md - ✗ Bad:
scripts\helper.py,reference\guide.md
Unix-style paths work cross-platform.
Iterative Development
Build Evaluations First
Create test cases BEFORE extensive documentation:
- Identify gaps: Run Claude on tasks without skill, document failures
- Create evaluations: Build 3-5 test scenarios
- Establish baseline: Measure Claude's performance without skill
- Write minimal instructions: Just enough to pass evaluations
- Iterate: Execute evaluations, refine
Develop Iteratively with Claude
Use Claude to help write skills:
- Complete a task without skill: Work through problem, note what context you provide
- Identify reusable pattern: What context is useful for similar tasks?
- Ask Claude to create skill: "Create a skill that captures this pattern"
- Review for conciseness: Remove unnecessary explanations
- Test on similar tasks: Use skill with fresh Claude instance
- Iterate based on observation: Where does Claude struggle?
Claude understands skill format natively - no special prompts needed.
Checklist for Effective Skills
Before publishing:
Core Quality
- Description is specific and includes key terms
- Description includes what skill does AND when to use it
- SKILL.md body under 500 lines
- Additional details in separate files (if needed)
- No time-sensitive information
- Consistent terminology throughout
- Examples are concrete, not abstract
- File references are one level deep
- Progressive disclosure used appropriately
- Workflows have clear steps
Code and Scripts
- Scripts solve problems, don't punt to Claude
- Error handling is explicit and helpful
- No "magic numbers" (all values justified)
- Required packages listed and verified
- Scripts have clear documentation
- No Windows-style paths (all forward slashes)
- Validation steps for critical operations
- Feedback loops for quality-critical tasks
Testing
- At least 3 test cases created
- Tested with Haiku (if that's the target)
- Tested with real usage scenarios
- Team feedback incorporated (if applicable)