feat(skills): modernize capability-writing with Anthropic best practices

Updates capability-writing skill with progressive disclosure structure based on
Anthropic's January 2025 documentation. Implements Haiku-first approach (12x
cheaper, 2-5x faster than Sonnet).

Key changes:
- Add 5 core principles: conciseness, progressive disclosure, script bundling,
  degrees of freedom, and Haiku-first model selection
- Restructure with best-practices.md, templates/, examples/, and reference/
- Create 4 templates: user-invocable skill, background skill, agent, helper script
- Add 3 examples: simple workflow, progressive disclosure, with scripts
- Add 3 reference docs: frontmatter fields, model selection, anti-patterns
- Update create-capability to analyze complexity and recommend structures
- Default all new skills/agents to Haiku unless justified

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-11 18:10:53 +01:00
parent 7406517cd9
commit f424a7f992
13 changed files with 2612 additions and 229 deletions

View File

@@ -0,0 +1,536 @@
# Anti-Patterns to Avoid
Common mistakes when creating skills and agents.
## Skill Design Anti-Patterns
### 1. Overly Broad Components
**Bad:** One skill that does everything
```yaml
---
name: project-management
description: Handles issues, PRs, releases, documentation, deployment, testing, CI/CD...
---
# Project Management
This skill does:
- Issue management
- Pull request reviews
- Release planning
- Documentation
- Deployment
- Testing
- CI/CD configuration
...
```
**Why it's bad:**
- Huge context window usage
- Hard to maintain
- Unclear when to trigger
- Tries to do too much
**Good:** Focused components
```yaml
---
name: issue-writing
description: How to write clear, actionable issues with acceptance criteria.
---
```
**Separate skills for:**
- `issue-writing` - Issue quality
- `review-pr` - PR reviews
- `gitea` - CLI reference
- Each does one thing well
---
### 2. Vague Instructions
**Bad:**
```markdown
1. Handle the issue
2. Do the work
3. Finish up
4. Let me know when done
```
**Why it's bad:**
- No clear actions
- Claude has to guess
- Inconsistent results
- Hard to validate
**Good:**
```markdown
1. **View issue**: `tea issues $1 --comments`
2. **Create branch**: `git checkout -b issue-$1-<title>`
3. **Plan work**: Use TodoWrite to break down steps
4. **Implement**: Make necessary changes
5. **Commit**: `git commit -m "feat: ..."`
6. **Create PR**: `tea pulls create --title "..." --description "..."`
```
---
### 3. Missing Skill References
**Bad:**
```markdown
Use the gitea skill to create an issue.
```
**Why it's bad:**
- Skills have ~20% auto-activation rate
- Claude might not load the skill
- Inconsistent results
**Good:**
```markdown
@~/.claude/skills/gitea/SKILL.md
Use `tea issues create --title "..." --description "..."`
```
**The `@` reference guarantees the skill content is loaded.**
---
### 4. God Skills
**Bad:** Single 1500-line skill covering everything
```
skills/database/SKILL.md (1500 lines)
- PostgreSQL
- MySQL
- MongoDB
- Redis
- All queries
- All optimization tips
- All schemas
```
**Why it's bad:**
- Exceeds recommended 500 lines
- Loads everything even if you need one thing
- Hard to maintain
- Wastes tokens
**Good:** Progressive disclosure
```
skills/database/
├── SKILL.md (200 lines - overview)
├── reference/
│ ├── postgres.md
│ ├── mysql.md
│ ├── mongodb.md
│ └── redis.md
└── schemas/
├── users.md
├── products.md
└── orders.md
```
Claude loads only what's needed.
---
### 5. Premature Agent Creation
**Bad:** Creating an agent for every task
```
agents/
├── issue-viewer/
├── branch-creator/
├── commit-maker/
├── pr-creator/
└── readme-updater/
```
**Why it's bad:**
- Overhead of spawning agents
- Most tasks don't need isolation
- Harder to follow workflow
- Slower execution
**Good:** Use agents only when needed:
- Context isolation (parallel work)
- Skill composition (multiple skills together)
- Specialist persona (architecture review)
**Simple tasks → Skills**
**Complex isolated work → Agents**
---
### 6. Verbose Explanations
**Bad:**
```markdown
Git is a distributed version control system that was created by Linus Torvalds in 2005. It allows multiple developers to work on the same codebase simultaneously while maintaining a complete history of all changes. When you want to save your changes, you use the git commit command, which creates a snapshot of your current working directory...
```
**Why it's bad:**
- Wastes tokens
- Claude already knows git
- Slows down loading
- Adds no value
**Good:**
```markdown
`git commit -m 'feat: add feature'`
```
**Assume Claude is smart. Only add domain-specific context.**
---
## Instruction Anti-Patterns
### 7. Offering Too Many Options
**Bad:**
```markdown
You can use pypdf, or pdfplumber, or PyMuPDF, or pdf2image, or camelot, or tabula, or...
```
**Why it's bad:**
- Decision paralysis
- Inconsistent choices
- No clear default
**Good:**
```markdown
Use pdfplumber for text extraction:
\`\`\`python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
\`\`\`
For scanned PDFs requiring OCR, use pdf2image + pytesseract instead.
```
**Provide default, mention alternative only when needed.**
---
### 8. Time-Sensitive Information
**Bad:**
```markdown
If you're doing this before August 2025, use the old API.
After August 2025, use the new API.
```
**Why it's bad:**
- Will become wrong
- Requires maintenance
- Confusing after the date
**Good:**
```markdown
## Current Method
Use v2 API: `api.example.com/v2/messages`
## Old Patterns
<details>
<summary>Legacy v1 API (deprecated 2025-08)</summary>
The v1 API: `api.example.com/v1/messages`
No longer supported.
</details>
```
---
### 9. Inconsistent Terminology
**Bad:** Mixing terms for the same thing
```markdown
1. Get the API endpoint
2. Call the URL
3. Hit the API route
4. Query the path
```
**Why it's bad:**
- Confusing
- Looks like different things
- Harder to search
**Good:** Pick one term and stick with it
```markdown
1. Get the API endpoint
2. Call the API endpoint
3. Check the API endpoint response
4. Retry the API endpoint if needed
```
---
### 10. Windows-Style Paths
**Bad:**
```markdown
Run: `scripts\helper.py`
See: `reference\guide.md`
```
**Why it's bad:**
- Fails on Unix systems
- Causes errors on Mac/Linux
**Good:**
```markdown
Run: `scripts/helper.py`
See: `reference/guide.md`
```
**Always use forward slashes. They work everywhere.**
---
## Script Anti-Patterns
### 11. Punting to Claude
**Bad script:**
```python
def process_file(path):
return open(path).read() # Let Claude handle errors
```
**Why it's bad:**
- Script fails with no helpful message
- Claude has to guess what happened
- Inconsistent error handling
**Good script:**
```python
def process_file(path):
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
print(f"ERROR: File {path} not found")
print("Creating default file...")
with open(path, 'w') as f:
f.write('')
return ''
except PermissionError:
print(f"ERROR: Cannot access {path}")
print("Using default value")
return ''
```
**Scripts should solve problems, not punt to Claude.**
---
### 12. Magic Numbers
**Bad:**
```bash
TIMEOUT=47 # Why 47?
RETRIES=5 # Why 5?
DELAY=3.7 # Why 3.7?
```
**Why it's bad:**
- No one knows why these values
- Hard to adjust
- "Voodoo constants"
**Good:**
```bash
# HTTP requests typically complete in <30s
# Extra buffer for slow connections
TIMEOUT=30
# Three retries balances reliability vs speed
# Most intermittent failures resolve by retry 2
RETRIES=3
# Exponential backoff: 1s, 2s, 4s
INITIAL_DELAY=1
```
**Document why each value is what it is.**
---
## Model Selection Anti-Patterns
### 13. Always Using Sonnet/Opus
**Bad:**
```yaml
---
name: dashboard
model: opus # "Just to be safe"
---
```
**Why it's bad:**
- 60x more expensive than Haiku
- 5x slower
- Wasted cost for simple task
**Good:**
```yaml
---
name: dashboard
model: haiku # Tested: 5/5 tests passed
---
```
**Test with Haiku first. Only upgrade if needed.**
---
### 14. Never Testing Haiku
**Bad:**
```yaml
---
name: review-pr
model: sonnet # Assumed it needs Sonnet, never tested Haiku
---
```
**Why it's bad:**
- Might work fine with Haiku
- Missing 12x cost savings
- Missing 2.5x speed improvement
**Good:**
```yaml
---
name: review-pr
model: haiku # Tested: Haiku 4/5 (80%), good enough!
---
```
Or:
```yaml
---
name: review-pr
model: sonnet # Tested: Haiku 2/5 (40%), Sonnet 4/5 (80%)
---
```
**Always test Haiku first, document results.**
---
## Progressive Disclosure Anti-Patterns
### 15. Deeply Nested References
**Bad:**
```
SKILL.md → advanced.md → details.md → actual-info.md
```
**Why it's bad:**
- Claude may partially read nested files
- Information might be incomplete
- Hard to navigate
**Good:**
```
SKILL.md → {advanced.md, reference.md, examples.md}
```
**Keep references one level deep from SKILL.md.**
---
### 16. No Table of Contents for Long Files
**Bad:** 500-line reference file with no structure
```markdown
# Reference
(500 lines of content with no navigation)
```
**Why it's bad:**
- Hard to preview
- Claude might miss sections
- User can't navigate
**Good:**
```markdown
# Reference
## Contents
- Authentication and setup
- Core methods
- Advanced features
- Error handling
- Examples
## Authentication and Setup
...
```
**Files >100 lines should have TOC.**
---
## Checklist to Avoid Anti-Patterns
Before publishing a skill:
- [ ] Not overly broad (does one thing well)
- [ ] Instructions are specific (not vague)
- [ ] Skill references use `@` syntax
- [ ] Under 500 lines (or uses progressive disclosure)
- [ ] Only creates agents when needed
- [ ] Concise (assumes Claude knows basics)
- [ ] Provides default, not 10 options
- [ ] No time-sensitive information
- [ ] Consistent terminology
- [ ] Forward slashes for paths
- [ ] Scripts handle errors, don't punt
- [ ] No magic numbers in scripts
- [ ] Tested with Haiku first
- [ ] References are one level deep
- [ ] Long files have table of contents

View File

@@ -0,0 +1,278 @@
# Frontmatter Fields Reference
Complete documentation of all available frontmatter fields for skills and agents.
## Skill Frontmatter
### Required Fields
#### `name`
- **Type:** string
- **Required:** Yes
- **Format:** Lowercase, hyphens only, no spaces
- **Max length:** 64 characters
- **Must match:** Directory name
- **Cannot contain:** XML tags, reserved words ("anthropic", "claude")
- **Example:** `work-issue`, `code-review`, `gitea`
#### `description`
- **Type:** string (multiline supported with `>`)
- **Required:** Yes
- **Max length:** 1024 characters
- **Cannot contain:** XML tags
- **Should include:**
- What the skill does
- When to use it
- Trigger conditions
- **Example:**
```yaml
description: >
View, create, and manage Gitea issues and pull requests.
Use when working with issues, PRs, or when user mentions tea, gitea, issue numbers.
```
#### `user-invocable`
- **Type:** boolean
- **Required:** Yes
- **Values:** `true` or `false`
- **Usage:**
- `true`: User can trigger with `/skill-name`
- `false`: Background skill, auto-loaded when needed
### Optional Fields
#### `model`
- **Type:** string
- **Required:** No
- **Values:** `haiku`, `sonnet`, `opus`
- **Default:** Inherits from parent (usually haiku)
- **Guidance:** Default to `haiku`, only upgrade if needed
- **Example:**
```yaml
model: haiku # 12x cheaper than sonnet
```
#### `argument-hint`
- **Type:** string
- **Required:** No (only for user-invocable skills)
- **Format:** `<required>` for required params, `[optional]` for optional
- **Shows in UI:** Helps users know what arguments to provide
- **Example:**
```yaml
argument-hint: <issue-number>
argument-hint: <issue-number> [optional-title]
```
#### `context`
- **Type:** string
- **Required:** No
- **Values:** `fork`
- **Usage:** Set to `fork` for skills needing isolated context
- **When to use:** Heavy exploration tasks that would pollute main context
- **Example:**
```yaml
context: fork # For arch-review-repo, deep exploration
```
#### `allowed-tools`
- **Type:** list of strings
- **Required:** No
- **Usage:** Restrict which tools the skill can use
- **Example:**
```yaml
allowed-tools:
- Read
- Bash
- Grep
```
- **Note:** Rarely used, most skills have all tools
## Agent Frontmatter
### Required Fields
#### `name`
- **Type:** string
- **Required:** Yes
- **Same rules as skill name**
#### `description`
- **Type:** string
- **Required:** Yes
- **Should include:**
- What the agent does
- When to spawn it
- **Example:**
```yaml
description: >
Automated code review of pull requests for quality, bugs, security, and style.
Spawn when reviewing PRs or checking code quality.
```
### Optional Fields
#### `model`
- **Type:** string
- **Required:** No
- **Values:** `haiku`, `sonnet`, `opus`, `inherit`
- **Default:** `inherit` (uses parent's model)
- **Guidance:**
- Default to `haiku` for simple agents
- Use `sonnet` for balanced performance
- Reserve `opus` for deep reasoning
- **Example:**
```yaml
model: haiku # Fast and cheap for code review checklist
```
#### `skills`
- **Type:** comma-separated list of skill names (not paths)
- **Required:** No
- **Usage:** Auto-load these skills when agent spawns
- **Format:** Just skill names, not paths
- **Example:**
```yaml
skills: gitea, issue-writing, code-review
```
- **Note:** Agent runtime loads skills automatically
#### `disallowedTools`
- **Type:** list of tool names
- **Required:** No
- **Common use:** Make agents read-only
- **Example:**
```yaml
disallowedTools:
- Edit
- Write
```
- **When to use:** Analysis agents that shouldn't modify code
#### `permissionMode`
- **Type:** string
- **Required:** No
- **Values:** `default`, `bypassPermissions`
- **Usage:** Rarely used, for agents that need to bypass permission prompts
- **Example:**
```yaml
permissionMode: bypassPermissions
```
## Examples
### Minimal User-Invocable Skill
```yaml
---
name: dashboard
description: Show open issues, PRs, and CI status.
user-invocable: true
---
```
### Full-Featured Skill
```yaml
---
name: work-issue
description: >
Implement a Gitea issue with full workflow: branch, plan, code, PR, review.
Use when implementing issues or when user says /work-issue.
model: haiku
argument-hint: <issue-number>
user-invocable: true
---
```
### Background Skill
```yaml
---
name: gitea
description: >
View, create, and manage Gitea issues and PRs using tea CLI.
Use when working with issues, PRs, viewing issue details, or when user mentions tea, gitea, issue numbers.
user-invocable: false
---
```
### Read-Only Agent
```yaml
---
name: code-reviewer
description: >
Automated code review of pull requests for quality, bugs, security, style, and test coverage.
model: sonnet
skills: gitea, code-review
disallowedTools:
- Edit
- Write
---
```
### Implementation Agent
```yaml
---
name: issue-worker
description: >
Autonomously implements a single issue in an isolated git worktree.
model: haiku
skills: gitea, issue-writing, software-architecture
---
```
## Validation Rules
### Name Validation
- Must be lowercase
- Must use hyphens (not underscores or spaces)
- Cannot contain: `anthropic`, `claude`
- Cannot contain XML tags `<`, `>`
- Max 64 characters
- Must match directory name exactly
### Description Validation
- Cannot be empty
- Max 1024 characters
- Cannot contain XML tags
- Should end with period
### Model Validation
- Must be one of: `haiku`, `sonnet`, `opus`, `inherit`
- Case-sensitive (must be lowercase)
## Common Mistakes
**Bad: Using paths in skills field**
```yaml
skills: ~/.claude/skills/gitea/SKILL.md # Wrong!
```
**Good: Just skill names**
```yaml
skills: gitea, issue-writing
```
**Bad: Reserved word in name**
```yaml
name: claude-helper # Contains "claude"
```
**Good: Descriptive name**
```yaml
name: code-helper
```
**Bad: Vague description**
```yaml
description: Helps with stuff
```
**Good: Specific description**
```yaml
description: >
Analyze Excel spreadsheets, create pivot tables, generate charts.
Use when analyzing Excel files, spreadsheets, or .xlsx files.
```

View File

@@ -0,0 +1,336 @@
# Model Selection Guide
Detailed guidance on choosing the right model for skills and agents.
## Cost Comparison
| Model | Input (per MTok) | Output (per MTok) | vs Haiku |
|-------|------------------|-------------------|----------|
| **Haiku** | $0.25 | $1.25 | Baseline |
| **Sonnet** | $3.00 | $15.00 | 12x more expensive |
| **Opus** | $15.00 | $75.00 | 60x more expensive |
**Example cost for typical skill call (2K input, 1K output):**
- Haiku: $0.00175
- Sonnet: $0.021 (12x more)
- Opus: $0.105 (60x more)
## Speed Comparison
| Model | Tokens/Second | vs Haiku |
|-------|---------------|----------|
| **Haiku** | ~100 | Baseline |
| **Sonnet** | ~40 | 2.5x slower |
| **Opus** | ~20 | 5x slower |
## Decision Framework
```
Start with Haiku by default
|
v
Test on 3-5 representative tasks
|
+-- Success rate ≥80%? ---------> ✓ Use Haiku
| (12x cheaper, 2-5x faster)
|
+-- Success rate <80%? --------> Try Sonnet
| |
| v
| Test on same tasks
| |
| +-- Success ≥80%? --> Use Sonnet
| |
| +-- Still failing? --> Opus or redesign
|
v
Document why you chose the model
```
## When Haiku Works Well
### ✓ Ideal for Haiku
**Simple sequential workflows:**
- `/dashboard` - Fetch and display
- `/roadmap` - List and format
- `/commit` - Generate message from diff
**Workflows with scripts:**
- Error-prone operations in scripts
- Skills just orchestrate script calls
- Validation is deterministic
**Structured outputs:**
- Tasks with clear templates
- Format is defined upfront
- No ambiguous formatting
**Reference/knowledge skills:**
- `gitea` - CLI reference
- `issue-writing` - Patterns and templates
- `software-architecture` - Best practices
### Examples of Haiku Success
**work-issue skill:**
- Sequential steps (view → branch → plan → implement → PR)
- Each step has clear validation
- Scripts handle error-prone operations
- Success rate: ~90%
**dashboard skill:**
- Fetch data (tea commands)
- Format as table
- Clear, structured output
- Success rate: ~95%
## When to Use Sonnet
### Use Sonnet When
**Haiku fails 20%+ of the time**
- Test with Haiku first
- If success rate <80%, upgrade to Sonnet
**Complex judgment required:**
- Code review (quality assessment)
- Issue grooming (clarity evaluation)
- Architecture decisions
**Nuanced reasoning:**
- Understanding implicit requirements
- Making trade-off decisions
- Applying context-dependent rules
### Examples of Sonnet Success
**review-pr skill:**
- Requires code understanding
- Judgment about quality/bugs
- Context-dependent feedback
- Originally tried Haiku: 65% success → Sonnet: 85%
**issue-worker agent:**
- Autonomous implementation
- Pattern matching
- Architectural decisions
- Originally tried Haiku: 70% success → Sonnet: 82%
## When to Use Opus
### Reserve Opus For
**Deep architectural reasoning:**
- `software-architect` agent
- Pattern recognition across large codebases
- Identifying subtle anti-patterns
- Trade-off analysis
**High-stakes decisions:**
- Breaking changes analysis
- System-wide refactoring plans
- Security architecture review
**Complex pattern recognition:**
- Requires sophisticated understanding
- Multiple layers of abstraction
- Long-term implications
### Examples of Opus Success
**software-architect agent:**
- Analyzes entire codebase
- Identifies 8 different anti-patterns
- Provides prioritized recommendations
- Sonnet: 68% success → Opus: 88%
**arch-review-repo skill:**
- Comprehensive architecture audit
- Cross-cutting concerns
- System-wide patterns
- Opus justified for depth
## Making Haiku More Effective
If Haiku is struggling, try these improvements **before** upgrading to Sonnet:
### 1. Add Validation Steps
**Instead of:**
```markdown
3. Implement changes and create PR
```
**Try:**
```markdown
3. Implement changes
4. Validate: Run `./scripts/validate.sh` (tests pass, linter clean)
5. Create PR: `./scripts/create-pr.sh`
```
### 2. Bundle Error-Prone Operations in Scripts
**Instead of:**
```markdown
5. Create PR: `tea pulls create --title "..." --description "..."`
```
**Try:**
```markdown
5. Create PR: `./scripts/create-pr.sh $issue "$title"`
```
### 3. Add Structured Output Templates
**Instead of:**
```markdown
Show the results
```
**Try:**
```markdown
Format results as:
| Issue | Status | Link |
|-------|--------|------|
| ... | ... | ... |
```
### 4. Add Explicit Checklists
**Instead of:**
```markdown
Review the code for quality
```
**Try:**
```markdown
Check:
- [ ] Code quality (readability, naming)
- [ ] Bugs (edge cases, null checks)
- [ ] Tests (coverage, assertions)
```
### 5. Make Instructions More Concise
**Instead of:**
```markdown
Git is a version control system. When you want to commit changes, you use the git commit command which saves your changes to the repository...
```
**Try:**
```markdown
`git commit -m 'feat: add feature'`
```
## Testing Methodology
### Create Test Suite
For each skill, create 3-5 test cases:
**Example: work-issue skill tests**
1. Simple bug fix issue
2. New feature with acceptance criteria
3. Issue missing acceptance criteria
4. Issue with tests that fail
5. Complex refactoring task
### Test with Haiku
```bash
# Set skill to Haiku
model: haiku
# Run all 5 tests
# Document success/failure for each
```
### Measure Success Rate
```
Success rate = (Successful tests / Total tests) × 100
```
**Decision:**
- ≥80% → Keep Haiku
- <80% → Try Sonnet
- <50% → Likely need Opus or redesign
### Test with Sonnet (if needed)
```bash
# Upgrade to Sonnet
model: sonnet
# Run same 5 tests
# Compare results
```
### Document Decision
```yaml
---
name: work-issue
model: haiku # Tested: 4/5 tests passed with Haiku (80%)
---
```
Or:
```yaml
---
name: review-pr
model: sonnet # Tested: Haiku 3/5 (60%), Sonnet 4/5 (80%)
---
```
## Common Patterns
### Pattern: Start Haiku, Upgrade if Needed
**Issue-worker agent evolution:**
1. **V1 (Haiku):** 70% success - struggled with pattern matching
2. **Analysis:** Added more examples, still 72%
3. **V2 (Sonnet):** 82% success - better code understanding
4. **Decision:** Keep Sonnet, document why
### Pattern: Haiku for Most, Sonnet for Complex
**Review-pr skill:**
- Static analysis steps: Haiku could handle
- Manual code review: Needs Sonnet judgment
- **Decision:** Use Sonnet for whole skill (simplicity)
### Pattern: Split Complex Skills
**Instead of:** One complex skill using Opus
**Try:** Split into:
- Haiku skill for orchestration
- Sonnet agent for complex subtask
- Saves cost (most work in Haiku)
## Model Selection Checklist
Before choosing a model:
- [ ] Tested with Haiku first
- [ ] Measured success rate on 3-5 test cases
- [ ] Tried improvements (scripts, validation, checklists)
- [ ] Documented why this model is needed
- [ ] Considered cost implications (12x/60x)
- [ ] Considered speed implications (2.5x/5x slower)
- [ ] Will re-test if Claude models improve
## Future-Proofing
**Models improve over time.**
Periodically re-test Sonnet/Opus skills with Haiku:
- Haiku v2 might handle what Haiku v1 couldn't
- Cost savings compound over time
- Speed improvements are valuable
**Set a reminder:** Test Haiku again in 3-6 months.