feat(skills): modernize capability-writing with Anthropic best practices
Updates capability-writing skill with progressive disclosure structure based on Anthropic's January 2025 documentation. Implements Haiku-first approach (12x cheaper, 2-5x faster than Sonnet). Key changes: - Add 5 core principles: conciseness, progressive disclosure, script bundling, degrees of freedom, and Haiku-first model selection - Restructure with best-practices.md, templates/, examples/, and reference/ - Create 4 templates: user-invocable skill, background skill, agent, helper script - Add 3 examples: simple workflow, progressive disclosure, with scripts - Add 3 reference docs: frontmatter fields, model selection, anti-patterns - Update create-capability to analyze complexity and recommend structures - Default all new skills/agents to Haiku unless justified Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
536
skills/capability-writing/reference/anti-patterns.md
Normal file
536
skills/capability-writing/reference/anti-patterns.md
Normal file
@@ -0,0 +1,536 @@
|
||||
# Anti-Patterns to Avoid
|
||||
|
||||
Common mistakes when creating skills and agents.
|
||||
|
||||
## Skill Design Anti-Patterns
|
||||
|
||||
### 1. Overly Broad Components
|
||||
|
||||
**Bad:** One skill that does everything
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: project-management
|
||||
description: Handles issues, PRs, releases, documentation, deployment, testing, CI/CD...
|
||||
---
|
||||
|
||||
# Project Management
|
||||
|
||||
This skill does:
|
||||
- Issue management
|
||||
- Pull request reviews
|
||||
- Release planning
|
||||
- Documentation
|
||||
- Deployment
|
||||
- Testing
|
||||
- CI/CD configuration
|
||||
...
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Huge context window usage
|
||||
- Hard to maintain
|
||||
- Unclear when to trigger
|
||||
- Tries to do too much
|
||||
|
||||
**Good:** Focused components
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: issue-writing
|
||||
description: How to write clear, actionable issues with acceptance criteria.
|
||||
---
|
||||
```
|
||||
|
||||
**Separate skills for:**
|
||||
- `issue-writing` - Issue quality
|
||||
- `review-pr` - PR reviews
|
||||
- `gitea` - CLI reference
|
||||
- Each does one thing well
|
||||
|
||||
---
|
||||
|
||||
### 2. Vague Instructions
|
||||
|
||||
**Bad:**
|
||||
|
||||
```markdown
|
||||
1. Handle the issue
|
||||
2. Do the work
|
||||
3. Finish up
|
||||
4. Let me know when done
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- No clear actions
|
||||
- Claude has to guess
|
||||
- Inconsistent results
|
||||
- Hard to validate
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
1. **View issue**: `tea issues $1 --comments`
|
||||
2. **Create branch**: `git checkout -b issue-$1-<title>`
|
||||
3. **Plan work**: Use TodoWrite to break down steps
|
||||
4. **Implement**: Make necessary changes
|
||||
5. **Commit**: `git commit -m "feat: ..."`
|
||||
6. **Create PR**: `tea pulls create --title "..." --description "..."`
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Missing Skill References
|
||||
|
||||
**Bad:**
|
||||
|
||||
```markdown
|
||||
Use the gitea skill to create an issue.
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Skills have ~20% auto-activation rate
|
||||
- Claude might not load the skill
|
||||
- Inconsistent results
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
@~/.claude/skills/gitea/SKILL.md
|
||||
|
||||
Use `tea issues create --title "..." --description "..."`
|
||||
```
|
||||
|
||||
**The `@` reference guarantees the skill content is loaded.**
|
||||
|
||||
---
|
||||
|
||||
### 4. God Skills
|
||||
|
||||
**Bad:** Single 1500-line skill covering everything
|
||||
|
||||
```
|
||||
skills/database/SKILL.md (1500 lines)
|
||||
- PostgreSQL
|
||||
- MySQL
|
||||
- MongoDB
|
||||
- Redis
|
||||
- All queries
|
||||
- All optimization tips
|
||||
- All schemas
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Exceeds recommended 500 lines
|
||||
- Loads everything even if you need one thing
|
||||
- Hard to maintain
|
||||
- Wastes tokens
|
||||
|
||||
**Good:** Progressive disclosure
|
||||
|
||||
```
|
||||
skills/database/
|
||||
├── SKILL.md (200 lines - overview)
|
||||
├── reference/
|
||||
│ ├── postgres.md
|
||||
│ ├── mysql.md
|
||||
│ ├── mongodb.md
|
||||
│ └── redis.md
|
||||
└── schemas/
|
||||
├── users.md
|
||||
├── products.md
|
||||
└── orders.md
|
||||
```
|
||||
|
||||
Claude loads only what's needed.
|
||||
|
||||
---
|
||||
|
||||
### 5. Premature Agent Creation
|
||||
|
||||
**Bad:** Creating an agent for every task
|
||||
|
||||
```
|
||||
agents/
|
||||
├── issue-viewer/
|
||||
├── branch-creator/
|
||||
├── commit-maker/
|
||||
├── pr-creator/
|
||||
└── readme-updater/
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Overhead of spawning agents
|
||||
- Most tasks don't need isolation
|
||||
- Harder to follow workflow
|
||||
- Slower execution
|
||||
|
||||
**Good:** Use agents only when needed:
|
||||
- Context isolation (parallel work)
|
||||
- Skill composition (multiple skills together)
|
||||
- Specialist persona (architecture review)
|
||||
|
||||
**Simple tasks → Skills**
|
||||
**Complex isolated work → Agents**
|
||||
|
||||
---
|
||||
|
||||
### 6. Verbose Explanations
|
||||
|
||||
**Bad:**
|
||||
|
||||
```markdown
|
||||
Git is a distributed version control system that was created by Linus Torvalds in 2005. It allows multiple developers to work on the same codebase simultaneously while maintaining a complete history of all changes. When you want to save your changes, you use the git commit command, which creates a snapshot of your current working directory...
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Wastes tokens
|
||||
- Claude already knows git
|
||||
- Slows down loading
|
||||
- Adds no value
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
`git commit -m 'feat: add feature'`
|
||||
```
|
||||
|
||||
**Assume Claude is smart. Only add domain-specific context.**
|
||||
|
||||
---
|
||||
|
||||
## Instruction Anti-Patterns
|
||||
|
||||
### 7. Offering Too Many Options
|
||||
|
||||
**Bad:**
|
||||
|
||||
```markdown
|
||||
You can use pypdf, or pdfplumber, or PyMuPDF, or pdf2image, or camelot, or tabula, or...
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Decision paralysis
|
||||
- Inconsistent choices
|
||||
- No clear default
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
Use pdfplumber for text extraction:
|
||||
|
||||
\`\`\`python
|
||||
import pdfplumber
|
||||
with pdfplumber.open("file.pdf") as pdf:
|
||||
text = pdf.pages[0].extract_text()
|
||||
\`\`\`
|
||||
|
||||
For scanned PDFs requiring OCR, use pdf2image + pytesseract instead.
|
||||
```
|
||||
|
||||
**Provide default, mention alternative only when needed.**
|
||||
|
||||
---
|
||||
|
||||
### 8. Time-Sensitive Information
|
||||
|
||||
**Bad:**
|
||||
|
||||
```markdown
|
||||
If you're doing this before August 2025, use the old API.
|
||||
After August 2025, use the new API.
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Will become wrong
|
||||
- Requires maintenance
|
||||
- Confusing after the date
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
## Current Method
|
||||
Use v2 API: `api.example.com/v2/messages`
|
||||
|
||||
## Old Patterns
|
||||
<details>
|
||||
<summary>Legacy v1 API (deprecated 2025-08)</summary>
|
||||
The v1 API: `api.example.com/v1/messages`
|
||||
No longer supported.
|
||||
</details>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 9. Inconsistent Terminology
|
||||
|
||||
**Bad:** Mixing terms for the same thing
|
||||
|
||||
```markdown
|
||||
1. Get the API endpoint
|
||||
2. Call the URL
|
||||
3. Hit the API route
|
||||
4. Query the path
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Confusing
|
||||
- Looks like different things
|
||||
- Harder to search
|
||||
|
||||
**Good:** Pick one term and stick with it
|
||||
|
||||
```markdown
|
||||
1. Get the API endpoint
|
||||
2. Call the API endpoint
|
||||
3. Check the API endpoint response
|
||||
4. Retry the API endpoint if needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. Windows-Style Paths
|
||||
|
||||
**Bad:**
|
||||
|
||||
```markdown
|
||||
Run: `scripts\helper.py`
|
||||
See: `reference\guide.md`
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Fails on Unix systems
|
||||
- Causes errors on Mac/Linux
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
Run: `scripts/helper.py`
|
||||
See: `reference/guide.md`
|
||||
```
|
||||
|
||||
**Always use forward slashes. They work everywhere.**
|
||||
|
||||
---
|
||||
|
||||
## Script Anti-Patterns
|
||||
|
||||
### 11. Punting to Claude
|
||||
|
||||
**Bad script:**
|
||||
|
||||
```python
|
||||
def process_file(path):
|
||||
return open(path).read() # Let Claude handle errors
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Script fails with no helpful message
|
||||
- Claude has to guess what happened
|
||||
- Inconsistent error handling
|
||||
|
||||
**Good script:**
|
||||
|
||||
```python
|
||||
def process_file(path):
|
||||
try:
|
||||
with open(path) as f:
|
||||
return f.read()
|
||||
except FileNotFoundError:
|
||||
print(f"ERROR: File {path} not found")
|
||||
print("Creating default file...")
|
||||
with open(path, 'w') as f:
|
||||
f.write('')
|
||||
return ''
|
||||
except PermissionError:
|
||||
print(f"ERROR: Cannot access {path}")
|
||||
print("Using default value")
|
||||
return ''
|
||||
```
|
||||
|
||||
**Scripts should solve problems, not punt to Claude.**
|
||||
|
||||
---
|
||||
|
||||
### 12. Magic Numbers
|
||||
|
||||
**Bad:**
|
||||
|
||||
```bash
|
||||
TIMEOUT=47 # Why 47?
|
||||
RETRIES=5 # Why 5?
|
||||
DELAY=3.7 # Why 3.7?
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- No one knows why these values
|
||||
- Hard to adjust
|
||||
- "Voodoo constants"
|
||||
|
||||
**Good:**
|
||||
|
||||
```bash
|
||||
# HTTP requests typically complete in <30s
|
||||
# Extra buffer for slow connections
|
||||
TIMEOUT=30
|
||||
|
||||
# Three retries balances reliability vs speed
|
||||
# Most intermittent failures resolve by retry 2
|
||||
RETRIES=3
|
||||
|
||||
# Exponential backoff: 1s, 2s, 4s
|
||||
INITIAL_DELAY=1
|
||||
```
|
||||
|
||||
**Document why each value is what it is.**
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Anti-Patterns
|
||||
|
||||
### 13. Always Using Sonnet/Opus
|
||||
|
||||
**Bad:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: dashboard
|
||||
model: opus # "Just to be safe"
|
||||
---
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- 60x more expensive than Haiku
|
||||
- 5x slower
|
||||
- Wasted cost for simple task
|
||||
|
||||
**Good:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: dashboard
|
||||
model: haiku # Tested: 5/5 tests passed
|
||||
---
|
||||
```
|
||||
|
||||
**Test with Haiku first. Only upgrade if needed.**
|
||||
|
||||
---
|
||||
|
||||
### 14. Never Testing Haiku
|
||||
|
||||
**Bad:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: review-pr
|
||||
model: sonnet # Assumed it needs Sonnet, never tested Haiku
|
||||
---
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Might work fine with Haiku
|
||||
- Missing 12x cost savings
|
||||
- Missing 2.5x speed improvement
|
||||
|
||||
**Good:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: review-pr
|
||||
model: haiku # Tested: Haiku 4/5 (80%), good enough!
|
||||
---
|
||||
```
|
||||
|
||||
Or:
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: review-pr
|
||||
model: sonnet # Tested: Haiku 2/5 (40%), Sonnet 4/5 (80%)
|
||||
---
|
||||
```
|
||||
|
||||
**Always test Haiku first, document results.**
|
||||
|
||||
---
|
||||
|
||||
## Progressive Disclosure Anti-Patterns
|
||||
|
||||
### 15. Deeply Nested References
|
||||
|
||||
**Bad:**
|
||||
|
||||
```
|
||||
SKILL.md → advanced.md → details.md → actual-info.md
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Claude may partially read nested files
|
||||
- Information might be incomplete
|
||||
- Hard to navigate
|
||||
|
||||
**Good:**
|
||||
|
||||
```
|
||||
SKILL.md → {advanced.md, reference.md, examples.md}
|
||||
```
|
||||
|
||||
**Keep references one level deep from SKILL.md.**
|
||||
|
||||
---
|
||||
|
||||
### 16. No Table of Contents for Long Files
|
||||
|
||||
**Bad:** 500-line reference file with no structure
|
||||
|
||||
```markdown
|
||||
# Reference
|
||||
|
||||
(500 lines of content with no navigation)
|
||||
```
|
||||
|
||||
**Why it's bad:**
|
||||
- Hard to preview
|
||||
- Claude might miss sections
|
||||
- User can't navigate
|
||||
|
||||
**Good:**
|
||||
|
||||
```markdown
|
||||
# Reference
|
||||
|
||||
## Contents
|
||||
- Authentication and setup
|
||||
- Core methods
|
||||
- Advanced features
|
||||
- Error handling
|
||||
- Examples
|
||||
|
||||
## Authentication and Setup
|
||||
...
|
||||
```
|
||||
|
||||
**Files >100 lines should have TOC.**
|
||||
|
||||
---
|
||||
|
||||
## Checklist to Avoid Anti-Patterns
|
||||
|
||||
Before publishing a skill:
|
||||
|
||||
- [ ] Not overly broad (does one thing well)
|
||||
- [ ] Instructions are specific (not vague)
|
||||
- [ ] Skill references use `@` syntax
|
||||
- [ ] Under 500 lines (or uses progressive disclosure)
|
||||
- [ ] Only creates agents when needed
|
||||
- [ ] Concise (assumes Claude knows basics)
|
||||
- [ ] Provides default, not 10 options
|
||||
- [ ] No time-sensitive information
|
||||
- [ ] Consistent terminology
|
||||
- [ ] Forward slashes for paths
|
||||
- [ ] Scripts handle errors, don't punt
|
||||
- [ ] No magic numbers in scripts
|
||||
- [ ] Tested with Haiku first
|
||||
- [ ] References are one level deep
|
||||
- [ ] Long files have table of contents
|
||||
278
skills/capability-writing/reference/frontmatter-fields.md
Normal file
278
skills/capability-writing/reference/frontmatter-fields.md
Normal file
@@ -0,0 +1,278 @@
|
||||
# Frontmatter Fields Reference
|
||||
|
||||
Complete documentation of all available frontmatter fields for skills and agents.
|
||||
|
||||
## Skill Frontmatter
|
||||
|
||||
### Required Fields
|
||||
|
||||
#### `name`
|
||||
- **Type:** string
|
||||
- **Required:** Yes
|
||||
- **Format:** Lowercase, hyphens only, no spaces
|
||||
- **Max length:** 64 characters
|
||||
- **Must match:** Directory name
|
||||
- **Cannot contain:** XML tags, reserved words ("anthropic", "claude")
|
||||
- **Example:** `work-issue`, `code-review`, `gitea`
|
||||
|
||||
#### `description`
|
||||
- **Type:** string (multiline supported with `>`)
|
||||
- **Required:** Yes
|
||||
- **Max length:** 1024 characters
|
||||
- **Cannot contain:** XML tags
|
||||
- **Should include:**
|
||||
- What the skill does
|
||||
- When to use it
|
||||
- Trigger conditions
|
||||
- **Example:**
|
||||
```yaml
|
||||
description: >
|
||||
View, create, and manage Gitea issues and pull requests.
|
||||
Use when working with issues, PRs, or when user mentions tea, gitea, issue numbers.
|
||||
```
|
||||
|
||||
#### `user-invocable`
|
||||
- **Type:** boolean
|
||||
- **Required:** Yes
|
||||
- **Values:** `true` or `false`
|
||||
- **Usage:**
|
||||
- `true`: User can trigger with `/skill-name`
|
||||
- `false`: Background skill, auto-loaded when needed
|
||||
|
||||
### Optional Fields
|
||||
|
||||
#### `model`
|
||||
- **Type:** string
|
||||
- **Required:** No
|
||||
- **Values:** `haiku`, `sonnet`, `opus`
|
||||
- **Default:** Inherits from parent (usually haiku)
|
||||
- **Guidance:** Default to `haiku`, only upgrade if needed
|
||||
- **Example:**
|
||||
```yaml
|
||||
model: haiku # 12x cheaper than sonnet
|
||||
```
|
||||
|
||||
#### `argument-hint`
|
||||
- **Type:** string
|
||||
- **Required:** No (only for user-invocable skills)
|
||||
- **Format:** `<required>` for required params, `[optional]` for optional
|
||||
- **Shows in UI:** Helps users know what arguments to provide
|
||||
- **Example:**
|
||||
```yaml
|
||||
argument-hint: <issue-number>
|
||||
argument-hint: <issue-number> [optional-title]
|
||||
```
|
||||
|
||||
#### `context`
|
||||
- **Type:** string
|
||||
- **Required:** No
|
||||
- **Values:** `fork`
|
||||
- **Usage:** Set to `fork` for skills needing isolated context
|
||||
- **When to use:** Heavy exploration tasks that would pollute main context
|
||||
- **Example:**
|
||||
```yaml
|
||||
context: fork # For arch-review-repo, deep exploration
|
||||
```
|
||||
|
||||
#### `allowed-tools`
|
||||
- **Type:** list of strings
|
||||
- **Required:** No
|
||||
- **Usage:** Restrict which tools the skill can use
|
||||
- **Example:**
|
||||
```yaml
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Bash
|
||||
- Grep
|
||||
```
|
||||
- **Note:** Rarely used, most skills have all tools
|
||||
|
||||
## Agent Frontmatter
|
||||
|
||||
### Required Fields
|
||||
|
||||
#### `name`
|
||||
- **Type:** string
|
||||
- **Required:** Yes
|
||||
- **Same rules as skill name**
|
||||
|
||||
#### `description`
|
||||
- **Type:** string
|
||||
- **Required:** Yes
|
||||
- **Should include:**
|
||||
- What the agent does
|
||||
- When to spawn it
|
||||
- **Example:**
|
||||
```yaml
|
||||
description: >
|
||||
Automated code review of pull requests for quality, bugs, security, and style.
|
||||
Spawn when reviewing PRs or checking code quality.
|
||||
```
|
||||
|
||||
### Optional Fields
|
||||
|
||||
#### `model`
|
||||
- **Type:** string
|
||||
- **Required:** No
|
||||
- **Values:** `haiku`, `sonnet`, `opus`, `inherit`
|
||||
- **Default:** `inherit` (uses parent's model)
|
||||
- **Guidance:**
|
||||
- Default to `haiku` for simple agents
|
||||
- Use `sonnet` for balanced performance
|
||||
- Reserve `opus` for deep reasoning
|
||||
- **Example:**
|
||||
```yaml
|
||||
model: haiku # Fast and cheap for code review checklist
|
||||
```
|
||||
|
||||
#### `skills`
|
||||
- **Type:** comma-separated list of skill names (not paths)
|
||||
- **Required:** No
|
||||
- **Usage:** Auto-load these skills when agent spawns
|
||||
- **Format:** Just skill names, not paths
|
||||
- **Example:**
|
||||
```yaml
|
||||
skills: gitea, issue-writing, code-review
|
||||
```
|
||||
- **Note:** Agent runtime loads skills automatically
|
||||
|
||||
#### `disallowedTools`
|
||||
- **Type:** list of tool names
|
||||
- **Required:** No
|
||||
- **Common use:** Make agents read-only
|
||||
- **Example:**
|
||||
```yaml
|
||||
disallowedTools:
|
||||
- Edit
|
||||
- Write
|
||||
```
|
||||
- **When to use:** Analysis agents that shouldn't modify code
|
||||
|
||||
#### `permissionMode`
|
||||
- **Type:** string
|
||||
- **Required:** No
|
||||
- **Values:** `default`, `bypassPermissions`
|
||||
- **Usage:** Rarely used, for agents that need to bypass permission prompts
|
||||
- **Example:**
|
||||
```yaml
|
||||
permissionMode: bypassPermissions
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Minimal User-Invocable Skill
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: dashboard
|
||||
description: Show open issues, PRs, and CI status.
|
||||
user-invocable: true
|
||||
---
|
||||
```
|
||||
|
||||
### Full-Featured Skill
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: work-issue
|
||||
description: >
|
||||
Implement a Gitea issue with full workflow: branch, plan, code, PR, review.
|
||||
Use when implementing issues or when user says /work-issue.
|
||||
model: haiku
|
||||
argument-hint: <issue-number>
|
||||
user-invocable: true
|
||||
---
|
||||
```
|
||||
|
||||
### Background Skill
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: gitea
|
||||
description: >
|
||||
View, create, and manage Gitea issues and PRs using tea CLI.
|
||||
Use when working with issues, PRs, viewing issue details, or when user mentions tea, gitea, issue numbers.
|
||||
user-invocable: false
|
||||
---
|
||||
```
|
||||
|
||||
### Read-Only Agent
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: code-reviewer
|
||||
description: >
|
||||
Automated code review of pull requests for quality, bugs, security, style, and test coverage.
|
||||
model: sonnet
|
||||
skills: gitea, code-review
|
||||
disallowedTools:
|
||||
- Edit
|
||||
- Write
|
||||
---
|
||||
```
|
||||
|
||||
### Implementation Agent
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: issue-worker
|
||||
description: >
|
||||
Autonomously implements a single issue in an isolated git worktree.
|
||||
model: haiku
|
||||
skills: gitea, issue-writing, software-architecture
|
||||
---
|
||||
```
|
||||
|
||||
## Validation Rules
|
||||
|
||||
### Name Validation
|
||||
- Must be lowercase
|
||||
- Must use hyphens (not underscores or spaces)
|
||||
- Cannot contain: `anthropic`, `claude`
|
||||
- Cannot contain XML tags `<`, `>`
|
||||
- Max 64 characters
|
||||
- Must match directory name exactly
|
||||
|
||||
### Description Validation
|
||||
- Cannot be empty
|
||||
- Max 1024 characters
|
||||
- Cannot contain XML tags
|
||||
- Should end with period
|
||||
|
||||
### Model Validation
|
||||
- Must be one of: `haiku`, `sonnet`, `opus`, `inherit`
|
||||
- Case-sensitive (must be lowercase)
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
**Bad: Using paths in skills field**
|
||||
```yaml
|
||||
skills: ~/.claude/skills/gitea/SKILL.md # Wrong!
|
||||
```
|
||||
|
||||
**Good: Just skill names**
|
||||
```yaml
|
||||
skills: gitea, issue-writing
|
||||
```
|
||||
|
||||
**Bad: Reserved word in name**
|
||||
```yaml
|
||||
name: claude-helper # Contains "claude"
|
||||
```
|
||||
|
||||
**Good: Descriptive name**
|
||||
```yaml
|
||||
name: code-helper
|
||||
```
|
||||
|
||||
**Bad: Vague description**
|
||||
```yaml
|
||||
description: Helps with stuff
|
||||
```
|
||||
|
||||
**Good: Specific description**
|
||||
```yaml
|
||||
description: >
|
||||
Analyze Excel spreadsheets, create pivot tables, generate charts.
|
||||
Use when analyzing Excel files, spreadsheets, or .xlsx files.
|
||||
```
|
||||
336
skills/capability-writing/reference/model-selection.md
Normal file
336
skills/capability-writing/reference/model-selection.md
Normal file
@@ -0,0 +1,336 @@
|
||||
# Model Selection Guide
|
||||
|
||||
Detailed guidance on choosing the right model for skills and agents.
|
||||
|
||||
## Cost Comparison
|
||||
|
||||
| Model | Input (per MTok) | Output (per MTok) | vs Haiku |
|
||||
|-------|------------------|-------------------|----------|
|
||||
| **Haiku** | $0.25 | $1.25 | Baseline |
|
||||
| **Sonnet** | $3.00 | $15.00 | 12x more expensive |
|
||||
| **Opus** | $15.00 | $75.00 | 60x more expensive |
|
||||
|
||||
**Example cost for typical skill call (2K input, 1K output):**
|
||||
- Haiku: $0.00175
|
||||
- Sonnet: $0.021 (12x more)
|
||||
- Opus: $0.105 (60x more)
|
||||
|
||||
## Speed Comparison
|
||||
|
||||
| Model | Tokens/Second | vs Haiku |
|
||||
|-------|---------------|----------|
|
||||
| **Haiku** | ~100 | Baseline |
|
||||
| **Sonnet** | ~40 | 2.5x slower |
|
||||
| **Opus** | ~20 | 5x slower |
|
||||
|
||||
## Decision Framework
|
||||
|
||||
```
|
||||
Start with Haiku by default
|
||||
|
|
||||
v
|
||||
Test on 3-5 representative tasks
|
||||
|
|
||||
+-- Success rate ≥80%? ---------> ✓ Use Haiku
|
||||
| (12x cheaper, 2-5x faster)
|
||||
|
|
||||
+-- Success rate <80%? --------> Try Sonnet
|
||||
| |
|
||||
| v
|
||||
| Test on same tasks
|
||||
| |
|
||||
| +-- Success ≥80%? --> Use Sonnet
|
||||
| |
|
||||
| +-- Still failing? --> Opus or redesign
|
||||
|
|
||||
v
|
||||
Document why you chose the model
|
||||
```
|
||||
|
||||
## When Haiku Works Well
|
||||
|
||||
### ✓ Ideal for Haiku
|
||||
|
||||
**Simple sequential workflows:**
|
||||
- `/dashboard` - Fetch and display
|
||||
- `/roadmap` - List and format
|
||||
- `/commit` - Generate message from diff
|
||||
|
||||
**Workflows with scripts:**
|
||||
- Error-prone operations in scripts
|
||||
- Skills just orchestrate script calls
|
||||
- Validation is deterministic
|
||||
|
||||
**Structured outputs:**
|
||||
- Tasks with clear templates
|
||||
- Format is defined upfront
|
||||
- No ambiguous formatting
|
||||
|
||||
**Reference/knowledge skills:**
|
||||
- `gitea` - CLI reference
|
||||
- `issue-writing` - Patterns and templates
|
||||
- `software-architecture` - Best practices
|
||||
|
||||
### Examples of Haiku Success
|
||||
|
||||
**work-issue skill:**
|
||||
- Sequential steps (view → branch → plan → implement → PR)
|
||||
- Each step has clear validation
|
||||
- Scripts handle error-prone operations
|
||||
- Success rate: ~90%
|
||||
|
||||
**dashboard skill:**
|
||||
- Fetch data (tea commands)
|
||||
- Format as table
|
||||
- Clear, structured output
|
||||
- Success rate: ~95%
|
||||
|
||||
## When to Use Sonnet
|
||||
|
||||
### Use Sonnet When
|
||||
|
||||
**Haiku fails 20%+ of the time**
|
||||
- Test with Haiku first
|
||||
- If success rate <80%, upgrade to Sonnet
|
||||
|
||||
**Complex judgment required:**
|
||||
- Code review (quality assessment)
|
||||
- Issue grooming (clarity evaluation)
|
||||
- Architecture decisions
|
||||
|
||||
**Nuanced reasoning:**
|
||||
- Understanding implicit requirements
|
||||
- Making trade-off decisions
|
||||
- Applying context-dependent rules
|
||||
|
||||
### Examples of Sonnet Success
|
||||
|
||||
**review-pr skill:**
|
||||
- Requires code understanding
|
||||
- Judgment about quality/bugs
|
||||
- Context-dependent feedback
|
||||
- Originally tried Haiku: 65% success → Sonnet: 85%
|
||||
|
||||
**issue-worker agent:**
|
||||
- Autonomous implementation
|
||||
- Pattern matching
|
||||
- Architectural decisions
|
||||
- Originally tried Haiku: 70% success → Sonnet: 82%
|
||||
|
||||
## When to Use Opus
|
||||
|
||||
### Reserve Opus For
|
||||
|
||||
**Deep architectural reasoning:**
|
||||
- `software-architect` agent
|
||||
- Pattern recognition across large codebases
|
||||
- Identifying subtle anti-patterns
|
||||
- Trade-off analysis
|
||||
|
||||
**High-stakes decisions:**
|
||||
- Breaking changes analysis
|
||||
- System-wide refactoring plans
|
||||
- Security architecture review
|
||||
|
||||
**Complex pattern recognition:**
|
||||
- Requires sophisticated understanding
|
||||
- Multiple layers of abstraction
|
||||
- Long-term implications
|
||||
|
||||
### Examples of Opus Success
|
||||
|
||||
**software-architect agent:**
|
||||
- Analyzes entire codebase
|
||||
- Identifies 8 different anti-patterns
|
||||
- Provides prioritized recommendations
|
||||
- Sonnet: 68% success → Opus: 88%
|
||||
|
||||
**arch-review-repo skill:**
|
||||
- Comprehensive architecture audit
|
||||
- Cross-cutting concerns
|
||||
- System-wide patterns
|
||||
- Opus justified for depth
|
||||
|
||||
## Making Haiku More Effective
|
||||
|
||||
If Haiku is struggling, try these improvements **before** upgrading to Sonnet:
|
||||
|
||||
### 1. Add Validation Steps
|
||||
|
||||
**Instead of:**
|
||||
```markdown
|
||||
3. Implement changes and create PR
|
||||
```
|
||||
|
||||
**Try:**
|
||||
```markdown
|
||||
3. Implement changes
|
||||
4. Validate: Run `./scripts/validate.sh` (tests pass, linter clean)
|
||||
5. Create PR: `./scripts/create-pr.sh`
|
||||
```
|
||||
|
||||
### 2. Bundle Error-Prone Operations in Scripts
|
||||
|
||||
**Instead of:**
|
||||
```markdown
|
||||
5. Create PR: `tea pulls create --title "..." --description "..."`
|
||||
```
|
||||
|
||||
**Try:**
|
||||
```markdown
|
||||
5. Create PR: `./scripts/create-pr.sh $issue "$title"`
|
||||
```
|
||||
|
||||
### 3. Add Structured Output Templates
|
||||
|
||||
**Instead of:**
|
||||
```markdown
|
||||
Show the results
|
||||
```
|
||||
|
||||
**Try:**
|
||||
```markdown
|
||||
Format results as:
|
||||
|
||||
| Issue | Status | Link |
|
||||
|-------|--------|------|
|
||||
| ... | ... | ... |
|
||||
```
|
||||
|
||||
### 4. Add Explicit Checklists
|
||||
|
||||
**Instead of:**
|
||||
```markdown
|
||||
Review the code for quality
|
||||
```
|
||||
|
||||
**Try:**
|
||||
```markdown
|
||||
Check:
|
||||
- [ ] Code quality (readability, naming)
|
||||
- [ ] Bugs (edge cases, null checks)
|
||||
- [ ] Tests (coverage, assertions)
|
||||
```
|
||||
|
||||
### 5. Make Instructions More Concise
|
||||
|
||||
**Instead of:**
|
||||
```markdown
|
||||
Git is a version control system. When you want to commit changes, you use the git commit command which saves your changes to the repository...
|
||||
```
|
||||
|
||||
**Try:**
|
||||
```markdown
|
||||
`git commit -m 'feat: add feature'`
|
||||
```
|
||||
|
||||
## Testing Methodology
|
||||
|
||||
### Create Test Suite
|
||||
|
||||
For each skill, create 3-5 test cases:
|
||||
|
||||
**Example: work-issue skill tests**
|
||||
1. Simple bug fix issue
|
||||
2. New feature with acceptance criteria
|
||||
3. Issue missing acceptance criteria
|
||||
4. Issue with tests that fail
|
||||
5. Complex refactoring task
|
||||
|
||||
### Test with Haiku
|
||||
|
||||
```bash
|
||||
# Set skill to Haiku
|
||||
model: haiku
|
||||
|
||||
# Run all 5 tests
|
||||
# Document success/failure for each
|
||||
```
|
||||
|
||||
### Measure Success Rate
|
||||
|
||||
```
|
||||
Success rate = (Successful tests / Total tests) × 100
|
||||
```
|
||||
|
||||
**Decision:**
|
||||
- ≥80% → Keep Haiku
|
||||
- <80% → Try Sonnet
|
||||
- <50% → Likely need Opus or redesign
|
||||
|
||||
### Test with Sonnet (if needed)
|
||||
|
||||
```bash
|
||||
# Upgrade to Sonnet
|
||||
model: sonnet
|
||||
|
||||
# Run same 5 tests
|
||||
# Compare results
|
||||
```
|
||||
|
||||
### Document Decision
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: work-issue
|
||||
model: haiku # Tested: 4/5 tests passed with Haiku (80%)
|
||||
---
|
||||
```
|
||||
|
||||
Or:
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: review-pr
|
||||
model: sonnet # Tested: Haiku 3/5 (60%), Sonnet 4/5 (80%)
|
||||
---
|
||||
```
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pattern: Start Haiku, Upgrade if Needed
|
||||
|
||||
**Issue-worker agent evolution:**
|
||||
1. **V1 (Haiku):** 70% success - struggled with pattern matching
|
||||
2. **Analysis:** Added more examples, still 72%
|
||||
3. **V2 (Sonnet):** 82% success - better code understanding
|
||||
4. **Decision:** Keep Sonnet, document why
|
||||
|
||||
### Pattern: Haiku for Most, Sonnet for Complex
|
||||
|
||||
**Review-pr skill:**
|
||||
- Static analysis steps: Haiku could handle
|
||||
- Manual code review: Needs Sonnet judgment
|
||||
- **Decision:** Use Sonnet for whole skill (simplicity)
|
||||
|
||||
### Pattern: Split Complex Skills
|
||||
|
||||
**Instead of:** One complex skill using Opus
|
||||
|
||||
**Try:** Split into:
|
||||
- Haiku skill for orchestration
|
||||
- Sonnet agent for complex subtask
|
||||
- Saves cost (most work in Haiku)
|
||||
|
||||
## Model Selection Checklist
|
||||
|
||||
Before choosing a model:
|
||||
|
||||
- [ ] Tested with Haiku first
|
||||
- [ ] Measured success rate on 3-5 test cases
|
||||
- [ ] Tried improvements (scripts, validation, checklists)
|
||||
- [ ] Documented why this model is needed
|
||||
- [ ] Considered cost implications (12x/60x)
|
||||
- [ ] Considered speed implications (2.5x/5x slower)
|
||||
- [ ] Will re-test if Claude models improve
|
||||
|
||||
## Future-Proofing
|
||||
|
||||
**Models improve over time.**
|
||||
|
||||
Periodically re-test Sonnet/Opus skills with Haiku:
|
||||
- Haiku v2 might handle what Haiku v1 couldn't
|
||||
- Cost savings compound over time
|
||||
- Speed improvements are valuable
|
||||
|
||||
**Set a reminder:** Test Haiku again in 3-6 months.
|
||||
Reference in New Issue
Block a user