docs: Add VersionConflictError retry pattern examples
Add comprehensive examples demonstrating standard retry patterns for handling version conflicts during optimistic concurrency control: - Pattern 1: Simple exponential backoff (recommended for most cases) - Pattern 2: State reload and merge (deterministic, idempotent updates) - Pattern 3: Circuit breaker (cascading failure prevention) - Pattern 4: Jittered backoff (thundering herd prevention) - Pattern 5: Conflict analysis and monitoring Includes complete, runnable examples and a guide to choosing the right pattern for different scenarios. Documents best practices for monitoring and debugging version conflicts. Closes #62 Co-Authored-By: Claude Code <noreply@anthropic.com>
This commit is contained in:
235
examples/README.md
Normal file
235
examples/README.md
Normal file
@@ -0,0 +1,235 @@
|
||||
# Aether Examples
|
||||
|
||||
Standard patterns and best practices for building with Aether.
|
||||
|
||||
## Version Conflict Retry Patterns
|
||||
|
||||
When using optimistic concurrency control with Aether's event store, version conflicts can occur when multiple writers attempt to save events for the same actor. The `VersionConflictError` provides full context about the conflict, enabling intelligent retry strategies.
|
||||
|
||||
### Understanding Version Conflicts
|
||||
|
||||
A version conflict occurs when:
|
||||
- You attempt to save an event with version `N`
|
||||
- But the actor already has a version >= `N`
|
||||
|
||||
Example:
|
||||
```go
|
||||
// Actor "order-123" currently has version 5
|
||||
// Writer A reads version 5, creates version 6, saves successfully
|
||||
// Writer B also read version 5, creates version 6, attempts save
|
||||
// -> VersionConflictError: current=6, attempted=6
|
||||
```
|
||||
|
||||
### Working with VersionConflictError
|
||||
|
||||
The `VersionConflictError` provides:
|
||||
- `ActorID` - The actor that had the conflict
|
||||
- `CurrentVersion` - The actual current version in the store
|
||||
- `AttemptedVersion` - The version you tried to save
|
||||
|
||||
Example usage:
|
||||
```go
|
||||
err := eventStore.SaveEvent(event)
|
||||
if errors.Is(err, aether.ErrVersionConflict) {
|
||||
var versionErr *aether.VersionConflictError
|
||||
if errors.As(err, &versionErr) {
|
||||
fmt.Printf("Conflict for actor %q: current=%d, attempted=%d",
|
||||
versionErr.ActorID, versionErr.CurrentVersion, versionErr.AttemptedVersion)
|
||||
// Implement retry logic using CurrentVersion
|
||||
nextVersion := versionErr.CurrentVersion + 1
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Recommended Patterns
|
||||
|
||||
#### Pattern 1: Simple Exponential Backoff (Recommended for Most Cases)
|
||||
|
||||
```go
|
||||
const maxRetries = 5
|
||||
const baseDelay = 10 * time.Millisecond
|
||||
|
||||
for attempt := 0; attempt < maxRetries; attempt++ {
|
||||
currentVersion, _ := eventStore.GetLatestVersion(actorID)
|
||||
|
||||
event := &aether.Event{
|
||||
ActorID: actorID,
|
||||
Version: currentVersion + 1,
|
||||
// ...
|
||||
}
|
||||
|
||||
err := eventStore.SaveEvent(event)
|
||||
if err == nil {
|
||||
return nil // Success!
|
||||
}
|
||||
|
||||
if !errors.Is(err, aether.ErrVersionConflict) {
|
||||
return err // Different error, don't retry
|
||||
}
|
||||
|
||||
// Exponential backoff: 10ms, 20ms, 40ms, 80ms, 160ms
|
||||
delay := time.Duration(baseDelay.Milliseconds() * int64(math.Pow(2, float64(attempt)))) * time.Millisecond
|
||||
time.Sleep(delay)
|
||||
}
|
||||
return fmt.Errorf("max retries exceeded")
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Simple to understand and implement
|
||||
- Respects store capacity
|
||||
- Good for most scenarios
|
||||
|
||||
**Cons:**
|
||||
- Can cause thundering herd in high-concurrency scenarios
|
||||
- May not work well if conflicts are due to logical issues
|
||||
|
||||
#### Pattern 2: State Reload and Merge
|
||||
|
||||
Use this pattern when you can merge concurrent changes:
|
||||
|
||||
```go
|
||||
const maxRetries = 3
|
||||
|
||||
for attempt := 0; attempt < maxRetries; attempt++ {
|
||||
// Reload current state
|
||||
events, _ := eventStore.GetEvents(actorID, 0)
|
||||
aggregate := rebuildFromEvents(events)
|
||||
|
||||
// Apply your update
|
||||
aggregate.Status = "shipped"
|
||||
|
||||
// Attempt save with new version
|
||||
event := &aether.Event{
|
||||
ActorID: actorID,
|
||||
Version: aggregate.Version + 1,
|
||||
Data: map[string]interface{}{"status": aggregate.Status},
|
||||
}
|
||||
|
||||
err := eventStore.SaveEvent(event)
|
||||
if err == nil {
|
||||
return nil // Success!
|
||||
}
|
||||
|
||||
if !errors.Is(err, aether.ErrVersionConflict) {
|
||||
return err
|
||||
}
|
||||
|
||||
// Reload and retry (loop continues)
|
||||
}
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Deterministic - will eventually succeed
|
||||
- Can merge concurrent updates
|
||||
- Good for business logic that's idempotent
|
||||
|
||||
**Cons:**
|
||||
- More expensive (replaying events each attempt)
|
||||
- Only works if updates can be safely retried
|
||||
|
||||
#### Pattern 3: Circuit Breaker for Cascading Failures
|
||||
|
||||
Use when you want to avoid hammering a saturated store:
|
||||
|
||||
```go
|
||||
type CircuitBreaker struct {
|
||||
state string // "closed", "open", "half-open"
|
||||
failures int
|
||||
failureThreshold int
|
||||
lastFailureTime time.Time
|
||||
cooldownTime time.Duration
|
||||
}
|
||||
|
||||
// ... implement circuit breaker logic ...
|
||||
|
||||
// Usage:
|
||||
if !cb.canAttempt() {
|
||||
return fmt.Errorf("circuit breaker open")
|
||||
}
|
||||
|
||||
err := eventStore.SaveEvent(event)
|
||||
if err == nil {
|
||||
cb.recordSuccess()
|
||||
} else if errors.Is(err, aether.ErrVersionConflict) {
|
||||
cb.recordFailure()
|
||||
if cb.failureCount >= cb.failureThreshold {
|
||||
cb.open()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Prevents cascading failures
|
||||
- Allows store recovery time
|
||||
- Good for distributed systems
|
||||
|
||||
**Cons:**
|
||||
- More complex implementation
|
||||
- May reject valid requests temporarily
|
||||
|
||||
#### Pattern 4: Jittered Backoff for High Concurrency
|
||||
|
||||
Add randomness to prevent thundering herd:
|
||||
|
||||
```go
|
||||
exponentialDelay := time.Duration(baseDelay.Milliseconds() * int64(math.Pow(2, float64(attempt)))) * time.Millisecond
|
||||
jitter := time.Duration(rand.Int63n(int64(exponentialDelay)))
|
||||
delay := exponentialDelay + jitter
|
||||
time.Sleep(delay)
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Prevents synchronized retries
|
||||
- Good for high-concurrency scenarios
|
||||
|
||||
**Cons:**
|
||||
- Slightly more complex
|
||||
- May increase total retry time
|
||||
|
||||
### Complete Example
|
||||
|
||||
See `version_conflict_retry.go` for complete, runnable examples of all patterns.
|
||||
|
||||
### When to Use Each Pattern
|
||||
|
||||
| Pattern | Use When | Avoid When |
|
||||
|---------|----------|-----------|
|
||||
| Exponential Backoff | Default choice for most apps | Store is consistently overloaded |
|
||||
| State Reload | Updates can be safely replayed | Event replay is expensive |
|
||||
| Circuit Breaker | Store is frequently saturated | You need immediate feedback |
|
||||
| Jittered Backoff | Many concurrent writers | Single-threaded app |
|
||||
|
||||
### Monitoring Version Conflicts
|
||||
|
||||
Log and monitor version conflicts to understand contention patterns:
|
||||
|
||||
```go
|
||||
var versionErr *aether.VersionConflictError
|
||||
if errors.As(err, &versionErr) {
|
||||
log.WithFields(log.Fields{
|
||||
"actor_id": versionErr.ActorID,
|
||||
"current_version": versionErr.CurrentVersion,
|
||||
"attempted_version": versionErr.AttemptedVersion,
|
||||
"version_gap": versionErr.AttemptedVersion - versionErr.CurrentVersion,
|
||||
}).Warn("Version conflict")
|
||||
|
||||
// Alert if gap is too large (indicates stale read)
|
||||
if versionErr.AttemptedVersion - versionErr.CurrentVersion > 5 {
|
||||
metrics.versionConflictLargeGap.Inc()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Best Practices
|
||||
|
||||
1. **Always check the error type** - Not all errors are version conflicts
|
||||
2. **Use CurrentVersion for retries** - Don't hardcode retry logic
|
||||
3. **Set reasonable retry limits** - Prevent infinite loops
|
||||
4. **Monitor contention** - Track version conflicts to identify hotspots
|
||||
5. **Consider your domain** - Some updates can be safely retried, others cannot
|
||||
6. **Test concurrent scenarios** - Version conflicts are rare in single-threaded apps
|
||||
|
||||
### References
|
||||
|
||||
- [CLAUDE.md](../CLAUDE.md) - Architecture and event versioning semantics
|
||||
- [Event Sourcing Patterns](../vision.md) - Domain-driven design approach
|
||||
Reference in New Issue
Block a user