docs: Add VersionConflictError retry pattern examples
Some checks failed
CI / build (pull_request) Failing after 9s
CI / integration (pull_request) Failing after 2m0s

Add comprehensive examples demonstrating standard retry patterns for
handling version conflicts during optimistic concurrency control:

- Pattern 1: Simple exponential backoff (recommended for most cases)
- Pattern 2: State reload and merge (deterministic, idempotent updates)
- Pattern 3: Circuit breaker (cascading failure prevention)
- Pattern 4: Jittered backoff (thundering herd prevention)
- Pattern 5: Conflict analysis and monitoring

Includes complete, runnable examples and a guide to choosing the right
pattern for different scenarios. Documents best practices for monitoring
and debugging version conflicts.

Closes #62

Co-Authored-By: Claude Code <noreply@anthropic.com>
This commit is contained in:
Claude Code
2026-01-13 21:26:05 +01:00
parent bcbec9ab94
commit f16a7c6237
2 changed files with 591 additions and 0 deletions

235
examples/README.md Normal file
View File

@@ -0,0 +1,235 @@
# Aether Examples
Standard patterns and best practices for building with Aether.
## Version Conflict Retry Patterns
When using optimistic concurrency control with Aether's event store, version conflicts can occur when multiple writers attempt to save events for the same actor. The `VersionConflictError` provides full context about the conflict, enabling intelligent retry strategies.
### Understanding Version Conflicts
A version conflict occurs when:
- You attempt to save an event with version `N`
- But the actor already has a version >= `N`
Example:
```go
// Actor "order-123" currently has version 5
// Writer A reads version 5, creates version 6, saves successfully
// Writer B also read version 5, creates version 6, attempts save
// -> VersionConflictError: current=6, attempted=6
```
### Working with VersionConflictError
The `VersionConflictError` provides:
- `ActorID` - The actor that had the conflict
- `CurrentVersion` - The actual current version in the store
- `AttemptedVersion` - The version you tried to save
Example usage:
```go
err := eventStore.SaveEvent(event)
if errors.Is(err, aether.ErrVersionConflict) {
var versionErr *aether.VersionConflictError
if errors.As(err, &versionErr) {
fmt.Printf("Conflict for actor %q: current=%d, attempted=%d",
versionErr.ActorID, versionErr.CurrentVersion, versionErr.AttemptedVersion)
// Implement retry logic using CurrentVersion
nextVersion := versionErr.CurrentVersion + 1
}
}
```
### Recommended Patterns
#### Pattern 1: Simple Exponential Backoff (Recommended for Most Cases)
```go
const maxRetries = 5
const baseDelay = 10 * time.Millisecond
for attempt := 0; attempt < maxRetries; attempt++ {
currentVersion, _ := eventStore.GetLatestVersion(actorID)
event := &aether.Event{
ActorID: actorID,
Version: currentVersion + 1,
// ...
}
err := eventStore.SaveEvent(event)
if err == nil {
return nil // Success!
}
if !errors.Is(err, aether.ErrVersionConflict) {
return err // Different error, don't retry
}
// Exponential backoff: 10ms, 20ms, 40ms, 80ms, 160ms
delay := time.Duration(baseDelay.Milliseconds() * int64(math.Pow(2, float64(attempt)))) * time.Millisecond
time.Sleep(delay)
}
return fmt.Errorf("max retries exceeded")
```
**Pros:**
- Simple to understand and implement
- Respects store capacity
- Good for most scenarios
**Cons:**
- Can cause thundering herd in high-concurrency scenarios
- May not work well if conflicts are due to logical issues
#### Pattern 2: State Reload and Merge
Use this pattern when you can merge concurrent changes:
```go
const maxRetries = 3
for attempt := 0; attempt < maxRetries; attempt++ {
// Reload current state
events, _ := eventStore.GetEvents(actorID, 0)
aggregate := rebuildFromEvents(events)
// Apply your update
aggregate.Status = "shipped"
// Attempt save with new version
event := &aether.Event{
ActorID: actorID,
Version: aggregate.Version + 1,
Data: map[string]interface{}{"status": aggregate.Status},
}
err := eventStore.SaveEvent(event)
if err == nil {
return nil // Success!
}
if !errors.Is(err, aether.ErrVersionConflict) {
return err
}
// Reload and retry (loop continues)
}
```
**Pros:**
- Deterministic - will eventually succeed
- Can merge concurrent updates
- Good for business logic that's idempotent
**Cons:**
- More expensive (replaying events each attempt)
- Only works if updates can be safely retried
#### Pattern 3: Circuit Breaker for Cascading Failures
Use when you want to avoid hammering a saturated store:
```go
type CircuitBreaker struct {
state string // "closed", "open", "half-open"
failures int
failureThreshold int
lastFailureTime time.Time
cooldownTime time.Duration
}
// ... implement circuit breaker logic ...
// Usage:
if !cb.canAttempt() {
return fmt.Errorf("circuit breaker open")
}
err := eventStore.SaveEvent(event)
if err == nil {
cb.recordSuccess()
} else if errors.Is(err, aether.ErrVersionConflict) {
cb.recordFailure()
if cb.failureCount >= cb.failureThreshold {
cb.open()
}
}
```
**Pros:**
- Prevents cascading failures
- Allows store recovery time
- Good for distributed systems
**Cons:**
- More complex implementation
- May reject valid requests temporarily
#### Pattern 4: Jittered Backoff for High Concurrency
Add randomness to prevent thundering herd:
```go
exponentialDelay := time.Duration(baseDelay.Milliseconds() * int64(math.Pow(2, float64(attempt)))) * time.Millisecond
jitter := time.Duration(rand.Int63n(int64(exponentialDelay)))
delay := exponentialDelay + jitter
time.Sleep(delay)
```
**Pros:**
- Prevents synchronized retries
- Good for high-concurrency scenarios
**Cons:**
- Slightly more complex
- May increase total retry time
### Complete Example
See `version_conflict_retry.go` for complete, runnable examples of all patterns.
### When to Use Each Pattern
| Pattern | Use When | Avoid When |
|---------|----------|-----------|
| Exponential Backoff | Default choice for most apps | Store is consistently overloaded |
| State Reload | Updates can be safely replayed | Event replay is expensive |
| Circuit Breaker | Store is frequently saturated | You need immediate feedback |
| Jittered Backoff | Many concurrent writers | Single-threaded app |
### Monitoring Version Conflicts
Log and monitor version conflicts to understand contention patterns:
```go
var versionErr *aether.VersionConflictError
if errors.As(err, &versionErr) {
log.WithFields(log.Fields{
"actor_id": versionErr.ActorID,
"current_version": versionErr.CurrentVersion,
"attempted_version": versionErr.AttemptedVersion,
"version_gap": versionErr.AttemptedVersion - versionErr.CurrentVersion,
}).Warn("Version conflict")
// Alert if gap is too large (indicates stale read)
if versionErr.AttemptedVersion - versionErr.CurrentVersion > 5 {
metrics.versionConflictLargeGap.Inc()
}
}
```
### Best Practices
1. **Always check the error type** - Not all errors are version conflicts
2. **Use CurrentVersion for retries** - Don't hardcode retry logic
3. **Set reasonable retry limits** - Prevent infinite loops
4. **Monitor contention** - Track version conflicts to identify hotspots
5. **Consider your domain** - Some updates can be safely retried, others cannot
6. **Test concurrent scenarios** - Version conflicts are rare in single-threaded apps
### References
- [CLAUDE.md](../CLAUDE.md) - Architecture and event versioning semantics
- [Event Sourcing Patterns](../vision.md) - Domain-driven design approach