Compare commits
39 Commits
d33477c02a
...
update-age
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
0b44f6664a | ||
| b481dae0b6 | |||
| 6041479286 | |||
|
|
7487a5f3af | ||
|
|
b67417ac68 | ||
|
|
5b5083dcf8 | ||
|
|
6549125f3d | ||
|
|
464fed67ec | ||
|
|
46e1c44017 | ||
| bcbec9ab94 | |||
|
|
de30e1ef1b | ||
|
|
b9e641c2aa | ||
|
|
ec3db5668f | ||
| 20d688f2a2 | |||
|
|
fd1938672e | ||
|
|
6de897ef60 | ||
|
271f5db444
|
|||
|
18ea677585
|
|||
| aae0f2413d | |||
| dd5deb7944 | |||
| f966f01dd3 | |||
| 7085c682c3 | |||
| e66fa40b3a | |||
|
ef73fb6bfd
|
|||
| e3dbe3d52d | |||
| 9e238c5e70 | |||
|
adead7e980
|
|||
|
f0f8978079
|
|||
| b6de82c8ee | |||
| 655ee0ac49 | |||
|
f62964bf3b
|
|||
| 484e3ced2e | |||
|
2bf699909b
|
|||
| 200dd5d551 | |||
|
4666bb6503
|
|||
| 8df36cac7a | |||
|
b759c7fb97
|
|||
|
eaff315782
|
|||
|
c757bb76f3
|
64
.product-strategy/ISSUE_MIGRATION.md
Normal file
64
.product-strategy/ISSUE_MIGRATION.md
Normal file
@@ -0,0 +1,64 @@
|
||||
# Issue: Implement Actor Migration Between Cluster Nodes
|
||||
|
||||
## Problem
|
||||
|
||||
When nodes join or leave the cluster, actors need to be migrated to maintain even distribution. Currently:
|
||||
|
||||
- `handleRebalanceRequest` in `cluster/manager.go:150` is empty
|
||||
- `handleMigrationRequest` in `cluster/manager.go:167` is empty
|
||||
- `RebalanceShards` in `cluster/shard.go:211` returns unchanged map
|
||||
- `SendMessage` in `cluster/distributed.go:139` ignores sharding
|
||||
|
||||
## Required Implementation
|
||||
|
||||
### 1. Rebalance Algorithm (cluster/shard.go)
|
||||
Implement `ConsistentHashPlacement.RebalanceShards` to:
|
||||
- Calculate new shard assignments based on active nodes
|
||||
- Identify actors needing migration
|
||||
- Generate migration plan with source/dest nodes
|
||||
|
||||
### 2. Migration Coordinator (cluster/manager.go)
|
||||
Implement `handleRebalanceRequest` to:
|
||||
- Accept migration plan from leader
|
||||
- For each actor in plan:
|
||||
1. Pause incoming messages
|
||||
2. Capture actor state (replay events up to current version)
|
||||
3. Serialize state
|
||||
4. Send migration request to destination node
|
||||
5. Wait for ack
|
||||
6. Delete actor from current node
|
||||
- Track migration status via `ActorMigration.Status`
|
||||
|
||||
### 3. Cross-Node Message Routing (cluster/distributed.go)
|
||||
Implement proper routing in `SendMessage`:
|
||||
- Use `GetActorNode(actorID)` to determine target node
|
||||
- If remote: marshal message, send via NATS to target node
|
||||
- If local: send to local runtime
|
||||
- Route response back to caller if needed
|
||||
|
||||
## Suggested Approach
|
||||
|
||||
1. **Define message types** for actor migration requests/responses in `cluster/types.go`
|
||||
2. **Implement state capture** - replay events to get current state
|
||||
3. **Implement state restore** - deserialize and restore actor state
|
||||
4. **Implement coordinator** - manage migration phases
|
||||
5. **Add error handling** - handle failed migrations, retries, cleanup
|
||||
6. **Add tests** - test migration with mock NATS
|
||||
|
||||
## Related Files
|
||||
|
||||
- `cluster/manager.go:150` - handleRebalanceRequest (empty)
|
||||
- `cluster/manager.go:167` - handleMigrationRequest (empty)
|
||||
- `cluster/shard.go:211` - RebalanceShards (stub)
|
||||
- `cluster/distributed.go:139` - SendMessage (simplified)
|
||||
- `cluster/types.go:108` - ActorMigration struct
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `RebalanceShards` returns new shard map with actor assignments
|
||||
- [ ] `handleRebalanceRequest` processes migration plan
|
||||
- [ ] `handleMigrationRequest` accepts actor migrations
|
||||
- [ ] `SendMessage` routes to correct node
|
||||
- [ ] Actors can be migrated with state preserved
|
||||
- [ ] Failed migrations are handled gracefully
|
||||
- [ ] Integration test with multi-node cluster
|
||||
117
.product-strategy/ISSUE_SNAPSHOTS.md
Normal file
117
.product-strategy/ISSUE_SNAPSHOTS.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Issue: Add Snapshot Support to Event Sourcing Workflow
|
||||
|
||||
## Problem
|
||||
|
||||
`SnapshotStore` interface is defined but snapshots are not integrated into the event sourcing workflow. This means:
|
||||
- Actors with many events must replay entire history
|
||||
- No performance optimization for long-lived actors
|
||||
- Snapshots exist as API but are not used
|
||||
|
||||
## Current State
|
||||
|
||||
- `EventStoreWithErrors` in `event.go:235` - no snapshot methods
|
||||
- `SnapshotStore` interface in `event.go:245` - defined but not widely used
|
||||
- `JetStreamEventStore.GetLatestSnapshot` and `SaveSnapshot` implemented but not called automatically
|
||||
- `InMemoryEventStore` has snapshot methods but no lifecycle management
|
||||
|
||||
## Required Implementation
|
||||
|
||||
### 1. Snapshot Strategy
|
||||
Define when to create snapshots:
|
||||
- Fixed interval (e.g., every 100 events)
|
||||
- Version-based (e.g., every 50 versions)
|
||||
- Hybrid: version-based with min/max bounds
|
||||
|
||||
### 2. State Capture
|
||||
Add method to capture actor state:
|
||||
```go
|
||||
// CaptureState rebuilds actor state by replaying events and returns it
|
||||
CaptureState(actorID string, fromVersion int64) (map[string]interface{}, error)
|
||||
```
|
||||
|
||||
### 3. Snapshot Store Extension
|
||||
Extend `EventStoreWithErrors` to include snapshots:
|
||||
```go
|
||||
type EventStoreWithSnapshots interface {
|
||||
EventStoreWithErrors
|
||||
GetLatestSnapshot(actorID string) (*ActorSnapshot, error)
|
||||
SaveSnapshot(snapshot *ActorSnapshot) error
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Snapshot Workflow
|
||||
Modify event retrieval to use snapshots:
|
||||
```go
|
||||
GetEvents(actorID string, fromVersion int64) ([]*Event, error) {
|
||||
// 1. Try to get latest snapshot
|
||||
snapshot, _ := store.GetLatestSnapshot(actorID)
|
||||
|
||||
// 2. If snapshot exists and version <= fromVersion:
|
||||
// - Return events from snapshot version + 1
|
||||
// 3. Else:
|
||||
// - Replay all events from version 0
|
||||
}
|
||||
```
|
||||
|
||||
## Suggested Implementation
|
||||
|
||||
### 1. Add CaptureState to EventStore interface
|
||||
In `event.go`, extend `EventStore` or create `StateStore` interface:
|
||||
```go
|
||||
type StateStore interface {
|
||||
EventStore
|
||||
CaptureState(actorID string, fromVersion int64) (map[string]interface{}, error)
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Implement CaptureState
|
||||
In `store/jetstream.go`:
|
||||
```go
|
||||
func (jes *JetStreamEventStore) CaptureState(actorID string, fromVersion int64) (map[string]interface{}, error) {
|
||||
// Replay events and build state (application logic needed here)
|
||||
events, _ := jes.GetEvents(actorID, fromVersion)
|
||||
// Need application logic to convert events to state
|
||||
return state, nil
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Add Snapshot Helper
|
||||
Create snapshot utilities:
|
||||
```go
|
||||
// CreateSnapshot creates snapshot from state
|
||||
func CreateSnapshot(actorID string, version int64, state map[string]interface{}) *ActorSnapshot {
|
||||
return &ActorSnapshot{
|
||||
ActorID: actorID,
|
||||
Version: version,
|
||||
State: state,
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Modify GetEvents
|
||||
Update `GetEvents` in both stores to use snapshots when beneficial.
|
||||
|
||||
## Snapshots Workflow Example
|
||||
|
||||
```
|
||||
1. Actor has 1000 events
|
||||
2. Every 100 events, create snapshot
|
||||
3. Actor reaches version 1000, snapshot at version 1000
|
||||
4. Request events from version 900:
|
||||
- Get snapshot at version 1000? No (version too high)
|
||||
- Replay 900->1000 events (only 100 events)
|
||||
5. Request events from version 50:
|
||||
- Get latest snapshot at version 1000? Yes (version > 50)
|
||||
- Use snapshot as base
|
||||
- Replay 1000->1000 events (none)
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `CaptureState` method added to event store
|
||||
- [ ] Snapshots created at configured intervals
|
||||
- [ ] `GetEvents` uses snapshots to optimize replay
|
||||
- [ ] Snapshot workflow tested with long-lived actors
|
||||
- [ ] Configuration for snapshot interval/version
|
||||
- [ ] Metrics: snapshot count, average replay size
|
||||
100
.product-strategy/ISSUE_VM_RUNTIME.md
Normal file
100
.product-strategy/ISSUE_VM_RUNTIME.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# Issue: Implement VM/Runtime for Actors
|
||||
|
||||
## Problem
|
||||
|
||||
Only interfaces exist for `Runtime` and `VirtualMachine` in `cluster/types.go` and `cluster/distributed.go`, but no actual implementation. Actors cannot be created, started, stopped, or have their state managed.
|
||||
|
||||
## Required Components
|
||||
|
||||
### 1. VM Implementation (cluster/vm.go - new)
|
||||
```go
|
||||
type VirtualMachine struct {
|
||||
actorID string
|
||||
eventStore aether.EventStore
|
||||
state map[string]interface{}
|
||||
version int64
|
||||
}
|
||||
```
|
||||
|
||||
Methods needed:
|
||||
- `GetID()`, `GetActorID()`, `GetState()` - already in interface
|
||||
- `Start()` - replay events to rebuild state
|
||||
- `ProcessEvent(event *aether.Event)` - apply event to state
|
||||
- `Stop()` - persist final state
|
||||
- `GetVersion()` - current event version
|
||||
|
||||
### 2. Runtime Implementation (cluster/runtime.go - new)
|
||||
```go
|
||||
type Runtime struct {
|
||||
natsConn *nats.Conn
|
||||
eventStore aether.EventStore
|
||||
vmRegistry VMRegistry // map[actorID]*VirtualMachine
|
||||
config RuntimeConfig
|
||||
}
|
||||
```
|
||||
|
||||
Methods needed:
|
||||
- `Start()` - initialize and start processing
|
||||
- `LoadModel(model eventstorming.Model)` - register domain types
|
||||
- `SendMessage(message RuntimeMessage)` - route to appropriate VM
|
||||
- `GetActiveVMs()` - return map of active VMs
|
||||
- `CreateVM(actorID string)` - create new VM instance
|
||||
- `StopVM(actorID string)` - persist and stop VM
|
||||
|
||||
### 3. Event Processing
|
||||
- Subscribe to actor's event stream
|
||||
- Replay events to build initial state
|
||||
- Apply new events as they arrive
|
||||
- Handle event versions and conflicts
|
||||
|
||||
## Suggested Design
|
||||
|
||||
### VM Lifecycle
|
||||
```
|
||||
1. Actor message arrives for actor-123
|
||||
2. Runtime checks if VM exists for actor-123
|
||||
3. If not, create VM:
|
||||
- Replay events from event store
|
||||
- Rebuild state
|
||||
4. Route message to VM
|
||||
5. VM processes message -> creates new events
|
||||
6. Events persisted to event store
|
||||
7. VM state updated
|
||||
```
|
||||
|
||||
### State Management
|
||||
- State derived from event replay
|
||||
- No separate state store needed
|
||||
- Can snapshot periodically for performance
|
||||
- Version conflict handling using existing EventStore
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
1. **Create VM struct** in `cluster/vm.go`
|
||||
2. **Implement event replay** to rebuild state
|
||||
3. **Create Runtime** in `cluster/runtime.go`
|
||||
4. **Register Runtime with cluster** via `SetVMProvider`
|
||||
5. **Implement message processing** - validate against model
|
||||
6. **Add version conflict handling** using existing EventStore
|
||||
7. **Write tests** - mock event store, test state transitions
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
cluster/
|
||||
├── vm.go # VirtualMachine implementation
|
||||
├── runtime.go # Runtime implementation
|
||||
├── vm_test.go # VM tests
|
||||
├── runtime_test.go # Runtime tests
|
||||
└── integration_test.go # Integration tests
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] VM can be created with actor ID
|
||||
- [ ] VM replays events to build state
|
||||
- [ ] VM processes events and updates state
|
||||
- [ ] VM persists current version
|
||||
- [ ] Runtime can create/stop VMs
|
||||
- [ ] Runtime manages VM registry
|
||||
- [ ] Integration test with NATS and JetStream
|
||||
106
AGENTS.md
Normal file
106
AGENTS.md
Normal file
@@ -0,0 +1,106 @@
|
||||
# Aether
|
||||
|
||||
**Distributed event sourcing primitives for Go, powered by NATS.**
|
||||
|
||||
---
|
||||
|
||||
## Development Commands
|
||||
|
||||
```bash
|
||||
make build # go build ./...
|
||||
make test # go test ./...
|
||||
make lint # golangci-lint run
|
||||
make clean # go clean
|
||||
```
|
||||
|
||||
## NATS Server Requirement
|
||||
|
||||
Integration tests require NATS with JetStream enabled:
|
||||
|
||||
```bash
|
||||
brew install nats-server
|
||||
nats-server -js
|
||||
```
|
||||
|
||||
Run tests in a separate terminal after starting NATS.
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
aether/
|
||||
├── event.go # Event, ActorSnapshot, EventStore interface
|
||||
├── eventbus.go # EventBus, EventBroadcaster interface
|
||||
├── nats_eventbus.go # NATSEventBus implementation
|
||||
├── metrics*.go # Prometheus metrics
|
||||
├── store/ # EventStore implementations
|
||||
│ ├── memory.go # InMemoryEventStore (testing)
|
||||
│ └── jetstream.go # JetStreamEventStore (production)
|
||||
├── cluster/ # Cluster management
|
||||
│ ├── manager.go # ClusterManager
|
||||
│ ├── discovery.go # NodeDiscovery
|
||||
│ ├── hashring.go # ConsistentHashRing
|
||||
│ ├── shard.go # ShardManager
|
||||
│ ├── leader.go # LeaderElection
|
||||
│ └── types.go # Cluster types
|
||||
├── examples/ # Usage examples
|
||||
└── eventstorming/ # Domain modeling reference
|
||||
```
|
||||
|
||||
## Core Patterns
|
||||
|
||||
### Event Versioning
|
||||
|
||||
Events for each actor must have monotonically increasing versions:
|
||||
|
||||
```go
|
||||
currentVersion, _ := store.GetLatestVersion(actorID)
|
||||
event := &aether.Event{
|
||||
ActorID: actorID,
|
||||
Version: currentVersion + 1,
|
||||
// ...
|
||||
}
|
||||
err := store.SaveEvent(event)
|
||||
if errors.Is(err, aether.ErrVersionConflict) {
|
||||
// Reload and retry
|
||||
}
|
||||
```
|
||||
|
||||
### Namespace Isolation
|
||||
|
||||
Namespaces provide logical boundaries for events:
|
||||
|
||||
```go
|
||||
// Event bus namespace
|
||||
ch := eventBus.Subscribe("tenant-abc")
|
||||
eventBus.Publish("tenant-abc", event)
|
||||
|
||||
// Store namespace
|
||||
store, _ := store.NewJetStreamEventStoreWithNamespace(natsConn, "events", "tenant-abc")
|
||||
```
|
||||
|
||||
Namespaces sanitize special characters and prefix stream names for complete data isolation.
|
||||
|
||||
### JetStream Cache Behavior
|
||||
|
||||
`JetStreamEventStore` caches actor versions for performance. Cache is invalidated when `GetLatestVersion` detects a newer version from external writes.
|
||||
|
||||
## Testing
|
||||
|
||||
- Unit tests: `go test -v ./...`
|
||||
- Single test: `go test -v -run TestName`
|
||||
- Single file: `go test -v ./store/...`
|
||||
- Benchmarks: `go test -bench=. -benchmem`
|
||||
|
||||
Integration tests require running NATS server first.
|
||||
|
||||
## Linting
|
||||
|
||||
```bash
|
||||
golangci-lint run
|
||||
golangci-lint run --fix
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [vision.md](./vision.md) - Product vision and principles
|
||||
- [examples/README.md](./examples/README.md) - Example patterns
|
||||
160
CLAUDE.md
160
CLAUDE.md
@@ -1,160 +0,0 @@
|
||||
# Aether
|
||||
|
||||
Distributed actor system with event sourcing for Go, powered by NATS.
|
||||
|
||||
## Organization Context
|
||||
|
||||
This repo is part of Flowmade. See:
|
||||
- [Organization manifesto](https://git.flowmade.one/flowmade-one/architecture/src/branch/main/manifesto.md) - who we are, what we believe
|
||||
- [Repository map](https://git.flowmade.one/flowmade-one/architecture/src/branch/main/repos.md) - how this fits in the bigger picture
|
||||
- [Vision](./vision.md) - what this specific product does
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
git clone git@git.flowmade.one:flowmade-one/aether.git
|
||||
cd aether
|
||||
go mod download
|
||||
```
|
||||
|
||||
Requires NATS server for integration tests:
|
||||
```bash
|
||||
# Install NATS
|
||||
brew install nats-server
|
||||
|
||||
# Run with JetStream enabled
|
||||
nats-server -js
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
aether/
|
||||
├── event.go # Event, ActorSnapshot, EventStore interface
|
||||
├── eventbus.go # EventBus, EventBroadcaster interface
|
||||
├── nats_eventbus.go # NATSEventBus - cross-node event broadcasting
|
||||
├── store/
|
||||
│ ├── memory.go # InMemoryEventStore (testing)
|
||||
│ └── jetstream.go # JetStreamEventStore (production)
|
||||
├── cluster/
|
||||
│ ├── manager.go # ClusterManager
|
||||
│ ├── discovery.go # NodeDiscovery
|
||||
│ ├── hashring.go # ConsistentHashRing
|
||||
│ ├── shard.go # ShardManager
|
||||
│ ├── leader.go # LeaderElection
|
||||
│ └── types.go # Cluster types
|
||||
└── model/
|
||||
└── model.go # EventStorming model types
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
make build # Build the library
|
||||
make test # Run tests
|
||||
make lint # Run linters
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Event Sourcing
|
||||
|
||||
Events are the source of truth. State is derived by replaying events.
|
||||
|
||||
```go
|
||||
// Create an event
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: "order-123",
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"total": 100.00},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
// Persist to event store
|
||||
store.SaveEvent(event)
|
||||
|
||||
// Replay events to rebuild state
|
||||
events, _ := store.GetEvents("order-123", 0)
|
||||
```
|
||||
|
||||
### Event Versioning
|
||||
|
||||
Events for each actor must have **monotonically increasing versions**. This ensures event stream integrity and enables optimistic concurrency control.
|
||||
|
||||
#### Version Semantics
|
||||
|
||||
- Each actor has an independent version sequence
|
||||
- Version must be strictly greater than the current latest version
|
||||
- For new actors (no events), the first event must have version > 0
|
||||
- Non-consecutive versions are allowed (gaps are permitted)
|
||||
|
||||
#### Optimistic Concurrency Pattern
|
||||
|
||||
```go
|
||||
// 1. Get current version
|
||||
currentVersion, _ := store.GetLatestVersion("order-123")
|
||||
|
||||
// 2. Create event with next version
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderUpdated",
|
||||
ActorID: "order-123",
|
||||
Version: currentVersion + 1,
|
||||
Data: map[string]interface{}{"status": "shipped"},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
// 3. Attempt to save
|
||||
err := store.SaveEvent(event)
|
||||
if errors.Is(err, aether.ErrVersionConflict) {
|
||||
// Another writer won - reload and retry if appropriate
|
||||
var versionErr *aether.VersionConflictError
|
||||
errors.As(err, &versionErr)
|
||||
log.Printf("Conflict: actor %s has version %d, attempted %d",
|
||||
versionErr.ActorID, versionErr.CurrentVersion, versionErr.AttemptedVersion)
|
||||
}
|
||||
```
|
||||
|
||||
#### Error Types
|
||||
|
||||
- `ErrVersionConflict` - Sentinel error for version conflicts (use with `errors.Is`)
|
||||
- `VersionConflictError` - Detailed error with ActorID, CurrentVersion, and AttemptedVersion
|
||||
|
||||
### Namespace Isolation
|
||||
|
||||
Namespaces provide logical boundaries for events and subscriptions:
|
||||
|
||||
```go
|
||||
// Subscribe to events in a namespace
|
||||
ch := eventBus.Subscribe("tenant-abc")
|
||||
|
||||
// Events are isolated per namespace
|
||||
eventBus.Publish("tenant-abc", event) // Only tenant-abc subscribers see this
|
||||
```
|
||||
|
||||
### Clustering
|
||||
|
||||
Aether handles node discovery, leader election, and shard distribution:
|
||||
|
||||
```go
|
||||
// Create cluster manager
|
||||
manager := cluster.NewClusterManager(natsConn, nodeID)
|
||||
|
||||
// Join cluster
|
||||
manager.Start()
|
||||
|
||||
// Leader election happens automatically
|
||||
if manager.IsLeader() {
|
||||
// Coordinate shard assignments
|
||||
}
|
||||
```
|
||||
|
||||
## Key Patterns
|
||||
|
||||
- **Events are immutable** - Never modify, only append
|
||||
- **Versions are monotonic** - Each event must have version > previous for same actor
|
||||
- **Snapshots for performance** - Periodically snapshot state to avoid full replay
|
||||
- **Namespaces for isolation** - Not multi-tenancy, just logical boundaries
|
||||
- **NATS for everything** - Events, pub/sub, clustering all use NATS
|
||||
169
README.md
Normal file
169
README.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# Aether
|
||||
|
||||
[](https://git.flowmade.one/flowmade-one/aether/actions/workflows/ci.yml)
|
||||
|
||||
Event sourcing primitives for Go, powered by NATS.
|
||||
|
||||
Aether provides composable building blocks for distributed, event-sourced systems without imposing framework opinions on your domain.
|
||||
|
||||
## Why Aether?
|
||||
|
||||
Building distributed, event-sourced systems in Go requires assembling many pieces: event storage, pub/sub, clustering, leader election. Existing solutions are either too heavy (full frameworks with opinions about your domain), too light (just pub/sub), or not NATS-native.
|
||||
|
||||
Aether provides clear primitives that compose well:
|
||||
|
||||
- **Event sourcing primitives** - Event, EventStore interface, snapshots
|
||||
- **Event stores** - In-memory (testing) and JetStream (production)
|
||||
- **Event bus** - Local and NATS-backed pub/sub with namespace isolation
|
||||
- **Cluster management** - Node discovery, leader election, shard distribution
|
||||
|
||||
Built for JetStream from the ground up, not bolted on.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
go get git.flowmade.one/flowmade-one/aether
|
||||
```
|
||||
|
||||
Requires Go 1.23 or later.
|
||||
|
||||
## Quick Start
|
||||
|
||||
Here is a minimal example showing event sourcing fundamentals: creating events, saving them to a store, and replaying to rebuild state.
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/google/uuid"
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
"git.flowmade.one/flowmade-one/aether/store"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// Create an in-memory event store (use JetStream for production)
|
||||
eventStore := store.NewInMemoryEventStore()
|
||||
|
||||
// Create and save events
|
||||
// Error handling omitted for brevity
|
||||
orderID := "order-123"
|
||||
|
||||
orderPlaced := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: orderID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"total": 99.99, "items": 3},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
eventStore.SaveEvent(orderPlaced)
|
||||
|
||||
orderShipped := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderShipped",
|
||||
ActorID: orderID,
|
||||
Version: 2,
|
||||
Data: map[string]interface{}{"carrier": "FastShip", "tracking": "FS123456"},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
eventStore.SaveEvent(orderShipped)
|
||||
|
||||
// Replay events to rebuild state
|
||||
events, _ := eventStore.GetEvents(orderID, 0)
|
||||
|
||||
state := make(map[string]interface{})
|
||||
for _, event := range events {
|
||||
switch event.EventType {
|
||||
case "OrderPlaced":
|
||||
state["total"] = event.Data["total"]
|
||||
state["items"] = event.Data["items"]
|
||||
state["status"] = "placed"
|
||||
case "OrderShipped":
|
||||
state["status"] = "shipped"
|
||||
state["carrier"] = event.Data["carrier"]
|
||||
state["tracking"] = event.Data["tracking"]
|
||||
}
|
||||
}
|
||||
|
||||
fmt.Printf("Order state after replaying %d events:\n", len(events))
|
||||
fmt.Printf(" Status: %s\n", state["status"])
|
||||
fmt.Printf(" Total: $%.2f\n", state["total"])
|
||||
fmt.Printf(" Tracking: %s\n", state["tracking"])
|
||||
}
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Order state after replaying 2 events:
|
||||
Status: shipped
|
||||
Total: $99.99
|
||||
Tracking: FS123456
|
||||
```
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### Events are immutable
|
||||
|
||||
Events represent facts about what happened. Once saved, they are never modified or deleted - you only append new events. This immutability guarantee is enforced at multiple levels:
|
||||
|
||||
**Interface Design**: The `EventStore` interface provides no Update or Delete methods. Only `SaveEvent` (append), `GetEvents` (read), and `GetLatestVersion` (read) are available.
|
||||
|
||||
**JetStream Storage**: When using `JetStreamEventStore`, events are stored in a NATS JetStream stream configured with:
|
||||
- File-based storage (durable)
|
||||
- Limits-based retention policy (events expire after configured duration, not before)
|
||||
- No mechanism to modify or delete individual events during their lifetime
|
||||
|
||||
**Audit Trail Guarantee**: Because events are immutable once persisted, they serve as a trustworthy audit trail. You can rely on the fact that historical events won't change, enabling compliance and forensics.
|
||||
|
||||
To correct a mistake, append a new event that expresses the correction rather than modifying history:
|
||||
|
||||
```go
|
||||
// Wrong: Cannot update an event
|
||||
// store.UpdateEvent(eventID, newData) // This method doesn't exist
|
||||
|
||||
// Right: Append a new event that corrects the record
|
||||
correctionEvent := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderCorrected",
|
||||
ActorID: orderID,
|
||||
Version: currentVersion + 1,
|
||||
Data: map[string]interface{}{"reason": "price adjustment"},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
err := store.SaveEvent(correctionEvent)
|
||||
```
|
||||
|
||||
### State is derived
|
||||
|
||||
Current state is always derived by replaying events. This gives you a complete audit trail and the ability to rebuild state at any point in time.
|
||||
|
||||
### Versions ensure consistency
|
||||
|
||||
Each event for an actor must have a strictly increasing version number. This enables optimistic concurrency control:
|
||||
|
||||
```go
|
||||
currentVersion, _ := eventStore.GetLatestVersion(actorID)
|
||||
|
||||
event := &aether.Event{
|
||||
ActorID: actorID,
|
||||
Version: currentVersion + 1,
|
||||
// ...
|
||||
}
|
||||
|
||||
err := eventStore.SaveEvent(event)
|
||||
if errors.Is(err, aether.ErrVersionConflict) {
|
||||
// Another writer saved first - reload and retry
|
||||
}
|
||||
```
|
||||
|
||||
## Documentation
|
||||
|
||||
- [Vision](./vision.md) - Product vision and design principles
|
||||
- [CLAUDE.md](./CLAUDE.md) - Development guide and architecture details
|
||||
|
||||
## License
|
||||
|
||||
See [LICENSE](./LICENSE) for details.
|
||||
@@ -44,5 +44,4 @@
|
||||
// - Leader election ensures coordination continues despite node failures
|
||||
// - Actor migration allows rebalancing when cluster topology changes
|
||||
// - Graceful shutdown with proper resource cleanup
|
||||
//
|
||||
package cluster
|
||||
125
cluster/config_test.go
Normal file
125
cluster/config_test.go
Normal file
@@ -0,0 +1,125 @@
|
||||
package cluster
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestDefaultHashRingConfig(t *testing.T) {
|
||||
config := DefaultHashRingConfig()
|
||||
|
||||
if config.VirtualNodes != DefaultVirtualNodes {
|
||||
t.Errorf("expected VirtualNodes=%d, got %d", DefaultVirtualNodes, config.VirtualNodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDefaultShardConfig(t *testing.T) {
|
||||
config := DefaultShardConfig()
|
||||
|
||||
if config.ShardCount != DefaultNumShards {
|
||||
t.Errorf("expected ShardCount=%d, got %d", DefaultNumShards, config.ShardCount)
|
||||
}
|
||||
if config.ReplicationFactor != 1 {
|
||||
t.Errorf("expected ReplicationFactor=1, got %d", config.ReplicationFactor)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewConsistentHashRingWithConfig(t *testing.T) {
|
||||
t.Run("custom virtual nodes", func(t *testing.T) {
|
||||
config := HashRingConfig{VirtualNodes: 50}
|
||||
ring := NewConsistentHashRingWithConfig(config)
|
||||
|
||||
ring.AddNode("test-node")
|
||||
|
||||
if len(ring.sortedHashes) != 50 {
|
||||
t.Errorf("expected 50 virtual nodes, got %d", len(ring.sortedHashes))
|
||||
}
|
||||
if ring.GetVirtualNodes() != 50 {
|
||||
t.Errorf("expected GetVirtualNodes()=50, got %d", ring.GetVirtualNodes())
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("zero value uses default", func(t *testing.T) {
|
||||
config := HashRingConfig{VirtualNodes: 0}
|
||||
ring := NewConsistentHashRingWithConfig(config)
|
||||
|
||||
ring.AddNode("test-node")
|
||||
|
||||
if len(ring.sortedHashes) != DefaultVirtualNodes {
|
||||
t.Errorf("expected %d virtual nodes, got %d", DefaultVirtualNodes, len(ring.sortedHashes))
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("default constructor uses default config", func(t *testing.T) {
|
||||
ring := NewConsistentHashRing()
|
||||
|
||||
ring.AddNode("test-node")
|
||||
|
||||
if len(ring.sortedHashes) != DefaultVirtualNodes {
|
||||
t.Errorf("expected %d virtual nodes, got %d", DefaultVirtualNodes, len(ring.sortedHashes))
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestNewShardManagerWithConfig(t *testing.T) {
|
||||
t.Run("custom shard count", func(t *testing.T) {
|
||||
config := ShardConfig{ShardCount: 256, ReplicationFactor: 2}
|
||||
sm := NewShardManagerWithConfig(config)
|
||||
|
||||
if sm.GetShardCount() != 256 {
|
||||
t.Errorf("expected shard count 256, got %d", sm.GetShardCount())
|
||||
}
|
||||
if sm.GetReplicationFactor() != 2 {
|
||||
t.Errorf("expected replication factor 2, got %d", sm.GetReplicationFactor())
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("zero values use defaults", func(t *testing.T) {
|
||||
config := ShardConfig{ShardCount: 0, ReplicationFactor: 0}
|
||||
sm := NewShardManagerWithConfig(config)
|
||||
|
||||
if sm.GetShardCount() != DefaultNumShards {
|
||||
t.Errorf("expected shard count %d, got %d", DefaultNumShards, sm.GetShardCount())
|
||||
}
|
||||
if sm.GetReplicationFactor() != 1 {
|
||||
t.Errorf("expected replication factor 1, got %d", sm.GetReplicationFactor())
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("legacy constructor still works", func(t *testing.T) {
|
||||
sm := NewShardManager(512, 3)
|
||||
|
||||
if sm.GetShardCount() != 512 {
|
||||
t.Errorf("expected shard count 512, got %d", sm.GetShardCount())
|
||||
}
|
||||
if sm.GetReplicationFactor() != 3 {
|
||||
t.Errorf("expected replication factor 3, got %d", sm.GetReplicationFactor())
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestShardManagerGetShard_DifferentShardCounts(t *testing.T) {
|
||||
testCases := []struct {
|
||||
shardCount int
|
||||
}{
|
||||
{shardCount: 16},
|
||||
{shardCount: 64},
|
||||
{shardCount: 256},
|
||||
{shardCount: 1024},
|
||||
{shardCount: 4096},
|
||||
}
|
||||
|
||||
for _, tc := range testCases {
|
||||
t.Run("shardCount="+string(rune(tc.shardCount)), func(t *testing.T) {
|
||||
sm := NewShardManagerWithConfig(ShardConfig{ShardCount: tc.shardCount})
|
||||
|
||||
// Verify all actor IDs map to valid shard range
|
||||
for i := 0; i < 1000; i++ {
|
||||
actorID := "actor-" + string(rune(i))
|
||||
shard := sm.GetShard(actorID)
|
||||
if shard < 0 || shard >= tc.shardCount {
|
||||
t.Errorf("shard %d out of range [0, %d)", shard, tc.shardCount)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -12,7 +12,7 @@ import (
|
||||
type DistributedVM struct {
|
||||
nodeID string
|
||||
cluster *ClusterManager
|
||||
localRuntime Runtime // Interface to avoid import cycles
|
||||
localRuntime Runtime
|
||||
sharding *ShardManager
|
||||
discovery *NodeDiscovery
|
||||
natsConn *nats.Conn
|
||||
@@ -20,17 +20,29 @@ type DistributedVM struct {
|
||||
cancel context.CancelFunc
|
||||
}
|
||||
|
||||
// Runtime interface to avoid import cycles with main aether package
|
||||
// Runtime defines the interface for a local runtime that executes actors.
|
||||
// This interface decouples the cluster package from specific runtime implementations.
|
||||
type Runtime interface {
|
||||
// Start initializes and starts the runtime
|
||||
Start() error
|
||||
LoadModel(model interface{}) error
|
||||
SendMessage(message interface{}) error
|
||||
// LoadModel loads an EventStorming model into the runtime
|
||||
LoadModel(model RuntimeModel) error
|
||||
// SendMessage sends a message to an actor in the runtime
|
||||
SendMessage(message RuntimeMessage) error
|
||||
}
|
||||
|
||||
// DistributedVMRegistry implements VMRegistry using DistributedVM's local runtime and sharding
|
||||
// DistributedVMRegistry implements VMRegistry using DistributedVM's local runtime and sharding.
|
||||
// It provides the cluster manager with access to VM information without import cycles.
|
||||
type DistributedVMRegistry struct {
|
||||
runtime interface{} // Runtime interface to avoid import cycles
|
||||
sharding *ShardManager
|
||||
vmProvider VMProvider
|
||||
sharding *ShardManager
|
||||
}
|
||||
|
||||
// VMProvider defines an interface for accessing VMs from a runtime.
|
||||
// This is used by DistributedVMRegistry to get VM information.
|
||||
type VMProvider interface {
|
||||
// GetActiveVMs returns a map of actor IDs to their VirtualMachine instances
|
||||
GetActiveVMs() map[string]VirtualMachine
|
||||
}
|
||||
|
||||
// NewDistributedVM creates a distributed VM runtime cluster node
|
||||
@@ -67,16 +79,19 @@ func NewDistributedVM(nodeID string, natsURLs []string, localRuntime Runtime) (*
|
||||
cancel: cancel,
|
||||
}
|
||||
|
||||
// Create VM registry and connect it to cluster manager
|
||||
vmRegistry := &DistributedVMRegistry{
|
||||
runtime: localRuntime,
|
||||
sharding: sharding,
|
||||
}
|
||||
cluster.SetVMRegistry(vmRegistry)
|
||||
|
||||
return dvm, nil
|
||||
}
|
||||
|
||||
// SetVMProvider sets the VM provider for the distributed VM registry.
|
||||
// This should be called after the runtime is fully initialized.
|
||||
func (dvm *DistributedVM) SetVMProvider(provider VMProvider) {
|
||||
vmRegistry := &DistributedVMRegistry{
|
||||
vmProvider: provider,
|
||||
sharding: dvm.sharding,
|
||||
}
|
||||
dvm.cluster.SetVMRegistry(vmRegistry)
|
||||
}
|
||||
|
||||
// Start begins the distributed VM cluster node
|
||||
func (dvm *DistributedVM) Start() error {
|
||||
// Start local runtime
|
||||
@@ -103,7 +118,7 @@ func (dvm *DistributedVM) Stop() {
|
||||
}
|
||||
|
||||
// LoadModel distributes EventStorming model across the cluster with VM templates
|
||||
func (dvm *DistributedVM) LoadModel(model interface{}) error {
|
||||
func (dvm *DistributedVM) LoadModel(model RuntimeModel) error {
|
||||
// Load model locally first
|
||||
if err := dvm.localRuntime.LoadModel(model); err != nil {
|
||||
return fmt.Errorf("failed to load model locally: %w", err)
|
||||
@@ -121,7 +136,7 @@ func (dvm *DistributedVM) LoadModel(model interface{}) error {
|
||||
}
|
||||
|
||||
// SendMessage routes messages across the distributed cluster
|
||||
func (dvm *DistributedVM) SendMessage(message interface{}) error {
|
||||
func (dvm *DistributedVM) SendMessage(message RuntimeMessage) error {
|
||||
// This is a simplified implementation
|
||||
// In practice, this would determine the target node based on sharding
|
||||
// and route the message appropriately
|
||||
@@ -162,15 +177,29 @@ func (dvm *DistributedVM) handleClusterMessage(msg *nats.Msg) {
|
||||
switch clusterMsg.Type {
|
||||
case "load_model":
|
||||
// Handle model loading from other nodes
|
||||
if model := clusterMsg.Payload; model != nil {
|
||||
dvm.localRuntime.LoadModel(model)
|
||||
// Re-marshal and unmarshal to convert map[string]interface{} to concrete type
|
||||
payloadBytes, err := json.Marshal(clusterMsg.Payload)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
var model ModelPayload
|
||||
if err := json.Unmarshal(payloadBytes, &model); err != nil {
|
||||
return
|
||||
}
|
||||
dvm.localRuntime.LoadModel(&model)
|
||||
|
||||
case "route_message":
|
||||
// Handle message routing from other nodes
|
||||
if message := clusterMsg.Payload; message != nil {
|
||||
dvm.localRuntime.SendMessage(message)
|
||||
// Re-marshal and unmarshal to convert map[string]interface{} to concrete type
|
||||
payloadBytes, err := json.Marshal(clusterMsg.Payload)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
var message MessagePayload
|
||||
if err := json.Unmarshal(payloadBytes, &message); err != nil {
|
||||
return
|
||||
}
|
||||
dvm.localRuntime.SendMessage(&message)
|
||||
|
||||
case "rebalance":
|
||||
// Handle shard rebalancing requests
|
||||
@@ -200,19 +229,20 @@ func (dvm *DistributedVM) GetClusterInfo() map[string]interface{} {
|
||||
nodes := dvm.cluster.GetNodes()
|
||||
|
||||
return map[string]interface{}{
|
||||
"nodeId": dvm.nodeID,
|
||||
"isLeader": dvm.cluster.IsLeader(),
|
||||
"leader": dvm.cluster.GetLeader(),
|
||||
"nodeCount": len(nodes),
|
||||
"nodes": nodes,
|
||||
"nodeId": dvm.nodeID,
|
||||
"isLeader": dvm.cluster.IsLeader(),
|
||||
"leader": dvm.cluster.GetLeader(),
|
||||
"nodeCount": len(nodes),
|
||||
"nodes": nodes,
|
||||
}
|
||||
}
|
||||
|
||||
// GetActiveVMs returns a map of active VMs (implementation depends on runtime)
|
||||
func (dvr *DistributedVMRegistry) GetActiveVMs() map[string]interface{} {
|
||||
// This would need to access the actual runtime's VM registry
|
||||
// For now, return empty map to avoid import cycles
|
||||
return make(map[string]interface{})
|
||||
// GetActiveVMs returns a map of active VMs from the VM provider
|
||||
func (dvr *DistributedVMRegistry) GetActiveVMs() map[string]VirtualMachine {
|
||||
if dvr.vmProvider == nil {
|
||||
return make(map[string]VirtualMachine)
|
||||
}
|
||||
return dvr.vmProvider.GetActiveVMs()
|
||||
}
|
||||
|
||||
// GetShard returns the shard number for the given actor ID
|
||||
|
||||
@@ -5,25 +5,41 @@ import (
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"sort"
|
||||
"sync"
|
||||
)
|
||||
|
||||
// ConsistentHashRing implements a consistent hash ring for shard distribution
|
||||
type ConsistentHashRing struct {
|
||||
mu sync.RWMutex
|
||||
ring map[uint32]string // hash -> node ID
|
||||
sortedHashes []uint32 // sorted hash keys
|
||||
nodes map[string]bool // active nodes
|
||||
virtualNodes int // number of virtual nodes per physical node
|
||||
}
|
||||
|
||||
// NewConsistentHashRing creates a new consistent hash ring
|
||||
// NewConsistentHashRing creates a new consistent hash ring with default configuration
|
||||
func NewConsistentHashRing() *ConsistentHashRing {
|
||||
return NewConsistentHashRingWithConfig(DefaultHashRingConfig())
|
||||
}
|
||||
|
||||
// NewConsistentHashRingWithConfig creates a new consistent hash ring with custom configuration
|
||||
func NewConsistentHashRingWithConfig(config HashRingConfig) *ConsistentHashRing {
|
||||
virtualNodes := config.VirtualNodes
|
||||
if virtualNodes == 0 {
|
||||
virtualNodes = DefaultVirtualNodes
|
||||
}
|
||||
return &ConsistentHashRing{
|
||||
ring: make(map[uint32]string),
|
||||
nodes: make(map[string]bool),
|
||||
ring: make(map[uint32]string),
|
||||
nodes: make(map[string]bool),
|
||||
virtualNodes: virtualNodes,
|
||||
}
|
||||
}
|
||||
|
||||
// AddNode adds a node to the hash ring
|
||||
func (chr *ConsistentHashRing) AddNode(nodeID string) {
|
||||
chr.mu.Lock()
|
||||
defer chr.mu.Unlock()
|
||||
|
||||
if chr.nodes[nodeID] {
|
||||
return // Node already exists
|
||||
}
|
||||
@@ -31,7 +47,7 @@ func (chr *ConsistentHashRing) AddNode(nodeID string) {
|
||||
chr.nodes[nodeID] = true
|
||||
|
||||
// Add virtual nodes for better distribution
|
||||
for i := 0; i < VirtualNodes; i++ {
|
||||
for i := 0; i < chr.virtualNodes; i++ {
|
||||
virtualKey := fmt.Sprintf("%s:%d", nodeID, i)
|
||||
hash := chr.hash(virtualKey)
|
||||
chr.ring[hash] = nodeID
|
||||
@@ -45,6 +61,9 @@ func (chr *ConsistentHashRing) AddNode(nodeID string) {
|
||||
|
||||
// RemoveNode removes a node from the hash ring
|
||||
func (chr *ConsistentHashRing) RemoveNode(nodeID string) {
|
||||
chr.mu.Lock()
|
||||
defer chr.mu.Unlock()
|
||||
|
||||
if !chr.nodes[nodeID] {
|
||||
return // Node doesn't exist
|
||||
}
|
||||
@@ -65,6 +84,9 @@ func (chr *ConsistentHashRing) RemoveNode(nodeID string) {
|
||||
|
||||
// GetNode returns the node responsible for a given key
|
||||
func (chr *ConsistentHashRing) GetNode(key string) string {
|
||||
chr.mu.RLock()
|
||||
defer chr.mu.RUnlock()
|
||||
|
||||
if len(chr.sortedHashes) == 0 {
|
||||
return ""
|
||||
}
|
||||
@@ -92,6 +114,9 @@ func (chr *ConsistentHashRing) hash(key string) uint32 {
|
||||
|
||||
// GetNodes returns all active nodes in the ring
|
||||
func (chr *ConsistentHashRing) GetNodes() []string {
|
||||
chr.mu.RLock()
|
||||
defer chr.mu.RUnlock()
|
||||
|
||||
nodes := make([]string, 0, len(chr.nodes))
|
||||
for nodeID := range chr.nodes {
|
||||
nodes = append(nodes, nodeID)
|
||||
@@ -101,5 +126,13 @@ func (chr *ConsistentHashRing) GetNodes() []string {
|
||||
|
||||
// IsEmpty returns true if the ring has no nodes
|
||||
func (chr *ConsistentHashRing) IsEmpty() bool {
|
||||
chr.mu.RLock()
|
||||
defer chr.mu.RUnlock()
|
||||
|
||||
return len(chr.nodes) == 0
|
||||
}
|
||||
|
||||
// GetVirtualNodes returns the number of virtual nodes per physical node
|
||||
func (chr *ConsistentHashRing) GetVirtualNodes() int {
|
||||
return chr.virtualNodes
|
||||
}
|
||||
|
||||
@@ -42,7 +42,7 @@ func TestAddNode(t *testing.T) {
|
||||
}
|
||||
|
||||
// Verify virtual nodes were added
|
||||
expectedVirtualNodes := VirtualNodes
|
||||
expectedVirtualNodes := DefaultVirtualNodes
|
||||
if len(ring.sortedHashes) != expectedVirtualNodes {
|
||||
t.Errorf("expected %d virtual nodes, got %d", expectedVirtualNodes, len(ring.sortedHashes))
|
||||
}
|
||||
@@ -86,7 +86,7 @@ func TestAddNode_MultipleNodes(t *testing.T) {
|
||||
t.Errorf("expected 3 nodes, got %d", len(nodes))
|
||||
}
|
||||
|
||||
expectedHashes := VirtualNodes * 3
|
||||
expectedHashes := DefaultVirtualNodes * 3
|
||||
if len(ring.sortedHashes) != expectedHashes {
|
||||
t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes))
|
||||
}
|
||||
@@ -118,7 +118,7 @@ func TestRemoveNode(t *testing.T) {
|
||||
}
|
||||
|
||||
// Verify virtual nodes were removed
|
||||
expectedHashes := VirtualNodes
|
||||
expectedHashes := DefaultVirtualNodes
|
||||
if len(ring.sortedHashes) != expectedHashes {
|
||||
t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes))
|
||||
}
|
||||
@@ -321,7 +321,7 @@ func TestRingBehavior_ManyNodes(t *testing.T) {
|
||||
}
|
||||
|
||||
// Verify virtual nodes count
|
||||
expectedHashes := numNodes * VirtualNodes
|
||||
expectedHashes := numNodes * DefaultVirtualNodes
|
||||
if len(ring.sortedHashes) != expectedHashes {
|
||||
t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes))
|
||||
}
|
||||
@@ -355,7 +355,7 @@ func TestRingBehavior_ManyNodes(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestVirtualNodes_ImproveDistribution(t *testing.T) {
|
||||
func TestDefaultVirtualNodes_ImproveDistribution(t *testing.T) {
|
||||
// Test that virtual nodes actually improve distribution
|
||||
// by comparing with a theoretical single-hash-per-node scenario
|
||||
|
||||
@@ -386,7 +386,7 @@ func TestVirtualNodes_ImproveDistribution(t *testing.T) {
|
||||
stdDev := math.Sqrt(sumSquaredDiff / float64(numNodes))
|
||||
coefficientOfVariation := stdDev / expectedPerNode
|
||||
|
||||
// With VirtualNodes=150, we expect good distribution
|
||||
// With DefaultVirtualNodes=150, we expect good distribution
|
||||
// Coefficient of variation should be low (< 15%)
|
||||
if coefficientOfVariation > 0.15 {
|
||||
t.Errorf("distribution has high coefficient of variation: %.2f%% (expected < 15%%)",
|
||||
@@ -394,8 +394,8 @@ func TestVirtualNodes_ImproveDistribution(t *testing.T) {
|
||||
}
|
||||
|
||||
// Verify that the actual number of virtual nodes matches expected
|
||||
if len(ring.sortedHashes) != numNodes*VirtualNodes {
|
||||
t.Errorf("expected %d virtual node hashes, got %d", numNodes*VirtualNodes, len(ring.sortedHashes))
|
||||
if len(ring.sortedHashes) != numNodes*DefaultVirtualNodes {
|
||||
t.Errorf("expected %d virtual node hashes, got %d", numNodes*DefaultVirtualNodes, len(ring.sortedHashes))
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -44,8 +44,8 @@ func NewLeaderElection(nodeID string, natsConn *nats.Conn, callbacks LeaderElect
|
||||
Bucket: "aether-leader-election",
|
||||
Description: "Aether cluster leader election coordination",
|
||||
TTL: LeaderLeaseTimeout * 2, // Auto-cleanup expired leases
|
||||
MaxBytes: 1024 * 1024, // 1MB max
|
||||
Replicas: 1, // Single replica for simplicity
|
||||
MaxBytes: 1024 * 1024, // 1MB max
|
||||
Replicas: 1, // Single replica for simplicity
|
||||
})
|
||||
if err != nil {
|
||||
// Try to get existing KV store
|
||||
|
||||
@@ -12,25 +12,28 @@ import (
|
||||
"github.com/nats-io/nats.go"
|
||||
)
|
||||
|
||||
// VMRegistry provides access to local VM information for cluster operations
|
||||
// VMRegistry provides access to local VM information for cluster operations.
|
||||
// Implementations must provide thread-safe access to VM data.
|
||||
type VMRegistry interface {
|
||||
GetActiveVMs() map[string]interface{} // VirtualMachine interface to avoid import cycles
|
||||
// GetActiveVMs returns a map of actor IDs to their VirtualMachine instances
|
||||
GetActiveVMs() map[string]VirtualMachine
|
||||
// GetShard returns the shard number for a given actor ID
|
||||
GetShard(actorID string) int
|
||||
}
|
||||
|
||||
// ClusterManager coordinates distributed VM operations across the cluster
|
||||
type ClusterManager struct {
|
||||
nodeID string
|
||||
nodes map[string]*NodeInfo
|
||||
nodeUpdates chan NodeUpdate
|
||||
shardMap *ShardMap
|
||||
hashRing *ConsistentHashRing
|
||||
election *LeaderElection
|
||||
natsConn *nats.Conn
|
||||
ctx context.Context
|
||||
mutex sync.RWMutex
|
||||
logger *log.Logger
|
||||
vmRegistry VMRegistry // Interface to access local VMs
|
||||
nodeID string
|
||||
nodes map[string]*NodeInfo
|
||||
nodeUpdates chan NodeUpdate
|
||||
shardMap *ShardMap
|
||||
hashRing *ConsistentHashRing
|
||||
election *LeaderElection
|
||||
natsConn *nats.Conn
|
||||
ctx context.Context
|
||||
mutex sync.RWMutex
|
||||
logger *log.Logger
|
||||
vmRegistry VMRegistry // Interface to access local VMs
|
||||
}
|
||||
|
||||
// NewClusterManager creates a cluster coordination manager
|
||||
@@ -50,13 +53,13 @@ func NewClusterManager(nodeID string, natsConn *nats.Conn, ctx context.Context)
|
||||
// Create leadership election with callbacks
|
||||
callbacks := LeaderElectionCallbacks{
|
||||
OnBecameLeader: func() {
|
||||
cm.logger.Printf("👑 This node became the cluster leader - can initiate rebalancing")
|
||||
cm.logger.Printf("This node became the cluster leader - can initiate rebalancing")
|
||||
},
|
||||
OnLostLeader: func() {
|
||||
cm.logger.Printf("📉 This node lost cluster leadership")
|
||||
cm.logger.Printf("This node lost cluster leadership")
|
||||
},
|
||||
OnNewLeader: func(leaderID string) {
|
||||
cm.logger.Printf("🔄 Cluster leadership changed to: %s", leaderID)
|
||||
cm.logger.Printf("Cluster leadership changed to: %s", leaderID)
|
||||
},
|
||||
}
|
||||
|
||||
@@ -71,7 +74,7 @@ func NewClusterManager(nodeID string, natsConn *nats.Conn, ctx context.Context)
|
||||
|
||||
// Start begins cluster management operations
|
||||
func (cm *ClusterManager) Start() {
|
||||
cm.logger.Printf("🚀 Starting cluster manager")
|
||||
cm.logger.Printf("Starting cluster manager")
|
||||
|
||||
// Start leader election
|
||||
cm.election.Start()
|
||||
@@ -88,7 +91,7 @@ func (cm *ClusterManager) Start() {
|
||||
|
||||
// Stop gracefully stops the cluster manager
|
||||
func (cm *ClusterManager) Stop() {
|
||||
cm.logger.Printf("🛑 Stopping cluster manager")
|
||||
cm.logger.Printf("Stopping cluster manager")
|
||||
|
||||
if cm.election != nil {
|
||||
cm.election.Stop()
|
||||
@@ -138,7 +141,7 @@ func (cm *ClusterManager) GetActorsInShard(shardID int) []string {
|
||||
func (cm *ClusterManager) handleClusterMessage(msg *nats.Msg) {
|
||||
var clusterMsg ClusterMessage
|
||||
if err := json.Unmarshal(msg.Data, &clusterMsg); err != nil {
|
||||
cm.logger.Printf("⚠️ Invalid cluster message: %v", err)
|
||||
cm.logger.Printf("Invalid cluster message: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -152,7 +155,7 @@ func (cm *ClusterManager) handleClusterMessage(msg *nats.Msg) {
|
||||
cm.handleNodeUpdate(update)
|
||||
}
|
||||
default:
|
||||
cm.logger.Printf("⚠️ Unknown cluster message type: %s", clusterMsg.Type)
|
||||
cm.logger.Printf("Unknown cluster message type: %s", clusterMsg.Type)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -165,12 +168,12 @@ func (cm *ClusterManager) handleNodeUpdate(update NodeUpdate) {
|
||||
case NodeJoined:
|
||||
cm.nodes[update.Node.ID] = update.Node
|
||||
cm.hashRing.AddNode(update.Node.ID)
|
||||
cm.logger.Printf("➕ Node joined: %s", update.Node.ID)
|
||||
cm.logger.Printf("Node joined: %s", update.Node.ID)
|
||||
|
||||
case NodeLeft:
|
||||
delete(cm.nodes, update.Node.ID)
|
||||
cm.hashRing.RemoveNode(update.Node.ID)
|
||||
cm.logger.Printf("➖ Node left: %s", update.Node.ID)
|
||||
cm.logger.Printf("Node left: %s", update.Node.ID)
|
||||
|
||||
case NodeUpdated:
|
||||
if node, exists := cm.nodes[update.Node.ID]; exists {
|
||||
@@ -188,7 +191,7 @@ func (cm *ClusterManager) handleNodeUpdate(update NodeUpdate) {
|
||||
for _, node := range cm.nodes {
|
||||
if now.Sub(node.LastSeen) > 90*time.Second && node.Status != NodeStatusFailed {
|
||||
node.Status = NodeStatusFailed
|
||||
cm.logger.Printf("❌ Node marked as failed: %s (last seen: %s)",
|
||||
cm.logger.Printf("Node marked as failed: %s (last seen: %s)",
|
||||
node.ID, node.LastSeen.Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
@@ -212,7 +215,7 @@ func (cm *ClusterManager) handleNodeUpdate(update NodeUpdate) {
|
||||
|
||||
// handleRebalanceRequest processes cluster rebalancing requests
|
||||
func (cm *ClusterManager) handleRebalanceRequest(msg ClusterMessage) {
|
||||
cm.logger.Printf("🔄 Handling rebalance request from %s", msg.From)
|
||||
cm.logger.Printf("Handling rebalance request from %s", msg.From)
|
||||
|
||||
// Implementation would handle the specific rebalancing logic
|
||||
// This is a simplified version
|
||||
@@ -220,7 +223,7 @@ func (cm *ClusterManager) handleRebalanceRequest(msg ClusterMessage) {
|
||||
|
||||
// handleMigrationRequest processes actor migration requests
|
||||
func (cm *ClusterManager) handleMigrationRequest(msg ClusterMessage) {
|
||||
cm.logger.Printf("🚚 Handling migration request from %s", msg.From)
|
||||
cm.logger.Printf("Handling migration request from %s", msg.From)
|
||||
|
||||
// Implementation would handle the specific migration logic
|
||||
// This is a simplified version
|
||||
@@ -232,7 +235,7 @@ func (cm *ClusterManager) triggerShardRebalancing(reason string) {
|
||||
return // Only leader can initiate rebalancing
|
||||
}
|
||||
|
||||
cm.logger.Printf("⚖️ Triggering shard rebalancing: %s", reason)
|
||||
cm.logger.Printf("Triggering shard rebalancing: %s", reason)
|
||||
|
||||
// Get active nodes
|
||||
var activeNodes []*NodeInfo
|
||||
@@ -245,12 +248,12 @@ func (cm *ClusterManager) triggerShardRebalancing(reason string) {
|
||||
cm.mutex.RUnlock()
|
||||
|
||||
if len(activeNodes) == 0 {
|
||||
cm.logger.Printf("⚠️ No active nodes available for rebalancing")
|
||||
cm.logger.Printf("No active nodes available for rebalancing")
|
||||
return
|
||||
}
|
||||
|
||||
// This would implement the actual rebalancing logic
|
||||
cm.logger.Printf("🎯 Would rebalance across %d active nodes", len(activeNodes))
|
||||
cm.logger.Printf("Would rebalance across %d active nodes", len(activeNodes))
|
||||
}
|
||||
|
||||
// monitorNodes periodically checks node health and updates
|
||||
@@ -279,7 +282,7 @@ func (cm *ClusterManager) checkNodeHealth() {
|
||||
for _, node := range cm.nodes {
|
||||
if now.Sub(node.LastSeen) > 90*time.Second && node.Status == NodeStatusActive {
|
||||
node.Status = NodeStatusFailed
|
||||
cm.logger.Printf("💔 Node failed: %s", node.ID)
|
||||
cm.logger.Printf("Node failed: %s", node.ID)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -33,8 +33,26 @@ type ShardManager struct {
|
||||
replication int
|
||||
}
|
||||
|
||||
// NewShardManager creates a new shard manager
|
||||
// NewShardManager creates a new shard manager with default configuration
|
||||
func NewShardManager(shardCount, replication int) *ShardManager {
|
||||
return NewShardManagerWithConfig(ShardConfig{
|
||||
ShardCount: shardCount,
|
||||
ReplicationFactor: replication,
|
||||
})
|
||||
}
|
||||
|
||||
// NewShardManagerWithConfig creates a new shard manager with custom configuration
|
||||
func NewShardManagerWithConfig(config ShardConfig) *ShardManager {
|
||||
// Apply defaults for zero values
|
||||
shardCount := config.ShardCount
|
||||
if shardCount == 0 {
|
||||
shardCount = DefaultNumShards
|
||||
}
|
||||
replication := config.ReplicationFactor
|
||||
if replication == 0 {
|
||||
replication = 1
|
||||
}
|
||||
|
||||
return &ShardManager{
|
||||
shardCount: shardCount,
|
||||
shardMap: &ShardMap{Shards: make(map[int][]string), Nodes: make(map[string]NodeInfo)},
|
||||
@@ -149,6 +167,15 @@ func (sm *ShardManager) GetActorsInShard(shardID int, nodeID string, vmRegistry
|
||||
return actors
|
||||
}
|
||||
|
||||
// GetShardCount returns the total number of shards
|
||||
func (sm *ShardManager) GetShardCount() int {
|
||||
return sm.shardCount
|
||||
}
|
||||
|
||||
// GetReplicationFactor returns the replication factor
|
||||
func (sm *ShardManager) GetReplicationFactor() int {
|
||||
return sm.replication
|
||||
}
|
||||
|
||||
// ConsistentHashPlacement implements PlacementStrategy using consistent hashing
|
||||
type ConsistentHashPlacement struct{}
|
||||
|
||||
713
cluster/shard_test.go
Normal file
713
cluster/shard_test.go
Normal file
@@ -0,0 +1,713 @@
|
||||
package cluster
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestNewShardManager(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
if sm == nil {
|
||||
t.Fatal("NewShardManager returned nil")
|
||||
}
|
||||
if sm.shardCount != 16 {
|
||||
t.Errorf("expected shardCount 16, got %d", sm.shardCount)
|
||||
}
|
||||
if sm.replication != 3 {
|
||||
t.Errorf("expected replication 3, got %d", sm.replication)
|
||||
}
|
||||
if sm.shardMap == nil {
|
||||
t.Error("shardMap is nil")
|
||||
}
|
||||
if sm.placement == nil {
|
||||
t.Error("placement strategy is nil")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewShardManager_DefaultsForZeroValues(t *testing.T) {
|
||||
sm := NewShardManagerWithConfig(ShardConfig{})
|
||||
|
||||
if sm.shardCount != DefaultNumShards {
|
||||
t.Errorf("expected default shardCount %d, got %d", DefaultNumShards, sm.shardCount)
|
||||
}
|
||||
if sm.replication != 1 {
|
||||
t.Errorf("expected default replication 1, got %d", sm.replication)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewShardManagerWithConfig_CustomValues(t *testing.T) {
|
||||
config := ShardConfig{
|
||||
ShardCount: 256,
|
||||
ReplicationFactor: 2,
|
||||
}
|
||||
sm := NewShardManagerWithConfig(config)
|
||||
|
||||
if sm.shardCount != 256 {
|
||||
t.Errorf("expected shardCount 256, got %d", sm.shardCount)
|
||||
}
|
||||
if sm.replication != 2 {
|
||||
t.Errorf("expected replication 2, got %d", sm.replication)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShard_ReturnsCorrectShardForActor(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
// Test that GetShard returns consistent results
|
||||
actorID := "actor-123"
|
||||
shard1 := sm.GetShard(actorID)
|
||||
shard2 := sm.GetShard(actorID)
|
||||
|
||||
if shard1 != shard2 {
|
||||
t.Errorf("GetShard not consistent: got %d and %d for same actor", shard1, shard2)
|
||||
}
|
||||
|
||||
// Verify shard is within valid range
|
||||
if shard1 < 0 || shard1 >= 16 {
|
||||
t.Errorf("shard %d is out of range [0, 16)", shard1)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShard_DifferentActorsCanMapToDifferentShards(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
// With enough actors, we should see different shards
|
||||
shardsSeen := make(map[int]bool)
|
||||
for i := 0; i < 100; i++ {
|
||||
actorID := fmt.Sprintf("actor-%d", i)
|
||||
shard := sm.GetShard(actorID)
|
||||
shardsSeen[shard] = true
|
||||
}
|
||||
|
||||
// We should see multiple different shards
|
||||
if len(shardsSeen) < 2 {
|
||||
t.Errorf("expected multiple different shards, got %d unique shards", len(shardsSeen))
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShard_DistributesActorsAcrossShards(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
distribution := make(map[int]int)
|
||||
numActors := 1000
|
||||
|
||||
for i := 0; i < numActors; i++ {
|
||||
actorID := fmt.Sprintf("actor-%d", i)
|
||||
shard := sm.GetShard(actorID)
|
||||
distribution[shard]++
|
||||
}
|
||||
|
||||
// Verify all shards are within valid range
|
||||
for shard := range distribution {
|
||||
if shard < 0 || shard >= 16 {
|
||||
t.Errorf("shard %d is out of range [0, 16)", shard)
|
||||
}
|
||||
}
|
||||
|
||||
// With good hashing, we should see fairly even distribution
|
||||
expectedPerShard := numActors / 16
|
||||
for shard, count := range distribution {
|
||||
deviation := float64(count-expectedPerShard) / float64(expectedPerShard)
|
||||
if deviation > 0.5 || deviation < -0.5 {
|
||||
t.Logf("shard %d has %d actors (%.1f%% deviation)", shard, count, deviation*100)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShardNodes_EmptyShard(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
|
||||
if nodes == nil {
|
||||
t.Error("GetShardNodes returned nil, expected empty slice")
|
||||
}
|
||||
if len(nodes) != 0 {
|
||||
t.Errorf("expected empty slice for unassigned shard, got %v", nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShardNodes_ReturnsAssignedNodes(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
// Assign nodes to shard
|
||||
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
|
||||
if len(nodes) != 3 {
|
||||
t.Errorf("expected 3 nodes, got %d", len(nodes))
|
||||
}
|
||||
if nodes[0] != "node-1" || nodes[1] != "node-2" || nodes[2] != "node-3" {
|
||||
t.Errorf("unexpected nodes: %v", nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShardNodes_NonExistentShard(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
// Query a shard that has no assignments
|
||||
nodes := sm.GetShardNodes(999)
|
||||
|
||||
if len(nodes) != 0 {
|
||||
t.Errorf("expected empty slice for non-existent shard, got %v", nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssignShard_CreatesNewAssignment(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
sm.AssignShard(5, []string{"node-a"})
|
||||
|
||||
nodes := sm.GetShardNodes(5)
|
||||
if len(nodes) != 1 || nodes[0] != "node-a" {
|
||||
t.Errorf("expected [node-a], got %v", nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssignShard_UpdatesExistingAssignment(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
sm.AssignShard(5, []string{"node-a"})
|
||||
sm.AssignShard(5, []string{"node-b", "node-c"})
|
||||
|
||||
nodes := sm.GetShardNodes(5)
|
||||
if len(nodes) != 2 {
|
||||
t.Errorf("expected 2 nodes, got %d", len(nodes))
|
||||
}
|
||||
if nodes[0] != "node-b" || nodes[1] != "node-c" {
|
||||
t.Errorf("expected [node-b, node-c], got %v", nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssignShard_MultipleShards(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
sm.AssignShard(0, []string{"node-1"})
|
||||
sm.AssignShard(1, []string{"node-2"})
|
||||
sm.AssignShard(2, []string{"node-3"})
|
||||
|
||||
if nodes := sm.GetShardNodes(0); len(nodes) != 1 || nodes[0] != "node-1" {
|
||||
t.Errorf("shard 0: expected [node-1], got %v", nodes)
|
||||
}
|
||||
if nodes := sm.GetShardNodes(1); len(nodes) != 1 || nodes[0] != "node-2" {
|
||||
t.Errorf("shard 1: expected [node-2], got %v", nodes)
|
||||
}
|
||||
if nodes := sm.GetShardNodes(2); len(nodes) != 1 || nodes[0] != "node-3" {
|
||||
t.Errorf("shard 2: expected [node-3], got %v", nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetPrimaryNode(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
sm.AssignShard(0, []string{"primary", "replica1", "replica2"})
|
||||
|
||||
primary := sm.GetPrimaryNode(0)
|
||||
if primary != "primary" {
|
||||
t.Errorf("expected 'primary', got %q", primary)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetPrimaryNode_EmptyShard(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
primary := sm.GetPrimaryNode(0)
|
||||
if primary != "" {
|
||||
t.Errorf("expected empty string for unassigned shard, got %q", primary)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetReplicaNodes(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
sm.AssignShard(0, []string{"primary", "replica1", "replica2"})
|
||||
|
||||
replicas := sm.GetReplicaNodes(0)
|
||||
if len(replicas) != 2 {
|
||||
t.Errorf("expected 2 replicas, got %d", len(replicas))
|
||||
}
|
||||
if replicas[0] != "replica1" || replicas[1] != "replica2" {
|
||||
t.Errorf("expected [replica1, replica2], got %v", replicas)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetReplicaNodes_SingleNode(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
sm.AssignShard(0, []string{"only-node"})
|
||||
|
||||
replicas := sm.GetReplicaNodes(0)
|
||||
if len(replicas) != 0 {
|
||||
t.Errorf("expected no replicas for single-node shard, got %v", replicas)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetReplicaNodes_EmptyShard(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
replicas := sm.GetReplicaNodes(0)
|
||||
if len(replicas) != 0 {
|
||||
t.Errorf("expected empty slice for unassigned shard, got %v", replicas)
|
||||
}
|
||||
}
|
||||
|
||||
func TestPlaceActor_NoNodes(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
_, err := sm.PlaceActor("actor-1", map[string]*NodeInfo{})
|
||||
|
||||
if err == nil {
|
||||
t.Error("expected error when no nodes available")
|
||||
}
|
||||
}
|
||||
|
||||
func TestPlaceActor_SingleNode(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1", Status: NodeStatusActive},
|
||||
}
|
||||
|
||||
nodeID, err := sm.PlaceActor("actor-1", nodes)
|
||||
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
}
|
||||
if nodeID != "node-1" {
|
||||
t.Errorf("expected node-1, got %q", nodeID)
|
||||
}
|
||||
}
|
||||
|
||||
func TestPlaceActor_ReturnsValidNode(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1", Status: NodeStatusActive},
|
||||
"node-2": {ID: "node-2", Status: NodeStatusActive},
|
||||
"node-3": {ID: "node-3", Status: NodeStatusActive},
|
||||
}
|
||||
|
||||
// PlaceActor should always return one of the available nodes
|
||||
for i := 0; i < 100; i++ {
|
||||
nodeID, err := sm.PlaceActor(fmt.Sprintf("actor-%d", i), nodes)
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
}
|
||||
if _, exists := nodes[nodeID]; !exists {
|
||||
t.Errorf("PlaceActor returned invalid node: %q", nodeID)
|
||||
}
|
||||
}
|
||||
}
|
||||
func TestPlaceActor_DistributesAcrossNodes(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1", Status: NodeStatusActive},
|
||||
"node-2": {ID: "node-2", Status: NodeStatusActive},
|
||||
"node-3": {ID: "node-3", Status: NodeStatusActive},
|
||||
}
|
||||
|
||||
distribution := make(map[string]int)
|
||||
for i := 0; i < 100; i++ {
|
||||
nodeID, _ := sm.PlaceActor(fmt.Sprintf("actor-%d", i), nodes)
|
||||
distribution[nodeID]++
|
||||
}
|
||||
|
||||
// Should use multiple nodes
|
||||
if len(distribution) < 2 {
|
||||
t.Errorf("expected distribution across multiple nodes, got %v", distribution)
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateShardMap(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
newMap := &ShardMap{
|
||||
Version: 5,
|
||||
Shards: map[int][]string{
|
||||
0: {"node-a", "node-b"},
|
||||
1: {"node-c"},
|
||||
},
|
||||
Nodes: map[string]NodeInfo{
|
||||
"node-a": {ID: "node-a"},
|
||||
"node-b": {ID: "node-b"},
|
||||
"node-c": {ID: "node-c"},
|
||||
},
|
||||
}
|
||||
|
||||
sm.UpdateShardMap(newMap)
|
||||
|
||||
result := sm.GetShardMap()
|
||||
if result.Version != 5 {
|
||||
t.Errorf("expected version 5, got %d", result.Version)
|
||||
}
|
||||
if len(result.Shards[0]) != 2 {
|
||||
t.Errorf("expected 2 nodes for shard 0, got %d", len(result.Shards[0]))
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShardMap_ReturnsDeepCopy(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
sm.AssignShard(0, []string{"node-1", "node-2"})
|
||||
|
||||
copy1 := sm.GetShardMap()
|
||||
copy2 := sm.GetShardMap()
|
||||
|
||||
// Modify copy1
|
||||
copy1.Shards[0][0] = "modified"
|
||||
copy1.Version = 999
|
||||
|
||||
// copy2 should be unaffected
|
||||
if copy2.Shards[0][0] == "modified" {
|
||||
t.Error("GetShardMap did not return a deep copy (shard nodes modified)")
|
||||
}
|
||||
if copy2.Version == 999 {
|
||||
t.Error("GetShardMap did not return a deep copy (version modified)")
|
||||
}
|
||||
|
||||
// Original should be unaffected
|
||||
nodes := sm.GetShardNodes(0)
|
||||
if nodes[0] == "modified" {
|
||||
t.Error("original shard map was modified through copy")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetShardCount(t *testing.T) {
|
||||
sm := NewShardManager(64, 1)
|
||||
|
||||
if sm.GetShardCount() != 64 {
|
||||
t.Errorf("expected 64, got %d", sm.GetShardCount())
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetReplicationFactor(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
if sm.GetReplicationFactor() != 3 {
|
||||
t.Errorf("expected 3, got %d", sm.GetReplicationFactor())
|
||||
}
|
||||
}
|
||||
|
||||
func TestRebalanceShards_NoPlacementStrategy(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
sm.placement = nil // Remove placement strategy
|
||||
|
||||
_, err := sm.RebalanceShards(map[string]*NodeInfo{})
|
||||
|
||||
if err == nil {
|
||||
t.Error("expected error when no placement strategy configured")
|
||||
}
|
||||
}
|
||||
|
||||
func TestRebalanceShards_WithNodes(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1", Status: NodeStatusActive},
|
||||
"node-2": {ID: "node-2", Status: NodeStatusActive},
|
||||
}
|
||||
|
||||
result, err := sm.RebalanceShards(nodes)
|
||||
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
}
|
||||
if result == nil {
|
||||
t.Error("expected non-nil result")
|
||||
}
|
||||
}
|
||||
|
||||
// Test shard assignment with node failures
|
||||
func TestShardAssignment_NodeFailure(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
// Initial assignment with 3 replicas
|
||||
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
|
||||
|
||||
// Simulate node failure by reassigning without the failed node
|
||||
sm.AssignShard(0, []string{"node-1", "node-3"})
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
if len(nodes) != 2 {
|
||||
t.Errorf("expected 2 nodes after failure, got %d", len(nodes))
|
||||
}
|
||||
|
||||
// Verify primary is still correct
|
||||
primary := sm.GetPrimaryNode(0)
|
||||
if primary != "node-1" {
|
||||
t.Errorf("expected node-1 as primary, got %q", primary)
|
||||
}
|
||||
|
||||
// Verify replica count
|
||||
replicas := sm.GetReplicaNodes(0)
|
||||
if len(replicas) != 1 || replicas[0] != "node-3" {
|
||||
t.Errorf("expected [node-3] as replicas, got %v", replicas)
|
||||
}
|
||||
}
|
||||
|
||||
func TestShardAssignment_AllNodesFailExceptOne(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
|
||||
|
||||
// Simulate all but one node failing
|
||||
sm.AssignShard(0, []string{"node-3"})
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
if len(nodes) != 1 || nodes[0] != "node-3" {
|
||||
t.Errorf("expected [node-3], got %v", nodes)
|
||||
}
|
||||
|
||||
primary := sm.GetPrimaryNode(0)
|
||||
if primary != "node-3" {
|
||||
t.Errorf("expected node-3 as primary, got %q", primary)
|
||||
}
|
||||
|
||||
replicas := sm.GetReplicaNodes(0)
|
||||
if len(replicas) != 0 {
|
||||
t.Errorf("expected no replicas, got %v", replicas)
|
||||
}
|
||||
}
|
||||
|
||||
// Test replication factor is respected
|
||||
func TestReplicationFactor_Respected(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
if sm.GetReplicationFactor() != 3 {
|
||||
t.Errorf("expected replication factor 3, got %d", sm.GetReplicationFactor())
|
||||
}
|
||||
|
||||
// Assign with exactly the replication factor
|
||||
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
if len(nodes) != 3 {
|
||||
t.Errorf("expected 3 nodes matching replication factor, got %d", len(nodes))
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplicationFactor_CanExceed(t *testing.T) {
|
||||
// Note: ShardManager doesn't enforce max replication, it just tracks what's assigned
|
||||
sm := NewShardManager(16, 2)
|
||||
|
||||
// Assign more nodes than replication factor
|
||||
sm.AssignShard(0, []string{"node-1", "node-2", "node-3", "node-4"})
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
if len(nodes) != 4 {
|
||||
t.Errorf("expected 4 nodes, got %d", len(nodes))
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplicationFactor_LessThanFactor(t *testing.T) {
|
||||
sm := NewShardManager(16, 3)
|
||||
|
||||
// Assign fewer nodes than replication factor (possible during degraded state)
|
||||
sm.AssignShard(0, []string{"node-1"})
|
||||
|
||||
nodes := sm.GetShardNodes(0)
|
||||
if len(nodes) != 1 {
|
||||
t.Errorf("expected 1 node, got %d", len(nodes))
|
||||
}
|
||||
|
||||
// System should track that we're under-replicated
|
||||
// (in practice, cluster manager would handle this)
|
||||
}
|
||||
|
||||
// Mock VM registry for testing GetActorsInShard
|
||||
type mockVMRegistry struct {
|
||||
activeVMs map[string]VirtualMachine
|
||||
}
|
||||
|
||||
func (m *mockVMRegistry) GetActiveVMs() map[string]VirtualMachine {
|
||||
return m.activeVMs
|
||||
}
|
||||
|
||||
func (m *mockVMRegistry) GetShard(actorID string) int {
|
||||
// This would use the same logic as ShardManager
|
||||
return 0
|
||||
}
|
||||
|
||||
type mockVM struct {
|
||||
id string
|
||||
actorID string
|
||||
state VMState
|
||||
}
|
||||
|
||||
func (m *mockVM) GetID() string { return m.id }
|
||||
func (m *mockVM) GetActorID() string { return m.actorID }
|
||||
func (m *mockVM) GetState() VMState { return m.state }
|
||||
|
||||
func TestGetActorsInShard_NilRegistry(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
actors := sm.GetActorsInShard(0, "node-1", nil)
|
||||
|
||||
if len(actors) != 0 {
|
||||
t.Errorf("expected empty slice for nil registry, got %v", actors)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetActorsInShard_WithActors(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
// Create mock VMs - need to find actors that map to the same shard
|
||||
// First, find some actor IDs that map to shard 0
|
||||
var actorsInShard0 []string
|
||||
for i := 0; i < 100; i++ {
|
||||
actorID := fmt.Sprintf("actor-%d", i)
|
||||
if sm.GetShard(actorID) == 0 {
|
||||
actorsInShard0 = append(actorsInShard0, actorID)
|
||||
if len(actorsInShard0) >= 3 {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
activeVMs := make(map[string]VirtualMachine)
|
||||
for _, actorID := range actorsInShard0 {
|
||||
activeVMs[actorID] = &mockVM{
|
||||
id: "vm-" + actorID,
|
||||
actorID: actorID,
|
||||
state: VMStateRunning,
|
||||
}
|
||||
}
|
||||
|
||||
registry := &mockVMRegistry{activeVMs: activeVMs}
|
||||
|
||||
actors := sm.GetActorsInShard(0, "node-1", registry)
|
||||
|
||||
if len(actors) != len(actorsInShard0) {
|
||||
t.Errorf("expected %d actors in shard 0, got %d", len(actorsInShard0), len(actors))
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetActorsInShard_EmptyRegistry(t *testing.T) {
|
||||
sm := NewShardManager(16, 1)
|
||||
|
||||
registry := &mockVMRegistry{activeVMs: make(map[string]VirtualMachine)}
|
||||
|
||||
actors := sm.GetActorsInShard(0, "node-1", registry)
|
||||
|
||||
if len(actors) != 0 {
|
||||
t.Errorf("expected empty slice for empty registry, got %v", actors)
|
||||
}
|
||||
}
|
||||
|
||||
// Tests for ConsistentHashPlacement
|
||||
func TestConsistentHashPlacement_PlaceActor_NoNodes(t *testing.T) {
|
||||
placement := &ConsistentHashPlacement{}
|
||||
shardMap := &ShardMap{}
|
||||
|
||||
_, err := placement.PlaceActor("actor-1", shardMap, map[string]*NodeInfo{})
|
||||
|
||||
if err == nil {
|
||||
t.Error("expected error when no nodes available")
|
||||
}
|
||||
}
|
||||
|
||||
func TestConsistentHashPlacement_PlaceActor_SingleNode(t *testing.T) {
|
||||
placement := &ConsistentHashPlacement{}
|
||||
shardMap := &ShardMap{}
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1"},
|
||||
}
|
||||
|
||||
nodeID, err := placement.PlaceActor("actor-1", shardMap, nodes)
|
||||
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
}
|
||||
if nodeID != "node-1" {
|
||||
t.Errorf("expected node-1, got %q", nodeID)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConsistentHashPlacement_PlaceActor_ReturnsValidNode(t *testing.T) {
|
||||
placement := &ConsistentHashPlacement{}
|
||||
shardMap := &ShardMap{}
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1"},
|
||||
"node-2": {ID: "node-2"},
|
||||
"node-3": {ID: "node-3"},
|
||||
}
|
||||
|
||||
// PlaceActor should always return one of the available nodes
|
||||
for i := 0; i < 100; i++ {
|
||||
nodeID, err := placement.PlaceActor(fmt.Sprintf("actor-%d", i), shardMap, nodes)
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
}
|
||||
if _, exists := nodes[nodeID]; !exists {
|
||||
t.Errorf("PlaceActor returned invalid node: %q", nodeID)
|
||||
}
|
||||
}
|
||||
}
|
||||
func TestConsistentHashPlacement_RebalanceShards(t *testing.T) {
|
||||
placement := &ConsistentHashPlacement{}
|
||||
currentMap := &ShardMap{
|
||||
Version: 1,
|
||||
Shards: map[int][]string{0: {"node-1"}},
|
||||
}
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1"},
|
||||
"node-2": {ID: "node-2"},
|
||||
}
|
||||
|
||||
result, err := placement.RebalanceShards(currentMap, nodes)
|
||||
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
}
|
||||
// Current implementation returns unchanged map
|
||||
if result != currentMap {
|
||||
t.Error("expected same map returned (simplified implementation)")
|
||||
}
|
||||
}
|
||||
|
||||
// Benchmark tests
|
||||
func BenchmarkGetShard(b *testing.B) {
|
||||
sm := NewShardManager(1024, 1)
|
||||
|
||||
actorIDs := make([]string, 1000)
|
||||
for i := range actorIDs {
|
||||
actorIDs[i] = fmt.Sprintf("actor-%d", i)
|
||||
}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
sm.GetShard(actorIDs[i%len(actorIDs)])
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkAssignShard(b *testing.B) {
|
||||
sm := NewShardManager(1024, 1)
|
||||
nodes := []string{"node-1", "node-2", "node-3"}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
sm.AssignShard(i%1024, nodes)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkPlaceActor(b *testing.B) {
|
||||
sm := NewShardManager(1024, 1)
|
||||
nodes := map[string]*NodeInfo{
|
||||
"node-1": {ID: "node-1"},
|
||||
"node-2": {ID: "node-2"},
|
||||
"node-3": {ID: "node-3"},
|
||||
}
|
||||
|
||||
actorIDs := make([]string, 1000)
|
||||
for i := range actorIDs {
|
||||
actorIDs[i] = fmt.Sprintf("actor-%d", i)
|
||||
}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
sm.PlaceActor(actorIDs[i%len(actorIDs)], nodes)
|
||||
}
|
||||
}
|
||||
145
cluster/types.go
145
cluster/types.go
@@ -4,17 +4,47 @@ import (
|
||||
"time"
|
||||
)
|
||||
|
||||
// Default configuration values
|
||||
const (
|
||||
// NumShards defines the total number of shards in the cluster
|
||||
NumShards = 1024
|
||||
// VirtualNodes defines the number of virtual nodes per physical node for consistent hashing
|
||||
VirtualNodes = 150
|
||||
// DefaultNumShards defines the default total number of shards in the cluster
|
||||
DefaultNumShards = 1024
|
||||
// DefaultVirtualNodes defines the default number of virtual nodes per physical node
|
||||
DefaultVirtualNodes = 150
|
||||
// Leadership election constants
|
||||
LeaderLeaseTimeout = 10 * time.Second // How long a leader lease lasts
|
||||
HeartbeatInterval = 3 * time.Second // How often leader sends heartbeats
|
||||
ElectionTimeout = 2 * time.Second // How long to wait for election
|
||||
)
|
||||
|
||||
// HashRingConfig holds configuration for the consistent hash ring
|
||||
type HashRingConfig struct {
|
||||
// VirtualNodes is the number of virtual nodes per physical node (default: 150)
|
||||
VirtualNodes int
|
||||
}
|
||||
|
||||
// DefaultHashRingConfig returns the default hash ring configuration
|
||||
func DefaultHashRingConfig() HashRingConfig {
|
||||
return HashRingConfig{
|
||||
VirtualNodes: DefaultVirtualNodes,
|
||||
}
|
||||
}
|
||||
|
||||
// ShardConfig holds configuration for shard management
|
||||
type ShardConfig struct {
|
||||
// ShardCount is the total number of shards (default: 1024)
|
||||
ShardCount int
|
||||
// ReplicationFactor is the number of replicas per shard (default: 1)
|
||||
ReplicationFactor int
|
||||
}
|
||||
|
||||
// DefaultShardConfig returns the default shard configuration
|
||||
func DefaultShardConfig() ShardConfig {
|
||||
return ShardConfig{
|
||||
ShardCount: DefaultNumShards,
|
||||
ReplicationFactor: 1,
|
||||
}
|
||||
}
|
||||
|
||||
// NodeStatus represents the health status of a node
|
||||
type NodeStatus string
|
||||
|
||||
@@ -30,14 +60,14 @@ type NodeInfo struct {
|
||||
Address string `json:"address"`
|
||||
Port int `json:"port"`
|
||||
Status NodeStatus `json:"status"`
|
||||
Capacity float64 `json:"capacity"` // Maximum load capacity
|
||||
Load float64 `json:"load"` // Current CPU/memory load
|
||||
LastSeen time.Time `json:"lastSeen"` // Last heartbeat timestamp
|
||||
Capacity float64 `json:"capacity"` // Maximum load capacity
|
||||
Load float64 `json:"load"` // Current CPU/memory load
|
||||
LastSeen time.Time `json:"lastSeen"` // Last heartbeat timestamp
|
||||
Timestamp time.Time `json:"timestamp"`
|
||||
Metadata map[string]string `json:"metadata"`
|
||||
IsLeader bool `json:"isLeader"`
|
||||
VMCount int `json:"vmCount"` // Number of VMs on this node
|
||||
ShardIDs []int `json:"shardIds"` // Shards assigned to this node
|
||||
VMCount int `json:"vmCount"` // Number of VMs on this node
|
||||
ShardIDs []int `json:"shardIds"` // Shards assigned to this node
|
||||
}
|
||||
|
||||
// NodeUpdateType represents the type of node update
|
||||
@@ -57,9 +87,9 @@ type NodeUpdate struct {
|
||||
|
||||
// ShardMap represents the distribution of shards across cluster nodes
|
||||
type ShardMap struct {
|
||||
Version uint64 `json:"version"` // Incremented on each change
|
||||
Shards map[int][]string `json:"shards"` // shard ID -> [primary, replica1, replica2]
|
||||
Nodes map[string]NodeInfo `json:"nodes"` // node ID -> node info
|
||||
Version uint64 `json:"version"` // Incremented on each change
|
||||
Shards map[int][]string `json:"shards"` // shard ID -> [primary, replica1, replica2]
|
||||
Nodes map[string]NodeInfo `json:"nodes"` // node ID -> node info
|
||||
UpdateTime time.Time `json:"updateTime"`
|
||||
}
|
||||
|
||||
@@ -74,23 +104,23 @@ type ClusterMessage struct {
|
||||
|
||||
// RebalanceRequest represents a request to rebalance shards
|
||||
type RebalanceRequest struct {
|
||||
RequestID string `json:"requestId"`
|
||||
FromNode string `json:"fromNode"`
|
||||
ToNode string `json:"toNode"`
|
||||
ShardIDs []int `json:"shardIds"`
|
||||
Reason string `json:"reason"`
|
||||
Migrations []ActorMigration `json:"migrations"`
|
||||
RequestID string `json:"requestId"`
|
||||
FromNode string `json:"fromNode"`
|
||||
ToNode string `json:"toNode"`
|
||||
ShardIDs []int `json:"shardIds"`
|
||||
Reason string `json:"reason"`
|
||||
Migrations []ActorMigration `json:"migrations"`
|
||||
}
|
||||
|
||||
// ActorMigration represents the migration of an actor between nodes
|
||||
type ActorMigration struct {
|
||||
ActorID string `json:"actorId"`
|
||||
FromNode string `json:"fromNode"`
|
||||
ToNode string `json:"toNode"`
|
||||
ShardID int `json:"shardId"`
|
||||
State map[string]interface{} `json:"state"`
|
||||
Version int64 `json:"version"`
|
||||
Status string `json:"status"` // "pending", "in_progress", "completed", "failed"
|
||||
ActorID string `json:"actorId"`
|
||||
FromNode string `json:"fromNode"`
|
||||
ToNode string `json:"toNode"`
|
||||
ShardID int `json:"shardId"`
|
||||
State map[string]interface{} `json:"state"`
|
||||
Version int64 `json:"version"`
|
||||
Status string `json:"status"` // "pending", "in_progress", "completed", "failed"
|
||||
}
|
||||
|
||||
// LeaderElectionCallbacks defines callbacks for leadership changes
|
||||
@@ -108,3 +138,68 @@ type LeadershipLease struct {
|
||||
StartedAt time.Time `json:"startedAt"`
|
||||
}
|
||||
|
||||
// VirtualMachine defines the interface for a virtual machine instance.
|
||||
// This interface provides the minimal contract needed by the cluster package
|
||||
// to interact with VMs without creating import cycles with the runtime package.
|
||||
type VirtualMachine interface {
|
||||
// GetID returns the unique identifier for this VM
|
||||
GetID() string
|
||||
// GetActorID returns the actor ID this VM represents
|
||||
GetActorID() string
|
||||
// GetState returns the current state of the VM
|
||||
GetState() VMState
|
||||
}
|
||||
|
||||
// VMState represents the state of a virtual machine
|
||||
type VMState string
|
||||
|
||||
const (
|
||||
VMStateIdle VMState = "idle"
|
||||
VMStateRunning VMState = "running"
|
||||
VMStatePaused VMState = "paused"
|
||||
VMStateStopped VMState = "stopped"
|
||||
)
|
||||
|
||||
// RuntimeModel defines the interface for an EventStorming model that can be loaded into a runtime.
|
||||
// This decouples the cluster package from the specific eventstorming package.
|
||||
type RuntimeModel interface {
|
||||
// GetID returns the unique identifier for this model
|
||||
GetID() string
|
||||
// GetName returns the name of this model
|
||||
GetName() string
|
||||
}
|
||||
|
||||
// RuntimeMessage defines the interface for messages that can be sent through the runtime.
|
||||
// This provides type safety for inter-actor communication without creating import cycles.
|
||||
type RuntimeMessage interface {
|
||||
// GetTargetActorID returns the ID of the actor this message is addressed to
|
||||
GetTargetActorID() string
|
||||
// GetType returns the message type identifier
|
||||
GetType() string
|
||||
}
|
||||
|
||||
// ModelPayload is a concrete type for JSON-unmarshaling RuntimeModel payloads.
|
||||
// Use this when receiving model data over the network.
|
||||
type ModelPayload struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
}
|
||||
|
||||
// GetID implements RuntimeModel
|
||||
func (m *ModelPayload) GetID() string { return m.ID }
|
||||
|
||||
// GetName implements RuntimeModel
|
||||
func (m *ModelPayload) GetName() string { return m.Name }
|
||||
|
||||
// MessagePayload is a concrete type for JSON-unmarshaling RuntimeMessage payloads.
|
||||
// Use this when receiving message data over the network.
|
||||
type MessagePayload struct {
|
||||
TargetActorID string `json:"targetActorId"`
|
||||
Type string `json:"type"`
|
||||
}
|
||||
|
||||
// GetTargetActorID implements RuntimeMessage
|
||||
func (m *MessagePayload) GetTargetActorID() string { return m.TargetActorID }
|
||||
|
||||
// GetType implements RuntimeMessage
|
||||
func (m *MessagePayload) GetType() string { return m.Type }
|
||||
|
||||
67
event.go
67
event.go
@@ -28,6 +28,39 @@ func (e *VersionConflictError) Unwrap() error {
|
||||
return ErrVersionConflict
|
||||
}
|
||||
|
||||
// ReplayError captures information about a malformed event encountered during replay.
|
||||
// This allows callers to inspect and handle corrupted data without losing context.
|
||||
type ReplayError struct {
|
||||
// SequenceNumber is the sequence number of the message in the stream (if available)
|
||||
SequenceNumber uint64
|
||||
// RawData is the raw bytes that could not be unmarshaled
|
||||
RawData []byte
|
||||
// Err is the underlying unmarshal error
|
||||
Err error
|
||||
}
|
||||
|
||||
func (e *ReplayError) Error() string {
|
||||
return fmt.Sprintf("failed to unmarshal event at sequence %d: %v", e.SequenceNumber, e.Err)
|
||||
}
|
||||
|
||||
func (e *ReplayError) Unwrap() error {
|
||||
return e.Err
|
||||
}
|
||||
|
||||
// ReplayResult contains the results of replaying events, including any errors encountered.
|
||||
// This allows callers to decide how to handle malformed events rather than silently skipping them.
|
||||
type ReplayResult struct {
|
||||
// Events contains the successfully unmarshaled events
|
||||
Events []*Event
|
||||
// Errors contains information about any malformed events encountered
|
||||
Errors []ReplayError
|
||||
}
|
||||
|
||||
// HasErrors returns true if any malformed events were encountered during replay
|
||||
func (r *ReplayResult) HasErrors() bool {
|
||||
return len(r.Errors) > 0
|
||||
}
|
||||
|
||||
// Event represents a domain event in the system
|
||||
type Event struct {
|
||||
ID string `json:"id"`
|
||||
@@ -40,6 +73,14 @@ type Event struct {
|
||||
Timestamp time.Time `json:"timestamp"`
|
||||
}
|
||||
|
||||
// Common event types for Aether infrastructure
|
||||
const (
|
||||
// EventTypeEventStored is an internal event published when an event is successfully persisted.
|
||||
// This event allows observability components (metrics, projections, audit systems) to react
|
||||
// to persisted events without coupling to application code.
|
||||
EventTypeEventStored = "EventStored"
|
||||
)
|
||||
|
||||
// Common metadata keys for distributed tracing and auditing
|
||||
const (
|
||||
// MetadataKeyCorrelationID identifies related events across services
|
||||
@@ -143,6 +184,17 @@ type ActorSnapshot struct {
|
||||
|
||||
// EventStore defines the interface for event persistence.
|
||||
//
|
||||
// # Immutability Guarantee
|
||||
//
|
||||
// EventStore is append-only. Once an event is persisted via SaveEvent, it is never
|
||||
// modified or deleted. The interface intentionally provides no Update or Delete methods.
|
||||
// This ensures:
|
||||
// - Events serve as an immutable audit trail
|
||||
// - State can be safely derived by replaying events
|
||||
// - Concurrent reads are always safe (events never change)
|
||||
//
|
||||
// To correct a mistake, append a new event that expresses the correction.
|
||||
//
|
||||
// # Version Semantics
|
||||
//
|
||||
// Events for an actor must have monotonically increasing versions. When SaveEvent
|
||||
@@ -163,10 +215,13 @@ type EventStore interface {
|
||||
// SaveEvent persists an event to the store. The event's Version must be
|
||||
// strictly greater than the current latest version for the actor.
|
||||
// Returns VersionConflictError if version <= current latest version.
|
||||
// Once saved, the event is immutable and can never be modified or deleted.
|
||||
SaveEvent(event *Event) error
|
||||
|
||||
// GetEvents retrieves events for an actor from a specific version (inclusive).
|
||||
// Returns an empty slice if no events exist for the actor.
|
||||
// The returned events are guaranteed to be immutable - they will never be
|
||||
// modified or deleted from the store.
|
||||
GetEvents(actorID string, fromVersion int64) ([]*Event, error)
|
||||
|
||||
// GetLatestVersion returns the latest version for an actor.
|
||||
@@ -174,6 +229,18 @@ type EventStore interface {
|
||||
GetLatestVersion(actorID string) (int64, error)
|
||||
}
|
||||
|
||||
// EventStoreWithErrors extends EventStore with methods that report malformed events.
|
||||
// Stores that may encounter corrupted data during replay (e.g., JetStream) should
|
||||
// implement this interface to give callers visibility into data quality issues.
|
||||
type EventStoreWithErrors interface {
|
||||
EventStore
|
||||
|
||||
// GetEventsWithErrors retrieves events for an actor and reports any malformed
|
||||
// events encountered. This method allows callers to decide how to handle
|
||||
// corrupted data rather than silently skipping it.
|
||||
GetEventsWithErrors(actorID string, fromVersion int64) (*ReplayResult, error)
|
||||
}
|
||||
|
||||
// SnapshotStore extends EventStore with snapshot capabilities
|
||||
type SnapshotStore interface {
|
||||
EventStore
|
||||
|
||||
316
event_test.go
316
event_test.go
@@ -2,6 +2,8 @@ package aether
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
@@ -1208,3 +1210,317 @@ func TestEvent_MetadataAllHelpersRoundTrip(t *testing.T) {
|
||||
t.Errorf("GetSpanID mismatch: got %q", decoded.GetSpanID())
|
||||
}
|
||||
}
|
||||
|
||||
// Tests for ReplayError and ReplayResult types
|
||||
|
||||
func TestReplayError_Error(t *testing.T) {
|
||||
err := &ReplayError{
|
||||
SequenceNumber: 42,
|
||||
RawData: []byte(`invalid json`),
|
||||
Err: json.Unmarshal([]byte(`{`), &struct{}{}),
|
||||
}
|
||||
|
||||
errMsg := err.Error()
|
||||
if !strings.Contains(errMsg, "42") {
|
||||
t.Errorf("expected error message to contain sequence number, got: %s", errMsg)
|
||||
}
|
||||
if !strings.Contains(errMsg, "unmarshal") || !strings.Contains(errMsg, "failed") {
|
||||
t.Errorf("expected error message to contain 'failed' and 'unmarshal', got: %s", errMsg)
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayError_Unwrap(t *testing.T) {
|
||||
innerErr := json.Unmarshal([]byte(`{`), &struct{}{})
|
||||
err := &ReplayError{
|
||||
SequenceNumber: 1,
|
||||
RawData: []byte(`{`),
|
||||
Err: innerErr,
|
||||
}
|
||||
|
||||
unwrapped := err.Unwrap()
|
||||
if unwrapped != innerErr {
|
||||
t.Errorf("expected Unwrap to return inner error")
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayResult_HasErrors(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
result *ReplayResult
|
||||
expected bool
|
||||
}{
|
||||
{
|
||||
name: "no errors",
|
||||
result: &ReplayResult{Events: []*Event{}, Errors: []ReplayError{}},
|
||||
expected: false,
|
||||
},
|
||||
{
|
||||
name: "nil errors slice",
|
||||
result: &ReplayResult{Events: []*Event{}, Errors: nil},
|
||||
expected: false,
|
||||
},
|
||||
{
|
||||
name: "has errors",
|
||||
result: &ReplayResult{
|
||||
Events: []*Event{},
|
||||
Errors: []ReplayError{
|
||||
{SequenceNumber: 1, RawData: []byte(`bad`), Err: nil},
|
||||
},
|
||||
},
|
||||
expected: true,
|
||||
},
|
||||
{
|
||||
name: "has events and errors",
|
||||
result: &ReplayResult{
|
||||
Events: []*Event{{ID: "evt-1"}},
|
||||
Errors: []ReplayError{
|
||||
{SequenceNumber: 2, RawData: []byte(`bad`), Err: nil},
|
||||
},
|
||||
},
|
||||
expected: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
if got := tt.result.HasErrors(); got != tt.expected {
|
||||
t.Errorf("HasErrors() = %v, want %v", got, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayResult_EmptyResult(t *testing.T) {
|
||||
result := &ReplayResult{
|
||||
Events: []*Event{},
|
||||
Errors: []ReplayError{},
|
||||
}
|
||||
|
||||
if result.HasErrors() {
|
||||
t.Error("expected HasErrors() to return false for empty result")
|
||||
}
|
||||
if len(result.Events) != 0 {
|
||||
t.Errorf("expected 0 events, got %d", len(result.Events))
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayError_WithZeroSequence(t *testing.T) {
|
||||
err := &ReplayError{
|
||||
SequenceNumber: 0,
|
||||
RawData: []byte(`corrupted`),
|
||||
Err: json.Unmarshal([]byte(`not-json`), &struct{}{}),
|
||||
}
|
||||
|
||||
errMsg := err.Error()
|
||||
if !strings.Contains(errMsg, "sequence 0") {
|
||||
t.Errorf("expected error message to contain 'sequence 0', got: %s", errMsg)
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayError_WithLargeRawData(t *testing.T) {
|
||||
largeData := make([]byte, 1024*1024) // 1MB
|
||||
for i := range largeData {
|
||||
largeData[i] = 'x'
|
||||
}
|
||||
|
||||
err := &ReplayError{
|
||||
SequenceNumber: 999,
|
||||
RawData: largeData,
|
||||
Err: json.Unmarshal(largeData, &struct{}{}),
|
||||
}
|
||||
|
||||
// Should be able to create the error without issues
|
||||
if len(err.RawData) != 1024*1024 {
|
||||
t.Errorf("expected RawData to be preserved, got length %d", len(err.RawData))
|
||||
}
|
||||
|
||||
// Error() should still work
|
||||
_ = err.Error()
|
||||
}
|
||||
|
||||
// Tests for VersionConflictError
|
||||
|
||||
func TestVersionConflictError_Error(t *testing.T) {
|
||||
err := &VersionConflictError{
|
||||
ActorID: "order-123",
|
||||
AttemptedVersion: 3,
|
||||
CurrentVersion: 5,
|
||||
}
|
||||
|
||||
errMsg := err.Error()
|
||||
|
||||
// Verify error message contains all context
|
||||
if !strings.Contains(errMsg, "order-123") {
|
||||
t.Errorf("error message should contain ActorID, got: %s", errMsg)
|
||||
}
|
||||
if !strings.Contains(errMsg, "3") {
|
||||
t.Errorf("error message should contain AttemptedVersion, got: %s", errMsg)
|
||||
}
|
||||
if !strings.Contains(errMsg, "5") {
|
||||
t.Errorf("error message should contain CurrentVersion, got: %s", errMsg)
|
||||
}
|
||||
if !strings.Contains(errMsg, "version conflict") {
|
||||
t.Errorf("error message should contain 'version conflict', got: %s", errMsg)
|
||||
}
|
||||
}
|
||||
|
||||
func TestVersionConflictError_Fields(t *testing.T) {
|
||||
err := &VersionConflictError{
|
||||
ActorID: "actor-456",
|
||||
AttemptedVersion: 10,
|
||||
CurrentVersion: 8,
|
||||
}
|
||||
|
||||
if err.ActorID != "actor-456" {
|
||||
t.Errorf("ActorID mismatch: got %q, want %q", err.ActorID, "actor-456")
|
||||
}
|
||||
if err.AttemptedVersion != 10 {
|
||||
t.Errorf("AttemptedVersion mismatch: got %d, want %d", err.AttemptedVersion, 10)
|
||||
}
|
||||
if err.CurrentVersion != 8 {
|
||||
t.Errorf("CurrentVersion mismatch: got %d, want %d", err.CurrentVersion, 8)
|
||||
}
|
||||
}
|
||||
|
||||
func TestVersionConflictError_Unwrap(t *testing.T) {
|
||||
err := &VersionConflictError{
|
||||
ActorID: "actor-789",
|
||||
AttemptedVersion: 2,
|
||||
CurrentVersion: 1,
|
||||
}
|
||||
|
||||
unwrapped := err.Unwrap()
|
||||
if unwrapped != ErrVersionConflict {
|
||||
t.Errorf("Unwrap should return ErrVersionConflict sentinel")
|
||||
}
|
||||
}
|
||||
|
||||
func TestVersionConflictError_ErrorsIs(t *testing.T) {
|
||||
err := &VersionConflictError{
|
||||
ActorID: "test-actor",
|
||||
AttemptedVersion: 5,
|
||||
CurrentVersion: 4,
|
||||
}
|
||||
|
||||
// Test that errors.Is works with sentinel
|
||||
if !errors.Is(err, ErrVersionConflict) {
|
||||
t.Error("errors.Is(err, ErrVersionConflict) should return true")
|
||||
}
|
||||
|
||||
// Test that other errors don't match
|
||||
if errors.Is(err, errors.New("other error")) {
|
||||
t.Error("errors.Is should not match unrelated errors")
|
||||
}
|
||||
}
|
||||
|
||||
func TestVersionConflictError_ErrorsAs(t *testing.T) {
|
||||
originalErr := &VersionConflictError{
|
||||
ActorID: "actor-unwrap",
|
||||
AttemptedVersion: 7,
|
||||
CurrentVersion: 6,
|
||||
}
|
||||
|
||||
var versionErr *VersionConflictError
|
||||
if !errors.As(originalErr, &versionErr) {
|
||||
t.Fatalf("errors.As should succeed with VersionConflictError")
|
||||
}
|
||||
|
||||
// Verify fields are accessible through unwrapped error
|
||||
if versionErr.ActorID != "actor-unwrap" {
|
||||
t.Errorf("ActorID mismatch after As: got %q", versionErr.ActorID)
|
||||
}
|
||||
if versionErr.AttemptedVersion != 7 {
|
||||
t.Errorf("AttemptedVersion mismatch after As: got %d", versionErr.AttemptedVersion)
|
||||
}
|
||||
if versionErr.CurrentVersion != 6 {
|
||||
t.Errorf("CurrentVersion mismatch after As: got %d", versionErr.CurrentVersion)
|
||||
}
|
||||
}
|
||||
|
||||
func TestVersionConflictError_CanReadCurrentVersion(t *testing.T) {
|
||||
// This test verifies that applications can read CurrentVersion for retry strategies
|
||||
err := &VersionConflictError{
|
||||
ActorID: "order-abc",
|
||||
AttemptedVersion: 2,
|
||||
CurrentVersion: 10,
|
||||
}
|
||||
|
||||
var versionErr *VersionConflictError
|
||||
if !errors.As(err, &versionErr) {
|
||||
t.Fatal("failed to unwrap VersionConflictError")
|
||||
}
|
||||
|
||||
// Application can use CurrentVersion to decide retry strategy
|
||||
nextVersion := versionErr.CurrentVersion + 1
|
||||
|
||||
if nextVersion != 11 {
|
||||
t.Errorf("application should be able to compute next version: got %d, want 11", nextVersion)
|
||||
}
|
||||
|
||||
// Application can log detailed context
|
||||
logMsg := fmt.Sprintf("Version conflict for actor %q: attempted %d, current %d, will retry with %d",
|
||||
versionErr.ActorID, versionErr.AttemptedVersion, versionErr.CurrentVersion, nextVersion)
|
||||
|
||||
if !strings.Contains(logMsg, "order-abc") {
|
||||
t.Errorf("application context logging failed: %s", logMsg)
|
||||
}
|
||||
}
|
||||
|
||||
func TestVersionConflictError_EdgeCases(t *testing.T) {
|
||||
testCases := []struct {
|
||||
name string
|
||||
actorID string
|
||||
attemp int64
|
||||
current int64
|
||||
}{
|
||||
{"zero current", "actor-1", 1, 0},
|
||||
{"large numbers", "actor-2", 1000000, 999999},
|
||||
{"max int64", "actor-3", 9223372036854775807, 9223372036854775806},
|
||||
{"negative attempt", "actor-4", -1, -2},
|
||||
{"empty actor id", "", 1, 0},
|
||||
{"special chars in actor id", "actor@#$%", 2, 1},
|
||||
}
|
||||
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
err := &VersionConflictError{
|
||||
ActorID: tc.actorID,
|
||||
AttemptedVersion: tc.attemp,
|
||||
CurrentVersion: tc.current,
|
||||
}
|
||||
|
||||
// Should not panic
|
||||
msg := err.Error()
|
||||
if msg == "" {
|
||||
t.Error("Error() should return non-empty string")
|
||||
}
|
||||
|
||||
// Should be wrapped correctly
|
||||
if err.Unwrap() != ErrVersionConflict {
|
||||
t.Error("Unwrap should return ErrVersionConflict")
|
||||
}
|
||||
|
||||
// errors.Is should work
|
||||
if !errors.Is(err, ErrVersionConflict) {
|
||||
t.Error("errors.Is should work for edge case")
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestErrVersionConflict_Sentinel(t *testing.T) {
|
||||
// Verify the sentinel error is correctly defined
|
||||
if ErrVersionConflict == nil {
|
||||
t.Fatal("ErrVersionConflict should not be nil")
|
||||
}
|
||||
|
||||
expectedMsg := "version conflict"
|
||||
if ErrVersionConflict.Error() != expectedMsg {
|
||||
t.Errorf("ErrVersionConflict message mismatch: got %q, want %q", ErrVersionConflict.Error(), expectedMsg)
|
||||
}
|
||||
|
||||
// Test that it's usable with errors.Is
|
||||
if !errors.Is(ErrVersionConflict, ErrVersionConflict) {
|
||||
t.Error("ErrVersionConflict should match itself with errors.Is")
|
||||
}
|
||||
}
|
||||
|
||||
243
eventbus.go
243
eventbus.go
@@ -5,82 +5,225 @@ import (
|
||||
"sync"
|
||||
)
|
||||
|
||||
// EventBroadcaster defines the interface for publishing and subscribing to events
|
||||
// EventBroadcaster defines the interface for publishing and subscribing to events.
|
||||
//
|
||||
// Subscribe accepts namespace patterns following NATS subject matching conventions:
|
||||
// - Exact match: "tenant-a" matches only "tenant-a"
|
||||
// - Single wildcard: "*" matches any single token, "tenant-*" matches "tenant-a", "tenant-b"
|
||||
// - Multi-token wildcard: ">" matches one or more tokens (only at end of pattern)
|
||||
//
|
||||
// Security Warning: Wildcard subscriptions bypass namespace isolation.
|
||||
// Only grant wildcard access to trusted system components.
|
||||
type EventBroadcaster interface {
|
||||
Subscribe(namespaceID string) <-chan *Event
|
||||
Unsubscribe(namespaceID string, ch <-chan *Event)
|
||||
// Subscribe creates a channel that receives events matching the namespace pattern.
|
||||
// Pattern syntax follows NATS conventions: "*" matches single token, ">" matches multiple.
|
||||
Subscribe(namespacePattern string) <-chan *Event
|
||||
|
||||
// SubscribeWithFilter creates a filtered subscription channel for a namespace pattern.
|
||||
// Events are filtered by the provided SubscriptionFilter before delivery.
|
||||
// Filters are applied with AND logic - events must match all specified criteria.
|
||||
//
|
||||
// Example: Subscribe to "orders" namespace, only receiving "OrderPlaced" events for "order-*" actors:
|
||||
// filter := &SubscriptionFilter{
|
||||
// EventTypes: []string{"OrderPlaced"},
|
||||
// ActorPattern: "order-*",
|
||||
// }
|
||||
// ch := bus.SubscribeWithFilter("orders", filter)
|
||||
SubscribeWithFilter(namespacePattern string, filter *SubscriptionFilter) <-chan *Event
|
||||
|
||||
Unsubscribe(namespacePattern string, ch <-chan *Event)
|
||||
Publish(namespaceID string, event *Event)
|
||||
Stop()
|
||||
SubscriberCount(namespaceID string) int
|
||||
}
|
||||
|
||||
// EventBus broadcasts events to multiple subscribers within a namespace
|
||||
// MetricsProvider is an optional interface that EventBroadcaster implementations
|
||||
// can implement to expose metrics.
|
||||
type MetricsProvider interface {
|
||||
// Metrics returns the metrics collector for this broadcaster.
|
||||
Metrics() BroadcasterMetrics
|
||||
}
|
||||
|
||||
// subscription represents a single subscriber channel with its pattern
|
||||
type subscription struct {
|
||||
pattern string
|
||||
ch chan *Event
|
||||
}
|
||||
|
||||
// filteredSubscription represents a subscriber with an optional filter
|
||||
type filteredSubscription struct {
|
||||
pattern string
|
||||
ch chan *Event
|
||||
filter *SubscriptionFilter
|
||||
}
|
||||
|
||||
// EventBus broadcasts events to multiple subscribers within a namespace.
|
||||
// Supports wildcard patterns for cross-namespace subscriptions.
|
||||
//
|
||||
// Security Considerations:
|
||||
// Wildcard subscriptions (using "*" or ">") receive events from multiple namespaces.
|
||||
// This is intentional for cross-cutting concerns like logging, monitoring, and auditing.
|
||||
// However, it bypasses namespace isolation - use with appropriate access controls.
|
||||
type EventBus struct {
|
||||
subscribers map[string][]chan *Event // namespaceID -> channels
|
||||
mutex sync.RWMutex
|
||||
ctx context.Context
|
||||
cancel context.CancelFunc
|
||||
// exactSubscribers holds subscribers for exact namespace matches (no wildcards)
|
||||
exactSubscribers map[string][]*filteredSubscription
|
||||
// wildcardSubscribers holds subscribers with wildcard patterns
|
||||
wildcardSubscribers []*filteredSubscription
|
||||
mutex sync.RWMutex
|
||||
ctx context.Context
|
||||
cancel context.CancelFunc
|
||||
metrics *DefaultMetricsCollector
|
||||
}
|
||||
|
||||
// NewEventBus creates a new event bus
|
||||
func NewEventBus() *EventBus {
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
return &EventBus{
|
||||
subscribers: make(map[string][]chan *Event),
|
||||
ctx: ctx,
|
||||
cancel: cancel,
|
||||
exactSubscribers: make(map[string][]*filteredSubscription),
|
||||
wildcardSubscribers: make([]*filteredSubscription, 0),
|
||||
ctx: ctx,
|
||||
cancel: cancel,
|
||||
metrics: NewMetricsCollector(),
|
||||
}
|
||||
}
|
||||
|
||||
// Subscribe creates a new subscription channel for a namespace
|
||||
func (eb *EventBus) Subscribe(namespaceID string) <-chan *Event {
|
||||
// Metrics returns the metrics collector for this event bus.
|
||||
func (eb *EventBus) Metrics() BroadcasterMetrics {
|
||||
return eb.metrics
|
||||
}
|
||||
|
||||
// Subscribe creates a new subscription channel for a namespace pattern.
|
||||
// Patterns follow NATS subject matching conventions:
|
||||
// - "*" matches a single token (any sequence without ".")
|
||||
// - ">" matches one or more tokens (only valid at the end)
|
||||
// - Exact strings match exactly
|
||||
//
|
||||
// Security Warning: Wildcard patterns receive events from all matching namespaces,
|
||||
// bypassing namespace isolation. Only use for trusted system components.
|
||||
func (eb *EventBus) Subscribe(namespacePattern string) <-chan *Event {
|
||||
return eb.SubscribeWithFilter(namespacePattern, nil)
|
||||
}
|
||||
|
||||
// SubscribeWithFilter creates a filtered subscription channel for a namespace pattern.
|
||||
// Events are filtered by the provided SubscriptionFilter before delivery.
|
||||
// If filter is nil or empty, all events matching the namespace pattern are delivered.
|
||||
//
|
||||
// Filtering is applied client-side for efficient processing:
|
||||
// - EventTypes: Only events with matching event types are delivered
|
||||
// - ActorPattern: Only events from matching actors are delivered
|
||||
//
|
||||
// Both namespace pattern wildcards and event filters work together:
|
||||
// - Namespace pattern determines which namespaces to subscribe to
|
||||
// - Filter determines which events within those namespaces to receive
|
||||
func (eb *EventBus) SubscribeWithFilter(namespacePattern string, filter *SubscriptionFilter) <-chan *Event {
|
||||
eb.mutex.Lock()
|
||||
defer eb.mutex.Unlock()
|
||||
|
||||
// Create buffered channel to prevent blocking publishers
|
||||
ch := make(chan *Event, 100)
|
||||
eb.subscribers[namespaceID] = append(eb.subscribers[namespaceID], ch)
|
||||
|
||||
sub := &filteredSubscription{
|
||||
pattern: namespacePattern,
|
||||
ch: ch,
|
||||
filter: filter,
|
||||
}
|
||||
|
||||
if IsWildcardPattern(namespacePattern) {
|
||||
// Store wildcard subscription separately
|
||||
eb.wildcardSubscribers = append(eb.wildcardSubscribers, sub)
|
||||
} else {
|
||||
// Exact match subscription
|
||||
eb.exactSubscribers[namespacePattern] = append(eb.exactSubscribers[namespacePattern], sub)
|
||||
}
|
||||
|
||||
// Record subscription metric
|
||||
eb.metrics.RecordSubscribe(namespacePattern)
|
||||
|
||||
return ch
|
||||
}
|
||||
|
||||
// Unsubscribe removes a subscription channel
|
||||
func (eb *EventBus) Unsubscribe(namespaceID string, ch <-chan *Event) {
|
||||
func (eb *EventBus) Unsubscribe(namespacePattern string, ch <-chan *Event) {
|
||||
eb.mutex.Lock()
|
||||
defer eb.mutex.Unlock()
|
||||
|
||||
subs := eb.subscribers[namespaceID]
|
||||
for i, subscriber := range subs {
|
||||
if subscriber == ch {
|
||||
// Remove channel from slice
|
||||
eb.subscribers[namespaceID] = append(subs[:i], subs[i+1:]...)
|
||||
close(subscriber)
|
||||
break
|
||||
if IsWildcardPattern(namespacePattern) {
|
||||
// Remove from wildcard subscribers
|
||||
for i, sub := range eb.wildcardSubscribers {
|
||||
if sub.ch == ch {
|
||||
eb.wildcardSubscribers = append(eb.wildcardSubscribers[:i], eb.wildcardSubscribers[i+1:]...)
|
||||
close(sub.ch)
|
||||
// Record unsubscription metric
|
||||
eb.metrics.RecordUnsubscribe(namespacePattern)
|
||||
break
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Remove from exact subscribers
|
||||
subs := eb.exactSubscribers[namespacePattern]
|
||||
for i, sub := range subs {
|
||||
if sub.ch == ch {
|
||||
// Remove subscription from slice
|
||||
eb.exactSubscribers[namespacePattern] = append(subs[:i], subs[i+1:]...)
|
||||
close(sub.ch)
|
||||
// Record unsubscription metric
|
||||
eb.metrics.RecordUnsubscribe(namespacePattern)
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Clean up empty namespace entries
|
||||
if len(eb.subscribers[namespaceID]) == 0 {
|
||||
delete(eb.subscribers, namespaceID)
|
||||
// Clean up empty namespace entries
|
||||
if len(eb.exactSubscribers[namespacePattern]) == 0 {
|
||||
delete(eb.exactSubscribers, namespacePattern)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Publish sends an event to all subscribers of a namespace
|
||||
// Publish sends an event to all subscribers of a namespace.
|
||||
// Events are delivered to:
|
||||
// - All exact subscribers for the namespace (after filter matching)
|
||||
// - All wildcard subscribers whose pattern matches the namespace (after filter matching)
|
||||
func (eb *EventBus) Publish(namespaceID string, event *Event) {
|
||||
eb.mutex.RLock()
|
||||
defer eb.mutex.RUnlock()
|
||||
|
||||
subscribers := eb.subscribers[namespaceID]
|
||||
for _, ch := range subscribers {
|
||||
select {
|
||||
case ch <- event:
|
||||
// Event delivered
|
||||
default:
|
||||
// Channel full, skip this subscriber (non-blocking)
|
||||
// Record publish metric
|
||||
eb.metrics.RecordPublish(namespaceID)
|
||||
|
||||
// Deliver to exact subscribers
|
||||
subscribers := eb.exactSubscribers[namespaceID]
|
||||
for _, sub := range subscribers {
|
||||
eb.deliverToSubscriber(sub, event, namespaceID)
|
||||
}
|
||||
|
||||
// Deliver to matching wildcard subscribers
|
||||
for _, sub := range eb.wildcardSubscribers {
|
||||
if MatchNamespacePattern(sub.pattern, namespaceID) {
|
||||
eb.deliverToSubscriber(sub, event, namespaceID)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// deliverToSubscriber delivers an event to a subscriber if it matches the filter
|
||||
func (eb *EventBus) deliverToSubscriber(sub *filteredSubscription, event *Event, namespaceID string) {
|
||||
// Apply filter if present
|
||||
if sub.filter != nil && !sub.filter.IsEmpty() {
|
||||
if !sub.filter.Matches(event) {
|
||||
// Event doesn't match filter, skip delivery
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
select {
|
||||
case sub.ch <- event:
|
||||
// Event delivered
|
||||
eb.metrics.RecordReceive(namespaceID)
|
||||
default:
|
||||
// Channel full, skip this subscriber (non-blocking)
|
||||
eb.metrics.RecordDroppedEvent(namespaceID)
|
||||
}
|
||||
}
|
||||
|
||||
// Stop closes the event bus
|
||||
func (eb *EventBus) Stop() {
|
||||
eb.mutex.Lock()
|
||||
@@ -88,19 +231,37 @@ func (eb *EventBus) Stop() {
|
||||
|
||||
eb.cancel()
|
||||
|
||||
// Close all subscriber channels
|
||||
for _, subs := range eb.subscribers {
|
||||
for _, ch := range subs {
|
||||
close(ch)
|
||||
// Close all exact subscriber channels and update metrics
|
||||
for namespaceID, subs := range eb.exactSubscribers {
|
||||
for _, sub := range subs {
|
||||
close(sub.ch)
|
||||
eb.metrics.RecordUnsubscribe(namespaceID)
|
||||
}
|
||||
}
|
||||
|
||||
eb.subscribers = make(map[string][]chan *Event)
|
||||
// Close all wildcard subscriber channels and update metrics
|
||||
for _, sub := range eb.wildcardSubscribers {
|
||||
close(sub.ch)
|
||||
eb.metrics.RecordUnsubscribe(sub.pattern)
|
||||
}
|
||||
|
||||
eb.exactSubscribers = make(map[string][]*filteredSubscription)
|
||||
eb.wildcardSubscribers = make([]*filteredSubscription, 0)
|
||||
}
|
||||
|
||||
// SubscriberCount returns the number of subscribers for a namespace
|
||||
// SubscriberCount returns the number of subscribers for a namespace.
|
||||
// This counts only exact match subscribers, not wildcard subscribers that may match.
|
||||
func (eb *EventBus) SubscriberCount(namespaceID string) int {
|
||||
eb.mutex.RLock()
|
||||
defer eb.mutex.RUnlock()
|
||||
return len(eb.subscribers[namespaceID])
|
||||
return len(eb.exactSubscribers[namespaceID])
|
||||
}
|
||||
|
||||
// WildcardSubscriberCount returns the number of wildcard subscribers.
|
||||
// These are subscribers using "*" or ">" patterns that may receive events
|
||||
// from multiple namespaces.
|
||||
func (eb *EventBus) WildcardSubscriberCount() int {
|
||||
eb.mutex.RLock()
|
||||
defer eb.mutex.RUnlock()
|
||||
return len(eb.wildcardSubscribers)
|
||||
}
|
||||
|
||||
822
eventbus_test.go
Normal file
822
eventbus_test.go
Normal file
@@ -0,0 +1,822 @@
|
||||
package aether
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func TestEventBus_ExactSubscription(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
ch := eb.Subscribe("tenant-a")
|
||||
|
||||
event := &Event{
|
||||
ID: "evt-1",
|
||||
EventType: "TestEvent",
|
||||
ActorID: "actor-1",
|
||||
}
|
||||
|
||||
eb.Publish("tenant-a", event)
|
||||
|
||||
select {
|
||||
case received := <-ch:
|
||||
if received.ID != event.ID {
|
||||
t.Errorf("expected event ID %s, got %s", event.ID, received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timed out waiting for event")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_WildcardStarSubscription(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe to all single-token namespaces
|
||||
ch := eb.Subscribe("*")
|
||||
|
||||
event := &Event{
|
||||
ID: "evt-1",
|
||||
EventType: "TestEvent",
|
||||
ActorID: "actor-1",
|
||||
}
|
||||
|
||||
eb.Publish("tenant-a", event)
|
||||
|
||||
select {
|
||||
case received := <-ch:
|
||||
if received.ID != event.ID {
|
||||
t.Errorf("expected event ID %s, got %s", event.ID, received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timed out waiting for event")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_WildcardGreaterSubscription(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe to all namespaces
|
||||
ch := eb.Subscribe(">")
|
||||
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "Test1", ActorID: "actor-1"},
|
||||
{ID: "evt-2", EventType: "Test2", ActorID: "actor-2"},
|
||||
{ID: "evt-3", EventType: "Test3", ActorID: "actor-3"},
|
||||
}
|
||||
|
||||
namespaces := []string{"tenant-a", "tenant-b", "prod.tenant.orders"}
|
||||
|
||||
for i, ns := range namespaces {
|
||||
eb.Publish(ns, events[i])
|
||||
}
|
||||
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
|
||||
for i := 0; i < len(events); i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after receiving %d of %d events", i, len(events))
|
||||
}
|
||||
}
|
||||
|
||||
for _, evt := range events {
|
||||
if !received[evt.ID] {
|
||||
t.Errorf("did not receive event %s", evt.ID)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_PrefixWildcard(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe to prod.*
|
||||
ch := eb.Subscribe("prod.*")
|
||||
|
||||
event1 := &Event{ID: "evt-1", EventType: "Test", ActorID: "actor-1"}
|
||||
event2 := &Event{ID: "evt-2", EventType: "Test", ActorID: "actor-2"}
|
||||
event3 := &Event{ID: "evt-3", EventType: "Test", ActorID: "actor-3"}
|
||||
|
||||
// Should match
|
||||
eb.Publish("prod.tenant", event1)
|
||||
eb.Publish("prod.orders", event2)
|
||||
// Should not match (different prefix)
|
||||
eb.Publish("staging.tenant", event3)
|
||||
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
|
||||
// Should receive exactly 2 events
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after receiving %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
// Verify we got the right ones
|
||||
if !received["evt-1"] || !received["evt-2"] {
|
||||
t.Errorf("expected evt-1 and evt-2, got %v", received)
|
||||
}
|
||||
|
||||
// Verify no third event arrives
|
||||
select {
|
||||
case evt := <-ch:
|
||||
t.Errorf("unexpected event received: %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected - no more events
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_MultipleWildcardSubscribers(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
ch1 := eb.Subscribe("prod.*")
|
||||
ch2 := eb.Subscribe("prod.>")
|
||||
ch3 := eb.Subscribe(">")
|
||||
|
||||
event := &Event{ID: "evt-1", EventType: "Test", ActorID: "actor-1"}
|
||||
|
||||
eb.Publish("prod.tenant.orders", event)
|
||||
|
||||
// ch1 (prod.*) should NOT receive - doesn't match 3 tokens
|
||||
select {
|
||||
case <-ch1:
|
||||
t.Error("prod.* should not match prod.tenant.orders")
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
|
||||
// ch2 (prod.>) should receive
|
||||
select {
|
||||
case received := <-ch2:
|
||||
if received.ID != event.ID {
|
||||
t.Errorf("expected %s, got %s", event.ID, received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("prod.> should match prod.tenant.orders")
|
||||
}
|
||||
|
||||
// ch3 (>) should receive
|
||||
select {
|
||||
case received := <-ch3:
|
||||
if received.ID != event.ID {
|
||||
t.Errorf("expected %s, got %s", event.ID, received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("> should match prod.tenant.orders")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_ExactAndWildcardCoexist(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
chExact := eb.Subscribe("tenant-a")
|
||||
chWildcard := eb.Subscribe("*")
|
||||
|
||||
event := &Event{ID: "evt-1", EventType: "Test", ActorID: "actor-1"}
|
||||
|
||||
eb.Publish("tenant-a", event)
|
||||
|
||||
// Both should receive the event
|
||||
var wg sync.WaitGroup
|
||||
wg.Add(2)
|
||||
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
select {
|
||||
case received := <-chExact:
|
||||
if received.ID != event.ID {
|
||||
t.Errorf("exact: expected %s, got %s", event.ID, received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("exact subscriber timed out")
|
||||
}
|
||||
}()
|
||||
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
select {
|
||||
case received := <-chWildcard:
|
||||
if received.ID != event.ID {
|
||||
t.Errorf("wildcard: expected %s, got %s", event.ID, received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("wildcard subscriber timed out")
|
||||
}
|
||||
}()
|
||||
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
func TestEventBus_WildcardUnsubscribe(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
ch := eb.Subscribe("prod.*")
|
||||
|
||||
// Verify it's counted
|
||||
if eb.WildcardSubscriberCount() != 1 {
|
||||
t.Errorf("expected 1 wildcard subscriber, got %d", eb.WildcardSubscriberCount())
|
||||
}
|
||||
|
||||
eb.Unsubscribe("prod.*", ch)
|
||||
|
||||
// Verify it's removed
|
||||
if eb.WildcardSubscriberCount() != 0 {
|
||||
t.Errorf("expected 0 wildcard subscribers, got %d", eb.WildcardSubscriberCount())
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscriberCount(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Add exact subscribers
|
||||
ch1 := eb.Subscribe("tenant-a")
|
||||
ch2 := eb.Subscribe("tenant-a")
|
||||
|
||||
if eb.SubscriberCount("tenant-a") != 2 {
|
||||
t.Errorf("expected 2 exact subscribers, got %d", eb.SubscriberCount("tenant-a"))
|
||||
}
|
||||
|
||||
// Add wildcard subscriber - should not affect exact count
|
||||
eb.Subscribe("*")
|
||||
|
||||
if eb.SubscriberCount("tenant-a") != 2 {
|
||||
t.Errorf("expected 2 exact subscribers after wildcard add, got %d", eb.SubscriberCount("tenant-a"))
|
||||
}
|
||||
if eb.WildcardSubscriberCount() != 1 {
|
||||
t.Errorf("expected 1 wildcard subscriber, got %d", eb.WildcardSubscriberCount())
|
||||
}
|
||||
|
||||
// Unsubscribe exact
|
||||
eb.Unsubscribe("tenant-a", ch1)
|
||||
if eb.SubscriberCount("tenant-a") != 1 {
|
||||
t.Errorf("expected 1 exact subscriber after unsubscribe, got %d", eb.SubscriberCount("tenant-a"))
|
||||
}
|
||||
|
||||
eb.Unsubscribe("tenant-a", ch2)
|
||||
if eb.SubscriberCount("tenant-a") != 0 {
|
||||
t.Errorf("expected 0 exact subscribers after unsubscribe, got %d", eb.SubscriberCount("tenant-a"))
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_StopClosesAllChannels(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
|
||||
chExact := eb.Subscribe("tenant-a")
|
||||
chWildcard := eb.Subscribe("*")
|
||||
|
||||
eb.Stop()
|
||||
|
||||
// Both channels should be closed
|
||||
select {
|
||||
case _, ok := <-chExact:
|
||||
if ok {
|
||||
t.Error("expected exact channel to be closed")
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("timed out waiting for exact channel close")
|
||||
}
|
||||
|
||||
select {
|
||||
case _, ok := <-chWildcard:
|
||||
if ok {
|
||||
t.Error("expected wildcard channel to be closed")
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("timed out waiting for wildcard channel close")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_NamespaceIsolation(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
chA := eb.Subscribe("tenant-a")
|
||||
chB := eb.Subscribe("tenant-b")
|
||||
|
||||
eventA := &Event{ID: "evt-a", EventType: "Test", ActorID: "actor-1"}
|
||||
eventB := &Event{ID: "evt-b", EventType: "Test", ActorID: "actor-2"}
|
||||
|
||||
eb.Publish("tenant-a", eventA)
|
||||
eb.Publish("tenant-b", eventB)
|
||||
|
||||
// Verify tenant-a receives only its event
|
||||
select {
|
||||
case received := <-chA:
|
||||
if received.ID != "evt-a" {
|
||||
t.Errorf("tenant-a received wrong event: %s", received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("tenant-a timed out")
|
||||
}
|
||||
|
||||
select {
|
||||
case <-chA:
|
||||
t.Error("tenant-a received extra event")
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
|
||||
// Verify tenant-b receives only its event
|
||||
select {
|
||||
case received := <-chB:
|
||||
if received.ID != "evt-b" {
|
||||
t.Errorf("tenant-b received wrong event: %s", received.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Error("tenant-b timed out")
|
||||
}
|
||||
|
||||
select {
|
||||
case <-chB:
|
||||
t.Error("tenant-b received extra event")
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_NonBlockingPublish(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Create subscriber but don't read from channel
|
||||
_ = eb.Subscribe("tenant-a")
|
||||
|
||||
// Fill the channel buffer (100 events)
|
||||
for i := 0; i < 150; i++ {
|
||||
event := &Event{
|
||||
ID: "evt",
|
||||
EventType: "Test",
|
||||
ActorID: "actor-1",
|
||||
}
|
||||
// Should not block even when channel is full
|
||||
eb.Publish("tenant-a", event)
|
||||
}
|
||||
|
||||
// If we got here without blocking, test passes
|
||||
}
|
||||
|
||||
func TestEventBus_ConcurrentOperations(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
var wg sync.WaitGroup
|
||||
|
||||
// Concurrent subscriptions
|
||||
for i := 0; i < 10; i++ {
|
||||
wg.Add(1)
|
||||
go func(n int) {
|
||||
defer wg.Done()
|
||||
ch := eb.Subscribe("tenant-a")
|
||||
time.Sleep(10 * time.Millisecond)
|
||||
eb.Unsubscribe("tenant-a", ch)
|
||||
}(i)
|
||||
}
|
||||
|
||||
// Concurrent wildcard subscriptions
|
||||
for i := 0; i < 10; i++ {
|
||||
wg.Add(1)
|
||||
go func(n int) {
|
||||
defer wg.Done()
|
||||
ch := eb.Subscribe("*")
|
||||
time.Sleep(10 * time.Millisecond)
|
||||
eb.Unsubscribe("*", ch)
|
||||
}(i)
|
||||
}
|
||||
|
||||
// Concurrent publishes
|
||||
for i := 0; i < 10; i++ {
|
||||
wg.Add(1)
|
||||
go func(n int) {
|
||||
defer wg.Done()
|
||||
event := &Event{
|
||||
ID: "evt",
|
||||
EventType: "Test",
|
||||
ActorID: "actor-1",
|
||||
}
|
||||
eb.Publish("tenant-a", event)
|
||||
}(i)
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
// Tests for SubscribeWithFilter functionality
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_EventTypes(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe with filter for specific event types
|
||||
filter := &SubscriptionFilter{
|
||||
EventTypes: []string{"OrderPlaced", "OrderShipped"},
|
||||
}
|
||||
ch := eb.SubscribeWithFilter("orders", filter)
|
||||
|
||||
// Publish events of different types
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"},
|
||||
{ID: "evt-2", EventType: "OrderCancelled", ActorID: "order-2"}, // Should not be received
|
||||
{ID: "evt-3", EventType: "OrderShipped", ActorID: "order-3"},
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("orders", e)
|
||||
}
|
||||
|
||||
// Should receive evt-1 and evt-3, but not evt-2
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after receiving %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
if !received["evt-1"] || !received["evt-3"] {
|
||||
t.Errorf("expected to receive evt-1 and evt-3, got %v", received)
|
||||
}
|
||||
|
||||
// Verify evt-2 was not received
|
||||
select {
|
||||
case evt := <-ch:
|
||||
t.Errorf("unexpected event received: %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_ActorPattern(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe with filter for specific actor pattern
|
||||
filter := &SubscriptionFilter{
|
||||
ActorPattern: "order-*",
|
||||
}
|
||||
ch := eb.SubscribeWithFilter("events", filter)
|
||||
|
||||
// Publish events from different actors
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "Test", ActorID: "order-123"},
|
||||
{ID: "evt-2", EventType: "Test", ActorID: "user-456"}, // Should not be received
|
||||
{ID: "evt-3", EventType: "Test", ActorID: "order-789"},
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("events", e)
|
||||
}
|
||||
|
||||
// Should receive evt-1 and evt-3, but not evt-2
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after receiving %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
if !received["evt-1"] || !received["evt-3"] {
|
||||
t.Errorf("expected to receive evt-1 and evt-3, got %v", received)
|
||||
}
|
||||
|
||||
// Verify evt-2 was not received
|
||||
select {
|
||||
case evt := <-ch:
|
||||
t.Errorf("unexpected event received: %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_Combined(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe with filter for both event type AND actor pattern
|
||||
filter := &SubscriptionFilter{
|
||||
EventTypes: []string{"OrderPlaced"},
|
||||
ActorPattern: "order-*",
|
||||
}
|
||||
ch := eb.SubscribeWithFilter("orders", filter)
|
||||
|
||||
// Publish events with various combinations
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-123"}, // Should be received
|
||||
{ID: "evt-2", EventType: "OrderPlaced", ActorID: "user-456"}, // Wrong actor
|
||||
{ID: "evt-3", EventType: "OrderCancelled", ActorID: "order-789"}, // Wrong type
|
||||
{ID: "evt-4", EventType: "OrderCancelled", ActorID: "user-000"}, // Wrong both
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("orders", e)
|
||||
}
|
||||
|
||||
// Should only receive evt-1
|
||||
select {
|
||||
case evt := <-ch:
|
||||
if evt.ID != "evt-1" {
|
||||
t.Errorf("expected evt-1, got %s", evt.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timed out waiting for event")
|
||||
}
|
||||
|
||||
// Verify no more events arrive
|
||||
select {
|
||||
case evt := <-ch:
|
||||
t.Errorf("unexpected event received: %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_NilFilter(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe with nil filter - should receive all events
|
||||
ch := eb.SubscribeWithFilter("events", nil)
|
||||
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "TypeA", ActorID: "actor-1"},
|
||||
{ID: "evt-2", EventType: "TypeB", ActorID: "actor-2"},
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("events", e)
|
||||
}
|
||||
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after receiving %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
if !received["evt-1"] || !received["evt-2"] {
|
||||
t.Errorf("expected all events, got %v", received)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_EmptyFilter(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe with empty filter - should receive all events
|
||||
ch := eb.SubscribeWithFilter("events", &SubscriptionFilter{})
|
||||
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "TypeA", ActorID: "actor-1"},
|
||||
{ID: "evt-2", EventType: "TypeB", ActorID: "actor-2"},
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("events", e)
|
||||
}
|
||||
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after receiving %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
if !received["evt-1"] || !received["evt-2"] {
|
||||
t.Errorf("expected all events, got %v", received)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_WildcardNamespaceAndFilter(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Subscribe to wildcard namespace pattern with event type filter
|
||||
filter := &SubscriptionFilter{
|
||||
EventTypes: []string{"OrderPlaced"},
|
||||
}
|
||||
ch := eb.SubscribeWithFilter("prod.*", filter)
|
||||
|
||||
// Publish events to different namespaces
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"}, // prod.orders - should match
|
||||
{ID: "evt-2", EventType: "OrderShipped", ActorID: "order-2"}, // prod.orders - wrong type
|
||||
{ID: "evt-3", EventType: "OrderPlaced", ActorID: "order-3"}, // staging.orders - wrong namespace
|
||||
}
|
||||
|
||||
eb.Publish("prod.orders", events[0])
|
||||
eb.Publish("prod.orders", events[1])
|
||||
eb.Publish("staging.orders", events[2])
|
||||
|
||||
// Should only receive evt-1
|
||||
select {
|
||||
case evt := <-ch:
|
||||
if evt.ID != "evt-1" {
|
||||
t.Errorf("expected evt-1, got %s", evt.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timed out waiting for event")
|
||||
}
|
||||
|
||||
// Verify no more events arrive
|
||||
select {
|
||||
case evt := <-ch:
|
||||
t.Errorf("unexpected event received: %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_MultipleSubscribersWithDifferentFilters(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Two subscribers with different filters on same namespace
|
||||
filter1 := &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}
|
||||
filter2 := &SubscriptionFilter{EventTypes: []string{"OrderShipped"}}
|
||||
|
||||
ch1 := eb.SubscribeWithFilter("orders", filter1)
|
||||
ch2 := eb.SubscribeWithFilter("orders", filter2)
|
||||
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"},
|
||||
{ID: "evt-2", EventType: "OrderShipped", ActorID: "order-2"},
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("orders", e)
|
||||
}
|
||||
|
||||
// ch1 should only receive evt-1
|
||||
select {
|
||||
case evt := <-ch1:
|
||||
if evt.ID != "evt-1" {
|
||||
t.Errorf("ch1: expected evt-1, got %s", evt.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("ch1 timed out")
|
||||
}
|
||||
|
||||
// ch2 should only receive evt-2
|
||||
select {
|
||||
case evt := <-ch2:
|
||||
if evt.ID != "evt-2" {
|
||||
t.Errorf("ch2: expected evt-2, got %s", evt.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("ch2 timed out")
|
||||
}
|
||||
|
||||
// Verify no extra events
|
||||
select {
|
||||
case evt := <-ch1:
|
||||
t.Errorf("ch1: unexpected event %s", evt.ID)
|
||||
case evt := <-ch2:
|
||||
t.Errorf("ch2: unexpected event %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_UnsubscribeFiltered(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
filter := &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}
|
||||
ch := eb.SubscribeWithFilter("orders", filter)
|
||||
|
||||
// Verify subscription count
|
||||
if eb.SubscriberCount("orders") != 1 {
|
||||
t.Errorf("expected 1 subscriber, got %d", eb.SubscriberCount("orders"))
|
||||
}
|
||||
|
||||
eb.Unsubscribe("orders", ch)
|
||||
|
||||
// Verify unsubscribed
|
||||
if eb.SubscriberCount("orders") != 0 {
|
||||
t.Errorf("expected 0 subscribers, got %d", eb.SubscriberCount("orders"))
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_FilteredAndUnfilteredCoexist(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// One subscriber with filter, one without
|
||||
filter := &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}
|
||||
chFiltered := eb.SubscribeWithFilter("orders", filter)
|
||||
chUnfiltered := eb.Subscribe("orders")
|
||||
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"},
|
||||
{ID: "evt-2", EventType: "OrderShipped", ActorID: "order-2"},
|
||||
}
|
||||
|
||||
for _, e := range events {
|
||||
eb.Publish("orders", e)
|
||||
}
|
||||
|
||||
// Filtered subscriber should only receive evt-1
|
||||
select {
|
||||
case evt := <-chFiltered:
|
||||
if evt.ID != "evt-1" {
|
||||
t.Errorf("filtered: expected evt-1, got %s", evt.ID)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("filtered subscriber timed out")
|
||||
}
|
||||
|
||||
// Unfiltered subscriber should receive both
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-chUnfiltered:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("unfiltered timed out after %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
if !received["evt-1"] || !received["evt-2"] {
|
||||
t.Errorf("unfiltered expected both events, got %v", received)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_SubscribeWithFilter_WildcardGreaterWithFilter(t *testing.T) {
|
||||
eb := NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Use > wildcard (matches one or more tokens) with filter
|
||||
filter := &SubscriptionFilter{
|
||||
ActorPattern: "order-*",
|
||||
}
|
||||
ch := eb.SubscribeWithFilter(">", filter)
|
||||
|
||||
events := []*Event{
|
||||
{ID: "evt-1", EventType: "Test", ActorID: "order-123"},
|
||||
{ID: "evt-2", EventType: "Test", ActorID: "user-456"},
|
||||
{ID: "evt-3", EventType: "Test", ActorID: "order-789"},
|
||||
}
|
||||
|
||||
// Publish to different namespaces
|
||||
eb.Publish("tenant-a", events[0])
|
||||
eb.Publish("tenant-b", events[1])
|
||||
eb.Publish("prod.orders", events[2])
|
||||
|
||||
// Should receive evt-1 and evt-3, but not evt-2
|
||||
received := make(map[string]bool)
|
||||
timeout := time.After(100 * time.Millisecond)
|
||||
for i := 0; i < 2; i++ {
|
||||
select {
|
||||
case evt := <-ch:
|
||||
received[evt.ID] = true
|
||||
case <-timeout:
|
||||
t.Fatalf("timed out after %d events", len(received))
|
||||
}
|
||||
}
|
||||
|
||||
if !received["evt-1"] || !received["evt-3"] {
|
||||
t.Errorf("expected evt-1 and evt-3, got %v", received)
|
||||
}
|
||||
|
||||
// Verify no evt-2
|
||||
select {
|
||||
case evt := <-ch:
|
||||
t.Errorf("unexpected event: %s", evt.ID)
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected
|
||||
}
|
||||
}
|
||||
189
examples/README.md
Normal file
189
examples/README.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Aether Examples
|
||||
|
||||
This directory contains examples demonstrating common patterns for using Aether.
|
||||
|
||||
## Retry Patterns (`retry_patterns.go`)
|
||||
|
||||
When saving events with optimistic concurrency control, your application may encounter `VersionConflictError` when multiple writers attempt to update the same actor concurrently. This file demonstrates several retry strategies.
|
||||
|
||||
### Pattern Overview
|
||||
|
||||
All retry patterns work with `VersionConflictError` which provides three critical fields:
|
||||
|
||||
- **ActorID**: The actor that experienced the conflict
|
||||
- **CurrentVersion**: The latest version in the store
|
||||
- **AttemptedVersion**: The version you tried to save
|
||||
|
||||
Your application can read these fields to make intelligent retry decisions.
|
||||
|
||||
### Available Patterns
|
||||
|
||||
#### SimpleRetryPattern
|
||||
|
||||
The most basic pattern - just retry with exponential backoff:
|
||||
|
||||
```go
|
||||
// Automatically retries up to 3 times with exponential backoff
|
||||
err := SimpleRetryPattern(store, "order-123", "OrderUpdated")
|
||||
```
|
||||
|
||||
**Use when**: You want a straightforward retry mechanism without complex logic.
|
||||
|
||||
#### ConflictDetailedRetryPattern
|
||||
|
||||
Extracts detailed information from the conflict error to make smarter decisions:
|
||||
|
||||
```go
|
||||
// Detects thrashing (multiple conflicts at same version)
|
||||
// and can implement circuit-breaker logic
|
||||
err := ConflictDetailedRetryPattern(store, "order-123", "OrderUpdated")
|
||||
```
|
||||
|
||||
**Use when**: You need visibility into conflict patterns and want to detect system issues like thrashing.
|
||||
|
||||
#### JitterRetryPattern
|
||||
|
||||
Adds randomized jitter to prevent "thundering herd" when multiple writers retry:
|
||||
|
||||
```go
|
||||
// Exponential backoff with jitter prevents synchronized retries
|
||||
err := JitterRetryPattern(store, "order-123", "OrderUpdated")
|
||||
```
|
||||
|
||||
**Use when**: You have high concurrency and want to prevent all writers from retrying at the same time.
|
||||
|
||||
#### AdaptiveRetryPattern
|
||||
|
||||
Adjusts backoff duration based on version distance (indicator of contention):
|
||||
|
||||
```go
|
||||
// Light contention (gap=1): 50ms backoff
|
||||
// Moderate contention (gap=3-10): proportional backoff
|
||||
// High contention (gap>10): aggressive backoff
|
||||
err := AdaptiveRetryPattern(store, "order-123", "OrderUpdated")
|
||||
```
|
||||
|
||||
**Use when**: You want backoff strategy to respond to actual system load.
|
||||
|
||||
#### EventualConsistencyPattern
|
||||
|
||||
Instead of blocking on retry, queues the event for asynchronous retry:
|
||||
|
||||
```go
|
||||
// Returns immediately, event is queued for background retry
|
||||
EventualConsistencyPattern(store, retryQueue, event)
|
||||
|
||||
// Background worker processes the queue
|
||||
for item := range retryQueue {
|
||||
// Implement your own retry logic here
|
||||
}
|
||||
```
|
||||
|
||||
**Use when**: You can't afford to block the request, and background retry is acceptable.
|
||||
|
||||
#### CircuitBreakerPattern
|
||||
|
||||
Implements a circuit breaker to prevent cascading failures:
|
||||
|
||||
```go
|
||||
cb := NewCircuitBreaker()
|
||||
|
||||
// Fails fast when circuit is open
|
||||
err := CircuitBreakerRetryPattern(store, cb, "order-123", "OrderUpdated")
|
||||
if err != nil && !cb.CanRetry() {
|
||||
return ErrCircuitBreakerOpen
|
||||
}
|
||||
```
|
||||
|
||||
**Use when**: You have a distributed system and want to prevent retry storms during outages.
|
||||
|
||||
## Common Pattern: Extract and Log Context
|
||||
|
||||
All patterns can read context from `VersionConflictError`:
|
||||
|
||||
```go
|
||||
var versionErr *aether.VersionConflictError
|
||||
if errors.As(err, &versionErr) {
|
||||
log.Printf(
|
||||
"Conflict for actor %q: attempted %d, current %d",
|
||||
versionErr.ActorID,
|
||||
versionErr.AttemptedVersion,
|
||||
versionErr.CurrentVersion,
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
## Sentinel Error Check
|
||||
|
||||
Check if an error is a version conflict without examining the struct:
|
||||
|
||||
```go
|
||||
if errors.Is(err, aether.ErrVersionConflict) {
|
||||
// This is a version conflict - retry is appropriate
|
||||
}
|
||||
```
|
||||
|
||||
## Implementing Your Own Pattern
|
||||
|
||||
Basic template:
|
||||
|
||||
```go
|
||||
for attempt := 0; attempt < maxRetries; attempt++ {
|
||||
// 1. Get current version
|
||||
currentVersion, err := store.GetLatestVersion(actorID)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// 2. Create event with next version
|
||||
event := &aether.Event{
|
||||
ActorID: actorID,
|
||||
Version: currentVersion + 1,
|
||||
// ... other fields
|
||||
}
|
||||
|
||||
// 3. Attempt save
|
||||
err = store.SaveEvent(event)
|
||||
if err == nil {
|
||||
return nil // Success
|
||||
}
|
||||
|
||||
// 4. Check if it's a conflict
|
||||
if !errors.Is(err, aether.ErrVersionConflict) {
|
||||
return err // Some other error
|
||||
}
|
||||
|
||||
// 5. Implement your retry strategy
|
||||
time.Sleep(yourBackoff(attempt))
|
||||
}
|
||||
```
|
||||
|
||||
## Choosing a Pattern
|
||||
|
||||
| Pattern | Latency | Throughput | Complexity | Use Case |
|
||||
|---------|---------|-----------|-----------|----------|
|
||||
| Simple | Low | Low | Very Low | Single writer, testing |
|
||||
| DetailedConflict | Low | Medium | Medium | Debugging, monitoring |
|
||||
| Jitter | Low-Medium | High | Low | Multi-writer concurrency |
|
||||
| Adaptive | Low-Medium | High | Medium | Variable load scenarios |
|
||||
| EventualConsistency | Very Low | Very High | High | High-volume, async-OK workloads |
|
||||
| CircuitBreaker | Variable | Stable | High | Distributed, failure-resilient systems |
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
1. **Backoff timing**: Shorter backoffs waste CPU on retries, longer backoffs increase latency
|
||||
2. **Retry limits**: Too few retries give up too early, too many waste resources
|
||||
3. **Jitter**: Essential for preventing synchronized retries in high-concurrency scenarios
|
||||
4. **Monitoring**: Track retry rates and conflict patterns to detect system issues
|
||||
|
||||
## Testing
|
||||
|
||||
Use `aether.NewInMemoryEventStore()` in tests:
|
||||
|
||||
```go
|
||||
store := store.NewInMemoryEventStore()
|
||||
err := SimpleRetryPattern(store, "test-actor", "TestEvent")
|
||||
if err != nil {
|
||||
t.Fatalf("retry pattern failed: %v", err)
|
||||
}
|
||||
```
|
||||
168
examples/cross_node_broadcasting.go
Normal file
168
examples/cross_node_broadcasting.go
Normal file
@@ -0,0 +1,168 @@
|
||||
// Package main demonstrates cross-node event broadcasting using NATSEventBus
|
||||
// and JetStreamEventStore for cluster synchronization.
|
||||
//
|
||||
// This example shows:
|
||||
// 1. Setting up NATSEventBus with JetStreamEventStore
|
||||
// 2. Broadcasting events across NATS for cross-node distribution
|
||||
// 3. Subscribing to EventStored events for version cache synchronization
|
||||
// 4. Properly handling EventStored events from other cluster nodes
|
||||
//
|
||||
// Prerequisites:
|
||||
// - NATS server running with JetStream enabled (nats-server -js)
|
||||
// - Events stream created in JetStream
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log"
|
||||
"os"
|
||||
"os/signal"
|
||||
"syscall"
|
||||
"time"
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
"git.flowmade.one/flowmade-one/aether/store"
|
||||
"github.com/google/uuid"
|
||||
"github.com/nats-io/nats.go"
|
||||
)
|
||||
|
||||
func main() {
|
||||
natsURL := getEnv("NATS_URL", "nats://localhost:4222")
|
||||
|
||||
nc, err := nats.Connect(natsURL)
|
||||
if err != nil {
|
||||
log.Fatal("Failed to connect to NATS:", err)
|
||||
}
|
||||
defer nc.Close()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store1, err := store.NewJetStreamEventStore(nc, "events")
|
||||
if err != nil {
|
||||
log.Fatal("Failed to create event store:", err)
|
||||
}
|
||||
|
||||
eventBus1 := aether.NewNATSEventBusWithBroadcaster(nc, store1, "")
|
||||
defer eventBus1.Stop()
|
||||
|
||||
store2, err := store.NewJetStreamEventStore(nc, "events")
|
||||
if err != nil {
|
||||
log.Fatal("Failed to create event store:", err)
|
||||
}
|
||||
|
||||
eventBus2 := aether.NewNATSEventBusWithBroadcaster(nc, store2, "")
|
||||
defer eventBus2.Stop()
|
||||
|
||||
eventStoredCh1 := eventBus1.SubscribeToEventStored("*")
|
||||
eventStoredCh2 := eventBus2.SubscribeToEventStored("*")
|
||||
|
||||
done := make(chan struct{})
|
||||
|
||||
go processEvents(ctx, eventStoredCh1, store1, done)
|
||||
go processEvents(ctx, eventStoredCh2, store2, done)
|
||||
|
||||
go func() {
|
||||
time.Sleep(2 * time.Second)
|
||||
|
||||
actorID := "demo-actor"
|
||||
|
||||
event1 := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{
|
||||
"total": 99.99,
|
||||
"status": "pending",
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
log.Printf("Node 1 publishing event: %s", event1.EventType)
|
||||
eventBus1.Publish("", event1)
|
||||
|
||||
time.Sleep(500 * time.Millisecond)
|
||||
|
||||
event2 := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderPaid",
|
||||
ActorID: actorID,
|
||||
Version: 2,
|
||||
Data: map[string]interface{}{
|
||||
"total": 99.99,
|
||||
"status": "paid",
|
||||
"method": "credit_card",
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
log.Printf("Node 2 publishing event: %s", event2.EventType)
|
||||
eventBus2.Publish("", event2)
|
||||
|
||||
time.Sleep(2 * time.Second)
|
||||
|
||||
close(done)
|
||||
|
||||
log.Println("Cross-node broadcasting demo complete")
|
||||
}()
|
||||
|
||||
sigCh := make(chan os.Signal, 1)
|
||||
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
|
||||
|
||||
select {
|
||||
case <-sigCh:
|
||||
log.Println("Shutting down...")
|
||||
case <-done:
|
||||
}
|
||||
}
|
||||
|
||||
func processEvents(ctx context.Context, eventStoredCh <-chan *aether.Event, eventStore *store.JetStreamEventStore, done chan struct{}) {
|
||||
for {
|
||||
select {
|
||||
case <-done:
|
||||
return
|
||||
case <-ctx.Done():
|
||||
return
|
||||
case event, ok := <-eventStoredCh:
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
|
||||
if event == nil {
|
||||
continue
|
||||
}
|
||||
|
||||
if event.EventType != aether.EventTypeEventStored {
|
||||
continue
|
||||
}
|
||||
|
||||
actorID, ok := event.Data["actorId"].(string)
|
||||
if !ok {
|
||||
log.Printf("Warning: EventStored missing actorId")
|
||||
continue
|
||||
}
|
||||
|
||||
version, ok := event.Data["version"].(int64)
|
||||
if !ok {
|
||||
log.Printf("Warning: EventStored missing version")
|
||||
continue
|
||||
}
|
||||
|
||||
eventID, _ := event.Data["eventId"].(string)
|
||||
|
||||
log.Printf("Received EventStored: actor=%s, version=%d, eventId=%s", actorID, version, eventID)
|
||||
|
||||
eventStore.UpdateVersionCache(actorID, version)
|
||||
|
||||
currentVersion, _ := eventStore.GetLatestVersion(actorID)
|
||||
log.Printf("Updated cache: %s now has version %d (cached: %d)", actorID, version, currentVersion)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func getEnv(key, defaultValue string) string {
|
||||
if value := os.Getenv(key); value != "" {
|
||||
return value
|
||||
}
|
||||
return defaultValue
|
||||
}
|
||||
16
go.mod
16
go.mod
@@ -1,16 +1,26 @@
|
||||
module git.flowmade.one/flowmade-one/aether
|
||||
|
||||
go 1.23
|
||||
go 1.23.0
|
||||
|
||||
require (
|
||||
github.com/google/uuid v1.6.0
|
||||
github.com/nats-io/nats.go v1.37.0
|
||||
github.com/prometheus/client_golang v1.23.2
|
||||
)
|
||||
|
||||
require (
|
||||
github.com/klauspost/compress v1.17.2 // indirect
|
||||
github.com/beorn7/perks v1.0.1 // indirect
|
||||
github.com/cespare/xxhash/v2 v2.3.0 // indirect
|
||||
github.com/klauspost/compress v1.18.0 // indirect
|
||||
github.com/kr/text v0.2.0 // indirect
|
||||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
|
||||
github.com/nats-io/nkeys v0.4.7 // indirect
|
||||
github.com/nats-io/nuid v1.0.1 // indirect
|
||||
github.com/prometheus/client_model v0.6.2 // indirect
|
||||
github.com/prometheus/common v0.66.1 // indirect
|
||||
github.com/prometheus/procfs v0.16.1 // indirect
|
||||
go.yaml.in/yaml/v2 v2.4.2 // indirect
|
||||
golang.org/x/crypto v0.18.0 // indirect
|
||||
golang.org/x/sys v0.16.0 // indirect
|
||||
golang.org/x/sys v0.35.0 // indirect
|
||||
google.golang.org/protobuf v1.36.8 // indirect
|
||||
)
|
||||
|
||||
48
go.sum
48
go.sum
@@ -1,14 +1,54 @@
|
||||
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
|
||||
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
|
||||
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
|
||||
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
|
||||
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
|
||||
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
|
||||
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
|
||||
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
|
||||
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
|
||||
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
|
||||
github.com/klauspost/compress v1.17.2 h1:RlWWUY/Dr4fL8qk9YG7DTZ7PDgME2V4csBXA8L/ixi4=
|
||||
github.com/klauspost/compress v1.17.2/go.mod h1:ntbaceVETuRiXiv4DpjP66DpAtAGkEQskQzEyD//IeE=
|
||||
github.com/klauspost/compress v1.18.0 h1:c/Cqfb0r+Yi+JtIEq73FWXVkRonBlf0CRNYc8Zttxdo=
|
||||
github.com/klauspost/compress v1.18.0/go.mod h1:2Pp+KzxcywXVXMr50+X0Q/Lsb43OQHYWRCY2AiWywWQ=
|
||||
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
|
||||
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
|
||||
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
|
||||
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
|
||||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
|
||||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
|
||||
github.com/nats-io/nats.go v1.37.0 h1:07rauXbVnnJvv1gfIyghFEo6lUcYRY0WXc3x7x0vUxE=
|
||||
github.com/nats-io/nats.go v1.37.0/go.mod h1:Ubdu4Nh9exXdSz0RVWRFBbRfrbSxOYd26oF0wkWclB8=
|
||||
github.com/nats-io/nkeys v0.4.7 h1:RwNJbbIdYCoClSDNY7QVKZlyb/wfT6ugvFCiKy6vDvI=
|
||||
github.com/nats-io/nkeys v0.4.7/go.mod h1:kqXRgRDPlGy7nGaEDMuYzmiJCIAAWDK0IMBtDmGD0nc=
|
||||
github.com/nats-io/nuid v1.0.1 h1:5iA8DT8V7q8WK2EScv2padNa/rTESc1KdnPw4TC2paw=
|
||||
github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OSON2c=
|
||||
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
|
||||
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||
github.com/prometheus/client_golang v1.23.2 h1:Je96obch5RDVy3FDMndoUsjAhG5Edi49h0RJWRi/o0o=
|
||||
github.com/prometheus/client_golang v1.23.2/go.mod h1:Tb1a6LWHB3/SPIzCoaDXI4I8UHKeFTEQ1YCr+0Gyqmg=
|
||||
github.com/prometheus/client_model v0.6.2 h1:oBsgwpGs7iVziMvrGhE53c/GrLUsZdHnqNwqPLxwZyk=
|
||||
github.com/prometheus/client_model v0.6.2/go.mod h1:y3m2F6Gdpfy6Ut/GBsUqTWZqCUvMVzSfMLjcu6wAwpE=
|
||||
github.com/prometheus/common v0.66.1 h1:h5E0h5/Y8niHc5DlaLlWLArTQI7tMrsfQjHV+d9ZoGs=
|
||||
github.com/prometheus/common v0.66.1/go.mod h1:gcaUsgf3KfRSwHY4dIMXLPV0K/Wg1oZ8+SbZk/HH/dA=
|
||||
github.com/prometheus/procfs v0.16.1 h1:hZ15bTNuirocR6u0JZ6BAHHmwS1p8B4P6MRqxtzMyRg=
|
||||
github.com/prometheus/procfs v0.16.1/go.mod h1:teAbpZRB1iIAJYREa1LsoWUXykVXA1KlTmWl8x/U+Is=
|
||||
github.com/rogpeppe/go-internal v1.10.0 h1:TMyTOH3F/DB16zRVcYyreMH6GnZZrwQVAoYjRBZyWFQ=
|
||||
github.com/rogpeppe/go-internal v1.10.0/go.mod h1:UQnix2H7Ngw/k4C5ijL5+65zddjncjaFoBhdsK/akog=
|
||||
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
|
||||
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
|
||||
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
|
||||
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
|
||||
go.yaml.in/yaml/v2 v2.4.2 h1:DzmwEr2rDGHl7lsFgAHxmNz/1NlQ7xLIrlN2h5d1eGI=
|
||||
go.yaml.in/yaml/v2 v2.4.2/go.mod h1:081UH+NErpNdqlCXm3TtEran0rJZGxAYx9hb/ELlsPU=
|
||||
golang.org/x/crypto v0.18.0 h1:PGVlW0xEltQnzFZ55hkuX5+KLyrMYhHld1YHO4AKcdc=
|
||||
golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1mg=
|
||||
golang.org/x/sys v0.16.0 h1:xWw16ngr6ZMtmxDyKyIgsE93KNKz5HKmMa3b8ALHidU=
|
||||
golang.org/x/sys v0.16.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
|
||||
golang.org/x/sys v0.35.0 h1:vz1N37gP5bs89s7He8XuIYXpyY0+QlsKmzipCbUtyxI=
|
||||
golang.org/x/sys v0.35.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
|
||||
google.golang.org/protobuf v1.36.8 h1:xHScyCOEuuwZEc6UtSOvPbAT4zRh0xcNRYekJwfqyMc=
|
||||
google.golang.org/protobuf v1.36.8/go.mod h1:fuxRtAxBytpl4zzqUh6/eyUujkJdNiuEkXntxiD/uRU=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
|
||||
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
|
||||
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
|
||||
258
metrics.go
Normal file
258
metrics.go
Normal file
@@ -0,0 +1,258 @@
|
||||
package aether
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
)
|
||||
|
||||
// BroadcasterMetrics provides observability metrics for EventBroadcaster implementations.
|
||||
// All methods are safe for concurrent use.
|
||||
type BroadcasterMetrics interface {
|
||||
// EventsPublished returns the total number of events published per namespace.
|
||||
EventsPublished(namespaceID string) int64
|
||||
|
||||
// EventsReceived returns the total number of events received per namespace.
|
||||
// For EventBus this equals events delivered to subscribers.
|
||||
// For NATSEventBus this includes events received from NATS.
|
||||
EventsReceived(namespaceID string) int64
|
||||
|
||||
// ActiveSubscriptions returns the current number of active subscriptions per namespace.
|
||||
ActiveSubscriptions(namespaceID string) int64
|
||||
|
||||
// TotalActiveSubscriptions returns the total number of active subscriptions across all namespaces.
|
||||
TotalActiveSubscriptions() int64
|
||||
|
||||
// PublishErrors returns the total number of publish errors per namespace.
|
||||
PublishErrors(namespaceID string) int64
|
||||
|
||||
// SubscribeErrors returns the total number of subscribe errors per namespace.
|
||||
SubscribeErrors(namespaceID string) int64
|
||||
|
||||
// DroppedEvents returns the total number of events dropped (e.g., full channel) per namespace.
|
||||
DroppedEvents(namespaceID string) int64
|
||||
|
||||
// Namespaces returns a list of all namespaces that have metrics.
|
||||
Namespaces() []string
|
||||
|
||||
// Reset resets all metrics. Useful for testing.
|
||||
Reset()
|
||||
}
|
||||
|
||||
// MetricsCollector provides methods for collecting metrics.
|
||||
// This interface is implemented internally and used by EventBus implementations.
|
||||
type MetricsCollector interface {
|
||||
BroadcasterMetrics
|
||||
|
||||
// RecordPublish records a successful publish event.
|
||||
RecordPublish(namespaceID string)
|
||||
|
||||
// RecordReceive records a received event.
|
||||
RecordReceive(namespaceID string)
|
||||
|
||||
// RecordSubscribe records a new subscription.
|
||||
RecordSubscribe(namespaceID string)
|
||||
|
||||
// RecordUnsubscribe records a removed subscription.
|
||||
RecordUnsubscribe(namespaceID string)
|
||||
|
||||
// RecordPublishError records a publish error.
|
||||
RecordPublishError(namespaceID string)
|
||||
|
||||
// RecordSubscribeError records a subscribe error.
|
||||
RecordSubscribeError(namespaceID string)
|
||||
|
||||
// RecordDroppedEvent records a dropped event (e.g., channel full).
|
||||
RecordDroppedEvent(namespaceID string)
|
||||
}
|
||||
|
||||
// namespaceMetrics holds counters for a single namespace.
|
||||
type namespaceMetrics struct {
|
||||
eventsPublished int64
|
||||
eventsReceived int64
|
||||
activeSubscriptions int64
|
||||
publishErrors int64
|
||||
subscribeErrors int64
|
||||
droppedEvents int64
|
||||
}
|
||||
|
||||
// DefaultMetricsCollector is the default implementation of MetricsCollector.
|
||||
// It uses atomic operations for thread-safe counter updates.
|
||||
type DefaultMetricsCollector struct {
|
||||
mu sync.RWMutex
|
||||
namespaces map[string]*namespaceMetrics
|
||||
}
|
||||
|
||||
// NewMetricsCollector creates a new DefaultMetricsCollector.
|
||||
func NewMetricsCollector() *DefaultMetricsCollector {
|
||||
return &DefaultMetricsCollector{
|
||||
namespaces: make(map[string]*namespaceMetrics),
|
||||
}
|
||||
}
|
||||
|
||||
// getOrCreateNamespace returns metrics for a namespace, creating if needed.
|
||||
func (m *DefaultMetricsCollector) getOrCreateNamespace(namespaceID string) *namespaceMetrics {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if exists {
|
||||
return ns
|
||||
}
|
||||
|
||||
m.mu.Lock()
|
||||
defer m.mu.Unlock()
|
||||
|
||||
// Double-check after acquiring write lock
|
||||
if ns, exists = m.namespaces[namespaceID]; exists {
|
||||
return ns
|
||||
}
|
||||
|
||||
ns = &namespaceMetrics{}
|
||||
m.namespaces[namespaceID] = ns
|
||||
return ns
|
||||
}
|
||||
|
||||
// EventsPublished returns the total number of events published for a namespace.
|
||||
func (m *DefaultMetricsCollector) EventsPublished(namespaceID string) int64 {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if !exists {
|
||||
return 0
|
||||
}
|
||||
return atomic.LoadInt64(&ns.eventsPublished)
|
||||
}
|
||||
|
||||
// EventsReceived returns the total number of events received for a namespace.
|
||||
func (m *DefaultMetricsCollector) EventsReceived(namespaceID string) int64 {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if !exists {
|
||||
return 0
|
||||
}
|
||||
return atomic.LoadInt64(&ns.eventsReceived)
|
||||
}
|
||||
|
||||
// ActiveSubscriptions returns the current number of active subscriptions for a namespace.
|
||||
func (m *DefaultMetricsCollector) ActiveSubscriptions(namespaceID string) int64 {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if !exists {
|
||||
return 0
|
||||
}
|
||||
return atomic.LoadInt64(&ns.activeSubscriptions)
|
||||
}
|
||||
|
||||
// TotalActiveSubscriptions returns the total number of active subscriptions across all namespaces.
|
||||
func (m *DefaultMetricsCollector) TotalActiveSubscriptions() int64 {
|
||||
m.mu.RLock()
|
||||
defer m.mu.RUnlock()
|
||||
|
||||
var total int64
|
||||
for _, ns := range m.namespaces {
|
||||
total += atomic.LoadInt64(&ns.activeSubscriptions)
|
||||
}
|
||||
return total
|
||||
}
|
||||
|
||||
// PublishErrors returns the total number of publish errors for a namespace.
|
||||
func (m *DefaultMetricsCollector) PublishErrors(namespaceID string) int64 {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if !exists {
|
||||
return 0
|
||||
}
|
||||
return atomic.LoadInt64(&ns.publishErrors)
|
||||
}
|
||||
|
||||
// SubscribeErrors returns the total number of subscribe errors for a namespace.
|
||||
func (m *DefaultMetricsCollector) SubscribeErrors(namespaceID string) int64 {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if !exists {
|
||||
return 0
|
||||
}
|
||||
return atomic.LoadInt64(&ns.subscribeErrors)
|
||||
}
|
||||
|
||||
// DroppedEvents returns the total number of dropped events for a namespace.
|
||||
func (m *DefaultMetricsCollector) DroppedEvents(namespaceID string) int64 {
|
||||
m.mu.RLock()
|
||||
ns, exists := m.namespaces[namespaceID]
|
||||
m.mu.RUnlock()
|
||||
|
||||
if !exists {
|
||||
return 0
|
||||
}
|
||||
return atomic.LoadInt64(&ns.droppedEvents)
|
||||
}
|
||||
|
||||
// Namespaces returns a list of all namespaces that have metrics.
|
||||
func (m *DefaultMetricsCollector) Namespaces() []string {
|
||||
m.mu.RLock()
|
||||
defer m.mu.RUnlock()
|
||||
|
||||
namespaces := make([]string, 0, len(m.namespaces))
|
||||
for ns := range m.namespaces {
|
||||
namespaces = append(namespaces, ns)
|
||||
}
|
||||
return namespaces
|
||||
}
|
||||
|
||||
// Reset resets all metrics.
|
||||
func (m *DefaultMetricsCollector) Reset() {
|
||||
m.mu.Lock()
|
||||
defer m.mu.Unlock()
|
||||
m.namespaces = make(map[string]*namespaceMetrics)
|
||||
}
|
||||
|
||||
// RecordPublish records a successful publish event.
|
||||
func (m *DefaultMetricsCollector) RecordPublish(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.eventsPublished, 1)
|
||||
}
|
||||
|
||||
// RecordReceive records a received event.
|
||||
func (m *DefaultMetricsCollector) RecordReceive(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.eventsReceived, 1)
|
||||
}
|
||||
|
||||
// RecordSubscribe records a new subscription.
|
||||
func (m *DefaultMetricsCollector) RecordSubscribe(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.activeSubscriptions, 1)
|
||||
}
|
||||
|
||||
// RecordUnsubscribe records a removed subscription.
|
||||
func (m *DefaultMetricsCollector) RecordUnsubscribe(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.activeSubscriptions, -1)
|
||||
}
|
||||
|
||||
// RecordPublishError records a publish error.
|
||||
func (m *DefaultMetricsCollector) RecordPublishError(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.publishErrors, 1)
|
||||
}
|
||||
|
||||
// RecordSubscribeError records a subscribe error.
|
||||
func (m *DefaultMetricsCollector) RecordSubscribeError(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.subscribeErrors, 1)
|
||||
}
|
||||
|
||||
// RecordDroppedEvent records a dropped event.
|
||||
func (m *DefaultMetricsCollector) RecordDroppedEvent(namespaceID string) {
|
||||
ns := m.getOrCreateNamespace(namespaceID)
|
||||
atomic.AddInt64(&ns.droppedEvents, 1)
|
||||
}
|
||||
123
metrics_prometheus.go
Normal file
123
metrics_prometheus.go
Normal file
@@ -0,0 +1,123 @@
|
||||
package aether
|
||||
|
||||
import (
|
||||
"github.com/prometheus/client_golang/prometheus"
|
||||
)
|
||||
|
||||
// PrometheusMetricsAdapter exposes BroadcasterMetrics as Prometheus metrics.
|
||||
// It implements prometheus.Collector and can be registered with a Prometheus registry.
|
||||
type PrometheusMetricsAdapter struct {
|
||||
metrics BroadcasterMetrics
|
||||
|
||||
eventsPublishedDesc *prometheus.Desc
|
||||
eventsReceivedDesc *prometheus.Desc
|
||||
activeSubscriptionsDesc *prometheus.Desc
|
||||
publishErrorsDesc *prometheus.Desc
|
||||
subscribeErrorsDesc *prometheus.Desc
|
||||
droppedEventsDesc *prometheus.Desc
|
||||
}
|
||||
|
||||
// NewPrometheusMetricsAdapter creates a new PrometheusMetricsAdapter that wraps
|
||||
// a BroadcasterMetrics implementation and exposes it as Prometheus metrics.
|
||||
//
|
||||
// The adapter implements prometheus.Collector and should be registered with
|
||||
// a Prometheus registry:
|
||||
//
|
||||
// eb := aether.NewEventBus()
|
||||
// adapter := aether.NewPrometheusMetricsAdapter(eb.Metrics())
|
||||
// prometheus.MustRegister(adapter)
|
||||
func NewPrometheusMetricsAdapter(metrics BroadcasterMetrics) *PrometheusMetricsAdapter {
|
||||
return &PrometheusMetricsAdapter{
|
||||
metrics: metrics,
|
||||
eventsPublishedDesc: prometheus.NewDesc(
|
||||
"aether_events_published_total",
|
||||
"Total number of events published per namespace",
|
||||
[]string{"namespace"},
|
||||
nil,
|
||||
),
|
||||
eventsReceivedDesc: prometheus.NewDesc(
|
||||
"aether_events_received_total",
|
||||
"Total number of events received per namespace",
|
||||
[]string{"namespace"},
|
||||
nil,
|
||||
),
|
||||
activeSubscriptionsDesc: prometheus.NewDesc(
|
||||
"aether_active_subscriptions",
|
||||
"Number of active subscriptions per namespace",
|
||||
[]string{"namespace"},
|
||||
nil,
|
||||
),
|
||||
publishErrorsDesc: prometheus.NewDesc(
|
||||
"aether_publish_errors_total",
|
||||
"Total number of publish errors per namespace",
|
||||
[]string{"namespace"},
|
||||
nil,
|
||||
),
|
||||
subscribeErrorsDesc: prometheus.NewDesc(
|
||||
"aether_subscribe_errors_total",
|
||||
"Total number of subscribe errors per namespace",
|
||||
[]string{"namespace"},
|
||||
nil,
|
||||
),
|
||||
droppedEventsDesc: prometheus.NewDesc(
|
||||
"aether_dropped_events_total",
|
||||
"Total number of dropped events per namespace",
|
||||
[]string{"namespace"},
|
||||
nil,
|
||||
),
|
||||
}
|
||||
}
|
||||
|
||||
// Describe implements prometheus.Collector.
|
||||
func (a *PrometheusMetricsAdapter) Describe(ch chan<- *prometheus.Desc) {
|
||||
ch <- a.eventsPublishedDesc
|
||||
ch <- a.eventsReceivedDesc
|
||||
ch <- a.activeSubscriptionsDesc
|
||||
ch <- a.publishErrorsDesc
|
||||
ch <- a.subscribeErrorsDesc
|
||||
ch <- a.droppedEventsDesc
|
||||
}
|
||||
|
||||
// Collect implements prometheus.Collector.
|
||||
func (a *PrometheusMetricsAdapter) Collect(ch chan<- prometheus.Metric) {
|
||||
namespaces := a.metrics.Namespaces()
|
||||
|
||||
for _, ns := range namespaces {
|
||||
ch <- prometheus.MustNewConstMetric(
|
||||
a.eventsPublishedDesc,
|
||||
prometheus.CounterValue,
|
||||
float64(a.metrics.EventsPublished(ns)),
|
||||
ns,
|
||||
)
|
||||
ch <- prometheus.MustNewConstMetric(
|
||||
a.eventsReceivedDesc,
|
||||
prometheus.CounterValue,
|
||||
float64(a.metrics.EventsReceived(ns)),
|
||||
ns,
|
||||
)
|
||||
ch <- prometheus.MustNewConstMetric(
|
||||
a.activeSubscriptionsDesc,
|
||||
prometheus.GaugeValue,
|
||||
float64(a.metrics.ActiveSubscriptions(ns)),
|
||||
ns,
|
||||
)
|
||||
ch <- prometheus.MustNewConstMetric(
|
||||
a.publishErrorsDesc,
|
||||
prometheus.CounterValue,
|
||||
float64(a.metrics.PublishErrors(ns)),
|
||||
ns,
|
||||
)
|
||||
ch <- prometheus.MustNewConstMetric(
|
||||
a.subscribeErrorsDesc,
|
||||
prometheus.CounterValue,
|
||||
float64(a.metrics.SubscribeErrors(ns)),
|
||||
ns,
|
||||
)
|
||||
ch <- prometheus.MustNewConstMetric(
|
||||
a.droppedEventsDesc,
|
||||
prometheus.CounterValue,
|
||||
float64(a.metrics.DroppedEvents(ns)),
|
||||
ns,
|
||||
)
|
||||
}
|
||||
}
|
||||
304
metrics_test.go
Normal file
304
metrics_test.go
Normal file
@@ -0,0 +1,304 @@
|
||||
package aether_test
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
)
|
||||
|
||||
func TestMetricsCollector_InitialState(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
if got := mc.EventsPublished("test-ns"); got != 0 {
|
||||
t.Errorf("EventsPublished() = %d, want 0", got)
|
||||
}
|
||||
if got := mc.EventsReceived("test-ns"); got != 0 {
|
||||
t.Errorf("EventsReceived() = %d, want 0", got)
|
||||
}
|
||||
if got := mc.ActiveSubscriptions("test-ns"); got != 0 {
|
||||
t.Errorf("ActiveSubscriptions() = %d, want 0", got)
|
||||
}
|
||||
if got := mc.TotalActiveSubscriptions(); got != 0 {
|
||||
t.Errorf("TotalActiveSubscriptions() = %d, want 0", got)
|
||||
}
|
||||
if got := mc.PublishErrors("test-ns"); got != 0 {
|
||||
t.Errorf("PublishErrors() = %d, want 0", got)
|
||||
}
|
||||
if got := mc.SubscribeErrors("test-ns"); got != 0 {
|
||||
t.Errorf("SubscribeErrors() = %d, want 0", got)
|
||||
}
|
||||
if got := mc.DroppedEvents("test-ns"); got != 0 {
|
||||
t.Errorf("DroppedEvents() = %d, want 0", got)
|
||||
}
|
||||
if got := len(mc.Namespaces()); got != 0 {
|
||||
t.Errorf("Namespaces() = %d, want 0", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_RecordPublish(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
mc.RecordPublish("ns1")
|
||||
mc.RecordPublish("ns1")
|
||||
mc.RecordPublish("ns2")
|
||||
|
||||
if got := mc.EventsPublished("ns1"); got != 2 {
|
||||
t.Errorf("EventsPublished(ns1) = %d, want 2", got)
|
||||
}
|
||||
if got := mc.EventsPublished("ns2"); got != 1 {
|
||||
t.Errorf("EventsPublished(ns2) = %d, want 1", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_RecordReceive(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
mc.RecordReceive("ns1")
|
||||
mc.RecordReceive("ns1")
|
||||
mc.RecordReceive("ns1")
|
||||
|
||||
if got := mc.EventsReceived("ns1"); got != 3 {
|
||||
t.Errorf("EventsReceived(ns1) = %d, want 3", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_Subscriptions(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
mc.RecordSubscribe("ns1")
|
||||
mc.RecordSubscribe("ns1")
|
||||
mc.RecordSubscribe("ns2")
|
||||
|
||||
if got := mc.ActiveSubscriptions("ns1"); got != 2 {
|
||||
t.Errorf("ActiveSubscriptions(ns1) = %d, want 2", got)
|
||||
}
|
||||
if got := mc.ActiveSubscriptions("ns2"); got != 1 {
|
||||
t.Errorf("ActiveSubscriptions(ns2) = %d, want 1", got)
|
||||
}
|
||||
if got := mc.TotalActiveSubscriptions(); got != 3 {
|
||||
t.Errorf("TotalActiveSubscriptions() = %d, want 3", got)
|
||||
}
|
||||
|
||||
mc.RecordUnsubscribe("ns1")
|
||||
|
||||
if got := mc.ActiveSubscriptions("ns1"); got != 1 {
|
||||
t.Errorf("ActiveSubscriptions(ns1) after unsubscribe = %d, want 1", got)
|
||||
}
|
||||
if got := mc.TotalActiveSubscriptions(); got != 2 {
|
||||
t.Errorf("TotalActiveSubscriptions() after unsubscribe = %d, want 2", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_Errors(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
mc.RecordPublishError("ns1")
|
||||
mc.RecordPublishError("ns1")
|
||||
mc.RecordSubscribeError("ns1")
|
||||
mc.RecordDroppedEvent("ns1")
|
||||
mc.RecordDroppedEvent("ns1")
|
||||
mc.RecordDroppedEvent("ns1")
|
||||
|
||||
if got := mc.PublishErrors("ns1"); got != 2 {
|
||||
t.Errorf("PublishErrors(ns1) = %d, want 2", got)
|
||||
}
|
||||
if got := mc.SubscribeErrors("ns1"); got != 1 {
|
||||
t.Errorf("SubscribeErrors(ns1) = %d, want 1", got)
|
||||
}
|
||||
if got := mc.DroppedEvents("ns1"); got != 3 {
|
||||
t.Errorf("DroppedEvents(ns1) = %d, want 3", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_Namespaces(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
mc.RecordPublish("ns1")
|
||||
mc.RecordReceive("ns2")
|
||||
mc.RecordSubscribe("ns3")
|
||||
|
||||
namespaces := mc.Namespaces()
|
||||
if len(namespaces) != 3 {
|
||||
t.Errorf("Namespaces() length = %d, want 3", len(namespaces))
|
||||
}
|
||||
|
||||
nsMap := make(map[string]bool)
|
||||
for _, ns := range namespaces {
|
||||
nsMap[ns] = true
|
||||
}
|
||||
|
||||
for _, expected := range []string{"ns1", "ns2", "ns3"} {
|
||||
if !nsMap[expected] {
|
||||
t.Errorf("Namespaces() missing %q", expected)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_Reset(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
|
||||
mc.RecordPublish("ns1")
|
||||
mc.RecordReceive("ns1")
|
||||
mc.RecordSubscribe("ns1")
|
||||
|
||||
mc.Reset()
|
||||
|
||||
if got := mc.EventsPublished("ns1"); got != 0 {
|
||||
t.Errorf("EventsPublished() after reset = %d, want 0", got)
|
||||
}
|
||||
if got := len(mc.Namespaces()); got != 0 {
|
||||
t.Errorf("Namespaces() after reset = %d, want 0", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMetricsCollector_ConcurrentAccess(t *testing.T) {
|
||||
mc := aether.NewMetricsCollector()
|
||||
const goroutines = 10
|
||||
const iterations = 100
|
||||
|
||||
var wg sync.WaitGroup
|
||||
wg.Add(goroutines)
|
||||
|
||||
for i := 0; i < goroutines; i++ {
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
for j := 0; j < iterations; j++ {
|
||||
mc.RecordPublish("concurrent-ns")
|
||||
mc.RecordReceive("concurrent-ns")
|
||||
mc.RecordSubscribe("concurrent-ns")
|
||||
mc.RecordUnsubscribe("concurrent-ns")
|
||||
mc.RecordPublishError("concurrent-ns")
|
||||
mc.RecordSubscribeError("concurrent-ns")
|
||||
mc.RecordDroppedEvent("concurrent-ns")
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
|
||||
expected := int64(goroutines * iterations)
|
||||
|
||||
if got := mc.EventsPublished("concurrent-ns"); got != expected {
|
||||
t.Errorf("EventsPublished() = %d, want %d", got, expected)
|
||||
}
|
||||
if got := mc.EventsReceived("concurrent-ns"); got != expected {
|
||||
t.Errorf("EventsReceived() = %d, want %d", got, expected)
|
||||
}
|
||||
if got := mc.ActiveSubscriptions("concurrent-ns"); got != 0 {
|
||||
t.Errorf("ActiveSubscriptions() = %d, want 0 (subscribed and unsubscribed same amount)", got)
|
||||
}
|
||||
if got := mc.PublishErrors("concurrent-ns"); got != expected {
|
||||
t.Errorf("PublishErrors() = %d, want %d", got, expected)
|
||||
}
|
||||
if got := mc.SubscribeErrors("concurrent-ns"); got != expected {
|
||||
t.Errorf("SubscribeErrors() = %d, want %d", got, expected)
|
||||
}
|
||||
if got := mc.DroppedEvents("concurrent-ns"); got != expected {
|
||||
t.Errorf("DroppedEvents() = %d, want %d", got, expected)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_Metrics(t *testing.T) {
|
||||
eb := aether.NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
metrics := eb.Metrics()
|
||||
if metrics == nil {
|
||||
t.Fatal("Metrics() returned nil")
|
||||
}
|
||||
|
||||
// Subscribe and verify metrics
|
||||
ch := eb.Subscribe("test-ns")
|
||||
if got := metrics.ActiveSubscriptions("test-ns"); got != 1 {
|
||||
t.Errorf("ActiveSubscriptions() after subscribe = %d, want 1", got)
|
||||
}
|
||||
|
||||
// Publish and verify metrics
|
||||
event := &aether.Event{
|
||||
ID: "test-1",
|
||||
EventType: "TestEvent",
|
||||
ActorID: "actor-1",
|
||||
Version: 1,
|
||||
}
|
||||
eb.Publish("test-ns", event)
|
||||
|
||||
// Wait for event delivery
|
||||
select {
|
||||
case <-ch:
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timeout waiting for event")
|
||||
}
|
||||
|
||||
if got := metrics.EventsPublished("test-ns"); got != 1 {
|
||||
t.Errorf("EventsPublished() after publish = %d, want 1", got)
|
||||
}
|
||||
if got := metrics.EventsReceived("test-ns"); got != 1 {
|
||||
t.Errorf("EventsReceived() after publish = %d, want 1", got)
|
||||
}
|
||||
|
||||
// Unsubscribe and verify metrics
|
||||
eb.Unsubscribe("test-ns", ch)
|
||||
if got := metrics.ActiveSubscriptions("test-ns"); got != 0 {
|
||||
t.Errorf("ActiveSubscriptions() after unsubscribe = %d, want 0", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_DroppedEvents(t *testing.T) {
|
||||
eb := aether.NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
metrics := eb.Metrics()
|
||||
|
||||
// Subscribe but don't read from channel
|
||||
_ = eb.Subscribe("test-ns")
|
||||
|
||||
// Fill the channel buffer (default is 100)
|
||||
for i := 0; i < 100; i++ {
|
||||
eb.Publish("test-ns", &aether.Event{
|
||||
ID: "fill-" + string(rune(i)),
|
||||
EventType: "FillEvent",
|
||||
})
|
||||
}
|
||||
|
||||
// Next publish should be dropped
|
||||
eb.Publish("test-ns", &aether.Event{
|
||||
ID: "dropped",
|
||||
EventType: "DroppedEvent",
|
||||
})
|
||||
|
||||
if got := metrics.DroppedEvents("test-ns"); got != 1 {
|
||||
t.Errorf("DroppedEvents() = %d, want 1", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_MetricsProvider(t *testing.T) {
|
||||
eb := aether.NewEventBus()
|
||||
defer eb.Stop()
|
||||
|
||||
// Verify EventBus implements MetricsProvider
|
||||
var mp aether.MetricsProvider = eb
|
||||
if mp.Metrics() == nil {
|
||||
t.Error("EventBus.Metrics() returned nil")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventBus_StopClearsSubscriptionMetrics(t *testing.T) {
|
||||
eb := aether.NewEventBus()
|
||||
metrics := eb.Metrics()
|
||||
|
||||
_ = eb.Subscribe("ns1")
|
||||
_ = eb.Subscribe("ns1")
|
||||
_ = eb.Subscribe("ns2")
|
||||
|
||||
if got := metrics.TotalActiveSubscriptions(); got != 3 {
|
||||
t.Errorf("TotalActiveSubscriptions() before stop = %d, want 3", got)
|
||||
}
|
||||
|
||||
eb.Stop()
|
||||
|
||||
if got := metrics.TotalActiveSubscriptions(); got != 0 {
|
||||
t.Errorf("TotalActiveSubscriptions() after stop = %d, want 0", got)
|
||||
}
|
||||
}
|
||||
282
nats_eventbus.go
282
nats_eventbus.go
@@ -5,22 +5,31 @@ import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"log"
|
||||
"strings"
|
||||
"sync"
|
||||
|
||||
"github.com/google/uuid"
|
||||
"github.com/nats-io/nats.go"
|
||||
)
|
||||
|
||||
// NATSEventBus is an EventBus that broadcasts events across all cluster nodes using NATS
|
||||
// NATSEventBus is an EventBus that broadcasts events across all cluster nodes using NATS.
|
||||
// Supports wildcard patterns for cross-namespace subscriptions using NATS native wildcards.
|
||||
//
|
||||
// Security Considerations:
|
||||
// Wildcard subscriptions (using "*" or ">") receive events from multiple namespaces.
|
||||
// This bypasses namespace isolation at the NATS level. Ensure proper access controls
|
||||
// are in place at the application layer before granting wildcard subscription access.
|
||||
type NATSEventBus struct {
|
||||
*EventBus // Embed base EventBus for local subscriptions
|
||||
nc *nats.Conn // NATS connection
|
||||
subscriptions []*nats.Subscription
|
||||
namespaceSubscribers map[string]int // Track number of subscribers per namespace
|
||||
nodeID string // Unique ID for this node
|
||||
mutex sync.Mutex
|
||||
ctx context.Context
|
||||
cancel context.CancelFunc
|
||||
*EventBus // Embed base EventBus for local subscriptions
|
||||
nc *nats.Conn // NATS connection
|
||||
subscriptions []*nats.Subscription
|
||||
patternSubscribers map[string]int // Track number of subscribers per pattern (includes wildcards)
|
||||
nodeID string // Unique ID for this node
|
||||
streamPrefix string // NATS subject prefix for events
|
||||
eventStore interface{} // Optional event store for version cache sync (jetstream.JetStreamEventStore)
|
||||
mutex sync.Mutex
|
||||
ctx context.Context
|
||||
cancel context.CancelFunc
|
||||
}
|
||||
|
||||
// eventMessage is the wire format for events sent over NATS
|
||||
@@ -35,69 +44,134 @@ func NewNATSEventBus(nc *nats.Conn) (*NATSEventBus, error) {
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
|
||||
neb := &NATSEventBus{
|
||||
EventBus: NewEventBus(),
|
||||
nc: nc,
|
||||
nodeID: uuid.New().String(),
|
||||
subscriptions: make([]*nats.Subscription, 0),
|
||||
namespaceSubscribers: make(map[string]int),
|
||||
ctx: ctx,
|
||||
cancel: cancel,
|
||||
EventBus: NewEventBus(),
|
||||
nc: nc,
|
||||
nodeID: uuid.New().String(),
|
||||
subscriptions: make([]*nats.Subscription, 0),
|
||||
patternSubscribers: make(map[string]int),
|
||||
streamPrefix: "aether",
|
||||
ctx: ctx,
|
||||
cancel: cancel,
|
||||
}
|
||||
|
||||
return neb, nil
|
||||
}
|
||||
|
||||
// Subscribe creates a local subscription and ensures NATS subscription exists for the namespace
|
||||
func (neb *NATSEventBus) Subscribe(namespaceID string) <-chan *Event {
|
||||
// NewNATSEventBusWithBroadcaster creates a new NATS-backed event bus with JetStreamEventStore integration.
|
||||
// The event store is used to automatically update version cache when EventStored events are received
|
||||
// from other cluster nodes via NATS. This ensures cross-node version consistency.
|
||||
//
|
||||
// Example:
|
||||
//
|
||||
// eventBus := aether.NewNATSEventBusWithBroadcaster(natsConn, store, "tenant-abc")
|
||||
// ch := eventBus.SubscribeToEventStored("tenant-*")
|
||||
// for event := range ch {
|
||||
// actorID := event.Data["actorId"].(string)
|
||||
// version := event.Data["version"].(int64)
|
||||
// store.UpdateVersionCache(actorID, version)
|
||||
// }
|
||||
//
|
||||
// The namespace parameter is used as a prefix for EventStored event filtering.
|
||||
// If empty, EventStored events from all namespaces will be received (requires wildcard pattern).
|
||||
func NewNATSEventBusWithBroadcaster(nc *nats.Conn, store interface{}, namespace string) *NATSEventBus {
|
||||
streamPrefix := "aether"
|
||||
if namespace != "" {
|
||||
streamPrefix = fmt.Sprintf("aether.%s", sanitizeSubject(namespace))
|
||||
}
|
||||
|
||||
neb := &NATSEventBus{
|
||||
EventBus: NewEventBus(),
|
||||
nc: nc,
|
||||
nodeID: uuid.New().String(),
|
||||
subscriptions: make([]*nats.Subscription, 0),
|
||||
patternSubscribers: make(map[string]int),
|
||||
streamPrefix: streamPrefix,
|
||||
eventStore: store,
|
||||
ctx: context.Background(),
|
||||
cancel: func() {},
|
||||
}
|
||||
|
||||
return neb
|
||||
}
|
||||
|
||||
// Subscribe creates a local subscription and ensures NATS subscription exists for the pattern.
|
||||
// Supports NATS subject patterns:
|
||||
// - "*" matches a single token
|
||||
// - ">" matches one or more tokens (only at the end)
|
||||
//
|
||||
// Security Warning: Wildcard patterns receive events from all matching namespaces,
|
||||
// bypassing namespace isolation. Only use for trusted system components.
|
||||
func (neb *NATSEventBus) Subscribe(namespacePattern string) <-chan *Event {
|
||||
return neb.SubscribeWithFilter(namespacePattern, nil)
|
||||
}
|
||||
|
||||
// SubscribeWithFilter creates a filtered subscription channel for a namespace pattern.
|
||||
// Events are filtered by the provided SubscriptionFilter before delivery.
|
||||
// If filter is nil or empty, all events matching the namespace pattern are delivered.
|
||||
//
|
||||
// For NATSEventBus:
|
||||
// - Namespace pattern filtering is applied at the NATS level using native wildcards
|
||||
// - EventTypes and ActorPattern filters are applied client-side after receiving messages
|
||||
//
|
||||
// This allows efficient server-side filtering for namespaces while providing
|
||||
// flexible client-side filtering for event types and actors.
|
||||
func (neb *NATSEventBus) SubscribeWithFilter(namespacePattern string, filter *SubscriptionFilter) <-chan *Event {
|
||||
neb.mutex.Lock()
|
||||
defer neb.mutex.Unlock()
|
||||
|
||||
// Create local subscription first
|
||||
ch := neb.EventBus.Subscribe(namespaceID)
|
||||
// Create local subscription first (with filter)
|
||||
ch := neb.EventBus.SubscribeWithFilter(namespacePattern, filter)
|
||||
|
||||
// Check if this is the first subscriber for this namespace
|
||||
count := neb.namespaceSubscribers[namespaceID]
|
||||
// Check if this is the first subscriber for this pattern
|
||||
count := neb.patternSubscribers[namespacePattern]
|
||||
if count == 0 {
|
||||
// First subscriber - create NATS subscription
|
||||
subject := fmt.Sprintf("aether.events.%s", namespaceID)
|
||||
// NATS natively supports wildcards, so we can use the pattern directly
|
||||
subject := fmt.Sprintf("aether.events.%s", namespacePattern)
|
||||
|
||||
sub, err := neb.nc.Subscribe(subject, func(msg *nats.Msg) {
|
||||
neb.handleNATSEvent(msg)
|
||||
neb.handleNATSEvent(msg, namespacePattern)
|
||||
})
|
||||
if err != nil {
|
||||
log.Printf("[NATSEventBus] Failed to subscribe to NATS subject %s: %v", subject, err)
|
||||
// Record subscription error
|
||||
neb.metrics.RecordSubscribeError(namespacePattern)
|
||||
} else {
|
||||
neb.subscriptions = append(neb.subscriptions, sub)
|
||||
log.Printf("[NATSEventBus] Node %s subscribed to %s", neb.nodeID, subject)
|
||||
if IsWildcardPattern(namespacePattern) {
|
||||
log.Printf("[NATSEventBus] Node %s subscribed to wildcard pattern %s", neb.nodeID, subject)
|
||||
} else {
|
||||
log.Printf("[NATSEventBus] Node %s subscribed to %s", neb.nodeID, subject)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
neb.namespaceSubscribers[namespaceID] = count + 1
|
||||
neb.patternSubscribers[namespacePattern] = count + 1
|
||||
|
||||
return ch
|
||||
}
|
||||
|
||||
// Unsubscribe removes a local subscription and cleans up NATS subscription if no more subscribers
|
||||
func (neb *NATSEventBus) Unsubscribe(namespaceID string, ch <-chan *Event) {
|
||||
func (neb *NATSEventBus) Unsubscribe(namespacePattern string, ch <-chan *Event) {
|
||||
neb.mutex.Lock()
|
||||
defer neb.mutex.Unlock()
|
||||
|
||||
neb.EventBus.Unsubscribe(namespaceID, ch)
|
||||
neb.EventBus.Unsubscribe(namespacePattern, ch)
|
||||
|
||||
count := neb.namespaceSubscribers[namespaceID]
|
||||
count := neb.patternSubscribers[namespacePattern]
|
||||
if count > 0 {
|
||||
count--
|
||||
neb.namespaceSubscribers[namespaceID] = count
|
||||
neb.patternSubscribers[namespacePattern] = count
|
||||
|
||||
if count == 0 {
|
||||
delete(neb.namespaceSubscribers, namespaceID)
|
||||
log.Printf("[NATSEventBus] No more subscribers for namespace %s on node %s", namespaceID, neb.nodeID)
|
||||
delete(neb.patternSubscribers, namespacePattern)
|
||||
log.Printf("[NATSEventBus] No more subscribers for pattern %s on node %s", namespacePattern, neb.nodeID)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// handleNATSEvent processes events received from NATS
|
||||
func (neb *NATSEventBus) handleNATSEvent(msg *nats.Msg) {
|
||||
func (neb *NATSEventBus) handleNATSEvent(msg *nats.Msg, subscribedPattern string) {
|
||||
var eventMsg eventMessage
|
||||
if err := json.Unmarshal(msg.Data, &eventMsg); err != nil {
|
||||
log.Printf("[NATSEventBus] Failed to unmarshal event: %v", err)
|
||||
@@ -109,8 +183,44 @@ func (neb *NATSEventBus) handleNATSEvent(msg *nats.Msg) {
|
||||
return
|
||||
}
|
||||
|
||||
// Forward to local EventBus subscribers
|
||||
neb.EventBus.Publish(eventMsg.NamespaceID, eventMsg.Event)
|
||||
// For wildcard subscriptions, we need to deliver to the EventBus using
|
||||
// the subscribed pattern so it reaches the correct wildcard subscriber.
|
||||
// For exact subscriptions, use the actual namespace.
|
||||
if IsWildcardPattern(subscribedPattern) {
|
||||
// Deliver using the pattern - the EventBus will route to wildcard subscribers
|
||||
neb.deliverToWildcardSubscribers(subscribedPattern, eventMsg.Event)
|
||||
} else {
|
||||
// Forward to local EventBus subscribers with actual namespace
|
||||
neb.EventBus.Publish(eventMsg.NamespaceID, eventMsg.Event)
|
||||
}
|
||||
}
|
||||
|
||||
// deliverToWildcardSubscribers delivers an event to subscribers of a specific wildcard pattern
|
||||
// Applies filters before delivery.
|
||||
func (neb *NATSEventBus) deliverToWildcardSubscribers(pattern string, event *Event) {
|
||||
neb.EventBus.mutex.RLock()
|
||||
defer neb.EventBus.mutex.RUnlock()
|
||||
|
||||
for _, sub := range neb.EventBus.wildcardSubscribers {
|
||||
if sub.pattern == pattern {
|
||||
// Apply filter if present
|
||||
if sub.filter != nil && !sub.filter.IsEmpty() {
|
||||
if !sub.filter.Matches(event) {
|
||||
// Event doesn't match filter, skip delivery
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
select {
|
||||
case sub.ch <- event:
|
||||
// Event delivered from NATS
|
||||
neb.metrics.RecordReceive(pattern)
|
||||
default:
|
||||
// Channel full, skip this subscriber (non-blocking)
|
||||
neb.metrics.RecordDroppedEvent(pattern)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Publish publishes an event both locally and to NATS for cross-node broadcasting
|
||||
@@ -130,11 +240,13 @@ func (neb *NATSEventBus) Publish(namespaceID string, event *Event) {
|
||||
data, err := json.Marshal(eventMsg)
|
||||
if err != nil {
|
||||
log.Printf("[NATSEventBus] Failed to marshal event for NATS: %v", err)
|
||||
neb.metrics.RecordPublishError(namespaceID)
|
||||
return
|
||||
}
|
||||
|
||||
if err := neb.nc.Publish(subject, data); err != nil {
|
||||
log.Printf("[NATSEventBus] Failed to publish event to NATS: %v", err)
|
||||
neb.metrics.RecordPublishError(namespaceID)
|
||||
return
|
||||
}
|
||||
}
|
||||
@@ -157,3 +269,103 @@ func (neb *NATSEventBus) Stop() {
|
||||
|
||||
log.Printf("[NATSEventBus] Node %s stopped", neb.nodeID)
|
||||
}
|
||||
|
||||
// sanitizeSubject sanitizes a string for use in NATS subjects
|
||||
func sanitizeSubject(s string) string {
|
||||
s = strings.ReplaceAll(s, " ", "_")
|
||||
s = strings.ReplaceAll(s, ".", "_")
|
||||
s = strings.ReplaceAll(s, "*", "_")
|
||||
s = strings.ReplaceAll(s, ">", "_")
|
||||
return s
|
||||
}
|
||||
|
||||
// extractActorType extracts the actor type from an actor ID
|
||||
func extractActorType(actorID string) string {
|
||||
for i, c := range actorID {
|
||||
if c == '-' && i > 0 {
|
||||
return actorID[:i]
|
||||
}
|
||||
}
|
||||
return "unknown"
|
||||
}
|
||||
|
||||
// SubscribeToEventStored creates a subscription to EventStored events for a namespace pattern.
|
||||
// EventStored events are published by JetStreamEventStore when events are successfully saved.
|
||||
// This is useful for cross-node event synchronization and version cache consistency.
|
||||
//
|
||||
// The returned channel receives EventStored events matching the pattern.
|
||||
// The EventStored event schema:
|
||||
// - EventType: "EventStored"
|
||||
// - ActorID: ID of the actor that the original event was about
|
||||
// - Version: version of the stored event
|
||||
// - Data:
|
||||
// - eventId: (string) ID of the stored event
|
||||
// - actorId: (string) ID of the actor
|
||||
// - version: (int64) version of the event
|
||||
// - timestamp: (int64) Unix timestamp of when the event was stored
|
||||
//
|
||||
// The namespacePattern supports NATS wildcards:
|
||||
// - "*" matches a single token
|
||||
// - ">" matches one or more tokens (only at the end)
|
||||
//
|
||||
// Example:
|
||||
//
|
||||
// ch := eventBus.SubscribeToEventStored("tenant-*")
|
||||
// for event := range ch {
|
||||
// if event.EventType != aether.EventTypeEventStored {
|
||||
// continue
|
||||
// }
|
||||
// actorID := event.Data["actorId"].(string)
|
||||
// version, _ := event.Data["version"].(int64)
|
||||
// store.UpdateVersionCache(actorID, version)
|
||||
// }
|
||||
//
|
||||
// Security Warning: Using wildcard patterns like ">" will receive EventStored events
|
||||
// from all namespaces. Ensure your application handles this appropriately.
|
||||
func (neb *NATSEventBus) SubscribeToEventStored(namespacePattern string) <-chan *Event {
|
||||
neb.mutex.Lock()
|
||||
defer neb.mutex.Unlock()
|
||||
|
||||
subject := fmt.Sprintf("%s.%s.%s", neb.streamPrefix, namespacePattern, "events.>")
|
||||
|
||||
ch := make(chan *Event, 100)
|
||||
|
||||
sub, err := neb.nc.Subscribe(subject, func(msg *nats.Msg) {
|
||||
var eventMsg eventMessage
|
||||
if err := json.Unmarshal(msg.Data, &eventMsg); err != nil {
|
||||
log.Printf("[NATSEventBus] Failed to unmarshal EventStored event: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
if eventMsg.NodeID == neb.nodeID {
|
||||
return
|
||||
}
|
||||
|
||||
if eventMsg.Event.EventType == EventTypeEventStored && neb.eventStore != nil {
|
||||
actorID, ok := eventMsg.Event.Data["actorId"].(string)
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
version, ok := eventMsg.Event.Data["version"].(int64)
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
// Use type assertion to call UpdateVersionCache
|
||||
if es, ok := neb.eventStore.(interface{ UpdateVersionCache(string, int64) }); ok {
|
||||
es.UpdateVersionCache(actorID, version)
|
||||
}
|
||||
}
|
||||
|
||||
neb.EventBus.Publish(eventMsg.NamespaceID, eventMsg.Event)
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
log.Printf("[NATSEventBus] Failed to subscribe to EventStored: %v", err)
|
||||
close(ch)
|
||||
return ch
|
||||
}
|
||||
|
||||
neb.subscriptions = append(neb.subscriptions, sub)
|
||||
|
||||
return ch
|
||||
}
|
||||
|
||||
197
pattern.go
Normal file
197
pattern.go
Normal file
@@ -0,0 +1,197 @@
|
||||
package aether
|
||||
|
||||
import "strings"
|
||||
|
||||
// MatchNamespacePattern checks if a namespace matches a pattern.
|
||||
// Patterns follow NATS subject matching conventions where tokens are separated by dots:
|
||||
// - "*" matches exactly one token (any sequence without ".")
|
||||
// - ">" matches one or more tokens (only valid at the end of a pattern)
|
||||
// - Exact strings match exactly
|
||||
//
|
||||
// Examples:
|
||||
// - "tenant-a" matches "tenant-a" (exact match)
|
||||
// - "*" matches any single-token namespace like "tenant-a" or "production"
|
||||
// - ">" matches any namespace with one or more tokens
|
||||
// - "prod.*" matches "prod.tenant", "prod.orders" (but not "prod.tenant.orders")
|
||||
// - "prod.>" matches "prod.tenant", "prod.tenant.orders", "prod.a.b.c"
|
||||
// - "*.tenant.*" matches "prod.tenant.orders", "staging.tenant.events"
|
||||
//
|
||||
// Security Considerations:
|
||||
// Wildcard subscriptions provide cross-namespace visibility. Use with caution:
|
||||
// - "*" or ">" patterns receive events from ALL matching namespaces
|
||||
// - This bypasses namespace isolation for the subscriber
|
||||
// - Only grant wildcard subscription access to trusted system components
|
||||
// - Consider auditing wildcard subscription usage
|
||||
// - For multi-tenant systems, wildcard access should be restricted to admin/ops
|
||||
// - Use the most specific pattern possible to minimize exposure
|
||||
func MatchNamespacePattern(pattern, namespace string) bool {
|
||||
// Empty pattern matches nothing
|
||||
if pattern == "" {
|
||||
return false
|
||||
}
|
||||
|
||||
// ">" matches everything when used alone
|
||||
if pattern == ">" {
|
||||
return namespace != ""
|
||||
}
|
||||
|
||||
patternTokens := strings.Split(pattern, ".")
|
||||
namespaceTokens := strings.Split(namespace, ".")
|
||||
|
||||
return matchTokens(patternTokens, namespaceTokens)
|
||||
}
|
||||
|
||||
// matchTokens recursively matches pattern tokens against namespace tokens
|
||||
func matchTokens(patternTokens, namespaceTokens []string) bool {
|
||||
// If pattern is exhausted, namespace must also be exhausted
|
||||
if len(patternTokens) == 0 {
|
||||
return len(namespaceTokens) == 0
|
||||
}
|
||||
|
||||
patternToken := patternTokens[0]
|
||||
|
||||
// ">" matches one or more remaining tokens (must be last pattern token)
|
||||
if patternToken == ">" {
|
||||
// ">" requires at least one token to match
|
||||
return len(namespaceTokens) >= 1
|
||||
}
|
||||
|
||||
// If namespace is exhausted but pattern has more tokens, no match
|
||||
if len(namespaceTokens) == 0 {
|
||||
return false
|
||||
}
|
||||
|
||||
namespaceToken := namespaceTokens[0]
|
||||
|
||||
// "*" matches exactly one token
|
||||
if patternToken == "*" {
|
||||
return matchTokens(patternTokens[1:], namespaceTokens[1:])
|
||||
}
|
||||
|
||||
// Exact match required
|
||||
if patternToken == namespaceToken {
|
||||
return matchTokens(patternTokens[1:], namespaceTokens[1:])
|
||||
}
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
// IsWildcardPattern returns true if the pattern contains wildcards (* or >).
|
||||
// Wildcard patterns can match multiple namespaces and bypass namespace isolation.
|
||||
func IsWildcardPattern(pattern string) bool {
|
||||
return strings.Contains(pattern, "*") || strings.Contains(pattern, ">")
|
||||
}
|
||||
|
||||
// SubscriptionFilter defines optional filters for event subscriptions.
|
||||
// All configured filters are combined with AND logic - an event must match
|
||||
// all specified criteria to be delivered to the subscriber.
|
||||
//
|
||||
// Filter Processing:
|
||||
// - EventTypes: Event must have an EventType matching at least one in the list (OR within types)
|
||||
// - ActorPattern: Event's ActorID must match the pattern (supports * and > wildcards)
|
||||
//
|
||||
// Filtering is applied client-side in the EventBus. For NATSEventBus, namespace-level
|
||||
// filtering uses NATS subject patterns, while EventTypes and ActorPattern filtering
|
||||
// happens after message receipt.
|
||||
type SubscriptionFilter struct {
|
||||
// EventTypes filters events by type. Empty slice means all event types.
|
||||
// If specified, only events with an EventType in this list are delivered.
|
||||
// Example: []string{"OrderPlaced", "OrderShipped"} receives only those event types.
|
||||
EventTypes []string
|
||||
|
||||
// ActorPattern filters events by actor ID pattern. Empty string means all actors.
|
||||
// Supports NATS-style wildcards:
|
||||
// - "*" matches a single token (e.g., "order-*" matches "order-123", "order-456")
|
||||
// - ">" matches one or more tokens (e.g., "order.>" matches "order.us.123", "order.eu.456")
|
||||
// Example: "order-*" receives events only for actors starting with "order-"
|
||||
ActorPattern string
|
||||
}
|
||||
|
||||
// IsEmpty returns true if no filters are configured.
|
||||
func (f *SubscriptionFilter) IsEmpty() bool {
|
||||
return len(f.EventTypes) == 0 && f.ActorPattern == ""
|
||||
}
|
||||
|
||||
// Matches returns true if the event matches all configured filters.
|
||||
// An empty filter matches all events.
|
||||
func (f *SubscriptionFilter) Matches(event *Event) bool {
|
||||
if event == nil {
|
||||
return false
|
||||
}
|
||||
|
||||
// Check event type filter
|
||||
if len(f.EventTypes) > 0 {
|
||||
typeMatch := false
|
||||
for _, et := range f.EventTypes {
|
||||
if event.EventType == et {
|
||||
typeMatch = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !typeMatch {
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
// Check actor pattern filter
|
||||
if f.ActorPattern != "" {
|
||||
if !MatchActorPattern(f.ActorPattern, event.ActorID) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
// MatchActorPattern checks if an actor ID matches a pattern.
|
||||
// Uses the same matching logic as MatchNamespacePattern for consistency.
|
||||
//
|
||||
// Patterns:
|
||||
// - "*" matches a single token (e.g., "order-*" matches "order-123")
|
||||
// - ">" matches one or more tokens (e.g., "order.>" matches "order.us.east")
|
||||
// - Exact strings match exactly (e.g., "order-123" matches only "order-123")
|
||||
//
|
||||
// Note: For simple prefix matching without dots (e.g., "order-*" matching "order-123"),
|
||||
// this uses simplified matching where "*" matches any remaining characters in a token.
|
||||
func MatchActorPattern(pattern, actorID string) bool {
|
||||
// Empty pattern matches nothing
|
||||
if pattern == "" {
|
||||
return false
|
||||
}
|
||||
|
||||
// Empty actor ID matches nothing except ">"
|
||||
if actorID == "" {
|
||||
return false
|
||||
}
|
||||
|
||||
// If pattern contains dots, use token-based matching (same as namespace)
|
||||
if strings.Contains(pattern, ".") || strings.Contains(actorID, ".") {
|
||||
return MatchNamespacePattern(pattern, actorID)
|
||||
}
|
||||
|
||||
// Simple matching for non-tokenized patterns
|
||||
// ">" matches any non-empty actor ID
|
||||
if pattern == ">" {
|
||||
return true
|
||||
}
|
||||
|
||||
// "*" matches any single-token actor ID (no dots)
|
||||
if pattern == "*" {
|
||||
return true
|
||||
}
|
||||
|
||||
// Check for suffix wildcard (e.g., "order-*")
|
||||
if strings.HasSuffix(pattern, "*") {
|
||||
prefix := strings.TrimSuffix(pattern, "*")
|
||||
return strings.HasPrefix(actorID, prefix)
|
||||
}
|
||||
|
||||
// Check for suffix multi-match (e.g., "order->")
|
||||
if strings.HasSuffix(pattern, ">") {
|
||||
prefix := strings.TrimSuffix(pattern, ">")
|
||||
return strings.HasPrefix(actorID, prefix)
|
||||
}
|
||||
|
||||
// Exact match
|
||||
return pattern == actorID
|
||||
}
|
||||
242
pattern_test.go
Normal file
242
pattern_test.go
Normal file
@@ -0,0 +1,242 @@
|
||||
package aether
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestMatchNamespacePattern(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
pattern string
|
||||
namespace string
|
||||
expected bool
|
||||
}{
|
||||
// Exact matches
|
||||
{"exact match", "tenant-a", "tenant-a", true},
|
||||
{"exact mismatch", "tenant-a", "tenant-b", false},
|
||||
{"exact match with dots", "prod.tenant.a", "prod.tenant.a", true},
|
||||
{"exact mismatch with dots", "prod.tenant.a", "prod.tenant.b", false},
|
||||
|
||||
// Empty cases
|
||||
{"empty pattern", "", "tenant-a", false},
|
||||
{"empty namespace exact", "tenant-a", "", false},
|
||||
{"empty namespace catch-all", ">", "", false},
|
||||
{"both empty", "", "", false},
|
||||
|
||||
// Single wildcard (*) - matches one token (NATS semantics: tokens are dot-separated)
|
||||
{"star matches any single token", "*", "tenant-a", true},
|
||||
{"star matches any single token 2", "*", "anything", true},
|
||||
{"star does not match multi-token", "*", "prod.tenant", false},
|
||||
{"prefix with star", "prod.*", "prod.tenant", true},
|
||||
{"prefix with star 2", "prod.*", "prod.orders", true},
|
||||
{"prefix with star no match extra tokens", "prod.*", "prod.tenant.orders", false},
|
||||
{"prefix with star no match wrong prefix", "prod.*", "staging.tenant", false},
|
||||
{"middle wildcard", "prod.*.orders", "prod.tenant.orders", true},
|
||||
{"middle wildcard no match", "prod.*.orders", "prod.tenant.events", false},
|
||||
{"multiple stars", "*.tenant.*", "prod.tenant.orders", true},
|
||||
{"multiple stars 2", "*.*.orders", "prod.tenant.orders", true},
|
||||
{"multiple stars no match", "*.*.orders", "prod.orders", false},
|
||||
|
||||
// Multi-token wildcard (>) - matches one or more tokens
|
||||
{"greater matches one", ">", "tenant", true},
|
||||
{"greater matches multi", ">", "prod.tenant.orders", true},
|
||||
{"prefix greater", "prod.>", "prod.tenant", true},
|
||||
{"prefix greater multi", "prod.>", "prod.tenant.orders.items", true},
|
||||
{"prefix greater no match different prefix", "prod.>", "staging.tenant", false},
|
||||
{"prefix greater requires at least one", "prod.>", "prod", false},
|
||||
{"deep prefix greater", "prod.tenant.>", "prod.tenant.orders", true},
|
||||
|
||||
// Combined wildcards
|
||||
{"star then greater", "*.>", "prod.tenant", true},
|
||||
{"star then greater multi", "*.>", "prod.tenant.orders", true},
|
||||
{"star then greater no match single", "*.>", "prod", false},
|
||||
|
||||
// Edge cases
|
||||
{"trailing dot in pattern", "tenant.", "tenant.", true},
|
||||
{"just dots", "..", "..", true},
|
||||
{"star at end", "prod.tenant.*", "prod.tenant.a", true},
|
||||
{"star at end no match", "prod.tenant.*", "prod.other.a", false},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
result := MatchNamespacePattern(tt.pattern, tt.namespace)
|
||||
if result != tt.expected {
|
||||
t.Errorf("MatchNamespacePattern(%q, %q) = %v, want %v",
|
||||
tt.pattern, tt.namespace, result, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestIsWildcardPattern(t *testing.T) {
|
||||
tests := []struct {
|
||||
pattern string
|
||||
expected bool
|
||||
}{
|
||||
{"tenant-a", false},
|
||||
{"prod.tenant.orders", false},
|
||||
{"*", true},
|
||||
{"prod.*", true},
|
||||
{"*.orders", true},
|
||||
{">", true},
|
||||
{"prod.>", true},
|
||||
{"*.>", true},
|
||||
{"prod.*.orders", true},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.pattern, func(t *testing.T) {
|
||||
result := IsWildcardPattern(tt.pattern)
|
||||
if result != tt.expected {
|
||||
t.Errorf("IsWildcardPattern(%q) = %v, want %v",
|
||||
tt.pattern, result, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkMatchNamespacePattern(b *testing.B) {
|
||||
benchmarks := []struct {
|
||||
name string
|
||||
pattern string
|
||||
namespace string
|
||||
}{
|
||||
{"exact", "tenant-a", "tenant-a"},
|
||||
{"star", "*", "tenant-a"},
|
||||
{"prefix_star", "prod.*", "prod.tenant"},
|
||||
{"greater", ">", "prod.tenant.orders"},
|
||||
{"complex", "prod.*.>", "prod.tenant.orders.items"},
|
||||
}
|
||||
|
||||
for _, bm := range benchmarks {
|
||||
b.Run(bm.name, func(b *testing.B) {
|
||||
for i := 0; i < b.N; i++ {
|
||||
MatchNamespacePattern(bm.pattern, bm.namespace)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchActorPattern(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
pattern string
|
||||
actorID string
|
||||
expected bool
|
||||
}{
|
||||
// Empty cases
|
||||
{"empty pattern", "", "actor-123", false},
|
||||
{"empty actorID", "actor-*", "", false},
|
||||
{"both empty", "", "", false},
|
||||
|
||||
// Exact matches (no dots)
|
||||
{"exact match", "actor-123", "actor-123", true},
|
||||
{"exact mismatch", "actor-123", "actor-456", false},
|
||||
|
||||
// Suffix wildcard with * (simple, no dots)
|
||||
{"prefix with star", "order-*", "order-123", true},
|
||||
{"prefix with star 2", "order-*", "order-456-xyz", true},
|
||||
{"prefix with star mismatch", "order-*", "user-123", false},
|
||||
{"star alone", "*", "anything", true},
|
||||
|
||||
// Suffix wildcard with > (simple, no dots)
|
||||
{"prefix with greater", "order->", "order-123", true},
|
||||
{"greater alone", ">", "anything", true},
|
||||
|
||||
// Dot-separated actor IDs (uses MatchNamespacePattern)
|
||||
{"dotted exact match", "order.us.123", "order.us.123", true},
|
||||
{"dotted exact mismatch", "order.us.123", "order.eu.123", false},
|
||||
{"dotted star", "order.*", "order.123", true},
|
||||
{"dotted star deep", "order.*.*", "order.us.123", true},
|
||||
{"dotted greater", "order.>", "order.us.123.456", true},
|
||||
{"dotted star mismatch depth", "order.*", "order.us.123", false},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
result := MatchActorPattern(tt.pattern, tt.actorID)
|
||||
if result != tt.expected {
|
||||
t.Errorf("MatchActorPattern(%q, %q) = %v, want %v",
|
||||
tt.pattern, tt.actorID, result, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestSubscriptionFilter_IsEmpty(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
filter *SubscriptionFilter
|
||||
expected bool
|
||||
}{
|
||||
{"nil fields", &SubscriptionFilter{}, true},
|
||||
{"empty slice", &SubscriptionFilter{EventTypes: []string{}}, true},
|
||||
{"has event types", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}, false},
|
||||
{"has actor pattern", &SubscriptionFilter{ActorPattern: "order-*"}, false},
|
||||
{"has both", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"}, false},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
result := tt.filter.IsEmpty()
|
||||
if result != tt.expected {
|
||||
t.Errorf("SubscriptionFilter.IsEmpty() = %v, want %v", result, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestSubscriptionFilter_Matches(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
filter *SubscriptionFilter
|
||||
event *Event
|
||||
expected bool
|
||||
}{
|
||||
// Nil event
|
||||
{"nil event", &SubscriptionFilter{}, nil, false},
|
||||
|
||||
// Empty filter matches all
|
||||
{"empty filter", &SubscriptionFilter{}, &Event{EventType: "Test", ActorID: "actor-1"}, true},
|
||||
|
||||
// Event type filtering
|
||||
{"event type match", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}},
|
||||
&Event{EventType: "OrderPlaced", ActorID: "order-1"}, true},
|
||||
{"event type mismatch", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}},
|
||||
&Event{EventType: "OrderShipped", ActorID: "order-1"}, false},
|
||||
{"event type multiple match first", &SubscriptionFilter{EventTypes: []string{"OrderPlaced", "OrderShipped"}},
|
||||
&Event{EventType: "OrderPlaced", ActorID: "order-1"}, true},
|
||||
{"event type multiple match second", &SubscriptionFilter{EventTypes: []string{"OrderPlaced", "OrderShipped"}},
|
||||
&Event{EventType: "OrderShipped", ActorID: "order-1"}, true},
|
||||
{"event type multiple no match", &SubscriptionFilter{EventTypes: []string{"OrderPlaced", "OrderShipped"}},
|
||||
&Event{EventType: "OrderCancelled", ActorID: "order-1"}, false},
|
||||
|
||||
// Actor pattern filtering
|
||||
{"actor pattern exact match", &SubscriptionFilter{ActorPattern: "order-123"},
|
||||
&Event{EventType: "Test", ActorID: "order-123"}, true},
|
||||
{"actor pattern exact mismatch", &SubscriptionFilter{ActorPattern: "order-123"},
|
||||
&Event{EventType: "Test", ActorID: "order-456"}, false},
|
||||
{"actor pattern wildcard match", &SubscriptionFilter{ActorPattern: "order-*"},
|
||||
&Event{EventType: "Test", ActorID: "order-123"}, true},
|
||||
{"actor pattern wildcard mismatch", &SubscriptionFilter{ActorPattern: "order-*"},
|
||||
&Event{EventType: "Test", ActorID: "user-123"}, false},
|
||||
|
||||
// Combined filters (AND logic)
|
||||
{"combined both match", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
|
||||
&Event{EventType: "OrderPlaced", ActorID: "order-123"}, true},
|
||||
{"combined event matches actor does not", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
|
||||
&Event{EventType: "OrderPlaced", ActorID: "user-123"}, false},
|
||||
{"combined actor matches event does not", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
|
||||
&Event{EventType: "OrderShipped", ActorID: "order-123"}, false},
|
||||
{"combined neither matches", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
|
||||
&Event{EventType: "OrderShipped", ActorID: "user-123"}, false},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
result := tt.filter.Matches(tt.event)
|
||||
if result != tt.expected {
|
||||
t.Errorf("SubscriptionFilter.Matches() = %v, want %v", result, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
6
renovate.json
Normal file
6
renovate.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"$schema": "https://docs.renovatebot.com/renovate-schema.json",
|
||||
"extends": [
|
||||
"config:recommended"
|
||||
]
|
||||
}
|
||||
46
store/config_test.go
Normal file
46
store/config_test.go
Normal file
@@ -0,0 +1,46 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func TestDefaultJetStreamConfig(t *testing.T) {
|
||||
config := DefaultJetStreamConfig()
|
||||
|
||||
if config.StreamRetention != DefaultStreamRetention {
|
||||
t.Errorf("expected StreamRetention=%v, got %v", DefaultStreamRetention, config.StreamRetention)
|
||||
}
|
||||
if config.ReplicaCount != DefaultReplicaCount {
|
||||
t.Errorf("expected ReplicaCount=%d, got %d", DefaultReplicaCount, config.ReplicaCount)
|
||||
}
|
||||
}
|
||||
|
||||
func TestJetStreamConfigDefaults(t *testing.T) {
|
||||
t.Run("default stream retention is 1 year", func(t *testing.T) {
|
||||
expected := 365 * 24 * time.Hour
|
||||
if DefaultStreamRetention != expected {
|
||||
t.Errorf("expected DefaultStreamRetention=%v, got %v", expected, DefaultStreamRetention)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("default replica count is 1", func(t *testing.T) {
|
||||
if DefaultReplicaCount != 1 {
|
||||
t.Errorf("expected DefaultReplicaCount=1, got %d", DefaultReplicaCount)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestJetStreamConfigCustomValues(t *testing.T) {
|
||||
config := JetStreamConfig{
|
||||
StreamRetention: 30 * 24 * time.Hour, // 30 days
|
||||
ReplicaCount: 3,
|
||||
}
|
||||
|
||||
if config.StreamRetention != 30*24*time.Hour {
|
||||
t.Errorf("expected StreamRetention=30 days, got %v", config.StreamRetention)
|
||||
}
|
||||
if config.ReplicaCount != 3 {
|
||||
t.Errorf("expected ReplicaCount=3, got %d", config.ReplicaCount)
|
||||
}
|
||||
}
|
||||
215
store/immutability_test.go
Normal file
215
store/immutability_test.go
Normal file
@@ -0,0 +1,215 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
)
|
||||
|
||||
// TestEventImmutability_MemoryStore verifies that events cannot be modified after persistence
|
||||
// in the in-memory event store. This demonstrates the append-only nature of event sourcing.
|
||||
func TestEventImmutability_MemoryStore(t *testing.T) {
|
||||
store := NewInMemoryEventStore()
|
||||
actorID := "test-actor-123"
|
||||
|
||||
// Create and save an event
|
||||
originalEvent := &aether.Event{
|
||||
ID: "evt-immutable-1",
|
||||
EventType: "TestEvent",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{
|
||||
"value": "original",
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err := store.SaveEvent(originalEvent)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent failed: %v", err)
|
||||
}
|
||||
|
||||
// Retrieve the event from the store
|
||||
events, err := store.GetEvents(actorID, 0)
|
||||
if err != nil {
|
||||
t.Fatalf("GetEvents failed: %v", err)
|
||||
}
|
||||
|
||||
if len(events) == 0 {
|
||||
t.Fatal("expected 1 event, got 0")
|
||||
}
|
||||
|
||||
retrievedEvent := events[0]
|
||||
|
||||
// Verify the stored event has the correct values
|
||||
if retrievedEvent.Data["value"] != "original" {
|
||||
t.Errorf("Data value mismatch: got %v, want %v", retrievedEvent.Data["value"], "original")
|
||||
}
|
||||
|
||||
if retrievedEvent.EventType != "TestEvent" {
|
||||
t.Errorf("EventType mismatch: got %q, want %q", retrievedEvent.EventType, "TestEvent")
|
||||
}
|
||||
|
||||
// Verify ID is correct
|
||||
if retrievedEvent.ID != "evt-immutable-1" {
|
||||
t.Errorf("Event ID mismatch: got %q, want %q", retrievedEvent.ID, "evt-immutable-1")
|
||||
}
|
||||
}
|
||||
|
||||
// TestEventImmutability_NoUpdateMethod verifies that the EventStore interface
|
||||
// has only append, read methods - no Update or Delete methods.
|
||||
func TestEventImmutability_NoUpdateMethod(t *testing.T) {
|
||||
// This test documents that the EventStore interface is append-only.
|
||||
// The interface intentionally provides:
|
||||
// - SaveEvent: append only
|
||||
// - GetEvents: read only
|
||||
// - GetLatestVersion: read only
|
||||
//
|
||||
// To verify this, we demonstrate that any attempt to call non-existent
|
||||
// update/delete methods would be caught at compile time (not runtime).
|
||||
// This is enforced by the interface definition in event.go which does
|
||||
// not include Update, Delete, or Modify methods.
|
||||
|
||||
store := NewInMemoryEventStore()
|
||||
|
||||
// Compile-time check: these would not compile if we tried them:
|
||||
// store.Update(event) // compile error: no such method
|
||||
// store.Delete(eventID) // compile error: no such method
|
||||
// store.Modify(eventID, newData) // compile error: no such method
|
||||
|
||||
// Only these methods exist:
|
||||
var eventStore aether.EventStore = store
|
||||
if eventStore == nil {
|
||||
t.Fatal("eventStore is nil")
|
||||
}
|
||||
// If we got here, the compile-time checks passed
|
||||
t.Log("EventStore interface enforces append-only semantics by design")
|
||||
}
|
||||
|
||||
// TestEventImmutability_VersionOnlyGoesUp verifies that versions are monotonically
|
||||
// increasing and attempting to save with a non-increasing version fails.
|
||||
func TestEventImmutability_VersionOnlyGoesUp(t *testing.T) {
|
||||
store := NewInMemoryEventStore()
|
||||
actorID := "actor-version-check"
|
||||
|
||||
// Save first event with version 1
|
||||
event1 := &aether.Event{
|
||||
ID: "evt-v1",
|
||||
EventType: "Event1",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err := store.SaveEvent(event1)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent(v1) failed: %v", err)
|
||||
}
|
||||
|
||||
// Try to save with same version - should fail
|
||||
event2Same := &aether.Event{
|
||||
ID: "evt-v1-again",
|
||||
EventType: "Event2",
|
||||
ActorID: actorID,
|
||||
Version: 1, // Same version
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err = store.SaveEvent(event2Same)
|
||||
if err == nil {
|
||||
t.Error("expected SaveEvent(same version) to fail, but it succeeded")
|
||||
}
|
||||
|
||||
// Try to save with lower version - should fail
|
||||
event3Lower := &aether.Event{
|
||||
ID: "evt-v0",
|
||||
EventType: "Event3",
|
||||
ActorID: actorID,
|
||||
Version: 0, // Lower version
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err = store.SaveEvent(event3Lower)
|
||||
if err == nil {
|
||||
t.Error("expected SaveEvent(lower version) to fail, but it succeeded")
|
||||
}
|
||||
|
||||
// Save with next version - should succeed
|
||||
event4Next := &aether.Event{
|
||||
ID: "evt-v2",
|
||||
EventType: "Event4",
|
||||
ActorID: actorID,
|
||||
Version: 2,
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err = store.SaveEvent(event4Next)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent(v2) failed: %v", err)
|
||||
}
|
||||
|
||||
// Verify we have exactly 2 events
|
||||
events, err := store.GetEvents(actorID, 0)
|
||||
if err != nil {
|
||||
t.Fatalf("GetEvents failed: %v", err)
|
||||
}
|
||||
|
||||
if len(events) != 2 {
|
||||
t.Errorf("expected 2 events, got %d", len(events))
|
||||
}
|
||||
}
|
||||
|
||||
// TestEventImmutability_EventCannotBeDeleted verifies that there is no way to delete
|
||||
// events from the store through the EventStore interface.
|
||||
func TestEventImmutability_EventCannotBeDeleted(t *testing.T) {
|
||||
store := NewInMemoryEventStore()
|
||||
actorID := "actor-nodelete"
|
||||
|
||||
// Save an event
|
||||
event := &aether.Event{
|
||||
ID: "evt-nodelete",
|
||||
EventType: "ImportantEvent",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"critical": true},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err := store.SaveEvent(event)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent failed: %v", err)
|
||||
}
|
||||
|
||||
// Retrieve it
|
||||
events1, err := store.GetEvents(actorID, 0)
|
||||
if err != nil {
|
||||
t.Fatalf("GetEvents (1) failed: %v", err)
|
||||
}
|
||||
|
||||
if len(events1) != 1 {
|
||||
t.Fatal("expected 1 event after save")
|
||||
}
|
||||
|
||||
// Try to delete through interface - this method doesn't exist
|
||||
// store.Delete("evt-nodelete") // compile error: no such method
|
||||
// store.DeleteByActorID(actorID) // compile error: no such method
|
||||
|
||||
// Verify the event is still there (we can't delete it)
|
||||
events2, err := store.GetEvents(actorID, 0)
|
||||
if err != nil {
|
||||
t.Fatalf("GetEvents (2) failed: %v", err)
|
||||
}
|
||||
|
||||
if len(events2) != 1 {
|
||||
t.Errorf("expected 1 event (should not be deletable), got %d", len(events2))
|
||||
}
|
||||
|
||||
if events2[0].ID != "evt-nodelete" {
|
||||
t.Errorf("event ID changed: got %q, want %q", events2[0].ID, "evt-nodelete")
|
||||
}
|
||||
}
|
||||
431
store/integration_test.go
Normal file
431
store/integration_test.go
Normal file
@@ -0,0 +1,431 @@
|
||||
//go:build integration
|
||||
|
||||
package store
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log"
|
||||
"os"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
"github.com/nats-io/nats.go"
|
||||
"github.com/nats-io/nats-server/v2/server"
|
||||
)
|
||||
|
||||
func setupNatsServer() (*server.Server, *nats.Conn, func()) {
|
||||
opts := &server.Options{
|
||||
Port: -1,
|
||||
JetStream: true,
|
||||
StoreDir: "/tmp/nats-test-" + time.Now().Format("20060102150405"),
|
||||
}
|
||||
|
||||
s, err := server.NewServer(opts)
|
||||
if err != nil {
|
||||
log.Fatal("Failed to create NATS server:", err)
|
||||
}
|
||||
|
||||
go s.Start()
|
||||
if !s.ReadyForConnections(4 * time.Second) {
|
||||
log.Fatal("NATS server failed to start")
|
||||
}
|
||||
|
||||
nc, err := nats.Connect(s.ClientURL())
|
||||
if err != nil {
|
||||
s.Shutdown()
|
||||
log.Fatal("Failed to connect to NATS:", err)
|
||||
}
|
||||
|
||||
return s, nc, func() {
|
||||
nc.Close()
|
||||
s.Shutdown()
|
||||
os.RemoveAll(opts.StoreDir)
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateVersionCache(t *testing.T) {
|
||||
s, nc, cleanup := setupNatsServer()
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, "test_update_cache")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store: %v", err)
|
||||
}
|
||||
defer store.Close(ctx)
|
||||
|
||||
actorID := "test-actor-1"
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
cachedVersion int64
|
||||
newVersion int64
|
||||
expectUpdate bool
|
||||
expectVersion int64
|
||||
}{
|
||||
{
|
||||
name: "update when new version is greater",
|
||||
cachedVersion: 5,
|
||||
newVersion: 10,
|
||||
expectUpdate: true,
|
||||
expectVersion: 10,
|
||||
},
|
||||
{
|
||||
name: "do not update when new version is equal",
|
||||
cachedVersion: 5,
|
||||
newVersion: 5,
|
||||
expectUpdate: false,
|
||||
expectVersion: 5,
|
||||
},
|
||||
{
|
||||
name: "do not update when new version is less",
|
||||
cachedVersion: 10,
|
||||
newVersion: 5,
|
||||
expectUpdate: false,
|
||||
expectVersion: 10,
|
||||
},
|
||||
{
|
||||
name: "update when no cached version exists",
|
||||
cachedVersion: 0,
|
||||
newVersion: 1,
|
||||
expectUpdate: true,
|
||||
expectVersion: 1,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
// Set up cached version
|
||||
store.versions = make(map[string]int64)
|
||||
store.versions[actorID] = tt.cachedVersion
|
||||
|
||||
// Call UpdateVersionCache
|
||||
store.UpdateVersionCache(actorID, tt.newVersion)
|
||||
|
||||
// Verify result
|
||||
if tt.expectUpdate {
|
||||
if version, ok := store.versions[actorID]; !ok {
|
||||
t.Error("Expected version to be updated but it wasn't cached")
|
||||
} else if version != tt.expectVersion {
|
||||
t.Errorf("Expected version %d, got %d", tt.expectVersion, version)
|
||||
}
|
||||
} else {
|
||||
if version, ok := store.versions[actorID]; !ok {
|
||||
t.Error("Expected version to remain cached")
|
||||
} else if version != tt.expectVersion {
|
||||
t.Errorf("Expected version to remain %d, got %d", tt.expectVersion, version)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateVersionCache_Concurrent(t *testing.T) {
|
||||
s, nc, cleanup := setupNatsServer()
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, "test_update_cache_concurrent")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store: %v", err)
|
||||
}
|
||||
defer store.Close(ctx)
|
||||
|
||||
actorID := "concurrent-actor"
|
||||
store.versions[actorID] = 1
|
||||
|
||||
const numGoroutines = 50
|
||||
const maxVersion = 100
|
||||
|
||||
var done = make(chan struct{})
|
||||
var updates int32
|
||||
|
||||
for i := 0; i < numGoroutines; i++ {
|
||||
version := int64(1 + (i % maxVersion))
|
||||
go func(v int64) {
|
||||
store.UpdateVersionCache(actorID, v)
|
||||
select {
|
||||
case <-done:
|
||||
default:
|
||||
updates++
|
||||
}
|
||||
}(version)
|
||||
}
|
||||
|
||||
close(done)
|
||||
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
|
||||
finalVersion := store.versions[actorID]
|
||||
if finalVersion > maxVersion {
|
||||
t.Errorf("Expected version to be at most %d, got %d", maxVersion, finalVersion)
|
||||
}
|
||||
}
|
||||
|
||||
func TestSubscribeToEventStored(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping integration test")
|
||||
}
|
||||
|
||||
s, nc, cleanup := setupNatsServer()
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, "test_subscribe_event_stored")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store: %v", err)
|
||||
}
|
||||
defer store.Close(ctx)
|
||||
|
||||
eventBusWithStore := NewNATSEventBusWithBroadcaster(nc, store, "")
|
||||
if eventBusWithStore == nil {
|
||||
t.Fatalf("Failed to create event bus with broadcaster")
|
||||
}
|
||||
defer eventBusWithStore.Stop()
|
||||
|
||||
ch := eventBusWithStore.SubscribeToEventStored("*")
|
||||
if ch == nil {
|
||||
t.Fatal("SubscribeToEventStored returned nil channel")
|
||||
}
|
||||
|
||||
actorID := "subscribe-test-actor"
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "TestEvent",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"key": "value"},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
eventBusWithStore.Publish("", event)
|
||||
|
||||
select {
|
||||
case receivedEvent := <-ch:
|
||||
if receivedEvent.EventType != aether.EventTypeEventStored {
|
||||
t.Errorf("Expected EventTypeEventStored, got %s", receivedEvent.EventType)
|
||||
}
|
||||
if receivedEvent.ActorID != actorID {
|
||||
t.Errorf("Expected actorID %s, got %s", actorID, receivedEvent.ActorID)
|
||||
}
|
||||
data, ok := receivedEvent.Data["actorId"].(string)
|
||||
if !ok || data != actorID {
|
||||
t.Errorf("Expected actorId in data to be %s", actorID)
|
||||
}
|
||||
case <-time.After(2 * time.Second):
|
||||
t.Fatal("Timeout waiting for EventStored event")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCrossNodeBroadcasting_SingleNode(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping integration test")
|
||||
}
|
||||
|
||||
s, nc, cleanup := setupNatsServer()
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, "test_single_node_broadcast")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store: %v", err)
|
||||
}
|
||||
defer store.Close(ctx)
|
||||
|
||||
eventBus := NewNATSEventBusWithBroadcaster(nc, store, "")
|
||||
defer eventBus.Stop()
|
||||
|
||||
actorID := "broadcast-test-actor-1"
|
||||
localCh := eventBus.Subscribe("")
|
||||
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"total": 99.99},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
eventBus.Publish("", event)
|
||||
|
||||
select {
|
||||
case receivedEvent := <-localCh:
|
||||
if receivedEvent.EventType != "OrderPlaced" {
|
||||
t.Errorf("Expected OrderPlaced, got %s", receivedEvent.EventType)
|
||||
}
|
||||
if receivedEvent.ActorID != actorID {
|
||||
t.Errorf("Expected actorID %s, got %s", actorID, receivedEvent.ActorID)
|
||||
}
|
||||
case <-time.After(2 * time.Second):
|
||||
t.Fatal("Timeout waiting for broadcast event")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCrossNodeBroadcasting_MultiNode(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping integration test")
|
||||
}
|
||||
|
||||
s1, nc1, cleanup1 := setupNatsServer()
|
||||
defer cleanup1()
|
||||
|
||||
s2, nc2, cleanup2 := setupNatsServer()
|
||||
defer cleanup2()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store1, err := NewJetStreamEventStore(nc1, "test_multi_node_1")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store 1: %v", err)
|
||||
}
|
||||
|
||||
store2, err := NewJetStreamEventStore(nc2, "test_multi_node_2")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store 2: %v", err)
|
||||
}
|
||||
|
||||
eventBus1 := NewNATSEventBusWithBroadcaster(nc1, store1, "")
|
||||
eventBus2 := NewNATSEventBusWithBroadcaster(nc2, store2, "")
|
||||
defer eventBus1.Stop()
|
||||
defer eventBus2.Stop()
|
||||
|
||||
actorID := "multi-node-actor"
|
||||
receiverCh := eventBus2.Subscribe("")
|
||||
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "InventoryReserved",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"quantity": 5},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
eventBus1.Publish("", event)
|
||||
|
||||
select {
|
||||
case receivedEvent := <-receiverCh:
|
||||
if receivedEvent.EventType != "InventoryReserved" {
|
||||
t.Errorf("Expected InventoryReserved, got %s", receivedEvent.EventType)
|
||||
}
|
||||
if receivedEvent.ActorID != actorID {
|
||||
t.Errorf("Expected actorID %s, got %s", actorID, receivedEvent.ActorID)
|
||||
}
|
||||
case <-time.After(3 * time.Second):
|
||||
t.Fatal("Timeout waiting for cross-node event")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCrossNodeBroadcasting_NamespaceIsolation(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping integration test")
|
||||
}
|
||||
|
||||
s, nc, cleanup := setupNatsServer()
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
tenantAStore, err := NewJetStreamEventStoreWithNamespace(nc, "events", "tenant-a")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create tenant A store: %v", err)
|
||||
}
|
||||
|
||||
tenantBStore, err := NewJetStreamEventStoreWithNamespace(nc, "events", "tenant-b")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create tenant B store: %v", err)
|
||||
}
|
||||
|
||||
tenantAEventBus := NewNATSEventBusWithBroadcaster(nc, tenantAStore, "tenant-a")
|
||||
tenantBEventBus := NewNATSEventBusWithBroadcaster(nc, tenantBStore, "tenant-b")
|
||||
defer tenantAEventBus.Stop()
|
||||
defer tenantBEventBus.Stop()
|
||||
|
||||
tenantACh := tenantAEventBus.Subscribe("tenant-a")
|
||||
tenantBCh := tenantBEventBus.Subscribe("tenant-b")
|
||||
|
||||
actorID := "tenant-actor"
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "TenantEvent",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{"data": "tenant-a"},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
tenantAEventBus.Publish("tenant-a", event)
|
||||
|
||||
select {
|
||||
case receivedEvent := <-tenantACh:
|
||||
if receivedEvent.EventType != "TenantEvent" {
|
||||
t.Errorf("Expected TenantEvent in tenant A, got %s", receivedEvent.EventType)
|
||||
}
|
||||
case <-time.After(2 * time.Second):
|
||||
t.Error("Timeout waiting for tenant A to receive event")
|
||||
}
|
||||
|
||||
select {
|
||||
case <-tenantBCh:
|
||||
t.Error("Tenant B should not receive tenant A's events")
|
||||
case <-time.After(1 * time.Second):
|
||||
// Expected - tenant B should not receive events from tenant A
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateVersionCache_EventStored(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping integration test")
|
||||
}
|
||||
|
||||
s, nc, cleanup := setupNatsServer()
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, "test_version_cache_eventstored")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create store: %v", err)
|
||||
}
|
||||
|
||||
eventBus := NewNATSEventBusWithBroadcaster(nc, store, "")
|
||||
defer eventBus.Stop()
|
||||
|
||||
actorID := "version-cache-actor"
|
||||
store.UpdateVersionCache(actorID, 5)
|
||||
|
||||
event := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: "TestEvent",
|
||||
ActorID: actorID,
|
||||
Version: 10,
|
||||
Data: map[string]interface{}{"test": true},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
eventBus.Publish("", event)
|
||||
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
|
||||
storedVersion, err := store.GetLatestVersion(actorID)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to get latest version: %v", err)
|
||||
}
|
||||
|
||||
if storedVersion != 10 {
|
||||
t.Errorf("Expected version 10, got %d", storedVersion)
|
||||
}
|
||||
|
||||
cacheVersion, ok := store.GetCachedVersion(actorID)
|
||||
if !ok {
|
||||
t.Error("Expected version to be in cache")
|
||||
} else if cacheVersion != 10 {
|
||||
t.Errorf("Expected cached version 10, got %d", cacheVersion)
|
||||
}
|
||||
}
|
||||
@@ -1,6 +1,7 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"strings"
|
||||
@@ -9,31 +10,134 @@ import (
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
"github.com/nats-io/nats.go"
|
||||
"github.com/google/uuid"
|
||||
)
|
||||
|
||||
// JetStreamEventStore implements EventStore using NATS JetStream for persistence
|
||||
// Default configuration values for JetStream event store
|
||||
const (
|
||||
DefaultStreamRetention = 365 * 24 * time.Hour // 1 year
|
||||
DefaultReplicaCount = 1
|
||||
)
|
||||
|
||||
// JetStreamConfig holds configuration options for JetStreamEventStore
|
||||
type JetStreamConfig struct {
|
||||
// StreamRetention is how long to keep events (default: 1 year).
|
||||
// JetStream enforces this retention policy at the storage level using a limits-based policy:
|
||||
// - MaxAge: Events older than this duration are automatically deleted
|
||||
// - Storage is file-based (nats.FileStorage) for durability
|
||||
// - Once the retention period expires, events are permanently removed from the stream
|
||||
// This ensures that old events do not consume storage indefinitely.
|
||||
// To keep events indefinitely, set StreamRetention to a very large value or configure
|
||||
// a custom retention policy in the JetStream stream configuration.
|
||||
StreamRetention time.Duration
|
||||
// ReplicaCount is the number of replicas for high availability (default: 1)
|
||||
ReplicaCount int
|
||||
// Namespace is an optional prefix for stream names to provide storage isolation.
|
||||
// When set, the actual stream name becomes "{namespace}_{streamName}".
|
||||
// Events in namespaced stores are completely isolated from other namespaces.
|
||||
// Leave empty for backward-compatible non-namespaced behavior.
|
||||
Namespace string
|
||||
}
|
||||
|
||||
// DefaultJetStreamConfig returns the default configuration
|
||||
func DefaultJetStreamConfig() JetStreamConfig {
|
||||
return JetStreamConfig{
|
||||
StreamRetention: DefaultStreamRetention,
|
||||
ReplicaCount: DefaultReplicaCount,
|
||||
}
|
||||
}
|
||||
|
||||
// JetStreamEventStore implements EventStore using NATS JetStream for persistence.
|
||||
// It also implements EventStoreWithErrors to report malformed events during replay.
|
||||
//
|
||||
// ## Immutability Guarantee
|
||||
//
|
||||
// JetStreamEventStore is append-only. Events are stored in a JetStream stream that
|
||||
// is configured with file-based storage (nats.FileStorage) and a retention policy
|
||||
// (nats.LimitsPolicy). The configured MaxAge retention policy ensures that old events
|
||||
// eventually expire, but during their lifetime, events are never modified or deleted
|
||||
// through the EventStore API. Once an event is published to the stream:
|
||||
// - It cannot be updated
|
||||
// - It cannot be deleted before expiration
|
||||
// - It can only be read
|
||||
//
|
||||
// This architectural guarantee, combined with the EventStore interface providing
|
||||
// no Update or Delete methods, ensures events are immutable and suitable as an
|
||||
// audit trail.
|
||||
//
|
||||
// ## Version Cache Invalidation Strategy
|
||||
//
|
||||
// JetStreamEventStore maintains an in-memory cache of actor versions for optimistic
|
||||
// concurrency control. The cache is invalidated on any miss (GetLatestVersion call
|
||||
// that finds a newer version in JetStream) to ensure consistency even when external
|
||||
// processes write to the same JetStream stream.
|
||||
//
|
||||
// If only Aether owns the stream (single-writer assumption), the cache provides
|
||||
// excellent performance for repeated version checks. If external writers modify
|
||||
// the stream, the cache will remain consistent because:
|
||||
//
|
||||
// 1. On SaveEvent: getLatestVersionLocked() checks JetStream on cache miss
|
||||
// 2. On GetLatestVersion: If actual version > cached version, cache is invalidated
|
||||
// 3. Subsequent checks for that actor will fetch fresh data from JetStream
|
||||
//
|
||||
// This strategy prevents data corruption from stale cache while maintaining
|
||||
// performance for the single-writer case.
|
||||
type JetStreamEventStore struct {
|
||||
js nats.JetStreamContext
|
||||
streamName string
|
||||
mu sync.Mutex // Protects version checks during SaveEvent
|
||||
versions map[string]int64 // actorID -> latest version cache
|
||||
config JetStreamConfig
|
||||
mu sync.Mutex // Protects version checks during SaveEvent
|
||||
versions map[string]int64 // actorID -> latest version cache
|
||||
broadcaster aether.EventBroadcaster // Optional broadcaster for EventStored events
|
||||
namespace string // Optional namespace for event publishing
|
||||
}
|
||||
|
||||
// NewJetStreamEventStore creates a new JetStream-based event store
|
||||
|
||||
|
||||
|
||||
// NewJetStreamEventStore creates a new JetStream-based event store with default configuration
|
||||
func NewJetStreamEventStore(natsConn *nats.Conn, streamName string) (*JetStreamEventStore, error) {
|
||||
return NewJetStreamEventStoreWithConfig(natsConn, streamName, DefaultJetStreamConfig())
|
||||
}
|
||||
|
||||
// NewJetStreamEventStoreWithNamespace creates a new JetStream-based event store with namespace isolation.
|
||||
// The namespace is prefixed to the stream name to ensure complete isolation at the storage level.
|
||||
// This is a convenience function; the same can be achieved by setting Namespace in JetStreamConfig.
|
||||
func NewJetStreamEventStoreWithNamespace(natsConn *nats.Conn, streamName string, namespace string) (*JetStreamEventStore, error) {
|
||||
config := DefaultJetStreamConfig()
|
||||
config.Namespace = namespace
|
||||
return NewJetStreamEventStoreWithConfig(natsConn, streamName, config)
|
||||
}
|
||||
|
||||
// NewJetStreamEventStoreWithConfig creates a new JetStream-based event store with custom configuration
|
||||
func NewJetStreamEventStoreWithConfig(natsConn *nats.Conn, streamName string, config JetStreamConfig) (*JetStreamEventStore, error) {
|
||||
js, err := natsConn.JetStream()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to get JetStream context: %w", err)
|
||||
}
|
||||
|
||||
// Apply defaults for zero values
|
||||
if config.StreamRetention == 0 {
|
||||
config.StreamRetention = DefaultStreamRetention
|
||||
}
|
||||
if config.ReplicaCount == 0 {
|
||||
config.ReplicaCount = DefaultReplicaCount
|
||||
}
|
||||
|
||||
// Apply namespace prefix to stream name if provided
|
||||
effectiveStreamName := streamName
|
||||
if config.Namespace != "" {
|
||||
effectiveStreamName = fmt.Sprintf("%s_%s", sanitizeSubject(config.Namespace), streamName)
|
||||
}
|
||||
|
||||
// Create or update the stream
|
||||
stream := &nats.StreamConfig{
|
||||
Name: streamName,
|
||||
Subjects: []string{fmt.Sprintf("%s.events.>", streamName), fmt.Sprintf("%s.snapshots.>", streamName)},
|
||||
Name: effectiveStreamName,
|
||||
Subjects: []string{fmt.Sprintf("%s.events.>", effectiveStreamName), fmt.Sprintf("%s.snapshots.>", effectiveStreamName)},
|
||||
Storage: nats.FileStorage,
|
||||
Retention: nats.LimitsPolicy,
|
||||
MaxAge: 365 * 24 * time.Hour, // Keep events for 1 year
|
||||
Replicas: 1, // Can be increased for HA
|
||||
MaxAge: config.StreamRetention,
|
||||
Replicas: config.ReplicaCount,
|
||||
}
|
||||
|
||||
_, err = js.AddStream(stream)
|
||||
@@ -43,8 +147,73 @@ func NewJetStreamEventStore(natsConn *nats.Conn, streamName string) (*JetStreamE
|
||||
|
||||
return &JetStreamEventStore{
|
||||
js: js,
|
||||
streamName: streamName,
|
||||
streamName: effectiveStreamName,
|
||||
config: config,
|
||||
versions: make(map[string]int64),
|
||||
broadcaster: nil,
|
||||
namespace: "",
|
||||
}, nil
|
||||
}
|
||||
|
||||
// GetNamespace returns the namespace configured for this store, or empty string if not namespaced.
|
||||
func (jes *JetStreamEventStore) GetNamespace() string {
|
||||
return jes.config.Namespace
|
||||
}
|
||||
|
||||
// GetStreamName returns the effective stream name (including namespace prefix if applicable).
|
||||
func (jes *JetStreamEventStore) GetStreamName() string {
|
||||
return jes.streamName
|
||||
}
|
||||
|
||||
// NewJetStreamEventStoreWithBroadcaster creates a new JetStream-based event store with broadcaster support.
|
||||
// The broadcaster receives EventStored events when events are successfully saved.
|
||||
func NewJetStreamEventStoreWithBroadcaster(natsConn *nats.Conn, streamName string, broadcaster aether.EventBroadcaster, namespace string) (*JetStreamEventStore, error) {
|
||||
config := DefaultJetStreamConfig()
|
||||
if namespace != "" {
|
||||
config.Namespace = namespace
|
||||
}
|
||||
|
||||
js, err := natsConn.JetStream()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to get JetStream context: %w", err)
|
||||
}
|
||||
|
||||
// Apply defaults for zero values
|
||||
if config.StreamRetention == 0 {
|
||||
config.StreamRetention = DefaultStreamRetention
|
||||
}
|
||||
if config.ReplicaCount == 0 {
|
||||
config.ReplicaCount = DefaultReplicaCount
|
||||
}
|
||||
|
||||
// Apply namespace prefix to stream name if provided
|
||||
effectiveStreamName := streamName
|
||||
if config.Namespace != "" {
|
||||
effectiveStreamName = fmt.Sprintf("%s_%s", sanitizeSubject(config.Namespace), streamName)
|
||||
}
|
||||
|
||||
// Create or update the stream
|
||||
stream := &nats.StreamConfig{
|
||||
Name: effectiveStreamName,
|
||||
Subjects: []string{fmt.Sprintf("%s.events.>", effectiveStreamName), fmt.Sprintf("%s.snapshots.>", effectiveStreamName)},
|
||||
Storage: nats.FileStorage,
|
||||
Retention: nats.LimitsPolicy,
|
||||
MaxAge: config.StreamRetention,
|
||||
Replicas: config.ReplicaCount,
|
||||
}
|
||||
|
||||
_, err = js.AddStream(stream)
|
||||
if err != nil && !strings.Contains(err.Error(), "already exists") {
|
||||
return nil, fmt.Errorf("failed to create stream: %w", err)
|
||||
}
|
||||
|
||||
return &JetStreamEventStore{
|
||||
js: js,
|
||||
streamName: effectiveStreamName,
|
||||
config: config,
|
||||
versions: make(map[string]int64),
|
||||
broadcaster: broadcaster,
|
||||
namespace: namespace,
|
||||
}, nil
|
||||
}
|
||||
|
||||
@@ -55,19 +224,36 @@ func (jes *JetStreamEventStore) SaveEvent(event *aether.Event) error {
|
||||
jes.mu.Lock()
|
||||
defer jes.mu.Unlock()
|
||||
|
||||
// Get current latest version for this actor
|
||||
currentVersion, err := jes.getLatestVersionLocked(event.ActorID)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to get latest version: %w", err)
|
||||
}
|
||||
|
||||
// Validate version is strictly greater than current
|
||||
if event.Version <= currentVersion {
|
||||
return &aether.VersionConflictError{
|
||||
ActorID: event.ActorID,
|
||||
AttemptedVersion: event.Version,
|
||||
CurrentVersion: currentVersion,
|
||||
// Check cache first
|
||||
if version, ok := jes.versions[event.ActorID]; ok {
|
||||
// Validate version against cached version
|
||||
if event.Version <= version {
|
||||
return &aether.VersionConflictError{
|
||||
ActorID: event.ActorID,
|
||||
AttemptedVersion: event.Version,
|
||||
CurrentVersion: version,
|
||||
}
|
||||
}
|
||||
// Version check passed, proceed with publish while holding lock
|
||||
} else {
|
||||
// Cache miss - need to check actual stream
|
||||
// Get current latest version while holding lock to prevent TOCTOU race
|
||||
currentVersion, err := jes.getLatestVersionLocked(event.ActorID)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to get latest version: %w", err)
|
||||
}
|
||||
|
||||
// Validate version is strictly greater than current
|
||||
if event.Version <= currentVersion {
|
||||
return &aether.VersionConflictError{
|
||||
ActorID: event.ActorID,
|
||||
AttemptedVersion: event.Version,
|
||||
CurrentVersion: currentVersion,
|
||||
}
|
||||
}
|
||||
|
||||
// Update cache with current version
|
||||
jes.versions[event.ActorID] = currentVersion
|
||||
}
|
||||
|
||||
// Serialize event to JSON
|
||||
@@ -88,50 +274,80 @@ func (jes *JetStreamEventStore) SaveEvent(event *aether.Event) error {
|
||||
return fmt.Errorf("failed to publish event to JetStream: %w", err)
|
||||
}
|
||||
|
||||
// Update version cache
|
||||
// Update version cache after successful publish
|
||||
jes.versions[event.ActorID] = event.Version
|
||||
|
||||
// Publish EventStored event after successful save (if broadcaster is configured)
|
||||
if jes.broadcaster != nil {
|
||||
jes.publishEventStored(event)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// getLatestVersionLocked returns the latest version for an actor.
|
||||
// Caller must hold jes.mu.
|
||||
func (jes *JetStreamEventStore) getLatestVersionLocked(actorID string) (int64, error) {
|
||||
// Check cache first
|
||||
if version, ok := jes.versions[actorID]; ok {
|
||||
return version, nil
|
||||
// publishEventStored publishes an EventStored event to the broadcaster.
|
||||
// This is called after a successful SaveEvent to notify subscribers.
|
||||
//
|
||||
// EventStored Event Schema:
|
||||
// - EventType: "EventStored" (aether.EventTypeEventStored)
|
||||
// - ActorID: ID of the actor that the original event was about
|
||||
// - Version: version of the stored event
|
||||
// - Data:
|
||||
// - eventId: (string) ID of the stored event
|
||||
// - actorId: (string) ID of the actor
|
||||
// - version: (int64) version of the event
|
||||
// - timestamp: (int64) Unix timestamp of when the event was stored
|
||||
//
|
||||
// Example usage with NATSEventBus:
|
||||
//
|
||||
// eventBus := aether.NewNATSEventBus(natsConn)
|
||||
// store := store.NewJetStreamEventStoreWithBroadcaster(natsConn, "events", eventBus, "")
|
||||
// ch := eventBus.SubscribeToEventStored("*")
|
||||
//
|
||||
// for event := range ch {
|
||||
// actorID := event.Data["actorId"].(string)
|
||||
// version := event.Data["version"].(int64)
|
||||
// store.UpdateVersionCache(actorID, version)
|
||||
// }
|
||||
func (jes *JetStreamEventStore) publishEventStored(originalEvent *aether.Event) {
|
||||
eventStored := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: aether.EventTypeEventStored,
|
||||
ActorID: originalEvent.ActorID, // EventStored is about the original actor
|
||||
Version: originalEvent.Version, // Preserve the version of the stored event
|
||||
Data: map[string]interface{}{
|
||||
"eventId": originalEvent.ID,
|
||||
"actorId": originalEvent.ActorID,
|
||||
"version": originalEvent.Version,
|
||||
"timestamp": originalEvent.Timestamp.Unix(),
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
// Fetch from JetStream
|
||||
events, err := jes.getEventsInternal(actorID, 0)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
|
||||
if len(events) == 0 {
|
||||
return 0, nil
|
||||
}
|
||||
|
||||
latestVersion := int64(0)
|
||||
for _, event := range events {
|
||||
if event.Version > latestVersion {
|
||||
latestVersion = event.Version
|
||||
}
|
||||
}
|
||||
|
||||
// Update cache
|
||||
jes.versions[actorID] = latestVersion
|
||||
|
||||
return latestVersion, nil
|
||||
jes.broadcaster.Publish(jes.namespace, eventStored)
|
||||
}
|
||||
|
||||
// GetEvents retrieves all events for an actor since a version
|
||||
// GetEvents retrieves all events for an actor since a version.
|
||||
// Note: This method silently skips malformed events for backward compatibility.
|
||||
// Use GetEventsWithErrors to receive information about malformed events.
|
||||
func (jes *JetStreamEventStore) GetEvents(actorID string, fromVersion int64) ([]*aether.Event, error) {
|
||||
return jes.getEventsInternal(actorID, fromVersion)
|
||||
result, err := jes.getEventsWithErrorsInternal(actorID, fromVersion)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return result.Events, nil
|
||||
}
|
||||
|
||||
// getEventsInternal is the internal implementation of GetEvents
|
||||
func (jes *JetStreamEventStore) getEventsInternal(actorID string, fromVersion int64) ([]*aether.Event, error) {
|
||||
// GetEventsWithErrors retrieves events for an actor and reports any malformed
|
||||
// events encountered. This method allows callers to decide how to handle
|
||||
// corrupted data rather than silently skipping it.
|
||||
func (jes *JetStreamEventStore) GetEventsWithErrors(actorID string, fromVersion int64) (*aether.ReplayResult, error) {
|
||||
return jes.getEventsWithErrorsInternal(actorID, fromVersion)
|
||||
}
|
||||
|
||||
// getEventsWithErrorsInternal is the internal implementation that tracks both
|
||||
// successfully parsed events and errors for malformed events.
|
||||
func (jes *JetStreamEventStore) getEventsWithErrorsInternal(actorID string, fromVersion int64) (*aether.ReplayResult, error) {
|
||||
// Create subject filter for this actor
|
||||
subject := fmt.Sprintf("%s.events.%s.%s",
|
||||
jes.streamName,
|
||||
@@ -145,7 +361,10 @@ func (jes *JetStreamEventStore) getEventsInternal(actorID string, fromVersion in
|
||||
}
|
||||
defer consumer.Unsubscribe()
|
||||
|
||||
var events []*aether.Event
|
||||
result := &aether.ReplayResult{
|
||||
Events: make([]*aether.Event, 0),
|
||||
Errors: make([]aether.ReplayError, 0),
|
||||
}
|
||||
|
||||
// Fetch messages in batches
|
||||
for {
|
||||
@@ -160,12 +379,24 @@ func (jes *JetStreamEventStore) getEventsInternal(actorID string, fromVersion in
|
||||
for _, msg := range msgs {
|
||||
var event aether.Event
|
||||
if err := json.Unmarshal(msg.Data, &event); err != nil {
|
||||
continue // Skip malformed events
|
||||
// Record the error with context instead of silently skipping
|
||||
metadata, _ := msg.Metadata()
|
||||
seqNum := uint64(0)
|
||||
if metadata != nil {
|
||||
seqNum = metadata.Sequence.Stream
|
||||
}
|
||||
result.Errors = append(result.Errors, aether.ReplayError{
|
||||
SequenceNumber: seqNum,
|
||||
RawData: msg.Data,
|
||||
Err: err,
|
||||
})
|
||||
msg.Ack() // Still ack to prevent redelivery
|
||||
continue
|
||||
}
|
||||
|
||||
// Filter by version
|
||||
if event.Version > fromVersion {
|
||||
events = append(events, &event)
|
||||
result.Events = append(result.Events, &event)
|
||||
}
|
||||
|
||||
msg.Ack()
|
||||
@@ -176,31 +407,99 @@ func (jes *JetStreamEventStore) getEventsInternal(actorID string, fromVersion in
|
||||
}
|
||||
}
|
||||
|
||||
return events, nil
|
||||
return result, nil
|
||||
}
|
||||
|
||||
// GetLatestVersion returns the latest version for an actor
|
||||
// GetLatestVersion returns the latest version for an actor in O(1) time.
|
||||
// It uses JetStream's DeliverLast() option to fetch only the last message
|
||||
// instead of scanning all events, making this O(1) instead of O(n).
|
||||
func (jes *JetStreamEventStore) GetLatestVersion(actorID string) (int64, error) {
|
||||
events, err := jes.GetEvents(actorID, 0)
|
||||
// Create subject filter for this actor
|
||||
subject := fmt.Sprintf("%s.events.%s.%s",
|
||||
jes.streamName,
|
||||
sanitizeSubject(extractActorType(actorID)),
|
||||
sanitizeSubject(actorID))
|
||||
|
||||
// Create consumer to read only the last message
|
||||
consumer, err := jes.js.PullSubscribe(subject, "", nats.DeliverLast())
|
||||
if err != nil {
|
||||
return 0, err
|
||||
return 0, fmt.Errorf("failed to create consumer: %w", err)
|
||||
}
|
||||
defer consumer.Unsubscribe()
|
||||
|
||||
// Fetch only the last message
|
||||
msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second))
|
||||
if err != nil {
|
||||
if err == nats.ErrTimeout {
|
||||
// No messages for this actor, return 0
|
||||
return 0, nil
|
||||
}
|
||||
return 0, fmt.Errorf("failed to fetch last message: %w", err)
|
||||
}
|
||||
|
||||
if len(events) == 0 {
|
||||
if len(msgs) == 0 {
|
||||
// No events for this actor
|
||||
return 0, nil
|
||||
}
|
||||
|
||||
latestVersion := int64(0)
|
||||
for _, event := range events {
|
||||
if event.Version > latestVersion {
|
||||
latestVersion = event.Version
|
||||
}
|
||||
// Parse the last message to get the version
|
||||
var event aether.Event
|
||||
if err := json.Unmarshal(msgs[0].Data, &event); err != nil {
|
||||
return 0, fmt.Errorf("failed to unmarshal last event: %w", err)
|
||||
}
|
||||
|
||||
return latestVersion, nil
|
||||
msgs[0].Ack()
|
||||
return event.Version, nil
|
||||
}
|
||||
|
||||
// GetLatestSnapshot gets the most recent snapshot for an actor
|
||||
// getLatestVersionLocked is like GetLatestVersion but assumes the caller already holds jes.mu.
|
||||
// This is used internally to avoid releasing and reacquiring the lock during SaveEvent,
|
||||
// which would create a TOCTOU race condition.
|
||||
func (jes *JetStreamEventStore) getLatestVersionLocked(actorID string) (int64, error) {
|
||||
// Create subject filter for this actor
|
||||
subject := fmt.Sprintf("%s.events.%s.%s",
|
||||
jes.streamName,
|
||||
sanitizeSubject(extractActorType(actorID)),
|
||||
sanitizeSubject(actorID))
|
||||
|
||||
// Create consumer to read only the last message
|
||||
consumer, err := jes.js.PullSubscribe(subject, "", nats.DeliverLast())
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("failed to create consumer: %w", err)
|
||||
}
|
||||
defer consumer.Unsubscribe()
|
||||
|
||||
// Fetch only the last message
|
||||
msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second))
|
||||
if err != nil {
|
||||
if err == nats.ErrTimeout {
|
||||
// No messages for this actor, return 0
|
||||
return 0, nil
|
||||
}
|
||||
return 0, fmt.Errorf("failed to fetch last message: %w", err)
|
||||
}
|
||||
|
||||
if len(msgs) == 0 {
|
||||
// No events for this actor
|
||||
return 0, nil
|
||||
}
|
||||
|
||||
// Parse the last message to get the version
|
||||
var event aether.Event
|
||||
if err := json.Unmarshal(msgs[0].Data, &event); err != nil {
|
||||
return 0, fmt.Errorf("failed to unmarshal last event: %w", err)
|
||||
}
|
||||
|
||||
msgs[0].Ack()
|
||||
return event.Version, nil
|
||||
}
|
||||
|
||||
// GetLatestSnapshot gets the most recent snapshot for an actor.
|
||||
// Returns an error if no snapshot exists for the actor (unlike GetLatestVersion which returns 0).
|
||||
// This is intentional: a missing snapshot is different from a missing event stream.
|
||||
// If an actor has no events, that's a normal state (use version 0).
|
||||
// If an actor has no snapshot, that could indicate an error or it could be normal
|
||||
// depending on the use case, so we let the caller decide how to handle it.
|
||||
func (jes *JetStreamEventStore) GetLatestSnapshot(actorID string) (*aether.ActorSnapshot, error) {
|
||||
// Create subject for snapshots
|
||||
subject := fmt.Sprintf("%s.snapshots.%s.%s",
|
||||
@@ -218,12 +517,14 @@ func (jes *JetStreamEventStore) GetLatestSnapshot(actorID string) (*aether.Actor
|
||||
msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second))
|
||||
if err != nil {
|
||||
if err == nats.ErrTimeout {
|
||||
// No snapshot found - return error to distinguish from successful nil result
|
||||
return nil, fmt.Errorf("no snapshot found for actor %s", actorID)
|
||||
}
|
||||
return nil, fmt.Errorf("failed to fetch snapshot: %w", err)
|
||||
}
|
||||
|
||||
if len(msgs) == 0 {
|
||||
// No snapshot exists for this actor
|
||||
return nil, fmt.Errorf("no snapshot found for actor %s", actorID)
|
||||
}
|
||||
|
||||
@@ -279,3 +580,44 @@ func sanitizeSubject(s string) string {
|
||||
s = strings.ReplaceAll(s, ">", "_")
|
||||
return s
|
||||
}
|
||||
|
||||
// UpdateVersionCache updates the version cache for a specific actor.
|
||||
// This is used when receiving events from other nodes via NATS to keep
|
||||
// the version cache consistent across cluster nodes.
|
||||
//
|
||||
// Only updates if the new version is greater than the cached version to prevent
|
||||
// stale cache entries from causing version conflicts.
|
||||
func (jes *JetStreamEventStore) UpdateVersionCache(actorID string, version int64) {
|
||||
jes.mu.Lock()
|
||||
defer jes.mu.Unlock()
|
||||
|
||||
// Only update if the new version is greater than cached version
|
||||
if currentVersion, ok := jes.versions[actorID]; !ok || version > currentVersion {
|
||||
jes.versions[actorID] = version
|
||||
}
|
||||
}
|
||||
|
||||
// GetCachedVersion returns the cached version for an actor, if available.
|
||||
func (jes *JetStreamEventStore) GetCachedVersion(actorID string) (int64, bool) {
|
||||
jes.mu.Lock()
|
||||
defer jes.mu.Unlock()
|
||||
|
||||
version, ok := jes.versions[actorID]
|
||||
return version, ok
|
||||
}
|
||||
|
||||
// SetBroadcaster sets the event broadcaster for this store.
|
||||
// The broadcaster is used to publish EventStored events when events are saved.
|
||||
func (jes *JetStreamEventStore) SetBroadcaster(broadcaster aether.EventBroadcaster) {
|
||||
jes.mu.Lock()
|
||||
defer jes.mu.Unlock()
|
||||
jes.broadcaster = broadcaster
|
||||
}
|
||||
|
||||
// Close closes the JetStream event store and cleans up resources.
|
||||
func (jes *JetStreamEventStore) Close(ctx context.Context) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Compile-time check that JetStreamEventStore implements EventStoreWithErrors
|
||||
var _ aether.EventStoreWithErrors = (*JetStreamEventStore)(nil)
|
||||
|
||||
147
store/jetstream_benchmark_test.go
Normal file
147
store/jetstream_benchmark_test.go
Normal file
@@ -0,0 +1,147 @@
|
||||
//go:build integration
|
||||
|
||||
package store
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
)
|
||||
|
||||
// BenchmarkGetLatestVersion_WithManyEvents benchmarks GetLatestVersion performance
|
||||
// with a large number of events per actor.
|
||||
// This demonstrates the O(1) performance by showing that time doesn't increase
|
||||
// significantly with more events.
|
||||
func BenchmarkGetLatestVersion_WithManyEvents(b *testing.B) {
|
||||
nc := getTestNATSConnection(&testing.T{})
|
||||
if nc == nil {
|
||||
b.Skip("NATS not available")
|
||||
return
|
||||
}
|
||||
defer nc.Close()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, fmt.Sprintf("bench-getversion-%d", time.Now().UnixNano()))
|
||||
if err != nil {
|
||||
b.Fatalf("failed to create store: %v", err)
|
||||
}
|
||||
|
||||
actorID := "actor-bench-test"
|
||||
|
||||
// Populate with 1000 events
|
||||
for i := 1; i <= 1000; i++ {
|
||||
event := &aether.Event{
|
||||
ID: fmt.Sprintf("evt-%d", i),
|
||||
EventType: "BenchEvent",
|
||||
ActorID: actorID,
|
||||
Version: int64(i),
|
||||
Data: map[string]interface{}{"index": i},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
err := store.SaveEvent(event)
|
||||
if err != nil {
|
||||
b.Fatalf("SaveEvent failed for event %d: %v", i, err)
|
||||
}
|
||||
}
|
||||
|
||||
// Benchmark GetLatestVersion
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, err := store.GetLatestVersion(actorID)
|
||||
if err != nil {
|
||||
b.Fatalf("GetLatestVersion failed: %v", err)
|
||||
}
|
||||
}
|
||||
b.StopTimer()
|
||||
}
|
||||
|
||||
// BenchmarkGetLatestVersion_NoCache benchmarks GetLatestVersion without cache
|
||||
// to show that even uncached lookups are very fast due to DeliverLast optimization.
|
||||
// A new store instance is created before timing to bypass the version cache.
|
||||
func BenchmarkGetLatestVersion_NoCache(b *testing.B) {
|
||||
nc := getTestNATSConnection(&testing.T{})
|
||||
if nc == nil {
|
||||
b.Skip("NATS not available")
|
||||
return
|
||||
}
|
||||
defer nc.Close()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, fmt.Sprintf("bench-nocache-%d", time.Now().UnixNano()))
|
||||
if err != nil {
|
||||
b.Fatalf("failed to create store: %v", err)
|
||||
}
|
||||
|
||||
actorID := "actor-bench-nocache"
|
||||
|
||||
// Populate with 1000 events
|
||||
for i := 1; i <= 1000; i++ {
|
||||
event := &aether.Event{
|
||||
ID: fmt.Sprintf("evt-%d", i),
|
||||
EventType: "BenchEvent",
|
||||
ActorID: actorID,
|
||||
Version: int64(i),
|
||||
Data: map[string]interface{}{"index": i},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
err := store.SaveEvent(event)
|
||||
if err != nil {
|
||||
b.Fatalf("SaveEvent failed for event %d: %v", i, err)
|
||||
}
|
||||
}
|
||||
|
||||
// Create a new store instance to bypass version cache
|
||||
uncachedStore, err := NewJetStreamEventStore(nc, store.GetStreamName())
|
||||
if err != nil {
|
||||
b.Fatalf("failed to create uncached store: %v", err)
|
||||
}
|
||||
|
||||
// Benchmark GetLatestVersion without using cache
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, err := uncachedStore.GetLatestVersion(actorID)
|
||||
if err != nil {
|
||||
b.Fatalf("GetLatestVersion failed: %v", err)
|
||||
}
|
||||
}
|
||||
b.StopTimer()
|
||||
}
|
||||
|
||||
// BenchmarkGetLatestVersion_SingleEvent benchmarks with minimal data
|
||||
func BenchmarkGetLatestVersion_SingleEvent(b *testing.B) {
|
||||
nc := getTestNATSConnection(&testing.T{})
|
||||
if nc == nil {
|
||||
b.Skip("NATS not available")
|
||||
return
|
||||
}
|
||||
defer nc.Close()
|
||||
|
||||
store, err := NewJetStreamEventStore(nc, fmt.Sprintf("bench-single-%d", time.Now().UnixNano()))
|
||||
if err != nil {
|
||||
b.Fatalf("failed to create store: %v", err)
|
||||
}
|
||||
|
||||
actorID := "actor-single"
|
||||
|
||||
event := &aether.Event{
|
||||
ID: "evt-1",
|
||||
EventType: "TestEvent",
|
||||
ActorID: actorID,
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
err = store.SaveEvent(event)
|
||||
if err != nil {
|
||||
b.Fatalf("SaveEvent failed: %v", err)
|
||||
}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, err := store.GetLatestVersion(actorID)
|
||||
if err != nil {
|
||||
b.Fatalf("GetLatestVersion failed: %v", err)
|
||||
}
|
||||
}
|
||||
b.StopTimer()
|
||||
}
|
||||
@@ -2,15 +2,19 @@ package store
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"git.flowmade.one/flowmade-one/aether"
|
||||
"github.com/google/uuid"
|
||||
)
|
||||
|
||||
// InMemoryEventStore provides a simple in-memory event store for testing
|
||||
type InMemoryEventStore struct {
|
||||
mu sync.RWMutex
|
||||
events map[string][]*aether.Event // actorID -> events
|
||||
snapshots map[string][]*aether.ActorSnapshot // actorID -> snapshots (sorted by version)
|
||||
mu sync.RWMutex
|
||||
events map[string][]*aether.Event // actorID -> events
|
||||
snapshots map[string][]*aether.ActorSnapshot // actorID -> snapshots (sorted by version)
|
||||
broadcaster aether.EventBroadcaster // optional broadcaster for EventStored events
|
||||
namespace string // optional namespace for event publishing
|
||||
}
|
||||
|
||||
// NewInMemoryEventStore creates a new in-memory event store
|
||||
@@ -21,9 +25,21 @@ func NewInMemoryEventStore() *InMemoryEventStore {
|
||||
}
|
||||
}
|
||||
|
||||
// NewInMemoryEventStoreWithBroadcaster creates a new in-memory event store with an event broadcaster
|
||||
// The broadcaster receives EventStored events when events are successfully saved.
|
||||
func NewInMemoryEventStoreWithBroadcaster(broadcaster aether.EventBroadcaster, namespace string) *InMemoryEventStore {
|
||||
return &InMemoryEventStore{
|
||||
events: make(map[string][]*aether.Event),
|
||||
snapshots: make(map[string][]*aether.ActorSnapshot),
|
||||
broadcaster: broadcaster,
|
||||
namespace: namespace,
|
||||
}
|
||||
}
|
||||
|
||||
// SaveEvent saves an event to the in-memory store.
|
||||
// Returns VersionConflictError if the event's version is not strictly greater
|
||||
// than the current latest version for the actor.
|
||||
// If a broadcaster is configured, publishes an EventStored event on success.
|
||||
func (es *InMemoryEventStore) SaveEvent(event *aether.Event) error {
|
||||
es.mu.Lock()
|
||||
defer es.mu.Unlock()
|
||||
@@ -51,9 +67,35 @@ func (es *InMemoryEventStore) SaveEvent(event *aether.Event) error {
|
||||
es.events[event.ActorID] = make([]*aether.Event, 0)
|
||||
}
|
||||
es.events[event.ActorID] = append(es.events[event.ActorID], event)
|
||||
|
||||
// Publish EventStored event after successful save (if broadcaster is configured)
|
||||
if es.broadcaster != nil {
|
||||
es.publishEventStored(event)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// publishEventStored publishes an EventStored event to the broadcaster.
|
||||
// This is called after a successful SaveEvent to notify subscribers.
|
||||
func (es *InMemoryEventStore) publishEventStored(originalEvent *aether.Event) {
|
||||
eventStored := &aether.Event{
|
||||
ID: uuid.New().String(),
|
||||
EventType: aether.EventTypeEventStored,
|
||||
ActorID: originalEvent.ActorID, // EventStored is about the original actor
|
||||
Version: originalEvent.Version, // Preserve the version of the stored event
|
||||
Data: map[string]interface{}{
|
||||
"eventId": originalEvent.ID,
|
||||
"actorId": originalEvent.ActorID,
|
||||
"version": originalEvent.Version,
|
||||
"timestamp": originalEvent.Timestamp.Unix(),
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
es.broadcaster.Publish(es.namespace, eventStored)
|
||||
}
|
||||
|
||||
// GetEvents retrieves events for an actor from a specific version
|
||||
func (es *InMemoryEventStore) GetEvents(actorID string, fromVersion int64) ([]*aether.Event, error) {
|
||||
es.mu.RLock()
|
||||
|
||||
@@ -1905,3 +1905,181 @@ func TestSaveEvent_MetadataPreservedAcrossMultipleEvents(t *testing.T) {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// === EventStored Publishing Tests ===
|
||||
|
||||
func TestSaveEvent_WithBroadcaster_PublishesEventStored(t *testing.T) {
|
||||
// Create a mock broadcaster to capture published events
|
||||
broadcaster := aether.NewEventBus()
|
||||
store := NewInMemoryEventStoreWithBroadcaster(broadcaster, "test-namespace")
|
||||
|
||||
// Subscribe to EventStored events
|
||||
ch := broadcaster.Subscribe("test-namespace")
|
||||
defer broadcaster.Unsubscribe("test-namespace", ch)
|
||||
|
||||
event := &aether.Event{
|
||||
ID: "evt-123",
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: "order-456",
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{
|
||||
"total": 100.50,
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
// Save event
|
||||
err := store.SaveEvent(event)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent failed: %v", err)
|
||||
}
|
||||
|
||||
// Check if EventStored was published
|
||||
select {
|
||||
case publishedEvent := <-ch:
|
||||
if publishedEvent == nil {
|
||||
t.Fatal("received nil event from broadcaster")
|
||||
}
|
||||
if publishedEvent.EventType != aether.EventTypeEventStored {
|
||||
t.Errorf("expected EventType %q, got %q", aether.EventTypeEventStored, publishedEvent.EventType)
|
||||
}
|
||||
if publishedEvent.ActorID != "order-456" {
|
||||
t.Errorf("expected ActorID %q, got %q", "order-456", publishedEvent.ActorID)
|
||||
}
|
||||
if publishedEvent.Version != 1 {
|
||||
t.Errorf("expected Version 1, got %d", publishedEvent.Version)
|
||||
}
|
||||
// Check data contains original event info
|
||||
if publishedEvent.Data["eventId"] != "evt-123" {
|
||||
t.Errorf("expected eventId %q, got %q", "evt-123", publishedEvent.Data["eventId"])
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timeout waiting for EventStored event")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSaveEvent_VersionConflict_NoEventStored(t *testing.T) {
|
||||
broadcaster := aether.NewEventBus()
|
||||
store := NewInMemoryEventStoreWithBroadcaster(broadcaster, "test-namespace")
|
||||
|
||||
// Subscribe to EventStored events
|
||||
ch := broadcaster.Subscribe("test-namespace")
|
||||
defer broadcaster.Unsubscribe("test-namespace", ch)
|
||||
|
||||
// Save first event
|
||||
event1 := &aether.Event{
|
||||
ID: "evt-1",
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: "order-456",
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err := store.SaveEvent(event1)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent(event1) failed: %v", err)
|
||||
}
|
||||
|
||||
// Drain the first EventStored event
|
||||
select {
|
||||
case <-ch:
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatal("timeout waiting for first EventStored event")
|
||||
}
|
||||
|
||||
// Try to save event with non-increasing version (should fail)
|
||||
event2 := &aether.Event{
|
||||
ID: "evt-2",
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: "order-456",
|
||||
Version: 1, // Same version, should conflict
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err = store.SaveEvent(event2)
|
||||
if !errors.Is(err, aether.ErrVersionConflict) {
|
||||
t.Fatalf("expected ErrVersionConflict, got %v", err)
|
||||
}
|
||||
|
||||
// Verify no EventStored event was published
|
||||
select {
|
||||
case <-ch:
|
||||
t.Fatal("expected no EventStored event, but received one")
|
||||
case <-time.After(50 * time.Millisecond):
|
||||
// Expected - no event published
|
||||
}
|
||||
}
|
||||
|
||||
func TestSaveEvent_MultipleEvents_PublishesMultipleEventStored(t *testing.T) {
|
||||
broadcaster := aether.NewEventBus()
|
||||
store := NewInMemoryEventStoreWithBroadcaster(broadcaster, "test-namespace")
|
||||
|
||||
// Subscribe to EventStored events
|
||||
ch := broadcaster.Subscribe("test-namespace")
|
||||
defer broadcaster.Unsubscribe("test-namespace", ch)
|
||||
|
||||
// Save multiple events
|
||||
for i := int64(1); i <= 3; i++ {
|
||||
event := &aether.Event{
|
||||
ID: fmt.Sprintf("evt-%d", i),
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: "order-456",
|
||||
Version: i,
|
||||
Data: map[string]interface{}{},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
err := store.SaveEvent(event)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent failed: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Verify we received 3 EventStored events in order
|
||||
for i := int64(1); i <= 3; i++ {
|
||||
select {
|
||||
case publishedEvent := <-ch:
|
||||
if publishedEvent == nil {
|
||||
t.Fatal("received nil event from broadcaster")
|
||||
}
|
||||
if publishedEvent.Version != i {
|
||||
t.Errorf("expected Version %d, got %d", i, publishedEvent.Version)
|
||||
}
|
||||
case <-time.After(100 * time.Millisecond):
|
||||
t.Fatalf("timeout waiting for EventStored event %d", i)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestSaveEvent_WithoutBroadcaster_NoPanic(t *testing.T) {
|
||||
// Test that SaveEvent works without a broadcaster (nil broadcaster)
|
||||
store := NewInMemoryEventStore()
|
||||
|
||||
event := &aether.Event{
|
||||
ID: "evt-123",
|
||||
EventType: "OrderPlaced",
|
||||
ActorID: "order-456",
|
||||
Version: 1,
|
||||
Data: map[string]interface{}{
|
||||
"total": 100.50,
|
||||
},
|
||||
Timestamp: time.Now(),
|
||||
}
|
||||
|
||||
// This should not panic even though broadcaster is nil
|
||||
err := store.SaveEvent(event)
|
||||
if err != nil {
|
||||
t.Fatalf("SaveEvent failed: %v", err)
|
||||
}
|
||||
|
||||
// Verify event was saved
|
||||
events, err := store.GetEvents("order-456", 0)
|
||||
if err != nil {
|
||||
t.Fatalf("GetEvents failed: %v", err)
|
||||
}
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 event, got %d", len(events))
|
||||
}
|
||||
}
|
||||
|
||||
124
store/namespace_test.go
Normal file
124
store/namespace_test.go
Normal file
@@ -0,0 +1,124 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestJetStreamConfigNamespace(t *testing.T) {
|
||||
t.Run("default config has empty namespace", func(t *testing.T) {
|
||||
config := DefaultJetStreamConfig()
|
||||
if config.Namespace != "" {
|
||||
t.Errorf("expected empty namespace in default config, got %q", config.Namespace)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("namespace can be set in config", func(t *testing.T) {
|
||||
config := JetStreamConfig{
|
||||
Namespace: "tenant-abc",
|
||||
}
|
||||
if config.Namespace != "tenant-abc" {
|
||||
t.Errorf("expected namespace tenant-abc, got %q", config.Namespace)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestNamespacedStreamName(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
baseStreamName string
|
||||
namespace string
|
||||
expectedStreamName string
|
||||
}{
|
||||
{
|
||||
name: "no namespace - stream name unchanged",
|
||||
baseStreamName: "events",
|
||||
namespace: "",
|
||||
expectedStreamName: "events",
|
||||
},
|
||||
{
|
||||
name: "with namespace - prefixed stream name",
|
||||
baseStreamName: "events",
|
||||
namespace: "tenant-abc",
|
||||
expectedStreamName: "tenant-abc_events",
|
||||
},
|
||||
{
|
||||
name: "namespace with dots - sanitized",
|
||||
baseStreamName: "events",
|
||||
namespace: "tenant.abc",
|
||||
expectedStreamName: "tenant_abc_events",
|
||||
},
|
||||
{
|
||||
name: "namespace with spaces - sanitized",
|
||||
baseStreamName: "events",
|
||||
namespace: "tenant abc",
|
||||
expectedStreamName: "tenant_abc_events",
|
||||
},
|
||||
{
|
||||
name: "namespace with special chars - sanitized",
|
||||
baseStreamName: "events",
|
||||
namespace: "tenant*abc>def",
|
||||
expectedStreamName: "tenant_abc_def_events",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
// We can't create a real JetStreamEventStore without NATS,
|
||||
// but we can test the stream name logic by examining the expected format
|
||||
effectiveStreamName := tt.baseStreamName
|
||||
if tt.namespace != "" {
|
||||
effectiveStreamName = sanitizeSubject(tt.namespace) + "_" + tt.baseStreamName
|
||||
}
|
||||
|
||||
if effectiveStreamName != tt.expectedStreamName {
|
||||
t.Errorf("expected stream name %q, got %q", tt.expectedStreamName, effectiveStreamName)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestSanitizeSubject(t *testing.T) {
|
||||
tests := []struct {
|
||||
input string
|
||||
expected string
|
||||
}{
|
||||
{"simple", "simple"},
|
||||
{"with spaces", "with_spaces"},
|
||||
{"with.dots", "with_dots"},
|
||||
{"with*stars", "with_stars"},
|
||||
{"with>greater", "with_greater"},
|
||||
{"complex.name with*special>chars", "complex_name_with_special_chars"},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.input, func(t *testing.T) {
|
||||
result := sanitizeSubject(tt.input)
|
||||
if result != tt.expected {
|
||||
t.Errorf("sanitizeSubject(%q) = %q, want %q", tt.input, result, tt.expected)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractActorType(t *testing.T) {
|
||||
tests := []struct {
|
||||
actorID string
|
||||
expectedType string
|
||||
}{
|
||||
{"order-123", "order"},
|
||||
{"user-abc-def", "user"},
|
||||
{"nodelimiter", "unknown"},
|
||||
{"", "unknown"},
|
||||
{"-leadingdash", "unknown"},
|
||||
{"a-b", "a"},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.actorID, func(t *testing.T) {
|
||||
result := extractActorType(tt.actorID)
|
||||
if result != tt.expectedType {
|
||||
t.Errorf("extractActorType(%q) = %q, want %q", tt.actorID, result, tt.expectedType)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user