1 Commits

Author SHA1 Message Date
d33477c02a Add mutex protection to ConsistentHashRing for thread safety
All checks were successful
CI / build (pull_request) Successful in 17s
- Add sync.RWMutex to ConsistentHashRing struct
- Use Lock/Unlock for write operations (AddNode, RemoveNode)
- Use RLock/RUnlock for read operations (GetNode, GetNodes, IsEmpty)

This allows concurrent reads (the common case) while serializing writes,
preventing race conditions when multiple goroutines access the hash ring.

Closes #35

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:30:58 +01:00
40 changed files with 416 additions and 6460 deletions

View File

@@ -1,64 +0,0 @@
# Issue: Implement Actor Migration Between Cluster Nodes
## Problem
When nodes join or leave the cluster, actors need to be migrated to maintain even distribution. Currently:
- `handleRebalanceRequest` in `cluster/manager.go:150` is empty
- `handleMigrationRequest` in `cluster/manager.go:167` is empty
- `RebalanceShards` in `cluster/shard.go:211` returns unchanged map
- `SendMessage` in `cluster/distributed.go:139` ignores sharding
## Required Implementation
### 1. Rebalance Algorithm (cluster/shard.go)
Implement `ConsistentHashPlacement.RebalanceShards` to:
- Calculate new shard assignments based on active nodes
- Identify actors needing migration
- Generate migration plan with source/dest nodes
### 2. Migration Coordinator (cluster/manager.go)
Implement `handleRebalanceRequest` to:
- Accept migration plan from leader
- For each actor in plan:
1. Pause incoming messages
2. Capture actor state (replay events up to current version)
3. Serialize state
4. Send migration request to destination node
5. Wait for ack
6. Delete actor from current node
- Track migration status via `ActorMigration.Status`
### 3. Cross-Node Message Routing (cluster/distributed.go)
Implement proper routing in `SendMessage`:
- Use `GetActorNode(actorID)` to determine target node
- If remote: marshal message, send via NATS to target node
- If local: send to local runtime
- Route response back to caller if needed
## Suggested Approach
1. **Define message types** for actor migration requests/responses in `cluster/types.go`
2. **Implement state capture** - replay events to get current state
3. **Implement state restore** - deserialize and restore actor state
4. **Implement coordinator** - manage migration phases
5. **Add error handling** - handle failed migrations, retries, cleanup
6. **Add tests** - test migration with mock NATS
## Related Files
- `cluster/manager.go:150` - handleRebalanceRequest (empty)
- `cluster/manager.go:167` - handleMigrationRequest (empty)
- `cluster/shard.go:211` - RebalanceShards (stub)
- `cluster/distributed.go:139` - SendMessage (simplified)
- `cluster/types.go:108` - ActorMigration struct
## Acceptance Criteria
- [ ] `RebalanceShards` returns new shard map with actor assignments
- [ ] `handleRebalanceRequest` processes migration plan
- [ ] `handleMigrationRequest` accepts actor migrations
- [ ] `SendMessage` routes to correct node
- [ ] Actors can be migrated with state preserved
- [ ] Failed migrations are handled gracefully
- [ ] Integration test with multi-node cluster

View File

@@ -1,117 +0,0 @@
# Issue: Add Snapshot Support to Event Sourcing Workflow
## Problem
`SnapshotStore` interface is defined but snapshots are not integrated into the event sourcing workflow. This means:
- Actors with many events must replay entire history
- No performance optimization for long-lived actors
- Snapshots exist as API but are not used
## Current State
- `EventStoreWithErrors` in `event.go:235` - no snapshot methods
- `SnapshotStore` interface in `event.go:245` - defined but not widely used
- `JetStreamEventStore.GetLatestSnapshot` and `SaveSnapshot` implemented but not called automatically
- `InMemoryEventStore` has snapshot methods but no lifecycle management
## Required Implementation
### 1. Snapshot Strategy
Define when to create snapshots:
- Fixed interval (e.g., every 100 events)
- Version-based (e.g., every 50 versions)
- Hybrid: version-based with min/max bounds
### 2. State Capture
Add method to capture actor state:
```go
// CaptureState rebuilds actor state by replaying events and returns it
CaptureState(actorID string, fromVersion int64) (map[string]interface{}, error)
```
### 3. Snapshot Store Extension
Extend `EventStoreWithErrors` to include snapshots:
```go
type EventStoreWithSnapshots interface {
EventStoreWithErrors
GetLatestSnapshot(actorID string) (*ActorSnapshot, error)
SaveSnapshot(snapshot *ActorSnapshot) error
}
```
### 4. Snapshot Workflow
Modify event retrieval to use snapshots:
```go
GetEvents(actorID string, fromVersion int64) ([]*Event, error) {
// 1. Try to get latest snapshot
snapshot, _ := store.GetLatestSnapshot(actorID)
// 2. If snapshot exists and version <= fromVersion:
// - Return events from snapshot version + 1
// 3. Else:
// - Replay all events from version 0
}
```
## Suggested Implementation
### 1. Add CaptureState to EventStore interface
In `event.go`, extend `EventStore` or create `StateStore` interface:
```go
type StateStore interface {
EventStore
CaptureState(actorID string, fromVersion int64) (map[string]interface{}, error)
}
```
### 2. Implement CaptureState
In `store/jetstream.go`:
```go
func (jes *JetStreamEventStore) CaptureState(actorID string, fromVersion int64) (map[string]interface{}, error) {
// Replay events and build state (application logic needed here)
events, _ := jes.GetEvents(actorID, fromVersion)
// Need application logic to convert events to state
return state, nil
}
```
### 3. Add Snapshot Helper
Create snapshot utilities:
```go
// CreateSnapshot creates snapshot from state
func CreateSnapshot(actorID string, version int64, state map[string]interface{}) *ActorSnapshot {
return &ActorSnapshot{
ActorID: actorID,
Version: version,
State: state,
Timestamp: time.Now(),
}
}
```
### 4. Modify GetEvents
Update `GetEvents` in both stores to use snapshots when beneficial.
## Snapshots Workflow Example
```
1. Actor has 1000 events
2. Every 100 events, create snapshot
3. Actor reaches version 1000, snapshot at version 1000
4. Request events from version 900:
- Get snapshot at version 1000? No (version too high)
- Replay 900->1000 events (only 100 events)
5. Request events from version 50:
- Get latest snapshot at version 1000? Yes (version > 50)
- Use snapshot as base
- Replay 1000->1000 events (none)
```
## Acceptance Criteria
- [ ] `CaptureState` method added to event store
- [ ] Snapshots created at configured intervals
- [ ] `GetEvents` uses snapshots to optimize replay
- [ ] Snapshot workflow tested with long-lived actors
- [ ] Configuration for snapshot interval/version
- [ ] Metrics: snapshot count, average replay size

View File

@@ -1,100 +0,0 @@
# Issue: Implement VM/Runtime for Actors
## Problem
Only interfaces exist for `Runtime` and `VirtualMachine` in `cluster/types.go` and `cluster/distributed.go`, but no actual implementation. Actors cannot be created, started, stopped, or have their state managed.
## Required Components
### 1. VM Implementation (cluster/vm.go - new)
```go
type VirtualMachine struct {
actorID string
eventStore aether.EventStore
state map[string]interface{}
version int64
}
```
Methods needed:
- `GetID()`, `GetActorID()`, `GetState()` - already in interface
- `Start()` - replay events to rebuild state
- `ProcessEvent(event *aether.Event)` - apply event to state
- `Stop()` - persist final state
- `GetVersion()` - current event version
### 2. Runtime Implementation (cluster/runtime.go - new)
```go
type Runtime struct {
natsConn *nats.Conn
eventStore aether.EventStore
vmRegistry VMRegistry // map[actorID]*VirtualMachine
config RuntimeConfig
}
```
Methods needed:
- `Start()` - initialize and start processing
- `LoadModel(model eventstorming.Model)` - register domain types
- `SendMessage(message RuntimeMessage)` - route to appropriate VM
- `GetActiveVMs()` - return map of active VMs
- `CreateVM(actorID string)` - create new VM instance
- `StopVM(actorID string)` - persist and stop VM
### 3. Event Processing
- Subscribe to actor's event stream
- Replay events to build initial state
- Apply new events as they arrive
- Handle event versions and conflicts
## Suggested Design
### VM Lifecycle
```
1. Actor message arrives for actor-123
2. Runtime checks if VM exists for actor-123
3. If not, create VM:
- Replay events from event store
- Rebuild state
4. Route message to VM
5. VM processes message -> creates new events
6. Events persisted to event store
7. VM state updated
```
### State Management
- State derived from event replay
- No separate state store needed
- Can snapshot periodically for performance
- Version conflict handling using existing EventStore
## Implementation Steps
1. **Create VM struct** in `cluster/vm.go`
2. **Implement event replay** to rebuild state
3. **Create Runtime** in `cluster/runtime.go`
4. **Register Runtime with cluster** via `SetVMProvider`
5. **Implement message processing** - validate against model
6. **Add version conflict handling** using existing EventStore
7. **Write tests** - mock event store, test state transitions
## File Structure
```
cluster/
├── vm.go # VirtualMachine implementation
├── runtime.go # Runtime implementation
├── vm_test.go # VM tests
├── runtime_test.go # Runtime tests
└── integration_test.go # Integration tests
```
## Acceptance Criteria
- [ ] VM can be created with actor ID
- [ ] VM replays events to build state
- [ ] VM processes events and updates state
- [ ] VM persists current version
- [ ] Runtime can create/stop VMs
- [ ] Runtime manages VM registry
- [ ] Integration test with NATS and JetStream

106
AGENTS.md
View File

@@ -1,106 +0,0 @@
# Aether
**Distributed event sourcing primitives for Go, powered by NATS.**
---
## Development Commands
```bash
make build # go build ./...
make test # go test ./...
make lint # golangci-lint run
make clean # go clean
```
## NATS Server Requirement
Integration tests require NATS with JetStream enabled:
```bash
brew install nats-server
nats-server -js
```
Run tests in a separate terminal after starting NATS.
## Project Structure
```
aether/
├── event.go # Event, ActorSnapshot, EventStore interface
├── eventbus.go # EventBus, EventBroadcaster interface
├── nats_eventbus.go # NATSEventBus implementation
├── metrics*.go # Prometheus metrics
├── store/ # EventStore implementations
│ ├── memory.go # InMemoryEventStore (testing)
│ └── jetstream.go # JetStreamEventStore (production)
├── cluster/ # Cluster management
│ ├── manager.go # ClusterManager
│ ├── discovery.go # NodeDiscovery
│ ├── hashring.go # ConsistentHashRing
│ ├── shard.go # ShardManager
│ ├── leader.go # LeaderElection
│ └── types.go # Cluster types
├── examples/ # Usage examples
└── eventstorming/ # Domain modeling reference
```
## Core Patterns
### Event Versioning
Events for each actor must have monotonically increasing versions:
```go
currentVersion, _ := store.GetLatestVersion(actorID)
event := &aether.Event{
ActorID: actorID,
Version: currentVersion + 1,
// ...
}
err := store.SaveEvent(event)
if errors.Is(err, aether.ErrVersionConflict) {
// Reload and retry
}
```
### Namespace Isolation
Namespaces provide logical boundaries for events:
```go
// Event bus namespace
ch := eventBus.Subscribe("tenant-abc")
eventBus.Publish("tenant-abc", event)
// Store namespace
store, _ := store.NewJetStreamEventStoreWithNamespace(natsConn, "events", "tenant-abc")
```
Namespaces sanitize special characters and prefix stream names for complete data isolation.
### JetStream Cache Behavior
`JetStreamEventStore` caches actor versions for performance. Cache is invalidated when `GetLatestVersion` detects a newer version from external writes.
## Testing
- Unit tests: `go test -v ./...`
- Single test: `go test -v -run TestName`
- Single file: `go test -v ./store/...`
- Benchmarks: `go test -bench=. -benchmem`
Integration tests require running NATS server first.
## Linting
```bash
golangci-lint run
golangci-lint run --fix
```
## References
- [vision.md](./vision.md) - Product vision and principles
- [examples/README.md](./examples/README.md) - Example patterns

160
CLAUDE.md Normal file
View File

@@ -0,0 +1,160 @@
# Aether
Distributed actor system with event sourcing for Go, powered by NATS.
## Organization Context
This repo is part of Flowmade. See:
- [Organization manifesto](https://git.flowmade.one/flowmade-one/architecture/src/branch/main/manifesto.md) - who we are, what we believe
- [Repository map](https://git.flowmade.one/flowmade-one/architecture/src/branch/main/repos.md) - how this fits in the bigger picture
- [Vision](./vision.md) - what this specific product does
## Setup
```bash
git clone git@git.flowmade.one:flowmade-one/aether.git
cd aether
go mod download
```
Requires NATS server for integration tests:
```bash
# Install NATS
brew install nats-server
# Run with JetStream enabled
nats-server -js
```
## Project Structure
```
aether/
├── event.go # Event, ActorSnapshot, EventStore interface
├── eventbus.go # EventBus, EventBroadcaster interface
├── nats_eventbus.go # NATSEventBus - cross-node event broadcasting
├── store/
│ ├── memory.go # InMemoryEventStore (testing)
│ └── jetstream.go # JetStreamEventStore (production)
├── cluster/
│ ├── manager.go # ClusterManager
│ ├── discovery.go # NodeDiscovery
│ ├── hashring.go # ConsistentHashRing
│ ├── shard.go # ShardManager
│ ├── leader.go # LeaderElection
│ └── types.go # Cluster types
└── model/
└── model.go # EventStorming model types
```
## Development
```bash
make build # Build the library
make test # Run tests
make lint # Run linters
```
## Architecture
### Event Sourcing
Events are the source of truth. State is derived by replaying events.
```go
// Create an event
event := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderPlaced",
ActorID: "order-123",
Version: 1,
Data: map[string]interface{}{"total": 100.00},
Timestamp: time.Now(),
}
// Persist to event store
store.SaveEvent(event)
// Replay events to rebuild state
events, _ := store.GetEvents("order-123", 0)
```
### Event Versioning
Events for each actor must have **monotonically increasing versions**. This ensures event stream integrity and enables optimistic concurrency control.
#### Version Semantics
- Each actor has an independent version sequence
- Version must be strictly greater than the current latest version
- For new actors (no events), the first event must have version > 0
- Non-consecutive versions are allowed (gaps are permitted)
#### Optimistic Concurrency Pattern
```go
// 1. Get current version
currentVersion, _ := store.GetLatestVersion("order-123")
// 2. Create event with next version
event := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderUpdated",
ActorID: "order-123",
Version: currentVersion + 1,
Data: map[string]interface{}{"status": "shipped"},
Timestamp: time.Now(),
}
// 3. Attempt to save
err := store.SaveEvent(event)
if errors.Is(err, aether.ErrVersionConflict) {
// Another writer won - reload and retry if appropriate
var versionErr *aether.VersionConflictError
errors.As(err, &versionErr)
log.Printf("Conflict: actor %s has version %d, attempted %d",
versionErr.ActorID, versionErr.CurrentVersion, versionErr.AttemptedVersion)
}
```
#### Error Types
- `ErrVersionConflict` - Sentinel error for version conflicts (use with `errors.Is`)
- `VersionConflictError` - Detailed error with ActorID, CurrentVersion, and AttemptedVersion
### Namespace Isolation
Namespaces provide logical boundaries for events and subscriptions:
```go
// Subscribe to events in a namespace
ch := eventBus.Subscribe("tenant-abc")
// Events are isolated per namespace
eventBus.Publish("tenant-abc", event) // Only tenant-abc subscribers see this
```
### Clustering
Aether handles node discovery, leader election, and shard distribution:
```go
// Create cluster manager
manager := cluster.NewClusterManager(natsConn, nodeID)
// Join cluster
manager.Start()
// Leader election happens automatically
if manager.IsLeader() {
// Coordinate shard assignments
}
```
## Key Patterns
- **Events are immutable** - Never modify, only append
- **Versions are monotonic** - Each event must have version > previous for same actor
- **Snapshots for performance** - Periodically snapshot state to avoid full replay
- **Namespaces for isolation** - Not multi-tenancy, just logical boundaries
- **NATS for everything** - Events, pub/sub, clustering all use NATS

169
README.md
View File

@@ -1,169 +0,0 @@
# Aether
[![CI](https://git.flowmade.one/flowmade-one/aether/actions/workflows/ci.yml/badge.svg)](https://git.flowmade.one/flowmade-one/aether/actions/workflows/ci.yml)
Event sourcing primitives for Go, powered by NATS.
Aether provides composable building blocks for distributed, event-sourced systems without imposing framework opinions on your domain.
## Why Aether?
Building distributed, event-sourced systems in Go requires assembling many pieces: event storage, pub/sub, clustering, leader election. Existing solutions are either too heavy (full frameworks with opinions about your domain), too light (just pub/sub), or not NATS-native.
Aether provides clear primitives that compose well:
- **Event sourcing primitives** - Event, EventStore interface, snapshots
- **Event stores** - In-memory (testing) and JetStream (production)
- **Event bus** - Local and NATS-backed pub/sub with namespace isolation
- **Cluster management** - Node discovery, leader election, shard distribution
Built for JetStream from the ground up, not bolted on.
## Installation
```bash
go get git.flowmade.one/flowmade-one/aether
```
Requires Go 1.23 or later.
## Quick Start
Here is a minimal example showing event sourcing fundamentals: creating events, saving them to a store, and replaying to rebuild state.
```go
package main
import (
"fmt"
"time"
"github.com/google/uuid"
"git.flowmade.one/flowmade-one/aether"
"git.flowmade.one/flowmade-one/aether/store"
)
func main() {
// Create an in-memory event store (use JetStream for production)
eventStore := store.NewInMemoryEventStore()
// Create and save events
// Error handling omitted for brevity
orderID := "order-123"
orderPlaced := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderPlaced",
ActorID: orderID,
Version: 1,
Data: map[string]interface{}{"total": 99.99, "items": 3},
Timestamp: time.Now(),
}
eventStore.SaveEvent(orderPlaced)
orderShipped := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderShipped",
ActorID: orderID,
Version: 2,
Data: map[string]interface{}{"carrier": "FastShip", "tracking": "FS123456"},
Timestamp: time.Now(),
}
eventStore.SaveEvent(orderShipped)
// Replay events to rebuild state
events, _ := eventStore.GetEvents(orderID, 0)
state := make(map[string]interface{})
for _, event := range events {
switch event.EventType {
case "OrderPlaced":
state["total"] = event.Data["total"]
state["items"] = event.Data["items"]
state["status"] = "placed"
case "OrderShipped":
state["status"] = "shipped"
state["carrier"] = event.Data["carrier"]
state["tracking"] = event.Data["tracking"]
}
}
fmt.Printf("Order state after replaying %d events:\n", len(events))
fmt.Printf(" Status: %s\n", state["status"])
fmt.Printf(" Total: $%.2f\n", state["total"])
fmt.Printf(" Tracking: %s\n", state["tracking"])
}
```
Output:
```
Order state after replaying 2 events:
Status: shipped
Total: $99.99
Tracking: FS123456
```
## Key Concepts
### Events are immutable
Events represent facts about what happened. Once saved, they are never modified or deleted - you only append new events. This immutability guarantee is enforced at multiple levels:
**Interface Design**: The `EventStore` interface provides no Update or Delete methods. Only `SaveEvent` (append), `GetEvents` (read), and `GetLatestVersion` (read) are available.
**JetStream Storage**: When using `JetStreamEventStore`, events are stored in a NATS JetStream stream configured with:
- File-based storage (durable)
- Limits-based retention policy (events expire after configured duration, not before)
- No mechanism to modify or delete individual events during their lifetime
**Audit Trail Guarantee**: Because events are immutable once persisted, they serve as a trustworthy audit trail. You can rely on the fact that historical events won't change, enabling compliance and forensics.
To correct a mistake, append a new event that expresses the correction rather than modifying history:
```go
// Wrong: Cannot update an event
// store.UpdateEvent(eventID, newData) // This method doesn't exist
// Right: Append a new event that corrects the record
correctionEvent := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderCorrected",
ActorID: orderID,
Version: currentVersion + 1,
Data: map[string]interface{}{"reason": "price adjustment"},
Timestamp: time.Now(),
}
err := store.SaveEvent(correctionEvent)
```
### State is derived
Current state is always derived by replaying events. This gives you a complete audit trail and the ability to rebuild state at any point in time.
### Versions ensure consistency
Each event for an actor must have a strictly increasing version number. This enables optimistic concurrency control:
```go
currentVersion, _ := eventStore.GetLatestVersion(actorID)
event := &aether.Event{
ActorID: actorID,
Version: currentVersion + 1,
// ...
}
err := eventStore.SaveEvent(event)
if errors.Is(err, aether.ErrVersionConflict) {
// Another writer saved first - reload and retry
}
```
## Documentation
- [Vision](./vision.md) - Product vision and design principles
- [CLAUDE.md](./CLAUDE.md) - Development guide and architecture details
## License
See [LICENSE](./LICENSE) for details.

View File

@@ -44,4 +44,5 @@
// - Leader election ensures coordination continues despite node failures // - Leader election ensures coordination continues despite node failures
// - Actor migration allows rebalancing when cluster topology changes // - Actor migration allows rebalancing when cluster topology changes
// - Graceful shutdown with proper resource cleanup // - Graceful shutdown with proper resource cleanup
package cluster //
package cluster

View File

@@ -1,125 +0,0 @@
package cluster
import (
"testing"
)
func TestDefaultHashRingConfig(t *testing.T) {
config := DefaultHashRingConfig()
if config.VirtualNodes != DefaultVirtualNodes {
t.Errorf("expected VirtualNodes=%d, got %d", DefaultVirtualNodes, config.VirtualNodes)
}
}
func TestDefaultShardConfig(t *testing.T) {
config := DefaultShardConfig()
if config.ShardCount != DefaultNumShards {
t.Errorf("expected ShardCount=%d, got %d", DefaultNumShards, config.ShardCount)
}
if config.ReplicationFactor != 1 {
t.Errorf("expected ReplicationFactor=1, got %d", config.ReplicationFactor)
}
}
func TestNewConsistentHashRingWithConfig(t *testing.T) {
t.Run("custom virtual nodes", func(t *testing.T) {
config := HashRingConfig{VirtualNodes: 50}
ring := NewConsistentHashRingWithConfig(config)
ring.AddNode("test-node")
if len(ring.sortedHashes) != 50 {
t.Errorf("expected 50 virtual nodes, got %d", len(ring.sortedHashes))
}
if ring.GetVirtualNodes() != 50 {
t.Errorf("expected GetVirtualNodes()=50, got %d", ring.GetVirtualNodes())
}
})
t.Run("zero value uses default", func(t *testing.T) {
config := HashRingConfig{VirtualNodes: 0}
ring := NewConsistentHashRingWithConfig(config)
ring.AddNode("test-node")
if len(ring.sortedHashes) != DefaultVirtualNodes {
t.Errorf("expected %d virtual nodes, got %d", DefaultVirtualNodes, len(ring.sortedHashes))
}
})
t.Run("default constructor uses default config", func(t *testing.T) {
ring := NewConsistentHashRing()
ring.AddNode("test-node")
if len(ring.sortedHashes) != DefaultVirtualNodes {
t.Errorf("expected %d virtual nodes, got %d", DefaultVirtualNodes, len(ring.sortedHashes))
}
})
}
func TestNewShardManagerWithConfig(t *testing.T) {
t.Run("custom shard count", func(t *testing.T) {
config := ShardConfig{ShardCount: 256, ReplicationFactor: 2}
sm := NewShardManagerWithConfig(config)
if sm.GetShardCount() != 256 {
t.Errorf("expected shard count 256, got %d", sm.GetShardCount())
}
if sm.GetReplicationFactor() != 2 {
t.Errorf("expected replication factor 2, got %d", sm.GetReplicationFactor())
}
})
t.Run("zero values use defaults", func(t *testing.T) {
config := ShardConfig{ShardCount: 0, ReplicationFactor: 0}
sm := NewShardManagerWithConfig(config)
if sm.GetShardCount() != DefaultNumShards {
t.Errorf("expected shard count %d, got %d", DefaultNumShards, sm.GetShardCount())
}
if sm.GetReplicationFactor() != 1 {
t.Errorf("expected replication factor 1, got %d", sm.GetReplicationFactor())
}
})
t.Run("legacy constructor still works", func(t *testing.T) {
sm := NewShardManager(512, 3)
if sm.GetShardCount() != 512 {
t.Errorf("expected shard count 512, got %d", sm.GetShardCount())
}
if sm.GetReplicationFactor() != 3 {
t.Errorf("expected replication factor 3, got %d", sm.GetReplicationFactor())
}
})
}
func TestShardManagerGetShard_DifferentShardCounts(t *testing.T) {
testCases := []struct {
shardCount int
}{
{shardCount: 16},
{shardCount: 64},
{shardCount: 256},
{shardCount: 1024},
{shardCount: 4096},
}
for _, tc := range testCases {
t.Run("shardCount="+string(rune(tc.shardCount)), func(t *testing.T) {
sm := NewShardManagerWithConfig(ShardConfig{ShardCount: tc.shardCount})
// Verify all actor IDs map to valid shard range
for i := 0; i < 1000; i++ {
actorID := "actor-" + string(rune(i))
shard := sm.GetShard(actorID)
if shard < 0 || shard >= tc.shardCount {
t.Errorf("shard %d out of range [0, %d)", shard, tc.shardCount)
}
}
})
}
}

View File

@@ -115,4 +115,4 @@ func (nd *NodeDiscovery) announceNode(updateType NodeUpdateType) {
// Stop gracefully stops the node discovery service // Stop gracefully stops the node discovery service
func (nd *NodeDiscovery) Stop() { func (nd *NodeDiscovery) Stop() {
nd.announceNode(NodeLeft) nd.announceNode(NodeLeft)
} }

View File

@@ -12,7 +12,7 @@ import (
type DistributedVM struct { type DistributedVM struct {
nodeID string nodeID string
cluster *ClusterManager cluster *ClusterManager
localRuntime Runtime localRuntime Runtime // Interface to avoid import cycles
sharding *ShardManager sharding *ShardManager
discovery *NodeDiscovery discovery *NodeDiscovery
natsConn *nats.Conn natsConn *nats.Conn
@@ -20,29 +20,17 @@ type DistributedVM struct {
cancel context.CancelFunc cancel context.CancelFunc
} }
// Runtime defines the interface for a local runtime that executes actors. // Runtime interface to avoid import cycles with main aether package
// This interface decouples the cluster package from specific runtime implementations.
type Runtime interface { type Runtime interface {
// Start initializes and starts the runtime
Start() error Start() error
// LoadModel loads an EventStorming model into the runtime LoadModel(model interface{}) error
LoadModel(model RuntimeModel) error SendMessage(message interface{}) error
// SendMessage sends a message to an actor in the runtime
SendMessage(message RuntimeMessage) error
} }
// DistributedVMRegistry implements VMRegistry using DistributedVM's local runtime and sharding. // DistributedVMRegistry implements VMRegistry using DistributedVM's local runtime and sharding
// It provides the cluster manager with access to VM information without import cycles.
type DistributedVMRegistry struct { type DistributedVMRegistry struct {
vmProvider VMProvider runtime interface{} // Runtime interface to avoid import cycles
sharding *ShardManager sharding *ShardManager
}
// VMProvider defines an interface for accessing VMs from a runtime.
// This is used by DistributedVMRegistry to get VM information.
type VMProvider interface {
// GetActiveVMs returns a map of actor IDs to their VirtualMachine instances
GetActiveVMs() map[string]VirtualMachine
} }
// NewDistributedVM creates a distributed VM runtime cluster node // NewDistributedVM creates a distributed VM runtime cluster node
@@ -79,17 +67,14 @@ func NewDistributedVM(nodeID string, natsURLs []string, localRuntime Runtime) (*
cancel: cancel, cancel: cancel,
} }
return dvm, nil // Create VM registry and connect it to cluster manager
}
// SetVMProvider sets the VM provider for the distributed VM registry.
// This should be called after the runtime is fully initialized.
func (dvm *DistributedVM) SetVMProvider(provider VMProvider) {
vmRegistry := &DistributedVMRegistry{ vmRegistry := &DistributedVMRegistry{
vmProvider: provider, runtime: localRuntime,
sharding: dvm.sharding, sharding: sharding,
} }
dvm.cluster.SetVMRegistry(vmRegistry) cluster.SetVMRegistry(vmRegistry)
return dvm, nil
} }
// Start begins the distributed VM cluster node // Start begins the distributed VM cluster node
@@ -118,7 +103,7 @@ func (dvm *DistributedVM) Stop() {
} }
// LoadModel distributes EventStorming model across the cluster with VM templates // LoadModel distributes EventStorming model across the cluster with VM templates
func (dvm *DistributedVM) LoadModel(model RuntimeModel) error { func (dvm *DistributedVM) LoadModel(model interface{}) error {
// Load model locally first // Load model locally first
if err := dvm.localRuntime.LoadModel(model); err != nil { if err := dvm.localRuntime.LoadModel(model); err != nil {
return fmt.Errorf("failed to load model locally: %w", err) return fmt.Errorf("failed to load model locally: %w", err)
@@ -136,7 +121,7 @@ func (dvm *DistributedVM) LoadModel(model RuntimeModel) error {
} }
// SendMessage routes messages across the distributed cluster // SendMessage routes messages across the distributed cluster
func (dvm *DistributedVM) SendMessage(message RuntimeMessage) error { func (dvm *DistributedVM) SendMessage(message interface{}) error {
// This is a simplified implementation // This is a simplified implementation
// In practice, this would determine the target node based on sharding // In practice, this would determine the target node based on sharding
// and route the message appropriately // and route the message appropriately
@@ -177,29 +162,15 @@ func (dvm *DistributedVM) handleClusterMessage(msg *nats.Msg) {
switch clusterMsg.Type { switch clusterMsg.Type {
case "load_model": case "load_model":
// Handle model loading from other nodes // Handle model loading from other nodes
// Re-marshal and unmarshal to convert map[string]interface{} to concrete type if model := clusterMsg.Payload; model != nil {
payloadBytes, err := json.Marshal(clusterMsg.Payload) dvm.localRuntime.LoadModel(model)
if err != nil {
return
} }
var model ModelPayload
if err := json.Unmarshal(payloadBytes, &model); err != nil {
return
}
dvm.localRuntime.LoadModel(&model)
case "route_message": case "route_message":
// Handle message routing from other nodes // Handle message routing from other nodes
// Re-marshal and unmarshal to convert map[string]interface{} to concrete type if message := clusterMsg.Payload; message != nil {
payloadBytes, err := json.Marshal(clusterMsg.Payload) dvm.localRuntime.SendMessage(message)
if err != nil {
return
} }
var message MessagePayload
if err := json.Unmarshal(payloadBytes, &message); err != nil {
return
}
dvm.localRuntime.SendMessage(&message)
case "rebalance": case "rebalance":
// Handle shard rebalancing requests // Handle shard rebalancing requests
@@ -229,23 +200,22 @@ func (dvm *DistributedVM) GetClusterInfo() map[string]interface{} {
nodes := dvm.cluster.GetNodes() nodes := dvm.cluster.GetNodes()
return map[string]interface{}{ return map[string]interface{}{
"nodeId": dvm.nodeID, "nodeId": dvm.nodeID,
"isLeader": dvm.cluster.IsLeader(), "isLeader": dvm.cluster.IsLeader(),
"leader": dvm.cluster.GetLeader(), "leader": dvm.cluster.GetLeader(),
"nodeCount": len(nodes), "nodeCount": len(nodes),
"nodes": nodes, "nodes": nodes,
} }
} }
// GetActiveVMs returns a map of active VMs from the VM provider // GetActiveVMs returns a map of active VMs (implementation depends on runtime)
func (dvr *DistributedVMRegistry) GetActiveVMs() map[string]VirtualMachine { func (dvr *DistributedVMRegistry) GetActiveVMs() map[string]interface{} {
if dvr.vmProvider == nil { // This would need to access the actual runtime's VM registry
return make(map[string]VirtualMachine) // For now, return empty map to avoid import cycles
} return make(map[string]interface{})
return dvr.vmProvider.GetActiveVMs()
} }
// GetShard returns the shard number for the given actor ID // GetShard returns the shard number for the given actor ID
func (dvr *DistributedVMRegistry) GetShard(actorID string) int { func (dvr *DistributedVMRegistry) GetShard(actorID string) int {
return dvr.sharding.GetShard(actorID) return dvr.sharding.GetShard(actorID)
} }

View File

@@ -14,24 +14,13 @@ type ConsistentHashRing struct {
ring map[uint32]string // hash -> node ID ring map[uint32]string // hash -> node ID
sortedHashes []uint32 // sorted hash keys sortedHashes []uint32 // sorted hash keys
nodes map[string]bool // active nodes nodes map[string]bool // active nodes
virtualNodes int // number of virtual nodes per physical node
} }
// NewConsistentHashRing creates a new consistent hash ring with default configuration // NewConsistentHashRing creates a new consistent hash ring
func NewConsistentHashRing() *ConsistentHashRing { func NewConsistentHashRing() *ConsistentHashRing {
return NewConsistentHashRingWithConfig(DefaultHashRingConfig())
}
// NewConsistentHashRingWithConfig creates a new consistent hash ring with custom configuration
func NewConsistentHashRingWithConfig(config HashRingConfig) *ConsistentHashRing {
virtualNodes := config.VirtualNodes
if virtualNodes == 0 {
virtualNodes = DefaultVirtualNodes
}
return &ConsistentHashRing{ return &ConsistentHashRing{
ring: make(map[uint32]string), ring: make(map[uint32]string),
nodes: make(map[string]bool), nodes: make(map[string]bool),
virtualNodes: virtualNodes,
} }
} }
@@ -47,7 +36,7 @@ func (chr *ConsistentHashRing) AddNode(nodeID string) {
chr.nodes[nodeID] = true chr.nodes[nodeID] = true
// Add virtual nodes for better distribution // Add virtual nodes for better distribution
for i := 0; i < chr.virtualNodes; i++ { for i := 0; i < VirtualNodes; i++ {
virtualKey := fmt.Sprintf("%s:%d", nodeID, i) virtualKey := fmt.Sprintf("%s:%d", nodeID, i)
hash := chr.hash(virtualKey) hash := chr.hash(virtualKey)
chr.ring[hash] = nodeID chr.ring[hash] = nodeID
@@ -131,8 +120,3 @@ func (chr *ConsistentHashRing) IsEmpty() bool {
return len(chr.nodes) == 0 return len(chr.nodes) == 0
} }
// GetVirtualNodes returns the number of virtual nodes per physical node
func (chr *ConsistentHashRing) GetVirtualNodes() int {
return chr.virtualNodes
}

View File

@@ -42,7 +42,7 @@ func TestAddNode(t *testing.T) {
} }
// Verify virtual nodes were added // Verify virtual nodes were added
expectedVirtualNodes := DefaultVirtualNodes expectedVirtualNodes := VirtualNodes
if len(ring.sortedHashes) != expectedVirtualNodes { if len(ring.sortedHashes) != expectedVirtualNodes {
t.Errorf("expected %d virtual nodes, got %d", expectedVirtualNodes, len(ring.sortedHashes)) t.Errorf("expected %d virtual nodes, got %d", expectedVirtualNodes, len(ring.sortedHashes))
} }
@@ -86,7 +86,7 @@ func TestAddNode_MultipleNodes(t *testing.T) {
t.Errorf("expected 3 nodes, got %d", len(nodes)) t.Errorf("expected 3 nodes, got %d", len(nodes))
} }
expectedHashes := DefaultVirtualNodes * 3 expectedHashes := VirtualNodes * 3
if len(ring.sortedHashes) != expectedHashes { if len(ring.sortedHashes) != expectedHashes {
t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes)) t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes))
} }
@@ -118,7 +118,7 @@ func TestRemoveNode(t *testing.T) {
} }
// Verify virtual nodes were removed // Verify virtual nodes were removed
expectedHashes := DefaultVirtualNodes expectedHashes := VirtualNodes
if len(ring.sortedHashes) != expectedHashes { if len(ring.sortedHashes) != expectedHashes {
t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes)) t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes))
} }
@@ -321,7 +321,7 @@ func TestRingBehavior_ManyNodes(t *testing.T) {
} }
// Verify virtual nodes count // Verify virtual nodes count
expectedHashes := numNodes * DefaultVirtualNodes expectedHashes := numNodes * VirtualNodes
if len(ring.sortedHashes) != expectedHashes { if len(ring.sortedHashes) != expectedHashes {
t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes)) t.Errorf("expected %d virtual nodes, got %d", expectedHashes, len(ring.sortedHashes))
} }
@@ -355,7 +355,7 @@ func TestRingBehavior_ManyNodes(t *testing.T) {
} }
} }
func TestDefaultVirtualNodes_ImproveDistribution(t *testing.T) { func TestVirtualNodes_ImproveDistribution(t *testing.T) {
// Test that virtual nodes actually improve distribution // Test that virtual nodes actually improve distribution
// by comparing with a theoretical single-hash-per-node scenario // by comparing with a theoretical single-hash-per-node scenario
@@ -386,7 +386,7 @@ func TestDefaultVirtualNodes_ImproveDistribution(t *testing.T) {
stdDev := math.Sqrt(sumSquaredDiff / float64(numNodes)) stdDev := math.Sqrt(sumSquaredDiff / float64(numNodes))
coefficientOfVariation := stdDev / expectedPerNode coefficientOfVariation := stdDev / expectedPerNode
// With DefaultVirtualNodes=150, we expect good distribution // With VirtualNodes=150, we expect good distribution
// Coefficient of variation should be low (< 15%) // Coefficient of variation should be low (< 15%)
if coefficientOfVariation > 0.15 { if coefficientOfVariation > 0.15 {
t.Errorf("distribution has high coefficient of variation: %.2f%% (expected < 15%%)", t.Errorf("distribution has high coefficient of variation: %.2f%% (expected < 15%%)",
@@ -394,8 +394,8 @@ func TestDefaultVirtualNodes_ImproveDistribution(t *testing.T) {
} }
// Verify that the actual number of virtual nodes matches expected // Verify that the actual number of virtual nodes matches expected
if len(ring.sortedHashes) != numNodes*DefaultVirtualNodes { if len(ring.sortedHashes) != numNodes*VirtualNodes {
t.Errorf("expected %d virtual node hashes, got %d", numNodes*DefaultVirtualNodes, len(ring.sortedHashes)) t.Errorf("expected %d virtual node hashes, got %d", numNodes*VirtualNodes, len(ring.sortedHashes))
} }
} }

View File

@@ -44,8 +44,8 @@ func NewLeaderElection(nodeID string, natsConn *nats.Conn, callbacks LeaderElect
Bucket: "aether-leader-election", Bucket: "aether-leader-election",
Description: "Aether cluster leader election coordination", Description: "Aether cluster leader election coordination",
TTL: LeaderLeaseTimeout * 2, // Auto-cleanup expired leases TTL: LeaderLeaseTimeout * 2, // Auto-cleanup expired leases
MaxBytes: 1024 * 1024, // 1MB max MaxBytes: 1024 * 1024, // 1MB max
Replicas: 1, // Single replica for simplicity Replicas: 1, // Single replica for simplicity
}) })
if err != nil { if err != nil {
// Try to get existing KV store // Try to get existing KV store
@@ -411,4 +411,4 @@ func (le *LeaderElection) updateCurrentLeader(leaderID string, term uint64) {
le.callbacks.OnNewLeader(leaderID) le.callbacks.OnNewLeader(leaderID)
} }
} }
} }

View File

@@ -12,28 +12,25 @@ import (
"github.com/nats-io/nats.go" "github.com/nats-io/nats.go"
) )
// VMRegistry provides access to local VM information for cluster operations. // VMRegistry provides access to local VM information for cluster operations
// Implementations must provide thread-safe access to VM data.
type VMRegistry interface { type VMRegistry interface {
// GetActiveVMs returns a map of actor IDs to their VirtualMachine instances GetActiveVMs() map[string]interface{} // VirtualMachine interface to avoid import cycles
GetActiveVMs() map[string]VirtualMachine
// GetShard returns the shard number for a given actor ID
GetShard(actorID string) int GetShard(actorID string) int
} }
// ClusterManager coordinates distributed VM operations across the cluster // ClusterManager coordinates distributed VM operations across the cluster
type ClusterManager struct { type ClusterManager struct {
nodeID string nodeID string
nodes map[string]*NodeInfo nodes map[string]*NodeInfo
nodeUpdates chan NodeUpdate nodeUpdates chan NodeUpdate
shardMap *ShardMap shardMap *ShardMap
hashRing *ConsistentHashRing hashRing *ConsistentHashRing
election *LeaderElection election *LeaderElection
natsConn *nats.Conn natsConn *nats.Conn
ctx context.Context ctx context.Context
mutex sync.RWMutex mutex sync.RWMutex
logger *log.Logger logger *log.Logger
vmRegistry VMRegistry // Interface to access local VMs vmRegistry VMRegistry // Interface to access local VMs
} }
// NewClusterManager creates a cluster coordination manager // NewClusterManager creates a cluster coordination manager
@@ -53,13 +50,13 @@ func NewClusterManager(nodeID string, natsConn *nats.Conn, ctx context.Context)
// Create leadership election with callbacks // Create leadership election with callbacks
callbacks := LeaderElectionCallbacks{ callbacks := LeaderElectionCallbacks{
OnBecameLeader: func() { OnBecameLeader: func() {
cm.logger.Printf("This node became the cluster leader - can initiate rebalancing") cm.logger.Printf("👑 This node became the cluster leader - can initiate rebalancing")
}, },
OnLostLeader: func() { OnLostLeader: func() {
cm.logger.Printf("This node lost cluster leadership") cm.logger.Printf("📉 This node lost cluster leadership")
}, },
OnNewLeader: func(leaderID string) { OnNewLeader: func(leaderID string) {
cm.logger.Printf("Cluster leadership changed to: %s", leaderID) cm.logger.Printf("🔄 Cluster leadership changed to: %s", leaderID)
}, },
} }
@@ -74,7 +71,7 @@ func NewClusterManager(nodeID string, natsConn *nats.Conn, ctx context.Context)
// Start begins cluster management operations // Start begins cluster management operations
func (cm *ClusterManager) Start() { func (cm *ClusterManager) Start() {
cm.logger.Printf("Starting cluster manager") cm.logger.Printf("🚀 Starting cluster manager")
// Start leader election // Start leader election
cm.election.Start() cm.election.Start()
@@ -91,7 +88,7 @@ func (cm *ClusterManager) Start() {
// Stop gracefully stops the cluster manager // Stop gracefully stops the cluster manager
func (cm *ClusterManager) Stop() { func (cm *ClusterManager) Stop() {
cm.logger.Printf("Stopping cluster manager") cm.logger.Printf("🛑 Stopping cluster manager")
if cm.election != nil { if cm.election != nil {
cm.election.Stop() cm.election.Stop()
@@ -141,7 +138,7 @@ func (cm *ClusterManager) GetActorsInShard(shardID int) []string {
func (cm *ClusterManager) handleClusterMessage(msg *nats.Msg) { func (cm *ClusterManager) handleClusterMessage(msg *nats.Msg) {
var clusterMsg ClusterMessage var clusterMsg ClusterMessage
if err := json.Unmarshal(msg.Data, &clusterMsg); err != nil { if err := json.Unmarshal(msg.Data, &clusterMsg); err != nil {
cm.logger.Printf("Invalid cluster message: %v", err) cm.logger.Printf("⚠️ Invalid cluster message: %v", err)
return return
} }
@@ -155,7 +152,7 @@ func (cm *ClusterManager) handleClusterMessage(msg *nats.Msg) {
cm.handleNodeUpdate(update) cm.handleNodeUpdate(update)
} }
default: default:
cm.logger.Printf("Unknown cluster message type: %s", clusterMsg.Type) cm.logger.Printf("⚠️ Unknown cluster message type: %s", clusterMsg.Type)
} }
} }
@@ -168,12 +165,12 @@ func (cm *ClusterManager) handleNodeUpdate(update NodeUpdate) {
case NodeJoined: case NodeJoined:
cm.nodes[update.Node.ID] = update.Node cm.nodes[update.Node.ID] = update.Node
cm.hashRing.AddNode(update.Node.ID) cm.hashRing.AddNode(update.Node.ID)
cm.logger.Printf("Node joined: %s", update.Node.ID) cm.logger.Printf(" Node joined: %s", update.Node.ID)
case NodeLeft: case NodeLeft:
delete(cm.nodes, update.Node.ID) delete(cm.nodes, update.Node.ID)
cm.hashRing.RemoveNode(update.Node.ID) cm.hashRing.RemoveNode(update.Node.ID)
cm.logger.Printf("Node left: %s", update.Node.ID) cm.logger.Printf(" Node left: %s", update.Node.ID)
case NodeUpdated: case NodeUpdated:
if node, exists := cm.nodes[update.Node.ID]; exists { if node, exists := cm.nodes[update.Node.ID]; exists {
@@ -191,7 +188,7 @@ func (cm *ClusterManager) handleNodeUpdate(update NodeUpdate) {
for _, node := range cm.nodes { for _, node := range cm.nodes {
if now.Sub(node.LastSeen) > 90*time.Second && node.Status != NodeStatusFailed { if now.Sub(node.LastSeen) > 90*time.Second && node.Status != NodeStatusFailed {
node.Status = NodeStatusFailed node.Status = NodeStatusFailed
cm.logger.Printf("Node marked as failed: %s (last seen: %s)", cm.logger.Printf("Node marked as failed: %s (last seen: %s)",
node.ID, node.LastSeen.Format(time.RFC3339)) node.ID, node.LastSeen.Format(time.RFC3339))
} }
} }
@@ -215,7 +212,7 @@ func (cm *ClusterManager) handleNodeUpdate(update NodeUpdate) {
// handleRebalanceRequest processes cluster rebalancing requests // handleRebalanceRequest processes cluster rebalancing requests
func (cm *ClusterManager) handleRebalanceRequest(msg ClusterMessage) { func (cm *ClusterManager) handleRebalanceRequest(msg ClusterMessage) {
cm.logger.Printf("Handling rebalance request from %s", msg.From) cm.logger.Printf("🔄 Handling rebalance request from %s", msg.From)
// Implementation would handle the specific rebalancing logic // Implementation would handle the specific rebalancing logic
// This is a simplified version // This is a simplified version
@@ -223,7 +220,7 @@ func (cm *ClusterManager) handleRebalanceRequest(msg ClusterMessage) {
// handleMigrationRequest processes actor migration requests // handleMigrationRequest processes actor migration requests
func (cm *ClusterManager) handleMigrationRequest(msg ClusterMessage) { func (cm *ClusterManager) handleMigrationRequest(msg ClusterMessage) {
cm.logger.Printf("Handling migration request from %s", msg.From) cm.logger.Printf("🚚 Handling migration request from %s", msg.From)
// Implementation would handle the specific migration logic // Implementation would handle the specific migration logic
// This is a simplified version // This is a simplified version
@@ -235,7 +232,7 @@ func (cm *ClusterManager) triggerShardRebalancing(reason string) {
return // Only leader can initiate rebalancing return // Only leader can initiate rebalancing
} }
cm.logger.Printf("Triggering shard rebalancing: %s", reason) cm.logger.Printf("⚖️ Triggering shard rebalancing: %s", reason)
// Get active nodes // Get active nodes
var activeNodes []*NodeInfo var activeNodes []*NodeInfo
@@ -248,12 +245,12 @@ func (cm *ClusterManager) triggerShardRebalancing(reason string) {
cm.mutex.RUnlock() cm.mutex.RUnlock()
if len(activeNodes) == 0 { if len(activeNodes) == 0 {
cm.logger.Printf("No active nodes available for rebalancing") cm.logger.Printf("⚠️ No active nodes available for rebalancing")
return return
} }
// This would implement the actual rebalancing logic // This would implement the actual rebalancing logic
cm.logger.Printf("Would rebalance across %d active nodes", len(activeNodes)) cm.logger.Printf("🎯 Would rebalance across %d active nodes", len(activeNodes))
} }
// monitorNodes periodically checks node health and updates // monitorNodes periodically checks node health and updates
@@ -282,7 +279,7 @@ func (cm *ClusterManager) checkNodeHealth() {
for _, node := range cm.nodes { for _, node := range cm.nodes {
if now.Sub(node.LastSeen) > 90*time.Second && node.Status == NodeStatusActive { if now.Sub(node.LastSeen) > 90*time.Second && node.Status == NodeStatusActive {
node.Status = NodeStatusFailed node.Status = NodeStatusFailed
cm.logger.Printf("Node failed: %s", node.ID) cm.logger.Printf("💔 Node failed: %s", node.ID)
} }
} }
} }
@@ -331,4 +328,4 @@ func (cm *ClusterManager) GetShardMap() *ShardMap {
Nodes: make(map[string]NodeInfo), Nodes: make(map[string]NodeInfo),
UpdateTime: cm.shardMap.UpdateTime, UpdateTime: cm.shardMap.UpdateTime,
} }
} }

View File

@@ -33,26 +33,8 @@ type ShardManager struct {
replication int replication int
} }
// NewShardManager creates a new shard manager with default configuration // NewShardManager creates a new shard manager
func NewShardManager(shardCount, replication int) *ShardManager { func NewShardManager(shardCount, replication int) *ShardManager {
return NewShardManagerWithConfig(ShardConfig{
ShardCount: shardCount,
ReplicationFactor: replication,
})
}
// NewShardManagerWithConfig creates a new shard manager with custom configuration
func NewShardManagerWithConfig(config ShardConfig) *ShardManager {
// Apply defaults for zero values
shardCount := config.ShardCount
if shardCount == 0 {
shardCount = DefaultNumShards
}
replication := config.ReplicationFactor
if replication == 0 {
replication = 1
}
return &ShardManager{ return &ShardManager{
shardCount: shardCount, shardCount: shardCount,
shardMap: &ShardMap{Shards: make(map[int][]string), Nodes: make(map[string]NodeInfo)}, shardMap: &ShardMap{Shards: make(map[int][]string), Nodes: make(map[string]NodeInfo)},
@@ -167,15 +149,6 @@ func (sm *ShardManager) GetActorsInShard(shardID int, nodeID string, vmRegistry
return actors return actors
} }
// GetShardCount returns the total number of shards
func (sm *ShardManager) GetShardCount() int {
return sm.shardCount
}
// GetReplicationFactor returns the replication factor
func (sm *ShardManager) GetReplicationFactor() int {
return sm.replication
}
// ConsistentHashPlacement implements PlacementStrategy using consistent hashing // ConsistentHashPlacement implements PlacementStrategy using consistent hashing
type ConsistentHashPlacement struct{} type ConsistentHashPlacement struct{}
@@ -212,4 +185,4 @@ func (chp *ConsistentHashPlacement) RebalanceShards(currentMap *ShardMap, nodes
// This is a simplified implementation // This is a simplified implementation
// In practice, this would implement sophisticated rebalancing logic // In practice, this would implement sophisticated rebalancing logic
return currentMap, nil return currentMap, nil
} }

View File

@@ -1,713 +0,0 @@
package cluster
import (
"fmt"
"testing"
)
func TestNewShardManager(t *testing.T) {
sm := NewShardManager(16, 3)
if sm == nil {
t.Fatal("NewShardManager returned nil")
}
if sm.shardCount != 16 {
t.Errorf("expected shardCount 16, got %d", sm.shardCount)
}
if sm.replication != 3 {
t.Errorf("expected replication 3, got %d", sm.replication)
}
if sm.shardMap == nil {
t.Error("shardMap is nil")
}
if sm.placement == nil {
t.Error("placement strategy is nil")
}
}
func TestNewShardManager_DefaultsForZeroValues(t *testing.T) {
sm := NewShardManagerWithConfig(ShardConfig{})
if sm.shardCount != DefaultNumShards {
t.Errorf("expected default shardCount %d, got %d", DefaultNumShards, sm.shardCount)
}
if sm.replication != 1 {
t.Errorf("expected default replication 1, got %d", sm.replication)
}
}
func TestNewShardManagerWithConfig_CustomValues(t *testing.T) {
config := ShardConfig{
ShardCount: 256,
ReplicationFactor: 2,
}
sm := NewShardManagerWithConfig(config)
if sm.shardCount != 256 {
t.Errorf("expected shardCount 256, got %d", sm.shardCount)
}
if sm.replication != 2 {
t.Errorf("expected replication 2, got %d", sm.replication)
}
}
func TestGetShard_ReturnsCorrectShardForActor(t *testing.T) {
sm := NewShardManager(16, 1)
// Test that GetShard returns consistent results
actorID := "actor-123"
shard1 := sm.GetShard(actorID)
shard2 := sm.GetShard(actorID)
if shard1 != shard2 {
t.Errorf("GetShard not consistent: got %d and %d for same actor", shard1, shard2)
}
// Verify shard is within valid range
if shard1 < 0 || shard1 >= 16 {
t.Errorf("shard %d is out of range [0, 16)", shard1)
}
}
func TestGetShard_DifferentActorsCanMapToDifferentShards(t *testing.T) {
sm := NewShardManager(16, 1)
// With enough actors, we should see different shards
shardsSeen := make(map[int]bool)
for i := 0; i < 100; i++ {
actorID := fmt.Sprintf("actor-%d", i)
shard := sm.GetShard(actorID)
shardsSeen[shard] = true
}
// We should see multiple different shards
if len(shardsSeen) < 2 {
t.Errorf("expected multiple different shards, got %d unique shards", len(shardsSeen))
}
}
func TestGetShard_DistributesActorsAcrossShards(t *testing.T) {
sm := NewShardManager(16, 1)
distribution := make(map[int]int)
numActors := 1000
for i := 0; i < numActors; i++ {
actorID := fmt.Sprintf("actor-%d", i)
shard := sm.GetShard(actorID)
distribution[shard]++
}
// Verify all shards are within valid range
for shard := range distribution {
if shard < 0 || shard >= 16 {
t.Errorf("shard %d is out of range [0, 16)", shard)
}
}
// With good hashing, we should see fairly even distribution
expectedPerShard := numActors / 16
for shard, count := range distribution {
deviation := float64(count-expectedPerShard) / float64(expectedPerShard)
if deviation > 0.5 || deviation < -0.5 {
t.Logf("shard %d has %d actors (%.1f%% deviation)", shard, count, deviation*100)
}
}
}
func TestGetShardNodes_EmptyShard(t *testing.T) {
sm := NewShardManager(16, 1)
nodes := sm.GetShardNodes(0)
if nodes == nil {
t.Error("GetShardNodes returned nil, expected empty slice")
}
if len(nodes) != 0 {
t.Errorf("expected empty slice for unassigned shard, got %v", nodes)
}
}
func TestGetShardNodes_ReturnsAssignedNodes(t *testing.T) {
sm := NewShardManager(16, 3)
// Assign nodes to shard
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
nodes := sm.GetShardNodes(0)
if len(nodes) != 3 {
t.Errorf("expected 3 nodes, got %d", len(nodes))
}
if nodes[0] != "node-1" || nodes[1] != "node-2" || nodes[2] != "node-3" {
t.Errorf("unexpected nodes: %v", nodes)
}
}
func TestGetShardNodes_NonExistentShard(t *testing.T) {
sm := NewShardManager(16, 1)
// Query a shard that has no assignments
nodes := sm.GetShardNodes(999)
if len(nodes) != 0 {
t.Errorf("expected empty slice for non-existent shard, got %v", nodes)
}
}
func TestAssignShard_CreatesNewAssignment(t *testing.T) {
sm := NewShardManager(16, 1)
sm.AssignShard(5, []string{"node-a"})
nodes := sm.GetShardNodes(5)
if len(nodes) != 1 || nodes[0] != "node-a" {
t.Errorf("expected [node-a], got %v", nodes)
}
}
func TestAssignShard_UpdatesExistingAssignment(t *testing.T) {
sm := NewShardManager(16, 1)
sm.AssignShard(5, []string{"node-a"})
sm.AssignShard(5, []string{"node-b", "node-c"})
nodes := sm.GetShardNodes(5)
if len(nodes) != 2 {
t.Errorf("expected 2 nodes, got %d", len(nodes))
}
if nodes[0] != "node-b" || nodes[1] != "node-c" {
t.Errorf("expected [node-b, node-c], got %v", nodes)
}
}
func TestAssignShard_MultipleShards(t *testing.T) {
sm := NewShardManager(16, 1)
sm.AssignShard(0, []string{"node-1"})
sm.AssignShard(1, []string{"node-2"})
sm.AssignShard(2, []string{"node-3"})
if nodes := sm.GetShardNodes(0); len(nodes) != 1 || nodes[0] != "node-1" {
t.Errorf("shard 0: expected [node-1], got %v", nodes)
}
if nodes := sm.GetShardNodes(1); len(nodes) != 1 || nodes[0] != "node-2" {
t.Errorf("shard 1: expected [node-2], got %v", nodes)
}
if nodes := sm.GetShardNodes(2); len(nodes) != 1 || nodes[0] != "node-3" {
t.Errorf("shard 2: expected [node-3], got %v", nodes)
}
}
func TestGetPrimaryNode(t *testing.T) {
sm := NewShardManager(16, 3)
sm.AssignShard(0, []string{"primary", "replica1", "replica2"})
primary := sm.GetPrimaryNode(0)
if primary != "primary" {
t.Errorf("expected 'primary', got %q", primary)
}
}
func TestGetPrimaryNode_EmptyShard(t *testing.T) {
sm := NewShardManager(16, 1)
primary := sm.GetPrimaryNode(0)
if primary != "" {
t.Errorf("expected empty string for unassigned shard, got %q", primary)
}
}
func TestGetReplicaNodes(t *testing.T) {
sm := NewShardManager(16, 3)
sm.AssignShard(0, []string{"primary", "replica1", "replica2"})
replicas := sm.GetReplicaNodes(0)
if len(replicas) != 2 {
t.Errorf("expected 2 replicas, got %d", len(replicas))
}
if replicas[0] != "replica1" || replicas[1] != "replica2" {
t.Errorf("expected [replica1, replica2], got %v", replicas)
}
}
func TestGetReplicaNodes_SingleNode(t *testing.T) {
sm := NewShardManager(16, 1)
sm.AssignShard(0, []string{"only-node"})
replicas := sm.GetReplicaNodes(0)
if len(replicas) != 0 {
t.Errorf("expected no replicas for single-node shard, got %v", replicas)
}
}
func TestGetReplicaNodes_EmptyShard(t *testing.T) {
sm := NewShardManager(16, 1)
replicas := sm.GetReplicaNodes(0)
if len(replicas) != 0 {
t.Errorf("expected empty slice for unassigned shard, got %v", replicas)
}
}
func TestPlaceActor_NoNodes(t *testing.T) {
sm := NewShardManager(16, 1)
_, err := sm.PlaceActor("actor-1", map[string]*NodeInfo{})
if err == nil {
t.Error("expected error when no nodes available")
}
}
func TestPlaceActor_SingleNode(t *testing.T) {
sm := NewShardManager(16, 1)
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1", Status: NodeStatusActive},
}
nodeID, err := sm.PlaceActor("actor-1", nodes)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if nodeID != "node-1" {
t.Errorf("expected node-1, got %q", nodeID)
}
}
func TestPlaceActor_ReturnsValidNode(t *testing.T) {
sm := NewShardManager(16, 1)
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1", Status: NodeStatusActive},
"node-2": {ID: "node-2", Status: NodeStatusActive},
"node-3": {ID: "node-3", Status: NodeStatusActive},
}
// PlaceActor should always return one of the available nodes
for i := 0; i < 100; i++ {
nodeID, err := sm.PlaceActor(fmt.Sprintf("actor-%d", i), nodes)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if _, exists := nodes[nodeID]; !exists {
t.Errorf("PlaceActor returned invalid node: %q", nodeID)
}
}
}
func TestPlaceActor_DistributesAcrossNodes(t *testing.T) {
sm := NewShardManager(16, 1)
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1", Status: NodeStatusActive},
"node-2": {ID: "node-2", Status: NodeStatusActive},
"node-3": {ID: "node-3", Status: NodeStatusActive},
}
distribution := make(map[string]int)
for i := 0; i < 100; i++ {
nodeID, _ := sm.PlaceActor(fmt.Sprintf("actor-%d", i), nodes)
distribution[nodeID]++
}
// Should use multiple nodes
if len(distribution) < 2 {
t.Errorf("expected distribution across multiple nodes, got %v", distribution)
}
}
func TestUpdateShardMap(t *testing.T) {
sm := NewShardManager(16, 1)
newMap := &ShardMap{
Version: 5,
Shards: map[int][]string{
0: {"node-a", "node-b"},
1: {"node-c"},
},
Nodes: map[string]NodeInfo{
"node-a": {ID: "node-a"},
"node-b": {ID: "node-b"},
"node-c": {ID: "node-c"},
},
}
sm.UpdateShardMap(newMap)
result := sm.GetShardMap()
if result.Version != 5 {
t.Errorf("expected version 5, got %d", result.Version)
}
if len(result.Shards[0]) != 2 {
t.Errorf("expected 2 nodes for shard 0, got %d", len(result.Shards[0]))
}
}
func TestGetShardMap_ReturnsDeepCopy(t *testing.T) {
sm := NewShardManager(16, 1)
sm.AssignShard(0, []string{"node-1", "node-2"})
copy1 := sm.GetShardMap()
copy2 := sm.GetShardMap()
// Modify copy1
copy1.Shards[0][0] = "modified"
copy1.Version = 999
// copy2 should be unaffected
if copy2.Shards[0][0] == "modified" {
t.Error("GetShardMap did not return a deep copy (shard nodes modified)")
}
if copy2.Version == 999 {
t.Error("GetShardMap did not return a deep copy (version modified)")
}
// Original should be unaffected
nodes := sm.GetShardNodes(0)
if nodes[0] == "modified" {
t.Error("original shard map was modified through copy")
}
}
func TestGetShardCount(t *testing.T) {
sm := NewShardManager(64, 1)
if sm.GetShardCount() != 64 {
t.Errorf("expected 64, got %d", sm.GetShardCount())
}
}
func TestGetReplicationFactor(t *testing.T) {
sm := NewShardManager(16, 3)
if sm.GetReplicationFactor() != 3 {
t.Errorf("expected 3, got %d", sm.GetReplicationFactor())
}
}
func TestRebalanceShards_NoPlacementStrategy(t *testing.T) {
sm := NewShardManager(16, 1)
sm.placement = nil // Remove placement strategy
_, err := sm.RebalanceShards(map[string]*NodeInfo{})
if err == nil {
t.Error("expected error when no placement strategy configured")
}
}
func TestRebalanceShards_WithNodes(t *testing.T) {
sm := NewShardManager(16, 1)
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1", Status: NodeStatusActive},
"node-2": {ID: "node-2", Status: NodeStatusActive},
}
result, err := sm.RebalanceShards(nodes)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if result == nil {
t.Error("expected non-nil result")
}
}
// Test shard assignment with node failures
func TestShardAssignment_NodeFailure(t *testing.T) {
sm := NewShardManager(16, 3)
// Initial assignment with 3 replicas
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
// Simulate node failure by reassigning without the failed node
sm.AssignShard(0, []string{"node-1", "node-3"})
nodes := sm.GetShardNodes(0)
if len(nodes) != 2 {
t.Errorf("expected 2 nodes after failure, got %d", len(nodes))
}
// Verify primary is still correct
primary := sm.GetPrimaryNode(0)
if primary != "node-1" {
t.Errorf("expected node-1 as primary, got %q", primary)
}
// Verify replica count
replicas := sm.GetReplicaNodes(0)
if len(replicas) != 1 || replicas[0] != "node-3" {
t.Errorf("expected [node-3] as replicas, got %v", replicas)
}
}
func TestShardAssignment_AllNodesFailExceptOne(t *testing.T) {
sm := NewShardManager(16, 3)
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
// Simulate all but one node failing
sm.AssignShard(0, []string{"node-3"})
nodes := sm.GetShardNodes(0)
if len(nodes) != 1 || nodes[0] != "node-3" {
t.Errorf("expected [node-3], got %v", nodes)
}
primary := sm.GetPrimaryNode(0)
if primary != "node-3" {
t.Errorf("expected node-3 as primary, got %q", primary)
}
replicas := sm.GetReplicaNodes(0)
if len(replicas) != 0 {
t.Errorf("expected no replicas, got %v", replicas)
}
}
// Test replication factor is respected
func TestReplicationFactor_Respected(t *testing.T) {
sm := NewShardManager(16, 3)
if sm.GetReplicationFactor() != 3 {
t.Errorf("expected replication factor 3, got %d", sm.GetReplicationFactor())
}
// Assign with exactly the replication factor
sm.AssignShard(0, []string{"node-1", "node-2", "node-3"})
nodes := sm.GetShardNodes(0)
if len(nodes) != 3 {
t.Errorf("expected 3 nodes matching replication factor, got %d", len(nodes))
}
}
func TestReplicationFactor_CanExceed(t *testing.T) {
// Note: ShardManager doesn't enforce max replication, it just tracks what's assigned
sm := NewShardManager(16, 2)
// Assign more nodes than replication factor
sm.AssignShard(0, []string{"node-1", "node-2", "node-3", "node-4"})
nodes := sm.GetShardNodes(0)
if len(nodes) != 4 {
t.Errorf("expected 4 nodes, got %d", len(nodes))
}
}
func TestReplicationFactor_LessThanFactor(t *testing.T) {
sm := NewShardManager(16, 3)
// Assign fewer nodes than replication factor (possible during degraded state)
sm.AssignShard(0, []string{"node-1"})
nodes := sm.GetShardNodes(0)
if len(nodes) != 1 {
t.Errorf("expected 1 node, got %d", len(nodes))
}
// System should track that we're under-replicated
// (in practice, cluster manager would handle this)
}
// Mock VM registry for testing GetActorsInShard
type mockVMRegistry struct {
activeVMs map[string]VirtualMachine
}
func (m *mockVMRegistry) GetActiveVMs() map[string]VirtualMachine {
return m.activeVMs
}
func (m *mockVMRegistry) GetShard(actorID string) int {
// This would use the same logic as ShardManager
return 0
}
type mockVM struct {
id string
actorID string
state VMState
}
func (m *mockVM) GetID() string { return m.id }
func (m *mockVM) GetActorID() string { return m.actorID }
func (m *mockVM) GetState() VMState { return m.state }
func TestGetActorsInShard_NilRegistry(t *testing.T) {
sm := NewShardManager(16, 1)
actors := sm.GetActorsInShard(0, "node-1", nil)
if len(actors) != 0 {
t.Errorf("expected empty slice for nil registry, got %v", actors)
}
}
func TestGetActorsInShard_WithActors(t *testing.T) {
sm := NewShardManager(16, 1)
// Create mock VMs - need to find actors that map to the same shard
// First, find some actor IDs that map to shard 0
var actorsInShard0 []string
for i := 0; i < 100; i++ {
actorID := fmt.Sprintf("actor-%d", i)
if sm.GetShard(actorID) == 0 {
actorsInShard0 = append(actorsInShard0, actorID)
if len(actorsInShard0) >= 3 {
break
}
}
}
activeVMs := make(map[string]VirtualMachine)
for _, actorID := range actorsInShard0 {
activeVMs[actorID] = &mockVM{
id: "vm-" + actorID,
actorID: actorID,
state: VMStateRunning,
}
}
registry := &mockVMRegistry{activeVMs: activeVMs}
actors := sm.GetActorsInShard(0, "node-1", registry)
if len(actors) != len(actorsInShard0) {
t.Errorf("expected %d actors in shard 0, got %d", len(actorsInShard0), len(actors))
}
}
func TestGetActorsInShard_EmptyRegistry(t *testing.T) {
sm := NewShardManager(16, 1)
registry := &mockVMRegistry{activeVMs: make(map[string]VirtualMachine)}
actors := sm.GetActorsInShard(0, "node-1", registry)
if len(actors) != 0 {
t.Errorf("expected empty slice for empty registry, got %v", actors)
}
}
// Tests for ConsistentHashPlacement
func TestConsistentHashPlacement_PlaceActor_NoNodes(t *testing.T) {
placement := &ConsistentHashPlacement{}
shardMap := &ShardMap{}
_, err := placement.PlaceActor("actor-1", shardMap, map[string]*NodeInfo{})
if err == nil {
t.Error("expected error when no nodes available")
}
}
func TestConsistentHashPlacement_PlaceActor_SingleNode(t *testing.T) {
placement := &ConsistentHashPlacement{}
shardMap := &ShardMap{}
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1"},
}
nodeID, err := placement.PlaceActor("actor-1", shardMap, nodes)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if nodeID != "node-1" {
t.Errorf("expected node-1, got %q", nodeID)
}
}
func TestConsistentHashPlacement_PlaceActor_ReturnsValidNode(t *testing.T) {
placement := &ConsistentHashPlacement{}
shardMap := &ShardMap{}
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1"},
"node-2": {ID: "node-2"},
"node-3": {ID: "node-3"},
}
// PlaceActor should always return one of the available nodes
for i := 0; i < 100; i++ {
nodeID, err := placement.PlaceActor(fmt.Sprintf("actor-%d", i), shardMap, nodes)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if _, exists := nodes[nodeID]; !exists {
t.Errorf("PlaceActor returned invalid node: %q", nodeID)
}
}
}
func TestConsistentHashPlacement_RebalanceShards(t *testing.T) {
placement := &ConsistentHashPlacement{}
currentMap := &ShardMap{
Version: 1,
Shards: map[int][]string{0: {"node-1"}},
}
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1"},
"node-2": {ID: "node-2"},
}
result, err := placement.RebalanceShards(currentMap, nodes)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
// Current implementation returns unchanged map
if result != currentMap {
t.Error("expected same map returned (simplified implementation)")
}
}
// Benchmark tests
func BenchmarkGetShard(b *testing.B) {
sm := NewShardManager(1024, 1)
actorIDs := make([]string, 1000)
for i := range actorIDs {
actorIDs[i] = fmt.Sprintf("actor-%d", i)
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
sm.GetShard(actorIDs[i%len(actorIDs)])
}
}
func BenchmarkAssignShard(b *testing.B) {
sm := NewShardManager(1024, 1)
nodes := []string{"node-1", "node-2", "node-3"}
b.ResetTimer()
for i := 0; i < b.N; i++ {
sm.AssignShard(i%1024, nodes)
}
}
func BenchmarkPlaceActor(b *testing.B) {
sm := NewShardManager(1024, 1)
nodes := map[string]*NodeInfo{
"node-1": {ID: "node-1"},
"node-2": {ID: "node-2"},
"node-3": {ID: "node-3"},
}
actorIDs := make([]string, 1000)
for i := range actorIDs {
actorIDs[i] = fmt.Sprintf("actor-%d", i)
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
sm.PlaceActor(actorIDs[i%len(actorIDs)], nodes)
}
}

View File

@@ -4,47 +4,17 @@ import (
"time" "time"
) )
// Default configuration values
const ( const (
// DefaultNumShards defines the default total number of shards in the cluster // NumShards defines the total number of shards in the cluster
DefaultNumShards = 1024 NumShards = 1024
// DefaultVirtualNodes defines the default number of virtual nodes per physical node // VirtualNodes defines the number of virtual nodes per physical node for consistent hashing
DefaultVirtualNodes = 150 VirtualNodes = 150
// Leadership election constants // Leadership election constants
LeaderLeaseTimeout = 10 * time.Second // How long a leader lease lasts LeaderLeaseTimeout = 10 * time.Second // How long a leader lease lasts
HeartbeatInterval = 3 * time.Second // How often leader sends heartbeats HeartbeatInterval = 3 * time.Second // How often leader sends heartbeats
ElectionTimeout = 2 * time.Second // How long to wait for election ElectionTimeout = 2 * time.Second // How long to wait for election
) )
// HashRingConfig holds configuration for the consistent hash ring
type HashRingConfig struct {
// VirtualNodes is the number of virtual nodes per physical node (default: 150)
VirtualNodes int
}
// DefaultHashRingConfig returns the default hash ring configuration
func DefaultHashRingConfig() HashRingConfig {
return HashRingConfig{
VirtualNodes: DefaultVirtualNodes,
}
}
// ShardConfig holds configuration for shard management
type ShardConfig struct {
// ShardCount is the total number of shards (default: 1024)
ShardCount int
// ReplicationFactor is the number of replicas per shard (default: 1)
ReplicationFactor int
}
// DefaultShardConfig returns the default shard configuration
func DefaultShardConfig() ShardConfig {
return ShardConfig{
ShardCount: DefaultNumShards,
ReplicationFactor: 1,
}
}
// NodeStatus represents the health status of a node // NodeStatus represents the health status of a node
type NodeStatus string type NodeStatus string
@@ -60,14 +30,14 @@ type NodeInfo struct {
Address string `json:"address"` Address string `json:"address"`
Port int `json:"port"` Port int `json:"port"`
Status NodeStatus `json:"status"` Status NodeStatus `json:"status"`
Capacity float64 `json:"capacity"` // Maximum load capacity Capacity float64 `json:"capacity"` // Maximum load capacity
Load float64 `json:"load"` // Current CPU/memory load Load float64 `json:"load"` // Current CPU/memory load
LastSeen time.Time `json:"lastSeen"` // Last heartbeat timestamp LastSeen time.Time `json:"lastSeen"` // Last heartbeat timestamp
Timestamp time.Time `json:"timestamp"` Timestamp time.Time `json:"timestamp"`
Metadata map[string]string `json:"metadata"` Metadata map[string]string `json:"metadata"`
IsLeader bool `json:"isLeader"` IsLeader bool `json:"isLeader"`
VMCount int `json:"vmCount"` // Number of VMs on this node VMCount int `json:"vmCount"` // Number of VMs on this node
ShardIDs []int `json:"shardIds"` // Shards assigned to this node ShardIDs []int `json:"shardIds"` // Shards assigned to this node
} }
// NodeUpdateType represents the type of node update // NodeUpdateType represents the type of node update
@@ -87,9 +57,9 @@ type NodeUpdate struct {
// ShardMap represents the distribution of shards across cluster nodes // ShardMap represents the distribution of shards across cluster nodes
type ShardMap struct { type ShardMap struct {
Version uint64 `json:"version"` // Incremented on each change Version uint64 `json:"version"` // Incremented on each change
Shards map[int][]string `json:"shards"` // shard ID -> [primary, replica1, replica2] Shards map[int][]string `json:"shards"` // shard ID -> [primary, replica1, replica2]
Nodes map[string]NodeInfo `json:"nodes"` // node ID -> node info Nodes map[string]NodeInfo `json:"nodes"` // node ID -> node info
UpdateTime time.Time `json:"updateTime"` UpdateTime time.Time `json:"updateTime"`
} }
@@ -104,23 +74,23 @@ type ClusterMessage struct {
// RebalanceRequest represents a request to rebalance shards // RebalanceRequest represents a request to rebalance shards
type RebalanceRequest struct { type RebalanceRequest struct {
RequestID string `json:"requestId"` RequestID string `json:"requestId"`
FromNode string `json:"fromNode"` FromNode string `json:"fromNode"`
ToNode string `json:"toNode"` ToNode string `json:"toNode"`
ShardIDs []int `json:"shardIds"` ShardIDs []int `json:"shardIds"`
Reason string `json:"reason"` Reason string `json:"reason"`
Migrations []ActorMigration `json:"migrations"` Migrations []ActorMigration `json:"migrations"`
} }
// ActorMigration represents the migration of an actor between nodes // ActorMigration represents the migration of an actor between nodes
type ActorMigration struct { type ActorMigration struct {
ActorID string `json:"actorId"` ActorID string `json:"actorId"`
FromNode string `json:"fromNode"` FromNode string `json:"fromNode"`
ToNode string `json:"toNode"` ToNode string `json:"toNode"`
ShardID int `json:"shardId"` ShardID int `json:"shardId"`
State map[string]interface{} `json:"state"` State map[string]interface{} `json:"state"`
Version int64 `json:"version"` Version int64 `json:"version"`
Status string `json:"status"` // "pending", "in_progress", "completed", "failed" Status string `json:"status"` // "pending", "in_progress", "completed", "failed"
} }
// LeaderElectionCallbacks defines callbacks for leadership changes // LeaderElectionCallbacks defines callbacks for leadership changes
@@ -138,68 +108,3 @@ type LeadershipLease struct {
StartedAt time.Time `json:"startedAt"` StartedAt time.Time `json:"startedAt"`
} }
// VirtualMachine defines the interface for a virtual machine instance.
// This interface provides the minimal contract needed by the cluster package
// to interact with VMs without creating import cycles with the runtime package.
type VirtualMachine interface {
// GetID returns the unique identifier for this VM
GetID() string
// GetActorID returns the actor ID this VM represents
GetActorID() string
// GetState returns the current state of the VM
GetState() VMState
}
// VMState represents the state of a virtual machine
type VMState string
const (
VMStateIdle VMState = "idle"
VMStateRunning VMState = "running"
VMStatePaused VMState = "paused"
VMStateStopped VMState = "stopped"
)
// RuntimeModel defines the interface for an EventStorming model that can be loaded into a runtime.
// This decouples the cluster package from the specific eventstorming package.
type RuntimeModel interface {
// GetID returns the unique identifier for this model
GetID() string
// GetName returns the name of this model
GetName() string
}
// RuntimeMessage defines the interface for messages that can be sent through the runtime.
// This provides type safety for inter-actor communication without creating import cycles.
type RuntimeMessage interface {
// GetTargetActorID returns the ID of the actor this message is addressed to
GetTargetActorID() string
// GetType returns the message type identifier
GetType() string
}
// ModelPayload is a concrete type for JSON-unmarshaling RuntimeModel payloads.
// Use this when receiving model data over the network.
type ModelPayload struct {
ID string `json:"id"`
Name string `json:"name"`
}
// GetID implements RuntimeModel
func (m *ModelPayload) GetID() string { return m.ID }
// GetName implements RuntimeModel
func (m *ModelPayload) GetName() string { return m.Name }
// MessagePayload is a concrete type for JSON-unmarshaling RuntimeMessage payloads.
// Use this when receiving message data over the network.
type MessagePayload struct {
TargetActorID string `json:"targetActorId"`
Type string `json:"type"`
}
// GetTargetActorID implements RuntimeMessage
func (m *MessagePayload) GetTargetActorID() string { return m.TargetActorID }
// GetType implements RuntimeMessage
func (m *MessagePayload) GetType() string { return m.Type }

View File

@@ -28,39 +28,6 @@ func (e *VersionConflictError) Unwrap() error {
return ErrVersionConflict return ErrVersionConflict
} }
// ReplayError captures information about a malformed event encountered during replay.
// This allows callers to inspect and handle corrupted data without losing context.
type ReplayError struct {
// SequenceNumber is the sequence number of the message in the stream (if available)
SequenceNumber uint64
// RawData is the raw bytes that could not be unmarshaled
RawData []byte
// Err is the underlying unmarshal error
Err error
}
func (e *ReplayError) Error() string {
return fmt.Sprintf("failed to unmarshal event at sequence %d: %v", e.SequenceNumber, e.Err)
}
func (e *ReplayError) Unwrap() error {
return e.Err
}
// ReplayResult contains the results of replaying events, including any errors encountered.
// This allows callers to decide how to handle malformed events rather than silently skipping them.
type ReplayResult struct {
// Events contains the successfully unmarshaled events
Events []*Event
// Errors contains information about any malformed events encountered
Errors []ReplayError
}
// HasErrors returns true if any malformed events were encountered during replay
func (r *ReplayResult) HasErrors() bool {
return len(r.Errors) > 0
}
// Event represents a domain event in the system // Event represents a domain event in the system
type Event struct { type Event struct {
ID string `json:"id"` ID string `json:"id"`
@@ -73,14 +40,6 @@ type Event struct {
Timestamp time.Time `json:"timestamp"` Timestamp time.Time `json:"timestamp"`
} }
// Common event types for Aether infrastructure
const (
// EventTypeEventStored is an internal event published when an event is successfully persisted.
// This event allows observability components (metrics, projections, audit systems) to react
// to persisted events without coupling to application code.
EventTypeEventStored = "EventStored"
)
// Common metadata keys for distributed tracing and auditing // Common metadata keys for distributed tracing and auditing
const ( const (
// MetadataKeyCorrelationID identifies related events across services // MetadataKeyCorrelationID identifies related events across services
@@ -184,17 +143,6 @@ type ActorSnapshot struct {
// EventStore defines the interface for event persistence. // EventStore defines the interface for event persistence.
// //
// # Immutability Guarantee
//
// EventStore is append-only. Once an event is persisted via SaveEvent, it is never
// modified or deleted. The interface intentionally provides no Update or Delete methods.
// This ensures:
// - Events serve as an immutable audit trail
// - State can be safely derived by replaying events
// - Concurrent reads are always safe (events never change)
//
// To correct a mistake, append a new event that expresses the correction.
//
// # Version Semantics // # Version Semantics
// //
// Events for an actor must have monotonically increasing versions. When SaveEvent // Events for an actor must have monotonically increasing versions. When SaveEvent
@@ -215,13 +163,10 @@ type EventStore interface {
// SaveEvent persists an event to the store. The event's Version must be // SaveEvent persists an event to the store. The event's Version must be
// strictly greater than the current latest version for the actor. // strictly greater than the current latest version for the actor.
// Returns VersionConflictError if version <= current latest version. // Returns VersionConflictError if version <= current latest version.
// Once saved, the event is immutable and can never be modified or deleted.
SaveEvent(event *Event) error SaveEvent(event *Event) error
// GetEvents retrieves events for an actor from a specific version (inclusive). // GetEvents retrieves events for an actor from a specific version (inclusive).
// Returns an empty slice if no events exist for the actor. // Returns an empty slice if no events exist for the actor.
// The returned events are guaranteed to be immutable - they will never be
// modified or deleted from the store.
GetEvents(actorID string, fromVersion int64) ([]*Event, error) GetEvents(actorID string, fromVersion int64) ([]*Event, error)
// GetLatestVersion returns the latest version for an actor. // GetLatestVersion returns the latest version for an actor.
@@ -229,18 +174,6 @@ type EventStore interface {
GetLatestVersion(actorID string) (int64, error) GetLatestVersion(actorID string) (int64, error)
} }
// EventStoreWithErrors extends EventStore with methods that report malformed events.
// Stores that may encounter corrupted data during replay (e.g., JetStream) should
// implement this interface to give callers visibility into data quality issues.
type EventStoreWithErrors interface {
EventStore
// GetEventsWithErrors retrieves events for an actor and reports any malformed
// events encountered. This method allows callers to decide how to handle
// corrupted data rather than silently skipping it.
GetEventsWithErrors(actorID string, fromVersion int64) (*ReplayResult, error)
}
// SnapshotStore extends EventStore with snapshot capabilities // SnapshotStore extends EventStore with snapshot capabilities
type SnapshotStore interface { type SnapshotStore interface {
EventStore EventStore

View File

@@ -2,8 +2,6 @@ package aether
import ( import (
"encoding/json" "encoding/json"
"errors"
"fmt"
"strings" "strings"
"testing" "testing"
"time" "time"
@@ -1210,317 +1208,3 @@ func TestEvent_MetadataAllHelpersRoundTrip(t *testing.T) {
t.Errorf("GetSpanID mismatch: got %q", decoded.GetSpanID()) t.Errorf("GetSpanID mismatch: got %q", decoded.GetSpanID())
} }
} }
// Tests for ReplayError and ReplayResult types
func TestReplayError_Error(t *testing.T) {
err := &ReplayError{
SequenceNumber: 42,
RawData: []byte(`invalid json`),
Err: json.Unmarshal([]byte(`{`), &struct{}{}),
}
errMsg := err.Error()
if !strings.Contains(errMsg, "42") {
t.Errorf("expected error message to contain sequence number, got: %s", errMsg)
}
if !strings.Contains(errMsg, "unmarshal") || !strings.Contains(errMsg, "failed") {
t.Errorf("expected error message to contain 'failed' and 'unmarshal', got: %s", errMsg)
}
}
func TestReplayError_Unwrap(t *testing.T) {
innerErr := json.Unmarshal([]byte(`{`), &struct{}{})
err := &ReplayError{
SequenceNumber: 1,
RawData: []byte(`{`),
Err: innerErr,
}
unwrapped := err.Unwrap()
if unwrapped != innerErr {
t.Errorf("expected Unwrap to return inner error")
}
}
func TestReplayResult_HasErrors(t *testing.T) {
tests := []struct {
name string
result *ReplayResult
expected bool
}{
{
name: "no errors",
result: &ReplayResult{Events: []*Event{}, Errors: []ReplayError{}},
expected: false,
},
{
name: "nil errors slice",
result: &ReplayResult{Events: []*Event{}, Errors: nil},
expected: false,
},
{
name: "has errors",
result: &ReplayResult{
Events: []*Event{},
Errors: []ReplayError{
{SequenceNumber: 1, RawData: []byte(`bad`), Err: nil},
},
},
expected: true,
},
{
name: "has events and errors",
result: &ReplayResult{
Events: []*Event{{ID: "evt-1"}},
Errors: []ReplayError{
{SequenceNumber: 2, RawData: []byte(`bad`), Err: nil},
},
},
expected: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := tt.result.HasErrors(); got != tt.expected {
t.Errorf("HasErrors() = %v, want %v", got, tt.expected)
}
})
}
}
func TestReplayResult_EmptyResult(t *testing.T) {
result := &ReplayResult{
Events: []*Event{},
Errors: []ReplayError{},
}
if result.HasErrors() {
t.Error("expected HasErrors() to return false for empty result")
}
if len(result.Events) != 0 {
t.Errorf("expected 0 events, got %d", len(result.Events))
}
}
func TestReplayError_WithZeroSequence(t *testing.T) {
err := &ReplayError{
SequenceNumber: 0,
RawData: []byte(`corrupted`),
Err: json.Unmarshal([]byte(`not-json`), &struct{}{}),
}
errMsg := err.Error()
if !strings.Contains(errMsg, "sequence 0") {
t.Errorf("expected error message to contain 'sequence 0', got: %s", errMsg)
}
}
func TestReplayError_WithLargeRawData(t *testing.T) {
largeData := make([]byte, 1024*1024) // 1MB
for i := range largeData {
largeData[i] = 'x'
}
err := &ReplayError{
SequenceNumber: 999,
RawData: largeData,
Err: json.Unmarshal(largeData, &struct{}{}),
}
// Should be able to create the error without issues
if len(err.RawData) != 1024*1024 {
t.Errorf("expected RawData to be preserved, got length %d", len(err.RawData))
}
// Error() should still work
_ = err.Error()
}
// Tests for VersionConflictError
func TestVersionConflictError_Error(t *testing.T) {
err := &VersionConflictError{
ActorID: "order-123",
AttemptedVersion: 3,
CurrentVersion: 5,
}
errMsg := err.Error()
// Verify error message contains all context
if !strings.Contains(errMsg, "order-123") {
t.Errorf("error message should contain ActorID, got: %s", errMsg)
}
if !strings.Contains(errMsg, "3") {
t.Errorf("error message should contain AttemptedVersion, got: %s", errMsg)
}
if !strings.Contains(errMsg, "5") {
t.Errorf("error message should contain CurrentVersion, got: %s", errMsg)
}
if !strings.Contains(errMsg, "version conflict") {
t.Errorf("error message should contain 'version conflict', got: %s", errMsg)
}
}
func TestVersionConflictError_Fields(t *testing.T) {
err := &VersionConflictError{
ActorID: "actor-456",
AttemptedVersion: 10,
CurrentVersion: 8,
}
if err.ActorID != "actor-456" {
t.Errorf("ActorID mismatch: got %q, want %q", err.ActorID, "actor-456")
}
if err.AttemptedVersion != 10 {
t.Errorf("AttemptedVersion mismatch: got %d, want %d", err.AttemptedVersion, 10)
}
if err.CurrentVersion != 8 {
t.Errorf("CurrentVersion mismatch: got %d, want %d", err.CurrentVersion, 8)
}
}
func TestVersionConflictError_Unwrap(t *testing.T) {
err := &VersionConflictError{
ActorID: "actor-789",
AttemptedVersion: 2,
CurrentVersion: 1,
}
unwrapped := err.Unwrap()
if unwrapped != ErrVersionConflict {
t.Errorf("Unwrap should return ErrVersionConflict sentinel")
}
}
func TestVersionConflictError_ErrorsIs(t *testing.T) {
err := &VersionConflictError{
ActorID: "test-actor",
AttemptedVersion: 5,
CurrentVersion: 4,
}
// Test that errors.Is works with sentinel
if !errors.Is(err, ErrVersionConflict) {
t.Error("errors.Is(err, ErrVersionConflict) should return true")
}
// Test that other errors don't match
if errors.Is(err, errors.New("other error")) {
t.Error("errors.Is should not match unrelated errors")
}
}
func TestVersionConflictError_ErrorsAs(t *testing.T) {
originalErr := &VersionConflictError{
ActorID: "actor-unwrap",
AttemptedVersion: 7,
CurrentVersion: 6,
}
var versionErr *VersionConflictError
if !errors.As(originalErr, &versionErr) {
t.Fatalf("errors.As should succeed with VersionConflictError")
}
// Verify fields are accessible through unwrapped error
if versionErr.ActorID != "actor-unwrap" {
t.Errorf("ActorID mismatch after As: got %q", versionErr.ActorID)
}
if versionErr.AttemptedVersion != 7 {
t.Errorf("AttemptedVersion mismatch after As: got %d", versionErr.AttemptedVersion)
}
if versionErr.CurrentVersion != 6 {
t.Errorf("CurrentVersion mismatch after As: got %d", versionErr.CurrentVersion)
}
}
func TestVersionConflictError_CanReadCurrentVersion(t *testing.T) {
// This test verifies that applications can read CurrentVersion for retry strategies
err := &VersionConflictError{
ActorID: "order-abc",
AttemptedVersion: 2,
CurrentVersion: 10,
}
var versionErr *VersionConflictError
if !errors.As(err, &versionErr) {
t.Fatal("failed to unwrap VersionConflictError")
}
// Application can use CurrentVersion to decide retry strategy
nextVersion := versionErr.CurrentVersion + 1
if nextVersion != 11 {
t.Errorf("application should be able to compute next version: got %d, want 11", nextVersion)
}
// Application can log detailed context
logMsg := fmt.Sprintf("Version conflict for actor %q: attempted %d, current %d, will retry with %d",
versionErr.ActorID, versionErr.AttemptedVersion, versionErr.CurrentVersion, nextVersion)
if !strings.Contains(logMsg, "order-abc") {
t.Errorf("application context logging failed: %s", logMsg)
}
}
func TestVersionConflictError_EdgeCases(t *testing.T) {
testCases := []struct {
name string
actorID string
attemp int64
current int64
}{
{"zero current", "actor-1", 1, 0},
{"large numbers", "actor-2", 1000000, 999999},
{"max int64", "actor-3", 9223372036854775807, 9223372036854775806},
{"negative attempt", "actor-4", -1, -2},
{"empty actor id", "", 1, 0},
{"special chars in actor id", "actor@#$%", 2, 1},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
err := &VersionConflictError{
ActorID: tc.actorID,
AttemptedVersion: tc.attemp,
CurrentVersion: tc.current,
}
// Should not panic
msg := err.Error()
if msg == "" {
t.Error("Error() should return non-empty string")
}
// Should be wrapped correctly
if err.Unwrap() != ErrVersionConflict {
t.Error("Unwrap should return ErrVersionConflict")
}
// errors.Is should work
if !errors.Is(err, ErrVersionConflict) {
t.Error("errors.Is should work for edge case")
}
})
}
}
func TestErrVersionConflict_Sentinel(t *testing.T) {
// Verify the sentinel error is correctly defined
if ErrVersionConflict == nil {
t.Fatal("ErrVersionConflict should not be nil")
}
expectedMsg := "version conflict"
if ErrVersionConflict.Error() != expectedMsg {
t.Errorf("ErrVersionConflict message mismatch: got %q, want %q", ErrVersionConflict.Error(), expectedMsg)
}
// Test that it's usable with errors.Is
if !errors.Is(ErrVersionConflict, ErrVersionConflict) {
t.Error("ErrVersionConflict should match itself with errors.Is")
}
}

View File

@@ -5,225 +5,82 @@ import (
"sync" "sync"
) )
// EventBroadcaster defines the interface for publishing and subscribing to events. // EventBroadcaster defines the interface for publishing and subscribing to events
//
// Subscribe accepts namespace patterns following NATS subject matching conventions:
// - Exact match: "tenant-a" matches only "tenant-a"
// - Single wildcard: "*" matches any single token, "tenant-*" matches "tenant-a", "tenant-b"
// - Multi-token wildcard: ">" matches one or more tokens (only at end of pattern)
//
// Security Warning: Wildcard subscriptions bypass namespace isolation.
// Only grant wildcard access to trusted system components.
type EventBroadcaster interface { type EventBroadcaster interface {
// Subscribe creates a channel that receives events matching the namespace pattern. Subscribe(namespaceID string) <-chan *Event
// Pattern syntax follows NATS conventions: "*" matches single token, ">" matches multiple. Unsubscribe(namespaceID string, ch <-chan *Event)
Subscribe(namespacePattern string) <-chan *Event
// SubscribeWithFilter creates a filtered subscription channel for a namespace pattern.
// Events are filtered by the provided SubscriptionFilter before delivery.
// Filters are applied with AND logic - events must match all specified criteria.
//
// Example: Subscribe to "orders" namespace, only receiving "OrderPlaced" events for "order-*" actors:
// filter := &SubscriptionFilter{
// EventTypes: []string{"OrderPlaced"},
// ActorPattern: "order-*",
// }
// ch := bus.SubscribeWithFilter("orders", filter)
SubscribeWithFilter(namespacePattern string, filter *SubscriptionFilter) <-chan *Event
Unsubscribe(namespacePattern string, ch <-chan *Event)
Publish(namespaceID string, event *Event) Publish(namespaceID string, event *Event)
Stop() Stop()
SubscriberCount(namespaceID string) int SubscriberCount(namespaceID string) int
} }
// MetricsProvider is an optional interface that EventBroadcaster implementations // EventBus broadcasts events to multiple subscribers within a namespace
// can implement to expose metrics.
type MetricsProvider interface {
// Metrics returns the metrics collector for this broadcaster.
Metrics() BroadcasterMetrics
}
// subscription represents a single subscriber channel with its pattern
type subscription struct {
pattern string
ch chan *Event
}
// filteredSubscription represents a subscriber with an optional filter
type filteredSubscription struct {
pattern string
ch chan *Event
filter *SubscriptionFilter
}
// EventBus broadcasts events to multiple subscribers within a namespace.
// Supports wildcard patterns for cross-namespace subscriptions.
//
// Security Considerations:
// Wildcard subscriptions (using "*" or ">") receive events from multiple namespaces.
// This is intentional for cross-cutting concerns like logging, monitoring, and auditing.
// However, it bypasses namespace isolation - use with appropriate access controls.
type EventBus struct { type EventBus struct {
// exactSubscribers holds subscribers for exact namespace matches (no wildcards) subscribers map[string][]chan *Event // namespaceID -> channels
exactSubscribers map[string][]*filteredSubscription mutex sync.RWMutex
// wildcardSubscribers holds subscribers with wildcard patterns ctx context.Context
wildcardSubscribers []*filteredSubscription cancel context.CancelFunc
mutex sync.RWMutex
ctx context.Context
cancel context.CancelFunc
metrics *DefaultMetricsCollector
} }
// NewEventBus creates a new event bus // NewEventBus creates a new event bus
func NewEventBus() *EventBus { func NewEventBus() *EventBus {
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
return &EventBus{ return &EventBus{
exactSubscribers: make(map[string][]*filteredSubscription), subscribers: make(map[string][]chan *Event),
wildcardSubscribers: make([]*filteredSubscription, 0), ctx: ctx,
ctx: ctx, cancel: cancel,
cancel: cancel,
metrics: NewMetricsCollector(),
} }
} }
// Metrics returns the metrics collector for this event bus. // Subscribe creates a new subscription channel for a namespace
func (eb *EventBus) Metrics() BroadcasterMetrics { func (eb *EventBus) Subscribe(namespaceID string) <-chan *Event {
return eb.metrics
}
// Subscribe creates a new subscription channel for a namespace pattern.
// Patterns follow NATS subject matching conventions:
// - "*" matches a single token (any sequence without ".")
// - ">" matches one or more tokens (only valid at the end)
// - Exact strings match exactly
//
// Security Warning: Wildcard patterns receive events from all matching namespaces,
// bypassing namespace isolation. Only use for trusted system components.
func (eb *EventBus) Subscribe(namespacePattern string) <-chan *Event {
return eb.SubscribeWithFilter(namespacePattern, nil)
}
// SubscribeWithFilter creates a filtered subscription channel for a namespace pattern.
// Events are filtered by the provided SubscriptionFilter before delivery.
// If filter is nil or empty, all events matching the namespace pattern are delivered.
//
// Filtering is applied client-side for efficient processing:
// - EventTypes: Only events with matching event types are delivered
// - ActorPattern: Only events from matching actors are delivered
//
// Both namespace pattern wildcards and event filters work together:
// - Namespace pattern determines which namespaces to subscribe to
// - Filter determines which events within those namespaces to receive
func (eb *EventBus) SubscribeWithFilter(namespacePattern string, filter *SubscriptionFilter) <-chan *Event {
eb.mutex.Lock() eb.mutex.Lock()
defer eb.mutex.Unlock() defer eb.mutex.Unlock()
// Create buffered channel to prevent blocking publishers // Create buffered channel to prevent blocking publishers
ch := make(chan *Event, 100) ch := make(chan *Event, 100)
eb.subscribers[namespaceID] = append(eb.subscribers[namespaceID], ch)
sub := &filteredSubscription{
pattern: namespacePattern,
ch: ch,
filter: filter,
}
if IsWildcardPattern(namespacePattern) {
// Store wildcard subscription separately
eb.wildcardSubscribers = append(eb.wildcardSubscribers, sub)
} else {
// Exact match subscription
eb.exactSubscribers[namespacePattern] = append(eb.exactSubscribers[namespacePattern], sub)
}
// Record subscription metric
eb.metrics.RecordSubscribe(namespacePattern)
return ch return ch
} }
// Unsubscribe removes a subscription channel // Unsubscribe removes a subscription channel
func (eb *EventBus) Unsubscribe(namespacePattern string, ch <-chan *Event) { func (eb *EventBus) Unsubscribe(namespaceID string, ch <-chan *Event) {
eb.mutex.Lock() eb.mutex.Lock()
defer eb.mutex.Unlock() defer eb.mutex.Unlock()
if IsWildcardPattern(namespacePattern) { subs := eb.subscribers[namespaceID]
// Remove from wildcard subscribers for i, subscriber := range subs {
for i, sub := range eb.wildcardSubscribers { if subscriber == ch {
if sub.ch == ch { // Remove channel from slice
eb.wildcardSubscribers = append(eb.wildcardSubscribers[:i], eb.wildcardSubscribers[i+1:]...) eb.subscribers[namespaceID] = append(subs[:i], subs[i+1:]...)
close(sub.ch) close(subscriber)
// Record unsubscription metric break
eb.metrics.RecordUnsubscribe(namespacePattern)
break
}
}
} else {
// Remove from exact subscribers
subs := eb.exactSubscribers[namespacePattern]
for i, sub := range subs {
if sub.ch == ch {
// Remove subscription from slice
eb.exactSubscribers[namespacePattern] = append(subs[:i], subs[i+1:]...)
close(sub.ch)
// Record unsubscription metric
eb.metrics.RecordUnsubscribe(namespacePattern)
break
}
} }
}
// Clean up empty namespace entries // Clean up empty namespace entries
if len(eb.exactSubscribers[namespacePattern]) == 0 { if len(eb.subscribers[namespaceID]) == 0 {
delete(eb.exactSubscribers, namespacePattern) delete(eb.subscribers, namespaceID)
}
} }
} }
// Publish sends an event to all subscribers of a namespace. // Publish sends an event to all subscribers of a namespace
// Events are delivered to:
// - All exact subscribers for the namespace (after filter matching)
// - All wildcard subscribers whose pattern matches the namespace (after filter matching)
func (eb *EventBus) Publish(namespaceID string, event *Event) { func (eb *EventBus) Publish(namespaceID string, event *Event) {
eb.mutex.RLock() eb.mutex.RLock()
defer eb.mutex.RUnlock() defer eb.mutex.RUnlock()
// Record publish metric subscribers := eb.subscribers[namespaceID]
eb.metrics.RecordPublish(namespaceID) for _, ch := range subscribers {
select {
// Deliver to exact subscribers case ch <- event:
subscribers := eb.exactSubscribers[namespaceID] // Event delivered
for _, sub := range subscribers { default:
eb.deliverToSubscriber(sub, event, namespaceID) // Channel full, skip this subscriber (non-blocking)
}
// Deliver to matching wildcard subscribers
for _, sub := range eb.wildcardSubscribers {
if MatchNamespacePattern(sub.pattern, namespaceID) {
eb.deliverToSubscriber(sub, event, namespaceID)
} }
} }
} }
// deliverToSubscriber delivers an event to a subscriber if it matches the filter
func (eb *EventBus) deliverToSubscriber(sub *filteredSubscription, event *Event, namespaceID string) {
// Apply filter if present
if sub.filter != nil && !sub.filter.IsEmpty() {
if !sub.filter.Matches(event) {
// Event doesn't match filter, skip delivery
return
}
}
select {
case sub.ch <- event:
// Event delivered
eb.metrics.RecordReceive(namespaceID)
default:
// Channel full, skip this subscriber (non-blocking)
eb.metrics.RecordDroppedEvent(namespaceID)
}
}
// Stop closes the event bus // Stop closes the event bus
func (eb *EventBus) Stop() { func (eb *EventBus) Stop() {
eb.mutex.Lock() eb.mutex.Lock()
@@ -231,37 +88,19 @@ func (eb *EventBus) Stop() {
eb.cancel() eb.cancel()
// Close all exact subscriber channels and update metrics // Close all subscriber channels
for namespaceID, subs := range eb.exactSubscribers { for _, subs := range eb.subscribers {
for _, sub := range subs { for _, ch := range subs {
close(sub.ch) close(ch)
eb.metrics.RecordUnsubscribe(namespaceID)
} }
} }
// Close all wildcard subscriber channels and update metrics eb.subscribers = make(map[string][]chan *Event)
for _, sub := range eb.wildcardSubscribers {
close(sub.ch)
eb.metrics.RecordUnsubscribe(sub.pattern)
}
eb.exactSubscribers = make(map[string][]*filteredSubscription)
eb.wildcardSubscribers = make([]*filteredSubscription, 0)
} }
// SubscriberCount returns the number of subscribers for a namespace. // SubscriberCount returns the number of subscribers for a namespace
// This counts only exact match subscribers, not wildcard subscribers that may match.
func (eb *EventBus) SubscriberCount(namespaceID string) int { func (eb *EventBus) SubscriberCount(namespaceID string) int {
eb.mutex.RLock() eb.mutex.RLock()
defer eb.mutex.RUnlock() defer eb.mutex.RUnlock()
return len(eb.exactSubscribers[namespaceID]) return len(eb.subscribers[namespaceID])
}
// WildcardSubscriberCount returns the number of wildcard subscribers.
// These are subscribers using "*" or ">" patterns that may receive events
// from multiple namespaces.
func (eb *EventBus) WildcardSubscriberCount() int {
eb.mutex.RLock()
defer eb.mutex.RUnlock()
return len(eb.wildcardSubscribers)
} }

View File

@@ -1,822 +0,0 @@
package aether
import (
"sync"
"testing"
"time"
)
func TestEventBus_ExactSubscription(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
ch := eb.Subscribe("tenant-a")
event := &Event{
ID: "evt-1",
EventType: "TestEvent",
ActorID: "actor-1",
}
eb.Publish("tenant-a", event)
select {
case received := <-ch:
if received.ID != event.ID {
t.Errorf("expected event ID %s, got %s", event.ID, received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("timed out waiting for event")
}
}
func TestEventBus_WildcardStarSubscription(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe to all single-token namespaces
ch := eb.Subscribe("*")
event := &Event{
ID: "evt-1",
EventType: "TestEvent",
ActorID: "actor-1",
}
eb.Publish("tenant-a", event)
select {
case received := <-ch:
if received.ID != event.ID {
t.Errorf("expected event ID %s, got %s", event.ID, received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("timed out waiting for event")
}
}
func TestEventBus_WildcardGreaterSubscription(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe to all namespaces
ch := eb.Subscribe(">")
events := []*Event{
{ID: "evt-1", EventType: "Test1", ActorID: "actor-1"},
{ID: "evt-2", EventType: "Test2", ActorID: "actor-2"},
{ID: "evt-3", EventType: "Test3", ActorID: "actor-3"},
}
namespaces := []string{"tenant-a", "tenant-b", "prod.tenant.orders"}
for i, ns := range namespaces {
eb.Publish(ns, events[i])
}
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < len(events); i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after receiving %d of %d events", i, len(events))
}
}
for _, evt := range events {
if !received[evt.ID] {
t.Errorf("did not receive event %s", evt.ID)
}
}
}
func TestEventBus_PrefixWildcard(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe to prod.*
ch := eb.Subscribe("prod.*")
event1 := &Event{ID: "evt-1", EventType: "Test", ActorID: "actor-1"}
event2 := &Event{ID: "evt-2", EventType: "Test", ActorID: "actor-2"}
event3 := &Event{ID: "evt-3", EventType: "Test", ActorID: "actor-3"}
// Should match
eb.Publish("prod.tenant", event1)
eb.Publish("prod.orders", event2)
// Should not match (different prefix)
eb.Publish("staging.tenant", event3)
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
// Should receive exactly 2 events
for i := 0; i < 2; i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after receiving %d events", len(received))
}
}
// Verify we got the right ones
if !received["evt-1"] || !received["evt-2"] {
t.Errorf("expected evt-1 and evt-2, got %v", received)
}
// Verify no third event arrives
select {
case evt := <-ch:
t.Errorf("unexpected event received: %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected - no more events
}
}
func TestEventBus_MultipleWildcardSubscribers(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
ch1 := eb.Subscribe("prod.*")
ch2 := eb.Subscribe("prod.>")
ch3 := eb.Subscribe(">")
event := &Event{ID: "evt-1", EventType: "Test", ActorID: "actor-1"}
eb.Publish("prod.tenant.orders", event)
// ch1 (prod.*) should NOT receive - doesn't match 3 tokens
select {
case <-ch1:
t.Error("prod.* should not match prod.tenant.orders")
case <-time.After(50 * time.Millisecond):
// Expected
}
// ch2 (prod.>) should receive
select {
case received := <-ch2:
if received.ID != event.ID {
t.Errorf("expected %s, got %s", event.ID, received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Error("prod.> should match prod.tenant.orders")
}
// ch3 (>) should receive
select {
case received := <-ch3:
if received.ID != event.ID {
t.Errorf("expected %s, got %s", event.ID, received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Error("> should match prod.tenant.orders")
}
}
func TestEventBus_ExactAndWildcardCoexist(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
chExact := eb.Subscribe("tenant-a")
chWildcard := eb.Subscribe("*")
event := &Event{ID: "evt-1", EventType: "Test", ActorID: "actor-1"}
eb.Publish("tenant-a", event)
// Both should receive the event
var wg sync.WaitGroup
wg.Add(2)
go func() {
defer wg.Done()
select {
case received := <-chExact:
if received.ID != event.ID {
t.Errorf("exact: expected %s, got %s", event.ID, received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Error("exact subscriber timed out")
}
}()
go func() {
defer wg.Done()
select {
case received := <-chWildcard:
if received.ID != event.ID {
t.Errorf("wildcard: expected %s, got %s", event.ID, received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Error("wildcard subscriber timed out")
}
}()
wg.Wait()
}
func TestEventBus_WildcardUnsubscribe(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
ch := eb.Subscribe("prod.*")
// Verify it's counted
if eb.WildcardSubscriberCount() != 1 {
t.Errorf("expected 1 wildcard subscriber, got %d", eb.WildcardSubscriberCount())
}
eb.Unsubscribe("prod.*", ch)
// Verify it's removed
if eb.WildcardSubscriberCount() != 0 {
t.Errorf("expected 0 wildcard subscribers, got %d", eb.WildcardSubscriberCount())
}
}
func TestEventBus_SubscriberCount(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Add exact subscribers
ch1 := eb.Subscribe("tenant-a")
ch2 := eb.Subscribe("tenant-a")
if eb.SubscriberCount("tenant-a") != 2 {
t.Errorf("expected 2 exact subscribers, got %d", eb.SubscriberCount("tenant-a"))
}
// Add wildcard subscriber - should not affect exact count
eb.Subscribe("*")
if eb.SubscriberCount("tenant-a") != 2 {
t.Errorf("expected 2 exact subscribers after wildcard add, got %d", eb.SubscriberCount("tenant-a"))
}
if eb.WildcardSubscriberCount() != 1 {
t.Errorf("expected 1 wildcard subscriber, got %d", eb.WildcardSubscriberCount())
}
// Unsubscribe exact
eb.Unsubscribe("tenant-a", ch1)
if eb.SubscriberCount("tenant-a") != 1 {
t.Errorf("expected 1 exact subscriber after unsubscribe, got %d", eb.SubscriberCount("tenant-a"))
}
eb.Unsubscribe("tenant-a", ch2)
if eb.SubscriberCount("tenant-a") != 0 {
t.Errorf("expected 0 exact subscribers after unsubscribe, got %d", eb.SubscriberCount("tenant-a"))
}
}
func TestEventBus_StopClosesAllChannels(t *testing.T) {
eb := NewEventBus()
chExact := eb.Subscribe("tenant-a")
chWildcard := eb.Subscribe("*")
eb.Stop()
// Both channels should be closed
select {
case _, ok := <-chExact:
if ok {
t.Error("expected exact channel to be closed")
}
case <-time.After(100 * time.Millisecond):
t.Error("timed out waiting for exact channel close")
}
select {
case _, ok := <-chWildcard:
if ok {
t.Error("expected wildcard channel to be closed")
}
case <-time.After(100 * time.Millisecond):
t.Error("timed out waiting for wildcard channel close")
}
}
func TestEventBus_NamespaceIsolation(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
chA := eb.Subscribe("tenant-a")
chB := eb.Subscribe("tenant-b")
eventA := &Event{ID: "evt-a", EventType: "Test", ActorID: "actor-1"}
eventB := &Event{ID: "evt-b", EventType: "Test", ActorID: "actor-2"}
eb.Publish("tenant-a", eventA)
eb.Publish("tenant-b", eventB)
// Verify tenant-a receives only its event
select {
case received := <-chA:
if received.ID != "evt-a" {
t.Errorf("tenant-a received wrong event: %s", received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Error("tenant-a timed out")
}
select {
case <-chA:
t.Error("tenant-a received extra event")
case <-time.After(50 * time.Millisecond):
// Expected
}
// Verify tenant-b receives only its event
select {
case received := <-chB:
if received.ID != "evt-b" {
t.Errorf("tenant-b received wrong event: %s", received.ID)
}
case <-time.After(100 * time.Millisecond):
t.Error("tenant-b timed out")
}
select {
case <-chB:
t.Error("tenant-b received extra event")
case <-time.After(50 * time.Millisecond):
// Expected
}
}
func TestEventBus_NonBlockingPublish(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Create subscriber but don't read from channel
_ = eb.Subscribe("tenant-a")
// Fill the channel buffer (100 events)
for i := 0; i < 150; i++ {
event := &Event{
ID: "evt",
EventType: "Test",
ActorID: "actor-1",
}
// Should not block even when channel is full
eb.Publish("tenant-a", event)
}
// If we got here without blocking, test passes
}
func TestEventBus_ConcurrentOperations(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
var wg sync.WaitGroup
// Concurrent subscriptions
for i := 0; i < 10; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
ch := eb.Subscribe("tenant-a")
time.Sleep(10 * time.Millisecond)
eb.Unsubscribe("tenant-a", ch)
}(i)
}
// Concurrent wildcard subscriptions
for i := 0; i < 10; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
ch := eb.Subscribe("*")
time.Sleep(10 * time.Millisecond)
eb.Unsubscribe("*", ch)
}(i)
}
// Concurrent publishes
for i := 0; i < 10; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
event := &Event{
ID: "evt",
EventType: "Test",
ActorID: "actor-1",
}
eb.Publish("tenant-a", event)
}(i)
}
wg.Wait()
}
// Tests for SubscribeWithFilter functionality
func TestEventBus_SubscribeWithFilter_EventTypes(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe with filter for specific event types
filter := &SubscriptionFilter{
EventTypes: []string{"OrderPlaced", "OrderShipped"},
}
ch := eb.SubscribeWithFilter("orders", filter)
// Publish events of different types
events := []*Event{
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"},
{ID: "evt-2", EventType: "OrderCancelled", ActorID: "order-2"}, // Should not be received
{ID: "evt-3", EventType: "OrderShipped", ActorID: "order-3"},
}
for _, e := range events {
eb.Publish("orders", e)
}
// Should receive evt-1 and evt-3, but not evt-2
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < 2; i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after receiving %d events", len(received))
}
}
if !received["evt-1"] || !received["evt-3"] {
t.Errorf("expected to receive evt-1 and evt-3, got %v", received)
}
// Verify evt-2 was not received
select {
case evt := <-ch:
t.Errorf("unexpected event received: %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected
}
}
func TestEventBus_SubscribeWithFilter_ActorPattern(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe with filter for specific actor pattern
filter := &SubscriptionFilter{
ActorPattern: "order-*",
}
ch := eb.SubscribeWithFilter("events", filter)
// Publish events from different actors
events := []*Event{
{ID: "evt-1", EventType: "Test", ActorID: "order-123"},
{ID: "evt-2", EventType: "Test", ActorID: "user-456"}, // Should not be received
{ID: "evt-3", EventType: "Test", ActorID: "order-789"},
}
for _, e := range events {
eb.Publish("events", e)
}
// Should receive evt-1 and evt-3, but not evt-2
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < 2; i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after receiving %d events", len(received))
}
}
if !received["evt-1"] || !received["evt-3"] {
t.Errorf("expected to receive evt-1 and evt-3, got %v", received)
}
// Verify evt-2 was not received
select {
case evt := <-ch:
t.Errorf("unexpected event received: %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected
}
}
func TestEventBus_SubscribeWithFilter_Combined(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe with filter for both event type AND actor pattern
filter := &SubscriptionFilter{
EventTypes: []string{"OrderPlaced"},
ActorPattern: "order-*",
}
ch := eb.SubscribeWithFilter("orders", filter)
// Publish events with various combinations
events := []*Event{
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-123"}, // Should be received
{ID: "evt-2", EventType: "OrderPlaced", ActorID: "user-456"}, // Wrong actor
{ID: "evt-3", EventType: "OrderCancelled", ActorID: "order-789"}, // Wrong type
{ID: "evt-4", EventType: "OrderCancelled", ActorID: "user-000"}, // Wrong both
}
for _, e := range events {
eb.Publish("orders", e)
}
// Should only receive evt-1
select {
case evt := <-ch:
if evt.ID != "evt-1" {
t.Errorf("expected evt-1, got %s", evt.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("timed out waiting for event")
}
// Verify no more events arrive
select {
case evt := <-ch:
t.Errorf("unexpected event received: %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected
}
}
func TestEventBus_SubscribeWithFilter_NilFilter(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe with nil filter - should receive all events
ch := eb.SubscribeWithFilter("events", nil)
events := []*Event{
{ID: "evt-1", EventType: "TypeA", ActorID: "actor-1"},
{ID: "evt-2", EventType: "TypeB", ActorID: "actor-2"},
}
for _, e := range events {
eb.Publish("events", e)
}
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < 2; i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after receiving %d events", len(received))
}
}
if !received["evt-1"] || !received["evt-2"] {
t.Errorf("expected all events, got %v", received)
}
}
func TestEventBus_SubscribeWithFilter_EmptyFilter(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe with empty filter - should receive all events
ch := eb.SubscribeWithFilter("events", &SubscriptionFilter{})
events := []*Event{
{ID: "evt-1", EventType: "TypeA", ActorID: "actor-1"},
{ID: "evt-2", EventType: "TypeB", ActorID: "actor-2"},
}
for _, e := range events {
eb.Publish("events", e)
}
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < 2; i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after receiving %d events", len(received))
}
}
if !received["evt-1"] || !received["evt-2"] {
t.Errorf("expected all events, got %v", received)
}
}
func TestEventBus_SubscribeWithFilter_WildcardNamespaceAndFilter(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Subscribe to wildcard namespace pattern with event type filter
filter := &SubscriptionFilter{
EventTypes: []string{"OrderPlaced"},
}
ch := eb.SubscribeWithFilter("prod.*", filter)
// Publish events to different namespaces
events := []*Event{
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"}, // prod.orders - should match
{ID: "evt-2", EventType: "OrderShipped", ActorID: "order-2"}, // prod.orders - wrong type
{ID: "evt-3", EventType: "OrderPlaced", ActorID: "order-3"}, // staging.orders - wrong namespace
}
eb.Publish("prod.orders", events[0])
eb.Publish("prod.orders", events[1])
eb.Publish("staging.orders", events[2])
// Should only receive evt-1
select {
case evt := <-ch:
if evt.ID != "evt-1" {
t.Errorf("expected evt-1, got %s", evt.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("timed out waiting for event")
}
// Verify no more events arrive
select {
case evt := <-ch:
t.Errorf("unexpected event received: %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected
}
}
func TestEventBus_SubscribeWithFilter_MultipleSubscribersWithDifferentFilters(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Two subscribers with different filters on same namespace
filter1 := &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}
filter2 := &SubscriptionFilter{EventTypes: []string{"OrderShipped"}}
ch1 := eb.SubscribeWithFilter("orders", filter1)
ch2 := eb.SubscribeWithFilter("orders", filter2)
events := []*Event{
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"},
{ID: "evt-2", EventType: "OrderShipped", ActorID: "order-2"},
}
for _, e := range events {
eb.Publish("orders", e)
}
// ch1 should only receive evt-1
select {
case evt := <-ch1:
if evt.ID != "evt-1" {
t.Errorf("ch1: expected evt-1, got %s", evt.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("ch1 timed out")
}
// ch2 should only receive evt-2
select {
case evt := <-ch2:
if evt.ID != "evt-2" {
t.Errorf("ch2: expected evt-2, got %s", evt.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("ch2 timed out")
}
// Verify no extra events
select {
case evt := <-ch1:
t.Errorf("ch1: unexpected event %s", evt.ID)
case evt := <-ch2:
t.Errorf("ch2: unexpected event %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected
}
}
func TestEventBus_SubscribeWithFilter_UnsubscribeFiltered(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
filter := &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}
ch := eb.SubscribeWithFilter("orders", filter)
// Verify subscription count
if eb.SubscriberCount("orders") != 1 {
t.Errorf("expected 1 subscriber, got %d", eb.SubscriberCount("orders"))
}
eb.Unsubscribe("orders", ch)
// Verify unsubscribed
if eb.SubscriberCount("orders") != 0 {
t.Errorf("expected 0 subscribers, got %d", eb.SubscriberCount("orders"))
}
}
func TestEventBus_SubscribeWithFilter_FilteredAndUnfilteredCoexist(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// One subscriber with filter, one without
filter := &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}
chFiltered := eb.SubscribeWithFilter("orders", filter)
chUnfiltered := eb.Subscribe("orders")
events := []*Event{
{ID: "evt-1", EventType: "OrderPlaced", ActorID: "order-1"},
{ID: "evt-2", EventType: "OrderShipped", ActorID: "order-2"},
}
for _, e := range events {
eb.Publish("orders", e)
}
// Filtered subscriber should only receive evt-1
select {
case evt := <-chFiltered:
if evt.ID != "evt-1" {
t.Errorf("filtered: expected evt-1, got %s", evt.ID)
}
case <-time.After(100 * time.Millisecond):
t.Fatal("filtered subscriber timed out")
}
// Unfiltered subscriber should receive both
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < 2; i++ {
select {
case evt := <-chUnfiltered:
received[evt.ID] = true
case <-timeout:
t.Fatalf("unfiltered timed out after %d events", len(received))
}
}
if !received["evt-1"] || !received["evt-2"] {
t.Errorf("unfiltered expected both events, got %v", received)
}
}
func TestEventBus_SubscribeWithFilter_WildcardGreaterWithFilter(t *testing.T) {
eb := NewEventBus()
defer eb.Stop()
// Use > wildcard (matches one or more tokens) with filter
filter := &SubscriptionFilter{
ActorPattern: "order-*",
}
ch := eb.SubscribeWithFilter(">", filter)
events := []*Event{
{ID: "evt-1", EventType: "Test", ActorID: "order-123"},
{ID: "evt-2", EventType: "Test", ActorID: "user-456"},
{ID: "evt-3", EventType: "Test", ActorID: "order-789"},
}
// Publish to different namespaces
eb.Publish("tenant-a", events[0])
eb.Publish("tenant-b", events[1])
eb.Publish("prod.orders", events[2])
// Should receive evt-1 and evt-3, but not evt-2
received := make(map[string]bool)
timeout := time.After(100 * time.Millisecond)
for i := 0; i < 2; i++ {
select {
case evt := <-ch:
received[evt.ID] = true
case <-timeout:
t.Fatalf("timed out after %d events", len(received))
}
}
if !received["evt-1"] || !received["evt-3"] {
t.Errorf("expected evt-1 and evt-3, got %v", received)
}
// Verify no evt-2
select {
case evt := <-ch:
t.Errorf("unexpected event: %s", evt.ID)
case <-time.After(50 * time.Millisecond):
// Expected
}
}

View File

@@ -1,189 +0,0 @@
# Aether Examples
This directory contains examples demonstrating common patterns for using Aether.
## Retry Patterns (`retry_patterns.go`)
When saving events with optimistic concurrency control, your application may encounter `VersionConflictError` when multiple writers attempt to update the same actor concurrently. This file demonstrates several retry strategies.
### Pattern Overview
All retry patterns work with `VersionConflictError` which provides three critical fields:
- **ActorID**: The actor that experienced the conflict
- **CurrentVersion**: The latest version in the store
- **AttemptedVersion**: The version you tried to save
Your application can read these fields to make intelligent retry decisions.
### Available Patterns
#### SimpleRetryPattern
The most basic pattern - just retry with exponential backoff:
```go
// Automatically retries up to 3 times with exponential backoff
err := SimpleRetryPattern(store, "order-123", "OrderUpdated")
```
**Use when**: You want a straightforward retry mechanism without complex logic.
#### ConflictDetailedRetryPattern
Extracts detailed information from the conflict error to make smarter decisions:
```go
// Detects thrashing (multiple conflicts at same version)
// and can implement circuit-breaker logic
err := ConflictDetailedRetryPattern(store, "order-123", "OrderUpdated")
```
**Use when**: You need visibility into conflict patterns and want to detect system issues like thrashing.
#### JitterRetryPattern
Adds randomized jitter to prevent "thundering herd" when multiple writers retry:
```go
// Exponential backoff with jitter prevents synchronized retries
err := JitterRetryPattern(store, "order-123", "OrderUpdated")
```
**Use when**: You have high concurrency and want to prevent all writers from retrying at the same time.
#### AdaptiveRetryPattern
Adjusts backoff duration based on version distance (indicator of contention):
```go
// Light contention (gap=1): 50ms backoff
// Moderate contention (gap=3-10): proportional backoff
// High contention (gap>10): aggressive backoff
err := AdaptiveRetryPattern(store, "order-123", "OrderUpdated")
```
**Use when**: You want backoff strategy to respond to actual system load.
#### EventualConsistencyPattern
Instead of blocking on retry, queues the event for asynchronous retry:
```go
// Returns immediately, event is queued for background retry
EventualConsistencyPattern(store, retryQueue, event)
// Background worker processes the queue
for item := range retryQueue {
// Implement your own retry logic here
}
```
**Use when**: You can't afford to block the request, and background retry is acceptable.
#### CircuitBreakerPattern
Implements a circuit breaker to prevent cascading failures:
```go
cb := NewCircuitBreaker()
// Fails fast when circuit is open
err := CircuitBreakerRetryPattern(store, cb, "order-123", "OrderUpdated")
if err != nil && !cb.CanRetry() {
return ErrCircuitBreakerOpen
}
```
**Use when**: You have a distributed system and want to prevent retry storms during outages.
## Common Pattern: Extract and Log Context
All patterns can read context from `VersionConflictError`:
```go
var versionErr *aether.VersionConflictError
if errors.As(err, &versionErr) {
log.Printf(
"Conflict for actor %q: attempted %d, current %d",
versionErr.ActorID,
versionErr.AttemptedVersion,
versionErr.CurrentVersion,
)
}
```
## Sentinel Error Check
Check if an error is a version conflict without examining the struct:
```go
if errors.Is(err, aether.ErrVersionConflict) {
// This is a version conflict - retry is appropriate
}
```
## Implementing Your Own Pattern
Basic template:
```go
for attempt := 0; attempt < maxRetries; attempt++ {
// 1. Get current version
currentVersion, err := store.GetLatestVersion(actorID)
if err != nil {
return err
}
// 2. Create event with next version
event := &aether.Event{
ActorID: actorID,
Version: currentVersion + 1,
// ... other fields
}
// 3. Attempt save
err = store.SaveEvent(event)
if err == nil {
return nil // Success
}
// 4. Check if it's a conflict
if !errors.Is(err, aether.ErrVersionConflict) {
return err // Some other error
}
// 5. Implement your retry strategy
time.Sleep(yourBackoff(attempt))
}
```
## Choosing a Pattern
| Pattern | Latency | Throughput | Complexity | Use Case |
|---------|---------|-----------|-----------|----------|
| Simple | Low | Low | Very Low | Single writer, testing |
| DetailedConflict | Low | Medium | Medium | Debugging, monitoring |
| Jitter | Low-Medium | High | Low | Multi-writer concurrency |
| Adaptive | Low-Medium | High | Medium | Variable load scenarios |
| EventualConsistency | Very Low | Very High | High | High-volume, async-OK workloads |
| CircuitBreaker | Variable | Stable | High | Distributed, failure-resilient systems |
## Performance Considerations
1. **Backoff timing**: Shorter backoffs waste CPU on retries, longer backoffs increase latency
2. **Retry limits**: Too few retries give up too early, too many waste resources
3. **Jitter**: Essential for preventing synchronized retries in high-concurrency scenarios
4. **Monitoring**: Track retry rates and conflict patterns to detect system issues
## Testing
Use `aether.NewInMemoryEventStore()` in tests:
```go
store := store.NewInMemoryEventStore()
err := SimpleRetryPattern(store, "test-actor", "TestEvent")
if err != nil {
t.Fatalf("retry pattern failed: %v", err)
}
```

View File

@@ -1,168 +0,0 @@
// Package main demonstrates cross-node event broadcasting using NATSEventBus
// and JetStreamEventStore for cluster synchronization.
//
// This example shows:
// 1. Setting up NATSEventBus with JetStreamEventStore
// 2. Broadcasting events across NATS for cross-node distribution
// 3. Subscribing to EventStored events for version cache synchronization
// 4. Properly handling EventStored events from other cluster nodes
//
// Prerequisites:
// - NATS server running with JetStream enabled (nats-server -js)
// - Events stream created in JetStream
package main
import (
"context"
"log"
"os"
"os/signal"
"syscall"
"time"
"git.flowmade.one/flowmade-one/aether"
"git.flowmade.one/flowmade-one/aether/store"
"github.com/google/uuid"
"github.com/nats-io/nats.go"
)
func main() {
natsURL := getEnv("NATS_URL", "nats://localhost:4222")
nc, err := nats.Connect(natsURL)
if err != nil {
log.Fatal("Failed to connect to NATS:", err)
}
defer nc.Close()
ctx := context.Background()
store1, err := store.NewJetStreamEventStore(nc, "events")
if err != nil {
log.Fatal("Failed to create event store:", err)
}
eventBus1 := aether.NewNATSEventBusWithBroadcaster(nc, store1, "")
defer eventBus1.Stop()
store2, err := store.NewJetStreamEventStore(nc, "events")
if err != nil {
log.Fatal("Failed to create event store:", err)
}
eventBus2 := aether.NewNATSEventBusWithBroadcaster(nc, store2, "")
defer eventBus2.Stop()
eventStoredCh1 := eventBus1.SubscribeToEventStored("*")
eventStoredCh2 := eventBus2.SubscribeToEventStored("*")
done := make(chan struct{})
go processEvents(ctx, eventStoredCh1, store1, done)
go processEvents(ctx, eventStoredCh2, store2, done)
go func() {
time.Sleep(2 * time.Second)
actorID := "demo-actor"
event1 := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderPlaced",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{
"total": 99.99,
"status": "pending",
},
Timestamp: time.Now(),
}
log.Printf("Node 1 publishing event: %s", event1.EventType)
eventBus1.Publish("", event1)
time.Sleep(500 * time.Millisecond)
event2 := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderPaid",
ActorID: actorID,
Version: 2,
Data: map[string]interface{}{
"total": 99.99,
"status": "paid",
"method": "credit_card",
},
Timestamp: time.Now(),
}
log.Printf("Node 2 publishing event: %s", event2.EventType)
eventBus2.Publish("", event2)
time.Sleep(2 * time.Second)
close(done)
log.Println("Cross-node broadcasting demo complete")
}()
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
select {
case <-sigCh:
log.Println("Shutting down...")
case <-done:
}
}
func processEvents(ctx context.Context, eventStoredCh <-chan *aether.Event, eventStore *store.JetStreamEventStore, done chan struct{}) {
for {
select {
case <-done:
return
case <-ctx.Done():
return
case event, ok := <-eventStoredCh:
if !ok {
return
}
if event == nil {
continue
}
if event.EventType != aether.EventTypeEventStored {
continue
}
actorID, ok := event.Data["actorId"].(string)
if !ok {
log.Printf("Warning: EventStored missing actorId")
continue
}
version, ok := event.Data["version"].(int64)
if !ok {
log.Printf("Warning: EventStored missing version")
continue
}
eventID, _ := event.Data["eventId"].(string)
log.Printf("Received EventStored: actor=%s, version=%d, eventId=%s", actorID, version, eventID)
eventStore.UpdateVersionCache(actorID, version)
currentVersion, _ := eventStore.GetLatestVersion(actorID)
log.Printf("Updated cache: %s now has version %d (cached: %d)", actorID, version, currentVersion)
}
}
}
func getEnv(key, defaultValue string) string {
if value := os.Getenv(key); value != "" {
return value
}
return defaultValue
}

16
go.mod
View File

@@ -1,26 +1,16 @@
module git.flowmade.one/flowmade-one/aether module git.flowmade.one/flowmade-one/aether
go 1.23.0 go 1.23
require ( require (
github.com/google/uuid v1.6.0 github.com/google/uuid v1.6.0
github.com/nats-io/nats.go v1.37.0 github.com/nats-io/nats.go v1.37.0
github.com/prometheus/client_golang v1.23.2
) )
require ( require (
github.com/beorn7/perks v1.0.1 // indirect github.com/klauspost/compress v1.17.2 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/klauspost/compress v1.18.0 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/nats-io/nkeys v0.4.7 // indirect github.com/nats-io/nkeys v0.4.7 // indirect
github.com/nats-io/nuid v1.0.1 // indirect github.com/nats-io/nuid v1.0.1 // indirect
github.com/prometheus/client_model v0.6.2 // indirect
github.com/prometheus/common v0.66.1 // indirect
github.com/prometheus/procfs v0.16.1 // indirect
go.yaml.in/yaml/v2 v2.4.2 // indirect
golang.org/x/crypto v0.18.0 // indirect golang.org/x/crypto v0.18.0 // indirect
golang.org/x/sys v0.35.0 // indirect golang.org/x/sys v0.16.0 // indirect
google.golang.org/protobuf v1.36.8 // indirect
) )

48
go.sum
View File

@@ -1,54 +1,14 @@
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0= github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/klauspost/compress v1.18.0 h1:c/Cqfb0r+Yi+JtIEq73FWXVkRonBlf0CRNYc8Zttxdo= github.com/klauspost/compress v1.17.2 h1:RlWWUY/Dr4fL8qk9YG7DTZ7PDgME2V4csBXA8L/ixi4=
github.com/klauspost/compress v1.18.0/go.mod h1:2Pp+KzxcywXVXMr50+X0Q/Lsb43OQHYWRCY2AiWywWQ= github.com/klauspost/compress v1.17.2/go.mod h1:ntbaceVETuRiXiv4DpjP66DpAtAGkEQskQzEyD//IeE=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/nats-io/nats.go v1.37.0 h1:07rauXbVnnJvv1gfIyghFEo6lUcYRY0WXc3x7x0vUxE= github.com/nats-io/nats.go v1.37.0 h1:07rauXbVnnJvv1gfIyghFEo6lUcYRY0WXc3x7x0vUxE=
github.com/nats-io/nats.go v1.37.0/go.mod h1:Ubdu4Nh9exXdSz0RVWRFBbRfrbSxOYd26oF0wkWclB8= github.com/nats-io/nats.go v1.37.0/go.mod h1:Ubdu4Nh9exXdSz0RVWRFBbRfrbSxOYd26oF0wkWclB8=
github.com/nats-io/nkeys v0.4.7 h1:RwNJbbIdYCoClSDNY7QVKZlyb/wfT6ugvFCiKy6vDvI= github.com/nats-io/nkeys v0.4.7 h1:RwNJbbIdYCoClSDNY7QVKZlyb/wfT6ugvFCiKy6vDvI=
github.com/nats-io/nkeys v0.4.7/go.mod h1:kqXRgRDPlGy7nGaEDMuYzmiJCIAAWDK0IMBtDmGD0nc= github.com/nats-io/nkeys v0.4.7/go.mod h1:kqXRgRDPlGy7nGaEDMuYzmiJCIAAWDK0IMBtDmGD0nc=
github.com/nats-io/nuid v1.0.1 h1:5iA8DT8V7q8WK2EScv2padNa/rTESc1KdnPw4TC2paw= github.com/nats-io/nuid v1.0.1 h1:5iA8DT8V7q8WK2EScv2padNa/rTESc1KdnPw4TC2paw=
github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OSON2c= github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OSON2c=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/prometheus/client_golang v1.23.2 h1:Je96obch5RDVy3FDMndoUsjAhG5Edi49h0RJWRi/o0o=
github.com/prometheus/client_golang v1.23.2/go.mod h1:Tb1a6LWHB3/SPIzCoaDXI4I8UHKeFTEQ1YCr+0Gyqmg=
github.com/prometheus/client_model v0.6.2 h1:oBsgwpGs7iVziMvrGhE53c/GrLUsZdHnqNwqPLxwZyk=
github.com/prometheus/client_model v0.6.2/go.mod h1:y3m2F6Gdpfy6Ut/GBsUqTWZqCUvMVzSfMLjcu6wAwpE=
github.com/prometheus/common v0.66.1 h1:h5E0h5/Y8niHc5DlaLlWLArTQI7tMrsfQjHV+d9ZoGs=
github.com/prometheus/common v0.66.1/go.mod h1:gcaUsgf3KfRSwHY4dIMXLPV0K/Wg1oZ8+SbZk/HH/dA=
github.com/prometheus/procfs v0.16.1 h1:hZ15bTNuirocR6u0JZ6BAHHmwS1p8B4P6MRqxtzMyRg=
github.com/prometheus/procfs v0.16.1/go.mod h1:teAbpZRB1iIAJYREa1LsoWUXykVXA1KlTmWl8x/U+Is=
github.com/rogpeppe/go-internal v1.10.0 h1:TMyTOH3F/DB16zRVcYyreMH6GnZZrwQVAoYjRBZyWFQ=
github.com/rogpeppe/go-internal v1.10.0/go.mod h1:UQnix2H7Ngw/k4C5ijL5+65zddjncjaFoBhdsK/akog=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
go.yaml.in/yaml/v2 v2.4.2 h1:DzmwEr2rDGHl7lsFgAHxmNz/1NlQ7xLIrlN2h5d1eGI=
go.yaml.in/yaml/v2 v2.4.2/go.mod h1:081UH+NErpNdqlCXm3TtEran0rJZGxAYx9hb/ELlsPU=
golang.org/x/crypto v0.18.0 h1:PGVlW0xEltQnzFZ55hkuX5+KLyrMYhHld1YHO4AKcdc= golang.org/x/crypto v0.18.0 h1:PGVlW0xEltQnzFZ55hkuX5+KLyrMYhHld1YHO4AKcdc=
golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1mg= golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1mg=
golang.org/x/sys v0.35.0 h1:vz1N37gP5bs89s7He8XuIYXpyY0+QlsKmzipCbUtyxI= golang.org/x/sys v0.16.0 h1:xWw16ngr6ZMtmxDyKyIgsE93KNKz5HKmMa3b8ALHidU=
golang.org/x/sys v0.35.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k= golang.org/x/sys v0.16.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
google.golang.org/protobuf v1.36.8 h1:xHScyCOEuuwZEc6UtSOvPbAT4zRh0xcNRYekJwfqyMc=
google.golang.org/protobuf v1.36.8/go.mod h1:fuxRtAxBytpl4zzqUh6/eyUujkJdNiuEkXntxiD/uRU=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -1,258 +0,0 @@
package aether
import (
"sync"
"sync/atomic"
)
// BroadcasterMetrics provides observability metrics for EventBroadcaster implementations.
// All methods are safe for concurrent use.
type BroadcasterMetrics interface {
// EventsPublished returns the total number of events published per namespace.
EventsPublished(namespaceID string) int64
// EventsReceived returns the total number of events received per namespace.
// For EventBus this equals events delivered to subscribers.
// For NATSEventBus this includes events received from NATS.
EventsReceived(namespaceID string) int64
// ActiveSubscriptions returns the current number of active subscriptions per namespace.
ActiveSubscriptions(namespaceID string) int64
// TotalActiveSubscriptions returns the total number of active subscriptions across all namespaces.
TotalActiveSubscriptions() int64
// PublishErrors returns the total number of publish errors per namespace.
PublishErrors(namespaceID string) int64
// SubscribeErrors returns the total number of subscribe errors per namespace.
SubscribeErrors(namespaceID string) int64
// DroppedEvents returns the total number of events dropped (e.g., full channel) per namespace.
DroppedEvents(namespaceID string) int64
// Namespaces returns a list of all namespaces that have metrics.
Namespaces() []string
// Reset resets all metrics. Useful for testing.
Reset()
}
// MetricsCollector provides methods for collecting metrics.
// This interface is implemented internally and used by EventBus implementations.
type MetricsCollector interface {
BroadcasterMetrics
// RecordPublish records a successful publish event.
RecordPublish(namespaceID string)
// RecordReceive records a received event.
RecordReceive(namespaceID string)
// RecordSubscribe records a new subscription.
RecordSubscribe(namespaceID string)
// RecordUnsubscribe records a removed subscription.
RecordUnsubscribe(namespaceID string)
// RecordPublishError records a publish error.
RecordPublishError(namespaceID string)
// RecordSubscribeError records a subscribe error.
RecordSubscribeError(namespaceID string)
// RecordDroppedEvent records a dropped event (e.g., channel full).
RecordDroppedEvent(namespaceID string)
}
// namespaceMetrics holds counters for a single namespace.
type namespaceMetrics struct {
eventsPublished int64
eventsReceived int64
activeSubscriptions int64
publishErrors int64
subscribeErrors int64
droppedEvents int64
}
// DefaultMetricsCollector is the default implementation of MetricsCollector.
// It uses atomic operations for thread-safe counter updates.
type DefaultMetricsCollector struct {
mu sync.RWMutex
namespaces map[string]*namespaceMetrics
}
// NewMetricsCollector creates a new DefaultMetricsCollector.
func NewMetricsCollector() *DefaultMetricsCollector {
return &DefaultMetricsCollector{
namespaces: make(map[string]*namespaceMetrics),
}
}
// getOrCreateNamespace returns metrics for a namespace, creating if needed.
func (m *DefaultMetricsCollector) getOrCreateNamespace(namespaceID string) *namespaceMetrics {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if exists {
return ns
}
m.mu.Lock()
defer m.mu.Unlock()
// Double-check after acquiring write lock
if ns, exists = m.namespaces[namespaceID]; exists {
return ns
}
ns = &namespaceMetrics{}
m.namespaces[namespaceID] = ns
return ns
}
// EventsPublished returns the total number of events published for a namespace.
func (m *DefaultMetricsCollector) EventsPublished(namespaceID string) int64 {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if !exists {
return 0
}
return atomic.LoadInt64(&ns.eventsPublished)
}
// EventsReceived returns the total number of events received for a namespace.
func (m *DefaultMetricsCollector) EventsReceived(namespaceID string) int64 {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if !exists {
return 0
}
return atomic.LoadInt64(&ns.eventsReceived)
}
// ActiveSubscriptions returns the current number of active subscriptions for a namespace.
func (m *DefaultMetricsCollector) ActiveSubscriptions(namespaceID string) int64 {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if !exists {
return 0
}
return atomic.LoadInt64(&ns.activeSubscriptions)
}
// TotalActiveSubscriptions returns the total number of active subscriptions across all namespaces.
func (m *DefaultMetricsCollector) TotalActiveSubscriptions() int64 {
m.mu.RLock()
defer m.mu.RUnlock()
var total int64
for _, ns := range m.namespaces {
total += atomic.LoadInt64(&ns.activeSubscriptions)
}
return total
}
// PublishErrors returns the total number of publish errors for a namespace.
func (m *DefaultMetricsCollector) PublishErrors(namespaceID string) int64 {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if !exists {
return 0
}
return atomic.LoadInt64(&ns.publishErrors)
}
// SubscribeErrors returns the total number of subscribe errors for a namespace.
func (m *DefaultMetricsCollector) SubscribeErrors(namespaceID string) int64 {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if !exists {
return 0
}
return atomic.LoadInt64(&ns.subscribeErrors)
}
// DroppedEvents returns the total number of dropped events for a namespace.
func (m *DefaultMetricsCollector) DroppedEvents(namespaceID string) int64 {
m.mu.RLock()
ns, exists := m.namespaces[namespaceID]
m.mu.RUnlock()
if !exists {
return 0
}
return atomic.LoadInt64(&ns.droppedEvents)
}
// Namespaces returns a list of all namespaces that have metrics.
func (m *DefaultMetricsCollector) Namespaces() []string {
m.mu.RLock()
defer m.mu.RUnlock()
namespaces := make([]string, 0, len(m.namespaces))
for ns := range m.namespaces {
namespaces = append(namespaces, ns)
}
return namespaces
}
// Reset resets all metrics.
func (m *DefaultMetricsCollector) Reset() {
m.mu.Lock()
defer m.mu.Unlock()
m.namespaces = make(map[string]*namespaceMetrics)
}
// RecordPublish records a successful publish event.
func (m *DefaultMetricsCollector) RecordPublish(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.eventsPublished, 1)
}
// RecordReceive records a received event.
func (m *DefaultMetricsCollector) RecordReceive(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.eventsReceived, 1)
}
// RecordSubscribe records a new subscription.
func (m *DefaultMetricsCollector) RecordSubscribe(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.activeSubscriptions, 1)
}
// RecordUnsubscribe records a removed subscription.
func (m *DefaultMetricsCollector) RecordUnsubscribe(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.activeSubscriptions, -1)
}
// RecordPublishError records a publish error.
func (m *DefaultMetricsCollector) RecordPublishError(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.publishErrors, 1)
}
// RecordSubscribeError records a subscribe error.
func (m *DefaultMetricsCollector) RecordSubscribeError(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.subscribeErrors, 1)
}
// RecordDroppedEvent records a dropped event.
func (m *DefaultMetricsCollector) RecordDroppedEvent(namespaceID string) {
ns := m.getOrCreateNamespace(namespaceID)
atomic.AddInt64(&ns.droppedEvents, 1)
}

View File

@@ -1,123 +0,0 @@
package aether
import (
"github.com/prometheus/client_golang/prometheus"
)
// PrometheusMetricsAdapter exposes BroadcasterMetrics as Prometheus metrics.
// It implements prometheus.Collector and can be registered with a Prometheus registry.
type PrometheusMetricsAdapter struct {
metrics BroadcasterMetrics
eventsPublishedDesc *prometheus.Desc
eventsReceivedDesc *prometheus.Desc
activeSubscriptionsDesc *prometheus.Desc
publishErrorsDesc *prometheus.Desc
subscribeErrorsDesc *prometheus.Desc
droppedEventsDesc *prometheus.Desc
}
// NewPrometheusMetricsAdapter creates a new PrometheusMetricsAdapter that wraps
// a BroadcasterMetrics implementation and exposes it as Prometheus metrics.
//
// The adapter implements prometheus.Collector and should be registered with
// a Prometheus registry:
//
// eb := aether.NewEventBus()
// adapter := aether.NewPrometheusMetricsAdapter(eb.Metrics())
// prometheus.MustRegister(adapter)
func NewPrometheusMetricsAdapter(metrics BroadcasterMetrics) *PrometheusMetricsAdapter {
return &PrometheusMetricsAdapter{
metrics: metrics,
eventsPublishedDesc: prometheus.NewDesc(
"aether_events_published_total",
"Total number of events published per namespace",
[]string{"namespace"},
nil,
),
eventsReceivedDesc: prometheus.NewDesc(
"aether_events_received_total",
"Total number of events received per namespace",
[]string{"namespace"},
nil,
),
activeSubscriptionsDesc: prometheus.NewDesc(
"aether_active_subscriptions",
"Number of active subscriptions per namespace",
[]string{"namespace"},
nil,
),
publishErrorsDesc: prometheus.NewDesc(
"aether_publish_errors_total",
"Total number of publish errors per namespace",
[]string{"namespace"},
nil,
),
subscribeErrorsDesc: prometheus.NewDesc(
"aether_subscribe_errors_total",
"Total number of subscribe errors per namespace",
[]string{"namespace"},
nil,
),
droppedEventsDesc: prometheus.NewDesc(
"aether_dropped_events_total",
"Total number of dropped events per namespace",
[]string{"namespace"},
nil,
),
}
}
// Describe implements prometheus.Collector.
func (a *PrometheusMetricsAdapter) Describe(ch chan<- *prometheus.Desc) {
ch <- a.eventsPublishedDesc
ch <- a.eventsReceivedDesc
ch <- a.activeSubscriptionsDesc
ch <- a.publishErrorsDesc
ch <- a.subscribeErrorsDesc
ch <- a.droppedEventsDesc
}
// Collect implements prometheus.Collector.
func (a *PrometheusMetricsAdapter) Collect(ch chan<- prometheus.Metric) {
namespaces := a.metrics.Namespaces()
for _, ns := range namespaces {
ch <- prometheus.MustNewConstMetric(
a.eventsPublishedDesc,
prometheus.CounterValue,
float64(a.metrics.EventsPublished(ns)),
ns,
)
ch <- prometheus.MustNewConstMetric(
a.eventsReceivedDesc,
prometheus.CounterValue,
float64(a.metrics.EventsReceived(ns)),
ns,
)
ch <- prometheus.MustNewConstMetric(
a.activeSubscriptionsDesc,
prometheus.GaugeValue,
float64(a.metrics.ActiveSubscriptions(ns)),
ns,
)
ch <- prometheus.MustNewConstMetric(
a.publishErrorsDesc,
prometheus.CounterValue,
float64(a.metrics.PublishErrors(ns)),
ns,
)
ch <- prometheus.MustNewConstMetric(
a.subscribeErrorsDesc,
prometheus.CounterValue,
float64(a.metrics.SubscribeErrors(ns)),
ns,
)
ch <- prometheus.MustNewConstMetric(
a.droppedEventsDesc,
prometheus.CounterValue,
float64(a.metrics.DroppedEvents(ns)),
ns,
)
}
}

View File

@@ -1,304 +0,0 @@
package aether_test
import (
"sync"
"testing"
"time"
"git.flowmade.one/flowmade-one/aether"
)
func TestMetricsCollector_InitialState(t *testing.T) {
mc := aether.NewMetricsCollector()
if got := mc.EventsPublished("test-ns"); got != 0 {
t.Errorf("EventsPublished() = %d, want 0", got)
}
if got := mc.EventsReceived("test-ns"); got != 0 {
t.Errorf("EventsReceived() = %d, want 0", got)
}
if got := mc.ActiveSubscriptions("test-ns"); got != 0 {
t.Errorf("ActiveSubscriptions() = %d, want 0", got)
}
if got := mc.TotalActiveSubscriptions(); got != 0 {
t.Errorf("TotalActiveSubscriptions() = %d, want 0", got)
}
if got := mc.PublishErrors("test-ns"); got != 0 {
t.Errorf("PublishErrors() = %d, want 0", got)
}
if got := mc.SubscribeErrors("test-ns"); got != 0 {
t.Errorf("SubscribeErrors() = %d, want 0", got)
}
if got := mc.DroppedEvents("test-ns"); got != 0 {
t.Errorf("DroppedEvents() = %d, want 0", got)
}
if got := len(mc.Namespaces()); got != 0 {
t.Errorf("Namespaces() = %d, want 0", got)
}
}
func TestMetricsCollector_RecordPublish(t *testing.T) {
mc := aether.NewMetricsCollector()
mc.RecordPublish("ns1")
mc.RecordPublish("ns1")
mc.RecordPublish("ns2")
if got := mc.EventsPublished("ns1"); got != 2 {
t.Errorf("EventsPublished(ns1) = %d, want 2", got)
}
if got := mc.EventsPublished("ns2"); got != 1 {
t.Errorf("EventsPublished(ns2) = %d, want 1", got)
}
}
func TestMetricsCollector_RecordReceive(t *testing.T) {
mc := aether.NewMetricsCollector()
mc.RecordReceive("ns1")
mc.RecordReceive("ns1")
mc.RecordReceive("ns1")
if got := mc.EventsReceived("ns1"); got != 3 {
t.Errorf("EventsReceived(ns1) = %d, want 3", got)
}
}
func TestMetricsCollector_Subscriptions(t *testing.T) {
mc := aether.NewMetricsCollector()
mc.RecordSubscribe("ns1")
mc.RecordSubscribe("ns1")
mc.RecordSubscribe("ns2")
if got := mc.ActiveSubscriptions("ns1"); got != 2 {
t.Errorf("ActiveSubscriptions(ns1) = %d, want 2", got)
}
if got := mc.ActiveSubscriptions("ns2"); got != 1 {
t.Errorf("ActiveSubscriptions(ns2) = %d, want 1", got)
}
if got := mc.TotalActiveSubscriptions(); got != 3 {
t.Errorf("TotalActiveSubscriptions() = %d, want 3", got)
}
mc.RecordUnsubscribe("ns1")
if got := mc.ActiveSubscriptions("ns1"); got != 1 {
t.Errorf("ActiveSubscriptions(ns1) after unsubscribe = %d, want 1", got)
}
if got := mc.TotalActiveSubscriptions(); got != 2 {
t.Errorf("TotalActiveSubscriptions() after unsubscribe = %d, want 2", got)
}
}
func TestMetricsCollector_Errors(t *testing.T) {
mc := aether.NewMetricsCollector()
mc.RecordPublishError("ns1")
mc.RecordPublishError("ns1")
mc.RecordSubscribeError("ns1")
mc.RecordDroppedEvent("ns1")
mc.RecordDroppedEvent("ns1")
mc.RecordDroppedEvent("ns1")
if got := mc.PublishErrors("ns1"); got != 2 {
t.Errorf("PublishErrors(ns1) = %d, want 2", got)
}
if got := mc.SubscribeErrors("ns1"); got != 1 {
t.Errorf("SubscribeErrors(ns1) = %d, want 1", got)
}
if got := mc.DroppedEvents("ns1"); got != 3 {
t.Errorf("DroppedEvents(ns1) = %d, want 3", got)
}
}
func TestMetricsCollector_Namespaces(t *testing.T) {
mc := aether.NewMetricsCollector()
mc.RecordPublish("ns1")
mc.RecordReceive("ns2")
mc.RecordSubscribe("ns3")
namespaces := mc.Namespaces()
if len(namespaces) != 3 {
t.Errorf("Namespaces() length = %d, want 3", len(namespaces))
}
nsMap := make(map[string]bool)
for _, ns := range namespaces {
nsMap[ns] = true
}
for _, expected := range []string{"ns1", "ns2", "ns3"} {
if !nsMap[expected] {
t.Errorf("Namespaces() missing %q", expected)
}
}
}
func TestMetricsCollector_Reset(t *testing.T) {
mc := aether.NewMetricsCollector()
mc.RecordPublish("ns1")
mc.RecordReceive("ns1")
mc.RecordSubscribe("ns1")
mc.Reset()
if got := mc.EventsPublished("ns1"); got != 0 {
t.Errorf("EventsPublished() after reset = %d, want 0", got)
}
if got := len(mc.Namespaces()); got != 0 {
t.Errorf("Namespaces() after reset = %d, want 0", got)
}
}
func TestMetricsCollector_ConcurrentAccess(t *testing.T) {
mc := aether.NewMetricsCollector()
const goroutines = 10
const iterations = 100
var wg sync.WaitGroup
wg.Add(goroutines)
for i := 0; i < goroutines; i++ {
go func() {
defer wg.Done()
for j := 0; j < iterations; j++ {
mc.RecordPublish("concurrent-ns")
mc.RecordReceive("concurrent-ns")
mc.RecordSubscribe("concurrent-ns")
mc.RecordUnsubscribe("concurrent-ns")
mc.RecordPublishError("concurrent-ns")
mc.RecordSubscribeError("concurrent-ns")
mc.RecordDroppedEvent("concurrent-ns")
}
}()
}
wg.Wait()
expected := int64(goroutines * iterations)
if got := mc.EventsPublished("concurrent-ns"); got != expected {
t.Errorf("EventsPublished() = %d, want %d", got, expected)
}
if got := mc.EventsReceived("concurrent-ns"); got != expected {
t.Errorf("EventsReceived() = %d, want %d", got, expected)
}
if got := mc.ActiveSubscriptions("concurrent-ns"); got != 0 {
t.Errorf("ActiveSubscriptions() = %d, want 0 (subscribed and unsubscribed same amount)", got)
}
if got := mc.PublishErrors("concurrent-ns"); got != expected {
t.Errorf("PublishErrors() = %d, want %d", got, expected)
}
if got := mc.SubscribeErrors("concurrent-ns"); got != expected {
t.Errorf("SubscribeErrors() = %d, want %d", got, expected)
}
if got := mc.DroppedEvents("concurrent-ns"); got != expected {
t.Errorf("DroppedEvents() = %d, want %d", got, expected)
}
}
func TestEventBus_Metrics(t *testing.T) {
eb := aether.NewEventBus()
defer eb.Stop()
metrics := eb.Metrics()
if metrics == nil {
t.Fatal("Metrics() returned nil")
}
// Subscribe and verify metrics
ch := eb.Subscribe("test-ns")
if got := metrics.ActiveSubscriptions("test-ns"); got != 1 {
t.Errorf("ActiveSubscriptions() after subscribe = %d, want 1", got)
}
// Publish and verify metrics
event := &aether.Event{
ID: "test-1",
EventType: "TestEvent",
ActorID: "actor-1",
Version: 1,
}
eb.Publish("test-ns", event)
// Wait for event delivery
select {
case <-ch:
case <-time.After(100 * time.Millisecond):
t.Fatal("timeout waiting for event")
}
if got := metrics.EventsPublished("test-ns"); got != 1 {
t.Errorf("EventsPublished() after publish = %d, want 1", got)
}
if got := metrics.EventsReceived("test-ns"); got != 1 {
t.Errorf("EventsReceived() after publish = %d, want 1", got)
}
// Unsubscribe and verify metrics
eb.Unsubscribe("test-ns", ch)
if got := metrics.ActiveSubscriptions("test-ns"); got != 0 {
t.Errorf("ActiveSubscriptions() after unsubscribe = %d, want 0", got)
}
}
func TestEventBus_DroppedEvents(t *testing.T) {
eb := aether.NewEventBus()
defer eb.Stop()
metrics := eb.Metrics()
// Subscribe but don't read from channel
_ = eb.Subscribe("test-ns")
// Fill the channel buffer (default is 100)
for i := 0; i < 100; i++ {
eb.Publish("test-ns", &aether.Event{
ID: "fill-" + string(rune(i)),
EventType: "FillEvent",
})
}
// Next publish should be dropped
eb.Publish("test-ns", &aether.Event{
ID: "dropped",
EventType: "DroppedEvent",
})
if got := metrics.DroppedEvents("test-ns"); got != 1 {
t.Errorf("DroppedEvents() = %d, want 1", got)
}
}
func TestEventBus_MetricsProvider(t *testing.T) {
eb := aether.NewEventBus()
defer eb.Stop()
// Verify EventBus implements MetricsProvider
var mp aether.MetricsProvider = eb
if mp.Metrics() == nil {
t.Error("EventBus.Metrics() returned nil")
}
}
func TestEventBus_StopClearsSubscriptionMetrics(t *testing.T) {
eb := aether.NewEventBus()
metrics := eb.Metrics()
_ = eb.Subscribe("ns1")
_ = eb.Subscribe("ns1")
_ = eb.Subscribe("ns2")
if got := metrics.TotalActiveSubscriptions(); got != 3 {
t.Errorf("TotalActiveSubscriptions() before stop = %d, want 3", got)
}
eb.Stop()
if got := metrics.TotalActiveSubscriptions(); got != 0 {
t.Errorf("TotalActiveSubscriptions() after stop = %d, want 0", got)
}
}

View File

@@ -5,31 +5,22 @@ import (
"encoding/json" "encoding/json"
"fmt" "fmt"
"log" "log"
"strings"
"sync" "sync"
"github.com/google/uuid" "github.com/google/uuid"
"github.com/nats-io/nats.go" "github.com/nats-io/nats.go"
) )
// NATSEventBus is an EventBus that broadcasts events across all cluster nodes using NATS. // NATSEventBus is an EventBus that broadcasts events across all cluster nodes using NATS
// Supports wildcard patterns for cross-namespace subscriptions using NATS native wildcards.
//
// Security Considerations:
// Wildcard subscriptions (using "*" or ">") receive events from multiple namespaces.
// This bypasses namespace isolation at the NATS level. Ensure proper access controls
// are in place at the application layer before granting wildcard subscription access.
type NATSEventBus struct { type NATSEventBus struct {
*EventBus // Embed base EventBus for local subscriptions *EventBus // Embed base EventBus for local subscriptions
nc *nats.Conn // NATS connection nc *nats.Conn // NATS connection
subscriptions []*nats.Subscription subscriptions []*nats.Subscription
patternSubscribers map[string]int // Track number of subscribers per pattern (includes wildcards) namespaceSubscribers map[string]int // Track number of subscribers per namespace
nodeID string // Unique ID for this node nodeID string // Unique ID for this node
streamPrefix string // NATS subject prefix for events mutex sync.Mutex
eventStore interface{} // Optional event store for version cache sync (jetstream.JetStreamEventStore) ctx context.Context
mutex sync.Mutex cancel context.CancelFunc
ctx context.Context
cancel context.CancelFunc
} }
// eventMessage is the wire format for events sent over NATS // eventMessage is the wire format for events sent over NATS
@@ -44,134 +35,69 @@ func NewNATSEventBus(nc *nats.Conn) (*NATSEventBus, error) {
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
neb := &NATSEventBus{ neb := &NATSEventBus{
EventBus: NewEventBus(), EventBus: NewEventBus(),
nc: nc, nc: nc,
nodeID: uuid.New().String(), nodeID: uuid.New().String(),
subscriptions: make([]*nats.Subscription, 0), subscriptions: make([]*nats.Subscription, 0),
patternSubscribers: make(map[string]int), namespaceSubscribers: make(map[string]int),
streamPrefix: "aether", ctx: ctx,
ctx: ctx, cancel: cancel,
cancel: cancel,
} }
return neb, nil return neb, nil
} }
// NewNATSEventBusWithBroadcaster creates a new NATS-backed event bus with JetStreamEventStore integration. // Subscribe creates a local subscription and ensures NATS subscription exists for the namespace
// The event store is used to automatically update version cache when EventStored events are received func (neb *NATSEventBus) Subscribe(namespaceID string) <-chan *Event {
// from other cluster nodes via NATS. This ensures cross-node version consistency.
//
// Example:
//
// eventBus := aether.NewNATSEventBusWithBroadcaster(natsConn, store, "tenant-abc")
// ch := eventBus.SubscribeToEventStored("tenant-*")
// for event := range ch {
// actorID := event.Data["actorId"].(string)
// version := event.Data["version"].(int64)
// store.UpdateVersionCache(actorID, version)
// }
//
// The namespace parameter is used as a prefix for EventStored event filtering.
// If empty, EventStored events from all namespaces will be received (requires wildcard pattern).
func NewNATSEventBusWithBroadcaster(nc *nats.Conn, store interface{}, namespace string) *NATSEventBus {
streamPrefix := "aether"
if namespace != "" {
streamPrefix = fmt.Sprintf("aether.%s", sanitizeSubject(namespace))
}
neb := &NATSEventBus{
EventBus: NewEventBus(),
nc: nc,
nodeID: uuid.New().String(),
subscriptions: make([]*nats.Subscription, 0),
patternSubscribers: make(map[string]int),
streamPrefix: streamPrefix,
eventStore: store,
ctx: context.Background(),
cancel: func() {},
}
return neb
}
// Subscribe creates a local subscription and ensures NATS subscription exists for the pattern.
// Supports NATS subject patterns:
// - "*" matches a single token
// - ">" matches one or more tokens (only at the end)
//
// Security Warning: Wildcard patterns receive events from all matching namespaces,
// bypassing namespace isolation. Only use for trusted system components.
func (neb *NATSEventBus) Subscribe(namespacePattern string) <-chan *Event {
return neb.SubscribeWithFilter(namespacePattern, nil)
}
// SubscribeWithFilter creates a filtered subscription channel for a namespace pattern.
// Events are filtered by the provided SubscriptionFilter before delivery.
// If filter is nil or empty, all events matching the namespace pattern are delivered.
//
// For NATSEventBus:
// - Namespace pattern filtering is applied at the NATS level using native wildcards
// - EventTypes and ActorPattern filters are applied client-side after receiving messages
//
// This allows efficient server-side filtering for namespaces while providing
// flexible client-side filtering for event types and actors.
func (neb *NATSEventBus) SubscribeWithFilter(namespacePattern string, filter *SubscriptionFilter) <-chan *Event {
neb.mutex.Lock() neb.mutex.Lock()
defer neb.mutex.Unlock() defer neb.mutex.Unlock()
// Create local subscription first (with filter) // Create local subscription first
ch := neb.EventBus.SubscribeWithFilter(namespacePattern, filter) ch := neb.EventBus.Subscribe(namespaceID)
// Check if this is the first subscriber for this pattern // Check if this is the first subscriber for this namespace
count := neb.patternSubscribers[namespacePattern] count := neb.namespaceSubscribers[namespaceID]
if count == 0 { if count == 0 {
// First subscriber - create NATS subscription // First subscriber - create NATS subscription
// NATS natively supports wildcards, so we can use the pattern directly subject := fmt.Sprintf("aether.events.%s", namespaceID)
subject := fmt.Sprintf("aether.events.%s", namespacePattern)
sub, err := neb.nc.Subscribe(subject, func(msg *nats.Msg) { sub, err := neb.nc.Subscribe(subject, func(msg *nats.Msg) {
neb.handleNATSEvent(msg, namespacePattern) neb.handleNATSEvent(msg)
}) })
if err != nil { if err != nil {
log.Printf("[NATSEventBus] Failed to subscribe to NATS subject %s: %v", subject, err) log.Printf("[NATSEventBus] Failed to subscribe to NATS subject %s: %v", subject, err)
// Record subscription error
neb.metrics.RecordSubscribeError(namespacePattern)
} else { } else {
neb.subscriptions = append(neb.subscriptions, sub) neb.subscriptions = append(neb.subscriptions, sub)
if IsWildcardPattern(namespacePattern) { log.Printf("[NATSEventBus] Node %s subscribed to %s", neb.nodeID, subject)
log.Printf("[NATSEventBus] Node %s subscribed to wildcard pattern %s", neb.nodeID, subject)
} else {
log.Printf("[NATSEventBus] Node %s subscribed to %s", neb.nodeID, subject)
}
} }
} }
neb.patternSubscribers[namespacePattern] = count + 1 neb.namespaceSubscribers[namespaceID] = count + 1
return ch return ch
} }
// Unsubscribe removes a local subscription and cleans up NATS subscription if no more subscribers // Unsubscribe removes a local subscription and cleans up NATS subscription if no more subscribers
func (neb *NATSEventBus) Unsubscribe(namespacePattern string, ch <-chan *Event) { func (neb *NATSEventBus) Unsubscribe(namespaceID string, ch <-chan *Event) {
neb.mutex.Lock() neb.mutex.Lock()
defer neb.mutex.Unlock() defer neb.mutex.Unlock()
neb.EventBus.Unsubscribe(namespacePattern, ch) neb.EventBus.Unsubscribe(namespaceID, ch)
count := neb.patternSubscribers[namespacePattern] count := neb.namespaceSubscribers[namespaceID]
if count > 0 { if count > 0 {
count-- count--
neb.patternSubscribers[namespacePattern] = count neb.namespaceSubscribers[namespaceID] = count
if count == 0 { if count == 0 {
delete(neb.patternSubscribers, namespacePattern) delete(neb.namespaceSubscribers, namespaceID)
log.Printf("[NATSEventBus] No more subscribers for pattern %s on node %s", namespacePattern, neb.nodeID) log.Printf("[NATSEventBus] No more subscribers for namespace %s on node %s", namespaceID, neb.nodeID)
} }
} }
} }
// handleNATSEvent processes events received from NATS // handleNATSEvent processes events received from NATS
func (neb *NATSEventBus) handleNATSEvent(msg *nats.Msg, subscribedPattern string) { func (neb *NATSEventBus) handleNATSEvent(msg *nats.Msg) {
var eventMsg eventMessage var eventMsg eventMessage
if err := json.Unmarshal(msg.Data, &eventMsg); err != nil { if err := json.Unmarshal(msg.Data, &eventMsg); err != nil {
log.Printf("[NATSEventBus] Failed to unmarshal event: %v", err) log.Printf("[NATSEventBus] Failed to unmarshal event: %v", err)
@@ -183,44 +109,8 @@ func (neb *NATSEventBus) handleNATSEvent(msg *nats.Msg, subscribedPattern string
return return
} }
// For wildcard subscriptions, we need to deliver to the EventBus using // Forward to local EventBus subscribers
// the subscribed pattern so it reaches the correct wildcard subscriber. neb.EventBus.Publish(eventMsg.NamespaceID, eventMsg.Event)
// For exact subscriptions, use the actual namespace.
if IsWildcardPattern(subscribedPattern) {
// Deliver using the pattern - the EventBus will route to wildcard subscribers
neb.deliverToWildcardSubscribers(subscribedPattern, eventMsg.Event)
} else {
// Forward to local EventBus subscribers with actual namespace
neb.EventBus.Publish(eventMsg.NamespaceID, eventMsg.Event)
}
}
// deliverToWildcardSubscribers delivers an event to subscribers of a specific wildcard pattern
// Applies filters before delivery.
func (neb *NATSEventBus) deliverToWildcardSubscribers(pattern string, event *Event) {
neb.EventBus.mutex.RLock()
defer neb.EventBus.mutex.RUnlock()
for _, sub := range neb.EventBus.wildcardSubscribers {
if sub.pattern == pattern {
// Apply filter if present
if sub.filter != nil && !sub.filter.IsEmpty() {
if !sub.filter.Matches(event) {
// Event doesn't match filter, skip delivery
continue
}
}
select {
case sub.ch <- event:
// Event delivered from NATS
neb.metrics.RecordReceive(pattern)
default:
// Channel full, skip this subscriber (non-blocking)
neb.metrics.RecordDroppedEvent(pattern)
}
}
}
} }
// Publish publishes an event both locally and to NATS for cross-node broadcasting // Publish publishes an event both locally and to NATS for cross-node broadcasting
@@ -240,13 +130,11 @@ func (neb *NATSEventBus) Publish(namespaceID string, event *Event) {
data, err := json.Marshal(eventMsg) data, err := json.Marshal(eventMsg)
if err != nil { if err != nil {
log.Printf("[NATSEventBus] Failed to marshal event for NATS: %v", err) log.Printf("[NATSEventBus] Failed to marshal event for NATS: %v", err)
neb.metrics.RecordPublishError(namespaceID)
return return
} }
if err := neb.nc.Publish(subject, data); err != nil { if err := neb.nc.Publish(subject, data); err != nil {
log.Printf("[NATSEventBus] Failed to publish event to NATS: %v", err) log.Printf("[NATSEventBus] Failed to publish event to NATS: %v", err)
neb.metrics.RecordPublishError(namespaceID)
return return
} }
} }
@@ -269,103 +157,3 @@ func (neb *NATSEventBus) Stop() {
log.Printf("[NATSEventBus] Node %s stopped", neb.nodeID) log.Printf("[NATSEventBus] Node %s stopped", neb.nodeID)
} }
// sanitizeSubject sanitizes a string for use in NATS subjects
func sanitizeSubject(s string) string {
s = strings.ReplaceAll(s, " ", "_")
s = strings.ReplaceAll(s, ".", "_")
s = strings.ReplaceAll(s, "*", "_")
s = strings.ReplaceAll(s, ">", "_")
return s
}
// extractActorType extracts the actor type from an actor ID
func extractActorType(actorID string) string {
for i, c := range actorID {
if c == '-' && i > 0 {
return actorID[:i]
}
}
return "unknown"
}
// SubscribeToEventStored creates a subscription to EventStored events for a namespace pattern.
// EventStored events are published by JetStreamEventStore when events are successfully saved.
// This is useful for cross-node event synchronization and version cache consistency.
//
// The returned channel receives EventStored events matching the pattern.
// The EventStored event schema:
// - EventType: "EventStored"
// - ActorID: ID of the actor that the original event was about
// - Version: version of the stored event
// - Data:
// - eventId: (string) ID of the stored event
// - actorId: (string) ID of the actor
// - version: (int64) version of the event
// - timestamp: (int64) Unix timestamp of when the event was stored
//
// The namespacePattern supports NATS wildcards:
// - "*" matches a single token
// - ">" matches one or more tokens (only at the end)
//
// Example:
//
// ch := eventBus.SubscribeToEventStored("tenant-*")
// for event := range ch {
// if event.EventType != aether.EventTypeEventStored {
// continue
// }
// actorID := event.Data["actorId"].(string)
// version, _ := event.Data["version"].(int64)
// store.UpdateVersionCache(actorID, version)
// }
//
// Security Warning: Using wildcard patterns like ">" will receive EventStored events
// from all namespaces. Ensure your application handles this appropriately.
func (neb *NATSEventBus) SubscribeToEventStored(namespacePattern string) <-chan *Event {
neb.mutex.Lock()
defer neb.mutex.Unlock()
subject := fmt.Sprintf("%s.%s.%s", neb.streamPrefix, namespacePattern, "events.>")
ch := make(chan *Event, 100)
sub, err := neb.nc.Subscribe(subject, func(msg *nats.Msg) {
var eventMsg eventMessage
if err := json.Unmarshal(msg.Data, &eventMsg); err != nil {
log.Printf("[NATSEventBus] Failed to unmarshal EventStored event: %v", err)
return
}
if eventMsg.NodeID == neb.nodeID {
return
}
if eventMsg.Event.EventType == EventTypeEventStored && neb.eventStore != nil {
actorID, ok := eventMsg.Event.Data["actorId"].(string)
if !ok {
return
}
version, ok := eventMsg.Event.Data["version"].(int64)
if !ok {
return
}
// Use type assertion to call UpdateVersionCache
if es, ok := neb.eventStore.(interface{ UpdateVersionCache(string, int64) }); ok {
es.UpdateVersionCache(actorID, version)
}
}
neb.EventBus.Publish(eventMsg.NamespaceID, eventMsg.Event)
})
if err != nil {
log.Printf("[NATSEventBus] Failed to subscribe to EventStored: %v", err)
close(ch)
return ch
}
neb.subscriptions = append(neb.subscriptions, sub)
return ch
}

View File

@@ -1,197 +0,0 @@
package aether
import "strings"
// MatchNamespacePattern checks if a namespace matches a pattern.
// Patterns follow NATS subject matching conventions where tokens are separated by dots:
// - "*" matches exactly one token (any sequence without ".")
// - ">" matches one or more tokens (only valid at the end of a pattern)
// - Exact strings match exactly
//
// Examples:
// - "tenant-a" matches "tenant-a" (exact match)
// - "*" matches any single-token namespace like "tenant-a" or "production"
// - ">" matches any namespace with one or more tokens
// - "prod.*" matches "prod.tenant", "prod.orders" (but not "prod.tenant.orders")
// - "prod.>" matches "prod.tenant", "prod.tenant.orders", "prod.a.b.c"
// - "*.tenant.*" matches "prod.tenant.orders", "staging.tenant.events"
//
// Security Considerations:
// Wildcard subscriptions provide cross-namespace visibility. Use with caution:
// - "*" or ">" patterns receive events from ALL matching namespaces
// - This bypasses namespace isolation for the subscriber
// - Only grant wildcard subscription access to trusted system components
// - Consider auditing wildcard subscription usage
// - For multi-tenant systems, wildcard access should be restricted to admin/ops
// - Use the most specific pattern possible to minimize exposure
func MatchNamespacePattern(pattern, namespace string) bool {
// Empty pattern matches nothing
if pattern == "" {
return false
}
// ">" matches everything when used alone
if pattern == ">" {
return namespace != ""
}
patternTokens := strings.Split(pattern, ".")
namespaceTokens := strings.Split(namespace, ".")
return matchTokens(patternTokens, namespaceTokens)
}
// matchTokens recursively matches pattern tokens against namespace tokens
func matchTokens(patternTokens, namespaceTokens []string) bool {
// If pattern is exhausted, namespace must also be exhausted
if len(patternTokens) == 0 {
return len(namespaceTokens) == 0
}
patternToken := patternTokens[0]
// ">" matches one or more remaining tokens (must be last pattern token)
if patternToken == ">" {
// ">" requires at least one token to match
return len(namespaceTokens) >= 1
}
// If namespace is exhausted but pattern has more tokens, no match
if len(namespaceTokens) == 0 {
return false
}
namespaceToken := namespaceTokens[0]
// "*" matches exactly one token
if patternToken == "*" {
return matchTokens(patternTokens[1:], namespaceTokens[1:])
}
// Exact match required
if patternToken == namespaceToken {
return matchTokens(patternTokens[1:], namespaceTokens[1:])
}
return false
}
// IsWildcardPattern returns true if the pattern contains wildcards (* or >).
// Wildcard patterns can match multiple namespaces and bypass namespace isolation.
func IsWildcardPattern(pattern string) bool {
return strings.Contains(pattern, "*") || strings.Contains(pattern, ">")
}
// SubscriptionFilter defines optional filters for event subscriptions.
// All configured filters are combined with AND logic - an event must match
// all specified criteria to be delivered to the subscriber.
//
// Filter Processing:
// - EventTypes: Event must have an EventType matching at least one in the list (OR within types)
// - ActorPattern: Event's ActorID must match the pattern (supports * and > wildcards)
//
// Filtering is applied client-side in the EventBus. For NATSEventBus, namespace-level
// filtering uses NATS subject patterns, while EventTypes and ActorPattern filtering
// happens after message receipt.
type SubscriptionFilter struct {
// EventTypes filters events by type. Empty slice means all event types.
// If specified, only events with an EventType in this list are delivered.
// Example: []string{"OrderPlaced", "OrderShipped"} receives only those event types.
EventTypes []string
// ActorPattern filters events by actor ID pattern. Empty string means all actors.
// Supports NATS-style wildcards:
// - "*" matches a single token (e.g., "order-*" matches "order-123", "order-456")
// - ">" matches one or more tokens (e.g., "order.>" matches "order.us.123", "order.eu.456")
// Example: "order-*" receives events only for actors starting with "order-"
ActorPattern string
}
// IsEmpty returns true if no filters are configured.
func (f *SubscriptionFilter) IsEmpty() bool {
return len(f.EventTypes) == 0 && f.ActorPattern == ""
}
// Matches returns true if the event matches all configured filters.
// An empty filter matches all events.
func (f *SubscriptionFilter) Matches(event *Event) bool {
if event == nil {
return false
}
// Check event type filter
if len(f.EventTypes) > 0 {
typeMatch := false
for _, et := range f.EventTypes {
if event.EventType == et {
typeMatch = true
break
}
}
if !typeMatch {
return false
}
}
// Check actor pattern filter
if f.ActorPattern != "" {
if !MatchActorPattern(f.ActorPattern, event.ActorID) {
return false
}
}
return true
}
// MatchActorPattern checks if an actor ID matches a pattern.
// Uses the same matching logic as MatchNamespacePattern for consistency.
//
// Patterns:
// - "*" matches a single token (e.g., "order-*" matches "order-123")
// - ">" matches one or more tokens (e.g., "order.>" matches "order.us.east")
// - Exact strings match exactly (e.g., "order-123" matches only "order-123")
//
// Note: For simple prefix matching without dots (e.g., "order-*" matching "order-123"),
// this uses simplified matching where "*" matches any remaining characters in a token.
func MatchActorPattern(pattern, actorID string) bool {
// Empty pattern matches nothing
if pattern == "" {
return false
}
// Empty actor ID matches nothing except ">"
if actorID == "" {
return false
}
// If pattern contains dots, use token-based matching (same as namespace)
if strings.Contains(pattern, ".") || strings.Contains(actorID, ".") {
return MatchNamespacePattern(pattern, actorID)
}
// Simple matching for non-tokenized patterns
// ">" matches any non-empty actor ID
if pattern == ">" {
return true
}
// "*" matches any single-token actor ID (no dots)
if pattern == "*" {
return true
}
// Check for suffix wildcard (e.g., "order-*")
if strings.HasSuffix(pattern, "*") {
prefix := strings.TrimSuffix(pattern, "*")
return strings.HasPrefix(actorID, prefix)
}
// Check for suffix multi-match (e.g., "order->")
if strings.HasSuffix(pattern, ">") {
prefix := strings.TrimSuffix(pattern, ">")
return strings.HasPrefix(actorID, prefix)
}
// Exact match
return pattern == actorID
}

View File

@@ -1,242 +0,0 @@
package aether
import "testing"
func TestMatchNamespacePattern(t *testing.T) {
tests := []struct {
name string
pattern string
namespace string
expected bool
}{
// Exact matches
{"exact match", "tenant-a", "tenant-a", true},
{"exact mismatch", "tenant-a", "tenant-b", false},
{"exact match with dots", "prod.tenant.a", "prod.tenant.a", true},
{"exact mismatch with dots", "prod.tenant.a", "prod.tenant.b", false},
// Empty cases
{"empty pattern", "", "tenant-a", false},
{"empty namespace exact", "tenant-a", "", false},
{"empty namespace catch-all", ">", "", false},
{"both empty", "", "", false},
// Single wildcard (*) - matches one token (NATS semantics: tokens are dot-separated)
{"star matches any single token", "*", "tenant-a", true},
{"star matches any single token 2", "*", "anything", true},
{"star does not match multi-token", "*", "prod.tenant", false},
{"prefix with star", "prod.*", "prod.tenant", true},
{"prefix with star 2", "prod.*", "prod.orders", true},
{"prefix with star no match extra tokens", "prod.*", "prod.tenant.orders", false},
{"prefix with star no match wrong prefix", "prod.*", "staging.tenant", false},
{"middle wildcard", "prod.*.orders", "prod.tenant.orders", true},
{"middle wildcard no match", "prod.*.orders", "prod.tenant.events", false},
{"multiple stars", "*.tenant.*", "prod.tenant.orders", true},
{"multiple stars 2", "*.*.orders", "prod.tenant.orders", true},
{"multiple stars no match", "*.*.orders", "prod.orders", false},
// Multi-token wildcard (>) - matches one or more tokens
{"greater matches one", ">", "tenant", true},
{"greater matches multi", ">", "prod.tenant.orders", true},
{"prefix greater", "prod.>", "prod.tenant", true},
{"prefix greater multi", "prod.>", "prod.tenant.orders.items", true},
{"prefix greater no match different prefix", "prod.>", "staging.tenant", false},
{"prefix greater requires at least one", "prod.>", "prod", false},
{"deep prefix greater", "prod.tenant.>", "prod.tenant.orders", true},
// Combined wildcards
{"star then greater", "*.>", "prod.tenant", true},
{"star then greater multi", "*.>", "prod.tenant.orders", true},
{"star then greater no match single", "*.>", "prod", false},
// Edge cases
{"trailing dot in pattern", "tenant.", "tenant.", true},
{"just dots", "..", "..", true},
{"star at end", "prod.tenant.*", "prod.tenant.a", true},
{"star at end no match", "prod.tenant.*", "prod.other.a", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := MatchNamespacePattern(tt.pattern, tt.namespace)
if result != tt.expected {
t.Errorf("MatchNamespacePattern(%q, %q) = %v, want %v",
tt.pattern, tt.namespace, result, tt.expected)
}
})
}
}
func TestIsWildcardPattern(t *testing.T) {
tests := []struct {
pattern string
expected bool
}{
{"tenant-a", false},
{"prod.tenant.orders", false},
{"*", true},
{"prod.*", true},
{"*.orders", true},
{">", true},
{"prod.>", true},
{"*.>", true},
{"prod.*.orders", true},
}
for _, tt := range tests {
t.Run(tt.pattern, func(t *testing.T) {
result := IsWildcardPattern(tt.pattern)
if result != tt.expected {
t.Errorf("IsWildcardPattern(%q) = %v, want %v",
tt.pattern, result, tt.expected)
}
})
}
}
func BenchmarkMatchNamespacePattern(b *testing.B) {
benchmarks := []struct {
name string
pattern string
namespace string
}{
{"exact", "tenant-a", "tenant-a"},
{"star", "*", "tenant-a"},
{"prefix_star", "prod.*", "prod.tenant"},
{"greater", ">", "prod.tenant.orders"},
{"complex", "prod.*.>", "prod.tenant.orders.items"},
}
for _, bm := range benchmarks {
b.Run(bm.name, func(b *testing.B) {
for i := 0; i < b.N; i++ {
MatchNamespacePattern(bm.pattern, bm.namespace)
}
})
}
}
func TestMatchActorPattern(t *testing.T) {
tests := []struct {
name string
pattern string
actorID string
expected bool
}{
// Empty cases
{"empty pattern", "", "actor-123", false},
{"empty actorID", "actor-*", "", false},
{"both empty", "", "", false},
// Exact matches (no dots)
{"exact match", "actor-123", "actor-123", true},
{"exact mismatch", "actor-123", "actor-456", false},
// Suffix wildcard with * (simple, no dots)
{"prefix with star", "order-*", "order-123", true},
{"prefix with star 2", "order-*", "order-456-xyz", true},
{"prefix with star mismatch", "order-*", "user-123", false},
{"star alone", "*", "anything", true},
// Suffix wildcard with > (simple, no dots)
{"prefix with greater", "order->", "order-123", true},
{"greater alone", ">", "anything", true},
// Dot-separated actor IDs (uses MatchNamespacePattern)
{"dotted exact match", "order.us.123", "order.us.123", true},
{"dotted exact mismatch", "order.us.123", "order.eu.123", false},
{"dotted star", "order.*", "order.123", true},
{"dotted star deep", "order.*.*", "order.us.123", true},
{"dotted greater", "order.>", "order.us.123.456", true},
{"dotted star mismatch depth", "order.*", "order.us.123", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := MatchActorPattern(tt.pattern, tt.actorID)
if result != tt.expected {
t.Errorf("MatchActorPattern(%q, %q) = %v, want %v",
tt.pattern, tt.actorID, result, tt.expected)
}
})
}
}
func TestSubscriptionFilter_IsEmpty(t *testing.T) {
tests := []struct {
name string
filter *SubscriptionFilter
expected bool
}{
{"nil fields", &SubscriptionFilter{}, true},
{"empty slice", &SubscriptionFilter{EventTypes: []string{}}, true},
{"has event types", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}}, false},
{"has actor pattern", &SubscriptionFilter{ActorPattern: "order-*"}, false},
{"has both", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := tt.filter.IsEmpty()
if result != tt.expected {
t.Errorf("SubscriptionFilter.IsEmpty() = %v, want %v", result, tt.expected)
}
})
}
}
func TestSubscriptionFilter_Matches(t *testing.T) {
tests := []struct {
name string
filter *SubscriptionFilter
event *Event
expected bool
}{
// Nil event
{"nil event", &SubscriptionFilter{}, nil, false},
// Empty filter matches all
{"empty filter", &SubscriptionFilter{}, &Event{EventType: "Test", ActorID: "actor-1"}, true},
// Event type filtering
{"event type match", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}},
&Event{EventType: "OrderPlaced", ActorID: "order-1"}, true},
{"event type mismatch", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}},
&Event{EventType: "OrderShipped", ActorID: "order-1"}, false},
{"event type multiple match first", &SubscriptionFilter{EventTypes: []string{"OrderPlaced", "OrderShipped"}},
&Event{EventType: "OrderPlaced", ActorID: "order-1"}, true},
{"event type multiple match second", &SubscriptionFilter{EventTypes: []string{"OrderPlaced", "OrderShipped"}},
&Event{EventType: "OrderShipped", ActorID: "order-1"}, true},
{"event type multiple no match", &SubscriptionFilter{EventTypes: []string{"OrderPlaced", "OrderShipped"}},
&Event{EventType: "OrderCancelled", ActorID: "order-1"}, false},
// Actor pattern filtering
{"actor pattern exact match", &SubscriptionFilter{ActorPattern: "order-123"},
&Event{EventType: "Test", ActorID: "order-123"}, true},
{"actor pattern exact mismatch", &SubscriptionFilter{ActorPattern: "order-123"},
&Event{EventType: "Test", ActorID: "order-456"}, false},
{"actor pattern wildcard match", &SubscriptionFilter{ActorPattern: "order-*"},
&Event{EventType: "Test", ActorID: "order-123"}, true},
{"actor pattern wildcard mismatch", &SubscriptionFilter{ActorPattern: "order-*"},
&Event{EventType: "Test", ActorID: "user-123"}, false},
// Combined filters (AND logic)
{"combined both match", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
&Event{EventType: "OrderPlaced", ActorID: "order-123"}, true},
{"combined event matches actor does not", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
&Event{EventType: "OrderPlaced", ActorID: "user-123"}, false},
{"combined actor matches event does not", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
&Event{EventType: "OrderShipped", ActorID: "order-123"}, false},
{"combined neither matches", &SubscriptionFilter{EventTypes: []string{"OrderPlaced"}, ActorPattern: "order-*"},
&Event{EventType: "OrderShipped", ActorID: "user-123"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := tt.filter.Matches(tt.event)
if result != tt.expected {
t.Errorf("SubscriptionFilter.Matches() = %v, want %v", result, tt.expected)
}
})
}
}

View File

@@ -1,6 +0,0 @@
{
"$schema": "https://docs.renovatebot.com/renovate-schema.json",
"extends": [
"config:recommended"
]
}

View File

@@ -1,46 +0,0 @@
package store
import (
"testing"
"time"
)
func TestDefaultJetStreamConfig(t *testing.T) {
config := DefaultJetStreamConfig()
if config.StreamRetention != DefaultStreamRetention {
t.Errorf("expected StreamRetention=%v, got %v", DefaultStreamRetention, config.StreamRetention)
}
if config.ReplicaCount != DefaultReplicaCount {
t.Errorf("expected ReplicaCount=%d, got %d", DefaultReplicaCount, config.ReplicaCount)
}
}
func TestJetStreamConfigDefaults(t *testing.T) {
t.Run("default stream retention is 1 year", func(t *testing.T) {
expected := 365 * 24 * time.Hour
if DefaultStreamRetention != expected {
t.Errorf("expected DefaultStreamRetention=%v, got %v", expected, DefaultStreamRetention)
}
})
t.Run("default replica count is 1", func(t *testing.T) {
if DefaultReplicaCount != 1 {
t.Errorf("expected DefaultReplicaCount=1, got %d", DefaultReplicaCount)
}
})
}
func TestJetStreamConfigCustomValues(t *testing.T) {
config := JetStreamConfig{
StreamRetention: 30 * 24 * time.Hour, // 30 days
ReplicaCount: 3,
}
if config.StreamRetention != 30*24*time.Hour {
t.Errorf("expected StreamRetention=30 days, got %v", config.StreamRetention)
}
if config.ReplicaCount != 3 {
t.Errorf("expected ReplicaCount=3, got %d", config.ReplicaCount)
}
}

View File

@@ -1,215 +0,0 @@
package store
import (
"testing"
"time"
"git.flowmade.one/flowmade-one/aether"
)
// TestEventImmutability_MemoryStore verifies that events cannot be modified after persistence
// in the in-memory event store. This demonstrates the append-only nature of event sourcing.
func TestEventImmutability_MemoryStore(t *testing.T) {
store := NewInMemoryEventStore()
actorID := "test-actor-123"
// Create and save an event
originalEvent := &aether.Event{
ID: "evt-immutable-1",
EventType: "TestEvent",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{
"value": "original",
},
Timestamp: time.Now(),
}
err := store.SaveEvent(originalEvent)
if err != nil {
t.Fatalf("SaveEvent failed: %v", err)
}
// Retrieve the event from the store
events, err := store.GetEvents(actorID, 0)
if err != nil {
t.Fatalf("GetEvents failed: %v", err)
}
if len(events) == 0 {
t.Fatal("expected 1 event, got 0")
}
retrievedEvent := events[0]
// Verify the stored event has the correct values
if retrievedEvent.Data["value"] != "original" {
t.Errorf("Data value mismatch: got %v, want %v", retrievedEvent.Data["value"], "original")
}
if retrievedEvent.EventType != "TestEvent" {
t.Errorf("EventType mismatch: got %q, want %q", retrievedEvent.EventType, "TestEvent")
}
// Verify ID is correct
if retrievedEvent.ID != "evt-immutable-1" {
t.Errorf("Event ID mismatch: got %q, want %q", retrievedEvent.ID, "evt-immutable-1")
}
}
// TestEventImmutability_NoUpdateMethod verifies that the EventStore interface
// has only append, read methods - no Update or Delete methods.
func TestEventImmutability_NoUpdateMethod(t *testing.T) {
// This test documents that the EventStore interface is append-only.
// The interface intentionally provides:
// - SaveEvent: append only
// - GetEvents: read only
// - GetLatestVersion: read only
//
// To verify this, we demonstrate that any attempt to call non-existent
// update/delete methods would be caught at compile time (not runtime).
// This is enforced by the interface definition in event.go which does
// not include Update, Delete, or Modify methods.
store := NewInMemoryEventStore()
// Compile-time check: these would not compile if we tried them:
// store.Update(event) // compile error: no such method
// store.Delete(eventID) // compile error: no such method
// store.Modify(eventID, newData) // compile error: no such method
// Only these methods exist:
var eventStore aether.EventStore = store
if eventStore == nil {
t.Fatal("eventStore is nil")
}
// If we got here, the compile-time checks passed
t.Log("EventStore interface enforces append-only semantics by design")
}
// TestEventImmutability_VersionOnlyGoesUp verifies that versions are monotonically
// increasing and attempting to save with a non-increasing version fails.
func TestEventImmutability_VersionOnlyGoesUp(t *testing.T) {
store := NewInMemoryEventStore()
actorID := "actor-version-check"
// Save first event with version 1
event1 := &aether.Event{
ID: "evt-v1",
EventType: "Event1",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err := store.SaveEvent(event1)
if err != nil {
t.Fatalf("SaveEvent(v1) failed: %v", err)
}
// Try to save with same version - should fail
event2Same := &aether.Event{
ID: "evt-v1-again",
EventType: "Event2",
ActorID: actorID,
Version: 1, // Same version
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err = store.SaveEvent(event2Same)
if err == nil {
t.Error("expected SaveEvent(same version) to fail, but it succeeded")
}
// Try to save with lower version - should fail
event3Lower := &aether.Event{
ID: "evt-v0",
EventType: "Event3",
ActorID: actorID,
Version: 0, // Lower version
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err = store.SaveEvent(event3Lower)
if err == nil {
t.Error("expected SaveEvent(lower version) to fail, but it succeeded")
}
// Save with next version - should succeed
event4Next := &aether.Event{
ID: "evt-v2",
EventType: "Event4",
ActorID: actorID,
Version: 2,
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err = store.SaveEvent(event4Next)
if err != nil {
t.Fatalf("SaveEvent(v2) failed: %v", err)
}
// Verify we have exactly 2 events
events, err := store.GetEvents(actorID, 0)
if err != nil {
t.Fatalf("GetEvents failed: %v", err)
}
if len(events) != 2 {
t.Errorf("expected 2 events, got %d", len(events))
}
}
// TestEventImmutability_EventCannotBeDeleted verifies that there is no way to delete
// events from the store through the EventStore interface.
func TestEventImmutability_EventCannotBeDeleted(t *testing.T) {
store := NewInMemoryEventStore()
actorID := "actor-nodelete"
// Save an event
event := &aether.Event{
ID: "evt-nodelete",
EventType: "ImportantEvent",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{"critical": true},
Timestamp: time.Now(),
}
err := store.SaveEvent(event)
if err != nil {
t.Fatalf("SaveEvent failed: %v", err)
}
// Retrieve it
events1, err := store.GetEvents(actorID, 0)
if err != nil {
t.Fatalf("GetEvents (1) failed: %v", err)
}
if len(events1) != 1 {
t.Fatal("expected 1 event after save")
}
// Try to delete through interface - this method doesn't exist
// store.Delete("evt-nodelete") // compile error: no such method
// store.DeleteByActorID(actorID) // compile error: no such method
// Verify the event is still there (we can't delete it)
events2, err := store.GetEvents(actorID, 0)
if err != nil {
t.Fatalf("GetEvents (2) failed: %v", err)
}
if len(events2) != 1 {
t.Errorf("expected 1 event (should not be deletable), got %d", len(events2))
}
if events2[0].ID != "evt-nodelete" {
t.Errorf("event ID changed: got %q, want %q", events2[0].ID, "evt-nodelete")
}
}

View File

@@ -1,431 +0,0 @@
//go:build integration
package store
import (
"context"
"log"
"os"
"testing"
"time"
"git.flowmade.one/flowmade-one/aether"
"github.com/nats-io/nats.go"
"github.com/nats-io/nats-server/v2/server"
)
func setupNatsServer() (*server.Server, *nats.Conn, func()) {
opts := &server.Options{
Port: -1,
JetStream: true,
StoreDir: "/tmp/nats-test-" + time.Now().Format("20060102150405"),
}
s, err := server.NewServer(opts)
if err != nil {
log.Fatal("Failed to create NATS server:", err)
}
go s.Start()
if !s.ReadyForConnections(4 * time.Second) {
log.Fatal("NATS server failed to start")
}
nc, err := nats.Connect(s.ClientURL())
if err != nil {
s.Shutdown()
log.Fatal("Failed to connect to NATS:", err)
}
return s, nc, func() {
nc.Close()
s.Shutdown()
os.RemoveAll(opts.StoreDir)
}
}
func TestUpdateVersionCache(t *testing.T) {
s, nc, cleanup := setupNatsServer()
defer cleanup()
ctx := context.Background()
store, err := NewJetStreamEventStore(nc, "test_update_cache")
if err != nil {
t.Fatalf("Failed to create store: %v", err)
}
defer store.Close(ctx)
actorID := "test-actor-1"
tests := []struct {
name string
cachedVersion int64
newVersion int64
expectUpdate bool
expectVersion int64
}{
{
name: "update when new version is greater",
cachedVersion: 5,
newVersion: 10,
expectUpdate: true,
expectVersion: 10,
},
{
name: "do not update when new version is equal",
cachedVersion: 5,
newVersion: 5,
expectUpdate: false,
expectVersion: 5,
},
{
name: "do not update when new version is less",
cachedVersion: 10,
newVersion: 5,
expectUpdate: false,
expectVersion: 10,
},
{
name: "update when no cached version exists",
cachedVersion: 0,
newVersion: 1,
expectUpdate: true,
expectVersion: 1,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Set up cached version
store.versions = make(map[string]int64)
store.versions[actorID] = tt.cachedVersion
// Call UpdateVersionCache
store.UpdateVersionCache(actorID, tt.newVersion)
// Verify result
if tt.expectUpdate {
if version, ok := store.versions[actorID]; !ok {
t.Error("Expected version to be updated but it wasn't cached")
} else if version != tt.expectVersion {
t.Errorf("Expected version %d, got %d", tt.expectVersion, version)
}
} else {
if version, ok := store.versions[actorID]; !ok {
t.Error("Expected version to remain cached")
} else if version != tt.expectVersion {
t.Errorf("Expected version to remain %d, got %d", tt.expectVersion, version)
}
}
})
}
}
func TestUpdateVersionCache_Concurrent(t *testing.T) {
s, nc, cleanup := setupNatsServer()
defer cleanup()
ctx := context.Background()
store, err := NewJetStreamEventStore(nc, "test_update_cache_concurrent")
if err != nil {
t.Fatalf("Failed to create store: %v", err)
}
defer store.Close(ctx)
actorID := "concurrent-actor"
store.versions[actorID] = 1
const numGoroutines = 50
const maxVersion = 100
var done = make(chan struct{})
var updates int32
for i := 0; i < numGoroutines; i++ {
version := int64(1 + (i % maxVersion))
go func(v int64) {
store.UpdateVersionCache(actorID, v)
select {
case <-done:
default:
updates++
}
}(version)
}
close(done)
time.Sleep(100 * time.Millisecond)
finalVersion := store.versions[actorID]
if finalVersion > maxVersion {
t.Errorf("Expected version to be at most %d, got %d", maxVersion, finalVersion)
}
}
func TestSubscribeToEventStored(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test")
}
s, nc, cleanup := setupNatsServer()
defer cleanup()
ctx := context.Background()
store, err := NewJetStreamEventStore(nc, "test_subscribe_event_stored")
if err != nil {
t.Fatalf("Failed to create store: %v", err)
}
defer store.Close(ctx)
eventBusWithStore := NewNATSEventBusWithBroadcaster(nc, store, "")
if eventBusWithStore == nil {
t.Fatalf("Failed to create event bus with broadcaster")
}
defer eventBusWithStore.Stop()
ch := eventBusWithStore.SubscribeToEventStored("*")
if ch == nil {
t.Fatal("SubscribeToEventStored returned nil channel")
}
actorID := "subscribe-test-actor"
event := &aether.Event{
ID: uuid.New().String(),
EventType: "TestEvent",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{"key": "value"},
Timestamp: time.Now(),
}
eventBusWithStore.Publish("", event)
select {
case receivedEvent := <-ch:
if receivedEvent.EventType != aether.EventTypeEventStored {
t.Errorf("Expected EventTypeEventStored, got %s", receivedEvent.EventType)
}
if receivedEvent.ActorID != actorID {
t.Errorf("Expected actorID %s, got %s", actorID, receivedEvent.ActorID)
}
data, ok := receivedEvent.Data["actorId"].(string)
if !ok || data != actorID {
t.Errorf("Expected actorId in data to be %s", actorID)
}
case <-time.After(2 * time.Second):
t.Fatal("Timeout waiting for EventStored event")
}
}
func TestCrossNodeBroadcasting_SingleNode(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test")
}
s, nc, cleanup := setupNatsServer()
defer cleanup()
ctx := context.Background()
store, err := NewJetStreamEventStore(nc, "test_single_node_broadcast")
if err != nil {
t.Fatalf("Failed to create store: %v", err)
}
defer store.Close(ctx)
eventBus := NewNATSEventBusWithBroadcaster(nc, store, "")
defer eventBus.Stop()
actorID := "broadcast-test-actor-1"
localCh := eventBus.Subscribe("")
event := &aether.Event{
ID: uuid.New().String(),
EventType: "OrderPlaced",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{"total": 99.99},
Timestamp: time.Now(),
}
eventBus.Publish("", event)
select {
case receivedEvent := <-localCh:
if receivedEvent.EventType != "OrderPlaced" {
t.Errorf("Expected OrderPlaced, got %s", receivedEvent.EventType)
}
if receivedEvent.ActorID != actorID {
t.Errorf("Expected actorID %s, got %s", actorID, receivedEvent.ActorID)
}
case <-time.After(2 * time.Second):
t.Fatal("Timeout waiting for broadcast event")
}
}
func TestCrossNodeBroadcasting_MultiNode(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test")
}
s1, nc1, cleanup1 := setupNatsServer()
defer cleanup1()
s2, nc2, cleanup2 := setupNatsServer()
defer cleanup2()
ctx := context.Background()
store1, err := NewJetStreamEventStore(nc1, "test_multi_node_1")
if err != nil {
t.Fatalf("Failed to create store 1: %v", err)
}
store2, err := NewJetStreamEventStore(nc2, "test_multi_node_2")
if err != nil {
t.Fatalf("Failed to create store 2: %v", err)
}
eventBus1 := NewNATSEventBusWithBroadcaster(nc1, store1, "")
eventBus2 := NewNATSEventBusWithBroadcaster(nc2, store2, "")
defer eventBus1.Stop()
defer eventBus2.Stop()
actorID := "multi-node-actor"
receiverCh := eventBus2.Subscribe("")
event := &aether.Event{
ID: uuid.New().String(),
EventType: "InventoryReserved",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{"quantity": 5},
Timestamp: time.Now(),
}
eventBus1.Publish("", event)
select {
case receivedEvent := <-receiverCh:
if receivedEvent.EventType != "InventoryReserved" {
t.Errorf("Expected InventoryReserved, got %s", receivedEvent.EventType)
}
if receivedEvent.ActorID != actorID {
t.Errorf("Expected actorID %s, got %s", actorID, receivedEvent.ActorID)
}
case <-time.After(3 * time.Second):
t.Fatal("Timeout waiting for cross-node event")
}
}
func TestCrossNodeBroadcasting_NamespaceIsolation(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test")
}
s, nc, cleanup := setupNatsServer()
defer cleanup()
ctx := context.Background()
tenantAStore, err := NewJetStreamEventStoreWithNamespace(nc, "events", "tenant-a")
if err != nil {
t.Fatalf("Failed to create tenant A store: %v", err)
}
tenantBStore, err := NewJetStreamEventStoreWithNamespace(nc, "events", "tenant-b")
if err != nil {
t.Fatalf("Failed to create tenant B store: %v", err)
}
tenantAEventBus := NewNATSEventBusWithBroadcaster(nc, tenantAStore, "tenant-a")
tenantBEventBus := NewNATSEventBusWithBroadcaster(nc, tenantBStore, "tenant-b")
defer tenantAEventBus.Stop()
defer tenantBEventBus.Stop()
tenantACh := tenantAEventBus.Subscribe("tenant-a")
tenantBCh := tenantBEventBus.Subscribe("tenant-b")
actorID := "tenant-actor"
event := &aether.Event{
ID: uuid.New().String(),
EventType: "TenantEvent",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{"data": "tenant-a"},
Timestamp: time.Now(),
}
tenantAEventBus.Publish("tenant-a", event)
select {
case receivedEvent := <-tenantACh:
if receivedEvent.EventType != "TenantEvent" {
t.Errorf("Expected TenantEvent in tenant A, got %s", receivedEvent.EventType)
}
case <-time.After(2 * time.Second):
t.Error("Timeout waiting for tenant A to receive event")
}
select {
case <-tenantBCh:
t.Error("Tenant B should not receive tenant A's events")
case <-time.After(1 * time.Second):
// Expected - tenant B should not receive events from tenant A
}
}
func TestUpdateVersionCache_EventStored(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test")
}
s, nc, cleanup := setupNatsServer()
defer cleanup()
ctx := context.Background()
store, err := NewJetStreamEventStore(nc, "test_version_cache_eventstored")
if err != nil {
t.Fatalf("Failed to create store: %v", err)
}
eventBus := NewNATSEventBusWithBroadcaster(nc, store, "")
defer eventBus.Stop()
actorID := "version-cache-actor"
store.UpdateVersionCache(actorID, 5)
event := &aether.Event{
ID: uuid.New().String(),
EventType: "TestEvent",
ActorID: actorID,
Version: 10,
Data: map[string]interface{}{"test": true},
Timestamp: time.Now(),
}
eventBus.Publish("", event)
time.Sleep(100 * time.Millisecond)
storedVersion, err := store.GetLatestVersion(actorID)
if err != nil {
t.Fatalf("Failed to get latest version: %v", err)
}
if storedVersion != 10 {
t.Errorf("Expected version 10, got %d", storedVersion)
}
cacheVersion, ok := store.GetCachedVersion(actorID)
if !ok {
t.Error("Expected version to be in cache")
} else if cacheVersion != 10 {
t.Errorf("Expected cached version 10, got %d", cacheVersion)
}
}

View File

@@ -1,7 +1,6 @@
package store package store
import ( import (
"context"
"encoding/json" "encoding/json"
"fmt" "fmt"
"strings" "strings"
@@ -10,134 +9,31 @@ import (
"git.flowmade.one/flowmade-one/aether" "git.flowmade.one/flowmade-one/aether"
"github.com/nats-io/nats.go" "github.com/nats-io/nats.go"
"github.com/google/uuid"
) )
// Default configuration values for JetStream event store // JetStreamEventStore implements EventStore using NATS JetStream for persistence
const (
DefaultStreamRetention = 365 * 24 * time.Hour // 1 year
DefaultReplicaCount = 1
)
// JetStreamConfig holds configuration options for JetStreamEventStore
type JetStreamConfig struct {
// StreamRetention is how long to keep events (default: 1 year).
// JetStream enforces this retention policy at the storage level using a limits-based policy:
// - MaxAge: Events older than this duration are automatically deleted
// - Storage is file-based (nats.FileStorage) for durability
// - Once the retention period expires, events are permanently removed from the stream
// This ensures that old events do not consume storage indefinitely.
// To keep events indefinitely, set StreamRetention to a very large value or configure
// a custom retention policy in the JetStream stream configuration.
StreamRetention time.Duration
// ReplicaCount is the number of replicas for high availability (default: 1)
ReplicaCount int
// Namespace is an optional prefix for stream names to provide storage isolation.
// When set, the actual stream name becomes "{namespace}_{streamName}".
// Events in namespaced stores are completely isolated from other namespaces.
// Leave empty for backward-compatible non-namespaced behavior.
Namespace string
}
// DefaultJetStreamConfig returns the default configuration
func DefaultJetStreamConfig() JetStreamConfig {
return JetStreamConfig{
StreamRetention: DefaultStreamRetention,
ReplicaCount: DefaultReplicaCount,
}
}
// JetStreamEventStore implements EventStore using NATS JetStream for persistence.
// It also implements EventStoreWithErrors to report malformed events during replay.
//
// ## Immutability Guarantee
//
// JetStreamEventStore is append-only. Events are stored in a JetStream stream that
// is configured with file-based storage (nats.FileStorage) and a retention policy
// (nats.LimitsPolicy). The configured MaxAge retention policy ensures that old events
// eventually expire, but during their lifetime, events are never modified or deleted
// through the EventStore API. Once an event is published to the stream:
// - It cannot be updated
// - It cannot be deleted before expiration
// - It can only be read
//
// This architectural guarantee, combined with the EventStore interface providing
// no Update or Delete methods, ensures events are immutable and suitable as an
// audit trail.
//
// ## Version Cache Invalidation Strategy
//
// JetStreamEventStore maintains an in-memory cache of actor versions for optimistic
// concurrency control. The cache is invalidated on any miss (GetLatestVersion call
// that finds a newer version in JetStream) to ensure consistency even when external
// processes write to the same JetStream stream.
//
// If only Aether owns the stream (single-writer assumption), the cache provides
// excellent performance for repeated version checks. If external writers modify
// the stream, the cache will remain consistent because:
//
// 1. On SaveEvent: getLatestVersionLocked() checks JetStream on cache miss
// 2. On GetLatestVersion: If actual version > cached version, cache is invalidated
// 3. Subsequent checks for that actor will fetch fresh data from JetStream
//
// This strategy prevents data corruption from stale cache while maintaining
// performance for the single-writer case.
type JetStreamEventStore struct { type JetStreamEventStore struct {
js nats.JetStreamContext js nats.JetStreamContext
streamName string streamName string
config JetStreamConfig mu sync.Mutex // Protects version checks during SaveEvent
mu sync.Mutex // Protects version checks during SaveEvent versions map[string]int64 // actorID -> latest version cache
versions map[string]int64 // actorID -> latest version cache
broadcaster aether.EventBroadcaster // Optional broadcaster for EventStored events
namespace string // Optional namespace for event publishing
} }
// NewJetStreamEventStore creates a new JetStream-based event store
// NewJetStreamEventStore creates a new JetStream-based event store with default configuration
func NewJetStreamEventStore(natsConn *nats.Conn, streamName string) (*JetStreamEventStore, error) { func NewJetStreamEventStore(natsConn *nats.Conn, streamName string) (*JetStreamEventStore, error) {
return NewJetStreamEventStoreWithConfig(natsConn, streamName, DefaultJetStreamConfig())
}
// NewJetStreamEventStoreWithNamespace creates a new JetStream-based event store with namespace isolation.
// The namespace is prefixed to the stream name to ensure complete isolation at the storage level.
// This is a convenience function; the same can be achieved by setting Namespace in JetStreamConfig.
func NewJetStreamEventStoreWithNamespace(natsConn *nats.Conn, streamName string, namespace string) (*JetStreamEventStore, error) {
config := DefaultJetStreamConfig()
config.Namespace = namespace
return NewJetStreamEventStoreWithConfig(natsConn, streamName, config)
}
// NewJetStreamEventStoreWithConfig creates a new JetStream-based event store with custom configuration
func NewJetStreamEventStoreWithConfig(natsConn *nats.Conn, streamName string, config JetStreamConfig) (*JetStreamEventStore, error) {
js, err := natsConn.JetStream() js, err := natsConn.JetStream()
if err != nil { if err != nil {
return nil, fmt.Errorf("failed to get JetStream context: %w", err) return nil, fmt.Errorf("failed to get JetStream context: %w", err)
} }
// Apply defaults for zero values
if config.StreamRetention == 0 {
config.StreamRetention = DefaultStreamRetention
}
if config.ReplicaCount == 0 {
config.ReplicaCount = DefaultReplicaCount
}
// Apply namespace prefix to stream name if provided
effectiveStreamName := streamName
if config.Namespace != "" {
effectiveStreamName = fmt.Sprintf("%s_%s", sanitizeSubject(config.Namespace), streamName)
}
// Create or update the stream // Create or update the stream
stream := &nats.StreamConfig{ stream := &nats.StreamConfig{
Name: effectiveStreamName, Name: streamName,
Subjects: []string{fmt.Sprintf("%s.events.>", effectiveStreamName), fmt.Sprintf("%s.snapshots.>", effectiveStreamName)}, Subjects: []string{fmt.Sprintf("%s.events.>", streamName), fmt.Sprintf("%s.snapshots.>", streamName)},
Storage: nats.FileStorage, Storage: nats.FileStorage,
Retention: nats.LimitsPolicy, Retention: nats.LimitsPolicy,
MaxAge: config.StreamRetention, MaxAge: 365 * 24 * time.Hour, // Keep events for 1 year
Replicas: config.ReplicaCount, Replicas: 1, // Can be increased for HA
} }
_, err = js.AddStream(stream) _, err = js.AddStream(stream)
@@ -147,73 +43,8 @@ func NewJetStreamEventStoreWithConfig(natsConn *nats.Conn, streamName string, co
return &JetStreamEventStore{ return &JetStreamEventStore{
js: js, js: js,
streamName: effectiveStreamName, streamName: streamName,
config: config,
versions: make(map[string]int64), versions: make(map[string]int64),
broadcaster: nil,
namespace: "",
}, nil
}
// GetNamespace returns the namespace configured for this store, or empty string if not namespaced.
func (jes *JetStreamEventStore) GetNamespace() string {
return jes.config.Namespace
}
// GetStreamName returns the effective stream name (including namespace prefix if applicable).
func (jes *JetStreamEventStore) GetStreamName() string {
return jes.streamName
}
// NewJetStreamEventStoreWithBroadcaster creates a new JetStream-based event store with broadcaster support.
// The broadcaster receives EventStored events when events are successfully saved.
func NewJetStreamEventStoreWithBroadcaster(natsConn *nats.Conn, streamName string, broadcaster aether.EventBroadcaster, namespace string) (*JetStreamEventStore, error) {
config := DefaultJetStreamConfig()
if namespace != "" {
config.Namespace = namespace
}
js, err := natsConn.JetStream()
if err != nil {
return nil, fmt.Errorf("failed to get JetStream context: %w", err)
}
// Apply defaults for zero values
if config.StreamRetention == 0 {
config.StreamRetention = DefaultStreamRetention
}
if config.ReplicaCount == 0 {
config.ReplicaCount = DefaultReplicaCount
}
// Apply namespace prefix to stream name if provided
effectiveStreamName := streamName
if config.Namespace != "" {
effectiveStreamName = fmt.Sprintf("%s_%s", sanitizeSubject(config.Namespace), streamName)
}
// Create or update the stream
stream := &nats.StreamConfig{
Name: effectiveStreamName,
Subjects: []string{fmt.Sprintf("%s.events.>", effectiveStreamName), fmt.Sprintf("%s.snapshots.>", effectiveStreamName)},
Storage: nats.FileStorage,
Retention: nats.LimitsPolicy,
MaxAge: config.StreamRetention,
Replicas: config.ReplicaCount,
}
_, err = js.AddStream(stream)
if err != nil && !strings.Contains(err.Error(), "already exists") {
return nil, fmt.Errorf("failed to create stream: %w", err)
}
return &JetStreamEventStore{
js: js,
streamName: effectiveStreamName,
config: config,
versions: make(map[string]int64),
broadcaster: broadcaster,
namespace: namespace,
}, nil }, nil
} }
@@ -224,36 +55,19 @@ func (jes *JetStreamEventStore) SaveEvent(event *aether.Event) error {
jes.mu.Lock() jes.mu.Lock()
defer jes.mu.Unlock() defer jes.mu.Unlock()
// Check cache first // Get current latest version for this actor
if version, ok := jes.versions[event.ActorID]; ok { currentVersion, err := jes.getLatestVersionLocked(event.ActorID)
// Validate version against cached version if err != nil {
if event.Version <= version { return fmt.Errorf("failed to get latest version: %w", err)
return &aether.VersionConflictError{ }
ActorID: event.ActorID,
AttemptedVersion: event.Version,
CurrentVersion: version,
}
}
// Version check passed, proceed with publish while holding lock
} else {
// Cache miss - need to check actual stream
// Get current latest version while holding lock to prevent TOCTOU race
currentVersion, err := jes.getLatestVersionLocked(event.ActorID)
if err != nil {
return fmt.Errorf("failed to get latest version: %w", err)
}
// Validate version is strictly greater than current // Validate version is strictly greater than current
if event.Version <= currentVersion { if event.Version <= currentVersion {
return &aether.VersionConflictError{ return &aether.VersionConflictError{
ActorID: event.ActorID, ActorID: event.ActorID,
AttemptedVersion: event.Version, AttemptedVersion: event.Version,
CurrentVersion: currentVersion, CurrentVersion: currentVersion,
}
} }
// Update cache with current version
jes.versions[event.ActorID] = currentVersion
} }
// Serialize event to JSON // Serialize event to JSON
@@ -274,80 +88,50 @@ func (jes *JetStreamEventStore) SaveEvent(event *aether.Event) error {
return fmt.Errorf("failed to publish event to JetStream: %w", err) return fmt.Errorf("failed to publish event to JetStream: %w", err)
} }
// Update version cache after successful publish // Update version cache
jes.versions[event.ActorID] = event.Version jes.versions[event.ActorID] = event.Version
// Publish EventStored event after successful save (if broadcaster is configured)
if jes.broadcaster != nil {
jes.publishEventStored(event)
}
return nil return nil
} }
// publishEventStored publishes an EventStored event to the broadcaster. // getLatestVersionLocked returns the latest version for an actor.
// This is called after a successful SaveEvent to notify subscribers. // Caller must hold jes.mu.
// func (jes *JetStreamEventStore) getLatestVersionLocked(actorID string) (int64, error) {
// EventStored Event Schema: // Check cache first
// - EventType: "EventStored" (aether.EventTypeEventStored) if version, ok := jes.versions[actorID]; ok {
// - ActorID: ID of the actor that the original event was about return version, nil
// - Version: version of the stored event
// - Data:
// - eventId: (string) ID of the stored event
// - actorId: (string) ID of the actor
// - version: (int64) version of the event
// - timestamp: (int64) Unix timestamp of when the event was stored
//
// Example usage with NATSEventBus:
//
// eventBus := aether.NewNATSEventBus(natsConn)
// store := store.NewJetStreamEventStoreWithBroadcaster(natsConn, "events", eventBus, "")
// ch := eventBus.SubscribeToEventStored("*")
//
// for event := range ch {
// actorID := event.Data["actorId"].(string)
// version := event.Data["version"].(int64)
// store.UpdateVersionCache(actorID, version)
// }
func (jes *JetStreamEventStore) publishEventStored(originalEvent *aether.Event) {
eventStored := &aether.Event{
ID: uuid.New().String(),
EventType: aether.EventTypeEventStored,
ActorID: originalEvent.ActorID, // EventStored is about the original actor
Version: originalEvent.Version, // Preserve the version of the stored event
Data: map[string]interface{}{
"eventId": originalEvent.ID,
"actorId": originalEvent.ActorID,
"version": originalEvent.Version,
"timestamp": originalEvent.Timestamp.Unix(),
},
Timestamp: time.Now(),
} }
jes.broadcaster.Publish(jes.namespace, eventStored) // Fetch from JetStream
} events, err := jes.getEventsInternal(actorID, 0)
// GetEvents retrieves all events for an actor since a version.
// Note: This method silently skips malformed events for backward compatibility.
// Use GetEventsWithErrors to receive information about malformed events.
func (jes *JetStreamEventStore) GetEvents(actorID string, fromVersion int64) ([]*aether.Event, error) {
result, err := jes.getEventsWithErrorsInternal(actorID, fromVersion)
if err != nil { if err != nil {
return nil, err return 0, err
} }
return result.Events, nil
if len(events) == 0 {
return 0, nil
}
latestVersion := int64(0)
for _, event := range events {
if event.Version > latestVersion {
latestVersion = event.Version
}
}
// Update cache
jes.versions[actorID] = latestVersion
return latestVersion, nil
} }
// GetEventsWithErrors retrieves events for an actor and reports any malformed // GetEvents retrieves all events for an actor since a version
// events encountered. This method allows callers to decide how to handle func (jes *JetStreamEventStore) GetEvents(actorID string, fromVersion int64) ([]*aether.Event, error) {
// corrupted data rather than silently skipping it. return jes.getEventsInternal(actorID, fromVersion)
func (jes *JetStreamEventStore) GetEventsWithErrors(actorID string, fromVersion int64) (*aether.ReplayResult, error) {
return jes.getEventsWithErrorsInternal(actorID, fromVersion)
} }
// getEventsWithErrorsInternal is the internal implementation that tracks both // getEventsInternal is the internal implementation of GetEvents
// successfully parsed events and errors for malformed events. func (jes *JetStreamEventStore) getEventsInternal(actorID string, fromVersion int64) ([]*aether.Event, error) {
func (jes *JetStreamEventStore) getEventsWithErrorsInternal(actorID string, fromVersion int64) (*aether.ReplayResult, error) {
// Create subject filter for this actor // Create subject filter for this actor
subject := fmt.Sprintf("%s.events.%s.%s", subject := fmt.Sprintf("%s.events.%s.%s",
jes.streamName, jes.streamName,
@@ -361,10 +145,7 @@ func (jes *JetStreamEventStore) getEventsWithErrorsInternal(actorID string, from
} }
defer consumer.Unsubscribe() defer consumer.Unsubscribe()
result := &aether.ReplayResult{ var events []*aether.Event
Events: make([]*aether.Event, 0),
Errors: make([]aether.ReplayError, 0),
}
// Fetch messages in batches // Fetch messages in batches
for { for {
@@ -379,24 +160,12 @@ func (jes *JetStreamEventStore) getEventsWithErrorsInternal(actorID string, from
for _, msg := range msgs { for _, msg := range msgs {
var event aether.Event var event aether.Event
if err := json.Unmarshal(msg.Data, &event); err != nil { if err := json.Unmarshal(msg.Data, &event); err != nil {
// Record the error with context instead of silently skipping continue // Skip malformed events
metadata, _ := msg.Metadata()
seqNum := uint64(0)
if metadata != nil {
seqNum = metadata.Sequence.Stream
}
result.Errors = append(result.Errors, aether.ReplayError{
SequenceNumber: seqNum,
RawData: msg.Data,
Err: err,
})
msg.Ack() // Still ack to prevent redelivery
continue
} }
// Filter by version // Filter by version
if event.Version > fromVersion { if event.Version > fromVersion {
result.Events = append(result.Events, &event) events = append(events, &event)
} }
msg.Ack() msg.Ack()
@@ -407,99 +176,31 @@ func (jes *JetStreamEventStore) getEventsWithErrorsInternal(actorID string, from
} }
} }
return result, nil return events, nil
} }
// GetLatestVersion returns the latest version for an actor in O(1) time. // GetLatestVersion returns the latest version for an actor
// It uses JetStream's DeliverLast() option to fetch only the last message
// instead of scanning all events, making this O(1) instead of O(n).
func (jes *JetStreamEventStore) GetLatestVersion(actorID string) (int64, error) { func (jes *JetStreamEventStore) GetLatestVersion(actorID string) (int64, error) {
// Create subject filter for this actor events, err := jes.GetEvents(actorID, 0)
subject := fmt.Sprintf("%s.events.%s.%s",
jes.streamName,
sanitizeSubject(extractActorType(actorID)),
sanitizeSubject(actorID))
// Create consumer to read only the last message
consumer, err := jes.js.PullSubscribe(subject, "", nats.DeliverLast())
if err != nil { if err != nil {
return 0, fmt.Errorf("failed to create consumer: %w", err) return 0, err
}
defer consumer.Unsubscribe()
// Fetch only the last message
msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second))
if err != nil {
if err == nats.ErrTimeout {
// No messages for this actor, return 0
return 0, nil
}
return 0, fmt.Errorf("failed to fetch last message: %w", err)
} }
if len(msgs) == 0 { if len(events) == 0 {
// No events for this actor
return 0, nil return 0, nil
} }
// Parse the last message to get the version latestVersion := int64(0)
var event aether.Event for _, event := range events {
if err := json.Unmarshal(msgs[0].Data, &event); err != nil { if event.Version > latestVersion {
return 0, fmt.Errorf("failed to unmarshal last event: %w", err) latestVersion = event.Version
}
msgs[0].Ack()
return event.Version, nil
}
// getLatestVersionLocked is like GetLatestVersion but assumes the caller already holds jes.mu.
// This is used internally to avoid releasing and reacquiring the lock during SaveEvent,
// which would create a TOCTOU race condition.
func (jes *JetStreamEventStore) getLatestVersionLocked(actorID string) (int64, error) {
// Create subject filter for this actor
subject := fmt.Sprintf("%s.events.%s.%s",
jes.streamName,
sanitizeSubject(extractActorType(actorID)),
sanitizeSubject(actorID))
// Create consumer to read only the last message
consumer, err := jes.js.PullSubscribe(subject, "", nats.DeliverLast())
if err != nil {
return 0, fmt.Errorf("failed to create consumer: %w", err)
}
defer consumer.Unsubscribe()
// Fetch only the last message
msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second))
if err != nil {
if err == nats.ErrTimeout {
// No messages for this actor, return 0
return 0, nil
} }
return 0, fmt.Errorf("failed to fetch last message: %w", err)
} }
if len(msgs) == 0 { return latestVersion, nil
// No events for this actor
return 0, nil
}
// Parse the last message to get the version
var event aether.Event
if err := json.Unmarshal(msgs[0].Data, &event); err != nil {
return 0, fmt.Errorf("failed to unmarshal last event: %w", err)
}
msgs[0].Ack()
return event.Version, nil
} }
// GetLatestSnapshot gets the most recent snapshot for an actor. // GetLatestSnapshot gets the most recent snapshot for an actor
// Returns an error if no snapshot exists for the actor (unlike GetLatestVersion which returns 0).
// This is intentional: a missing snapshot is different from a missing event stream.
// If an actor has no events, that's a normal state (use version 0).
// If an actor has no snapshot, that could indicate an error or it could be normal
// depending on the use case, so we let the caller decide how to handle it.
func (jes *JetStreamEventStore) GetLatestSnapshot(actorID string) (*aether.ActorSnapshot, error) { func (jes *JetStreamEventStore) GetLatestSnapshot(actorID string) (*aether.ActorSnapshot, error) {
// Create subject for snapshots // Create subject for snapshots
subject := fmt.Sprintf("%s.snapshots.%s.%s", subject := fmt.Sprintf("%s.snapshots.%s.%s",
@@ -517,14 +218,12 @@ func (jes *JetStreamEventStore) GetLatestSnapshot(actorID string) (*aether.Actor
msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second)) msgs, err := consumer.Fetch(1, nats.MaxWait(time.Second))
if err != nil { if err != nil {
if err == nats.ErrTimeout { if err == nats.ErrTimeout {
// No snapshot found - return error to distinguish from successful nil result
return nil, fmt.Errorf("no snapshot found for actor %s", actorID) return nil, fmt.Errorf("no snapshot found for actor %s", actorID)
} }
return nil, fmt.Errorf("failed to fetch snapshot: %w", err) return nil, fmt.Errorf("failed to fetch snapshot: %w", err)
} }
if len(msgs) == 0 { if len(msgs) == 0 {
// No snapshot exists for this actor
return nil, fmt.Errorf("no snapshot found for actor %s", actorID) return nil, fmt.Errorf("no snapshot found for actor %s", actorID)
} }
@@ -580,44 +279,3 @@ func sanitizeSubject(s string) string {
s = strings.ReplaceAll(s, ">", "_") s = strings.ReplaceAll(s, ">", "_")
return s return s
} }
// UpdateVersionCache updates the version cache for a specific actor.
// This is used when receiving events from other nodes via NATS to keep
// the version cache consistent across cluster nodes.
//
// Only updates if the new version is greater than the cached version to prevent
// stale cache entries from causing version conflicts.
func (jes *JetStreamEventStore) UpdateVersionCache(actorID string, version int64) {
jes.mu.Lock()
defer jes.mu.Unlock()
// Only update if the new version is greater than cached version
if currentVersion, ok := jes.versions[actorID]; !ok || version > currentVersion {
jes.versions[actorID] = version
}
}
// GetCachedVersion returns the cached version for an actor, if available.
func (jes *JetStreamEventStore) GetCachedVersion(actorID string) (int64, bool) {
jes.mu.Lock()
defer jes.mu.Unlock()
version, ok := jes.versions[actorID]
return version, ok
}
// SetBroadcaster sets the event broadcaster for this store.
// The broadcaster is used to publish EventStored events when events are saved.
func (jes *JetStreamEventStore) SetBroadcaster(broadcaster aether.EventBroadcaster) {
jes.mu.Lock()
defer jes.mu.Unlock()
jes.broadcaster = broadcaster
}
// Close closes the JetStream event store and cleans up resources.
func (jes *JetStreamEventStore) Close(ctx context.Context) error {
return nil
}
// Compile-time check that JetStreamEventStore implements EventStoreWithErrors
var _ aether.EventStoreWithErrors = (*JetStreamEventStore)(nil)

View File

@@ -1,147 +0,0 @@
//go:build integration
package store
import (
"fmt"
"testing"
"time"
"git.flowmade.one/flowmade-one/aether"
)
// BenchmarkGetLatestVersion_WithManyEvents benchmarks GetLatestVersion performance
// with a large number of events per actor.
// This demonstrates the O(1) performance by showing that time doesn't increase
// significantly with more events.
func BenchmarkGetLatestVersion_WithManyEvents(b *testing.B) {
nc := getTestNATSConnection(&testing.T{})
if nc == nil {
b.Skip("NATS not available")
return
}
defer nc.Close()
store, err := NewJetStreamEventStore(nc, fmt.Sprintf("bench-getversion-%d", time.Now().UnixNano()))
if err != nil {
b.Fatalf("failed to create store: %v", err)
}
actorID := "actor-bench-test"
// Populate with 1000 events
for i := 1; i <= 1000; i++ {
event := &aether.Event{
ID: fmt.Sprintf("evt-%d", i),
EventType: "BenchEvent",
ActorID: actorID,
Version: int64(i),
Data: map[string]interface{}{"index": i},
Timestamp: time.Now(),
}
err := store.SaveEvent(event)
if err != nil {
b.Fatalf("SaveEvent failed for event %d: %v", i, err)
}
}
// Benchmark GetLatestVersion
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, err := store.GetLatestVersion(actorID)
if err != nil {
b.Fatalf("GetLatestVersion failed: %v", err)
}
}
b.StopTimer()
}
// BenchmarkGetLatestVersion_NoCache benchmarks GetLatestVersion without cache
// to show that even uncached lookups are very fast due to DeliverLast optimization.
// A new store instance is created before timing to bypass the version cache.
func BenchmarkGetLatestVersion_NoCache(b *testing.B) {
nc := getTestNATSConnection(&testing.T{})
if nc == nil {
b.Skip("NATS not available")
return
}
defer nc.Close()
store, err := NewJetStreamEventStore(nc, fmt.Sprintf("bench-nocache-%d", time.Now().UnixNano()))
if err != nil {
b.Fatalf("failed to create store: %v", err)
}
actorID := "actor-bench-nocache"
// Populate with 1000 events
for i := 1; i <= 1000; i++ {
event := &aether.Event{
ID: fmt.Sprintf("evt-%d", i),
EventType: "BenchEvent",
ActorID: actorID,
Version: int64(i),
Data: map[string]interface{}{"index": i},
Timestamp: time.Now(),
}
err := store.SaveEvent(event)
if err != nil {
b.Fatalf("SaveEvent failed for event %d: %v", i, err)
}
}
// Create a new store instance to bypass version cache
uncachedStore, err := NewJetStreamEventStore(nc, store.GetStreamName())
if err != nil {
b.Fatalf("failed to create uncached store: %v", err)
}
// Benchmark GetLatestVersion without using cache
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, err := uncachedStore.GetLatestVersion(actorID)
if err != nil {
b.Fatalf("GetLatestVersion failed: %v", err)
}
}
b.StopTimer()
}
// BenchmarkGetLatestVersion_SingleEvent benchmarks with minimal data
func BenchmarkGetLatestVersion_SingleEvent(b *testing.B) {
nc := getTestNATSConnection(&testing.T{})
if nc == nil {
b.Skip("NATS not available")
return
}
defer nc.Close()
store, err := NewJetStreamEventStore(nc, fmt.Sprintf("bench-single-%d", time.Now().UnixNano()))
if err != nil {
b.Fatalf("failed to create store: %v", err)
}
actorID := "actor-single"
event := &aether.Event{
ID: "evt-1",
EventType: "TestEvent",
ActorID: actorID,
Version: 1,
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err = store.SaveEvent(event)
if err != nil {
b.Fatalf("SaveEvent failed: %v", err)
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, err := store.GetLatestVersion(actorID)
if err != nil {
b.Fatalf("GetLatestVersion failed: %v", err)
}
}
b.StopTimer()
}

View File

@@ -2,19 +2,15 @@ package store
import ( import (
"sync" "sync"
"time"
"git.flowmade.one/flowmade-one/aether" "git.flowmade.one/flowmade-one/aether"
"github.com/google/uuid"
) )
// InMemoryEventStore provides a simple in-memory event store for testing // InMemoryEventStore provides a simple in-memory event store for testing
type InMemoryEventStore struct { type InMemoryEventStore struct {
mu sync.RWMutex mu sync.RWMutex
events map[string][]*aether.Event // actorID -> events events map[string][]*aether.Event // actorID -> events
snapshots map[string][]*aether.ActorSnapshot // actorID -> snapshots (sorted by version) snapshots map[string][]*aether.ActorSnapshot // actorID -> snapshots (sorted by version)
broadcaster aether.EventBroadcaster // optional broadcaster for EventStored events
namespace string // optional namespace for event publishing
} }
// NewInMemoryEventStore creates a new in-memory event store // NewInMemoryEventStore creates a new in-memory event store
@@ -25,21 +21,9 @@ func NewInMemoryEventStore() *InMemoryEventStore {
} }
} }
// NewInMemoryEventStoreWithBroadcaster creates a new in-memory event store with an event broadcaster
// The broadcaster receives EventStored events when events are successfully saved.
func NewInMemoryEventStoreWithBroadcaster(broadcaster aether.EventBroadcaster, namespace string) *InMemoryEventStore {
return &InMemoryEventStore{
events: make(map[string][]*aether.Event),
snapshots: make(map[string][]*aether.ActorSnapshot),
broadcaster: broadcaster,
namespace: namespace,
}
}
// SaveEvent saves an event to the in-memory store. // SaveEvent saves an event to the in-memory store.
// Returns VersionConflictError if the event's version is not strictly greater // Returns VersionConflictError if the event's version is not strictly greater
// than the current latest version for the actor. // than the current latest version for the actor.
// If a broadcaster is configured, publishes an EventStored event on success.
func (es *InMemoryEventStore) SaveEvent(event *aether.Event) error { func (es *InMemoryEventStore) SaveEvent(event *aether.Event) error {
es.mu.Lock() es.mu.Lock()
defer es.mu.Unlock() defer es.mu.Unlock()
@@ -67,35 +51,9 @@ func (es *InMemoryEventStore) SaveEvent(event *aether.Event) error {
es.events[event.ActorID] = make([]*aether.Event, 0) es.events[event.ActorID] = make([]*aether.Event, 0)
} }
es.events[event.ActorID] = append(es.events[event.ActorID], event) es.events[event.ActorID] = append(es.events[event.ActorID], event)
// Publish EventStored event after successful save (if broadcaster is configured)
if es.broadcaster != nil {
es.publishEventStored(event)
}
return nil return nil
} }
// publishEventStored publishes an EventStored event to the broadcaster.
// This is called after a successful SaveEvent to notify subscribers.
func (es *InMemoryEventStore) publishEventStored(originalEvent *aether.Event) {
eventStored := &aether.Event{
ID: uuid.New().String(),
EventType: aether.EventTypeEventStored,
ActorID: originalEvent.ActorID, // EventStored is about the original actor
Version: originalEvent.Version, // Preserve the version of the stored event
Data: map[string]interface{}{
"eventId": originalEvent.ID,
"actorId": originalEvent.ActorID,
"version": originalEvent.Version,
"timestamp": originalEvent.Timestamp.Unix(),
},
Timestamp: time.Now(),
}
es.broadcaster.Publish(es.namespace, eventStored)
}
// GetEvents retrieves events for an actor from a specific version // GetEvents retrieves events for an actor from a specific version
func (es *InMemoryEventStore) GetEvents(actorID string, fromVersion int64) ([]*aether.Event, error) { func (es *InMemoryEventStore) GetEvents(actorID string, fromVersion int64) ([]*aether.Event, error) {
es.mu.RLock() es.mu.RLock()

View File

@@ -1905,181 +1905,3 @@ func TestSaveEvent_MetadataPreservedAcrossMultipleEvents(t *testing.T) {
} }
} }
} }
// === EventStored Publishing Tests ===
func TestSaveEvent_WithBroadcaster_PublishesEventStored(t *testing.T) {
// Create a mock broadcaster to capture published events
broadcaster := aether.NewEventBus()
store := NewInMemoryEventStoreWithBroadcaster(broadcaster, "test-namespace")
// Subscribe to EventStored events
ch := broadcaster.Subscribe("test-namespace")
defer broadcaster.Unsubscribe("test-namespace", ch)
event := &aether.Event{
ID: "evt-123",
EventType: "OrderPlaced",
ActorID: "order-456",
Version: 1,
Data: map[string]interface{}{
"total": 100.50,
},
Timestamp: time.Now(),
}
// Save event
err := store.SaveEvent(event)
if err != nil {
t.Fatalf("SaveEvent failed: %v", err)
}
// Check if EventStored was published
select {
case publishedEvent := <-ch:
if publishedEvent == nil {
t.Fatal("received nil event from broadcaster")
}
if publishedEvent.EventType != aether.EventTypeEventStored {
t.Errorf("expected EventType %q, got %q", aether.EventTypeEventStored, publishedEvent.EventType)
}
if publishedEvent.ActorID != "order-456" {
t.Errorf("expected ActorID %q, got %q", "order-456", publishedEvent.ActorID)
}
if publishedEvent.Version != 1 {
t.Errorf("expected Version 1, got %d", publishedEvent.Version)
}
// Check data contains original event info
if publishedEvent.Data["eventId"] != "evt-123" {
t.Errorf("expected eventId %q, got %q", "evt-123", publishedEvent.Data["eventId"])
}
case <-time.After(100 * time.Millisecond):
t.Fatal("timeout waiting for EventStored event")
}
}
func TestSaveEvent_VersionConflict_NoEventStored(t *testing.T) {
broadcaster := aether.NewEventBus()
store := NewInMemoryEventStoreWithBroadcaster(broadcaster, "test-namespace")
// Subscribe to EventStored events
ch := broadcaster.Subscribe("test-namespace")
defer broadcaster.Unsubscribe("test-namespace", ch)
// Save first event
event1 := &aether.Event{
ID: "evt-1",
EventType: "OrderPlaced",
ActorID: "order-456",
Version: 1,
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err := store.SaveEvent(event1)
if err != nil {
t.Fatalf("SaveEvent(event1) failed: %v", err)
}
// Drain the first EventStored event
select {
case <-ch:
case <-time.After(100 * time.Millisecond):
t.Fatal("timeout waiting for first EventStored event")
}
// Try to save event with non-increasing version (should fail)
event2 := &aether.Event{
ID: "evt-2",
EventType: "OrderPlaced",
ActorID: "order-456",
Version: 1, // Same version, should conflict
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err = store.SaveEvent(event2)
if !errors.Is(err, aether.ErrVersionConflict) {
t.Fatalf("expected ErrVersionConflict, got %v", err)
}
// Verify no EventStored event was published
select {
case <-ch:
t.Fatal("expected no EventStored event, but received one")
case <-time.After(50 * time.Millisecond):
// Expected - no event published
}
}
func TestSaveEvent_MultipleEvents_PublishesMultipleEventStored(t *testing.T) {
broadcaster := aether.NewEventBus()
store := NewInMemoryEventStoreWithBroadcaster(broadcaster, "test-namespace")
// Subscribe to EventStored events
ch := broadcaster.Subscribe("test-namespace")
defer broadcaster.Unsubscribe("test-namespace", ch)
// Save multiple events
for i := int64(1); i <= 3; i++ {
event := &aether.Event{
ID: fmt.Sprintf("evt-%d", i),
EventType: "OrderPlaced",
ActorID: "order-456",
Version: i,
Data: map[string]interface{}{},
Timestamp: time.Now(),
}
err := store.SaveEvent(event)
if err != nil {
t.Fatalf("SaveEvent failed: %v", err)
}
}
// Verify we received 3 EventStored events in order
for i := int64(1); i <= 3; i++ {
select {
case publishedEvent := <-ch:
if publishedEvent == nil {
t.Fatal("received nil event from broadcaster")
}
if publishedEvent.Version != i {
t.Errorf("expected Version %d, got %d", i, publishedEvent.Version)
}
case <-time.After(100 * time.Millisecond):
t.Fatalf("timeout waiting for EventStored event %d", i)
}
}
}
func TestSaveEvent_WithoutBroadcaster_NoPanic(t *testing.T) {
// Test that SaveEvent works without a broadcaster (nil broadcaster)
store := NewInMemoryEventStore()
event := &aether.Event{
ID: "evt-123",
EventType: "OrderPlaced",
ActorID: "order-456",
Version: 1,
Data: map[string]interface{}{
"total": 100.50,
},
Timestamp: time.Now(),
}
// This should not panic even though broadcaster is nil
err := store.SaveEvent(event)
if err != nil {
t.Fatalf("SaveEvent failed: %v", err)
}
// Verify event was saved
events, err := store.GetEvents("order-456", 0)
if err != nil {
t.Fatalf("GetEvents failed: %v", err)
}
if len(events) != 1 {
t.Fatalf("expected 1 event, got %d", len(events))
}
}

View File

@@ -1,124 +0,0 @@
package store
import (
"testing"
)
func TestJetStreamConfigNamespace(t *testing.T) {
t.Run("default config has empty namespace", func(t *testing.T) {
config := DefaultJetStreamConfig()
if config.Namespace != "" {
t.Errorf("expected empty namespace in default config, got %q", config.Namespace)
}
})
t.Run("namespace can be set in config", func(t *testing.T) {
config := JetStreamConfig{
Namespace: "tenant-abc",
}
if config.Namespace != "tenant-abc" {
t.Errorf("expected namespace tenant-abc, got %q", config.Namespace)
}
})
}
func TestNamespacedStreamName(t *testing.T) {
tests := []struct {
name string
baseStreamName string
namespace string
expectedStreamName string
}{
{
name: "no namespace - stream name unchanged",
baseStreamName: "events",
namespace: "",
expectedStreamName: "events",
},
{
name: "with namespace - prefixed stream name",
baseStreamName: "events",
namespace: "tenant-abc",
expectedStreamName: "tenant-abc_events",
},
{
name: "namespace with dots - sanitized",
baseStreamName: "events",
namespace: "tenant.abc",
expectedStreamName: "tenant_abc_events",
},
{
name: "namespace with spaces - sanitized",
baseStreamName: "events",
namespace: "tenant abc",
expectedStreamName: "tenant_abc_events",
},
{
name: "namespace with special chars - sanitized",
baseStreamName: "events",
namespace: "tenant*abc>def",
expectedStreamName: "tenant_abc_def_events",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// We can't create a real JetStreamEventStore without NATS,
// but we can test the stream name logic by examining the expected format
effectiveStreamName := tt.baseStreamName
if tt.namespace != "" {
effectiveStreamName = sanitizeSubject(tt.namespace) + "_" + tt.baseStreamName
}
if effectiveStreamName != tt.expectedStreamName {
t.Errorf("expected stream name %q, got %q", tt.expectedStreamName, effectiveStreamName)
}
})
}
}
func TestSanitizeSubject(t *testing.T) {
tests := []struct {
input string
expected string
}{
{"simple", "simple"},
{"with spaces", "with_spaces"},
{"with.dots", "with_dots"},
{"with*stars", "with_stars"},
{"with>greater", "with_greater"},
{"complex.name with*special>chars", "complex_name_with_special_chars"},
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
result := sanitizeSubject(tt.input)
if result != tt.expected {
t.Errorf("sanitizeSubject(%q) = %q, want %q", tt.input, result, tt.expected)
}
})
}
}
func TestExtractActorType(t *testing.T) {
tests := []struct {
actorID string
expectedType string
}{
{"order-123", "order"},
{"user-abc-def", "user"},
{"nodelimiter", "unknown"},
{"", "unknown"},
{"-leadingdash", "unknown"},
{"a-b", "a"},
}
for _, tt := range tests {
t.Run(tt.actorID, func(t *testing.T) {
result := extractActorType(tt.actorID)
if result != tt.expectedType {
t.Errorf("extractActorType(%q) = %q, want %q", tt.actorID, result, tt.expectedType)
}
})
}
}