The integration tests had timing issues causing intermittent failures on CI:
- TestNATSEventBus_HighThroughput: Added subscriber readiness synchronization using a barrier event before bulk publishing. This ensures the NATS subscription is fully established before events are sent rapidly. Extended timeout from 30s to 60s for CI environments.
- TestNATSEventBus_EventOrdering: Added readiness barrier event to synchronize subscriber setup before publishing ordered events. Extended timeout from 10s to 15s to account for CI timing variations.
- TestNATSEventBus_ConcurrentPublishSubscribe: Added readiness synchronization before concurrent publishers start. Extended timeout from 10s to 30s to handle the increased load under CI constraints.
Root causes:
- Subscriber channels were not fully ready to receive when bulk publishing started, causing message loss
- CI runners (especially ARM64) have different timing characteristics than local development
- Insufficient timeouts for high-volume event collection under shared CI resources
The fixes use a barrier pattern: publish a ready signal, wait to receive it, then proceed with the test. This is more reliable than fixed sleep durations.
Closes#57
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Gitea runner uses ARM64, not x86_64. Detect architecture
at runtime and download the appropriate NATS server binary.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Run nats-server directly from extracted location instead of
installing to /usr/local/bin, avoiding the need for sudo which
isn't available in the Gitea runner environment.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused services block that caused CI failure
(Gitea runner doesn't support --name/-p in options field)
- Update build tag to modern //go:build syntax (Go 1.17+)
The workflow already manually installs and starts NATS with JetStream,
making the services block redundant.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit adds comprehensive integration tests for JetStreamEventStore
that validate production event store behavior against a real NATS server.
Tests include:
- Stream creation and configuration
- SaveEvent persistence to JetStream
- GetEvents retrieval in correct order
- GetLatestVersion functionality
- Snapshot save/load operations
- Namespace isolation between stores
- Concurrent writes and version conflict handling
- Persistence across connection disconnects
- Multiple store instance coordination
Also updates CI workflow to run integration tests with a NATS server
enabled with JetStream.
Closes#10
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comprehensive unit tests for shard management functionality:
- GetShard returns correct shard for actor IDs consistently
- GetShardNodes returns nodes responsible for each shard
- AssignShard correctly updates shard assignments
- PlaceActor returns valid nodes from available set
- Shard assignment handles node failures gracefully
- Replication factor is properly tracked
Includes tests for edge cases (empty shards, nil registry, single node)
and benchmark tests for GetShard, AssignShard, and PlaceActor.
Closes#5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds support for filtering events by type or actor pattern within namespace
subscriptions. Key changes:
- Add SubscriptionFilter type with EventTypes and ActorPattern fields
- Add SubscribeWithFilter to EventBroadcaster interface
- Implement filtering in EventBus with full wildcard pattern support preserved
- Implement filtering in NATSEventBus (server-side namespace, client-side filters)
- Add MatchActorPattern function for actor ID pattern matching
- Add comprehensive unit tests for all filtering scenarios
Filter Processing:
- EventTypes: Event must match at least one type in the list (OR within types)
- ActorPattern: Event's ActorID must match the pattern (supports * and > wildcards)
- Multiple filters are combined with AND logic
This implementation works alongside the existing wildcard subscription support:
- Namespace wildcards (* and >) work with event filters
- Filters are applied after namespace pattern matching
- Metrics are properly recorded for filtered subscriptions
Closes#21
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive integration tests that verify NATSEventBus behavior
with a real NATS server. Tests cover:
- Cross-node event delivery (multiple NATSEventBus instances)
- Namespace isolation with single and multiple NATS connections
- High-throughput scenarios (1000 events)
- Event ordering within namespace
- No cross-namespace leakage verification
- Concurrent publish/subscribe operations
- Multiple subscribers to same namespace
- Event metadata preservation across NATS
- Large event payload handling (100KB)
- Subscribe/unsubscribe lifecycle
- Reconnection behavior
- Graceful degradation under load
- Benchmarks for publish and publish-receive
Tests require a running NATS server and are tagged with +build integration.
Run with: go test -tags=integration -v ./...
Closes#18
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Support NATS-style wildcard patterns ("*" and ">") for subscribing
to events across multiple namespaces. This enables cross-cutting
concerns like logging, monitoring, and auditing without requiring
separate subscriptions for each namespace.
- Add pattern.go with MatchNamespacePattern and IsWildcardPattern
- Update EventBus to track wildcard subscribers separately
- Update NATSEventBus to use NATS native wildcard support
- Add comprehensive tests for pattern matching and EventBus wildcards
- Document security implications in all relevant code comments
Closes#20
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a README.md that gives developers a quick understanding of what
Aether is and how to get started. Includes:
- Project description and why Aether exists
- Installation instructions
- Quick start code example showing event creation, persistence, and replay
- Key concepts (immutability, derived state, version consistency)
- Links to further documentation
- CI badge
Closes#44
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for optional namespace prefixes on JetStreamEventStore streams
to enable complete namespace isolation at the storage level:
- Add Namespace field to JetStreamConfig
- Add NewJetStreamEventStoreWithNamespace convenience constructor
- Prefix stream names with sanitized namespace when configured
- Add GetNamespace and GetStreamName accessor methods
- Add unit tests for namespace functionality
- Document namespace-scoped stores in CLAUDE.md
The namespace prefix is sanitized (spaces, dots, wildcards converted to
underscores) and prepended to the stream name, ensuring events from one
namespace cannot be read from another namespace's store while maintaining
full backward compatibility for non-namespaced stores.
Closes#19
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ReplayError and ReplayResult types to capture information about
malformed events encountered during replay. This allows callers to
inspect and handle corrupted data rather than having it silently skipped.
Key changes:
- Add ReplayError type with sequence number, raw data, and underlying error
- Add ReplayResult type containing both successfully parsed events and errors
- Add EventStoreWithErrors interface for stores that can report replay errors
- Implement GetEventsWithErrors on JetStreamEventStore
- Update GetEvents to maintain backward compatibility (still skips malformed)
- Add comprehensive unit tests for the new types
This addresses the issue of silent data loss during event-sourced replay
by giving callers visibility into data quality issues.
Closes#39
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add sync.RWMutex to ConsistentHashRing struct
- Use Lock/Unlock for write operations (AddNode, RemoveNode)
- Use RLock/RUnlock for read operations (GetNode, GetNodes, IsEmpty)
This allows concurrent reads (the common case) while serializing writes,
preventing race conditions when multiple goroutines access the hash ring.
Closes#35
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
JSON unmarshal produces map[string]interface{}, not concrete types.
Added ModelPayload and MessagePayload concrete types that implement
RuntimeModel and RuntimeMessage interfaces respectively.
The handleClusterMessage now re-marshals and unmarshals the payload
to convert from map[string]interface{} to the proper concrete type.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add VirtualMachine interface with GetID(), GetActorID(), GetState()
- Add VMState type with idle/running/paused/stopped states
- Add RuntimeModel interface for event storming model contracts
- Add RuntimeMessage interface for actor message contracts
- Add VMProvider interface for decoupled VM access
- Update VMRegistry.GetActiveVMs() to return map[string]VirtualMachine
- Update Runtime interface to use RuntimeModel and RuntimeMessage
- Update DistributedVMRegistry to use VMProvider instead of interface{}
- Add SetVMProvider method to DistributedVM for dependency injection
This improves type safety, makes contracts explicit, and enables better
IDE support while avoiding import cycles through interface segregation.
Closes#37
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add config structs with sensible defaults for tunable parameters:
- JetStreamConfig for stream retention (1 year) and replica count (1)
- HashRingConfig for virtual nodes per physical node (150)
- ShardConfig for shard count (1024) and replication factor (1)
Each component gets a new WithConfig constructor that accepts custom
configuration, while the original constructors continue to work with
defaults. Zero values in configs fall back to defaults for backward
compatibility.
Closes#38
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Metadata field (map[string]string) to Event struct with omitempty
- Add helper methods for common metadata: SetCorrelationID/GetCorrelationID,
SetCausationID/GetCausationID, SetUserID/GetUserID, SetTraceID/GetTraceID,
SetSpanID/GetSpanID
- Add WithMetadataFrom helper for copying metadata between events
- Add metadata key constants for standard fields
- Add comprehensive unit tests for metadata serialization and helpers
- Add store tests verifying metadata persistence
Closes#7
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ErrVersionConflict error type and VersionConflictError for detailed
conflict information
- Implement version validation in InMemoryEventStore.SaveEvent that rejects
events with version <= current latest version
- Implement version validation in JetStreamEventStore.SaveEvent with version
caching for performance
- Add comprehensive tests for version conflict detection including concurrent
writes to same actor
- Document versioning semantics in EventStore interface and CLAUDE.md
This ensures events have monotonically increasing versions per actor and
provides clear error messages for version conflicts, enabling optimistic
concurrency control patterns.
Closes#6
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Test SaveEvent persists events correctly (single, multiple, multi-actor)
- Test GetEvents retrieves events in insertion order
- Test GetEvents with fromVersion filtering
- Test GetLatestVersion returns correct version
- Test behavior with non-existent actor IDs (returns empty/zero)
- Test concurrent access safety with race detector
- Add mutex protection to InMemoryEventStore for thread safety
Closes#3
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Test all public methods with 100% coverage:
- AddNode: updates ring, is idempotent, handles multiple nodes
- RemoveNode: updates ring, handles non-existent nodes
- GetNode: returns consistent results, handles empty ring and single node
- GetNodes and IsEmpty helper methods
Distribution tests verify:
- Balanced key distribution across nodes (< 25% deviation)
- Minimal key movement when adding nodes (< 35% moved)
- Virtual nodes improve distribution (CV < 15%)
- Ring behavior with 100+ nodes
Includes benchmarks for GetNode, AddNode, and distribution operations.
Closes#2
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Test JSON serialization/deserialization, field names, omitempty behavior,
edge cases (empty/nil data, large payloads, special characters including
unicode and control chars), timestamp handling across timezones, nanosecond
precision, version edge cases, and nested data structures.
Closes#1
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>