From 7d75add5f5f8aa7c7f47b35f169950f9c7395455 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jo=C3=A3o=20Reis?= Date: Tue, 22 Jul 2025 17:21:47 +0100 Subject: [PATCH] Update documentation for 2.0 (readme, upgrade guide, pkg.go.dev) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch updates the existing pkg.go.dev documentation and adds documentation for the new packages. This patch also updates the README ands adds a new UPGRADE_GUIDE.md with some documentation for users that will be upgrading from 1.x to 2.x. Patch by João Reis; reviewed by TBD for CASSGO-79 --- README.md | 27 +- UPGRADE_GUIDE.md | 748 ++++++++++++++++++++++++++++++++++++++++++++ doc.go | 513 +++++++++++++++++++++++++++--- gocqlzap/doc.go | 128 ++++++++ gocqlzerolog/doc.go | 130 ++++++++ hostpool/doc.go | 101 ++++++ lz4/doc.go | 72 +++++ snappy/doc.go | 75 +++++ 8 files changed, 1745 insertions(+), 49 deletions(-) create mode 100644 UPGRADE_GUIDE.md create mode 100644 gocqlzap/doc.go create mode 100644 gocqlzerolog/doc.go create mode 100644 hostpool/doc.go create mode 100644 lz4/doc.go create mode 100644 snappy/doc.go diff --git a/README.md b/README.md index 5144caf29..1b6a26dcf 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,8 @@ Installation go get github.com/apache/cassandra-gocql-driver/v2 +**Note:** Version `2.0.0` introduces breaking changes. See the [upgrade guide](https://github.com/apache/cassandra-gocql-driver/blob/trunk/UPGRADE_GUIDE.md) for upgrade instructions from `1.x`. + Features -------- @@ -53,18 +55,29 @@ Features * Each connection can execute up to n concurrent queries (whereby n is the limit set by the protocol version the client chooses to use) * Optional automatic discovery of nodes * Policy based connection pool with token aware and round-robin policy implementations + * Support for host-targeted queries with Query.SetHostID() * Support for password authentication * Iteration over paged results with configurable page size * Support for TLS/SSL -* Optional frame compression (using snappy) +* Optional frame compression (Snappy and LZ4 available in separate packages) +* Structured logging support with dedicated packages for popular loggers (Zap, Zerolog) * Automatic query preparation * Support for query tracing -* Support for Cassandra 2.1+ [binary protocol version 3](https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v3.spec) - * Support for up to 32768 streams - * Support for tuple types - * Support for client side timestamps by default - * Support for UDTs via a custom marshaller or struct tags -* Support for Cassandra 3.0+ [binary protocol version 4](https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec) +* Support for Cassandra 2.1+ through 5.0+ with native protocol versions 3, 4, and 5: + * **Protocol 3** (Cassandra 2.1+): + * Support for up to 32768 streams + * Support for tuple types + * Support for client side timestamps by default + * Support for UDTs via a custom marshaller or struct tags + * **Protocol 4** (Cassandra 3.0+): + * All Protocol 3 features + * Enhanced performance and efficiency + * **Protocol 5** (Cassandra 4.0+): + * All previous protocol features + * Support for per-query keyspace override (Query.SetKeyspace(), Batch.SetKeyspace()) + * Support for per-query custom timestamps (Query.WithNowInSeconds(), Batch.WithNowInSeconds()) + * **Cassandra 5.0+ specific**: + * Support for vector types for vector search capabilities * An API to access the schema metadata of a given keyspace Performance diff --git a/UPGRADE_GUIDE.md b/UPGRADE_GUIDE.md new file mode 100644 index 000000000..12fd39190 --- /dev/null +++ b/UPGRADE_GUIDE.md @@ -0,0 +1,748 @@ + + +# GoCQL Major Version Upgrade Guide + +This guide helps you migrate between major versions of the GoCQL driver. Each major version introduces significant changes that may require code modifications. + + + +## Available Upgrade Paths + +- [v1.x → v2.x](#upgrading-from-v1x-to-v2x) +- [Future version upgrades will be documented here as they become available] + +--- + +## Upgrading from v1.x to v2.x + +Version 2.0.0 represents a major overhaul of the GoCQL driver with significant API changes, new features, and improvements. This migration requires careful planning and testing. + +**Important Prerequisites:** +- **Minimum Go version**: Go 1.19+ +- **Minimum Cassandra version**: 2.1+ (recommended 4.1+ for full feature support) +- **Supported protocol versions**: 3, 4, 5 (Cassandra 2.1+, versions 1 and 2 are no longer supported) + +### Table of Contents + +- [Protocol Version / Cassandra Version Support](#protocol-version--cassandra-version-support) +- [Breaking Changes](#breaking-changes) + - [Module and Import Changes](#module-and-import-changes) + - [Removed Global Functions](#removed-global-functions) + - [Session Setter Methods Removed](#session-setter-methods-removed) + - [Methods Moved from Query/Batch to Iter](#methods-moved-from-querybatch-to-iter) + - [Logging System Overhaul](#logging-system-overhaul) + - [TimeoutLimit Variable Removal](#timeoutlimit-variable-removal) + - [Advanced API Changes](#advanced-api-changes) + - [HostInfo Method Visibility Changes](#hostinfo-method-visibility-changes) + - [HostSelectionPolicy Interface Changes](#hostselectionpolicy-interface-changes) + - [ExecutableQuery Interface Deprecated](#executablequery-interface-deprecated) +- [Changes that might lead to runtime errors](#changes-that-might-lead-to-runtime-errors) + - [PasswordAuthenticator Behavior Change](#passwordauthenticator-behavior-change) + - [CQL to Go type mapping for inet columns changed in MapScan/SliceMap](#cql-to-go-type-mapping-for-inet-columns-changed-in-mapscanslicemap) + - [NULL Collections Now Return nil Instead of Empty Collections in MapScan/SliceMap](#null-collections-now-return-nil-instead-of-empty-collections-in-mapscanslicemap) +- [Deprecation Notices](#deprecation-notices) + - [Example Migrations](#example-migrations) + +--- + +### Protocol Version / Cassandra Version Support + +**Protocol versions 1 and 2 removed:** + +gocql v2.x dropped support for old protocol versions. Here's the mapping of protocol versions to Cassandra versions: + +| Protocol Version | Cassandra Versions | Status in v2.x | +|------------------|-------------------|----------------| +| 1 | 1.2 - 2.0 | ❌ **REMOVED** | +| 2 | 2.0 - 2.2 | ❌ **REMOVED** | +| 3 | 2.1+ | ✅ **Minimum supported** | +| 4 | 2.2+ | ✅ | +| 5 | 4.0+ | ✅ **Latest** | + +```go +// OLD (v1.x) - NO LONGER SUPPORTED +cluster.ProtoVersion = 1 // ❌ Runtime error during connection +cluster.ProtoVersion = 2 // ❌ Runtime error during connection + +// NEW (v2.x) - Minimum version 3 +cluster.ProtoVersion = 3 // ✅ Minimum supported (Cassandra 2.1+) +cluster.ProtoVersion = 4 // ✅ (Cassandra 2.2+) +cluster.ProtoVersion = 5 // ✅ Latest (Cassandra 4.0+) +// OR omit for auto-negotiation (recommended) +``` + +**Runtime error you'll see:** +```bash +gocql: unsupported protocol response version: 1 +# OR +gocql: unsupported protocol response version: 2 +``` + +**Migration:** Update your cluster configuration to use protocol version 3 or higher, or remove the explicit ProtoVersion setting to use auto-negotiation: +```go +// Option 1: Explicit version (minimum 3) +cluster.ProtoVersion = 3 // For Cassandra 2.1+ + +// Option 2: Auto-negotiation (recommended) +cluster := gocql.NewCluster("127.0.0.1") +// ProtoVersion will be auto-negotiated to the highest supported version +// This is recommended as it works with any Cassandra 2.1+ version +``` + +**Note:** Since gocql v2.x requires Cassandra 2.1+ anyway, most users should use auto-negotiation for the best compatibility unless you have a cluster with nodes that have different Cassandra versions. + +--- + +### Breaking Changes + +#### Module and Package Changes + +**CRITICAL: All users must update import paths** + +The module has been moved to the Apache Software Foundation with a new import path: + +```go +// OLD (v1.x) +import "github.com/gocql/gocql" + +// NEW (v2.x) +import "github.com/apache/cassandra-gocql-driver/v2" +``` + +**Compressor modules converted to packages:** + +The Snappy and LZ4 compressors have been reorganized from separate modules into packages within the main driver: + +```go +// OLD (v1.x) - Snappy was part of main module +cluster.Compressor = &gocql.SnappyCompressor{} + +// OLD (v1.x) - LZ4 was a separate module +import "github.com/gocql/gocql/lz4" + +// NEW (v2.x) - Both are now packages within the main module +import "github.com/apache/cassandra-gocql-driver/v2/snappy" +import "github.com/apache/cassandra-gocql-driver/v2/lz4" + +cluster.Compressor = &snappy.SnappyCompressor{} // ✅ New package syntax +cluster.Compressor = &lz4.LZ4Compressor{} // ✅ New package syntax +``` + +**HostPoolHostPolicy moved to hostpool package:** + +The `HostPoolHostPolicy` function has been moved from the main gocql package to the hostpool package: + +```go +// OLD (v1.x) - COMPILATION ERROR in v2.x +cluster.PoolConfig.HostSelectionPolicy = gocql.HostPoolHostPolicy(hostpool.New(nil)) // ❌ undefined: gocql.HostPoolHostPolicy + +// NEW (v2.x) - Import from hostpool package +import "github.com/apache/cassandra-gocql-driver/v2/hostpool" +cluster.PoolConfig.HostSelectionPolicy = hostpool.HostPoolHostPolicy(hostpool.New(nil)) +``` + +**All import path changes:** +```go +// OLD (v1.x) +import "github.com/gocql/gocql" +import "github.com/gocql/gocql/lz4" // Was separate module + +// NEW (v2.x) +import "github.com/apache/cassandra-gocql-driver/v2" +import "github.com/apache/cassandra-gocql-driver/v2/snappy" // Now package +import "github.com/apache/cassandra-gocql-driver/v2/lz4" // Now package +import "github.com/apache/cassandra-gocql-driver/v2/hostpool" // For HostPoolHostPolicy +``` + +#### Removed Global Functions + +**`NewBatch()` function removed (was deprecated):** +```go +// OLD (v1.x) - COMPILATION ERROR in v2.x +batch := gocql.NewBatch(gocql.LoggedBatch) // ❌ undefined: gocql.NewBatch +``` + +**Compilation error you'll see:** +```bash +./main.go:42:10: undefined: gocql.NewBatch +``` + +**Migration:** +```go +// NEW (v2.x) - Use fluent API +batch := session.Batch(gocql.LoggedBatch) +``` + +**`MustParseConsistency()` function removed (was deprecated):** +```go +// OLD (v1.x) - COMPILATION ERROR in v2.x +cons, err := gocql.MustParseConsistency("quorum") // ❌ undefined: gocql.MustParseConsistency +``` + +**Compilation error you'll see:** +```bash +./main.go:45:11: undefined: gocql.MustParseConsistency +``` + +**Migration:** +```go +// NEW (v2.x) - Use ParseConsistency (panics on error instead of returning unused error) +cons := gocql.ParseConsistency("quorum") // ✅ Direct panic on invalid input +``` + +#### Session Setter Methods Removed + +**Session setter methods removed for immutability:** +```go +// OLD (v1.x) - COMPILATION ERRORS in v2.x +session.SetTrace(tracer) // ❌ session.SetTrace undefined +session.SetConsistency(gocql.Quorum) // ❌ session.SetConsistency undefined +session.SetPageSize(1000) // ❌ session.SetPageSize undefined +session.SetPrefetch(0.25) // ❌ session.SetPrefetch undefined +``` + +**Compilation errors you'll see:** +```bash +./main.go:45:9: session.SetTrace undefined (type *gocql.Session has no method SetTrace) +./main.go:46:9: session.SetConsistency undefined (type *gocql.Session has no method SetConsistency) +./main.go:47:9: session.SetPageSize undefined (type *gocql.Session has no method SetPageSize) +./main.go:48:9: session.SetPrefetch undefined (type *gocql.Session has no method SetPrefetch) +``` + +**Migration options:** +```go +// NEW (v2.x) - Option 1: Set defaults via ClusterConfig +cluster := gocql.NewCluster("127.0.0.1") +cluster.Consistency = gocql.Quorum +cluster.PageSize = 1000 +cluster.NextPagePrefetch = 0.25 +cluster.Tracer = tracer + +// NEW (v2.x) - Option 2: Configure per Query/Batch +query := session.Query("SELECT ..."). + Trace(tracer). + Consistency(gocql.Quorum). + PageSize(1000). + Prefetch(0.25) + +batch := session.Batch(gocql.LoggedBatch). + Trace(tracer). + Consistency(gocql.Quorum) +``` + +#### Methods Moved from Query/Batch to Iter + +**Execution-specific methods moved from Query/Batch objects to Iter objects:** +```go +// OLD (v1.x) - COMPILATION ERRORS in v2.x +query := session.Query("SELECT * FROM users WHERE id = ?", userID) +iter := query.Iter() +// ... process results +attempts := query.Attempts() // ❌ query.Attempts undefined +latency := query.Latency() // ❌ query.Latency undefined +host := query.Host() // ❌ query.Host undefined +``` + +**Compilation errors you'll see:** +```bash +./main.go:52:12: query.Attempts undefined (type *gocql.Query has no method Attempts) +./main.go:53:11: query.Latency undefined (type *gocql.Query has no method Latency) +./main.go:54:8: query.Host undefined (type *gocql.Query has no method Host) +``` + +**Migration:** +```go +// NEW (v2.x) - Methods available on Iter +query := session.Query("SELECT * FROM users WHERE id = ?", userID) +iter := query.Iter() +// ... process results +attempts := iter.Attempts() // ✅ Now available on Iter +latency := iter.Latency() // ✅ Now available on Iter +host := iter.Host() // ✅ Now available on Iter +defer iter.Close() +``` + +**Why this change?** Methods moved to Iter because they represent execution-specific data that only exists after a query is executed, not properties of the query definition itself. + +#### Logging System Overhaul + +**Complete replacement of logging interface - COMPILATION ERRORS:** +```go +// OLD (v1.x) - StdLogger interface - NO LONGER EXISTS +type StdLogger interface { // ❌ Interface removed + Print(v ...interface{}) // ❌ Interface removed + Printf(format string, v ...interface{}) // ❌ Interface removed + Println(v ...interface{}) // ❌ Interface removed +} + +// Trying to use old interface causes compilation error +cluster.Logger = myOldLogger // ❌ cannot use myOldLogger (type StdLogger) as StructuredLogger +``` + +**Compilation error you'll see:** +```bash +./main.go:67:16: cannot use myOldLogger (type StdLogger) as type StructuredLogger in assignment: + StdLogger does not implement StructuredLogger (missing Debug method) +``` + +**Migration - NEW StructuredLogger interface:** +```go +// NEW (v2.x) - StructuredLogger interface +type StructuredLogger interface { + Debug(msg string, fields ...Field) + Info(msg string, fields ...Field) + Warn(msg string, fields ...Field) + Error(msg string, fields ...Field) +} +``` + +**Migration options:** +```go +// Option 1: Use built-in default logger +cluster.Logger = gocql.NewLogger(gocql.LogLevelInfo) + +// Option 2: Use provided adapters +cluster.Logger = gocqlzap.New(zapLogger) // For Zap +cluster.Logger = gocqlzerolog.New(zeroLogger) // For Zerolog + +// Option 3: Implement StructuredLogger interface +type MyStructuredLogger struct{} +func (l MyStructuredLogger) Debug(msg string, fields ...gocql.Field) { /* ... */ } +func (l MyStructuredLogger) Info(msg string, fields ...gocql.Field) { /* ... */ } +func (l MyStructuredLogger) Warn(msg string, fields ...gocql.Field) { /* ... */ } +func (l MyStructuredLogger) Error(msg string, fields ...gocql.Field) { /* ... */ } + +cluster.Logger = MyStructuredLogger{} +``` + +For comprehensive StructuredLogger documentation, see [pkg.go.dev/github.com/apache/cassandra-gocql-driver/v2#hdr-Structured_Logging](https://pkg.go.dev/github.com/apache/cassandra-gocql-driver/v2#hdr-Structured_Logging). + +#### Advanced API Changes + +**⚠️ This section only applies to advanced users who implement custom interfaces ⚠️** + +*Most users can skip this section. These changes only affect you if you've implemented custom `HostSelectionPolicy`, `RetryPolicy`, or other advanced driver interfaces.* + +##### HostInfo Method Visibility Changes + +**HostInfo method visibility changes - COMPILATION ERRORS:** + +Several HostInfo methods have been removed or made private: + +```go +// OLD (v1.x) - COMPILATION ERRORS in v2.x +host.SetConnectAddress(addr) // ❌ method undefined +host.SetHostID(id) // ❌ method undefined (became private setHostID) + +// Runtime behavior changes: +addr := host.ConnectAddress() // ⚠️ No longer panics on invalid address, driver validates before creating the object +``` + +**Compilation errors you'll see:** +```bash +./main.go:45:9: host.SetConnectAddress undefined (type *gocql.HostInfo has no method SetConnectAddress) +./main.go:46:9: host.SetHostID undefined (type *gocql.HostInfo has no method SetHostID) +``` + +**Migration:** +```go +// OLD (v1.x) - Setting host connection address +host.SetConnectAddress(net.ParseIP("192.168.1.100")) + +// NEW (v2.x) - Use AddressTranslator instead +cluster.AddressTranslator = gocql.AddressTranslatorFunc(func(addr net.IP, port int) (net.IP, int) { + // Translate addresses here + return net.ParseIP("192.168.1.100"), port +}) +``` + +##### HostSelectionPolicy Interface Changes + +**HostSelectionPolicy.Pick() method signature changed:** +```go +// OLD (v1.x) - COMPILATION ERROR in v2.x +type HostSelectionPolicy interface { + Pick(qry ExecutableQuery) NextHost // ❌ ExecutableQuery no longer exists +} +``` + +**Compilation error you'll see:** +```bash +./main.go:25:17: undefined: ExecutableQuery +``` + +**Migration:** +```go +// NEW (v2.x) - Use ExecutableStatement interface +type HostSelectionPolicy interface { + Pick(stmt ExecutableStatement) NextHost // ✅ New interface +} +``` + +##### ExecutableQuery Interface Deprecated + +**ExecutableQuery interface deprecated and replaced:** +```go +// OLD (v1.x) - DEPRECATED in v2.x +type ExecutableQuery interface { + // ... methods +} + +// NEW (v2.x) - Replacement interfaces +type ExecutableStatement interface { + GetRoutingKey() ([]byte, error) + Keyspace() string + Table() string + IsIdempotent() bool + GetHostID() string + Statement() Statement +} + +// implemented by Query and Batch +type Statement interface { + Iter() *Iter + IterContext(ctx context.Context) *Iter + Exec() error + ExecContext(ctx context.Context) error +} +``` + +**Migration for custom HostSelectionPolicy implementations:** +```go +// OLD (v1.x) +func (p *MyCustomPolicy) Pick(qry ExecutableQuery) NextHost { + routingKey, _ := qry.GetRoutingKey() + keyspace := qry.Keyspace() + // Access to internal query properties... + // ... +} + +// NEW (v2.x) - Core methods available on ExecutableStatement +func (p *MyCustomPolicy) Pick(stmt ExecutableStatement) NextHost { + routingKey, _ := stmt.GetRoutingKey() // ✅ Same method + keyspace := stmt.Keyspace() // ✅ Same method + + // For additional properties, type cast the underlying Statement + switch s := stmt.Statement().(type) { + case *Query: + // Access Query-specific properties (READ-ONLY) + consistency := s.GetConsistency() + pageSize := s.PageSize() + // ... other Query methods + case *Batch: + // Access Batch-specific properties (READ-ONLY) + batchType := s.Type + consistency := s.GetConsistency() + // ... other Batch methods + } + + // ⚠️ WARNING: Only READ from the statement - do NOT modify it! + // Modifying the statement in HostSelectionPolicy will not affect the current request execution, + // but it WILL modify the original Query/Batch object that the user has a reference to. + // Since Query/Batch are not thread-safe, this can cause race conditions and unexpected behavior. + + // ... +} +``` + +**Impact:** If you have implemented custom `HostSelectionPolicy`, `RetryPolicy`, or other interfaces that accept `ExecutableQuery`, you'll need to update the parameter type to `ExecutableStatement`. The available methods remain mostly the same. + + + +#### TimeoutLimit Variable Removal + +**TimeoutLimit variable removed - COMPILATION ERROR:** + +The deprecated `TimeoutLimit` global variable has been removed: + +```go +// OLD (v1.x) - COMPILATION ERROR in v2.x +gocql.TimeoutLimit = 5 // ❌ undefined: gocql.TimeoutLimit +``` + +**Compilation error you'll see:** +```bash +./main.go:45:9: undefined: gocql.TimeoutLimit +``` + +**Behavior after fix:** +- **v1.x default**: `TimeoutLimit = 0` meant timeouts never closed connections +- **v2.x**: Behavior will match v1.x default - timeouts never close connections +- **Impact**: Only affects users who explicitly set `TimeoutLimit > 0` + +**Migration:** +```go +// OLD (v1.x) - Setting timeout limit (deprecated approach) +gocql.TimeoutLimit = 5 // Close connection after 5 timeouts + +// NEW (v2.x) - No direct replacement recommended +// Remove any code that sets gocql.TimeoutLimit +``` + +**Recommended Approach:** +We do not recommend a specific code-level migration path for `TimeoutLimit` because the old approach was fundamentally flawed. Instead, focus on proper operational practices: + +**Better Solution: Infrastructure Monitoring & Management** +- **Monitor Cassandra node health** using proper metrics (CPU, memory, disk I/O, GC pauses) +- **Shut down unhealthy nodes** rather than trying to work around them at the client level +- **Use proper alerting** on node performance metrics +- **Address root causes** of node problems rather than masking symptoms + +**Why TimeoutLimit was deprecated:** +In real-world scenarios, if a Cassandra node is unhealthy enough to cause timeouts, simply closing and reopening connections won't solve the underlying problem. The node will likely continue causing latency issues and other performance problems. The correct solution is to identify and fix unhealthy nodes at the infrastructure level. + +--- + +### Changes that might lead to runtime errors + +#### PasswordAuthenticator Behavior Change + +**PasswordAuthenticator now allows any server authenticator by default - POTENTIAL SECURITY ISSUE:** + +In v1.x, PasswordAuthenticator had a hardcoded list of approved authenticators and would reject connections to servers using other authenticators. + +In v2.x, PasswordAuthenticator will authenticate with **any** authenticator provided by the server unless you explicitly restrict it: + +```go +// v1.x behavior: Only allowed specific authenticators by default +// v2.x behavior: Allows ANY authenticator by default + +// OLD (v1.x) - Automatic rejection of non-standard authenticators +cluster.Authenticator = PasswordAuthenticator{ + Username: "user", + Password: "password", + // Automatically rejected servers with non-standard authenticators +} + +// NEW (v2.x) - To maintain v1.x security behavior: +cluster.Authenticator = PasswordAuthenticator{ + Username: "user", + Password: "password", + AllowedAuthenticators: []string{ + "org.apache.cassandra.auth.PasswordAuthenticator", + // Add other allowed authenticators here + }, +} + +// NEW (v2.x) - To allow any authenticator (new default behavior): +cluster.Authenticator = PasswordAuthenticator{ + Username: "user", + Password: "password", + // AllowedAuthenticators: nil, // or empty slice allows any +} +``` + +**Security Impact:** +- **v1.x**: Connections to servers with non-standard authenticators were automatically rejected +- **v2.x**: Same code will now successfully authenticate with any server authenticator +- **Risk**: May connect to servers with weaker or unexpected authentication mechanisms + +**Migration:** If security is a concern, explicitly set `AllowedAuthenticators` to maintain the restrictive v1.x behavior. + +#### CQL to Go type mapping for inet columns changed in MapScan/SliceMap + +**inet columns now return `net.IP` instead of `string` in MapScan/SliceMap - RUNTIME PANIC:** + +In v1.x, `MapScan()` and `SliceMap()` returned inet columns as `string` values. In v2.x, they now return `net.IP` values, causing runtime panics for existing type assertions. + +```go +// v1.x: inet columns returned as string +result, _ := session.Query("SELECT inet_col FROM table").Iter().SliceMap() +ip := result[0]["inet_col"].(string) // ✅ Worked in v1.x + +// v2.x: Same code causes runtime panic +result, _ := session.Query("SELECT inet_col FROM table").Iter().SliceMap() +ip := result[0]["inet_col"].(string) // ❌ PANIC: interface conversion: interface {} is net.IP, not string +``` + +**Runtime panic you'll see:** +```bash +panic: interface conversion: interface {} is net.IP, not string +``` + +**Migration:** +```go +// NEW (v2.x) - Use net.IP type assertion +result, _ := session.Query("SELECT inet_col FROM table").Iter().SliceMap() +ip := result[0]["inet_col"].(net.IP) // ✅ Correct type for v2.x + +// Convert to string if needed +ipString := ip.String() + +// Or use type switching for compatibility during migration +switch v := result[0]["inet_col"].(type) { +case net.IP: + ipString := v.String() // v2.x behavior +case string: + ipString := v // v1.x behavior (shouldn't happen in v2.x) +} +``` + +**Impact:** This only affects code using `MapScan()` and `SliceMap()` with inet columns. Direct `Scan()` calls are not affected since they require explicit type specification. + +#### NULL Collections Now Return nil Instead of Empty Collections in MapScan/SliceMap + +**NULL collections (lists, sets, maps) now return `nil` instead of empty collections - POTENTIAL RUNTIME PANIC:** + +In v1.x, `MapScan()` and `SliceMap()` returned NULL collections as empty slices/maps. In v2.x, they now return `nil` slices/maps, which can cause panics in code that assumes non-nil collections. + +```go +// v1.x: NULL collections returned as empty +result, _ := session.Query("SELECT null_list_col FROM table").Iter().SliceMap() +list := result[0]["null_list_col"].([]string) +// list was []string{} (empty slice with len=0, cap=0) +fmt.Println(list == nil) // false +fmt.Println(len(list) == 0) // true + +// v2.x: NULL collections now return nil +result, _ := session.Query("SELECT null_list_col FROM table").Iter().SliceMap() +list := result[0]["null_list_col"].([]string) +// list is []string(nil) (nil slice) +fmt.Println(list == nil) // true +fmt.Println(len(list) == 0) // true (len of nil slice is 0) +``` + +**Code that may cause runtime panics:** +```go +// ❌ BREAKS: Code checking for non-nil before processing +if myList != nil && len(myList) > 0 { // Condition now fails for NULL collections + processItems(myList) // Never called for NULL collections +} + +// ❌ PANICS: Operations that don't handle nil slices +copy(destination, myList) // Panics if myList is nil +myList[0] = "value" // Panics if myList is nil + +// ❌ BREAKS: Code assuming initialized slice for direct assignment +myList[index] = newValue // Index assignment panics on nil slice +``` + +**Migration - Use nil-safe patterns:** +```go +// ✅ SAFE: Check length only (works for both nil and empty slices) +if len(myList) > 0 { // len of nil slice is 0 + processItems(myList) +} + +// ✅ SAFE: Use append (handles nil slices) +myList = append(myList, item) // append works with nil slices + +// ✅ SAFE: Range over slices (safe with nil) +for _, item := range myList { // range over nil slice is safe (no iterations) + processItem(item) +} + +// ✅ SAFE: Initialize before direct assignment +if myList == nil { + myList = make([]string, desiredLength) +} +myList[index] = newValue + +// ✅ SAFE: Use copy with proper nil check +if len(myList) > 0 { + copy(destination, myList) +} +``` + +**Why this change was made:** +- **Semantic correctness**: NULL ≠ empty. A NULL collection should be `nil`, not an empty collection +- **Memory efficiency**: `nil` slices use less memory than empty slices +- **Consistency**: Direct `Scan()` calls already behaved this way + +**Impact:** This affects code using `MapScan()` and `SliceMap()` with collection columns (lists, sets, maps) that explicitly checks for `nil` or performs operations that don't handle `nil` slices/maps safely. + +--- + +### Deprecation Notices + +The following features are deprecated but still functional in v2.x. **These should not be used in new code.** Plan to migrate away from these before v3.x: + +**Session Methods:** +1. **`Session.ExecuteBatch()`** → Use `Batch.Exec()` fluent API +2. **`Session.ExecuteBatchCAS()`** → Use `Batch.ExecCAS()` fluent API +3. **`Session.MapExecuteBatchCAS()`** → Use `Batch.MapExecCAS()` fluent API +4. **`Session.NewBatch()`** → Use `Session.Batch()` fluent API + +**Query Methods:** +5. **`Query.SetConsistency()`** → Use `Query.Consistency()` fluent API instead +6. **`Query.Context()`** → Pass context directly to `ExecContext()`, `IterContext()`, `ScanContext()`, `ScanCASContext()`, `MapScanContext()`, or `MapScanCASContext()` +7. **`Query.WithContext()`** → Use context methods like `ExecContext()`, `IterContext()`, `ScanContext()`, `ScanCASContext()`, `MapScanContext()`, or `MapScanCASContext()` instead + +**Batch Methods:** +8. **`Batch.SetConsistency()`** → Use `Batch.Consistency()` fluent API instead +9. **`Batch.Context()`** → Pass context directly to `ExecContext()`, `ExecCASContext()`, or `MapExecCASContext()` +10. **`Batch.WithContext()`** → Use context methods like `ExecContext()`, `ExecCASContext()`, or `MapExecCASContext()` instead + +**Host Filters:** +11. **`DataCentreHostFilter()`** → Use `DataCenterHostFilter()` (spelling consistency) + +**Type Aliases:** +12. **`SerialConsistency`** → Use `Consistency` instead +13. **`ExecutableQuery`** → Use `Statement` for Query/Batch objects or `ExecutableStatement` in HostSelectionPolicy implementations + +**Migration Timeline:** +- v2.x: Deprecated APIs work but may emit warnings +- v3.x: Deprecated APIs will be removed + +#### Example Migrations + +**Batch Operations:** +```go +// DEPRECATED (still works in v2.x) +batch := session.NewBatch(gocql.LoggedBatch) +batch.Query("INSERT INTO users (id, name) VALUES (?, ?)", id, name) +err := session.ExecuteBatch(batch) + +// RECOMMENDED (future-proof) +err := session.Batch(gocql.LoggedBatch). + Query("INSERT INTO users (id, name) VALUES (?, ?)", id, name). + Exec() +``` + +**Context Handling:** +```go +// DEPRECATED (still works in v2.x) +query := session.Query("SELECT * FROM users WHERE id = ?", userID) +query = query.WithContext(ctx) +iter := query.Iter() + +// RECOMMENDED (future-proof) - use appropriate context method: +iter := session.Query("SELECT * FROM users WHERE id = ?", userID).IterContext(ctx) +// OR for single row scans: +var user User +err := session.Query("SELECT * FROM users WHERE id = ?", userID).ScanContext(ctx, &user.ID, &user.Name) +// OR for execution without results: +err := session.Query("INSERT INTO users (id, name) VALUES (?, ?)", userID, name).ExecContext(ctx) +``` + +**Consistency Setting:** +```go +// DEPRECATED (still works in v2.x) +query := session.Query("SELECT * FROM users") +query.SetConsistency(gocql.Quorum) + +batch := session.Batch(gocql.LoggedBatch) +batch.SetConsistency(gocql.Quorum) + +// RECOMMENDED (future-proof) +query := session.Query("SELECT * FROM users").Consistency(gocql.Quorum) +batch := session.Batch(gocql.LoggedBatch).Consistency(gocql.Quorum) +``` \ No newline at end of file diff --git a/doc.go b/doc.go index 620386d09..10103f705 100644 --- a/doc.go +++ b/doc.go @@ -25,9 +25,13 @@ // Package gocql implements a fast and robust Cassandra driver for the // Go programming language. // +// # Upgrading to a new major version +// +// For detailed migration instructions between major versions, see the [upgrade guide]. +// // # Connecting to the cluster // -// Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: +// Pass a list of initial node IP addresses to [NewCluster] to create a new cluster configuration: // // cluster := gocql.NewCluster("192.168.1.1", "192.168.1.2", "192.168.1.3") // @@ -40,7 +44,7 @@ // address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address // then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. // -// Then you can customize more options (see ClusterConfig): +// Then you can customize more options (see [ClusterConfig]): // // cluster.Keyspace = "example" // cluster.Consistency = gocql.Quorum @@ -50,10 +54,13 @@ // protocol version explicitly, as it's not defined which version will be used in certain situations (for example // during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). // +// Native protocol versions 3, 4, and 5 are supported. +// For features like per-query keyspace setting and timestamp override, use native protocol version 5. +// // The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. // If you use replace directive in go.mod, the driver will send information about the replacement module instead. // -// When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: +// When ready, create a session from the configuration. Don't forget to [Session.Close] the session once you are done with it: // // session, err := cluster.CreateSession() // if err != nil { @@ -61,6 +68,208 @@ // } // defer session.Close() // +// # Reconnection and Host Recovery +// +// The driver provides robust reconnection mechanisms to handle network failures and host outages. +// Two main configuration settings control reconnection behavior: +// +// - ClusterConfig.ReconnectionPolicy: Controls retry behavior for immediate connection failures, query-driven reconnection, and background recovery +// - ClusterConfig.ReconnectInterval: Controls background recovery of DOWN hosts +// +// [ReconnectionPolicy] controls retry behavior for immediate connection failures, query-driven reconnection, and background recovery. +// +// [ConstantReconnectionPolicy] provides predictable fixed intervals (Default): +// +// cluster.ReconnectionPolicy = &gocql.ConstantReconnectionPolicy{ +// MaxRetries: 3, // Maximum retry attempts +// Interval: 1 * time.Second, // Fixed interval between retries +// } +// +// [ExponentialReconnectionPolicy] provides gentler backoff with capped intervals: +// +// cluster.ReconnectionPolicy = &gocql.ExponentialReconnectionPolicy{ +// MaxRetries: 5, // 6 total attempts: 0+1+2+4+8+15 = 30s total +// InitialInterval: 1 * time.Second, // Initial retry interval +// MaxInterval: 15 * time.Second, // Maximum retry interval (prevents excessive delays) +// } +// +// Note: Each reconnection attempt sequence starts fresh from InitialInterval. +// This applies both to immediate connection failures and each ClusterConfig.ReconnectInterval cycle. +// For example, if ClusterConfig.ReconnectInterval=60s, every 60 seconds the background process +// starts a new sequence beginning at InitialInterval, not continuing from where +// the previous 60-second cycle ended. +// +// ClusterConfig.ReconnectInterval controls background recovery of DOWN hosts. When a host is marked DOWN, this process periodically +// attempts reconnection using the same ReconnectionPolicy settings: +// +// cluster.ReconnectInterval = 60 * time.Second // Check DOWN hosts every 60 seconds (default) +// +// Setting ClusterConfig.ReconnectInterval to 0 disables background reconnection. +// +// The reconnection process involves several components working together in a specific sequence: +// +// 1. Individual Connection Reconnection - Immediate retry attempts for failed connections within UP hosts +// 2. Host State Management - Marking hosts DOWN when all connections fail and retries are exhausted +// 3. Background Recovery - Periodic reconnection attempts for DOWN hosts via ReconnectInterval +// +// Individual connection reconnection occurs when connections fail within a host's pool, and the driver immediately attempts +// reconnection using ReconnectionPolicy. For hosts that remain UP (with working connections), failed individual connections +// are reconnected on a query-driven basis - every query execution triggers asynchronous reconnection attempts for missing +// connections. Queries proceed immediately using available connections while reconnection happens asynchronously in the +// background. There is no query latency impact from reconnection attempts. Multiple concurrent queries to the same host +// will not trigger parallel reconnection attempts - the driver uses a "filling" flag to ensure only one reconnection process runs per host. +// +// Host state management determines when a host is marked DOWN. Only when ALL connections to a host fail and ReconnectionPolicy +// retries are exhausted does the host get marked DOWN. DOWN hosts are excluded from query routing. Since DOWN hosts don't +// receive queries, they cannot benefit from query-driven reconnection. This is why the background ClusterConfig.ReconnectInterval process +// is essential for DOWN host recovery. +// +// Background recovery through ClusterConfig.ReconnectInterval periodically attempts to reconnect DOWN hosts using ReconnectionPolicy settings. +// Event-driven recovery also triggers immediate reconnection when Cassandra sends STATUS_CHANGE UP events. +// +// The complete recovery process follows these steps: +// +// 1. Connection fails → ReconnectionPolicy immediate retry attempts +// 2. Query-driven recovery → Each query to partially-failed hosts triggers reconnection attempts +// 3. Host marked DOWN → All connections failed and retries exhausted +// 4. Background recovery → ClusterConfig.ReconnectInterval process attempts reconnection using ReconnectionPolicy +// 5. Event recovery → Cassandra events can trigger immediate reconnection +// +// Here's a practical example showing how the settings work together: +// +// cluster.ReconnectionPolicy = &gocql.ExponentialReconnectionPolicy{ +// MaxRetries: 8, // 9 total attempts (0s, 1s, 2s, 4s, 8s, 16s, 30s, 30s, 30s) +// InitialInterval: 1 * time.Second, // Starts at 1 second +// MaxInterval: 30 * time.Second, // Caps exponential growth at 30 seconds +// } +// +// cluster.ReconnectInterval = 60 * time.Second // Background checks every 60 seconds +// +// Timeline Example: With this configuration, when a host loses ALL connections: +// +// T=0:00 - Host has 2 connections, both fail +// T=0:00 - Immediate reconnection attempt 1: 0s delay +// T=0:01 - Immediate reconnection attempt 2: 1s delay +// T=0:03 - Immediate reconnection attempt 3: 2s delay +// T=0:07 - Immediate reconnection attempt 4: 4s delay +// T=0:15 - Immediate reconnection attempt 5: 8s delay +// T=0:31 - Immediate reconnection attempt 6: 16s delay +// T=1:01 - Immediate reconnection attempt 7: 30s delay (capped by MaxInterval) +// T=1:31 - Immediate reconnection attempt 8: 30s delay +// T=2:01 - Immediate reconnection attempt 9: 30s delay +// T=2:31 - All immediate attempts failed, host marked DOWN +// +// T=3:31 - Background recovery attempt 1 starts (60s after DOWN) +// ReconnectionPolicy sequence: 0s, 1s, 2s, 4s, 8s, 16s, 30s, 30s, 30s +// +// T=4:31 - ClusterConfig.ReconnectInterval timer fires, tick buffered (timer channel capacity=1) +// T=5:31 - ClusterConfig.ReconnectInterval timer fires again, there is already a tick buffered so ignore +// T=5:32 - Background recovery attempt 1 completes (after 2:01), immediately reads buffered tick +// T=5:32 - Background recovery attempt 2 starts (buffered timer from T=5:31) +// T=6:32 - ClusterConfig.ReconnectInterval timer fires, tick buffered +// T=7:32 - ClusterConfig.ReconnectInterval timer fires again, there is already a tick buffered so ignore +// T=7:33 - Background recovery attempt 2 completes (after 2:01), immediately reads buffered tick +// T=7:33 - Background recovery attempt 3 starts (buffered timer from T=7:32) +// +// Timer Behavior and Predictable Timing: +// +// Note: [time.Ticker].C has buffer capacity=1, but Go drops ticks for "slow receivers." +// The reconnection process is a slow receiver (taking 2+ minutes vs 60s interval). +// First missed tick gets buffered, subsequent ticks are dropped. When reconnection +// completes, it immediately reads the buffered tick and starts the next attempt. +// This causes attempts to run back-to-back at the ReconnectionPolicy duration interval +// (121s) instead of the intended ClusterConfig.ReconnectInterval (60s), but timing remains predictable. +// +// To avoid this buffering/dropping behavior, ensure ClusterConfig.ReconnectInterval is larger than the +// total ReconnectionPolicy duration. You can achieve this by either: +// +// 1. Increasing ClusterConfig.ReconnectInterval (e.g., 150s > 121s sequence duration) +// 2. Reducing ReconnectionPolicy duration (e.g., 30s sequence < 60s ClusterConfig.ReconnectInterval) +// +// This ensures predictable timing with each recovery attempt starting exactly ClusterConfig.ReconnectInterval apart. +// Approach #2 provides faster recovery while maintaining predictable timing. +// +// Individual failed connections within UP hosts are reconnected asynchronously without affecting query performance. +// +// Best Practices and Configuration Guidelines: +// +// - ReconnectionPolicy: Use ConstantReconnectionPolicy for predictable behavior or ExponentialReconnectionPolicy +// for gentler recovery. Aggressive settings affect background reconnection frequency but don't impact query latency +// - ClusterConfig.ReconnectInterval: Set to 30-60 seconds for most cases. Shorter intervals provide faster recovery but more traffic +// - Timing Predictability: For predictable background recovery timing, ensure ClusterConfig.ReconnectInterval exceeds the total +// ReconnectionPolicy sequence duration. This prevents Go's ticker from buffering/dropping ticks due to "slow receiver" +// behavior. You can achieve this by either increasing ClusterConfig.ReconnectInterval or reducing ReconnectionPolicy duration +// (fewer retries/shorter intervals). The latter approach provides faster recovery while maintaining predictable timing +// - Monitoring: Enable logging to observe reconnection behavior and tune settings +// +// # Compression +// +// The driver supports Snappy and LZ4 compression of protocol frames. +// +// For Snappy compression (via [github.com/apache/cassandra-gocql-driver/v2/snappy] package): +// +// import "github.com/apache/cassandra-gocql-driver/v2/snappy" +// +// cluster.Compressor = &snappy.SnappyCompressor{} +// +// For LZ4 compression (via [github.com/apache/cassandra-gocql-driver/v2/lz4] package): +// +// import "github.com/apache/cassandra-gocql-driver/v2/lz4" +// +// cluster.Compressor = &lz4.LZ4Compressor{} +// +// Both compressors use efficient append-like semantics for optimal performance and memory usage. +// +// # Structured Logging +// +// The driver provides structured logging through the [StructuredLogger] interface. +// Built-in integrations are available for popular logging libraries: +// +// For Zap logger (via [github.com/apache/cassandra-gocql-driver/v2/gocqlzap] package): +// +// import "github.com/apache/cassandra-gocql-driver/v2/gocqlzap" +// +// zapLogger, _ := zap.NewProduction() +// cluster.Logger = gocqlzap.NewZapLogger(zapLogger) +// +// For Zerolog (via [github.com/apache/cassandra-gocql-driver/v2/gocqlzerolog] package): +// +// import "github.com/apache/cassandra-gocql-driver/v2/gocqlzerolog" +// +// zerologLogger := zerolog.New(os.Stdout).With().Timestamp().Logger() +// cluster.Logger = gocqlzerolog.NewZerologLogger(&zerologLogger) +// +// You can also use the built-in standard library logger: +// +// cluster.Logger = gocql.NewLogger(gocql.LogLevelInfo) +// +// # Native Protocol Version 5 Features +// +// Native protocol version 5 provides several advanced capabilities: +// +// Set keyspace for individual queries (useful for multi-tenant applications): +// +// err := session.Query("SELECT * FROM table").SetKeyspace("tenant1").Exec() +// +// Target queries to specific nodes (useful for virtual tables in Cassandra 4.0+): +// +// err := session.Query("SELECT * FROM system_views.settings"). +// SetHostID("host-uuid").Exec() +// +// Use current timestamp override for testing and consistency: +// +// err := session.Query("INSERT INTO table (id, data) VALUES (?, ?)"). +// WithNowInSeconds(specificTimestamp). +// Bind(id, data).Exec() +// +// These features are also available on batch operations: +// +// err := session.Batch(LoggedBatch). +// Query("INSERT INTO table (id, data) VALUES (?, ?)", id, data). +// SetKeyspace("tenant1"). +// WithNowInSeconds(specificTimestamp). +// Exec() +// // # Authentication // // CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and @@ -68,7 +277,7 @@ // // To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. // -// PasswordAuthenticator is provided to use for username/password authentication: +// [PasswordAuthenticator] is provided to use for username/password authentication: // // cluster := gocql.NewCluster("192.168.1.1", "192.168.1.2", "192.168.1.3") // cluster.Authenticator = gocql.PasswordAuthenticator{ @@ -95,12 +304,12 @@ // // It is possible to secure traffic between the client and server with TLS. // -// To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. +// To use TLS, set the ClusterConfig.SslOpts field. [SslOptions] embeds *[crypto/tls.Config] so you can set that directly. // There are also helpers to load keys/certificates from files. // -// Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification -// to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. -// SslOptions and Config.InsecureSkipVerify interact as follows: +// Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set SslOptions.EnableHostVerification +// to true if no Config is set. Most users should set SslOptions.Config to a *crypto/tls.Config. +// SslOptions and crypto/tls.Config.InsecureSkipVerify interact as follows: // // Config.InsecureSkipVerify | EnableHostVerification | Result // Config is nil | false | do not verify host @@ -124,7 +333,7 @@ // // # Data-center awareness and query routing // -// To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you +// To route queries to local DC first, use [DCAwareRoundRobinPolicy]. For example, if the datacenter you // want to primarily connect is called dc1 (as configured in the database): // // cluster := gocql.NewCluster("192.168.1.1", "192.168.1.2", "192.168.1.3") @@ -135,7 +344,7 @@ // cluster := gocql.NewCluster("192.168.1.1", "192.168.1.2", "192.168.1.3") // cluster.PoolConfig.HostSelectionPolicy = gocql.TokenAwareHostPolicy(gocql.DCAwareRoundRobinPolicy("dc1")) // -// Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. +// Note that [TokenAwareHostPolicy] can take options such as [ShuffleReplicas] and [NonLocalReplicasFallback]. // // We recommend running with a token aware host policy in production for maximum performance. // @@ -150,7 +359,7 @@ // // # Rack-level awareness // -// The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. +// The DCAwareRoundRobinPolicy can be replaced with [RackAwareRoundRobinPolicy], which takes two parameters, datacenter and rack. // // Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three // (the local rack, the rest of the local datacenter, and everything else). @@ -159,20 +368,20 @@ // // # Executing queries // -// Create queries with Session.Query. Query values must not be reused between different executions and must not be +// Create queries with [Session.Query]. Query values must not be reused between different executions and must not be // modified after starting execution of the query. // -// To execute a query without reading results, use Query.Exec: +// To execute a query without reading results, use [Query.Exec]: // // err := session.Query(`INSERT INTO tweet (timeline, id, text) VALUES (?, ?, ?)`, // "me", gocql.TimeUUID(), "hello world").WithContext(ctx).Exec() // -// Single row can be read by calling Query.Scan: +// Single row can be read by calling [Query.Scan]: // // err := session.Query(`SELECT id, text FROM tweet WHERE timeline = ? LIMIT 1`, // "me").WithContext(ctx).Consistency(gocql.One).Scan(&id, &text) // -// Multiple rows can be read using Iter.Scanner: +// Multiple rows can be read using [Iter.Scanner]: // // scanner := session.Query(`SELECT id, text FROM tweet WHERE timeline = ?`, // "me").WithContext(ctx).Iter().Scanner() @@ -194,20 +403,44 @@ // // See Example for complete example. // +// # Vector types (Cassandra 5.0+) +// +// The driver supports Cassandra 5.0 vector types, enabling powerful vector search capabilities: +// +// // Create a table with vector column +// err := session.Query(`CREATE TABLE vectors ( +// id int PRIMARY KEY, +// embedding vector +// )`).Exec() +// +// // Insert vector data +// embedding := make([]float32, 128) +// // ... populate embedding values +// err = session.Query("INSERT INTO vectors (id, embedding) VALUES (?, ?)", +// 1, embedding).Exec() +// +// // Query vector data +// var retrievedEmbedding []float32 +// err = session.Query("SELECT embedding FROM vectors WHERE id = ?", 1). +// Scan(&retrievedEmbedding) +// +// Vector types support various element types including basic types, collections, and user-defined types. +// Vector search requires Cassandra 5.0 or later. +// // # Prepared statements // // The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache // of prepared statements. // CQL protocol does not support preparing other query types. // -// When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. +// When using native protocol >= 4, it is possible to use [UnsetValue] as the bound value of a column. // This will cause the database to ignore writing the column. // The main advantage is the ability to keep the same prepared statement even when you don't // want to update some fields, where before you needed to make another prepared statement. // // # Executing multiple queries concurrently // -// Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them +// [Session] is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them // from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries // are executed asynchronously at the protocol level. // @@ -272,29 +505,29 @@ // # Paging // // The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, -// Query.PageSize, and Query.Prefetch. +// [Query.PageSize], and [Query.Prefetch]. // // It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). // Manual paging is useful if you want to store the page state externally, for example in a URL to allow users // browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since // it contains data from primary keys. // -// Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that +// Paging state is specific to the native protocol version and the exact query used. It is meant as opaque state that // should not be modified. If you send paging state from different query or protocol version, then the behaviour // is not defined (you might get unexpected results or an error from the server). For example, do not send paging state // returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, -// paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). +// paging state between Cassandra 2.2 and 3.0 is incompatible (see [CASSANDRA-10880]). // // The driver does not check whether the paging state is from the same protocol version/statement. // You might want to validate yourself as this could be a problem if you store paging state externally. // For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. // // Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by -// Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned +// [Iter.PageState] to Query.PageState of a subsequent query to get the next page. If the length of slice returned // by Iter.PageState is zero, there are no more pages available (or an error occurred). // -// Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. -// While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors +// Using too low values of ClusterConfig.PageSize will negatively affect performance, a value below 100 is probably too low. +// While Cassandra returns exactly ClusterConfig.PageSize items (except for last page) in a page currently, the protocol authors // explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't // rely on the page having the exact count of items. // @@ -303,20 +536,20 @@ // # Dynamic list of columns // // There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied -// by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. +// by the user. [Iter.Columns], Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. // // See Example_dynamicColumns. // // # Batches // // The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. -// Use Session.Batch to create a new batch and then fill-in details of individual queries. -// Then execute the batch with Batch.Exec. +// Use [Session.Batch] to create a new batch and then fill-in details of individual queries. +// Then execute the batch with [Batch.Exec]. // // Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have // overhead to ensure this property. // Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. -// Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. +// Updates of counters are handled specially by Cassandra so batches of counter updates have to use [CounterBatch] type. // A counter batch can only contain statements to update counters. // // For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should @@ -326,7 +559,7 @@ // With single-partition batches you can send the batch directly to the node for the partition without incurring the // additional network hop. // -// It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. +// It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to [Query.Exec]. // There are differences how those are executed. // BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. // Batch.Exec prepares individual statements in the batch. @@ -334,16 +567,121 @@ // // See Example_batch for an example. // +// The [Batch] API provides a fluent interface for building and executing batch operations: +// +// // Create and execute a batch using fluent API +// err := session.Batch(LoggedBatch). +// Query("INSERT INTO table1 (id, name) VALUES (?, ?)", id1, name1). +// Query("INSERT INTO table2 (id, value) VALUES (?, ?)", id2, value2). +// Exec() +// +// // Lightweight transactions with batches +// applied, iter, err := session.Batch(LoggedBatch). +// Query("INSERT INTO users (id, name) VALUES (?, ?) IF NOT EXISTS", id, name). +// ExecCAS() +// if err != nil { +// // handle error +// } +// if !applied { +// // handle conditional failure +// } +// // # Lightweight transactions // -// Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an +// [Query.ScanCAS] or [Query.MapScanCAS] can be used to execute a single-statement lightweight transaction (an // INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. // // Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional -// statement. All the conditions must return true for the batch to be applied. You can use Batch.ExecCAS and -// Batch.MapExecCAS when executing the batch to learn about the result of the LWT. See example for +// statement. All the conditions must return true for the batch to be applied. You can use [Batch.ExecCAS] and +// [Batch.MapExecCAS] when executing the batch to learn about the result of the LWT. See example for // Batch.MapExecCAS. // +// # SERIAL Consistency for Reads +// +// The driver supports SERIAL and LOCAL_SERIAL consistency levels on SELECT statements. +// These special consistency levels are designed for reading data that may have been written using +// lightweight transactions (LWT) with IF conditions, providing linearizable consistency guarantees. +// +// When to use SERIAL consistency levels: +// +// Use SERIAL or LOCAL_SERIAL consistency when you need to: +// - Read the most recent committed value after lightweight transactions +// - Ensure linearizable consistency (stronger than eventual consistency) +// - Read data that might have uncommitted lightweight transactions in progress +// +// Important considerations: +// +// - SERIAL reads have higher latency and resource usage than normal reads +// - Only use when you specifically need linearizable consistency +// - If a SERIAL read finds an uncommitted transaction, it will commit that transaction +// - Most applications should use regular consistency levels (ONE, QUORUM, etc.) +// +// # Immutable Execution +// +// [Query] and [Batch] objects follow an immutable execution model that enables safe reuse and concurrent +// execution without object mutation. +// +// Query and Batch Object Reusability: +// +// Query and Batch objects remain unchanged during execution, allowing for safe reuse and concurrent execution: +// +// // Create a query once +// query := session.Query("SELECT * FROM users WHERE id = ?", userID) +// +// // Safe to execute multiple times +// iter1 := query.Iter() +// defer iter1.Close() +// +// iter2 := query.Iter() // Same query, separate execution +// defer iter2.Close() +// +// // Safe to use from multiple goroutines +// go func() { +// iter := query.Iter() +// defer iter.Close() +// // ... process results +// }() +// +// The same applies to Batch objects: +// +// // Create batch once using fluent API +// batch := session.Batch(LoggedBatch). +// Query("INSERT INTO table1 (id, name) VALUES (?, ?)", id1, name1). +// Query("INSERT INTO table2 (id, value) VALUES (?, ?)", id2, value2) +// +// // Safe to execute multiple times +// err1 := batch.Exec() +// err2 := batch.Exec() // Same batch, separate execution +// +// Execution Metrics and Metadata: +// +// Execution-specific information such as metrics, attempts, and latency are available through the [Iter] object +// returned by execution. This provides per-execution metrics while keeping the original objects unchanged: +// +// query := session.Query("SELECT * FROM users WHERE id = ?", userID) +// iter := query.Iter() +// defer iter.Close() +// +// // Access execution metrics through Iter +// attempts := iter.Attempts() // Number of times this execution was attempted +// latency := iter.Latency() // Average latency per attempt in nanoseconds +// keyspace := iter.Keyspace() // Keyspace the query was executed against +// table := iter.Table() // Table the query was executed against (if determinable) +// host := iter.Host() // Host that executed the query +// +// For batches, execution methods that return an Iter (like ExecCAS) provide the same metrics: +// +// // Execute CAS operation and get metrics through Iter using fluent API +// applied, iter, err := session.Batch(LoggedBatch). +// Query("INSERT INTO users (id, name) VALUES (?, ?) IF NOT EXISTS", id, name). +// ExecCAS() +// defer iter.Close() +// +// if err != nil { +// log.Printf("Batch failed after %d attempts with average latency %d ns", +// iter.Attempts(), iter.Latency()) +// } +// // # Retries and speculative execution // // Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed @@ -360,14 +698,102 @@ // // # User-defined types // -// UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing -// UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). +// Cassandra User-Defined Types (UDTs) are composite data types that group related fields together. +// GoCQL provides several ways to work with UDTs in Go, from simple struct mapping to advanced custom marshaling. // -// For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: +// Basic UDT Usage with Structs: +// +// The simplest way to work with UDTs is using Go structs with `cql` tags: +// +// // Cassandra UDT definition: +// // CREATE TYPE address (street text, city text, zip_code int); +// +// type Address struct { +// Street string `cql:"street"` +// City string `cql:"city"` +// ZipCode int `cql:"zip_code"` +// } +// +// // Usage in queries +// addr := Address{Street: "123 Main St", City: "Anytown", ZipCode: 12345} +// err := session.Query("INSERT INTO users (id, address) VALUES (?, ?)", +// userID, addr).Exec() +// +// // Reading UDTs +// var readAddr Address +// err = session.Query("SELECT address FROM users WHERE id = ?", +// userID).Scan(&readAddr) +// +// Field Mapping: +// +// GoCQL maps struct fields to UDT fields using two strategies: +// +// 1. CQL tags: Use `cql:"field_name"` to explicitly map fields +// 2. Name matching: If no tag is present, field names must match exactly (case-sensitive) // // type MyUDT struct { -// FieldA int32 `cql:"a"` -// FieldB string `cql:"b"` +// FieldA int32 `cql:"field_a"` // Maps to UDT field "field_a" +// FieldB string `cql:"field_b"` // Maps to UDT field "field_b" +// FieldC string // Maps to UDT field "FieldC" (exact name match) +// } +// +// Working with Maps: +// +// UDTs can also be marshaled to/from `map[string]interface{}`: +// +// // Marshal to map +// var udtMap map[string]interface{} +// err := session.Query("SELECT address FROM users WHERE id = ?", +// userID).Scan(&udtMap) +// +// // Access fields +// street := udtMap["street"].(string) +// zipCode := udtMap["zip_code"].(int) +// +// Advanced Custom Marshaling: +// +// For complex scenarios, implement the UDTMarshaler and UDTUnmarshaler interfaces: +// +// type CustomUDT struct { +// fieldA string +// fieldB int32 +// } +// +// // UDTMarshaler for writing to Cassandra +// func (c CustomUDT) MarshalUDT(name string, info TypeInfo) ([]byte, error) { +// switch name { +// case "field_a": +// return Marshal(info, c.fieldA) +// case "field_b": +// return Marshal(info, c.fieldB) +// default: +// return nil, nil // Unknown fields set to null +// } +// } +// +// // UDTUnmarshaler for reading from Cassandra +// func (c *CustomUDT) UnmarshalUDT(name string, info TypeInfo, data []byte) error { +// switch name { +// case "field_a": +// return Unmarshal(info, data, &c.fieldA) +// case "field_b": +// return Unmarshal(info, data, &c.fieldB) +// default: +// return nil // Ignore unknown fields for forward compatibility +// } +// } +// +// Nested UDTs and Collections: +// +// UDTs can contain other UDTs and collection types: +// +// // Cassandra definitions: +// // CREATE TYPE address (street text, city text); +// // CREATE TYPE person (name text, addresses list>); +// +// type Person struct { +// Name string `cql:"name"` +// Addresses []Address `cql:"addresses"` // } // // See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. @@ -376,15 +802,18 @@ // // It is possible to provide observer implementations that could be used to gather metrics: // -// - QueryObserver for monitoring individual queries. -// - BatchObserver for monitoring batch queries. -// - ConnectObserver for monitoring new connections from the driver to the database. -// - FrameHeaderObserver for monitoring individual protocol frames. +// - [QueryObserver] for monitoring individual queries. +// - [BatchObserver] for monitoring batch queries. +// - [ConnectObserver] for monitoring new connections from the driver to the database. +// - [FrameHeaderObserver] for monitoring individual protocol frames. // // CQL protocol also supports tracing of queries. When enabled, the database will write information about -// internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive +// internal events that happened during execution of the query. You can use [Query.Trace] to request tracing and receive // the session ID that the database used to store the trace information in system_traces.sessions and -// system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. +// system_traces.events tables. [NewTraceWriter] returns an implementation of [Tracer] that writes the events to a writer. // Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, // so this feature should not be used on production systems with very high load unless you know what you are doing. +// +// [upgrade guide]: https://github.com/apache/cassandra-gocql-driver/blob/trunk/UPGRADE_GUIDE.md +// [CASSANDRA-10880]: https://issues.apache.org/jira/browse/CASSANDRA-10880 package gocql // import "github.com/apache/cassandra-gocql-driver/v2" diff --git a/gocqlzap/doc.go b/gocqlzap/doc.go new file mode 100644 index 000000000..f526d5472 --- /dev/null +++ b/gocqlzap/doc.go @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// Package gocqlzap provides Zap logger integration for the gocql Cassandra driver. +// +// # Overview +// +// This package integrates the popular Zap structured logging library with gocql, +// allowing you to use Zap's high-performance logging features for database operations. +// It implements the gocql.StructuredLogger interface and converts gocql log fields +// to Zap fields automatically. +// +// # Basic Usage +// +// To use Zap logging with gocql, create a Zap logger and wrap it with NewZapLogger: +// +// import ( +// "go.uber.org/zap" +// "github.com/apache/cassandra-gocql-driver/v2" +// "github.com/apache/cassandra-gocql-driver/v2/gocqlzap" +// ) +// +// zapLogger, _ := zap.NewProduction() +// defer zapLogger.Sync() +// +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Logger = gocqlzap.NewZapLogger(zapLogger) +// +// session, err := cluster.CreateSession() +// if err != nil { +// panic(err) +// } +// defer session.Close() +// +// # Named vs Unnamed Loggers +// +// The package provides two functions for creating logger instances: +// +// - NewZapLogger: Creates a logger with a default name "gocql" using Zap's Named() method +// - NewUnnamedZapLogger: Creates a logger without setting a name, allowing you to control naming +// +// Example with named logger: +// +// // This will add a "logger" field with value "gocql" to all log entries +// cluster.Logger = gocqlzap.NewZapLogger(zapLogger) +// +// Example with unnamed logger (custom naming): +// +// // You control the logger name +// customLogger := zapLogger.Named("my-cassandra-client") +// cluster.Logger = gocqlzap.NewUnnamedZapLogger(customLogger) +// +// # Field Type Conversion +// +// The package automatically converts gocql log fields to appropriate Zap field types: +// +// - Boolean fields → zap.Bool +// - Integer fields → zap.Int64 +// - String fields → zap.String +// - Other types → zap.Any +// +// # Log Levels +// +// The gocql log levels are mapped to Zap log levels as follows: +// +// - gocql Error → zap.Error +// - gocql Warning → zap.Warn +// - gocql Info → zap.Info +// - gocql Debug → zap.Debug +// +// # Configuration Examples +// +// # Recommended: Use Built-in Configurations +// +// For most use cases, use Zap's built-in configurations which provide sensible defaults: +// +// // For production +// zapLogger, _ := zap.NewProduction() +// defer zapLogger.Sync() +// +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Logger = gocqlzap.NewZapLogger(zapLogger) +// +// // For development (includes caller info, console encoding) +// zapLogger, _ := zap.NewDevelopment() +// defer zapLogger.Sync() +// +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Logger = gocqlzap.NewZapLogger(zapLogger) +// +// # Custom Configuration +// +// For advanced configuration options, refer to the official Zap documentation: +// https://pkg.go.dev/go.uber.org/zap +// +// Once you have configured your Zap logger, simply wrap it with gocqlzap: +// +// zapLogger := // ... your custom Zap logger configuration +// cluster.Logger = gocqlzap.NewZapLogger(zapLogger) +// +// # Performance Considerations +// +// This integration is designed to be high-performance: +// +// - Uses Zap's WithLazy() for efficient field construction +// - Minimizes allocations by reusing field conversion logic +// - Leverages Zap's optimized structured logging capabilities +// +// # Thread Safety +// +// The logger implementation is thread-safe and can be used concurrently +// across multiple goroutines, as guaranteed by the underlying Zap logger. +package gocqlzap // import "github.com/apache/cassandra-gocql-driver/v2/gocqlzap" diff --git a/gocqlzerolog/doc.go b/gocqlzerolog/doc.go new file mode 100644 index 000000000..fa0841718 --- /dev/null +++ b/gocqlzerolog/doc.go @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// Package gocqlzerolog provides Zerolog logger integration for the gocql Cassandra driver. +// +// # Overview +// +// This package integrates the popular Zerolog structured logging library with gocql, +// allowing you to use Zerolog's zero-allocation logging features for database operations. +// It implements the gocql.StructuredLogger interface and converts gocql log fields +// to Zerolog fields automatically. +// +// # Basic Usage +// +// To use Zerolog logging with gocql, create a Zerolog logger and wrap it with NewZerologLogger: +// +// import ( +// "os" +// "github.com/rs/zerolog" +// "github.com/apache/cassandra-gocql-driver/v2" +// "github.com/apache/cassandra-gocql-driver/v2/gocqlzerolog" +// ) +// +// zerologLogger := zerolog.New(os.Stdout).With().Timestamp().Logger() +// +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Logger = gocqlzerolog.NewZerologLogger(zerologLogger) +// +// session, err := cluster.CreateSession() +// if err != nil { +// panic(err) +// } +// defer session.Close() +// +// # Named vs Unnamed Loggers +// +// The package provides two functions for creating logger instances: +// +// - NewZerologLogger: Creates a logger with a global context containing a "logger" field set to "gocql" +// - NewUnnamedZerologLogger: Creates a logger without modifying the context, allowing you to control naming +// +// Example with named logger: +// +// // This will add a "logger": "gocql" field to all log entries +// cluster.Logger = gocqlzerolog.NewZerologLogger(zerologLogger) +// +// Example with unnamed logger (custom naming): +// +// // You control the logger context +// customLogger := zerologLogger.With().Str("component", "cassandra-client").Logger() +// cluster.Logger = gocqlzerolog.NewUnnamedZerologLogger(customLogger) +// +// # Field Type Conversion +// +// The package automatically converts gocql log fields to appropriate Zerolog field types: +// +// - Boolean fields → zerolog.Event.Bool +// - Integer fields → zerolog.Event.Int64 +// - String fields → zerolog.Event.Str +// - Other types → zerolog.Event.Any +// +// # Log Levels +// +// The gocql log levels are mapped to Zerolog log levels as follows: +// +// - gocql Error → zerolog.Logger.Error() +// - gocql Warning → zerolog.Logger.Warn() +// - gocql Info → zerolog.Logger.Info() +// - gocql Debug → zerolog.Logger.Debug() +// +// # Configuration Examples +// +// # Recommended: Simple Setup +// +// For most use cases, create a basic zerolog logger: +// +// import ( +// "os" +// "github.com/rs/zerolog" +// "github.com/apache/cassandra-gocql-driver/v2" +// "github.com/apache/cassandra-gocql-driver/v2/gocqlzerolog" +// ) +// +// // Basic structured logging (JSON output) +// zerologLogger := zerolog.New(os.Stdout).With().Timestamp().Logger() +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Logger = gocqlzerolog.NewZerologLogger(zerologLogger) +// +// // Human-readable console output for development +// zerologLogger := zerolog.New(zerolog.ConsoleWriter{Out: os.Stdout}).With().Timestamp().Logger() +// cluster.Logger = gocqlzerolog.NewZerologLogger(zerologLogger) +// +// # Advanced Configuration +// +// For advanced zerolog configuration options (sampling, custom outputs, global settings, etc.), +// refer to the official Zerolog documentation: https://github.com/rs/zerolog +// +// Once you have configured your Zerolog logger, simply wrap it with gocqlzerolog: +// +// zerologLogger := // ... your custom Zerolog logger configuration +// cluster.Logger = gocqlzerolog.NewZerologLogger(zerologLogger) +// +// # Performance Considerations +// +// This integration is designed to be high-performance and zero-allocation: +// +// - Uses Zerolog's zero-allocation logging capabilities +// - Minimizes memory allocations through efficient field conversion +// - Leverages Zerolog's optimized structured logging +// +// # Thread Safety +// +// The logger implementation is thread-safe and can be used concurrently +// across multiple goroutines, as guaranteed by the underlying Zerolog logger. +package gocqlzerolog // import "github.com/apache/cassandra-gocql-driver/v2/gocqlzerolog" diff --git a/hostpool/doc.go b/hostpool/doc.go new file mode 100644 index 000000000..d3945a102 --- /dev/null +++ b/hostpool/doc.go @@ -0,0 +1,101 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// Package hostpool provides host selection policies for gocql that integrate +// with the go-hostpool library for intelligent host pooling and load balancing. +// +// # Overview +// +// This package allows gocql to use go-hostpool's intelligent host selection +// algorithms, including round-robin and epsilon greedy policies that can +// automatically avoid problematic hosts and adapt to host performance. +// +// # Basic Usage +// +// To use host pool policies with gocql, create a host pool and set it as your +// cluster's host selection policy: +// +// import ( +// "github.com/hailocab/go-hostpool" +// "github.com/apache/cassandra-gocql-driver/v2" +// "github.com/apache/cassandra-gocql-driver/v2/hostpool" +// ) +// +// // Create an epsilon greedy pool for adaptive load balancing +// pool := hostpool.NewEpsilonGreedy( +// nil, // Host list populated automatically by gocql +// 0, // Use default 5-minute decay duration +// &hostpool.LinearEpsilonValueCalculator{}, // Example calculator +// ) +// +// cluster := gocql.NewCluster("127.0.0.1", "127.0.0.2", "127.0.0.3") +// cluster.PoolConfig.HostSelectionPolicy = hostpool.HostPoolHostPolicy(pool) +// +// session, err := cluster.CreateSession() +// if err != nil { +// panic(err) +// } +// defer session.Close() +// +// # Host Pool Types +// +// # Simple Round Robin +// +// Basic round-robin selection suitable for testing and simple deployments: +// +// pool := hostpool.New(nil) // Hosts populated by gocql +// cluster.PoolConfig.HostSelectionPolicy = hostpool.HostPoolHostPolicy(pool) +// +// # Epsilon Greedy +// +// Adaptive selection that learns host performance and routes traffic accordingly: +// +// // Example using LinearEpsilonValueCalculator +// pool := hostpool.NewEpsilonGreedy(nil, 0, &hostpool.LinearEpsilonValueCalculator{}) +// cluster.PoolConfig.HostSelectionPolicy = hostpool.HostPoolHostPolicy(pool) +// +// // Other epsilon value calculators are also available: +// // - LogEpsilonValueCalculator: Uses logarithmic scaling +// // - PolynomialEpsilonValueCalculator: Uses polynomial scaling +// +// The epsilon greedy algorithm automatically: +// - Routes more traffic to faster-responding hosts +// - Reduces load on slower or problematic hosts +// - Adapts to changing host performance over time +// - Provides automatic failure avoidance +// +// # Integration Details +// +// The hostpool policy integrates seamlessly with gocql's host management: +// +// - Host list is automatically populated and updated by gocql +// - Host failures are automatically reported to the pool +// - The pool tracks response times and host performance +// - Works with gocql's reconnection and discovery mechanisms +// +// # Configuration Options +// +// For epsilon greedy pools, you can customize: +// +// - Decay duration: How long to average response times (default 5 minutes) +// - Value calculator: Algorithm for scoring hosts based on performance +// +// Choose the epsilon value calculator that best fits your performance characteristics +// and load balancing requirements. See the go-hostpool documentation for detailed +// configuration options and calculator behavior. +package hostpool diff --git a/lz4/doc.go b/lz4/doc.go new file mode 100644 index 000000000..466e89595 --- /dev/null +++ b/lz4/doc.go @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +/* + * Content before git sha 34fdeebefcbf183ed7f916f931aa0586fdaa1b40 + * Copyright (c) 2016, The Gocql authors, + * provided under the BSD-3-Clause License. + * See the NOTICE file distributed with this work for additional information. + */ + +// Package lz4 provides LZ4 compression for the Cassandra Native Protocol. +// +// LZ4 compresses Native Protocol frame payloads as defined in the Cassandra +// Native Protocol specification. The protocol supports both compressed and +// uncompressed frame formats, with compression applied to frame payloads +// containing CQL envelopes. +// +// # Basic Usage +// +// To enable LZ4 compression: +// +// import ( +// "github.com/apache/cassandra-gocql-driver/v2" +// "github.com/apache/cassandra-gocql-driver/v2/lz4" +// ) +// +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Compressor = &lz4.LZ4Compressor{} +// +// # Native Protocol Compression +// +// According to the Cassandra Native Protocol specification, compression operates +// on frame payloads containing streams of CQL envelopes. Each frame payload is +// compressed independently with no compression context between frames. +// +// # Protocol and Cassandra Version Support +// +// LZ4 compression is supported across all Native Protocol versions that support +// compression, with corresponding Cassandra version support: +// +// - Protocol v2 (Cassandra 2.0.x): LZ4 and Snappy supported +// - Protocol v3 (Cassandra 2.1.x): LZ4 and Snappy supported +// - Protocol v4 (Cassandra 2.2.x, 3.0.x, 3.x): LZ4 and Snappy supported +// - Protocol v5 (Cassandra 4.0+): Only LZ4 supported (Snappy removed) +// +// LZ4 is supported from Cassandra 2.0+ through current versions. In Cassandra 4.0+, +// LZ4 became the only supported compression algorithm. +// +// # Performance Characteristics +// +// LZ4 generally provides a good balance of compression speed and compression ratio, +// making it a solid default choice for most applications. The effectiveness of +// compression depends on the specific CQL query patterns, result set sizes, and +// frame payload characteristics in your application. +// +// For optimal performance, benchmark both LZ4 and Snappy with your specific +// workload, though LZ4 is typically a good starting point. +package lz4 // import "github.com/apache/cassandra-gocql-driver/v2/lz4" diff --git a/snappy/doc.go b/snappy/doc.go new file mode 100644 index 000000000..3d60ed3a2 --- /dev/null +++ b/snappy/doc.go @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +/* + * Content before git sha 34fdeebefcbf183ed7f916f931aa0586fdaa1b40 + * Copyright (c) 2012, The Gocql authors, + * provided under the BSD-3-Clause License. + * See the NOTICE file distributed with this work for additional information. + */ + +// Package snappy provides Snappy compression for the Cassandra Native Protocol. +// +// Snappy compresses Native Protocol frame payloads as defined in the Cassandra +// Native Protocol specification. The protocol supports both compressed and +// uncompressed frame formats, with compression applied to frame payloads +// containing CQL envelopes. +// +// # Basic Usage +// +// To enable Snappy compression: +// +// import ( +// "github.com/apache/cassandra-gocql-driver/v2" +// "github.com/apache/cassandra-gocql-driver/v2/snappy" +// ) +// +// cluster := gocql.NewCluster("127.0.0.1") +// cluster.Compressor = &snappy.SnappyCompressor{} +// +// # Native Protocol Compression +// +// According to the Cassandra Native Protocol specification, compression operates +// on frame payloads containing streams of CQL envelopes. Each frame payload is +// compressed independently with no compression context between frames. +// +// # Protocol and Cassandra Version Support +// +// Snappy compression support varies by Cassandra version due to Native Protocol +// changes: +// +// - Protocol v2 (Cassandra 2.0.x): LZ4 and Snappy supported +// - Protocol v3 (Cassandra 2.1.x): LZ4 and Snappy supported +// - Protocol v4 (Cassandra 2.2.x, 3.0.x, 3.x): LZ4 and Snappy supported +// - Protocol v5 (Cassandra 4.0+): Only LZ4 supported (Snappy removed) +// +// Snappy is supported from Cassandra 2.0.x through 3.x, but is not available +// in Cassandra 4.0+. For applications targeting Cassandra 4.0+, use LZ4 compression. +// +// # Compatibility Notes +// +// When connecting to Cassandra 4.0+ clusters, Snappy compression will not be +// available as the protocol only supports LZ4. Applications should use LZ4 +// for maximum compatibility or implement version-specific compression selection. +// +// # Performance Characteristics +// +// LZ4 is generally recommended as the default choice for most applications. +// For performance optimization, benchmark your specific workload with both +// compression algorithms, though Snappy is mainly useful if benchmarking +// shows it performs better for your specific use case. +package snappy // import "github.com/apache/cassandra-gocql-driver/v2/snappy"