Confluent kafka support#61
Conversation
- Replaced the existing Kafka implementation with Confluent Kafka Go client for improved functionality. - Updated task configuration to include new fields for schema registry and idempotent producer support. - Enhanced README documentation to reflect changes in task behavior and configuration options. - Added example pipelines for reading and writing with schema support using Confluent Kafka.
|
Since the library is CGO dependent, lets also test it with/wihtout CGO_ENABLED flag. |
… task configuration - Updated the build process in release.yaml and Dockerfile to enable CGO for the caterpillar binary. - Added a new serializer implementation for Avro format with Schema Registry support. - Enhanced README documentation to clarify configuration options and provide examples for using Avro serialization.
| @@ -1,13 +1,13 @@ | |||
| FROM golang:1.24.7-alpine AS builder | |||
| FROM golang:1.24.7 AS builder | |||
There was a problem hiding this comment.
Added this becuase
confluent-kafka-go vendors librdkafka_glibc_linux which cannot link against Alpine's musl libc.
|
|
||
| # build executable | ||
| RUN go build -o caterpillar ./cmd/caterpillar/caterpillar.go | ||
| RUN CGO_ENABLED=1 go build -o caterpillar ./cmd/caterpillar/caterpillar.go |
There was a problem hiding this comment.
confluent-kafka-go wraps native librdkafka via CGO
…rocess - Changed base image from Debian to Alpine for a smaller footprint. - Added necessary packages for building with CGO and librdkafka support. - Updated build command to include dynamic tags for the caterpillar binary.
- Introduced a new codec format handler for Avro serialization in Kafka tasks, improving schema registry integration. - Updated the `newCodecForFormat` function to utilize a map for codec format handling, enhancing maintainability. - Added `schema_format` configuration option to Kafka pipeline YAML files for specifying the serialization format. - Minor comment updates in the Dockerfile to clarify the build process.
| WORKDIR /go/src/github.com/patterninc/caterpillar | ||
|
|
||
| # Alpine 3.20 ships librdkafka 2.4.0; confluent-kafka-go v2 requires 2.14.0+. | ||
| RUN apk add --no-cache gcc musl-dev pkgconf \ |
There was a problem hiding this comment.
Build time dependency.
| RUN chmod 755 caterpillar | ||
|
|
||
| FROM alpine:3.20 | ||
| RUN apk add --no-cache \ |
There was a problem hiding this comment.
Run time dependency.
Yash Shrivastava (alephys26)
left a comment
There was a problem hiding this comment.
For the standalone reader, add an Unique suffix on every run, so that it always behaves as a new reader.
Otherwise completely drop the standalone reader functionality.
- Updated the Kafka consumer group ID to include a unique UUID, ensuring better identification and management of consumer instances.
Yash Shrivastava (alephys26)
left a comment
There was a problem hiding this comment.
LGTM.
Description
segmentio/kafka-gotoconfluent-kafka-go/v2for better SCRAM-SHA-512, TLS, andSchema Registry support
schema_registry_url,schema_registry_username,schema_registry_password)complement to idempotent: true on the producer
kafkaTaskstruct — removed duplicate unexported fields (timeout,batchFlushInterval), deadfields (
BatchSize,UserCert,UserCertPath) and unused constant (defaultBatchSize)and retry logic
Types of changes
Checklist