Skip to content

Prerequisites

Before installing the Kafka Connect gRPC connector, ensure your environment meets the following requirements.

Required Software

Java Development Kit (JDK)

Minimum Version: Java 11 Recommended: Java 17 (LTS)

java -version

Expected output:

openjdk version "17.0.9" 2023-10-17 LTS
OpenJDK Runtime Environment (build 17.0.9+9-LTS)

sudo apt update
sudo apt install openjdk-17-jdk
brew install openjdk@17
sudo yum install java-17-openjdk-devel

Apache Kafka

Minimum Version: 3.9.0 Download: Apache Kafka Downloads

kafka-topics.sh --version

Expected output:

3.9.0 (Commit:...)

# Download Kafka
wget https://downloads.apache.org/kafka/3.9.0/kafka_2.13-3.9.0.tgz
tar -xzf kafka_2.13-3.9.0.tgz
cd kafka_2.13-3.9.0

# Start Kafka (KRaft mode)
KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"
bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties
bin/kafka-server-start.sh config/kraft/server.properties

Maven (for building from source)

Minimum Version: 3.6+ Recommended: 3.9+

mvn --version

Expected output:

Apache Maven 3.9.5
Maven home: /usr/share/maven
Java version: 17.0.9

Protocol Buffers Compiler (protoc)

Required for: Generating .desc descriptor files Minimum Version: 3.0+ Recommended: 3.25.0+

protoc --version

Expected output:

libprotoc 3.25.0

sudo apt update
sudo apt install protobuf-compiler
brew install protobuf
# Download from GitHub releases
PROTOC_VERSION=25.0
wget https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-x86_64.zip
unzip protoc-${PROTOC_VERSION}-linux-x86_64.zip -d /usr/local

Network Requirements

gRPC Server Connectivity

The connector requires network access to gRPC servers:

Protocol Default Port Purpose
gRPC (plaintext) 9090 Unencrypted gRPC connections
gRPC (TLS) 443 or custom TLS-encrypted gRPC connections

Test gRPC Connectivity

# Install grpcurl (gRPC testing tool)
go install github.com/fullstorydev/grpcurl/cmd/grpcurl@latest

# Test connection and list services
grpcurl -plaintext localhost:9090 list

# Test server streaming method
grpcurl -plaintext localhost:9090 \
  com.example.MyService/StreamData

Firewall Configuration

If behind a corporate firewall, ensure HTTP/2 traffic is allowed:

# Test with curl (HTTP/2)
curl -v --http2 https://grpc-server:443

# Expected: Connection successful, HTTP/2 protocol negotiated

Proxy Configuration

If using a corporate proxy:

# Set proxy environment variables
export https_proxy=http://proxy.company.com:8080
export http_proxy=http://proxy.company.com:8080

# Configure Java proxy (add to Kafka Connect startup)
export KAFKA_OPTS="-Dhttps.proxyHost=proxy.company.com \
                   -Dhttps.proxyPort=8080 \
                   -Dhttp.proxyHost=proxy.company.com \
                   -Dhttp.proxyPort=8080"

gRPC Server Requirements

Server Streaming Method

The connector requires a server streaming gRPC method:

service EventService {
  // ✅ Supported: Server streaming
  rpc StreamEvents(EventRequest) returns (stream Event);

  // ❌ Not supported: Unary
  rpc GetEvent(EventRequest) returns (Event);

  // ❌ Not supported: Client streaming
  rpc UploadEvents(stream Event) returns (UploadResponse);

  // ❌ Not supported: Bidirectional streaming
  rpc SyncEvents(stream Event) returns (stream Event);
}

Protocol Buffer Descriptor

Generate a descriptor file for your service:

# Generate .desc file with all dependencies
protoc --descriptor_set_out=service.desc \
  --include_imports \
  your_service.proto

# Verify descriptor contains your service
grpcurl -protoset service.desc list

Include Imports

Always use --include_imports when generating descriptors. Without it, the connector cannot resolve message types from imported proto files.

Kafka Connect Setup

Ensure Kafka Connect is running in distributed mode:

# Check if Connect is running
curl http://localhost:8083/

Expected response:

{
  "version": "3.9.0",
  "commit": "...",
  "kafka_cluster_id": "..."
}

Internal Topics Configuration

Kafka Connect in distributed mode requires three internal topics. These are typically auto-created, but for production you should pre-create them with proper replication:

# In connect-distributed.properties
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
offset.storage.partitions=25

config.storage.topic=connect-configs
config.storage.replication.factor=3
config.storage.partitions=1

status.storage.topic=connect-status
status.storage.replication.factor=3
status.storage.partitions=5

Topic Purpose

  • connect-offsets: Stores source connector offsets (sequence numbers for gRPC)
  • connect-configs: Stores connector and task configurations
  • connect-status: Stores connector and task status

Producer Configuration

Configure producer settings for source connectors in connect-distributed.properties:

# Optimize for throughput
producer.linger.ms=10
producer.batch.size=32768
producer.compression.type=lz4
producer.acks=1

Plugin Directory

Verify the plugin path is configured:

# Check connect-distributed.properties
grep plugin.path $KAFKA_HOME/config/connect-distributed.properties

Expected output:

plugin.path=/usr/local/share/kafka/plugins

Plugin Path Must Exist

The plugin directory must exist and have proper permissions:

sudo mkdir -p /usr/local/share/kafka/plugins
sudo chown -R $USER:$USER /usr/local/share/kafka/plugins

Resource Requirements

Minimum Resources

For development/testing:

Resource Requirement
CPU 1 core
Memory 512 MB for connector
Disk 100 MB for JAR files
Network 10 Mbps

Production Resources

For production deployments:

Resource Requirement
CPU 2+ cores
Memory 2 GB for Kafka Connect worker
Disk 1 GB (for JARs + logs)
Network 100+ Mbps, low latency to gRPC server

Memory Configuration

Configure heap size for Kafka Connect:

# In connect-distributed.sh or systemd service
export KAFKA_HEAP_OPTS="-Xms2G -Xmx2G"

Security Configuration

TLS/mTLS Certificates

For secure gRPC connections, prepare certificate files:

# CA certificate (for server verification)
/path/to/ca.crt

# Client certificate (for mTLS)
/path/to/client.crt

# Client private key (for mTLS)
/path/to/client.key

Verify certificates:

# Check certificate validity
openssl x509 -in ca.crt -text -noout

# Test mTLS connection
grpcurl -cacert ca.crt \
  -cert client.crt \
  -key client.key \
  grpc-server:443 list

Secure JMX Access

For production, enable JMX authentication:

# Create password file
echo "admin changeit" > /etc/kafka/jmx.password
chmod 600 /etc/kafka/jmx.password

# Create access file
echo "admin readwrite" > /etc/kafka/jmx.access
chmod 644 /etc/kafka/jmx.access

# Configure Kafka Connect
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote \
  -Dcom.sun.management.jmxremote.authenticate=true \
  -Dcom.sun.management.jmxremote.password.file=/etc/kafka/jmx.password \
  -Dcom.sun.management.jmxremote.access.file=/etc/kafka/jmx.access \
  -Dcom.sun.management.jmxremote.ssl=true"

Never Run JMX Without Authentication in Production

Default JMX configuration with authenticate=false exposes your connector to unauthorized access.

Protect Secrets in Configuration

Do NOT store sensitive data in plaintext config files. Use Kafka Connect's ConfigProvider:

{
  "grpc.tls.client.key": "${file:/etc/kafka/secrets.properties:grpc.client.key}",
  "config.providers": "file",
  "config.providers.file.class": "org.apache.kafka.common.config.provider.FileConfigProvider"
}

Create /etc/kafka/secrets.properties with restricted permissions:

echo "grpc.client.key=/secure/path/to/client.key" > /etc/kafka/secrets.properties
chmod 600 /etc/kafka/secrets.properties

Optional Tools

Monitoring Tools

For production deployments, consider installing:

  • JMX Monitoring: JConsole, VisualVM, or JMX Exporter
  • Prometheus: For metrics collection
  • Grafana: For dashboards
  • Loki/ELK: For log aggregation

Development Tools

For connector development:

  • Git: Version control
  • Docker: Containerized Kafka and gRPC server setup
  • grpcurl: gRPC command-line client for testing
  • curl/jq: API testing and JSON parsing
  • BloomRPC: GUI client for testing gRPC services

Verification Checklist

Before proceeding to installation, verify:

  • Java 11+ is installed and java -version works
  • Kafka 3.9.0+ is running
  • Kafka Connect REST API is accessible at http://localhost:8083/
  • Plugin directory exists and is writable
  • Maven 3.6+ is installed (for building from source)
  • protoc is installed for generating descriptors
  • Network access to gRPC server is available
  • gRPC server has a server streaming method
  • Protocol Buffer descriptor file is generated

Troubleshooting Prerequisites

Java Version Issues

Problem: Wrong Java version

# Check all Java installations
ls -la /usr/lib/jvm/

# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH

Kafka Not Running

Problem: Kafka Connect not accessible

# Check Kafka Connect logs
tail -f $KAFKA_HOME/logs/connect.log

# Restart Kafka Connect
$KAFKA_HOME/bin/connect-distributed.sh config/connect-distributed.properties

Network Connectivity Issues

Problem: Cannot reach gRPC server

# Test DNS resolution
nslookup grpc-server.example.com

# Test port connectivity
telnet grpc-server.example.com 9090

# Test gRPC service
grpcurl -plaintext grpc-server.example.com:9090 list

Proto Descriptor Issues

Problem: Cannot generate descriptor file

# Ensure protoc is installed
which protoc

# Check proto file syntax
protoc --syntax-check your_service.proto

# Generate with verbose output
protoc --descriptor_set_out=service.desc \
  --include_imports \
  --error_format=gcc \
  your_service.proto

Next Steps

Once all prerequisites are met:

  1. Installation - Install the connector
  2. Configuration - Configure all connector parameters
  3. Quick Start - Deploy your first connector

Need help? Check our FAQ or open an issue.