# Kafka Consumer

TidePoolUI includes a PHP-based Kafka consumer that reads share events from SeaTidePool and stores them in Redis for display in the web UI.

## Overview

The consumer (`backend/bin/share-consumer.php`) is a long-running daemon that:

1. Connects to a Kafka cluster
2. Subscribes to the shares topic
3. Processes each share message
4. Updates Redis via the ShareStore class

## Requirements

- PHP 8.3 with `rdkafka` extension
- Redis server
- Network access to Kafka broker(s)

### Installing rdkafka on FreeBSD

```sh
sudo pkg install php83-pecl-rdkafka
sudo cp /usr/local/etc/php/ext-20-rdkafka.ini.sample /usr/local/etc/php/ext-20-rdkafka.ini
```

Verify installation:
```sh
php -m | grep rdkafka
```

## Configuration

The consumer uses environment variables for configuration:

| Variable | Default | Description |
|----------|---------|-------------|
| `KAFKA_BROKERS` | `localhost:9092` | Comma-separated list of Kafka brokers |
| `KAFKA_TOPIC` | `tidepool.dev.shares` | Topic to consume from |
| `KAFKA_GROUP` | `tidepoolui-consumer` | Consumer group ID |
| `HEALTH_FILE` | `/tmp/tidepoolui-consumer.health` | Health check file path |
| `MAX_MEMORY_MB` | `128` | Memory limit before restart |

### Example Configuration

```sh
export KAFKA_BROKERS="kafka01:9092,kafka02:9092,kafka03:9092"
export KAFKA_TOPIC="tidepool.prod.shares"
export KAFKA_GROUP="tidepoolui-prod"
./backend/bin/share-consumer.php
```

## Running the Consumer

### Foreground (Development)

```sh
./backend/bin/share-consumer.php
```

Output:
```
TidePoolUI Kafka Share Consumer
================================
Brokers: localhost:9092
Topic: tidepool.dev.shares
Group: tidepoolui-consumer

Subscribed to tidepool.dev.shares. Waiting for messages...

[05:16:16] testworker: valid (1.42)
[05:16:29] testworker: valid (1.17)
[05:16:32] testworker: valid (1.82)
```

### Background (Production)

Using `daemon`:
```sh
daemon -p /var/run/tidepoolui-consumer.pid ./backend/bin/share-consumer.php
```

Using `nohup`:
```sh
nohup ./backend/bin/share-consumer.php >> /var/log/tidepoolui-consumer.log 2>&1 &
```

### FreeBSD rc.d Service (Recommended)

Install the service script:
```sh
sudo cp etc/rc.d/tidepoolui_consumer /usr/local/etc/rc.d/
sudo chmod +x /usr/local/etc/rc.d/tidepoolui_consumer
```

Configure in `/etc/rc.conf`:
```sh
tidepoolui_consumer_enable="YES"
tidepoolui_consumer_user="www"
tidepoolui_consumer_dir="/usr/local/www/tidepoolui"
tidepoolui_consumer_kafka_brokers="localhost:9092"
tidepoolui_consumer_kafka_topic="tidepool.prod.shares"
```

Manage the service:
```sh
sudo service tidepoolui_consumer start
sudo service tidepoolui_consumer status
sudo service tidepoolui_consumer health   # Show health metrics
sudo service tidepoolui_consumer stop
```

### Systemd Service (Linux)

Create `/etc/systemd/system/tidepoolui-consumer.service`:

```ini
[Unit]
Description=TidePoolUI Kafka Consumer
After=network.target redis.service

[Service]
Type=simple
User=www
WorkingDirectory=/path/to/TidePoolUI
Environment=KAFKA_BROKERS=localhost:9092
Environment=KAFKA_TOPIC=tidepool.prod.shares
ExecStart=/usr/local/bin/php backend/bin/share-consumer.php
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
```

## Message Format

The consumer expects JSON messages with this structure:

```json
{
    "workinfoid": 12345,
    "clientid": 1,
    "diff": 1.0,
    "sdiff": 2.5,
    "result": true,
    "errn": 0,
    "username": "miner1",
    "workername": "miner1.rig1",
    "address": "192.168.1.100",
    "agent": "bfgminer/5.5.0"
}
```

**Fields:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `workinfoid` | integer | Yes | Work unit identifier |
| `clientid` | integer | Yes | Client connection ID |
| `diff` | float | Yes | Target difficulty |
| `sdiff` | float | Yes | Submitted share difficulty |
| `result` | boolean | Yes | true = valid, false = invalid |
| `errn` | integer | No | Error number (for invalid shares) |
| `username` | string | Yes | Pool username |
| `workername` | string | Yes | Full worker name |
| `address` | string | No | Client IP address |
| `agent` | string | No | Mining software identifier |

## Consumer Groups

The consumer uses Kafka consumer groups for:

- **Offset tracking**: Remembers last consumed message
- **Scalability**: Multiple consumers can share load
- **Fault tolerance**: If one consumer fails, others continue

### Resetting Offsets

To reprocess all messages from the beginning:

```sh
# Delete the consumer group
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
    --group tidepoolui-consumer --delete

# Or reset to earliest
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
    --group tidepoolui-consumer --reset-offsets --to-earliest \
    --topic tidepool.dev.shares --execute
```

## Monitoring

### Check Consumer Lag

```sh
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
    --group tidepoolui-consumer --describe
```

### Check Redis Data

```sh
# Share count
redis-cli llen tidepoolui:shares

# Worker list
redis-cli smembers tidepoolui:workers

# Worker details
redis-cli hgetall tidepoolui:worker:testworker
```

## Resilience Features

The hardened consumer includes:

### Signal Handling
- **SIGTERM/SIGINT/SIGHUP**: Graceful shutdown, commits offsets, cleans up

### Redis Reconnection
- Automatic reconnection with exponential backoff (1s, 2s, 4s)
- Up to 3 retry attempts before failing
- Connection health check via ping before each operation

### Memory Management
- Periodic garbage collection (every 5 minutes)
- Memory limit enforcement (default 128MB, configurable)
- Automatic exit when limit exceeded (supervisor restarts)

### Health Check
- Writes status to `/tmp/tidepoolui-consumer.health` every 30 seconds
- Includes: PID, memory usage, message counts, last message time
- Use for monitoring/alerting

## Error Handling

The consumer handles these error conditions:

| Error | Behavior |
|-------|----------|
| Kafka connection lost | Automatic reconnection |
| Invalid JSON message | Logged to stderr, skipped, error counter incremented |
| Redis connection lost | Automatic reconnection with backoff (3 retries) |
| Partition EOF | Normal, continue polling |
| Timeout | Normal, continue polling |
| Memory limit exceeded | Graceful exit (supervisor restarts) |

## Performance Tuning

For high-volume environments, adjust these settings in the consumer code:

```php
// Increase batch size
$conf->set('fetch.min.bytes', '1024');

// Reduce latency
$conf->set('fetch.wait.max.ms', '100');

// Enable compression
$conf->set('compression.codec', 'snappy');
```

## Troubleshooting

### "Broker transport failure"

- Check Kafka is running: `sudo podman ps | grep kafka`
- Verify broker address is correct
- Check firewall allows port 9092

### "Unknown topic or partition"

- Topic may not exist yet
- SeaTidePool may not have published any shares
- Check topic with: `kafka-topics.sh --list --bootstrap-server localhost:9092`

### Consumer not receiving messages

1. Check SeaTidePool has Kafka enabled
2. Verify topic name matches
3. Check consumer group offset
4. Try consuming from beginning:
   ```sh
   kafka-console-consumer.sh --bootstrap-server localhost:9092 \
       --topic tidepool.dev.shares --from-beginning --max-messages 5
   ```

### High memory usage

The consumer stores recent shares in Redis with a cap. If Redis memory is high:
```sh
redis-cli info memory
redis-cli llen tidepoolui:shares  # Should be capped at 1000
```

## Related Documentation

- [ARCHITECTURE.md](ARCHITECTURE.md) - System overview
- [SeaTidePool KAFKA_DEV_SETUP.md](../../SeaTidePool/doc/KAFKA_DEV_SETUP.md) - Kafka cluster setup
