Kafka record extractor - User Guide
Created , Updated
The Kafka record extractor deploys a kafkacat container on Docker. It extracts SMF records, filtered by a configured time interval, from an Apache Kafka cluster. In some troubleshooting cases, support teams may request that an extract of SMF records. The following instructions help users to install, configure and run the Kafka record extractor.
Step 1: Prerequisites
A Linux distribution with Docker installed is required.
Step 2: Installation
New users should contact technical support to receive their access credentials.
Download the kafka-record-extractor.tar
archive from our repository.
Extract the archive:
tar -xvf kafka-record-extractor.tar
Step 3: Configuration
Customize the properties in the settings.config
configuration file:
The date
system command formats the string in START_TIMEDATE and END_TIMEDATE into a corresponding timestamp. Click here for more information about the supported date string formats.
KEY | EXAMPLE VALUE | DESCRIPTION |
KAFKA_HOST_IP | 127.0.0.1 localhost | The Kafka cluster IP address or hostname |
KAFKA_PORT | 9092 | The active port listening for connections to the Kafka cluster. |
KAFKA_TOPIC | smf | The existing topic that contains the SMF records. |
START_TIMEDATE | 2021-04-12 20:52:00 UTC | Earliest time a record arrived in the cluster which should be extracted. Forms a time interval with END_TIMEDATE. |
END_TIMEDATE | 2021-04-12 21:52:00 UTC | Latest time a record arrived in the cluster which should be extracted. Forms a time interval with START_TIMEDATE. |
MAX_RECORDS_TO_EXTRACT | 500 | The maximum number of records to be extracted. |
ARCHIVE_RECORDS | true | Flag determines whether extracted records must be archived into gzipped tarballs. |
RECORDS_PER_ARCHIVE | 100 |
Only processed if ARCHIVE_RECORDS is true. The maximum number of records per archive. When set to 0, all extracted records wil be archived into a single gzipped tarball. |
Step 4: Run
The properties must be configured correctly before running the Kafka record extractor
Run the bash script extractor.sh
that is located in the extracted root folder:
./extractor.sh
Extracted records or gzipped archives are stored in the sub folder /extract/records
.
Naming Convention
Record:
<KAFKA_TOPIC>-<TOPIC_PARTITION>-<PARTITION_OFFSET>_<DATE>_<TIME>_UTC.smf
Archive:
records-<DATE>_<TIME>_UTC-x<ARCHIVE_NUMBER>.tar.gz