Skip to content

jerryshao/kafka-input-format

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

kafka-input-format

kafka-input-format: A Hadoop input format specific for batch loading messages from Kafka, it has several features:

  1. Automatically record current consumed offset of Kafka message queue into Zookeeper, avoid duplication.
  2. Automatically distribute tasks according to Kafka's broker-partition locality, avoid data transmission on the network.

This project is open sourced under Apache License Version 2.0.

About

A Kafka input format used in Hadoop or Spark for batch reading data from Kafka

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages