Docsp 15784 kafka connect introduction (#111)

biniona-mongodb · terakilobyte · Chris Cho · schmalliso · commit fb7ba95171ea · 2022-04-26T12:33:19.000-04:00
Co-authored-by: Nathan &lt;nathan.leniz@mongodb.com&gt;
Co-authored-by: Chris Cho &lt;chris.cho@mongodb.com&gt;
Co-authored-by: Kailie Yuan &lt;kailie.yuan@mongodb.com&gt;
Co-authored-by: Rea Rustagi &lt;rea.rustagi@mongodb.com&gt;
Co-authored-by: Robert Walters &lt;robert.walters@mongodb.com&gt;
diff --git a/source/includes/figures/connect-data-flow.drawio b/source/includes/figures/connect-data-flow.drawio
@@ -0,0 +1 @@
+<mxfile host="app.diagrams.net" modified="2021-07-19T02:23:48.741Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36" version="14.8.6" etag="Fl_y1J6sKEKLlZwr8ESv" type="device"><diagram id="NlAw9Xcqx87uc-Amdjaq" name="Page-1">7Vhdc6IwFP01Pm6HD0F8rB/bzmz75Oxu61uECLGBS2MsuL9+LxLAFLV2xhnZnT7JPbkJybnnJMGePY7zO0HS6BECynuWEeQ9e9KzLNMaevhTINsS8Vy3BELBApXUADP2hyrQUOiGBXStJUoALlmqgz4kCfWlhhEhINPTlsD1t6YkpC1g5hPeRn+zQEZqFY7R4PeUhVH1ZtNQLTGpkhWwjkgA2R5kT3v2WADI8inOx5QX5FW8lP2+H2mtJyZoIs/pQF4X99mvt9Vw/vNOzrPJfLB6/Wb2y2HeCN+oFavZym1FgYBNEtBiFKNnj7KISTpLiV+0Zlh0xCIZc4xMfFTDUSFpfnSiZr181A2FmEqxxRTVwfIUY0oyNYNZUwB7oLBon3zPUYVXRQ/rsRte8EFR8wmavBYpNECZqBCEjCCEhPBpg4502pqcB4BUkbWiUm6V5slGgk4lzZl82nt+Loa6cVQ0ydXIu2BbBQku92k/2OtVhE23XVT1K9dXLOp00ZAD2AifnuBKCUgSEVJ5Is8+LAJBOZHsTZ/HxQtqtWT/jOtC5DZNOfNxBpC0av6B9Mk6LXegJcuLul/CC65uBdtoW8E74IQKuzhvwy8jnG0E+0wj9K9pBPvA/u+SuJA2x0mPHiEJYYK0GByrVreFcseVu8tZoHHcI0jd//peMr2Omam643TfTRd0Rf9MVzjXdEX7VvSDLF8IQmN1z+yAnmv9dkbPZouWjuq5A6eDc6YP3Gv6wDnmg+urv285HVN/+8vpS/0nVX2G+gfXVL/7L5wCjtOxU2DwwY0yLm6EweJ/uFG6nbtRtjdsnfxb5CfChErIDwdK8I5XpEfqVK6lgBc6Bg4CkQSSYidbMs7fQYSzMMHQRzop4qOCbPy45reqIWZBsNsGD9VO3xovcV8aWjf6mWE57XqZfaddMOvzBcOw+W9v17b3D6k9/Qs=</diagram></mxfile>
diff --git a/source/includes/figures/connect-data-flow.png b/source/includes/figures/connect-data-flow.png
diff --git a/source/includes/figures/mongo-kafka-connect.png b/source/includes/figures/mongo-kafka-connect.png
diff --git a/source/introduction/kafka-connect.txt b/source/introduction/kafka-connect.txt
@@ -2,4 +2,90 @@
 Kafka and Kafka Connect
 =======================
 
-asdf
+.. default-domain:: mongodb
+
+.. contents:: On this page
+   :local:
+   :backlinks: none
+   :depth: 1
+   :class: singlecol
+
+Overview
+~~~~~~~~
+
+In this guide, you can learn the following foundational information about Apache
+Kafka and Kafka Connect: 
+
+- What Apache Kafka and Kafka Connect are
+- What problems Apache Kafka and Kafka Connect solve
+- Why Apache Kafka and Kafka Connect are useful
+- How data moves through an Apache Kafka and Kafka Connect pipeline
+
+Apache Kafka
+~~~~~~~~~~~~
+
+Apache Kafka is an open source publish/subscribe messaging system. Apache Kafka
+provides a flexible, **fault tolerant**, and **horizontally scalable** system to
+move data throughout datastores and applications. A system is fault tolerant
+if the system can continue operating even if certain components of the
+system stop working. A system is horizontally scalable if the system can be
+expanded to handle larger workloads by adding more machines rather than by
+improving a machine's hardware.
+
+For more information on Apache Kafka, see the following resources:
+
+- `Confluent "What is Apache Kafka?" Page <https://www.confluent.io/what-is-apache-kafka/>`__
+- `Apache Kafka Official Documentation <https://kafka.apache.org/documentation/>`__
+
+Kafka Connect
+~~~~~~~~~~~~~
+
+Kafka Connect is a component of Apache Kafka that solves the problem of
+connecting Apache Kafka to datastores such as MongoDB. Kafka Connect solves this
+problem by providing the following resources:
+
+- A fault tolerant runtime for transferring data to and from datastores.
+- A framework for the Apache Kafka community to share solutions for
+  connecting Apache Kafka to different datastores.
+
+The Kafka Connect framework defines an API for developers to write reusable
+**connectors**. Connectors enable Kafka Connect deployments
+to interact with a specific datastore as a data source or a data sink. The
+MongoDB Kafka Connector is one of these connectors.
+
+For more information on Kafka Connect, see the following resources:
+
+- `Confluent Kafka Connect Page <https://docs.confluent.io/platform/current/connect/index.html>`__
+- `Apache Kafka Official Documentation, Kafka Connect Guide <https://kafka.apache.org/documentation/#connect>`__
+- `Apache Foundation Video Walk-Through of the Kafka Connect Framework <https://www.youtube.com/watch?v=EXviLqXFoQI>`__
+
+.. tip:: Use Kafka Connect instead of Producer/Consumer Clients when Connecting to Datastores
+
+   While you could write your own application to connect Apache Kafka to a
+   specific datastore using producer and consumer clients, Kafka Connect may be
+   a better fit for you. Here are some reasons to use Kafka Connect:
+   
+   - Kafka Connect has a fault tolerant distributed architecture to ensure a
+     reliable pipeline.
+   - There are a large number of community maintained connectors for connecting
+     Apache Kafka to popular datastores like MongoDB, PostgreSQL, and MySQL using the
+     Kafka Connect framework. This reduces the amount of boilerplate code you need to
+     write and maintain to manage database connections, error handling,
+     dead-letter queue integration, and other problems involved in connecting Apache Kafka
+     with a datastore.
+   - You have the option to use a managed Kafka Connect cluster from Confluent.
+
+Diagram
+~~~~~~~
+
+The following diagram shows how information flows through an example data pipeline
+built with Apache Kafka and Kafka Connect. The example pipeline uses a MongoDB
+cluster as a data source, and a MongoDB cluster as a data sink.
+
+<TODO: Update the image to version that has gone through design department>
+
+.. figure:: /includes/figures/connect-data-flow.png
+   :alt: Dataflow diagram of Kafka Connect deployment. 
+
+All connectors and datastores in the example pipeline are optional, and you can
+swap them out for whatever connectors and datastores you need for your deployment.
diff --git a/source/kafka-connection-mongodb.txt b/source/kafka-connection-mongodb.txt
@@ -81,7 +81,7 @@ connection URI in the ``connection.uri`` setting. Refer to the configuration
 guides for more detail:
 
 - :doc:`Sink Configuration Properties </kafka-sink-properties>`
-- :ref:`Source Connector Configuration Properties <source-connector-configuration-properties>`
+- (TODO Fix Broken Link) Source Connector Configuration Properties <source-connector-configuration-properties>
 
 For more information on how to build your connection URI, navigate
 to the :guilabel:`Authentication` section in the
diff --git a/source/kafka-sink-cdc.txt b/source/kafka-sink-cdc.txt
@@ -37,7 +37,7 @@ a Kafka topic, update your configuration to include the following:
 
 The ``ChangeStreamHandler`` class instructs the sink connector to process
 change events that are in the :manual:`change stream response document format </reference/change-events/#change-stream-output>`.
-You can use a :doc:`MongoDB Kafka source connector </kafka-source>` to
+You can use a (TODO: FIX LINK) (MongoDB Kafka source connector </kafka-source>) to
 configure the change stream data that you want to publish to specific topics.
 
 Remember to specify the topic and the destination in the following

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	+<mxfile host="app.diagrams.net" modified="2021-07-19T02:23:48.741Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36" version="14.8.6" etag="Fl_y1J6sKEKLlZwr8ESv" type="device"><diagram id="NlAw9Xcqx87uc-Amdjaq" name="Page-1">7Vhdc6IwFP01Pm6HD0F8rB/bzmz75Oxu61uECLGBS2MsuL9+LxLAFLV2xhnZnT7JPbkJybnnJMGePY7zO0HS6BECynuWEeQ9e9KzLNMaevhTINsS8Vy3BELBApXUADP2hyrQUOiGBXStJUoALlmqgz4kCfWlhhEhINPTlsD1t6YkpC1g5hPeRn+zQEZqFY7R4PeUhVH1ZtNQLTGpkhWwjkgA2R5kT3v2WADI8inOx5QX5FW8lP2+H2mtJyZoIs/pQF4X99mvt9Vw/vNOzrPJfLB6/Wb2y2HeCN+oFavZym1FgYBNEtBiFKNnj7KISTpLiV+0Zlh0xCIZc4xMfFTDUSFpfnSiZr181A2FmEqxxRTVwfIUY0oyNYNZUwB7oLBon3zPUYVXRQ/rsRte8EFR8wmavBYpNECZqBCEjCCEhPBpg4502pqcB4BUkbWiUm6V5slGgk4lzZl82nt+Loa6cVQ0ydXIu2BbBQku92k/2OtVhE23XVT1K9dXLOp00ZAD2AifnuBKCUgSEVJ5Is8+LAJBOZHsTZ/HxQtqtWT/jOtC5DZNOfNxBpC0av6B9Mk6LXegJcuLul/CC65uBdtoW8E74IQKuzhvwy8jnG0E+0wj9K9pBPvA/u+SuJA2x0mPHiEJYYK0GByrVreFcseVu8tZoHHcI0jd//peMr2Omam643TfTRd0Rf9MVzjXdEX7VvSDLF8IQmN1z+yAnmv9dkbPZouWjuq5A6eDc6YP3Gv6wDnmg+urv285HVN/+8vpS/0nVX2G+gfXVL/7L5wCjtOxU2DwwY0yLm6EweJ/uFG6nbtRtjdsnfxb5CfChErIDwdK8I5XpEfqVK6lgBc6Bg4CkQSSYidbMs7fQYSzMMHQRzop4qOCbPy45reqIWZBsNsGD9VO3xovcV8aWjf6mWE57XqZfaddMOvzBcOw+W9v17b3D6k9/Qs=</diagram></mxfile>