Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/amitsaxena098/OpenKnowledgeStream/llms.txt

Use this file to discover all available pages before exploring further.

OpenKnowledgeStream is a multi-module Java pipeline that continuously polls the Wikipedia Recent Changes API, publishes each edit event to a Kafka topic, and indexes the records into OpenSearch. The result is a live, searchable data stream of Wikipedia page edits—ready for dashboards, analytics, and downstream integrations.

Introduction

Understand what OpenKnowledgeStream does and how the components fit together.

Quickstart

Stand up Kafka and OpenSearch, build the project, and see your first change indexed in under 10 minutes.

Architecture

Explore the three-module design: change stream, indexer, and shared common library.

Configuration

Configure Kafka brokers, OpenSearch endpoints, and polling intervals.

Guides

Step-by-step guides for running locally, deploying to production, and querying indexed data.

Reference

Full reference for all components, services, and data models.

How It Works

1

Poll Wikipedia

The wiki-change-stream module queries https://en.wikipedia.org/w/api.php every 5 seconds, fetching up to 100 recent page changes including title, page ID, change type, and tags.
2

Publish to Kafka

Each Change record is serialized as JSON and published to the recent_change_stream Kafka topic on localhost:9092. The producer uses Spring Kafka’s JsonSerializer.
3

Consume and Index

The opensearch-wiki-indexer module polls the topic every 5 seconds, deserializes each message, and upserts it into the wiki-changes OpenSearch index using the page title as the document ID.
4

Query and Analyze

With data flowing into OpenSearch, you can run full-text searches, filter by change type or tags, and build dashboards using OpenSearch Dashboards or any compatible tool.
OpenKnowledgeStream requires Java 21+, Apache Kafka, and OpenSearch running locally (or remotely with updated connection config). See the Quickstart for setup instructions.

Build docs developers (and LLMs) love