OpenKnowledgeStream is structured as a Maven multi-module project with the root artifactDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/amitsaxena098/OpenKnowledgeStream/llms.txt
Use this file to discover all available pages before exploring further.
com.as:OpenKnowledgeStream (version 0.0.1-SNAPSHOT, Spring Boot 4.1.0, Java 21). It is composed of three modules, each with a distinct responsibility: polling the Wikipedia API, indexing events into OpenSearch, and providing the shared data model. The two runnable modules are packaged as independent Spring Boot applications, both annotated with @EnableScheduling to drive their respective timed polling loops.
wiki-change-stream
Spring Boot application and pipeline entry point. The
OpenStream service fires every 5 seconds via @Scheduled(fixedRate = 5000), calling WikipediaClient — a reactive Spring WebFlux WebClient — to fetch the latest 100 recent changes from the Wikipedia API at https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&format=json&rclimit=100&rcprop=title|tags|ids. Each Change object is then published individually to the Kafka topic recent_change_stream via KafkaPublish.opensearch-wiki-indexer
Spring Boot application and pipeline sink.
KafkaConsume polls the recent_change_stream topic every 5 seconds using a plain KafkaConsumer in the wiki-indexer consumer group. For each consumed Change, OpensearchIndexer.index() upserts the document into the wiki-changes OpenSearch index, using the page title as the document ID.wiki-common
Shared library with no runnable main class. Provides the
Change, Query, and RecentChanges Lombok @Data models used by both wiki-change-stream and opensearch-wiki-indexer. Packaged as a plain JAR (Spring Boot Maven plugin is skipped) and referenced as a local dependency by the other two modules.Module dependency graph
The inter-module dependencies declared in the childpom.xml files are:
wiki-change-streamdepends on bothopensearch-wiki-indexerandwiki-common— it pulls in the indexer module so thatOpensearchIndexerandKafkaConsumeare available on the classpath and component-scanned at runtime.opensearch-wiki-indexerdepends onwiki-common— it references the sharedChangemodel for deserialization and indexing.wiki-commonhas no intra-project dependencies — it is the base of the dependency tree.
Although
opensearch-wiki-indexer is declared as a dependency of wiki-change-stream, the indexer beans (KafkaConsume, OpensearchIndexer) are component-scanned by OpenKnowledgeStreamApplication via @ComponentScan({"com.as", "WikiIndexer", "WikiIndexer.models", "Wikicommon", "WikiChangeStream"}). Both modules therefore run inside a single JVM when wiki-change-stream is launched.Shared infrastructure
Both runnable modules connect to the same local infrastructure at their default ports:| Service | Address | Used by |
|---|---|---|
| Apache Kafka | localhost:9092 | KafkaPublish, KafkaConsume |
| OpenSearch | localhost:9200 | OpensearchIndexer |
| Wikipedia REST API | https://en.wikipedia.org | WikipediaClient |