Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/amitsaxena098/OpenKnowledgeStream/llms.txt

Use this file to discover all available pages before exploring further.

wiki-common is a shared Maven library that provides the data model classes used by both wiki-change-stream and opensearch-wiki-indexer. It contains no application logic or entry point — its sole purpose is to define the Java types that flow from the Wikipedia API through Kafka to OpenSearch.

Classes

Change

Represents a single Wikipedia page edit event. Fields: type, title, pageId, tags. Used as the Kafka message value and the OpenSearch document body.

Query

Top-level deserialization target for the Wikipedia API JSON response. Holds a single RecentChanges field named query.

RecentChanges

Holds the list of Change objects extracted from the recentchanges array in the Wikipedia API response.

Classes Reference

Change

Package: Wikicommon.models
Annotations: @Data (Lombok)
Represents a single Wikipedia page edit event returned by the Recent Changes API and carried through the entire pipeline.
Change.java
package Wikicommon.models;

import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.Data;

import java.util.List;

@Data
public class Change {

    private String type;
    private String title;
    @JsonProperty("pageid")
    private Long pageId;
    private List<String> tags;
}
type
String
The type of change (e.g. edit, new).
title
String
The title of the Wikipedia page that was changed. Used as the OpenSearch document ID.
pageId
Long
The unique Wikipedia page ID. Deserialized from the pageid JSON field.
tags
List<String>
A list of tags applied to this change (e.g. mobile edit, visualeditor).

Query

Package: Wikicommon.models
Annotations: @Data (Lombok)
Top-level wrapper that mirrors the outer JSON object returned by the Wikipedia API. WikipediaClient deserializes the full API response into this class.
Query.java
package Wikicommon.models;

import lombok.Data;

@Data
public class Query {
    private RecentChanges query;
}
query
RecentChanges
Nested object containing the recentchanges array from the API response.
The response object hierarchy maps to the Wikipedia API JSON structure as follows:
Wikipedia API response (abbreviated)
{
  "query": {
    "recentchanges": [
      { "type": "edit", "title": "Example", "pageid": 12345, "tags": [] }
    ]
  }
}

RecentChanges

Package: Wikicommon.models
Annotations: @Data (Lombok)
Intermediate wrapper that holds the list of Change objects deserialized from the recentchanges JSON array.
RecentChanges.java
package Wikicommon.models;

import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.Data;

import java.util.List;

@Data
public class RecentChanges {
    @JsonProperty("recentchanges")
    List<Change> recentChanges;
}
recentChanges
List<Change>
The list of recent Wikipedia page change events. Deserialized from the recentchanges JSON array.

Maven Artifact

Group ID: com.as
Artifact ID: wiki-common
Version: 0.0.1-SNAPSHOT

Adding as a dependency

Add the following to your module’s pom.xml:
pom.xml
<dependency>
    <groupId>com.as</groupId>
    <artifactId>wiki-common</artifactId>
    <version>0.0.1-SNAPSHOT</version>
</dependency>
wiki-common is configured with <skip>true</skip> in the Spring Boot Maven plugin. This prevents it from being repackaged as an executable JAR — it is a plain library JAR intended to be consumed as a dependency by other modules.
wiki-common/pom.xml
<build>
    <plugins>
        <plugin>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-maven-plugin</artifactId>
            <configuration>
                <skip>true</skip>
            </configuration>
        </plugin>
    </plugins>
</build>

wiki-change-stream

Polls the Wikipedia API and publishes Change objects to Kafka

opensearch-wiki-indexer

Consumes Change objects from Kafka and indexes them into OpenSearch

Build docs developers (and LLMs) love