Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/apache/wayang/llms.txt

Use this file to discover all available pages before exploring further.

Apache Wayang is distributed through Maven Central. You always need wayang-core plus one module per execution platform you want available. Everything else — the fluent API, the profiler, additional platform adapters — is opt-in. This page covers Maven Central dependencies, snapshot builds, building from source, runtime requirements, Java 17 JVM flags, and PATH setup.

Maven dependencies

Replace WAYANG_VERSION with the latest release version. Check Maven Central for the current stable version.

Core and API modules

These three artifacts are the foundation for almost every Wayang project:
pom.xml
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-core</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-basic</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-api-scala-java</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

Platform adapter modules

Add one dependency per execution engine you want Wayang to consider. Each artifact ships the translator that maps Wayang’s logical operators onto that engine’s native API:
pom.xml
<!-- Local execution via Java Streams (development, small data) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-java</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

<!-- Apache Spark (large-scale batch processing) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-spark</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

<!-- Apache Flink (stream and batch processing) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-flink</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

<!-- PostgreSQL (SQL-capable relational data) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-postgres</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

<!-- SQLite (lightweight embedded SQL) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-sqlite3</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

<!-- Apache Giraph (graph processing) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-giraph</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

<!-- TensorFlow (machine learning workloads) -->
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-tensorflow</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

Optional: profiler module

The profiler learns operator and UDF cost functions from historical executions, improving the optimizer’s placement decisions over time:
pom.xml
<dependency>
  <groupId>org.apache.wayang</groupId>
  <artifactId>wayang-profiler</artifactId>
  <version>WAYANG_VERSION</version>
</dependency>

Module reference

ModulePurposeRequired?
wayang-coreCore data structures and the cost-based optimizerYes
wayang-basicCommon operators (flatMap, reduceByKey, etc.) and data typesRecommended
wayang-api-scala-javaFluent Java/Scala builder API (JavaPlanBuilder, PlanBuilder)Recommended
wayang-javaJava Streams platform adapterOne platform minimum
wayang-sparkApache Spark platform adapterOptional
wayang-flinkApache Flink platform adapterOptional
wayang-postgresPostgreSQL platform adapterOptional
wayang-sqlite3SQLite platform adapterOptional
wayang-giraphApache Giraph adapterOptional
wayang-tensorflowTensorFlow platform adapterOptional
wayang-profilerLearns cost functions from execution historyOptional

Snapshot builds

Pre-release (snapshot) builds are published to Apache’s snapshot repository. Add this repository block to your pom.xml to resolve them:
pom.xml
<repositories>
  <repository>
    <id>apache-snapshots</id>
    <name>Apache Foundation Snapshot Repository</name>
    <url>https://repository.apache.org/content/repositories/snapshots</url>
  </repository>
</repositories>
The current snapshot version is tracked in the root pom.xml of the repository.

Build from source

If you want to build Wayang yourself — to use the latest unreleased code or to contribute — clone the repository and build with the Maven wrapper:
git clone https://github.com/apache/wayang.git
cd wayang
./mvnw clean install -DskipTests
To run the full test suite:
./mvnw test
To build the distributable assembly (the tar.gz that contains wayang-submit):
./mvnw clean package -pl :wayang-assembly -Pdistribution
The archive appears under wayang-assembly/target/.

Runtime requirements

RequirementVersionEnvironment variable
Java17JAVA_HOME
Apache Spark (with Scala 2.12)3.4.4SPARK_HOME
Apache Hadoop3+HADOOP_HOME
Only Spark and Hadoop are needed if you register the corresponding platform adapters. A project using only wayang-java requires only Java 17.

Java 17 JVM flags

Running Wayang on Java 17 — especially with the Spark adapter — requires opening several internal Java modules. Without these flags you will hit IllegalAccessError at runtime. Edit your wayang-submit script (located at wayang-assembly/target/wayang-WAYANG_VERSION/bin/wayang-submit) so the runner invocation includes all of the following flags:
wayang-submit (excerpt)
eval "$RUNNER \
  --add-exports=java.base/sun.nio.ch=ALL-UNNAMED \
  --add-opens=java.base/java.nio=ALL-UNNAMED \
  --add-opens=java.base/java.lang=ALL-UNNAMED \
  --add-opens=java.base/java.util=ALL-UNNAMED \
  --add-opens=java.base/java.io=ALL-UNNAMED \
  --add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
  --add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
  --add-opens=java.base/java.net=ALL-UNNAMED \
  --add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
  $FLAGS -cp \"${WAYANG_CLASSPATH}\" $CLASS ${ARGS}"
If you invoke Wayang directly from your own launcher (e.g., java -cp ...), pass the same flags on the java command line.

PATH setup

After building or downloading the Wayang assembly, extract the archive and add the bin/ directory to your PATH so wayang-submit is available system-wide.
tar -xvf wayang-WAYANG_VERSION.tar.gz
cd wayang-WAYANG_VERSION
echo "export WAYANG_HOME=$(pwd)" >> ~/.bashrc
echo "export PATH=${PATH}:${WAYANG_HOME}/bin" >> ~/.bashrc
source ~/.bashrc

Validate the installation

Run the bundled WordCount application against the repository’s own README to confirm everything is wired up correctly:
bin/wayang-submit org.apache.wayang.apps.wordcount.Main java file://$(pwd)/README.md
A successful run prints word-frequency pairs to stdout, for example:
[(the,42), (a,31), (wayang,28), ...]
If you see IllegalAccessError, go back and apply the Java 17 JVM flags to your wayang-submit script.

Build docs developers (and LLMs) love