Documentation Index
Fetch the complete documentation index at: https://mintlify.com/apache/wayang/llms.txt
Use this file to discover all available pages before exploring further.
Apache Wayang is distributed through Maven Central. You always need wayang-core plus one module per execution platform you want available. Everything else — the fluent API, the profiler, additional platform adapters — is opt-in. This page covers Maven Central dependencies, snapshot builds, building from source, runtime requirements, Java 17 JVM flags, and PATH setup.
Maven dependencies
Replace WAYANG_VERSION with the latest release version. Check Maven Central for the current stable version.
Core and API modules
These three artifacts are the foundation for almost every Wayang project:
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-core</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-basic</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-api-scala-java</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
Add one dependency per execution engine you want Wayang to consider. Each artifact ships the translator that maps Wayang’s logical operators onto that engine’s native API:
<!-- Local execution via Java Streams (development, small data) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-java</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<!-- Apache Spark (large-scale batch processing) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-spark</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<!-- Apache Flink (stream and batch processing) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-flink</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<!-- PostgreSQL (SQL-capable relational data) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-postgres</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<!-- SQLite (lightweight embedded SQL) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-sqlite3</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<!-- Apache Giraph (graph processing) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-giraph</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
<!-- TensorFlow (machine learning workloads) -->
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-tensorflow</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
Optional: profiler module
The profiler learns operator and UDF cost functions from historical executions, improving the optimizer’s placement decisions over time:
<dependency>
<groupId>org.apache.wayang</groupId>
<artifactId>wayang-profiler</artifactId>
<version>WAYANG_VERSION</version>
</dependency>
Module reference
| Module | Purpose | Required? |
|---|
wayang-core | Core data structures and the cost-based optimizer | Yes |
wayang-basic | Common operators (flatMap, reduceByKey, etc.) and data types | Recommended |
wayang-api-scala-java | Fluent Java/Scala builder API (JavaPlanBuilder, PlanBuilder) | Recommended |
wayang-java | Java Streams platform adapter | One platform minimum |
wayang-spark | Apache Spark platform adapter | Optional |
wayang-flink | Apache Flink platform adapter | Optional |
wayang-postgres | PostgreSQL platform adapter | Optional |
wayang-sqlite3 | SQLite platform adapter | Optional |
wayang-giraph | Apache Giraph adapter | Optional |
wayang-tensorflow | TensorFlow platform adapter | Optional |
wayang-profiler | Learns cost functions from execution history | Optional |
Snapshot builds
Pre-release (snapshot) builds are published to Apache’s snapshot repository. Add this repository block to your pom.xml to resolve them:
<repositories>
<repository>
<id>apache-snapshots</id>
<name>Apache Foundation Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots</url>
</repository>
</repositories>
The current snapshot version is tracked in the root pom.xml of the repository.
Build from source
If you want to build Wayang yourself — to use the latest unreleased code or to contribute — clone the repository and build with the Maven wrapper:
git clone https://github.com/apache/wayang.git
cd wayang
./mvnw clean install -DskipTests
To run the full test suite:
To build the distributable assembly (the tar.gz that contains wayang-submit):
./mvnw clean package -pl :wayang-assembly -Pdistribution
The archive appears under wayang-assembly/target/.
Runtime requirements
| Requirement | Version | Environment variable |
|---|
| Java | 17 | JAVA_HOME |
| Apache Spark (with Scala 2.12) | 3.4.4 | SPARK_HOME |
| Apache Hadoop | 3+ | HADOOP_HOME |
Only Spark and Hadoop are needed if you register the corresponding platform adapters. A project using only wayang-java requires only Java 17.
Java 17 JVM flags
Running Wayang on Java 17 — especially with the Spark adapter — requires opening several internal Java modules. Without these flags you will hit IllegalAccessError at runtime. Edit your wayang-submit script (located at wayang-assembly/target/wayang-WAYANG_VERSION/bin/wayang-submit) so the runner invocation includes all of the following flags:
eval "$RUNNER \
--add-exports=java.base/sun.nio.ch=ALL-UNNAMED \
--add-opens=java.base/java.nio=ALL-UNNAMED \
--add-opens=java.base/java.lang=ALL-UNNAMED \
--add-opens=java.base/java.util=ALL-UNNAMED \
--add-opens=java.base/java.io=ALL-UNNAMED \
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
--add-opens=java.base/java.net=ALL-UNNAMED \
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
$FLAGS -cp \"${WAYANG_CLASSPATH}\" $CLASS ${ARGS}"
If you invoke Wayang directly from your own launcher (e.g., java -cp ...), pass the same flags on the java command line.
PATH setup
After building or downloading the Wayang assembly, extract the archive and add the bin/ directory to your PATH so wayang-submit is available system-wide.
tar -xvf wayang-WAYANG_VERSION.tar.gz
cd wayang-WAYANG_VERSION
echo "export WAYANG_HOME=$(pwd)" >> ~/.bashrc
echo "export PATH=${PATH}:${WAYANG_HOME}/bin" >> ~/.bashrc
source ~/.bashrc
echo "export WAYANG_HOME=$(pwd)" >> ~/.zshrc
echo "export PATH=${PATH}:${WAYANG_HOME}/bin" >> ~/.zshrc
source ~/.zshrc
Validate the installation
Run the bundled WordCount application against the repository’s own README to confirm everything is wired up correctly:
bin/wayang-submit org.apache.wayang.apps.wordcount.Main java file://$(pwd)/README.md
A successful run prints word-frequency pairs to stdout, for example:
[(the,42), (a,31), (wayang,28), ...]
If you see IllegalAccessError, go back and apply the Java 17 JVM flags to your wayang-submit script.