Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/HarvardPL/AbcDatalog/llms.txt

Use this file to discover all available pages before exploring further.

This quickstart walks you through the complete lifecycle of a Datalog evaluation using AbcDatalog’s Java API: writing a program that computes the transitive closure of a directed graph, parsing it, initialising an evaluation engine, and running queries against the derived facts. The example below is drawn directly from EngineExample.java in the AbcDatalog source repository.
1

Add the Maven dependency

Add AbcDatalog to your pom.xml:
pom.xml
<dependency>
    <groupId>io.github.harvardpl</groupId>
    <artifactId>AbcDatalog</artifactId>
    <version>0.8.0</version>
</dependency>
See Installation for Gradle and pre-built JAR options.
2

Write a Datalog program

A Datalog program consists of EDB facts (base facts known before evaluation) and IDB rules (logical rules that derive new facts). The example below defines a graph with four directed edges and two rules that derive the full transitive closure relation tc:
edge(a, b).
edge(b, c).
edge(c, d).
edge(d, c).
tc(X, Y) :- edge(X, Y).
tc(X, Y) :- tc(X, Z), tc(Z, Y).
  • edge(a, b). — a ground fact (EDB). Constants begin with lowercase letters.
  • tc(X, Y) :- edge(X, Y). — a rule: tc holds for every direct edge.
  • tc(X, Y) :- tc(X, Z), tc(Z, Y). — a recursive rule: tc is closed transitively. Variables begin with uppercase letters.
The file example/tc.dtlg in the repository contains a richer version of this program:
example/tc.dtlg
% A program that computes graph transitive closure.

% The `edge` predicate is an EDB. That is, it represents facts known before
% evaluation. Predicates and constants begin with lowercase letters.
edge(a,b).
edge(b,c).
edge(c,c).
edge(c,d).

% The `tc` predicate is an IDB. That is, it represents facts that are generated
% during evaluation. Variables begin with uppercase letters.
tc(X,Y) :- edge(X,Y).
tc(X,Y) :- edge(X,Z), tc(Z,Y).

% We could alternatively compute transitive closure this way, using non-linear
% recursion.
tc_nonlinear(X,Y) :- edge(X,Y).
tc_nonlinear(X,Y) :- tc_nonlinear(X,Z), tc_nonlinear(Z,Y).

% Compute every node that is part of a cycle.
in_cycle1(X) :- tc(X,Y), X=Y.
in_cycle2(X) :- tc(X,X).

% Compute pairs of nodes that are NOT in the tc relationship.
node(X) :- edge(X,_).
node(X) :- edge(_,X).
not_tc(X,Y) :- node(X), node(Y), not tc(X,Y).
3

Parse the program

Use DatalogTokenizer to tokenise your input and DatalogParser.parseProgram() to produce a Set<Clause>. In EngineExample.java the program is read from a StringReader, but in practice you would pass a FileReader pointing to a .dtlg file:
import edu.harvard.seas.pl.abcdatalog.ast.Clause;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParser;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogTokenizer;
import java.io.Reader;
import java.io.StringReader;
import java.util.Set;

// Build the program text (in practice, use new FileReader("program.dtlg"))
String[] lines = {
    "edge(a, b).",
    "edge(b, c).",
    "edge(c, d).",
    "edge(d, c).",
    "tc(X, Y) :- edge(X, Y).",
    "tc(X, Y) :- tc(X, Z), tc(Z, Y)."
};
StringBuilder sb = new StringBuilder();
for (String line : lines) {
    sb.append(line);
}
Reader r = new StringReader(sb.toString());

DatalogTokenizer t = new DatalogTokenizer(r);
Set<Clause> prog = DatalogParser.parseProgram(t);
4

Create and initialize an engine

Instantiate a SemiNaiveEngine — a sequential, bottom-up evaluation engine — and call init() with the parsed program. The engine evaluates all derivable facts eagerly during initialisation.
import edu.harvard.seas.pl.abcdatalog.engine.DatalogEngine;
import edu.harvard.seas.pl.abcdatalog.engine.bottomup.sequential.SemiNaiveEngine;
import edu.harvard.seas.pl.abcdatalog.ast.validation.DatalogValidationException;

// You can choose what sort of engine you want here.
DatalogEngine e = SemiNaiveEngine.newEngine();
e.init(prog);
init() throws DatalogValidationException if the program violates Datalog safety or stratification constraints. Catch or declare it in your method signature.
5

Build a query

A query is a PositiveAtom with variables standing for the positions you want to enumerate. You can construct one by parsing a query string, or by building the AST programmatically:
import edu.harvard.seas.pl.abcdatalog.ast.PositiveAtom;
import edu.harvard.seas.pl.abcdatalog.ast.PredicateSym;
import edu.harvard.seas.pl.abcdatalog.ast.Term;
import edu.harvard.seas.pl.abcdatalog.ast.Variable;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParseException;

PositiveAtom q;

// Option 1: parse from text
String s = "tc(X, Y)?";
DatalogTokenizer qt = new DatalogTokenizer(new StringReader(s));
q = DatalogParser.parseQuery(qt);

// Option 2: build the AST programmatically
Variable x = Variable.create("X");
Variable y = Variable.create("Y");
PredicateSym p = PredicateSym.create("tc", 2);
q = PositiveAtom.create(p, new Term[] {x, y});
Both options produce an equivalent query. Use the AST approach when you are constructing queries dynamically at runtime.
6

Run the query and iterate over results

Call engine.query(q) to retrieve all facts that unify with the query atom. The result is a Set<PositiveAtom> — each element is a ground atom representing one derived fact.
Set<PositiveAtom> results = e.query(q);
for (PositiveAtom result : results) {
    System.out.println(result);
}
For the transitive closure program above, this prints every tc(_, _) pair reachable via the defined edges, for example:
tc(a,b)
tc(a,c)
tc(a,d)
tc(b,c)
tc(b,d)
tc(c,c)
tc(c,d)
tc(d,c)
tc(d,d)

Complete example

Here is the full main method from EngineExample.java, combining all steps above:
EngineExample.java
import edu.harvard.seas.pl.abcdatalog.ast.Clause;
import edu.harvard.seas.pl.abcdatalog.ast.PositiveAtom;
import edu.harvard.seas.pl.abcdatalog.ast.PredicateSym;
import edu.harvard.seas.pl.abcdatalog.ast.Term;
import edu.harvard.seas.pl.abcdatalog.ast.Variable;
import edu.harvard.seas.pl.abcdatalog.ast.validation.DatalogValidationException;
import edu.harvard.seas.pl.abcdatalog.engine.DatalogEngine;
import edu.harvard.seas.pl.abcdatalog.engine.bottomup.sequential.SemiNaiveEngine;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParseException;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParser;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogTokenizer;
import java.io.Reader;
import java.io.StringReader;
import java.util.Set;

public static void main(String[] args) throws DatalogParseException, DatalogValidationException {
    Reader r = readInProgram(); // returns a Reader over the Datalog program text
    DatalogTokenizer t = new DatalogTokenizer(r);
    Set<Clause> prog = DatalogParser.parseProgram(t);

    // You can choose what sort of engine you want here.
    DatalogEngine e = SemiNaiveEngine.newEngine();
    e.init(prog);

    // Build query: tc(X, Y)?
    Variable x = Variable.create("X");
    Variable y = Variable.create("Y");
    PredicateSym p = PredicateSym.create("tc", 2);
    PositiveAtom q = PositiveAtom.create(p, new Term[] {x, y});

    Set<PositiveAtom> results = e.query(q);
    for (PositiveAtom result : results) {
        System.out.println(result);
    }
}
You can explore Datalog programs interactively without writing any Java code. Launch the GUI by running java -jar AbcDatalog-0.8.0-jar-with-dependencies.jar, load a .dtlg file, and issue queries directly in the interface. See Installation for how to obtain the JAR.
SemiNaiveEngine is a good default for most programs, but AbcDatalog offers concurrent bottom-up engines and top-down QSQ engines as well. See Evaluation engines for a comparison and guidance on when to choose each one.

Build docs developers (and LLMs) love