Get started with AbcDatalog

This quickstart walks you through the complete lifecycle of a Datalog evaluation using AbcDatalog’s Java API: writing a program that computes the transitive closure of a directed graph, parsing it, initialising an evaluation engine, and running queries against the derived facts. The example below is drawn directly from EngineExample.java in the AbcDatalog source repository.

Add the Maven dependency

Add AbcDatalog to your pom.xml:

pom.xml

<dependency>
    <groupId>io.github.harvardpl</groupId>
    <artifactId>AbcDatalog</artifactId>
    <version>0.8.0</version>
</dependency>

See Installation for Gradle and pre-built JAR options.

Write a Datalog program

A Datalog program consists of EDB facts (base facts known before evaluation) and IDB rules (logical rules that derive new facts). The example below defines a graph with four directed edges and two rules that derive the full transitive closure relation tc:

edge(a, b).
edge(b, c).
edge(c, d).
edge(d, c).
tc(X, Y) :- edge(X, Y).
tc(X, Y) :- tc(X, Z), tc(Z, Y).

edge(a, b). — a ground fact (EDB). Constants begin with lowercase letters.
tc(X, Y) :- edge(X, Y). — a rule: tc holds for every direct edge.
tc(X, Y) :- tc(X, Z), tc(Z, Y). — a recursive rule: tc is closed transitively. Variables begin with uppercase letters.

The file example/tc.dtlg in the repository contains a richer version of this program:

example/tc.dtlg

% A program that computes graph transitive closure.

% The `edge` predicate is an EDB. That is, it represents facts known before
% evaluation. Predicates and constants begin with lowercase letters.
edge(a,b).
edge(b,c).
edge(c,c).
edge(c,d).

% The `tc` predicate is an IDB. That is, it represents facts that are generated
% during evaluation. Variables begin with uppercase letters.
tc(X,Y) :- edge(X,Y).
tc(X,Y) :- edge(X,Z), tc(Z,Y).

% We could alternatively compute transitive closure this way, using non-linear
% recursion.
tc_nonlinear(X,Y) :- edge(X,Y).
tc_nonlinear(X,Y) :- tc_nonlinear(X,Z), tc_nonlinear(Z,Y).

% Compute every node that is part of a cycle.
in_cycle1(X) :- tc(X,Y), X=Y.
in_cycle2(X) :- tc(X,X).

% Compute pairs of nodes that are NOT in the tc relationship.
node(X) :- edge(X,_).
node(X) :- edge(_,X).
not_tc(X,Y) :- node(X), node(Y), not tc(X,Y).

Parse the program

Use DatalogTokenizer to tokenise your input and DatalogParser.parseProgram() to produce a Set<Clause>. In EngineExample.java the program is read from a StringReader, but in practice you would pass a FileReader pointing to a .dtlg file:

import edu.harvard.seas.pl.abcdatalog.ast.Clause;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParser;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogTokenizer;
import java.io.Reader;
import java.io.StringReader;
import java.util.Set;

// Build the program text (in practice, use new FileReader("program.dtlg"))
String[] lines = {
    "edge(a, b).",
    "edge(b, c).",
    "edge(c, d).",
    "edge(d, c).",
    "tc(X, Y) :- edge(X, Y).",
    "tc(X, Y) :- tc(X, Z), tc(Z, Y)."
};
StringBuilder sb = new StringBuilder();
for (String line : lines) {
    sb.append(line);
}
Reader r = new StringReader(sb.toString());

DatalogTokenizer t = new DatalogTokenizer(r);
Set<Clause> prog = DatalogParser.parseProgram(t);

Create and initialize an engine

Instantiate a SemiNaiveEngine — a sequential, bottom-up evaluation engine — and call init() with the parsed program. The engine evaluates all derivable facts eagerly during initialisation.

import edu.harvard.seas.pl.abcdatalog.engine.DatalogEngine;
import edu.harvard.seas.pl.abcdatalog.engine.bottomup.sequential.SemiNaiveEngine;
import edu.harvard.seas.pl.abcdatalog.ast.validation.DatalogValidationException;

// You can choose what sort of engine you want here.
DatalogEngine e = SemiNaiveEngine.newEngine();
e.init(prog);

init() throws DatalogValidationException if the program violates Datalog safety or stratification constraints. Catch or declare it in your method signature.

Build a query

A query is a PositiveAtom with variables standing for the positions you want to enumerate. You can construct one by parsing a query string, or by building the AST programmatically:

import edu.harvard.seas.pl.abcdatalog.ast.PositiveAtom;
import edu.harvard.seas.pl.abcdatalog.ast.PredicateSym;
import edu.harvard.seas.pl.abcdatalog.ast.Term;
import edu.harvard.seas.pl.abcdatalog.ast.Variable;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParseException;

PositiveAtom q;

// Option 1: parse from text
String s = "tc(X, Y)?";
DatalogTokenizer qt = new DatalogTokenizer(new StringReader(s));
q = DatalogParser.parseQuery(qt);

// Option 2: build the AST programmatically
Variable x = Variable.create("X");
Variable y = Variable.create("Y");
PredicateSym p = PredicateSym.create("tc", 2);
q = PositiveAtom.create(p, new Term[] {x, y});

Both options produce an equivalent query. Use the AST approach when you are constructing queries dynamically at runtime.

Run the query and iterate over results

Call engine.query(q) to retrieve all facts that unify with the query atom. The result is a Set<PositiveAtom> — each element is a ground atom representing one derived fact.

Set<PositiveAtom> results = e.query(q);
for (PositiveAtom result : results) {
    System.out.println(result);
}

For the transitive closure program above, this prints every tc(_, _) pair reachable via the defined edges, for example:

tc(a,b)
tc(a,c)
tc(a,d)
tc(b,c)
tc(b,d)
tc(c,c)
tc(c,d)
tc(d,c)
tc(d,d)

Complete example

Here is the full main method from EngineExample.java, combining all steps above:

EngineExample.java

import edu.harvard.seas.pl.abcdatalog.ast.Clause;
import edu.harvard.seas.pl.abcdatalog.ast.PositiveAtom;
import edu.harvard.seas.pl.abcdatalog.ast.PredicateSym;
import edu.harvard.seas.pl.abcdatalog.ast.Term;
import edu.harvard.seas.pl.abcdatalog.ast.Variable;
import edu.harvard.seas.pl.abcdatalog.ast.validation.DatalogValidationException;
import edu.harvard.seas.pl.abcdatalog.engine.DatalogEngine;
import edu.harvard.seas.pl.abcdatalog.engine.bottomup.sequential.SemiNaiveEngine;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParseException;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogParser;
import edu.harvard.seas.pl.abcdatalog.parser.DatalogTokenizer;
import java.io.Reader;
import java.io.StringReader;
import java.util.Set;

public static void main(String[] args) throws DatalogParseException, DatalogValidationException {
    Reader r = readInProgram(); // returns a Reader over the Datalog program text
    DatalogTokenizer t = new DatalogTokenizer(r);
    Set<Clause> prog = DatalogParser.parseProgram(t);

    // You can choose what sort of engine you want here.
    DatalogEngine e = SemiNaiveEngine.newEngine();
    e.init(prog);

    // Build query: tc(X, Y)?
    Variable x = Variable.create("X");
    Variable y = Variable.create("Y");
    PredicateSym p = PredicateSym.create("tc", 2);
    PositiveAtom q = PositiveAtom.create(p, new Term[] {x, y});

    Set<PositiveAtom> results = e.query(q);
    for (PositiveAtom result : results) {
        System.out.println(result);
    }
}

You can explore Datalog programs interactively without writing any Java code. Launch the GUI by running java -jar AbcDatalog-0.8.0-jar-with-dependencies.jar, load a .dtlg file, and issue queries directly in the interface. See Installation for how to obtain the JAR.

SemiNaiveEngine is a good default for most programs, but AbcDatalog offers concurrent bottom-up engines and top-down QSQ engines as well. See Evaluation engines for a comparison and guidance on when to choose each one.

Get Started

Language Guide

Evaluation Engines

Executor API

Extending AbcDatalog

Get started with AbcDatalog

Complete example

Build docs developers (and LLMs) love

Get Started

Language Guide

Evaluation Engines

Executor API

Extending AbcDatalog

Documentation Index

​Complete example

Build docs developers (and LLMs) love

Complete example