Documentation Index
Fetch the complete documentation index at: https://mintlify.com/vxcontrol/pentagi/llms.txt
Use this file to discover all available pages before exploring further.
Neo4j is the graph database powering PentAGI’s Graphiti knowledge graph system. It provides high-performance storage and querying of complex relationships between entities, enabling semantic memory and contextual understanding.
Overview
Neo4j is a native graph database that stores and queries data as nodes and relationships. In PentAGI, it serves as:
- Knowledge Storage: Persistent graph database for entities and relationships
- Relationship Querying: Fast traversal of complex entity connections
- Pattern Matching: Cypher query language for graph patterns
- Temporal Tracking: Time-based relationship management
- Visualization: Built-in browser for graph exploration
Architecture
Neo4j in the PentAGI stack:
Setup
Configure Neo4j Credentials
Set Neo4j authentication in your .env file:# Neo4j settings
NEO4J_USER=neo4j
NEO4J_PASSWORD=devpassword
NEO4J_DATABASE=neo4j
NEO4J_URI=bolt://neo4j:7687
Security: Change NEO4J_PASSWORD to a strong password. Default password neo4j is not allowed in Neo4j 4.0+.
Deploy Neo4j with Graphiti
Neo4j is included in the Graphiti stack:curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-graphiti.yml
docker compose -f docker-compose.yml -f docker-compose-graphiti.yml up -d
Verify Neo4j is Running
Check Neo4j service status:# Check service health
docker compose ps neo4j
# Verify HTTP endpoint
curl http://localhost:7474
# Check Bolt connection
docker exec neo4j cypher-shell -u neo4j -p devpassword "RETURN 'Connected' as status;"
Access Neo4j Browser
Open the Neo4j Browser interface:Login with:
- Username:
neo4j
- Password:
devpassword (or your configured password)
- Database:
neo4j
Configuration
Docker Compose Settings
Neo4j service configuration:
docker-compose-graphiti.yml
neo4j:
image: neo4j:5.26.2
restart: unless-stopped
container_name: neo4j
hostname: neo4j
ports:
- "127.0.0.1:7474:7474" # HTTP (Browser)
- "127.0.0.1:7687:7687" # Bolt (Protocol)
volumes:
- neo4j_data:/data # Database storage
environment:
- NEO4J_AUTH=neo4j/devpassword
shm_size: 4g # Shared memory for transactions
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:7474 || exit 1"]
interval: 1s
timeout: 10s
retries: 10
Environment Variables
Key Neo4j configuration options:
| Variable | Description | Default |
|---|
NEO4J_AUTH | Authentication (user/password) | neo4j/devpassword |
NEO4J_dbms_memory_heap_initial__size | Initial heap size | 512m |
NEO4J_dbms_memory_heap_max__size | Maximum heap size | 1G |
NEO4J_dbms_memory_pagecache_size | Page cache size | 512m |
NEO4J_dbms_security_procedures_unrestricted | Allowed procedures | gds.* |
For production deployments, increase memory limits:
docker-compose-graphiti.yml
neo4j:
environment:
- NEO4J_dbms_memory_heap_initial__size=2G
- NEO4J_dbms_memory_heap_max__size=4G
- NEO4J_dbms_memory_pagecache_size=2G
shm_size: 8g
Cypher Query Language
Neo4j uses Cypher for querying graph data.
Basic Queries
Create a node:
CREATE (t:Target {name: "target.com", ip: "192.168.1.1"})
RETURN t
Create a relationship:
MATCH (t:Target {name: "target.com"})
CREATE (s:Service {name: "HTTP", port: 80})
CREATE (t)-[:HAS_SERVICE]->(s)
RETURN t, s
Find nodes:
MATCH (t:Target)
WHERE t.name CONTAINS "example"
RETURN t.name, t.ip
Pattern Matching
Find related entities:
// Find all services on a target
MATCH (t:Target)-[:HAS_SERVICE]->(s:Service)
WHERE t.name = "target.com"
RETURN s.name, s.port
// Find vulnerability chain
MATCH path = (tool:Tool)-[:DISCOVERS]->(vuln:Vulnerability)-[:AFFECTS]->(target:Target)
RETURN path
Aggregation
Count and aggregate:
// Count vulnerabilities by severity
MATCH (v:Vulnerability)
RETURN v.severity, count(*) as count
ORDER BY count DESC
// Most used tools
MATCH (a:Agent)-[:USED]->(t:Tool)
RETURN t.name, count(a) as usage_count
ORDER BY usage_count DESC
LIMIT 10
Graph Algorithms
Shortest path:
MATCH path = shortestPath(
(start:Target {name: "entry.com"})-[*]-(end:Target {name: "internal.com"})
)
RETURN path
Usage
Neo4j Browser
The built-in browser provides:
- Query Editor: Write and execute Cypher queries
- Graph Visualization: Interactive node and relationship display
- Data Browser: Explore database schema and contents
- Query History: Review previous queries
- Favorites: Save frequently-used queries
Command Line Access
Use cypher-shell for CLI queries:
# Connect to Neo4j
docker exec -it neo4j cypher-shell -u neo4j -p devpassword
# Run a query
neo4j@neo4j> MATCH (n) RETURN count(n);
# Exit
neo4j@neo4j> :exit
Python Client
Query Neo4j from Python:
from neo4j import GraphDatabase
driver = GraphDatabase.driver(
"bolt://localhost:7687",
auth=("neo4j", "devpassword")
)
with driver.session(database="neo4j") as session:
result = session.run("""
MATCH (t:Target)-[:HAS_VULNERABILITY]->(v:Vulnerability)
WHERE v.severity = 'HIGH'
RETURN t.name, v.name, v.description
""")
for record in result:
print(f"Target: {record['t.name']}")
print(f"Vulnerability: {record['v.name']}")
print(f"Description: {record['v.description']}")
print()
driver.close()
Maintenance
Backup
Backup Neo4j data:
# Create backup
docker exec neo4j neo4j-admin database dump neo4j \
--to-path=/var/lib/neo4j/data/backups
# Copy backup to host
docker cp neo4j:/var/lib/neo4j/data/backups/neo4j.dump ./
Restore
Restore from backup:
# Stop Neo4j
docker compose stop neo4j
# Copy backup to container
docker cp neo4j.dump neo4j:/var/lib/neo4j/data/backups/
# Restore database
docker exec neo4j neo4j-admin database load neo4j \
--from-path=/var/lib/neo4j/data/backups
# Start Neo4j
docker compose start neo4j
Indexes
Create indexes for better performance:
// Index on Target name
CREATE INDEX target_name FOR (t:Target) ON (t.name)
// Index on Vulnerability type
CREATE INDEX vulnerability_type FOR (v:Vulnerability) ON (v.type)
// Composite index
CREATE INDEX target_composite FOR (t:Target) ON (t.name, t.ip)
// Full-text search index
CREATE FULLTEXT INDEX target_search FOR (t:Target) ON EACH [t.name, t.description]
View existing indexes:
Constraints
Ensure data integrity:
// Unique constraint
CREATE CONSTRAINT target_unique FOR (t:Target) REQUIRE t.name IS UNIQUE
// Existence constraint (Enterprise only)
CREATE CONSTRAINT vulnerability_name FOR (v:Vulnerability) REQUIRE v.name IS NOT NULL
Monitoring
Database Metrics
Query database statistics:
// Database size
CALL dbms.queryJmx('org.neo4j:instance=kernel#0,name=Store file sizes')
YIELD attributes
RETURN attributes.TotalStoreSize.value as total_size
// Node and relationship counts
MATCH (n)
RETURN labels(n) as label, count(*) as count
UNION
MATCH ()-[r]->()
RETURN type(r) as label, count(*) as count
// Transaction statistics
CALL dbms.listTransactions()
YIELD transactionId, currentQueryId, elapsedTime
RETURN *
Profile slow queries:
// Explain query plan
EXPLAIN
MATCH (t:Target)-[:HAS_SERVICE]->(s:Service)
WHERE t.name = "target.com"
RETURN s
// Profile query execution
PROFILE
MATCH (t:Target)-[:HAS_SERVICE]->(s:Service)
WHERE t.name = "target.com"
RETURN s
Logs
View Neo4j logs:
# Query log
docker exec neo4j cat /var/log/neo4j/query.log
# Debug log
docker exec neo4j cat /var/log/neo4j/debug.log
# Follow logs
docker compose logs -f neo4j
Troubleshooting
Connection Issues
Verify Neo4j is accessible:
# Check if Neo4j is listening
docker exec neo4j netstat -tlnp | grep 7687
# Test Bolt connection
telnet localhost 7687
# Test from Graphiti container
docker exec graphiti nc -zv neo4j 7687
Authentication Errors
Reset password:
# Stop Neo4j
docker compose stop neo4j
# Disable authentication temporarily
docker compose run --rm neo4j neo4j-admin set-initial-password newpassword
# Update .env file
NEO4J_PASSWORD=newpassword
# Restart
docker compose up -d neo4j
Diagnose slow queries:
-
Enable query logging:
NEO4J_dbms_logs_query_enabled=true
NEO4J_dbms_logs_query_threshold=100ms
-
Analyze query plans with
PROFILE
-
Add missing indexes
-
Increase memory allocation
Data Corruption
Recover from corruption:
# Check database consistency
docker exec neo4j neo4j-admin check-consistency neo4j
# Repair if needed (CAUTION: may lose data)
docker exec neo4j neo4j-admin database repair neo4j
Best Practices
Schema Design
- Use meaningful node labels and relationship types
- Normalize properties across similar nodes
- Avoid deeply nested queries (> 5 levels)
- Use indexes on frequently queried properties
- Model relationships as first-class entities
Query Optimization
- Always use indexes for lookups
- Limit result sets with
LIMIT
- Use
WITH to pipeline queries
- Avoid Cartesian products
- Profile queries before production
Security
- Change default password immediately
- Use strong passwords (16+ characters)
- Restrict network access to trusted IPs
- Enable TLS/SSL in production
- Regularly update Neo4j version
Data Management
- Regular backups (daily minimum)
- Monitor disk usage
- Archive old data periodically
- Clean up unused nodes and relationships
- Document schema and queries