Documentation Index Fetch the complete documentation index at: https://mintlify.com/bidewio/better-openclaw/llms.txt
Use this file to discover all available pages before exploring further.
The Research Agent skill pack enables intelligent web research workflows with semantic memory, privacy-focused search, and headless browser automation.
Included Services
Qdrant Vector database for semantic memory
SearXNG Privacy-focused metasearch engine
Browserless Headless Chrome for web scraping
Skills Provided
Qdrant Memory
Capabilities:
Store and search vector embeddings
Semantic similarity search
Filter searches by metadata
Build RAG (Retrieval-Augmented Generation) systems
Manage multiple collections
Example Usage:
# Create a collection for research notes
curl -X PUT "http://qdrant:6333/collections/research_notes" \
-H "Content-Type: application/json" \
-d '{
"vectors": {"size": 1536, "distance": "Cosine"},
"optimizers_config": {"default_segment_number": 2}
}'
# Store a research finding with embedding
curl -X PUT "http://qdrant:6333/collections/research_notes/points" \
-H "Content-Type: application/json" \
-d '{
"points": [{
"id": 1,
"vector": [0.05, 0.61, 0.76, ...],
"payload": {
"source": "https://example.com/article",
"text": "Key findings from the research paper...",
"timestamp": "2025-01-15T10:30:00Z",
"tags": ["ai", "research"]
}
}]
}'
# Search for similar research
curl -X POST "http://qdrant:6333/collections/research_notes/points/search" \
-H "Content-Type: application/json" \
-d '{
"vector": [0.2, 0.1, 0.9, ...],
"limit": 5,
"with_payload": true
}'
SearXNG Search
Capabilities:
Privacy-focused web search
Aggregates results from multiple search engines
No tracking or profiling
JSON API for programmatic access
Filter by category (web, images, news, etc.)
Example Usage:
# Search the web
curl "http://searxng:8080/search?q=artificial+intelligence&format=json"
# Search for academic papers
curl "http://searxng:8080/search?q=quantum+computing&categories=science&format=json"
# Search for news articles
curl "http://searxng:8080/search?q=latest+ai+developments&categories=news&format=json"
# Image search
curl "http://searxng:8080/search?q=neural+networks&categories=images&format=json"
Response structure:
{
"query" : "artificial intelligence" ,
"results" : [
{
"title" : "What is AI?" ,
"url" : "https://example.com/ai" ,
"content" : "Artificial intelligence is..." ,
"engine" : "google" ,
"score" : 0.95
}
]
}
Browserless Browse
Capabilities:
Headless Chrome automation
Render JavaScript-heavy pages
Take screenshots
Generate PDFs
Extract structured data
Handle dynamic content
Example Usage:
# Scrape a webpage with JavaScript rendering
curl -X POST "http://browserless:3000/content?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/article",
"waitForSelector": "#main-content"
}'
# Take a screenshot
curl -X POST "http://browserless:3000/screenshot?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}' \
--output screenshot.png
# Generate a PDF
curl -X POST "http://browserless:3000/pdf?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/report"}' \
--output report.pdf
# Execute custom Puppeteer script
curl -X POST "http://browserless:3000/function?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"code": "async ({ page }) => {
await page.goto(\"https://example.com\");
const title = await page.title();
return { title };
}"
}'
Use Cases
Intelligent Web Research
Build a research agent that:
Searches the web using SearXNG
Visits top results with Browserless
Extracts key information from page content
Generates embeddings using Ollama (from Local AI pack)
Stores in Qdrant for semantic retrieval
Answers questions using RAG
Competitive Intelligence
Monitor competitor websites and track changes:
# 1. Search for competitor mentions
curl "http://searxng:8080/search?q=competitor+product+launch&format=json"
# 2. Visit each result and capture content
for url in $urls ; do
curl -X POST "http://browserless:3000/content?token=TOKEN" \
-d "{ \" url \" : \" $url \" }" > content.html
# 3. Extract and store insights in Qdrant
# (with embeddings)
done
Academic Research
Search academic sources and build a knowledge base:
# Search for research papers
curl "http://searxng:8080/search?q=neural+networks&categories=science&format=json" \
| jq -r '.results[] | .url' \
| while read url ; do
# Download PDF or capture content
curl -X POST "http://browserless:3000/pdf?token=TOKEN" \
-d "{ \" url \" : \" $url \" }" \
--output "papers/$( echo $url | md5sum | cut -d ' ' -f1 ).pdf"
done
Content Monitoring
Track website changes and get alerts:
# Capture current state
curl -X POST "http://browserless:3000/screenshot?token=TOKEN" \
-d '{"url": "https://target.com"}' \
--output current.png
# Compare with previous state
# (use image diff tools)
# If changed, store in Qdrant with timestamp
Example Research Workflow
Complete research pipeline:
#!/bin/bash
# Research Agent Workflow
# 1. Search for information
RESULTS = $( curl -s "http://searxng:8080/search?q=AI+trends+2025&format=json" )
# 2. Extract top 5 URLs
URLS = $( echo $RESULTS | jq -r '.results[0:5] | .[].url' )
# 3. Visit each URL and extract content
for URL in $URLS ; do
echo "Processing: $URL "
# Scrape content with Browserless
CONTENT = $( curl -s -X POST "http://browserless:3000/content?token=TOKEN" \
-H "Content-Type: application/json" \
-d "{ \" url \" : \" $URL \" , \" waitForSelector \" : \" body \" }" )
# Generate embedding (using Ollama from Local AI pack)
EMBEDDING = $( curl -s -X POST "http://ollama:11434/api/embed" \
-H "Content-Type: application/json" \
-d "{ \" model \" : \" nomic-embed-text \" , \" input \" : [ \" $CONTENT \" ]}" \
| jq -r '.embeddings[0]' )
# Store in Qdrant
curl -X PUT "http://qdrant:6333/collections/research/points" \
-H "Content-Type: application/json" \
-d "{
\" points \" : [{
\" id \" : $( uuidgen | md5sum | head -c 8 ),
\" vector \" : $EMBEDDING ,
\" payload \" : {
\" url \" : \" $URL \" ,
\" content \" : \" $CONTENT \" ,
\" timestamp \" : \" $( date -Iseconds ) \"
}
}]
}"
done
echo "Research complete. Data stored in Qdrant."
Configuration
Environment Variables
# Qdrant
QDRANT_HOST = qdrant
QDRANT_PORT = 6333
QDRANT_GRPC_PORT = 6334
# SearXNG
SEARXNG_HOST = searxng
SEARXNG_PORT = 8080
# Browserless
BROWSERLESS_HOST = browserless
BROWSERLESS_PORT = 3000
BROWSERLESS_TOKEN =< generated >
Collection Patterns
Recommended Qdrant collections:
research_notes - Manual research findings
web_scrapes - Automated scraping results
documents - Uploaded research documents
conversations - Chat history for RAG
Memory Requirements
Qdrant : ~512 MB base + vector data
SearXNG : ~512 MB
Browserless : ~1.5 GB (Chrome + Node.js)
Total : ~2.5 GB minimum
Qdrant
Create payload indexes on frequently filtered fields
Use with_vector: false when only payloads are needed
Batch upsert operations for better performance
SearXNG
Cache search results to reduce load
Use specific categories to narrow results
Respect rate limits from upstream engines
Browserless
Reuse browser contexts when possible
Use waitForSelector instead of arbitrary delays
Disable images/CSS for faster scraping: {"blockAds": true}
Increase timeout for slow-loading pages
Next Steps
Local AI Pack Add Ollama for embeddings and LLM inference
Knowledge Base Pack Add full-text search with Meilisearch