Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt

Use this file to discover all available pages before exploring further.

Session 27 of Going Meta, broadcast on April 2, 2024, introduces the reflection agent pattern as applied to knowledge graph construction. Jesus Barrasa builds a multi-actor LangGraph workflow where three GPT-4-powered agents collaborate in a loop: a Modelling Expert generates an entity-relationship model from a dataset description, a Model Reviewer critiques it, and a Model Editor applies the suggested changes — with the loop repeating until the model converges. The result is a self-improving KG construction pipeline that produces higher-quality graph schemas than a single LLM call.

What You’ll Learn

  • How to define three specialised LLM actors (Generator, Critic, Editor) using LangChain prompt templates
  • How to wire them together into a LangGraph MessageGraph with conditional edges
  • How to use Kaggle’s Croissant metadata to automatically describe datasets for prompting
  • How to implement both automated iteration and human-in-the-loop feedback modes
  • How to visualise each iteration’s data model with Graphviz

The Three-Actor Architecture

Actor 1: Modelling Expert

Generates an entity-relationship model from dataset features. Outputs JSON with entities, relationships, attributes, and schema.org term mappings.

Actor 2: Model Reviewer

Critiques the generated model. Suggests up to 2-3 concise changes — detecting over/under-normalisation, suggesting better names — without rewriting the model itself.

Actor 3: Model Editor

Applies only the reviewer’s suggested changes to the current model and returns the updated JSON. Makes no changes beyond what it was instructed.

LangGraph Orchestrator

A MessageGraph wires the three actors into a loop. Conditional edges determine whether to iterate further or stop, based on message count or human feedback.

LangGraph StateGraph Pattern

The core of the session is the LangGraph MessageGraph that connects the three actors:
from langgraph.graph import END, MessageGraph
from langchain_core.messages import AIMessage, BaseMessage, HumanMessage

# mode of operation
human_in_the_loop_agent = False

async def model_generation_node(state: Sequence[BaseMessage]):
    return await model_generate.ainvoke({"messages": state})

async def model_reflection_node(state: Sequence[BaseMessage]) -> List[BaseMessage]:
    if not human_in_the_loop_agent:
        current_model_msg = state[-1]
        res = await model_reflect.ainvoke({
            "messages": [HumanMessage(content=generate_prompt_for_model_critique(current_model_msg.content))]
        })
        content = res.content
    else:
        content = input("feedback please: \n")
    return HumanMessage(content=content)

async def model_change_node(state: Sequence[BaseMessage]) -> List[BaseMessage]:
    changes_message = state[-1]
    current_model_msg = state[-2]
    res = await model_change.ainvoke({
        "messages": [HumanMessage(content=generate_prompt_for_model_change(
            current_model_msg.content, changes_message.content
        ))]
    })
    return AIMessage(content=res.content)

builder = MessageGraph()
builder.add_node("model_generate", model_generation_node)
builder.add_node("model_reflect", model_reflection_node)
builder.add_node("model_change", model_change_node)
builder.set_entry_point("model_generate")

def should_i_iterate(state: List[BaseMessage]):
    if human_in_the_loop_agent:
        feedback_message = state[-1]
        if len(feedback_message.content) > 0:
            return "model_change"
    else:
        if len(state) < 6:
            return "model_change"
    return END

builder.add_edge("model_generate", "model_reflect")
builder.add_conditional_edges("model_reflect", should_i_iterate)
builder.add_edge("model_change", "model_reflect")
graph = builder.compile()
human_in_the_loop_agent = False runs fully automated iterations. Set it to True to replace the LLM critic with a human reviewer who types feedback at the prompt — useful for steering the model toward domain-specific requirements.

Prompts for Each Actor

Modelling Expert Prompt

prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You are a data modelling expert capable of creating high quality entity-relationship models from denormalised datasets. "
        "You always follow these modeling principles: "
        "You don't overnormalize the model. "
        "You don't use the same name for relationships connecting different types of entities. "
        "You make sure that all features in the dataset are included in the model. "
        "You make sure there is a one to one mapping between the attributes in the extracted entities and the features in the dataset provided as input. ",
    ),
    MessagesPlaceholder(variable_name="messages"),
])
llm = ChatOpenAI(model="gpt-4")
model_generate = prompt | llm

Model Reviewer Prompt

reflection_prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You are a data modelling expert capable of analysing entity-relationship models and suggest changes that can improve them. "
        "You are not supposed to generate a new model, just provide suggestions for changes when pertinent. "
        "You always pay extra attention at the following: "
        "Detect under-normalized in the model and recommend they are extracted as new entities connected to the existing ones through relevant relationships. "
        "Detect over-normalized entities in the model and recommend they are merged as part of existing ones. "
        "Suggest alternative names for terms (property names, entity names, relationship names) used in the model if the proposed ones are not adequate or expressive enough. "
        "You do not recommend combining or merging attributes into composite ones. "
        "You don't always need to propose changes, if a model is good as-is just do not propose changes. "
    ),
    MessagesPlaceholder(variable_name="messages"),
])
model_reflect = reflection_prompt | llm

Running the Agent

The graph is invoked asynchronously over a dataset description. Here it runs over a crime dataset from Kaggle:
async for event in graph.astream(
    [
        HumanMessage(
            content=generate_prompt_for_kaggle_dataset(
                "sahirmaharajj/crime-data-from-2020-to-present-updated-monthly"
            )
        )
    ],
):
    print(event)
    print("---")
After the loop completes, the final message in event[END] contains the refined model, which can be visualised with Graphviz:
generate_graph_viz_and_render(event[END][1].content, "model_viz")

Iteration Dynamics

Automated Stopping Criterion

In automated mode, the agent stops after len(state) < 6 — approximately two full generate-reflect-change cycles. This is configurable for your use case.

Human-in-the-Loop Mode

In human-in-the-loop mode, the loop continues as long as the human provides non-empty feedback. An empty input terminates the loop, giving the human full control over convergence.

Graphviz Visualisation

After each iteration, the model is rendered to a PNG diagram using Graphviz’s Digraph, letting you watch the data model evolve across iterations.

Schema.org Alignment

Each entity, attribute, and relationship is annotated with the closest schema.org URI — grounding the generated model in a shared, machine-readable vocabulary.
Try the supply chain dataset (shashwatwork/dataco-smart-supply-chain-for-big-data-analysis) to see the agent tackle a complex model with 50+ features. The reviewer typically catches over-normalised intermediate entities in the first reflection pass.

Resources

Watch the Recording

Full session recording on YouTube — April 2, 2024.

Session Code (Colab)

LangGraph notebook and resources on GitHub.

Build docs developers (and LLMs) love