This chapter explores prompt engineering techniques that help you get better results from Large Language Models. You’ll learn how to structure prompts effectively, provide context, and use advanced strategies to improve model outputs.
Overview
Prompt engineering is the practice of designing and optimizing inputs to LLMs to achieve desired outputs. This chapter covers:
- Basic ingredients of effective prompts
- Advanced prompt engineering techniques
- Reasoning strategies for complex tasks
- Output verification and formatting
Setting Up
To run the examples in this chapter, you’ll need a GPU. In Google Colab, go to Runtime > Change runtime type > Hardware accelerator > GPU > GPU type > T4.
# Install required packages
!pip install langchain>=0.1.17 openai>=1.13.3 langchain_openai>=0.1.6 \
transformers>=4.40.1 datasets>=2.18.0 accelerate>=0.27.2 \
sentence-transformers>=2.5.1 duckduckgo-search>=5.2.2
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
Basic Prompt Engineering
Simple Prompts
The most basic form of prompting involves asking a direct question:
messages = [
{"role": "user", "content": "Create a funny joke about chickens."}
]
output = pipe(messages)
print(output[0]["generated_text"])
# Output: Why don't chickens like to go to the gym?
# Because they can't crack the egg-sistence of it!
Understanding Chat Templates
Models use specific formatting for chat interactions. You can view the template being applied:
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False)
print(prompt)
Output:
<s><|user|>
Create a funny joke about chickens.<|end|>
<|assistant|>
Temperature and Sampling
Control output randomness with temperature and top_p parameters:
High Temperature
High Top-P
# More creative and varied outputs
output = pipe(messages, do_sample=True, temperature=1)
print(output[0]["generated_text"])
# Output: Why don't chickens ever play hide and seek?
# Because good luck hiding when everyone always goes to the henhouse!
# Nucleus sampling for diverse outputs
output = pipe(messages, do_sample=True, top_p=1)
print(output[0]["generated_text"])
# Output: Why don't chickens like math class?
# Because they can't solve for "x" in their eggs!
Advanced Prompt Engineering
Complex Prompt Structure
Build comprehensive prompts with multiple components:
# Prompt components
persona = "You are an expert in Large Language models. You excel at breaking down complex papers into digestible summaries.\n"
instruction = "Summarize the key findings of the paper provided.\n"
context = "Your summary should extract the most crucial points that can help researchers quickly understand the most vital information of the paper.\n"
data_format = "Create a bullet-point summary that outlines the method. Follow this up with a concise paragraph that encapsulates the main results.\n"
audience = "The summary is designed for busy researchers that quickly need to grasp the newest trends in Large Language Models.\n"
tone = "The tone should be professional and clear.\n"
data = f"Text to summarize: {text}"
# Combine all components
query = persona + instruction + context + data_format + audience + tone + data
messages = [{"role": "user", "content": query}]
outputs = pipe(messages)
Experiment with removing or adding components to see their impact on generated outputs. Each element serves a specific purpose in guiding the model.
In-Context Learning: Few-Shot Prompting
Provide examples to guide the model’s behavior:
# One-shot prompt: provide a single example
one_shot_prompt = [
{
"role": "user",
"content": "A 'Gigamuru' is a type of Japanese musical instrument. An example of a sentence that uses the word Gigamuru is:"
},
{
"role": "assistant",
"content": "I have a Gigamuru that my uncle gave me as a gift. I love to play it at home."
},
{
"role": "user",
"content": "To 'screeg' something is to swing a sword at it. An example of a sentence that uses the word screeg is:"
}
]
outputs = pipe(one_shot_prompt)
print(outputs[0]["generated_text"])
# Output: During the intense duel, the knight skillfully screeged
# his opponent's shield, forcing him to defend himself.
Chain Prompting: Breaking Down Complex Tasks
Split complex tasks into smaller, manageable steps:
Create Product Name and Slogan
product_prompt = [
{"role": "user", "content": "Create a name and slogan for a chatbot that leverages LLMs."}
]
outputs = pipe(product_prompt)
product_description = outputs[0]["generated_text"]
print(product_description)
# Output: Name: "MindMeld Messenger"
# Slogan: "Unleashing Intelligent Conversations, One Response at a Time"
Generate Sales Pitch from Product
sales_prompt = [
{"role": "user", "content": f"Generate a very short sales pitch for the following product: '{product_description}'"}
]
outputs = pipe(sales_prompt)
sales_pitch = outputs[0]["generated_text"]
print(sales_pitch)
# Output: Introducing MindMeld Messenger - your ultimate communication partner!
# Unleash intelligent conversations with our innovative AI-powered messaging platform...
Chain prompting allows you to maintain quality at each step while building complex outputs incrementally. Each step’s output becomes input for the next.
Reasoning with Generative Models
Chain-of-Thought (CoT) Prompting
Enable better reasoning by showing step-by-step thinking:
# Chain-of-Thought prompt with reasoning example
cot_prompt = [
{
"role": "user",
"content": "Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?"
},
{
"role": "assistant",
"content": "Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11."
},
{
"role": "user",
"content": "The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?"
}
]
outputs = pipe(cot_prompt)
print(outputs[0]["generated_text"])
# Output: The cafeteria started with 23 apples. They used 20 apples,
# so they had 23 - 20 = 3 apples left. Then they bought 6 more apples,
# so they now have 3 + 6 = 9 apples. The answer is 9.
Zero-Shot Chain-of-Thought
Trigger reasoning without examples using magic phrases:
zeroshot_cot_prompt = [
{
"role": "user",
"content": "The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? Let's think step-by-step."
}
]
outputs = pipe(zeroshot_cot_prompt)
print(outputs[0]["generated_text"])
# Output: Step 1: Start with the initial number of apples, which is 23.
# Step 2: Subtract the number of apples used to make lunch, which is 20. So, 23 - 20 = 3 apples remaining.
# Step 3: Add the number of apples bought, which is 6. So, 3 + 6 = 9 apples.
# The cafeteria now has 9 apples.
The phrase “Let’s think step-by-step” is remarkably effective at triggering reasoning behavior in LLMs without requiring examples.
Tree-of-Thought: Multiple Reasoning Paths
Simulate multiple experts reasoning together:
zeroshot_tot_prompt = [
{
"role": "user",
"content": "Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realises they're wrong at any point then they leave. The question is 'The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?' Make sure to discuss the results."
}
]
outputs = pipe(zeroshot_tot_prompt)
print(outputs[0]["generated_text"])
# Output: Expert 1: Step 1 - Start with the initial number of apples: 23 apples.
# Expert 2: Step 1 - Subtract the apples used for lunch: 23 - 20 = 3 apples remaining.
# Expert 3: Step 1 - Add the newly bought apples: 3 + 6 = 9 apples.
# ...
# All experts agree on the final count: The cafeteria has 9 apples.
Output Verification
Structured Output with Examples
Guide the model to produce specific formats:
Zero-Shot
One-Shot Template
zeroshot_prompt = [
{"role": "user", "content": "Create a character profile for an RPG game in JSON format."}
]
outputs = pipe(zeroshot_prompt)
print(outputs[0]["generated_text"])
# Produces verbose JSON with many fields
one_shot_template = """Create a short character profile for an RPG game. Make sure to only use this format:
{
"description": "A SHORT DESCRIPTION",
"name": "THE CHARACTER'S NAME",
"armor": "ONE PIECE OF ARMOR",
"weapon": "ONE OR MORE WEAPONS"
}
"""
one_shot_prompt = [{"role": "user", "content": one_shot_template}]
outputs = pipe(one_shot_prompt)
print(outputs[0]["generated_text"])
# Output:
# {
# "description": "A cunning rogue with a mysterious past, skilled in stealth and deception.",
# "name": "Lysandra Shadowstep",
# "armor": "Leather Cloak of the Night",
# "weapon": "Dagger of Whispers, Throwing Knives"
# }
Grammar: Constrained Sampling
Force valid JSON output using constrained sampling with llama-cpp-python:
from llama_cpp.llama import Llama
# Load Phi-3 with llama.cpp
llm = Llama.from_pretrained(
repo_id="microsoft/Phi-3-mini-4k-instruct-gguf",
filename="*fp16.gguf",
n_gpu_layers=-1,
n_ctx=2048,
verbose=False
)
# Generate with JSON schema enforcement
output = llm.create_chat_completion(
messages=[
{"role": "user", "content": "Create a warrior for an RPG in JSON format."},
],
response_format={"type": "json_object"},
temperature=0,
)['choices'][0]['message']["content"]
print(json.dumps(json.loads(output), indent=4))
Constrained sampling guarantees valid JSON output but may require additional computational overhead. Use it when format compliance is critical.
Best Practices
Start Simple
Begin with clear, direct prompts before adding complexity.
Add Context Gradually
Include persona, instructions, context, format requirements, and tone as needed.
Use Examples
Few-shot prompting is powerful for demonstrating desired behavior and output formats.
Break Down Complex Tasks
Use chain prompting to split multi-step problems into manageable pieces.
Enable Reasoning
For mathematical or logical problems, use Chain-of-Thought prompting with “Let’s think step-by-step.”
Constrain When Necessary
Use constrained sampling or detailed format examples when you need specific output structures.
Key Takeaways
- Prompt structure matters: Persona, instructions, context, format, audience, and tone all influence outputs
- Examples are powerful: Few-shot learning can dramatically improve results
- Chain complex tasks: Break down multi-step problems into sequential prompts
- Trigger reasoning: Use CoT prompting for mathematical and logical tasks
- Control output format: Use examples or constrained sampling for structured outputs
- Experiment iteratively: Test different approaches and refine based on results
Next Steps
Continue to Chapter 7: Advanced Text Generation Techniques to learn about chaining, memory, and agents that extend beyond prompt engineering.