Back to AI guides

A Deep Dive into POML

Stanley Ulili
Updated on October 30, 2025

In the rapidly evolving world of Large Language Models (LLMs), the art and science of crafting effective prompts have become a cornerstone of building intelligent applications. While simple prompts in plain English work for basic tasks, creating complex, reliable, and maintainable prompts for production systems presents a significant challenge. As prompts grow with detailed instructions, few-shot examples, and dynamic data, they can quickly become unwieldy, difficult to debug, and nearly impossible to reuse.

Enter POML, the Prompt Orchestration Markup Language. Developed by prompt engineering researchers at Microsoft, POML is an open-source markup language designed to bring structure, maintainability, and powerful data integration capabilities to advanced prompt engineering. It addresses common challenges in prompt development, such as lack of structure, complex data integration, and inadequate tooling, by providing a systematic way to organize, compose, and manage AI prompts.

In this comprehensive tutorial, we will embark on a deep dive into the world of POML. We will learn not just the "how" but also the "why" behind this innovative language. You'll discover how to move beyond simple text files and start building sophisticated, dynamic, and reusable prompts that can power the next generation of LLM applications.

We will cover:

  • The fundamental concepts and philosophy behind POML
  • Setting up your environment and writing your first POML prompts
  • Mastering POML's syntax, core components, and data integration features
  • Leveraging the powerful template engine for creating dynamic and conditional prompts
  • A practical, step-by-step project to generate a comprehensive agent guideline file for a codebase

The Problem with Plain Text: Why Do We Need a Prompt Markup Language?

Before we dive into the syntax of POML, it's crucial to understand the problems it aims to solve. For many, a prompt is just a string of text fed to an AI. While true at its core, this oversimplification breaks down when building real-world applications.

The Challenge of Complexity and Maintainability

Imagine you're building an AI agent that analyzes customer support tickets, categorizes them, and drafts a response. Your prompt might need to include:

  • A persona for the AI (e.g., "You are a friendly and helpful support agent")
  • The main task (e.g., "Analyze the following ticket...")
  • Specific formatting instructions for the output (e.g., "Respond in JSON format with keys: 'summary', 'category', 'priority', 'draft_response'")
  • Several examples of tickets and their desired JSON outputs (few-shot learning)
  • The actual customer ticket, which is dynamic data
  • Contextual information, like the customer's purchase history, pulled from a database

Trying to manage this as a single, massive block of text is a recipe for disaster. It's hard to read, modify, and debug. If you want to change the JSON schema, you have to update it in multiple places within the examples, risking inconsistency. If you want to reuse the "persona" section in another prompt, you're forced to copy and paste, leading to code duplication.

The Data Integration Dilemma

A significant part of advanced prompting involves providing the LLM with relevant, up-to-the-minute context. This is often called Retrieval-Augmented Generation (RAG). This context can come from various sources: text files, PDFs, database records, spreadsheets, or even live web pages.

Without a dedicated framework, developers must write significant boilerplate code to:

  1. Read data from these diverse sources
  2. Parse and clean the data
  3. Format it into a text representation that the LLM can understand
  4. Inject this formatted text into the correct position within the main prompt string

This process is tedious, error-prone, and mixes data retrieval logic with prompt construction logic, violating the principle of separation of concerns.

How POML Provides a Solution

POML addresses these challenges by introducing a structured, component-based approach, much like how HTML brought structure to web documents.

  • Structure and Readability: By using descriptive tags like <role>, <task>, and <example>, POML makes prompts self-documenting and easier to understand
  • Modularity and Reusability: POML allows you to break down prompts into smaller, manageable components that can be reused and even included from other files
  • Seamless Data Integration: POML provides dedicated components like <document>, <table>, and <webpage> that handle the complexities of fetching, parsing, and formatting data from external sources directly within the prompt itself
  • Dynamic Content: A built-in template engine with support for variables, loops, and conditionals allows for the creation of highly dynamic and context-aware prompts

POML isn't for writing a quick question to ChatGPT. It's an engineering tool for professionals who build and maintain complex, data-driven AI systems.

Step 1 — Setting Up Your POML Environment

While POML has a TypeScript SDK, the video notes that it's not yet stable. For this tutorial, we will focus on the mature and stable Python SDK, which is the recommended approach for production use cases.

The Core POML Workflow

The fundamental process of using POML is straightforward. It involves writing your prompt logic in a file with a .poml extension and then using a compiler (in our case, the Python SDK) to process this file into a final, formatted string that can be sent to an LLM.

A diagram illustrating the POML workflow showing POML code flowing through a compiler to output Markdown, JSON, and HTML

This architecture is powerful because it decouples your prompt definition from the final output format, allowing a single .poml file to generate prompts suitable for different models or APIs.

Installing the Python SDK

Let's get our hands dirty and set up a project.

First, open your terminal and create a new folder for our project:

 
mkdir poml_tutorial
 
cd poml_tutorial

It's a best practice in Python to use a virtual environment to manage project dependencies. This isolates your project's packages from your global Python installation.

Create the virtual environment:

 
python3 -m venv venv

Activate it on macOS or Linux:

 
source venv/bin/activate

On Windows, use the following instead:

 
.\venv\Scripts\activate

You'll know it's active because your terminal prompt will be prefixed with (venv).

Now, install the poml package using pip:

 
pip install poml

This command downloads and installs the POML Python SDK and all its necessary dependencies.

Running Your First POML Program

With the library installed, we can run our first "Hello, World" example. Create a new Python file named hello_world.py:

 
code hello_world.py

Open hello_world.py and add the following code, which compiles a simple POML string:

hello_world.py
from poml import poml

# Define a simple prompt using an inline POML string
poml_string = "<p>hello world</p>"

# Use the poml() function to compile the string
output = poml(poml_string)

# Print the compiled output
print(output)

Execute the file from your terminal:

 
python hello_world.py

You will see the following output in your console:

Output
[{'speaker': 'human', 'content': 'hello world'}]

Let's break this down. The poml() function has processed our simple <p> tag and converted it into a structured format. By default, it produces a list of message dictionaries, which is a common format for chat-based LLM APIs. Each dictionary has a speaker (who is providing the content) and the content itself. This simple example already showcases how POML adds a layer of structure even to the most basic prompts.

Step 2 — Mastering POML Syntax and Core Components

While inline strings are useful for simple cases, the real power of POML comes from using dedicated .poml files and its rich set of components (tags).

Understanding the Anatomy of a .poml File

Let's create a more structured prompt in its own file. Create a file named first_prompt.poml:

 
code first_prompt.poml

Add the following content:

first_prompt.poml
<poml>
  <role>You are a super clever AI assistant.</role>
  <task>Say hello world like a programmer.</task>
  <hint>Try not to talk about code.</hint>
</poml>

Here we see some of the foundational "intentional" components of POML:

  • <poml>: The root element that encapsulates the entire prompt
  • <role>: Defines the persona or system message for the AI. This sets the context for how the model should behave
  • <task>: Specifies the primary objective or instruction for the LLM to follow
  • <hint>: Provides additional, often subtle, guidance or constraints to nudge the AI's response in the right direction

Now, let's create a new Python script, run_prompt.py, to compile this file:

 
code run_prompt.py

Add the following code:

run_prompt.py
from poml import poml

# Compile the .poml file by passing its path
output = poml("first_prompt.poml")

# The output is a list, so we'll access the content of the first message
content = output[0]['content']

print(content)

Running the script will produce the following output:

 
python run_prompt.py
Output
# Role
You are a super clever ai assistant

# Task
Say hello world like a programmer

## Hint
Try not to talk about code

By default, POML compiles to Markdown, which is excellent for readability and debugging. The component tags are transformed into clear headings.

Compiling to Different Formats

What if you need a different format for your API? This is where the syntax attribute comes in. You can add it to the root <poml> tag.

Let's modify first_prompt.poml to output JSON:

first_prompt.poml
<poml syntax="json">
<role>You are a super clever AI assistant.</role> <task>Say hello world like a programmer.</task> <hint>Try not to talk about code.</hint> </poml>

If you run the script again, the output will now be a clean JSON object:

 
python run_prompt.py
Output
{
  "role": "You are a super clever AI assistant.",
  "task": "Say hello world like a programmer.",
  "hint": "Try not to talk about code."
}

This is incredibly useful because the same prompt definition can be compiled into a human-readable format for documentation (Markdown) or a machine-readable format for an API call (JSON) with a single-word change. POML also supports html, yaml, and xml as syntax targets.

Step 3 — Integrating External Data with POML

One of POML's most compelling features is its ability to seamlessly integrate data from external sources. This is the key to building powerful RAG systems and context-aware agents without writing complex pre-processing code.

Injecting Files with the <document> Component

Imagine you have detailed coding standards for different frameworks stored in separate Markdown files. With POML, you can conditionally include them in a prompt.

Let's say you have a file code-styles/react_style.md. You can embed its entire content into your prompt like this:

 
<poml>
  <task>Generate a new component following these guidelines:</task>

  <document src="code-styles/react_style.md" />
</poml>

When this .poml file is compiled, the content of react_style.md is automatically read and inserted into the prompt at that location. This keeps your main prompt clean and allows you to manage large pieces of context in their own dedicated files. The parser attribute can be used to help POML understand file types like .txt, .pdf, and .docx.

Working with Tabular Data using <table>

This component is a game-changer for prompts that need to reason over structured data. You can point it to a CSV, an Excel file, or even a JSON file, and POML will parse it and format it for the LLM.

 
<poml>
  <task>Analyze the following sales data and provide a summary.</task>

  <table 
    src="data/sales_data.xlsx" 
    parser="excel" 
    maxRecords="5" 
    syntax="csv" 
  />
</poml>

In this example, POML will:

  1. Open the sales_data.xlsx file
  2. Use its built-in Excel parser to read the data
  3. Take only the first 5 records (thanks to maxRecords="5")
  4. Format those records as a clean CSV string (syntax="csv")
  5. Inject the resulting CSV into the prompt

This single component replaces what would otherwise be dozens of lines of Python code using libraries like Pandas or openpyxl.

Scraping Live Web Content with <webpage>

Need to provide your AI with information from a live website? The <webpage> component can fetch and include content from a URL.

Get the entire page content:

 
<webpage url="https://makebettercontent.dev/blog/first-post" />

Get only the text content from the main article section:

 
<webpage 
  url="https://makebettercontent.dev/blog/first-post"
  selector="main"
  extractText="true"
/>

The selector attribute uses a CSS selector to pinpoint the exact part of the page you want, and extractText="true" strips out all the HTML, leaving you with clean text content for the LLM.

Step 4 — Building Dynamic Prompts with the Template Engine

Static prompts are limited. To build truly intelligent applications, you need prompts that can adapt based on variables, conditions, and loops. POML's template engine provides all these capabilities.

Using Variables, Conditionals, and Loops

POML's template engine feels familiar to anyone who has worked with web templating languages like Jinja or Handlebars.

Variables (<let>): You can define variables and reference them using double curly braces:

 
<let name="framework" value="'React'" />
<p>The chosen framework is {{framework}}.</p>

Conditional Rendering (if): The if attribute allows you to include or exclude entire blocks of content based on a variable's value:

 
<p if="isReact">
  <document src="code-styles/react_style.md" />
</p>

Loops (for): You can iterate over lists to generate repetitive content:

 
<let name="fruits" value="['apple', 'banana', 'cherry']" />
<list listStyle="decimal">
  <item for="item in fruits">{{item}}</item>
</list>

This will compile to a numbered list:

Output
1. apple
2. banana
3. cherry

Bridging Python and POML with context

The most powerful way to make prompts dynamic is by passing data from your application code directly into the POML template. This is done using the context argument of the poml() function.

Let's look at an example. In your Python script:

script.py
from poml import poml

# This data could come from a database, an API call, or user input
app_data = {
    'user_name': 'Clark Kent',
    'user_id': 1
}

# The context is a dictionary passed to the poml function
# The keys in the dictionary become available as variables inside the .poml file
context = {
    'user': app_data
}

output = poml("user_prompt.poml", context)
print(output[0]['content'])

And in your user_prompt.poml file:

user_prompt.poml
<poml>
  <p>User Name: {{user.user_name}}</p>
  <p>User ID: {{user.user_id}}</p>
</poml>

When you run the script, POML will inject the user object from the Python context into the template, resulting in the output:

Output
User Name: Clark Kent
User ID: 1

This mechanism is the key to connecting your application's logic with your prompt templates.

Step 5 — Building a Dynamic AGENTS.md Generator

Let's tie everything together with a practical use case: creating a script that generates a detailed AGENTS.md file. This file provides guidelines for an AI agent working on a specific codebase. The guidelines should change depending on the project's tech stack (e.g., React, TypeScript).

Creating the Code Style Snippets

First, create a folder named code-styles:

 
mkdir code-styles

Add markdown files for each technology. For example, react_style.md:

 
code code-styles/react_style.md
code-styles/react_style.md
### React Style Guide
- **Component Structure**: Use functional components with hooks. Do not use class components.
- **State Management**: Prefer the `useState` hook for local component state.
- **Props**: Destructure props in the function signature.

Create similar files for ts_style.md and nextjs_style.md.

Designing the Master POML Template

Next, create the main template, code_style.poml:

 
code code_style.poml

This file will use if conditions to conditionally include the style guides:

code_style.poml
<poml>
  <role speaker="system">You are an expert software architect creating comprehensive agent guidelines for a codebase.</role>
  <role speaker="human">
    Create a comprehensive AGENTS.md file that provides clear guidelines for AI agents working in this codebase.
  </role>

  <h>Code Style Guidelines</h>

  <p if="isReact">
    <document src="code-styles/react_style.md" />
  </p>

  <p if="isTypeScript">
    <document src="code-styles/ts_style.md" />
  </p>

  <p if="isNextJs">
    <document src="code-styles/nextjs_style.md" />
  </p>

  <h>Required AGENTS.md Structure</h>
  Please create a well-structured AGENTS.md file with the following sections:
  1. **Project Overview** - Brief description of the project architecture
  2. **Technology Stack** - List of key technologies based on context above
  3. **Code Style Guidelines** - Include the relevant style guides from above
</poml>

Writing the Python Orchestration Script

Now, create the Python script code_style.py that will parse command-line arguments and run the POML compiler:

 
code code_style.py
code_style.py
import sys
from poml import poml
# You would also import and configure your LLM client here, e.g., OpenAI
# from openai import OpenAI
# client = OpenAI()

# 1. Parse CLI arguments to determine the tech stack
args = [arg.lower() for arg in sys.argv[1:]]

# 2. Build the context dictionary with boolean flags
context = {
    'isReact': 'react' in args,
    'isTypeScript': 'ts' in args or 'typescript' in args,
    'isNextJs': 'nextjs' in args or 'next' in args
}

# 3. Generate the prompt using the POML template and the context
# We format it for the OpenAI chat API
final_prompt = poml("code_style.poml", context, format="openai_chat")

print("--- Generated Prompt for LLM ---")
print(final_prompt['messages'][1]['content']) # Print the human message for inspection

# 4. Call the LLM API with the generated prompt
# response = client.chat.completions.create(
#     model="gpt-4-turbo",
#     **final_prompt
# )
# agent_md_content = response.choices[0].message.content

# 5. Write the response to the AGENTS.md file
# with open("AGENTS.md", "w") as f:
#     f.write(agent_md_content)

# print("\n✅ AGENTS.md file generated successfully!")

The LLM call is commented out for demonstration purposes. You would uncomment and configure it with your API key to run it for real.

Running the Script

Now you can generate a tailored AGENTS.md prompt from your terminal. To generate guidelines for a React and TypeScript project, you would run:

 
python code_style.py react ts

The script will then print the fully assembled prompt, which includes the contents of both react_style.md and ts_style.md. This complete prompt can then be sent to an LLM to generate the final, comprehensive AGENTS.md file. This dynamic workflow ensures that your AI agent always gets the most relevant and specific instructions for the task at hand.

Final thoughts

POML represents a significant step forward in the discipline of prompt engineering. By moving away from unstructured text blobs and embracing a structured, component-based, and data-aware paradigm, it provides the tools necessary to build, manage, and scale complex AI applications with confidence.

Throughout this tutorial, we've explored how POML's intuitive markup, seamless data integration, and powerful template engine can solve real-world problems. We've seen how to create reusable components, dynamically inject data from files and web pages, and orchestrate complex prompts with simple Python scripts. The ability to define a prompt's logic once and compile it to multiple formats for different APIs is a testament to its well-thought-out design.

While there is a learning curve, the benefits in maintainability, reusability, and reliability are undeniable for anyone working on serious LLM-powered projects. As AI systems become more integrated with diverse data sources and complex business logic, tools like POML will not just be helpful—they will be essential.

Got an article suggestion? Let us know
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.