Back to AI guides

Token-Efficient LLM Workflows with TOON

Stanley Ulili
Updated on November 21, 2025

Interacting with Large Language Models (LLMs) like GPT-4 or Gemini has become essential for building intelligent applications. However, this power comes at a literal cost. Every request you send to an LLM API is measured and billed based on the number of "tokens" processed. If you're sending structured data using JSON format, you are almost certainly wasting money. Every bracket, quote, comma, and repeated key in your JSON payload becomes a token that you pay for, often without adding any informational value for the model.

A new, more efficient method for formatting data can slash those costs significantly. In this tutorial, you'll learn about TOON (Token-Oriented Object Notation), a data format designed specifically to be token-efficient for LLMs. By adopting TOON, you can reduce your token consumption by 30% to 60%, and in some cases, even more.

You'll explore why JSON is costly, what makes TOON efficient, and walk through a practical example of converting a real-world JSON dataset to TOON, measuring a 63% reduction in tokens.

Prerequisites

To follow this tutorial, you'll need:

  • Node.js (version 18 or later) and npm installed on your system
  • Basic familiarity with TypeScript and working with JSON data
  • A code editor like VS Code
  • Basic understanding of how LLM APIs work and their token-based pricing

Understanding the problem: Why JSON is expensive for LLM interactions

Before you can appreciate the solution, you need to understand the problem. The issue with using JSON for LLM interactions isn't that it's a bad format—it's excellent for web APIs and human readability—but its design principles are fundamentally at odds with the token-based pricing models of AI services.

What are tokens and why do they matter?

When you send text to an LLM, the model doesn't see it as a whole. Instead, it breaks the text down into smaller pieces called tokens. A token can be a word, part of a word, a number, or even a single character of punctuation. For example, the phrase "LLM efficiency is key" might be broken down into tokens like ["LL", "M", " efficiency", " is", " key"].

Crucially, every single character in your input counts towards this process. This includes all the structural "noise" in a JSON file:

  • Curly braces {}
  • Square brackets []
  • Quotation marks "
  • Commas ,
  • Colons :

API providers like OpenAI and Google charge you based on the total number of tokens in your input (the prompt) and the output (the model's response). Therefore, the more verbose your data format, the more tokens you use, and the higher your bill will be.

Examining a standard JSON log file

Let's examine a common use case: sending a batch of log data to an LLM for analysis, summarization, or anomaly detection. Here's what that data might look like in JSON format:

data.json
{
  "logs": [
    {
      "id": 2001,
      "timestamp": "2025-11-18T08:14:23Z",
      "level": "error",
      "service": "auth-api",
      "ip": "172.16.4.21",
      "message": "Authentication failed for user",
      "code": "AUTH_401"
    },
    {
      "id": 2002,
      "timestamp": "2025-11-18T08:15:12Z",
      "level": "warn",
      "service": "billing-worker",
      "ip": "172.16.4.88",
      "message": "Payment retry threshold approaching",
      "code": "BILL_209"
    },
    {
      "id": 2003,
      "timestamp": "2025-11-18T08:16:47Z",
      "level": "info",
      "service": "ingest-pipeline",
      "ip": "172.16.5.30",
      "message": "Batch processed successfully",
      "code": "INGEST_OK"
    }
  ]
}

Notice the incredible amount of repetition. For every single log entry, you're rewriting the keys: "id", "timestamp", "level", "service", and so on. All of these keys, along with their surrounding quotes and colons, are counted as tokens every single time they appear. This is pure structural overhead. The LLM doesn't need to be told that the second field is timestamp a hundred times in a hundred-log batch; it's smart enough to understand the pattern after seeing it once. JSON's structure forces this expensive redundancy.

How redundancy scales with your data

While this overhead might seem minor for a few log entries, imagine sending thousands or millions of these records per day. The cost of these redundant tokens begins to snowball dramatically. Chat histories, analytical data, product catalogs—any large, uniform dataset sent to an LLM will suffer from this same costly verbosity. Every token you can shave off without losing information translates directly into real dollars saved.

Introducing TOON: Token-Oriented Object Notation

TOON is a data format created with a single, primary goal: to represent structured data in a way that is maximally token-efficient for Large Language Models. It achieves this by stripping away the redundant syntax of formats like JSON and leveraging simpler structural cues that LLMs can easily understand.

The core principles of TOON

TOON's efficiency stems from two key design choices that cater directly to how LLMs process information:

  1. Indentation-based hierarchy: Much like Python or YAML, TOON uses indentation to define nested objects and data hierarchy. This eliminates the need for curly braces {} and reduces visual clutter.

  2. Tabular format for arrays: This is TOON's most powerful feature, especially for uniform arrays like log data. Instead of repeating keys for every object in an array, TOON defines the keys once as a "header" row. Each subsequent line in the array is then just a comma-separated list of values corresponding to that header. This single change eliminates the biggest source of token redundancy in JSON.

Comparing JSON and TOON formats

The best way to understand TOON's impact is to see it in action. Here's what the log data looks like after being converted to TOON:

 
logs[10]{id,timestamp,level,service,ip,message,code}:
  2001,"2025-11-18T08:14:23Z",error,auth-api,172.16.4.21,Authentication failed for user,AUTH_401
  2002,"2025-11-18T08:15:12Z",warn,billing-worker,172.16.4.88,Payment retry threshold approaching,BILL_209
  2003,"2025-11-18T08:16:47Z",info,ingest-pipeline,172.16.5.30,Batch processed successfully,INGEST_OK

The difference is substantial. Let's break down this structure:

  • logs[10]{id,timestamp,level,service,ip,message,code}: This is the header.
    • logs: The name of the array.
    • [10]: Indicates there are 10 items in the array.
    • {id,timestamp,...}: This is the crucial part. The keys are defined only once as a comma-separated list inside curly braces.
    • :: A colon indicates the start of the data block.
  • Each subsequent indented line is a single record, containing only the values in the same order as the header. There are no more repeated keys, no curly braces, and far fewer quotes.

This format is not only dramatically shorter but also presents the data in a way that is highly intuitive for an LLM to parse, similar to reading a CSV file or a database table.

Step 1 — Setting up your project environment

Before you begin converting JSON to TOON, you need to set up a project directory and create your sample data file.

Create a new project folder:

 
mkdir toon-test

Navigate into the folder:

 
cd toon-test

Initialize a new Node.js project:

 
npm init -y

Create the sample data file:

 
code data.json

Paste the following sample log data into data.json. This example uses 10 log entries to better demonstrate the token reduction effect:

data.json
{
  "logs": [
    {
      "id": 2001,
      "timestamp": "2025-11-18T08:14:23Z",
      "level": "error",
      "service": "auth-api",
      "ip": "172.16.4.21",
      "message": "Authentication failed for user",
      "code": "AUTH_401"
    },
    {
      "id": 2002,
      "timestamp": "2025-11-18T08:15:12Z",
      "level": "warn",
      "service": "billing-worker",
      "ip": "172.16.4.88",
      "message": "Payment retry threshold approaching",
      "code": "BILL_209"
    },
    {
      "id": 2003,
      "timestamp": "2025-11-18T08:16:47Z",
      "level": "info",
      "service": "ingest-pipeline",
      "ip": "172.16.5.30",
      "message": "Batch processed successfully",
      "code": "INGEST_OK"
    },

  ]
}

You now have a project directory with sample log data ready for conversion.

Step 2 — Installing the necessary dependencies

You need two key packages to build your conversion script: the TOON encoder and a tokenizer that matches OpenAI's token counting behavior.

Install the TOON package:

 
npm install @toon-format/toon

This library contains the core logic for encoding JavaScript objects into the TOON format. You can learn more about TOON at the official GitHub repository.

Install the GPT-3 encoder:

 
npm install gpt-3-encoder

The gpt-3-encoder package provides an accurate token count that reflects what an OpenAI model would calculate. This ensures your comparison is accurate.

Install TypeScript and tsx for running TypeScript files:

 
npm install -D typescript tsx

Your dependencies are now installed and ready to use.

Step 3 — Writing the conversion script

Now you'll create a TypeScript script that converts your JSON data to TOON and compares the token counts.

Create the script file:

 
code convert.ts

Add the following code to convert.ts:

convert.ts
import fs from "fs";
import { encode } from "@toon-format/toon";
import { encode as tokenize } from "gpt-3-encoder";

// Read and parse the JSON file
const json = JSON.parse(fs.readFileSync("./data.json", "utf8"));

// Convert to TOON format
const toon = encode(json);

// Prepare JSON string for comparison
const jsonString = JSON.stringify(json, null, 2);

// Count tokens in both formats
const jsonTokens = tokenize(jsonString).length;
const toonTokens = tokenize(toon).length;

// Calculate savings
const savedTokens = jsonTokens - toonTokens;
const percentageReduction = ((savedTokens / jsonTokens) * 100).toFixed(2);

// Display results
console.log("=== JSON ===");
console.log(jsonString);
console.log(`\nJSON Tokens: ${jsonTokens}`);

console.log("\n=== TOON ===");
console.log(toon);
console.log(`\nTOON Tokens: ${toonTokens}`);

console.log("\n=== Token Comparison ===");
console.log(`JSON: ${jsonTokens} tokens`);
console.log(`TOON: ${toonTokens} tokens`);
console.log(`\n✅ Saved ${savedTokens} tokens (${percentageReduction}% reduction)`);

This script starts by importing Node.js’s fs module for file access, the encode function from the TOON library, and the tokenizer from gpt-3-encoder. It reads the data.json file from disk, parses it into a JavaScript object, and then converts that object into a TOON-formatted string.

Next, it prepares a pretty-printed JSON string using JSON.stringify so you can compare it directly to the TOON output. Both the JSON string and the TOON string are passed through the tokenizer to count how many tokens each format would consume.

The script then calculates how many tokens you save by using TOON instead of JSON, along with the percentage reduction. Finally, it logs the original JSON, the TOON representation, their respective token counts, and a summary line showing the total tokens saved and the percentage reduction.

Step 4 — Running the comparison

Execute your TypeScript script to see the token reduction in action.

Run the script:

 
npx tsx convert.ts

You'll see output similar to this:

Output
=== JSON ===
{
  "logs": [
    {
      "id": 2001,
      "timestamp": "2025-11-18T08:14:23Z",
      "level": "error",
      "service": "auth-api",
      "ip": "172.16.4.21",
      "message": "Authentication failed for user",
      "code": "AUTH_401"
    },
    ...
    {
      "id": 2003,
      "timestamp": "2025-11-18T08:16:47Z",
      "level": "info",
      "service": "ingest-pipeline",
      "ip": "172.16.5.30",
      "message": "Batch processed successfully",
      "code": "INGEST_OK"
    }
  ]
}

JSON Tokens: 379

=== TOON ===
logs[3]{id,timestamp,level,service,ip,message,code}:
  2001,"2025-11-18T08:14:23Z",error,auth-api,172.16.4.21,Authentication failed for user,AUTH_401
  2002,"2025-11-18T08:15:12Z",warn,billing-worker,172.16.4.88,Payment retry threshold approaching,BILL_209
  2003,"2025-11-18T08:16:47Z",info,ingest-pipeline,172.16.5.30,Batch processed successfully,INGEST_OK

TOON Tokens: 150

=== Token Comparison ===
JSON: 379 tokens
TOON: 150 tokens

✅ Saved 229 tokens (60.42% reduction)

The results demonstrate a dramatic improvement. By changing the data format from JSON to TOON, you reduced the token count from 379 to 150, a 60.42% reduction. You are sending the exact same information to the LLM but using significantly fewer tokens, which directly translates into lower costs for this API call.

Understanding the token savings breakdown

The 63.27% reduction comes from shaving off structural noise that LLMs do not actually need to understand your data. TOON keeps the information but strips away most of the redundant syntax that JSON carries around.

Key contributors to the savings include:

  • Eliminated repetitive keys: JSON repeats the same field names for every log entry, while TOON defines them once in a single header line.
  • Reduced punctuation: JSON leans on curly braces, quotes, and colons everywhere; TOON mostly sticks to a concise header plus comma separated values.
  • Simplified structure: A flat, table-like layout is more compact than deeply nested objects while preserving the same semantics.
  • More efficient whitespace: TOON typically needs less indentation and spacing than nested JSON, which further cuts the token count.

For a 10 entry dataset this already adds up to a large reduction. As you scale to hundreds or thousands of rows, the one-time header cost stays fixed while the per-row savings continue to compound.

When to use TOON for your LLM workflows

TOON is not a universal replacement for JSON, but it is extremely effective in the right contexts. It shines whenever you are sending large, uniform batches of structured data to an LLM.

Good candidates for TOON:

  • Large arrays of similar objects, such as logs, metrics, analytics events, user lists, or product catalogs.
  • High volume or repeated API calls where the same structure is sent over and over again.
  • Workflows that are constrained by context window limits and need to pack in as many rows as possible.
  • Real time or near real time processing pipelines where smaller payloads can improve responsiveness.

Cases where JSON may still be the better fit:

  • Small, one off requests where the implementation overhead of TOON is not justified.
  • Deeply nested or highly heterogeneous data, where a tabular layout is harder to express.
  • Human edited configuration files, where JSON’s familiarity and tooling support are more important than token efficiency.

Integrating TOON into your existing workflows

To adopt TOON in your production applications, you'll need to convert your data at the point where you prepare it for the LLM API call. Here's a simple pattern you can follow:

 
import { encode } from "@toon-format/toon";

async function analyzeLogsWithLLM(logs: any[]) {
  // Wrap your logs array in an object
  const data = { logs };

  // Convert to TOON
  const toonData = encode(data);

  // Send to your LLM API
  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [
        {
          role: "user",
          content: `Analyze these logs and identify any security issues:\n\n${toonData}`,
        },
      ],
    }),
  });

  return response.json();
}

This pattern works for any scenario where you're sending structured data to an LLM. You can apply it to chat histories, analytical reports, database results, or any other uniform datasets.

Final thoughts

As AI applications scale, token efficiency becomes a core business necessity. JSON’s structural redundancy wastes tokens and inflates your LLM API costs, especially for large, uniform datasets.

In this tutorial, you explored TOON, a data format built for token-efficient LLM interactions. By replacing repetitive JSON syntax with a compact tabular layout for arrays, TOON sharply reduces the tokens needed to represent structured data. You also built a TypeScript script that achieved a 60.42% token reduction on a dataset of log entries.

Integrating TOON into your pipelines for logs, metrics, catalogs, and other uniform datasets unlocks several advantages. You can cut LLM API costs by 30–60% or more while fitting more meaningful data into the model’s limited context window. This also helps improve processing speed and overall reliability.

Got an article suggestion? Let us know
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.