Getting Started with Zclaw: AI on ESP32 Hardware
Most AI assistants today live in the cloud. They depend on powerful servers, complex frameworks, and serious computing resources. Zclaw takes the opposite approach.
Instead of scaling AI up, it scales it down. Zclaw runs an AI agent on a $5 ESP32 microcontroller. No GPU. No dedicated server. Just a small Wi-Fi-enabled chip that fits in your hand.
What makes this project interesting is not just the price. It is the shift in perspective. Zclaw connects large language models to real hardware. It allows a tiny embedded device to receive natural language instructions, send them to an LLM, interpret the structured response, and then take action through GPIO pins, sensors, or displays.
This is where things get practical. You are not just chatting with a model. You are controlling the physical world.
In this guide, you will learn what Zclaw is, how it works under tight hardware constraints, and how to install it on your own ESP32 board. You will flash the firmware, connect it to Telegram, and test real hardware interactions. Then you will extend it by building a custom tool, proving that even a minimal system can be flexible.
By the end, you will have built a fully functional AI assistant running on hardware that costs less than a coffee.
What is Zclaw? The power of AI in your palm
Ever since the emergence of powerful AI agents, the open-source community has been buzzing with creativity, leading to a plethora of "Claw-themed" projects like PicoClaw, NanoClaw, and IronClaw. These projects aim to replicate and build upon the agentic AI concept. Amidst this sea of innovation, Zclaw stands out not for its complexity, but for its radical simplicity and efficiency. It takes the sophisticated features of its larger counterparts and masterfully shrinks them down to run on an ESP32, a popular and inexpensive microcontroller.
The smallest AI assistant for ESP32
At its core, Zclaw is an open-source equivalent of OpenClaw, specifically architected to operate within the severe constraints of microcontrollers. Written entirely in C, it's designed for peak performance and minimal resource usage. While larger AI agents might run on powerful servers or high-end consumer hardware, Zclaw thrives on a simple $5 ESP32 chip.
This focus on minimalism is not just a novelty; it opens up a world of possibilities for smart devices, IoT applications, and interactive hardware projects that were previously too resource-intensive to be practical.
Impressive features in a tiny package
Zclaw's most astonishing feature is its all-in firmware budget of just 888 kilobytes. This is not just the application code; this cap includes the entire operational stack. Breaking down what's packed into this tiny footprint: Zclaw app logic (the core intelligence and decision-making engine of the agent), Wi-Fi and networking stack (enables the ESP32 to connect to the internet to communicate with LLM APIs and messaging services), TLS/crypto stack (provides a secure, encrypted communication layer crucial for protecting your API keys and data), and certificate bundle and app metadata (necessary components for establishing secure HTTPS connections to web endpoints).
This comprehensive stack allows the tiny ESP32 chip to talk directly and securely to AI model APIs without exposing sensitive information to an unencrypted middleman. It's built on top of the robust ESP-IDF (Espressif IoT Development Framework), which means it can be easily extended. You can add drivers for countless IoT sensors, displays, and other custom firmware plugins to augment your assistant's capabilities, all while working within its tight resource budget.
Practical use cases
Zclaw's direct access to the hardware's General-Purpose Input/Output (GPIO) pins unlocks a range of useful and fun applications. You can command your agent using natural language through a messaging app like Telegram.
Here are some of the use cases highlighted in the official documentation: lab automation (run a staged startup script for your electronics lab, setting GPIO rails, waiting for components to stabilize, verifying sensor pins, and reporting on the overall health of your setup), smart greenhouse helper (schedule watering cycles, store moisture level notes from sensors, toggle relays for lights or fans, and report on timing), workshop assistant (set timed reminders for recurring tasks, like checking equipment or cleaning your workbench), and persistent memory (Zclaw can use the ESP32's non-volatile storage to remember information across reboots).
All of this is orchestrated through a simple chat interface, where the ESP32 acts as a client. It receives your command, sends it to the cloud-based LLM for processing, receives the structured response, and executes the corresponding hardware action locally on the chip.
Getting started: setting up Zclaw on your ESP32
Understanding what Zclaw is sets the foundation for getting hands-on. In this section, you'll flash the Zclaw firmware onto an ESP32-C3 board and configure it to communicate via Telegram.
Required hardware
For this project, you will need a few basic components: an ESP32-based microcontroller (any modern ESP32 board should work; the video uses a tiny ESP32-C3 SuperMini board), a USB cable (to connect the ESP32 to your computer for flashing and power), a computer (with a terminal or command-line interface), a Wi-Fi network (the ESP32 will need internet access), and optionally for demos: a breadboard, an LED, a 220-ohm resistor, and jumper wires (for the advanced demo, a GC9A01 circular TFT display is used).
Installation and configuration
The Zclaw installation process is streamlined through a series of scripts. Follow these steps carefully to get your agent up and running.
Connect your hardware and clone the repository
First, connect your ESP32 board to your computer using the USB cable. A light on the board should illuminate, indicating it's receiving power.
Next, open your terminal and clone the official Zclaw GitHub repository:
This will download the entire project source code into a new directory named zclaw. Navigate into this directory:
Run the installation script
Zclaw comes with a convenient installation script that sets up the entire development environment. Run it with this command:
This script will perform several actions: detect your operating system, check for and install the ESP-IDF toolchain (required for building firmware for the ESP32), check for optional components like QEMU and cJSON, and once the environment is verified, it will prompt you to "Build the firmware now? [Y/n]". Type Y and press Enter. The first build may take a minute or two as it compiles all the necessary components.
Flash the firmware to the device
After a successful build, the script will find your connected ESP32 board (it should appear as a serial port like /dev/cu.usbmodem101). It will then ask, "Flash firmware now? (mode: standard) [Y/n]".
Type Y and press Enter. This process, known as "flashing," erases the microcontroller's existing memory and writes the newly compiled Zclaw firmware onto it. You'll see a progress bar in the terminal as the data is transferred.
Provision your Zclaw agent
This is the final and most important configuration step. After flashing, the script will ask, "Provision now (required before normal boot)? [Y/n]". Type Y and press Enter.
Provisioning writes your personal credentials (like Wi-Fi details and API keys) to the device's non-volatile storage (NVS). This allows the device to connect to the internet and authenticate with the necessary services on its own. The script will guide you through a series of prompts: Wi-Fi SSID (enter the name of your Wi-Fi network), LLM provider (choose from openai, anthropic, openrouter, or ollama), LLM API key (paste your API key for the selected service), Wi-Fi password, Telegram bot token (to get this, open Telegram and search for @BotFather, start a chat and send /newbot, follow the prompts, and copy the HTTP API token), and Telegram chat ID(s) (search for and chat with @userinfobot to get your Chat ID).
After entering the last credential, the provisioning process will complete.
Monitor and start chatting
The script's final prompt will be, "Open serial monitor to see output? [Y/n]". Type Y to start the monitor. This will show you the live log output from the ESP32. You'll see it connect to your Wi-Fi, initialize its components, and wait for commands.
Congratulations! Your Zclaw agent is now live. You can go to Telegram, find the bot you created, and start sending it commands.
Putting Zclaw to the test: practical demos
Theory is great, but seeing Zclaw in action reveals its true capabilities. Starting with a simple hardware "Hello World" and moving to a more advanced demo involving a custom tool demonstrates the range of possibilities.
Demo 1: blinking an LED with natural language
This first demo will test Zclaw's ability to control a GPIO pin.
The circuit
Create a simple circuit on your breadboard: connect a jumper wire from a GND (Ground) pin on the ESP32 to the ground rail of your breadboard, connect another jumper wire from the 3V3 (3.3 Volts) pin to the power rail, place the LED on the breadboard (connect the shorter leg to the ground rail), connect one end of the 220-ohm resistor to the longer leg of the LED, and connect a jumper wire from the other end of the resistor to a GPIO pin on your ESP32 (in the video, GPIO2 is used).
The interaction
Talk to the bot in Telegram. First, assign a name by sending: Treat GPIO2 pin as the main light. Zclaw will respond, confirming it has saved this information. This is now stored in the device's NVS.
Now, command the light using the name you just assigned: Turn on the main light. Almost instantly, the LED on your breadboard should light up! Zclaw understood your natural language command, recalled that "main light" corresponds to GPIO2, and set that pin to HIGH, completing the circuit.
Demo 2: creating a custom tool for a TFT display
The true power of a framework like Zclaw lies in its extensibility. Creating a new tool from scratch that allows Zclaw to display text on a circular TFT screen demonstrates this capability.
Augmenting the code
To add this functionality, you need to modify the source code.
Define the new tool: In the tools.c file, a new tool definition is added. This C struct tells Zclaw everything it needs to know about the new capability:
This code defines a tool named display_text. It includes a description for the LLM to understand its purpose and an input schema defining the required (text) and optional (bg_color, fg_color) parameters.
Add dependencies: To control the display, you need its driver. In the main/idf_component.yml file, a dependency for the display driver is added:
Implement the handler: A new function, tools_display_text_handler, is created to contain the actual logic for initializing the display, parsing colors, and drawing the text.
After making these changes, the project needs to be rebuilt, flashed, and provisioned again using the same scripts as before. This compiles the new tool and driver into the firmware.
The grand finale
With the new firmware loaded, impressive feats become possible. Simple display: Sending Display text saying hello world results in "hello world" appearing on the circular screen. Parameterized display: A more complex command like Now please display Subscribe with white text on a red background leverages the optional color parameters defined in the tool's schema. Zclaw parses the command, extracts the text and colors, and renders it perfectly on the display.
And for the ultimate test, the query Can you tell me what is the meaning of life and display it on screen results in the LLM identifying the core request. The number 42 appears on the screen, and the bot replies, "Done. '42' is now on screen (the classic answer from Douglas Adams' Hitchhiker's Guide to the Galaxy)!"
Final thoughts
Zclaw is not trying to replace enterprise AI platforms. It is proving that AI agents can exist far outside traditional environments.
You have seen an ESP32 connect securely to an LLM, interpret natural language, and execute real hardware actions. You installed the firmware, configured networking and API credentials, and extended the system with a custom tool. All of this fits within a firmware footprint of under one megabyte.
That is the real takeaway. Agentic AI does not have to be heavy. It does not have to be expensive. And it does not have to live only in the cloud.
Zclaw lowers the barrier to experimentation. It gives developers a hands-on way to understand function calling, tool execution, and hardware integration. More importantly, it makes AI tangible. You can see it blink an LED. You can watch it render text on a display. You can wire it to sensors and relays.
Projects like this expand what feels possible. They remind us that smart design often matters more than raw power.
If you want to explore embedded systems, experiment with AI agents, or simply build something different, Zclaw is a practical place to begin.