IBM Bob: Agentic Workflow, Code Review, and COBOL Modernization

Stanley Ulili

Updated on May 4, 2026

Agentic workflow and modes
Code review with /review
Modernizing a COBOL application
Reviewing the generated Python code
Auditing the original COBOL source
Pricing
Final thoughts

IBM Bob is an IDE built on IBM's Granite AI models. Its design distinguishes between planning and code generation: before writing any code, Bob can analyze a codebase, produce a step-by-step implementation plan, and wait for developer approval. This separation is the core architectural choice that differentiates it from tools that generate code immediately on request.

Bob also includes an integrated code review system accessible via /review, automated fix application, and enough familiarity with legacy languages to audit COBOL codebases and explain their testing conventions.

Agentic workflow and modes

Bob's agentic workflow operates through selectable modes that assign a specific role to the AI for a given task.

Diagram on the IBM Bob website illustrating the agentic workflow showing how a task moves through distinct modes like Ask, Plan, Code, and Advanced

Ask handles quick questions, code explanations, and documentation lookup without initiating a code-writing task.

Plan analyzes requirements and the existing codebase to produce a detailed implementation plan before any code is written. This is the phase where architectural alignment happens.

Code handles actual file creation and modification, ideally after a plan has been established.

Orchestrator coordinates complex multi-step tasks, acting as a project manager that sequences planning, execution, and verification.

Custom modes can be defined for team-specific or project-specific workflows.

Code review with `/review`

The /review command runs a comprehensive audit of the codebase powered by the Granite models. It looks for security vulnerabilities (SQL injection, XSS), hard-coded secrets and API keys, OWASP-referenced weak practices, potential race conditions, null pointer risks, and logic errors.

The Bob Findings panel showing a dedicated UI element that organizes and displays all issues discovered during a code review

Results appear in the Bob Findings panel, which provides the issue description, its location in the code, and its potential impact. Each finding has a light bulb icon that triggers autonomous remediation: Bob applies a fix, shows the diff for approval, and then offers to run targeted unit tests to verify the change did not introduce regressions.

Modernizing a COBOL application

To test Bob's agentic capabilities in a demanding scenario, the following example migrates zBANK, an open-source COBOL bank account management application, to a Python web application using Streamlit.

Prompt structure

The quality of the output depends on the specificity of the prompt. An effective prompt for this task includes several distinct parts:

Persona: "You are a Python Developer specializing in rapid prototyping and legacy modernization."
Goal: "Create a standalone ATM application with a web-based UI using Streamlit, launchable from the terminal."
Core logic requirements: "Create a BankEngine class that mimics the zBANK logic (Balance, Deposit, Withdraw). Use SQLite for persistent storage to simulate the VSAM files from the original code."
Frontend requirements: "Build a Streamlit UI with a sidebar for account selection, balance metric cards, deposit and withdrawal forms, and a transaction history table."
Setup requirements: "Provide a requirements.txt and a run.sh script so the demo can be launched with a single command."
Code style: "Use type hints and PEP 8-compliant code."

Specifying the architecture, storage mechanism, UI components, and code style eliminates ambiguity and reduces the likelihood of the model making poor design decisions.

Auto-approval permissions

Before execution, Bob presents an auto-approval modal listing the action types it may need to perform: reading files, writing files, executing terminal commands, and switching modes. Permissions can be granted or denied per category, limiting the agent to only the access required for the task.

Result

The full task completes in approximately three minutes. Bob reads the COBOL source files, creates bank_engine.py and app.py, and generates the setup scripts.

The fully functional zBANK ATM web application running in a browser displaying a clean dark-themed login page

The resulting application has a login screen, a balance dashboard, deposit and withdrawal forms, and a real-time transaction history table backed by SQLite.

Reviewing the generated Python code

Running /review on the newly generated Python application surfaces findings including "Database connection errors not handled." The SQLite operations lacked try...except blocks, which could cause unhandled crashes.

Clicking the light bulb icon on that finding triggers Bob to wrap the relevant operations in error handling, present the changes for approval, and then ask whether to run targeted tests on the modified file. Selecting yes causes Bob to generate and execute a temporary unit test confirming no regressions were introduced. This plan-fix-verify cycle is the same mechanism used whether the code was written by Bob or by a human.

Auditing the original COBOL source

Running /review on the original, unmodified COBOL codebase surfaces eight critical issues.

Summary in the chat panel listing the critical issues Bob found in the original COBOL codebase including security and logic flaws

The findings include: "PIN stored and transmitted in plaintext without encryption," "Withdrawal allows overdraft with no balance validation," and "Deposit accepts zero/negative amounts." Bob applies a fix to the withdrawal validation, adding checks to ensure the amount is positive and does not exceed the available balance.

Contextual awareness in legacy environments

After applying the COBOL fix, Bob again offers to add tests. Rather than generating a test file, it explains why that would be inappropriate:

Text in Bob's chat panel explaining its decision not to create test files for the COBOL project citing the typical reliance on manual testing in legacy mainframe environments

Testing Status: No test files or test configuration found in this COBOL/CICS mainframe project. This is typical for legacy mainframe applications that rely on manual testing or mainframe-specific testing tools not present in the repository. The fix has been applied successfully and follows the suggested validation logic.

This response reflects an understanding of legacy mainframe development culture rather than a naive attempt to apply modern testing conventions to an environment where they do not belong.

Pricing

IBM Bob offers a 30-day free trial with 40 Bobcoins, the unit used to measure usage. The full COBOL modernization task described here cost approximately 4 Bobcoins.

Final thoughts

Bob's structured separation of planning from execution addresses a real risk in AI-assisted development: generating code that is syntactically correct but architecturally inconsistent with the surrounding system. The Plan mode, combined with the developer approval step before Code mode begins, keeps the developer in control of architectural decisions.

The COBOL audit capability and the contextual explanation of why legacy test infrastructure differs from modern projects are the most distinctive aspects of the tool. For teams managing mainframe systems or facing legacy modernization, these are capabilities that are difficult to replicate with models trained primarily on modern codebases.

Bob is most clearly suited to enterprise environments where code quality, security, and architectural consistency take priority over generation speed. For exploratory or prototype work, the structured workflow may feel more deliberate than necessary.

Documentation and trial access are available at ibm.com/products/bob.

Got an article suggestion? Let us know

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.