IBM Bob: Agentic Workflow, Code Review, and COBOL Modernization
IBM Bob is an IDE built on IBM's Granite AI models. Its design distinguishes between planning and code generation: before writing any code, Bob can analyze a codebase, produce a step-by-step implementation plan, and wait for developer approval. This separation is the core architectural choice that differentiates it from tools that generate code immediately on request.
Bob also includes an integrated code review system accessible via /review, automated fix application, and enough familiarity with legacy languages to audit COBOL codebases and explain their testing conventions.
Agentic workflow and modes
Bob's agentic workflow operates through selectable modes that assign a specific role to the AI for a given task.
Ask handles quick questions, code explanations, and documentation lookup without initiating a code-writing task.
Plan analyzes requirements and the existing codebase to produce a detailed implementation plan before any code is written. This is the phase where architectural alignment happens.
Code handles actual file creation and modification, ideally after a plan has been established.
Orchestrator coordinates complex multi-step tasks, acting as a project manager that sequences planning, execution, and verification.
Custom modes can be defined for team-specific or project-specific workflows.
Code review with /review
The /review command runs a comprehensive audit of the codebase powered by the Granite models. It looks for security vulnerabilities (SQL injection, XSS), hard-coded secrets and API keys, OWASP-referenced weak practices, potential race conditions, null pointer risks, and logic errors.
Results appear in the Bob Findings panel, which provides the issue description, its location in the code, and its potential impact. Each finding has a light bulb icon that triggers autonomous remediation: Bob applies a fix, shows the diff for approval, and then offers to run targeted unit tests to verify the change did not introduce regressions.
Modernizing a COBOL application
To test Bob's agentic capabilities in a demanding scenario, the following example migrates zBANK, an open-source COBOL bank account management application, to a Python web application using Streamlit.
Prompt structure
The quality of the output depends on the specificity of the prompt. An effective prompt for this task includes several distinct parts:
- Persona: "You are a Python Developer specializing in rapid prototyping and legacy modernization."
- Goal: "Create a standalone ATM application with a web-based UI using Streamlit, launchable from the terminal."
- Core logic requirements: "Create a
BankEngineclass that mimics the zBANK logic (Balance, Deposit, Withdraw). Use SQLite for persistent storage to simulate the VSAM files from the original code." - Frontend requirements: "Build a Streamlit UI with a sidebar for account selection, balance metric cards, deposit and withdrawal forms, and a transaction history table."
- Setup requirements: "Provide a
requirements.txtand arun.shscript so the demo can be launched with a single command." - Code style: "Use type hints and PEP 8-compliant code."
Specifying the architecture, storage mechanism, UI components, and code style eliminates ambiguity and reduces the likelihood of the model making poor design decisions.
Auto-approval permissions
Before execution, Bob presents an auto-approval modal listing the action types it may need to perform: reading files, writing files, executing terminal commands, and switching modes. Permissions can be granted or denied per category, limiting the agent to only the access required for the task.
Result
The full task completes in approximately three minutes. Bob reads the COBOL source files, creates bank_engine.py and app.py, and generates the setup scripts.
The resulting application has a login screen, a balance dashboard, deposit and withdrawal forms, and a real-time transaction history table backed by SQLite.
Reviewing the generated Python code
Running /review on the newly generated Python application surfaces findings including "Database connection errors not handled." The SQLite operations lacked try...except blocks, which could cause unhandled crashes.
Clicking the light bulb icon on that finding triggers Bob to wrap the relevant operations in error handling, present the changes for approval, and then ask whether to run targeted tests on the modified file. Selecting yes causes Bob to generate and execute a temporary unit test confirming no regressions were introduced. This plan-fix-verify cycle is the same mechanism used whether the code was written by Bob or by a human.
Auditing the original COBOL source
Running /review on the original, unmodified COBOL codebase surfaces eight critical issues.
The findings include: "PIN stored and transmitted in plaintext without encryption," "Withdrawal allows overdraft with no balance validation," and "Deposit accepts zero/negative amounts." Bob applies a fix to the withdrawal validation, adding checks to ensure the amount is positive and does not exceed the available balance.
Contextual awareness in legacy environments
After applying the COBOL fix, Bob again offers to add tests. Rather than generating a test file, it explains why that would be inappropriate:
Testing Status: No test files or test configuration found in this COBOL/CICS mainframe project. This is typical for legacy mainframe applications that rely on manual testing or mainframe-specific testing tools not present in the repository. The fix has been applied successfully and follows the suggested validation logic.
This response reflects an understanding of legacy mainframe development culture rather than a naive attempt to apply modern testing conventions to an environment where they do not belong.
Pricing
IBM Bob offers a 30-day free trial with 40 Bobcoins, the unit used to measure usage. The full COBOL modernization task described here cost approximately 4 Bobcoins.
Final thoughts
Bob's structured separation of planning from execution addresses a real risk in AI-assisted development: generating code that is syntactically correct but architecturally inconsistent with the surrounding system. The Plan mode, combined with the developer approval step before Code mode begins, keeps the developer in control of architectural decisions.
The COBOL audit capability and the contextual explanation of why legacy test infrastructure differs from modern projects are the most distinctive aspects of the tool. For teams managing mainframe systems or facing legacy modernization, these are capabilities that are difficult to replicate with models trained primarily on modern codebases.
Bob is most clearly suited to enterprise environments where code quality, security, and architectural consistency take priority over generation speed. For exploratory or prototype work, the structured workflow may feel more deliberate than necessary.
Documentation and trial access are available at ibm.com/products/bob.