Bumblebee: Read-Only Endpoint Scanner for Developer Machine Supply-Chain Exposure

Stanley Ulili

Updated on June 1, 2026

The problem: developer machines as attack surface
What Bumblebee scans
Why read-only matters
Installation and usage
Scan profiles
How Bumblebee relates to other security tools
Final thoughts

Bumblebee is an open-source Go binary from Perplexity AI that inventories a developer machine's packages, editor extensions, browser extensions, and AI tool configurations by parsing metadata files directly. It never executes package managers (npm, pip, and so on) or project code, which means it cannot trigger malicious lifecycle scripts in compromised packages. Output is NDJSON, structured for ingestion into security tooling or logging platforms.

The problem: developer machines as attack surface

Traditional security scanning focuses on repositories, container images, and production environments. A developer's local machine has its own attack surface that this pipeline does not cover.

Diagram showing the Universe of Tools on a developer laptop with connections to package managers, browser extensions, AI tools, and more

A typical developer machine runs multiple package managers (npm, pip, Go modules, Bun), browser extensions, editor plugins, AI coding assistants, and local MCP servers. Each is a potential supply-chain attack vector. When a malicious package is discovered, the question is not only whether it reached production but whether any developer already has it installed locally.

What Bumblebee scans

Bumblebee reads on-disk metadata (lock files, extension manifests, configuration files) to inventory:

Packages across Go, npm, pip, and other ecosystems
Editor extensions (VS Code, JetBrains)
Browser extensions
AI tool and MCP server configurations

Bumblebee scanning pipeline showing how threat intel feeds into an exposure catalog, which Bumblebee uses to scan devices and produce logs and inventories

The pipeline allows a security team to take a new threat signal (a public advisory about a malicious package), create an exposure catalog, and immediately query which developer machines are affected.

Why read-only matters

npm and other package managers support lifecycle scripts: shell commands that run automatically at install time (preinstall, postinstall). Malicious packages can use these hooks to execute attacker-controlled code.

Animation showing how a scanner executing package manager commands can trigger a malicious preinstall hook

Running npm ls or similar commands in project directories to inventory packages risks triggering these scripts. Bumblebee parses the metadata files instead of invoking the package manager, so it cannot trigger lifecycle scripts regardless of what they contain.

Installation and usage

Bumblebee is a single self-contained binary with no daemons or non-standard library dependencies.

Copied!

go install github.com/perplexity-ai/bumblebee/cmd/bumblebee@latest

go install command being executed in a terminal window

Self-test

Copied!

bumblebee selftest

Expected output: selftest OK (3 findings in 4ms). This verifies the binary is working correctly before running a live scan.

Baseline scan

Copied!

bumblebee scan --profile baseline > inventory.ndjson

The baseline profile scans common global and user-level locations for package roots, editor extensions, browser extensions, and AI tool configs. It completes in seconds.

Reading the output

Each line in the NDJSON file is a self-contained JSON record:

Copied!

head -n 1 inventory.ndjson | jq

Formatted JSON object showing a package record with keys including record_type, ecosystem, package_name, and version

A sample record:

Copied!

{
  "record_type": "package",
  "scanner_name": "bumblebee",
  "scan_time": "2026-05-25T00:39:17.808781Z",
  "endpoint": {
    "hostname": "Joshs-MacBook-Pro.local",
    "os": "darwin",
    "arch": "arm64",
    "username": "josh"
  },
  "profile": "baseline",
  "ecosystem": "go",
  "package_name": "github.com/davecgh/go-spew",
  "version": "v1.1.1",
  "project_path": "/Users/josh/go/pkg/mod/github.com/!l!b!i!m/sarama@v1.43.3/",
  "source_file": "/Users/josh/go/pkg/mod/github.com/!l!b!i!m/sarama@v1.43.3/go.mod",
  "od": {
    "direct_dependency": true,
    "has_lifecycle_scripts": false,
    "confidence": "medium"
  }
}

Each record includes the endpoint details, the package ecosystem and version, the exact source file where the dependency was found, and whether the package has lifecycle scripts.

Scan profiles

baseline scans common global and user-level package roots. Suitable for regular lightweight inventory, runs in seconds.

project-root scans workspace directories where developers keep active code (such as ~/code or ~/src). Useful for checking dependencies in project lock files.

deep is the incident response profile. It recursively searches one or more explicit root directories for any evidence of packages. Slower and more thorough.

Copied!

bumblebee scan --profile deep \
  --root /Users/josh \
  --exposure-catalog ./catalog.json \
  --findings-only \
  --max-duration 5m > findings.ndjson

--exposure-catalog accepts a JSON file of known malicious packages. --findings-only suppresses records that do not match the catalog, producing a focused incident report. --max-duration ensures the scan completes in a bounded time window.

How Bumblebee relates to other security tools

SCA (Software Composition Analysis) analyzes a project's declared dependencies for known vulnerabilities. It covers what a team is building.

SBOM (Software Bill of Materials) creates a formal manifest of everything in a released artifact. It covers what a team ships.

EDR (Endpoint Detection and Response) monitors runtime process behavior. It covers what executed on a machine.

Bumblebee covers the local developer state: everything present on a developer's machine, including packages from old clones, globally installed tools, and extensions, regardless of whether any of it is part of an active project or ever shipped.

These categories are complementary. Bumblebee fills a gap that the others do not address.

Final thoughts

The practical workflow is: run baseline scans regularly and store the NDJSON output centrally. When a new advisory appears, query the stored inventory for the affected package names and versions to identify exposed machines without waiting for developers to self-report.

The read-only design is what makes this safe to run during an active incident. A compromised machine should not be prodded with commands that execute code from its package directories. Bumblebee's metadata-only approach is appropriate for both routine hygiene and incident response.

Source code and documentation are at github.com/perplexity-ai/bumblebee.

Got an article suggestion? Let us know

Skybridge: A React Framework for Building Interactive MCP Apps Inside AI Chatbots

Skybridge is an open-source TypeScript framework for building Model Context Protocol apps: interactive React widgets that render inside AI chatbots like ChatGPT and Claude. It includes a local emulator with HMR, a built-in tunnel for exposing local servers without ngrok, and a compliance audit tool for app store submissions.

→