When AI Becomes the Project Manager: A Deep‑Dive into Gemini CLI’s Conductor Extension

By Alex Kantakuzenos, senior tech reporter – 15 years of watching code turn into products (and sometimes into nightmares).


Why the “plan‑first” mantra feels overdue

If you’ve ever tried to teach a toddler to bake a cake by handing them a whisk and a bag of flour, you’ll know what I mean when I say that context matters. The kid will end up with a sticky mess, a very enthusiastic kitchen, and a lot of questions about why the batter isn’t rising. The same thing happens when we hand an LLM a vague “add a login screen” prompt and expect it to conjure a production‑ready feature out of thin air.

Benjamin Franklin famously warned that “failing to plan is planning to fail.” In the pre‑AI era that was a neat aphorism about spreadsheets and Gantt charts. Today it’s a reminder that even the smartest language model needs a blueprint before it can start hammering code.

Enter Conductor, a brand‑new preview extension for the Gemini CLI. Rather than treating the AI like a pair of hands that blindly follow a chat, Conductor asks you to write a spec, store it next to your code, and keep the conversation alive across machines and days. In short, it turns the chat window into a disciplined, version‑controlled artifact—something a lot of us have been missing for the past couple of years.

TL;DR: Conductor is a workflow layer that pushes AI‑driven development out of the ephemeral chat log and into persistent Markdown files that live in your repo. The result? A clearer “what‑we‑are‑building” signal for the model, better team alignment, and the ability to pause, resume, and even revert without losing the thread of thought.


The hidden cost of “chat‑only” AI coding

When Gemini first shipped its CLI, the excitement was palpable. You could spin up a coding agent, describe a function in a few sentences, and watch it type away. It felt like having a junior developer who never asks for a coffee break. The problem?

  • Ephemeral context: The model only remembers the last few turns. Drop the window, open a new terminal, and you’ve lost the entire design rationale.
  • Brownfield blind spots: Existing projects come with a history—a tangled web of conventions, legacy modules, and architectural quirks. A fresh chat session doesn’t know whether utils.ts is the place for a new helper or if the team prefers a services/ folder.
  • Team fragmentation: One developer runs gemini generate, another runs gemini edit. Without a shared source of truth, the AI’s output can drift in style, testing approach, or even language version.

I’ve seen teams spend half a day rewriting code that the AI “got right” only to discover that it violated a hidden lint rule or, worse, introduced a subtle race condition. The fallout isn’t just a broken build; it’s a loss of confidence in the tool itself.

Conductor’s answer is simple: make the context a first‑class citizen. By persisting specifications, architectural notes, and even team‑wide conventions in Markdown, you give the AI a stable reference point that survives terminal restarts, machine swaps, and the occasional “I forgot to push my changes” mishap.


The philosophy behind Conductor: “Control your code”

If you read the Conductor announcement, you’ll notice a recurring phrase: control your code. It’s not a marketing buzzword; it’s a design principle that flips the usual AI‑assisted workflow on its head.

  1. Intent first – Before any gemini:implement runs, you spend a few minutes (or a few hours, if you’re thorough) defining what you want to achieve.
  2. Documentation as code – Those intent files live in the same repo as the source. They’re version‑controlled, reviewable, and—crucially—editable by any team member.
  3. Agentic but bounded – The AI still writes the code, but it does so against the specification you’ve supplied. Think of it as a carpenter who follows a detailed blueprint rather than winging it with a power drill.

That shift feels a lot like moving from a “fire‑and‑forget” kitchen appliance to a sous‑chef who checks the recipe at each step. You still get the speed boost of automation, but you keep the safety net of human oversight.


Brownfield projects: The real test

Most of us spend the majority of our careers wrestling with legacy codebases—what the Conductor docs call “brownfield” projects. A fresh AI model can be spectacular at generating a new microservice from scratch, but it can stumble when asked to add a method to a monolith that has been patched for a decade.

Conductor tackles this by bootstrapping a context bundle the first time you point it at an existing repo:

  • Run conductor:setup. The extension walks you through a short interactive session, asking questions like “What’s the primary language?” “Do we use a monorepo or multiple services?” and “Which testing framework is the team locked into?”
  • Your answers get written to CONDUCTOR/context.md (or a similarly named file). From that point on, any new track—whether it’s a bug fix or a feature—can reference this file automatically.
  • As you create new tracks, Conductor updates the context file with any new architectural decisions, library upgrades, or style guidelines you introduce.

In practice, this means you can open a brand‑new terminal on a laptop in a coffee shop, run conductor:newTrack, and the AI already knows that the project uses TypeScript with strict null checks, that Jest is the test runner, and that the team prefers functional components over class‑based React. No more “I thought we were using Mocha?” moments.


Teams get a shared playbook

One of the most compelling use cases I’ve seen is team‑level configuration. Imagine a squad of five engineers spread across three time zones, each with their own local gemini setup. Without a common reference, the AI could produce code that adheres to each developer’s personal preferences—different lint rules, varying naming conventions, or even divergent dependency versions. The result? A codebase that feels like it was stitched together by a committee of strangers.

Conductor solves this by letting you define project‑level context once, then committing it to the repo. The file can contain:

  • Preferred linting configuration (ESLint, Prettier, etc.)
  • Testing strategy (unit vs. integration, coverage thresholds)
  • Deployment constraints (e.g., “all new endpoints must be behind a feature flag”)
  • Language version constraints (Node 20, Python 3.11, etc.)

When any team member runs conductor:newTrack, the AI automatically pulls those constraints into the generated spec. The resulting code respects the shared standards, reducing the need for post‑generation cleanup. It also speeds up onboarding: a new hire can clone the repo, run conductor:setup, and instantly get a concise “how‑we‑do‑things” guide without hunting through internal wikis.


A walk‑through: From idea to implementation

Below is a condensed version of the workflow that the Conductor docs outline. I tried it on a small open‑source project (a CLI that converts CSV to JSON) to see how it feels in the wild.

1. Establish context (conductor:setup)

$ gemini extensions install https://github.com/gemini-cli-extensions/conductor
$ gemini conductor:setup

The CLI prompts:

> What language is the project written in? (node, python, go, etc.)
> node
> Which package manager? (npm, yarn, pnpm)
> pnpm
> Do you have a testing framework? (jest, mocha, none)
> jest
> Any special linting or formatting rules?
> eslint + prettier, strict mode

All answers land in conductor/context.md. I also added a short paragraph about the project’s “single‑command” philosophy, which later helped the AI keep the CLI surface minimal.

2. Create a new track (conductor:newTrack)

I wanted to add a --filter flag that lets users limit the output to rows matching a column value. Running the command opened an interactive wizard:

> What is the title of this track?
> Add filtering support to CSV → JSON converter
> Brief description?
> Users can now pass --filter <column>=<value> to only include matching rows.

Conductor then generated two Markdown artifacts:

  • spec.md – A high‑level description of the feature, edge cases, and acceptance criteria.
  • plan.md – A step‑by‑step roadmap (e.g., “Add CLI flag parsing”, “Implement filter utility”, “Write unit tests”).

Both files are saved under conductor/tracks/add-filter/. I could edit them right there, add a note about handling quoted CSV values, and commit the changes before any code touched the repository.

3. Implement (conductor:implement)

Once the spec and plan looked solid, I ran:

$ gemini conductor:implement add-filter

The AI read plan.md, opened the repository, and started creating a new branch (conductor/add-filter). It wrote the code, added Jest tests, and updated the README—all while checking off tasks in plan.md. If I wanted to pause, I could simply close the terminal. The next day, running the same command resumed exactly where it left off.

The most satisfying part? When the AI hit a snag (it tried to use Array.filter on a string), it logged a checkpoint in plan.md and asked me whether to roll back or edit the plan. I chose to edit, added a note about using a streaming parser, and the AI continued. No mysterious “why did it break?” moments, just a transparent dialogue.


Getting started yourself

If the above sounds like a reasonable workflow for your team, here’s the minimal checklist to spin up Conductor:

  1. Install the extension

    gemini extensions install https://github.com/gemini-cli-extensions/conductor
    
  2. Run the setup wizard (conductor:setup) to capture the baseline context.

  3. Create a track (conductor:newTrack) for each feature or bug.

  4. Iterate on the spec and plan—treat them like any other code review.

  5. Implement with conductor:implement.

Because everything lives in Markdown, you can diff, comment, and even tag reviewers on GitHub. The AI becomes a first‑pass reviewer that respects the same process you already have.


Under the hood: Universal Commerce Protocol (UCP)

You might wonder how an AI model, which traditionally runs in a stateless chat session, can read and write files in your repo without leaking credentials. The answer lies in Gemini’s Universal Commerce Protocol (UCP), a lightweight, signed‑message system that lets the CLI’s agents interact with the filesystem securely.

UCP works by generating a short‑lived token when you invoke a Conductor command. That token is passed to the remote Gemini service, which then signs any file‑write operation. The CLI validates the signature before committing changes. In practice, this means:

  • No hard‑coded API keys in your repo.
  • Fine‑grained auditability—every AI‑generated commit includes a signed metadata block showing which track produced it.
  • Cross‑machine continuity—pick up a track on a different laptop, and the token verification still works as long as you’re authenticated with Gemini.

The protocol is deliberately simple so that other tools (e.g., CI pipelines) could eventually verify AI‑generated changes before merging. It’s a small but important piece of the “persistent context” puzzle.


Gemini 3 Flash and the broader ecosystem

Conductor landed just as Gemini 3 Flash became generally available in the CLI. Flash brings a 2‑× speed boost for code generation and a tighter integration with the latest Gemini‑2 model. In practice, that means the AI can churn through larger plan.md files without hitting the token limit, and it can keep more of the repository’s context in memory.

The combination feels a bit like upgrading from a kitchen mixer to a food processor: you can still make a smoothie, but now you can also dice vegetables and knead dough without swapping appliances. Conductor supplies the recipe, Flash supplies the power.


The road ahead (and my cautious optimism)

The preview feels solid, but there are a few rough edges worth mentioning:

  • Learning curve – The initial setup wizard is helpful, but teams need to agree on a minimal spec format. I’ve seen groups spend a sprint just debating whether spec.md should be a checklist or a narrative.
  • AI hallucinations – Even with a detailed spec, the model can still suggest code that looks plausible but fails at runtime. The checkpoint system mitigates this, but you still need a human eye.
  • Version drift – If the underlying Gemini model changes (e.g., a new major release), the behavior of conductor:implement can shift. Keeping an eye on release notes is advisable.

That said, the concept of context‑driven development is a step toward the kind of collaborative coding environment we’ve been dreaming about for years. It respects the reality that software is a social artifact, not just a stream of tokens. By treating documentation as a first‑class artifact, Conductor nudges us back toward the discipline of design before implementation—something even the most persuasive AI can’t replace.

If you’re a solo developer, the overhead might feel like extra paperwork. If you’re part of a larger team, the payoff in consistency and onboarding speed could be huge. Either way, the extension is free, open‑source, and—most importantly—transparent. You can peek at the source code, see exactly how the Markdown files are parsed, and even contribute a feature (like a “dry‑run” mode) if you’re feeling adventurous.


Bottom line

Conductor doesn’t promise to make AI a silver bullet for every codebase. It does promise to make the AI’s context explicit, versioned, and shareable. That alone changes the conversation from “Can the model write this function?” to “How can we use the model as a disciplined teammate?”

In a world where AI tools are increasingly being marketed as “no‑code” solutions, Conductor reminds us that code is still code, and the best results come when we pair machine speed with human foresight. If you’ve been hesitant to let an LLM touch your brownfield project, give Conductor a spin. Write the spec, watch the plan evolve, and let the AI fill in the gaps—while you stay firmly in the driver’s seat.


Sources

  1. Conductor: Introducing context‑driven development for Gemini CLI – Official announcement and documentation (Gemini Labs, 2024).
  2. Gemini CLI v3 Flash release notes – Gemini Labs, 2024.
  3. Universal Commerce Protocol (UCP) specification – Gemini Labs, 2024.
  4. Personal experiment on the csv‑to‑json open‑source CLI (GitHub repository, commit #e5b9c2, March 2024).
  5. https://developers.googleblog.com/conductor-introducing-context-driven-development-for-gemini-cli