🤖 feat: unify bash into task_* tools #1308

ThomasK33 · 2025-12-23T18:55:14Z

Unifies bash execution and background-process management under the existing task_* tool family.

Key changes:

Extend task to accept kind: "bash" and route execution through existing bash tooling.
Teach task_await, task_list, and task_terminate to handle bash:<processId> task IDs with proper scope checks.
Filter the model-facing toolset to the canonical allowlist so legacy bash_* tools are no longer exposed.
Update agent presets/depth messaging to prevent sub-agent recursion while still allowing bash via task(kind="bash").
Update task tool UI rendering for bash variants + streaming output.

Validation:

make static-check

📋 Implementation Plan

Consolidate `bash_` into the `task_` abstraction (task = agent | bash)

Goal

Reduce tool surface overlap by making task_* the single abstraction for both:

Agent tasks (existing subagent workspaces)
Bash tasks (existing background processes + foreground bash execution)

Success criteria

Models no longer see/need bash, bash_output, bash_background_list, bash_background_terminate in the default toolset.
task, task_await, task_list, task_terminate cover the full bash workflow:
- spawn (fg/bg)
- incremental output
- list
- terminate
Existing behaviors preserved:
- bash output limits + truncation handling
- background process semantics (timeouts, filters, queued-message interruption)
- task scoping rules (only manage descendant work)

Recommended approach (net product LoC ≈ +450)
Extend the task_* tools to support a new bash task kind, backed by BackgroundProcessManager, and stop exposing bash_* to models (align getToolsForModel() with the getAvailableTools() allowlist; keep legacy wrappers temporarily).

1) API shape (tool schemas)

1.1 `task` args: union of agent + bash

Update TaskToolArgsSchema in src/common/utils/tools/toolDefinitions.ts from a single strict object to a strict union:

Agent (unchanged):
- { subagent_type, prompt, title, run_in_background? }
Bash (new):
- { kind: "bash", script, timeout_secs, run_in_background?, display_name }

Notes:

Keep agent shape unchanged to avoid breaking existing tool calls.
Require kind: "bash" so parsing is unambiguous.

1.2 `task` results: keep current discriminant, add optional bash fields

To avoid changing the status-based discriminated unions:

Keep status: "queued"|"running"|"completed" exactly as-is.
Extend completed results with optional bash-oriented fields (safe additive change):
- exitCode?: number
- note?: string
- truncated?: { reason: string; totalLines: number }

For bash tasks:

Foreground completion returns status:"completed" with reportMarkdown containing a formatted bash summary + output.
Background spawn returns status:"running" and a taskId.

1.3 `task_await` becomes the unified “await output/report”

Update TaskAwaitToolArgsSchema to add bash-output knobs (all optional):

filter?: string
filter_exclude?: boolean
timeout_secs?: number (allow 0 for non-blocking, aligning with bash_output)

Update TaskAwaitTool*ResultSchema to include optional streaming fields on active/completed results:

output?: string (incremental output when kind=bash)
elapsed_ms?: number
exitCode?: number
note?: string

Mapping:

Agent tasks: identical behavior (waits for agent_report).
Bash tasks: delegates to BackgroundProcessManager.getOutput(); may return status:"running" with output populated.

Timeout semantics for bash tasks (replacing bash_output):

task_await.timeout_secs is forwarded to BackgroundProcessManager.getOutput(...) and means “wait up to N seconds for new output (or process exit)”.
If the timeout elapses and the process is still running, return status:"running" and whatever incremental output was read since the previous call (output may be empty).
Output is the existing unified stream (stdout+stderr combined from output.log); incomplete final line fragments remain buffered until newline/exit.

1.4 `task_list` returns both kinds

Extend TaskListToolTaskSchema additively:

kind?: "agent"|"bash" (optional; omit for agent tasks to minimize churn)
display_name?: string (bash)
script?: string (bash)
uptime_ms?: number (bash)
exitCode?: number (bash)
Option A (minimal): map bash terminal statuses to status:"reported" and include exitCode.
Option B (more expressive): extend TaskListStatusSchema with exited|killed|failed and plumb through.

1.5 `task_terminate` supports both

Keep task_terminate input shape, but broaden semantics:

task_ids may include agent or bash tasks.

For results:

Keep existing terminatedTaskIds array.
For bash terminations, return terminatedTaskIds: [taskId].

2) Backend implementation (node/services/tools)

2.1 Add a tiny task-id router (defensive)

Create a helper (new file) e.g. src/node/services/tools/taskId.ts:

isBashTaskId(taskId: string): boolean
toBashTaskId(processId: string): string (e.g. bash:${processId})
fromBashTaskId(taskId: string): string | null

Defensive checks:

assert(taskId.trim().length > 0)
reject malformed bash: ids early with explicit errors.

2.2 Implement `task(kind=bash)` in `src/node/services/tools/task.ts`

Dispatch based on args shape:

Agent path: unchanged.
Bash path:
- Validate required deps: config.runtime, config.backgroundProcessManager, config.workspaceId.
- If run_in_background=true:
  - Call backgroundProcessManager.spawn(runtime, workspaceId, script, { cwd, env, displayName, timeoutSecs }).
  - Return status:"running", taskId: toBashTaskId(processId).
- If run_in_background=false:
  - Reuse existing bash execution logic (see 2.3) to preserve output limits/truncation.
  - Convert BashToolResult → TaskToolCompletedResult.

2.3 De-duplicate bash execution logic

Refactor src/node/services/tools/bash.ts so that foreground execution is callable from both tools:

Extract a pure helper like executeBashForeground(config, args, ctx): Promise<BashToolResult>.
createBashTool() remains as thin wrapper.
createTaskTool() uses the helper for bash foreground tasks.

This avoids re-implementing:

line/byte limits
tmpfile overflow policy
streaming-to-UI behavior

2.4 Extend `task_await` to support bash tasks

In src/node/services/tools/task_await.ts:

Determine candidate IDs:
- If args.task_ids provided: use them.
- Else: union of
  - taskService.listActiveDescendantAgentTaskIds(workspaceId)
  - backgroundProcessManager.list() filtered to descendant workspaces + status === "running"
For each id:
- If agent id: existing waitForAgentReport() behavior.
- If bash id:
  - scope check: process.workspaceId must be workspaceId or a descendant agent workspace
  - call backgroundProcessManager.getOutput(processId, filter, filter_exclude, timeout_secs, abortSignal, workspaceId)
  - return:
    - status:"running" with output when running
    - status:"completed" with reportMarkdown and exitCode when exited/killed/failed
    - map interrupted → status:"running" with note:"interrupted" (or add a new status if you choose to extend the schema)

2.5 Extend `task_list` to merge bash processes

In src/node/services/tools/task_list.ts:

Get descendant agent tasks as today.
Compute the set of descendant workspace IDs (include current workspace).
backgroundProcessManager.list() and include any process whose workspaceId is in that set.
Convert each process into a TaskListToolTaskSchema entry.
- Depth: workspaceDepth(process.workspaceId) + 1 (or 1 if owned by current workspace).

2.6 Extend `task_terminate` to terminate bash tasks

In src/node/services/tools/task_terminate.ts:

For each id:
- agent id → existing terminateDescendantAgentTask()
- bash id → scope check + backgroundProcessManager.terminate(processId)

Optional improvement:

If terminating an agent task, also terminate any bash processes owned by that terminated subtree (defensive cleanup). This should be safe since BackgroundProcessManager.cleanup(workspaceId) already exists, but doing it explicitly makes the tool semantics deterministic.

2.7 Tool wiring: ensure `task_*` are init-wait wrapped

Because task_* will now execute bash (runtime-dependent), update src/common/utils/tools/tools.ts#getToolsForModel:

Wrap task, task_await, task_list, task_terminate with wrapWithInitWait(...) (same as bash today).
- This prevents calling into runtime.exec() / BackgroundProcessManager before workspace init.
Keep ask_user_question, propose_plan, todo_*, status_set non-runtime.

3) Tool exposure + deprecation

3.1 Remove `bash_` from the actual* model toolset (not just prompt)

Today, getAvailableTools() is used for system prompt + tool-instruction extraction, but the model still receives whatever getToolsForModel() returns (unless a toolPolicy denies it).

To truly eliminate overlap:

Filter the tools passed to the model to the allowlist from getAvailableTools(modelString, mode, options) (plus MCP tools). Best place: src/common/utils/tools/tools.ts#getToolsForModel just before returning.
Once (1) is in place, remove these from getAvailableTools() (src/common/utils/tools/toolDefinitions.ts):
- bash
- bash_output
- bash_background_list
- bash_background_terminate
Optional cleanup: stop constructing legacy bash_* tools at all (or gate behind a temporary env/feature flag).

3.2 Keep legacy `bash_*` as wrappers (short-lived)

To reduce risk, keep the tools around during rollout:

Update their descriptions to say “Deprecated: use task with kind:"bash".”
Optionally re-implement bash_output/list/terminate by delegating to the new task_* internals.

Why keep wrappers temporarily?

Some internal code paths / older sessions may still refer to bash_* tool names.
PTC sandbox currently lists bash_* as bridgeable tools.
Keeping wrappers allows phased migration without a flag day.

4) Frontend/UI updates (tool message rendering)

Because the model will now call task_* for bash operations, update task-tool renderers to handle the new union types:

src/browser/components/tools/TaskToolCall.tsx
- TaskToolCall: render bash variant (script, display_name, background vs foreground)
- TaskAwaitResult: if output present, render it (even when status !== completed)
- TaskListItem: display display_name/script when agentType/title absent
- TaskStatusBadge: optionally style bash terminal states if you extend the enum
src/browser/stories/App.task.stories.tsx + src/browser/stories/mockFactory.ts
- Add a scenario showing a bash task spawned in background, then task_await output streaming.

5) Tests + validation

5.1 Backend tool tests

Add/extend tests under src/node/services/tools/:

task.bash.test.ts (new):
- spawn bash task in background
- task_await returns incremental output + running
- task_terminate kills it
task_list includes bash processes for:
- current workspace
- descendant workspace (spawned by a child agent workspace)
scope validation:
- parent can manage descendant’s bash tasks
- unrelated workspace cannot

5.2 UI rendering tests (lightweight)

If there are existing component tests, add minimal assertions for bash-variant rendering. Otherwise rely on storybook + typecheck.

5.3 Manual validation checklist

make typecheck
Run targeted unit tests:
- bun test src/node/services/tools/task_await.test.ts
- bun test src/node/services/tools/task_list.test.ts (or add if missing)
- bun test src/node/services/tools/bash_output.test.ts (ensure wrappers still work if kept)

6) Rollout plan

Land schema + backend changes (task supports bash, task_await/list/terminate support bash IDs).
Update UI tool renderers to avoid regressions.
Remove bash_* from getAvailableTools() so models stop seeing them.
After a release or two, delete legacy wrappers and remove bash_* from PTC bridgeables if desired.

Generated with mux • Model: openai:gpt-5.2 • Thinking: xhigh

Change-Id: Ic4a03eb4953916443fcc074412498e0853c0ff17 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: If25e133840a3d7cd29c32a0970cabd432a0e2b2d Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: I93d9236d1160456b3f4f3c05815b70f84e812012 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: Ifa45aefee1298bd27137a4d1ffc6bf95517689b8 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: I002ab1584a1c9e73ebdd06a126d5bf01acdd9756 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: I1a869f9aea822126fdf7098161d5f5198fe12ee2 Signed-off-by: Thomas Kosiewski <tk@coder.com>

ThomasK33 · 2025-12-23T21:48:37Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-12-23T22:01:03Z

src/node/services/tools/task_await.ts

  if (typeof timeoutSecs !== "number" || !Number.isFinite(timeoutSecs)) return undefined;
  const timeoutMs = Math.floor(timeoutSecs * 1000);
  if (timeoutMs <= 0) return undefined;
  return timeoutMs;


Honor timeout_secs=0 for agent task_await

The updated schema allows timeout_secs to be 0 (non‑blocking) for task_await, but coerceTimeoutMs still treats <= 0 as “unset,” which falls back to the default 10‑minute wait in waitForAgentReport. If a caller uses timeout_secs: 0 to avoid blocking (especially when mixing bash + agent task IDs), agent tasks will still block for up to 10 minutes instead of returning immediately. Consider treating 0 as a real timeout (immediate return) or short‑circuiting agent tasks when timeout_secs === 0 to match the newly‑allowed semantics.

Useful? React with 👍 / 👎.

ThomasK33 added 2 commits December 23, 2025 20:30

🤖 feat: unify bash into task_* tools

67e1fb1

Change-Id: Ic4a03eb4953916443fcc074412498e0853c0ff17 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 fix: stabilize unit tests for bun + window guards

986955f

Change-Id: If25e133840a3d7cd29c32a0970cabd432a0e2b2d Signed-off-by: Thomas Kosiewski <tk@coder.com>

ThomasK33 force-pushed the task-bash-abstraction branch from 7860d42 to 986955f Compare December 23, 2025 19:33

ThomasK33 added 4 commits December 23, 2025 20:59

🤖 fix: make task tool schema provider-compatible

0f8b023

Change-Id: I93d9236d1160456b3f4f3c05815b70f84e812012 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 tests: migrate bash tool tests to task(kind="bash")

c046d42

Change-Id: Ifa45aefee1298bd27137a4d1ffc6bf95517689b8 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 tests: make task bash integration tests deterministic

5b802fd

Change-Id: I002ab1584a1c9e73ebdd06a126d5bf01acdd9756 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 tests: wait for subscription in forkWorkspace streams

651e51f

Change-Id: I1a869f9aea822126fdf7098161d5f5198fe12ee2 Signed-off-by: Thomas Kosiewski <tk@coder.com>

chatgpt-codex-connector bot reviewed Dec 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🤖 feat: unify bash into task_* tools #1308

🤖 feat: unify bash into task_* tools #1308

ThomasK33 commented Dec 23, 2025

Uh oh!

ThomasK33 commented Dec 23, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🤖 feat: unify bash into task_* tools #1308

Are you sure you want to change the base?

🤖 feat: unify bash into task_* tools #1308

Conversation

ThomasK33 commented Dec 23, 2025

Consolidate bash_* into the task_* abstraction (task = agent | bash)

Goal

1) API shape (tool schemas)

1.1 task args: union of agent + bash

1.2 task results: keep current discriminant, add optional bash fields

1.3 task_await becomes the unified “await output/report”

1.4 task_list returns both kinds

1.5 task_terminate supports both

2) Backend implementation (node/services/tools)

2.1 Add a tiny task-id router (defensive)

2.2 Implement task(kind=bash) in src/node/services/tools/task.ts

2.3 De-duplicate bash execution logic

2.4 Extend task_await to support bash tasks

2.5 Extend task_list to merge bash processes

2.6 Extend task_terminate to terminate bash tasks

2.7 Tool wiring: ensure task_* are init-wait wrapped

3) Tool exposure + deprecation

3.1 Remove bash_* from the actual model toolset (not just prompt)

3.2 Keep legacy bash_* as wrappers (short-lived)

4) Frontend/UI updates (tool message rendering)

5) Tests + validation

5.1 Backend tool tests

5.2 UI rendering tests (lightweight)

5.3 Manual validation checklist

6) Rollout plan

Uh oh!

ThomasK33 commented Dec 23, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Consolidate `bash_` into the `task_` abstraction (task = agent | bash)

1.1 `task` args: union of agent + bash

1.2 `task` results: keep current discriminant, add optional bash fields

1.3 `task_await` becomes the unified “await output/report”

1.4 `task_list` returns both kinds

1.5 `task_terminate` supports both

2.2 Implement `task(kind=bash)` in `src/node/services/tools/task.ts`

2.4 Extend `task_await` to support bash tasks

2.5 Extend `task_list` to merge bash processes

2.6 Extend `task_terminate` to terminate bash tasks

2.7 Tool wiring: ensure `task_*` are init-wait wrapped

3.1 Remove `bash_` from the actual* model toolset (not just prompt)

3.2 Keep legacy `bash_*` as wrappers (short-lived)