-
Notifications
You must be signed in to change notification settings - Fork 32
🤖 feat: unify bash into task_* tools #1308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Change-Id: Ic4a03eb4953916443fcc074412498e0853c0ff17 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: If25e133840a3d7cd29c32a0970cabd432a0e2b2d Signed-off-by: Thomas Kosiewski <tk@coder.com>
7860d42 to
986955f
Compare
Change-Id: I93d9236d1160456b3f4f3c05815b70f84e812012 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ifa45aefee1298bd27137a4d1ffc6bf95517689b8 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I002ab1584a1c9e73ebdd06a126d5bf01acdd9756 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I1a869f9aea822126fdf7098161d5f5198fe12ee2 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (typeof timeoutSecs !== "number" || !Number.isFinite(timeoutSecs)) return undefined; | ||
| const timeoutMs = Math.floor(timeoutSecs * 1000); | ||
| if (timeoutMs <= 0) return undefined; | ||
| return timeoutMs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honor timeout_secs=0 for agent task_await
The updated schema allows timeout_secs to be 0 (non‑blocking) for task_await, but coerceTimeoutMs still treats <= 0 as “unset,” which falls back to the default 10‑minute wait in waitForAgentReport. If a caller uses timeout_secs: 0 to avoid blocking (especially when mixing bash + agent task IDs), agent tasks will still block for up to 10 minutes instead of returning immediately. Consider treating 0 as a real timeout (immediate return) or short‑circuiting agent tasks when timeout_secs === 0 to match the newly‑allowed semantics.
Useful? React with 👍 / 👎.
Unifies bash execution and background-process management under the existing
task_*tool family.Key changes:
taskto acceptkind: "bash"and route execution through existing bash tooling.task_await,task_list, andtask_terminateto handlebash:<processId>task IDs with proper scope checks.bash_*tools are no longer exposed.task(kind="bash").Validation:
make static-check📋 Implementation Plan
Consolidate
bash_*into thetask_*abstraction (task = agent | bash)Goal
Reduce tool surface overlap by making
task_*the single abstraction for both:Success criteria
bash,bash_output,bash_background_list,bash_background_terminatein the default toolset.task,task_await,task_list,task_terminatecover the full bash workflow:Recommended approach (net product LoC ≈ +450)
Extend the
task_*tools to support a new bash task kind, backed byBackgroundProcessManager, and stop exposingbash_*to models (aligngetToolsForModel()with thegetAvailableTools()allowlist; keep legacy wrappers temporarily).1) API shape (tool schemas)
1.1
taskargs: union of agent + bashUpdate
TaskToolArgsSchemainsrc/common/utils/tools/toolDefinitions.tsfrom a single strict object to a strict union:{ subagent_type, prompt, title, run_in_background? }{ kind: "bash", script, timeout_secs, run_in_background?, display_name }Notes:
kind: "bash"so parsing is unambiguous.1.2
taskresults: keep current discriminant, add optional bash fieldsTo avoid changing the
status-based discriminated unions:status: "queued"|"running"|"completed"exactly as-is.exitCode?: numbernote?: stringtruncated?: { reason: string; totalLines: number }For bash tasks:
status:"completed"withreportMarkdowncontaining a formatted bash summary + output.status:"running"and ataskId.1.3
task_awaitbecomes the unified “await output/report”Update
TaskAwaitToolArgsSchemato add bash-output knobs (all optional):filter?: stringfilter_exclude?: booleantimeout_secs?: number(allow0for non-blocking, aligning withbash_output)Update
TaskAwaitTool*ResultSchemato include optional streaming fields on active/completed results:output?: string(incremental output when kind=bash)elapsed_ms?: numberexitCode?: numbernote?: stringMapping:
agent_report).BackgroundProcessManager.getOutput(); may returnstatus:"running"withoutputpopulated.Timeout semantics for bash tasks (replacing
bash_output):task_await.timeout_secsis forwarded toBackgroundProcessManager.getOutput(...)and means “wait up to N seconds for new output (or process exit)”.status:"running"and whatever incremental output was read since the previous call (outputmay be empty).output.log); incomplete final line fragments remain buffered until newline/exit.1.4
task_listreturns both kindsExtend
TaskListToolTaskSchemaadditively:kind?: "agent"|"bash"(optional; omit for agent tasks to minimize churn)display_name?: string(bash)script?: string(bash)uptime_ms?: number(bash)exitCode?: number(bash)status:"reported"and includeexitCode.TaskListStatusSchemawithexited|killed|failedand plumb through.1.5
task_terminatesupports bothKeep
task_terminateinput shape, but broaden semantics:task_idsmay include agent or bash tasks.For results:
terminatedTaskIdsarray.terminatedTaskIds: [taskId].2) Backend implementation (node/services/tools)
2.1 Add a tiny task-id router (defensive)
Create a helper (new file) e.g.
src/node/services/tools/taskId.ts:isBashTaskId(taskId: string): booleantoBashTaskId(processId: string): string(e.g.bash:${processId})fromBashTaskId(taskId: string): string | nullDefensive checks:
assert(taskId.trim().length > 0)bash:ids early with explicit errors.2.2 Implement
task(kind=bash)insrc/node/services/tools/task.tsDispatch based on args shape:
config.runtime,config.backgroundProcessManager,config.workspaceId.run_in_background=true:backgroundProcessManager.spawn(runtime, workspaceId, script, { cwd, env, displayName, timeoutSecs }).status:"running",taskId: toBashTaskId(processId).run_in_background=false:BashToolResult→TaskToolCompletedResult.2.3 De-duplicate bash execution logic
Refactor
src/node/services/tools/bash.tsso that foreground execution is callable from both tools:executeBashForeground(config, args, ctx): Promise<BashToolResult>.createBashTool()remains as thin wrapper.createTaskTool()uses the helper for bash foreground tasks.This avoids re-implementing:
2.4 Extend
task_awaitto support bash tasksIn
src/node/services/tools/task_await.ts:args.task_idsprovided: use them.taskService.listActiveDescendantAgentTaskIds(workspaceId)backgroundProcessManager.list()filtered to descendant workspaces +status === "running"waitForAgentReport()behavior.workspaceIdor a descendant agent workspacebackgroundProcessManager.getOutput(processId, filter, filter_exclude, timeout_secs, abortSignal, workspaceId)status:"running"withoutputwhen runningstatus:"completed"withreportMarkdownandexitCodewhen exited/killed/failedinterrupted→status:"running"withnote:"interrupted"(or add a new status if you choose to extend the schema)2.5 Extend
task_listto merge bash processesIn
src/node/services/tools/task_list.ts:backgroundProcessManager.list()and include any process whoseworkspaceIdis in that set.TaskListToolTaskSchemaentry.workspaceDepth(process.workspaceId) + 1(or1if owned by current workspace).2.6 Extend
task_terminateto terminate bash tasksIn
src/node/services/tools/task_terminate.ts:terminateDescendantAgentTask()backgroundProcessManager.terminate(processId)Optional improvement:
BackgroundProcessManager.cleanup(workspaceId)already exists, but doing it explicitly makes the tool semantics deterministic.2.7 Tool wiring: ensure
task_*are init-wait wrappedBecause
task_*will now execute bash (runtime-dependent), updatesrc/common/utils/tools/tools.ts#getToolsForModel:task,task_await,task_list,task_terminatewithwrapWithInitWait(...)(same asbashtoday).runtime.exec()/BackgroundProcessManagerbefore workspace init.ask_user_question,propose_plan,todo_*,status_setnon-runtime.3) Tool exposure + deprecation
3.1 Remove
bash_*from the actual model toolset (not just prompt)Today,
getAvailableTools()is used for system prompt + tool-instruction extraction, but the model still receives whatevergetToolsForModel()returns (unless atoolPolicydenies it).To truly eliminate overlap:
getAvailableTools(modelString, mode, options)(plus MCP tools). Best place:src/common/utils/tools/tools.ts#getToolsForModeljust before returning.getAvailableTools()(src/common/utils/tools/toolDefinitions.ts):bashbash_outputbash_background_listbash_background_terminatebash_*tools at all (or gate behind a temporary env/feature flag).3.2 Keep legacy
bash_*as wrappers (short-lived)To reduce risk, keep the tools around during rollout:
taskwithkind:"bash".”bash_output/list/terminateby delegating to the newtask_*internals.Why keep wrappers temporarily?
bash_*tool names.bash_*as bridgeable tools.4) Frontend/UI updates (tool message rendering)
Because the model will now call
task_*for bash operations, update task-tool renderers to handle the new union types:src/browser/components/tools/TaskToolCall.tsxTaskToolCall: render bash variant (script,display_name, background vs foreground)TaskAwaitResult: ifoutputpresent, render it (even whenstatus !== completed)TaskListItem: displaydisplay_name/scriptwhenagentType/titleabsentTaskStatusBadge: optionally style bash terminal states if you extend the enumsrc/browser/stories/App.task.stories.tsx+src/browser/stories/mockFactory.tstask_awaitoutput streaming.5) Tests + validation
5.1 Backend tool tests
Add/extend tests under
src/node/services/tools/:task.bash.test.ts(new):task_awaitreturns incremental output + runningtask_terminatekills ittask_listincludes bash processes for:5.2 UI rendering tests (lightweight)
If there are existing component tests, add minimal assertions for bash-variant rendering. Otherwise rely on storybook + typecheck.
5.3 Manual validation checklist
make typecheckbun test src/node/services/tools/task_await.test.tsbun test src/node/services/tools/task_list.test.ts(or add if missing)bun test src/node/services/tools/bash_output.test.ts(ensure wrappers still work if kept)6) Rollout plan
bash_*fromgetAvailableTools()so models stop seeing them.bash_*from PTC bridgeables if desired.Generated with
mux• Model:openai:gpt-5.2• Thinking:xhigh