fix(chatd): continue tool execution on FinishReasonLength#22858
Open
fix(chatd): continue tool execution on FinishReasonLength#22858
Conversation
When the LLM hits MaxOutputTokens it returns FinishReasonLength instead of FinishReasonToolCalls. If complete tool calls were already emitted, the loop silently exited without executing them and the chat went to "waiting" with no error. The user had to manually prompt it to continue. Accept FinishReasonLength alongside FinishReasonToolCalls in the shouldContinue check so that emitted tool calls are always executed regardless of why the stream ended.
6f47c09 to
a802d50
Compare
… calls Verify that when the model returns FinishReasonLength but emits no tool calls (text-only truncated response), the loop stops after a single step. This complements TestRun_ToolCallsWithFinishReasonLength which covers the positive case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
When the LLM hits
MaxOutputTokens(32k default), it returnsFinishReasonLengthinstead ofFinishReasonToolCalls. If complete tool calls had already been emitted before the stream was truncated,shouldContinuewasfalse. The tools were never executed and the loop exited withnil. The chat went towaitingstatus with no error visible to the user, who had to manually prompt it to continue.This became much more frequent after compaction re-entry was introduced (#22640), which adds summary messages to the context, pushing the model closer to its output token limit during tool-heavy work.
Fix
Accept
FinishReasonLengthalongsideFinishReasonToolCallsin theshouldContinuecheck so that emitted tool calls are always executed regardless of why the stream ended.Test
TestRun_ToolCallsWithFinishReasonLengthmirrors the existingTestRun_MultiStepToolExecutiontest but usesFinishReasonLengthon step 0. Verifies:RED/GREEN confirmed: test fails without the fix (
expected: 2, actual: 1), passes with it.