Skip to content

feat(context): add context manager#2547

Open
lizradway wants to merge 1 commit into
strands-agents:mainfrom
lizradway:context-manager
Open

feat(context): add context manager#2547
lizradway wants to merge 1 commit into
strands-agents:mainfrom
lizradway:context-manager

Conversation

@lizradway
Copy link
Copy Markdown
Member

@lizradway lizradway commented Jun 1, 2026

Description

Implements the v1 contextManager facade as designed in strands-agents/docs#831.

Adds a contextManager parameter to AgentConfig that pre-composes the SDK's context management primitives into a single configuration surface. An internal ContextManager plugin composes sub-plugins (ContextCompression, ContextOffloader) that handle the actual behavior.

Architecture

ContextManager (internal plugin)
├── ContextCompression (sub-plugin)
│   ├── Proactive compression (BeforeModelCallEvent)
│   ├── Reactive overflow recovery (AfterModelCallEvent)
│   └── Sliding window enforcement (AfterInvocationEvent)
└── ContextOffloader (sub-plugin)
    ├── Tool result caching (AfterToolCallEvent)
    └── retrieve_offloaded_content tool

Sub-plugins work independently when used standalone. User-provided plugins with matching names take precedence over managed sub-plugins. When contextManager is set, ContextCompression takes priority — NullConversationManager is used (same pattern as other dedicated-param plugins like retryStrategy, sessionManager).

What ships

  • contextManager parameter on AgentConfig — accepts "auto" or a config object
  • ContextCompression plugin — proactive/reactive compression with own reduction logic (truncate or summarize)
  • ContextOffloader — stays in vended-plugins/context-offloader/, composed internally by ContextManager
  • Message pinning (context-manager/compression/protection.ts) — pinMessageTool for agent-controlled pinning at runtime. Internal utilities (pinMessage, unpinMessage, isPinned, isProtected) not exported; programmatic pinning API deferred.
  • protectFirst — number of messages at the start of the conversation to protect from eviction
  • estimateInputTokens() utility — shared token estimation in src/context-manager/token-estimation.ts
  • <summary> XML tags — summarized messages are wrapped in <summary> tags so the model can distinguish framework-injected summaries from user content
  • conversationManager marked as pending deprecation — still works, JSDoc-tagged

Public API Surface

New on AgentConfig:

  • contextManager?: ContextManagerParam

New exports:

  • pinMessageTool (agent-invokable tool)
  • ContextManagerParam (type)

All classes (ContextManager, ContextCompression, ContextOffloader) are internal. ContextOffloader remains accessible via the vended-plugins/context-offloader sub-path for backward compat.

Configuration model

Two semantics depending on whether strategy: 'auto' is present:

Override (strategy: 'auto') — starts with everything enabled, you override specific settings:

contextManager: "auto"                                              // everything with defaults
contextManager: { strategy: 'auto', compression: { windowSize: 60 } }  // auto, tweak compression
contextManager: { strategy: 'auto', offloader: { threshold: 5000 } }   // auto, tweak offloader

Additive (no strategy) — starts with nothing, you enable what you want:

contextManager: { compression: true }                               // only compression
contextManager: { compression: 'summarize' }                        // only summarize compression
contextManager: { offloader: true }                                 // only offloading
contextManager: { compression: true, offloader: true }              // both (same as "auto")
contextManager: {}                                                  // nothing enabled

Compression config (discriminated union on method)

compression: true                                       // defaults
compression: 'truncate'                                 // method shorthand
compression: 'summarize'                                // method shorthand
compression: { method: 'truncate', windowSize: 30 }     // full config
compression: { method: 'summarize', summaryRatio: 0.5 } // full config
compression: { protectFirst: 2 }                        // protect first 2 messages

Defaults (when enabled via "auto")

Parameter Default
offloader threshold 2500 tokens
offloader previewTokens 1500 tokens
compression method "truncate"
compression windowSize 40
compression proactive true (threshold 0.7)
storage InMemoryStorage

Plugin registration

contextManager must be passed via the dedicated parameter — same pattern as conversationManager, retryStrategy, and sessionManager. No guards for misuse in plugins[] (consistent with other special-cased plugins).

Deprecation Plan

The following are marked as pending deprecation in v1 and will be removed in v2:

  • AgentConfig.conversationManagercontextManager: { compression: ... }
  • Agent._estimateInputTokens() → shared estimateInputTokens() utility
  • BeforeModelCallEvent.projectedInputTokens → future contextManager budget API
  • ConversationManager, SlidingWindowConversationManager, SummarizingConversationManager, NullConversationManagerContextCompression plugin
  • vended-plugins/context-offloader/ sub-path → import from @strands-agents/sdk directly

Breaking Changes

None. All changes are additive. Existing behavior is unchanged when contextManager is not set.

Related Issues

Documentation PR

strands-agents/docs#831

Type of Change

New feature

Testing

  • Type check passes
  • Lint passes
  • All 2915 tests pass (106 test files)
  • 60 new unit tests covering: token estimation, compression plugin, truncate/summarize strategies, protection logic, context manager resolution

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions github-actions Bot added the size/l label Jun 1, 2026
@github-actions github-actions Bot added size/l and removed size/l labels Jun 1, 2026
@github-actions github-actions Bot added size/l and removed size/l labels Jun 1, 2026
@github-actions github-actions Bot added size/l and removed size/l labels Jun 1, 2026
@github-actions github-actions Bot added size/l and removed size/l labels Jun 1, 2026
@lizradway lizradway changed the title feat(context): add context manager class feat(context): add context manager Jun 2, 2026
@github-actions github-actions Bot added size/xl and removed size/l labels Jun 2, 2026
@lizradway lizradway temporarily deployed to manual-approval June 2, 2026 16:04 — with GitHub Actions Inactive
@lizradway lizradway marked this pull request as ready for review June 2, 2026 16:38
// Tool-pair partner protection: if adjacent message is protected and they form a pair
const msg = messages[index]!
const hasToolResult = msg.content.some((b) => b.type === 'toolResultBlock')
if (hasToolResult && index > 0 && index - 1 < protectFirst) return true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This does not verify that messages[index - 1] actually contains a matching toolUseBlock. It only checks that the current message has a toolResult and the previous message is within protectFirst. If message at index - 1 happens to be a regular text message at the boundary of protectFirst, this incorrectly marks the current message as protected.

Suggestion: Add a check that validates the previous message contains a toolUseBlock with a matching toolUseId:

if (hasToolResult && index > 0 && index - 1 < protectFirst) {
  const prev = messages[index - 1]!
  const resultIds = new Set(
    msg.content.filter((b): b is ToolResultBlock => b.type === 'toolResultBlock').map((b) => b.toolUseId)
  )
  if (prev.content.some((b) => b.type === 'toolUseBlock' && resultIds.has((b as ToolUseBlock).toolUseId))) {
    return true
  }
}

In practice, the LLM API ordering (toolUse always precedes toolResult) may prevent this from manifesting as a user-visible bug, but the validation keeps the function correct regardless of message arrangement.

export type OffloaderConfig = {
/** Token threshold above which tool results are offloaded. Defaults to 2500. */
threshold?: number
/** Number of tokens to keep as an inline preview. Defaults to 500. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: previewTokens default is documented as 500 here (and repeated on line 46), but the actual fallback on line 159 is ?? 1500. The PR description's table also states 1500.

Suggestion: Update the TSDoc to say "Defaults to 1500" to match the implementation.

if (hasToolResult && index > 0 && index - 1 < protectFirst) return true

const hasToolUse = msg.content.some((b) => b.type === 'toolUseBlock')
if (hasToolUse && index + 1 < messages.length && index + 1 < protectFirst) return true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This condition is unreachable. We only arrive here when index >= protectFirst (line 123 already returned true for index < protectFirst). For index + 1 < protectFirst to be true, we'd need index < protectFirst - 1, which contradicts index >= protectFirst.

Suggestion: Remove this dead branch or rewrite the tool-pair partner logic. If the intent is "protect a toolUse whose partner toolResult is in the protected range", note that toolResult always comes after toolUse in message ordering, so the toolUse is always at a lower index — meaning it would already be protected by line 123.

continue
}

const hasToolUse = msg.content.some((b) => b.type === 'toolUseBlock')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The findValidTrimPoint function checks for toolUseBlock on user-role messages (lines 73-80), but toolUseBlocks only appear in assistant messages. Since line 63 already skips non-user messages, this branch is dead code.

The same pattern appears in adjustSplitForToolPairs in summarize.ts.

Suggestion: Remove the dead hasToolUse check or restructure the logic to correctly handle the trim boundary. The actual concern is: don't start the "kept" portion at a toolResult (user message) that is the result of a toolUse (assistant message) immediately before it — which is already handled by the toolResultBlock check on line 68.

* ```typescript
* // Config shorthand (most users)
* const agent = new Agent({ contextManager: "auto" })
*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The TSDoc @example (lines 82-88) shows passing a ContextManager class instance to Agent({ contextManager: cm }), but ContextManagerParam is typed as ContextStrategyValue | ContextManagerConfig — it doesn't accept a ContextManager instance. This example would fail type-checking.

Suggestion: Either update ContextManagerParam to also accept ContextManager instances (if that's the intended "power user" path), or fix the example to show the config-object approach:

const agent = new Agent({ contextManager: { storage: new S3Storage("bucket") } })

continue
}

const hasToolUse = msg.content.some((b) => b.type === 'toolUseBlock')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The adjustSplitForToolPairs function has the same dead code pattern as findValidTrimPoint in truncate.ts — checking for toolUseBlock on messages that have already passed the role !== 'user' skip (line 128-129 skips non-user messages, so the message at idx is always user-role, which never contains toolUseBlock).

Suggestion: Same as the comment on truncate.ts — consider removing the dead branch or documenting why it exists as defensive coding.

)
}
this._conversationManager = new NullConversationManager()
} else if (contextManagerPlugin) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: When a non-stateful model has both contextManager and conversationManager set, the conversationManager is silently ignored (line 365-366 takes priority). This could confuse users who set both accidentally.

Suggestion: Consider logging a warning when both are provided, e.g.:

} else if (contextManagerPlugin) {
  if (config?.conversationManager) {
    logger.warn('contextManager takes priority over conversationManager — conversationManager will be ignored')
  }
  this._conversationManager = new NullConversationManager()
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Assessment: Comment

Well-architected feature with clear separation of concerns between the facade (ContextManager), sub-plugins (ContextCompression, ContextOffloader), and strategy functions. The additive vs override configuration semantics are well thought out.

Review Categories
  • Documentation/Implementation mismatch: The previewTokens default is documented as 500 but implemented as 1500. The TSDoc example shows a usage pattern that doesn't match the type system.
  • Dead/unreachable code: The isProtected function has an unreachable branch, and findValidTrimPoint/adjustSplitForToolPairs have dead code checking for toolUseBlocks on user-role messages.
  • Correctness: Tool-pair validation in isProtected doesn't verify the adjacent message actually contains a matching toolUseBlock — could incorrectly protect messages in edge cases.
  • API review: This introduces a new contextManager primitive on AgentConfig with a significant public surface. Per the API Bar Raising guidelines, this scope likely warrants a needs-api-review label and designated reviewer if not already done.

The overall design aligns well with SDK tenets — particularly composability and "provide both low-level and high-level APIs".

@lizradway lizradway mentioned this pull request Jun 2, 2026
13 tasks
* Conversation manager for handling message history and context overflow.
* Defaults to SlidingWindowConversationManager with windowSize of 40.
*
* @remarks Pending deprecation — use `contextManager` instead. The `contextManager` parameter
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually planning to deprecate this? Do we need this warning if we're only deprecating in 2.0?

Comment on lines +6 to +7
/** Positive: protect first N messages. Negative: protect last N messages. */
protectFirst?: number
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want both first and last?

/** Ratio of messages to summarize (0.1–0.8). Defaults to 0.3. */
summaryRatio?: number
/** Minimum recent messages to preserve during summarization. Defaults to 10. */
preserveRecentMessages?: number
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protectLast would have more synergy with protectFirst. That said, I'd be in favor of exposing an object instead, but let's at least align the two for consistency.


export type SummarizeCompressionConfig = SharedCompressionOptions & {
method: 'summarize'
/** Ratio of messages to summarize (0.1–0.8). Defaults to 0.3. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unclear description to me. What does ratio of messages mean?

* Compression configuration.
* - `true`: enable with defaults (truncate, proactive at 0.7).
* - `'truncate'` / `'summarize'`: enable specific strategy with defaults.
* - `CompressionStrategy.Truncate(...)` / `CompressionStrategy.Summarize(...)`: full config.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually expose these types?

storage?: Storage
/**
* Context offloader configuration.
* - `true`: enable with defaults (threshold=2500, previewTokens=500).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previewTokens has a different default here than actual (1500)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, are we purposefully drifting from the existing offloader's default of 1000?

* @param model - The model to use for token counting
* @returns Estimated token count, or undefined if estimation fails
*/
export async function estimateInputTokens(messages: Message[], model: Model): Promise<number | undefined> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation we add the option to estimate systemPrompt and toolSpecs. Can we support this here to avoid regression? Then we can also forward Agent._estimateInputTokens() to use this function and avoid maintaining 2 copies.

*
* When set, takes priority over `conversationManager` — `NullConversationManager` is used.
*/
contextManager?: ContextManagerParam
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to maybe ContextManagerInput or inline the union. Param jumped out as confusing to me

*/
export type CompressionConfig =
| true
| import('./compression/context-compression.js').CompressionMethod
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we not just importing top level? What does this buy us?

/** Strategy name. Only "auto" is supported currently. */
strategy?: ContextStrategyValue
/** Storage backend for cached tool results. Defaults to InMemoryStorage. */
storage?: Storage
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's just cached tool results should we narrow the parameter name?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if it is also other things... Storage alone is too concise

if (config.compression) {
const userProvided = userPlugins?.some((p) => p.name === 'strands:context-compression')
if (!userProvided) {
let compressionConfig: import('./compression/context-compression.js').CompressionOptions | undefined
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same Q

const plugins: Plugin[] = []

if (config.compression) {
const userProvided = userPlugins?.some((p) => p.name === 'strands:context-compression')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do instanceof instead of this name matching pattern?

* A message is protected if it is pinned, within the protectFirst range,
* or is a tool-pair partner of a protected message.
*/
export function isProtected(messages: Message[], index: number, protectFirst?: number): boolean {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more descriptive function name

const SUMMARIZATION_PROMPT = `You are a conversation summarizer. Provide a concise summary of the conversation history.

Format Requirements:
- You MUST create a structured and concise summary in bullet-point format.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirming bullet-point format

* @param options - Summarization options
* @returns `true` if messages were summarized, `false` if not enough to summarize
*/
export async function summarize(messages: Message[], model: Model, options?: SummarizeOptions): Promise<boolean> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could expose sys prompt for the summarization

* @param message - The message to check
* @returns `true` if the message has `metadata.custom.pinned === true`
*/
export function isPinned(message: Message): boolean
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After offline discussion, it sounds like "pin" verbage is coming from tool result pairs. I.e. we don't say protect/ed directly because it can mean two messages.

I think this mismatch is confusing. I wouldn't mind just letting protected automatically include a pair.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants