Skip to content

pa draft#1242

Draft
vgokhale wants to merge 1 commit into
mainfrom
pa-draft-gfx1250-sink-20260617
Draft

pa draft#1242
vgokhale wants to merge 1 commit into
mainfrom
pa-draft-gfx1250-sink-20260617

Conversation

@vgokhale

Copy link
Copy Markdown

Summary

  • enable the PA persistent decode path for 256/1024 block sizes
  • add sink-aware gfx1250 PA decode handling via aiter.pa_decode_bf16_asm

Testing

  • Not run, per request.

from aiter import dtypes
from aiter.ops.triton.gluon.pa_decode_gluon import get_recommended_splits
from aiter.ops.triton.unified_attention import unified_attention
from atom.config import get_current_atom_config

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [ruff] <F401> reported by reviewdog 🐶
atom.config.get_current_atom_config imported but unused

reviewdog suggestion errorGitHub comment range and suggestion line range must be same. L14-L14 v.s. L14-L15

else:
device = q.device
total_s, nhead, v_head_dim = output.shape
softmax_scale = self.scale if self.scale is not None else 1.0 / (v_head_dim**0.5)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [ruff] <F841> reported by reviewdog 🐶
Local variable softmax_scale is assigned to but never used

Suggested change
softmax_scale = self.scale if self.scale is not None else 1.0 / (v_head_dim**0.5)
self.scale if self.scale is not None else 1.0 / (v_head_dim**0.5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants