Skip to content

Sync upstream fixes#1433

Merged
maxzhen merged 4 commits into
amd:mainfrom
houlz0507:061926_1
Jun 20, 2026
Merged

Sync upstream fixes#1433
maxzhen merged 4 commits into
amd:mainfrom
houlz0507:061926_1

Conversation

@houlz0507

Copy link
Copy Markdown
Contributor

No description provided.

aie2_populate_range() and amdxdna_umap_release() access a saved VMA
pointer that may have already been freed, leading to a potential
use-after-free.

Remove the VMA accesses from these functions to avoid the race.

Fixes: e486147c912f ("accel/amdxdna: Add BO import and export")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
amdxdna_umap_release() calls the blocking mmu_interval_notifier_remove()
before removing the object from abo->mem.umap_list. If
aie2_populate_range() runs concurrently, it may obtain a reference to an
amdxdna_umap that is being released, leading to a potential use-after-free.

Use refcount_inc_not_zero() in aie2_populate_range() when acquiring a
reference. If the reference count has already dropped to zero, release
is in progress and the entry is skipped.

Fixes: e486147c912f ("accel/amdxdna: Add BO import and export")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
…failure

amdxdna_gem_obj_open() increments open_ref before attempting to set up
the DMA address mapping.  When amdxdna_dma_map_bo() fails, the function
returned immediately without rolling back either change made on the first
open (open_ref == 1 path).

Fix it by decrementing open_ref and clearing abo->client on the error path.

Fixes: ece3e8980907 ("accel/amdxdna: Allow forcing IOVA-based DMA via module parameter")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Both amdxdna_hwctx_sync_debug_bo() and amdxdna_drm_config_hwctx_ioctl()
hold xdna->dev_lock while invoking backend operations. If the hardware
hangs, aie2_cmd_wait() blocks waiting for a firmware response. When the
DRM scheduler timeout expires, aie2_sched_job_timedout() is invoked to
reset the hardware. However, the timeout handler also attempts to acquire
dev_lock, resulting in a deadlock.

Avoid this by releasing dev_lock before waiting for the firmware
response and reacquiring it after the wait completes. This allows the
timeout handler to proceed with device recovery when a debug BO command
times out.

Fixes: 7ea046838021 ("accel/amdxdna: Support firmware debug buffer")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
@houlz0507

Copy link
Copy Markdown
Contributor Author

retest this please

@maxzhen maxzhen merged commit 9c78bce into amd:main Jun 20, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants