Skip to content

Commit 1b12f8d

Browse files
mwajdeczgregkh
authored andcommitted
drm/xe: Process deferred GGTT node removals on device unwind
[ Upstream commit af2b588 ] While we are indirectly draining our dedicated workqueue ggtt->wq that we use to complete asynchronous removal of some GGTT nodes, this happends as part of the managed-drm unwinding (ggtt_fini_early), which could be later then manage-device unwinding, where we could already unmap our MMIO/GMS mapping (mmio_fini). This was recently observed during unsuccessful VF initialization: [ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747340 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747540 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747240 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747040 tiles_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746840 mmio_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747f40 xe_bo_pinned_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746b40 devm_drm_dev_init_release (16 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] drmres release begin [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef81640 __fini_relay (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80d40 guc_ct_fini (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80040 __drmm_mutex_release (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80140 ggtt_fini_early (8 bytes) and this was leading to: [ ] BUG: unable to handle page fault for address: ffffc900058162a0 [ ] #PF: supervisor write access in kernel mode [ ] #PF: error_code(0x0002) - not-present page [ ] Oops: Oops: 0002 [#1] SMP NOPTI [ ] Tainted: [W]=WARN [ ] Workqueue: xe-ggtt-wq ggtt_node_remove_work_func [xe] [ ] RIP: 0010:xe_ggtt_set_pte+0x6d/0x350 [xe] [ ] Call Trace: [ ] <TASK> [ ] xe_ggtt_clear+0xb0/0x270 [xe] [ ] ggtt_node_remove+0xbb/0x120 [xe] [ ] ggtt_node_remove_work_func+0x30/0x50 [xe] [ ] process_one_work+0x22b/0x6f0 [ ] worker_thread+0x1e8/0x3d Add managed-device action that will explicitly drain the workqueue with all pending node removals prior to releasing MMIO/GSM mapping. Fixes: 919bb54 ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node") Signed-off-by: Michal Wajdeczko <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 89d2835c3680ab1938e22ad81b1c9f8c686bd391) Signed-off-by: Thomas Hellström <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
1 parent 3549ad8 commit 1b12f8d

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

drivers/gpu/drm/xe/xe_ggtt.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,13 @@ static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
198198
.ggtt_set_pte = xe_ggtt_set_pte_and_flush,
199199
};
200200

201+
static void dev_fini_ggtt(void *arg)
202+
{
203+
struct xe_ggtt *ggtt = arg;
204+
205+
drain_workqueue(ggtt->wq);
206+
}
207+
201208
/**
202209
* xe_ggtt_init_early - Early GGTT initialization
203210
* @ggtt: the &xe_ggtt to be initialized
@@ -254,6 +261,10 @@ int xe_ggtt_init_early(struct xe_ggtt *ggtt)
254261
if (err)
255262
return err;
256263

264+
err = devm_add_action_or_reset(xe->drm.dev, dev_fini_ggtt, ggtt);
265+
if (err)
266+
return err;
267+
257268
if (IS_SRIOV_VF(xe)) {
258269
err = xe_gt_sriov_vf_prepare_ggtt(xe_tile_get_gt(ggtt->tile, 0));
259270
if (err)

0 commit comments

Comments
 (0)