Skip to content

Commit dad945c

Browse files
kwachowsjlawryno
authored andcommitted
accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW
Mark as invalid context of a job that returned HW context violation error and queue work that aborts jobs from faulty context. Add engine reset to the context abort thread handler to not only abort currently executing jobs but also to ensure NPU invalid state recovery. Signed-off-by: Karol Wachowski <[email protected]> Signed-off-by: Maciej Falkowski <[email protected]> Reviewed-by: Jacek Lawrynowicz <[email protected]> Signed-off-by: Jacek Lawrynowicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
1 parent ab680dc commit dad945c

File tree

1 file changed

+25
-0
lines changed

1 file changed

+25
-0
lines changed

drivers/accel/ivpu/ivpu_job.c

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -533,6 +533,26 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32
533533

534534
lockdep_assert_held(&vdev->submitted_jobs_lock);
535535

536+
job = xa_load(&vdev->submitted_jobs_xa, job_id);
537+
if (!job)
538+
return -ENOENT;
539+
540+
if (job_status == VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW) {
541+
guard(mutex)(&job->file_priv->lock);
542+
543+
if (job->file_priv->has_mmu_faults)
544+
return 0;
545+
546+
/*
547+
* Mark context as faulty and defer destruction of the job to jobs abort thread
548+
* handler to synchronize between both faults and jobs returning context violation
549+
* status and ensure both are handled in the same way
550+
*/
551+
job->file_priv->has_mmu_faults = true;
552+
queue_work(system_wq, &vdev->context_abort_work);
553+
return 0;
554+
}
555+
536556
job = ivpu_job_remove_from_submitted_jobs(vdev, job_id);
537557
if (!job)
538558
return -ENOENT;
@@ -946,6 +966,9 @@ void ivpu_context_abort_work_fn(struct work_struct *work)
946966
unsigned long ctx_id;
947967
unsigned long id;
948968

969+
if (vdev->fw->sched_mode == VPU_SCHEDULING_MODE_HW)
970+
ivpu_jsm_reset_engine(vdev, 0);
971+
949972
mutex_lock(&vdev->context_list_lock);
950973
xa_for_each(&vdev->context_xa, ctx_id, file_priv) {
951974
if (!file_priv->has_mmu_faults || file_priv->aborted)
@@ -959,6 +982,8 @@ void ivpu_context_abort_work_fn(struct work_struct *work)
959982

960983
if (vdev->fw->sched_mode != VPU_SCHEDULING_MODE_HW)
961984
return;
985+
986+
ivpu_jsm_hws_resume_engine(vdev, 0);
962987
/*
963988
* In hardware scheduling mode NPU already has stopped processing jobs
964989
* and won't send us any further notifications, thus we have to free job related resources

0 commit comments

Comments
 (0)