From: shaoyunl shaoyun.liu@amd.com
stable inclusion from stable-v5.10.84 commit 74aafe99efb68f15e50be9f7032c2168512f98a8 bugzilla: 186030 https://gitee.com/openeuler/kernel/issues/I4QV2F
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 2cf49e00d40d5132e3d067b5aa6d84791929ab15 ]
In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch already been called, the start_cpsch will not be called since there is no resume in this case. When reset been triggered again, driver should avoid to do uninitialization again.
Signed-off-by: shaoyunl shaoyun.liu@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Chen Jun chenjun102@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 352a32dc609b..2645ebc63a14 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1207,6 +1207,11 @@ static int stop_cpsch(struct device_queue_manager *dqm) bool hanging;
dqm_lock(dqm); + if (!dqm->sched_running) { + dqm_unlock(dqm); + return 0; + } + if (!dqm->is_hws_hang) unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0); hanging = dqm->is_hws_hang || dqm->is_resetting;