mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 48 participants
  • 18262 discussions
[PATCH OLK-6.6] soc: qcom: pdr: Fix the potential deadlock
by Qi Xi 17 Apr '25

17 Apr '25
From: Saranya R <quic_sarar(a)quicinc.com> mainline inclusion from mainline-v6.14 commit 2eeb03ad9f42dfece63051be2400af487ddb96d2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBZH8C CVE: CVE-2025-22014 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- When some client process A call pdr_add_lookup() to add the look up for the service and does schedule locator work, later a process B got a new server packet indicating locator is up and call pdr_locator_new_server() which eventually sets pdr->locator_init_complete to true which process A sees and takes list lock and queries domain list but it will timeout due to deadlock as the response will queued to the same qmi->wq and it is ordered workqueue and process B is not able to complete new server request work due to deadlock on list lock. Fix it by removing the unnecessary list iteration as the list iteration is already being done inside locator work, so avoid it here and just call schedule_work() here. Process A Process B process_scheduled_works() pdr_add_lookup() qmi_data_ready_work() process_scheduled_works() pdr_locator_new_server() pdr->locator_init_complete=true; pdr_locator_work() mutex_lock(&pdr->list_lock); pdr_locate_service() mutex_lock(&pdr->list_lock); pdr_get_domain_list() pr_err("PDR: %s get domain list txn wait failed: %d\n", req->service_name, ret); Timeout error log due to deadlock: " PDR: tms/servreg get domain list txn wait failed: -110 PDR: service lookup for msm/adsp/sensor_pd:tms/servreg failed: -110 " Thanks to Bjorn and Johan for letting me know that this commit also fixes an audio regression when using the in-kernel pd-mapper as that makes it easier to hit this race. [1] Link: https://lore.kernel.org/lkml/Zqet8iInnDhnxkT9@hovoldconsulting.com/ # [1] Fixes: fbe639b44a82 ("soc: qcom: Introduce Protection Domain Restart helpers") CC: stable(a)vger.kernel.org Reviewed-by: Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com> Tested-by: Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com> Tested-by: Johan Hovold <johan+linaro(a)kernel.org> Signed-off-by: Saranya R <quic_sarar(a)quicinc.com> Co-developed-by: Mukesh Ojha <mukesh.ojha(a)oss.qualcomm.com> Signed-off-by: Mukesh Ojha <mukesh.ojha(a)oss.qualcomm.com> Link: https://lore.kernel.org/r/20250212163720.1577876-1-mukesh.ojha@oss.qualcomm… Signed-off-by: Bjorn Andersson <andersson(a)kernel.org> Signed-off-by: Qi Xi <xiqi2(a)huawei.com> --- drivers/soc/qcom/pdr_interface.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/soc/qcom/pdr_interface.c b/drivers/soc/qcom/pdr_interface.c index c7cd4daa10b0..f83491a7510e 100644 --- a/drivers/soc/qcom/pdr_interface.c +++ b/drivers/soc/qcom/pdr_interface.c @@ -74,7 +74,6 @@ static int pdr_locator_new_server(struct qmi_handle *qmi, { struct pdr_handle *pdr = container_of(qmi, struct pdr_handle, locator_hdl); - struct pdr_service *pds; mutex_lock(&pdr->lock); /* Create a local client port for QMI communication */ @@ -86,12 +85,7 @@ static int pdr_locator_new_server(struct qmi_handle *qmi, mutex_unlock(&pdr->lock); /* Service pending lookup requests */ - mutex_lock(&pdr->list_lock); - list_for_each_entry(pds, &pdr->lookups, node) { - if (pds->need_locator_lookup) - schedule_work(&pdr->locator_work); - } - mutex_unlock(&pdr->list_lock); + schedule_work(&pdr->locator_work); return 0; } -- 2.33.0
2 1
0 0
[PATCH OLK-5.10] soc: qcom: pdr: Fix the potential deadlock
by Qi Xi 17 Apr '25

17 Apr '25
From: Saranya R <quic_sarar(a)quicinc.com> mainline inclusion from mainline-v6.14 commit 2eeb03ad9f42dfece63051be2400af487ddb96d2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBZH8C CVE: CVE-2025-22014 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- When some client process A call pdr_add_lookup() to add the look up for the service and does schedule locator work, later a process B got a new server packet indicating locator is up and call pdr_locator_new_server() which eventually sets pdr->locator_init_complete to true which process A sees and takes list lock and queries domain list but it will timeout due to deadlock as the response will queued to the same qmi->wq and it is ordered workqueue and process B is not able to complete new server request work due to deadlock on list lock. Fix it by removing the unnecessary list iteration as the list iteration is already being done inside locator work, so avoid it here and just call schedule_work() here. Process A Process B process_scheduled_works() pdr_add_lookup() qmi_data_ready_work() process_scheduled_works() pdr_locator_new_server() pdr->locator_init_complete=true; pdr_locator_work() mutex_lock(&pdr->list_lock); pdr_locate_service() mutex_lock(&pdr->list_lock); pdr_get_domain_list() pr_err("PDR: %s get domain list txn wait failed: %d\n", req->service_name, ret); Timeout error log due to deadlock: " PDR: tms/servreg get domain list txn wait failed: -110 PDR: service lookup for msm/adsp/sensor_pd:tms/servreg failed: -110 " Thanks to Bjorn and Johan for letting me know that this commit also fixes an audio regression when using the in-kernel pd-mapper as that makes it easier to hit this race. [1] Link: https://lore.kernel.org/lkml/Zqet8iInnDhnxkT9@hovoldconsulting.com/ # [1] Fixes: fbe639b44a82 ("soc: qcom: Introduce Protection Domain Restart helpers") CC: stable(a)vger.kernel.org Reviewed-by: Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com> Tested-by: Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com> Tested-by: Johan Hovold <johan+linaro(a)kernel.org> Signed-off-by: Saranya R <quic_sarar(a)quicinc.com> Co-developed-by: Mukesh Ojha <mukesh.ojha(a)oss.qualcomm.com> Signed-off-by: Mukesh Ojha <mukesh.ojha(a)oss.qualcomm.com> Link: https://lore.kernel.org/r/20250212163720.1577876-1-mukesh.ojha@oss.qualcomm… Signed-off-by: Bjorn Andersson <andersson(a)kernel.org> Signed-off-by: Qi Xi <xiqi2(a)huawei.com> --- drivers/soc/qcom/pdr_interface.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/soc/qcom/pdr_interface.c b/drivers/soc/qcom/pdr_interface.c index 373725b6d544..d2e10d13bc7a 100644 --- a/drivers/soc/qcom/pdr_interface.c +++ b/drivers/soc/qcom/pdr_interface.c @@ -74,7 +74,6 @@ static int pdr_locator_new_server(struct qmi_handle *qmi, { struct pdr_handle *pdr = container_of(qmi, struct pdr_handle, locator_hdl); - struct pdr_service *pds; mutex_lock(&pdr->lock); /* Create a local client port for QMI communication */ @@ -86,12 +85,7 @@ static int pdr_locator_new_server(struct qmi_handle *qmi, mutex_unlock(&pdr->lock); /* Service pending lookup requests */ - mutex_lock(&pdr->list_lock); - list_for_each_entry(pds, &pdr->lookups, node) { - if (pds->need_locator_lookup) - schedule_work(&pdr->locator_work); - } - mutex_unlock(&pdr->list_lock); + schedule_work(&pdr->locator_work); return 0; } -- 2.33.0
2 1
0 0
[openeuler:OLK-5.10 2864/2864] kernel/sched/fair.c:4499:43: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'?
by kernel test robot 17 Apr '25

17 Apr '25
tree: https://gitee.com/openeuler/kernel.git OLK-5.10 head: 90bc43e348ee5b80304a53353ee95bfae19e7bf9 commit: 433c0b72564239cf3086f563d5ca32a10e4ffd3f [2864/2864] sched/fair: Count the number of tasks marked as steal_task on cfs_rq config: arm64-randconfig-004-20250417 (https://download.01.org/0day-ci/archive/20250417/202504171909.aMEHPXYz-lkp@…) compiler: aarch64-linux-gcc (GCC) 9.5.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250417/202504171909.aMEHPXYz-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202504171909.aMEHPXYz-lkp@intel.com/ All errors (new ones prefixed by >>): kernel/sched/fair.c: In function 'group_steal_enabled': kernel/sched/fair.c:4483:31: error: implicit declaration of function 'is_tg_steal' [-Werror=implicit-function-declaration] 4483 | return group_steal_used() && is_tg_steal(steal_task); | ^~~~~~~~~~~ kernel/sched/fair.c: In function 'overload_clear': >> kernel/sched/fair.c:4499:43: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 4499 | (rq->cfs.h_nr_running < 2 || rq->cfs.steal_h_nr_running == 0)) | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c:4489:16: warning: variable 'time' set but not used [-Wunused-but-set-variable] 4489 | unsigned long time; | ^~~~ kernel/sched/fair.c: In function 'overload_set': kernel/sched/fair.c:4525:36: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 4525 | if (group_steal_used() && rq->cfs.steal_h_nr_running < 1) | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c:4517:16: warning: variable 'time' set but not used [-Wunused-but-set-variable] 4517 | unsigned long time; | ^~~~ kernel/sched/fair.c: At top level: kernel/sched/fair.c:6013:6: warning: no previous prototype for 'init_cfs_bandwidth' [-Wmissing-prototypes] 6013 | void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {} | ^~~~~~~~~~~~~~~~~~ kernel/sched/fair.c: In function 'enqueue_task_fair': kernel/sched/fair.c:6670:12: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 6670 | cfs_rq->steal_h_nr_running++; | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c:6694:12: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 6694 | cfs_rq->steal_h_nr_running++; | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c: In function 'dequeue_task_fair': kernel/sched/fair.c:6795:12: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 6795 | cfs_rq->steal_h_nr_running--; | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c:6831:12: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 6831 | cfs_rq->steal_h_nr_running--; | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c: In function 'select_task_rq_fair': kernel/sched/fair.c:8191:16: warning: variable 'time' set but not used [-Wunused-but-set-variable] 8191 | unsigned long time; | ^~~~ kernel/sched/fair.c: In function 'pick_next_task_fair': kernel/sched/fair.c:9185:16: warning: variable 'time' set but not used [-Wunused-but-set-variable] 9185 | unsigned long time; | ^~~~ kernel/sched/fair.c: In function 'can_migrate_task_llc': >> kernel/sched/fair.c:9944:43: error: dereferencing pointer to incomplete type 'struct task_group' 9944 | if (group_steal_used() && !is_tg_steal(tg->steal_task)) | ^~ kernel/sched/fair.c: In function 'steal_from': kernel/sched/fair.c:13229:29: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 13229 | if (tg_used && src_rq->cfs.steal_h_nr_running < 1) | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c:13241:30: error: 'struct cfs_rq' has no member named 'steal_h_nr_running'; did you mean 'idle_h_nr_running'? 13241 | (tg_used && src_rq->cfs.steal_h_nr_running < 1)) | ^~~~~~~~~~~~~~~~~~ | idle_h_nr_running kernel/sched/fair.c: At top level: kernel/sched/fair.c:13411:6: warning: no previous prototype for 'task_vruntime_update' [-Wmissing-prototypes] 13411 | void task_vruntime_update(struct rq *rq, struct task_struct *p, bool in_fi) | ^~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:13961:6: warning: no previous prototype for 'free_fair_sched_group' [-Wmissing-prototypes] 13961 | void free_fair_sched_group(struct task_group *tg) { } | ^~~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:13963:5: warning: no previous prototype for 'alloc_fair_sched_group' [-Wmissing-prototypes] 13963 | int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) | ^~~~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:13968:6: warning: no previous prototype for 'online_fair_sched_group' [-Wmissing-prototypes] 13968 | void online_fair_sched_group(struct task_group *tg) { } | ^~~~~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:13970:6: warning: no previous prototype for 'unregister_fair_sched_group' [-Wmissing-prototypes] 13970 | void unregister_fair_sched_group(struct task_group *tg) { } | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: some warnings being treated as errors vim +4499 kernel/sched/fair.c 4480 4481 static inline bool group_steal_enabled(int steal_task) 4482 { > 4483 return group_steal_used() && is_tg_steal(steal_task); 4484 } 4485 4486 static void overload_clear(struct rq *rq) 4487 { 4488 struct sparsemask *overload_cpus; 4489 unsigned long time; 4490 bool need_clear = false; 4491 4492 if (!steal_enabled()) 4493 return; 4494 4495 if (!group_steal_used() && rq->cfs.h_nr_running >= 2) 4496 return; 4497 4498 if (group_steal_used() && > 4499 (rq->cfs.h_nr_running < 2 || rq->cfs.steal_h_nr_running == 0)) 4500 need_clear = true; 4501 4502 if (!need_clear) 4503 return; 4504 4505 time = schedstat_start_time(); 4506 rcu_read_lock(); 4507 overload_cpus = rcu_dereference(rq->cfs_overload_cpus); 4508 if (overload_cpus) 4509 sparsemask_clear_elem(overload_cpus, rq->cpu); 4510 rcu_read_unlock(); 4511 schedstat_end_time(rq, time); 4512 } 4513 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
1 0
0 0
[PATCH OLK-6.6 0/6] arm64: Add support for FEAT_{LS64, LS64_V}.
by Yushan Wang 17 Apr '25

17 Apr '25
From: Hongye Lin <linhongye(a)h-partners.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IC1F41 ---------------------------------------------------------------------- Mark Brown (1): arm64: Support AT_HWCAP3 Yicong Yang (5): arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 arm64: Add support for FEAT_{LS64, LS64_V} kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V} arm64: Add ESR.DFSC definition of unsupported exclusive or atomic access KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Documentation/arch/arm64/booting.rst | 12 +++ Documentation/arch/arm64/elf_hwcaps.rst | 12 ++- arch/arm64/include/asm/cpufeature.h | 3 +- arch/arm64/include/asm/el2_setup.h | 11 +++ arch/arm64/include/asm/esr.h | 8 ++ arch/arm64/include/asm/hwcap.h | 7 +- arch/arm64/include/asm/kvm_emulate.h | 1 + arch/arm64/include/uapi/asm/hwcap.h | 6 ++ arch/arm64/kernel/cpufeature.c | 57 ++++++++++++++ arch/arm64/kernel/cpuinfo.c | 2 + arch/arm64/kvm/inject_fault.c | 35 +++++++++ arch/arm64/kvm/mmu.c | 22 +++++- arch/arm64/tools/cpucaps | 4 +- tools/testing/selftests/arm64/abi/hwcap.c | 90 +++++++++++++++++++++++ 14 files changed, 262 insertions(+), 8 deletions(-) -- 2.33.0
2 7
0 0
[PATCH OLK-6.6 v2] PCI: AER: fix deadlock in do_recovery
by Qi Xi 17 Apr '25

17 Apr '25
From: Govindarajulu Varadarajan <gvaradar(a)cisco.com> hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IC1ERL -------------------------------- CPU0 CPU1 --------------------------------------------------------------------- __driver_attach() device_lock(&dev->mutex) <--- device mutex lock here driver_probe_device() pci_enable_sriov() pci_iov_add_virtfn() pci_device_add() aer_isr() <--- pci aer error do_recovery() broadcast_error_message() pci_walk_bus() down_read(&pci_bus_sem) <--- rd sem down_write(&pci_bus_sem) <-- stuck on wr sem report_error_detected() device_lock(&dev->mutex)<--- DEAD LOCK This can also happen when aer error occurs while pci_dev->sriov_config() is called. This patch does a pci_bus_walk and adds all the devices to a list. After unlocking (up_read) &pci_bus_sem, we go through the list and call err_handler of the devices with devic_lock() held. This way, we dont try to hold both locks at same time. v2: * Drop patch 1, 2 & 4. * Instead of locking 50+ devices, do get_device() and add them to a list. After unlocking &pci_bus_sem, go through the list call err_handler. v1: * Previous discussion here: https://lkml.org/lkml/2017/9/27/720 [ 70.984091] pcieport 0000:00:02.0: AER: Uncorrected (Non-Fatal) error received: id=0010 [ 70.984112] pcieport 0000:00:02.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0010(Requester ID) [ 70.984116] pcieport 0000:00:02.0: device [8086:3c04] error status/mask=00004000/00100000 [ 70.984120] pcieport 0000:00:02.0: [14] Completion Timeout (First) ... [ 107.484190] INFO: task kworker/0:1:76 blocked for more than 30 seconds. [ 107.563619] Not tainted 4.13.0+ #28 [ 107.611618] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 107.705368] kworker/0:1 D 0 76 2 0x80000000 [ 107.771050] Workqueue: events aer_isr [ 107.814895] Call Trace: [ 107.844181] __schedule+0x312/0xa40 [ 107.885928] schedule+0x3d/0x90 [ 107.923506] schedule_preempt_disabled+0x15/0x20 [ 107.978773] __mutex_lock+0x304/0xa30 [ 108.022594] ? dev_printk_emit+0x3b/0x50 [ 108.069534] ? report_error_detected+0xa6/0x210 [ 108.123770] mutex_lock_nested+0x1b/0x20 [ 108.170713] ? mutex_lock_nested+0x1b/0x20 [ 108.219730] report_error_detected+0xa6/0x210 [ 108.271881] ? aer_recover_queue+0xe0/0xe0 [ 108.320904] pci_walk_bus+0x46/0x90 [ 108.362645] ? aer_recover_queue+0xe0/0xe0 [ 108.411658] broadcast_error_message+0xc3/0xf0 [ 108.464835] do_recovery+0x34/0x220 [ 108.506569] ? get_device_error_info+0x92/0x130 [ 108.560785] aer_isr+0x28f/0x3b0 [ 108.599410] process_one_work+0x277/0x6c0 [ 108.647399] worker_thread+0x4d/0x3b0 [ 108.691218] kthread+0x171/0x190 [ 108.729830] ? process_one_work+0x6c0/0x6c0 [ 108.779888] ? kthread_create_on_node+0x40/0x40 [ 108.834110] ret_from_fork+0x2a/0x40 [ 108.876916] INFO: task kworker/0:2:205 blocked for more than 30 seconds. [ 108.957129] Not tainted 4.13.0+ #28 [ 109.005114] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 109.098873] kworker/0:2 D 0 205 2 0x80000000 [ 109.164544] Workqueue: events work_for_cpu_fn [ 109.216681] Call Trace: [ 109.245943] __schedule+0x312/0xa40 [ 109.287687] ? rwsem_down_write_failed+0x308/0x4f0 [ 109.345021] schedule+0x3d/0x90 [ 109.382603] rwsem_down_write_failed+0x30d/0x4f0 [ 109.437869] ? __lock_acquire+0x75c/0x1410 [ 109.486910] call_rwsem_down_write_failed+0x17/0x30 [ 109.545287] ? call_rwsem_down_write_failed+0x17/0x30 [ 109.605752] down_write+0x88/0xb0 [ 109.645410] pci_device_add+0x158/0x240 [ 109.691313] pci_iov_add_virtfn+0x24f/0x340 [ 109.741375] pci_enable_sriov+0x32b/0x420 [ 109.789466] ? pci_read+0x2c/0x30 [ 109.829142] enic_probe+0x5d4/0xff0 [enic] [ 109.878184] ? trace_hardirqs_on+0xd/0x10 [ 109.926180] local_pci_probe+0x42/0xa0 [ 109.971037] work_for_cpu_fn+0x14/0x20 [ 110.015898] process_one_work+0x277/0x6c0 [ 110.063884] worker_thread+0x1d6/0x3b0 [ 110.108750] kthread+0x171/0x190 [ 110.147363] ? process_one_work+0x6c0/0x6c0 [ 110.197426] ? kthread_create_on_node+0x40/0x40 [ 110.251642] ret_from_fork+0x2a/0x40 [ 110.294448] INFO: task systemd-udevd:492 blocked for more than 30 seconds. [ 110.376742] Not tainted 4.13.0+ #28 [ 110.424715] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 110.518457] systemd-udevd D 0 492 444 0x80000180 [ 110.584116] Call Trace: [ 110.613382] __schedule+0x312/0xa40 [ 110.655127] ? wait_for_completion+0x14a/0x1d0 [ 110.708302] schedule+0x3d/0x90 [ 110.745875] schedule_timeout+0x26e/0x5b0 [ 110.793887] ? wait_for_completion+0x14a/0x1d0 [ 110.847068] wait_for_completion+0x169/0x1d0 [ 110.898165] ? wait_for_completion+0x169/0x1d0 [ 110.951354] ? wake_up_q+0x80/0x80 [ 110.992060] flush_work+0x237/0x300 [ 111.033795] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0 [ 111.093224] ? wait_for_completion+0x5a/0x1d0 [ 111.145363] ? flush_work+0x237/0x300 [ 111.189189] work_on_cpu+0x94/0xb0 [ 111.229894] ? work_is_static_object+0x20/0x20 [ 111.283070] ? pci_device_shutdown+0x60/0x60 [ 111.334173] pci_device_probe+0x17a/0x190 [ 111.382163] driver_probe_device+0x2ff/0x450 [ 111.433260] __driver_attach+0x103/0x140 [ 111.480195] ? driver_probe_device+0x450/0x450 [ 111.533381] bus_for_each_dev+0x74/0xb0 [ 111.579276] driver_attach+0x1e/0x20 [ 111.622056] bus_add_driver+0x1ca/0x270 [ 111.667955] ? 0xffffffffc039c000 [ 111.707616] driver_register+0x60/0xe0 [ 111.752472] ? 0xffffffffc039c000 [ 111.792126] __pci_register_driver+0x6b/0x70 [ 111.843275] enic_init_module+0x38/0x1000 [enic] [ 111.898533] do_one_initcall+0x50/0x192 [ 111.944428] ? trace_hardirqs_on+0xd/0x10 [ 111.992408] do_init_module+0x5f/0x1f2 [ 112.037274] load_module+0x1740/0x1f70 [ 112.082148] SYSC_finit_module+0xd7/0xf0 [ 112.129083] ? SYSC_finit_module+0xd7/0xf0 [ 112.178106] SyS_finit_module+0xe/0x10 [ 112.222972] do_syscall_64+0x69/0x180 [ 112.266793] entry_SYSCALL64_slow_path+0x25/0x25 [ 112.322047] RIP: 0033:0x7f3da098b559 [ 112.364826] RSP: 002b:00007ffeb3306a38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 112.455447] RAX: ffffffffffffffda RBX: 0000557fe41ed3d0 RCX: 00007f3da098b559 [ 112.540860] RDX: 0000000000000000 RSI: 00007f3da14c79c5 RDI: 0000000000000006 [ 112.626281] RBP: 00007f3da14c79c5 R08: 0000000000000000 R09: 00007ffeb3306b50 [ 112.711698] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000 [ 112.797114] R13: 0000557fe420e210 R14: 0000000000020000 R15: 0000557fe2c1ef4a [ 112.882568] Showing all locks held in the system: [ 112.956545] 5 locks held by kworker/0:1/76: [ 113.006616] #0: ("events"){+.+.}, at: [<ffffffffb00b10ed>] process_one_work+0x1ed/0x6c0 [ 113.104535] #1: ((&rpc->dpc_handler)){+.+.}, at: [<ffffffffb00b10ed>] process_one_work+0x1ed/0x6c0 [ 113.213894] #2: (&rpc->rpc_mutex){+.+.}, at: [<ffffffffb0505ca2>] aer_isr+0x32/0x3b0 [ 113.308711] #3: (pci_bus_sem){++++}, at: [<ffffffffb04ea18a>] pci_walk_bus+0x2a/0x90 [ 113.403501] #4: (&dev->mutex){....}, at: [<ffffffffb0505706>] report_error_detected+0xa6/0x210 [ 113.508715] 3 locks held by kworker/0:2/205: [ 113.559808] #0: ("events"){+.+.}, at: [<ffffffffb00b10ed>] process_one_work+0x1ed/0x6c0 [ 113.657718] #1: ((&wfc.work)){+.+.}, at: [<ffffffffb00b10ed>] process_one_work+0x1ed/0x6c0 [ 113.758745] #2: (pci_bus_sem){++++}, at: [<ffffffffb04ec978>] pci_device_add+0x158/0x240 [ 113.857710] 1 lock held by khungtaskd/239: [ 113.906729] #0: (tasklist_lock){.+.+}, at: [<ffffffffb00f07dd>] debug_show_all_locks+0x3d/0x1a0 [ 114.012972] 2 locks held by systemd-udevd/492: [ 114.066148] #0: (&dev->mutex){....}, at: [<ffffffffb06254d5>] __driver_attach+0x55/0x140 [ 114.165107] #1: (&dev->mutex){....}, at: [<ffffffffb06254f2>] __driver_attach+0x72/0x140 [ 114.281879] ============================================= Signed-off-by: Govindarajulu Varadarajan <gvaradar(a)cisco.com> Signed-off-by: Qi Xi <xiqi2(a)huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com> --- drivers/pci/pcie/err.c | 47 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index 705893b5f7b0..90b7e748638e 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -171,6 +171,51 @@ static int report_resume(struct pci_dev *dev, void *data) return 0; } +struct aer_device_list { + struct device *dev; + struct list_head node; +}; + +static int aer_get_pci_dev(struct pci_dev *pdev, void *data) +{ + struct list_head *head = (struct list_head *)data; + struct device *dev = &pdev->dev; + struct aer_device_list *entry; + + entry = kmalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + /* continue with other devices, lets not return error */ + return 0; + + entry->dev = get_device(dev); + list_add_tail(&entry->node, head); + + return 0; +} + +static void aer_pci_walk_bus(struct pci_bus *top, + int (*cb)(struct pci_dev *, void *), + void *result) +{ + LIST_HEAD(dev_list); + struct aer_device_list *entry, *tmp; + + pci_walk_bus(top, aer_get_pci_dev, &dev_list); + list_for_each_entry_safe(entry, tmp, &dev_list, node) { + struct pci_dev *pdev = container_of(entry->dev, struct pci_dev, + dev); + int err; + + err = cb(pdev, result); + if (err) + dev_err(entry->dev, "AER: recovery handler failed: %d", + err); + put_device(entry->dev); + list_del(&entry->node); + kfree(entry); + } +} + /** * pci_walk_bridge - walk bridges potentially AER affected * @bridge: bridge which may be a Port, an RCEC, or an RCiEP @@ -189,7 +234,7 @@ static void pci_walk_bridge(struct pci_dev *bridge, void *userdata) { if (bridge->subordinate) - pci_walk_bus(bridge->subordinate, cb, userdata); + aer_pci_walk_bus(bridge->subordinate, cb, userdata); else cb(bridge, userdata); } -- 2.33.0
2 1
0 0
[PATCH OLK-6.6 0/3] Backport mainline patches to avoid crash caused by rsize being 0
by Wang Zhaolong 17 Apr '25

17 Apr '25
Wang Zhaolong (3): smb:client: smb: client: Add reverse mapping from tcon to superblocks smb: client: Store original IO parameters and prevent zero IO sizes smb: client: Update IO sizes after reconnection fs/smb/client/cifs_fs_sb.h | 1 + fs/smb/client/cifsglob.h | 3 ++- fs/smb/client/connect.c | 15 +++++++++++++++ fs/smb/client/fs_context.c | 2 ++ fs/smb/client/fs_context.h | 3 +++ fs/smb/client/misc.c | 2 ++ fs/smb/client/smb1ops.c | 6 +++--- fs/smb/client/smb2ops.c | 27 +++++++++++++++++++-------- fs/smb/client/smb2pdu.c | 24 ++++++++++++++++++++++-- fs/smb/common/smb2pdu.h | 3 +++ 10 files changed, 72 insertions(+), 14 deletions(-) -- 2.39.2
2 4
0 0
[PATCH OLK-6.6 0/3] Backport mainline patches to avoid crash caused by rsize being 0
by Wang Zhaolong 17 Apr '25

17 Apr '25
Wang Zhaolong (3): smb:client: smb: client: Add reverse mapping from tcon to superblocks smb: client: Store original IO parameters and prevent zero IO sizes smb: client: Update IO sizes after reconnection fs/smb/client/cifs_fs_sb.h | 1 + fs/smb/client/cifsglob.h | 3 ++- fs/smb/client/connect.c | 15 +++++++++++++++ fs/smb/client/fs_context.c | 2 ++ fs/smb/client/fs_context.h | 3 +++ fs/smb/client/misc.c | 2 ++ fs/smb/client/smb1ops.c | 6 +++--- fs/smb/client/smb2ops.c | 27 +++++++++++++++++++-------- fs/smb/client/smb2pdu.c | 24 ++++++++++++++++++++++-- fs/smb/common/smb2pdu.h | 3 +++ 10 files changed, 72 insertions(+), 14 deletions(-) -- 2.39.2
2 4
0 0
[PATCH openEuler-1.0-LTS 0/4] Backport mainline patches to avoid crash caused by rsize being 0
by Wang Zhaolong 17 Apr '25

17 Apr '25
Wang Zhaolong (4): Revert "cifs: Prevent NULL pointer dereference caused by cifs_sb->rsize is 0" smb:client: smb: client: Add reverse mapping from tcon to superblocks smb: client: Store original IO parameters and prevent zero IO sizes smb: client: Update IO sizes after reconnection fs/cifs/cifs_fs_sb.h | 3 +++ fs/cifs/cifsglob.h | 12 +++++++++--- fs/cifs/connect.c | 27 ++++++++++++++++++++------- fs/cifs/misc.c | 2 ++ fs/cifs/smb1ops.c | 10 +++++----- fs/cifs/smb2ops.c | 23 +++++++++++++++++------ fs/cifs/smb2pdu.c | 24 ++++++++++++++++++++++-- 7 files changed, 78 insertions(+), 23 deletions(-) -- 2.39.2
2 5
0 0
[PATCH OLK-5.10 0/3] Backport mainline patches to avoid crash caused by rsize being 0
by Wang Zhaolong 17 Apr '25

17 Apr '25
Wang Zhaolong (3): smb:client: smb: client: Add reverse mapping from tcon to superblocks smb: client: Store original IO parameters and prevent zero IO sizes smb: client: Update IO sizes after reconnection fs/cifs/cifs_fs_sb.h | 3 +++ fs/cifs/cifsglob.h | 12 +++++++++--- fs/cifs/connect.c | 22 ++++++++++++++++++++-- fs/cifs/misc.c | 2 ++ fs/cifs/smb1ops.c | 10 +++++----- fs/cifs/smb2ops.c | 35 +++++++++++++++++++++++------------ fs/cifs/smb2pdu.c | 24 ++++++++++++++++++++++-- 7 files changed, 84 insertions(+), 24 deletions(-) -- 2.39.2
2 4
0 0
[PATCH OLK-6.6 0/3] Backport mainline patches to avoid crash caused by rsize being 0
by Wang Zhaolong 17 Apr '25

17 Apr '25
Wang Zhaolong (3): smb:client: smb: client: Add reverse mapping from tcon to superblocks smb: client: Store original IO parameters and prevent zero IO sizes smb: client: Update IO sizes after reconnection fs/smb/client/cifs_fs_sb.h | 1 + fs/smb/client/cifsglob.h | 3 ++- fs/smb/client/connect.c | 15 +++++++++++++++ fs/smb/client/fs_context.c | 2 ++ fs/smb/client/fs_context.h | 3 +++ fs/smb/client/misc.c | 2 ++ fs/smb/client/smb1ops.c | 6 +++--- fs/smb/client/smb2ops.c | 27 +++++++++++++++++++-------- fs/smb/client/smb2pdu.c | 24 ++++++++++++++++++++++-- fs/smb/common/smb2pdu.h | 3 +++ 10 files changed, 72 insertions(+), 14 deletions(-) -- 2.39.2
1 0
0 0
  • ← Newer
  • 1
  • ...
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • ...
  • 1827
  • Older →

HyperKitty Powered by HyperKitty