Fix CVE-2024-35879
Herve Codina (2): driver core: Introduce device_link_wait_removal() of: dynamic: Synchronize of_changeset_destroy() with the devlink removals
Zhang Zekun (1): driver core: Fix kabi broken
drivers/base/core.c | 26 +++++++++++++++++++++++--- drivers/of/dynamic.c | 11 +++++++++++ include/linux/of.h | 2 ++ 3 files changed, 36 insertions(+), 3 deletions(-)
From: Herve Codina herve.codina@bootlin.com
stable inclusion from stable-v5.10.215 commit 7f62d985e94eba4b8327355982cfddf59de37296 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9QG3P CVE: CVE-2024-35879
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
------------------------------------------------------
commit 0462c56c290a99a7f03e817ae5b843116dfb575c upstream.
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") introduces a workqueue to release the consumer and supplier devices used in the devlink. In the job queued, devices are release and in turn, when all the references to these devices are dropped, the release function of the device itself is called.
Nothing is present to provide some synchronisation with this workqueue in order to ensure that all ongoing releasing operations are done and so, some other operations can be started safely.
For instance, in the following sequence: 1) of_platform_depopulate() 2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed (jobs pushed in the workqueue). During the step 2, OF nodes are destroyed but, without any synchronisation with devlink removal jobs, of_overlay_remove() can raise warnings related to missing of_node_put(): ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late, from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize operations waiting for the end of devlink removals (i.e. end of workqueue jobs). Also, as a flushing operation is done on the workqueue, the workqueue used is moved from a system-wide workqueue to a local one.
Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com Tested-by: Luca Ceresoli luca.ceresoli@bootlin.com Reviewed-by: Nuno Sa nuno.sa@analog.com Reviewed-by: Saravana Kannan saravanak@google.com Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Conflicts: drivers/base/core.c [There are only context conficts, no code logic conflicts] Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- drivers/base/core.c | 26 +++++++++++++++++++++++--- include/linux/device.h | 1 + 2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index 6fab0005e880..82b50a89cedc 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -53,6 +53,7 @@ static unsigned int defer_sync_state_count = 1; static unsigned int defer_fw_devlink_count; static LIST_HEAD(deferred_fw_devlink); static DEFINE_MUTEX(defer_fw_devlink_lock); +static struct workqueue_struct *device_link_wq; static bool fw_devlink_is_permissive(void);
#ifdef CONFIG_SRCU @@ -364,12 +365,26 @@ static void devlink_dev_release(struct device *dev) /* * It may take a while to complete this work because of the SRCU * synchronization in device_link_release_fn() and if the consumer or - * supplier devices get deleted when it runs, so put it into the "long" - * workqueue. + * supplier devices get deleted when it runs, so put it into the + * dedicated workqueue. */ - queue_work(system_long_wq, &link->rm_work); + queue_work(device_link_wq, &link->rm_work); }
+/** + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate + */ +void device_link_wait_removal(void) +{ + /* + * devlink removal jobs are queued in the dedicated work queue. + * To be sure that all removal jobs are terminated, ensure that any + * scheduled work has run to completion. + */ + flush_workqueue(device_link_wq); +} +EXPORT_SYMBOL_GPL(device_link_wait_removal); + static struct class devlink_class = { .name = "devlink", .owner = THIS_MODULE, @@ -3420,9 +3435,14 @@ int __init devices_init(void) sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); if (!sysfs_dev_char_kobj) goto char_kobj_err; + device_link_wq = alloc_workqueue("device_link_wq", 0, 0); + if (!device_link_wq) + goto wq_err;
return 0;
+ wq_err: + kobject_put(sysfs_dev_char_kobj); char_kobj_err: kobject_put(sysfs_dev_block_kobj); block_kobj_err: diff --git a/include/linux/device.h b/include/linux/device.h index 62b127bffdda..ceb02c0ac69c 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -962,6 +962,7 @@ void device_link_del(struct device_link *link); void device_link_remove(void *consumer, struct device *supplier); void device_links_supplier_sync_state_pause(void); void device_links_supplier_sync_state_resume(void); +void device_link_wait_removal(void);
extern __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...);
From: Herve Codina herve.codina@bootlin.com
stable inclusion from stable-v5.10.215 commit 3127b2ee50c424a96eb3559fbb7b43cf0b111c7a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9QG3P CVE: CVE-2024-35879
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
------------------------------------------------
commit 8917e7385346bd6584890ed362985c219fe6ae84 upstream.
In the following sequence: 1) of_platform_depopulate() 2) of_overlay_remove()
During the step 1, devices are destroyed and devlinks are removed. During the step 2, OF nodes are destroyed but __of_changeset_entry_destroy() can raise warnings related to missing of_node_put(): ERROR: memory leak, expected refcount 1 instead of 2 ...
Indeed, during the devlink removals performed at step 1, the removal itself releasing the device (and the attached of_node) is done by a job queued in a workqueue and so, it is done asynchronously with respect to function calls. When the warning is present, of_node_put() will be called but wrongly too late from the workqueue job.
In order to be sure that any ongoing devlink removals are done before the of_node destruction, synchronize the of_changeset_destroy() with the devlink removals.
Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com Reviewed-by: Saravana Kannan saravanak@google.com Tested-by: Luca Ceresoli luca.ceresoli@bootlin.com Reviewed-by: Nuno Sa nuno.sa@analog.com Link: https://lore.kernel.org/r/20240325152140.198219-3-herve.codina@bootlin.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- drivers/of/dynamic.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index b6a3ee65437b..4d80167d39d4 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -9,6 +9,7 @@
#define pr_fmt(fmt) "OF: " fmt
+#include <linux/device.h> #include <linux/of.h> #include <linux/spinlock.h> #include <linux/slab.h> @@ -675,6 +676,17 @@ void of_changeset_destroy(struct of_changeset *ocs) { struct of_changeset_entry *ce, *cen;
+ /* + * When a device is deleted, the device links to/from it are also queued + * for deletion. Until these device links are freed, the devices + * themselves aren't freed. If the device being deleted is due to an + * overlay change, this device might be holding a reference to a device + * node that will be freed. So, wait until all already pending device + * links are deleted before freeing a device node. This ensures we don't + * free any device node that has a non-zero reference count. + */ + device_link_wait_removal(); + list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node) __of_changeset_entry_destroy(ce); }
hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9QG3P
-----------------------------------
Move the device_link_wait_removal() to the of.h to avoid kabi broken.
Fixes: 6c72a794b8ae ("of: dynamic: Synchronize of_changeset_destroy() with the devlink removals") Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- drivers/of/dynamic.c | 1 - include/linux/device.h | 1 - include/linux/of.h | 2 ++ 3 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index 4d80167d39d4..e663445ce085 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -9,7 +9,6 @@
#define pr_fmt(fmt) "OF: " fmt
-#include <linux/device.h> #include <linux/of.h> #include <linux/spinlock.h> #include <linux/slab.h> diff --git a/include/linux/device.h b/include/linux/device.h index ceb02c0ac69c..62b127bffdda 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -962,7 +962,6 @@ void device_link_del(struct device_link *link); void device_link_remove(void *consumer, struct device *supplier); void device_links_supplier_sync_state_pause(void); void device_links_supplier_sync_state_resume(void); -void device_link_wait_removal(void);
extern __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); diff --git a/include/linux/of.h b/include/linux/of.h index e6b8e39f524c..1ebea14ad39c 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -1451,6 +1451,8 @@ static inline int of_reconfig_get_state_change(unsigned long action, } #endif /* CONFIG_OF_DYNAMIC */
+void device_link_wait_removal(void); + /** * of_device_is_system_power_controller - Tells if system-power-controller is found for device_node * @np: Pointer to the given device_node
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,转换为PR失败! 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M... 失败原因:应用补丁/补丁集失败,Patch failed at 0001 driver core: Introduce device_link_wait_removal() 建议解决方法:请查看失败原因, 确认补丁是否可以应用在当前期望分支的最新代码上
FeedBack: The patch(es) which you have sent to kernel@openeuler.org has been converted to PR failed! Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M... Failed Reason: apply patch(es) failed, Patch failed at 0001 driver core: Introduce device_link_wait_removal() Suggest Solution: please checkout if the failed patch(es) can work on the newest codes in expected branch