From: Carlos Llamas <cmllamas(a)google.com>
stable inclusion
from stable-v5.10.209
commit 7e7a0d86542b0ea903006d3f42f33c4f7ead6918
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I99JWJ
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 9a9ab0d963621d9d12199df9817e66982582d5a5 upstream.
Task A calls binder_update_page_range() to allocate and insert pages on
a remote address space from Task B. For this, Task A pins the remote mm
via mmget_not_zero() first. This can race with Task B do_exit() and the
final mmput() refcount decrement will come from Task A.
Task A | Task B
------------------+------------------
mmget_not_zero() |
| do_exit()
| exit_mm()
| mmput()
mmput() |
exit_mmap() |
remove_vma() |
fput() |
In this case, the work of ____fput() from Task B is queued up in Task A
as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
work gets executed. However, Task A instead sleep, waiting for a reply
from Task B that never comes (it's dead).
This means the binder_deferred_release() is blocked until an unrelated
binder event forces Task A to go back to userspace. All the associated
death notifications will also be delayed until then.
In order to fix this use mmput_async() that will schedule the work in
the corresponding mm->async_put_work WQ instead of Task A.
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Reviewed-by: Alice Ryhl <aliceryhl(a)google.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-4-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
drivers/android/binder_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 3c93e6c05c4d..c534b332c8d3 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -271,7 +271,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
}
if (mm) {
mmap_write_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
}
return 0;
@@ -304,7 +304,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
err_no_vma:
if (mm) {
mmap_write_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
}
return vma ? -ENOMEM : -ESRCH;
}
--
2.17.1
From: Carlos Llamas <cmllamas(a)google.com>
stable inclusion
from stable-v5.10.209
commit 7e7a0d86542b0ea903006d3f42f33c4f7ead6918
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I99JWJ
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 9a9ab0d963621d9d12199df9817e66982582d5a5 upstream.
Task A calls binder_update_page_range() to allocate and insert pages on
a remote address space from Task B. For this, Task A pins the remote mm
via mmget_not_zero() first. This can race with Task B do_exit() and the
final mmput() refcount decrement will come from Task A.
Task A | Task B
------------------+------------------
mmget_not_zero() |
| do_exit()
| exit_mm()
| mmput()
mmput() |
exit_mmap() |
remove_vma() |
fput() |
In this case, the work of ____fput() from Task B is queued up in Task A
as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
work gets executed. However, Task A instead sleep, waiting for a reply
from Task B that never comes (it's dead).
This means the binder_deferred_release() is blocked until an unrelated
binder event forces Task A to go back to userspace. All the associated
death notifications will also be delayed until then.
In order to fix this use mmput_async() that will schedule the work in
the corresponding mm->async_put_work WQ instead of Task A.
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Reviewed-by: Alice Ryhl <aliceryhl(a)google.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-4-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
drivers/android/binder_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 3c93e6c05c4d..c534b332c8d3 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -271,7 +271,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
}
if (mm) {
mmap_write_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
}
return 0;
@@ -304,7 +304,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
err_no_vma:
if (mm) {
mmap_write_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
}
return vma ? -ENOMEM : -ESRCH;
}
--
2.17.1
From: Carlos Llamas <cmllamas(a)google.com>
stable inclusion
from stable-v5.10.209
commit 7e7a0d86542b0ea903006d3f42f33c4f7ead6918
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I99JWJ
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 9a9ab0d963621d9d12199df9817e66982582d5a5 upstream.
Task A calls binder_update_page_range() to allocate and insert pages on
a remote address space from Task B. For this, Task A pins the remote mm
via mmget_not_zero() first. This can race with Task B do_exit() and the
final mmput() refcount decrement will come from Task A.
Task A | Task B
------------------+------------------
mmget_not_zero() |
| do_exit()
| exit_mm()
| mmput()
mmput() |
exit_mmap() |
remove_vma() |
fput() |
In this case, the work of ____fput() from Task B is queued up in Task A
as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
work gets executed. However, Task A instead sleep, waiting for a reply
from Task B that never comes (it's dead).
This means the binder_deferred_release() is blocked until an unrelated
binder event forces Task A to go back to userspace. All the associated
death notifications will also be delayed until then.
In order to fix this use mmput_async() that will schedule the work in
the corresponding mm->async_put_work WQ instead of Task A.
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Reviewed-by: Alice Ryhl <aliceryhl(a)google.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-4-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
drivers/android/binder_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 3c93e6c05c4d..c534b332c8d3 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -271,7 +271,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
}
if (mm) {
mmap_write_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
}
return 0;
@@ -304,7 +304,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
err_no_vma:
if (mm) {
mmap_write_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
}
return vma ? -ENOMEM : -ESRCH;
}
--
2.17.1
From: Kees Cook <keescook(a)chromium.org>
mainline inclusion
from mainline-v6.8-rc1
commit 9e4bf6a08d1e127bcc4bd72557f2dfafc6bc7f41
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I99K1F
CVE: CVE-2023-52618
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Since "dev_search_path" can technically be as large as PATH_MAX,
there was a risk of truncation when copying it and a second string
into "full_path" since it was also PATH_MAX sized. The W=1 builds were
reporting this warning:
drivers/block/rnbd/rnbd-srv.c: In function 'process_msg_open.isra':
drivers/block/rnbd/rnbd-srv.c:616:51: warning: '%s' directive output may be truncated writing up to 254 bytes into a region of size between 0 and 4095 [-Wformat-truncation=]
616 | snprintf(full_path, PATH_MAX, "%s/%s",
| ^~
In function 'rnbd_srv_get_full_path',
inlined from 'process_msg_open.isra' at drivers/block/rnbd/rnbd-srv.c:721:14: drivers/block/rnbd/rnbd-srv.c:616:17: note: 'snprintf' output between 2 and 4351 bytes into a destination of size 4096
616 | snprintf(full_path, PATH_MAX, "%s/%s",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
617 | dev_search_path, dev_name);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
To fix this, unconditionally check for truncation (as was already done
for the case where "%SESSNAME%" was present).
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312100355.lHoJPgKy-lkp@intel.com/
Cc: Md. Haris Iqbal <haris.iqbal(a)ionos.com>
Cc: Jack Wang <jinpu.wang(a)ionos.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <linux-block(a)vger.kernel.org>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Guoqing Jiang <guoqing.jiang(a)linux.dev>
Acked-by: Jack Wang <jinpu.wang(a)ionos.com>
Link: https://lore.kernel.org/r/20231212214738.work.169-kees@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
---
drivers/block/rnbd/rnbd-srv.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index e1bc8b4cd592..9c5d52335e17 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -591,6 +591,7 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
{
char *full_path;
char *a, *b;
+ int len;
full_path = kmalloc(PATH_MAX, GFP_KERNEL);
if (!full_path)
@@ -602,19 +603,19 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
*/
a = strnstr(dev_search_path, "%SESSNAME%", sizeof(dev_search_path));
if (a) {
- int len = a - dev_search_path;
+ len = a - dev_search_path;
len = snprintf(full_path, PATH_MAX, "%.*s/%s/%s", len,
dev_search_path, srv_sess->sessname, dev_name);
- if (len >= PATH_MAX) {
- pr_err("Too long path: %s, %s, %s\n",
- dev_search_path, srv_sess->sessname, dev_name);
- kfree(full_path);
- return ERR_PTR(-EINVAL);
- }
} else {
- snprintf(full_path, PATH_MAX, "%s/%s",
- dev_search_path, dev_name);
+ len = snprintf(full_path, PATH_MAX, "%s/%s",
+ dev_search_path, dev_name);
+ }
+ if (len >= PATH_MAX) {
+ pr_err("Too long path: %s, %s, %s\n",
+ dev_search_path, srv_sess->sessname, dev_name);
+ kfree(full_path);
+ return ERR_PTR(-EINVAL);
}
/* eliminitate duplicated slashes */
--
2.17.1
From: Kees Cook <keescook(a)chromium.org>
mainline inclusion
from mainline-v6.8-rc1
commit 9e4bf6a08d1e127bcc4bd72557f2dfafc6bc7f41
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I99K1F
CVE: CVE-2023-52618
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Since "dev_search_path" can technically be as large as PATH_MAX,
there was a risk of truncation when copying it and a second string
into "full_path" since it was also PATH_MAX sized. The W=1 builds were
reporting this warning:
drivers/block/rnbd/rnbd-srv.c: In function 'process_msg_open.isra':
drivers/block/rnbd/rnbd-srv.c:616:51: warning: '%s' directive output may be truncated writing up to 254 bytes into a region of size between 0 and 4095 [-Wformat-truncation=]
616 | snprintf(full_path, PATH_MAX, "%s/%s",
| ^~
In function 'rnbd_srv_get_full_path',
inlined from 'process_msg_open.isra' at drivers/block/rnbd/rnbd-srv.c:721:14: drivers/block/rnbd/rnbd-srv.c:616:17: note: 'snprintf' output between 2 and 4351 bytes into a destination of size 4096
616 | snprintf(full_path, PATH_MAX, "%s/%s",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
617 | dev_search_path, dev_name);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
To fix this, unconditionally check for truncation (as was already done
for the case where "%SESSNAME%" was present).
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312100355.lHoJPgKy-lkp@intel.com/
Cc: Md. Haris Iqbal <haris.iqbal(a)ionos.com>
Cc: Jack Wang <jinpu.wang(a)ionos.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <linux-block(a)vger.kernel.org>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Guoqing Jiang <guoqing.jiang(a)linux.dev>
Acked-by: Jack Wang <jinpu.wang(a)ionos.com>
Link: https://lore.kernel.org/r/20231212214738.work.169-kees@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
---
drivers/block/rnbd/rnbd-srv.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index e1bc8b4cd592..9c5d52335e17 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -591,6 +591,7 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
{
char *full_path;
char *a, *b;
+ int len;
full_path = kmalloc(PATH_MAX, GFP_KERNEL);
if (!full_path)
@@ -602,19 +603,19 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
*/
a = strnstr(dev_search_path, "%SESSNAME%", sizeof(dev_search_path));
if (a) {
- int len = a - dev_search_path;
+ len = a - dev_search_path;
len = snprintf(full_path, PATH_MAX, "%.*s/%s/%s", len,
dev_search_path, srv_sess->sessname, dev_name);
- if (len >= PATH_MAX) {
- pr_err("Too long path: %s, %s, %s\n",
- dev_search_path, srv_sess->sessname, dev_name);
- kfree(full_path);
- return ERR_PTR(-EINVAL);
- }
} else {
- snprintf(full_path, PATH_MAX, "%s/%s",
- dev_search_path, dev_name);
+ len = snprintf(full_path, PATH_MAX, "%s/%s",
+ dev_search_path, dev_name);
+ }
+ if (len >= PATH_MAX) {
+ pr_err("Too long path: %s, %s, %s\n",
+ dev_search_path, srv_sess->sessname, dev_name);
+ kfree(full_path);
+ return ERR_PTR(-EINVAL);
}
/* eliminitate duplicated slashes */
--
2.17.1
From: Kees Cook <keescook(a)chromium.org>
mainline inclusion
from mainline-v6.8-rc1
commit 9e4bf6a08d1e127bcc4bd72557f2dfafc6bc7f41
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I99K1F
CVE: CVE-2023-52618
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Since "dev_search_path" can technically be as large as PATH_MAX,
there was a risk of truncation when copying it and a second string
into "full_path" since it was also PATH_MAX sized. The W=1 builds were
reporting this warning:
drivers/block/rnbd/rnbd-srv.c: In function 'process_msg_open.isra':
drivers/block/rnbd/rnbd-srv.c:616:51: warning: '%s' directive output may be truncated writing up to 254 bytes into a region of size between 0 and 4095 [-Wformat-truncation=]
616 | snprintf(full_path, PATH_MAX, "%s/%s",
| ^~
In function 'rnbd_srv_get_full_path',
inlined from 'process_msg_open.isra' at drivers/block/rnbd/rnbd-srv.c:721:14: drivers/block/rnbd/rnbd-srv.c:616:17: note: 'snprintf' output between 2 and 4351 bytes into a destination of size 4096
616 | snprintf(full_path, PATH_MAX, "%s/%s",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
617 | dev_search_path, dev_name);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
To fix this, unconditionally check for truncation (as was already done
for the case where "%SESSNAME%" was present).
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312100355.lHoJPgKy-lkp@intel.com/
Cc: Md. Haris Iqbal <haris.iqbal(a)ionos.com>
Cc: Jack Wang <jinpu.wang(a)ionos.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <linux-block(a)vger.kernel.org>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Guoqing Jiang <guoqing.jiang(a)linux.dev>
Acked-by: Jack Wang <jinpu.wang(a)ionos.com>
Link: https://lore.kernel.org/r/20231212214738.work.169-kees@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
---
drivers/block/rnbd/rnbd-srv.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index e1bc8b4cd592..9c5d52335e17 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -591,6 +591,7 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
{
char *full_path;
char *a, *b;
+ int len;
full_path = kmalloc(PATH_MAX, GFP_KERNEL);
if (!full_path)
@@ -602,19 +603,19 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
*/
a = strnstr(dev_search_path, "%SESSNAME%", sizeof(dev_search_path));
if (a) {
- int len = a - dev_search_path;
+ len = a - dev_search_path;
len = snprintf(full_path, PATH_MAX, "%.*s/%s/%s", len,
dev_search_path, srv_sess->sessname, dev_name);
- if (len >= PATH_MAX) {
- pr_err("Too long path: %s, %s, %s\n",
- dev_search_path, srv_sess->sessname, dev_name);
- kfree(full_path);
- return ERR_PTR(-EINVAL);
- }
} else {
- snprintf(full_path, PATH_MAX, "%s/%s",
- dev_search_path, dev_name);
+ len = snprintf(full_path, PATH_MAX, "%s/%s",
+ dev_search_path, dev_name);
+ }
+ if (len >= PATH_MAX) {
+ pr_err("Too long path: %s, %s, %s\n",
+ dev_search_path, srv_sess->sessname, dev_name);
+ kfree(full_path);
+ return ERR_PTR(-EINVAL);
}
/* eliminitate duplicated slashes */
--
2.17.1
From: Gui-Dong Han <2045gemini(a)gmail.com>
stable inclusion
from stable-v5.10.209
commit 394c6c0b6d9bdd7d6ebca35ca9cfbabf44c0c257
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I917MX
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit da9065caa594d19b26e1a030fd0cc27bd365d685 upstream.
In min_key_size_set():
if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
In max_key_size_set():
if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs.Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 18f81241b74f ("Bluetooth: Move {min,max}_key_size debugfs ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
net/bluetooth/hci_debugfs.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 338833f12365..d4efc4aa55af 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -994,10 +994,12 @@ static int min_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
+ hci_dev_lock(hdev);
+ if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
@@ -1022,10 +1024,12 @@ static int max_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
+ hci_dev_lock(hdev);
+ if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
--
2.17.1
From: Gui-Dong Han <2045gemini(a)gmail.com>
stable inclusion
from stable-v5.10.209
commit 394c6c0b6d9bdd7d6ebca35ca9cfbabf44c0c257
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I917MX
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit da9065caa594d19b26e1a030fd0cc27bd365d685 upstream.
In min_key_size_set():
if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
In max_key_size_set():
if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs.Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 18f81241b74f ("Bluetooth: Move {min,max}_key_size debugfs ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
net/bluetooth/hci_debugfs.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 338833f12365..d4efc4aa55af 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -994,10 +994,12 @@ static int min_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
+ hci_dev_lock(hdev);
+ if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
@@ -1022,10 +1024,12 @@ static int max_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
+ hci_dev_lock(hdev);
+ if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
--
2.17.1
From: Gui-Dong Han <2045gemini(a)gmail.com>
stable inclusion
from stable-v5.10.209
commit 394c6c0b6d9bdd7d6ebca35ca9cfbabf44c0c257
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I917MX
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit da9065caa594d19b26e1a030fd0cc27bd365d685 upstream.
In min_key_size_set():
if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
In max_key_size_set():
if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs.Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 18f81241b74f ("Bluetooth: Move {min,max}_key_size debugfs ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
net/bluetooth/hci_debugfs.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 338833f12365..d4efc4aa55af 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -994,10 +994,12 @@ static int min_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
+ hci_dev_lock(hdev);
+ if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
@@ -1022,10 +1024,12 @@ static int max_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
+ hci_dev_lock(hdev);
+ if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
--
2.17.1
From: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
mainline inclusion
from mainline-v6.8-rc1
commit 97566d09fd02d2ab329774bb89a2cdf2267e86d9
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9BV4D
CVE: CVE-2024-26646
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
The kernel allocates a memory buffer and provides its location to the
hardware, which uses it to update the HFI table. This allocation occurs
during boot and remains constant throughout runtime.
When resuming from hibernation, the restore kernel allocates a second
memory buffer and reprograms the HFI hardware with the new location as
part of a normal boot. The location of the second memory buffer may
differ from the one allocated by the image kernel.
When the restore kernel transfers control to the image kernel, its HFI
buffer becomes invalid, potentially leading to memory corruption if the
hardware writes to it (the hardware continues to use the buffer from the
restore kernel).
It is also possible that the hardware "forgets" the address of the memory
buffer when resuming from "deep" suspend. Memory corruption may also occur
in such a scenario.
To prevent the described memory corruption, disable HFI when preparing to
suspend or hibernate. Enable it when resuming.
Add syscore callbacks to handle the package of the boot CPU (packages of
non-boot CPUs are handled via CPU offline). Syscore ops always run on the
boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend
and hibernation. Syscore ops only run in these cases.
Cc: 6.1+ <stable(a)vger.kernel.org> # 6.1+
Signed-off-by: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
[ rjw: Comment adjustment, subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Zhao Wenhui <zhaowenhui8(a)huawei.com>
---
drivers/thermal/intel/intel_hfi.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c
index 158a2b7c6b6f..d442d639a900 100644
--- a/drivers/thermal/intel/intel_hfi.c
+++ b/drivers/thermal/intel/intel_hfi.c
@@ -33,7 +33,9 @@
#include <linux/processor.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/suspend.h>
#include <linux/string.h>
+#include <linux/syscore_ops.h>
#include <linux/topology.h>
#include <linux/workqueue.h>
@@ -524,6 +526,30 @@ static __init int hfi_parse_features(void)
return 0;
}
+static void hfi_do_enable(void)
+{
+ /* This code runs only on the boot CPU. */
+ struct hfi_cpu_info *info = &per_cpu(hfi_cpu_info, 0);
+ struct hfi_instance *hfi_instance = info->hfi_instance;
+
+ /* No locking needed. There is no concurrency with CPU online. */
+ hfi_set_hw_table(hfi_instance);
+ hfi_enable();
+}
+
+static int hfi_do_disable(void)
+{
+ /* No locking needed. There is no concurrency with CPU offline. */
+ hfi_disable();
+
+ return 0;
+}
+
+static struct syscore_ops hfi_pm_ops = {
+ .resume = hfi_do_enable,
+ .suspend = hfi_do_disable,
+};
+
void __init intel_hfi_init(void)
{
struct hfi_instance *hfi_instance;
@@ -555,6 +581,8 @@ void __init intel_hfi_init(void)
if (!hfi_updates_wq)
goto err_nomem;
+ register_syscore_ops(&hfi_pm_ops);
+
return;
err_nomem:
--
2.17.1
From: Pablo Neira Ayuso <pablo(a)netfilter.org>
mainline inclusion
from mainline-v6.8
commit 552705a3650bbf46a22b1adedc1b04181490fc36
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9AK7L
CVE: CVE-2024-26643
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
While the rhashtable set gc runs asynchronously, a race allows it to
collect elements from anonymous sets with timeouts while it is being
released from the commit path.
Mingi Cho originally reported this issue in a different path in 6.1.x
with a pipapo set with low timeouts which is not possible upstream since
7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set
element timeout").
Fix this by setting on the dead flag for anonymous sets to skip async gc
in this case.
According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on
transaction abort"), Florian plans to accelerate abort path by releasing
objects via workqueue, therefore, this sets on the dead flag for abort
path too.
Cc: stable(a)vger.kernel.org
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Mingi Cho <mgcho.minic(a)gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
---
net/netfilter/nf_tables_api.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 0a7f7ed33406..07355fd7b2a2 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4674,6 +4674,7 @@ static void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set,
if (list_empty(&set->bindings) && nft_set_is_anonymous(set)) {
list_del_rcu(&set->list);
+ set->dead = 1;
if (event)
nf_tables_set_notify(ctx, set, NFT_MSG_DELSET,
GFP_KERNEL);
--
2.17.1
From: Pablo Neira Ayuso <pablo(a)netfilter.org>
mainline inclusion
from mainline-v6.8
commit 552705a3650bbf46a22b1adedc1b04181490fc36
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9AK7L
CVE: CVE-2024-26643
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
While the rhashtable set gc runs asynchronously, a race allows it to
collect elements from anonymous sets with timeouts while it is being
released from the commit path.
Mingi Cho originally reported this issue in a different path in 6.1.x
with a pipapo set with low timeouts which is not possible upstream since
7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set
element timeout").
Fix this by setting on the dead flag for anonymous sets to skip async gc
in this case.
According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on
transaction abort"), Florian plans to accelerate abort path by releasing
objects via workqueue, therefore, this sets on the dead flag for abort
path too.
Cc: stable(a)vger.kernel.org
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Mingi Cho <mgcho.minic(a)gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
---
net/netfilter/nf_tables_api.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3667b4daf1bb..566053c7dc2e 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4674,6 +4674,7 @@ static void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set,
if (list_empty(&set->bindings) && nft_set_is_anonymous(set)) {
list_del_rcu(&set->list);
+ set->dead = 1;
if (event)
nf_tables_set_notify(ctx, set, NFT_MSG_DELSET,
GFP_KERNEL);
--
2.17.1
From: Petr Pavlu <petr.pavlu(a)suse.com>
mainline inclusion
from mainline-v6.8-rc2
commit 2b44760609e9eaafc9d234a6883d042fc21132a7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9BV24
CVE: CVE-2024-26645
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Running the following two commands in parallel on a multi-processor
AArch64 machine can sporadically produce an unexpected warning about
duplicate histogram entries:
$ while true; do
echo hist:key=id.syscall:val=hitcount > \
/sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
sleep 0.001
done
$ stress-ng --sysbadaddr $(nproc)
The warning looks as follows:
[ 2911.172474] ------------[ cut here ]------------
[ 2911.173111] Duplicates detected: 1
[ 2911.173574] WARNING: CPU: 2 PID: 12247 at kernel/trace/tracing_map.c:983 tracing_map_sort_entries+0x3e0/0x408
[ 2911.174702] Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) af_packet(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ena(E) tiny_power_button(E) qemu_fw_cfg(E) button(E) fuse(E) efi_pstore(E) ip_tables(E) x_tables(E) xfs(E) libcrc32c(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) polyval_ce(E) polyval_generic(E) ghash_ce(E) gf128mul(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm4(E) sm3_ce(E) sm3(E) sha3_ce(E) sha512_ce(E) sha512_arm64(E) sha2_ce(E) sha256_arm64(E) nvme(E) sha1_ce(E) nvme_core(E) nvme_auth(E) t10_pi(E) sg(E) scsi_mod(E) scsi_common(E) efivarfs(E)
[ 2911.174738] Unloaded tainted modules: cppc_cpufreq(E):1
[ 2911.180985] CPU: 2 PID: 12247 Comm: cat Kdump: loaded Tainted: G E 6.7.0-default #2 1b58bbb22c97e4399dc09f92d309344f69c44a01
[ 2911.182398] Hardware name: Amazon EC2 c7g.8xlarge/, BIOS 1.0 11/1/2018
[ 2911.183208] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 2911.184038] pc : tracing_map_sort_entries+0x3e0/0x408
[ 2911.184667] lr : tracing_map_sort_entries+0x3e0/0x408
[ 2911.185310] sp : ffff8000a1513900
[ 2911.185750] x29: ffff8000a1513900 x28: ffff0003f272fe80 x27: 0000000000000001
[ 2911.186600] x26: ffff0003f272fe80 x25: 0000000000000030 x24: 0000000000000008
[ 2911.187458] x23: ffff0003c5788000 x22: ffff0003c16710c8 x21: ffff80008017f180
[ 2911.188310] x20: ffff80008017f000 x19: ffff80008017f180 x18: ffffffffffffffff
[ 2911.189160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000a15134b8
[ 2911.190015] x14: 0000000000000000 x13: 205d373432323154 x12: 5b5d313131333731
[ 2911.190844] x11: 00000000fffeffff x10: 00000000fffeffff x9 : ffffd1b78274a13c
[ 2911.191716] x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 000000000057ffa8
[ 2911.192554] x5 : ffff0012f6c24ec0 x4 : 0000000000000000 x3 : ffff2e5b72b5d000
[ 2911.193404] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0003ff254480
[ 2911.194259] Call trace:
[ 2911.194626] tracing_map_sort_entries+0x3e0/0x408
[ 2911.195220] hist_show+0x124/0x800
[ 2911.195692] seq_read_iter+0x1d4/0x4e8
[ 2911.196193] seq_read+0xe8/0x138
[ 2911.196638] vfs_read+0xc8/0x300
[ 2911.197078] ksys_read+0x70/0x108
[ 2911.197534] __arm64_sys_read+0x24/0x38
[ 2911.198046] invoke_syscall+0x78/0x108
[ 2911.198553] el0_svc_common.constprop.0+0xd0/0xf8
[ 2911.199157] do_el0_svc+0x28/0x40
[ 2911.199613] el0_svc+0x40/0x178
[ 2911.200048] el0t_64_sync_handler+0x13c/0x158
[ 2911.200621] el0t_64_sync+0x1a8/0x1b0
[ 2911.201115] ---[ end trace 0000000000000000 ]---
The problem appears to be caused by CPU reordering of writes issued from
__tracing_map_insert().
The check for the presence of an element with a given key in this
function is:
val = READ_ONCE(entry->val);
if (val && keys_match(key, val->key, map->key_size)) ...
The write of a new entry is:
elt = get_free_elt(map);
memcpy(elt->key, key, map->key_size);
entry->val = elt;
The "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;"
stores may become visible in the reversed order on another CPU. This
second CPU might then incorrectly determine that a new key doesn't match
an already present val->key and subsequently insert a new element,
resulting in a duplicate.
Fix the problem by adding a write barrier between
"memcpy(elt->key, key, map->key_size);" and "entry->val = elt;", and for
good measure, also use WRITE_ONCE(entry->val, elt) for publishing the
element. The sequence pairs with the mentioned "READ_ONCE(entry->val);"
and the "val->key" check which has an address dependency.
The barrier is placed on a path executed when adding an element for
a new key. Subsequent updates targeting the same key remain unaffected.
From the user's perspective, the issue was introduced by commit
c193707dde77 ("tracing: Remove code which merges duplicates"), which
followed commit cbf4100efb8f ("tracing: Add support to detect and avoid
duplicates"). The previous code operated differently; it inherently
expected potential races which result in duplicates but merged them
later when they occurred.
Link: https://lore.kernel.org/linux-trace-kernel/20240122150928.27725-1-petr.pavl…
Fixes: c193707dde77 ("tracing: Remove code which merges duplicates")
Signed-off-by: Petr Pavlu <petr.pavlu(a)suse.com>
Acked-by: Tom Zanussi <tom.zanussi(a)linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
---
kernel/trace/tracing_map.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c
index 51a9d1185033..d47641f9740b 100644
--- a/kernel/trace/tracing_map.c
+++ b/kernel/trace/tracing_map.c
@@ -574,7 +574,12 @@ __tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only)
}
memcpy(elt->key, key, map->key_size);
- entry->val = elt;
+ /*
+ * Ensure the initialization is visible and
+ * publish the elt.
+ */
+ smp_wmb();
+ WRITE_ONCE(entry->val, elt);
atomic64_inc(&map->hits);
return entry->val;
--
2.17.1
From: Pablo Neira Ayuso <pablo(a)netfilter.org>
mainline inclusion
from mainline-v6.8
commit 552705a3650bbf46a22b1adedc1b04181490fc36
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9AK7L
CVE: CVE-2024-26643
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
While the rhashtable set gc runs asynchronously, a race allows it to
collect elements from anonymous sets with timeouts while it is being
released from the commit path.
Mingi Cho originally reported this issue in a different path in 6.1.x
with a pipapo set with low timeouts which is not possible upstream since
7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set
element timeout").
Fix this by setting on the dead flag for anonymous sets to skip async gc
in this case.
According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on
transaction abort"), Florian plans to accelerate abort path by releasing
objects via workqueue, therefore, this sets on the dead flag for abort
path too.
Cc: stable(a)vger.kernel.org
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Mingi Cho <mgcho.minic(a)gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
---
net/netfilter/nf_tables_api.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 866a4ea8f4e7..c547d7beb6d1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4672,6 +4672,7 @@ static void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set,
if (list_empty(&set->bindings) && nft_set_is_anonymous(set)) {
list_del_rcu(&set->list);
+ set->dead = 1;
if (event)
nf_tables_set_notify(ctx, set, NFT_MSG_DELSET,
GFP_KERNEL);
--
2.17.1
From: Petr Pavlu <petr.pavlu(a)suse.com>
mainline inclusion
from mainline-v6.8-rc2
commit 2b44760609e9eaafc9d234a6883d042fc21132a7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9BV24
CVE: CVE-2024-26645
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Running the following two commands in parallel on a multi-processor
AArch64 machine can sporadically produce an unexpected warning about
duplicate histogram entries:
$ while true; do
echo hist:key=id.syscall:val=hitcount > \
/sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
sleep 0.001
done
$ stress-ng --sysbadaddr $(nproc)
The warning looks as follows:
[ 2911.172474] ------------[ cut here ]------------
[ 2911.173111] Duplicates detected: 1
[ 2911.173574] WARNING: CPU: 2 PID: 12247 at kernel/trace/tracing_map.c:983 tracing_map_sort_entries+0x3e0/0x408
[ 2911.174702] Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) af_packet(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ena(E) tiny_power_button(E) qemu_fw_cfg(E) button(E) fuse(E) efi_pstore(E) ip_tables(E) x_tables(E) xfs(E) libcrc32c(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) polyval_ce(E) polyval_generic(E) ghash_ce(E) gf128mul(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm4(E) sm3_ce(E) sm3(E) sha3_ce(E) sha512_ce(E) sha512_arm64(E) sha2_ce(E) sha256_arm64(E) nvme(E) sha1_ce(E) nvme_core(E) nvme_auth(E) t10_pi(E) sg(E) scsi_mod(E) scsi_common(E) efivarfs(E)
[ 2911.174738] Unloaded tainted modules: cppc_cpufreq(E):1
[ 2911.180985] CPU: 2 PID: 12247 Comm: cat Kdump: loaded Tainted: G E 6.7.0-default #2 1b58bbb22c97e4399dc09f92d309344f69c44a01
[ 2911.182398] Hardware name: Amazon EC2 c7g.8xlarge/, BIOS 1.0 11/1/2018
[ 2911.183208] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 2911.184038] pc : tracing_map_sort_entries+0x3e0/0x408
[ 2911.184667] lr : tracing_map_sort_entries+0x3e0/0x408
[ 2911.185310] sp : ffff8000a1513900
[ 2911.185750] x29: ffff8000a1513900 x28: ffff0003f272fe80 x27: 0000000000000001
[ 2911.186600] x26: ffff0003f272fe80 x25: 0000000000000030 x24: 0000000000000008
[ 2911.187458] x23: ffff0003c5788000 x22: ffff0003c16710c8 x21: ffff80008017f180
[ 2911.188310] x20: ffff80008017f000 x19: ffff80008017f180 x18: ffffffffffffffff
[ 2911.189160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000a15134b8
[ 2911.190015] x14: 0000000000000000 x13: 205d373432323154 x12: 5b5d313131333731
[ 2911.190844] x11: 00000000fffeffff x10: 00000000fffeffff x9 : ffffd1b78274a13c
[ 2911.191716] x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 000000000057ffa8
[ 2911.192554] x5 : ffff0012f6c24ec0 x4 : 0000000000000000 x3 : ffff2e5b72b5d000
[ 2911.193404] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0003ff254480
[ 2911.194259] Call trace:
[ 2911.194626] tracing_map_sort_entries+0x3e0/0x408
[ 2911.195220] hist_show+0x124/0x800
[ 2911.195692] seq_read_iter+0x1d4/0x4e8
[ 2911.196193] seq_read+0xe8/0x138
[ 2911.196638] vfs_read+0xc8/0x300
[ 2911.197078] ksys_read+0x70/0x108
[ 2911.197534] __arm64_sys_read+0x24/0x38
[ 2911.198046] invoke_syscall+0x78/0x108
[ 2911.198553] el0_svc_common.constprop.0+0xd0/0xf8
[ 2911.199157] do_el0_svc+0x28/0x40
[ 2911.199613] el0_svc+0x40/0x178
[ 2911.200048] el0t_64_sync_handler+0x13c/0x158
[ 2911.200621] el0t_64_sync+0x1a8/0x1b0
[ 2911.201115] ---[ end trace 0000000000000000 ]---
The problem appears to be caused by CPU reordering of writes issued from
__tracing_map_insert().
The check for the presence of an element with a given key in this
function is:
val = READ_ONCE(entry->val);
if (val && keys_match(key, val->key, map->key_size)) ...
The write of a new entry is:
elt = get_free_elt(map);
memcpy(elt->key, key, map->key_size);
entry->val = elt;
The "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;"
stores may become visible in the reversed order on another CPU. This
second CPU might then incorrectly determine that a new key doesn't match
an already present val->key and subsequently insert a new element,
resulting in a duplicate.
Fix the problem by adding a write barrier between
"memcpy(elt->key, key, map->key_size);" and "entry->val = elt;", and for
good measure, also use WRITE_ONCE(entry->val, elt) for publishing the
element. The sequence pairs with the mentioned "READ_ONCE(entry->val);"
and the "val->key" check which has an address dependency.
The barrier is placed on a path executed when adding an element for
a new key. Subsequent updates targeting the same key remain unaffected.
From the user's perspective, the issue was introduced by commit
c193707dde77 ("tracing: Remove code which merges duplicates"), which
followed commit cbf4100efb8f ("tracing: Add support to detect and avoid
duplicates"). The previous code operated differently; it inherently
expected potential races which result in duplicates but merged them
later when they occurred.
Link: https://lore.kernel.org/linux-trace-kernel/20240122150928.27725-1-petr.pavl…
Fixes: c193707dde77 ("tracing: Remove code which merges duplicates")
Signed-off-by: Petr Pavlu <petr.pavlu(a)suse.com>
Acked-by: Tom Zanussi <tom.zanussi(a)linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
---
kernel/trace/tracing_map.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c
index 51a9d1185033..d47641f9740b 100644
--- a/kernel/trace/tracing_map.c
+++ b/kernel/trace/tracing_map.c
@@ -574,7 +574,12 @@ __tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only)
}
memcpy(elt->key, key, map->key_size);
- entry->val = elt;
+ /*
+ * Ensure the initialization is visible and
+ * publish the elt.
+ */
+ smp_wmb();
+ WRITE_ONCE(entry->val, elt);
atomic64_inc(&map->hits);
return entry->val;
--
2.17.1
From: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
mainline inclusion
from mainline-v6.8-rc1
commit 97566d09fd02d2ab329774bb89a2cdf2267e86d9
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9BV4D
CVE: CVE-2024-26646
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
The kernel allocates a memory buffer and provides its location to the
hardware, which uses it to update the HFI table. This allocation occurs
during boot and remains constant throughout runtime.
When resuming from hibernation, the restore kernel allocates a second
memory buffer and reprograms the HFI hardware with the new location as
part of a normal boot. The location of the second memory buffer may
differ from the one allocated by the image kernel.
When the restore kernel transfers control to the image kernel, its HFI
buffer becomes invalid, potentially leading to memory corruption if the
hardware writes to it (the hardware continues to use the buffer from the
restore kernel).
It is also possible that the hardware "forgets" the address of the memory
buffer when resuming from "deep" suspend. Memory corruption may also occur
in such a scenario.
To prevent the described memory corruption, disable HFI when preparing to
suspend or hibernate. Enable it when resuming.
Add syscore callbacks to handle the package of the boot CPU (packages of
non-boot CPUs are handled via CPU offline). Syscore ops always run on the
boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend
and hibernation. Syscore ops only run in these cases.
Cc: 6.1+ <stable(a)vger.kernel.org> # 6.1+
Signed-off-by: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
[ rjw: Comment adjustment, subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Zhao Wenhui <zhaowenhui8(a)huawei.com>
---
drivers/thermal/intel/intel_hfi.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c
index 158a2b7c6b6f..d442d639a900 100644
--- a/drivers/thermal/intel/intel_hfi.c
+++ b/drivers/thermal/intel/intel_hfi.c
@@ -33,7 +33,9 @@
#include <linux/processor.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/suspend.h>
#include <linux/string.h>
+#include <linux/syscore_ops.h>
#include <linux/topology.h>
#include <linux/workqueue.h>
@@ -524,6 +526,30 @@ static __init int hfi_parse_features(void)
return 0;
}
+static void hfi_do_enable(void)
+{
+ /* This code runs only on the boot CPU. */
+ struct hfi_cpu_info *info = &per_cpu(hfi_cpu_info, 0);
+ struct hfi_instance *hfi_instance = info->hfi_instance;
+
+ /* No locking needed. There is no concurrency with CPU online. */
+ hfi_set_hw_table(hfi_instance);
+ hfi_enable();
+}
+
+static int hfi_do_disable(void)
+{
+ /* No locking needed. There is no concurrency with CPU offline. */
+ hfi_disable();
+
+ return 0;
+}
+
+static struct syscore_ops hfi_pm_ops = {
+ .resume = hfi_do_enable,
+ .suspend = hfi_do_disable,
+};
+
void __init intel_hfi_init(void)
{
struct hfi_instance *hfi_instance;
@@ -555,6 +581,8 @@ void __init intel_hfi_init(void)
if (!hfi_updates_wq)
goto err_nomem;
+ register_syscore_ops(&hfi_pm_ops);
+
return;
err_nomem:
--
2.17.1
From: Petr Pavlu <petr.pavlu(a)suse.com>
mainline inclusion
from mainline-v6.8-rc2
commit 2b44760609e9eaafc9d234a6883d042fc21132a7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9BV24
CVE: CVE-2024-26645
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Running the following two commands in parallel on a multi-processor
AArch64 machine can sporadically produce an unexpected warning about
duplicate histogram entries:
$ while true; do
echo hist:key=id.syscall:val=hitcount > \
/sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
sleep 0.001
done
$ stress-ng --sysbadaddr $(nproc)
The warning looks as follows:
[ 2911.172474] ------------[ cut here ]------------
[ 2911.173111] Duplicates detected: 1
[ 2911.173574] WARNING: CPU: 2 PID: 12247 at kernel/trace/tracing_map.c:983 tracing_map_sort_entries+0x3e0/0x408
[ 2911.174702] Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) af_packet(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ena(E) tiny_power_button(E) qemu_fw_cfg(E) button(E) fuse(E) efi_pstore(E) ip_tables(E) x_tables(E) xfs(E) libcrc32c(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) polyval_ce(E) polyval_generic(E) ghash_ce(E) gf128mul(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm4(E) sm3_ce(E) sm3(E) sha3_ce(E) sha512_ce(E) sha512_arm64(E) sha2_ce(E) sha256_arm64(E) nvme(E) sha1_ce(E) nvme_core(E) nvme_auth(E) t10_pi(E) sg(E) scsi_mod(E) scsi_common(E) efivarfs(E)
[ 2911.174738] Unloaded tainted modules: cppc_cpufreq(E):1
[ 2911.180985] CPU: 2 PID: 12247 Comm: cat Kdump: loaded Tainted: G E 6.7.0-default #2 1b58bbb22c97e4399dc09f92d309344f69c44a01
[ 2911.182398] Hardware name: Amazon EC2 c7g.8xlarge/, BIOS 1.0 11/1/2018
[ 2911.183208] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 2911.184038] pc : tracing_map_sort_entries+0x3e0/0x408
[ 2911.184667] lr : tracing_map_sort_entries+0x3e0/0x408
[ 2911.185310] sp : ffff8000a1513900
[ 2911.185750] x29: ffff8000a1513900 x28: ffff0003f272fe80 x27: 0000000000000001
[ 2911.186600] x26: ffff0003f272fe80 x25: 0000000000000030 x24: 0000000000000008
[ 2911.187458] x23: ffff0003c5788000 x22: ffff0003c16710c8 x21: ffff80008017f180
[ 2911.188310] x20: ffff80008017f000 x19: ffff80008017f180 x18: ffffffffffffffff
[ 2911.189160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000a15134b8
[ 2911.190015] x14: 0000000000000000 x13: 205d373432323154 x12: 5b5d313131333731
[ 2911.190844] x11: 00000000fffeffff x10: 00000000fffeffff x9 : ffffd1b78274a13c
[ 2911.191716] x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 000000000057ffa8
[ 2911.192554] x5 : ffff0012f6c24ec0 x4 : 0000000000000000 x3 : ffff2e5b72b5d000
[ 2911.193404] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0003ff254480
[ 2911.194259] Call trace:
[ 2911.194626] tracing_map_sort_entries+0x3e0/0x408
[ 2911.195220] hist_show+0x124/0x800
[ 2911.195692] seq_read_iter+0x1d4/0x4e8
[ 2911.196193] seq_read+0xe8/0x138
[ 2911.196638] vfs_read+0xc8/0x300
[ 2911.197078] ksys_read+0x70/0x108
[ 2911.197534] __arm64_sys_read+0x24/0x38
[ 2911.198046] invoke_syscall+0x78/0x108
[ 2911.198553] el0_svc_common.constprop.0+0xd0/0xf8
[ 2911.199157] do_el0_svc+0x28/0x40
[ 2911.199613] el0_svc+0x40/0x178
[ 2911.200048] el0t_64_sync_handler+0x13c/0x158
[ 2911.200621] el0t_64_sync+0x1a8/0x1b0
[ 2911.201115] ---[ end trace 0000000000000000 ]---
The problem appears to be caused by CPU reordering of writes issued from
__tracing_map_insert().
The check for the presence of an element with a given key in this
function is:
val = READ_ONCE(entry->val);
if (val && keys_match(key, val->key, map->key_size)) ...
The write of a new entry is:
elt = get_free_elt(map);
memcpy(elt->key, key, map->key_size);
entry->val = elt;
The "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;"
stores may become visible in the reversed order on another CPU. This
second CPU might then incorrectly determine that a new key doesn't match
an already present val->key and subsequently insert a new element,
resulting in a duplicate.
Fix the problem by adding a write barrier between
"memcpy(elt->key, key, map->key_size);" and "entry->val = elt;", and for
good measure, also use WRITE_ONCE(entry->val, elt) for publishing the
element. The sequence pairs with the mentioned "READ_ONCE(entry->val);"
and the "val->key" check which has an address dependency.
The barrier is placed on a path executed when adding an element for
a new key. Subsequent updates targeting the same key remain unaffected.
From the user's perspective, the issue was introduced by commit
c193707dde77 ("tracing: Remove code which merges duplicates"), which
followed commit cbf4100efb8f ("tracing: Add support to detect and avoid
duplicates"). The previous code operated differently; it inherently
expected potential races which result in duplicates but merged them
later when they occurred.
Link: https://lore.kernel.org/linux-trace-kernel/20240122150928.27725-1-petr.pavl…
Fixes: c193707dde77 ("tracing: Remove code which merges duplicates")
Signed-off-by: Petr Pavlu <petr.pavlu(a)suse.com>
Acked-by: Tom Zanussi <tom.zanussi(a)linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
---
kernel/trace/tracing_map.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c
index 51a9d1185033..d47641f9740b 100644
--- a/kernel/trace/tracing_map.c
+++ b/kernel/trace/tracing_map.c
@@ -574,7 +574,12 @@ __tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only)
}
memcpy(elt->key, key, map->key_size);
- entry->val = elt;
+ /*
+ * Ensure the initialization is visible and
+ * publish the elt.
+ */
+ smp_wmb();
+ WRITE_ONCE(entry->val, elt);
atomic64_inc(&map->hits);
return entry->val;
--
2.17.1
From: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
mainline inclusion
from mainline-v6.8-rc1
commit 97566d09fd02d2ab329774bb89a2cdf2267e86d9
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9BV4D
CVE: CVE-2024-26646
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
The kernel allocates a memory buffer and provides its location to the
hardware, which uses it to update the HFI table. This allocation occurs
during boot and remains constant throughout runtime.
When resuming from hibernation, the restore kernel allocates a second
memory buffer and reprograms the HFI hardware with the new location as
part of a normal boot. The location of the second memory buffer may
differ from the one allocated by the image kernel.
When the restore kernel transfers control to the image kernel, its HFI
buffer becomes invalid, potentially leading to memory corruption if the
hardware writes to it (the hardware continues to use the buffer from the
restore kernel).
It is also possible that the hardware "forgets" the address of the memory
buffer when resuming from "deep" suspend. Memory corruption may also occur
in such a scenario.
To prevent the described memory corruption, disable HFI when preparing to
suspend or hibernate. Enable it when resuming.
Add syscore callbacks to handle the package of the boot CPU (packages of
non-boot CPUs are handled via CPU offline). Syscore ops always run on the
boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend
and hibernation. Syscore ops only run in these cases.
Cc: 6.1+ <stable(a)vger.kernel.org> # 6.1+
Signed-off-by: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
[ rjw: Comment adjustment, subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Zhao Wenhui <zhaowenhui8(a)huawei.com>
---
drivers/thermal/intel/intel_hfi.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c
index 158a2b7c6b6f..d442d639a900 100644
--- a/drivers/thermal/intel/intel_hfi.c
+++ b/drivers/thermal/intel/intel_hfi.c
@@ -33,7 +33,9 @@
#include <linux/processor.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/suspend.h>
#include <linux/string.h>
+#include <linux/syscore_ops.h>
#include <linux/topology.h>
#include <linux/workqueue.h>
@@ -524,6 +526,30 @@ static __init int hfi_parse_features(void)
return 0;
}
+static void hfi_do_enable(void)
+{
+ /* This code runs only on the boot CPU. */
+ struct hfi_cpu_info *info = &per_cpu(hfi_cpu_info, 0);
+ struct hfi_instance *hfi_instance = info->hfi_instance;
+
+ /* No locking needed. There is no concurrency with CPU online. */
+ hfi_set_hw_table(hfi_instance);
+ hfi_enable();
+}
+
+static int hfi_do_disable(void)
+{
+ /* No locking needed. There is no concurrency with CPU offline. */
+ hfi_disable();
+
+ return 0;
+}
+
+static struct syscore_ops hfi_pm_ops = {
+ .resume = hfi_do_enable,
+ .suspend = hfi_do_disable,
+};
+
void __init intel_hfi_init(void)
{
struct hfi_instance *hfi_instance;
@@ -555,6 +581,8 @@ void __init intel_hfi_init(void)
if (!hfi_updates_wq)
goto err_nomem;
+ register_syscore_ops(&hfi_pm_ops);
+
return;
err_nomem:
--
2.17.1