Offering: HULK
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IAASLU
--------------------------------
When multiple processes or threads write to the same file concurrently,
if a network disruption occurs during the write operation, it may lead
to a deadlock situation as follow:
Process 1 (dd) Process 2 (cifsd) Process 3 (cifsiod)
cifs_writepages
lock_page - [1]
wait_on_page_writeback - [2] Waiting for writeback, blocked by [4]
wait_on_page_bit
cifs_demultiplex_thread
cifs_read_from_socket
cifs_readv_from_socket
- If another process triggers reconnect at this point
cifs_reconnect
- mid->mid_state updated to MID_RETRY_NEEDED
smb2_writev_callback mid_entry->callback()
- mid_state leads to wdata->result = -EAGAIN
wdata->result = -EAGAIN
queue_work(cifsiod_wq, &wdata->work);
cifs_writev_complete - work function
- Condition satisfied
- wdata->result == -EAGAIN
cifs_writev_requeue
lock_page - [3] Blocked by [1]
end_page_writeback
- [4] Won't execute, blocked by [3]
unlock_page
Mainline refactoring patch d08089f649a0 ("cifs: Change the I/O paths to
use an iterator rather than a page list") unlock page while waiting for
the writeback to complete, thus avoiding potential deadlocks caused by
lock ordering issues during reconnection.
Due to the large refactor of the mainline, the patch cannot be backport
directly. Therefore, This patch only uses a part of the idea of the
mainline patch to fix deadlock.
Fixes: c28c89fc43e3 ("cifs: add cifs_async_writev")
Signed-off-by: Wang Zhaolong <wangzhaolong1(a)huawei.com>
---
fs/cifs/file.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index e914f4f5cc83..dc7175b75c26 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2226,6 +2226,7 @@ wdata_prepare_pages(struct cifs_writedata *wdata, unsigned int found_pages,
* back from swapper_space to tmpfs file mapping
*/
+relock_recheck:
if (nr_pages == 0)
lock_page(page);
else if (!trylock_page(page))
@@ -2248,11 +2249,16 @@ wdata_prepare_pages(struct cifs_writedata *wdata, unsigned int found_pages,
break;
}
- if (wbc->sync_mode != WB_SYNC_NONE)
- wait_on_page_writeback(page);
+ if (PageWriteback(page)) {
+ unlock_page(page);
+ if (wbc->sync_mode != WB_SYNC_NONE) {
+ wait_on_page_writeback(page);
+ goto relock_recheck;
+ }
+ break;
+ }
- if (PageWriteback(page) ||
- !clear_page_dirty_for_io(page)) {
+ if (!clear_page_dirty_for_io(page)) {
unlock_page(page);
break;
}
--
2.39.2
Offering: HULK
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IAASLU
--------------------------------
When multiple processes or threads write to the same file concurrently,
if a network disruption occurs during the write operation, it may lead
to a deadlock situation as follow:
Process 1 (dd) Process 2 (cifsd) Process 3 (cifsiod)
cifs_writepages
lock_page - [1]
wait_on_page_writeback - [2] Waiting for writeback, blocked by [4]
wait_on_page_bit
cifs_demultiplex_thread
cifs_read_from_socket
cifs_readv_from_socket
- If another process triggers reconnect at this point
cifs_reconnect
- mid->mid_state updated to MID_RETRY_NEEDED
smb2_writev_callback mid_entry->callback()
- mid_state leads to wdata->result = -EAGAIN
wdata->result = -EAGAIN
queue_work(cifsiod_wq, &wdata->work);
cifs_writev_complete - work function
- Condition satisfied
- wdata->result == -EAGAIN
cifs_writev_requeue
lock_page - [3] Blocked by [1]
end_page_writeback
- [4] Won't execute, blocked by [3]
unlock_page
Mainline refactoring patch d08089f649a0 ("cifs: Change the I/O paths to
use an iterator rather than a page list") unlock page while waiting for
the writeback to complete, thus avoiding potential deadlocks caused by
lock ordering issues during reconnection.
Due to the large refactor of the mainline, the patch cannot be backport
directly. Therefore, This patch only uses a part of the idea of the
mainline patch to fix deadlock.
Fixes: c28c89fc43e3 ("cifs: add cifs_async_writev")
Signed-off-by: Wang Zhaolong <wangzhaolong1(a)huawei.com>
---
fs/cifs/file.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 875cb44ba573..e346e6c2227a 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2056,6 +2056,7 @@ wdata_prepare_pages(struct cifs_writedata *wdata, unsigned int found_pages,
* back from swapper_space to tmpfs file mapping
*/
+relock_recheck:
if (nr_pages == 0)
lock_page(page);
else if (!trylock_page(page))
@@ -2078,11 +2079,16 @@ wdata_prepare_pages(struct cifs_writedata *wdata, unsigned int found_pages,
break;
}
- if (wbc->sync_mode != WB_SYNC_NONE)
- wait_on_page_writeback(page);
+ if (PageWriteback(page)) {
+ unlock_page(page);
+ if (wbc->sync_mode != WB_SYNC_NONE) {
+ wait_on_page_writeback(page);
+ goto relock_recheck;
+ }
+ break;
+ }
- if (PageWriteback(page) ||
- !clear_page_dirty_for_io(page)) {
+ if (!clear_page_dirty_for_io(page)) {
unlock_page(page);
break;
}
--
2.39.2
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
mainline inclusion
from mainline-v5.8-rc7
commit 594cced14ad3903166c8b091ff96adac7552f0b3
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IABZNI
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
khugepaged has to drop mmap lock several times while collapsing a page.
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.
But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
anon_vma_lock_write include/linux/rmap.h:120 [inline]
collapse_huge_page mm/khugepaged.c:1110 [inline]
khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
khugepaged_do_scan mm/khugepaged.c:2193 [inline]
khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238
The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate(). The helper is only used for collapsing
anonymous pages.
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Reported-by: syzbot+ed318e8b790ca72c5ad0(a)syzkaller.appspotmail.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org>
Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel…
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
[ The Fixes tag is incorrect, the problem can triggered due to shmem_file. ]
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
---
mm/khugepaged.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 04d0a3ee006e..88badbed7f73 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -893,6 +893,9 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
return SCAN_ADDRESS_RANGE;
if (!hugepage_vma_check(vma, vma->vm_flags))
return SCAN_VMA_CHECK;
+ /* Anon VMA expected */
+ if (!vma->anon_vma || vma->vm_ops)
+ return SCAN_VMA_CHECK;
return 0;
}
--
2.25.1
From: Nikolay Borisov <nborisov(a)suse.com>
stable inclusion
from stable-v4.19.218
commit ed058d735a70f4b063323f1a7bb33cda0f987513
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9FNFN
CVE: CVE-2021-47189
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 45da9c1767ac31857df572f0a909fbe88fd5a7e9 upstream.
Ordered work functions aren't guaranteed to be handled by the same thread
which executed the normal work functions. The only way execution between
normal/ordered functions is synchronized is via the WORK_DONE_BIT,
unfortunately the used bitops don't guarantee any ordering whatsoever.
This manifested as seemingly inexplicable crashes on ARM64, where
async_chunk::inode is seen as non-null in async_cow_submit which causes
submit_compressed_extents to be called and crash occurs because
async_chunk::inode suddenly became NULL. The call trace was similar to:
pc : submit_compressed_extents+0x38/0x3d0
lr : async_cow_submit+0x50/0xd0
sp : ffff800015d4bc20
<registers omitted for brevity>
Call trace:
submit_compressed_extents+0x38/0x3d0
async_cow_submit+0x50/0xd0
run_ordered_work+0xc8/0x280
btrfs_work_helper+0x98/0x250
process_one_work+0x1f0/0x4ac
worker_thread+0x188/0x504
kthread+0x110/0x114
ret_from_fork+0x10/0x18
Fix this by adding respective barrier calls which ensure that all
accesses preceding setting of WORK_DONE_BIT are strictly ordered before
setting the flag. At the same time add a read barrier after reading of
WORK_DONE_BIT in run_ordered_work which ensures all subsequent loads
would be strictly ordered after reading the bit. This in turn ensures
are all accesses before WORK_DONE_BIT are going to be strictly ordered
before any access that can occur in ordered_func.
Reported-by: Chris Murphy <lists(a)colorremedies.com>
Fixes: 08a9ff326418 ("btrfs: Added btrfs_workqueue_struct implemented ordered execution based on kernel workqueue")
CC: stable(a)vger.kernel.org # 4.4+
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2011928
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Tested-by: Chris Murphy <chris(a)colorremedies.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yifan Qiao <qiaoyifan4(a)huawei.com>
---
fs/btrfs/async-thread.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index f79c0cb7697a..21f8f475c894 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -270,6 +270,13 @@ static void run_ordered_work(struct __btrfs_workqueue *wq,
ordered_list);
if (!test_bit(WORK_DONE_BIT, &work->flags))
break;
+ /*
+ * Orders all subsequent loads after reading WORK_DONE_BIT,
+ * paired with the smp_mb__before_atomic in btrfs_work_helper
+ * this guarantees that the ordered function will see all
+ * updates from ordinary work function.
+ */
+ smp_rmb();
/*
* we are going to call the ordered done function, but
@@ -355,6 +362,13 @@ static void normal_work_helper(struct btrfs_work *work)
thresh_exec_hook(wq);
work->func(work);
if (need_order) {
+ /*
+ * Ensures all memory accesses done in the work function are
+ * ordered before setting the WORK_DONE_BIT. Ensuring the thread
+ * which is going to executed the ordered work sees them.
+ * Pairs with the smp_rmb in run_ordered_work.
+ */
+ smp_mb__before_atomic();
set_bit(WORK_DONE_BIT, &work->flags);
run_ordered_work(wq, work);
}
--
2.39.2