From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.141 commit 38267d266336a7fb9eae9be23567a44776c6e4ca category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64WCC CVE: CVE-2022-20566
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------
commit b840304fb46cdf7012722f456bce06f151b3e81b upstream.
This attempts to fix the follow errors:
In function 'memcmp', inlined from 'bacmp' at ./include/net/bluetooth/bluetooth.h:347:9, inlined from 'l2cap_global_chan_by_psm' at net/bluetooth/l2cap_core.c:2003:15: ./include/linux/fortify-string.h:44:33: error: '__builtin_memcmp' specified bound 6 exceeds source size 0 [-Werror=stringop-overread] 44 | #define __underlying_memcmp __builtin_memcmp | ^ ./include/linux/fortify-string.h:420:16: note: in expansion of macro '__underlying_memcmp' 420 | return __underlying_memcmp(p, q, size); | ^~~~~~~~~~~~~~~~~~~ In function 'memcmp', inlined from 'bacmp' at ./include/net/bluetooth/bluetooth.h:347:9, inlined from 'l2cap_global_chan_by_psm' at net/bluetooth/l2cap_core.c:2004:15: ./include/linux/fortify-string.h:44:33: error: '__builtin_memcmp' specified bound 6 exceeds source size 0 [-Werror=stringop-overread] 44 | #define __underlying_memcmp __builtin_memcmp | ^ ./include/linux/fortify-string.h:420:16: note: in expansion of macro '__underlying_memcmp' 420 | return __underlying_memcmp(p, q, size); | ^~~~~~~~~~~~~~~~~~~
Fixes: 332f1795ca20 ("Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Cc: Sudip Mukherjee sudipm.mukherjee@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Lu Jialin lujialin4@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/bluetooth/l2cap_core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 70bfd9e8913e..f78ad8f536f7 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -1988,11 +1988,11 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm, src_match = !bacmp(&c->src, src); dst_match = !bacmp(&c->dst, dst); if (src_match && dst_match) { - c = l2cap_chan_hold_unless_zero(c); - if (c) { - read_unlock(&chan_list_lock); - return c; - } + if (!l2cap_chan_hold_unless_zero(c)) + continue; + + read_unlock(&chan_list_lock); + return c; }
/* Closest match */
From: Zhengchao Shao shaozhengchao@huawei.com
stable inclusion from stable-v5.10.154 commit d9ec6e2fbd4a565b2345d4852f586b7ae3ab41fd category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64WCC CVE: CVE-2022-20566
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------
[ Upstream commit 0d0e2d032811280b927650ff3c15fe5020e82533 ]
When l2cap_recv_frame() is invoked to receive data, and the cid is L2CAP_CID_A2MP, if the channel does not exist, it will create a channel. However, after a channel is created, the hold operation of the channel is not performed. In this case, the value of channel reference counting is 1. As a result, after hci_error_reset() is triggered, l2cap_conn_del() invokes the close hook function of A2MP to release the channel. Then l2cap_chan_unlock(chan) will trigger UAF issue.
The process is as follows: Receive data: l2cap_data_channel() a2mp_channel_create() --->channel ref is 2 l2cap_chan_put() --->channel ref is 1
Triger event: hci_error_reset() hci_dev_do_close() ... l2cap_disconn_cfm() l2cap_conn_del() l2cap_chan_hold() --->channel ref is 2 l2cap_chan_del() --->channel ref is 1 a2mp_chan_close_cb() --->channel ref is 0, release channel l2cap_chan_unlock() --->UAF of channel
The detailed Call Trace is as follows: BUG: KASAN: use-after-free in __mutex_unlock_slowpath+0xa6/0x5e0 Read of size 8 at addr ffff8880160664b8 by task kworker/u11:1/7593 Workqueue: hci0 hci_error_reset Call Trace: <TASK> dump_stack_lvl+0xcd/0x134 print_report.cold+0x2ba/0x719 kasan_report+0xb1/0x1e0 kasan_check_range+0x140/0x190 __mutex_unlock_slowpath+0xa6/0x5e0 l2cap_conn_del+0x404/0x7b0 l2cap_disconn_cfm+0x8c/0xc0 hci_conn_hash_flush+0x11f/0x260 hci_dev_close_sync+0x5f5/0x11f0 hci_dev_do_close+0x2d/0x70 hci_error_reset+0x9e/0x140 process_one_work+0x98a/0x1620 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 </TASK>
Allocated by task 7593: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0xa9/0xd0 l2cap_chan_create+0x40/0x930 amp_mgr_create+0x96/0x990 a2mp_channel_create+0x7d/0x150 l2cap_recv_frame+0x51b8/0x9a70 l2cap_recv_acldata+0xaa3/0xc00 hci_rx_work+0x702/0x1220 process_one_work+0x98a/0x1620 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30
Freed by task 7593: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x167/0x1c0 slab_free_freelist_hook+0x89/0x1c0 kfree+0xe2/0x580 l2cap_chan_put+0x22a/0x2d0 l2cap_conn_del+0x3fc/0x7b0 l2cap_disconn_cfm+0x8c/0xc0 hci_conn_hash_flush+0x11f/0x260 hci_dev_close_sync+0x5f5/0x11f0 hci_dev_do_close+0x2d/0x70 hci_error_reset+0x9e/0x140 process_one_work+0x98a/0x1620 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30
Last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0xbe/0xd0 call_rcu+0x99/0x740 netlink_release+0xe6a/0x1cf0 __sock_release+0xcd/0x280 sock_close+0x18/0x20 __fput+0x27c/0xa90 task_work_run+0xdd/0x1a0 exit_to_user_mode_prepare+0x23c/0x250 syscall_exit_to_user_mode+0x19/0x50 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Second to last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0xbe/0xd0 call_rcu+0x99/0x740 netlink_release+0xe6a/0x1cf0 __sock_release+0xcd/0x280 sock_close+0x18/0x20 __fput+0x27c/0xa90 task_work_run+0xdd/0x1a0 exit_to_user_mode_prepare+0x23c/0x250 syscall_exit_to_user_mode+0x19/0x50 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: d0be8347c623 ("Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put") Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Lu Jialin lujialin4@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/bluetooth/l2cap_core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index f78ad8f536f7..377b3719054a 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -7622,6 +7622,7 @@ static void l2cap_data_channel(struct l2cap_conn *conn, u16 cid, return; }
+ l2cap_chan_hold(chan); l2cap_chan_lock(chan); } else { BT_DBG("unknown cid 0x%4.4x", cid);
From: Jiasheng Jiang jiasheng@iscas.ac.cn
mainline inclusion from mainline-v5.19-rc1 commit ed713e2bc093239ccd380c2ce8ae9e4162f5c037 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6698Y CVE: CVE-2022-3114
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
As the potential failure of the kcalloc(), it should be better to check it in order to avoid the dereference of the NULL pointer.
Fixes: 379c9a24cc23 ("clk: imx: Fix reparenting of UARTs not associated with stdout") Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Reviewed-by: Abel Vesa abel.vesa@nxp.com Link: https://lore.kernel.org/r/20220310080257.1988412-1-jiasheng@iscas.ac.cn Signed-off-by: Abel Vesa abel.vesa@nxp.com Signed-off-by: Yi Yang yiyang13@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/clk/imx/clk.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/clk/imx/clk.c b/drivers/clk/imx/clk.c index 7cc669934253..99249ab361d2 100644 --- a/drivers/clk/imx/clk.c +++ b/drivers/clk/imx/clk.c @@ -173,6 +173,8 @@ void imx_register_uart_clocks(unsigned int clk_count) int i;
imx_uart_clocks = kcalloc(clk_count, sizeof(struct clk *), GFP_KERNEL); + if (!imx_uart_clocks) + return;
if (!of_stdout) return;
From: Yu Kuai yukuai3@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I65K8D CVE: NA
--------------------------------
Now that address of request_wrapper is caculated by address of request plus cmd_size, if cmd_size is not aligned to 8 bytes, request_wrapper will end up not aligned to 8 bytes as well, which will crash in arm64 because assembly instruction casal requires that operand address is aligned to 8 bytes:
Internal error: Oops: 96000021 [#1] SMP pc : blk_account_io_latency+0x54/0x134 Call trace: blk_account_io_latency+0x54/0x134 blk_account_io_done+0x3c/0x4c __blk_mq_end_request+0x78/0x134 scsi_end_request+0xcc/0x1f0 scsi_io_completion+0x88/0x240 scsi_finish_command+0x104/0x140 scsi_softirq_done+0x90/0x180 blk_mq_complete_request+0x5c/0x70 scsi_mq_done+0x4c/0x100
Fix the problem by declaring request_wrapper as aligned to cachline, and placing it before request.
Fixes: 9981c33db4da ("blk-mq: don't access request_wrapper if request is not allocated from block layer") Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/blk-flush.c | 8 +++++--- block/blk-mq.c | 2 +- block/blk-mq.h | 9 +++------ 3 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/block/blk-flush.c b/block/blk-flush.c index 65753f781c20..093c581a2651 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -470,6 +470,7 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size, gfp_t flags) { struct blk_flush_queue *fq; + struct request_wrapper *wrapper; int rq_sz = sizeof(struct request) + sizeof(struct request_wrapper);
fq = kzalloc_node(sizeof(*fq), flags, node); @@ -479,10 +480,11 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size, spin_lock_init(&fq->mq_flush_lock);
rq_sz = round_up(rq_sz + cmd_size, cache_line_size()); - fq->flush_rq = kzalloc_node(rq_sz, flags, node); - if (!fq->flush_rq) + wrapper = kzalloc_node(rq_sz, flags, node); + if (!wrapper) goto fail_rq;
+ fq->flush_rq = (struct request *)(wrapper + 1); INIT_LIST_HEAD(&fq->flush_queue[0]); INIT_LIST_HEAD(&fq->flush_queue[1]); INIT_LIST_HEAD(&fq->flush_data_in_flight); @@ -501,7 +503,7 @@ void blk_free_flush_queue(struct blk_flush_queue *fq) if (!fq) return;
- kfree(fq->flush_rq); + kfree(request_to_wrapper(fq->flush_rq)); kfree(fq); }
diff --git a/block/blk-mq.c b/block/blk-mq.c index a7e87451ce43..729cbf32842e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2634,7 +2634,7 @@ static int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, to_do = min(entries_per_page, depth - i); left -= to_do * rq_size; for (j = 0; j < to_do; j++) { - struct request *rq = p; + struct request *rq = p + sizeof(struct request_wrapper);
tags->static_rqs[i] = rq; if (blk_mq_init_request(set, rq, hctx_idx, node)) { diff --git a/block/blk-mq.h b/block/blk-mq.h index 6254abe9c112..dcb2077e4db6 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -40,14 +40,11 @@ struct blk_mq_ctx { struct request_wrapper { /* Time that I/O was counted in part_get_stat_info(). */ u64 stat_time_ns; -}; +} ____cacheline_aligned_in_smp;
-static inline struct request_wrapper *request_to_wrapper(struct request *rq) +static inline struct request_wrapper *request_to_wrapper(void *rq) { - unsigned long addr = (unsigned long)rq; - - addr += sizeof(*rq) + rq->q->tag_set->cmd_size; - return (struct request_wrapper *)addr; + return rq - sizeof(struct request_wrapper); }
void blk_mq_exit_queue(struct request_queue *q);
From: Yu Kuai yukuai3@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I65K8D CVE: NA
--------------------------------
____cacheline_aligned_in_smp will only be effective if CONFIG_SMP is enabled, use ____cacheline_aligned instead.
Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/blk-mq.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-mq.h b/block/blk-mq.h index dcb2077e4db6..358659fd3175 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -40,7 +40,7 @@ struct blk_mq_ctx { struct request_wrapper { /* Time that I/O was counted in part_get_stat_info(). */ u64 stat_time_ns; -} ____cacheline_aligned_in_smp; +} ____cacheline_aligned;
static inline struct request_wrapper *request_to_wrapper(void *rq) {
From: Pu Wen puwen@hygon.cn
mainline inclusion from mainline-v5.13-rc1 commit 59eca2fa1934de42d8aa44d3bef655c92ea69703 category: bugfix bugzilla: 188047, https://gitee.com/src-openeuler/kernel/issues/I64FW4 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Set the maximum DIE per package variable on Hygon using the nodes_per_socket value in order to do per-DIE manipulations for drivers such as powercap.
Signed-off-by: Pu Wen puwen@hygon.cn Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lkml.kernel.org/r/20210302020217.1827-1-puwen@hygon.cn Signed-off-by: Yuyao Lin linyuyao1@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/hygon.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c index 774ca6bfda9f..1f6de1d5ca84 100644 --- a/arch/x86/kernel/cpu/hygon.c +++ b/arch/x86/kernel/cpu/hygon.c @@ -235,12 +235,12 @@ static void bsp_init_hygon(struct cpuinfo_x86 *c) u32 ecx;
ecx = cpuid_ecx(0x8000001e); - nodes_per_socket = ((ecx >> 8) & 7) + 1; + __max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1; } else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) { u64 value;
rdmsrl(MSR_FAM10H_NODE_ID, value); - nodes_per_socket = ((value >> 3) & 7) + 1; + __max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 1; }
if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
From: Liu Shixin liushixin2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I66OCA CVE: NA
--------------------------------
The percpu pool will be cleared by clear_percpu_pools(), and then check whether all pages are already freed. If some pages are not freed, we will firstly isolate the freed pages and then migrate the used pages. Since we missed to get lock of percpu_pool, the used pages can be free to percpu_pool while the isolation is going. In such case, the list operation will be unreliable.
To fix this problem, we need to get all related locks sequentially and clear the perpcu_pool again before isolate the freed pages.
Fixes: cdbeee51d044 ("mm/dynamic_hugetlb: add migration function") Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Tong Tiangen tongtiangen@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- mm/dynamic_hugetlb.c | 58 +++++++++++++++++++++++++++++--------------- 1 file changed, 38 insertions(+), 20 deletions(-)
diff --git a/mm/dynamic_hugetlb.c b/mm/dynamic_hugetlb.c index 8a985d816c07..82ad4cbbeabc 100644 --- a/mm/dynamic_hugetlb.c +++ b/mm/dynamic_hugetlb.c @@ -182,25 +182,6 @@ static void reclaim_pages_from_percpu_pool(struct dhugetlb_pool *hpool, } }
-static void clear_percpu_pools(struct dhugetlb_pool *hpool) -{ - struct percpu_pages_pool *percpu_pool; - int i; - - lockdep_assert_held(&hpool->lock); - - spin_unlock(&hpool->lock); - for (i = 0; i < NR_PERCPU_POOL; i++) - spin_lock(&hpool->percpu_pool[i].lock); - spin_lock(&hpool->lock); - for (i = 0; i < NR_PERCPU_POOL; i++) { - percpu_pool = &hpool->percpu_pool[i]; - reclaim_pages_from_percpu_pool(hpool, percpu_pool, percpu_pool->free_pages); - } - for (i = 0; i < NR_PERCPU_POOL; i++) - spin_unlock(&hpool->percpu_pool[i].lock); -} - /* We only try 5 times to reclaim pages */ #define HPOOL_RECLAIM_RETRIES 5
@@ -210,6 +191,7 @@ static int hpool_merge_page(struct dhugetlb_pool *hpool, int hpages_pool_idx, bo struct split_hugepage *split_page, *split_next; unsigned long nr_pages, block_size; struct page *page, *next, *p; + struct percpu_pages_pool *percpu_pool; bool need_migrate = false, need_initial = false; int i, try; LIST_HEAD(wait_page_list); @@ -241,7 +223,22 @@ static int hpool_merge_page(struct dhugetlb_pool *hpool, int hpages_pool_idx, bo try = 0;
merge: - clear_percpu_pools(hpool); + /* + * If we are merging 4K page to 2M page, we need to get + * lock of percpu pool sequentially and clear percpu pool. + */ + if (hpages_pool_idx == HUGE_PAGES_POOL_2M) { + spin_unlock(&hpool->lock); + for (i = 0; i < NR_PERCPU_POOL; i++) + spin_lock(&hpool->percpu_pool[i].lock); + spin_lock(&hpool->lock); + for (i = 0; i < NR_PERCPU_POOL; i++) { + percpu_pool = &hpool->percpu_pool[i]; + reclaim_pages_from_percpu_pool(hpool, percpu_pool, + percpu_pool->free_pages); + } + } + page = pfn_to_page(split_page->start_pfn); for (i = 0; i < nr_pages; i+= block_size) { p = pfn_to_page(split_page->start_pfn + i); @@ -252,6 +249,14 @@ static int hpool_merge_page(struct dhugetlb_pool *hpool, int hpages_pool_idx, bo goto migrate; } } + if (hpages_pool_idx == HUGE_PAGES_POOL_2M) { + /* + * All target 4K page are in src_hpages_pool, we + * can unlock percpu pool. + */ + for (i = 0; i < NR_PERCPU_POOL; i++) + spin_unlock(&hpool->percpu_pool[i].lock); + }
list_del(&split_page->head_pages); hpages_pool->split_normal_pages--; @@ -284,8 +289,14 @@ static int hpool_merge_page(struct dhugetlb_pool *hpool, int hpages_pool_idx, bo trace_dynamic_hugetlb_split_merge(hpool, page, DHUGETLB_MERGE, page_size(page)); return 0; next: + if (hpages_pool_idx == HUGE_PAGES_POOL_2M) { + /* Unlock percpu pool before try next */ + for (i = 0; i < NR_PERCPU_POOL; i++) + spin_unlock(&hpool->percpu_pool[i].lock); + } continue; migrate: + /* page migration only used for HUGE_PAGES_POOL_2M */ if (try++ >= HPOOL_RECLAIM_RETRIES) goto next;
@@ -300,7 +311,10 @@ static int hpool_merge_page(struct dhugetlb_pool *hpool, int hpages_pool_idx, bo }
/* Unlock and try migration. */ + for (i = 0; i < NR_PERCPU_POOL; i++) + spin_unlock(&hpool->percpu_pool[i].lock); spin_unlock(&hpool->lock); + for (i = 0; i < nr_pages; i+= block_size) { p = pfn_to_page(split_page->start_pfn + i); if (PagePool(p)) @@ -312,6 +326,10 @@ static int hpool_merge_page(struct dhugetlb_pool *hpool, int hpages_pool_idx, bo } spin_lock(&hpool->lock);
+ /* + * Move all isolate pages to src_hpages_pool and then try + * merge again. + */ list_for_each_entry_safe(page, next, &wait_page_list, lru) { list_move_tail(&page->lru, &src_hpages_pool->hugepage_freelists); src_hpages_pool->free_normal_pages++;
From: Liu Shixin liushixin2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I66OCA CVE: NA
--------------------------------
Since free_pages_prepare() will clear the PagePool without lock in free_page_to_dhugetlb_pool() and free_page_list_to_dhugetlb_pool(), it is unreliable to check whether a page is freed by PagePool in hpool_merge_page(). Move free_pages_prepare() after ClearPagePool(), which can guarantee all allocated page has PagePool flag.
Fixes: 71197c63bfe9 ("mm/dynamic_hugetlb: free pages to dhugetlb_pool") Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Tong Tiangen tongtiangen@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- mm/dynamic_hugetlb.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/mm/dynamic_hugetlb.c b/mm/dynamic_hugetlb.c index 82ad4cbbeabc..d64aba9ba3dc 100644 --- a/mm/dynamic_hugetlb.c +++ b/mm/dynamic_hugetlb.c @@ -577,6 +577,10 @@ static void __free_page_to_dhugetlb_pool(struct page *page) spin_lock_irqsave(&percpu_pool->lock, flags);
ClearPagePool(page); + if (!free_pages_prepare(page, 0, true)) { + SetPagePool(page); + goto out; + } list_add(&page->lru, &percpu_pool->head_page); percpu_pool->free_pages++; percpu_pool->used_pages--; @@ -585,7 +589,7 @@ static void __free_page_to_dhugetlb_pool(struct page *page) reclaim_pages_from_percpu_pool(hpool, percpu_pool, PERCPU_POOL_PAGE_BATCH); spin_unlock(&hpool->lock); } - +out: spin_unlock_irqrestore(&percpu_pool->lock, flags); put_hpool(hpool); } @@ -595,8 +599,7 @@ bool free_page_to_dhugetlb_pool(struct page *page) if (!dhugetlb_enabled || !PagePool(page)) return false;
- if (free_pages_prepare(page, 0, true)) - __free_page_to_dhugetlb_pool(page); + __free_page_to_dhugetlb_pool(page); return true; }
@@ -610,8 +613,7 @@ void free_page_list_to_dhugetlb_pool(struct list_head *list) list_for_each_entry_safe(page, next, list, lru) { if (PagePool(page)) { list_del(&page->lru); - if (free_pages_prepare(page, 0, true)) - __free_page_to_dhugetlb_pool(page); + __free_page_to_dhugetlb_pool(page); } } }
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 187921, https://gitee.com/openeuler/kernel/issues/I66VDB CVE: NA
--------------------------------
Enable CONFIG_BLK_RQ_ALLOC_TIME will cause kabi broken, use request wrapper to fix it.
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/blk-iocost.c | 11 ++++++++--- block/blk-mq.c | 2 +- block/blk-mq.h | 4 ++++ include/linux/blkdev.h | 4 ---- 4 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 462dbb766ed1..2c30aae1a664 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -2747,8 +2747,13 @@ static void ioc_rqos_done(struct rq_qos *rqos, struct request *rq) struct ioc_pcpu_stat *ccs; u64 on_q_ns, rq_wait_ns, size_nsec; int pidx, rw; + struct request_wrapper *rq_wrapper;
- if (!ioc->enabled || !rq->alloc_time_ns || !rq->start_time_ns) + if (WARN_ON_ONCE(!(rq->rq_flags & RQF_FROM_BLOCK))) + return; + + rq_wrapper = request_to_wrapper(rq); + if (!ioc->enabled || !rq_wrapper->alloc_time_ns || !rq->start_time_ns) return;
switch (req_op(rq) & REQ_OP_MASK) { @@ -2764,8 +2769,8 @@ static void ioc_rqos_done(struct rq_qos *rqos, struct request *rq) return; }
- on_q_ns = ktime_get_ns() - rq->alloc_time_ns; - rq_wait_ns = rq->start_time_ns - rq->alloc_time_ns; + on_q_ns = ktime_get_ns() - rq_wrapper->alloc_time_ns; + rq_wait_ns = rq->start_time_ns - rq_wrapper->alloc_time_ns; size_nsec = div64_u64(calc_size_vtime_cost(rq, ioc), VTIME_PER_NSEC);
ccs = get_cpu_ptr(ioc->pcpu_stat); diff --git a/block/blk-mq.c b/block/blk-mq.c index 729cbf32842e..bcf529b40987 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -386,7 +386,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, rq->rq_disk = NULL; rq->part = NULL; #ifdef CONFIG_BLK_RQ_ALLOC_TIME - rq->alloc_time_ns = alloc_time_ns; + request_to_wrapper(rq)->alloc_time_ns = alloc_time_ns; #endif request_to_wrapper(rq)->stat_time_ns = 0; if (blk_mq_need_time_stamp(rq)) diff --git a/block/blk-mq.h b/block/blk-mq.h index 358659fd3175..7bb0b82bfbe9 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -40,6 +40,10 @@ struct blk_mq_ctx { struct request_wrapper { /* Time that I/O was counted in part_get_stat_info(). */ u64 stat_time_ns; +#ifdef CONFIG_BLK_RQ_ALLOC_TIME + /* Time that the first bio started allocating this request. */ + u64 alloc_time_ns; +#endif } ____cacheline_aligned;
static inline struct request_wrapper *request_to_wrapper(void *rq) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index eed319e5d192..171884608cad 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -202,10 +202,6 @@ struct request {
struct gendisk *rq_disk; struct hd_struct *part; -#ifdef CONFIG_BLK_RQ_ALLOC_TIME - /* Time that the first bio started allocating this request. */ - u64 alloc_time_ns; -#endif /* Time that this request was allocated for this IO. */ u64 start_time_ns; /* Time that I/O was submitted to the device. */
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 187921, https://gitee.com/openeuler/kernel/issues/I66VDB CVE: NA
--------------------------------
Enable CONFIG_BLK_CGROUP_IOCOST will cause kabi broken, use reserved fields to fix it.
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/blk_types.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index a0ec5ff216ff..b23db4257a24 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -234,9 +234,6 @@ struct bio { */ struct blkcg_gq *bi_blkg; struct bio_issue bi_issue; -#ifdef CONFIG_BLK_CGROUP_IOCOST - u64 bi_iocost_cost; -#endif #endif
#ifdef CONFIG_BLK_INLINE_ENCRYPTION @@ -263,7 +260,11 @@ struct bio {
struct bio_set *bi_pool;
+#ifdef CONFIG_BLK_CGROUP_IOCOST + KABI_USE(1, u64 bi_iocost_cost) +#else KABI_RESERVE(1) +#endif KABI_RESERVE(2) KABI_RESERVE(3) KABI_RESERVE(4)
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188176, https://gitee.com/openeuler/kernel/issues/I67294 CVE: NA
--------------------------------
This reverts commit 71e03f3cdaa9606135479c454d29b97c4e490456.
Drivers use tgt_dscvr will compile failed because API has changed.
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/ulp/iser/iscsi_iser.c | 9 ++----- drivers/scsi/be2iscsi/be_main.c | 9 ++----- drivers/scsi/bnx2i/bnx2i_iscsi.c | 9 ++----- drivers/scsi/cxgbi/cxgb3i/cxgb3i.c | 9 ++----- drivers/scsi/cxgbi/cxgb4i/cxgb4i.c | 9 ++----- drivers/scsi/qedi/qedi_iscsi.c | 8 ++---- drivers/scsi/qla4xxx/ql4_os.c | 8 ++---- drivers/scsi/scsi_transport_iscsi.c | 31 ++++++------------------ include/scsi/iscsi_if.h | 1 - include/scsi/scsi_transport_iscsi.h | 17 +------------ 10 files changed, 23 insertions(+), 87 deletions(-)
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index 4884b122e413..a16e066989fa 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -979,22 +979,17 @@ static struct scsi_host_template iscsi_iser_sht = { .track_queue_depth = 1, };
-static struct iscsi_transport_expand iscsi_iser_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - static struct iscsi_transport iscsi_iser_transport = { .owner = THIS_MODULE, .name = "iser", - .caps = CAP_RECOVERY_L0 | CAP_MULTI_R2T | CAP_TEXT_NEGO - | CAP_OPS_EXPAND, + .caps = CAP_RECOVERY_L0 | CAP_MULTI_R2T | CAP_TEXT_NEGO, /* session management */ .create_session = iscsi_iser_session_create, .destroy_session = iscsi_iser_session_destroy, /* connection management */ .create_conn = iscsi_iser_conn_create, .bind_conn = iscsi_iser_conn_bind, - .ops_expand = &iscsi_iser_expand, + .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_conn_teardown, .attr_is_visible = iser_attr_is_visible, .set_param = iscsi_iser_set_param, diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index 06f697bfc49f..b977e039bb78 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -5801,21 +5801,16 @@ static struct pci_error_handlers beiscsi_eeh_handlers = { .resume = beiscsi_eeh_resume, };
-struct iscsi_transport_expand beiscsi_iscsi_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - struct iscsi_transport beiscsi_iscsi_transport = { .owner = THIS_MODULE, .name = DRV_NAME, .caps = CAP_RECOVERY_L0 | CAP_HDRDGST | CAP_TEXT_NEGO | - CAP_MULTI_R2T | CAP_DATADGST | CAP_DATA_PATH_OFFLOAD | - CAP_OPS_EXPAND, + CAP_MULTI_R2T | CAP_DATADGST | CAP_DATA_PATH_OFFLOAD, .create_session = beiscsi_session_create, .destroy_session = beiscsi_session_destroy, .create_conn = beiscsi_conn_create, .bind_conn = beiscsi_conn_bind, - .ops_expand = &beiscsi_iscsi_expand, + .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_conn_teardown, .attr_is_visible = beiscsi_attr_is_visible, .set_iface_param = beiscsi_iface_set_param, diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c index e13c77a76150..8cf2f9a7cfdc 100644 --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c @@ -2274,23 +2274,18 @@ static struct scsi_host_template bnx2i_host_template = { .track_queue_depth = 1, };
- -static struct iscsi_transport_expand bnx2i_iscsi_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - struct iscsi_transport bnx2i_iscsi_transport = { .owner = THIS_MODULE, .name = "bnx2i", .caps = CAP_RECOVERY_L0 | CAP_HDRDGST | CAP_MULTI_R2T | CAP_DATADGST | CAP_DATA_PATH_OFFLOAD | - CAP_TEXT_NEGO | CAP_OPS_EXPAND, + CAP_TEXT_NEGO, .create_session = bnx2i_session_create, .destroy_session = bnx2i_session_destroy, .create_conn = bnx2i_conn_create, .bind_conn = bnx2i_conn_bind, - .ops_expand = &bnx2i_iscsi_expand, + .unbind_conn = iscsi_conn_unbind, .destroy_conn = bnx2i_conn_destroy, .attr_is_visible = bnx2i_attr_is_visible, .set_param = iscsi_set_param, diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c index 65eb6230d390..edcd3fab6973 100644 --- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c +++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c @@ -100,18 +100,13 @@ static struct scsi_host_template cxgb3i_host_template = { .track_queue_depth = 1, };
-static struct iscsi_transport_expand cxgb3i_iscsi_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - static struct iscsi_transport cxgb3i_iscsi_transport = { .owner = THIS_MODULE, .name = DRV_MODULE_NAME, /* owner and name should be set already */ .caps = CAP_RECOVERY_L0 | CAP_MULTI_R2T | CAP_HDRDGST | CAP_DATADGST | CAP_DIGEST_OFFLOAD | - CAP_PADDING_OFFLOAD | CAP_TEXT_NEGO | - CAP_OPS_EXPAND, + CAP_PADDING_OFFLOAD | CAP_TEXT_NEGO, .attr_is_visible = cxgbi_attr_is_visible, .get_host_param = cxgbi_get_host_param, .set_host_param = cxgbi_set_host_param, @@ -122,7 +117,7 @@ static struct iscsi_transport cxgb3i_iscsi_transport = { /* connection management */ .create_conn = cxgbi_create_conn, .bind_conn = cxgbi_bind_conn, - .ops_expand = &cxgb3i_iscsi_expand, + .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_tcp_conn_teardown, .start_conn = iscsi_conn_start, .stop_conn = iscsi_conn_stop, diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index 88f96964c356..efb3e2b3398e 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -118,17 +118,12 @@ static struct scsi_host_template cxgb4i_host_template = { .track_queue_depth = 1, };
-static struct iscsi_transport_expand cxgb4i_iscsi_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - static struct iscsi_transport cxgb4i_iscsi_transport = { .owner = THIS_MODULE, .name = DRV_MODULE_NAME, .caps = CAP_RECOVERY_L0 | CAP_MULTI_R2T | CAP_HDRDGST | CAP_DATADGST | CAP_DIGEST_OFFLOAD | - CAP_PADDING_OFFLOAD | CAP_TEXT_NEGO | - CAP_OPS_EXPAND, + CAP_PADDING_OFFLOAD | CAP_TEXT_NEGO, .attr_is_visible = cxgbi_attr_is_visible, .get_host_param = cxgbi_get_host_param, .set_host_param = cxgbi_set_host_param, @@ -139,7 +134,7 @@ static struct iscsi_transport cxgb4i_iscsi_transport = { /* connection management */ .create_conn = cxgbi_create_conn, .bind_conn = cxgbi_bind_conn, - .ops_expand = &cxgb4i_iscsi_expand, + .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_tcp_conn_teardown, .start_conn = iscsi_conn_start, .stop_conn = iscsi_conn_stop, diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c index 8003b3519b95..3bcadb3dd40d 100644 --- a/drivers/scsi/qedi/qedi_iscsi.c +++ b/drivers/scsi/qedi/qedi_iscsi.c @@ -1429,20 +1429,16 @@ static void qedi_cleanup_task(struct iscsi_task *task) cmd->scsi_cmd = NULL; }
-static struct iscsi_transport_expand qedi_iscsi_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - struct iscsi_transport qedi_iscsi_transport = { .owner = THIS_MODULE, .name = QEDI_MODULE_NAME, .caps = CAP_RECOVERY_L0 | CAP_HDRDGST | CAP_MULTI_R2T | CAP_DATADGST | - CAP_DATA_PATH_OFFLOAD | CAP_TEXT_NEGO | CAP_OPS_EXPAND, + CAP_DATA_PATH_OFFLOAD | CAP_TEXT_NEGO, .create_session = qedi_session_create, .destroy_session = qedi_session_destroy, .create_conn = qedi_conn_create, .bind_conn = qedi_conn_bind, - .ops_expand = &qedi_iscsi_expand, + .unbind_conn = iscsi_conn_unbind, .start_conn = qedi_conn_start, .stop_conn = iscsi_conn_stop, .destroy_conn = qedi_conn_destroy, diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c index 377d83762099..8d82d2a83059 100644 --- a/drivers/scsi/qla4xxx/ql4_os.c +++ b/drivers/scsi/qla4xxx/ql4_os.c @@ -246,24 +246,20 @@ static struct scsi_host_template qla4xxx_driver_template = { .vendor_id = SCSI_NL_VID_TYPE_PCI | PCI_VENDOR_ID_QLOGIC, };
-static struct iscsi_transport_expand qla4xxx_iscsi_expand = { - .unbind_conn = iscsi_conn_unbind, -}; - static struct iscsi_transport qla4xxx_iscsi_transport = { .owner = THIS_MODULE, .name = DRIVER_NAME, .caps = CAP_TEXT_NEGO | CAP_DATA_PATH_OFFLOAD | CAP_HDRDGST | CAP_DATADGST | CAP_LOGIN_OFFLOAD | - CAP_MULTI_R2T | CAP_OPS_EXPAND, + CAP_MULTI_R2T, .attr_is_visible = qla4_attr_is_visible, .create_session = qla4xxx_session_create, .destroy_session = qla4xxx_session_destroy, .start_conn = qla4xxx_conn_start, .create_conn = qla4xxx_conn_create, .bind_conn = qla4xxx_conn_bind, - .ops_expand = &qla4xxx_iscsi_expand, + .unbind_conn = iscsi_conn_unbind, .stop_conn = iscsi_conn_stop, .destroy_conn = qla4xxx_conn_destroy, .set_param = iscsi_set_param, diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index a213362524c9..517a3c4d879b 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2257,11 +2257,7 @@ static void iscsi_ep_disconnect(struct iscsi_cls_conn *conn, bool is_active) ep = conn->ep; conn->ep = NULL;
- if (session->transport->caps & CAP_OPS_EXPAND && - session->transport->ops_expand && - session->transport->ops_expand->unbind_conn) - session->transport->ops_expand->unbind_conn(conn, is_active); - + session->transport->unbind_conn(conn, is_active); session->transport->ep_disconnect(ep); ISCSI_DBG_TRANS_CONN(conn, "disconnect ep done.\n"); } @@ -3228,19 +3224,10 @@ iscsi_tgt_dscvr(struct iscsi_transport *transport, struct Scsi_Host *shost; struct sockaddr *dst_addr; int err; - int (*tgt_dscvr)(struct Scsi_Host *shost, enum iscsi_tgt_dscvr type, - uint32_t enable, struct sockaddr *dst_addr);
- if (transport->caps & CAP_OPS_EXPAND) { - if (!transport->ops_expand || !transport->ops_expand->tgt_dscvr) - return -EINVAL; - tgt_dscvr = transport->ops_expand->tgt_dscvr; - } else { - if (!transport->ops_expand) - return -EINVAL; - tgt_dscvr = (int (*)(struct Scsi_Host *, enum iscsi_tgt_dscvr, uint32_t, - struct sockaddr *))(transport->ops_expand); - } + if (!transport->tgt_dscvr) + return -EINVAL; + shost = scsi_host_lookup(ev->u.tgt_dscvr.host_no); if (!shost) { printk(KERN_ERR "target discovery could not find host no %u\n", @@ -3250,8 +3237,8 @@ iscsi_tgt_dscvr(struct iscsi_transport *transport,
dst_addr = (struct sockaddr *)((char*)ev + sizeof(*ev)); - err = tgt_dscvr(shost, ev->u.tgt_dscvr.type, - ev->u.tgt_dscvr.enable, dst_addr); + err = transport->tgt_dscvr(shost, ev->u.tgt_dscvr.type, + ev->u.tgt_dscvr.enable, dst_addr); scsi_host_put(shost); return err; } @@ -4904,10 +4891,8 @@ iscsi_register_transport(struct iscsi_transport *tt) int err;
BUG_ON(!tt); - if (tt->caps & CAP_OPS_EXPAND) { - BUG_ON(!tt->ops_expand); - WARN_ON(tt->ep_disconnect && !tt->ops_expand->unbind_conn); - } + WARN_ON(tt->ep_disconnect && !tt->unbind_conn); + priv = iscsi_if_transport_lookup(tt); if (priv) return NULL; diff --git a/include/scsi/iscsi_if.h b/include/scsi/iscsi_if.h index 5679b9fb2b1e..5225a23f2d0e 100644 --- a/include/scsi/iscsi_if.h +++ b/include/scsi/iscsi_if.h @@ -761,7 +761,6 @@ enum iscsi_ping_status_code { and verification */ #define CAP_LOGIN_OFFLOAD 0x4000 /* offload session login */
-#define CAP_OPS_EXPAND 0x8000 /* oiscsi_transport->ops_expand flag */ /* * These flags describes reason of stop_conn() call */ diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h index 1f7574d89822..b7fd8b738aa8 100644 --- a/include/scsi/scsi_transport_iscsi.h +++ b/include/scsi/scsi_transport_iscsi.h @@ -29,15 +29,6 @@ struct bsg_job; struct iscsi_bus_flash_session; struct iscsi_bus_flash_conn;
-/* - * The expansion of iscsi_transport to fix kabi while adding members. - */ -struct iscsi_transport_expand { - int (*tgt_dscvr)(struct Scsi_Host *shost, enum iscsi_tgt_dscvr type, - uint32_t enable, struct sockaddr *dst_addr); - void (*unbind_conn)(struct iscsi_cls_conn *conn, bool is_active); -}; - /** * struct iscsi_transport - iSCSI Transport template * @@ -91,6 +82,7 @@ struct iscsi_transport { void (*destroy_session) (struct iscsi_cls_session *session); struct iscsi_cls_conn *(*create_conn) (struct iscsi_cls_session *sess, uint32_t cid); + void (*unbind_conn) (struct iscsi_cls_conn *conn, bool is_active); int (*bind_conn) (struct iscsi_cls_session *session, struct iscsi_cls_conn *cls_conn, uint64_t transport_eph, int is_leading); @@ -132,15 +124,8 @@ struct iscsi_transport { int non_blocking); int (*ep_poll) (struct iscsi_endpoint *ep, int timeout_ms); void (*ep_disconnect) (struct iscsi_endpoint *ep); -#ifdef __GENKSYMS__ int (*tgt_dscvr) (struct Scsi_Host *shost, enum iscsi_tgt_dscvr type, uint32_t enable, struct sockaddr *dst_addr); -#else - /* - * onece ops_expand is used, caps must be set to CAP_OPS_EXPAND - */ - struct iscsi_transport_expand *ops_expand; -#endif int (*set_path) (struct Scsi_Host *shost, struct iscsi_path *params); int (*set_iface_param) (struct Scsi_Host *shost, void *data, uint32_t len);
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188176, https://gitee.com/openeuler/kernel/issues/I67294 CVE: NA
-------------------------------
Commit 891e2639deae ("scsi: iscsi: Stop queueing during ep_disconnect") introduces .unbind_conn to fix the race between __iscsi_conn_send_pdu() and .ep_disconnect, however it also introduces the KABI problem.
Considering the issue is only related with offload iscsi driver but not iscsi_tcp, so tried to revert it, however the above commit is just one patch in a patchset, the following patches depends on it and these patches fix problem related with iscsi_tcp.
So just reverting it manually by removing .unbind_conn from iscsi_transport.
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/ulp/iser/iscsi_iser.c | 1 - drivers/scsi/be2iscsi/be_main.c | 1 - drivers/scsi/bnx2i/bnx2i_iscsi.c | 1 - drivers/scsi/cxgbi/cxgb3i/cxgb3i.c | 1 - drivers/scsi/cxgbi/cxgb4i/cxgb4i.c | 1 - drivers/scsi/qedi/qedi_iscsi.c | 1 - drivers/scsi/qla4xxx/ql4_os.c | 1 - drivers/scsi/scsi_transport_iscsi.c | 2 -- include/scsi/scsi_transport_iscsi.h | 1 - 9 files changed, 10 deletions(-)
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index a16e066989fa..da3ee3cc2632 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -989,7 +989,6 @@ static struct iscsi_transport iscsi_iser_transport = { /* connection management */ .create_conn = iscsi_iser_conn_create, .bind_conn = iscsi_iser_conn_bind, - .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_conn_teardown, .attr_is_visible = iser_attr_is_visible, .set_param = iscsi_iser_set_param, diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index b977e039bb78..987dc8135a9b 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -5810,7 +5810,6 @@ struct iscsi_transport beiscsi_iscsi_transport = { .destroy_session = beiscsi_session_destroy, .create_conn = beiscsi_conn_create, .bind_conn = beiscsi_conn_bind, - .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_conn_teardown, .attr_is_visible = beiscsi_attr_is_visible, .set_iface_param = beiscsi_iface_set_param, diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c index 8cf2f9a7cfdc..872e2ebad800 100644 --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c @@ -2285,7 +2285,6 @@ struct iscsi_transport bnx2i_iscsi_transport = { .destroy_session = bnx2i_session_destroy, .create_conn = bnx2i_conn_create, .bind_conn = bnx2i_conn_bind, - .unbind_conn = iscsi_conn_unbind, .destroy_conn = bnx2i_conn_destroy, .attr_is_visible = bnx2i_attr_is_visible, .set_param = iscsi_set_param, diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c index edcd3fab6973..37d99357120f 100644 --- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c +++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c @@ -117,7 +117,6 @@ static struct iscsi_transport cxgb3i_iscsi_transport = { /* connection management */ .create_conn = cxgbi_create_conn, .bind_conn = cxgbi_bind_conn, - .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_tcp_conn_teardown, .start_conn = iscsi_conn_start, .stop_conn = iscsi_conn_stop, diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index efb3e2b3398e..2c3491528d42 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -134,7 +134,6 @@ static struct iscsi_transport cxgb4i_iscsi_transport = { /* connection management */ .create_conn = cxgbi_create_conn, .bind_conn = cxgbi_bind_conn, - .unbind_conn = iscsi_conn_unbind, .destroy_conn = iscsi_tcp_conn_teardown, .start_conn = iscsi_conn_start, .stop_conn = iscsi_conn_stop, diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c index 3bcadb3dd40d..c42123478795 100644 --- a/drivers/scsi/qedi/qedi_iscsi.c +++ b/drivers/scsi/qedi/qedi_iscsi.c @@ -1438,7 +1438,6 @@ struct iscsi_transport qedi_iscsi_transport = { .destroy_session = qedi_session_destroy, .create_conn = qedi_conn_create, .bind_conn = qedi_conn_bind, - .unbind_conn = iscsi_conn_unbind, .start_conn = qedi_conn_start, .stop_conn = iscsi_conn_stop, .destroy_conn = qedi_conn_destroy, diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c index 8d82d2a83059..dac4129885d7 100644 --- a/drivers/scsi/qla4xxx/ql4_os.c +++ b/drivers/scsi/qla4xxx/ql4_os.c @@ -259,7 +259,6 @@ static struct iscsi_transport qla4xxx_iscsi_transport = { .start_conn = qla4xxx_conn_start, .create_conn = qla4xxx_conn_create, .bind_conn = qla4xxx_conn_bind, - .unbind_conn = iscsi_conn_unbind, .stop_conn = iscsi_conn_stop, .destroy_conn = qla4xxx_conn_destroy, .set_param = iscsi_set_param, diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 517a3c4d879b..efb005288032 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2257,7 +2257,6 @@ static void iscsi_ep_disconnect(struct iscsi_cls_conn *conn, bool is_active) ep = conn->ep; conn->ep = NULL;
- session->transport->unbind_conn(conn, is_active); session->transport->ep_disconnect(ep); ISCSI_DBG_TRANS_CONN(conn, "disconnect ep done.\n"); } @@ -4891,7 +4890,6 @@ iscsi_register_transport(struct iscsi_transport *tt) int err;
BUG_ON(!tt); - WARN_ON(tt->ep_disconnect && !tt->unbind_conn);
priv = iscsi_if_transport_lookup(tt); if (priv) diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h index b7fd8b738aa8..77a59f391309 100644 --- a/include/scsi/scsi_transport_iscsi.h +++ b/include/scsi/scsi_transport_iscsi.h @@ -82,7 +82,6 @@ struct iscsi_transport { void (*destroy_session) (struct iscsi_cls_session *session); struct iscsi_cls_conn *(*create_conn) (struct iscsi_cls_session *sess, uint32_t cid); - void (*unbind_conn) (struct iscsi_cls_conn *conn, bool is_active); int (*bind_conn) (struct iscsi_cls_session *session, struct iscsi_cls_conn *cls_conn, uint64_t transport_eph, int is_leading);
From: Yu Kuai yukuai3@huawei.com
mainline inclusion from v6.1-rc5 commit f02be9002c480cd3ec0fcf184ad27cf531bd6ece category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I610OR CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Out test found a following problem in kernel 5.10, and the same problem should exist in mainline:
BUG: kernel NULL pointer dereference, address: 0000000000000094 PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 7 PID: 155 Comm: kworker/7:1 Not tainted 5.10.0-01932-g19e0ace2ca1d-dirty 4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-b4 Workqueue: kthrotld blk_throtl_dispatch_work_fn RIP: 0010:bfq_bio_bfqg+0x52/0xc0 Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b RSP: 0018:ffffc90001a1fba0 EFLAGS: 00010002 RAX: ffff888100d60400 RBX: ffff8881132e7000 RCX: 0000000000000000 RDX: 0000000000000017 RSI: ffff888103580a18 RDI: ffff888103580a18 RBP: ffff8881132e7000 R08: 0000000000000000 R09: ffffc90001a1fe10 R10: 0000000000000a20 R11: 0000000000034320 R12: 0000000000000000 R13: ffff888103580a18 R14: ffff888114447000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88881fdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000094 CR3: 0000000100cdb000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: bfq_bic_update_cgroup+0x3c/0x350 ? ioc_create_icq+0x42/0x270 bfq_init_rq+0xfd/0x1060 bfq_insert_requests+0x20f/0x1cc0 ? ioc_create_icq+0x122/0x270 blk_mq_sched_insert_requests+0x86/0x1d0 blk_mq_flush_plug_list+0x193/0x2a0 blk_flush_plug_list+0x127/0x170 blk_finish_plug+0x31/0x50 blk_throtl_dispatch_work_fn+0x151/0x190 process_one_work+0x27c/0x5f0 worker_thread+0x28b/0x6b0 ? rescuer_thread+0x590/0x590 kthread+0x153/0x1b0 ? kthread_flush_work+0x170/0x170 ret_from_fork+0x1f/0x30 Modules linked in: CR2: 0000000000000094 ---[ end trace e2e59ac014314547 ]--- RIP: 0010:bfq_bio_bfqg+0x52/0xc0 Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b RSP: 0018:ffffc90001a1fba0 EFLAGS: 00010002 RAX: ffff888100d60400 RBX: ffff8881132e7000 RCX: 0000000000000000 RDX: 0000000000000017 RSI: ffff888103580a18 RDI: ffff888103580a18 RBP: ffff8881132e7000 R08: 0000000000000000 R09: ffffc90001a1fe10 R10: 0000000000000a20 R11: 0000000000034320 R12: 0000000000000000 R13: ffff888103580a18 R14: ffff888114447000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88881fdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000094 CR3: 0000000100cdb000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Root cause is quite complex:
1) use bfq elevator for the test device. 2) create a cgroup CG 3) config blk throtl in CG
blkg_conf_prep blkg_create
4) create a thread T1 and issue async io in CG:
bio_init bio_associate_blkg ... submit_bio submit_bio_noacct blk_throtl_bio -> io is throttled // io submit is done
5) switch elevator:
bfq_exit_queue blkcg_deactivate_policy list_for_each_entry(blkg, &q->blkg_list, q_node) blkg->pd[] = NULL // bfq policy is removed
5) thread t1 exist, then remove the cgroup CG:
blkcg_unpin_online blkcg_destroy_blkgs blkg_destroy list_del_init(&blkg->q_node) // blkg is removed from queue list
6) switch elevator back to bfq
bfq_init_queue bfq_create_group_hierarchy blkcg_activate_policy list_for_each_entry_reverse(blkg, &q->blkg_list) // blkg is removed from list, hence bfq policy is still NULL
7) throttled io is dispatched to bfq:
bfq_insert_requests bfq_init_rq bfq_bic_update_cgroup bfq_bio_bfqg bfqg = blkg_to_bfqg(blkg) // bfqg is NULL because bfq policy is NULL
The problem is only possible in bfq because only bfq can be deactivated and activated while queue is online, while others can only be deactivated while the device is removed.
Fix the problem in bfq by checking if blkg is online before calling blkg_to_bfqg().
Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20221108103434.2853269-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe axboe@kernel.dk Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/bfq-cgroup.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index f99351017182..36ba7324f685 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -613,6 +613,10 @@ struct bfq_group *bfq_bio_bfqg(struct bfq_data *bfqd, struct bio *bio) struct bfq_group *bfqg;
while (blkg) { + if (!blkg->online) { + blkg = blkg->parent; + continue; + } bfqg = blkg_to_bfqg(blkg); if (bfqg->online) { bio_associate_blkg_from_css(bio, &blkg->blkcg->css);
From: Yu Kuai yukuai3@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I66VJO CVE: NA
--------------------------------
Our test report a uaf for 'bfqq->bic' in 5.10:
================================================================== BUG: KASAN: use-after-free in bfq_select_queue+0x378/0xa30 Read of size 8 at addr ffff88810efb42d8 by task fsstress/2318352
CPU: 6 PID: 2318352 Comm: fsstress Kdump: loaded Not tainted 5.10.0-60.18.0.50.h602.kasan.eulerosv2r11.x86_64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220320_160524-szxrtosci10000 04/01/2014 Call Trace: dump_stack+0x9c/0xd3 print_address_description.constprop.0+0x19/0x170 ? bfq_select_queue+0x378/0xa30 __kasan_report.cold+0x6c/0x84 ? bfq_select_queue+0x378/0xa30 kasan_report+0x3a/0x50 bfq_select_queue+0x378/0xa30 ? bfq_bfqq_expire+0x6c0/0x6c0 ? bfq_mark_bfqq_busy+0x1f/0x30 ? _raw_spin_lock_irq+0x7b/0xd0 __bfq_dispatch_request+0x1c4/0x220 bfq_dispatch_request+0xe8/0x130 __blk_mq_do_dispatch_sched+0x3f4/0x560 ? blk_mq_sched_mark_restart_hctx+0x50/0x50 ? bfq_init_rq+0x128/0x940 ? pvclock_clocksource_read+0xf6/0x1d0 blk_mq_do_dispatch_sched+0x62/0xb0 __blk_mq_sched_dispatch_requests+0x215/0x2a0 ? blk_mq_do_dispatch_ctx+0x3a0/0x3a0 ? bfq_insert_request+0x193/0x3f0 blk_mq_sched_dispatch_requests+0x8f/0xd0 __blk_mq_run_hw_queue+0x98/0x180 __blk_mq_delay_run_hw_queue+0x22b/0x240 ? bfq_asymmetric_scenario+0x160/0x160 blk_mq_run_hw_queue+0xe3/0x190 ? bfq_insert_request+0x3f0/0x3f0 blk_mq_sched_insert_requests+0x107/0x200 blk_mq_flush_plug_list+0x26e/0x3c0 ? blk_mq_insert_requests+0x250/0x250 ? blk_check_plugged+0x190/0x190 blk_finish_plug+0x63/0x90 __iomap_dio_rw+0x7b5/0x910 ? iomap_dio_actor+0x150/0x150 ? userns_put+0x70/0x70 ? userns_put+0x70/0x70 ? avc_has_perm_noaudit+0x1d0/0x1d0 ? down_read+0xd5/0x1a0 ? down_read_killable+0x1b0/0x1b0 ? from_kgid+0xa0/0xa0 iomap_dio_rw+0x36/0x80 ext4_dio_read_iter+0x146/0x190 [ext4] ext4_file_read_iter+0x1e2/0x230 [ext4] new_sync_read+0x29f/0x400 ? default_llseek+0x160/0x160 ? find_isec_ns+0x8d/0x2e0 ? avc_policy_seqno+0x27/0x40 ? selinux_file_permission+0x34/0x180 ? security_file_permission+0x135/0x2b0 vfs_read+0x24e/0x2d0 ksys_read+0xd5/0x1b0 ? __ia32_sys_pread64+0x160/0x160 ? __audit_syscall_entry+0x1cc/0x220 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6 RIP: 0033:0x7ff05f96fe62 Code: c0 e9 b2 fe ff ff 50 48 8d 3d 12 04 0c 00 e8 b5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 RSP: 002b:00007fffd30c0ff8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 00000000000000a5 RCX: 00007ff05f96fe62 RDX: 000000000001d000 RSI: 0000000001ffc000 RDI: 0000000000000003 RBP: 0000000000000003 R08: 0000000002019000 R09: 0000000000000000 R10: 00007ff05fa65290 R11: 0000000000000246 R12: 0000000000131800 R13: 000000000001d000 R14: 0000000000000000 R15: 0000000001ffc000
Allocated by task 2318348: kasan_save_stack+0x1b/0x40 __kasan_kmalloc.constprop.0+0xb5/0xe0 kmem_cache_alloc_node+0x15d/0x480 ioc_create_icq+0x68/0x2e0 blk_mq_sched_assign_ioc+0xbc/0xd0 blk_mq_rq_ctx_init+0x4b0/0x600 __blk_mq_alloc_request+0x21f/0x2e0 blk_mq_submit_bio+0x27a/0xd60 __submit_bio_noacct_mq+0x10b/0x270 submit_bio_noacct+0x13d/0x150 submit_bio+0xbf/0x280 iomap_dio_submit_bio+0x155/0x180 iomap_dio_bio_actor+0x2f0/0x770 iomap_dio_actor+0xd9/0x150 iomap_apply+0x1d2/0x4f0 __iomap_dio_rw+0x43a/0x910 iomap_dio_rw+0x36/0x80 ext4_dio_write_iter+0x46f/0x730 [ext4] ext4_file_write_iter+0xd8/0x100 [ext4] new_sync_write+0x2ac/0x3a0 vfs_write+0x365/0x430 ksys_write+0xd5/0x1b0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6
Freed by task 2320929: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x40 __kasan_slab_free+0x151/0x180 kmem_cache_free+0x9e/0x540 rcu_do_batch+0x292/0x700 rcu_core+0x270/0x2d0 __do_softirq+0xfd/0x402
Last call_rcu(): kasan_save_stack+0x1b/0x40 kasan_record_aux_stack+0xa8/0xf0 __call_rcu+0xa4/0x3a0 ioc_release_fn+0x45/0x120 process_one_work+0x3c5/0x730 worker_thread+0x93/0x650 kthread+0x1ba/0x210 ret_from_fork+0x22/0x30
Second to last call_rcu(): kasan_save_stack+0x1b/0x40 kasan_record_aux_stack+0xa8/0xf0 __call_rcu+0xa4/0x3a0 ioc_release_fn+0x45/0x120 process_one_work+0x3c5/0x730 worker_thread+0x93/0x650 kthread+0x1ba/0x210 ret_from_fork+0x22/0x30
The buggy address belongs to the object at ffff88810efb42a0 which belongs to the cache bfq_io_cq of size 160 The buggy address is located 56 bytes inside of 160-byte region [ffff88810efb42a0, ffff88810efb4340) The buggy address belongs to the page: page:00000000a519c14c refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88810efb4000 pfn:0x10efb4 head:00000000a519c14c order:1 compound_mapcount:0 flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff) raw: 0017ffffc0010200 0000000000000000 dead000000000122 ffff8881407c8600 raw: ffff88810efb4000 000000008024001a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff88810efb4180: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ffff88810efb4200: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88810efb4280: fc fc fc fc fa fb fb fb fb fb fb fb fb fb fb fb
^ ffff88810efb4300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88810efb4380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================
Commit 3bc5e683c67d ("bfq: Split shared queues on move between cgroups") changes that move process to a new cgroup will allocate a new bfqq to use, however, the old bfqq and new bfqq can point to the same bic:
1) Initial state, two process with io in the same cgroup.
Process 1 Process 2 (BIC1) (BIC2) | Λ | Λ | | | | V | V | bfqq1 bfqq2
2) bfqq1 is merged to bfqq2.
Process 1 Process 2(cg1) (BIC1) (BIC2) | | -------------| V bfqq1 bfqq2(coop)
3) Process 1 exit, then issue new io(denoce IOA) from Process 2.
(BIC2) | Λ | | V | bfqq2(coop)
4) Before IOA is completed, move Process 2 to another cgroup and issue io.
Process 2 (BIC2) Λ |--------------\ | V bfqq2 bfqq3
Now that BIC2 points to bfqq3, while bfqq2 and bfqq3 both point to BIC2. If all the requests are completed, and Process 2 exit, BIC2 will be freed while there is no guarantee that bfqq2 will be freed before BIC2.
Fix the problem by clearing bfqq->bic if process references is decreased to zero, since that they are not related anymore.
Fixes: 3bc5e683c67d ("bfq: Split shared queues on move between cgroups") Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/bfq-iosched.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 6edc00da5b57..829e713639ad 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2775,6 +2775,15 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq) bfqq != bfqd->in_service_queue) bfq_del_bfqq_busy(bfqd, bfqq, false);
+ /* + * __bfq_bic_change_cgroup() just reset bic->bfqq so that a new bfqq + * will be created to handle new io, while old bfqq will stay around + * until all the requests are completed. It's unsafe to keep bfqq->bic + * since they are not related anymore. + */ + if (bfqq_process_refs(bfqq) == 1) + bfqq->bic = NULL; + bfq_put_queue(bfqq); }
From: Alan Stern stern@rowland.harvard.edu
mainline inclusion from mainline-v6.1-rc2 commit 41fd1cb6151439b205ac7611883d85ae14250172 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6068W CVE: CVE-2022-3903
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Automatic kernel fuzzing led to a WARN about invalid pipe direction in the mceusb driver:
------------[ cut here ]------------ usb 6-1: BOGUS control dir, pipe 80000380 doesn't match bRequestType 40 WARNING: CPU: 0 PID: 2465 at drivers/usb/core/urb.c:410 usb_submit_urb+0x1326/0x1820 drivers/usb/core/urb.c:410 Modules linked in: CPU: 0 PID: 2465 Comm: kworker/0:2 Not tainted 5.19.0-rc4-00208-g69cb6c6556ad #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: usb_hub_wq hub_event RIP: 0010:usb_submit_urb+0x1326/0x1820 drivers/usb/core/urb.c:410 Code: 7c 24 40 e8 ac 23 91 fd 48 8b 7c 24 40 e8 b2 70 1b ff 45 89 e8 44 89 f1 4c 89 e2 48 89 c6 48 c7 c7 a0 30 a9 86 e8 48 07 11 02 <0f> 0b e9 1c f0 ff ff e8 7e 23 91 fd 0f b6 1d 63 22 83 05 31 ff 41 RSP: 0018:ffffc900032becf0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8881100f3058 RCX: 0000000000000000 RDX: ffffc90004961000 RSI: ffff888114c6d580 RDI: fffff52000657d90 RBP: ffff888105ad90f0 R08: ffffffff812c3638 R09: 0000000000000000 R10: 0000000000000005 R11: ffffed1023504ef1 R12: ffff888105ad9000 R13: 0000000000000040 R14: 0000000080000380 R15: ffff88810ba96500 FS: 0000000000000000(0000) GS:ffff88811a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe810bda58 CR3: 000000010b720000 CR4: 0000000000350ef0 Call Trace: <TASK> usb_start_wait_urb+0x101/0x4c0 drivers/usb/core/message.c:58 usb_internal_control_msg drivers/usb/core/message.c:102 [inline] usb_control_msg+0x31c/0x4a0 drivers/usb/core/message.c:153 mceusb_gen1_init drivers/media/rc/mceusb.c:1431 [inline] mceusb_dev_probe+0x258e/0x33f0 drivers/media/rc/mceusb.c:1807
The reason for the warning is clear enough; the driver sends an unusual read request on endpoint 0 but does not set the USB_DIR_IN bit in the bRequestType field.
More importantly, the whole situation can be avoided and the driver simplified by converting it over to the relatively new usb_control_msg_recv() and usb_control_msg_send() routines. That's what this fix does.
Reported-and-tested-by: Rondreis linhaoguo86@gmail.com Link: https://lore.kernel.org/all/CAB7eexLLApHJwZfMQ=X-PtRhw0BgO+5KcSMS05FNUYejJXq...
Signed-off-by: Alan Stern stern@rowland.harvard.edu Cc: stable@vger.kernel.org Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Zhang Peng zhangpeng362@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/media/rc/mceusb.c | 35 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 21 deletions(-)
diff --git a/drivers/media/rc/mceusb.c b/drivers/media/rc/mceusb.c index dbb5a4f44bda..de4cf6eb5258 100644 --- a/drivers/media/rc/mceusb.c +++ b/drivers/media/rc/mceusb.c @@ -1416,42 +1416,37 @@ static void mceusb_gen1_init(struct mceusb_dev *ir) { int ret; struct device *dev = ir->dev; - char *data; - - data = kzalloc(USB_CTRL_MSG_SZ, GFP_KERNEL); - if (!data) { - dev_err(dev, "%s: memory allocation failed!", __func__); - return; - } + char data[USB_CTRL_MSG_SZ];
/* * This is a strange one. Windows issues a set address to the device * on the receive control pipe and expect a certain value pair back */ - ret = usb_control_msg(ir->usbdev, usb_rcvctrlpipe(ir->usbdev, 0), - USB_REQ_SET_ADDRESS, USB_TYPE_VENDOR, 0, 0, - data, USB_CTRL_MSG_SZ, 3000); + ret = usb_control_msg_recv(ir->usbdev, 0, USB_REQ_SET_ADDRESS, + USB_DIR_IN | USB_TYPE_VENDOR, + 0, 0, data, USB_CTRL_MSG_SZ, 3000, + GFP_KERNEL); dev_dbg(dev, "set address - ret = %d", ret); dev_dbg(dev, "set address - data[0] = %d, data[1] = %d", data[0], data[1]);
/* set feature: bit rate 38400 bps */ - ret = usb_control_msg(ir->usbdev, usb_sndctrlpipe(ir->usbdev, 0), - USB_REQ_SET_FEATURE, USB_TYPE_VENDOR, - 0xc04e, 0x0000, NULL, 0, 3000); + ret = usb_control_msg_send(ir->usbdev, 0, + USB_REQ_SET_FEATURE, USB_TYPE_VENDOR, + 0xc04e, 0x0000, NULL, 0, 3000, GFP_KERNEL);
dev_dbg(dev, "set feature - ret = %d", ret);
/* bRequest 4: set char length to 8 bits */ - ret = usb_control_msg(ir->usbdev, usb_sndctrlpipe(ir->usbdev, 0), - 4, USB_TYPE_VENDOR, - 0x0808, 0x0000, NULL, 0, 3000); + ret = usb_control_msg_send(ir->usbdev, 0, + 4, USB_TYPE_VENDOR, + 0x0808, 0x0000, NULL, 0, 3000, GFP_KERNEL); dev_dbg(dev, "set char length - retB = %d", ret);
/* bRequest 2: set handshaking to use DTR/DSR */ - ret = usb_control_msg(ir->usbdev, usb_sndctrlpipe(ir->usbdev, 0), - 2, USB_TYPE_VENDOR, - 0x0000, 0x0100, NULL, 0, 3000); + ret = usb_control_msg_send(ir->usbdev, 0, + 2, USB_TYPE_VENDOR, + 0x0000, 0x0100, NULL, 0, 3000, GFP_KERNEL); dev_dbg(dev, "set handshake - retC = %d", ret);
/* device resume */ @@ -1459,8 +1454,6 @@ static void mceusb_gen1_init(struct mceusb_dev *ir)
/* get hw/sw revision? */ mce_command_out(ir, GET_REVISION, sizeof(GET_REVISION)); - - kfree(data); }
static void mceusb_gen2_init(struct mceusb_dev *ir)
From: Zhihao Cheng chengzhihao1@huawei.com
mainline inclusion from mainline-v6.2-rc1 commit 7991dbff6849f67e823b7cc0c15e5a90b0549b9f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I65M32
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id...
--------------------------------
Recently we found a softlock up problem in dm thin pool btree lookup code due to corrupted metadata:
Kernel panic - not syncing: softlockup: hung tasks CPU: 7 PID: 2669225 Comm: kworker/u16:3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: dm-thin do_worker [dm_thin_pool] Call Trace: <IRQ> dump_stack+0x9c/0xd3 panic+0x35d/0x6b9 watchdog_timer_fn.cold+0x16/0x25 __run_hrtimer+0xa2/0x2d0 </IRQ> RIP: 0010:__relink_lru+0x102/0x220 [dm_bufio] __bufio_new+0x11f/0x4f0 [dm_bufio] new_read+0xa3/0x1e0 [dm_bufio] dm_bm_read_lock+0x33/0xd0 [dm_persistent_data] ro_step+0x63/0x100 [dm_persistent_data] btree_lookup_raw.constprop.0+0x44/0x220 [dm_persistent_data] dm_btree_lookup+0x16f/0x210 [dm_persistent_data] dm_thin_find_block+0x12c/0x210 [dm_thin_pool] __process_bio_read_only+0xc5/0x400 [dm_thin_pool] process_thin_deferred_bios+0x1a4/0x4a0 [dm_thin_pool] process_one_work+0x3c5/0x730
Following process may generate a broken btree mixed with fresh and stale btree nodes, which could get dm thin trapped in an infinite loop while looking up data block: Transaction 1: pmd->root = A, A->B->C // One path in btree pmd->root = X, X->Y->Z // Copy-up Transaction 2: X,Z is updated on disk, Y write failed. // Commit failed, dm thin becomes read-only. process_bio_read_only dm_thin_find_block __find_block dm_btree_lookup(pmd->root) The pmd->root points to a broken btree, Y may contain stale node pointing to any block, for example X, which gets dm thin trapped into a dead loop while looking up Z.
Fix this by setting pmd->root in __open_metadata(), so that dm thin will use the last transaction's pmd->root if commit failed.
Fetch a reproducer in [Link].
Linke: https://bugzilla.kernel.org/show_bug.cgi?id=216790 Cc: stable@vger.kernel.org Fixes: 991d9fa02da0 ("dm: add thin provisioning target") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Acked-by: Joe Thornber ejt@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/md/dm-thin-metadata.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index d1e0ae06f04d..8f4d149bb99d 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -701,6 +701,15 @@ static int __open_metadata(struct dm_pool_metadata *pmd) goto bad_cleanup_data_sm; }
+ /* + * For pool metadata opening process, root setting is redundant + * because it will be set again in __begin_transaction(). But dm + * pool aborting process really needs to get last transaction's + * root to avoid accessing broken btree. + */ + pmd->root = le64_to_cpu(disk_super->data_mapping_root); + pmd->details_root = le64_to_cpu(disk_super->device_details_root); + __setup_btree_details(pmd); dm_bm_unlock(sblock);
From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v5.13-rc1 commit 580dca8143d215977811bd2ff881e1e4f6ff39f0 category: performance bugzilla: 187991,https://gitee.com/openeuler/kernel/issues/I66VDN CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------
Yanhui found that write performance is degraded a lot after applying hctx shared tagset on one test machine with megaraid_sas. And turns out it is caused by none scheduler which becomes default elevator caused by hctx shared tagset patchset.
Given more scsi HBAs will apply hctx shared tagset, and the similar performance exists for them too.
So keep previous behavior by still using default mq-deadline for queues which apply hctx shared tagset, just like before.
Fixes: 32bc15afed04 ("blk-mq: Facilitate a shared sbitmap per tagset") Reported-by: Yanhui Ma yama@redhat.com Cc: John Garry john.garry@huawei.com Cc: Hannes Reinecke hare@suse.de Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Reviewed-by: Bart Van Assche bvanassche@acm.org Reviewed-by: John Garry john.garry@huawei.com Link: https://lore.kernel.org/r/20210406031933.767228-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk
Conflict: block/elevator.c
Signed-off-by: Zhong Jinghua zhongjinghua@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/elevator.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/elevator.c b/block/elevator.c index 76f70f679a1b..b66afb571e15 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -630,7 +630,8 @@ static struct elevator_type *elevator_get_default(struct request_queue *q) if (q->tag_set && q->tag_set->flags & BLK_MQ_F_NO_SCHED_BY_DEFAULT) return NULL;
- if (q->nr_hw_queues != 1) + if (q->nr_hw_queues != 1 && + !blk_mq_is_sbitmap_shared(q->tag_set->flags)) return NULL;
return elevator_get(q, "mq-deadline", false);
From: Zeng Jingxiang linuszeng@tencent.com
mainline inclusion from mainline-v6.1-rc1 commit 8b740c08eb8202817562c358e8d867db0f7d6565 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I664DZ CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... --------------------------------
Coverity complains of a possible NULL dereference:
in of_select_probe_type(): 1. returned_null: of_match_device() returns NULL. 2. var_assigned: match = NULL return value from of_match_device() 309 match = of_match_device(of_flash_match, &dev->dev);
3.dereference: Dereferencing the NULL pointer match. 310 probe_type = match->data;
Signed-off-by: Zeng Jingxiang linuszeng@tencent.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220727060302.1560325-1-zengjx95@gmail.co... Signed-off-by: Xiang Yang xiangyang3@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: yiyang yiyang13@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/mtd/maps/physmap-core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/mtd/maps/physmap-core.c b/drivers/mtd/maps/physmap-core.c index 4f63b8430c71..69d0ab1f6f94 100644 --- a/drivers/mtd/maps/physmap-core.c +++ b/drivers/mtd/maps/physmap-core.c @@ -307,6 +307,9 @@ static const char *of_select_probe_type(struct platform_device *dev) const char *probe_type;
match = of_match_device(of_flash_match, &dev->dev); + if (!match) + return NULL; + probe_type = match->data; if (probe_type) return probe_type;
From: Zhang Qiao zhangqiao22@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I64OUS CVE: NA
-------------------------------
When a cfs_rq throttled by qos, mark cfs_rq->throttled as 1, and cfs bw will unthrottled this cfs_rq by mistake, it cause a list_del_valid warning. So add macro QOS_THROTTLED(=2), when a cfs_rq is throttled by qos, we mark the cfs_rq->throttled as QOS_THROTTLED, will check the value of cfs_rq->throttled before unthrottle a cfs_rq.
Signed-off-by: Zhang Qiao zhangqiao22@huawei.com Reviewed-by: zheng zucheng zhengzucheng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/sched/fair.c | 102 +++++++++++++++++++++++++++++--------------- 1 file changed, 68 insertions(+), 34 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 49818f4fc532..b2002d8527a3 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -123,6 +123,13 @@ int __weak arch_asym_cpu_priority(int cpu) #endif
#ifdef CONFIG_QOS_SCHED + +/* + * To distinguish cfs bw, use QOS_THROTTLED mark cfs_rq->throttled + * when qos throttled(and cfs bw throttle mark cfs_rq->throttled as 1). + */ +#define QOS_THROTTLED 2 + static DEFINE_PER_CPU_SHARED_ALIGNED(struct list_head, qos_throttled_cfs_rq); static DEFINE_PER_CPU_SHARED_ALIGNED(struct hrtimer, qos_overload_timer); static DEFINE_PER_CPU(int, qos_cpu_overload); @@ -4959,6 +4966,14 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
se = cfs_rq->tg->se[cpu_of(rq)];
+#ifdef CONFIG_QOS_SCHED + /* + * if this cfs_rq throttled by qos, not need unthrottle it. + */ + if (cfs_rq->throttled == QOS_THROTTLED) + return; +#endif + cfs_rq->throttled = 0;
update_rq_clock(rq); @@ -7182,26 +7197,6 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_
static void start_qos_hrtimer(int cpu);
-static int qos_tg_unthrottle_up(struct task_group *tg, void *data) -{ - struct rq *rq = data; - struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; - - cfs_rq->throttle_count--; - - return 0; -} - -static int qos_tg_throttle_down(struct task_group *tg, void *data) -{ - struct rq *rq = data; - struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; - - cfs_rq->throttle_count++; - - return 0; -} - static void throttle_qos_cfs_rq(struct cfs_rq *cfs_rq) { struct rq *rq = rq_of(cfs_rq); @@ -7213,7 +7208,7 @@ static void throttle_qos_cfs_rq(struct cfs_rq *cfs_rq)
/* freeze hierarchy runnable averages while throttled */ rcu_read_lock(); - walk_tg_tree_from(cfs_rq->tg, qos_tg_throttle_down, tg_nop, (void *)rq); + walk_tg_tree_from(cfs_rq->tg, tg_throttle_down, tg_nop, (void *)rq); rcu_read_unlock();
task_delta = cfs_rq->h_nr_running; @@ -7224,8 +7219,13 @@ static void throttle_qos_cfs_rq(struct cfs_rq *cfs_rq) if (!se->on_rq) break;
- if (dequeue) + if (dequeue) { dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP); + } else { + update_load_avg(qcfs_rq, se, 0); + se_update_runnable(se); + } + qcfs_rq->h_nr_running -= task_delta; qcfs_rq->idle_h_nr_running -= idle_task_delta;
@@ -7243,7 +7243,7 @@ static void throttle_qos_cfs_rq(struct cfs_rq *cfs_rq) if (list_empty(&per_cpu(qos_throttled_cfs_rq, cpu_of(rq)))) start_qos_hrtimer(cpu_of(rq));
- cfs_rq->throttled = 1; + cfs_rq->throttled = QOS_THROTTLED;
list_add(&cfs_rq->qos_throttled_list, &per_cpu(qos_throttled_cfs_rq, cpu_of(rq))); @@ -7253,12 +7253,14 @@ static void unthrottle_qos_cfs_rq(struct cfs_rq *cfs_rq) { struct rq *rq = rq_of(cfs_rq); struct sched_entity *se; - int enqueue = 1; unsigned int prev_nr = cfs_rq->h_nr_running; long task_delta, idle_task_delta;
se = cfs_rq->tg->se[cpu_of(rq)];
+ if (cfs_rq->throttled != QOS_THROTTLED) + return; + cfs_rq->throttled = 0;
update_rq_clock(rq); @@ -7266,7 +7268,7 @@ static void unthrottle_qos_cfs_rq(struct cfs_rq *cfs_rq)
/* update hierarchical throttle state */ rcu_read_lock(); - walk_tg_tree_from(cfs_rq->tg, tg_nop, qos_tg_unthrottle_up, (void *)rq); + walk_tg_tree_from(cfs_rq->tg, tg_nop, tg_unthrottle_up, (void *)rq); rcu_read_unlock();
if (!cfs_rq->load.weight) @@ -7276,26 +7278,58 @@ static void unthrottle_qos_cfs_rq(struct cfs_rq *cfs_rq) idle_task_delta = cfs_rq->idle_h_nr_running; for_each_sched_entity(se) { if (se->on_rq) - enqueue = 0; + break;
cfs_rq = cfs_rq_of(se); - if (enqueue) - enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP); + enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP); + cfs_rq->h_nr_running += task_delta; cfs_rq->idle_h_nr_running += idle_task_delta;
if (cfs_rq_throttled(cfs_rq)) - break; + goto unthrottle_throttle; }
- assert_list_leaf_cfs_rq(rq); + for_each_sched_entity(se) { + cfs_rq = cfs_rq_of(se);
- if (!se) { - add_nr_running(rq, task_delta); - if (prev_nr < 2 && prev_nr + task_delta >= 2) - overload_set(rq); + update_load_avg(cfs_rq, se, UPDATE_TG); + se_update_runnable(se); + + cfs_rq->h_nr_running += task_delta; + cfs_rq->idle_h_nr_running += idle_task_delta; + + /* end evaluation on encountering a throttled cfs_rq */ + if (cfs_rq_throttled(cfs_rq)) + goto unthrottle_throttle; + + /* + * One parent has been throttled and cfs_rq removed from the + * list. Add it back to not break the leaf list. + */ + if (throttled_hierarchy(cfs_rq)) + list_add_leaf_cfs_rq(cfs_rq); + } + + add_nr_running(rq, task_delta); + if (prev_nr < 2 && prev_nr + task_delta >= 2) + overload_set(rq); + +unthrottle_throttle: + /* + * The cfs_rq_throttled() breaks in the above iteration can result in + * incomplete leaf list maintenance, resulting in triggering the + * assertion below. + */ + for_each_sched_entity(se) { + cfs_rq = cfs_rq_of(se); + + if (list_add_leaf_cfs_rq(cfs_rq)) + break; }
+ assert_list_leaf_cfs_rq(rq); + /* Determine whether we need to wake up potentially idle CPU: */ if (rq->curr == rq->idle && rq->cfs.nr_running) resched_curr(rq);
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit fdee117ee86479fd2644bcd9ac2b2469e55722d1 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
Current ll_rw_block() helper is fragile because it assumes that locked buffer means it's under IO which is submitted by some other who holds the lock, it skip buffer if it failed to get the lock, so it's only safe on the readahead path. Unfortunately, now that most filesystems still use this helper mistakenly on the sync metadata read path. There is no guarantee that the one who holds the buffer lock always submit IO (e.g. buffer_migrate_folio_norefs() after commit 88dbcbb3a484 ("blkdev: avoid migration stalls for blkdev pages"), it could lead to false positive -EIO when submitting reading IO.
This patch add some friendly buffer read helpers to prepare replacing ll_rw_block() and similar calls. We can only call bh_readahead_[] helpers for the readahead paths.
Link: https://lkml.kernel.org/r/20220901133505.2510834-3-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/buffer.c include/linux/buffer_head.h
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/buffer.c | 65 +++++++++++++++++++++++++++++++++++++ include/linux/buffer_head.h | 38 ++++++++++++++++++++++ 2 files changed, 103 insertions(+)
diff --git a/fs/buffer.c b/fs/buffer.c index 37a08026d3ef..a10dfa3a0f59 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -3391,6 +3391,71 @@ int bh_uptodate_or_lock(struct buffer_head *bh) } EXPORT_SYMBOL(bh_uptodate_or_lock);
+/** + * __bh_read - Submit read for a locked buffer + * @bh: struct buffer_head + * @op_flags: appending REQ_OP_* flags besides REQ_OP_READ + * @wait: wait until reading finish + * + * Returns zero on success or don't wait, and -EIO on error. + */ +int __bh_read(struct buffer_head *bh, unsigned int op_flags, bool wait) +{ + int ret = 0; + + BUG_ON(!buffer_locked(bh)); + + get_bh(bh); + bh->b_end_io = end_buffer_read_sync; + submit_bh(REQ_OP_READ, op_flags, bh); + if (wait) { + wait_on_buffer(bh); + if (!buffer_uptodate(bh)) + ret = -EIO; + } + return ret; +} +EXPORT_SYMBOL(__bh_read); + +/** + * __bh_read_batch - Submit read for a batch of unlocked buffers + * @nr: entry number of the buffer batch + * @bhs: a batch of struct buffer_head + * @op_flags: appending REQ_OP_* flags besides REQ_OP_READ + * @force_lock: force to get a lock on the buffer if set, otherwise drops any + * buffer that cannot lock. + * + * Returns zero on success or don't wait, and -EIO on error. + */ +void __bh_read_batch(int nr, struct buffer_head *bhs[], + unsigned int op_flags, bool force_lock) +{ + int i; + + for (i = 0; i < nr; i++) { + struct buffer_head *bh = bhs[i]; + + if (buffer_uptodate(bh)) + continue; + + if (force_lock) + lock_buffer(bh); + else + if (!trylock_buffer(bh)) + continue; + + if (buffer_uptodate(bh)) { + unlock_buffer(bh); + continue; + } + + bh->b_end_io = end_buffer_read_sync; + get_bh(bh); + submit_bh(REQ_OP_READ, op_flags, bh); + } +} +EXPORT_SYMBOL(__bh_read_batch); + /** * bh_submit_read - Submit a locked buffer for reading * @bh: struct buffer_head diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index 6b47f94378c5..8577ab2ef446 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -207,6 +207,9 @@ void write_boundary_block(struct block_device *bdev, sector_t bblock, unsigned blocksize); int bh_uptodate_or_lock(struct buffer_head *bh); int bh_submit_read(struct buffer_head *bh); +int __bh_read(struct buffer_head *bh, unsigned int op_flags, bool wait); +void __bh_read_batch(int nr, struct buffer_head *bhs[], + unsigned int op_flags, bool force_lock);
extern int buffer_heads_over_limit;
@@ -380,6 +383,41 @@ static inline struct buffer_head *__getblk(struct block_device *bdev, return __getblk_gfp(bdev, block, size, __GFP_MOVABLE); }
+static inline void bh_readahead(struct buffer_head *bh, unsigned int op_flags) +{ + if (!buffer_uptodate(bh) && trylock_buffer(bh)) { + if (!buffer_uptodate(bh)) + __bh_read(bh, op_flags, false); + else + unlock_buffer(bh); + } +} + +static inline void bh_read_nowait(struct buffer_head *bh, unsigned int op_flags) +{ + if (!bh_uptodate_or_lock(bh)) + __bh_read(bh, op_flags, false); +} + +/* Returns 1 if buffer uptodated, 0 on success, and -EIO on error. */ +static inline int bh_read(struct buffer_head *bh, unsigned int op_flags) +{ + if (bh_uptodate_or_lock(bh)) + return 1; + return __bh_read(bh, op_flags, true); +} + +static inline void bh_read_batch(int nr, struct buffer_head *bhs[]) +{ + __bh_read_batch(nr, bhs, 0, true); +} + +static inline void bh_readahead_batch(int nr, struct buffer_head *bhs[], + unsigned int op_flags) +{ + __bh_read_batch(nr, bhs, op_flags, false); +} + /** * __bread() - reads a specified block and returns the bh * @bdev: the block_device to read from
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit e7ea1129afab0e63af2c2d0e6e9fb7651f0982b3 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync IO path because it skip buffers which has been locked by others, it could lead to false positive EIO when submitting read IO. So stop using ll_rw_block(), switch to use new helpers which could guarantee buffer locked and submit IO if needed.
Link: https://lkml.kernel.org/r/20220901133505.2510834-4-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/buffer.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/buffer.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c index a10dfa3a0f59..93324b06ecb4 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -562,7 +562,7 @@ void write_boundary_block(struct block_device *bdev, struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize); if (bh) { if (buffer_dirty(bh)) - ll_rw_block(REQ_OP_WRITE, 0, 1, &bh); + write_dirty_buffer(bh, 0); put_bh(bh); } } @@ -1363,7 +1363,7 @@ void __breadahead(struct block_device *bdev, sector_t block, unsigned size) { struct buffer_head *bh = __getblk(bdev, block, size); if (likely(bh)) { - ll_rw_block(REQ_OP_READ, REQ_RAHEAD, 1, &bh); + bh_readahead(bh, REQ_RAHEAD); brelse(bh); } } @@ -2038,7 +2038,7 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len, if (!buffer_uptodate(bh) && !buffer_delay(bh) && !buffer_unwritten(bh) && (block_start < from || block_end > to)) { - ll_rw_block(REQ_OP_READ, 0, 1, &bh); + bh_read_nowait(bh, 0); *wait_bh++=bh; } } @@ -2927,11 +2927,9 @@ int block_truncate_page(struct address_space *mapping, set_buffer_uptodate(bh);
if (!buffer_uptodate(bh) && !buffer_delay(bh) && !buffer_unwritten(bh)) { - err = -EIO; - ll_rw_block(REQ_OP_READ, 0, 1, &bh); - wait_on_buffer(bh); + err = bh_read(bh, 0); /* Uhhuh. Read error. Complain and punt. */ - if (!buffer_uptodate(bh)) + if (err < 0) goto unlock; }
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 86a020cc7232c3defad370852415876bbe4576dc category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that always submitting read IO if the buffer has been locked, so stop using it. We also switch to new bh_readahead() helper for the readahead path.
Link: https://lkml.kernel.org/r/20220901133505.2510834-5-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Andreas Gruenbacher agruenba@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/gfs2/meta_io.c fs/gfs2/quota.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/gfs2/meta_io.c | 8 ++------ fs/gfs2/quota.c | 8 ++------ 2 files changed, 4 insertions(+), 12 deletions(-)
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c index 2db573e31f78..1ce73794a1aa 100644 --- a/fs/gfs2/meta_io.c +++ b/fs/gfs2/meta_io.c @@ -521,8 +521,7 @@ struct buffer_head *gfs2_meta_ra(struct gfs2_glock *gl, u64 dblock, u32 extlen)
if (buffer_uptodate(first_bh)) goto out; - if (!buffer_locked(first_bh)) - ll_rw_block(REQ_OP_READ, REQ_META | REQ_PRIO, 1, &first_bh); + bh_read_nowait(first_bh, REQ_META | REQ_PRIO);
dblock++; extlen--; @@ -530,10 +529,7 @@ struct buffer_head *gfs2_meta_ra(struct gfs2_glock *gl, u64 dblock, u32 extlen) while (extlen) { bh = gfs2_getbuf(gl, dblock, CREATE);
- if (!buffer_uptodate(bh) && !buffer_locked(bh)) - ll_rw_block(REQ_OP_READ, - REQ_RAHEAD | REQ_META | REQ_PRIO, - 1, &bh); + bh_readahead(bh, REQ_RAHEAD | REQ_META | REQ_PRIO); brelse(bh); dblock++; extlen--; diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c index ad953ecb5853..065ddb8792f3 100644 --- a/fs/gfs2/quota.c +++ b/fs/gfs2/quota.c @@ -741,12 +741,8 @@ static int gfs2_write_buf_to_page(struct gfs2_inode *ip, unsigned long index, } if (PageUptodate(page)) set_buffer_uptodate(bh); - if (!buffer_uptodate(bh)) { - ll_rw_block(REQ_OP_READ, REQ_META | REQ_PRIO, 1, &bh); - wait_on_buffer(bh); - if (!buffer_uptodate(bh)) - goto unlock_out; - } + if (bh_read(bh, REQ_META | REQ_PRIO) < 0) + goto unlock_out; if (gfs2_is_jdata(ip)) gfs2_trans_add_data(ip->i_gl, bh); else
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 0ed48061887f603b33b7dcb9075cbfaaa8d02723 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO return from zisofs_uncompress_block() if he buffer has been locked by others. So stop using ll_rw_block(), switch to sync helper instead.
Link: https://lkml.kernel.org/r/20220901133505.2510834-6-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/isofs/compress.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/isofs/compress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/isofs/compress.c b/fs/isofs/compress.c index bc12ac7e2312..c9fea93b1ee7 100644 --- a/fs/isofs/compress.c +++ b/fs/isofs/compress.c @@ -82,7 +82,7 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start, return 0; } haveblocks = isofs_get_blocks(inode, blocknum, bhs, needblocks); - ll_rw_block(REQ_OP_READ, 0, haveblocks, bhs); + bh_read_batch(haveblocks, bhs);
curbh = 0; curpage = 0;
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 8c004d1fc1497d9a6d92ea968bd58230af59a492 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in journal_get_superblock(). We also switch to new bh_readahead_batch() for the buffer array readahead path.
Link: https://lkml.kernel.org/r/20220901133505.2510834-7-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Theodore Ts'o tytso@mit.edu Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/jbd2/journal.c fs/jbd2/recovery.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/jbd2/journal.c | 15 ++++++--------- fs/jbd2/recovery.c | 16 ++++++++++------ 2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index aae412b0bfae..40d2edd55f85 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1793,19 +1793,16 @@ static int journal_get_superblock(journal_t *journal) { struct buffer_head *bh; journal_superblock_t *sb; - int err = -EIO; + int err;
bh = journal->j_sb_buffer;
J_ASSERT(bh != NULL); - if (!buffer_uptodate(bh)) { - ll_rw_block(REQ_OP_READ, 0, 1, &bh); - wait_on_buffer(bh); - if (!buffer_uptodate(bh)) { - printk(KERN_ERR - "JBD2: IO error reading journal superblock\n"); - goto out; - } + err = bh_read(bh, 0); + if (err < 0) { + printk(KERN_ERR + "JBD2: IO error reading journal superblock\n"); + goto out; }
if (buffer_verified(bh)) diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index 1e07dfac4d81..f5e3bb411953 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -100,7 +100,7 @@ static int do_readahead(journal_t *journal, unsigned int start) if (!buffer_uptodate(bh) && !buffer_locked(bh)) { bufs[nbufs++] = bh; if (nbufs == MAXBUF) { - ll_rw_block(REQ_OP_READ, 0, nbufs, bufs); + bh_readahead_batch(nbufs, bufs, 0); journal_brelse_array(bufs, nbufs); nbufs = 0; } @@ -109,7 +109,7 @@ static int do_readahead(journal_t *journal, unsigned int start) }
if (nbufs) - ll_rw_block(REQ_OP_READ, 0, nbufs, bufs); + bh_readahead_batch(nbufs, bufs, 0); err = 0;
failed: @@ -152,9 +152,14 @@ static int jread(struct buffer_head **bhp, journal_t *journal, return -ENOMEM;
if (!buffer_uptodate(bh)) { - /* If this is a brand new buffer, start readahead. - Otherwise, we assume we are already reading it. */ - if (!buffer_req(bh)) + /* + * If this is a brand new buffer, start readahead. + * Otherwise, we assume we are already reading it. + */ + bool need_readahead = !buffer_req(bh); + + bh_read_nowait(bh, 0); + if (need_readahead) do_readahead(journal, offset); wait_on_buffer(bh); } @@ -687,7 +692,6 @@ static int do_one_pass(journal_t *journal, mark_buffer_dirty(nbh); BUFFER_TRACE(nbh, "marking uptodate"); ++info->nr_replays; - /* ll_rw_block(WRITE, 1, &nbh); */ unlock_buffer(nbh); brelse(obh); brelse(nbh);
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 6bf414a00ae7688fac1e03f63431a355068a1d79 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in ntfs_get_block_vbo().
Link: https://lkml.kernel.org/r/20220901133505.2510834-8-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/ntfs3/inode.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ntfs3/inode.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c index 866691331a36..eecf78e5d754 100644 --- a/fs/ntfs3/inode.c +++ b/fs/ntfs3/inode.c @@ -629,12 +629,9 @@ static noinline int ntfs_get_block_vbo(struct inode *inode, u64 vbo, bh->b_size = block_size; off = vbo & (PAGE_SIZE - 1); set_bh_page(bh, page, off); - ll_rw_block(REQ_OP_READ, 0, 1, &bh); - wait_on_buffer(bh); - if (!buffer_uptodate(bh)) { - err = -EIO; + err = bh_read(bh, 0); + if (err < 0) goto out; - } zero_user_segment(page, off + voff, off + block_size); } }
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 54d9171d38d904f5afde76e51bed416aaf144975 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in ocfs2.
Link: https://lkml.kernel.org/r/20220901133505.2510834-9-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/ocfs2/aops.c fs/ocfs2/super.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ocfs2/aops.c | 2 +- fs/ocfs2/super.c | 4 +--- 2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index ad20403b383f..6b06de78f2af 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -640,7 +640,7 @@ int ocfs2_map_page_blocks(struct page *page, u64 *p_blkno, !buffer_new(bh) && ocfs2_should_read_blk(inode, page, block_start) && (block_start < from || block_end > to)) { - ll_rw_block(REQ_OP_READ, 0, 1, &bh); + bh_read_nowait(bh, 0); *wait_bh++=bh; }
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c index c0e5f1bad499..01ac71723859 100644 --- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -1772,9 +1772,7 @@ static int ocfs2_get_sector(struct super_block *sb, if (!buffer_dirty(*bh)) clear_buffer_uptodate(*bh); unlock_buffer(*bh); - ll_rw_block(REQ_OP_READ, 0, 1, bh); - wait_on_buffer(*bh); - if (!buffer_uptodate(*bh)) { + if (bh_read(*bh, 0) < 0) { mlog_errno(-EIO); brelse(*bh); *bh = NULL;
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit d554822e82cc99db53b845f3e60dc13e56ad4575 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read/write path because it cannot guarantee that submitting read/write IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() in read path if the buffer has been locked by others. So stop using ll_rw_block() in reiserfs. We also switch to new bh_readahead_batch() helper for the buffer array readahead path.
Link: https://lkml.kernel.org/r/20220901133505.2510834-10-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/reiserfs/journal.c fs/reiserfs/stree.c fs/reiserfs/super.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/reiserfs/journal.c | 11 ++++++----- fs/reiserfs/stree.c | 4 ++-- fs/reiserfs/super.c | 4 +--- 3 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c index df5fc12a6cee..55df528e9876 100644 --- a/fs/reiserfs/journal.c +++ b/fs/reiserfs/journal.c @@ -870,7 +870,7 @@ static int write_ordered_buffers(spinlock_t * lock, */ if (buffer_dirty(bh) && unlikely(bh->b_page->mapping == NULL)) { spin_unlock(lock); - ll_rw_block(REQ_OP_WRITE, 0, 1, &bh); + write_dirty_buffer(bh, 0); spin_lock(lock); } put_bh(bh); @@ -1054,7 +1054,7 @@ static int flush_commit_list(struct super_block *s, if (tbh) { if (buffer_dirty(tbh)) { depth = reiserfs_write_unlock_nested(s); - ll_rw_block(REQ_OP_WRITE, 0, 1, &tbh); + write_dirty_buffer(tbh, 0); reiserfs_write_lock_nested(s, depth); } put_bh(tbh) ; @@ -2239,7 +2239,7 @@ static int journal_read_transaction(struct super_block *sb, } } /* read in the log blocks, memcpy to the corresponding real block */ - ll_rw_block(REQ_OP_READ, 0, get_desc_trans_len(desc), log_blocks); + bh_read_batch(get_desc_trans_len(desc), log_blocks); for (i = 0; i < get_desc_trans_len(desc); i++) {
wait_on_buffer(log_blocks[i]); @@ -2341,10 +2341,11 @@ static struct buffer_head *reiserfs_breada(struct block_device *dev, } else bhlist[j++] = bh; } - ll_rw_block(REQ_OP_READ, 0, j, bhlist); + bh = bhlist[0]; + bh_read_nowait(bh, 0); + bh_readahead_batch(j - 1, &bhlist[1], 0); for (i = 1; i < j; i++) brelse(bhlist[i]); - bh = bhlist[0]; wait_on_buffer(bh); if (buffer_uptodate(bh)) return bh; diff --git a/fs/reiserfs/stree.c b/fs/reiserfs/stree.c index ef42729216d1..84c12a1947b2 100644 --- a/fs/reiserfs/stree.c +++ b/fs/reiserfs/stree.c @@ -579,7 +579,7 @@ static int search_by_key_reada(struct super_block *s, if (!buffer_uptodate(bh[j])) { if (depth == -1) depth = reiserfs_write_unlock_nested(s); - ll_rw_block(REQ_OP_READ, REQ_RAHEAD, 1, bh + j); + bh_readahead(bh[j], REQ_RAHEAD); } brelse(bh[j]); } @@ -685,7 +685,7 @@ int search_by_key(struct super_block *sb, const struct cpu_key *key, if (!buffer_uptodate(bh) && depth == -1) depth = reiserfs_write_unlock_nested(sb);
- ll_rw_block(REQ_OP_READ, 0, 1, &bh); + bh_read_nowait(bh, 0); wait_on_buffer(bh);
if (depth != -1) diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c index 913f5af9bf24..e5fd44b4a8ea 100644 --- a/fs/reiserfs/super.c +++ b/fs/reiserfs/super.c @@ -1708,9 +1708,7 @@ static int read_super_block(struct super_block *s, int offset) /* after journal replay, reread all bitmap and super blocks */ static int reread_meta_blocks(struct super_block *s) { - ll_rw_block(REQ_OP_READ, 0, 1, &SB_BUFFER_WITH_SB(s)); - wait_on_buffer(SB_BUFFER_WITH_SB(s)); - if (!buffer_uptodate(SB_BUFFER_WITH_SB(s))) { + if (bh_read(SB_BUFFER_WITH_SB(s), 0) < 0) { reiserfs_warning(s, "reiserfs-2504", "error reading the super"); return 1; }
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 59a16786fa7a77dd383a62271e0102f1455bccea category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block(). We also switch to new bh_readahead_batch() helper for the buffer array readahead path.
Link: https://lkml.kernel.org/r/20220901133505.2510834-11-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/udf/dir.c fs/udf/directory.c fs/udf/inode.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/udf/dir.c | 2 +- fs/udf/directory.c | 2 +- fs/udf/inode.c | 8 +------- 3 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/fs/udf/dir.c b/fs/udf/dir.c index d0f92a52e3ba..02bf94e3e666 100644 --- a/fs/udf/dir.c +++ b/fs/udf/dir.c @@ -131,7 +131,7 @@ static int udf_readdir(struct file *file, struct dir_context *ctx) brelse(tmp); } if (num) { - ll_rw_block(REQ_OP_READ, REQ_RAHEAD, num, bha); + bh_readahead_batch(num, bha, REQ_RAHEAD); for (i = 0; i < num; i++) brelse(bha[i]); } diff --git a/fs/udf/directory.c b/fs/udf/directory.c index 73720320f0ab..16bcf2c6b8b3 100644 --- a/fs/udf/directory.c +++ b/fs/udf/directory.c @@ -89,7 +89,7 @@ struct fileIdentDesc *udf_fileident_read(struct inode *dir, loff_t *nf_pos, brelse(tmp); } if (num) { - ll_rw_block(REQ_OP_READ, REQ_RAHEAD, num, bha); + bh_readahead_batch(num, bha, REQ_RAHEAD); for (i = 0; i < num; i++) brelse(bha[i]); } diff --git a/fs/udf/inode.c b/fs/udf/inode.c index d32b836f6ca7..3ae9955c42b0 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -1210,13 +1210,7 @@ struct buffer_head *udf_bread(struct inode *inode, udf_pblk_t block, if (!bh) return NULL;
- if (buffer_uptodate(bh)) - return bh; - - ll_rw_block(REQ_OP_READ, 0, 1, &bh); - - wait_on_buffer(bh); - if (buffer_uptodate(bh)) + if (bh_read(bh, 0) >= 0) return bh;
brelse(bh);
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 6799b6983170c6dfdb2fcea8c97058557ea7b5b6 category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in ufs.
Link: https://lkml.kernel.org/r/20220901133505.2510834-12-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org
Conflict: fs/ufs/balloc.c
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ufs/balloc.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c index 075d3d9114c8..2436e3f82147 100644 --- a/fs/ufs/balloc.c +++ b/fs/ufs/balloc.c @@ -295,14 +295,10 @@ static void ufs_change_blocknr(struct inode *inode, sector_t beg,
if (!buffer_mapped(bh)) map_bh(bh, inode->i_sb, oldb + pos); - if (!buffer_uptodate(bh)) { - ll_rw_block(REQ_OP_READ, 0, 1, &bh); - wait_on_buffer(bh); - if (!buffer_uptodate(bh)) { - ufs_error(inode->i_sb, __func__, - "read of block failed\n"); - break; - } + if (bh_read(bh, 0) < 0) { + ufs_error(inode->i_sb, __func__, + "read of block failed\n"); + break; }
UFSD(" change from %llu to %llu, pos %u\n",
From: Zhang Yi yi.zhang@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 28cf75591008eef5e1649de31c4ddee5bf20081d category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
--------------------------------
bh_submit_read() and the uptodate check logic in bh_uptodate_or_lock() has been integrated in bh_read() helper, so switch to use it directly.
Link: https://lkml.kernel.org/r/20220901133505.2510834-14-yi.zhang@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext2/balloc.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c index c17ccc19b938..5dc0a31f4a08 100644 --- a/fs/ext2/balloc.c +++ b/fs/ext2/balloc.c @@ -126,6 +126,7 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group) struct ext2_group_desc * desc; struct buffer_head * bh = NULL; ext2_fsblk_t bitmap_blk; + int ret;
desc = ext2_get_group_desc(sb, block_group, NULL); if (!desc) @@ -139,10 +140,10 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group) block_group, le32_to_cpu(desc->bg_block_bitmap)); return NULL; } - if (likely(bh_uptodate_or_lock(bh))) + ret = bh_read(bh, 0); + if (ret > 0) return bh; - - if (bh_submit_read(bh) < 0) { + if (ret < 0) { brelse(bh); ext2_error(sb, __func__, "Cannot read block bitmap - "