From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187685, https://gitee.com/openeuler/kernel/issues/I5VVZX CVE: NA
--------------------------------
Fix bad space budget when symlink file is encrypted. Bad space budget may let make_reservation() return with -ENOSPC, which could turn ubifs to read-only mode in do_writepage() process.
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216490 Fixes: ca7f85be8d6cf9 ("ubifs: Add support for encrypted symlinks") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ubifs/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index f5777f59a101..b73904c1fbf7 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1145,7 +1145,6 @@ static int ubifs_symlink(struct inode *dir, struct dentry *dentry, int err, sz_change, len = strlen(symname); struct fscrypt_str disk_link; struct ubifs_budget_req req = { .new_ino = 1, .new_dent = 1, - .new_ino_d = ALIGN(len, 8), .dirtied_ino = 1 }; struct fscrypt_name nm;
@@ -1161,6 +1160,7 @@ static int ubifs_symlink(struct inode *dir, struct dentry *dentry, * Budget request settings: new inode, new direntry and changing parent * directory inode. */ + req.new_ino_d = ALIGN(disk_link.len - 1, 8); err = ubifs_budget_space(c, &req); if (err) return err;
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187685, https://gitee.com/openeuler/kernel/issues/I5VVZX CVE: NA
--------------------------------
There is no space budget for ubifs_xrename(). It may let make_reservation() return with -ENOSPC, which could turn ubifs to read-only mode in do_writepage() process. Fix it by adding space budget for ubifs_xrename().
Fixes: 9ec64962afb170 ("ubifs: Implement RENAME_EXCHANGE") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ubifs/dir.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index b73904c1fbf7..1f13fd515ce7 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1570,6 +1570,10 @@ static int ubifs_xrename(struct inode *old_dir, struct dentry *old_dentry, return err; }
+ err = ubifs_budget_space(c, &req); + if (err) + goto out; + lock_4_inodes(old_dir, new_dir, NULL, NULL);
time = current_time(old_dir); @@ -1595,6 +1599,7 @@ static int ubifs_xrename(struct inode *old_dir, struct dentry *old_dentry, unlock_4_inodes(old_dir, new_dir, NULL, NULL); ubifs_release_budget(c, &req);
+out: fscrypt_free_filename(&fst_nm); fscrypt_free_filename(&snd_nm); return err;
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187685, https://gitee.com/openeuler/kernel/issues/I5VVZX CVE: NA
-------------------------------
Each dirty inode should reserve 'c->bi.inode_budget' bytes in space budget calculation. Currently, space budget for dirty inode reports more space than what UBIFS actually needs to write.
Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ubifs/budget.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ubifs/budget.c b/fs/ubifs/budget.c index c0b84e960b20..bdb79be6dc0e 100644 --- a/fs/ubifs/budget.c +++ b/fs/ubifs/budget.c @@ -403,7 +403,7 @@ static int calc_dd_growth(const struct ubifs_info *c, dd_growth = req->dirtied_page ? c->bi.page_budget : 0;
if (req->dirtied_ino) - dd_growth += c->bi.inode_budget << (req->dirtied_ino - 1); + dd_growth += c->bi.inode_budget * req->dirtied_ino; if (req->mod_dent) dd_growth += c->bi.dent_budget; dd_growth += req->dirtied_ino_d;
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187685, https://gitee.com/openeuler/kernel/issues/I5VVZX CVE: NA
--------------------------------
If target inode is a special file (eg. block/char device) with nlink count greater than 1, the inode with ui->data will be re-written on disk. However, UBIFS losts target inode's data_len while doing space budget. Bad space budget may let make_reservation() return with -ENOSPC, which could turn ubifs to read-only mode in do_writepage() process.
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216494 Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ubifs/dir.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 1f13fd515ce7..1686e7aea823 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1318,6 +1318,8 @@ static int do_rename(struct inode *old_dir, struct dentry *old_dentry, if (unlink) { ubifs_assert(c, inode_is_locked(new_inode));
+ /* Budget for old inode's data when its nlink > 1. */ + req.dirtied_ino_d = ALIGN(ubifs_inode(new_inode)->data_len, 8); err = ubifs_purge_xattrs(new_inode); if (err) return err;
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187685, https://gitee.com/openeuler/kernel/issues/I5VVZX CVE: NA
--------------------------------
UBIFS calculates available space by c->main_bytes - c->lst.total_used (which means non-index lebs' free and dirty space is accounted into total available), then index lebs and four lebs (one for gc_lnum, one for deletions, two for journal heads) are deducted. In following situation, ubifs may get -ENOSPC from make_reservation(): LEB 84: DATAHD free 122880 used 1920 dirty 2176 dark 6144 LEB 110:DELETION free 126976 used 0 dirty 0 dark 6144 (empty) LEB 201:gc_lnum free 126976 used 0 dirty 0 dark 6144 LEB 272:GCHD free 77824 used 47672 dirty 1480 dark 6144 LEB 356:BASEHD free 0 used 39776 dirty 87200 dark 6144 OTHERS: index lebs, zero-available non-index lebs
UBIFS calculates the available bytes is 6888 (How to calculate it: 126976 * 5[remain main bytes] - 1920[used] - 47672[used] - 39776[used] - 126976 * 1[deletions] - 126976 * 1[gc_lnum] - 126976 * 2[journal heads] - 6144 * 5[dark] = 6888) after doing budget, however UBIFS cannot use BASEHD's dirty space(87200), because UBIFS cannot find next BASEHD to reclaim current BASEHD. (c->bi.min_idx_lebs equals to c->lst.idx_lebs, the empty leb won't be found by ubifs_find_free_space(), and dirty index lebs won't be picked as gced lebs. All non-index lebs has dirty space less then c->dead_wm, non-index lebs won't be picked as gced lebs either. So new free lebs won't be produced.). See more details in Link.
To fix it, reserve one leb for each journal head while doing budget.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216562 Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ubifs/budget.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/ubifs/budget.c b/fs/ubifs/budget.c index bdb79be6dc0e..9cb05ef9b9dd 100644 --- a/fs/ubifs/budget.c +++ b/fs/ubifs/budget.c @@ -212,11 +212,10 @@ long long ubifs_calc_available(const struct ubifs_info *c, int min_idx_lebs) subtract_lebs += 1;
/* - * The GC journal head LEB is not really accessible. And since - * different write types go to different heads, we may count only on - * one head's space. + * Since different write types go to different heads, we should + * reserve one leb for each head. */ - subtract_lebs += c->jhead_cnt - 1; + subtract_lebs += c->jhead_cnt;
/* We also reserve one LEB for deletions, which bypass budgeting */ subtract_lebs += 1;
From: Eric Dumazet edumazet@google.com
mainline inclusion from mainline-v6.1-rc1 commit ec7eede369fe5b0d085ac51fdbb95184f87bfc6c category: bugfix bugzilla: 187823, https://gitee.com/src-openeuler/kernel/issues/I5VZ0N CVE: CVE-2022-3521
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
syzbot found that kcm_tx_work() could crash [1] in:
/* Primarily for SOCK_SEQPACKET sockets */ if (likely(sk->sk_socket) && test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { <<*>> clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags); sk->sk_write_space(sk); }
I think the reason is that another thread might concurrently run in kcm_release() and call sock_orphan(sk) while sk is not locked. kcm_tx_work() find sk->sk_socket being NULL.
[1] BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:86 [inline] BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] BUG: KASAN: null-ptr-deref in kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 Write of size 8 at addr 0000000000000008 by task kworker/u4:3/53
CPU: 0 PID: 53 Comm: kworker/u4:3 Not tainted 5.19.0-rc3-next-20220621-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: kkcmd kcm_tx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 kasan_report+0xbe/0x1f0 mm/kasan/report.c:495 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189 instrument_atomic_write include/linux/instrumented.h:86 [inline] clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK>
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Tom Herbert tom@herbertland.com Link: https://lore.kernel.org/r/20221012133412.519394-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Baisong Zhong zhongbaisong@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/kcm/kcmsock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 56dad9565bc9..977f2de89dce 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -1838,10 +1838,10 @@ static int kcm_release(struct socket *sock) kcm = kcm_sk(sk); mux = kcm->mux;
+ lock_sock(sk); sock_orphan(sk); kfree_skb(kcm->seq_skb);
- lock_sock(sk); /* Purge queue under lock to avoid race condition with tx_work trying * to act when queue is nonempty. If tx_work runs after this point * it will just return.
From: Kuniyuki Iwashima kuniyu@amazon.com
mainline inclusion from mainline-v6.1-rc1 commit 3c52c6bb831f6335c176a0fc7214e26f43adbd11 category: bugfix bugzilla: 187825, https://gitee.com/src-openeuler/kernel/issues/I5VZ0J CVE: CVE-2022-3524
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
syzbot reported a memory leak [0] related to IPV6_ADDRFORM.
The scenario is that while one thread is converting an IPv6 socket into IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and allocates memory to inet6_sk(sk)->XXX after conversion.
Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources, which inet6_destroy_sock() should have cleaned up.
setsockopt(IPV6_ADDRFORM) setsockopt(IPV6_DSTOPTS) +-----------------------+ +----------------------+ - do_ipv6_setsockopt(sk, ...) - sockopt_lock_sock(sk) - do_ipv6_setsockopt(sk, ...) - lock_sock(sk) ^._ called via tcpv6_prot - WRITE_ONCE(sk->sk_prot, &tcp_prot) before WRITE_ONCE() - xchg(&np->opt, NULL) - txopt_put(opt) - sockopt_release_sock(sk) - release_sock(sk) - sockopt_lock_sock(sk) - lock_sock(sk) - ipv6_set_opt_hdr(sk, ...) - ipv6_update_options(sk, opt) - xchg(&inet6_sk(sk)->opt, opt) ^._ opt is never freed.
- sockopt_release_sock(sk) - release_sock(sk)
Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after acquiring the lock.
This issue exists from the initial commit between IPV6_ADDRFORM and IPV6_PKTOPTIONS.
[0]: BUG: memory leak unreferenced object 0xffff888009ab9f80 (size 96): comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s) hex dump (first 32 bytes): 01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00 ....H........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline] [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566 [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318 [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline] [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668 [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021 [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789 [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252 [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline] [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline] [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260 [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org
Conflicts: net/ipv6/ipv6_sockglue.c
Signed-off-by: Xu Jia xujia39@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/ipv6/ipv6_sockglue.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 1ae33a882b9a..16af0c700d05 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -417,6 +417,12 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, rtnl_lock(); lock_sock(sk);
+ /* Another thread has converted the socket into IPv4 with + * IPV6_ADDRFORM concurrently. + */ + if (unlikely(sk->sk_family != AF_INET6)) + goto unlock; + switch (optname) {
case IPV6_ADDRFORM: @@ -978,6 +984,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, break; }
+unlock: release_sock(sk); if (needs_rtnl) rtnl_unlock();
From: Thadeu Lima de Souza Cascardo cascardo@canonical.com
stable inclusion from stable-v5.10.144 commit 33015556a943d6cbb18c555925a54b8c0e46f521 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WL0J CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
This reverts commit 00b136bb6254e0abf6aaafe62c4da5f6c4fea4cb.
This temporarily reverts the backport of upstream commit 1f001e9da6bbf482311e45e48f53c2bd2179e59c. It was not correct to copy the ftrace stub as it would contain a relative jump to the return thunk which would not apply to the context where it was being copied to, leading to ftrace support to be broken.
Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Lin Yujun linyujun809@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/ftrace.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index dca5cf82144c..fbcd144260ac 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -308,7 +308,7 @@ union ftrace_op_code_union { } __attribute__((packed)); };
-#define RET_SIZE (IS_ENABLED(CONFIG_RETPOLINE) ? 5 : 1 + IS_ENABLED(CONFIG_SLS)) +#define RET_SIZE 1 + IS_ENABLED(CONFIG_SLS)
static unsigned long create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) @@ -367,10 +367,7 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
/* The trampoline ends with ret(q) */ retq = (unsigned long)ftrace_stub; - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) - memcpy(ip, text_gen_insn(JMP32_INSN_OPCODE, ip, &__x86_return_thunk), JMP32_INSN_SIZE); - else - ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); + ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); if (WARN_ON(ret < 0)) goto fail;
From: Peter Zijlstra peterz@infradead.org
stable inclusion from stable-v5.10.144 commit 4586df06a02049f4315c25b947c6dde2627c0d18 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WL0J CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit e52fc2cf3f662828cc0d51c4b73bed73ad275fce upstream.
Return trampoline must not use indirect branch to return; while this preserves the RSB, it is fundamentally incompatible with IBT. Instead use a retpoline like ROP gadget that defeats IBT while not unbalancing the RSB.
And since ftrace_stub is no longer a plain RET, don't use it to copy from. Since RET is a trivial instruction, poke it directly.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Josh Poimboeuf jpoimboe@redhat.com Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org [cascardo: remove ENDBR] Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com [OP: adjusted context for 5.10-stable] Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Lin Yujun linyujun809@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/ftrace.c | 9 ++------- arch/x86/kernel/ftrace_64.S | 19 +++++++++++++++---- 2 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index fbcd144260ac..5c7583449df7 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -321,12 +321,12 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) unsigned long offset; unsigned long npages; unsigned long size; - unsigned long retq; unsigned long *ptr; void *trampoline; void *ip; /* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */ unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 }; + unsigned const char retq[] = { RET_INSN_OPCODE, INT3_INSN_OPCODE }; union ftrace_op_code_union op_ptr; int ret;
@@ -364,12 +364,7 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) goto fail;
ip = trampoline + size; - - /* The trampoline ends with ret(q) */ - retq = (unsigned long)ftrace_stub; - ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); - if (WARN_ON(ret < 0)) - goto fail; + memcpy(ip, retq, RET_SIZE);
/* No need to test direct calls on created trampolines */ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) { diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index e3a375185a1b..5b2dabedcf66 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -170,7 +170,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
/* * This is weak to keep gas from relaxing the jumps. - * It is also used to copy the RET for trampolines. */ SYM_INNER_LABEL_ALIGN(ftrace_stub, SYM_L_WEAK) UNWIND_HINT_FUNC @@ -325,7 +324,7 @@ SYM_FUNC_END(ftrace_graph_caller)
SYM_CODE_START(return_to_handler) UNWIND_HINT_EMPTY - subq $24, %rsp + subq $16, %rsp
/* Save the return values */ movq %rax, (%rsp) @@ -337,7 +336,19 @@ SYM_CODE_START(return_to_handler) movq %rax, %rdi movq 8(%rsp), %rdx movq (%rsp), %rax - addq $24, %rsp - JMP_NOSPEC rdi + + addq $16, %rsp + /* + * Jump back to the old return address. This cannot be JMP_NOSPEC rdi + * since IBT would demand that contain ENDBR, which simply isn't so for + * return addresses. Use a retpoline here to keep the RSB balanced. + */ + ANNOTATE_INTRA_FUNCTION_CALL + call .Ldo_rop + int3 +.Ldo_rop: + mov %rdi, (%rsp) + UNWIND_HINT_FUNC + RET SYM_CODE_END(return_to_handler) #endif
From: Peter Zijlstra peterz@infradead.org
stable inclusion from stable-v5.10.144 commit 35371fd68807f41a4072c01c166de5425a2a47e5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WL0J CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 1f001e9da6bbf482311e45e48f53c2bd2179e59c upstream.
Use the return thunk in ftrace trampolines, if needed.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Borislav Petkov bp@suse.de [cascardo: use memcpy(text_gen_insn) as there is no __text_gen_insn] Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Lin Yujun linyujun809@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/ftrace.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 5c7583449df7..e760f6b5f0b4 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -308,7 +308,7 @@ union ftrace_op_code_union { } __attribute__((packed)); };
-#define RET_SIZE 1 + IS_ENABLED(CONFIG_SLS) +#define RET_SIZE (IS_ENABLED(CONFIG_RETPOLINE) ? 5 : 1 + IS_ENABLED(CONFIG_SLS))
static unsigned long create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) @@ -364,7 +364,12 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) goto fail;
ip = trampoline + size; - memcpy(ip, retq, RET_SIZE); + + /* The trampoline ends with ret(q) */ + if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + memcpy(ip, text_gen_insn(JMP32_INSN_OPCODE, ip, &__x86_return_thunk), JMP32_INSN_SIZE); + else + memcpy(ip, retq, sizeof(retq));
/* No need to test direct calls on created trampolines */ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.11-rc1 commit b6650dab404c701d7fe08a108b746542a934da84 category: bugfix bugzilla: 187815, https://gitee.com/openeuler/kernel/issues/I5WLBD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
In case the file cannot be opened with O_NOATIME because of lack of capabilities, then clear O_NOATIME instead of failing.
Remove WARN_ON(), since it would now trigger if O_NOATIME was cleared. Noticed by Amir Goldstein.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/overlayfs/file.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index b019f27c1360..71019814c0b5 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -54,9 +54,10 @@ static struct file *ovl_open_realfile(const struct file *file, err = inode_permission(realinode, MAY_OPEN | acc_mode); if (err) { realfile = ERR_PTR(err); - } else if (!inode_owner_or_capable(realinode)) { - realfile = ERR_PTR(-EPERM); } else { + if (!inode_owner_or_capable(realinode)) + flags &= ~O_NOATIME; + realfile = open_with_fake_path(&file->f_path, flags, realinode, current_cred()); } @@ -76,12 +77,6 @@ static int ovl_change_flags(struct file *file, unsigned int flags) struct inode *inode = file_inode(file); int err;
- flags |= OVL_OPEN_FLAGS; - - /* If some flag changed that cannot be changed then something's amiss */ - if (WARN_ON((file->f_flags ^ flags) & ~OVL_SETFL_MASK)) - return -EIO; - flags &= OVL_SETFL_MASK;
if (((flags ^ file->f_flags) & O_APPEND) && IS_APPEND(inode))
From: Johannes Berg johannes.berg@intel.com
stable inclusion from stable-v5.10.148 commit 58c0306d0bcd5f541714bea8765d23111c9af68a category: bugfix bugzilla: 187817, https://gitee.com/src-openeuler/kernel/issues/I5VMMW CVE: CVE-2022-42722
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit b2d03cabe2b2e150ff5a381731ea0355459be09f upstream.
If beacon protection is active but the beacon cannot be decrypted or is otherwise malformed, we call the cfg80211 API to report this to userspace, but that uses a netdev pointer, which isn't present for P2P-Device. Fix this to call it only conditionally to ensure cfg80211 won't crash in the case of P2P-Device.
This fixes CVE-2022-42722.
Reported-by: Sönke Huster shuster@seemoo.tu-darmstadt.de Fixes: 9eaf183af741 ("mac80211: Report beacon protection failures to user space") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.orgi Signed-off-by: Dong Chenchen dongchenchen2@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/mac80211/rx.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index e991abb45f68..97a63b940482 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -1975,10 +1975,11 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx)
if (mmie_keyidx < NUM_DEFAULT_KEYS + NUM_DEFAULT_MGMT_KEYS || mmie_keyidx >= NUM_DEFAULT_KEYS + NUM_DEFAULT_MGMT_KEYS + - NUM_DEFAULT_BEACON_KEYS) { - cfg80211_rx_unprot_mlme_mgmt(rx->sdata->dev, - skb->data, - skb->len); + NUM_DEFAULT_BEACON_KEYS) { + if (rx->sdata->dev) + cfg80211_rx_unprot_mlme_mgmt(rx->sdata->dev, + skb->data, + skb->len); return RX_DROP_MONITOR; /* unexpected BIP keyidx */ }
@@ -2126,7 +2127,8 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx) /* either the frame has been decrypted or will be dropped */ status->flag |= RX_FLAG_DECRYPTED;
- if (unlikely(ieee80211_is_beacon(fc) && result == RX_DROP_UNUSABLE)) + if (unlikely(ieee80211_is_beacon(fc) && result == RX_DROP_UNUSABLE && + rx->sdata->dev)) cfg80211_rx_unprot_mlme_mgmt(rx->sdata->dev, skb->data, skb->len);
From: Kuniyuki Iwashima kuniyu@amazon.com
mainline inclusion from mainline-v6.1-rc1 commit f49cd2f4d6170d27a2c61f1fecb03d8a70c91f57 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7ZF?from=project-issue CVE: CVE-2022-3566
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=...
--------------------------------
setsockopt(IPV6_ADDRFORM) and tcp_v6_connect() change icsk->icsk_af_ops under lock_sock(), but tcp_(get|set)sockopt() read it locklessly. To avoid load/store tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads and writes.
Thanks to Eric Dumazet for providing the syzbot report:
BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect
write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0: tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240 __inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660 inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724 __sys_connect_file net/socket.c:1976 [inline] __sys_connect+0x197/0x1b0 net/socket.c:1993 __do_sys_connect net/socket.c:2003 [inline] __se_sys_connect net/socket.c:2000 [inline] __x64_sys_connect+0x3d/0x50 net/socket.c:2000 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1: tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789 sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585 __sys_setsockopt+0x212/0x2b0 net/socket.c:2252 __do_sys_setsockopt net/socket.c:2263 [inline] __se_sys_setsockopt net/socket.c:2260 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted 6.0.0-rc4-syzkaller-00331-g4ed9c1e971b1-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot syzkaller@googlegroups.com Reported-by: Eric Dumazet edumazet@google.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Conflicts: net/ipv4/tcp.c net/ipv6/ipv6_sockglue.c Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/ipv4/tcp.c | 10 ++++++---- net/ipv6/ipv6_sockglue.c | 3 ++- net/ipv6/tcp_ipv6.c | 6 ++++-- 3 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 4d5280780a8e..e7183cc8e998 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3448,8 +3448,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, const struct inet_connection_sock *icsk = inet_csk(sk);
if (level != SOL_TCP) - return icsk->icsk_af_ops->setsockopt(sk, level, optname, - optval, optlen); + /* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */ + return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname, + optval, optlen); return do_tcp_setsockopt(sk, level, optname, optval, optlen); } EXPORT_SYMBOL(tcp_setsockopt); @@ -3993,8 +3994,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval, struct inet_connection_sock *icsk = inet_csk(sk);
if (level != SOL_TCP) - return icsk->icsk_af_ops->getsockopt(sk, level, optname, - optval, optlen); + /* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */ + return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname, + optval, optlen); return do_tcp_getsockopt(sk, level, optname, optval, optlen); } EXPORT_SYMBOL(tcp_getsockopt); diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 16af0c700d05..6cfb4042994e 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -481,7 +481,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, local_bh_enable(); /* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */ WRITE_ONCE(sk->sk_prot, &tcp_prot); - icsk->icsk_af_ops = &ipv4_specific; + /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */ + WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific); sk->sk_socket->ops = &inet_stream_ops; sk->sk_family = PF_INET; tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index ef36c34e435e..0862fe07cf06 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -237,7 +237,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, sin.sin_port = usin->sin6_port; sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
- icsk->icsk_af_ops = &ipv6_mapped; + /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */ + WRITE_ONCE(icsk->icsk_af_ops, &ipv6_mapped); if (sk_is_mptcp(sk)) mptcpv6_handle_mapped(sk, true); sk->sk_backlog_rcv = tcp_v4_do_rcv; @@ -249,7 +250,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
if (err) { icsk->icsk_ext_hdr_len = exthdrlen; - icsk->icsk_af_ops = &ipv6_specific; + /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */ + WRITE_ONCE(icsk->icsk_af_ops, &ipv6_specific); if (sk_is_mptcp(sk)) mptcpv6_handle_mapped(sk, false); sk->sk_backlog_rcv = tcp_v6_do_rcv;
From: Lee Jones lee@kernel.org
stable inclusion from stable-v5.10.134 commit 2ee0cab11f6626071f8a64c7792406dabdd94c8d category: bugfix bugzilla: 187845, https://gitee.com/src-openeuler/kernel/issues/I5UDNW CVE: CVE-2022-20409
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
This issue is conceptually identical to the one fixed in 29f077d07051 ("io_uring: always use original task when preparing req identity"), so rather than reinvent the wheel, I'm shamelessly quoting the commit message from that patch - thanks Jens:
"If the ring is setup with IORING_SETUP_IOPOLL and we have more than one task doing submissions on a ring, we can up in a situation where we assign the context from the current task rather than the request originator.
Always use req->task rather than assume it's the same as current.
No upstream patch exists for this issue, as only older kernels with the non-native workers have this problem."
Cc: Jens Axboe axboe@kernel.dk Cc: Pavel Begunkov asml.silence@gmail.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Fixes: 5c3462cfd123b ("io_uring: store io_identity in io_uring_task") Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 5a88260a8dd8..4e790ae69e68 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1325,7 +1325,7 @@ static void io_req_clean_work(struct io_kiocb *req) */ static bool io_identity_cow(struct io_kiocb *req) { - struct io_uring_task *tctx = current->io_uring; + struct io_uring_task *tctx = req->task->io_uring; const struct cred *creds = NULL; struct io_identity *id;
From: Kuniyuki Iwashima kuniyu@amazon.com
mainline inclusion from mainline-6.1-rc1 commit 364f997b5cfe1db0d63a390fe7c801fa2b3115f6 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7YP CVE: CVE-2022-3567
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
---------------------------
Commit 086d49058cd8 ("ipv6: annotate some data-races around sk->sk_prot") fixed some data-races around sk->sk_prot but it was not enough.
Some functions in inet6_(stream|dgram)_ops still access sk->sk_prot without lock_sock() or rtnl_lock(), so they need READ_ONCE() to avoid load tearing.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Liu Jian liujian56@huawei.com
Conflicts: net/ipv6/ipv6_sockglue.c Reviewed-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/core/sock.c | 6 ++++-- net/ipv4/af_inet.c | 23 ++++++++++++++++------- net/ipv6/ipv6_sockglue.c | 4 ++-- 3 files changed, 22 insertions(+), 11 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c index bee3c320dbfe..d75b3af50864 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3223,7 +3223,8 @@ int sock_common_getsockopt(struct socket *sock, int level, int optname, { struct sock *sk = sock->sk;
- return sk->sk_prot->getsockopt(sk, level, optname, optval, optlen); + /* IPV6_ADDRFORM can change sk->sk_prot under us. */ + return READ_ONCE(sk->sk_prot)->getsockopt(sk, level, optname, optval, optlen); } EXPORT_SYMBOL(sock_common_getsockopt);
@@ -3250,7 +3251,8 @@ int sock_common_setsockopt(struct socket *sock, int level, int optname, { struct sock *sk = sock->sk;
- return sk->sk_prot->setsockopt(sk, level, optname, optval, optlen); + /* IPV6_ADDRFORM can change sk->sk_prot under us. */ + return READ_ONCE(sk->sk_prot)->setsockopt(sk, level, optname, optval, optlen); } EXPORT_SYMBOL(sock_common_setsockopt);
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 911ad595dbb9..d95233b903a9 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -559,22 +559,27 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr, int addr_len, int flags) { struct sock *sk = sock->sk; + const struct proto *prot; int err;
if (addr_len < sizeof(uaddr->sa_family)) return -EINVAL; + + /* IPV6_ADDRFORM can change sk->sk_prot under us. */ + prot = READ_ONCE(sk->sk_prot); + if (uaddr->sa_family == AF_UNSPEC) - return sk->sk_prot->disconnect(sk, flags); + return prot->disconnect(sk, flags);
if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) { - err = sk->sk_prot->pre_connect(sk, uaddr, addr_len); + err = prot->pre_connect(sk, uaddr, addr_len); if (err) return err; }
if (data_race(!inet_sk(sk)->inet_num) && inet_autobind(sk)) return -EAGAIN; - return sk->sk_prot->connect(sk, uaddr, addr_len); + return prot->connect(sk, uaddr, addr_len); } EXPORT_SYMBOL(inet_dgram_connect);
@@ -735,10 +740,11 @@ EXPORT_SYMBOL(inet_stream_connect); int inet_accept(struct socket *sock, struct socket *newsock, int flags, bool kern) { - struct sock *sk1 = sock->sk; + struct sock *sk1 = sock->sk, *sk2; int err = -EINVAL; - struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err, kern);
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */ + sk2 = READ_ONCE(sk1->sk_prot)->accept(sk1, flags, &err, kern); if (!sk2) goto do_err;
@@ -824,12 +830,15 @@ ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { struct sock *sk = sock->sk; + const struct proto *prot;
if (unlikely(inet_send_prepare(sk))) return -EAGAIN;
- if (sk->sk_prot->sendpage) - return sk->sk_prot->sendpage(sk, page, offset, size, flags); + /* IPV6_ADDRFORM can change sk->sk_prot under us. */ + prot = READ_ONCE(sk->sk_prot); + if (prot->sendpage) + return prot->sendpage(sk, page, offset, size, flags); return sock_no_sendpage(sock, page, offset, size, flags); } EXPORT_SYMBOL(inet_sendpage); diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 6cfb4042994e..65f83ce728fa 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -479,7 +479,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, sock_prot_inuse_add(net, sk->sk_prot, -1); sock_prot_inuse_add(net, &tcp_prot, 1); local_bh_enable(); - /* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */ + /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */ WRITE_ONCE(sk->sk_prot, &tcp_prot); /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */ WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific); @@ -495,7 +495,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, sock_prot_inuse_add(net, sk->sk_prot, -1); sock_prot_inuse_add(net, prot, 1); local_bh_enable(); - /* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */ + /* Paired with READ_ONCE(sk->sk_prot) in inet6_dgram_ops */ WRITE_ONCE(sk->sk_prot, prot); sk->sk_socket->ops = &inet_dgram_ops; sk->sk_family = PF_INET;
From: Zhihao Cheng chengzhihao1@huawei.com
mainline inclusion from mainline-v6.1-rc1 commit 669d204469c46e91d99da24914130f78277a71d3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WQRM CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[1] suggests that fastmap is suitable for large flash devices. Module parameter 'fm_autoconvert' is a coarse grained switch to enable all ubi devices to generate fastmap, which may turn on fastmap even for small flash devices.
This patch imports a new field 'disable_fm' in struct 'ubi_attach_req' to support following situations by ioctl 'UBI_IOCATT'. [old functions] A. Disable 'fm_autoconvert': Disbable fastmap for all ubi devices B. Enable 'fm_autoconvert': Enable fastmap for all ubi devices [new function] C. Enable 'fm_autoconvert', set 'disable_fm' for given device: Don't create new fastmap and do full scan (existed fastmap will be destroyed) for the given ubi device.
A simple test case in [2].
[1] http://www.linux-mtd.infradead.org/doc/ubi.html#L_fastmap [2] https://bugzilla.kernel.org/show_bug.cgi?id=216278
Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Richard Weinberger richard@nod.at Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/mtd/ubi/build.c | 14 ++++++++++---- drivers/mtd/ubi/cdev.c | 2 +- drivers/mtd/ubi/ubi.h | 3 ++- include/uapi/mtd/ubi-user.h | 8 +++++++- 4 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c index 4153e0d15c5f..ac97f6f9bcc2 100644 --- a/drivers/mtd/ubi/build.c +++ b/drivers/mtd/ubi/build.c @@ -808,6 +808,7 @@ static int autoresize(struct ubi_device *ubi, int vol_id) * @ubi_num: number to assign to the new UBI device * @vid_hdr_offset: VID header offset * @max_beb_per1024: maximum expected number of bad PEB per 1024 PEBs + * @disable_fm: whether disable fastmap * * This function attaches MTD device @mtd_dev to UBI and assign @ubi_num number * to the newly created UBI device, unless @ubi_num is %UBI_DEV_NUM_AUTO, in @@ -815,11 +816,15 @@ static int autoresize(struct ubi_device *ubi, int vol_id) * automatically. Returns the new UBI device number in case of success and a * negative error code in case of failure. * + * If @disable_fm is true, ubi doesn't create new fastmap even the module param + * 'fm_autoconvert' is set, and existed old fastmap will be destroyed after + * doing full scanning. + * * Note, the invocations of this function has to be serialized by the * @ubi_devices_mutex. */ int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num, - int vid_hdr_offset, int max_beb_per1024) + int vid_hdr_offset, int max_beb_per1024, bool disable_fm) { struct ubi_device *ubi; int i, err; @@ -922,7 +927,7 @@ int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num, UBI_FM_MIN_POOL_SIZE);
ubi->fm_wl_pool.max_size = ubi->fm_pool.max_size / 2; - ubi->fm_disabled = !fm_autoconvert; + ubi->fm_disabled = (!fm_autoconvert || disable_fm) ? 1 : 0; if (fm_debug) ubi_enable_dbg_chk_fastmap(ubi);
@@ -963,7 +968,7 @@ int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num, if (!ubi->fm_buf) goto out_free; #endif - err = ubi_attach(ubi, 0); + err = ubi_attach(ubi, disable_fm ? 1 : 0); if (err) { ubi_err(ubi, "failed to attach mtd%d, error %d", mtd->index, err); @@ -1243,7 +1248,8 @@ static int __init ubi_init(void)
mutex_lock(&ubi_devices_mutex); err = ubi_attach_mtd_dev(mtd, p->ubi_num, - p->vid_hdr_offs, p->max_beb_per1024); + p->vid_hdr_offs, p->max_beb_per1024, + false); mutex_unlock(&ubi_devices_mutex); if (err < 0) { pr_err("UBI error: cannot attach mtd%d\n", diff --git a/drivers/mtd/ubi/cdev.c b/drivers/mtd/ubi/cdev.c index cc9a28cf9d82..0680ba82255b 100644 --- a/drivers/mtd/ubi/cdev.c +++ b/drivers/mtd/ubi/cdev.c @@ -1041,7 +1041,7 @@ static long ctrl_cdev_ioctl(struct file *file, unsigned int cmd, */ mutex_lock(&ubi_devices_mutex); err = ubi_attach_mtd_dev(mtd, req.ubi_num, req.vid_hdr_offset, - req.max_beb_per1024); + req.max_beb_per1024, !!req.disable_fm); mutex_unlock(&ubi_devices_mutex); if (err < 0) put_mtd_device(mtd); diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h index da0bee13fe7f..4f84607dafb4 100644 --- a/drivers/mtd/ubi/ubi.h +++ b/drivers/mtd/ubi/ubi.h @@ -939,7 +939,8 @@ int ubi_io_write_vid_hdr(struct ubi_device *ubi, int pnum,
/* build.c */ int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num, - int vid_hdr_offset, int max_beb_per1024); + int vid_hdr_offset, int max_beb_per1024, + bool disable_fm); int ubi_detach_mtd_dev(int ubi_num, int anyway); struct ubi_device *ubi_get_device(int ubi_num); void ubi_put_device(struct ubi_device *ubi); diff --git a/include/uapi/mtd/ubi-user.h b/include/uapi/mtd/ubi-user.h index b69e9ba6742b..dcb179de4358 100644 --- a/include/uapi/mtd/ubi-user.h +++ b/include/uapi/mtd/ubi-user.h @@ -247,6 +247,7 @@ enum { * @vid_hdr_offset: VID header offset (use defaults if %0) * @max_beb_per1024: maximum expected number of bad PEB per 1024 PEBs * @padding: reserved for future, not used, has to be zeroed + * @disable_fm: whether disable fastmap * * This data structure is used to specify MTD device UBI has to attach and the * parameters it has to use. The number which should be assigned to the new UBI @@ -281,13 +282,18 @@ enum { * eraseblocks for new bad eraseblocks, but attempts to use available * eraseblocks (if any). The accepted range is 0-768. If 0 is given, the * default kernel value of %CONFIG_MTD_UBI_BEB_LIMIT will be used. + * + * If @disable_fm is not zero, ubi doesn't create new fastmap even the module + * param 'fm_autoconvert' is set, and existed old fastmap will be destroyed + * after doing full scanning. */ struct ubi_attach_req { __s32 ubi_num; __s32 mtd_num; __s32 vid_hdr_offset; __s16 max_beb_per1024; - __s8 padding[10]; + __s8 disable_fm; + __s8 padding[9]; };
/*
From: ZhaoLong Wang wangzhaolong1@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WQRM CVE: NA
--------------------------------
This patch add 'enable_fm' as 5th module init parameters to control fastmap enable or not. When the value is 0, fastmap will not create and existed fastmap will destroyed for the given ubi device. Default value is 0.
To enable or disable fastmap during module loading, fm_autoconvert must be set to non-zero.
+-----------------+---------------+---------------------------+ | \ | enable_fm=0 | enable_fm=1 | +-----------------+---------------+---------------------------+ |fm_autoconvert=Y | disable fm | enable fm | +---------------------------------+---------------------------+ |fm_autoconvert=N | disable fm | Enable fastmap. If fastmap| | | | exists in the old image, | +------------------------------------------------------------ +
Example: # - Attach mtd1 to ubi1, disable fastmap, mtd2 to ubi2, enable fastmap. # modprobe ubi mtd=1,0,0,1,0 mtd=2,0,0,2,1 fm_autoconvert=1
# - If 5th parameter is not specified, the value is 0, fastmap is disable # modprobe ubi mtd=1 fm_autoconvert=1
Signed-off-by: ZhaoLong Wang wangzhaolong1@huawei.com Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/mtd/ubi/build.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c index ac97f6f9bcc2..62cdbf5bf562 100644 --- a/drivers/mtd/ubi/build.c +++ b/drivers/mtd/ubi/build.c @@ -35,7 +35,7 @@ #define MTD_PARAM_LEN_MAX 64
/* Maximum number of comma-separated items in the 'mtd=' parameter */ -#define MTD_PARAM_MAX_COUNT 4 +#define MTD_PARAM_MAX_COUNT 5
/* Maximum value for the number of bad PEBs per 1024 PEBs */ #define MAX_MTD_UBI_BEB_LIMIT 768 @@ -52,12 +52,14 @@ * string * @vid_hdr_offs: VID header offset * @max_beb_per1024: maximum expected number of bad PEBs per 1024 PEBs + * @enable_fm: enable fastmap when value is non-zero */ struct mtd_dev_param { char name[MTD_PARAM_LEN_MAX]; int ubi_num; int vid_hdr_offs; int max_beb_per1024; + int enable_fm; };
/* Numbers of elements set in the @mtd_dev_param array */ @@ -1249,7 +1251,7 @@ static int __init ubi_init(void) mutex_lock(&ubi_devices_mutex); err = ubi_attach_mtd_dev(mtd, p->ubi_num, p->vid_hdr_offs, p->max_beb_per1024, - false); + p->enable_fm == 0 ? true : false); mutex_unlock(&ubi_devices_mutex); if (err < 0) { pr_err("UBI error: cannot attach mtd%d\n", @@ -1429,7 +1431,7 @@ static int ubi_mtd_param_parse(const char *val, const struct kernel_param *kp) int err = kstrtoint(token, 10, &p->max_beb_per1024);
if (err) { - pr_err("UBI error: bad value for max_beb_per1024 parameter: %s", + pr_err("UBI error: bad value for max_beb_per1024 parameter: %s\n", token); return -EINVAL; } @@ -1440,13 +1442,25 @@ static int ubi_mtd_param_parse(const char *val, const struct kernel_param *kp) int err = kstrtoint(token, 10, &p->ubi_num);
if (err) { - pr_err("UBI error: bad value for ubi_num parameter: %s", + pr_err("UBI error: bad value for ubi_num parameter: %s\n", token); return -EINVAL; } } else p->ubi_num = UBI_DEV_NUM_AUTO;
+ token = tokens[4]; + if (token) { + int err = kstrtoint(token, 10, &p->enable_fm); + + if (err) { + pr_err("UBI error: bad value for enable_fm parameter: %s\n", + token); + return -EINVAL; + } + } else + p->enable_fm = 0; + mtd_devs += 1; return 0; } @@ -1459,11 +1473,13 @@ MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: mtd=<name|num|pa "Optional "max_beb_per1024" parameter specifies the maximum expected bad eraseblock per 1024 eraseblocks. (default value (" __stringify(CONFIG_MTD_UBI_BEB_LIMIT) ") if 0)\n" "Optional "ubi_num" parameter specifies UBI device number which have to be assigned to the newly created UBI device (assigned automatically by default)\n" + "Optional "enable_fm" parameter determines whether to enable fastmap during attach. If the value is non-zero, fastmap is enabled. Default value is 0.\n" "\n" "Example 1: mtd=/dev/mtd0 - attach MTD device /dev/mtd0.\n" "Example 2: mtd=content,1984 mtd=4 - attach MTD device with name "content" using VID header offset 1984, and MTD device number 4 with default VID header offset.\n" "Example 3: mtd=/dev/mtd1,0,25 - attach MTD device /dev/mtd1 using default VID header offset and reserve 25*nand_size_in_blocks/1024 erase blocks for bad block handling.\n" "Example 4: mtd=/dev/mtd1,0,0,5 - attach MTD device /dev/mtd1 to UBI 5 and using default values for the other fields.\n" + "example 5: mtd=1,0,0,5 mtd=2,0,0,6,1 - attach MTD device /dev/mtd1 to UBI 5 and disable fastmap; attach MTD device /dev/mtd2 to UBI 6 and enable fastmap(If supported).\n" "\t(e.g. if the NAND *chipset* has 4096 PEB, 100 will be reserved for this UBI device)."); #ifdef CONFIG_MTD_UBI_FASTMAP module_param(fm_autoconvert, bool, 0644);
From: Duoming Zhou duoming@zju.edu.cn
stable inclusion from stable-v6.1-rc1 commit 2568a7e0832ee30b0a351016d03062ab4e0e0a3f category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7YH CVE: CVE-2022-3565
Reference: https://nvd.nist.gov/vuln/detail/CVE-2022-3565 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The l1oip_cleanup() traverses the l1oip_ilist and calls release_card() to cleanup module and stack. However, release_card() calls del_timer() to delete the timers such as keep_tl and timeout_tl. If the timer handler is running, the del_timer() will not stop it and result in UAF bugs. One of the processes is shown below:
(cleanup routine) | (timer handler) release_card() | l1oip_timeout() ... | del_timer() | ... ... | kfree(hc) //FREE | | hc->timeout_on = 0 //USE
Fix by calling del_timer_sync() in release_card(), which makes sure the timer handlers have finished before the resources, such as l1oip and so on, have been deallocated.
What's more, the hc->workq and hc->socket_thread can kick those timers right back in. We add a bool flag to show if card is released. Then, check this flag in hc->workq and hc->socket_thread.
Fixes: 3712b42d4b1b ("Add layer1 over IP support") Signed-off-by: Duoming Zhou duoming@zju.edu.cn Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: GONG Ruiqi gongruiqi1@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/isdn/mISDN/l1oip.h | 1 + drivers/isdn/mISDN/l1oip_core.c | 13 +++++++------ 2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/isdn/mISDN/l1oip.h b/drivers/isdn/mISDN/l1oip.h index 7ea10db20e3a..48133d022812 100644 --- a/drivers/isdn/mISDN/l1oip.h +++ b/drivers/isdn/mISDN/l1oip.h @@ -59,6 +59,7 @@ struct l1oip { int bundle; /* bundle channels in one frm */ int codec; /* codec to use for transmis. */ int limit; /* limit number of bchannels */ + bool shutdown; /* if card is released */
/* timer */ struct timer_list keep_tl; diff --git a/drivers/isdn/mISDN/l1oip_core.c b/drivers/isdn/mISDN/l1oip_core.c index b57dcb834594..aec4f2a69c3b 100644 --- a/drivers/isdn/mISDN/l1oip_core.c +++ b/drivers/isdn/mISDN/l1oip_core.c @@ -275,7 +275,7 @@ l1oip_socket_send(struct l1oip *hc, u8 localcodec, u8 channel, u32 chanmask, p = frame;
/* restart timer */ - if (time_before(hc->keep_tl.expires, jiffies + 5 * HZ)) + if (time_before(hc->keep_tl.expires, jiffies + 5 * HZ) && !hc->shutdown) mod_timer(&hc->keep_tl, jiffies + L1OIP_KEEPALIVE * HZ); else hc->keep_tl.expires = jiffies + L1OIP_KEEPALIVE * HZ; @@ -601,7 +601,9 @@ l1oip_socket_parse(struct l1oip *hc, struct sockaddr_in *sin, u8 *buf, int len) goto multiframe;
/* restart timer */ - if (time_before(hc->timeout_tl.expires, jiffies + 5 * HZ) || !hc->timeout_on) { + if ((time_before(hc->timeout_tl.expires, jiffies + 5 * HZ) || + !hc->timeout_on) && + !hc->shutdown) { hc->timeout_on = 1; mod_timer(&hc->timeout_tl, jiffies + L1OIP_TIMEOUT * HZ); } else /* only adjust timer */ @@ -1232,11 +1234,10 @@ release_card(struct l1oip *hc) { int ch;
- if (timer_pending(&hc->keep_tl)) - del_timer(&hc->keep_tl); + hc->shutdown = true;
- if (timer_pending(&hc->timeout_tl)) - del_timer(&hc->timeout_tl); + del_timer_sync(&hc->keep_tl); + del_timer_sync(&hc->timeout_tl);
cancel_work_sync(&hc->workq);
From: Pavel Begunkov asml.silence@gmail.com
mainline inclusion from mainline-v6.1-rc1 commit 0091bfc81741b8d3aeb3b7ab8636f911b2de6e80 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WFKI CVE: CVE-2022-2602
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?h=...
--------------------------------
Instead of putting io_uring's registered files in unix_gc() we want it to be done by io_uring itself. The trick here is to consider io_uring registered files for cycle detection but not actually putting them down. Because io_uring can't register other ring instances, this will remove all refs to the ring file triggering the ->release path and clean up with io_ring_ctx_free().
Cc: stable@vger.kernel.org Fixes: 6b06314c47e1 ("io_uring: add file set registration") Reported-and-tested-by: David Bouman dbouman03@gmail.com Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com [axboe: add kerneldoc comment to skb, fold in skb leak fix] Signed-off-by: Jens Axboe axboe@kernel.dk Conflicts: fs/io_uring.c include/linux/skbuff.h Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/io_uring.c | 1 + include/linux/skbuff.h | 2 ++ net/unix/garbage.c | 20 ++++++++++++++++++++ 3 files changed, 23 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 4e790ae69e68..0de12d2bc29f 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7335,6 +7335,7 @@ static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset) }
skb->sk = sk; + skb->scm_io_uring = 1;
nr_files = 0; fpl->user = get_uid(ctx->user); diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 6dae09f22c5a..a610975fca7f 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -706,6 +706,7 @@ typedef unsigned char *sk_buff_data_t; * @transport_header: Transport layer header * @network_header: Network layer header * @mac_header: Link layer header + * @scm_io_uring: SKB holds io_uring registered files * @tail: Tail pointer * @end: End pointer * @head: Head of buffer @@ -866,6 +867,7 @@ struct sk_buff { #ifdef CONFIG_TLS_DEVICE __u8 decrypted:1; #endif + __u8 scm_io_uring:1;
#ifdef CONFIG_NET_SCHED __u16 tc_index; /* traffic control index */ diff --git a/net/unix/garbage.c b/net/unix/garbage.c index d45d5366115a..dc2763540393 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -204,6 +204,7 @@ void wait_for_unix_gc(void) /* The external entry point: unix_gc() */ void unix_gc(void) { + struct sk_buff *next_skb, *skb; struct unix_sock *u; struct unix_sock *next; struct sk_buff_head hitlist; @@ -297,11 +298,30 @@ void unix_gc(void)
spin_unlock(&unix_gc_lock);
+ /* We need io_uring to clean its registered files, ignore all io_uring + * originated skbs. It's fine as io_uring doesn't keep references to + * other io_uring instances and so killing all other files in the cycle + * will put all io_uring references forcing it to go through normal + * release.path eventually putting registered files. + */ + skb_queue_walk_safe(&hitlist, skb, next_skb) { + if (skb->scm_io_uring) { + __skb_unlink(skb, &hitlist); + skb_queue_tail(&skb->sk->sk_receive_queue, skb); + } + } + /* Here we are. Hitlist is filled. Die. */ __skb_queue_purge(&hitlist);
spin_lock(&unix_gc_lock);
+ /* There could be io_uring registered files, just push them back to + * the inflight list + */ + list_for_each_entry_safe(u, next, &gc_candidates, link) + list_move_tail(&u->link, &gc_inflight_list); + /* All candidates should have been detached by now. */ BUG_ON(!list_empty(&gc_candidates));
From: Zhihao Cheng chengzhihao1@huawei.com
Offering: HULK hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WFKI CVE: CVE-2022-2602
--------------------------------
In order to fix CVE-2022-2602, member 'scm_io_uring' is imported into sk_buff, fix broken kabi in sk_buff.
Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: zhongbaisong zhongbaisong@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/skbuff.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index a610975fca7f..fea290135399 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -867,7 +867,6 @@ struct sk_buff { #ifdef CONFIG_TLS_DEVICE __u8 decrypted:1; #endif - __u8 scm_io_uring:1;
#ifdef CONFIG_NET_SCHED __u16 tc_index; /* traffic control index */ @@ -918,7 +917,7 @@ struct sk_buff { __u32 headers_end[0]; /* public: */
- KABI_RESERVE(1) + KABI_USE(1, __u8 scm_io_uring:1) KABI_RESERVE(2) KABI_RESERVE(3) KABI_RESERVE(4)
From: Linus Torvalds torvalds@linux-foundation.org
stable inclusion from stable-v5.10.134 commit 0adf21eec59040b31af113e626efd85eb153c728 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X57Q CVE: CVE-2022-1882
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 353f7988dd8413c47718f7ca79c030b6fb62cfe5 upstream.
When the pipe is closed, we mark the associated watchqueue defunct by calling watch_queue_clear(). However, while that is protected by the watchqueue lock, new watchqueue entries aren't actually added under that lock at all: they use the pipe->rd_wait.lock instead, and looking up that pipe happens without any locking.
The watchqueue code uses the RCU read-side section to make sure that the wqueue entry itself hasn't disappeared, but that does not protect the pipe_info in any way.
So make sure to actually hold the wqueue lock when posting watch events, properly serializing against the pipe being torn down.
Reported-by: Noam Rathaus noamr@ssd-disclosure.com Cc: Greg KH gregkh@linuxfoundation.org Cc: David Howells dhowells@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Wang Hai wanghai38@huawei.com Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/watch_queue.c | 53 +++++++++++++++++++++++++++++++------------- 1 file changed, 37 insertions(+), 16 deletions(-)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c index 249ed3259144..dcf1e676797d 100644 --- a/kernel/watch_queue.c +++ b/kernel/watch_queue.c @@ -34,6 +34,27 @@ MODULE_LICENSE("GPL"); #define WATCH_QUEUE_NOTE_SIZE 128 #define WATCH_QUEUE_NOTES_PER_PAGE (PAGE_SIZE / WATCH_QUEUE_NOTE_SIZE)
+/* + * This must be called under the RCU read-lock, which makes + * sure that the wqueue still exists. It can then take the lock, + * and check that the wqueue hasn't been destroyed, which in + * turn makes sure that the notification pipe still exists. + */ +static inline bool lock_wqueue(struct watch_queue *wqueue) +{ + spin_lock_bh(&wqueue->lock); + if (unlikely(wqueue->defunct)) { + spin_unlock_bh(&wqueue->lock); + return false; + } + return true; +} + +static inline void unlock_wqueue(struct watch_queue *wqueue) +{ + spin_unlock_bh(&wqueue->lock); +} + static void watch_queue_pipe_buf_release(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { @@ -69,6 +90,10 @@ static const struct pipe_buf_operations watch_queue_pipe_buf_ops = {
/* * Post a notification to a watch queue. + * + * Must be called with the RCU lock for reading, and the + * watch_queue lock held, which guarantees that the pipe + * hasn't been released. */ static bool post_one_notification(struct watch_queue *wqueue, struct watch_notification *n) @@ -85,9 +110,6 @@ static bool post_one_notification(struct watch_queue *wqueue,
spin_lock_irq(&pipe->rd_wait.lock);
- if (wqueue->defunct) - goto out; - mask = pipe->ring_size - 1; head = pipe->head; tail = pipe->tail; @@ -203,7 +225,10 @@ void __post_watch_notification(struct watch_list *wlist, if (security_post_notification(watch->cred, cred, n) < 0) continue;
- post_one_notification(wqueue, n); + if (lock_wqueue(wqueue)) { + post_one_notification(wqueue, n); + unlock_wqueue(wqueue);; + } }
rcu_read_unlock(); @@ -465,11 +490,12 @@ int add_watch_to_object(struct watch *watch, struct watch_list *wlist) return -EAGAIN; }
- spin_lock_bh(&wqueue->lock); - kref_get(&wqueue->usage); - kref_get(&watch->usage); - hlist_add_head(&watch->queue_node, &wqueue->watches); - spin_unlock_bh(&wqueue->lock); + if (lock_wqueue(wqueue)) { + kref_get(&wqueue->usage); + kref_get(&watch->usage); + hlist_add_head(&watch->queue_node, &wqueue->watches); + unlock_wqueue(wqueue); + }
hlist_add_head(&watch->list_node, &wlist->watchers); return 0; @@ -523,20 +549,15 @@ int remove_watch_from_object(struct watch_list *wlist, struct watch_queue *wq,
wqueue = rcu_dereference(watch->queue);
- /* We don't need the watch list lock for the next bit as RCU is - * protecting *wqueue from deallocation. - */ - if (wqueue) { + if (lock_wqueue(wqueue)) { post_one_notification(wqueue, &n.watch);
- spin_lock_bh(&wqueue->lock); - if (!hlist_unhashed(&watch->queue_node)) { hlist_del_init_rcu(&watch->queue_node); put_watch(watch); }
- spin_unlock_bh(&wqueue->lock); + unlock_wqueue(wqueue); }
if (wlist->release_watch) {
From: Linus Torvalds torvalds@linux-foundation.org
stable inclusion from stable-v5.10.134 commit bb1990a3005e3655e84f9a57b3f30ce96d597dee category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X57Q CVE: CVE-2022-1882
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 44e29e64cf1ac0cffb152e0532227ea6d002aa28 upstream.
Sedat Dilek noticed that I had an extraneous semicolon at the end of a line in the previous patch.
It's harmless, but unintentional, and while compilers just treat it as an extra empty statement, for all I know some other tooling might warn about it. So clean it up before other people notice too ;)
Fixes: 353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly") Reported-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Reported-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/watch_queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c index dcf1e676797d..e5d22af43fa0 100644 --- a/kernel/watch_queue.c +++ b/kernel/watch_queue.c @@ -227,7 +227,7 @@ void __post_watch_notification(struct watch_list *wlist,
if (lock_wqueue(wqueue)) { post_one_notification(wqueue, n); - unlock_wqueue(wqueue);; + unlock_wqueue(wqueue); } }
From: Toke Høiland-Jørgensen toke@toke.dk
stable inclusion from stable-v5.10.143 commit 2ee85ac1b29dbd2ebd2d8e5ac1dd5793235d516b category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WF14 CVE: CVE-2022-3586
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=2...
--------------------------------
[ Upstream commit 9efd23297cca530bb35e1848665805d3fcdd7889 ]
The sch_sfb enqueue() routine assumes the skb is still alive after it has been enqueued into a child qdisc, using the data in the skb cb field in the increment_qlen() routine after enqueue. However, the skb may in fact have been freed, causing a use-after-free in this case. In particular, this happens if sch_cake is used as a child of sfb, and the GSO splitting mode of CAKE is enabled (in which case the skb will be split into segments and the original skb freed).
Fix this by copying the sfb cb data to the stack before enqueueing the skb, and using this stack copy in increment_qlen() instead of the skb pointer itself.
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231 Fixes: e13e02a3c68d ("net_sched: SFB flow scheduler") Signed-off-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Guo Mengqi guomengqi3@huawei.com Reviewed-by: chenweilong chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/sched/sch_sfb.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index da047a37a3bf..f180cf95cfc9 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -135,15 +135,15 @@ static void increment_one_qlen(u32 sfbhash, u32 slot, struct sfb_sched_data *q) } }
-static void increment_qlen(const struct sk_buff *skb, struct sfb_sched_data *q) +static void increment_qlen(const struct sfb_skb_cb *cb, struct sfb_sched_data *q) { u32 sfbhash;
- sfbhash = sfb_hash(skb, 0); + sfbhash = cb->hashes[0]; if (sfbhash) increment_one_qlen(sfbhash, 0, q);
- sfbhash = sfb_hash(skb, 1); + sfbhash = cb->hashes[1]; if (sfbhash) increment_one_qlen(sfbhash, 1, q); } @@ -283,6 +283,7 @@ static int sfb_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sfb_sched_data *q = qdisc_priv(sch); struct Qdisc *child = q->qdisc; struct tcf_proto *fl; + struct sfb_skb_cb cb; int i; u32 p_min = ~0; u32 minqlen = ~0; @@ -399,11 +400,12 @@ static int sfb_enqueue(struct sk_buff *skb, struct Qdisc *sch, }
enqueue: + memcpy(&cb, sfb_skb_cb(skb), sizeof(cb)); ret = qdisc_enqueue(skb, child, to_free); if (likely(ret == NET_XMIT_SUCCESS)) { qdisc_qstats_backlog_inc(sch, skb); sch->q.qlen++; - increment_qlen(skb, q); + increment_qlen(&cb, q); } else if (net_xmit_drop_count(ret)) { q->stats.childdrop++; qdisc_qstats_drop(sch);
From: Toke Høiland-Jørgensen toke@toke.dk
stable inclusion from stable-v5.10.143 commit 2ead78fbe6b523e6232ad286e3c13d2a410de22a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WF14 CVE: CVE-2022-3586
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=2...
--------------------------------
[ Upstream commit 2f09707d0c972120bf794cfe0f0c67e2c2ddb252 ]
Cong Wang noticed that the previous fix for sch_sfb accessing the queued skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue function was also calling qdisc_qstats_backlog_inc() after enqueue, which reads the pkt len from the skb cb field. Fix this by also storing the skb len, and using the stored value to increment the backlog after enqueueing.
Fixes: 9efd23297cca ("sch_sfb: Don't assume the skb is still around after enqueueing to child") Signed-off-by: Toke Høiland-Jørgensen toke@toke.dk Acked-by: Cong Wang cong.wang@bytedance.com Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dk Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Guo Mengqi guomengqi3@huawei.com Reviewed-by: chenweilong chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/sched/sch_sfb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index f180cf95cfc9..b2724057629f 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -281,6 +281,7 @@ static int sfb_enqueue(struct sk_buff *skb, struct Qdisc *sch, {
struct sfb_sched_data *q = qdisc_priv(sch); + unsigned int len = qdisc_pkt_len(skb); struct Qdisc *child = q->qdisc; struct tcf_proto *fl; struct sfb_skb_cb cb; @@ -403,7 +404,7 @@ static int sfb_enqueue(struct sk_buff *skb, struct Qdisc *sch, memcpy(&cb, sfb_skb_cb(skb), sizeof(cb)); ret = qdisc_enqueue(skb, child, to_free); if (likely(ret == NET_XMIT_SUCCESS)) { - qdisc_qstats_backlog_inc(sch, skb); + sch->qstats.backlog += len; sch->q.qlen++; increment_qlen(&cb, q); } else if (net_xmit_drop_count(ret)) {
From: Xu Kuohai xukuohai@huawei.com
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7AT CVE: CVE-2022-3534
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=...
--------------------------------
ASAN reports an use-after-free in btf_dump_name_dups:
ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928 READ of size 2 at 0xffff927006db thread T0 #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614) #1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127 #2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143 #3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212 #4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525 #5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552 #6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567 #7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912 #8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798 #9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282 #10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236 #11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #15 0xaaaab5d65990 (test_progs+0x185990)
0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0) freed by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 #6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032 #7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232 #8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #12 0xaaaab5d65990 (test_progs+0x185990)
previously allocated by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 #6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070 #7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102 #8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162 #9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #13 0xaaaab5d65990 (test_progs+0x185990)
The reason is that the key stored in hash table name_map is a string address, and the string memory is allocated by realloc() function, when the memory is resized by realloc() later, the old memory may be freed, so the address stored in name_map references to a freed memory, causing use-after-free.
Fix it by storing duplicated string address in name_map.
Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Xu Kuohai xukuohai@huawei.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Martin KaFai Lau martin.lau@kernel.org Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com Reviewed-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- tools/lib/bpf/btf_dump.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index bd22853be4a6..9ab0964475c1 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -188,6 +188,17 @@ static int btf_dump_resize(struct btf_dump *d) return 0; }
+static void btf_dump_free_names(struct hashmap *map) +{ + size_t bkt; + struct hashmap_entry *cur; + + hashmap__for_each_entry(map, cur, bkt) + free((void *)cur->key); + + hashmap__free(map); +} + void btf_dump__free(struct btf_dump *d) { int i; @@ -206,8 +217,8 @@ void btf_dump__free(struct btf_dump *d) free(d->cached_names); free(d->emit_queue); free(d->decl_stack); - hashmap__free(d->type_names); - hashmap__free(d->ident_names); + btf_dump_free_names(d->type_names); + btf_dump_free_names(d->ident_names);
free(d); } @@ -1392,11 +1403,24 @@ static void btf_dump_emit_type_chain(struct btf_dump *d, static size_t btf_dump_name_dups(struct btf_dump *d, struct hashmap *name_map, const char *orig_name) { + int err; + char *old_name; + char *new_name; size_t dup_cnt = 0;
+ new_name = strdup(orig_name); + if (!new_name) + return 1; + hashmap__find(name_map, orig_name, (void **)&dup_cnt); dup_cnt++; - hashmap__set(name_map, orig_name, (void *)dup_cnt, NULL, NULL); + + err = hashmap__set(name_map, new_name, (void *)dup_cnt, + (const void **)&old_name, NULL); + if (err) + free(new_name); + + free(old_name);
return dup_cnt; }
From: Al Viro viro@zeniv.linux.org.uk
mainline inclusion from mainline-v6.1-rc1 commit 9d64d2405f7d30d49818f6682acd0392348f0fdb category: bugfix bugzilla: 187857, https://gitee.com/src-openeuler/kernel/issues/I5X9DT CVE: CVE-2022-3577
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
So far we relied on .put_super = binderfs_put_super() to destroy info we stashed in sb->s_fs_info. This gave us the required ordering between ->evict_inode() and sb->s_fs_info destruction.
But the current implementation of binderfs_fill_super() has a memory leak in the rare circumstance that d_make_root() fails because ->put_super() is only called when sb->s_root is initialized. Fix this by removing ->put_super() and simply do all that work in binderfs_kill_super().
Reported-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Link: https://lore.kernel.org/r/20220823095339.853371-1-brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Li Huafei lihuafei1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/android/binderfs.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c index 7b4f154f07e6..a3beb685748c 100644 --- a/drivers/android/binderfs.c +++ b/drivers/android/binderfs.c @@ -330,22 +330,10 @@ static int binderfs_show_options(struct seq_file *seq, struct dentry *root) return 0; }
-static void binderfs_put_super(struct super_block *sb) -{ - struct binderfs_info *info = sb->s_fs_info; - - if (info && info->ipc_ns) - put_ipc_ns(info->ipc_ns); - - kfree(info); - sb->s_fs_info = NULL; -} - static const struct super_operations binderfs_super_ops = { .evict_inode = binderfs_evict_inode, .show_options = binderfs_show_options, .statfs = simple_statfs, - .put_super = binderfs_put_super, };
static inline bool is_binderfs_control_device(const struct dentry *dentry) @@ -763,11 +751,27 @@ static int binderfs_init_fs_context(struct fs_context *fc) return 0; }
+static void binderfs_kill_super(struct super_block *sb) +{ + struct binderfs_info *info = sb->s_fs_info; + + /* + * During inode eviction struct binderfs_info is needed. + * So first wipe the super_block then free struct binderfs_info. + */ + kill_litter_super(sb); + + if (info && info->ipc_ns) + put_ipc_ns(info->ipc_ns); + + kfree(info); +} + static struct file_system_type binder_fs_type = { .name = "binder", .init_fs_context = binderfs_init_fs_context, .parameters = binderfs_fs_parameters, - .kill_sb = kill_litter_super, + .kill_sb = binderfs_kill_super, .fs_flags = FS_USERNS_MOUNT, };
From: Dongliang Mu mudongliangabcd@gmail.com
mainline inclusion from mainline-v6.1-rc1 commit 945a9a8e448b65bec055d37eba58f711b39f66f0 category: bugfix bugzilla: 187857, https://gitee.com/src-openeuler/kernel/issues/I5X9DT CVE: CVE-2022-3577
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The error handling code in pvr2_hdw_create forgets to unregister the v4l2 device. When pvr2_hdw_create returns back to pvr2_context_create, it calls pvr2_context_destroy to destroy context, but mp->hdw is NULL, which leads to that pvr2_hdw_destroy directly returns.
Fix this by adding v4l2_device_unregister to decrease the refcount of usb interface.
Reported-by: syzbot+77b432d57c4791183ed4@syzkaller.appspotmail.com Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Li Huafei lihuafei1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/media/usb/pvrusb2/pvrusb2-hdw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c index fccd1798445d..d22ce328a279 100644 --- a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c +++ b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c @@ -2610,6 +2610,7 @@ struct pvr2_hdw *pvr2_hdw_create(struct usb_interface *intf, del_timer_sync(&hdw->encoder_run_timer); del_timer_sync(&hdw->encoder_wait_timer); flush_work(&hdw->workpoll); + v4l2_device_unregister(&hdw->v4l2_dev); usb_free_urb(hdw->ctl_read_urb); usb_free_urb(hdw->ctl_write_urb); kfree(hdw->ctl_read_buffer);
From: Maxim Mikityanskiy maxtram95@gmail.com
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7XW CVE: CVE-2022-3564
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git...
--------------------------------
Fix the race condition between the following two flows that run in parallel:
1. l2cap_reassemble_sdu -> chan->ops->recv (l2cap_sock_recv_cb) -> __sock_queue_rcv_skb.
2. bt_sock_recvmsg -> skb_recv_datagram, skb_free_datagram.
An SKB can be queued by the first flow and immediately dequeued and freed by the second flow, therefore the callers of l2cap_reassemble_sdu can't use the SKB after that function returns. However, some places continue accessing struct l2cap_ctrl that resides in the SKB's CB for a short time after l2cap_reassemble_sdu returns, leading to a use-after-free condition (the stack trace is below, line numbers for kernel 5.19.8).
Fix it by keeping a local copy of struct l2cap_ctrl.
BUG: KASAN: use-after-free in l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth Read of size 1 at addr ffff88812025f2f0 by task kworker/u17:3/43169
Workqueue: hci0 hci_rx_work [bluetooth] Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) print_report.cold (mm/kasan/report.c:314 mm/kasan/report.c:429) ? l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493) ? l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth l2cap_rx (net/bluetooth/l2cap_core.c:7236 net/bluetooth/l2cap_core.c:7271) bluetooth ret_from_fork (arch/x86/entry/entry_64.S:306) </TASK>
Allocated by task 43169: kasan_save_stack (mm/kasan/common.c:39) __kasan_slab_alloc (mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469) kmem_cache_alloc_node (mm/slab.h:750 mm/slub.c:3243 mm/slub.c:3293) __alloc_skb (net/core/skbuff.c:414) l2cap_recv_frag (./include/net/bluetooth/bluetooth.h:425 net/bluetooth/l2cap_core.c:8329) bluetooth l2cap_recv_acldata (net/bluetooth/l2cap_core.c:8442) bluetooth hci_rx_work (net/bluetooth/hci_core.c:3642 net/bluetooth/hci_core.c:3832) bluetooth process_one_work (kernel/workqueue.c:2289) worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2437) kthread (kernel/kthread.c:376) ret_from_fork (arch/x86/entry/entry_64.S:306)
Freed by task 27920: kasan_save_stack (mm/kasan/common.c:39) kasan_set_track (mm/kasan/common.c:45) kasan_set_free_info (mm/kasan/generic.c:372) ____kasan_slab_free (mm/kasan/common.c:368 mm/kasan/common.c:328) slab_free_freelist_hook (mm/slub.c:1780) kmem_cache_free (mm/slub.c:3536 mm/slub.c:3553) skb_free_datagram (./include/net/sock.h:1578 ./include/net/sock.h:1639 net/core/datagram.c:323) bt_sock_recvmsg (net/bluetooth/af_bluetooth.c:295) bluetooth l2cap_sock_recvmsg (net/bluetooth/l2cap_sock.c:1212) bluetooth sock_read_iter (net/socket.c:1087) new_sync_read (./include/linux/fs.h:2052 fs/read_write.c:401) vfs_read (fs/read_write.c:482) ksys_read (fs/read_write.c:620) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
Link: https://lore.kernel.org/linux-bluetooth/CAKErNvoqga1WcmoR3-0875esY6TVWFQDand... Fixes: d2a7ac5d5d3a ("Bluetooth: Add the ERTM receive state machine") Fixes: 4b51dae96731 ("Bluetooth: Add streaming mode receive and incoming packet classifier") Signed-off-by: Maxim Mikityanskiy maxtram95@gmail.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com
conflicts: net/bluetooth/l2cap_core.c
Signed-off-by: Xia Longlong xialonglong1@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/bluetooth/l2cap_core.c | 48 ++++++++++++++++++++++++++++++++------ 1 file changed, 41 insertions(+), 7 deletions(-)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 2557cd917f5e..e4464a35b419 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -6832,6 +6832,7 @@ static int l2cap_rx_state_recv(struct l2cap_chan *chan, struct l2cap_ctrl *control, struct sk_buff *skb, u8 event) { + struct l2cap_ctrl local_control; int err = 0; bool skb_in_use = false;
@@ -6856,15 +6857,32 @@ static int l2cap_rx_state_recv(struct l2cap_chan *chan, chan->buffer_seq = chan->expected_tx_seq; skb_in_use = true;
+ /* l2cap_reassemble_sdu may free skb, hence invalidate + * control, so make a copy in advance to use it after + * l2cap_reassemble_sdu returns and to avoid the race + * condition, for example: + * + * The current thread calls: + * l2cap_reassemble_sdu + * chan->ops->recv == l2cap_sock_recv_cb + * __sock_queue_rcv_skb + * Another thread calls: + * bt_sock_recvmsg + * skb_recv_datagram + * skb_free_datagram + * Then the current thread tries to access control, but + * it was freed by skb_free_datagram. + */ + local_control = *control; err = l2cap_reassemble_sdu(chan, skb, control); if (err) break;
- if (control->final) { + if (local_control.final) { if (!test_and_clear_bit(CONN_REJ_ACT, &chan->conn_state)) { - control->final = 0; - l2cap_retransmit_all(chan, control); + local_control.final = 0; + l2cap_retransmit_all(chan, &local_control); l2cap_ertm_send(chan); } } @@ -7244,11 +7262,27 @@ static int l2cap_rx(struct l2cap_chan *chan, struct l2cap_ctrl *control, static int l2cap_stream_rx(struct l2cap_chan *chan, struct l2cap_ctrl *control, struct sk_buff *skb) { + /* l2cap_reassemble_sdu may free skb, hence invalidate control, so store + * the txseq field in advance to use it after l2cap_reassemble_sdu + * returns and to avoid the race condition, for example: + * + * The current thread calls: + * l2cap_reassemble_sdu + * chan->ops->recv == l2cap_sock_recv_cb + * __sock_queue_rcv_skb + * Another thread calls: + * bt_sock_recvmsg + * skb_recv_datagram + * skb_free_datagram + * Then the current thread tries to access control, but it was freed by + * skb_free_datagram. + */ + u16 txseq = control->txseq; + BT_DBG("chan %p, control %p, skb %p, state %d", chan, control, skb, chan->rx_state);
- if (l2cap_classify_txseq(chan, control->txseq) == - L2CAP_TXSEQ_EXPECTED) { + if (l2cap_classify_txseq(chan, txseq) == L2CAP_TXSEQ_EXPECTED) { l2cap_pass_to_tx(chan, control);
BT_DBG("buffer_seq %d->%d", chan->buffer_seq, @@ -7271,8 +7305,8 @@ static int l2cap_stream_rx(struct l2cap_chan *chan, struct l2cap_ctrl *control, } }
- chan->last_acked_seq = control->txseq; - chan->expected_tx_seq = __next_seq(chan, control->txseq); + chan->last_acked_seq = txseq; + chan->expected_tx_seq = __next_seq(chan, txseq);
return 0; }
From: Jialiang Wang wangjialiang0806@163.com
mainline inclusion from mainline-v6.0-rc1 commit 02e1a114fdb71e59ee6770294166c30d437bf86a category: bugfix bugzilla: 187867, https://gitee.com/src-openeuler/kernel/issues/I5W7B5 CVE: CVE-2022-3545
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git/comm...
--------------------------------
area_cache_get() is used to distribute cache->area and set cache->id, and if cache->id is not 0 and cache->area->kref refcount is 0, it will release the cache->area by nfp_cpp_area_release(). area_cache_get() set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is is already set but the refcount is not increased as expected. At this time, calling the nfp_cpp_area_release() will cause use-after-free.
To avoid the use-after-free, set cache->id after area_init() and nfp_cpp_area_acquire() complete successfully.
Note: This vulnerability is triggerable by providing emulated device equipped with specified configuration.
BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760) Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1
Call Trace: <TASK> nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760) area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)
Allocated by task 1: nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303) nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802) nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230) nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215) nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)
Freed by task 1: kfree (mm/slub.c:4562) area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873) nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973) nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)
Signed-off-by: Jialiang Wang wangjialiang0806@163.com Reviewed-by: Yinjun Zhang yinjun.zhang@corigine.com Acked-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Yuyao Lin linyuyao1@huawei.com Reviewed-by: Wei Li liwei391@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c index 6ef48eb3a77d..b163489489e9 100644 --- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c +++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c @@ -874,7 +874,6 @@ area_cache_get(struct nfp_cpp *cpp, u32 id, }
/* Adjust the start address to be cache size aligned */ - cache->id = id; cache->addr = addr & ~(u64)(cache->size - 1);
/* Re-init to the new ID and address */ @@ -894,6 +893,8 @@ area_cache_get(struct nfp_cpp *cpp, u32 id, return NULL; }
+ cache->id = id; + exit: /* Adjust offset */ *offset = addr - cache->addr;
From: Andrew Gaul gaul@gaul.org
mainline inclusion from mainline-v6.1-rc1 commit 93e2be344a7db169b7119de21ac1bf253b8c6907 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WFKR CVE: CVE-2022-3594
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=...
--------------------------------
My system shows almost 10 million of these messages over a 24-hour period which pollutes my logs.
Signed-off-by: Andrew Gaul gaul@google.com Link: https://lore.kernel.org/r/20221002034128.2026653-1-gaul@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Lu Wei luwei32@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/net/usb/r8152.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 842f16815395..eaff8f83b5ac 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -1689,7 +1689,9 @@ static void intr_callback(struct urb *urb) "Stop submitting intr, status %d\n", status); return; case -EOVERFLOW: - netif_info(tp, intr, tp->netdev, "intr status -EOVERFLOW\n"); + if (net_ratelimit()) + netif_info(tp, intr, tp->netdev, + "intr status -EOVERFLOW\n"); goto resubmit; /* -EPIPE: should clear the halt */ default:
From: Fedor Pchelkin pchelkin@ispras.ru
stable inclusion from stable-v5.10.138 commit a220ff343396bae8d3b6abee72ab51f1f34b3027 category: bugfix bugzilla: 187869, https://gitee.com/src-openeuler/kernel/issues/I5X3MU CVE: CVE-2022-3633
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 8c21c54a53ab21842f5050fa090f26b03c0313d6 upstream.
We need to drop skb references taken in j1939_session_skb_queue() when destroying a session in j1939_session_destroy(). Otherwise those skbs would be lost.
Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
V1: https://lore.kernel.org/all/20220708175949.539064-1-pchelkin@ispras.ru
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Suggested-by: Oleksij Rempel o.rempel@pengutronix.de Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Alexey Khoroshilov khoroshilov@ispras.ru Acked-by: Oleksij Rempel o.rempel@pengutronix.de Link: https://lore.kernel.org/all/20220805150216.66313-1-pchelkin@ispras.ru Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Dong Chenchen dongchenchen2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- net/can/j1939/transport.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c index 9c39b0f5d6e0..2830a12a4dd1 100644 --- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -260,6 +260,8 @@ static void __j1939_session_drop(struct j1939_session *session)
static void j1939_session_destroy(struct j1939_session *session) { + struct sk_buff *skb; + if (session->err) j1939_sk_errqueue(session, J1939_ERRQUEUE_ABORT); else @@ -270,7 +272,11 @@ static void j1939_session_destroy(struct j1939_session *session) WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry)); WARN_ON_ONCE(!list_empty(&session->active_session_list_entry));
- skb_queue_purge(&session->skb_queue); + while ((skb = skb_dequeue(&session->skb_queue)) != NULL) { + /* drop ref taken in j1939_session_skb_queue() */ + skb_unref(skb); + kfree_skb(skb); + } __j1939_session_drop(session); j1939_priv_put(session->priv); kfree(session);