From: Dave Chinner dchinner@redhat.com
mainline inclusion from mainline-v5.19-rc1 commit c230a4a85bcdbfc1a7415deec6caf04e8fca1301 category: bugfix bugzilla: 187372, https://gitee.com/openeuler/kernel/issues/I5K0OM CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Ever since we added shadown format buffers to the log items, log items need to handle the item being released with shadow buffers attached. Due to the fact this requirement was added at the same time we added new rmap/reflink intents, we missed the cleanup of those items.
In theory, this means shadow buffers can be leaked in a very small window when a shutdown is initiated. Testing with KASAN shows this leak does not happen in practice - we haven't identified a single leak in several years of shutdown testing since ~v4.8 kernels.
However, the intent whiteout cleanup mechanism results in every cancelled intent in exactly the same state as this tiny race window creates and so if intents down clean up shadow buffers on final release we will leak the shadow buffer for just about every intent we create.
Hence we start with this patch to close this condition off and ensure that when whiteouts start to be used we don't leak lots of memory.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Allison Henderson allison.henderson@oracle.com Signed-off-by: Dave Chinner david@fromorbit.com
conflicts: fs/xfs/xfs_bmap_item.c fs/xfs/xfs_icreate_item.c fs/xfs/xfs_refcount_item.c fs/xfs/xfs_rmap_item.c
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- fs/xfs/xfs_bmap_item.c | 2 ++ fs/xfs/xfs_icreate_item.c | 1 + fs/xfs/xfs_refcount_item.c | 2 ++ fs/xfs/xfs_rmap_item.c | 2 ++ 4 files changed, 7 insertions(+)
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 44ec0f2d5253..e6de8081451f 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -40,6 +40,7 @@ STATIC void xfs_bui_item_free( struct xfs_bui_log_item *buip) { + kmem_free(buip->bui_item.li_lv_shadow); kmem_cache_free(xfs_bui_zone, buip); }
@@ -199,6 +200,7 @@ xfs_bud_item_release( struct xfs_bud_log_item *budp = BUD_ITEM(lip);
xfs_bui_release(budp->bud_buip); + kmem_free(budp->bud_item.li_lv_shadow); kmem_cache_free(xfs_bud_zone, budp); }
diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c index 9b3994b9c716..aa8c7c261d24 100644 --- a/fs/xfs/xfs_icreate_item.c +++ b/fs/xfs/xfs_icreate_item.c @@ -63,6 +63,7 @@ STATIC void xfs_icreate_item_release( struct xfs_log_item *lip) { + kmem_free(ICR_ITEM(lip)->ic_item.li_lv_shadow); kmem_cache_free(xfs_icreate_zone, ICR_ITEM(lip)); }
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 0dee316283a9..9f4ff45c7a93 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -35,6 +35,7 @@ STATIC void xfs_cui_item_free( struct xfs_cui_log_item *cuip) { + kmem_free(cuip->cui_item.li_lv_shadow); if (cuip->cui_format.cui_nextents > XFS_CUI_MAX_FAST_EXTENTS) kmem_free(cuip); else @@ -204,6 +205,7 @@ xfs_cud_item_release( struct xfs_cud_log_item *cudp = CUD_ITEM(lip);
xfs_cui_release(cudp->cud_cuip); + kmem_free(cudp->cud_item.li_lv_shadow); kmem_cache_free(xfs_cud_zone, cudp); }
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 20905953fe76..b5447ac7cb9b 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -35,6 +35,7 @@ STATIC void xfs_rui_item_free( struct xfs_rui_log_item *ruip) { + kmem_free(ruip->rui_item.li_lv_shadow); if (ruip->rui_format.rui_nextents > XFS_RUI_MAX_FAST_EXTENTS) kmem_free(ruip); else @@ -227,6 +228,7 @@ xfs_rud_item_release( struct xfs_rud_log_item *rudp = RUD_ITEM(lip);
xfs_rui_release(rudp->rud_ruip); + kmem_free(rudp->rud_item.li_lv_shadow); kmem_cache_free(xfs_rud_zone, rudp); }
From: Baokun Li libaokun1@huawei.com
stable inclusion from stable-v5.10.150 commit f34ab95162763cd7352f46df169296eec28b688d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6BJ5F CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit f9c1f248607d5546075d3f731e7607d5571f2b60 upstream.
I caught a null-ptr-deref bug as follows:
================================================================== KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f] CPU: 1 PID: 1589 Comm: umount Not tainted 5.10.0-02219-dirty #339 RIP: 0010:ext4_write_info+0x53/0x1b0 [...] Call Trace: dquot_writeback_dquots+0x341/0x9a0 ext4_sync_fs+0x19e/0x800 __sync_filesystem+0x83/0x100 sync_filesystem+0x89/0xf0 generic_shutdown_super+0x79/0x3e0 kill_block_super+0xa1/0x110 deactivate_locked_super+0xac/0x130 deactivate_super+0xb6/0xd0 cleanup_mnt+0x289/0x400 __cleanup_mnt+0x16/0x20 task_work_run+0x11c/0x1c0 exit_to_user_mode_prepare+0x203/0x210 syscall_exit_to_user_mode+0x5b/0x3a0 do_syscall_64+0x59/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ==================================================================
Above issue may happen as follows: ------------------------------------- exit_to_user_mode_prepare task_work_run __cleanup_mnt cleanup_mnt deactivate_super deactivate_locked_super kill_block_super generic_shutdown_super shrink_dcache_for_umount dentry = sb->s_root sb->s_root = NULL <--- Here set NULL sync_filesystem __sync_filesystem sb->s_op->sync_fs > ext4_sync_fs dquot_writeback_dquots sb->dq_op->write_info > ext4_write_info ext4_journal_start(d_inode(sb->s_root), EXT4_HT_QUOTA, 2) d_inode(sb->s_root) s_root->d_inode <--- Null pointer dereference
To solve this problem, we use ext4_journal_start_sb directly to avoid s_root being used.
Cc: stable@kernel.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20220805123947.565152-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- fs/ext4/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 5b98734f9f3d..dd3b72ba67e8 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6370,7 +6370,7 @@ static int ext4_write_info(struct super_block *sb, int type) handle_t *handle;
/* Data block + inode block */ - handle = ext4_journal_start(d_inode(sb->s_root), EXT4_HT_QUOTA, 2); + handle = ext4_journal_start_sb(sb, EXT4_HT_QUOTA, 2); if (IS_ERR(handle)) return PTR_ERR(handle); ret = dquot_commit_info(sb, type);
From: Baokun Li libaokun1@huawei.com
stable inclusion from stable-v5.10.163 commit 7223d5e75f26352354ea2c0ccf8b579821b52adf category: bugfix bugzilla: 187904,https://gitee.com/openeuler/kernel/issues/I6BJAR CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit a71248b1accb2b42e4980afef4fa4a27fa0e36f5 upstream.
I caught a issue as follows:
================================================================== BUG: KASAN: use-after-free in __list_add_valid+0x28/0x1a0 Read of size 8 at addr ffff88814b13f378 by task mount/710
CPU: 1 PID: 710 Comm: mount Not tainted 6.1.0-rc3-next #370 Call Trace: <TASK> dump_stack_lvl+0x73/0x9f print_report+0x25d/0x759 kasan_report+0xc0/0x120 __asan_load8+0x99/0x140 __list_add_valid+0x28/0x1a0 ext4_orphan_cleanup+0x564/0x9d0 [ext4] __ext4_fill_super+0x48e2/0x5300 [ext4] ext4_fill_super+0x19f/0x3a0 [ext4] get_tree_bdev+0x27b/0x450 ext4_get_tree+0x19/0x30 [ext4] vfs_get_tree+0x49/0x150 path_mount+0xaae/0x1350 do_mount+0xe2/0x110 __x64_sys_mount+0xf0/0x190 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> [...] ==================================================================
Above issue may happen as follows: ------------------------------------- ext4_fill_super ext4_orphan_cleanup --- loop1: assume last_orphan is 12 --- list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan) ext4_truncate --> return 0 ext4_inode_attach_jinode --> return -ENOMEM iput(inode) --> free inode<12> --- loop2: last_orphan is still 12 --- list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan); // use inode<12> and trigger UAF
To solve this issue, we need to propagate the return value of ext4_inode_attach_jinode() appropriately.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20221102080633.1630225-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- fs/ext4/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index e7eb5a5c5876..9034528a3fd0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4217,7 +4217,8 @@ int ext4_truncate(struct inode *inode)
/* If we zero-out tail of the page, we have to create jinode for jbd2 */ if (inode->i_size & (inode->i_sb->s_blocksize - 1)) { - if (ext4_inode_attach_jinode(inode) < 0) + err = ext4_inode_attach_jinode(inode); + if (err) goto out_trace; }
From: Herbert Xu herbert@gondor.apana.org.au
stable inclusion from stable-v5.10.164 commit 6c9e2c11c33c35563d34d12b343d43b5c12200b5 category: bugfix bugzilla: 188291, https://gitee.com/src-openeuler/kernel/issues/I6B1V2 CVE: CVE-2023-0394
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit cb3e9864cdbe35ff6378966660edbcbac955fe17 upstream.
The total cork length created by ip6_append_data includes extension headers, so we must exclude them when comparing them against the IPV6_CHECKSUM offset which does not include extension headers.
Reported-by: Kyle Zeng zengyhkyle@gmail.com Fixes: 357b40a18b04 ("[IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory") Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Lu Wei luwei32@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- net/ipv6/raw.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 31eb54e92b3f..110254f44a46 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -539,6 +539,7 @@ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6, struct raw6_sock *rp) { + struct ipv6_txoptions *opt; struct sk_buff *skb; int err = 0; int offset; @@ -556,6 +557,9 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
offset = rp->offset; total_len = inet_sk(sk)->cork.base.length; + opt = inet6_sk(sk)->cork.opt; + total_len -= opt ? opt->opt_flen : 0; + if (offset >= total_len - 1) { err = -EINVAL; ip6_flush_pending_frames(sk);
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
mainline inclusion from mainline-v6.0-rc1 commit 68aaee147e597b495622b7c9038e5922c7c61f57 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6ADCF CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
syzbot is reporting GFP_KERNEL allocation with oom_lock held when reporting memcg OOM [1]. If this allocation triggers the global OOM situation then the system can livelock because the GFP_KERNEL allocation with oom_lock held cannot trigger the global OOM killer because __alloc_pages_may_oom() fails to hold oom_lock.
Fix this problem by removing the allocation from memory_stat_format() completely, and pass static buffer when calling from memcg OOM path.
Note that the caller holding filesystem lock was the trigger for syzbot to report this locking dependency. Doing GFP_KERNEL allocation with filesystem lock held can deadlock the system even without involving OOM situation.
Link: https://syzkaller.appspot.com/bug?extid=2d2aeadc6ce1e1f11d45 [1] Link: https://lkml.kernel.org/r/86afb39f-8c65-bec2-6cfc-c5e3cd600c0b@I-love.SAKURA... Fixes: c8713d0b23123759 ("mm: memcontrol: dump memory.stat during cgroup OOM") Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot syzbot+2d2aeadc6ce1e1f11d45@syzkaller.appspotmail.com Suggested-by: Michal Hocko mhocko@suse.com Acked-by: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Roman Gushchin roman.gushchin@linux.dev Cc: Shakeel Butt shakeelb@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org
conflicts: mm/memcontrol.c
Signed-off-by: Cai Xinchen caixinchen1@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- mm/memcontrol.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fc69ceceeea9..4a069f9112e8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1496,14 +1496,12 @@ static int __init memory_stats_init(void) } pure_initcall(memory_stats_init);
-static char *memory_stat_format(struct mem_cgroup *memcg) +static void memory_stat_format(struct mem_cgroup *memcg, char *buf, int bufsize) { struct seq_buf s; int i;
- seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE); - if (!s.buffer) - return NULL; + seq_buf_init(&s, buf, bufsize);
/* * Provide statistics on the state of the memory subsystem as @@ -1563,8 +1561,6 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
/* The above should easily fit into one page */ WARN_ON_ONCE(seq_buf_has_overflowed(&s)); - - return s.buffer; }
#define K(x) ((x) << (PAGE_SHIFT-10)) @@ -1600,7 +1596,10 @@ void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct * */ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) { - char *buf; + /* Use static buffer, for the caller is holding oom_lock. */ + static char buf[PAGE_SIZE]; + + lockdep_assert_held(&oom_lock);
pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n", K((u64)page_counter_read(&memcg->memory)), @@ -1621,11 +1620,8 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) pr_info("Memory cgroup stats for "); pr_cont_cgroup_path(memcg->css.cgroup); pr_cont(":"); - buf = memory_stat_format(memcg); - if (!buf) - return; + memory_stat_format(memcg, buf, sizeof(buf)); pr_info("%s", buf); - kfree(buf);
mem_cgroup_print_memfs_info(memcg, NULL); } @@ -6639,11 +6635,11 @@ static int memory_events_local_show(struct seq_file *m, void *v) static int memory_stat_show(struct seq_file *m, void *v) { struct mem_cgroup *memcg = mem_cgroup_from_seq(m); - char *buf; + char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
- buf = memory_stat_format(memcg); if (!buf) return -ENOMEM; + memory_stat_format(memcg, buf, PAGE_SIZE); seq_puts(m, buf); kfree(buf); return 0;
From: Liu Shixin liushixin2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6ADCF CVE: NA
--------------------------------
syzbot is reporting GFP_KERNEL allocation with oom_lock held when reporting memcg OOM [1]. If this allocation triggers the global OOM situation then the system can livelock because the GFP_KERNEL allocation with oom_lock held cannot trigger the global OOM killer because __alloc_pages_may_oom() fails to hold oom_lock.
The problem mentioned above has been fixed by patch[2]. The is the same problem in memcg_memfs_info feature too. Refer to the patch[2], fix it by removing the allocation from mem_cgroup_print_memfs_info() completely, and pass static buffer when calling from memcg OOM path.
Link: https://syzkaller.appspot.com/bug?extid=2d2aeadc6ce1e1f11d45 [1] Link: https://lkml.kernel.org/r/86afb39f-8c65-bec2-6cfc-c5e3cd600c0b@I-love.SAKURA... [2] Fixes: 6b1d4d3a3713 ("mm/memcg_memfs_info: show files that having pages charged in mem_cgroup") Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- include/linux/memcg_memfs_info.h | 4 +++- mm/memcg_memfs_info.c | 20 ++++++++++---------- mm/memcontrol.c | 3 ++- 3 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/include/linux/memcg_memfs_info.h b/include/linux/memcg_memfs_info.h index 658a91e22bd7..b5e3709baa9e 100644 --- a/include/linux/memcg_memfs_info.h +++ b/include/linux/memcg_memfs_info.h @@ -6,11 +6,13 @@ #include <linux/seq_file.h>
#ifdef CONFIG_MEMCG_MEMFS_INFO -void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, struct seq_file *m); +void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, char *pathbuf, + struct seq_file *m); int mem_cgroup_memfs_files_show(struct seq_file *m, void *v); void mem_cgroup_memfs_info_init(void); #else static inline void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, + char *pathbuf, struct seq_file *m) { } diff --git a/mm/memcg_memfs_info.c b/mm/memcg_memfs_info.c index f404367ad08c..db7b0aee8053 100644 --- a/mm/memcg_memfs_info.c +++ b/mm/memcg_memfs_info.c @@ -157,7 +157,8 @@ static void memfs_show_files_in_mem_cgroup(struct super_block *sb, void *data) mntput(pfc->vfsmnt); }
-void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, struct seq_file *m) +void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, char *pathbuf, + struct seq_file *m) { struct print_files_control pfc = { .memcg = memcg, @@ -165,17 +166,11 @@ void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, struct seq_file *m) .max_print_files = memfs_max_print_files, .size_threshold = memfs_size_threshold, }; - char *pathbuf; int i;
if (!memfs_enable || !memcg) return;
- pathbuf = kmalloc(PATH_MAX, GFP_KERNEL); - if (!pathbuf) { - SEQ_printf(m, "Show memfs failed due to OOM\n"); - return; - } pfc.pathbuf = pathbuf; pfc.pathbuf_size = PATH_MAX;
@@ -192,15 +187,20 @@ void mem_cgroup_print_memfs_info(struct mem_cgroup *memcg, struct seq_file *m) SEQ_printf(m, "total files: %lu, total memory-size: %lukB\n", pfc.total_print_files, pfc.total_files_size >> 10); } - - kfree(pfc.pathbuf); }
int mem_cgroup_memfs_files_show(struct seq_file *m, void *v) { struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m)); + char *pathbuf;
- mem_cgroup_print_memfs_info(memcg, m); + pathbuf = kmalloc(PATH_MAX, GFP_KERNEL); + if (!pathbuf) { + SEQ_printf(m, "Show memfs abort: failed to allocate memory\n"); + return 0; + } + mem_cgroup_print_memfs_info(memcg, pathbuf, m); + kfree(pathbuf); return 0; }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4a069f9112e8..467c1f2fd6ae 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1598,6 +1598,7 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) { /* Use static buffer, for the caller is holding oom_lock. */ static char buf[PAGE_SIZE]; + static char pathbuf[PATH_MAX];
lockdep_assert_held(&oom_lock);
@@ -1623,7 +1624,7 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) memory_stat_format(memcg, buf, sizeof(buf)); pr_info("%s", buf);
- mem_cgroup_print_memfs_info(memcg, NULL); + mem_cgroup_print_memfs_info(memcg, pathbuf, NULL); }
/*
From: Zheng Wang zyytlz.wz@163.com
mainline inclusion from mainline-v6.2-rc3 commit 4a61648af68f5ba4884f0e3b494ee1cabc4b6620 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5XXFF CVE: CVE-2022-3707
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
If intel_gvt_dma_map_guest_page failed, it will call ppgtt_invalidate_spt, which will finally free the spt. But the caller function ppgtt_populate_spt_by_guest_entry does not notice that, it will free spt again in its error path.
Fix this by canceling the mapping of DMA address and freeing sub_spt. Besides, leave the handle of spt destroy to caller function instead of callee function when error occurs.
conflicts: drivers/gpu/drm/i915/gvt/gtt.c
Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support") Signed-off-by: Zheng Wang zyytlz.wz@163.com Reviewed-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20221229165641.1192455-1-zyytlz... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Reviewed-by: Wei Li liwei391@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- drivers/gpu/drm/i915/gvt/gtt.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index a3a4305eda01..0201f9b5f87e 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -1192,10 +1192,8 @@ static int split_2MB_gtt_entry(struct intel_vgpu *vgpu, for_each_shadow_entry(sub_spt, &sub_se, sub_index) { ret = intel_gvt_hypervisor_dma_map_guest_page(vgpu, start_gfn + sub_index, PAGE_SIZE, &dma_addr); - if (ret) { - ppgtt_invalidate_spt(spt); - return ret; - } + if (ret) + goto err; sub_se.val64 = se->val64;
/* Copy the PAT field from PDE. */ @@ -1214,6 +1212,17 @@ static int split_2MB_gtt_entry(struct intel_vgpu *vgpu, ops->set_pfn(se, sub_spt->shadow_page.mfn); ppgtt_set_shadow_entry(spt, se, index); return 0; +err: + /* Cancel the existing addess mappings of DMA addr. */ + for_each_present_shadow_entry(sub_spt, &sub_se, sub_index) { + gvt_vdbg_mm("invalidate 4K entry\n"); + ppgtt_invalidate_pte(sub_spt, &sub_se); + } + /* Release the new allocated spt. */ + trace_spt_change(sub_spt->vgpu->id, "release", sub_spt, + sub_spt->guest_page.gfn, sub_spt->shadow_page.type); + ppgtt_free_spt(sub_spt); + return ret; }
static int split_64KB_gtt_entry(struct intel_vgpu *vgpu,
From: Marco Elver elver@google.com
stable inclusion from stable-v5.10.163 commit 67349025f00d0749e36386cfcfc32c2887f47fdb category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6D0MR CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit fa69ee5aa48b5b52e8028c2eb486906e9998d081 ]
It turns out that usage of skb extensions can cause memory leaks. Ido Schimmel reported: "[...] there are instances that blindly overwrite 'skb->extensions' by invoking skb_copy_header() after __alloc_skb()."
Therefore, give up on using skb extensions for KCOV handle, and instead directly store kcov_handle in sk_buff.
Fixes: 6370cc3bbd8a ("net: add kcov handle to skb extensions") Fixes: 85ce50d337d1 ("net: kcov: don't select SKB_EXTENSIONS when there is no NET") Fixes: 97f53a08cba1 ("net: linux/skbuff.h: combine SKB_EXTENSIONS + KCOV handling") Link: https://lore.kernel.org/linux-wireless/20201121160941.GA485907@shredder.lan/ Reported-by: Ido Schimmel idosch@idosch.org Signed-off-by: Marco Elver elver@google.com Link: https://lore.kernel.org/r/20201125224840.2014773-1-elver@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Sasha Levin sashal@kernel.org
Conflicts: include/linux/skbuff.h
Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- include/linux/skbuff.h | 37 +++++++++++++------------------------ lib/Kconfig.debug | 1 - net/core/skbuff.c | 6 ------ 3 files changed, 13 insertions(+), 31 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index fea290135399..1be241150b77 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -706,6 +706,7 @@ typedef unsigned char *sk_buff_data_t; * @transport_header: Transport layer header * @network_header: Network layer header * @mac_header: Link layer header + * @kcov_handle: KCOV remote handle for remote coverage collection * @scm_io_uring: SKB holds io_uring registered files * @tail: Tail pointer * @end: End pointer @@ -913,6 +914,10 @@ struct sk_buff { __u16 network_header; __u16 mac_header;
+#ifdef CONFIG_KCOV + u64 kcov_handle; +#endif + /* private: */ __u32 headers_end[0]; /* public: */ @@ -4212,9 +4217,6 @@ enum skb_ext_id { #endif #if IS_ENABLED(CONFIG_MPTCP) SKB_EXT_MPTCP, -#endif -#if IS_ENABLED(CONFIG_KCOV) - SKB_EXT_KCOV_HANDLE, #endif SKB_EXT_NUM, /* must be last */ }; @@ -4670,35 +4672,22 @@ static inline void skb_reset_redirect(struct sk_buff *skb) #endif }
-#if IS_ENABLED(CONFIG_KCOV) && IS_ENABLED(CONFIG_SKB_EXTENSIONS) static inline void skb_set_kcov_handle(struct sk_buff *skb, const u64 kcov_handle) { - /* Do not allocate skb extensions only to set kcov_handle to zero - * (as it is zero by default). However, if the extensions are - * already allocated, update kcov_handle anyway since - * skb_set_kcov_handle can be called to zero a previously set - * value. - */ - if (skb_has_extensions(skb) || kcov_handle) { - u64 *kcov_handle_ptr = skb_ext_add(skb, SKB_EXT_KCOV_HANDLE); - - if (kcov_handle_ptr) - *kcov_handle_ptr = kcov_handle; - } +#ifdef CONFIG_KCOV + skb->kcov_handle = kcov_handle; +#endif }
static inline u64 skb_get_kcov_handle(struct sk_buff *skb) { - u64 *kcov_handle = skb_ext_find(skb, SKB_EXT_KCOV_HANDLE); - - return kcov_handle ? *kcov_handle : 0; -} +#ifdef CONFIG_KCOV + return skb->kcov_handle; #else -static inline void skb_set_kcov_handle(struct sk_buff *skb, - const u64 kcov_handle) { } -static inline u64 skb_get_kcov_handle(struct sk_buff *skb) { return 0; } -#endif /* CONFIG_KCOV && CONFIG_SKB_EXTENSIONS */ + return 0; +#endif +}
static inline bool skb_csum_is_sctp(struct sk_buff *skb) { diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 10e425c30486..f53afec6f7ae 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1945,7 +1945,6 @@ config KCOV depends on CC_HAS_SANCOV_TRACE_PC || GCC_PLUGINS select DEBUG_FS select GCC_PLUGIN_SANCOV if !CC_HAS_SANCOV_TRACE_PC - select SKB_EXTENSIONS if NET help KCOV exposes kernel code coverage information in a form suitable for coverage-guided fuzzing (randomized testing). diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d88d131ed75d..c13cb0684cca 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4262,9 +4262,6 @@ static const u8 skb_ext_type_len[] = { #if IS_ENABLED(CONFIG_MPTCP) [SKB_EXT_MPTCP] = SKB_EXT_CHUNKSIZEOF(struct mptcp_ext), #endif -#if IS_ENABLED(CONFIG_KCOV) - [SKB_EXT_KCOV_HANDLE] = SKB_EXT_CHUNKSIZEOF(u64), -#endif };
static __always_inline unsigned int skb_ext_total_length(void) @@ -4281,9 +4278,6 @@ static __always_inline unsigned int skb_ext_total_length(void) #endif #if IS_ENABLED(CONFIG_MPTCP) skb_ext_type_len[SKB_EXT_MPTCP] + -#endif -#if IS_ENABLED(CONFIG_KCOV) - skb_ext_type_len[SKB_EXT_KCOV_HANDLE] + #endif 0; }
From: Eric Dumazet edumazet@google.com
stable inclusion from stable-v5.10.156 commit e929ec98c0c3b10d9c07f3776df0c1a02d7a763e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I67UR2 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit b64085b00044bdf3cd1c9825e9ef5b2e0feae91a upstream.
macvlan should enforce a minimal mtu of 68, even at link creation.
This patch avoids the current behavior (which could lead to crashes in ipv6 stack if the link is brought up)
$ ip link add macvlan1 link eno1 mtu 8 type macvlan # This should fail ! $ ip link sh dev macvlan1 5: macvlan1@eno1: <BROADCAST,MULTICAST> mtu 8 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:47:6c:24:74:82 brd ff:ff:ff:ff:ff:ff $ ip link set macvlan1 mtu 67 Error: mtu less than device minimum. $ ip link set macvlan1 mtu 68 $ ip link set macvlan1 mtu 8 Error: mtu less than device minimum.
Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- drivers/net/macvlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index c8d803d3616c..a421137e4ff8 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1176,7 +1176,7 @@ void macvlan_common_setup(struct net_device *dev) { ether_setup(dev);
- dev->min_mtu = 0; + /* ether_setup() has set dev->min_mtu to ETH_MIN_MTU. */ dev->max_mtu = ETH_MAX_MTU; dev->priv_flags &= ~IFF_TX_SKB_SHARING; netif_keep_dst(dev);
From: Eric Dumazet edumazet@google.com
stable inclusion from stable-v5.10.152 commit 7aa3d623c11b9ab60f86b7833666e5d55bac4be9 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I64L17 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit ebda44da44f6f309d302522b049f43d6f829f7aa ]
We had one syzbot report [1] in syzbot queue for a while. I was waiting for more occurrences and/or a repro but Dmitry Vyukov spotted the issue right away.
<quoting Dmitry> qdisc_graft() drops reference to qdisc in notify_and_destroy while it's still assigned to dev->qdisc </quoting>
Indeed, RCU rules are clear when replacing a data structure. The visible pointer (dev->qdisc in this case) must be updated to the new object _before_ RCU grace period is started (qdisc_put(old) in this case).
[1] BUG: KASAN: use-after-free in __tcf_qdisc_find.part.0+0xa3a/0xac0 net/sched/cls_api.c:1066 Read of size 4 at addr ffff88802065e038 by task syz-executor.4/21027
CPU: 0 PID: 21027 Comm: syz-executor.4 Not tainted 6.0.0-rc3-syzkaller-00363-g7726d4c3e60b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:317 [inline] print_report.cold+0x2ba/0x719 mm/kasan/report.c:433 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495 __tcf_qdisc_find.part.0+0xa3a/0xac0 net/sched/cls_api.c:1066 __tcf_qdisc_find net/sched/cls_api.c:1051 [inline] tc_new_tfilter+0x34f/0x2200 net/sched/cls_api.c:2018 rtnetlink_rcv_msg+0x955/0xca0 net/core/rtnetlink.c:6081 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2482 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f5efaa89279 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f5efbc31168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f5efab9bf80 RCX: 00007f5efaa89279 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000005 RBP: 00007f5efaae32e9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f5efb0cfb1f R14: 00007f5efbc31300 R15: 0000000000022000 </TASK>
Allocated by task 21027: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:45 [inline] set_alloc_info mm/kasan/common.c:437 [inline] ____kasan_kmalloc mm/kasan/common.c:516 [inline] ____kasan_kmalloc mm/kasan/common.c:475 [inline] __kasan_kmalloc+0xa9/0xd0 mm/kasan/common.c:525 kmalloc_node include/linux/slab.h:623 [inline] kzalloc_node include/linux/slab.h:744 [inline] qdisc_alloc+0xb0/0xc50 net/sched/sch_generic.c:938 qdisc_create_dflt+0x71/0x4a0 net/sched/sch_generic.c:997 attach_one_default_qdisc net/sched/sch_generic.c:1152 [inline] netdev_for_each_tx_queue include/linux/netdevice.h:2437 [inline] attach_default_qdiscs net/sched/sch_generic.c:1170 [inline] dev_activate+0x760/0xcd0 net/sched/sch_generic.c:1229 __dev_open+0x393/0x4d0 net/core/dev.c:1441 __dev_change_flags+0x583/0x750 net/core/dev.c:8556 rtnl_configure_link+0xee/0x240 net/core/rtnetlink.c:3189 rtnl_newlink_create net/core/rtnetlink.c:3371 [inline] __rtnl_newlink+0x10b8/0x17e0 net/core/rtnetlink.c:3580 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3593 rtnetlink_rcv_msg+0x43a/0xca0 net/core/rtnetlink.c:6090 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2482 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 21020: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 kasan_set_track+0x21/0x30 mm/kasan/common.c:45 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370 ____kasan_slab_free mm/kasan/common.c:367 [inline] ____kasan_slab_free+0x166/0x1c0 mm/kasan/common.c:329 kasan_slab_free include/linux/kasan.h:200 [inline] slab_free_hook mm/slub.c:1754 [inline] slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1780 slab_free mm/slub.c:3534 [inline] kfree+0xe2/0x580 mm/slub.c:4562 rcu_do_batch kernel/rcu/tree.c:2245 [inline] rcu_core+0x7b5/0x1890 kernel/rcu/tree.c:2505 __do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
Last potentially related work creation: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 __kasan_record_aux_stack+0xbe/0xd0 mm/kasan/generic.c:348 call_rcu+0x99/0x790 kernel/rcu/tree.c:2793 qdisc_put+0xcd/0xe0 net/sched/sch_generic.c:1083 notify_and_destroy net/sched/sch_api.c:1012 [inline] qdisc_graft+0xeb1/0x1270 net/sched/sch_api.c:1084 tc_modify_qdisc+0xbb7/0x1a00 net/sched/sch_api.c:1671 rtnetlink_rcv_msg+0x43a/0xca0 net/core/rtnetlink.c:6090 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2482 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Second to last potentially related work creation: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 __kasan_record_aux_stack+0xbe/0xd0 mm/kasan/generic.c:348 kvfree_call_rcu+0x74/0x940 kernel/rcu/tree.c:3322 neigh_destroy+0x431/0x630 net/core/neighbour.c:912 neigh_release include/net/neighbour.h:454 [inline] neigh_cleanup_and_release+0x1f8/0x330 net/core/neighbour.c:103 neigh_del net/core/neighbour.c:225 [inline] neigh_remove_one+0x37d/0x460 net/core/neighbour.c:246 neigh_forced_gc net/core/neighbour.c:276 [inline] neigh_alloc net/core/neighbour.c:447 [inline] ___neigh_create+0x18b5/0x29a0 net/core/neighbour.c:642 ip6_finish_output2+0xfb8/0x1520 net/ipv6/ip6_output.c:125 __ip6_finish_output net/ipv6/ip6_output.c:195 [inline] ip6_finish_output+0x690/0x1160 net/ipv6/ip6_output.c:206 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x1ed/0x540 net/ipv6/ip6_output.c:227 dst_output include/net/dst.h:451 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] mld_sendpack+0xa09/0xe70 net/ipv6/mcast.c:1820 mld_send_cr net/ipv6/mcast.c:2121 [inline] mld_ifc_work+0x71c/0xdc0 net/ipv6/mcast.c:2653 process_one_work+0x991/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e4/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
The buggy address belongs to the object at ffff88802065e000 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 56 bytes inside of 1024-byte region [ffff88802065e000, ffff88802065e400)
The buggy address belongs to the physical page: page:ffffea0000819600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x20658 head:ffffea0000819600 order:3 compound_mapcount:0 compound_pincount:0 flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000010200 0000000000000000 dead000000000001 ffff888011841dc0 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 3523, tgid 3523 (sshd), ts 41495190986, free_ts 41417713212 prep_new_page mm/page_alloc.c:2532 [inline] get_page_from_freelist+0x109b/0x2ce0 mm/page_alloc.c:4283 __alloc_pages+0x1c7/0x510 mm/page_alloc.c:5515 alloc_pages+0x1a6/0x270 mm/mempolicy.c:2270 alloc_slab_page mm/slub.c:1824 [inline] allocate_slab+0x27e/0x3d0 mm/slub.c:1969 new_slab mm/slub.c:2029 [inline] ___slab_alloc+0x7f1/0xe10 mm/slub.c:3031 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3118 slab_alloc_node mm/slub.c:3209 [inline] __kmalloc_node_track_caller+0x2f2/0x380 mm/slub.c:4955 kmalloc_reserve net/core/skbuff.c:358 [inline] __alloc_skb+0xd9/0x2f0 net/core/skbuff.c:430 alloc_skb_fclone include/linux/skbuff.h:1307 [inline] tcp_stream_alloc_skb+0x38/0x580 net/ipv4/tcp.c:861 tcp_sendmsg_locked+0xc36/0x2f80 net/ipv4/tcp.c:1325 tcp_sendmsg+0x2b/0x40 net/ipv4/tcp.c:1483 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 sock_write_iter+0x291/0x3d0 net/socket.c:1108 call_write_iter include/linux/fs.h:2187 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x9e9/0xdd0 fs/read_write.c:578 ksys_write+0x1e8/0x250 fs/read_write.c:631 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1449 [inline] free_pcp_prepare+0x5e4/0xd20 mm/page_alloc.c:1499 free_unref_page_prepare mm/page_alloc.c:3380 [inline] free_unref_page+0x19/0x4d0 mm/page_alloc.c:3476 __unfreeze_partials+0x17c/0x1a0 mm/slub.c:2548 qlink_free mm/kasan/quarantine.c:168 [inline] qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187 kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:294 __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:447 kasan_slab_alloc include/linux/kasan.h:224 [inline] slab_post_alloc_hook mm/slab.h:727 [inline] slab_alloc_node mm/slub.c:3243 [inline] slab_alloc mm/slub.c:3251 [inline] __kmem_cache_alloc_lru mm/slub.c:3258 [inline] kmem_cache_alloc+0x267/0x3b0 mm/slub.c:3268 kmem_cache_zalloc include/linux/slab.h:723 [inline] alloc_buffer_head+0x20/0x140 fs/buffer.c:2974 alloc_page_buffers+0x280/0x790 fs/buffer.c:829 create_empty_buffers+0x2c/0xee0 fs/buffer.c:1558 ext4_block_write_begin+0x1004/0x1530 fs/ext4/inode.c:1074 ext4_da_write_begin+0x422/0xae0 fs/ext4/inode.c:2996 generic_perform_write+0x246/0x560 mm/filemap.c:3738 ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:270 ext4_file_write_iter+0x44a/0x1660 fs/ext4/file.c:679 call_write_iter include/linux/fs.h:2187 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x9e9/0xdd0 fs/read_write.c:578
Fixes: af356afa010f ("net_sched: reintroduce dev->qdisc for use by sch_api") Reported-by: syzbot syzkaller@googlegroups.com Diagnosed-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20221018203258.2793282-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- net/sched/sch_api.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 01bec5e746d3..54e2309315eb 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1081,12 +1081,13 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
skip: if (!ingress) { - notify_and_destroy(net, skb, n, classid, - rtnl_dereference(dev->qdisc), new); + old = rtnl_dereference(dev->qdisc); if (new && !new->ops->attach) qdisc_refcount_inc(new); rcu_assign_pointer(dev->qdisc, new ? : &noop_qdisc);
+ notify_and_destroy(net, skb, n, classid, old, new); + if (new && new->ops->attach) new->ops->attach(new); } else {