From: Shung-Hsi Yu <shung-hsi.yu(a)suse.com>
maillist inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WLXN
CVE: CVE-2022-3606
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/…
--------------------------------
When there are no program sections, obj->programs is left unallocated,
and find_prog_by_sec_insn()'s search lands on &obj->programs[0] == NULL,
and will cause null-pointer dereference in the following access to
prog->sec_idx.
Guard the search with obj->nr_programs similar to what's being done in
__bpf_program__iter() to prevent null-pointer access from happening.
Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu(a)suse.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20221012022353.7350-4-shung-hsi.yu@suse.com
Signed-off-by: Pu Lehui <pulehui(a)huawei.com>
Reviewed-by: Kuohai Xu <xukuohai(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
tools/lib/bpf/libbpf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 314fb1202d08..2b997a981052 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -3479,6 +3479,9 @@ static struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
int l = 0, r = obj->nr_programs - 1, m;
struct bpf_program *prog;
+ if (!obj->nr_programs)
+ return NULL;
+
while (l < r) {
m = l + (r - l + 1) / 2;
prog = &obj->programs[m];
--
2.20.1
From: Lu Wei <luwei32(a)huawei.com>
maillist inclusion
category: bugfix
bugzilla: 187792, https://gitee.com/openeuler/kernel/issues/I5ZG7O
Reference: https://www.spinics.net/lists/netdev/msg856902.html
--------------------------------
If setsockopt with option name of TCP_REPAIR_OPTIONS and opt_code
of TCPOPT_SACK_PERM is called to enable sack after data is sent
and dupacks are received , it will trigger a warning in function
tcp_verify_left_out() as follows:
============================================
WARNING: CPU: 8 PID: 0 at net/ipv4/tcp_input.c:2132
tcp_timeout_mark_lost+0x154/0x160
tcp_enter_loss+0x2b/0x290
tcp_retransmit_timer+0x50b/0x640
tcp_write_timer_handler+0x1c8/0x340
tcp_write_timer+0xe5/0x140
call_timer_fn+0x3a/0x1b0
__run_timers.part.0+0x1bf/0x2d0
run_timer_softirq+0x43/0xb0
__do_softirq+0xfd/0x373
__irq_exit_rcu+0xf6/0x140
The warning is caused in the following steps:
1. a socket named socketA is created
2. socketA enters repair mode without build a connection
3. socketA calls connect() and its state is changed to TCP_ESTABLISHED
directly
4. socketA leaves repair mode
5. socketA calls sendmsg() to send data, packets_out and sack_outs(dup
ack receives) increase
6. socketA enters repair mode again
7. socketA calls setsockopt with TCPOPT_SACK_PERM to enable sack
8. retransmit timer expires, it calls tcp_timeout_mark_lost(), lost_out
increases
9. sack_outs + lost_out > packets_out triggers since lost_out and
sack_outs increase repeatly
In function tcp_timeout_mark_lost(), tp->sacked_out will be cleared if
Step7 not happen and the warning will not be triggered. As suggested by
Denis and Eric, TCP_REPAIR_OPTIONS should be prohibited if data was
already sent.
socket-tcp tests in CRIU has been tested as follows:
$ sudo ./test/zdtm.py run -t zdtm/static/socket-tcp* --keep-going \
--ignore-taint
socket-tcp* represent all socket-tcp tests in test/zdtm/static/.
Fixes: b139ba4e90dc ("tcp: Repair connection-time negotiated parameters")
Signed-off-by: Lu Wei <luwei32(a)huawei.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/ipv4/tcp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 094cd93e50c2..6f71b6cfc1b2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2916,7 +2916,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
case TCP_REPAIR_OPTIONS:
if (!tp->repair)
err = -EINVAL;
- else if (sk->sk_state == TCP_ESTABLISHED)
+ else if (sk->sk_state == TCP_ESTABLISHED && !tp->bytes_sent)
err = tcp_repair_options_est(sk,
(struct tcp_repair_opt __user *)optval,
optlen);
--
2.25.1
From: Baokun Li <libaokun1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 187935,https://gitee.com/src-openeuler/kernel/issues/I5ZR8Z
CVE: NA
--------------------------------
When online resizing is performed twice consecutively, the error message
"Superblock checksum does not match superblock" is displayed for the
second time. Here's the reproducer:
mkfs.ext4 -F /dev/sdb 100M
mount /dev/sdb /tmp/test
resize2fs /dev/sdb 5G
resize2fs /dev/sdb 6G
To solve this issue, we moved the update of the checksum after the
es->s_overhead_clusters is updated.
Fixes: 026d0d27c488 ("ext4: reduce computation of overhead during resize")
Fixes: de394a86658f ("ext4: update s_overhead_clusters in the superblock during an on-line resize")
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ext4/resize.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index bc870561e394..f2b881aaf0b1 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1442,8 +1442,6 @@ static void ext4_update_super(struct super_block *sb,
* active. */
ext4_r_blocks_count_set(es, ext4_r_blocks_count(es) +
reserved_blocks);
- ext4_superblock_csum_set(sb);
- unlock_buffer(sbi->s_sbh);
/* Update the free space counts */
percpu_counter_add(&sbi->s_freeclusters_counter,
@@ -1471,6 +1469,8 @@ static void ext4_update_super(struct super_block *sb,
ext4_calculate_overhead(sb);
es->s_overhead_clusters = cpu_to_le32(sbi->s_overhead);
+ ext4_superblock_csum_set(sb);
+ unlock_buffer(sbi->s_sbh);
if (test_opt(sb, DEBUG))
printk(KERN_DEBUG "EXT4-fs: added group %u:"
"%llu blocks(%llu free %llu reserved)\n", flex_gd->count,
--
2.25.1
From: Pavel Begunkov <asml.silence(a)gmail.com>
mainline inclusion
from mainline-v6.1-rc1
commit 0091bfc81741b8d3aeb3b7ab8636f911b2de6e80
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5WFKI
CVE: CVE-2022-2602
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?h…
--------------------------------
Instead of putting io_uring's registered files in unix_gc() we want it
to be done by io_uring itself. The trick here is to consider io_uring
registered files for cycle detection but not actually putting them down.
Because io_uring can't register other ring instances, this will remove
all refs to the ring file triggering the ->release path and clean up
with io_ring_ctx_free().
Cc: stable(a)vger.kernel.org
Fixes: 6b06314c47e1 ("io_uring: add file set registration")
Reported-and-tested-by: David Bouman <dbouman03(a)gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)canonical.com>
[axboe: add kerneldoc comment to skb, fold in skb leak fix]
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Conflicts:
fs/io_uring.c
include/linux/skbuff.h
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/io_uring.c | 1 +
include/linux/skbuff.h | 3 +++
net/unix/garbage.c | 20 ++++++++++++++++++++
3 files changed, 24 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d4e430b51098..7d7af6a0ef96 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6835,6 +6835,7 @@ static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
}
skb->sk = sk;
+ skb->scm_io_uring = 1;
nr_files = 0;
fpl->user = get_uid(ctx->user);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index dbdb03ac557f..4524bef053b8 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -654,6 +654,7 @@ typedef unsigned char *sk_buff_data_t;
* @transport_header: Transport layer header
* @network_header: Network layer header
* @mac_header: Link layer header
+ * @scm_io_uring: SKB holds io_uring registered files
* @tail: Tail pointer
* @end: End pointer
* @head: Head of buffer
@@ -800,6 +801,8 @@ struct sk_buff {
__u8 decrypted:1;
#endif
+ __u8 scm_io_uring:1;
+
#ifdef CONFIG_NET_SCHED
__u16 tc_index; /* traffic control index */
#endif
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 4d283e26d816..5c9ff8df9136 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -209,6 +209,7 @@ void wait_for_unix_gc(void)
/* The external entry point: unix_gc() */
void unix_gc(void)
{
+ struct sk_buff *next_skb, *skb;
struct unix_sock *u;
struct unix_sock *next;
struct sk_buff_head hitlist;
@@ -302,11 +303,30 @@ void unix_gc(void)
spin_unlock(&unix_gc_lock);
+ /* We need io_uring to clean its registered files, ignore all io_uring
+ * originated skbs. It's fine as io_uring doesn't keep references to
+ * other io_uring instances and so killing all other files in the cycle
+ * will put all io_uring references forcing it to go through normal
+ * release.path eventually putting registered files.
+ */
+ skb_queue_walk_safe(&hitlist, skb, next_skb) {
+ if (skb->scm_io_uring) {
+ __skb_unlink(skb, &hitlist);
+ skb_queue_tail(&skb->sk->sk_receive_queue, skb);
+ }
+ }
+
/* Here we are. Hitlist is filled. Die. */
__skb_queue_purge(&hitlist);
spin_lock(&unix_gc_lock);
+ /* There could be io_uring registered files, just push them back to
+ * the inflight list
+ */
+ list_for_each_entry_safe(u, next, &gc_candidates, link)
+ list_move_tail(&u->link, &gc_inflight_list);
+
/* All candidates should have been detached by now. */
BUG_ON(!list_empty(&gc_candidates));
--
2.25.1
From: Luo Meng <luomeng12(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5WBID
CVE: NA
--------------------------------
When dm_resume() and dm_destroy() are concurrent, it will
lead to UAF.
One of the concurrency UAF can be shown as below:
use free
do_resume |
__find_device_hash_cell |
dm_get |
atomic_inc(&md->holders) |
| dm_destroy
| __dm_destroy
| if (!dm_suspended_md(md))
| atomic_read(&md->holders)
| msleep(1)
dm_resume |
__dm_resume |
dm_table_resume_targets |
pool_resume |
do_waker #add delay work |
| dm_table_destroy
| pool_dtr
| __pool_dec
| __pool_destroy
| destroy_workqueue
| kfree(pool) # free pool
time out
__do_softirq
run_timer_softirq # pool has already been freed
This can be easily reproduced using:
1. create thin-pool
2. dmsetup suspend pool
3. dmsetup resume pool
4. dmsetup remove_all # Concurrent with 3
The root cause of UAF bugs is that dm_resume() adds timer after
dm_destroy() skips cancel timer beause of suspend status. After
timeout, it will call run_timer_softirq(), however pool has already
been freed. The concurrency UAF bug will happen.
Therefore, canceling timer is moved after md->holders is zero.
Signed-off-by: Luo Meng <luomeng12(a)huawei.com>
Reviewed-by: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/md/dm.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4c46f030eed2..288dab0ab226 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2411,6 +2411,19 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
if (dm_request_based(md) && md->kworker_task)
kthread_flush_worker(&md->kworker);
+ /*
+ * Rare, but there may be I/O requests still going to complete,
+ * for example. Wait for all references to disappear.
+ * No one should increment the reference count of the mapped_device,
+ * after the mapped_device state becomes DMF_FREEING.
+ */
+ if (wait)
+ while (atomic_read(&md->holders))
+ msleep(1);
+ else if (atomic_read(&md->holders))
+ DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
+ dm_device_name(md), atomic_read(&md->holders));
+
/*
* Take suspend_lock so that presuspend and postsuspend methods
* do not race with internal suspend.
@@ -2427,19 +2440,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
dm_put_live_table(md, srcu_idx);
mutex_unlock(&md->suspend_lock);
- /*
- * Rare, but there may be I/O requests still going to complete,
- * for example. Wait for all references to disappear.
- * No one should increment the reference count of the mapped_device,
- * after the mapped_device state becomes DMF_FREEING.
- */
- if (wait)
- while (atomic_read(&md->holders))
- msleep(1);
- else if (atomic_read(&md->holders))
- DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
- dm_device_name(md), atomic_read(&md->holders));
-
dm_sysfs_exit(md);
dm_table_destroy(__unbind(md));
free_dev(md);
--
2.25.1