From: Baokun Li libaokun1@huawei.com
hulk inclusion category: bugfix bugzilla: 188500, https://gitee.com/openeuler/kernel/issues/I6RJ0V CVE: NA
--------------------------------
We got a WARNING in ext4_add_complete_io: ================================================================== WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4] [...] Call Trace: <TASK> ext4_end_bio+0xa8/0x240 [ext4] bio_endio+0x195/0x310 blk_update_request+0x184/0x770 scsi_end_request+0x2f/0x240 scsi_io_completion+0x75/0x450 scsi_finish_command+0xef/0x160 scsi_complete+0xa3/0x180 blk_complete_reqs+0x60/0x80 blk_done_softirq+0x25/0x40 __do_softirq+0x119/0x4c8 run_ksoftirqd+0x42/0x70 smpboot_thread_fn+0x136/0x3c0 kthread+0x140/0x1a0 ret_from_fork+0x2c/0x50 ==================================================================
Above issue may happen as follows:
cpu1 cpu2 ----------------------------|---------------------------- mount -o dioread_lock ext4_writepages ext4_do_writepages *if (ext4_should_dioread_nolock(inode))* // rsv_blocks is not assigned here mount -o remount,dioread_nolock ext4_journal_start_with_reserve __ext4_journal_start __ext4_journal_start_sb jbd2__journal_start *if (rsv_blocks)* // h_rsv_handle is not initialized here mpage_map_and_submit_extent mpage_map_one_extent dioread_nolock = ext4_should_dioread_nolock(inode) if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN)) mpd->io_submit.io_end->handle = handle->h_rsv_handle ext4_set_io_unwritten_flag io_end->flag |= EXT4_IO_END_UNWRITTEN // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command scsi_io_completion scsi_io_completion_action scsi_end_request blk_update_request req_bio_endio bio_endio bio->bi_end_io > ext4_end_bio ext4_put_io_end_defer ext4_add_complete_io // trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock() function returns inconsistent values in the ext4_do_writepages() and mpage_map_one_extent(). There are four conditions in this function that can be changed at mount time to cause this problem. These four conditions can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages") and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above issue is caused by this situation. We refer to the fix for the first category, when applying options during remount, we grab s_writepages_rwsem to avoid racing with writepages ops to trigger this problem.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable@vger.kernel.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- fs/ext4/ext4.h | 3 ++- fs/ext4/super.c | 13 +++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 4c88e75180a2..6df919b154b4 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1530,7 +1530,8 @@ struct ext4_sb_info {
/* * Barrier between writepages ops and changing any inode's JOURNAL_DATA - * or EXTENTS flag. + * or EXTENTS flag or between writepages ops and changing DIOREAD_NOLOCK + * mount option on remount. */ struct percpu_rw_semaphore s_writepages_rwsem; struct dax_device *s_daxdev; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 8029a6f6471c..df07222f1cc5 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5605,10 +5605,20 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) vfs_flags = SB_LAZYTIME | SB_I_VERSION; sb->s_flags = (sb->s_flags & ~vfs_flags) | (*flags & vfs_flags);
+ /* + * Changing the DIOREAD_NOLOCK mount option may cause two calls to + * ext4_should_dioread_nolock() to return inconsistent values, + * triggering WARN_ON in ext4_add_complete_io(). we grab here + * s_writepages_rwsem to avoid race between writepages ops and + * remount. + */ + percpu_down_write(&sbi->s_writepages_rwsem); if (!parse_options(data, sb, NULL, &journal_ioprio, 1)) { err = -EINVAL; + percpu_up_write(&sbi->s_writepages_rwsem); goto restore_opts; } + percpu_up_write(&sbi->s_writepages_rwsem);
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^ test_opt(sb, JOURNAL_CHECKSUM)) { @@ -5833,6 +5843,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) return 0;
restore_opts: + percpu_down_write(&sbi->s_writepages_rwsem); sb->s_flags = old_sb_flags; sbi->s_mount_opt = old_opts.s_mount_opt; sbi->s_mount_opt2 = old_opts.s_mount_opt2; @@ -5841,6 +5852,8 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) sbi->s_commit_interval = old_opts.s_commit_interval; sbi->s_min_batch_time = old_opts.s_min_batch_time; sbi->s_max_batch_time = old_opts.s_max_batch_time; + percpu_up_write(&sbi->s_writepages_rwsem); + if (!test_opt(sb, BLOCK_VALIDITY) && sbi->system_blks) ext4_release_system_zone(sb); #ifdef CONFIG_QUOTA
From: Eric Dumazet edumazet@google.com
stable inclusion from stable-v4.19.131 commit 6c83456601e65a6dd371ddf595510183d198e9b1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6S5DO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
---------------------------
commit 7c6d2ecbda83150b2036a2b36b21381ad4667762 upstream.
Recent change in virtio_net_hdr_to_skb() broke some packetdrill tests.
When --mss=XXX option is set, packetdrill always provide gso_type & gso_size for its inbound packets, regardless of packet size.
if (packet->tcp && packet->mss) { if (packet->ipv4) gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV4; else gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV6; gso.gso_size = packet->mss; }
Since many other programs could do the same, relax virtio_net_hdr_to_skb() to no longer return an error, but instead ignore gso settings.
This keeps Willem intent to make sure no malicious packet could reach gso stack.
Note that TCP stack has a special logic in tcp_set_skb_tso_segs() to clear gso_size for small packets.
Fixes: 6dd912f82680 ("net: check untrusted gso_size at kernel entry") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Willem de Bruijn willemb@google.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Signed-off-by: Liu Jian liujian56@huawei.com Reviewed-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- include/linux/virtio_net.h | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h index a7c197299fc7..0b13d8693efa 100644 --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -118,16 +118,17 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) { u16 gso_size = __virtio16_to_cpu(little_endian, hdr->gso_size); + struct skb_shared_info *shinfo = skb_shinfo(skb);
- if (skb->len - p_off <= gso_size) - return -EINVAL; - - skb_shinfo(skb)->gso_size = gso_size; - skb_shinfo(skb)->gso_type = gso_type; + /* Too small packets are not really GSO ones. */ + if (skb->len - p_off > gso_size) { + shinfo->gso_size = gso_size; + shinfo->gso_type = gso_type;
- /* Header must be checked, and gso_segs computed. */ - skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY; - skb_shinfo(skb)->gso_segs = 0; + /* Header must be checked, and gso_segs computed. */ + shinfo->gso_type |= SKB_GSO_DODGY; + shinfo->gso_segs = 0; + } }
return 0;
From: Jonathan Davies jonathan.davies@nutanix.com
stable inclusion from stable-v4.19.218 commit 960b360ca7463921c1a6b72e7066a706d6406223 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6S5DO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
---------------------------
[ Upstream commit cf9acc90c80ecbee00334aa85d92f4e74014bcff ]
virtio_net_hdr_to_skb does not set the skb's gso_size and gso_type correctly for UFO packets received via virtio-net that are a little over the GSO size. This can lead to problems elsewhere in the networking stack, e.g. ovs_vport_send dropping over-sized packets if gso_size is not set.
This is due to the comparison
if (skb->len - p_off > gso_size)
not properly accounting for the transport layer header.
p_off includes the size of the transport layer header (thlen), so skb->len - p_off is the size of the TCP/UDP payload.
gso_size is read from the virtio-net header. For UFO, fragmentation happens at the IP level so does not need to include the UDP header.
Hence the calculation could be comparing a TCP/UDP payload length with an IP payload length, causing legitimate virtio-net packets to have lack gso_type/gso_size information.
Example: a UDP packet with payload size 1473 has IP payload size 1481. If the guest used UFO, it is not fragmented and the virtio-net header's flags indicate that it is a GSO frame (VIRTIO_NET_HDR_GSO_UDP), with gso_size = 1480 for an MTU of 1500. skb->len will be 1515 and p_off will be 42, so skb->len - p_off = 1473. Hence the comparison fails, and shinfo->gso_size and gso_type are not set as they should be.
Instead, add the UDP header length before comparing to gso_size when using UFO. In this way, it is the size of the IP payload that is compared to gso_size.
Fixes: 6dd912f82680 ("net: check untrusted gso_size at kernel entry") Signed-off-by: Jonathan Davies jonathan.davies@nutanix.com Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Liu Jian liujian56@huawei.com Reviewed-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- include/linux/virtio_net.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h index 0b13d8693efa..bd3ed134cc67 100644 --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -118,10 +118,15 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) { u16 gso_size = __virtio16_to_cpu(little_endian, hdr->gso_size); + unsigned int nh_off = p_off; struct skb_shared_info *shinfo = skb_shinfo(skb);
+ /* UFO may not include transport header in gso_size. */ + if (gso_type & SKB_GSO_UDP) + nh_off -= thlen; + /* Too small packets are not really GSO ones. */ - if (skb->len - p_off > gso_size) { + if (skb->len - nh_off > gso_size) { shinfo->gso_size = gso_size; shinfo->gso_type = gso_type;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
stable inclusion from stable-v4.19.273 commit 669c76e55de332fbcbce5b74fccef1b4698a8936 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6OOP3 CVE: CVE-2023-1513
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 2c10b61421a28e95a46ab489fd56c0f442ff6952 upstream.
When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it.
Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed.
Cc: Sean Christopherson seanjc@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: x86@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: stable stable@kernel.org Reported-by: Xingyuan Mo hdthky0@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Message-Id: 20230214103304.3689213-1-gregkh@linuxfoundation.org Tested-by: Xingyuan Mo hdthky0@gmail.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Guo Mengqi guomengqi3@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- arch/x86/kvm/x86.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 36bc8a69ef5d..58aea3fb0f3b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3667,12 +3667,11 @@ static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu, { unsigned long val;
+ memset(dbgregs, 0, sizeof(*dbgregs)); memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db)); kvm_get_dr(vcpu, 6, &val); dbgregs->dr6 = val; dbgregs->dr7 = vcpu->arch.dr7; - dbgregs->flags = 0; - memset(&dbgregs->reserved, 0, sizeof(dbgregs->reserved)); }
static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,