Patchset for CVE-2021-47024.
Jia He (1): virtio_vsock: Fix race condition in virtio_transport_recv_pkt
Stefano Garzarella (2): vsock/virtio: discard packets only when socket is really closed vsock/virtio: free queued packets when closing socket
net/vmw_vsock/virtio_transport_common.c | 38 +++++++++++++++++++------ 1 file changed, 29 insertions(+), 9 deletions(-)
From: Jia He justin.he@arm.com
mainline inclusion from mainline-v5.7 commit 8692cefc433f282228fd44938dd4d26ed38254a2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I94J63 CVE: CVE-2021-47024
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the server on the guest, there will be a panic on a ThunderX2 (armv8a server):
[ 463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 463.718848] Mem abort info: [ 463.718849] ESR = 0x96000044 [ 463.718852] EC = 0x25: DABT (current EL), IL = 32 bits [ 463.718853] SET = 0, FnV = 0 [ 463.718854] EA = 0, S1PTW = 0 [ 463.718855] Data abort info: [ 463.718856] ISV = 0, ISS = 0x00000044 [ 463.718857] CM = 0, WnR = 1 [ 463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000 [ 463.718861] [0000000000000000] pgd=0000000000000000 [ 463.718866] Internal error: Oops: 96000044 [#1] SMP [...] [ 463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G O 5.7.0-rc7+ #139 [ 463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018 [ 463.718982] pstate: 60400009 (nZCv daif +PAN -UAO) [ 463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common] [ 463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common] [ 463.719000] sp : ffff80002dbe3c40 [...] [ 463.719025] Call trace: [ 463.719030] virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common] [ 463.719034] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock] [ 463.719041] vhost_worker+0x100/0x1a0 [vhost] [ 463.719048] kthread+0x128/0x130 [ 463.719052] ret_from_fork+0x10/0x18
The race condition is as follows: Task1 Task2 ===== ===== __sock_release virtio_transport_recv_pkt __vsock_release vsock_find_bound_socket (found sk) lock_sock_nested vsock_remove_sock sock_orphan sk_set_socket(sk, NULL) sk->sk_shutdown = SHUTDOWN_MASK ... release_sock lock_sock virtio_transport_recv_connecting sk->sk_socket->state (panic!)
The root cause is that vsock_find_bound_socket can't hold the lock_sock, so there is a small race window between vsock_find_bound_socket() and lock_sock(). If __vsock_release() is running in another task, sk->sk_socket will be set to NULL inadvertently.
This fixes it by checking sk->sk_shutdown(suggested by Stefano) after lock_sock since sk->sk_shutdown is set to SHUTDOWN_MASK under the protection of lock_sock_nested.
Signed-off-by: Jia He justin.he@arm.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Conflicts: net/vmw_vsock/virtio_transport_common.c Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com --- net/vmw_vsock/virtio_transport_common.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 52242a148c70..723ea42d7a40 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1037,6 +1037,14 @@ void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
lock_sock(sk);
+ /* Check if sk has been released before lock_sock */ + if (sk->sk_shutdown == SHUTDOWN_MASK) { + (void)virtio_transport_reset_no_sock(pkt); + release_sock(sk); + sock_put(sk); + goto free_pkt; + } + /* Update CID in case it has changed after a transport reset event */ vsk->local_addr.svm_cid = dst.svm_cid;
From: Stefano Garzarella sgarzare@redhat.com
mainline inclusion from mainline-v5.10-rc6 commit 3fe356d58efae54dade9ec94ea7c919ed20cf4db category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I94J63 CVE: CVE-2021-47024
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Starting from commit 8692cefc433f ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt"), we discard packets in virtio_transport_recv_pkt() if the socket has been released.
When the socket is connected, we schedule a delayed work to wait the RST packet from the other peer, also if SHUTDOWN_MASK is set in sk->sk_shutdown. This is done to complete the virtio-vsock shutdown algorithm, releasing the port assigned to the socket definitively only when the other peer has consumed all the packets.
If we discard the RST packet received, the socket will be closed only when the VSOCK_CLOSE_TIMEOUT is reached.
Sergio discovered the issue while running ab(1) HTTP benchmark using libkrun [1] and observing a latency increase with that commit.
To avoid this issue, we discard packet only if the socket is really closed (SOCK_DONE flag is set). We also set SOCK_DONE in virtio_transport_release() when we don't need to wait any packets from the other peer (we didn't schedule the delayed work). In this case we remove the socket from the vsock lists, releasing the port assigned.
[1] https://github.com/containers/libkrun
Fixes: 8692cefc433f ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt") Cc: justin.he@arm.com Reported-by: Sergio Lopez slp@redhat.com Tested-by: Sergio Lopez slp@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Acked-by: Jia He justin.he@arm.com Link: https://lore.kernel.org/r/20201120104736.73749-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Conflicts: net/vmw_vsock/virtio_transport_common.c Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com --- net/vmw_vsock/virtio_transport_common.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 723ea42d7a40..ff46c0fcc02b 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -810,8 +810,10 @@ void virtio_transport_release(struct vsock_sock *vsk) } release_sock(sk);
- if (remove_sock) + if (remove_sock) { + sock_set_flag(sk, SOCK_DONE); vsock_remove_sock(vsk); + } } EXPORT_SYMBOL_GPL(virtio_transport_release);
@@ -1037,8 +1039,8 @@ void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
lock_sock(sk);
- /* Check if sk has been released before lock_sock */ - if (sk->sk_shutdown == SHUTDOWN_MASK) { + /* Check if sk has been closed before lock_sock */ + if (sock_flag(sk, SOCK_DONE)) { (void)virtio_transport_reset_no_sock(pkt); release_sock(sk); sock_put(sk);
From: Stefano Garzarella sgarzare@redhat.com
mainline inclusion from mainline-v5.13-rc1 commit 8432b8114957235f42e070a16118a7f750de9d39 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I94J63 CVE: CVE-2021-47024
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
As reported by syzbot [1], there is a memory leak while closing the socket. We partially solved this issue with commit ac03046ece2b ("vsock/virtio: free packets during the socket release"), but we forgot to drain the RX queue when the socket is definitely closed by the scheduled work.
To avoid future issues, let's use the new virtio_transport_remove_sock() to drain the RX queue before removing the socket from the af_vsock lists calling vsock_remove_sock().
[1] https://syzkaller.appspot.com/bug?extid=24452624fc4c571eedd9
Fixes: ac03046ece2b ("vsock/virtio: free packets during the socket release") Reported-and-tested-by: syzbot+24452624fc4c571eedd9@syzkaller.appspotmail.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Conflicts: net/vmw_vsock/virtio_transport_common.c Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com --- net/vmw_vsock/virtio_transport_common.c | 30 ++++++++++++++++--------- 1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index ff46c0fcc02b..6db562d9569d 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -700,6 +700,23 @@ static int virtio_transport_reset_no_sock(struct virtio_vsock_pkt *pkt) return t->send_pkt(reply); }
+/* This function should be called with sk_lock held and SOCK_DONE set */ +static void virtio_transport_remove_sock(struct vsock_sock *vsk) +{ + struct virtio_vsock_sock *vvs = vsk->trans; + struct virtio_vsock_pkt *pkt, *tmp; + + /* We don't need to take rx_lock, as the socket is closing and we are + * removing it. + */ + list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) { + list_del(&pkt->list); + virtio_transport_free_pkt(pkt); + } + + vsock_remove_sock(vsk); +} + static void virtio_transport_wait_close(struct sock *sk, long timeout) { if (timeout) { @@ -732,7 +749,7 @@ static void virtio_transport_do_close(struct vsock_sock *vsk, (!cancel_timeout || cancel_delayed_work(&vsk->close_work))) { vsk->close_work_scheduled = false;
- vsock_remove_sock(vsk); + virtio_transport_remove_sock(vsk);
/* Release refcnt obtained when we scheduled the timeout */ sock_put(sk); @@ -795,8 +812,6 @@ static bool virtio_transport_close(struct vsock_sock *vsk)
void virtio_transport_release(struct vsock_sock *vsk) { - struct virtio_vsock_sock *vvs = vsk->trans; - struct virtio_vsock_pkt *pkt, *tmp; struct sock *sk = &vsk->sk; bool remove_sock = true;
@@ -804,16 +819,11 @@ void virtio_transport_release(struct vsock_sock *vsk) if (sk->sk_type == SOCK_STREAM) remove_sock = virtio_transport_close(vsk);
- list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) { - list_del(&pkt->list); - virtio_transport_free_pkt(pkt); - } - release_sock(sk); - if (remove_sock) { sock_set_flag(sk, SOCK_DONE); - vsock_remove_sock(vsk); + virtio_transport_remove_sock(vsk); } + release_sock(sk); } EXPORT_SYMBOL_GPL(virtio_transport_release);
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/4949 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/4949 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M...