[PATCH openEuler-1.0-LTS 0/2] smb: client: Fix netns refcount imbalance

Enzo Matsumiya (1): smb: client: fix TCP timers deadlock after rmmod Wang Zhaolong (1): smb: client: Fix netns refcount imbalance causing leaks and use-after-free fs/cifs/connect.c | 33 ++++++++++++++++++++++++--------- 1 file changed, 24 insertions(+), 9 deletions(-) -- 2.39.2

From: Enzo Matsumiya <ematsumiya@suse.de> mainline inclusion from mainline-v6.10-rc2 commit e9f2517a3e18a54a3943c098d2226b245d488801 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBI5BZ CVE: CVE-2024-54680 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") fixed a netns UAF by manually enabled socket refcounting (sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)). The reason the patch worked for that bug was because we now hold references to the netns (get_net_track() gets a ref internally) and they're properly released (internally, on __sk_destruct()), but only because sk->sk_net_refcnt was set. Problem: (this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless if init_net or other) Setting sk->sk_net_refcnt=1 *manually* and *after* socket creation is not only out of cifs scope, but also technically wrong -- it's set conditionally based on user (=1) vs kernel (=0) sockets. And net/ implementations seem to base their user vs kernel space operations on it. e.g. upon TCP socket close, the TCP timers are not cleared because sk->sk_net_refcnt=1: (cf. commit 151c9c724d05 ("tcp: properly terminate timers for kernel sockets")) net/ipv4/tcp.c: void tcp_close(struct sock *sk, long timeout) { lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); if (!sk->sk_net_refcnt) inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); } Which will throw a lockdep warning and then, as expected, deadlock on tcp_write_timer(). A way to reproduce this is by running the reproducer from ef7134c7fc48 and then 'rmmod cifs'. A few seconds later, the deadlock/lockdep warning shows up. Fix: We shouldn't mess with socket internals ourselves, so do not set sk_net_refcnt manually. Also change __sock_create() to sock_create_kern() for explicitness. As for non-init_net network namespaces, we deal with it the best way we can -- hold an extra netns reference for server->ssocket and drop it when it's released. This ensures that the netns still exists whenever we need to create/destroy server->ssocket, but is not directly tied to it. Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com> Conflicts: fs/cifs/connect.c fs/smb/client/connect.c [Conflicts due to fs/cifs move to fs/smb/client] Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com> --- fs/cifs/connect.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index bd9365d375f9..ee9bdc4ca569 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -811,6 +811,9 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server) if (server->ssocket) { sock_release(server->ssocket); server->ssocket = NULL; + + /* Release netns reference for the socket. */ + put_net(cifs_net_ns(server)); } if (!list_empty(&server->pending_mid_q)) { @@ -859,6 +862,7 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server) */ } + /* Release netns reference for this server. */ put_net(cifs_net_ns(server)); kfree(server->hostname); kfree(server); @@ -2526,6 +2530,8 @@ cifs_get_tcp_session(struct smb_vol *volume_info) tcp_ses->ops = volume_info->ops; tcp_ses->vals = volume_info->vals; + + /* Grab netns reference for this server. */ cifs_set_net_ns(tcp_ses, get_net(current->nsproxy->net_ns)); tcp_ses->hostname = extract_hostname(volume_info->UNC); if (IS_ERR(tcp_ses->hostname)) { @@ -2627,14 +2633,17 @@ cifs_get_tcp_session(struct smb_vol *volume_info) out_err_crypto_release: cifs_crypto_secmech_release(tcp_ses); + /* Release netns reference for this server. */ put_net(cifs_net_ns(tcp_ses)); out_err: if (tcp_ses) { if (!IS_ERR(tcp_ses->hostname)) kfree(tcp_ses->hostname); - if (tcp_ses->ssocket) + if (tcp_ses->ssocket) { sock_release(tcp_ses->ssocket); + put_net(cifs_net_ns(tcp_ses)); + } kfree(tcp_ses); } return ERR_PTR(rc); @@ -3629,22 +3638,22 @@ generic_ip_connect(struct TCP_Server_Info *server) if (socket == NULL) { struct net *net = cifs_net_ns(server); - struct sock *sk; - rc = __sock_create(net, sfamily, SOCK_STREAM, - IPPROTO_TCP, &socket, 1); + rc = sock_create_kern(net, sfamily, SOCK_STREAM, + IPPROTO_TCP, &socket); if (rc < 0) { cifs_dbg(VFS, "Error %d creating socket\n", rc); server->ssocket = NULL; return rc; } - sk = socket->sk; - sk->sk_net_refcnt = 1; + /* + * Grab netns reference for the socket. + * + * It'll be released here, on error, or in clean_demultiplex_info() upon server + * teardown. + */ get_net(net); -#ifdef CONFIG_PROC_FS - this_cpu_add(*net->core.sock_inuse, 1); -#endif /* BB other socket options to set KEEPALIVE, NODELAY? */ cifs_dbg(FYI, "Socket created\n"); @@ -3657,8 +3666,10 @@ generic_ip_connect(struct TCP_Server_Info *server) } rc = bind_socket(server); - if (rc < 0) + if (rc < 0) { + put_net(cifs_net_ns(server)); return rc; + } /* * Eventually check for other socket options to change from @@ -3692,6 +3703,7 @@ generic_ip_connect(struct TCP_Server_Info *server) rc = socket->ops->connect(socket, saddr, slen, 0); if (rc < 0) { cifs_dbg(FYI, "Error %d connecting to server\n", rc); + put_net(cifs_net_ns(server)); sock_release(socket); server->ssocket = NULL; return rc; @@ -3700,6 +3712,9 @@ generic_ip_connect(struct TCP_Server_Info *server) if (sport == htons(RFC1001_PORT)) rc = ip_rfc1001_connect(server); + if (rc < 0) + put_net(cifs_net_ns(server)); + return rc; } -- 2.39.2

maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBNCLL CVE: CVE-2024-54680 Reference: https://lore.kernel.org/linux-cifs/CAH2r5muqxgPy=n4Aks_k-2goEZmmD10Dx3u3jgOk... --------------------------------------------- Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") attempted to fix a netns use-after-free issue by manually adjusting reference counts via sk->sk_net_refcnt and sock_inuse_add(). However, a later commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod") pointed out that the approach of manually setting sk->sk_net_refcnt in the first commit was technically incorrect, as sk->sk_net_refcnt should only be set for user sockets. It led to issues like TCP timers not being cleared properly on close. The second commit moved to a model of just holding an extra netns reference for server->ssocket using get_net(), and dropping it when the server is torn down. But there remain some gaps in the get_net()/put_net() balancing added by these commits. The incomplete reference handling in these fixes results in two issues: 1. Netns refcount leaks[1] The problem process is as follows: ``` mount.cifs cifsd cifs_do_mount cifs_mount cifs_mount_get_session cifs_get_tcp_session get_net() /* First get net. */ ip_connect generic_ip_connect /* Try port 445 */ get_net() ->connect() /* Failed */ put_net() generic_ip_connect /* Try port 139 */ get_net() /* Missing matching put_net() for this get_net().*/ cifs_get_smb_ses cifs_negotiate_protocol smb2_negotiate SMB2_negotiate cifs_send_recv wait_for_response cifs_demultiplex_thread cifs_read_from_socket cifs_readv_from_socket cifs_reconnect cifs_abort_connection sock_release(); server->ssocket = NULL; /* Missing put_net() here. */ generic_ip_connect get_net() ->connect() /* Failed */ put_net() sock_release(); server->ssocket = NULL; free_rsp_buf ... clean_demultiplex_info /* It's only called once here. */ put_net() ``` When cifs_reconnect() is triggered, the server->ssocket is released without a corresponding put_net() for the reference acquired in generic_ip_connect() before. it ends up calling generic_ip_connect() again to retry get_net(). After that, server->ssocket is set to NULL in the error path of generic_ip_connect(), and the net count cannot be released in the final clean_demultiplex_info() function. 2. Potential use-after-free The current refcounting scheme can lead to a potential use-after-free issue in the following scenario: ``` cifs_do_mount cifs_mount cifs_mount_get_session cifs_get_tcp_session get_net() /* First get net */ ip_connect generic_ip_connect get_net() bind_socket kernel_bind /* failed */ put_net() /* after out_err_crypto_release label */ put_net() /* after out_err label */ put_net() ``` In the exception handling process where binding the socket fails, the get_net() and put_net() calls are unbalanced, which may cause the server->net reference count to drop to zero and be prematurely released. To address both issues, this patch ties the netns reference counting to the server->ssocket and server lifecycles. The extra reference is now acquired when the server or socket is created, and released when the socket is destroyed or the server is torn down. [1]: https://bugzilla.kernel.org/show_bug.cgi?id=219792 Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") Fixes: e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod") Conflicts: fs/cifs/connect.c fs/smb/client/connect.c [Conflicts due to fs/cifs move to fs/smb/client] Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com> --- fs/cifs/connect.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index ee9bdc4ca569..9543f59b7b25 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -429,6 +429,7 @@ cifs_reconnect(struct TCP_Server_Info *server) server->ssocket->state, server->ssocket->flags); sock_release(server->ssocket); server->ssocket = NULL; + put_net(cifs_net_ns(server)); } server->sequence_number = 0; server->session_estab = false; @@ -3650,8 +3651,12 @@ generic_ip_connect(struct TCP_Server_Info *server) /* * Grab netns reference for the socket. * - * It'll be released here, on error, or in clean_demultiplex_info() upon server - * teardown. + * This reference will be released in several situations: + * - In the failure path before the cifsd is started + * - In the all place where server->socket is released, it is + * also set to NULL. + * - Ultimately in clean_demultiplex_info(), during the final + * teardown. */ get_net(net); @@ -3666,10 +3671,8 @@ generic_ip_connect(struct TCP_Server_Info *server) } rc = bind_socket(server); - if (rc < 0) { - put_net(cifs_net_ns(server)); + if (rc < 0) return rc; - } /* * Eventually check for other socket options to change from @@ -3712,9 +3715,6 @@ generic_ip_connect(struct TCP_Server_Info *server) if (sport == htons(RFC1001_PORT)) rc = ip_rfc1001_connect(server); - if (rc < 0) - put_net(cifs_net_ns(server)); - return rc; } -- 2.39.2

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/15240 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/ABU... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/15240 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/ABU...
participants (2)
-
patchwork bot
-
Wang Zhaolong