From: Jonathon Reinhart jonathon.reinhart@gmail.com
commit 8d432592f30fcc34ef5a10aac4887b4897884493 upstream.
tcp_set_default_congestion_control() is netns-safe in that it writes to &net->ipv4.tcp_congestion_control, but it also sets ca->flags |= TCP_CONG_NON_RESTRICTED which is not namespaced. This has the unintended side-effect of changing the global net.ipv4.tcp_allowed_congestion_control sysctl, despite the fact that it is read-only: 97684f0970f6 ("net: Make tcp_allowed_congestion_control readonly in non-init netns")
Resolve this netns "leak" by only allowing the init netns to set the default algorithm to one that is restricted. This restriction could be removed if tcp_allowed_congestion_control were namespace-ified in the future.
This bug was uncovered with https://github.com/JonathonReinhart/linux-netns-sysctl-verify
Fixes: 6670e1524477 ("tcp: Namespace-ify sysctl_tcp_default_congestion_control") Signed-off-by: Jonathon Reinhart jonathon.reinhart@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- net/ipv4/tcp_cong.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c index 00a7482b6fbd9..533f8d84d2f71 100644 --- a/net/ipv4/tcp_cong.c +++ b/net/ipv4/tcp_cong.c @@ -228,6 +228,10 @@ int tcp_set_default_congestion_control(struct net *net, const char *name) ret = -ENOENT; } else if (!try_module_get(ca->owner)) { ret = -EBUSY; + } else if (!net_eq(net, &init_net) && + !(ca->flags & TCP_CONG_NON_RESTRICTED)) { + /* Only init netns can set default to a restricted algorithm */ + ret = -EPERM; } else { prev = xchg(&net->ipv4.tcp_congestion_control, ca); if (prev)