mainline inclusion from mainline-v6.5-rc6 commit 01f4fd27087078c90a0e22860d1dfa2cd0510791 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7T6MN CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
BUG_ON(!vlan_info) is triggered in unregister_vlan_dev() with following testcase:
# ip netns add ns1 # ip netns exec ns1 ip link add bond0 type bond mode 0 # ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 # ip netns exec ns1 ip link set bond_slave_1 master bond0 # ip netns exec ns1 ip link add link bond_slave_1 name vlan10 type vlan id 10 protocol 802.1ad # ip netns exec ns1 ip link add link bond0 name bond0_vlan10 type vlan id 10 protocol 802.1ad # ip netns exec ns1 ip link set bond_slave_1 nomaster # ip netns del ns1
The logical analysis of the problem is as follows:
1. create ETH_P_8021AD protocol vlan10 for bond_slave_1: register_vlan_dev() vlan_vid_add() vlan_info_alloc() __vlan_vid_add() // add [ETH_P_8021AD, 10] vid to bond_slave_1
2. create ETH_P_8021AD protocol bond0_vlan10 for bond0: register_vlan_dev() vlan_vid_add() __vlan_vid_add() vlan_add_rx_filter_info() if (!vlan_hw_filter_capable(dev, proto)) // condition established because bond0 without NETIF_F_HW_VLAN_STAG_FILTER return 0;
if (netif_device_present(dev)) return dev->netdev_ops->ndo_vlan_rx_add_vid(dev, proto, vid); // will be never called // The slaves of bond0 will not refer to the [ETH_P_8021AD, 10] vid.
3. detach bond_slave_1 from bond0: __bond_release_one() vlan_vids_del_by_dev() list_for_each_entry(vid_info, &vlan_info->vid_list, list) vlan_vid_del(dev, vid_info->proto, vid_info->vid); // bond_slave_1 [ETH_P_8021AD, 10] vid will be deleted. // bond_slave_1->vlan_info will be assigned NULL.
4. delete vlan10 during delete ns1: default_device_exit_batch() dev->rtnl_link_ops->dellink() // unregister_vlan_dev() for vlan10 vlan_info = rtnl_dereference(real_dev->vlan_info); // real_dev of vlan10 is bond_slave_1 BUG_ON(!vlan_info); // bond_slave_1->vlan_info is NULL now, bug is triggered!!!
Add S-VLAN tag related features support to bond driver. So the bond driver will always propagate the VLAN info to its slaves.
Fixes: 8ad227ff89a7 ("net: vlan: add 802.1ad support") Suggested-by: Ido Schimmel idosch@idosch.org Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://lore.kernel.org/r/20230802114320.4156068-1-william.xuanziyang@huawei... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com --- drivers/net/bonding/bond_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 9bf010e88da7..be0c4d655cf9 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -5313,7 +5313,9 @@ void bond_setup(struct net_device *bond_dev)
bond_dev->hw_features = BOND_VLAN_FEATURES | NETIF_F_HW_VLAN_CTAG_RX | - NETIF_F_HW_VLAN_CTAG_FILTER; + NETIF_F_HW_VLAN_CTAG_FILTER | + NETIF_F_HW_VLAN_STAG_RX | + NETIF_F_HW_VLAN_STAG_FILTER;
bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4; #ifdef CONFIG_XFRM_OFFLOAD