From: Rosemarie O'Riorden roriorden@redhat.com
stable inclusion from stable-v5.10.127 commit 49e3e449bc4e1208e3db7f19a86246467b230672 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5XDDK
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 12378a5a75e33f34f8586706eb61cca9e6d4690c upstream.
When a packet enters the OVS datapath and does not match any existing flows installed in the kernel flow cache, the packet will be sent to userspace to be parsed, and a new flow will be created. The kernel and OVS rely on each other to parse packet fields in the same way so that packets will be handled properly.
As per the design document linked below, OVS expects all later IPv6 fragments to have nw_proto=44 in the flow key, so they can be correctly matched on OpenFlow rules. OpenFlow controllers create pipelines based on this design.
This behavior was changed by the commit in the Fixes tag so that nw_proto equals the next_header field of the last extension header. However, there is no counterpart for this change in OVS userspace, meaning that this field is parsed differently between OVS and the kernel. This is a problem because OVS creates actions based on what is parsed in userspace, but the kernel-provided flow key is used as a match criteria, as described in Documentation/networking/openvswitch.rst. This leads to issues such as packets incorrectly matching on a flow and thus the wrong list of actions being applied to the packet. Such changes in packet parsing cannot be implemented without breaking the userspace.
The offending commit is partially reverted to restore the expected behavior.
The change technically made sense and there is a good reason that it was implemented, but it does not comply with the original design of OVS. If in the future someone wants to implement such a change, then it must be user-configurable and disabled by default to preserve backwards compatibility with existing OVS versions.
Cc: stable@vger.kernel.org Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags") Link: https://docs.openvswitch.org/en/latest/topics/design/#fragments Signed-off-by: Rosemarie O'Riorden roriorden@redhat.com Acked-by: Eelco Chaudron echaudro@redhat.com Link: https://lore.kernel.org/r/20220621204845.9721-1-roriorden@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/openvswitch/flow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index b03d142ec82e..c9ba61413c98 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -265,7 +265,7 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct sw_flow_key *key) if (flags & IP6_FH_F_FRAG) { if (frag_off) { key->ip.frag = OVS_FRAG_TYPE_LATER; - key->ip.proto = nexthdr; + key->ip.proto = NEXTHDR_FRAGMENT; return 0; } key->ip.frag = OVS_FRAG_TYPE_FIRST;