Hi All,
I'm confusing at the VLAN validation mechanism of RoCEv2.
Assuming that we have two nodes with an HCA that supports RoCE v2. And we add a VLAN (id = 1) on each nodes, but they are of different network segments. The IP and VLAN configuration are as follows:
NODE_A NODE_B +--------+---------------+------+ +--------+---------------+------+ | device | IP | VLAN | | device | IP | VLAN | +--------+---------------+------+ +--------+---------------+------+ | eth0 | 192.168.97.1 | 0 | /----| eth0 | 192.168.100.2 | 0 | +--------+---------------+------+ / +--------+---------------+------+ | eth0.1 | 192.168.100.3 | 1 |---/ | eth0.1 | 192.168.98.2 | 1 | +--------+---------------+------+ +--------+---------------+------+
Now I try to ping eth0 on NODE_B from eth0.1 on NODE_A, of cource it fails becauce these devices are using different VLAN ID.
Then I do some tests on RoCE, the first one is a simple RC send test:
NODE_A: ib_send_bw -d mlx5_0 -x 5 (the sgid 5 belongs to eth0.1) NODE_B: ib_send_bw -d mlx5_0 -x 3 <server ip> (the sgid 3 belongs to eth0)
The result is as expected, the RoCEv2 packet with unmatched VLAN ID was dropped. I think the reason is that for RC service, the VLAN information of a QP is recorded in QPC, and the HCA can check it when receiving a packet.
But when I run a simple UD send test:
NODE_A: ib_send_bw -d mlx5_0 -x 5 -c UD (the sgid 5 belongs to eth0.1) NODE_B: ib_send_bw -d mlx5_0 -x 3 -c UD <server ip> (the sgid 3 belongs to eth0)
The test ends without error successfully. So the question is, RoCEv2 is based on Ethernet, shouldn't a RoCEv2 node check the VLAN ID of every incoming packets?
UD is connectionless-oriented, an UD QP won't record VLAN info in it's QPC, so how to achieve the checking mechanism? Or a UD QP should just ignore the unmatched VLAN ID?
I didn't find any info from the IB specification, I'd appreciate it if someone could help explain it.
Thanks Weihang