On 2021/6/8 16:47, Parav Pandit wrote:
From: Yunsheng Lin linyunsheng@huawei.com Sent: Tuesday, June 8, 2021 1:06 PM
On 2021/6/8 13:26, Parav Pandit wrote:
From: Yunsheng Lin linyunsheng@huawei.com Sent: Tuesday, June 8, 2021 8:58 AM
On 2021/6/7 19:12, Parav Pandit wrote:
From: Yunsheng Lin linyunsheng@huawei.com Sent: Monday, June 7, 2021 4:27 PM
[..]
> >> 2. each PF's devlink instance has three types of port, which is >> FLAVOUR_PHYSICAL, FLAVOUR_PCI_PF and FLAVOUR_PCI_VF(supposing I >> understand >> port flavour correctly). >> > FLAVOUR_PCI_{PF,VF,SF} belongs to eswitch (representor) side on switchdev device.
If devlink instance or eswitch is in DEVLINK_ESWITCH_MODE_LEGACY mode, the FLAVOUR_PCI_{PF,VF,SF} port instance does not need to
created?
No. in eswitch legacy, there are no representor netdevice or devlink
ports.
It seems each devlink port instance corresponds to a netdevice. More specificly, the devlink instance is created in the struct pci_driver' probe function of a pci function, a devlink port instance is created and registered to that devlink instance when a netdev of that
pci function is created?
Yes.
As in diagram [1], the devlink port instance(flavour FLAVOUR_PHYSICAL) for ctrl-0-pf0 is created when the netdev of ctrl-0-pf0 is created in the host of smartNIC, the devlink port instance(flavour FLAVOUR_VIRTUAL) for ctrl-0- pf0vfN is created when the netdev of ctrl-0-pf0vfN is created in the host of smartNIC, right?
Ctrl-0-pf0vfN, ctrl-0-pf0 ports are eswitch ports. They are created where
there is eswitch.
Usually in smartnic where eswitch is located.
Does diagram in [1] corresponds to the multi-host (two) host setup as memtioned previously? H1.pf0.phyical_port = p0. H1.pf1.phyical_port = p1. H2.pf0.phyical_port = p0. H2.pf1.phyical_port = p1.
Yes.
Let's say H1 = server and H2 = smartNIC as the pci rc connected to below:
| | | --------- --------- ------- ------- | ----------- | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | | server | | ------- ----/---- ---/----- ------- ---/--- ---/--- | | pci rc |=== | pf0 |______/________/ | pf1 |___/_______/ | | connect | | ------- ------- | ----------- | | controller_num=1 (no eswitch) | ------|-------------------------------------------------- (internal wire) | --------------------------------------------------------- | devlink eswitch ports and reps | | ----------------------------------------------------- | | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | | | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | | ----------------------------------------------------- | | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | | | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | | ----------------------------------------------------- | | | | | ----------- | --------- --------- ------- ------- | | smartNIC| | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | | pci rc |==| ------- ----/---- ---/----- ------- ---/--- ---/--- | | connect | | | pf0 |______/________/ | pf1 |___/_______/ | ----------- | ------- ------- | | | | local controller_num=0 (eswitch) | ---------------------------------------------------------
A vanilla kernel can run on the smartNIC host, right?
Right.
what the smartNIC host see is two PF corresponding to ctrl-0-pf0 and ctrl-0-pf1 When the kernel is boot up first and mlx driver is not loaded yet, right?
I am not sure it is ok to leave out the VF and SF, but let's leave them out for simplicity now. When mlx driver is loaded, two devlink instances are created, which corresponds to ctrl-0-pf0 and ctrl-0-pf1, and two devlink port instances (flavour FLAVOUR_PHYSICAL) is created and registered to corresponding devlink instances just created, right?
As the eswitch mode is based on devlink instance, Let's only set the mode of ctrl-0-pf0' devlink instance to DEVLINK_ESWITCH_MODE_SWITCHDEV, the representor netdev of ctrl-1-pf0 is created and devlink port instance of that representor netdev is created and registered to devlink instances corresponding to ctrl-0-pf0?
I think I miss something here, the above does not seems right, because:
- For single host case:the PF is not passed through to the VM, devlink port instance of VF's representor netdev can be registered to the devlink
instance corresponding to it's PF, right?
Yes, if I understand your question right.
- But for two-host case as above, do we need to create a devlink instances for the PF corresponding to ctrl-1-pf0 in smartNIC host?
You can choose not to create a devlink instance in external controller PF. It may not be even a Linux OS running there.
I read questions few more times, but I find it hard to understand what you really want to ask. Not sure I understood you.
Trying again,
The model is really very straight forward as visible in the diagram.
There is one PF that has the eswitch. Eswitch contains representor ports.
I thought the representor ports of a PF'eswitch is decided by the function under a specific PF(For example, the PF itself and the VF under this PF)?
Each representor port represent either PF, VF or SF. This PF, VF or SF can be of local controller residing on the eswitch device or it can be of an external controller(s). Here external controller = 1.
If I understood above correctly: The fw/hw decide which PF has the eswitch, and how many devlink/representor port does this eswitch has? Suppose PF0 of controller_num=0 in have the eswitch, and the eswitch may has devlink/representor port representing other PF, like PF1 in controller_num=0, and even PF0/PF1 in controller_num=1?
Every single PF, VF, SF has devlink instance including the eswitch PF and PF of external controller (often called as external host). Why such devlink instance exists? -> I explained you before in [1].
[1] https://lore.kernel.org/netdev/PH0PR12MB5481FB8528A90E34FA3578C1DC389@PH0PR1...