mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Linuxarm

Threads by month
  • ----- 2025 -----
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
linuxarm@openeuler.org

  • 797 discussions
Re: [PATCH V3 7/7] app/testpmd: remove redundant fwd streams initialization
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 7/7] app/testpmd: remove redundant fwd streams > initialization > > The fwd_config_setup() is called after init_fwd_streams(). > The fwd_config_setup() will reinitialize forwarding streams. > This patch removes init_fwd_streams() from init_config(). > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- Agree. Seems redundant. Every fwd setup will call init_fwd_streams() again. Acked-by: Xiaoyun Li <xiaoyun.li(a)intel.com>
1 0
0 0
Re: [PATCH V3 6/7] app/testpmd: add forwarding config in start port
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 6/7] app/testpmd: add forwarding config in start port > > Most operations in testpmd that need to update the forwarding streams in > testpmd call fwd_config_setup(). In some scenarios, eg, dev_configure is called > again, the forwarding streams may not be updated. As a result, the actual > forwarding streams cannot be queried by "show config fwd" cmd. I don't agree on this. Fwd config should be only changed after the user change something like nb-cores, queue number, eth-peer in non-dcb mode. These are already done in those commands. You should do fwd_config_setup at the end of cmd_config_dcb_parsed(), I agree on this. But doing it in start port seems to do redundant times of fwd_setup. It's not really needed. > > The procedure is as follows: > set nbcore 4 > port stop all > port config 0 dcb vt off 4 pfc on > port start all > show config fwd > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > app/test-pmd/testpmd.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > abcbdaa..f8052b6 100644 > --- a/app/test-pmd/testpmd.c > +++ b/app/test-pmd/testpmd.c > @@ -2678,6 +2678,12 @@ start_port(portid_t pid) > } > } > } > + /* > + * In some scenarios, eg, dev_configure is called again, the forwarding > + * streams may not be updated. As a result, the actual forwarding > + * streams cannot be queried by "show config fwd" command. > + */ > + fwd_config_setup(); > > printf("Done\n"); > return 0; > -- > 2.7.4
1 0
0 0
Re: [PATCH V3 5/7] app/testpmd: move position of verifying DCB test
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 5/7] app/testpmd: move position of verifying DCB test > > Currently, the check for doing DCB test is assigned to start_packet_forwarding(), > which will be called when run "start" cmd. But fwd_config_setup() is used in > many scenarios, such as, "port config all rxq". > > This patch moves the check from start_packet_forwarding() to > fwd_config_setup(). > > Fixes: 7741e4cf16c0 ("app/testpmd: VMDq and DCB updates") > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > app/test-pmd/config.c | 23 +++++++++++++++++++++-- app/test- > pmd/testpmd.c | 19 ------------------- > 2 files changed, 21 insertions(+), 21 deletions(-) Acked-by: Xiaoyun Li <xiaoyun.li(a)intel.com>
1 0
0 0
Re: [PATCH V3 4/7] app/testpmd: add check for support of reporting DCB info
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 4/7] app/testpmd: add check for support of reporting DCB > info > > Currently, '.get_dcb_info' must be supported for the port doing DCB test, or all > information in 'rte_eth_dcb_info' are zero. It should be prevented when user run > cmd "port config 0 dcb vt off 4 pfc off". > > This patch adds the check for support of reporting dcb info. > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > V2->V3 > - fix the abnormal print information > > --- > app/test-pmd/cmdline.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > 2.7.4 Acked-by: Xiaoyun Li <xiaoyun.li(a)intel.com>
1 0
0 0
Re: [PATCH V3 3/7] app/testpmd: fix a segment fault when DCB test
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 3/7] app/testpmd: fix a segment fault when DCB test > > After DCB mode is configured, if we decrease the number of RX and TX queues, > fwd_config_setup() will be called to setup the DCB forwarding configuration. > And forwarding streams are updated based on new queue numbers in > fwd_config_setup(), but the mapping between the TC and queues obtained by > rte_eth_dev_get_dcb_info() is still old queue numbers (old queue numbers are > greater than new queue numbers). > In this case, the segment fault happens. So rte_eth_dev_configure() should be > called again to update the mapping between the TC and queues before > rte_eth_dev_get_dcb_info(). > > Like: > set nbcore 4 > port stop all > port config 0 dcb vt off 4 pfc on > port start all > port stop all > port config all rxq 8 > port config all txq 8 > > Fixes: 900550de04a7 ("app/testpmd: add dcb support") > Cc: stable(a)dpdk.org > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > app/test-pmd/config.c | 26 ++++++++++++++++++++++++++ > 1 file changed, 26 insertions(+) > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index > 03ee40c..18b197b 100644 > --- a/app/test-pmd/config.c > +++ b/app/test-pmd/config.c > @@ -2996,7 +2996,33 @@ dcb_fwd_config_setup(void) > uint16_t nb_rx_queue, nb_tx_queue; > uint16_t i, j, k, sm_id = 0; > uint16_t total_tc_num; > + struct rte_port *port; > uint8_t tc = 0; > + portid_t pid; > + int ret; > + > + /* > + * The fwd_config_setup() is called when the port is > RTE_PORT_STARTED When the port is RTE_PORT_STARTED or RTE_PORT_STARTED? Missing an 'or' here. Maybe something like the following will be better? (just a reference, you can put it in a better way) /* * Re-configure ports to get updated mapping between tc and queue in case the queue number of the port is changed. * Skip for started ports since configuring queue number needs to stop ports first. */ > + * RTE_PORT_STOPPED. When a port is RTE_PORT_STARTED, > dev_configure > + * cannot be called. > + * > + * re-configure the device after changing queue numbers of stopped > + * ports, so that the updated mapping between tc and queue can be > + * obtained. > + */ > + for (pid = 0; pid < nb_fwd_ports; pid++) { > + if (port_is_started(pid) == 1) > + continue; > + > + port = &ports[pid]; > + ret = rte_eth_dev_configure(pid, nb_rxq, nb_txq, > + &port->dev_conf); > + if (ret < 0) { > + printf("Failed to re-configure port %d, ret = %d.\n", > + pid, ret); > + return; > + } > + } > > cur_fwd_config.nb_fwd_lcores = (lcoreid_t) nb_fwd_lcores; > cur_fwd_config.nb_fwd_ports = nb_fwd_ports; > -- > 2.7.4
1 0
0 0
Re: [PATCH V3 2/7] app/testpmd: fix DCB forwarding configuration
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 2/7] app/testpmd: fix DCB forwarding configuration > > After DCB mode is configured, the operations of port stop and port start change > the value of the global variable "dcb_test", As a result, the forwarding > configuration from DCB to RSS mode, namely, “dcb_fwd_config_setup()” to > "rss_fwd_config_setup()". > > Currently, the 'dcb_flag' field in struct 'rte_port' indicates whether the port is > configured with DCB. And it is sufficient to have 'dcb_config' > as a global variable to control the DCB test status. So this patch deletes the > "dcb_test". > > In addition, the 'dcb_config' is first set to 1 first in init_port_dcb_config(), but > the function may fail. > So it should be moved to the end. Change this to the following will be better: In addition, setting 'dcb_config' at the end of init_port_dcb_config() in case that ports fail to enter DCB mode. > > Fixes: 900550de04a7 ("app/testpmd: add dcb support") > Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings") > Fixes: 7741e4cf16c0 ("app/testpmd: VMDq and DCB updates") > Cc: stable(a)dpdk.org > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > app/test-pmd/testpmd.c | 18 ++++-------------- app/test-pmd/testpmd.h | 1 - > 2 files changed, 4 insertions(+), 15 deletions(-) > > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > d4be23f..a076b1d 100644 > --- a/app/test-pmd/testpmd.c > +++ b/app/test-pmd/testpmd.c > @@ -246,9 +246,6 @@ uint16_t mb_mempool_cache = DEF_MBUF_CACHE; > /**< Size of mbuf mempool cache. */ > /* current configuration is in DCB or not,0 means it is not in DCB mode */ > uint8_t dcb_config = 0; > > -/* Whether the dcb is in testing status */ -uint8_t dcb_test = 0; > - > /* > * Configurable number of RX/TX queues. > */ > @@ -2167,8 +2164,7 @@ start_packet_forwarding(int with_tx_first) > return; > } > > - > - if(dcb_test) { > + if (dcb_config) { > for (i = 0; i < nb_fwd_ports; i++) { > pt_id = fwd_ports_ids[i]; > port = &ports[pt_id]; > @@ -2476,8 +2472,6 @@ start_port(portid_t pid) > if (port_id_is_invalid(pid, ENABLED_WARN)) > return 0; > > - if(dcb_config) > - dcb_test = 1; > RTE_ETH_FOREACH_DEV(pi) { > if (pid != pi && pid != (portid_t)RTE_PORT_ALL) > continue; > @@ -2717,11 +2711,6 @@ stop_port(portid_t pid) > portid_t peer_pl[RTE_MAX_ETHPORTS]; > int peer_pi; > > - if (dcb_test) { > - dcb_test = 0; > - dcb_config = 0; > - } > - > if (port_id_is_invalid(pid, ENABLED_WARN)) > return; > > @@ -3625,8 +3614,6 @@ init_port_dcb_config(portid_t pid, > rte_port = &ports[pid]; > > memset(&port_conf, 0, sizeof(struct rte_eth_conf)); > - /* Enter DCB configuration status */ > - dcb_config = 1; > > port_conf.rxmode = rte_port->dev_conf.rxmode; > port_conf.txmode = rte_port->dev_conf.txmode; @@ -3694,6 +3681,9 > @@ init_port_dcb_config(portid_t pid, > > rte_port->dcb_flag = 1; > > + /* Enter DCB configuration status */ > + dcb_config = 1; > + > return 0; > } > > diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index > 9530ec5..432c66d 100644 > --- a/app/test-pmd/testpmd.h > +++ b/app/test-pmd/testpmd.h > @@ -425,7 +425,6 @@ extern uint64_t noisy_lkup_num_reads; extern uint64_t > noisy_lkup_num_reads_writes; > > extern uint8_t dcb_config; > -extern uint8_t dcb_test; > > extern uint32_t mbuf_data_size_n; > extern uint16_t mbuf_data_size[MAX_SEGS_BUFFER_SPLIT]; > -- > 2.7.4
1 0
0 0
Re: [PATCH V3 1/7] app/testpmd: fix forward lcores number when DCB test
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 17:01 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V3 1/7] app/testpmd: fix forward lcores number when DCB test > > Currently, 'nb_fwd_lcores' value are both adjusted based on 'nb_fwd_streams' in > rss/simple/icmp_echo_fwd_config_setup. > But the operation is missing in dcb_fwd_config_setup, which may lead to a bad > behavior in which multiple polling threads operate on the same queue. In this > case, the device sends and receives packets, causing unexpected results. The > procedure is as follows: > 1/ run testpmd with "--rxq=8 --txq=8" > 2/ port stop all > 3/ set nbcore 8 > 4/ port config 0 dcb vt off 4 pfc on > 5/ port config all rxq 16 > 6/ port config all txq 16 > 7/ port start all > 8/ set fwd mac > 9/ start > Same as v2 comment. Please check that. > For the DCB forwarding test, each core is assigned to each traffic class and each > core is assigned a multi-stream. > Therefore, 'nb_fwd_lcores' value needs to be adjusted based on 'total_tc_num' > in all forwarding ports. > > Fixes: 900550de04a7 ("app/testpmd: add dcb support") > Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings") > Cc: stable(a)dpdk.org > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > app/test-pmd/config.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index > ccb9bd3..03ee40c 100644 > --- a/app/test-pmd/config.c > +++ b/app/test-pmd/config.c > @@ -2961,6 +2961,21 @@ rss_fwd_config_setup(void) > } > } > > +static uint16_t > +get_fwd_port_total_tc_num(void) > +{ > + struct rte_eth_dcb_info dcb_info; > + uint16_t total_tc_num = 0; > + unsigned int i; > + > + for (i = 0; i < nb_fwd_ports; i++) { > + (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[i], &dcb_info); > + total_tc_num += dcb_info.nb_tcs; > + } > + > + return total_tc_num; > +} > + > /** > * For the DCB forwarding test, each core is assigned on each traffic class. > * > @@ -2980,12 +2995,16 @@ dcb_fwd_config_setup(void) > lcoreid_t lc_id; > uint16_t nb_rx_queue, nb_tx_queue; > uint16_t i, j, k, sm_id = 0; > + uint16_t total_tc_num; > uint8_t tc = 0; > > cur_fwd_config.nb_fwd_lcores = (lcoreid_t) nb_fwd_lcores; > cur_fwd_config.nb_fwd_ports = nb_fwd_ports; > cur_fwd_config.nb_fwd_streams = > (streamid_t) (nb_rxq * cur_fwd_config.nb_fwd_ports); > + total_tc_num = get_fwd_port_total_tc_num(); > + if (cur_fwd_config.nb_fwd_lcores > total_tc_num) > + cur_fwd_config.nb_fwd_lcores = total_tc_num; > > /* reinitialize forwarding streams */ > init_fwd_streams(); > -- > 2.7.4
1 0
0 0
Re: [PATCH V2 1/7] app/testpmd: fix forward lcores number when DCB test
by Li, Xiaoyun 27 Apr '21

27 Apr '21
> -----Original Message----- > From: Huisong Li <lihuisong(a)huawei.com> > Sent: Tuesday, April 20, 2021 15:32 > To: dev(a)dpdk.org > Cc: Yigit, Ferruh <ferruh.yigit(a)intel.com>; Li, Xiaoyun <xiaoyun.li(a)intel.com>; > linuxarm(a)openeuler.org; lihuisong(a)huawei.com > Subject: [PATCH V2 1/7] app/testpmd: fix forward lcores number when DCB test > > Currently, 'nb_fwd_lcores' value are both adjusted based on 'nb_fwd_streams' in > rss/simple/icmp_echo_fwd_config_setup. > But the operation is missing in dcb_fwd_config_setup, which may lead to a bad > behavior in which multiple polling threads operate on the same queue. This patch is OK. But commit log is redundant and confusing. The above is enough to explain what your patch is doing and can even be more simple. >In this > case, the device sends and receives packets, causing unexpected results. The > procedure is as follows: I don't understand what you're saying here. The commands you're showing is 8 nbcores dealing with 16 queues. So it's one thread dealing with multiple queues which doesn't have issues at all. Please remove the useless and confusing commands. > 1/ run testpmd with "--rxq=8 --txq=8" > 2/ port stop all > 3/ set nbcore 8 > 4/ port config 0 dcb vt off 4 pfc on > 5/ port config all rxq 16 > 6/ port config all txq 16 > 7/ port start all > 8/ set fwd mac > 9/ start > > For the DCB forwarding test, each core is assigned to each traffic class and each > core is assigned a multi-stream. > Therefore, 'nb_fwd_lcores' value needs to be adjusted based on 'total_tc_num' > in all forwarding ports. Please refer to the RSS fwd config fix patch to write your own commit log. Use simple and easy-understanding words to explain yourself. Below is the reference of RSS. commit 017d680a91fcf30da14a6d3a2f96d41f6dda3a0f Author: Pablo de Lara <pablo.de.lara.guarch(a)intel.com> Date: Mon Jun 27 23:35:19 2016 +0100 app/testpmd: limit number of forwarding cores Number of forwarding cores must be equal or less than number of forwarding streams, otherwise two cores would try to use a same queue on a port, which is not allowed. > > Fixes: 900550de04a7 ("app/testpmd: add dcb support") > Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings") > Cc: stable(a)dpdk.org > > Signed-off-by: Huisong Li <lihuisong(a)huawei.com> > Signed-off-by: Lijun Ou <oulijun(a)huawei.com> > --- > app/test-pmd/config.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index > ccb9bd3..03ee40c 100644 > --- a/app/test-pmd/config.c > +++ b/app/test-pmd/config.c > @@ -2961,6 +2961,21 @@ rss_fwd_config_setup(void) > } > } > > +static uint16_t > +get_fwd_port_total_tc_num(void) > +{ > + struct rte_eth_dcb_info dcb_info; > + uint16_t total_tc_num = 0; > + unsigned int i; > + > + for (i = 0; i < nb_fwd_ports; i++) { > + (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[i], &dcb_info); > + total_tc_num += dcb_info.nb_tcs; > + } > + > + return total_tc_num; > +} > + > /** > * For the DCB forwarding test, each core is assigned on each traffic class. > * > @@ -2980,12 +2995,16 @@ dcb_fwd_config_setup(void) > lcoreid_t lc_id; > uint16_t nb_rx_queue, nb_tx_queue; > uint16_t i, j, k, sm_id = 0; > + uint16_t total_tc_num; > uint8_t tc = 0; > > cur_fwd_config.nb_fwd_lcores = (lcoreid_t) nb_fwd_lcores; > cur_fwd_config.nb_fwd_ports = nb_fwd_ports; > cur_fwd_config.nb_fwd_streams = > (streamid_t) (nb_rxq * cur_fwd_config.nb_fwd_ports); > + total_tc_num = get_fwd_port_total_tc_num(); > + if (cur_fwd_config.nb_fwd_lcores > total_tc_num) > + cur_fwd_config.nb_fwd_lcores = total_tc_num; > > /* reinitialize forwarding streams */ > init_fwd_streams(); > -- > 2.7.4
1 0
0 0
Re: [PATCH] sched/fair: don't use waker's cpu if the waker of sync wake-up is interrupt
by Mike Galbraith 27 Apr '21

27 Apr '21
On Tue, 2021-04-27 at 14:37 +1200, Barry Song wrote: > > To fix this issue, this patch checks the waker of sync wake-up is a task > but not an interrupt. In this case, the waker will schedule out and give > CPU to wakee. That rash "the waker will schedule out" assumption, ie this really really is a synchronous wakeup, may be true in your particular case, but the sync hint is so overused as to be fairly meaningless. We've squabbled with it repeatedly over the years because of that. It really should either become more of a communication of intent than it currently is, or just go away. I'd argue for go away, simply because there is no way for the kernel to know that there isn't more work directly behind any particular wakeup. Take a pipe, does shoving some bits through a pipe mean you have no further use of your CPU? IFF you're doing nothing but playing ping- pong, sure it does, but how many real proggies have zero overlap among its threads of execution? The mere notion of threaded apps having no overlap *to be converted to throughput* is dainbramaged, which should be the death knell of the sync wakeup hint. Threaded apps can't do stuff like, oh, networking, which uses the sync hint heavily, without at least to some extent defeating the purpose of threading if we were to take the hint seriously. Heck, just look at the beauty (gak) of wake_wide(). It was born specifically to combat the pure-evil side of the sync wakeup hint. Bah, 'nuff "Danger Will Robinson, that thing will *eat you*!!" ;-) -Mike
2 4
0 0
Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio
by Alex Williamson 26 Apr '21

26 Apr '21
On Wed, 21 Apr 2021 17:59:02 +0800 liulongfang <liulongfang(a)huawei.com> wrote: > On 2021/4/21 6:04, Alex Williamson wrote: > > On Tue, 20 Apr 2021 09:59:57 -0300 > > Jason Gunthorpe <jgg(a)nvidia.com> wrote: > > > >> On Tue, Apr 20, 2021 at 08:50:12PM +0800, liulongfang wrote: > >>> On 2021/4/19 20:33, Jason Gunthorpe wrote: > >>>> On Mon, Apr 19, 2021 at 08:24:40PM +0800, liulongfang wrote: > >>>> > >>>>>> I'm also confused how this works securely at all, as a general rule a > >>>>>> VFIO PCI driver cannot access the MMIO memory of the function it is > >>>>>> planning to assign to the guest. There is a lot of danger that the > >>>>>> guest could access that MMIO space one way or another. > >>>>> > >>>>> VF's MMIO memory is divided into two parts, one is the guest part, > >>>>> and the other is the live migration part. They do not affect each other, > >>>>> so there is no security problem. > >>>> > >>>> AFAIK there are several scenarios where a guest can access this MMIO > >>>> memory using DMA even if it is not mapped into the guest for CPU > >>>> access. > >>>> > >>> The hardware divides VF's MMIO memory into two parts. The live migration > >>> driver in the host uses the live migration part, and the device driver in > >>> the guest uses the guest part. They obtain the address of VF's MMIO memory > >>> in their respective drivers, although these two parts The memory is > >>> continuous on the hardware device, but due to the needs of the drive function, > >>> they will not perform operations on another part of the memory, and the > >>> device hardware also independently responds to the operation commands of > >>> the two parts. > >> > >> It doesn't matter, the memory is still under the same PCI BDF and VFIO > >> supports scenarios where devices in the same IOMMU group are not > >> isolated from each other. > >> > >> This is why the granual of isolation is a PCI BDF - VFIO directly > >> blocks kernel drivers from attaching to PCI BDFs that are not > >> completely isolated from VFIO BDF. > >> > >> Bypassing this prevention and attaching a kernel driver directly to > >> the same BDF being exposed to the guest breaks that isolation model. > >> > >>> So, I still don't understand what the security risk you are talking about is, > >>> and what do you think the security design should look like? > >>> Can you elaborate on it? > >> > >> Each security domain must have its own PCI BDF. > >> > >> The migration control registers must be on a different VF from the VF > >> being plugged into a guest and the two VFs have to be in different > >> IOMMU groups to ensure they are isolated from each other. > > > > I think that's a solution, I don't know if it's the only solution. > > AIUI, the issue here is that we have a device specific kernel driver > > extending vfio-pci with migration support for this device by using an > > If the two parts of the MMIO region are split into different BAR spaces on > the device, the MMIO region of ​​the business function is still placed in BAR2, > and the MMIO region of ​​the live migration function is moved to BAR4. > Only BAR2 is mapped in the guest. only BAR4 is mapped in the host. > This can solve this security issue. The concern is really the "on the device" part rather than whether the resources are within the same BAR or not. We need to assume that a user driver can generate a DMA targeting any address in the system, including in this case the user driver could generate a DMA targeting this migration BAR. Ideally this would be routed upstream to the IOMMU where it would be blocked for lack of a translation entry. However, because this range resides on the same PCIe requester ID, it's logically more similar to a two-function device where the functions are considered non-isolated and are therefore exposed within the same IOMMU group. We would not allow a kernel driver for one of those functions and a userspace driver for the other. In this case those drivers are strongly related, but we still need to consider to what extent a malicious user driver can interfere with or exploit the kernel side driver. > > MMIO region of the same device. This is susceptible to DMA> manipulation by the user device. Whether that's a security issue or> not depends on how the user can break the device. If the scope is > > limited to breaking their own device, they can do that any number of > > ways and it's not very interesting. If the user can manipulate device > > state in order to trigger an exploit of the host-side kernel driver, > > that's obviously more of a problem. > > > > The other side of this is that if migration support can be implemented > > entirely within the VF using this portion of the device MMIO space, why > > do we need the host kernel to support this rather than implementing it > > in userspace? For example, QEMU could know about this device, > > manipulate the BAR size to expose only the operational portion of MMIO > > to the VM and use the remainder to support migration itself. I'm > > afraid that just like mdev, the vfio migration uAPI is going to be used > > as an excuse to create kernel drivers simply to be able to make use of > > that uAPI. I haven't looked at this driver to know if it has some > > When the accelerator device is designed to support the live migration > function, it is based on the uAPI of the migration region to realize the > live migration function, so the live migration function requires a driver > that connects to this uAPI. > Is this set of interfaces not open to us now? In your model, if both BARs are exposed to userspace and a device specific extension in QEMU claims the migration BAR rather than exposing it to the VM, could that driver mimic the migration region uAPI from userspace? For example, you don't need page pinning to interact with the IOMMU, you don't need resources beyond the scope of the endpoint device itself, and the migration info BAR is safe for userspace to manage? If so, then a kernel-based driver to expose a migration uAPI seems like it's only a risk for the kernel, ie. moving what could be a userspace driver into the kernel for the convenience of re-using a kernel uAPI. Thanks, Alex
2 1
0 0
  • ← Newer
  • 1
  • ...
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • ...
  • 80
  • Older →

HyperKitty Powered by HyperKitty