> From: lipeng (Y)
> Sent: Wednesday, April 14, 2021 2:59 AM
>
>
> 在 2021/4/13 21:55, Salil Mehta 写道:
> >> From: lipeng (Y)
> >> Sent: Tuesday, April 13, 2021 1:38 PM
> >>
> >>
> >> 在 2021/4/13 19:17, Salil Mehta 写道:
> >>>> From: lipeng (Y)
> >>>> Sent: Tuesday, April 13, 2021 10:30 AM
> >>>> To: Salil Mehta <salil.mehta(a)huawei.com>; Linuxarm <linuxarm(a)huawei.com>;
> >>>> linuxarm(a)openeuler.org
> >>>> Cc: Zhuangyuzeng (Yisen) <yisen.zhuang(a)huawei.com>; huangdaode
> >>>> <huangdaode(a)huawei.com>; linyunsheng <linyunsheng(a)huawei.com>; shenjian
> (K)
> >>>> <shenjian15(a)huawei.com>; moyufeng <moyufeng(a)huawei.com>; zhangjiaran
> >>>> <zhangjiaran(a)huawei.com>; huangguangbin (A) <huangguangbin2(a)huawei.com>;
> >>>> Jonathan Cameron <jonathan.cameron(a)huawei.com>
> >>>> Subject: Re: [PATCH RFC net] net: hns3: fixes+refactors the broken set
> channel
> >>>> fallback logic
> >>>>
> >>>>
> >>>> 在 2021/4/13 17:12, Salil Mehta 写道:
> >>>>> The fallback logic of the set channels when set_channel() fails to configure
> >>>>> TX Sched/RSS H/W configuration or for the code which brings down/restores
> >>>>> client before/therafter is not handled properly.
> >>>>>
> >>>>> This patch fixes and refactors that code to improve the readibility.
> >>>>>
> >>>>> [INTERNAL ONLY]
> >>>>> Fixes: <Tag TODO> ("Comment TODO")
> >>>>> Signed-off-by: Salil Mehta <salil.mehta(a)huawei.com>
> >>>>> ---
> >>>>> .../net/ethernet/hisilicon/hns3/hns3_enet.c | 72
> +++++++++++--------
> >>>>> 1 file changed, 43 insertions(+), 29 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> >>>> b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> >>>>> index bf4302a5cf95..41d8bdb34f6e 100644
> >>>>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> >>>>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> >>>>> @@ -4692,23 +4692,56 @@ static int hns3_reset_notify(struct hnae3_handle
> >>>> *handle,
> >>>>> static int hns3_change_channels(struct hnae3_handle *handle, u32
> >>>> new_tqp_num,
> >>>>> bool rxfh_configured)
> >>>>> {
> >>>>> - int ret;
> >>>>> + const struct hnae3_ae_ops *ops = handle->ae_algo->ops;
> >>>>> + u32 org_tqp_num = handle->kinfo.num_tqps;
> >>>>> + struct device *dev = &handle->pdev->dev;
> >>>>> + u32 req_tqp_num = new_tqp_num;
> >>>>> + bool revert_old_config = false;
> >>>>> + int ret, retval = 0;
> >>>>> +
> >>>>> + /* bring down the client */
> >>>>> + ret = hns3_reset_notify(handle, HNAE3_DOWN_CLIENT);
> >>>>> + if (ret)
> >>>>> + return ret;
> >>>>>
> >>>>> - ret = handle->ae_algo->ops->set_channels(handle, new_tqp_num,
> >>>>> - rxfh_configured);
> >>>>> - if (ret) {
> >>>>> - dev_err(&handle->pdev->dev,
> >>>>> - "Change tqp num(%u) fail.\n", new_tqp_num);
> >>>>> + ret = hns3_reset_notify(handle, HNAE3_UNINIT_CLIENT);
> >>>>> + if (ret)
> >>>>> return ret;
> >>>>> +
> >>>>> +revert_old_tpqs_config:
> >>>>> + /* update the traffic sched and RSS config in the H/W as per new TPQ
> num
> >>>> */
> >>>>> + ret = ops->set_channels(handle, req_tqp_num, rxfh_configured);
> >>>>> + if (ret) {
> >>>>> + dev_err(dev, "failed(=%d) to update TX sched/RSS h/w cfg for %s
> >> TPQs\n",
> >>>>> + ret, revert_old_config?"old":"new");
> >>>>> +
> >>>>> + /* try reverting back h/w config with old TPQ num */
> >>>>> + if (!revert_old_config) {
> >>>>> + dev_warn(dev,
> >>>>> + "will try to revert TX sched/RSS h/w cfg with old TPQs\n");
> >>>>> + req_tqp_num = org_tqp_num;
> >>>>> + revert_old_config = true;
> >>>>> + retval = ret;
> >>>>> +
> >>>>> + goto revert_old_tpqs_config;
> >>>>> + }
> >>>>> +
> >>>>> + /* bad, we could not revert to h/w config with old TQP number */
> >>>>> + goto err_set_chan_fail;
> >>>> why you changelike this ?
> >>>>
> >>>>
> >>>> when setting fail, the port should work OK just like setting before.
> >>> The port can get into problematic state if the client remains in un-initialized
> >>> state after errors are encountered during the process of setting the channels
> >>> for new number of TPQs. Port cannot be said to be in sane state.
> >>>
> >>> Conditions(after desperate attempt to revert to old TPQ config fails):
> >>> 1. half un-initialized state of the client or
> >>> 2. client un-initialized but not re-initialized or
> >>> 3. client re-initialization fails later
> >>>
> >>>
> >>> This condition can only be recovered when you reset or re-load the driver.
> >>>
> >>>
> >> do you have any test case for it ?
> >
> > We need to simulate the TX Sched/RSS hardware config failure at firmware/IMP.
> > For the integration Testing, we can always create one but for system testing
> > we need ways in the firmware to be able to simulate these failures. Some
> > debugging feature is required in the firmware. Maybe somebody can take this
> > task and introduce this feature in the IMP firmware code if it does not exists
> > already? I would suggest to expose all these controls using 'devlink' interface
> > to the user which can be used by sys test team as well.
> >
> >
> >> when user change channel larger by ethtool command , and the
> >> memory not enough, it will fail.
> >>
> >> we should recovered too.
> > All of these should get conveyed by the hnae3 layer when you do set channels
> or when
> > we do client re-initialization(init+up)
> >
> This patch will make some test case fail, it is OK before this change.
Can you please list down the test case which is failing or will fail?