- Kernel - mailweb.openeuler.org

[PATCH OLK-5.10 v2] coresight: Fix loss of connection info when a module is unloaded
by Junhao He 19 Oct '23

19 Oct '23

From: James Clark <james.clark(a)arm.com> mainline inclusion from mainline-v6.4-rc1 commit c45b2835e7b205783bdfe08cc98fa86a7c5eeb74 category: Bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I89FF2 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------------------------------------- child_fwnode should be a read only property based on the DT or ACPI. If it's cleared on the parent device when a child is unloaded, then when the child is loaded again the connection won't be remade. child_dev should be cleared instead which signifies that the connection should be remade when the child_fwnode registers a new coresight_device. Similarly the reference count shouldn't be decremented as long as the parent device exists. The correct place to drop the reference is in coresight_release_platform_data() which is already done. Reproducible on Juno with the following steps: # load all coresight modules. $ cd /sys/bus/coresight/devices/ $ echo 1 > tmc_etr0/enable_sink $ echo 1 > etm0/enable_source # Works fine ^ $ echo 0 > etm0/enable_source $ rmmod coresight-funnel $ modprobe coresight-funnel $ echo 1 > etm0/enable_source -bash: echo: write error: Invalid argument Fixes: 37ea1ffddffa ("coresight: Use fwnode handle instead of device names") Fixes: 2af89ebacf29 ("coresight: Clear the connection field properly") Tested-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> Reviewed-by: Mike Leach <mike.leach(a)linaro.org> Signed-off-by: James Clark <james.clark(a)arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-2-james.clark@arm.com Signed-off-by: Junhao He <hejunhao3(a)huawei.com> --- drivers/hwtracing/coresight/coresight-core.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index ee7b5cfcf786..9696f402a328 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -1443,13 +1443,8 @@ static int coresight_remove_match(struct device *dev, void *data) if (csdev->dev.fwnode == conn->child_fwnode) { iterator->orphan = true; coresight_remove_links(iterator, conn); - /* - * Drop the reference to the handle for the remote - * device acquired in parsing the connections from - * platform data. - */ - fwnode_handle_put(conn->child_fwnode); - conn->child_fwnode = NULL; + + conn->child_dev = NULL; /* No need to continue */ break; } -- 2.33.0

2 1

[PATCH openEuler-1.0-LTS 0/2] radix-tree bugfix backport
by Ye Weihua 19 Oct '23

19 Oct '23

*** BLURB HERE *** Matthew Wilcox (2): idr: Permit any valid kernel pointer to be stored radix tree: Don't return retry entries from lookup lib/idr.c | 4 -- lib/radix-tree.c | 25 +++++--- tools/testing/radix-tree/Makefile | 1 + tools/testing/radix-tree/idr-test.c | 63 ++++++++++++++++++++ tools/testing/radix-tree/main.c | 1 + tools/testing/radix-tree/regression.h | 1 + tools/testing/radix-tree/regression4.c | 79 ++++++++++++++++++++++++++ 7 files changed, 162 insertions(+), 12 deletions(-) create mode 100644 tools/testing/radix-tree/regression4.c -- 2.34.1

2 3

[PATCH OLK-5.10] psi: Fix "no previous prototype" warnings when CONFIG_CGROUPS=n
by Xia Fukun 19 Oct '23

19 Oct '23

From: Suren Baghdasaryan <surenb(a)google.com> mainline inclusion from mainline-v6.5-rc7 commit ec2444530612a886b406e2830d7f314d1a07d4bb category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I89DEJ CVE: NA -------------------------------- When CONFIG_CGROUPS is disabled psi code generates the following warnings: kernel/sched/psi.c:1112:21: warning: no previous prototype for 'psi_trigger_create' [-Wmissing-prototypes] 1112 | struct psi_trigger *psi_trigger_create(struct psi_group *group, | ^~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1182:6: warning: no previous prototype for 'psi_trigger_destroy' [-Wmissing-prototypes] 1182 | void psi_trigger_destroy(struct psi_trigger *t) | ^~~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1249:10: warning: no previous prototype for 'psi_trigger_poll' [-Wmissing-prototypes] 1249 | __poll_t psi_trigger_poll(void **trigger_ptr, | ^~~~~~~~~~~~~~~~ Change declarations of these functions in the header to provide the prototypes even when they are unused. Fixes: 0e94682b73bf ("psi: introduce psi monitor") Reported-by: kernel test robot <lkp(a)intel.com> Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Link: https://lkml.kernel.org/r/20220119223940.787748-2-surenb@google.com Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- include/linux/psi.h | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/include/linux/psi.h b/include/linux/psi.h index d290f0493c33..c0c1d7e0c84d 100644 --- a/include/linux/psi.h +++ b/include/linux/psi.h @@ -26,18 +26,17 @@ void psi_memstall_enter(unsigned long *flags); void psi_memstall_leave(unsigned long *flags); int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res); - -#ifdef CONFIG_CGROUPS -int psi_cgroup_alloc(struct cgroup *cgrp); -void psi_cgroup_free(struct cgroup *cgrp); -void cgroup_move_task(struct task_struct *p, struct css_set *to); - struct psi_trigger *psi_trigger_create(struct psi_group *group, char *buf, size_t nbytes, enum psi_res res); void psi_trigger_destroy(struct psi_trigger *t); __poll_t psi_trigger_poll(void **trigger_ptr, struct file *file, poll_table *wait); + +#ifdef CONFIG_CGROUPS +int psi_cgroup_alloc(struct cgroup *cgrp); +void psi_cgroup_free(struct cgroup *cgrp); +void cgroup_move_task(struct task_struct *p, struct css_set *to); #endif #else /* CONFIG_PSI */ -- 2.34.1

2 1

[PATCH OLK-5.10] psi: Fix "no previous prototype" warnings when CONFIG_CGROUPS=n
by Xia Fukun 19 Oct '23

19 Oct '23

From: Suren Baghdasaryan <surenb(a)google.com> mainline inclusion from mainline-v6.5 commit ec2444530612a886b406e2830d7f314d1a07d4bb category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I89DEJ CVE: NA -------------------------------- When CONFIG_CGROUPS is disabled psi code generates the following warnings: kernel/sched/psi.c:1112:21: warning: no previous prototype for 'psi_trigger_create' [-Wmissing-prototypes] 1112 | struct psi_trigger *psi_trigger_create(struct psi_group *group, | ^~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1182:6: warning: no previous prototype for 'psi_trigger_destroy' [-Wmissing-prototypes] 1182 | void psi_trigger_destroy(struct psi_trigger *t) | ^~~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1249:10: warning: no previous prototype for 'psi_trigger_poll' [-Wmissing-prototypes] 1249 | __poll_t psi_trigger_poll(void **trigger_ptr, | ^~~~~~~~~~~~~~~~ Change declarations of these functions in the header to provide the prototypes even when they are unused. Fixes: 0e94682b73bf ("psi: introduce psi monitor") Reported-by: kernel test robot <lkp(a)intel.com> Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Link: https://lkml.kernel.org/r/20220119223940.787748-2-surenb@google.com Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- include/linux/psi.h | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/include/linux/psi.h b/include/linux/psi.h index d290f0493c33..c0c1d7e0c84d 100644 --- a/include/linux/psi.h +++ b/include/linux/psi.h @@ -26,18 +26,17 @@ void psi_memstall_enter(unsigned long *flags); void psi_memstall_leave(unsigned long *flags); int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res); - -#ifdef CONFIG_CGROUPS -int psi_cgroup_alloc(struct cgroup *cgrp); -void psi_cgroup_free(struct cgroup *cgrp); -void cgroup_move_task(struct task_struct *p, struct css_set *to); - struct psi_trigger *psi_trigger_create(struct psi_group *group, char *buf, size_t nbytes, enum psi_res res); void psi_trigger_destroy(struct psi_trigger *t); __poll_t psi_trigger_poll(void **trigger_ptr, struct file *file, poll_table *wait); + +#ifdef CONFIG_CGROUPS +int psi_cgroup_alloc(struct cgroup *cgrp); +void psi_cgroup_free(struct cgroup *cgrp); +void cgroup_move_task(struct task_struct *p, struct css_set *to); #endif #else /* CONFIG_PSI */ -- 2.34.1

2 1

[resend V2] openEuler 2023 年度优秀项目推荐结果公示
by Huxinwei 19 Oct '23

19 Oct '23

各位社区开发者，大家好：昨天的结果公示，后续又进一步收到错过的推荐。截止当前，技术委员会已收到优秀项目推荐如下。感谢大家的参与，讨论评审预计在下周技术委员会例会上进行，如有其他意见，请在下周二25号之前邮件反馈给我。再次感谢大家的支持。项目名称项目代码仓推荐人 FangTian视窗引擎 https://gitee.com/openeuler/ft_engine https://gitee.com/openeuler/ft_mmi https://gitee.com/openeuler/arkui-linux https://gitee.com/openeuler/ft_flutter https://gitee.com/openeuler/ft_multimedia 华亚东 @hua_yadong huayadong(a)kylinos.cn<mailto:huayadong@kylinos.cn> 吴圣垚 @shen_hua_li shengyao(a)iscas.ac.cn<mailto:shengyao@iscas.ac.cn> 宋亚南 @yanansong songyanan5(a)huawei.com<mailto:songyanan5@huawei.com> 靳国恩 @abc12133 jinguoen(a)huawei.com<mailto:jinguoen@huawei.com> 冯绍波 @ShaoboFeng fengshaobo(a)huawei.com<mailto:fengshaobo@huawei.com> 郑森文 @zhengsenwen senwen(a)iscas.ac.cn<mailto:senwen@iscas.ac.cn> 张子豪 @MouseZhang mousezhang(a)openkylin.top<mailto:mousezhang@openkylin.top> 黄钰馨 @huangyuxin2023 huangyuxin(a)isrc.iscas.ac.cn<mailto:huangyuxin@isrc.iscas.ac.cn> 蒋文宇 @jiangwenyu1 jiangwenyu1(a)huawei.com<mailto:jiangwenyu1@huawei.com> sysMaster https://gitee.com/openeuler/sysmaster 熊伟 @myeuler 何晓文@overweight 胡世元 @love_hangzhou A-Tune https://gitee.com/openeuler/A-Tune 王建民 @jianminw jianmin(a)iscas.ac.cn<mailto:jianmin@iscas.ac.cn> 李次华 @Monday licihua(a)huawei.com<mailto:licihua@huawei.com> 谢志鹏 @xiezhipeng1 xiezhipeng1(a)huawei.com<mailto:xiezhipeng1@huawei.com> A-OPS https://gitee.com/openeuler/A-Ops https://gitee.com/openeuler/aops-apollo https://gitee.com/openeuler/aops-diana https://gitee.com/openeuler/aops-zeus https://gitee.com/openeuler/gala-gopher https://gitee.com/openeuler/gala-spider https://gitee.com/openeuler/gala-anteater https://gitee.com/openeuler/gala-ragdoll https://gitee.com/openeuler/syscare https://gitee.com/openeuler/X-diagnosis 胡峰 @solarhu 罗盛炜 @Lostwayzxc 杨昭 @yangzhao_kl StratoVirt https://gitee.com/openeuler/stratovirt 王志钢 @cellfaint wangzhigang17(a)huawei.com<mailto:wangzhigang17@huawei.com> 朱科潜 @kevinzhu1 zhukeqian1(a)huawei.com<mailto:zhukeqian1@huawei.com> 汤中睿 @tzr tangzhongrui(a)cmss.chinamobile.com<mailto:tangzhongrui@cmss.chinamobile.com> secGear https://gitee.com/openeuler/secGear 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 刘忻 @ICC-NSG xinl(a)lzu.edu.cn<mailto:xinl@lzu.edu.cn> 杜东 @dongduResearcher dd_nirvana(a)sjtu.edu.cn<mailto:dd_nirvana@sjtu.edu.cn> 侯明永 @houmingyong houmingyong(a)huawei.com<mailto:houmingyong@huawei.com> PINC (PlugIN framework for Compiler) https://gitee.com/openEuler/pin-gcc-clienthttps://gitee.com/openeuler/pin-s…<https://gitee.com/openEuler/pin-gcc-clienthttps:/gitee.com/openeuler/pin-se…> 李彦成@li-yancheng 黄晓权@huang-xiaoquan 丁光亚@dguangya 伍明川@wumingchuan iSulad https://gitee.com/openeuler/iSulad https://gitee.com/openeuler/lcro https://gitee.com/openeuler/clibcni 蔡灏旻 @caihaomin 魏宝辉 @weibaohui Apache Bigtop支持openEuler OS https://github.com/apache/bigtop/tree/openEuler-support 熊伟 @myeuler 郑振宇 @ZhengZhenyu zhengzhenyu(a)huawei.com<mailto:zhengzhenyu@huawei.com> 蒋健源 @jenkins-jiang jiangjianyuan(a)huawei.com<mailto:jiangjianyuan@huawei.com> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 温伟健 @wenwj0 wenweijian2(a)huawei.com<mailto:wenweijian2@huawei.com> 陈强 @macchen1 mac.chenqiang(a)huawei.com<mailto:mac.chenqiang@huawei.com> sysboost https://gitee.com/openeuler/sysboost @liu-yuntao-10 liuyuntao10(a)huawei.com<mailto:liuyuntao10@huawei.com> @softkiller zhoukang7(a)huawei.com<mailto:zhoukang7@huawei.com> @pan-y182 yangpan51(a)huawei.com<mailto:yangpan51@huawei.com> openEuler Kernel https://gitee.com/openeuler/kernel https://gitee.com/src-openeuler/kernel 谢秀奇 @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> 桑力鹏 @sanglipeng sanglipeng1(a)jd.com<mailto:sanglipeng1@jd.com> 曾昭荣 @x56Jason jason.zeng(a)intel.com<mailto:jason.zeng@intel.com> 孔新伟 @kongzizaixian kong.kongxinwei(a)hisilicon.com<mailto:kong.kongxinwei@hisilicon.com> 刘恺 @kailiu42 kai.liu(a)xfusion.com<mailto:kai.liu@xfusion.com> GMEM（Generalized Memory Management） https://gitee.com/openeuler/kernel/tree/openEuler-23.09/ 主要代码文件： include/linux/gmem.h include/linux/gmem_as.h include/linux/vm_object.h mm/gmem.c mm/vm_object.c mm/huge_memory.c mm/memory.c mm/mmap.c drivers/remote_pager/ @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> @weixizhu94 weixi.zhu(a)huawei.com<mailto:weixi.zhu@huawei.com> @fangchuang fangchuangchuang(a)huawei.com<mailto:fangchuangchuang@huawei.com> @SuperSix173 liuchao173(a)huawei.com<mailto:liuchao173@huawei.com> gazelle https://gitee.com/openeuler/gazelle 胡峰 @solarhu 陆志浩 @MrRlu 吴长冶 @ nlgwcy 郑杰兵@steganographer utsudo https://gitee.com/openeuler/utsudo https://gitee.com/src-openeuler/utsudo 王路军 @ut-wanglujun wanglujun(a)uniontech.com<mailto:wanglujun@uniontech.com> 曹佩庆@hustcao4 caopeiqing(a)uniontech.com<mailto:caopeiqing@uniontech.com> 宾凌宇@binlingyu binlingyu(a)uniontech.com<mailto:binlingyu@uniontech.com> 张欢欢 @luckhuanhuan zhanghuanhuan(a)uniontech.com<mailto:zhanghuanhuan@uniontech.com> 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> DDE桌面环境 .... 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 杨晓旋 @ut-layne-yang yangxiaoxuan(a)uniontech.com<mailto:yangxiaoxuan@uniontech.com> 杨显钊 @ xzyangha yangxianzhao(a)uniontech.com<mailto:yangxianzhao@uniontech.com> 李伟刚 @open-bot liweigang(a)uniontech.com<mailto:liweigang@uniontech.com> kiran桌面环境 https://gitee.com/openeuler/kiran-authentication-devices 等石勇 @stonefly128 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> 吴伟 @wuwei_plct wuwei2016(a)iscas.ac.cn<mailto:wuwei2016@iscas.ac.cn> 唐杰 @tangjie02 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> 柳鑫浩 @liubuguiii liuxinghao(a)kylinsec.com.cn<mailto:liuxinghao@kylinsec.com.cn> 王逸樵 @eusteuc eusetuc(a)outlook.com<mailto:eusetuc@outlook.com> NestOS https://gitee.com/openeuler/NestOS 王麟 @wonleing wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> 王利民 @wanglmb wanglimin(a)xfusion.com<mailto:wanglimin@xfusion.com> 刘昊 @duguhaotian liuhao27(a)huawei.com<mailto:liuhao27@huawei.com> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> NestOS Kubernetes Deployer https://gitee.com/openeuler/nestos-kubernetes-deployer 候健 @hjimmy houjian(a)kylinos.cn<mailto:houjian@kylinos.cn> 王悦良 @wangyueliang wangyueliang(a)kylinos.cn<mailto:wangyueliang@kylinos.cn> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> PilotGo https://gitee.com/openeuler/PilotGo 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 陆志浩 @MrRlu luzhihao(a)huawei.com<mailto:luzhihao@huawei.com> 罗盛玮 @Lostwayzxc luoshengwei(a)huawei.com<mailto:luoshengwei@huawei.com> 镜像文件级定制裁剪项目imageTailor https://gitee.com/openeuler/imageTailor 吴峰光 @wu_fengguang wufengguang(a)huawei.com<mailto:wufengguang@huawei.com> 王龙 @laoqinren wanglong19(a)meituan.com<mailto:wanglong19@meituan.com> 高立江 @qq_mud gaolijiang(a)chinapost.com.cn<mailto:gaolijiang@chinapost.com.cn> 朱春意 @zhuchunyi zhuchunyi(a)huawei.com<mailto:zhuchunyi@huawei.com> migration-tools https://gitee.com/openeuler/migration-tools<https://gitee.com/openeuler/NestOS> 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 王磊 @wanglei-uos wangleic(a)uniontech.com<mailto:wangleic@uniontech.com> 李鑫 @F16lx lixin(a)uniontech.com<mailto:lixin@uniontech.com> 刘兴伟 @xingwei-liu liuxingwei(a)uniontech.com<mailto:liuxingwei@uniontech.com> CVE-ease https://gitee.com/openeuler/cve-ease 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 马威 @movie0125 mawei(a)uniontech.com<mailto:mawei@uniontech.com> 王麟 @wonleing wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> gala-ragdoll https://gitee.com/openeuler/gala-ragdoll.git 柳磊 @liulei 450962(a)qq.com<mailto:450962@qq.com> 李超峰 @李超峰0220(*无效gitee id) lichaofeng(a)cmos.chinamobile.com<mailto:lichaofeng@cmos.chinamobile.com> 杨彬 @byrobinsm binyangck(a)isoftstone.com<mailto:binyangck@isoftstone.com> 张道龙 @zhangdaolong dlzhangak(a)isoftstone.com<mailto:dlzhangak@isoftstone.com> CCPS https://gitee.com/openeuler/ccps 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 王磊 @wanglei-uos wangleic(a)uniontech.com<mailto:wangleic@uniontech.com> 季宗耀 @orangeji11 jizongyao(a)uniontech.com<mailto:jizongyao@uniontech.com> 麦合木提·买买提 @mahmut mahmut(a)uniontech.com<mailto:mahmut@uniontech.com> GCC for openEuler https://gitee.com/openEuler/gcc 李彦成 @li-yancheng 黄晓权 @huang-xiaoquan 丁光亚 @dguangya 伍明川 @wumingchuan ZVM https://gitee.com/openeuler/zvm 任慰 @vonhust renwei41(a)huawei.com<mailto:renwei41@huawei.com> 方林旭 @fanglinxu fanglinxu(a)huawei.com<mailto:fanglinxu@huawei.com> 熊程来 @cocoeoli xiongcl(a)hnu.edu.cn<mailto:xiongcl@hnu.edu.cn> 谢国骐 @xie-guoqi xgqman(a)hnu.edu.cn<mailto:xgqman@hnu.edu.cn> 韩宗成 @hzc04 hanzongcheng(a)huawei.com<mailto:hanzongcheng@huawei.com> 罗永茂 @yongmao_luo luoyongmao(a)huawei.com<mailto:luoyongmao@huawei.com> 粱其锋 @emancipator liangqifeng(a)ncti-gba.cn<mailto:liangqifeng@ncti-gba.cn> Rust-Shyper https://gitee.com/openeuler/rust_shyper 任慰 @vonhust renwei41(a)huawei.com<mailto:renwei41@huawei.com> 方林旭 @fanglinxu fanglinxu(a)huawei.com<mailto:fanglinxu@huawei.com> 叶增软 @yezengruan yezengruan(a)huawei.com<mailto:yezengruan@huawei.com> 莫策 @motse306 moce4917(a)buaa.edu.cn<mailto:moce4917@buaa.edu.cn> 王雷 wanglei(a)buaa.edu.cn<mailto:wanglei@buaa.edu.cn> 韩宗成 @hzc04 hanzongcheng(a)huawei.com<mailto:hanzongcheng@huawei.com> 罗永茂 @yongmao_luo luoyongmao(a)huawei.com<mailto:luoyongmao@huawei.com> 粱其锋 @emancipator liangqifeng(a)ncti-gba.cn<mailto:liangqifeng@ncti-gba.cn> oebuild https://gitee.com/openeuler/oebuild 任慰 @vonhust renwei41(a)huawei.com<mailto:renwei41@huawei.com> 方林旭 @fanglinxu fanglinxu(a)huawei.com<mailto:fanglinxu@huawei.com> 粱其锋 @emancipator liangqifeng(a)ncti-gba.cn<mailto:liangqifeng@ncti-gba.cn> 王伟 @wangwei622_admin wangwei(a)cdjrlc.com<mailto:wangwei@cdjrlc.com> 李新宇 @alichinese lixinyu44(a)huawei.com<mailto:lixinyu44@huawei.com> 发件人: Huxinwei 发送时间: 2023年10月18日 16:38 收件人: 'tc(a)openeuler.org' <tc(a)openeuler.org>; 'dev(a)openeuler.org' <dev(a)openeuler.org>; 'kernel(a)openeuler.org' <kernel(a)openeuler.org> 抄送: 'shinwell_hu(a)openeuler.sh' <shinwell_hu(a)openeuler.sh>; Xiongwei (William, Euler) <xiongwei888(a)huawei.com> 主题: [resend] openEuler 2023 年度优秀项目推荐结果公示各位社区开发者，大家好：上午的结果公示，在大家帮助下，识别了之前因为各种原因漏了的项目推荐。截止当前，技术委员会已收到优秀项目推荐如下。感谢大家的参与，讨论评审预计在下周技术委员会例会上进行，如有其他意见，请在下周二17号之前邮件反馈给我。再次感谢大家的支持。项目名称项目代码仓推荐人 FangTian视窗引擎 https://gitee.com/openeuler/ft_engine https://gitee.com/openeuler/ft_mmi https://gitee.com/openeuler/arkui-linux https://gitee.com/openeuler/ft_flutter https://gitee.com/openeuler/ft_multimedia 华亚东 @hua_yadong huayadong(a)kylinos.cn<mailto:huayadong@kylinos.cn> 吴圣垚 @shen_hua_li shengyao(a)iscas.ac.cn<mailto:shengyao@iscas.ac.cn> 宋亚南 @yanansong songyanan5(a)huawei.com<mailto:songyanan5@huawei.com> 靳国恩 @abc12133 jinguoen(a)huawei.com<mailto:jinguoen@huawei.com> 冯绍波 @ShaoboFeng fengshaobo(a)huawei.com<mailto:fengshaobo@huawei.com> 郑森文 @zhengsenwen senwen(a)iscas.ac.cn<mailto:senwen@iscas.ac.cn> 张子豪 @MouseZhang mousezhang(a)openkylin.top<mailto:mousezhang@openkylin.top> 黄钰馨 @huangyuxin2023 huangyuxin(a)isrc.iscas.ac.cn<mailto:huangyuxin@isrc.iscas.ac.cn> 蒋文宇 @jiangwenyu1 jiangwenyu1(a)huawei.com<mailto:jiangwenyu1@huawei.com> sysMaster https://gitee.com/openeuler/sysmaster 熊伟 @myeuler 何晓文@overweight 胡世元 @love_hangzhou A-Tune https://gitee.com/openeuler/A-Tune 王建民 @jianminw jianmin(a)iscas.ac.cn<mailto:jianmin@iscas.ac.cn> 李次华 @Monday licihua(a)huawei.com<mailto:licihua@huawei.com> 谢志鹏 @xiezhipeng1 xiezhipeng1(a)huawei.com<mailto:xiezhipeng1@huawei.com> A-OPS https://gitee.com/openeuler/A-Ops https://gitee.com/openeuler/aops-apollo https://gitee.com/openeuler/aops-diana https://gitee.com/openeuler/aops-zeus https://gitee.com/openeuler/gala-gopher https://gitee.com/openeuler/gala-spider https://gitee.com/openeuler/gala-anteater https://gitee.com/openeuler/gala-ragdoll https://gitee.com/openeuler/syscare https://gitee.com/openeuler/X-diagnosis 胡峰 @solarhu 罗盛炜 @Lostwayzxc 杨昭 @yangzhao_kl StratoVirt https://gitee.com/openeuler/stratovirt 王志钢 @cellfaint wangzhigang17(a)huawei.com<mailto:wangzhigang17@huawei.com> 朱科潜 @kevinzhu1 zhukeqian1(a)huawei.com<mailto:zhukeqian1@huawei.com> 汤中睿 @tzr tangzhongrui(a)cmss.chinamobile.com<mailto:tangzhongrui@cmss.chinamobile.com> secGear https://gitee.com/openeuler/secGear 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 刘忻 @ICC-NSG xinl(a)lzu.edu.cn<mailto:xinl@lzu.edu.cn> 杜东 @dongduResearcher dd_nirvana(a)sjtu.edu.cn<mailto:dd_nirvana@sjtu.edu.cn> 侯明永 @houmingyong houmingyong(a)huawei.com<mailto:houmingyong@huawei.com> PINC (PlugIN framework for Compiler) https://gitee.com/openEuler/pin-gcc-clienthttps://gitee.com/openeuler/pin-s…<https://gitee.com/openEuler/pin-gcc-clienthttps:/gitee.com/openeuler/pin-se…> 李彦成@li-yancheng 黄晓权@huang-xiaoquan 丁光亚@dguangya 伍明川@wumingchuan iSulad https://gitee.com/openeuler/iSulad https://gitee.com/openeuler/lcro https://gitee.com/openeuler/clibcni 蔡灏旻 @caihaomin 魏宝辉 @weibaohui Apache Bigtop支持openEuler OS https://github.com/apache/bigtop/tree/openEuler-support 熊伟 @myeuler 郑振宇 @ZhengZhenyu zhengzhenyu(a)huawei.com<mailto:zhengzhenyu@huawei.com> 蒋健源 @jenkins-jiang jiangjianyuan(a)huawei.com<mailto:jiangjianyuan@huawei.com> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 温伟健 @wenwj0 wenweijian2(a)huawei.com<mailto:wenweijian2@huawei.com> 陈强 @macchen1 mac.chenqiang(a)huawei.com<mailto:mac.chenqiang@huawei.com> sysboost https://gitee.com/openeuler/sysboost @liu-yuntao-10 liuyuntao10(a)huawei.com<mailto:liuyuntao10@huawei.com> @softkiller zhoukang7(a)huawei.com<mailto:zhoukang7@huawei.com> @pan-y182 yangpan51(a)huawei.com<mailto:yangpan51@huawei.com> openEuler Kernel https://gitee.com/openeuler/kernel https://gitee.com/src-openeuler/kernel 谢秀奇 @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> 桑力鹏 @sanglipeng sanglipeng1(a)jd.com<mailto:sanglipeng1@jd.com> 曾昭荣 @x56Jason jason.zeng(a)intel.com<mailto:jason.zeng@intel.com> 孔新伟 @kongzizaixian kong.kongxinwei(a)hisilicon.com<mailto:kong.kongxinwei@hisilicon.com> 刘恺 @kailiu42 kai.liu(a)xfusion.com<mailto:kai.liu@xfusion.com> GMEM（Generalized Memory Management） https://gitee.com/openeuler/kernel/tree/openEuler-23.09/ 主要代码文件： include/linux/gmem.h include/linux/gmem_as.h include/linux/vm_object.h mm/gmem.c mm/vm_object.c mm/huge_memory.c mm/memory.c mm/mmap.c drivers/remote_pager/ @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> @weixizhu94 weixi.zhu(a)huawei.com<mailto:weixi.zhu@huawei.com> @fangchuang fangchuangchuang(a)huawei.com<mailto:fangchuangchuang@huawei.com> @SuperSix173 liuchao173(a)huawei.com<mailto:liuchao173@huawei.com> gazelle https://gitee.com/openeuler/gazelle 胡峰 @solarhu 陆志浩 @MrRlu 吴长冶 @ nlgwcy 郑杰兵@steganographer utsudo https://gitee.com/openeuler/utsudo https://gitee.com/src-openeuler/utsudo 王路军 @ut-wanglujun wanglujun(a)uniontech.com<mailto:wanglujun@uniontech.com> 曹佩庆@hustcao4 caopeiqing(a)uniontech.com<mailto:caopeiqing@uniontech.com> 宾凌宇@binlingyu binlingyu(a)uniontech.com<mailto:binlingyu@uniontech.com> 张欢欢 @luckhuanhuan zhanghuanhuan(a)uniontech.com<mailto:zhanghuanhuan@uniontech.com> 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> DDE桌面环境 .... 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 杨晓旋 @ut-layne-yang yangxiaoxuan(a)uniontech.com<mailto:yangxiaoxuan@uniontech.com> 杨显钊 @ xzyangha yangxianzhao(a)uniontech.com<mailto:yangxianzhao@uniontech.com> 李伟刚 @open-bot liweigang(a)uniontech.com<mailto:liweigang@uniontech.com> kiran桌面环境 https://gitee.com/openeuler/kiran-authentication-devices 等石勇 @stonefly128 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> 吴伟 @wuwei_plct wuwei2016(a)iscas.ac.cn<mailto:wuwei2016@iscas.ac.cn> 唐杰 @tangjie02 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> 柳鑫浩 @liubuguiii liuxinghao(a)kylinsec.com.cn<mailto:liuxinghao@kylinsec.com.cn> 王逸樵 @eusteuc eusetuc(a)outlook.com<mailto:eusetuc@outlook.com> NestOS https://gitee.com/openeuler/NestOS 王麟 @wonleing wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> 王利民 @wanglmb wanglimin(a)xfusion.com<mailto:wanglimin@xfusion.com> 刘昊 @duguhaotian liuhao27(a)huawei.com<mailto:liuhao27@huawei.com> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> NestOS Kubernetes Deployer https://gitee.com/openeuler/nestos-kubernetes-deployer 候健 @hjimmy houjian(a)kylinos.cn<mailto:houjian@kylinos.cn> 王悦良 @wangyueliang wangyueliang(a)kylinos.cn<mailto:wangyueliang@kylinos.cn> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> PilotGo https://gitee.com/openeuler/PilotGo 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 陆志浩 @MrRlu luzhihao(a)huawei.com<mailto:luzhihao@huawei.com> 罗盛玮 @Lostwayzxc luoshengwei(a)huawei.com<mailto:luoshengwei@huawei.com> 镜像文件级定制裁剪项目imageTailor https://gitee.com/openeuler/imageTailor 吴峰光 @wu_fengguang wufengguang(a)huawei.com<mailto:wufengguang@huawei.com> 王龙 @laoqinren wanglong19(a)meituan.com<mailto:wanglong19@meituan.com> 高立江 @qq_mud gaolijiang(a)chinapost.com.cn<mailto:gaolijiang@chinapost.com.cn> 朱春意 @zhuchunyi zhuchunyi(a)huawei.com<mailto:zhuchunyi@huawei.com> migration-tools https://gitee.com/openeuler/migration-tools<https://gitee.com/openeuler/NestOS> 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 王磊 @wanglei-uos wangleic(a)uniontech.com<mailto:wangleic@uniontech.com> 李鑫 @F16lx lixin(a)uniontech.com<mailto:lixin@uniontech.com> 刘兴伟 @xingwei-liu liuxingwei(a)uniontech.com<mailto:liuxingwei@uniontech.com> CVE-ease https://gitee.com/openeuler/cve-ease 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 马威 @movie0125 mawei(a)uniontech.com<mailto:mawei@uniontech.com> 王麟 @wonleing wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> gala-ragdoll https://gitee.com/openeuler/gala-ragdoll.git 柳磊 @liulei 450962(a)qq.com<mailto:450962@qq.com> 李超峰 @李超峰0220(*无效gitee id) lichaofeng(a)cmos.chinamobile.com<mailto:lichaofeng@cmos.chinamobile.com> 杨彬 @byrobinsm binyangck(a)isoftstone.com<mailto:binyangck@isoftstone.com> 张道龙 @zhangdaolong dlzhangak(a)isoftstone.com<mailto:dlzhangak@isoftstone.com> CCPS https://gitee.com/openeuler/ccps 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 王磊 @wanglei-uos wangleic(a)uniontech.com<mailto:wangleic@uniontech.com> 季宗耀 @orangeji11 jizongyao(a)uniontech.com<mailto:jizongyao@uniontech.com> 麦合木提·买买提 @mahmut mahmut(a)uniontech.com<mailto:mahmut@uniontech.com> GCC for openEuler https://gitee.com/openEuler/gcc 李彦成 @li-yancheng 黄晓权 @huang-xiaoquan 丁光亚 @dguangya 伍明川 @wumingchuan ZVM https://gitee.com/openeuler/zvm 任慰 @vonhust renwei41(a)huawei.com<mailto:renwei41@huawei.com> 方林旭 @fanglinxu fanglinxu(a)huawei.com<mailto:fanglinxu@huawei.com> 熊程来 @cocoeoli xiongcl(a)hnu.edu.cn<mailto:xiongcl@hnu.edu.cn> 谢国骐 @xie-guoqi xgqman(a)hnu.edu.cn<mailto:xgqman@hnu.edu.cn> 韩宗成 @hzc04 hanzongcheng(a)huawei.com<mailto:hanzongcheng@huawei.com> 罗永茂 @yongmao_luo luoyongmao(a)huawei.com<mailto:luoyongmao@huawei.com> 粱其锋 @emancipator liangqifeng(a)ncti-gba.cn<mailto:liangqifeng@ncti-gba.cn> 发件人: Huxinwei 发送时间: 2023年9月25日 10:26 收件人: tc(a)openeuler.org<mailto:tc@openeuler.org>; dev(a)openeuler.org<mailto:dev@openeuler.org>; kernel(a)openeuler.org<mailto:kernel@openeuler.org> 抄送: shinwell_hu(a)openeuler.sh<mailto:shinwell_hu@openeuler.sh>; Xiongwei (William, Euler) <xiongwei888(a)huawei.com<mailto:xiongwei888@huawei.com>> 主题: 启动 openEuler 2023 年度优秀项目推荐各位社区的开发者：经 openEuler技术委员会9月20日会议讨论，现正式启动 openEuler 2023年度优秀项目的评选，请各位社区开发者和参与者推荐。当前在评选标准和项目设置上的考虑，可以参见：oEEP (openeuler.org)<https://www.openeuler.org/zh/oEEP/?name=oEEP-0007%20openEuler%E4%BC%98%E7%A…> 。截止 2023 年 10 月 15 日（周日）为止，任意三名以上社区参与者联名，可以向 tc(a)openeuler.org<mailto:tc@openeuler.org> 推荐您认可的项目。推荐项目的邮件请在邮件主题中明确包含 “openEuler 2023 年度优秀项目推荐” 字样。推荐项目的邮件内容中，请明确联名推荐人的邮箱地址和相应的 gitee id，推荐的项目名称，项目代码仓位置，推荐获奖的方向。我将汇总所有推荐，在 10 月 18 日之前通过社区邮件列表公示。欢迎大家的参与和推荐 Regards openEuler Technical Committee

1 0

[PATCH OLK-5.10 0/6] Fixed five CVEs vulnerabilities of ksmbd
by ZhaoLong Wang 19 Oct '23

19 Oct '23

CVE-2023-32254 CVE-2023-32246 CVE-2023-32256 CVE-2023-32258 CVE-2023-2593 Marios Makassikis (2): ksmbd: send proper error response in smb2_tree_connect() ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN Namjae Jeon (4): ksmbd: fix racy issue under cocurrent smb2 tree disconnect ksmbd: call rcu_barrier() in ksmbd_server_exit() ksmbd: fix racy issue from smb2 close and logoff with multichannel ksmbd: fix infinite loop in ksmbd_conn_handler_loop() fs/ksmbd/connection.c | 56 +++++++++++++++++++++++++++--------- fs/ksmbd/connection.h | 19 ++++++++++-- fs/ksmbd/mgmt/tree_connect.c | 13 ++++++++- fs/ksmbd/mgmt/tree_connect.h | 3 ++ fs/ksmbd/mgmt/user_session.c | 36 +++++++++++++++++++---- fs/ksmbd/server.c | 1 + fs/ksmbd/smb2pdu.c | 31 +++++++++++--------- fs/ksmbd/transport_tcp.c | 5 +++- 8 files changed, 126 insertions(+), 38 deletions(-) -- 2.34.3

2 7

[PATCH openEuler-23.09] mm: gmem: Reture false if hnid is bigger than MAX_NUMNODES
by Wupeng Ma 19 Oct '23

19 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> euleros inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7WLVX --------------------------------------------- Reture false if hnid is bigger than MAX_NUMNODES. Fixes: 46a7894b5e4c ("mm: gmem: Introduce GMEM") Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- include/linux/gmem.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/include/linux/gmem.h b/include/linux/gmem.h index e198180b8085..1786f8676376 100644 --- a/include/linux/gmem.h +++ b/include/linux/gmem.h @@ -327,13 +327,14 @@ extern struct hnode *hnodes[]; static inline bool is_hnode(int node) { - return !node_isset(node, node_possible_map) - && node_isset(node, hnode_map); + return (node < MAX_NUMNODES) && !node_isset(node, node_possible_map) && + node_isset(node, hnode_map); } static inline bool is_hnode_allowed(int node) { - return is_hnode(node) && node_isset(node, current->mems_allowed); + return (node < MAX_NUMNODES) && is_hnode(node) && + node_isset(node, current->mems_allowed); } static inline struct hnode *get_hnode(unsigned int hnid) -- 2.25.1

2 1

[PATCH 1/1] mm: gmem: Reture false if hnid is bigger than MAX_NUMNODES
by Wupeng Ma 19 Oct '23

19 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> euleros inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7WLVX --------------------------------------------- Reture false if hnid is bigger than MAX_NUMNODES. Fixes: 46a7894b5e4c ("mm: gmem: Introduce GMEM") Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- include/linux/gmem.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/include/linux/gmem.h b/include/linux/gmem.h index e198180b8085..1786f8676376 100644 --- a/include/linux/gmem.h +++ b/include/linux/gmem.h @@ -327,13 +327,14 @@ extern struct hnode *hnodes[]; static inline bool is_hnode(int node) { - return !node_isset(node, node_possible_map) - && node_isset(node, hnode_map); + return (node < MAX_NUMNODES) && !node_isset(node, node_possible_map) && + node_isset(node, hnode_map); } static inline bool is_hnode_allowed(int node) { - return is_hnode(node) && node_isset(node, current->mems_allowed); + return (node < MAX_NUMNODES) && is_hnode(node) && + node_isset(node, current->mems_allowed); } static inline struct hnode *get_hnode(unsigned int hnid) -- 2.25.1

1 0

[PATCH openEuler-1.0-LTS v2 0/2] Revert irq reentrant warm log
by Yipeng Zou 19 Oct '23

19 Oct '23

There is no need print this warm log since a95cc4dafae34(genirq: introduce handle_fasteoi_edge_irq flow handler) has been merged. Yipeng Zou (2): Revert "genirq: add printk safe in irq context" Revert "genirq: Introduce warn log when irq be reentrant" kernel/irq/chip.c | 14 -------------- 1 file changed, 14 deletions(-) -- 2.34.1

2 3

[PATCH openEuler-1.0-LTS 0/2] Revert irq reentrant warm log
by Yipeng Zou 19 Oct '23

19 Oct '23

There is no need print this warm log since a95cc4dafae34(genirq: introduce handle_fasteoi_edge_irq flow handler) has been merged. Yipeng Zou (2): Revert "genirq: add printk safe in irq context" Revert "genirq: Introduce warn log when irq be reentrant" kernel/irq/chip.c | 14 -------------- 1 file changed, 14 deletions(-) -- 2.34.1

2 3

[PATCH OLK-5.10] coresight: Fix loss of connection info when a module is unloaded
by Junhao He 18 Oct '23

18 Oct '23

From: James Clark <james.clark(a)arm.com> mainline inclusion from mainline-v6.4-rc1 commit c45b2835e7b205783bdfe08cc98fa86a7c5eeb74 category: Bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I85PIL CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------------------------------------- child_fwnode should be a read only property based on the DT or ACPI. If it's cleared on the parent device when a child is unloaded, then when the child is loaded again the connection won't be remade. child_dev should be cleared instead which signifies that the connection should be remade when the child_fwnode registers a new coresight_device. Similarly the reference count shouldn't be decremented as long as the parent device exists. The correct place to drop the reference is in coresight_release_platform_data() which is already done. Reproducible on Juno with the following steps: # load all coresight modules. $ cd /sys/bus/coresight/devices/ $ echo 1 > tmc_etr0/enable_sink $ echo 1 > etm0/enable_source # Works fine ^ $ echo 0 > etm0/enable_source $ rmmod coresight-funnel $ modprobe coresight-funnel $ echo 1 > etm0/enable_source -bash: echo: write error: Invalid argument Fixes: 37ea1ffddffa ("coresight: Use fwnode handle instead of device names") Fixes: 2af89ebacf29 ("coresight: Clear the connection field properly") Tested-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> Reviewed-by: Mike Leach <mike.leach(a)linaro.org> Signed-off-by: James Clark <james.clark(a)arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-2-james.clark@arm.com Signed-off-by: Junhao He <hejunhao3(a)huawei.com> --- drivers/hwtracing/coresight/coresight-core.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index ee7b5cfcf786..9696f402a328 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -1443,13 +1443,8 @@ static int coresight_remove_match(struct device *dev, void *data) if (csdev->dev.fwnode == conn->child_fwnode) { iterator->orphan = true; coresight_remove_links(iterator, conn); - /* - * Drop the reference to the handle for the remote - * device acquired in parsing the connections from - * platform data. - */ - fwnode_handle_put(conn->child_fwnode); - conn->child_fwnode = NULL; + + conn->child_dev = NULL; /* No need to continue */ break; } -- 2.33.0

2 1

[resend] openEuler 2023 年度优秀项目推荐结果公示
by Huxinwei 18 Oct '23

18 Oct '23

各位社区开发者，大家好：上午的结果公示，在大家帮助下，识别了之前因为各种原因漏了的项目推荐。截止当前，技术委员会已收到优秀项目推荐如下。感谢大家的参与，讨论评审预计在下周技术委员会例会上进行，如有其他意见，请在下周二17号之前邮件反馈给我。再次感谢大家的支持。项目名称项目代码仓推荐人 FangTian视窗引擎 https://gitee.com/openeuler/ft_engine https://gitee.com/openeuler/ft_mmi https://gitee.com/openeuler/arkui-linux https://gitee.com/openeuler/ft_flutter https://gitee.com/openeuler/ft_multimedia 华亚东 @hua_yadong huayadong(a)kylinos.cn<mailto:huayadong@kylinos.cn> 吴圣垚 @shen_hua_li shengyao(a)iscas.ac.cn<mailto:shengyao@iscas.ac.cn> 宋亚南 @yanansong songyanan5(a)huawei.com<mailto:songyanan5@huawei.com> 靳国恩 @abc12133 jinguoen(a)huawei.com<mailto:jinguoen@huawei.com> 冯绍波 @ShaoboFeng fengshaobo(a)huawei.com<mailto:fengshaobo@huawei.com> 郑森文 @zhengsenwen senwen(a)iscas.ac.cn<mailto:senwen@iscas.ac.cn> 张子豪 @MouseZhang mousezhang(a)openkylin.top 黄钰馨 @huangyuxin2023 huangyuxin(a)isrc.iscas.ac.cn<mailto:huangyuxin@isrc.iscas.ac.cn> 蒋文宇 @jiangwenyu1 jiangwenyu1(a)huawei.com<mailto:jiangwenyu1@huawei.com> sysMaster https://gitee.com/openeuler/sysmaster 熊伟 @myeuler 何晓文@overweight 胡世元 @love_hangzhou A-Tune https://gitee.com/openeuler/A-Tune 王建民 @jianminw jianmin(a)iscas.ac.cn<mailto:jianmin@iscas.ac.cn> 李次华 @Monday licihua(a)huawei.com<mailto:licihua@huawei.com> 谢志鹏 @xiezhipeng1 xiezhipeng1(a)huawei.com<mailto:xiezhipeng1@huawei.com> A-OPS https://gitee.com/openeuler/A-Ops https://gitee.com/openeuler/aops-apollo https://gitee.com/openeuler/aops-diana https://gitee.com/openeuler/aops-zeus https://gitee.com/openeuler/gala-gopher https://gitee.com/openeuler/gala-spider https://gitee.com/openeuler/gala-anteater https://gitee.com/openeuler/gala-ragdoll https://gitee.com/openeuler/syscare https://gitee.com/openeuler/X-diagnosis 胡峰 @solarhu 罗盛炜 @Lostwayzxc 杨昭 @yangzhao_kl StratoVirt https://gitee.com/openeuler/stratovirt 王志钢 @cellfaint wangzhigang17(a)huawei.com<mailto:wangzhigang17@huawei.com> 朱科潜 @kevinzhu1 zhukeqian1(a)huawei.com<mailto:zhukeqian1@huawei.com> 汤中睿 @tzr tangzhongrui(a)cmss.chinamobile.com<mailto:tangzhongrui@cmss.chinamobile.com> secGear https://gitee.com/openeuler/secGear 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 刘忻 @ICC-NSG xinl(a)lzu.edu.cn<mailto:xinl@lzu.edu.cn> 杜东 @dongduResearcher dd_nirvana(a)sjtu.edu.cn<mailto:dd_nirvana@sjtu.edu.cn> 侯明永 @houmingyong houmingyong(a)huawei.com<mailto:houmingyong@huawei.com> PINC (PlugIN framework for Compiler) https://gitee.com/openEuler/pin-gcc-clienthttps://gitee.com/openeuler/pin-s…<https://gitee.com/openEuler/pin-gcc-clienthttps:/gitee.com/openeuler/pin-se…> 李彦成@li-yancheng 黄晓权@huang-xiaoquan 丁光亚@dguangya 伍明川@wumingchuan iSulad https://gitee.com/openeuler/iSulad https://gitee.com/openeuler/lcro https://gitee.com/openeuler/clibcni 蔡灏旻 @caihaomin 魏宝辉 @weibaohui Apache Bigtop支持openEuler OS https://github.com/apache/bigtop/tree/openEuler-support 熊伟 @myeuler 郑振宇 @ZhengZhenyu zhengzhenyu(a)huawei.com<mailto:zhengzhenyu@huawei.com> 蒋健源 @jenkins-jiang jiangjianyuan(a)huawei.com<mailto:jiangjianyuan@huawei.com> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 温伟健 @wenwj0 wenweijian2(a)huawei.com<mailto:wenweijian2@huawei.com> 陈强 @macchen1 mac.chenqiang(a)huawei.com<mailto:mac.chenqiang@huawei.com> sysboost https://gitee.com/openeuler/sysboost @liu-yuntao-10 liuyuntao10(a)huawei.com<mailto:liuyuntao10@huawei.com> @softkiller zhoukang7(a)huawei.com<mailto:zhoukang7@huawei.com> @pan-y182 yangpan51(a)huawei.com<mailto:yangpan51@huawei.com> openEuler Kernel https://gitee.com/openeuler/kernel https://gitee.com/src-openeuler/kernel 谢秀奇 @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> 桑力鹏 @sanglipeng sanglipeng1(a)jd.com<mailto:sanglipeng1@jd.com> 曾昭荣 @x56Jason jason.zeng(a)intel.com<mailto:jason.zeng@intel.com> 孔新伟 @kongzizaixian kong.kongxinwei(a)hisilicon.com<mailto:kong.kongxinwei@hisilicon.com> 刘恺 @kailiu42 kai.liu(a)xfusion.com<mailto:kai.liu@xfusion.com> GMEM（Generalized Memory Management） https://gitee.com/openeuler/kernel/tree/openEuler-23.09/ 主要代码文件： include/linux/gmem.h include/linux/gmem_as.h include/linux/vm_object.h mm/gmem.c mm/vm_object.c mm/huge_memory.c mm/memory.c mm/mmap.c drivers/remote_pager/ @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> @weixizhu94 weixi.zhu(a)huawei.com<mailto:weixi.zhu@huawei.com> @fangchuang fangchuangchuang(a)huawei.com<mailto:fangchuangchuang@huawei.com> @SuperSix173 liuchao173(a)huawei.com<mailto:liuchao173@huawei.com> gazelle https://gitee.com/openeuler/gazelle 胡峰 @solarhu 陆志浩 @MrRlu 吴长冶 @ nlgwcy 郑杰兵@steganographer utsudo https://gitee.com/openeuler/utsudo https://gitee.com/src-openeuler/utsudo 王路军 @ut-wanglujun wanglujun(a)uniontech.com<mailto:wanglujun@uniontech.com> 曹佩庆@hustcao4 caopeiqing(a)uniontech.com<mailto:caopeiqing@uniontech.com> 宾凌宇@binlingyu binlingyu(a)uniontech.com<mailto:binlingyu@uniontech.com> 张欢欢 @luckhuanhuan zhanghuanhuan(a)uniontech.com<mailto:zhanghuanhuan@uniontech.com> 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> DDE桌面环境 .... 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 杨晓旋 @ut-layne-yang yangxiaoxuan(a)uniontech.com<mailto:yangxiaoxuan@uniontech.com> 杨显钊 @ xzyangha yangxianzhao(a)uniontech.com<mailto:yangxianzhao@uniontech.com> 李伟刚 @open-bot liweigang(a)uniontech.com<mailto:liweigang@uniontech.com> kiran桌面环境 https://gitee.com/openeuler/kiran-authentication-devices 等石勇 @stonefly128 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> 吴伟 @wuwei_plct wuwei2016(a)iscas.ac.cn<mailto:wuwei2016@iscas.ac.cn> 唐杰 @tangjie02 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> 柳鑫浩 @liubuguiii liuxinghao(a)kylinsec.com.cn<mailto:liuxinghao@kylinsec.com.cn> 王逸樵 @eusteuc eusetuc(a)outlook.com<mailto:eusetuc@outlook.com> NestOS https://gitee.com/openeuler/NestOS 王麟 @wonleing wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> 王利民 @wanglmb wanglimin(a)xfusion.com<mailto:wanglimin@xfusion.com> 刘昊 @duguhaotian liuhao27(a)huawei.com<mailto:liuhao27@huawei.com> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> NestOS Kubernetes Deployer https://gitee.com/openeuler/nestos-kubernetes-deployer 候健 @hjimmy houjian(a)kylinos.cn<mailto:houjian@kylinos.cn> 王悦良 @wangyueliang wangyueliang(a)kylinos.cn<mailto:wangyueliang@kylinos.cn> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> PilotGo https://gitee.com/openeuler/PilotGo 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 陆志浩 @MrRlu luzhihao(a)huawei.com<mailto:luzhihao@huawei.com> 罗盛玮 @Lostwayzxc luoshengwei(a)huawei.com<mailto:luoshengwei@huawei.com> 镜像文件级定制裁剪项目imageTailor https://gitee.com/openeuler/imageTailor 吴峰光 @wu_fengguang wufengguang(a)huawei.com<mailto:wufengguang@huawei.com> 王龙 @laoqinren wanglong19(a)meituan.com<mailto:wanglong19@meituan.com> 高立江 @qq_mud gaolijiang(a)chinapost.com.cn<mailto:gaolijiang@chinapost.com.cn> 朱春意 @zhuchunyi zhuchunyi(a)huawei.com<mailto:zhuchunyi@huawei.com> migration-tools https://gitee.com/openeuler/migration-tools<https://gitee.com/openeuler/NestOS> 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 王磊 @wanglei-uos wangleic(a)uniontech.com<mailto:wangleic@uniontech.com> 李鑫 @F16lx lixin(a)uniontech.com<mailto:lixin@uniontech.com> 刘兴伟 @xingwei-liu liuxingwei(a)uniontech.com<mailto:liuxingwei@uniontech.com> CVE-ease https://gitee.com/openeuler/cve-ease 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 马威 @movie0125 mawei(a)uniontech.com<mailto:mawei@uniontech.com> 王麟 @wonleing wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> 杜奕威 @duyiwei7w duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> gala-ragdoll https://gitee.com/openeuler/gala-ragdoll.git 柳磊 @liulei 450962(a)qq.com<mailto:450962@qq.com> 李超峰 @李超峰0220(*无效gitee id) lichaofeng(a)cmos.chinamobile.com<mailto:lichaofeng@cmos.chinamobile.com> 杨彬 @byrobinsm binyangck(a)isoftstone.com<mailto:binyangck@isoftstone.com> 张道龙 @zhangdaolong dlzhangak(a)isoftstone.com<mailto:dlzhangak@isoftstone.com> CCPS https://gitee.com/openeuler/ccps 吕从庆 @HelloWorld_lvcongqing lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> 王磊 @wanglei-uos wangleic(a)uniontech.com<mailto:wangleic@uniontech.com> 季宗耀 @orangeji11 jizongyao(a)uniontech.com<mailto:jizongyao@uniontech.com> 麦合木提·买买提 @mahmut mahmut(a)uniontech.com<mailto:mahmut@uniontech.com> GCC for openEuler https://gitee.com/openEuler/gcc 李彦成 @li-yancheng 黄晓权 @huang-xiaoquan 丁光亚 @dguangya 伍明川 @wumingchuan ZVM https://gitee.com/openeuler/zvm 任慰 @vonhust renwei41(a)huawei.com<mailto:renwei41@huawei.com> 方林旭 @fanglinxu fanglinxu(a)huawei.com<mailto:fanglinxu@huawei.com> 熊程来 @cocoeoli xiongcl(a)hnu.edu.cn<mailto:xiongcl@hnu.edu.cn> 谢国骐 @xie-guoqi xgqman(a)hnu.edu.cn<mailto:xgqman@hnu.edu.cn> 韩宗成 @hzc04 hanzongcheng(a)huawei.com<mailto:hanzongcheng@huawei.com> 罗永茂 @yongmao_luo luoyongmao(a)huawei.com<mailto:luoyongmao@huawei.com> 粱其锋 @emancipator liangqifeng(a)ncti-gba.cn<mailto:liangqifeng@ncti-gba.cn> 发件人: Huxinwei 发送时间: 2023年9月25日 10:26 收件人: tc(a)openeuler.org; dev(a)openeuler.org; kernel(a)openeuler.org 抄送: shinwell_hu(a)openeuler.sh; Xiongwei (William, Euler) <xiongwei888(a)huawei.com> 主题: 启动 openEuler 2023 年度优秀项目推荐各位社区的开发者：经 openEuler技术委员会9月20日会议讨论，现正式启动 openEuler 2023年度优秀项目的评选，请各位社区开发者和参与者推荐。当前在评选标准和项目设置上的考虑，可以参见：oEEP (openeuler.org)<https://www.openeuler.org/zh/oEEP/?name=oEEP-0007%20openEuler%E4%BC%98%E7%A…> 。截止 2023 年 10 月 15 日（周日）为止，任意三名以上社区参与者联名，可以向 tc(a)openeuler.org<mailto:tc@openeuler.org> 推荐您认可的项目。推荐项目的邮件请在邮件主题中明确包含 “openEuler 2023 年度优秀项目推荐” 字样。推荐项目的邮件内容中，请明确联名推荐人的邮箱地址和相应的 gitee id，推荐的项目名称，项目代码仓位置，推荐获奖的方向。我将汇总所有推荐，在 10 月 18 日之前通过社区邮件列表公示。欢迎大家的参与和推荐 Regards openEuler Technical Committee

1 0

[PATCH OLK-5.10 v2 0/3] fix cgroup poll UAF
by Lu Jialin 18 Oct '23

18 Oct '23

Arnd Bergmann (1): kernfs: add stub helper for kernfs_generic_poll() Randy Dunlap (1): sched/psi: Select KERNFS as needed Suren Baghdasaryan (1): sched/psi: use kernfs polling functions for PSI trigger polling include/linux/kernfs.h | 4 ++++ include/linux/psi.h | 5 +++-- include/linux/psi_types.h | 3 +++ init/Kconfig | 1 + kernel/cgroup/cgroup.c | 2 +- kernel/sched/psi.c | 29 +++++++++++++++++++++-------- 6 files changed, 33 insertions(+), 11 deletions(-) -- 2.34.1

2 4

[PATCH openEuler-1.0-LTS v2] x86/microcode/AMD: Make stub function static inline
by liwei 18 Oct '23

18 Oct '23

From: Valdis Klētnieks <valdis.kletnieks(a)vt.edu> mainline inclusion from mainline-v5.6-rc1 commit 82c881b28aa89215a760e39c5f6bcde2d6ce4918 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88UBD?from=project-issue CVE: NA -------------------------------- When building with C=1 W=1 (and when CONFIG_MICROCODE_AMD=n, as Luc Van Oostenryck correctly points out) both sparse and gcc complain: CHECK arch/x86/kernel/cpu/microcode/core.c ./arch/x86/include/asm/microcode_amd.h:56:6: warning: symbol \ 'reload_ucode_amd' was not declared. Should it be static? CC arch/x86/kernel/cpu/microcode/core.o In file included from arch/x86/kernel/cpu/microcode/core.c:36: ./arch/x86/include/asm/microcode_amd.h:56:6: warning: no previous \ prototype for 'reload_ucode_amd' [-Wmissing-prototypes] 56 | void reload_ucode_amd(void) {} | ^~~~~~~~~~~~~~~~ And they're right - that function can be a static inline like its brethren. Signed-off-by: Valdis Klētnieks <valdis.kletnieks(a)vt.edu> Signed-off-by: Borislav Petkov <bp(a)suse.de> Cc: Luc Van Oostenryck <luc.vanoostenryck(a)gmail.com> Cc: x86(a)kernel.org Link: https://lkml.kernel.org/r/52170.1575603873@turing-police Conflicts: arch/x86/include/asm/microcode_amd.h Signed-off-by: liwei <liwei728(a)huawei.com> --- arch/x86/include/asm/microcode_amd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/microcode_amd.h b/arch/x86/include/asm/microcode_amd.h index bdecac3ae8c3..b56a1c0d5f60 100644 --- a/arch/x86/include/asm/microcode_amd.h +++ b/arch/x86/include/asm/microcode_amd.h @@ -54,7 +54,7 @@ static inline void __init load_ucode_amd_bsp(unsigned int family) {} static inline void load_ucode_amd_ap(unsigned int family) {} static inline int __init save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; } -void reload_ucode_amd(void) {} +static inline void reload_ucode_amd(void) {} static inline void amd_check_microcode(void) {} #endif #endif /* _ASM_X86_MICROCODE_AMD_H */ -- 2.25.1

2 1

[PATCH OLK-5.10 0/3] fix poll UAF
by Lu Jialin 18 Oct '23

18 Oct '23

Arnd Bergmann (1): kernfs: add stub helper for kernfs_generic_poll() Randy Dunlap (1): sched/psi: Select KERNFS as needed Suren Baghdasaryan (1): sched/psi: use kernfs polling functions for PSI trigger polling include/linux/kernfs.h | 4 ++++ include/linux/psi.h | 5 +++-- include/linux/psi_types.h | 3 +++ init/Kconfig | 1 + kernel/cgroup/cgroup.c | 2 +- kernel/sched/psi.c | 27 ++++++++++++++++++++------- 6 files changed, 32 insertions(+), 10 deletions(-) -- 2.34.1

2 5

[PATCH] x86/microcode/AMD: Make stub function static inline
by liwei 18 Oct '23

18 Oct '23

From: Valdis Klētnieks <valdis.kletnieks(a)vt.edu> When building with C=1 W=1 (and when CONFIG_MICROCODE_AMD=n, as Luc Van Oostenryck correctly points out) both sparse and gcc complain: CHECK arch/x86/kernel/cpu/microcode/core.c ./arch/x86/include/asm/microcode_amd.h:56:6: warning: symbol \ 'reload_ucode_amd' was not declared. Should it be static? CC arch/x86/kernel/cpu/microcode/core.o In file included from arch/x86/kernel/cpu/microcode/core.c:36: ./arch/x86/include/asm/microcode_amd.h:56:6: warning: no previous \ prototype for 'reload_ucode_amd' [-Wmissing-prototypes] 56 | void reload_ucode_amd(void) {} | ^~~~~~~~~~~~~~~~ And they're right - that function can be a static inline like its brethren. Signed-off-by: Valdis Klētnieks <valdis.kletnieks(a)vt.edu> Signed-off-by: Borislav Petkov <bp(a)suse.de> Cc: Luc Van Oostenryck <luc.vanoostenryck(a)gmail.com> Cc: x86(a)kernel.org Link: https://lkml.kernel.org/r/52170.1575603873@turing-police --- arch/x86/include/asm/microcode_amd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/microcode_amd.h b/arch/x86/include/asm/microcode_amd.h index 209492849566..6685e1218959 100644 --- a/arch/x86/include/asm/microcode_amd.h +++ b/arch/x86/include/asm/microcode_amd.h @@ -53,6 +53,6 @@ static inline void __init load_ucode_amd_bsp(unsigned int family) {} static inline void load_ucode_amd_ap(unsigned int family) {} static inline int __init save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; } -void reload_ucode_amd(void) {} +static inline void reload_ucode_amd(void) {} #endif #endif /* _ASM_X86_MICROCODE_AMD_H */ -- 2.25.1

1 0

[PATCH openEuler-1.0-LTS] perf/core: Fix reentry problem in perf_output_read_group()
by Yang Jihong 18 Oct '23

18 Oct '23

mainline inclusion from mainline-v6.0-rc2 commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I88WX3 CVE: NA -------------------------------- perf_output_read_group may respond to IPI request of other cores and invoke __perf_install_in_context function. As a result, hwc configuration is modified. causing inconsistency and unexpected consequences. Interrupts are not disabled when perf_output_read_group reads PMU counter. In this case, IPI request may be received from other cores. As a result, PMU configuration is modified and an error occurs when reading PMU counter: CPU0 CPU1 __se_sys_perf_event_open perf_install_in_context perf_output_read_group smp_call_function_single for_each_sibling_event(sub, leader) { generic_exec_single if ((sub != event) && remote_function (sub->state == PERF_EVENT_STATE_ACTIVE)) | <enter IPI handler: __perf_install_in_context> <----RAISE IPI-----+ __perf_install_in_context ctx_resched event_sched_out armpmu_del ... hwc->idx = -1; // event->hwc.idx is set to -1 ... <exit IPI> sub->pmu->read(sub); armpmu_read armv8pmu_read_counter armv8pmu_read_hw_counter int idx = event->hw.idx; // idx = -1 u64 val = armv8pmu_read_evcntr(idx); u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30 read_pmevcntrn(counter) // undefined instruction Signed-off-by: Yang Jihong <yangjihong1(a)huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@huawei.com Signed-off-by: Yang Jihong <yangjihong1(a)huawei.com> --- kernel/events/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index b1e4beb6931c..b7da090569ed 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6208,9 +6208,16 @@ static void perf_output_read_group(struct perf_output_handle *handle, { struct perf_event *leader = event->group_leader, *sub; u64 read_format = event->attr.read_format; + unsigned long flags; u64 values[5]; int n = 0; + /* + * Disabling interrupts avoids all counter scheduling + * (context switches, timer based rotation and IPIs). + */ + local_irq_save(flags); + values[n++] = 1 + leader->nr_siblings; if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) @@ -6242,6 +6249,8 @@ static void perf_output_read_group(struct perf_output_handle *handle, __output_copy(handle, values, n * sizeof(u64)); } + + local_irq_restore(flags); } #define PERF_FORMAT_TOTAL_TIMES (PERF_FORMAT_TOTAL_TIME_ENABLED|\ -- 2.34.1

2 1

[PATCH OLK-5.10 0/3] fix poll UAF
by Lu Jialin 18 Oct '23

18 Oct '23

Arnd Bergmann (1): kernfs: add stub helper for kernfs_generic_poll() Randy Dunlap (1): sched/psi: Select KERNFS as needed Suren Baghdasaryan (1): sched/psi: use kernfs polling functions for PSI trigger polling include/linux/kernfs.h | 4 ++++ include/linux/psi.h | 5 +++-- include/linux/psi_types.h | 3 +++ init/Kconfig | 1 + kernel/cgroup/cgroup.c | 2 +- kernel/sched/psi.c | 27 ++++++++++++++++++++------- 6 files changed, 32 insertions(+), 10 deletions(-) -- 2.34.1

2 5

[PATCH openEuler-1.0-LTS] x86/microcode/AMD: Fix compile fail
by liwei 18 Oct '23

18 Oct '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88UBD?from=project-issue CVE: NA ---------------------------------- When CONFIG_MICROCODE_AMD is not defined, the reload_ucode_amd function is implemented by multiple files in the header file. As a result, multiple definition of reload_ucode_amd. Fix this error. Fixes: 0eeacafff0a7 ("x86/cpu/amd: Add a Zenbleed fix") Signed-off-by: liwei <liwei728(a)huawei.com> --- arch/x86/include/asm/microcode_amd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/microcode_amd.h b/arch/x86/include/asm/microcode_amd.h index bdecac3ae8c3..b56a1c0d5f60 100644 --- a/arch/x86/include/asm/microcode_amd.h +++ b/arch/x86/include/asm/microcode_amd.h @@ -54,7 +54,7 @@ static inline void __init load_ucode_amd_bsp(unsigned int family) {} static inline void load_ucode_amd_ap(unsigned int family) {} static inline int __init save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; } -void reload_ucode_amd(void) {} +static inline void reload_ucode_amd(void) {} static inline void amd_check_microcode(void) {} #endif #endif /* _ASM_X86_MICROCODE_AMD_H */ -- 2.25.1

2 1

openEuler 2023 年度优秀项目推荐结果公示
by Huxinwei 18 Oct '23

18 Oct '23

各位社区开发者，大家好：技术委员会目前已收到优秀项目推荐如下。感谢大家的参与，讨论评审预计在下周技术委员会例会上进行，如有其他意见，请在下周二17号之前邮件反馈给我。谢谢项目名称项目代码仓推荐人 FangTian视窗引擎 https://gitee.com/openeuler/ft_engine https://gitee.com/openeuler/ft_mmi https://gitee.com/openeuler/arkui-linux https://gitee.com/openeuler/ft_flutter https://gitee.com/openeuler/ft_multimedia 华亚东 huayadong(a)kylinos.cn<mailto:huayadong@kylinos.cn> @hua_yadong 吴圣垚 shengyao(a)iscas.ac.cn<mailto:shengyao@iscas.ac.cn> @shen_hua_li 宋亚南 songyanan5(a)huawei.com<mailto:songyanan5@huawei.com> @yanansong 靳国恩 jinguoen(a)huawei.com<mailto:jinguoen@huawei.com> @abc12133 冯绍波 fengshaobo(a)huawei.com<mailto:fengshaobo@huawei.com> @ShaoboFeng 郑森文 senwen(a)iscas.ac.cn<mailto:senwen@iscas.ac.cn> @zhengsenwen 张子豪 mousezhang(a)openkylin.top @MouseZhang 黄钰馨 huangyuxin(a)isrc.iscas.ac.cn<mailto:huangyuxin@isrc.iscas.ac.cn> @huangyuxin2023 蒋文宇 jiangwenyu1(a)huawei.com<mailto:jiangwenyu1@huawei.com> @jiangwenyu1 sysMaster https://gitee.com/openeuler/sysmaster 熊伟 @myeuler 何晓文@overweight 胡世元 @love_hangzhou A-Tune https://gitee.com/openeuler/A-Tune 王建民 @jianminw jianmin(a)iscas.ac.cn<mailto:jianmin@iscas.ac.cn> 李次华 @Monday licihua(a)huawei.com<mailto:licihua@huawei.com> 谢志鹏 @xiezhipeng1 xiezhipeng1(a)huawei.com<mailto:xiezhipeng1@huawei.com> A-OPS https://gitee.com/openeuler/A-Ops https://gitee.com/openeuler/aops-apollo https://gitee.com/openeuler/aops-diana https://gitee.com/openeuler/aops-zeus https://gitee.com/openeuler/gala-gopher https://gitee.com/openeuler/gala-spider https://gitee.com/openeuler/gala-anteater https://gitee.com/openeuler/gala-ragdoll https://gitee.com/openeuler/syscare https://gitee.com/openeuler/X-diagnosis 胡峰 @solarhu 罗盛炜 @Lostwayzxc 杨昭 @yangzhao_kl StratoVirt https://gitee.com/openeuler/stratovirt 王志钢 @cellfaint wangzhigang17(a)huawei.com<mailto:wangzhigang17@huawei.com> 朱科潜 @kevinzhu1 zhukeqian1(a)huawei.com<mailto:zhukeqian1@huawei.com> 汤中睿 @tzr tangzhongrui(a)cmss.chinamobile.com<mailto:tangzhongrui@cmss.chinamobile.com> secGear https://gitee.com/openeuler/secGear 朱健伟 @zhujianwei001 zhujianwei7(a)huawei.com<mailto:zhujianwei7@huawei.com> 刘忻 @ICC-NSG xinl(a)lzu.edu.cn<mailto:xinl@lzu.edu.cn> 杜东 @dongduResearcher dd_nirvana(a)sjtu.edu.cn<mailto:dd_nirvana@sjtu.edu.cn> 侯明永 @houmingyong houmingyong(a)huawei.com<mailto:houmingyong@huawei.com> PINC (PlugIN framework for Compiler) https://gitee.com/openEuler/pin-gcc-client https://gitee.com/openeuler/pin-server<https://gitee.com/openEuler/pin-gcc-client%0bhttps:/gitee.com/openeuler/pin…> 李彦成@li-yancheng 黄晓权@huang-xiaoquan 丁光亚@dguangya 伍明川@wumingchuan iSulad https://gitee.com/openeuler/iSulad https://gitee.com/openeuler/lcro https://gitee.com/openeuler/clibcni 蔡灏旻 @caihaomin 魏宝辉 @weibaohui Apache Bigtop支持openEuler OS https://github.com/apache/bigtop/tree/openEuler-support 熊伟 @myeuler 郑振宇 @ZhengZhenyu zhengzhenyu(a)huawei.com<mailto:zhengzhenyu@huawei.com> 蒋健源 @jenkins-jiang jiangjianyuan(a)huawei.com<mailto:jiangjianyuan@huawei.com> 杨昭 @yangzhao_kl yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> 温伟健 @wenwj0 wenweijian2(a)huawei.com<mailto:wenweijian2@huawei.com> 陈强 @macchen1 mac.chenqiang(a)huawei.com<mailto:mac.chenqiang@huawei.com> sysboost https://gitee.com/openeuler/sysboost @liu-yuntao-10 liuyuntao10(a)huawei.com<mailto:liuyuntao10@huawei.com> @softkiller zhoukang7(a)huawei.com<mailto:zhoukang7@huawei.com> @pan-y182 yangpan51(a)huawei.com<mailto:yangpan51@huawei.com> openEuler Kernel https://gitee.com/openeuler/kernel 谢秀奇 xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> @xiexiuqi 桑力鹏 sanglipeng1(a)jd.com<mailto:sanglipeng1@jd.com> @sanglipeng 曾昭荣 jason.zeng(a)intel.com<mailto:jason.zeng@intel.com> @x56Jason 孔新伟 kong.kongxinwei(a)hisilicon.com<mailto:kong.kongxinwei@hisilicon.com> @kongzizaixian 刘恺 kai.liu(a)xfusion.com<mailto:kai.liu@xfusion.com> @kailiu42 GMEM（Generalized Memory Management） https://gitee.com/openeuler/kernel/tree/openEuler-23.09/ 主要代码文件： include/linux/gmem.h include/linux/gmem_as.h include/linux/vm_object.h mm/gmem.c mm/vm_object.c mm/huge_memory.c mm/memory.c mm/mmap.c drivers/remote_pager/ @xiexiuqi xiexiuqi(a)huawei.com<mailto:xiexiuqi@huawei.com> @weixizhu94 weixi.zhu(a)huawei.com<mailto:weixi.zhu@huawei.com> @fangchuang fangchuangchuang(a)huawei.com<mailto:fangchuangchuang@huawei.com> @SuperSix173 liuchao173(a)huawei.com<mailto:liuchao173@huawei.com> gazelle https://gitee.com/openeuler/gazelle 胡峰 @solarhu 陆志浩 @MrRlu 吴长冶 @ nlgwcy 郑杰兵@steganographer utsudo https://gitee.com/openeuler/utsudo 王路军 wanglujun(a)uniontech.com<mailto:wanglujun@uniontech.com> @ut-wanglujun 曹佩庆 caopeiqing(a)uniontech.com<mailto:caopeiqing@uniontech.com> @hustcao4 宾凌宇 binlingyu(a)uniontech.com<mailto:binlingyu@uniontech.com> @binlingyu 张欢欢zhanghuanhuan(a)uniontech.com<mailto:zhanghuanhuan@uniontech.com> @luckhuanhuan 吕从庆 lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> @HelloWorld_lvcongqing DDE桌面环境 src-openeuler 代码仓，数量较多，不在此列举吕从庆 lvcongqing(a)uniontech.com<mailto:lvcongqing@uniontech.com> @HelloWorld_lvcongqing 杨晓旋 yangxiaoxuan(a)uniontech.com<mailto:yangxiaoxuan@uniontech.com> @ut-layne-yang 杨显钊 yangxianzhao(a)uniontech.com<mailto:yangxianzhao@uniontech.com> @ xzyangha 李伟刚 liweigang(a)uniontech.com<mailto:liweigang@uniontech.com> @open-bot kiran桌面环境 https://gitee.com/openeuler/kiran-authentication-devices 等石勇 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> @stonefly128 吴伟 wuwei2016(a)iscas.ac.cn<mailto:wuwei2016@iscas.ac.cn> @wuwei_plct 唐杰 tangjie02(a)kylinsec.com.cn<mailto:tangjie02@kylinsec.com.cn> @tangjie02 柳鑫浩 liuxinghao(a)kylinsec.com.cn<mailto:liuxinghao@kylinsec.com.cn> @liubuguiii 王逸樵 eusetuc(a)outlook.com<mailto:eusetuc@outlook.com> @eusteuc NestOS https://gitee.com/openeuler/NestOS 王麟 wangl29(a)chinatelecom.cn<mailto:wangl29@chinatelecom.cn> @wonleing 王利民 wanglimin(a)xfusion.com<mailto:wanglimin@xfusion.com> @wanglmb 刘昊 liuhao27(a)huawei.com<mailto:liuhao27@huawei.com> @duguhaotian 杜奕威 duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> @duyiwei7w NestOS Kubernetes Deployer https://gitee.com/openeuler/nestos-kubernetes-deployer 候健 houjian(a)kylinos.cn<mailto:houjian@kylinos.cn> @hjimmy 王悦良 wangyueliang(a)kylinos.cn<mailto:wangyueliang@kylinos.cn> @wangyueliang 杨昭 yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> @yangzhao_kl 杜奕威 duyiwei(a)kylinos.cn<mailto:duyiwei@kylinos.cn> @duyiwei7w PilotGo https://gitee.com/openeuler/PilotGo 杨昭 yangzhao1(a)kylinos.cn<mailto:yangzhao1@kylinos.cn> @yangzhao_kl 陆志浩 luzhihao(a)huawei.com<mailto:luzhihao@huawei.com> @MrRlu 罗盛玮luoshengwei(a)huawei.com<mailto:luoshengwei@huawei.com> @Lostwayzxc 发件人: Huxinwei 发送时间: 2023年9月25日 10:26 收件人: tc(a)openeuler.org; dev(a)openeuler.org; kernel(a)openeuler.org 抄送: shinwell_hu(a)openeuler.sh; Xiongwei (William, Euler) <xiongwei888(a)huawei.com> 主题: 启动 openEuler 2023 年度优秀项目推荐各位社区的开发者：经 openEuler技术委员会9月20日会议讨论，现正式启动 openEuler 2023年度优秀项目的评选，请各位社区开发者和参与者推荐。当前在评选标准和项目设置上的考虑，可以参见：oEEP (openeuler.org)<https://www.openeuler.org/zh/oEEP/?name=oEEP-0007%20openEuler%E4%BC%98%E7%A…> 。截止 2023 年 10 月 15 日（周日）为止，任意三名以上社区参与者联名，可以向 tc(a)openeuler.org<mailto:tc@openeuler.org> 推荐您认可的项目。推荐项目的邮件请在邮件主题中明确包含 “openEuler 2023 年度优秀项目推荐” 字样。推荐项目的邮件内容中，请明确联名推荐人的邮箱地址和相应的 gitee id，推荐的项目名称，项目代码仓位置，推荐获奖的方向。我将汇总所有推荐，在 10 月 18 日之前通过社区邮件列表公示。欢迎大家的参与和推荐 Regards openEuler Technical Committee

1 0

[PATCH openEuler-1.0-LTS] hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88UBD?from=project-issue CVE: NA
by liwei 18 Oct '23

18 Oct '23

---------------------------------- When CONFIG_MICROCODE_AMD is not defined, the reload_ucode_amd function is implemented by multiple files in the header file. As a result, multiple definition of reload_ucode_amd. Fix this error. Fixes: 0eeacafff0a7 ("x86/cpu/amd: Add a Zenbleed fix") Signed-off-by: liwei <liwei728(a)huawei.com> --- arch/x86/include/asm/microcode_amd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/microcode_amd.h b/arch/x86/include/asm/microcode_amd.h index bdecac3ae8c3..b56a1c0d5f60 100644 --- a/arch/x86/include/asm/microcode_amd.h +++ b/arch/x86/include/asm/microcode_amd.h @@ -54,7 +54,7 @@ static inline void __init load_ucode_amd_bsp(unsigned int family) {} static inline void load_ucode_amd_ap(unsigned int family) {} static inline int __init save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; } -void reload_ucode_amd(void) {} +static inline void reload_ucode_amd(void) {} static inline void amd_check_microcode(void) {} #endif #endif /* _ASM_X86_MICROCODE_AMD_H */ -- 2.25.1

2 1

[PATCH openEuler-1.0-LTS] x86/microcode/AMD: Fix compile fail
by liwei 18 Oct '23

18 Oct '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88UBD?from=project-issue CVE: NA ---------------------------------- When CONFIG_MICROCODE_AMD is not defined, the reload_ucode_amd function is implemented by multiple files in the header file. As a result, multiple definition of reload_ucode_amd. Fix this error. Signed-off-by: liwei <liwei728(a)huawei.com> --- arch/x86/include/asm/microcode_amd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/microcode_amd.h b/arch/x86/include/asm/microcode_amd.h index bdecac3ae8c3..b56a1c0d5f60 100644 --- a/arch/x86/include/asm/microcode_amd.h +++ b/arch/x86/include/asm/microcode_amd.h @@ -54,7 +54,7 @@ static inline void __init load_ucode_amd_bsp(unsigned int family) {} static inline void load_ucode_amd_ap(unsigned int family) {} static inline int __init save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; } -void reload_ucode_amd(void) {} +static inline void reload_ucode_amd(void) {} static inline void amd_check_microcode(void) {} #endif #endif /* _ASM_X86_MICROCODE_AMD_H */ -- 2.25.1

2 1

[OLK-5.10 v2 0/7] A set of bugfixes for openeuler
by Chengchang Tang 18 Oct '23

18 Oct '23

From: Juan Zhou <zhoujuan51(a)h-partners.com> Chengchang Tang (6): RDMA/hns: Fix context dca configuration RDMA/hns: Fix potential NULL pointer in DCA memory query RDMA/hns: Fix registering dca debugfs when dca has not been set RDMA/hns: Fix printing level of asynchronous events RDMA/hns: Fix signed-unsigned mix with relational RDMA/hns: Fix unregistering device and accessing to debugfs concurrently Junxian Huang (1): RDMA/hns: Fix the concurrency error between bond and reset. drivers/infiniband/hw/hns/hns_roce_bond.c | 153 ++++++++++++++----- drivers/infiniband/hw/hns/hns_roce_bond.h | 1 + drivers/infiniband/hw/hns/hns_roce_dca.c | 10 +- drivers/infiniband/hw/hns/hns_roce_debugfs.c | 7 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 22 ++- drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 4 +- drivers/infiniband/hw/hns/hns_roce_main.c | 3 +- 7 files changed, 143 insertions(+), 57 deletions(-) -- 2.30.0

1 7

[PATCH OLK-5.10] igb: set max size RX buffer when store bad packet is enabled
by Lu Wei 18 Oct '23

18 Oct '23

From: Radoslaw Tyl <radoslawx.tyl(a)intel.com> stable inclusion from stable-v5.10.195 commit 3e39008e9e3043663324f0920a5d6ebfa68cc92a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I885ON CVE: CVE-2023-45871 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit bb5ed01cd2428cd25b1c88a3a9cba87055eb289f upstream. Increase the RX buffer size to 3K when the SBP bit is on. The size of the RX buffer determines the number of pages allocated which may not be sufficient for receive frames larger than the set MTU size. Cc: stable(a)vger.kernel.org Fixes: 89eaefb61dc9 ("igb: Support RX-ALL feature flag.") Reported-by: Manfred Rudigier <manfred.rudigier(a)omicronenergy.com> Signed-off-by: Radoslaw Tyl <radoslawx.tyl(a)intel.com> Tested-by: Arpana Arland <arpanax.arland(a)intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- drivers/net/ethernet/intel/igb/igb_main.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index b5ab3bcb0751..2f219229a95e 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -4733,6 +4733,10 @@ void igb_configure_rx_ring(struct igb_adapter *adapter, static void igb_set_rx_buffer_len(struct igb_adapter *adapter, struct igb_ring *rx_ring) { +#if (PAGE_SIZE < 8192) + struct e1000_hw *hw = &adapter->hw; +#endif + /* set build_skb and buffer size flags */ clear_ring_build_skb_enabled(rx_ring); clear_ring_uses_large_buffer(rx_ring); @@ -4743,10 +4747,9 @@ static void igb_set_rx_buffer_len(struct igb_adapter *adapter, set_ring_build_skb_enabled(rx_ring); #if (PAGE_SIZE < 8192) - if (adapter->max_frame_size <= IGB_MAX_FRAME_BUILD_SKB) - return; - - set_ring_uses_large_buffer(rx_ring); + if (adapter->max_frame_size > IGB_MAX_FRAME_BUILD_SKB || + rd32(E1000_RCTL) & E1000_RCTL_SBP) + set_ring_uses_large_buffer(rx_ring); #endif } -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] igb: set max size RX buffer when store bad packet is enabled
by Lu Wei 18 Oct '23

18 Oct '23

From: Radoslaw Tyl <radoslawx.tyl(a)intel.com> mainline inclusion from mainline-v6.6-rc1 commit bb5ed01cd2428cd25b1c88a3a9cba87055eb289f category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I885ON Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- Increase the RX buffer size to 3K when the SBP bit is on. The size of the RX buffer determines the number of pages allocated which may not be sufficient for receive frames larger than the set MTU size. Cc: stable(a)vger.kernel.org Fixes: 89eaefb61dc9 ("igb: Support RX-ALL feature flag.") Reported-by: Manfred Rudigier <manfred.rudigier(a)omicronenergy.com> Signed-off-by: Radoslaw Tyl <radoslawx.tyl(a)intel.com> Tested-by: Arpana Arland <arpanax.arland(a)intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- drivers/net/ethernet/intel/igb/igb_main.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 6d9b8e81b44d..bebfeaf5aa26 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -4561,6 +4561,10 @@ void igb_configure_rx_ring(struct igb_adapter *adapter, static void igb_set_rx_buffer_len(struct igb_adapter *adapter, struct igb_ring *rx_ring) { +#if (PAGE_SIZE < 8192) + struct e1000_hw *hw = &adapter->hw; +#endif + /* set build_skb and buffer size flags */ clear_ring_build_skb_enabled(rx_ring); clear_ring_uses_large_buffer(rx_ring); @@ -4571,10 +4575,9 @@ static void igb_set_rx_buffer_len(struct igb_adapter *adapter, set_ring_build_skb_enabled(rx_ring); #if (PAGE_SIZE < 8192) - if (adapter->max_frame_size <= IGB_MAX_FRAME_BUILD_SKB) - return; - - set_ring_uses_large_buffer(rx_ring); + if (adapter->max_frame_size > IGB_MAX_FRAME_BUILD_SKB || + rd32(E1000_RCTL) & E1000_RCTL_SBP) + set_ring_uses_large_buffer(rx_ring); #endif } -- 2.34.1

2 1

[PATCH OLK-5.10] audit: fix possible soft lockup in __audit_inode_child()
by Yi Yang 17 Oct '23

17 Oct '23

From: Gaosheng Cui <cuigaosheng1(a)huawei.com> stable inclusion from stable-v5.10.195 commit 98ef243d5900d75a64539a2165745bffbb155d43 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88UP7 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit b59bc6e37237e37eadf50cd5de369e913f524463 ] Tracefs or debugfs maybe cause hundreds to thousands of PATH records, too many PATH records maybe cause soft lockup. For example: 1. CONFIG_KASAN=y && CONFIG_PREEMPTION=n 2. auditctl -a exit,always -S open -k key 3. sysctl -w kernel.watchdog_thresh=5 4. mkdir /sys/kernel/debug/tracing/instances/test There may be a soft lockup as follows: watchdog: BUG: soft lockup - CPU#45 stuck for 7s! [mkdir:15498] Kernel panic - not syncing: softlockup: hung tasks Call trace: dump_backtrace+0x0/0x30c show_stack+0x20/0x30 dump_stack+0x11c/0x174 panic+0x27c/0x494 watchdog_timer_fn+0x2bc/0x390 __run_hrtimer+0x148/0x4fc __hrtimer_run_queues+0x154/0x210 hrtimer_interrupt+0x2c4/0x760 arch_timer_handler_phys+0x48/0x60 handle_percpu_devid_irq+0xe0/0x340 __handle_domain_irq+0xbc/0x130 gic_handle_irq+0x78/0x460 el1_irq+0xb8/0x140 __audit_inode_child+0x240/0x7bc tracefs_create_file+0x1b8/0x2a0 trace_create_file+0x18/0x50 event_create_dir+0x204/0x30c __trace_add_new_event+0xac/0x100 event_trace_add_tracer+0xa0/0x130 trace_array_create_dir+0x60/0x140 trace_array_create+0x1e0/0x370 instance_mkdir+0x90/0xd0 tracefs_syscall_mkdir+0x68/0xa0 vfs_mkdir+0x21c/0x34c do_mkdirat+0x1b4/0x1d4 __arm64_sys_mkdirat+0x4c/0x60 el0_svc_common.constprop.0+0xa8/0x240 do_el0_svc+0x8c/0xc0 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Therefore, we add cond_resched() to __audit_inode_child() to fix it. Fixes: 5195d8e217a7 ("audit: dynamically allocate audit_names when not enough space is in the names array") Signed-off-by: Gaosheng Cui <cuigaosheng1(a)huawei.com> Signed-off-by: Paul Moore <paul(a)paul-moore.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yi Yang <yiyang13(a)huawei.com> --- kernel/auditsc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/auditsc.c b/kernel/auditsc.c index 07e2788bbbf1..57b982b44732 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -2203,6 +2203,8 @@ void __audit_inode_child(struct inode *parent, } } + cond_resched(); + /* is there a matching child entry? */ list_for_each_entry(n, &context->names_list, list) { /* can only match entries that have a name */ -- 2.25.1

2 1

[PATCH openEuler-22.03-LTS-SP1 0/5] Fixed five CVE vulnerabilities of ksmbd.
by ZhaoLong Wang 17 Oct '23

17 Oct '23

CVE-2023-32254 CVE-2023-32246 CVE-2023-32256 CVE-2023-32258 CVE-2023-2593 Marios Makassikis (1): ksmbd: send proper error response in smb2_tree_connect() Namjae Jeon (4): ksmbd: fix racy issue under cocurrent smb2 tree disconnect ksmbd: call rcu_barrier() in ksmbd_server_exit() ksmbd: fix racy issue from smb2 close and logoff with multichannel ksmbd: fix infinite loop in ksmbd_conn_handler_loop() fs/ksmbd/connection.c | 61 +++++++++++++++++++++++++++--------- fs/ksmbd/connection.h | 19 +++++++++-- fs/ksmbd/mgmt/tree_connect.c | 13 +++++++- fs/ksmbd/mgmt/tree_connect.h | 3 ++ fs/ksmbd/mgmt/user_session.c | 36 +++++++++++++++++---- fs/ksmbd/server.c | 1 + fs/ksmbd/smb2pdu.c | 31 ++++++++++-------- fs/ksmbd/transport_tcp.c | 5 ++- 8 files changed, 130 insertions(+), 39 deletions(-) -- 2.34.3

2 6

[PATCH OLK-5.10 0/5] Fixed five CVE vulnerabilities of ksmbd
by ZhaoLong Wang 17 Oct '23

17 Oct '23

CVE-2023-32254 CVE-2023-32246 CVE-2023-32256 CVE-2023-32258 CVE-2023-2593 Marios Makassikis (1): ksmbd: send proper error response in smb2_tree_connect() Namjae Jeon (4): ksmbd: fix racy issue under cocurrent smb2 tree disconnect ksmbd: call rcu_barrier() in ksmbd_server_exit() ksmbd: fix racy issue from smb2 close and logoff with multichannel ksmbd: fix infinite loop in ksmbd_conn_handler_loop() fs/ksmbd/connection.c | 61 +++++++++++++++++++++++++++--------- fs/ksmbd/connection.h | 19 +++++++++-- fs/ksmbd/mgmt/tree_connect.c | 13 +++++++- fs/ksmbd/mgmt/tree_connect.h | 3 ++ fs/ksmbd/mgmt/user_session.c | 36 +++++++++++++++++---- fs/ksmbd/server.c | 1 + fs/ksmbd/smb2pdu.c | 31 ++++++++++-------- fs/ksmbd/transport_tcp.c | 5 ++- 8 files changed, 130 insertions(+), 39 deletions(-) -- 2.34.3

2 6

[PATCH OLK-5.10] xhci: print warning when HCE was set
by Longfang Liu 17 Oct '23

17 Oct '23

mainline inclusion from mainline-v6.2-rc1 commit 2a25e66d676dfb9b018abd503deed3d38a892dec category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I87TGP CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------------------------------------- When HCE(Host Controller Error) is set, it means that the xhci hardware controller has an error at this time, but the current xhci driver software does not log this event. By adding an HCE event detection in the xhci interrupt processing interface, a warning log is output to the system, which is convenient for system device status tracking. Signed-off-by: Longfang Liu <liulongfang(a)huawei.com> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20221130091944.2171610-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/usb/host/xhci-ring.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 935368c6b31d..d99ea8492749 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2935,6 +2935,11 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd) if (!(status & STS_EINT)) goto out; + if (status & STS_HCE) { + xhci_warn(xhci, "WARNING: Host Controller Error\n"); + goto out; + } + if (status & STS_FATAL) { xhci_warn(xhci, "WARNING: Host System Error\n"); xhci_halt(xhci); -- 2.24.0

2 1

[PATCH openEuler-1.0-LTS v2] audit: fix possible soft lockup in __audit_inode_child()
by Yi Yang 17 Oct '23

17 Oct '23

From: Gaosheng Cui <cuigaosheng1(a)huawei.com> stable inclusion from stable-v4.19.295 commit 1640c7bd4eddec6c72f3a99cbb74e333a2ce9f5d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88RR0 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit b59bc6e37237e37eadf50cd5de369e913f524463 ] Tracefs or debugfs maybe cause hundreds to thousands of PATH records, too many PATH records maybe cause soft lockup. For example: 1. CONFIG_KASAN=y && CONFIG_PREEMPTION=n 2. auditctl -a exit,always -S open -k key 3. sysctl -w kernel.watchdog_thresh=5 4. mkdir /sys/kernel/debug/tracing/instances/test There may be a soft lockup as follows: watchdog: BUG: soft lockup - CPU#45 stuck for 7s! [mkdir:15498] Kernel panic - not syncing: softlockup: hung tasks Call trace: dump_backtrace+0x0/0x30c show_stack+0x20/0x30 dump_stack+0x11c/0x174 panic+0x27c/0x494 watchdog_timer_fn+0x2bc/0x390 __run_hrtimer+0x148/0x4fc __hrtimer_run_queues+0x154/0x210 hrtimer_interrupt+0x2c4/0x760 arch_timer_handler_phys+0x48/0x60 handle_percpu_devid_irq+0xe0/0x340 __handle_domain_irq+0xbc/0x130 gic_handle_irq+0x78/0x460 el1_irq+0xb8/0x140 __audit_inode_child+0x240/0x7bc tracefs_create_file+0x1b8/0x2a0 trace_create_file+0x18/0x50 event_create_dir+0x204/0x30c __trace_add_new_event+0xac/0x100 event_trace_add_tracer+0xa0/0x130 trace_array_create_dir+0x60/0x140 trace_array_create+0x1e0/0x370 instance_mkdir+0x90/0xd0 tracefs_syscall_mkdir+0x68/0xa0 vfs_mkdir+0x21c/0x34c do_mkdirat+0x1b4/0x1d4 __arm64_sys_mkdirat+0x4c/0x60 el0_svc_common.constprop.0+0xa8/0x240 do_el0_svc+0x8c/0xc0 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Therefore, we add cond_resched() to __audit_inode_child() to fix it. Fixes: 5195d8e217a7 ("audit: dynamically allocate audit_names when not enough space is in the names array") Signed-off-by: Gaosheng Cui <cuigaosheng1(a)huawei.com> Signed-off-by: Paul Moore <paul(a)paul-moore.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yi Yang <yiyang13(a)huawei.com> --- v2:Fix checkformat error --- kernel/auditsc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/auditsc.c b/kernel/auditsc.c index 57d30b75f7a2..61fd210f0226 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -1931,6 +1931,8 @@ void __audit_inode_child(struct inode *parent, } } + cond_resched(); + /* is there a matching child entry? */ list_for_each_entry(n, &context->names_list, list) { /* can only match entries that have a name */ -- 2.25.1

2 1

[PATCH OLK-5.10 0/5] Fix 5 CVE of ksmbd
by ZhaoLong Wang 17 Oct '23

17 Oct '23

CVE-2023-32254 CVE-2023-32246 CVE-2023-32256 CVE-2023-32258 CVE-2023-2593 Marios Makassikis (1): ksmbd: send proper error response in smb2_tree_connect() Namjae Jeon (4): ksmbd: fix racy issue under cocurrent smb2 tree disconnect ksmbd: call rcu_barrier() in ksmbd_server_exit() ksmbd: fix racy issue from smb2 close and logoff with multichannel ksmbd: fix infinite loop in ksmbd_conn_handler_loop() fs/ksmbd/connection.c | 61 +++++++++++++++++++++++++++--------- fs/ksmbd/connection.h | 19 +++++++++-- fs/ksmbd/mgmt/tree_connect.c | 13 +++++++- fs/ksmbd/mgmt/tree_connect.h | 3 ++ fs/ksmbd/mgmt/user_session.c | 36 +++++++++++++++++---- fs/ksmbd/server.c | 1 + fs/ksmbd/smb2pdu.c | 31 ++++++++++-------- fs/ksmbd/transport_tcp.c | 5 ++- 8 files changed, 130 insertions(+), 39 deletions(-) -- 2.34.3

1 5

[PATCH OLK-5.10 0/5] Fix 5 CVE of ksmbd
by ZhaoLong Wang 17 Oct '23

17 Oct '23

CVE-2023-32254 CVE-2023-32246 CVE-2023-32256 CVE-2023-32258 CVE-2023-2593 Marios Makassikis (1): ksmbd: send proper error response in smb2_tree_connect() Namjae Jeon (4): ksmbd: fix racy issue under cocurrent smb2 tree disconnect ksmbd: call rcu_barrier() in ksmbd_server_exit() ksmbd: fix racy issue from smb2 close and logoff with multichannel ksmbd: fix infinite loop in ksmbd_conn_handler_loop() fs/ksmbd/connection.c | 61 +++++++++++++++++++++++++++--------- fs/ksmbd/connection.h | 19 +++++++++-- fs/ksmbd/mgmt/tree_connect.c | 13 +++++++- fs/ksmbd/mgmt/tree_connect.h | 3 ++ fs/ksmbd/mgmt/user_session.c | 36 +++++++++++++++++---- fs/ksmbd/server.c | 1 + fs/ksmbd/smb2pdu.c | 31 ++++++++++-------- fs/ksmbd/transport_tcp.c | 5 ++- 8 files changed, 130 insertions(+), 39 deletions(-) -- 2.34.3

1 5

[PATCH openEuler-1.0-LTS] audit: fix possible soft lockup in __audit_inode_child()
by Yi Yang 17 Oct '23

17 Oct '23

From: Gaosheng Cui <cuigaosheng1(a)huawei.com> stable inclusion from stable-v4.19.295 commit 1640c7bd4eddec6c72f3a99cbb74e333a2ce9f5d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88RR0 CVE: NA Reference:https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/… -------------------------------- [ Upstream commit b59bc6e37237e37eadf50cd5de369e913f524463 ] Tracefs or debugfs maybe cause hundreds to thousands of PATH records, too many PATH records maybe cause soft lockup. For example: 1. CONFIG_KASAN=y && CONFIG_PREEMPTION=n 2. auditctl -a exit,always -S open -k key 3. sysctl -w kernel.watchdog_thresh=5 4. mkdir /sys/kernel/debug/tracing/instances/test There may be a soft lockup as follows: watchdog: BUG: soft lockup - CPU#45 stuck for 7s! [mkdir:15498] Kernel panic - not syncing: softlockup: hung tasks Call trace: dump_backtrace+0x0/0x30c show_stack+0x20/0x30 dump_stack+0x11c/0x174 panic+0x27c/0x494 watchdog_timer_fn+0x2bc/0x390 __run_hrtimer+0x148/0x4fc __hrtimer_run_queues+0x154/0x210 hrtimer_interrupt+0x2c4/0x760 arch_timer_handler_phys+0x48/0x60 handle_percpu_devid_irq+0xe0/0x340 __handle_domain_irq+0xbc/0x130 gic_handle_irq+0x78/0x460 el1_irq+0xb8/0x140 __audit_inode_child+0x240/0x7bc tracefs_create_file+0x1b8/0x2a0 trace_create_file+0x18/0x50 event_create_dir+0x204/0x30c __trace_add_new_event+0xac/0x100 event_trace_add_tracer+0xa0/0x130 trace_array_create_dir+0x60/0x140 trace_array_create+0x1e0/0x370 instance_mkdir+0x90/0xd0 tracefs_syscall_mkdir+0x68/0xa0 vfs_mkdir+0x21c/0x34c do_mkdirat+0x1b4/0x1d4 __arm64_sys_mkdirat+0x4c/0x60 el0_svc_common.constprop.0+0xa8/0x240 do_el0_svc+0x8c/0xc0 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Therefore, we add cond_resched() to __audit_inode_child() to fix it. Fixes: 5195d8e217a7 ("audit: dynamically allocate audit_names when not enough space is in the names array") Signed-off-by: Gaosheng Cui <cuigaosheng1(a)huawei.com> Signed-off-by: Paul Moore <paul(a)paul-moore.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yi Yang <yiyang13(a)huawei.com> --- kernel/auditsc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/auditsc.c b/kernel/auditsc.c index 57d30b75f7a2..61fd210f0226 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -1931,6 +1931,8 @@ void __audit_inode_child(struct inode *parent, } } + cond_resched(); + /* is there a matching child entry? */ list_for_each_entry(n, &context->names_list, list) { /* can only match entries that have a name */ -- 2.34.1

2 1

[PATCH OLK-5.10 00/24] coresight: Add new API to allocate trace source ID values
by Junhao He 17 Oct '23

17 Oct '23

From: xiabing <xiabing12(a)h-partners.com> The current method for allocating trace source ID values to sources is to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10). The STM is allocated ID 0x1. This fixed algorithm is used in both the CoreSight driver code, and by perf when writing the trace metadata in the AUXTRACE_INFO record. The method needs replacing as currently:- 1. It is inefficient in using available IDs. 2. Does not scale to larger systems with many cores and the algorithm has no limits so will generate invalid trace IDs for cpu number > 44. Additionally requirements to allocate additional system IDs on some systems have been seen. This patch set introduces an API that allows the allocation of trace IDs in a dynamic manner. Architecturally reserved IDs are never allocated, and the system is limited to allocating only valid IDs. Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use the new API. For the ETMx.x devices IDs are allocated on certain events a) When using sysfs, an ID will be allocated on hardware enable, or a read of sysfs TRCTRACEID register and freed when the sysfs reset is written. b) When using perf, ID is allocated on during setup AUX event, and freed on event free. IDs are communicated using the AUX_OUTPUT_HW_ID packet. The ID allocator is notified when perf sessions start and stop so CPU based IDs are kept constant throughout any perf session. Adrian Hunter (1): perf/x86: Add new event for AUX output counter index German Gomez (1): perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters James Clark (4): perf cs-etm: Tidy up auxtrace info header printing perf cs-etm: Cleanup cs_etm__process_auxtrace_info() perf cs-etm: Print unknown header version as an error perf cs-etm: Print auxtrace info even if OpenCSD isn't linked Mike Leach (15): coresight: trace-id: Add API to dynamically assign Trace ID values coresight: Remove obsolete Trace ID unniqueness checks coresight: perf: traceid: Add perf ID allocation and notifiers coresight: stm: Update STM driver to use Trace ID API coresight: etm4x: Update ETM4 driver to use Trace ID API coresight: etm3x: Update ETM3 driver to use Trace ID API coresight: etmX.X: stm: Remove trace_id() callback coresight: trace id: Remove legacy get trace ID function. perf cs-etm: Move mapping of Trace ID and cpu into helper function perf cs-etm: Update record event to use new Trace ID protocol kernel: events: Export perf_report_aux_output_id() perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packet coresight: events: PERF_RECORD_AUX_OUTPUT_HW_ID used for Trace ID coresight: trace-id: Add debug & test macros to Trace ID allocation coresight: etm4x: Fix missing trctraceidr file in sysfs Ruidong Tian (1): coresight: perf: Release Coresight path when alloc trace id failed Suzuki K Poulose (1): coresight: perf: Output trace id only once xiabing (1): Revert "drivers/ETM: fix error in invalid cs_id" .../testing/sysfs-bus-coresight-devices-etm3x | 2 +- arch/x86/events/core.c | 6 + arch/x86/events/intel/core.c | 16 + arch/x86/events/perf_event.h | 1 + drivers/hwtracing/coresight/Makefile | 2 +- drivers/hwtracing/coresight/coresight-core.c | 45 -- .../hwtracing/coresight/coresight-etm-perf.c | 32 + .../hwtracing/coresight/coresight-etm-perf.h | 2 + drivers/hwtracing/coresight/coresight-etm.h | 3 +- .../coresight/coresight-etm3x-core.c | 93 +-- .../coresight/coresight-etm3x-sysfs.c | 27 +- .../coresight/coresight-etm4x-core.c | 73 ++- .../coresight/coresight-etm4x-sysfs.c | 50 +- drivers/hwtracing/coresight/coresight-etm4x.h | 3 + drivers/hwtracing/coresight/coresight-stm.c | 49 +- .../hwtracing/coresight/coresight-trace-id.c | 297 ++++++++++ .../hwtracing/coresight/coresight-trace-id.h | 156 +++++ include/linux/coresight-pmu.h | 35 +- include/linux/coresight.h | 3 - include/linux/perf_event.h | 1 + include/uapi/linux/perf_event.h | 15 + kernel/events/core.c | 31 + tools/include/linux/coresight-pmu.h | 48 +- tools/perf/arch/arm/util/cs-etm.c | 66 ++- tools/perf/util/Build | 1 + tools/perf/util/cs-etm-base.c | 191 ++++++ .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 + tools/perf/util/cs-etm.c | 546 ++++++++++-------- tools/perf/util/cs-etm.h | 37 +- 29 files changed, 1387 insertions(+), 451 deletions(-) create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h create mode 100644 tools/perf/util/cs-etm-base.c -- 2.30.0

2 25

[PATCH OLK-5.10 0/4] CVE-2023-37453
by Lin Yujun 17 Oct '23

17 Oct '23

*** BLURB HERE *** Alan Stern (4): USB: core: Unite old scheme and new scheme descriptor reads USB: core: Change usb_get_device_descriptor() API USB: core: Fix race by not overwriting udev->descriptor in hub_port_init() USB: core: Fix oversight in SuperSpeed initialization drivers/usb/core/hcd.c | 10 +- drivers/usb/core/hub.c | 349 +++++++++++++++++++++---------------- drivers/usb/core/message.c | 29 ++- drivers/usb/core/usb.h | 4 +- 4 files changed, 223 insertions(+), 169 deletions(-) -- 2.34.1

2 5

[PATCH] xhci: print warning when HCE was set
by Longfang Liu 17 Oct '23

17 Oct '23

mainline inclusion from mainline-v6.2-rc1 commit 2a25e66d676dfb9b018abd503deed3d38a892dec category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I87TGP CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------------------------------------- When HCE(Host Controller Error) is set, it means that the xhci hardware controller has an error at this time, but the current xhci driver software does not log this event. By adding an HCE event detection in the xhci interrupt processing interface, a warning log is output to the system, which is convenient for system device status tracking. Signed-off-by: Longfang Liu <liulongfang(a)huawei.com> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20221130091944.2171610-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/usb/host/xhci-ring.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 935368c6b31d..d99ea8492749 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2935,6 +2935,11 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd) if (!(status & STS_EINT)) goto out; + if (status & STS_HCE) { + xhci_warn(xhci, "WARNING: Host Controller Error\n"); + goto out; + } + if (status & STS_FATAL) { xhci_warn(xhci, "WARNING: Host System Error\n"); xhci_halt(xhci); -- 2.24.0

1 0

[PATCH openEuler-1.0-LTS v2] netfilter: xt_u32: validate user space input
by Lu Wei 17 Oct '23

17 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 69c5d284f67089b4750d28ff6ac6f52ec224b330 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85CAT CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index a95b50342dbb..58ba402bc0b0 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -95,11 +95,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

2 1

[PATCH OLK-5.10 v2] netfilter: xt_u32: validate user space input
by Lu Wei 17 Oct '23

17 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 69c5d284f67089b4750d28ff6ac6f52ec224b330 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85CAT CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index 177b40d08098..117d4615d668 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -96,11 +96,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

2 1

[PATCH OLK-5.10 0/3] Fix 4 CVE of ksmbd
by ZhaoLong Wang 17 Oct '23

17 Oct '23

CVE-2023-32254 CVE-2023-32246 CVE-2023-32256 CVE-2023-32258 Namjae Jeon (3): ksmbd: fix racy issue under cocurrent smb2 tree disconnect ksmbd: call rcu_barrier() in ksmbd_server_exit() ksmbd: fix racy issue from smb2 close and logoff with multichannel fs/ksmbd/connection.c | 54 +++++++++++++++++++++++++++--------- fs/ksmbd/connection.h | 19 +++++++++++-- fs/ksmbd/mgmt/tree_connect.c | 13 ++++++++- fs/ksmbd/mgmt/tree_connect.h | 3 ++ fs/ksmbd/mgmt/user_session.c | 36 ++++++++++++++++++++---- fs/ksmbd/server.c | 1 + fs/ksmbd/smb2pdu.c | 24 ++++++++-------- 7 files changed, 116 insertions(+), 34 deletions(-) -- 2.34.3

2 4

[PATCH OLK-5.10] crypto: drbg - Only fail when jent is unavailable in FIPS mode
by Lu Jialin 17 Oct '23

17 Oct '23

From: Herbert Xu <herbert(a)gondor.apana.org.au> stable inclusion from stable-v5.10.180 commit 2a67bc52cd3f0783ac412d8007ae7bdd04911b10 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88MMY Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 686cd976b6ddedeeb1a1fb09ba53a891d3cc9a03 ] When jent initialisation fails for any reason other than ENOENT, the entire drbg fails to initialise, even when we're not in FIPS mode. This is wrong because we can still use the kernel RNG when we're not in FIPS mode. Change it so that it only fails when we are in FIPS mode. Fixes: 57225e679788 ("crypto: drbg - Use callback API for random readiness") Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Reviewed-by: Stephan Mueller <smueller(a)chronox.de> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflict: crypto/drbg.c Signed-off-by: Lu Jialin <lujialin4(a)huawei.com> --- crypto/drbg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crypto/drbg.c b/crypto/drbg.c index a4b5d6dbe99d..3019a26c5021 100644 --- a/crypto/drbg.c +++ b/crypto/drbg.c @@ -1573,7 +1573,7 @@ static int drbg_instantiate(struct drbg_state *drbg, struct drbg_string *pers, if (IS_ERR(drbg->jent)) { ret = PTR_ERR(drbg->jent); drbg->jent = NULL; - if (fips_enabled || ret != -ENOENT) + if (fips_enabled) goto free_everything; pr_info("DRBG: Continuing without Jitter RNG\n"); } -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS v2] USB: ene_usb6250: Allocate enough memory for full object
by Yipeng Zou 17 Oct '23

17 Oct '23

From: Kees Cook <keescook(a)chromium.org> mainline inclusion from mainline-v6.6-rc2 commit ce33e64c1788912976b61314b56935abd4bc97ef category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I885FJ CVE: CVE-2023-45862 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The allocation of PageBuffer is 512 bytes in size, but the dereferencing of struct ms_bootblock_idi (also size 512) happens at a calculated offset within the allocation, which means the object could potentially extend beyond the end of the allocation. Avoid this case by just allocating enough space to catch any accesses beyond the end. Seen with GCC 13: ../drivers/usb/storage/ene_ub6250.c: In function 'ms_lib_process_bootblock': ../drivers/usb/storage/ene_ub6250.c:1050:44: warning: array subscript 'struct ms_bootblock_idi[0]' is partly outside array bounds of 'unsigned char[512]' [-Warray-bounds=] 1050 | if (le16_to_cpu(idi->wIDIgeneralConfiguration) != MS_IDI_GENERAL_CONF) | ^~ ../include/uapi/linux/byteorder/little_endian.h:37:51: note: in definition of macro '__le16_to_cpu' 37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x)) | ^ ../drivers/usb/storage/ene_ub6250.c:1050:29: note: in expansion of macro 'le16_to_cpu' 1050 | if (le16_to_cpu(idi->wIDIgeneralConfiguration) != MS_IDI_GENERAL_CONF) | ^~~~~~~~~~~ In file included from ../drivers/usb/storage/ene_ub6250.c:5: In function 'kmalloc', inlined from 'ms_lib_process_bootblock' at ../drivers/usb/storage/ene_ub6250.c:942:15: ../include/linux/slab.h:580:24: note: at offset [256, 512] into object of size 512 allocated by 'kmalloc_trace' 580 | return kmalloc_trace( | ^~~~~~~~~~~~~~ 581 | kmalloc_caches[kmalloc_type(flags)][index], | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 582 | flags, size); | ~~~~~~~~~~~~ Cc: Alan Stern <stern(a)rowland.harvard.edu> Signed-off-by: Kees Cook <keescook(a)chromium.org> Link: https://lore.kernel.org/r/20230204183546.never.849-kees@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com> --- drivers/usb/storage/ene_ub6250.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/storage/ene_ub6250.c b/drivers/usb/storage/ene_ub6250.c index 4d261e4de9ad..e6768662c5ed 100644 --- a/drivers/usb/storage/ene_ub6250.c +++ b/drivers/usb/storage/ene_ub6250.c @@ -940,7 +940,7 @@ static int ms_lib_process_bootblock(struct us_data *us, u16 PhyBlock, u8 *PageDa struct ms_lib_type_extdat ExtraData; struct ene_ub6250_info *info = (struct ene_ub6250_info *) us->extra; - PageBuffer = kmalloc(MS_BYTES_PER_PAGE, GFP_KERNEL); + PageBuffer = kzalloc(MS_BYTES_PER_PAGE * 2, GFP_KERNEL); if (PageBuffer == NULL) return (u32)-1; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] USB: ene_usb6250: Allocate enough memory for full object
by Yipeng Zou 17 Oct '23

17 Oct '23

From: Kees Cook <keescook(a)chromium.org> mainline inclusion from mainline-v6.6-rc2 commit ce33e64c1788912976b61314b56935abd4bc97ef category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88NEK CVE: CVE-2023-45862 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The allocation of PageBuffer is 512 bytes in size, but the dereferencing of struct ms_bootblock_idi (also size 512) happens at a calculated offset within the allocation, which means the object could potentially extend beyond the end of the allocation. Avoid this case by just allocating enough space to catch any accesses beyond the end. Seen with GCC 13: ../drivers/usb/storage/ene_ub6250.c: In function 'ms_lib_process_bootblock': ../drivers/usb/storage/ene_ub6250.c:1050:44: warning: array subscript 'struct ms_bootblock_idi[0]' is partly outside array bounds of 'unsigned char[512]' [-Warray-bounds=] 1050 | if (le16_to_cpu(idi->wIDIgeneralConfiguration) != MS_IDI_GENERAL_CONF) | ^~ ../include/uapi/linux/byteorder/little_endian.h:37:51: note: in definition of macro '__le16_to_cpu' 37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x)) | ^ ../drivers/usb/storage/ene_ub6250.c:1050:29: note: in expansion of macro 'le16_to_cpu' 1050 | if (le16_to_cpu(idi->wIDIgeneralConfiguration) != MS_IDI_GENERAL_CONF) | ^~~~~~~~~~~ In file included from ../drivers/usb/storage/ene_ub6250.c:5: In function 'kmalloc', inlined from 'ms_lib_process_bootblock' at ../drivers/usb/storage/ene_ub6250.c:942:15: ../include/linux/slab.h:580:24: note: at offset [256, 512] into object of size 512 allocated by 'kmalloc_trace' 580 | return kmalloc_trace( | ^~~~~~~~~~~~~~ 581 | kmalloc_caches[kmalloc_type(flags)][index], | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 582 | flags, size); | ~~~~~~~~~~~~ Cc: Alan Stern <stern(a)rowland.harvard.edu> Signed-off-by: Kees Cook <keescook(a)chromium.org> Link: https://lore.kernel.org/r/20230204183546.never.849-kees@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com> --- drivers/usb/storage/ene_ub6250.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/storage/ene_ub6250.c b/drivers/usb/storage/ene_ub6250.c index 4d261e4de9ad..e6768662c5ed 100644 --- a/drivers/usb/storage/ene_ub6250.c +++ b/drivers/usb/storage/ene_ub6250.c @@ -940,7 +940,7 @@ static int ms_lib_process_bootblock(struct us_data *us, u16 PhyBlock, u8 *PageDa struct ms_lib_type_extdat ExtraData; struct ene_ub6250_info *info = (struct ene_ub6250_info *) us->extra; - PageBuffer = kmalloc(MS_BYTES_PER_PAGE, GFP_KERNEL); + PageBuffer = kzalloc(MS_BYTES_PER_PAGE * 2, GFP_KERNEL); if (PageBuffer == NULL) return (u32)-1; -- 2.34.1

2 1

[OLK-5.10] crypto: drbg - Only fail when jent is unavailable in FIPS mode
by Lu Jialin 17 Oct '23

17 Oct '23

From: Herbert Xu <herbert(a)gondor.apana.org.au> stable inclusion from stable-v5.10.180 commit 2a67bc52cd3f0783ac412d8007ae7bdd04911b10 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I88MMY Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 686cd976b6ddedeeb1a1fb09ba53a891d3cc9a03 ] When jent initialisation fails for any reason other than ENOENT, the entire drbg fails to initialise, even when we're not in FIPS mode. This is wrong because we can still use the kernel RNG when we're not in FIPS mode. Change it so that it only fails when we are in FIPS mode. Fixes: 57225e679788 ("crypto: drbg - Use callback API for random readiness") Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Reviewed-by: Stephan Mueller <smueller(a)chronox.de> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflict: crypto/drbg.c Signed-off-by: Lu Jialin <lujialin4(a)huawei.com> --- crypto/drbg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crypto/drbg.c b/crypto/drbg.c index a4b5d6dbe99d..3019a26c5021 100644 --- a/crypto/drbg.c +++ b/crypto/drbg.c @@ -1573,7 +1573,7 @@ static int drbg_instantiate(struct drbg_state *drbg, struct drbg_string *pers, if (IS_ERR(drbg->jent)) { ret = PTR_ERR(drbg->jent); drbg->jent = NULL; - if (fips_enabled || ret != -ENOENT) + if (fips_enabled) goto free_everything; pr_info("DRBG: Continuing without Jitter RNG\n"); } -- 2.34.1

1 0

[PATCH v5 OLK-5.10 00/19] Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely
by Wupeng Ma 17 Oct '23

17 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> Patch 1: move FDT init out of kaslr_early_init for future use. Patch 2 to 8: enable feature PBHA for arm64. Patch 9-18: Control the usage of HBM cache for kernel and task precisely. Patch 19: Enable feature PBHA for arm64 by default. Changelog since v4: - update pbha_bit0_update_pgprot to pgprot_pbha_bit0 Changelog since v3: - update desc for patch #15 Changelog since v2: - correct the error Documentation in patch #18 Changelog since v1: - fix kabi broken due to include files James Morse (7): KVM: arm64: Detect and enable PBHA for stage2 dt-bindings: Rename the description of cpu nodes cpu.yaml dt-bindings: arm: Add binding for Page Based Hardware Attributes arm64: cpufeature: Enable PBHA bits for stage1 arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values KVM: arm64: Configure PBHA bits for stage2 Documentation: arm64: Describe the support and expectations for PBHA Ma Wupeng (11): arm64: cpufeature: Enable PBHA for stage1 early via FDT arm64: mm: Detect and enable PBHA bit0 at early startup arm64: mm: Update kernel pte entries if pbha bit0 enabled arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0 arm64: mm: Introduce procfs interface to update PBHA0 bit arm64: mm: Set flag VM_PBHA_BIT0 for global init task arm64: mm: Introduce prctl to control pbha behavior arm64: mm: Introduce kernel param pbha openeuler: configs: arm64: Enable PBHA by default Marc Zyngier (1): arm64: Extract early FDT mapping from kaslr_early_init() .../admin-guide/kernel-parameters.txt | 8 + Documentation/arm64/index.rst | 1 + Documentation/arm64/pbha.rst | 85 +++ .../devicetree/bindings/arm/cpu.yaml | 537 ++++++++++++++++ .../devicetree/bindings/arm/cpus.yaml | 584 +++--------------- arch/arm64/Kconfig | 20 + arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 3 + arch/arm64/include/asm/cpufeature.h | 15 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_pgtable.h | 9 + arch/arm64/include/asm/mman.h | 10 + arch/arm64/include/asm/pgtable-hwdef.h | 6 + arch/arm64/include/asm/pgtable.h | 26 + arch/arm64/include/asm/setup.h | 3 + arch/arm64/include/uapi/asm/mman.h | 1 + arch/arm64/kernel/cpufeature.c | 258 ++++++++ arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kernel/kaslr.c | 7 +- arch/arm64/kernel/setup.c | 15 + arch/arm64/kvm/reset.c | 15 +- arch/arm64/mm/hugetlbpage.c | 2 + arch/arm64/mm/mmu.c | 14 +- arch/arm64/mm/ptdump.c | 5 + .../firmware/efi/libstub/efi-stub-helper.c | 3 + drivers/firmware/efi/libstub/fdt.c | 57 ++ drivers/soc/hisilicon/Makefile | 1 + drivers/soc/hisilicon/pbha.c | 204 ++++++ fs/proc/base.c | 103 +++ fs/proc/task_mmu.c | 3 + include/linux/mm.h | 8 +- include/linux/pbha.h | 66 ++ include/uapi/asm-generic/mman-common.h | 1 + include/uapi/linux/prctl.h | 2 + kernel/sys.c | 10 + mm/memory.c | 4 + mm/vmalloc.c | 5 + 38 files changed, 1579 insertions(+), 523 deletions(-) create mode 100644 Documentation/arm64/pbha.rst create mode 100644 Documentation/devicetree/bindings/arm/cpu.yaml create mode 100644 drivers/soc/hisilicon/pbha.c create mode 100644 include/linux/pbha.h -- 2.25.1

2 20

[PATCH OLK-5.10,v2] netfilter: xt_u32: validate user space input
by Lu Wei 17 Oct '23

17 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 69c5d284f67089b4750d28ff6ac6f52ec224b330 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85CAT CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index 177b40d08098..117d4615d668 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -96,11 +96,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

1 0

[PATCH openEuler-1.0-LTS,v2] netfilter: xt_u32: validate user space input
by Lu Wei 17 Oct '23

17 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 69c5d284f67089b4750d28ff6ac6f52ec224b330 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85CAT CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index a95b50342dbb..58ba402bc0b0 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -95,11 +95,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

1 0

[PATCH v4 OLK-5.10 00/19] Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely
by Wupeng Ma 16 Oct '23

16 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> Patch 1: move FDT init out of kaslr_early_init for future use. Patch 2 to 8: enable feature PBHA for arm64. Patch 9-18: Control the usage of HBM cache for kernel and task precisely. Patch 19: Enable feature PBHA for arm64 by default. Changelog since v3: - update desc for patch #15 Changelog since v2: - correct the error Documentation in patch #18 Changelog since v1: - fix kabi broken due to include files James Morse (7): KVM: arm64: Detect and enable PBHA for stage2 dt-bindings: Rename the description of cpu nodes cpu.yaml dt-bindings: arm: Add binding for Page Based Hardware Attributes arm64: cpufeature: Enable PBHA bits for stage1 arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values KVM: arm64: Configure PBHA bits for stage2 Documentation: arm64: Describe the support and expectations for PBHA Ma Wupeng (11): arm64: cpufeature: Enable PBHA for stage1 early via FDT arm64: mm: Detect and enable PBHA bit0 at early startup arm64: mm: Update kernel pte entries if pbha bit0 enabled arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0 arm64: mm: Introduce procfs interface to update PBHA0 bit arm64: mm: Set flag VM_PBHA_BIT0 for global init task arm64: mm: Introduce prctl to control pbha behavior arm64: mm: Introduce kernel param pbha openeuler: configs: arm64: Enable PBHA by default Marc Zyngier (1): arm64: Extract early FDT mapping from kaslr_early_init() .../admin-guide/kernel-parameters.txt | 8 + Documentation/arm64/index.rst | 1 + Documentation/arm64/pbha.rst | 85 +++ .../devicetree/bindings/arm/cpu.yaml | 537 ++++++++++++++++ .../devicetree/bindings/arm/cpus.yaml | 584 +++--------------- arch/arm64/Kconfig | 20 + arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 3 + arch/arm64/include/asm/cpufeature.h | 15 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_pgtable.h | 9 + arch/arm64/include/asm/mman.h | 10 + arch/arm64/include/asm/pgtable-hwdef.h | 6 + arch/arm64/include/asm/pgtable.h | 26 + arch/arm64/include/asm/setup.h | 3 + arch/arm64/include/uapi/asm/mman.h | 1 + arch/arm64/kernel/cpufeature.c | 258 ++++++++ arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kernel/kaslr.c | 7 +- arch/arm64/kernel/setup.c | 15 + arch/arm64/kvm/reset.c | 15 +- arch/arm64/mm/hugetlbpage.c | 2 + arch/arm64/mm/mmu.c | 14 +- arch/arm64/mm/ptdump.c | 5 + .../firmware/efi/libstub/efi-stub-helper.c | 3 + drivers/firmware/efi/libstub/fdt.c | 57 ++ drivers/soc/hisilicon/Makefile | 1 + drivers/soc/hisilicon/pbha.c | 204 ++++++ fs/proc/base.c | 103 +++ fs/proc/task_mmu.c | 3 + include/linux/mm.h | 8 +- include/linux/pbha.h | 66 ++ include/uapi/asm-generic/mman-common.h | 1 + include/uapi/linux/prctl.h | 2 + kernel/sys.c | 10 + mm/memory.c | 4 + mm/vmalloc.c | 5 + 38 files changed, 1579 insertions(+), 523 deletions(-) create mode 100644 Documentation/arm64/pbha.rst create mode 100644 Documentation/devicetree/bindings/arm/cpu.yaml create mode 100644 drivers/soc/hisilicon/pbha.c create mode 100644 include/linux/pbha.h -- 2.25.1

2 20

[PATCH openEuler-1.0-LTS] ext4: fix rec_len verify error
by Baokun Li 16 Oct '23

16 Oct '23

From: Shida Zhang <zhangshida(a)kylinos.cn> stable inclusion from stable-v4.19.296 commit 996f30b44369fa2d1fe62d30a14ef27e05b30448 category: bugfix bugzilla: 189039, https://gitee.com/openeuler/kernel/issues/I7OXK8 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit 7fda67e8c3ab6069f75888f67958a6d30454a9f6 upstream. With the configuration PAGE_SIZE 64k and filesystem blocksize 64k, a problem occurred when more than 13 million files were directly created under a directory: EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum When enough files are created, the fake_dirent->reclen will be 0xffff. it doesn't equal to the blocksize 65536, i.e. 0x10000. But it is not the same condition when blocksize equals to 4k. when enough files are created, the fake_dirent->reclen will be 0x1000. it equals to the blocksize 4k, i.e. 0x1000. The problem seems to be related to the limitation of the 16-bit field when the blocksize is set to 64k. To address this, helpers like ext4_rec_len_{from,to}_disk has already been introduced to complete the conversion between the encoded and the plain form of rec_len. So fix this one by using the helper, and all the other in this file too. Cc: stable(a)kernel.org Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Suggested-by: Andreas Dilger <adilger(a)dilger.ca> Suggested-by: Darrick J. Wong <djwong(a)kernel.org> Signed-off-by: Shida Zhang <zhangshida(a)kylinos.cn> Reviewed-by: Andreas Dilger <adilger(a)dilger.ca> Reviewed-by: Darrick J. Wong <djwong(a)kernel.org> Link: https://lore.kernel.org/r/20230803060938.1929759-1-zhangshida@kylinos.cn Signed-off-by: Theodore Ts'o <tytso(a)mit.edu> Signed-off-by: Shida Zhang <zhangshida(a)kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Baokun Li <libaokun1(a)huawei.com> --- fs/ext4/namei.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 972611b72195..c55e1b64c38b 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -336,17 +336,17 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode, struct ext4_dir_entry *de) { struct ext4_dir_entry_tail *t; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); #ifdef PARANOID struct ext4_dir_entry *d, *top; d = de; top = (struct ext4_dir_entry *)(((void *)de) + - (EXT4_BLOCK_SIZE(inode->i_sb) - - sizeof(struct ext4_dir_entry_tail))); - while (d < top && d->rec_len) + (blocksize - sizeof(struct ext4_dir_entry_tail))); + while (d < top && ext4_rec_len_from_disk(d->rec_len, blocksize)) d = (struct ext4_dir_entry *)(((void *)d) + - le16_to_cpu(d->rec_len)); + ext4_rec_len_from_disk(d->rec_len, blocksize)); if (d != top) return NULL; @@ -357,7 +357,8 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode, #endif if (t->det_reserved_zero1 || - le16_to_cpu(t->det_rec_len) != sizeof(struct ext4_dir_entry_tail) || + (ext4_rec_len_from_disk(t->det_rec_len, blocksize) != + sizeof(struct ext4_dir_entry_tail)) || t->det_reserved_zero2 || t->det_reserved_ft != EXT4_FT_DIR_CSUM) return NULL; @@ -439,13 +440,14 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode, struct ext4_dir_entry *dp; struct dx_root_info *root; int count_offset; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); + unsigned int rlen = ext4_rec_len_from_disk(dirent->rec_len, blocksize); - if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb)) + if (rlen == blocksize) count_offset = 8; - else if (le16_to_cpu(dirent->rec_len) == 12) { + else if (rlen == 12) { dp = (struct ext4_dir_entry *)(((void *)dirent) + 12); - if (le16_to_cpu(dp->rec_len) != - EXT4_BLOCK_SIZE(inode->i_sb) - 12) + if (ext4_rec_len_from_disk(dp->rec_len, blocksize) != blocksize - 12) return NULL; root = (struct dx_root_info *)(((void *)dp + 12)); if (root->reserved_zero || @@ -1258,6 +1260,7 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, unsigned int buflen = bh->b_size; char *base = bh->b_data; struct dx_hash_info h = *hinfo; + int blocksize = EXT4_BLOCK_SIZE(dir->i_sb); if (ext4_has_metadata_csum(dir->i_sb)) buflen -= sizeof(struct ext4_dir_entry_tail); @@ -1271,11 +1274,12 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, map_tail--; map_tail->hash = h.hash; map_tail->offs = ((char *) de - base)>>2; - map_tail->size = le16_to_cpu(de->rec_len); + map_tail->size = ext4_rec_len_from_disk(de->rec_len, + blocksize); count++; cond_resched(); } - de = ext4_next_entry(de, dir->i_sb->s_blocksize); + de = ext4_next_entry(de, blocksize); } return count; } -- 2.31.1

2 1

[PATCH openEuler-1.0-LTS] netfilter: xt_u32: validate user space input
by Lu Wei 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 69c5d284f67089b4750d28ff6ac6f52ec224b330 category: bugfix bugzilla: 189278 CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index a95b50342dbb..58ba402bc0b0 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -95,11 +95,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

2 1

[PATCH OLK-5.10] netfilter: xt_u32: validate user space input
by Lu Wei 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 69c5d284f67089b4750d28ff6ac6f52ec224b330 category: bugfix bugzilla: 189278 CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index 177b40d08098..117d4615d668 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -96,11 +96,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

2 1

[PATCH OLK-5.10] netfilter: xt_u32: validate user space input
by Lu Wei 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 04a07b7392d47dea88b6bc00389655cf26c5ef46 category: bugfix bugzilla: 189278 CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index 177b40d08098..117d4615d668 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -96,11 +96,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] netfilter: xt_u32: validate user space input
by Lu Wei 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> mainline inclusion from mainline-v6.6-rc1 commit 04a07b7392d47dea88b6bc00389655cf26c5ef46 category: bugfix bugzilla: 189278 CVE: CVE-2023-39192 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The xt_u32 module doesn't validate the fields in the xt_u32 structure. An attacker may take advantage of this to trigger an OOB read by setting the size fields with a value beyond the arrays boundaries. Add a checkentry function to validate the structure. This was originally reported by the ZDI project (ZDI-CAN-18408). Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match") Cc: stable(a)vger.kernel.org Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/xt_u32.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c index a95b50342dbb..58ba402bc0b0 100644 --- a/net/netfilter/xt_u32.c +++ b/net/netfilter/xt_u32.c @@ -95,11 +95,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret ^ data->invert; } +static int u32_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_u32 *data = par->matchinfo; + const struct xt_u32_test *ct; + unsigned int i; + + if (data->ntests > ARRAY_SIZE(data->tests)) + return -EINVAL; + + for (i = 0; i < data->ntests; ++i) { + ct = &data->tests[i]; + + if (ct->nnums > ARRAY_SIZE(ct->location) || + ct->nvalues > ARRAY_SIZE(ct->value)) + return -EINVAL; + } + + return 0; +} + static struct xt_match xt_u32_mt_reg __read_mostly = { .name = "u32", .revision = 0, .family = NFPROTO_UNSPEC, .match = u32_mt, + .checkentry = u32_mt_checkentry, .matchsize = sizeof(struct xt_u32), .me = THIS_MODULE, }; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] netfilter: xt_sctp: validate the flag_info count
by Zhengchao Shao 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> stable inclusion from stable-v4.19.295 commit f25dbfadaf525d854597c16420dd753ca47b9396 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85CAQ CVE: CVE-2023-39193 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit e99476497687ef9e850748fe6d232264f30bc8f9 upstream. sctp_mt_check doesn't validate the flag_count field. An attacker can take advantage of that to trigger a OOB read and leak memory information. Add the field validation in the checkentry function. Fixes: 2e4e6a17af35 ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables") Cc: stable(a)vger.kernel.org Reported-by: Lucas Leong <wmliang(a)infosec.exchange> Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com> --- net/netfilter/xt_sctp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/netfilter/xt_sctp.c b/net/netfilter/xt_sctp.c index 2d2fa1d53ea6..05495d3f47b8 100644 --- a/net/netfilter/xt_sctp.c +++ b/net/netfilter/xt_sctp.c @@ -149,6 +149,8 @@ static int sctp_mt_check(const struct xt_mtchk_param *par) { const struct xt_sctp_info *info = par->matchinfo; + if (info->flag_count > ARRAY_SIZE(info->flag_info)) + return -EINVAL; if (info->flags & ~XT_SCTP_VALID_FLAGS) return -EINVAL; if (info->invflags & ~XT_SCTP_VALID_FLAGS) -- 2.34.1

2 1

[PATCH OLK-5.10] netfilter: xt_sctp: validate the flag_info count
by Zhengchao Shao 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> stable inclusion from stable-v5.10.195 commit 5541827d13cf19b905594eaee586527476efaa61 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85CAQ CVE: CVE-2023-39193 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit e99476497687ef9e850748fe6d232264f30bc8f9 upstream. sctp_mt_check doesn't validate the flag_count field. An attacker can take advantage of that to trigger a OOB read and leak memory information. Add the field validation in the checkentry function. Fixes: 2e4e6a17af35 ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables") Cc: stable(a)vger.kernel.org Reported-by: Lucas Leong <wmliang(a)infosec.exchange> Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com> --- net/netfilter/xt_sctp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/netfilter/xt_sctp.c b/net/netfilter/xt_sctp.c index 680015ba7cb6..d4bf089c9e3f 100644 --- a/net/netfilter/xt_sctp.c +++ b/net/netfilter/xt_sctp.c @@ -150,6 +150,8 @@ static int sctp_mt_check(const struct xt_mtchk_param *par) { const struct xt_sctp_info *info = par->matchinfo; + if (info->flag_count > ARRAY_SIZE(info->flag_info)) + return -EINVAL; if (info->flags & ~XT_SCTP_VALID_FLAGS) return -EINVAL; if (info->invflags & ~XT_SCTP_VALID_FLAGS) -- 2.34.1

2 1

[PATCH OLK-5.10] netfilter: nfnetlink_osf: avoid OOB read
by Liu Jian 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> stable inclusion from stable-v5.10.195 commit 780f60dde29692c42091602fee9c25e9e391f3dc category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I86MTP CVE: CVE-2023-39189 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… --------------------------- [ Upstream commit f4f8a7803119005e87b716874bec07c751efafec ] The opt_num field is controlled by user mode and is not currently validated inside the kernel. An attacker can take advantage of this to trigger an OOB read and potentially leak information. BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88 Read of size 2 at addr ffff88804bc64272 by task poc/6431 CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1 Call Trace: nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88 nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281 nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47 expr_call_ops_eval net/netfilter/nf_tables_core.c:214 nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264 nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23 [..] Also add validation to genre, subtype and version fields. Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match") Reported-by: Lucas Leong <wmliang(a)infosec.exchange> Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Liu Jian <liujian56(a)huawei.com> --- net/netfilter/nfnetlink_osf.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/netfilter/nfnetlink_osf.c b/net/netfilter/nfnetlink_osf.c index 51e3953b414c..260a655def34 100644 --- a/net/netfilter/nfnetlink_osf.c +++ b/net/netfilter/nfnetlink_osf.c @@ -316,6 +316,14 @@ static int nfnl_osf_add_callback(struct net *net, struct sock *ctnl, f = nla_data(osf_attrs[OSF_ATTR_FINGER]); + if (f->opt_num > ARRAY_SIZE(f->opt)) + return -EINVAL; + + if (!memchr(f->genre, 0, MAXGENRELEN) || + !memchr(f->subtype, 0, MAXGENRELEN) || + !memchr(f->version, 0, MAXGENRELEN)) + return -EINVAL; + kf = kmalloc(sizeof(struct nf_osf_finger), GFP_KERNEL); if (!kf) return -ENOMEM; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] netfilter: nfnetlink_osf: avoid OOB read
by Liu Jian 13 Oct '23

13 Oct '23

From: Wander Lairson Costa <wander(a)redhat.com> stable inclusion from stable-v4.19.295 commit 40d427ffccf9e60bd7288ea3748c066404a35622 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I86MTP CVE: CVE-2023-39189 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… --------------------------- [ Upstream commit f4f8a7803119005e87b716874bec07c751efafec ] The opt_num field is controlled by user mode and is not currently validated inside the kernel. An attacker can take advantage of this to trigger an OOB read and potentially leak information. BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88 Read of size 2 at addr ffff88804bc64272 by task poc/6431 CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1 Call Trace: nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88 nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281 nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47 expr_call_ops_eval net/netfilter/nf_tables_core.c:214 nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264 nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23 [..] Also add validation to genre, subtype and version fields. Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match") Reported-by: Lucas Leong <wmliang(a)infosec.exchange> Signed-off-by: Wander Lairson Costa <wander(a)redhat.com> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Liu Jian <liujian56(a)huawei.com> --- net/netfilter/nfnetlink_osf.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/netfilter/nfnetlink_osf.c b/net/netfilter/nfnetlink_osf.c index 917f06110c82..46713818228c 100644 --- a/net/netfilter/nfnetlink_osf.c +++ b/net/netfilter/nfnetlink_osf.c @@ -318,6 +318,14 @@ static int nfnl_osf_add_callback(struct net *net, struct sock *ctnl, f = nla_data(osf_attrs[OSF_ATTR_FINGER]); + if (f->opt_num > ARRAY_SIZE(f->opt)) + return -EINVAL; + + if (!memchr(f->genre, 0, MAXGENRELEN) || + !memchr(f->subtype, 0, MAXGENRELEN) || + !memchr(f->version, 0, MAXGENRELEN)) + return -EINVAL; + kf = kmalloc(sizeof(struct nf_osf_finger), GFP_KERNEL); if (!kf) return -ENOMEM; -- 2.34.1

2 1

[PATCH OLK-5.10] xhci: print warning when HCE was set
by liulongfang 13 Oct '23

13 Oct '23

From: Longfang Liu <liulongfang(a)huawei.com> mainline inclusion from mainline-v6.2-rc1 commit 2a25e66d676dfb9b018abd503deed3d38a892dec category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I87TGP CVE: NA Reference: https://www.spinics.net/lists/linux-usb/msg231333.html ---------------------------------------------------------------------- When HCE(Host Controller Error) is set, it means that the xhci hardware controller has an error at this time, but the current xhci driver software does not log this event. By adding an HCE event detection in the xhci interrupt processing interface, a warning log is output to the system, which is convenient for system device status tracking. Signed-off-by: Longfang Liu <liulongfang(a)huawei.com> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20221130091944.2171610-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/usb/host/xhci-ring.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index ad81e9a508b1..f6af479188e8 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -3031,6 +3031,11 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd) if (!(status & STS_EINT)) goto out; + if (status & STS_HCE) { + xhci_warn(xhci, "WARNING: Host Controller Error\n"); + goto out; + } + if (status & STS_FATAL) { xhci_warn(xhci, "WARNING: Host System Error\n"); xhci_halt(xhci); -- 2.33.0

2 1

【重要】openEuler-22.03-LTS-SP3 kernel 需求收集
by zhangjialin (F) 13 Oct '23

13 Oct '23

openEuler-22.03-LTS-SP3 kernel 需求收集截止日期： 2023 年 10 月 27 日星期五 openEuler-22.03-LTS-SP3 kernel 需求合入截止日期： 2023 年 12 月 1 日星期五目前是需求收集阶段，如果您有合入 openEuler-22.03-LTS-SP3 kernel 的需求，请您尽快向 openEuler 社区 kernel sig 提交需求 issue 需求 issue 提交链接： https://gitee.com/openeuler/kernel/issues ， issue 类型选择需求， issue 标题以 [openEuler-22.03-LTS-SP3] 开头我们将在 2023 年 10 月 27 日 kernel sig 双周例会上集中讨论，采纳的需求将纳入里程碑规划

1 0

[PATCH] x86/cpufeatures: Fix abi breakage caused by NCAPINTS in cpufeature header file.
by Jialin Zhang 12 Oct '23

12 Oct '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7RQ67 -------------------------------- Fix abi breakage according to the previous solution: commit ac376dd8c1af ("x86/cpufeatures: Fix abi breakage caused by NCAPINTS in cpufeature header file.") Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com> Signed-off-by: Yanan Wang <wangyanan55(a)huawei.com> --- arch/x86/include/asm/cpufeature.h | 7 ++----- arch/x86/include/asm/cpufeatures.h | 11 ++++++----- arch/x86/include/asm/disabled-features.h | 3 +-- arch/x86/include/asm/required-features.h | 3 +-- arch/x86/kernel/cpu/common.c | 3 --- arch/x86/kernel/cpu/scattered.c | 3 +++ arch/x86/kvm/cpuid.c | 4 ++++ arch/x86/kvm/cpuid.h | 13 +++++++++++++ 8 files changed, 30 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 5efb04544612..f4cbc01c0bc4 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -31,7 +31,6 @@ enum cpuid_leafs CPUID_7_ECX, CPUID_8000_0007_EBX, CPUID_7_EDX, - CPUID_8000_0021_EAX, }; #ifdef CONFIG_X86_FEATURE_NAMES @@ -90,9 +89,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \ - CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) || \ REQUIRED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 20)) + BUILD_BUG_ON_ZERO(NCAPINTS != 19)) #define DISABLED_MASK_BIT_SET(feature_bit) \ ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ @@ -114,9 +112,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \ - CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) || \ DISABLED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 20)) + BUILD_BUG_ON_ZERO(NCAPINTS != 19)) #define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index e39f64f9b52a..60102962999c 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -13,7 +13,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 20 /* N 32-bit words worth of info */ +#define NCAPINTS 19 /* N 32-bit words worth of info */ #define NBUGINTS 2 /* N 32-bit bug flags */ /* @@ -409,6 +409,11 @@ #define X86_FEATURE_SUCCOR (17*32+ 1) /* Uncorrectable error containment and recovery */ #define X86_FEATURE_SMCA (17*32+ 3) /* Scalable MCA */ +/* AMD-defined SRSO vulnerability features, CPUID level 0x80000021 (EAX), word 20 */ +#define X86_FEATURE_SBPB (17*32+24) +#define X86_FEATURE_IBPB_BRTYPE (17*32+25) +#define X86_FEATURE_SRSO_NO (17*32+26) + /* AMD-defined memory encryption features, CPUID level 0x8000001f (EAX), word 19 */ #define X86_FEATURE_SME (17*32+ 27) /* AMD Secure Memory Encryption */ #define X86_FEATURE_SEV (17*32+ 28) /* AMD Secure Encrypted Virtualization */ @@ -440,10 +445,6 @@ #define X86_FEATURE_SPEC_CTRL_SSBD (18*32+31) /* "" Speculative Store Bypass Disable */ -#define X86_FEATURE_SBPB (19*32+27) /* "" Selective Branch Prediction Barrier */ -#define X86_FEATURE_IBPB_BRTYPE (19*32+28) /* "" MSR_PRED_CMD[IBPB] flushes all branch type predictions */ -#define X86_FEATURE_SRSO_NO (19*32+29) /* "" CPU is not affected by SRSO */ - /* * BUG word(s) */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index f7be189e9723..fb51349e45a7 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -110,7 +110,6 @@ DISABLE_ENQCMD) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 -#define DISABLED_MASK19 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20) +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) #endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index b2d504f11937..3ff0d48469f2 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -101,7 +101,6 @@ #define REQUIRED_MASK16 0 #define REQUIRED_MASK17 0 #define REQUIRED_MASK18 0 -#define REQUIRED_MASK19 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20) +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) #endif /* _ASM_X86_REQUIRED_FEATURES_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 07380b70be85..508b90371515 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -963,9 +963,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c) if (c->extended_cpuid_level >= 0x8000000a) c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a); - if (c->extended_cpuid_level >= 0x80000021) - c->x86_capability[CPUID_8000_0021_EAX] = cpuid_eax(0x80000021); - init_scattered_cpuid_features(c); init_speculation_control(c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index e5c34b26de64..24acb25a72c4 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -50,6 +50,9 @@ static const struct cpuid_bit cpuid_bits[] = { { X86_FEATURE_VM_PAGE_FLUSH, CPUID_EAX, 2, 0x8000001f, 0 }, { X86_FEATURE_SEV_ES, CPUID_EAX, 3, 0x8000001f, 0 }, { X86_FEATURE_SME_COHERENT, CPUID_EAX, 10, 0x8000001f, 0 }, + { X86_FEATURE_SBPB, CPUID_EAX, 27, 0x80000021, 0 }, + { X86_FEATURE_IBPB_BRTYPE, CPUID_EAX, 28, 0x80000021, 0 }, + { X86_FEATURE_SRSO_NO, CPUID_EAX, 29, 0x80000021, 0 }, { 0, 0, 0, 0, 0 } }; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0ee92d9f6018..778fcee776e4 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -560,6 +560,10 @@ void kvm_set_cpu_caps(void) !boot_cpu_has(X86_FEATURE_AMD_SSBD)) kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD); + if (cpu_feature_enabled(X86_FEATURE_SBPB)) + kvm_cpu_cap_set(X86_FEATURE_SBPB); + if (cpu_feature_enabled(X86_FEATURE_IBPB_BRTYPE)) + kvm_cpu_cap_set(X86_FEATURE_IBPB_BRTYPE); if (cpu_feature_enabled(X86_FEATURE_SRSO_NO)) kvm_cpu_cap_set(X86_FEATURE_SRSO_NO); diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index 51b658bf7054..e246dfce5da9 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -14,6 +14,7 @@ */ enum kvm_only_cpuid_leafs { CPUID_12_EAX = NCAPINTS, + CPUID_8000_0021_EAX, NR_KVM_CPU_CAPS, NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS, @@ -26,6 +27,11 @@ enum kvm_only_cpuid_leafs { #define KVM_X86_FEATURE_SGX2 KVM_X86_FEATURE(CPUID_12_EAX, 1) #define KVM_X86_FEATURE_SGX_EDECCSSA KVM_X86_FEATURE(CPUID_12_EAX, 11) +/* AMD-defined SRSO vulnerability features, CPUID level 0x80000021 (EAX), word 20 */ +#define KVM_X86_FEATURE_SBPB KVM_X86_FEATURE(CPUID_8000_0021_EAX, 27) +#define KVM_X86_FEATURE_IBPB_BRTYPE KVM_X86_FEATURE(CPUID_8000_0021_EAX, 28) +#define KVM_X86_FEATURE_SRSO_NO KVM_X86_FEATURE(CPUID_8000_0021_EAX, 29) + extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; void kvm_set_cpu_caps(void); @@ -119,6 +125,13 @@ static __always_inline u32 __feature_translate(int x86_feature) else if (x86_feature == X86_FEATURE_SGX_EDECCSSA) return KVM_X86_FEATURE_SGX_EDECCSSA; + if (x86_feature == X86_FEATURE_SBPB) + return KVM_X86_FEATURE_SBPB; + else if (x86_feature == X86_FEATURE_IBPB_BRTYPE) + return KVM_X86_FEATURE_IBPB_BRTYPE; + else if (x86_feature == X86_FEATURE_SRSO_NO) + return KVM_X86_FEATURE_SRSO_NO; + return x86_feature; } -- 2.25.1

1 0

[PATCH] x86/cpufeatures: Fix abi breakage caused by NCAPINTS in cpufeature header file.
by Jialin Zhang 12 Oct '23

12 Oct '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7RQ67 -------------------------------- Fix abi breakage according to the previous solution: commit ac376dd8c1af ("x86/cpufeatures: Fix abi breakage caused by NCAPINTS in cpufeature header file.") Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com> Signed-off-by: Yanan Wang <wangyanan55(a)huawei.com> --- arch/x86/include/asm/cpufeature.h | 7 ++----- arch/x86/include/asm/cpufeatures.h | 11 ++++++----- arch/x86/include/asm/disabled-features.h | 3 +-- arch/x86/include/asm/required-features.h | 3 +-- arch/x86/kernel/cpu/common.c | 3 --- arch/x86/kernel/cpu/scattered.c | 3 +++ arch/x86/kvm/cpuid.c | 4 ++++ arch/x86/kvm/cpuid.h | 13 +++++++++++++ 8 files changed, 30 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 5efb04544612..f4cbc01c0bc4 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -31,7 +31,6 @@ enum cpuid_leafs CPUID_7_ECX, CPUID_8000_0007_EBX, CPUID_7_EDX, - CPUID_8000_0021_EAX, }; #ifdef CONFIG_X86_FEATURE_NAMES @@ -90,9 +89,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \ - CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) || \ REQUIRED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 20)) + BUILD_BUG_ON_ZERO(NCAPINTS != 19)) #define DISABLED_MASK_BIT_SET(feature_bit) \ ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ @@ -114,9 +112,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \ - CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) || \ DISABLED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 20)) + BUILD_BUG_ON_ZERO(NCAPINTS != 19)) #define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index e39f64f9b52a..60102962999c 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -13,7 +13,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 20 /* N 32-bit words worth of info */ +#define NCAPINTS 19 /* N 32-bit words worth of info */ #define NBUGINTS 2 /* N 32-bit bug flags */ /* @@ -409,6 +409,11 @@ #define X86_FEATURE_SUCCOR (17*32+ 1) /* Uncorrectable error containment and recovery */ #define X86_FEATURE_SMCA (17*32+ 3) /* Scalable MCA */ +/* AMD-defined SRSO vulnerability features, CPUID level 0x80000021 (EAX), word 20 */ +#define X86_FEATURE_SBPB (17*32+24) +#define X86_FEATURE_IBPB_BRTYPE (17*32+25) +#define X86_FEATURE_SRSO_NO (17*32+26) + /* AMD-defined memory encryption features, CPUID level 0x8000001f (EAX), word 19 */ #define X86_FEATURE_SME (17*32+ 27) /* AMD Secure Memory Encryption */ #define X86_FEATURE_SEV (17*32+ 28) /* AMD Secure Encrypted Virtualization */ @@ -440,10 +445,6 @@ #define X86_FEATURE_SPEC_CTRL_SSBD (18*32+31) /* "" Speculative Store Bypass Disable */ -#define X86_FEATURE_SBPB (19*32+27) /* "" Selective Branch Prediction Barrier */ -#define X86_FEATURE_IBPB_BRTYPE (19*32+28) /* "" MSR_PRED_CMD[IBPB] flushes all branch type predictions */ -#define X86_FEATURE_SRSO_NO (19*32+29) /* "" CPU is not affected by SRSO */ - /* * BUG word(s) */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index f7be189e9723..fb51349e45a7 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -110,7 +110,6 @@ DISABLE_ENQCMD) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 -#define DISABLED_MASK19 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20) +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) #endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index b2d504f11937..3ff0d48469f2 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -101,7 +101,6 @@ #define REQUIRED_MASK16 0 #define REQUIRED_MASK17 0 #define REQUIRED_MASK18 0 -#define REQUIRED_MASK19 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20) +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) #endif /* _ASM_X86_REQUIRED_FEATURES_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 07380b70be85..508b90371515 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -963,9 +963,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c) if (c->extended_cpuid_level >= 0x8000000a) c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a); - if (c->extended_cpuid_level >= 0x80000021) - c->x86_capability[CPUID_8000_0021_EAX] = cpuid_eax(0x80000021); - init_scattered_cpuid_features(c); init_speculation_control(c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index e5c34b26de64..24acb25a72c4 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -50,6 +50,9 @@ static const struct cpuid_bit cpuid_bits[] = { { X86_FEATURE_VM_PAGE_FLUSH, CPUID_EAX, 2, 0x8000001f, 0 }, { X86_FEATURE_SEV_ES, CPUID_EAX, 3, 0x8000001f, 0 }, { X86_FEATURE_SME_COHERENT, CPUID_EAX, 10, 0x8000001f, 0 }, + { X86_FEATURE_SBPB, CPUID_EAX, 27, 0x80000021, 0 }, + { X86_FEATURE_IBPB_BRTYPE, CPUID_EAX, 28, 0x80000021, 0 }, + { X86_FEATURE_SRSO_NO, CPUID_EAX, 29, 0x80000021, 0 }, { 0, 0, 0, 0, 0 } }; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0ee92d9f6018..778fcee776e4 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -560,6 +560,10 @@ void kvm_set_cpu_caps(void) !boot_cpu_has(X86_FEATURE_AMD_SSBD)) kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD); + if (cpu_feature_enabled(X86_FEATURE_SBPB)) + kvm_cpu_cap_set(X86_FEATURE_SBPB); + if (cpu_feature_enabled(X86_FEATURE_IBPB_BRTYPE)) + kvm_cpu_cap_set(X86_FEATURE_IBPB_BRTYPE); if (cpu_feature_enabled(X86_FEATURE_SRSO_NO)) kvm_cpu_cap_set(X86_FEATURE_SRSO_NO); diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index 51b658bf7054..e246dfce5da9 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -14,6 +14,7 @@ */ enum kvm_only_cpuid_leafs { CPUID_12_EAX = NCAPINTS, + CPUID_8000_0021_EAX, NR_KVM_CPU_CAPS, NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS, @@ -26,6 +27,11 @@ enum kvm_only_cpuid_leafs { #define KVM_X86_FEATURE_SGX2 KVM_X86_FEATURE(CPUID_12_EAX, 1) #define KVM_X86_FEATURE_SGX_EDECCSSA KVM_X86_FEATURE(CPUID_12_EAX, 11) +/* AMD-defined SRSO vulnerability features, CPUID level 0x80000021 (EAX), word 20 */ +#define KVM_X86_FEATURE_SBPB KVM_X86_FEATURE(CPUID_8000_0021_EAX, 27) +#define KVM_X86_FEATURE_IBPB_BRTYPE KVM_X86_FEATURE(CPUID_8000_0021_EAX, 28) +#define KVM_X86_FEATURE_SRSO_NO KVM_X86_FEATURE(CPUID_8000_0021_EAX, 29) + extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; void kvm_set_cpu_caps(void); @@ -119,6 +125,13 @@ static __always_inline u32 __feature_translate(int x86_feature) else if (x86_feature == X86_FEATURE_SGX_EDECCSSA) return KVM_X86_FEATURE_SGX_EDECCSSA; + if (x86_feature == X86_FEATURE_SBPB) + return KVM_X86_FEATURE_SBPB; + else if (x86_feature == X86_FEATURE_IBPB_BRTYPE) + return KVM_X86_FEATURE_IBPB_BRTYPE; + else if (x86_feature == X86_FEATURE_SRSO_NO) + return KVM_X86_FEATURE_SRSO_NO; + return x86_feature; } -- 2.25.1

1 0

openEuler Kernel SIG双周例会
by openEuler conference 12 Oct '23

12 Oct '23

您好！ Kernel SIG 邀请您参加 2023-10-13 14:00 召开的WeLink会议(自动录制) 会议主题：openEuler Kernel SIG双周例会会议内容： 1. 进展update 2. openEuler bpftool发布件上游源修改 3. 议题征集中（新增议题可直接回复本邮件申报，也可直接填至会议看板审核）会议链接：https://bmeeting.huaweicloud.com:36443/#/j/985668321 会议纪要：https://etherpad.openeuler.org/p/Kernel-meetings 温馨提醒：建议接入会议后修改参会人的姓名，也可以使用您在gitee.com的ID 更多资讯尽在：https://openeuler.org/zh/ Hello! openEuler Kernel SIG invites you to attend the WeLink conference(auto recording) will be held at 2023-10-13 14:00, The subject of the conference is openEuler Kernel SIG双周例会, Summary: 1. 进展update 2. openEuler bpftool发布件上游源修改 3. 议题征集中（新增议题可直接回复本邮件申报，也可直接填至会议看板审核） You can join the meeting at https://bmeeting.huaweicloud.com:36443/#/j/985668321. Add topics at https://etherpad.openeuler.org/p/Kernel-meetings. Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com. More information: https://openeuler.org/en/

1 0

[PATCH openEuler-1.0-LTS] efi: use 32-bit alignment for efi_guid_t literals
by Xia Fukun 11 Oct '23

11 Oct '23

From: Ard Biesheuvel <ardb(a)kernel.org> stable inclusion from stable-v5.10.198 commit 47ba0d4d2afb476e2a67f781166186e24b1e3bc1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I878NW CVE: NA -------------------------------- commit fb98cc0b3af2ba4d87301dff2b381b12eee35d7d upstream. Commit 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t") updated the type definition of efi_guid_t to ensure that it always appears sufficiently aligned (the UEFI spec is ambiguous about this, but given the fact that its EFI_GUID type is defined in terms of a struct carrying a uint32_t, the natural alignment is definitely >= 32 bits). However, we missed the EFI_GUID() macro which is used to instantiate efi_guid_t literals: that macro is still based on the guid_t type, which does not have a minimum alignment at all. This results in warnings such as In file included from drivers/firmware/efi/mokvar-table.c:35: include/linux/efi.h:1093:34: warning: passing 1-byte aligned argument to 4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer access [-Walign-mismatch] status = get_var(L"SecureBoot", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size, ^ include/linux/efi.h:1101:24: warning: passing 1-byte aligned argument to 4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer access [-Walign-mismatch] get_var(L"SetupMode", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size, &setupmode); The distinction only matters on CPUs that do not support misaligned loads fully, but 32-bit ARM's load-multiple instructions fall into that category, and these are likely to be emitted by the compiler that built the firmware for loading word-aligned 128-bit GUIDs from memory So re-implement the initializer in terms of our own efi_guid_t type, so that the alignment becomes a property of the literal's type. Fixes: 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t") Reported-by: Nathan Chancellor <nathan(a)kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com> Reviewed-by: Nathan Chancellor <nathan(a)kernel.org> Tested-by: Nathan Chancellor <nathan(a)kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1327 Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- include/linux/efi.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/efi.h b/include/linux/efi.h index b90423d1128b..f40628b5f63a 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -63,8 +63,10 @@ typedef void *efi_handle_t; */ typedef guid_t efi_guid_t __aligned(__alignof__(u32)); -#define EFI_GUID(a,b,c,d0,d1,d2,d3,d4,d5,d6,d7) \ - GUID_INIT(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7) +#define EFI_GUID(a, b, c, d...) (efi_guid_t){ { \ + (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \ + (b) & 0xff, ((b) >> 8) & 0xff, \ + (c) & 0xff, ((c) >> 8) & 0xff, d } } /* * Generic EFI table header -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] efi: use 32-bit alignment for efi_guid_t literals
by Xia Fukun 11 Oct '23

11 Oct '23

From: Ard Biesheuvel <ardb(a)kernel.org> stable inclusion from stable-5.10.198 commit 47ba0d4d2afb476e2a67f781166186e24b1e3bc1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I878NW CVE: NA -------------------------------- commit fb98cc0b3af2ba4d87301dff2b381b12eee35d7d upstream. Commit 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t") updated the type definition of efi_guid_t to ensure that it always appears sufficiently aligned (the UEFI spec is ambiguous about this, but given the fact that its EFI_GUID type is defined in terms of a struct carrying a uint32_t, the natural alignment is definitely >= 32 bits). However, we missed the EFI_GUID() macro which is used to instantiate efi_guid_t literals: that macro is still based on the guid_t type, which does not have a minimum alignment at all. This results in warnings such as In file included from drivers/firmware/efi/mokvar-table.c:35: include/linux/efi.h:1093:34: warning: passing 1-byte aligned argument to 4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer access [-Walign-mismatch] status = get_var(L"SecureBoot", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size, ^ include/linux/efi.h:1101:24: warning: passing 1-byte aligned argument to 4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer access [-Walign-mismatch] get_var(L"SetupMode", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size, &setupmode); The distinction only matters on CPUs that do not support misaligned loads fully, but 32-bit ARM's load-multiple instructions fall into that category, and these are likely to be emitted by the compiler that built the firmware for loading word-aligned 128-bit GUIDs from memory So re-implement the initializer in terms of our own efi_guid_t type, so that the alignment becomes a property of the literal's type. Fixes: 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t") Reported-by: Nathan Chancellor <nathan(a)kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com> Reviewed-by: Nathan Chancellor <nathan(a)kernel.org> Tested-by: Nathan Chancellor <nathan(a)kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1327 Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- include/linux/efi.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/efi.h b/include/linux/efi.h index b90423d1128b..f40628b5f63a 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -63,8 +63,10 @@ typedef void *efi_handle_t; */ typedef guid_t efi_guid_t __aligned(__alignof__(u32)); -#define EFI_GUID(a,b,c,d0,d1,d2,d3,d4,d5,d6,d7) \ - GUID_INIT(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7) +#define EFI_GUID(a, b, c, d...) (efi_guid_t){ { \ + (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \ + (b) & 0xff, ((b) >> 8) & 0xff, \ + (c) & 0xff, ((c) >> 8) & 0xff, d } } /* * Generic EFI table header -- 2.34.1

2 1

[PATCH openEuler-22.03-LTS-SP2] netfilter: nf_tables: disallow rule removal from chain binding
by Ziyang Xuan 11 Oct '23

11 Oct '23

From: Pablo Neira Ayuso <pablo(a)netfilter.org> mainline inclusion from mainline-v6.6-rc3 commit f15f29fd4779be8a418b66e9d52979bb6d6c2325 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I84NFY CVE: CVE-2023-5197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Chain binding only requires the rule addition/insertion command within the same transaction. Removal of rules from chain bindings within the same transaction makes no sense, userspace does not utilize this feature. Replace nft_chain_is_bound() check to nft_chain_binding() in rule deletion commands. Replace command implies a rule deletion, reject this command too. Rule flush command can also safely rely on this nft_chain_binding() check because unbound chains are not allowed since 62e1e94b246e ("netfilter: nf_tables: reject unbound chain set before commit phase"). Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich <kevinrich1337(a)gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com> --- net/netfilter/nf_tables_api.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 4ac499c6ef47..121ad5981520 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1250,7 +1250,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -1295,7 +1295,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -2579,6 +2579,9 @@ static int nf_tables_delchain(struct net *net, struct sock *nlsk, return PTR_ERR(chain); } + if (nft_chain_binding(chain)) + return -EOPNOTSUPP; + if (nlh->nlmsg_flags & NLM_F_NONREC && chain->use > 0) return -EBUSY; @@ -3474,6 +3477,11 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, } if (nlh->nlmsg_flags & NLM_F_REPLACE) { + if (nft_chain_binding(chain)) { + err = -EOPNOTSUPP; + goto err2; + } + trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule); if (trans == NULL) { err = -ENOMEM; @@ -3580,7 +3588,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]); return PTR_ERR(chain); } - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) return -EOPNOTSUPP; } @@ -3610,7 +3618,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, list_for_each_entry(chain, &table->chains, list) { if (!nft_is_active_next(net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx.chain = chain; @@ -9194,6 +9202,9 @@ static void __nft_release_table(struct net *net, struct nft_table *table) ctx.family = table->family; ctx.table = table; list_for_each_entry(chain, &table->chains, list) { + if (nft_chain_binding(chain)) + continue; + ctx.chain = chain; list_for_each_entry_safe(rule, nr, &chain->rules, list) { list_del(&rule->list); -- 2.25.1

2 1

[PATCH openEuler-22.03-LTS-SP1] netfilter: nf_tables: disallow rule removal from chain binding
by Ziyang Xuan 11 Oct '23

11 Oct '23

From: Pablo Neira Ayuso <pablo(a)netfilter.org> mainline inclusion from mainline-v6.6-rc3 commit f15f29fd4779be8a418b66e9d52979bb6d6c2325 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I84NFY CVE: CVE-2023-5197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Chain binding only requires the rule addition/insertion command within the same transaction. Removal of rules from chain bindings within the same transaction makes no sense, userspace does not utilize this feature. Replace nft_chain_is_bound() check to nft_chain_binding() in rule deletion commands. Replace command implies a rule deletion, reject this command too. Rule flush command can also safely rely on this nft_chain_binding() check because unbound chains are not allowed since 62e1e94b246e ("netfilter: nf_tables: reject unbound chain set before commit phase"). Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich <kevinrich1337(a)gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com> --- net/netfilter/nf_tables_api.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 74394c865105..b2d5e1dda99a 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1250,7 +1250,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -1295,7 +1295,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -2579,6 +2579,9 @@ static int nf_tables_delchain(struct net *net, struct sock *nlsk, return PTR_ERR(chain); } + if (nft_chain_binding(chain)) + return -EOPNOTSUPP; + if (nlh->nlmsg_flags & NLM_F_NONREC && chain->use > 0) return -EBUSY; @@ -3474,6 +3477,11 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, } if (nlh->nlmsg_flags & NLM_F_REPLACE) { + if (nft_chain_binding(chain)) { + err = -EOPNOTSUPP; + goto err2; + } + trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule); if (trans == NULL) { err = -ENOMEM; @@ -3580,7 +3588,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]); return PTR_ERR(chain); } - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) return -EOPNOTSUPP; } @@ -3610,7 +3618,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, list_for_each_entry(chain, &table->chains, list) { if (!nft_is_active_next(net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx.chain = chain; @@ -9194,6 +9202,9 @@ static void __nft_release_table(struct net *net, struct nft_table *table) ctx.family = table->family; ctx.table = table; list_for_each_entry(chain, &table->chains, list) { + if (nft_chain_binding(chain)) + continue; + ctx.chain = chain; list_for_each_entry_safe(rule, nr, &chain->rules, list) { list_del(&rule->list); -- 2.25.1

2 1

[PATCH openEuler-22.03-LTS] netfilter: nf_tables: disallow rule removal from chain binding
by Ziyang Xuan 11 Oct '23

11 Oct '23

From: Pablo Neira Ayuso <pablo(a)netfilter.org> mainline inclusion from mainline-v6.6-rc3 commit f15f29fd4779be8a418b66e9d52979bb6d6c2325 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I84NFY CVE: CVE-2023-5197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Chain binding only requires the rule addition/insertion command within the same transaction. Removal of rules from chain bindings within the same transaction makes no sense, userspace does not utilize this feature. Replace nft_chain_is_bound() check to nft_chain_binding() in rule deletion commands. Replace command implies a rule deletion, reject this command too. Rule flush command can also safely rely on this nft_chain_binding() check because unbound chains are not allowed since 62e1e94b246e ("netfilter: nf_tables: reject unbound chain set before commit phase"). Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich <kevinrich1337(a)gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com> --- net/netfilter/nf_tables_api.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index b0bf9342b2e5..c6507d52b4eb 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1250,7 +1250,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -1295,7 +1295,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -2577,6 +2577,9 @@ static int nf_tables_delchain(struct net *net, struct sock *nlsk, return PTR_ERR(chain); } + if (nft_chain_binding(chain)) + return -EOPNOTSUPP; + if (nlh->nlmsg_flags & NLM_F_NONREC && chain->use > 0) return -EBUSY; @@ -3472,6 +3475,11 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, } if (nlh->nlmsg_flags & NLM_F_REPLACE) { + if (nft_chain_binding(chain)) { + err = -EOPNOTSUPP; + goto err2; + } + trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule); if (trans == NULL) { err = -ENOMEM; @@ -3578,7 +3586,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]); return PTR_ERR(chain); } - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) return -EOPNOTSUPP; } @@ -3608,7 +3616,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, list_for_each_entry(chain, &table->chains, list) { if (!nft_is_active_next(net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx.chain = chain; @@ -9192,6 +9200,9 @@ static void __nft_release_table(struct net *net, struct nft_table *table) ctx.family = table->family; ctx.table = table; list_for_each_entry(chain, &table->chains, list) { + if (nft_chain_binding(chain)) + continue; + ctx.chain = chain; list_for_each_entry_safe(rule, nr, &chain->rules, list) { list_del(&rule->list); -- 2.25.1

2 1

[PATCH openEuler-22.03-LTS 00/19] nf_table LTS
by Lu Wei 11 Oct '23

11 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

2 20

[PATCH v3 OLK-5.10 00/19] Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely
by Wupeng Ma 11 Oct '23

11 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> Patch 1: move FDT init out of kaslr_early_init for future use. Patch 2 to 8: enable feature PBHA for arm64. Patch 9-18: Control the usage of HBM cache for kernel and task precisely. Patch 19: Enable feature PBHA for arm64 by default. Changelog since v2: - correct the error Documentation in patch #18 Changelog since v1: - fix kabi broken due to include files James Morse (7): KVM: arm64: Detect and enable PBHA for stage2 dt-bindings: Rename the description of cpu nodes cpu.yaml dt-bindings: arm: Add binding for Page Based Hardware Attributes arm64: cpufeature: Enable PBHA bits for stage1 arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values KVM: arm64: Configure PBHA bits for stage2 Documentation: arm64: Describe the support and expectations for PBHA Ma Wupeng (11): arm64: cpufeature: Enable PBHA for stage1 early via FDT arm64: mm: Detect and enable PBHA bit0 at early startup arm64: mm: Update kernel pte entries if pbha bit0 enabled arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0 arm64: mm: Introduce sysfs interface to bypass whole task arm64: mm: Set flag VM_PBHA_BIT0 for global init task arm64: mm: Introduce prctl to control pbha behavior arm64: mm: Introduce kernel param pbha openeuler: configs: arm64: Enable PBHA by default Marc Zyngier (1): arm64: Extract early FDT mapping from kaslr_early_init() .../admin-guide/kernel-parameters.txt | 8 + Documentation/arm64/index.rst | 1 + Documentation/arm64/pbha.rst | 85 +++ .../devicetree/bindings/arm/cpu.yaml | 537 ++++++++++++++++ .../devicetree/bindings/arm/cpus.yaml | 584 +++--------------- arch/arm64/Kconfig | 20 + arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 3 + arch/arm64/include/asm/cpufeature.h | 15 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_pgtable.h | 9 + arch/arm64/include/asm/mman.h | 10 + arch/arm64/include/asm/pgtable-hwdef.h | 6 + arch/arm64/include/asm/pgtable.h | 26 + arch/arm64/include/asm/setup.h | 3 + arch/arm64/include/uapi/asm/mman.h | 1 + arch/arm64/kernel/cpufeature.c | 258 ++++++++ arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kernel/kaslr.c | 7 +- arch/arm64/kernel/setup.c | 15 + arch/arm64/kvm/reset.c | 15 +- arch/arm64/mm/hugetlbpage.c | 2 + arch/arm64/mm/mmu.c | 14 +- arch/arm64/mm/ptdump.c | 5 + .../firmware/efi/libstub/efi-stub-helper.c | 3 + drivers/firmware/efi/libstub/fdt.c | 57 ++ drivers/soc/hisilicon/Makefile | 1 + drivers/soc/hisilicon/pbha.c | 204 ++++++ fs/proc/base.c | 103 +++ fs/proc/task_mmu.c | 3 + include/linux/mm.h | 8 +- include/linux/pbha.h | 66 ++ include/uapi/asm-generic/mman-common.h | 1 + include/uapi/linux/prctl.h | 2 + kernel/sys.c | 10 + mm/memory.c | 4 + mm/vmalloc.c | 5 + 38 files changed, 1579 insertions(+), 523 deletions(-) create mode 100644 Documentation/arm64/pbha.rst create mode 100644 Documentation/devicetree/bindings/arm/cpu.yaml create mode 100644 drivers/soc/hisilicon/pbha.c create mode 100644 include/linux/pbha.h -- 2.25.1

2 20

[PATCH] RDMA/core: fix kabi compatibility for SRQ resource tracing
by Chengchang Tang 11 Oct '23

11 Oct '23

From: wenglianfa <wenglianfa(a)huawei.com> In the patch related to SRQ resource tracing, members are added to the structure, which causes Kapi incompatibility. To solve this problem, perform the following operations: First, make sure that the data structure of the new member has little impact and the risk is controllable, then Use the __GENKSYMS__ macro to shield the new structure members and place the new members at the end of the data structure. Fixes: d7b7ce53b3be ("RDMA/hns: Support SRQ restrack ops for hns driver") Fixes: 788310f83ffc ("RDMA/core: Add support to dump SRQ resource in RAW format") Fixes: 31cc28f1894f ("RDMA/core: Add dedicated SRQ resource tracker function") Fixes: f54d69b7874e ("RDMA/nldev: Add QP numbers to SRQ information") Fixes: 6102284160fd ("RDMA/nldev: Return SRQ information") Fixes: 4e38b5a1267f ("RDMA/restrack: Add support to get resource tracking for SRQ") Fixes: 0c32db26a6c4 ("RDMA/nldev: Return context information") Signed-off-by: wenglianfa <wenglianfa(a)huawei.com> --- include/rdma/ib_verbs.h | 9 +++++++-- include/rdma/restrack.h | 2 ++ include/uapi/rdma/rdma_netlink.h | 4 ++++ 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index fc32b16d5bb3..db46f3150b99 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1608,10 +1608,12 @@ struct ib_srq { }; } ext; +#ifndef __GENKSYMS__ /* * Implementation details of the RDMA core, don't use in drivers: */ struct rdma_restrack_entry res; +#endif }; enum ib_raw_packet_caps { @@ -2548,8 +2550,6 @@ struct ib_device_ops { int (*fill_res_qp_entry)(struct sk_buff *msg, struct ib_qp *ibqp); int (*fill_res_qp_entry_raw)(struct sk_buff *msg, struct ib_qp *ibqp); int (*fill_res_cm_id_entry)(struct sk_buff *msg, struct rdma_cm_id *id); - int (*fill_res_srq_entry)(struct sk_buff *msg, struct ib_srq *ib_srq); - int (*fill_res_srq_entry_raw)(struct sk_buff *msg, struct ib_srq *ib_srq); /* Device lifecycle callbacks */ /* @@ -2619,6 +2619,11 @@ struct ib_device_ops { DECLARE_RDMA_OBJ_SIZE(ib_srq); DECLARE_RDMA_OBJ_SIZE(ib_ucontext); DECLARE_RDMA_OBJ_SIZE(ib_xrcd); + +#ifndef __GENKSYMS__ + int (*fill_res_srq_entry)(struct sk_buff *msg, struct ib_srq *ib_srq); + int (*fill_res_srq_entry_raw)(struct sk_buff *msg, struct ib_srq *ib_srq); +#endif }; struct ib_core_device { diff --git a/include/rdma/restrack.h b/include/rdma/restrack.h index bdfb6fce0bed..57fca78afc47 100644 --- a/include/rdma/restrack.h +++ b/include/rdma/restrack.h @@ -49,10 +49,12 @@ enum rdma_restrack_type { * @RDMA_RESTRACK_COUNTER: Statistic Counter */ RDMA_RESTRACK_COUNTER, +#ifndef __GENKSYMS__ /** * @RDMA_RESTRACK_SRQ: Shared receive queue (SRQ) */ RDMA_RESTRACK_SRQ, +#endif /** * @RDMA_RESTRACK_MAX: Last entry, used for array dclarations */ diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index 05392cd8e627..5ab830103e32 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -293,11 +293,13 @@ enum rdma_nldev_command { RDMA_NLDEV_CMD_RES_MR_GET_RAW, +#ifndef __GENKSYMS__ RDMA_NLDEV_CMD_RES_CTX_GET, /* can dump */ RDMA_NLDEV_CMD_RES_SRQ_GET, /* can dump */ RDMA_NLDEV_CMD_RES_SRQ_GET_RAW, +#endif RDMA_NLDEV_NUM_OPS }; @@ -539,6 +541,7 @@ enum rdma_nldev_attr { RDMA_NLDEV_ATTR_RES_RAW, /* binary */ +#ifndef __GENKSYMS__ RDMA_NLDEV_ATTR_RES_CTX, /* nested table */ RDMA_NLDEV_ATTR_RES_CTX_ENTRY, /* nested table */ @@ -548,6 +551,7 @@ enum rdma_nldev_attr { RDMA_NLDEV_ATTR_MIN_RANGE, /* u32 */ RDMA_NLDEV_ATTR_MAX_RANGE, /* u32 */ +#endif /* * Always the end */ -- 2.33.0

1 0

[PATCH openEuler-22.03-LTS-SP2 00/19] nf_table LTS
by Lu Wei 11 Oct '23

11 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

2 20

[PATCH openEuler-22.03-LTS-SP1 00/19] nf_table LTS
by Lu Wei 11 Oct '23

11 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

2 20

[PATCH openEuler-22.03-LTS 00/19] nf_table LTS
by Lu Wei 11 Oct '23

11 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

1 19

[PATCH OLK-5.10] netfilter: nf_tables: disallow rule removal from chain binding
by Ziyang Xuan 10 Oct '23

10 Oct '23

From: Pablo Neira Ayuso <pablo(a)netfilter.org> mainline inclusion from mainline-v6.6-rc3 commit f15f29fd4779be8a418b66e9d52979bb6d6c2325 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I84NFY CVE: CVE-2023-5197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Chain binding only requires the rule addition/insertion command within the same transaction. Removal of rules from chain bindings within the same transaction makes no sense, userspace does not utilize this feature. Replace nft_chain_is_bound() check to nft_chain_binding() in rule deletion commands. Replace command implies a rule deletion, reject this command too. Rule flush command can also safely rely on this nft_chain_binding() check because unbound chains are not allowed since 62e1e94b246e ("netfilter: nf_tables: reject unbound chain set before commit phase"). Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich <kevinrich1337(a)gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com> --- net/netfilter/nf_tables_api.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 4ac499c6ef47..121ad5981520 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1250,7 +1250,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -1295,7 +1295,7 @@ static int nft_flush_table(struct nft_ctx *ctx) if (!nft_is_active_next(ctx->net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx->chain = chain; @@ -2579,6 +2579,9 @@ static int nf_tables_delchain(struct net *net, struct sock *nlsk, return PTR_ERR(chain); } + if (nft_chain_binding(chain)) + return -EOPNOTSUPP; + if (nlh->nlmsg_flags & NLM_F_NONREC && chain->use > 0) return -EBUSY; @@ -3474,6 +3477,11 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, } if (nlh->nlmsg_flags & NLM_F_REPLACE) { + if (nft_chain_binding(chain)) { + err = -EOPNOTSUPP; + goto err2; + } + trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule); if (trans == NULL) { err = -ENOMEM; @@ -3580,7 +3588,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]); return PTR_ERR(chain); } - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) return -EOPNOTSUPP; } @@ -3610,7 +3618,7 @@ static int nf_tables_delrule(struct net *net, struct sock *nlsk, list_for_each_entry(chain, &table->chains, list) { if (!nft_is_active_next(net, chain)) continue; - if (nft_chain_is_bound(chain)) + if (nft_chain_binding(chain)) continue; ctx.chain = chain; @@ -9194,6 +9202,9 @@ static void __nft_release_table(struct net *net, struct nft_table *table) ctx.family = table->family; ctx.table = table; list_for_each_entry(chain, &table->chains, list) { + if (nft_chain_binding(chain)) + continue; + ctx.chain = chain; list_for_each_entry_safe(rule, nr, &chain->rules, list) { list_del(&rule->list); -- 2.25.1

2 1

[PATCH OLK-5.10 00/19] nf_table LTS
by Lu Wei 10 Oct '23

10 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

2 20

[PATCH OLK-5.10 0/3] scsi: mpt3sas: Driver patch set for openEuler-22.03
by Hao Zhang 10 Oct '23

10 Oct '23

Get patch set from upstream 5.10 LTS stable branch Ranjan Kumar (1): scsi: mpt3sas: Perform additional retries if doorbell read returns 0 Tomas Henzl (1): scsi: mpt3sas: Fix a memory leak Wenchao Hao (1): scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add() drivers/scsi/mpt3sas/mpt3sas_base.c | 49 +++++++++++++++++------- drivers/scsi/mpt3sas/mpt3sas_base.h | 1 + drivers/scsi/mpt3sas/mpt3sas_transport.c | 14 ++++++- 3 files changed, 49 insertions(+), 15 deletions(-) -- 2.33.0

2 4

[PATCH OLK-5.10,00/19] nf_table LTS
by Lu Wei 10 Oct '23

10 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

1 19

[PATCH OLK-5.10,00/19] nf_table LTS
by Lu Wei 10 Oct '23

10 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

1 19

[PATCH OLK-5.10 0/2] tracing: Backport bugfixes
by Zheng Yejian 10 Oct '23

10 Oct '23

Chen Lin (1): ring-buffer: Do not swap cpu_buffer during resize process Steven Rostedt (Google) (1): ring-buffer: Do not attempt to read past "commit" kernel/trace/ring_buffer.c | 19 ++++++++++++++++++- kernel/trace/trace.c | 3 ++- 2 files changed, 20 insertions(+), 2 deletions(-) -- 2.25.1

2 3

[PATCH OLK-5.10 1/2] ring-buffer: Do not swap cpu_buffer during resize process
by Zheng Yejian 10 Oct '23

10 Oct '23

From: Chen Lin <chen.lin5(a)zte.com.cn> mainline inclusion from mainline-v6.5-rc3 commit 8a96c0288d0737ad77882024974c075345c72011 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I86Q1D Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ------------------------------------------------- When ring_buffer_swap_cpu was called during resize process, the cpu buffer was swapped in the middle, resulting in incorrect state. Continuing to run in the wrong state will result in oops. This issue can be easily reproduced using the following two scripts: /tmp # cat test1.sh //#! /bin/sh for i in `seq 0 100000` do echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 done /tmp # cat test2.sh //#! /bin/sh for i in `seq 0 100000` do echo irqsoff > /sys/kernel/debug/tracing/current_tracer sleep 1 echo nop > /sys/kernel/debug/tracing/current_tracer sleep 1 done /tmp # ./test1.sh & /tmp # ./test2.sh & A typical oops log is as follows, sometimes with other different oops logs. [ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8 [ 231.713375] Modules linked in: [ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 231.716750] Hardware name: linux,dummy-virt (DT) [ 231.718152] Workqueue: events update_pages_handler [ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 231.721171] pc : rb_update_pages+0x378/0x3f8 [ 231.722212] lr : rb_update_pages+0x25c/0x3f8 [ 231.723248] sp : ffff800082b9bd50 [ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0 [ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a [ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000 [ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510 [ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002 [ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558 [ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001 [ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000 [ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208 [ 231.744196] Call trace: [ 231.744892] rb_update_pages+0x378/0x3f8 [ 231.745893] update_pages_handler+0x1c/0x38 [ 231.746893] process_one_work+0x1f0/0x468 [ 231.747852] worker_thread+0x54/0x410 [ 231.748737] kthread+0x124/0x138 [ 231.749549] ret_from_fork+0x10/0x20 [ 231.750434] ---[ end trace 0000000000000000 ]--- [ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 233.721696] Mem abort info: [ 233.721935] ESR = 0x0000000096000004 [ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits [ 233.722596] SET = 0, FnV = 0 [ 233.722805] EA = 0, S1PTW = 0 [ 233.723026] FSC = 0x04: level 0 translation fault [ 233.723458] Data abort info: [ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000 [ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 233.726720] Modules linked in: [ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 233.727777] Hardware name: linux,dummy-virt (DT) [ 233.728225] Workqueue: events update_pages_handler [ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 233.729054] pc : rb_update_pages+0x1a8/0x3f8 [ 233.729334] lr : rb_update_pages+0x154/0x3f8 [ 233.729592] sp : ffff800082b9bd50 [ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418 [ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003 [ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58 [ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001 [ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000 [ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c [ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0 [ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000 [ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000 [ 233.734418] Call trace: [ 233.734593] rb_update_pages+0x1a8/0x3f8 [ 233.734853] update_pages_handler+0x1c/0x38 [ 233.735148] process_one_work+0x1f0/0x468 [ 233.735525] worker_thread+0x54/0x410 [ 233.735852] kthread+0x124/0x138 [ 233.736064] ret_from_fork+0x10/0x20 [ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060) [ 233.736959] ---[ end trace 0000000000000000 ]--- After analysis, the seq of the error is as follows [1-5]: int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, int cpu_id) { for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //1. get cpu_buffer, aka cpu_buffer(A) ... ... schedule_work_on(cpu, &cpu_buffer->update_pages_work); //2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to // update_pages_handler, do the update process, set 'update_done' in // complete(&cpu_buffer->update_done) and to wakeup resize process. //----> //3. Just at this moment, ring_buffer_swap_cpu is triggered, //cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer. //ring_buffer_swap_cpu is called as the 'Call trace' below. Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x18/0x28 dump_stack+0x12c/0x188 ring_buffer_swap_cpu+0x2f8/0x328 update_max_tr_single+0x180/0x210 check_critical_timing+0x2b4/0x2c8 tracer_hardirqs_on+0x1c0/0x200 trace_hardirqs_on+0xec/0x378 el0_svc_common+0x64/0x260 do_el0_svc+0x90/0xf8 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb8 el0_sync+0x180/0x1c0 //<---- /* wait for all the updates to complete */ for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //4. get cpu_buffer, cpu_buffer(B) is used in the following process, //the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong. //for example, cpu_buffer(A)->update_done will leave be set 1, and will //not 'wait_for_completion' at the next resize round. if (!cpu_buffer->nr_pages_to_update) continue; if (cpu_online(cpu)) wait_for_completion(&cpu_buffer->update_done); cpu_buffer->nr_pages_to_update = 0; } ... } //5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong, //Continuing to run in the wrong state, then oops occurs. Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn Signed-off-by: Chen Lin <chen.lin5(a)zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com> --- kernel/trace/ring_buffer.c | 14 +++++++++++++- kernel/trace/trace.c | 3 ++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 6c901a8f1202..c90988652986 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -539,6 +539,7 @@ struct trace_buffer { unsigned flags; int cpus; atomic_t record_disabled; + atomic_t resizing; cpumask_var_t cpumask; struct lock_class_key *reader_lock_key; @@ -2038,7 +2039,7 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, /* prevent another thread from changing buffer sizes */ mutex_lock(&buffer->mutex); - + atomic_inc(&buffer->resizing); if (cpu_id == RING_BUFFER_ALL_CPUS) { /* @@ -2181,6 +2182,7 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, atomic_dec(&buffer->record_disabled); } + atomic_dec(&buffer->resizing); mutex_unlock(&buffer->mutex); return 0; @@ -2201,6 +2203,7 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, } } out_err_unlock: + atomic_dec(&buffer->resizing); mutex_unlock(&buffer->mutex); return err; } @@ -5237,6 +5240,15 @@ int ring_buffer_swap_cpu(struct trace_buffer *buffer_a, if (local_read(&cpu_buffer_b->committing)) goto out_dec; + /* + * When resize is in progress, we cannot swap it because + * it will mess the state of the cpu buffer. + */ + if (atomic_read(&buffer_a->resizing)) + goto out_dec; + if (atomic_read(&buffer_b->resizing)) + goto out_dec; + buffer_a->buffers[cpu] = cpu_buffer_b; buffer_b->buffers[cpu] = cpu_buffer_a; diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 5062939700b6..77ad035d6365 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1876,9 +1876,10 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) * place on this CPU. We fail to record, but we reset * the max trace buffer (no one writes directly to it) * and flag that it failed. + * Another reason is resize is in progress. */ trace_array_printk_buf(tr->max_buffer.buffer, _THIS_IP_, - "Failed to swap buffers due to commit in progress\n"); + "Failed to swap buffers due to commit or resize in progress\n"); } WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY); -- 2.25.1

2 2

[PATCH openEuler-23.09] mm: gmem: Display VM_PEER_SHARED as ps during smaps
by Wupeng Ma 10 Oct '23

10 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WLVX --------------------------------------------- Display VM_PEER_SHARED as ps during smaps. Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- fs/proc/task_mmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f1f3b03e1867..509fcbc73e3e 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -712,6 +712,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR [ilog2(VM_UFFD_MINOR)] = "ui", #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_GMEM + [ilog2(VM_PEER_SHARED)] = "ps", +#endif }; size_t i; -- 2.25.1

2 1

[PATCH OLK-5.10 00/10] scsi: mpt3sas: Driver patch set for openEuler-22.03
by Hao Zhang 10 Oct '23

10 Oct '23

Get patch set from upstream 5.10 LTS stable branch Matt Lupfer (1): scsi: mpt3sas: Page fault in reply q processing Ranjan Kumar (1): scsi: mpt3sas: Perform additional retries if doorbell read returns 0 Sreekanth Reddy (4): scsi: mpt3sas: Fix use-after-free warning scsi: mpt3sas: Don't change DMA mask while reallocating pools scsi: mpt3sas: re-do lost mpt3sas DMA mask fix scsi: mpt3sas: Remove usage of dma_get_required_mask() API Suganath Prabu S (1): scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region Tomas Henzl (1): scsi: mpt3sas: Fix a memory leak Wenchao Hao (1): scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add() Yang Yingliang (1): scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add() drivers/scsi/mpt3sas/mpt3sas_base.c | 227 ++++++++++++++++------- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 16 +- 4 files changed, 178 insertions(+), 69 deletions(-) -- 2.33.0

2 11

[PATCH openEuler-22.03-LTS] mm: gmem: Display VM_PEER_SHARED as ps during smaps
by Wupeng Ma 10 Oct '23

10 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WLVX --------------------------------------------- Display VM_PEER_SHARED as ps during smaps. Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- fs/proc/task_mmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f1f3b03e1867..509fcbc73e3e 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -712,6 +712,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR [ilog2(VM_UFFD_MINOR)] = "ui", #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_GMEM + [ilog2(VM_PEER_SHARED)] = "ps", +#endif }; size_t i; -- 2.25.1

2 1

[PATCH openEuler-22.03-LTS 1/1] mm: gmem: Display VM_PEER_SHARED as ps during smaps
by Wupeng Ma 10 Oct '23

10 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WLVX --------------------------------------------- Display VM_PEER_SHARED as ps during smaps. Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- fs/proc/task_mmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f1f3b03e1867..509fcbc73e3e 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -712,6 +712,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR [ilog2(VM_UFFD_MINOR)] = "ui", #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_GMEM + [ilog2(VM_PEER_SHARED)] = "ps", +#endif }; size_t i; -- 2.25.1

2 1

[PATCH OLK-5.10,00/19] nf_table LTS
by Lu Wei 10 Oct '23

10 Oct '23

Florian Westphal (1): netfilter: nf_tables: use net_generic infra for transaction data Lu Wei (7): Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails" Revert "netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR" Revert "netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain" Revert "netfilter: nf_tables: fix chain binding transaction logic" Revert "netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE" Revert "netfilter: nf_tables: skip bound chain on rule flush" Revert "netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID" Pablo Neira Ayuso (11): netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR netfilter: nf_tables: skip bound chain on rule flush netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID include/net/netfilter/nf_tables.h | 18 +- include/net/netns/nftables.h | 14 +- net/netfilter/nf_tables_api.c | 534 +++++++++++++++++++++--------- net/netfilter/nf_tables_offload.c | 30 +- net/netfilter/nft_chain_filter.c | 11 +- net/netfilter/nft_dynset.c | 6 +- net/netfilter/nft_set_bitmap.c | 5 +- net/netfilter/nft_set_hash.c | 23 +- net/netfilter/nft_set_pipapo.c | 14 +- net/netfilter/nft_set_rbtree.c | 5 +- 10 files changed, 458 insertions(+), 202 deletions(-) -- 2.34.1

1 19

[PATCH V1 OLK-5.10 00/24] Introduce some vdpa ops to support vdpa device hotmigrate
by Jiang Dongxu 10 Oct '23

10 Oct '23

From: jiangdongxu1 <jiangdongxu1(a)huawei.com> Patch 1-15: some bugfix and ops intruduced by upstream Patch 16-17: introduce vdpa device logging ops Patch 18: get_backend_feature ops intruduced by upstream Patch 19-20: introduce vdpa device state ops Patch 21-22: introduce vdpa migrate state ops Patch 23: introduce new VHOST feature BYTEMAPLOG Patch 24: introduce vmstate header file used for vender vdpa driver jiangdongxu1 (24): vhost-vdpa: fix an iotlb memory leak vhost_vdpa: fix the crash in unmap a large memory vdpa: Add resume operation vhost-vdpa: Introduce RESUME backend feature bit vhost-vdpa: uAPI to resume the device vhost-vdpa: free iommu domain after last use during cleanup vhost-vdpa: vhost_vdpa_alloc_domain() should be using a const struct bus_type * vdpa: add bind_mm/unbind_mm callbacks vhost-vdpa: use bind_mm/unbind_mm device callbacks vhost_vdpa: fix unmap process in no-batch mode vhost_vdpa: tell vqs about the negotiated vhost_vdpa: support PACKED when setting-getting vring_base PCI/IOV: Add pci_iov_vf_id() to get VF index virtio: update virtio id table, add transitional ids virtio: fix virtio transitional ids vdpa: add log operations vhost-vdpa: add uAPI for logging vdpa: add get_backend_features vdpa operation vdpa: add device state operations vhost-vdpa: add uAPI for device buffer vdpa: add vdpa device migration status ops vhost-vdpa: add uAPI for device migration status vhost: add VHOST feature VHOST_BACKEND_F_BYTEMAPLOG vdpa: add vmstate struct definition drivers/pci/iov.c | 14 ++ drivers/vhost/vdpa.c | 316 +++++++++++++++++++++++++++---- include/linux/pci.h | 8 +- include/linux/vdpa.h | 60 +++++- include/linux/vdpa_vmstate.h | 162 ++++++++++++++++ include/uapi/linux/vhost.h | 20 ++ include/uapi/linux/vhost_types.h | 21 ++ include/uapi/linux/virtio_ids.h | 12 ++ 8 files changed, 577 insertions(+), 36 deletions(-) create mode 100644 include/linux/vdpa_vmstate.h -- 2.33.0

2 25

[PATCH OLK-5.10] ipv4: fix null-deref in ipv4_link_failure
by Lu Wei 10 Oct '23

10 Oct '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v6.6-rc3 commit 0113d9c9d1ccc07f5a3710dac4aa24b6d711278c category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85DZB CVE: CVE-2023-42754 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs. When skb->dev is NULL, dev_net(skb->dev) will become null-dereference. This patch adds a check for the edge case and switch to use the net_device from the rtable when skb->dev is NULL. Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") Suggested-by: David Ahern <dsahern(a)kernel.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Cc: Stephen Suryaputra <ssuryaextr(a)gmail.com> Cc: Vadim Fedorenko <vfedorenko(a)novek.ru> Reviewed-by: David Ahern <dsahern(a)kernel.org> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/ipv4/route.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 216f76b548ff..f85eaf566b9b 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1240,6 +1240,7 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie) static void ipv4_send_dest_unreach(struct sk_buff *skb) { + struct net_device *dev; struct ip_options opt; int res; @@ -1257,7 +1258,8 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb) opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr); rcu_read_lock(); - res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); + dev = skb->dev ? skb->dev : skb_rtable(skb)->dst.dev; + res = __ip_options_compile(dev_net(dev), &opt, skb, NULL); rcu_read_unlock(); if (res) -- 2.34.1

2 1

[PATCH OLK-5.10 01/24] vhost-vdpa: fix an iotlb memory leak
by Jiang Dongxu 09 Oct '23

09 Oct '23

From: jiangdongxu1 <jiangdongxu1(a)huawei.com> mainline inclusion from v6.2-rc3 commit c070c1912a83432530cbb4271d5b9b11fa36b67a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- Before commit 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") we called vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1) during release to free all the resources allocated when processing user IOTLB messages through vhost_vdpa_process_iotlb_update(). That commit changed the handling of IOTLB a bit, and we accidentally removed some code called during the release. We partially fixed this with commit 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") but a potential memory leak is still there as showed by kmemleak if the application does not send VHOST_IOTLB_INVALIDATE or crashes: unreferenced object 0xffff888007fbaa30 (size 16): comm "blkio-bench", pid 914, jiffies 4294993521 (age 885.500s) hex dump (first 16 bytes): 40 73 41 07 80 88 ff ff 00 00 00 00 00 00 00 00 @sA............. backtrace: [<0000000087736d2a>] kmem_cache_alloc_trace+0x142/0x1c0 [<0000000060740f50>] vhost_vdpa_process_iotlb_msg+0x68c/0x901 [vhost_vdpa] [<0000000083e8e205>] vhost_chr_write_iter+0xc0/0x4a0 [vhost] [<000000008f2f414a>] vhost_vdpa_chr_write_iter+0x18/0x20 [vhost_vdpa] [<00000000de1cd4a0>] vfs_write+0x216/0x4b0 [<00000000a2850200>] ksys_write+0x71/0xf0 [<00000000de8e720b>] __x64_sys_write+0x19/0x20 [<0000000018b12cbb>] do_syscall_64+0x3f/0x90 [<00000000986ec465>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Let's fix this calling vhost_vdpa_iotlb_unmap() on the whole range in vhost_vdpa_remove_as(). We move that call before vhost_dev_cleanup() since we need a valid v->vdev.mm in vhost_vdpa_pa_unmap(). vhost_iotlb_reset() call can be removed, since vhost_vdpa_iotlb_unmap() on the whole range removes all the entries. The kmemleak log reported was observed with a vDPA device that has `use_va` set to true (e.g. VDUSE). This patch has been tested with both types of devices. Fixes: 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Fixes: 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") Signed-off-by: Stefano Garzarella <sgarzare(a)redhat.com> Message-Id: <20221109154213.146789-1-sgarzare(a)redhat.com> Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Acked-by: Jason Wang <jasowang(a)redhat.com> Signed-off-by: jiangdongxu1 <jiangdongxu1(a)huawei.com> --- drivers/vhost/vdpa.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index ebafc05d2b74..eee3189985b9 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -65,6 +65,10 @@ static DEFINE_IDA(vhost_vdpa_ida); static dev_t vhost_vdpa_major; +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, + struct vhost_iotlb *iotlb, + u64 start, u64 last); + static inline u32 iotlb_to_asid(struct vhost_iotlb *iotlb) { struct vhost_vdpa_as *as = container_of(iotlb, struct @@ -135,7 +139,7 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid) return -EINVAL; hlist_del(&as->hash_link); - vhost_iotlb_reset(&as->iotlb); + vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1); kfree(as); return 0; @@ -1166,14 +1170,14 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v) struct vhost_vdpa_as *as; u32 asid; - vhost_dev_cleanup(&v->vdev); - kfree(v->vdev.vqs); - for (asid = 0; asid < v->vdpa->nas; asid++) { as = asid_to_as(v, asid); if (as) vhost_vdpa_remove_as(v, asid); } + + vhost_dev_cleanup(&v->vdev); + kfree(v->vdev.vqs); } static int vhost_vdpa_open(struct inode *inode, struct file *filep) -- 2.33.0

2 24

[PATCH 00/24] Introduce vdpa ops to support vdpa device hotmigrate
by Jiang Dongxu 09 Oct '23

09 Oct '23

From: jiangdongxu1 <jiangdongxu1(a)huawei.com> Patch 1-15: some bugfix and ops intruduced by upstream Patch 16-17: introduce vdpa device logging ops Patch 18: get_backend_feature ops intruduced by upstream Patch 19-20: introduce vdpa device state ops Patch 21-22: introduce vdpa migrate state ops Patch 23: introduce new VHOST feature BYTEMAPLOG Patch 24: introduce vmstate header file used for vender vdpa driver jiangdongxu1 (24): vhost-vdpa: fix an iotlb memory leak vhost_vdpa: fix the crash in unmap a large memory vdpa: Add resume operation vhost-vdpa: Introduce RESUME backend feature bit vhost-vdpa: uAPI to resume the device vhost-vdpa: free iommu domain after last use during cleanup vhost-vdpa: vhost_vdpa_alloc_domain() should be using a const struct bus_type * vdpa: add bind_mm/unbind_mm callbacks vhost-vdpa: use bind_mm/unbind_mm device callbacks vhost_vdpa: fix unmap process in no-batch mode vhost_vdpa: tell vqs about the negotiated vhost_vdpa: support PACKED when setting-getting vring_base PCI/IOV: Add pci_iov_vf_id() to get VF index virtio: update virtio id table, add transitional ids virtio: fix virtio transitional ids vdpa: add log operations vhost-vdpa: add uAPI for logging vdpa: add get_backend_features vdpa operation vdpa: add device state operations vhost-vdpa: add uAPI for device buffer vdpa: add vdpa device migration status ops vhost-vdpa: add uAPI for device migration status vhost: add VHOST feature VHOST_BACKEND_F_BYTEMAPLOG vdpa: add vmstate struct definition drivers/pci/iov.c | 14 ++ drivers/vhost/vdpa.c | 316 +++++++++++++++++++++++++++---- include/linux/pci.h | 8 +- include/linux/vdpa.h | 60 +++++- include/linux/vdpa_vmstate.h | 162 ++++++++++++++++ include/uapi/linux/vhost.h | 20 ++ include/uapi/linux/vhost_types.h | 21 ++ include/uapi/linux/virtio_ids.h | 12 ++ 8 files changed, 577 insertions(+), 36 deletions(-) create mode 100644 include/linux/vdpa_vmstate.h -- 2.33.0

1 0

[PATCH v2 OLK-5.10 00/19] Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely
by Wupeng Ma 09 Oct '23

09 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> Patch 1: move FDT init out of kaslr_early_init for future use. Patch 2 to 8: enable feature PBHA for arm64. Patch 9-18: Control the usage of HBM cache for kernel and task precisely. Patch 19: Enable feature PBHA for arm64 by default. Changelog since v1: - fix kabi broken due to include files James Morse (7): KVM: arm64: Detect and enable PBHA for stage2 dt-bindings: Rename the description of cpu nodes cpu.yaml dt-bindings: arm: Add binding for Page Based Hardware Attributes arm64: cpufeature: Enable PBHA bits for stage1 arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values KVM: arm64: Configure PBHA bits for stage2 Documentation: arm64: Describe the support and expectations for PBHA Ma Wupeng (11): arm64: cpufeature: Enable PBHA for stage1 early via FDT arm64: mm: Detect and enable PBHA bit0 at early startup arm64: mm: Update kernel pte entries if pbha bit0 enabled arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0 arm64: mm: Introduce sysfs interface to bypass whole task arm64: mm: Set flag VM_PBHA_BIT0 for global init task arm64: mm: Introduce prctl to control pbha behavior arm64: mm: Introduce kernel param pbha openeuler: configs: arm64: Enable PBHA by default Marc Zyngier (1): arm64: Extract early FDT mapping from kaslr_early_init() .../admin-guide/kernel-parameters.txt | 8 + Documentation/arm64/index.rst | 1 + Documentation/arm64/pbha.rst | 85 +++ .../devicetree/bindings/arm/cpu.yaml | 537 ++++++++++++++++ .../devicetree/bindings/arm/cpus.yaml | 584 +++--------------- arch/arm64/Kconfig | 20 + arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 3 + arch/arm64/include/asm/cpufeature.h | 15 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_pgtable.h | 9 + arch/arm64/include/asm/mman.h | 10 + arch/arm64/include/asm/pgtable-hwdef.h | 6 + arch/arm64/include/asm/pgtable.h | 26 + arch/arm64/include/asm/setup.h | 3 + arch/arm64/include/uapi/asm/mman.h | 1 + arch/arm64/kernel/cpufeature.c | 258 ++++++++ arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kernel/kaslr.c | 7 +- arch/arm64/kernel/setup.c | 15 + arch/arm64/kvm/reset.c | 15 +- arch/arm64/mm/hugetlbpage.c | 2 + arch/arm64/mm/mmu.c | 14 +- arch/arm64/mm/ptdump.c | 5 + .../firmware/efi/libstub/efi-stub-helper.c | 3 + drivers/firmware/efi/libstub/fdt.c | 57 ++ drivers/soc/hisilicon/Makefile | 1 + drivers/soc/hisilicon/pbha.c | 204 ++++++ fs/proc/base.c | 103 +++ fs/proc/task_mmu.c | 3 + include/linux/mm.h | 8 +- include/linux/pbha.h | 66 ++ include/uapi/asm-generic/mman-common.h | 1 + include/uapi/linux/prctl.h | 2 + kernel/sys.c | 10 + mm/memory.c | 4 + mm/vmalloc.c | 5 + 38 files changed, 1579 insertions(+), 523 deletions(-) create mode 100644 Documentation/arm64/pbha.rst create mode 100644 Documentation/devicetree/bindings/arm/cpu.yaml create mode 100644 drivers/soc/hisilicon/pbha.c create mode 100644 include/linux/pbha.h -- 2.25.1

2 20

[PATCH OLK-5.10 01/19] Revert "netfilter: nf_tables: unbind non-anonymous set if rule construction fails"
by Lu Wei 09 Oct '23

09 Oct '23

Offering: HULK hulk inclusion category: bugfix bugzilla: NA ------------------------------- This reverts commit dbc47736365771e25f6c831a57d4ff384ba983ef. Backport the dependency patch and then re-backport this patch. Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/nf_tables_api.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index bbe6e7023683..40fbf2d78427 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4564,8 +4564,6 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set, nft_set_trans_unbind(ctx, set); if (nft_set_is_anonymous(set)) nft_deactivate_next(ctx->net, set); - else - list_del_rcu(&binding->list); set->use--; break; -- 2.34.1

2 19

[PATCH v2 OLK-5.10 00/19] Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely
by Wupeng Ma 09 Oct '23

09 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> Patch 1: move FDT init out of kaslr_early_init for future use. Patch 2 to 8: enable feature PBHA for arm64. Patch 9-18: Control the usage of HBM cache for kernel and task precisely. Patch 19: Enable feature PBHA for arm64 by default. Changelog since v1: - fix kabi broken due to include files James Morse (7): KVM: arm64: Detect and enable PBHA for stage2 dt-bindings: Rename the description of cpu nodes cpu.yaml dt-bindings: arm: Add binding for Page Based Hardware Attributes arm64: cpufeature: Enable PBHA bits for stage1 arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values KVM: arm64: Configure PBHA bits for stage2 Documentation: arm64: Describe the support and expectations for PBHA Ma Wupeng (11): arm64: cpufeature: Enable PBHA for stage1 early via FDT arm64: mm: Detect and enable PBHA bit0 at early startup arm64: mm: Update kernel pte entries if pbha bit0 enabled arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0 arm64: mm: Introduce sysfs interface to bypass whole task arm64: mm: Set flag VM_PBHA_BIT0 for global init task arm64: mm: Introduce prctl to control pbha behavior arm64: mm: Introduce kernel param pbha openeuler: configs: arm64: Enable PBHA by default Marc Zyngier (1): arm64: Extract early FDT mapping from kaslr_early_init() .../admin-guide/kernel-parameters.txt | 8 + Documentation/arm64/index.rst | 1 + Documentation/arm64/pbha.rst | 85 +++ .../devicetree/bindings/arm/cpu.yaml | 537 ++++++++++++++++ .../devicetree/bindings/arm/cpus.yaml | 584 +++--------------- arch/arm64/Kconfig | 20 + arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 3 + arch/arm64/include/asm/cpufeature.h | 15 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_pgtable.h | 9 + arch/arm64/include/asm/mman.h | 10 + arch/arm64/include/asm/pgtable-hwdef.h | 6 + arch/arm64/include/asm/pgtable.h | 26 + arch/arm64/include/asm/setup.h | 3 + arch/arm64/include/uapi/asm/mman.h | 1 + arch/arm64/kernel/cpufeature.c | 258 ++++++++ arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kernel/kaslr.c | 7 +- arch/arm64/kernel/setup.c | 15 + arch/arm64/kvm/reset.c | 15 +- arch/arm64/mm/hugetlbpage.c | 2 + arch/arm64/mm/mmu.c | 14 +- arch/arm64/mm/ptdump.c | 5 + .../firmware/efi/libstub/efi-stub-helper.c | 3 + drivers/firmware/efi/libstub/fdt.c | 57 ++ drivers/soc/hisilicon/Makefile | 1 + drivers/soc/hisilicon/pbha.c | 204 ++++++ fs/proc/base.c | 103 +++ fs/proc/task_mmu.c | 3 + include/linux/mm.h | 8 +- include/linux/pbha.h | 66 ++ include/uapi/asm-generic/mman-common.h | 1 + include/uapi/linux/prctl.h | 2 + kernel/sys.c | 10 + mm/memory.c | 4 + mm/vmalloc.c | 5 + 38 files changed, 1579 insertions(+), 523 deletions(-) create mode 100644 Documentation/arm64/pbha.rst create mode 100644 Documentation/devicetree/bindings/arm/cpu.yaml create mode 100644 drivers/soc/hisilicon/pbha.c create mode 100644 include/linux/pbha.h -- 2.25.1

1 19

[PATCH OLK-5.10 00/10] scsi: mpt3sas: Driver patch set for
by Hao Zhang 09 Oct '23

09 Oct '23

Get patch set from upstream 5.10 LTS stable branch Matt Lupfer (1): scsi: mpt3sas: Page fault in reply q processing Ranjan Kumar (1): scsi: mpt3sas: Perform additional retries if doorbell read returns 0 Sreekanth Reddy (4): scsi: mpt3sas: Fix use-after-free warning scsi: mpt3sas: Don't change DMA mask while reallocating pools scsi: mpt3sas: re-do lost mpt3sas DMA mask fix scsi: mpt3sas: Remove usage of dma_get_required_mask() API Suganath Prabu S (1): scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region Tomas Henzl (1): scsi: mpt3sas: Fix a memory leak Wenchao Hao (1): scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add() Yang Yingliang (1): scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add() drivers/scsi/mpt3sas/mpt3sas_base.c | 227 ++++++++++++++++------- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 16 +- 4 files changed, 178 insertions(+), 69 deletions(-) -- 2.33.0

1 10

[PATCH openEuler-1.0-LTS] scsi: hisi_sas: Handle the NCQ error returned by D2H frame
by Xingui Yang 09 Oct '23

09 Oct '23

driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I86GWG CVE: NA ----------------------------------------------------------------------- We find that some disks use D2H frame instead of SDB frame to return NCQ error. Currently, only the I/O corresponding to the D2H frame is processed in this scenario, which does not meet the processing requirements of the NCQ error scenario. So we set dev_status to HISI_SAS_DEV_NCQ_ERR and abort all I/Os of the disk in this scenario. Signed-off-by: Xingui Yang <yangxingui(a)huawei.com> Reviewed-by: Xiang Chen <chenxiang66(a)hisilicon.com> --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index f79060fca..79010d02a 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -2341,7 +2341,15 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task, case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP: if ((dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) && (sipc_rx_err_type & RX_FIS_STATUS_ERR_MSK)) { - ts->stat = SAS_PROTO_RESPONSE; + if (task->ata_task.use_ncq) { + struct domain_device *device = task->dev; + struct hisi_sas_device *sas_dev = + device->lldd_dev; + sas_dev->dev_status = HISI_SAS_DEV_NCQ_ERR; + slot->abort = 1; + } else { + ts->stat = SAS_PROTO_RESPONSE; + } } else if ((dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) || (dw3 & SATA_DISK_IN_ERROR_STATUS)) { ts->stat = SAS_PHY_DOWN; -- 2.17.1

2 1

[PATCH OLK-5.10] netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
by Lu Wei 09 Oct '23

09 Oct '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v6.6-rc1 commit 050d91c03b28ca479df13dfb02bcd2c60dd6a878 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I83QCZ CVE: CVE-2023-42753 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Acked-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/ipset/ip_set_hash_netportnet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 144346faffc1..b8ec2c414a5f 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -35,6 +35,7 @@ MODULE_ALIAS("ip_set_hash:net,port,net"); #define IP_SET_HASH_WITH_PROTO #define IP_SET_HASH_WITH_NETS #define IPSET_NET_COUNT 2 +#define IP_SET_HASH_WITH_NET0 /* IPv4 variant */ -- 2.34.1

2 1

[PATCH openEuler-23.09 00/13] LoongArch: fix some pci problems
by Hongchen Zhang 09 Oct '23

09 Oct '23

Hongchen Zhang (13): LoongArch: fix ls2k500 bmc not work when installing iso LS7A2000 : Add quirk for OHCI device rev 0x02 PCI: Check if entry->offset already exist for mem resource PCI: Check if the pci controller can use both CFG0 and CFG1 mode to access configuration space PCI: PM: Fix pcie mrrs restoring pci: fix kabi error caused by pm_suspend_target_state LoongArch: Fixed some pcie card not scanning properly pci/quirks: ls7a2000: fix pm transition of devices under pcie port LS7A2000: PCIE: Fixup GPU card error pci: fix X server auto probe fail when both ast and etnaviv drm present LoongArch: pci root bridige set acpi companion only when not acpi_disabled. LoongArch: Fix secondary bridge routing errors pci: irq: Add early_param pci_irq_limit to limit pci irq numbers arch/loongarch/pci/acpi.c | 23 ++-- drivers/gpu/drm/loongson/loongson_module.c | 15 +++ drivers/irqchip/irq-loongson-pch-pic.c | 6 +- drivers/pci/controller/pci-loongson.c | 147 ++++++++++++++++++++- drivers/pci/msi/msi.c | 25 ++++ drivers/pci/pci.c | 20 ++- 6 files changed, 217 insertions(+), 19 deletions(-) -- 2.33.0

2 14

[PATCH 02/28] x86/bugs: Increase the x86 bugs vector size to two u32s
by Jialin Zhang 09 Oct '23

09 Oct '23

From: "Borislav Petkov (AMD)" <bp(a)alien8.de> stable inclusion from stable-v5.10.189 commit 073a28a9b50662991e7d6956c2cf2fc5d54f28cd category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7RQ67 CVE: CVE-2023-20569 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- Upstream commit: 0e52740ffd10c6c316837c6c128f460f1aaba1ea There was never a doubt in my mind that they would not fit into a single u32 eventually. Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Conflicts: arch/x86/include/asm/cpufeatures.h tools/arch/x86/include/asm/cpufeatures.h Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com> --- arch/x86/include/asm/cpufeatures.h | 2 +- tools/arch/x86/include/asm/cpufeatures.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 1a6680712025..f51fc445a24b 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -14,7 +14,7 @@ * Defines x86 CPU feature bits */ #define NCAPINTS 19 /* N 32-bit words worth of info */ -#define NBUGINTS 1 /* N 32-bit bug flags */ +#define NBUGINTS 2 /* N 32-bit bug flags */ /* * Note: If the comment begins with a quoted string, that string is used diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 238287abee0c..6186aebff9c0 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -14,7 +14,7 @@ * Defines x86 CPU feature bits */ #define NCAPINTS 19 /* N 32-bit words worth of info */ -#define NBUGINTS 1 /* N 32-bit bug flags */ +#define NBUGINTS 2 /* N 32-bit bug flags */ /* * Note: If the comment begins with a quoted string, that string is used -- 2.25.1

1 26

[PATCH OLK-5.10 0/3] uacce: uacce fixes some bugs
by Weili Qian 09 Oct '23

09 Oct '23

From: JiangShui Yang <yangjiangshui(a)h-partners.com> Weili Qian (3): uacce: fix NULL pointer when unbind device uacce: cleanup some unused codes uacce: remove unused file 'dev_state' drivers/crypto/hisilicon/qm.c | 12 ----- drivers/misc/uacce/uacce.c | 87 +++++++++++------------------------ include/linux/uacce.h | 17 ------- 3 files changed, 27 insertions(+), 89 deletions(-) -- 2.33.0

2 4

[PATCH OLK-5.10 00/19] Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely
by Wupeng Ma 09 Oct '23

09 Oct '23

From: Ma Wupeng <mawupeng1(a)huawei.com> Patch 1: move FDT init out of kaslr_early_init for future use. Patch 2 to 8: enable feature PBHA for arm64. Patch 9-18: Control the usage of HBM cache for kernel and task precisely. Patch 19: Enable feature PBHA for arm64 by default. James Morse (7): KVM: arm64: Detect and enable PBHA for stage2 dt-bindings: Rename the description of cpu nodes cpu.yaml dt-bindings: arm: Add binding for Page Based Hardware Attributes arm64: cpufeature: Enable PBHA bits for stage1 arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values KVM: arm64: Configure PBHA bits for stage2 Documentation: arm64: Describe the support and expectations for PBHA Ma Wupeng (11): arm64: cpufeature: Enable PBHA for stage1 early via FDT arm64: mm: Detect and enable PBHA bit0 at early startup arm64: mm: Update kernel pte entries if pbha bit0 enabled arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0 arm64: mm: Introduce sysfs interface to bypass whole task arm64: mm: Set flag VM_PBHA_BIT0 for global init task arm64: mm: Introduce prctl to control pbha behavior arm64: mm: Introduce kernel param pbha openeuler: configs: arm64: Enable PBHA by default Marc Zyngier (1): arm64: Extract early FDT mapping from kaslr_early_init() .../admin-guide/kernel-parameters.txt | 8 + Documentation/arm64/index.rst | 1 + Documentation/arm64/pbha.rst | 85 +++ .../devicetree/bindings/arm/cpu.yaml | 537 ++++++++++++++++ .../devicetree/bindings/arm/cpus.yaml | 584 +++--------------- arch/arm64/Kconfig | 20 + arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 3 + arch/arm64/include/asm/cpufeature.h | 15 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_pgtable.h | 9 + arch/arm64/include/asm/mman.h | 10 + arch/arm64/include/asm/pgtable-hwdef.h | 6 + arch/arm64/include/asm/pgtable.h | 26 + arch/arm64/include/asm/setup.h | 3 + arch/arm64/include/uapi/asm/mman.h | 1 + arch/arm64/kernel/cpufeature.c | 258 ++++++++ arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kernel/kaslr.c | 7 +- arch/arm64/kernel/setup.c | 15 + arch/arm64/kvm/reset.c | 15 +- arch/arm64/mm/hugetlbpage.c | 2 + arch/arm64/mm/mmu.c | 14 +- arch/arm64/mm/ptdump.c | 5 + .../firmware/efi/libstub/efi-stub-helper.c | 3 + drivers/firmware/efi/libstub/fdt.c | 57 ++ drivers/soc/hisilicon/Makefile | 1 + drivers/soc/hisilicon/pbha.c | 204 ++++++ fs/proc/base.c | 103 +++ fs/proc/task_mmu.c | 3 + include/linux/mm.h | 8 +- include/linux/pbha.h | 67 ++ include/uapi/asm-generic/mman-common.h | 1 + include/uapi/linux/prctl.h | 2 + kernel/sys.c | 10 + mm/memory.c | 4 + mm/vmalloc.c | 5 + 38 files changed, 1580 insertions(+), 523 deletions(-) create mode 100644 Documentation/arm64/pbha.rst create mode 100644 Documentation/devicetree/bindings/arm/cpu.yaml create mode 100644 drivers/soc/hisilicon/pbha.c create mode 100644 include/linux/pbha.h -- 2.25.1

2 20

[PATCH OLK-5.10] net: ipv4: fix one memleak in __inet_del_ifa()
by Liu Jian 08 Oct '23

08 Oct '23

stable inclusion from stable-v5.10.195 commit 7c8ddcdab1b900bed69cad6beef477fff116289e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I869PY CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… --------------------------- [ Upstream commit ac28b1ec6135649b5d78b028e47264cb3ebca5ea ] I got the below warning when do fuzzing test: unregister_netdevice: waiting for bond0 to become free. Usage count = 2 It can be repoduced via: ip link add bond0 type bond sysctl -w net.ipv4.conf.bond0.promote_secondaries=1 ip addr add 4.117.174.103/0 scope 0x40 dev bond0 ip addr add 192.168.100.111/255.255.255.254 scope 0 dev bond0 ip addr add 0.0.0.4/0 scope 0x40 secondary dev bond0 ip addr del 4.117.174.103/0 scope 0x40 dev bond0 ip link delete bond0 type bond In this reproduction test case, an incorrect 'last_prim' is found in __inet_del_ifa(), as a result, the secondary address(0.0.0.4/0 scope 0x40) is lost. The memory of the secondary address is leaked and the reference of in_device and net_device is leaked. Fix this problem: Look for 'last_prim' starting at location of the deleted IP and inserting the promoted IP into the location of 'last_prim'. Fixes: 0ff60a45678e ("[IPV4]: Fix secondary IP addresses after promotion") Signed-off-by: Liu Jian <liujian56(a)huawei.com> Signed-off-by: Julian Anastasov <ja(a)ssi.bg> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Liu Jian <liujian56(a)huawei.com> --- net/ipv4/devinet.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 88b6120878cd..da1ca8081c03 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -351,14 +351,14 @@ static void __inet_del_ifa(struct in_device *in_dev, { struct in_ifaddr *promote = NULL; struct in_ifaddr *ifa, *ifa1; - struct in_ifaddr *last_prim; + struct in_ifaddr __rcu **last_prim; struct in_ifaddr *prev_prom = NULL; int do_promote = IN_DEV_PROMOTE_SECONDARIES(in_dev); ASSERT_RTNL(); ifa1 = rtnl_dereference(*ifap); - last_prim = rtnl_dereference(in_dev->ifa_list); + last_prim = ifap; if (in_dev->dead) goto no_promotions; @@ -372,7 +372,7 @@ static void __inet_del_ifa(struct in_device *in_dev, while ((ifa = rtnl_dereference(*ifap1)) != NULL) { if (!(ifa->ifa_flags & IFA_F_SECONDARY) && ifa1->ifa_scope <= ifa->ifa_scope) - last_prim = ifa; + last_prim = &ifa->ifa_next; if (!(ifa->ifa_flags & IFA_F_SECONDARY) || ifa1->ifa_mask != ifa->ifa_mask || @@ -436,9 +436,9 @@ static void __inet_del_ifa(struct in_device *in_dev, rcu_assign_pointer(prev_prom->ifa_next, next_sec); - last_sec = rtnl_dereference(last_prim->ifa_next); + last_sec = rtnl_dereference(*last_prim); rcu_assign_pointer(promote->ifa_next, last_sec); - rcu_assign_pointer(last_prim->ifa_next, promote); + rcu_assign_pointer(*last_prim, promote); } promote->ifa_flags &= ~IFA_F_SECONDARY; -- 2.34.1

2 1

[PATCH 01/26] [Backport] tools headers cpufeatures: Sync with the kernel sources
by Jialin Zhang 08 Oct '23

08 Oct '23

From: Arnaldo Carvalho de Melo <acme(a)redhat.com> stable inclusion from stable-v5.10.189 commit 9b7fe7c6fbc007564f97805ff45882e79f0c70d0 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7RQ67 CVE: CVE-2023-20569 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit 1a9bcadd0058a3e81c1beca48e5e08dee9446a01 upstream. To pick the changes from: 3b9c723ed7cfa4e1 ("KVM: SVM: Add support for SVM instruction address check change") b85a0425d8056f3b ("Enumerate AVX Vector Neural Network instructions") fb35d30fe5b06cc2 ("x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX]") This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Borislav Petkov <bp(a)suse.de> Cc: Kyung Min Park <kyung.min.park(a)intel.com> Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Sean Christopherson <seanjc(a)google.com> Cc: Wei Huang <wei.huang2(a)amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Conflicts: tools/arch/x86/include/asm/cpufeatures.h Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com> --- tools/arch/x86/include/asm/cpufeatures.h | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 1a139412f78e..76a860a3d754 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -96,7 +96,7 @@ #define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */ #define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */ #define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */ -#define X86_FEATURE_SME_COHERENT ( 3*32+17) /* "" AMD hardware-enforced cache coherency */ +/* FREE! ( 3*32+17) */ #define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */ #define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */ #define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */ @@ -201,7 +201,7 @@ #define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */ #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ -#define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ +/* FREE! ( 7*32+10) */ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ #define X86_FEATURE_KERNEL_IBRS ( 7*32+12) /* "" Set/clear IBRS on kernel entry/exit */ #define X86_FEATURE_RSB_VMEXIT ( 7*32+13) /* "" Fill RSB on VM-Exit */ @@ -211,7 +211,7 @@ #define X86_FEATURE_SSBD ( 7*32+17) /* Speculative Store Bypass Disable */ #define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */ #define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* "" Fill RSB on context switches */ -#define X86_FEATURE_SEV ( 7*32+20) /* AMD Secure Encrypted Virtualization */ +/* FREE! ( 7*32+20) */ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ @@ -236,7 +236,6 @@ #define X86_FEATURE_EPT_AD ( 8*32+17) /* Intel Extended Page Table access-dirty bit */ #define X86_FEATURE_VMCALL ( 8*32+18) /* "" Hypervisor supports the VMCALL instruction */ #define X86_FEATURE_VMW_VMMCALL ( 8*32+19) /* "" VMware prefers VMMCALL hypercall instruction */ -#define X86_FEATURE_SEV_ES ( 8*32+20) /* AMD Secure Encrypted Virtualization - Encrypted State */ /* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */ #define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/ @@ -302,6 +301,7 @@ #define X86_FEATURE_MSR_TSX_CTRL (11*32+18) /* "" MSR IA32_TSX_CTRL (Intel) implemented */ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ +#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ @@ -346,6 +346,7 @@ #define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */ #define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */ #define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */ +#define X86_FEATURE_SVME_ADDR_CHK (15*32+28) /* "" SVME addr check */ /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */ #define X86_FEATURE_AVX512VBMI (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/ @@ -374,6 +375,13 @@ #define X86_FEATURE_SUCCOR (17*32+ 1) /* Uncorrectable error containment and recovery */ #define X86_FEATURE_SMCA (17*32+ 3) /* Scalable MCA */ +/* AMD-defined memory encryption features, CPUID level 0x8000001f (EAX), word 19 */ +#define X86_FEATURE_SME (17*32+27) /* AMD Secure Memory Encryption */ +#define X86_FEATURE_SEV (17*32+28) /* AMD Secure Encrypted Virtualization */ +#define X86_FEATURE_VM_PAGE_FLUSH (17*32+29) /* "" VM Page Flush MSR is supported */ +#define X86_FEATURE_SEV_ES (17*32+30) /* AMD Secure Encrypted Virtualization - Encrypted State */ +#define X86_FEATURE_SME_COHERENT (17*32+31) /* "" AMD hardware-enforced cache coherency */ + /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ #define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ -- 2.25.1

1 25

[PATCH openEuler-1.0-LTS] quota: fix warning in dqgrab()
by Long Li 08 Oct '23

08 Oct '23

From: Ye Bin <yebin10(a)huawei.com> stable inclusion from stable-v4.19.273 commit 965bad2bf1afef64ec16249da676dc7310cca32e category: bugfix bugzilla: 188627, https://gitee.com/openeuler/kernel/issues/I85XNK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit d6a95db3c7ad160bc16b89e36449705309b52bcb ] There's issue as follows when do fault injection: WARNING: CPU: 1 PID: 14870 at include/linux/quotaops.h:51 dquot_disable+0x13b7/0x18c0 Modules linked in: CPU: 1 PID: 14870 Comm: fsconfig Not tainted 6.3.0-next-20230505-00006-g5107a9c821af-dirty #541 RIP: 0010:dquot_disable+0x13b7/0x18c0 RSP: 0018:ffffc9000acc79e0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88825e41b980 RDX: 0000000000000000 RSI: ffff88825e41b980 RDI: 0000000000000002 RBP: ffff888179f68000 R08: ffffffff82087ca7 R09: 0000000000000000 R10: 0000000000000001 R11: ffffed102f3ed026 R12: ffff888179f68130 R13: ffff888179f68110 R14: dffffc0000000000 R15: ffff888179f68118 FS: 00007f450a073740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe96f2efd8 CR3: 000000025c8ad000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> dquot_load_quota_sb+0xd53/0x1060 dquot_resume+0x172/0x230 ext4_reconfigure+0x1dc6/0x27b0 reconfigure_super+0x515/0xa90 __x64_sys_fsconfig+0xb19/0xd20 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue may happens as follows: ProcessA ProcessB ProcessC sys_fsconfig vfs_fsconfig_locked reconfigure_super ext4_remount dquot_suspend -> suspend all type quota sys_fsconfig vfs_fsconfig_locked reconfigure_super ext4_remount dquot_resume ret = dquot_load_quota_sb add_dquot_ref do_open -> open file O_RDWR vfs_open do_dentry_open get_write_access atomic_inc_unless_negative(&inode->i_writecount) ext4_file_open dquot_file_open dquot_initialize __dquot_initialize dqget atomic_inc(&dquot->dq_count); __dquot_initialize __dquot_initialize dqget if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) ext4_acquire_dquot -> Return error DQ_ACTIVE_B flag isn't set dquot_disable invalidate_dquots if (atomic_read(&dquot->dq_count)) dqgrab WARN_ON_ONCE(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) -> Trigger warning In the above scenario, 'dquot->dq_flags' has no DQ_ACTIVE_B is normal when dqgrab(). To solve above issue just replace the dqgrab() use in invalidate_dquots() with atomic_inc(&dquot->dq_count). Signed-off-by: Ye Bin <yebin10(a)huawei.com> Signed-off-by: Jan Kara <jack(a)suse.cz> Message-Id: <20230605140731.2427629-3-yebin10(a)huawei.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Long Li <leo.lilong(a)huawei.com> --- fs/quota/dquot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index ca7a38ce46f5..b763b87dc2d0 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -577,7 +577,7 @@ static void invalidate_dquots(struct super_block *sb, int type) goto restart; } - dqgrab(dquot); + atomic_inc(&dquot->dq_count); spin_unlock(&dq_list_lock); /* * Once dqput() wakes us up, we know it's time to free -- 2.31.1

2 1

[PATCH openEuler-1.0-LTS] net: ipv4: fix one memleak in __inet_del_ifa()
by Liu Jian 08 Oct '23

08 Oct '23

mainline inclusion from mainline-v6.6-rc2 commit ac28b1ec6135649b5d78b028e47264cb3ebca5ea category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I869PY CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- I got the below warning when do fuzzing test: unregister_netdevice: waiting for bond0 to become free. Usage count = 2 It can be repoduced via: ip link add bond0 type bond sysctl -w net.ipv4.conf.bond0.promote_secondaries=1 ip addr add 4.117.174.103/0 scope 0x40 dev bond0 ip addr add 192.168.100.111/255.255.255.254 scope 0 dev bond0 ip addr add 0.0.0.4/0 scope 0x40 secondary dev bond0 ip addr del 4.117.174.103/0 scope 0x40 dev bond0 ip link delete bond0 type bond In this reproduction test case, an incorrect 'last_prim' is found in __inet_del_ifa(), as a result, the secondary address(0.0.0.4/0 scope 0x40) is lost. The memory of the secondary address is leaked and the reference of in_device and net_device is leaked. Fix this problem: Look for 'last_prim' starting at location of the deleted IP and inserting the promoted IP into the location of 'last_prim'. Fixes: 0ff60a45678e ("[IPV4]: Fix secondary IP addresses after promotion") Signed-off-by: Liu Jian <liujian56(a)huawei.com> Signed-off-by: Julian Anastasov <ja(a)ssi.bg> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Liu Jian <liujian56(a)huawei.com> Conflicts: net/ipv4/devinet.c --- net/ipv4/devinet.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 12a2cea9d606..64f0fa0be370 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -341,12 +341,13 @@ static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, { struct in_ifaddr *promote = NULL; struct in_ifaddr *ifa, *ifa1 = *ifap; - struct in_ifaddr *last_prim = in_dev->ifa_list; + struct in_ifaddr **last_prim; struct in_ifaddr *prev_prom = NULL; int do_promote = IN_DEV_PROMOTE_SECONDARIES(in_dev); ASSERT_RTNL(); + last_prim = ifap; if (in_dev->dead) goto no_promotions; @@ -360,7 +361,7 @@ static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, while ((ifa = *ifap1) != NULL) { if (!(ifa->ifa_flags & IFA_F_SECONDARY) && ifa1->ifa_scope <= ifa->ifa_scope) - last_prim = ifa; + last_prim = &ifa->ifa_next; if (!(ifa->ifa_flags & IFA_F_SECONDARY) || ifa1->ifa_mask != ifa->ifa_mask || @@ -420,8 +421,8 @@ static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, if (prev_prom) { prev_prom->ifa_next = promote->ifa_next; - promote->ifa_next = last_prim->ifa_next; - last_prim->ifa_next = promote; + promote->ifa_next = *last_prim; + *last_prim = promote; } promote->ifa_flags &= ~IFA_F_SECONDARY; -- 2.34.1

2 1

[PATCH OLK-5.10 v2] RDMA/irdma: Prevent zero-length STAG registration
by Liu Jian 08 Oct '23

08 Oct '23

From: Christopher Bednarz <christopher.n.bednarz(a)intel.com> mainline inclusion from mainline-v6.6-rc1 commit bb6d73d9add68ad270888db327514384dfa44958 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7YFVN CVE: CVE-2023-25775 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Currently irdma allows zero-length STAGs to be programmed in HW during the kernel mode fast register flow. Zero-length MR or STAG registration disable HW memory length checks. Improve gaps in bounds checking in irdma by preventing zero-length STAG or MR registrations except if the IB_PD_UNSAFE_GLOBAL_RKEY is set. This addresses the disclosure CVE-2023-25775. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Christopher Bednarz <christopher.n.bednarz(a)intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem(a)intel.com> Link: https://lore.kernel.org/r/20230818144838.1758-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon(a)kernel.org> Signed-off-by: Liu Jian <liujian56(a)huawei.com> Conflicts: drivers/infiniband/hw/i40iw/i40iw_ctrl.c drivers/infiniband/hw/i40iw/i40iw_type.h drivers/infiniband/hw/i40iw/i40iw_verbs.c drivers/infiniband/hw/irdma/ctrl.c drivers/infiniband/hw/irdma/type.h drivers/infiniband/hw/irdma/verbs.c --- drivers/infiniband/hw/i40iw/i40iw_ctrl.c | 6 ++++++ drivers/infiniband/hw/i40iw/i40iw_type.h | 2 ++ drivers/infiniband/hw/i40iw/i40iw_verbs.c | 10 ++++++++-- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c index 86d3f8aff329..2b18bb36e4e3 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c +++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c @@ -3033,6 +3033,9 @@ static enum i40iw_status_code i40iw_sc_alloc_stag( u64 header; enum i40iw_page_size page_size; + if (!info->total_len && !info->all_memory) + return -EINVAL; + page_size = (info->page_size == 0x200000) ? I40IW_PAGE_SIZE_2M : I40IW_PAGE_SIZE_4K; cqp = dev->cqp; wqe = i40iw_sc_cqp_get_next_send_wqe(cqp, scratch); @@ -3091,6 +3094,9 @@ static enum i40iw_status_code i40iw_sc_mr_reg_non_shared( u8 addr_type; enum i40iw_page_size page_size; + if (!info->total_len && !info->all_memory) + return -EINVAL; + page_size = (info->page_size == 0x200000) ? I40IW_PAGE_SIZE_2M : I40IW_PAGE_SIZE_4K; if (info->access_rights & (I40IW_ACCESS_FLAGS_REMOTEREAD_ONLY | I40IW_ACCESS_FLAGS_REMOTEWRITE_ONLY)) diff --git a/drivers/infiniband/hw/i40iw/i40iw_type.h b/drivers/infiniband/hw/i40iw/i40iw_type.h index c3babf3cbb8e..341aa6b1b6c1 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_type.h +++ b/drivers/infiniband/hw/i40iw/i40iw_type.h @@ -786,6 +786,7 @@ struct i40iw_allocate_stag_info { bool use_hmc_fcn_index; u8 hmc_fcn_index; bool use_pf_rid; + bool all_memory; }; struct i40iw_reg_ns_stag_info { @@ -804,6 +805,7 @@ struct i40iw_reg_ns_stag_info { bool use_hmc_fcn_index; u8 hmc_fcn_index; bool use_pf_rid; + bool all_memory; }; struct i40iw_fast_reg_stag_info { diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c index 533f3caecb7a..89654dc91d81 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c +++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c @@ -1494,7 +1494,8 @@ static int i40iw_handle_q_mem(struct i40iw_device *iwdev, static int i40iw_hw_alloc_stag(struct i40iw_device *iwdev, struct i40iw_mr *iwmr) { struct i40iw_allocate_stag_info *info; - struct i40iw_pd *iwpd = to_iwpd(iwmr->ibmr.pd); + struct ib_pd *pd = iwmr->ibmr.pd; + struct i40iw_pd *iwpd = to_iwpd(pd); enum i40iw_status_code status; int err = 0; struct i40iw_cqp_request *cqp_request; @@ -1511,6 +1512,7 @@ static int i40iw_hw_alloc_stag(struct i40iw_device *iwdev, struct i40iw_mr *iwmr info->stag_idx = iwmr->stag >> I40IW_CQPSQ_STAG_IDX_SHIFT; info->pd_id = iwpd->sc_pd.pd_id; info->total_len = iwmr->length; + info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY; info->remote_access = true; cqp_info->cqp_cmd = OP_ALLOC_STAG; cqp_info->post_sq = 1; @@ -1563,6 +1565,8 @@ static struct ib_mr *i40iw_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type, iwmr->type = IW_MEMREG_TYPE_MEM; palloc = &iwpbl->pble_alloc; iwmr->page_cnt = max_num_sg; + /* Use system PAGE_SIZE as the sg page sizes are unknown at this point */ + iwmr->length = max_num_sg * PAGE_SIZE; mutex_lock(&iwdev->pbl_mutex); status = i40iw_get_pble(&iwdev->sc_dev, iwdev->pble_rsrc, palloc, iwmr->page_cnt); mutex_unlock(&iwdev->pbl_mutex); @@ -1659,7 +1663,8 @@ static int i40iw_hwreg_mr(struct i40iw_device *iwdev, { struct i40iw_pbl *iwpbl = &iwmr->iwpbl; struct i40iw_reg_ns_stag_info *stag_info; - struct i40iw_pd *iwpd = to_iwpd(iwmr->ibmr.pd); + struct ib_pd *pd = iwmr->ibmr.pd; + struct i40iw_pd *iwpd = to_iwpd(pd); struct i40iw_pble_alloc *palloc = &iwpbl->pble_alloc; enum i40iw_status_code status; int err = 0; @@ -1679,6 +1684,7 @@ static int i40iw_hwreg_mr(struct i40iw_device *iwdev, stag_info->total_len = iwmr->length; stag_info->access_rights = access; stag_info->pd_id = iwpd->sc_pd.pd_id; + stag_info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY; stag_info->addr_type = I40IW_ADDR_TYPE_VA_BASED; stag_info->page_size = iwmr->page_size; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS v2] RDMA/irdma: Prevent zero-length STAG registration
by Liu Jian 08 Oct '23

08 Oct '23

From: Christopher Bednarz <christopher.n.bednarz(a)intel.com> mainline inclusion from mainline-v6.6-rc1 commit bb6d73d9add68ad270888db327514384dfa44958 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7YFVN CVE: CVE-2023-25775 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Currently irdma allows zero-length STAGs to be programmed in HW during the kernel mode fast register flow. Zero-length MR or STAG registration disable HW memory length checks. Improve gaps in bounds checking in irdma by preventing zero-length STAG or MR registrations except if the IB_PD_UNSAFE_GLOBAL_RKEY is set. This addresses the disclosure CVE-2023-25775. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Christopher Bednarz <christopher.n.bednarz(a)intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem(a)intel.com> Link: https://lore.kernel.org/r/20230818144838.1758-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon(a)kernel.org> Signed-off-by: Liu Jian <liujian56(a)huawei.com> Conflicts: drivers/infiniband/hw/i40iw/i40iw_ctrl.c drivers/infiniband/hw/i40iw/i40iw_type.h drivers/infiniband/hw/i40iw/i40iw_verbs.c drivers/infiniband/hw/irdma/ctrl.c drivers/infiniband/hw/irdma/type.h drivers/infiniband/hw/irdma/verbs.c --- drivers/infiniband/hw/i40iw/i40iw_ctrl.c | 6 ++++++ drivers/infiniband/hw/i40iw/i40iw_type.h | 2 ++ drivers/infiniband/hw/i40iw/i40iw_verbs.c | 10 ++++++++-- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c index 4d841a3c68f3..026557aa2307 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c +++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c @@ -2945,6 +2945,9 @@ static enum i40iw_status_code i40iw_sc_alloc_stag( u64 header; enum i40iw_page_size page_size; + if (!info->total_len && !info->all_memory) + return -EINVAL; + page_size = (info->page_size == 0x200000) ? I40IW_PAGE_SIZE_2M : I40IW_PAGE_SIZE_4K; cqp = dev->cqp; wqe = i40iw_sc_cqp_get_next_send_wqe(cqp, scratch); @@ -3003,6 +3006,9 @@ static enum i40iw_status_code i40iw_sc_mr_reg_non_shared( u8 addr_type; enum i40iw_page_size page_size; + if (!info->total_len && !info->all_memory) + return -EINVAL; + page_size = (info->page_size == 0x200000) ? I40IW_PAGE_SIZE_2M : I40IW_PAGE_SIZE_4K; if (info->access_rights & (I40IW_ACCESS_FLAGS_REMOTEREAD_ONLY | I40IW_ACCESS_FLAGS_REMOTEWRITE_ONLY)) diff --git a/drivers/infiniband/hw/i40iw/i40iw_type.h b/drivers/infiniband/hw/i40iw/i40iw_type.h index adc8d2ec523d..5c4e2f206105 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_type.h +++ b/drivers/infiniband/hw/i40iw/i40iw_type.h @@ -779,6 +779,7 @@ struct i40iw_allocate_stag_info { bool use_hmc_fcn_index; u8 hmc_fcn_index; bool use_pf_rid; + bool all_memory; }; struct i40iw_reg_ns_stag_info { @@ -797,6 +798,7 @@ struct i40iw_reg_ns_stag_info { bool use_hmc_fcn_index; u8 hmc_fcn_index; bool use_pf_rid; + bool all_memory; }; struct i40iw_fast_reg_stag_info { diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c index a5e3349b8a7c..9cf8bf2c87e7 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c +++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c @@ -1603,7 +1603,8 @@ static int i40iw_handle_q_mem(struct i40iw_device *iwdev, static int i40iw_hw_alloc_stag(struct i40iw_device *iwdev, struct i40iw_mr *iwmr) { struct i40iw_allocate_stag_info *info; - struct i40iw_pd *iwpd = to_iwpd(iwmr->ibmr.pd); + struct ib_pd *pd = iwmr->ibmr.pd; + struct i40iw_pd *iwpd = to_iwpd(pd); enum i40iw_status_code status; int err = 0; struct i40iw_cqp_request *cqp_request; @@ -1620,6 +1621,7 @@ static int i40iw_hw_alloc_stag(struct i40iw_device *iwdev, struct i40iw_mr *iwmr info->stag_idx = iwmr->stag >> I40IW_CQPSQ_STAG_IDX_SHIFT; info->pd_id = iwpd->sc_pd.pd_id; info->total_len = iwmr->length; + info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY; info->remote_access = true; cqp_info->cqp_cmd = OP_ALLOC_STAG; cqp_info->post_sq = 1; @@ -1673,6 +1675,8 @@ static struct ib_mr *i40iw_alloc_mr(struct ib_pd *pd, iwmr->type = IW_MEMREG_TYPE_MEM; palloc = &iwpbl->pble_alloc; iwmr->page_cnt = max_num_sg; + /* Use system PAGE_SIZE as the sg page sizes are unknown at this point */ + iwmr->length = max_num_sg * PAGE_SIZE; mutex_lock(&iwdev->pbl_mutex); status = i40iw_get_pble(&iwdev->sc_dev, iwdev->pble_rsrc, palloc, iwmr->page_cnt); mutex_unlock(&iwdev->pbl_mutex); @@ -1769,7 +1773,8 @@ static int i40iw_hwreg_mr(struct i40iw_device *iwdev, { struct i40iw_pbl *iwpbl = &iwmr->iwpbl; struct i40iw_reg_ns_stag_info *stag_info; - struct i40iw_pd *iwpd = to_iwpd(iwmr->ibmr.pd); + struct ib_pd *pd = iwmr->ibmr.pd; + struct i40iw_pd *iwpd = to_iwpd(pd); struct i40iw_pble_alloc *palloc = &iwpbl->pble_alloc; enum i40iw_status_code status; int err = 0; @@ -1789,6 +1794,7 @@ static int i40iw_hwreg_mr(struct i40iw_device *iwdev, stag_info->total_len = iwmr->length; stag_info->access_rights = access; stag_info->pd_id = iwpd->sc_pd.pd_id; + stag_info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY; stag_info->addr_type = I40IW_ADDR_TYPE_VA_BASED; stag_info->page_size = iwmr->page_size; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS 0/2] Backport lts bugfix patch for macvlan
by Ziyang Xuan 08 Oct '23

08 Oct '23

Backport lts commit f916e5988ae4 ("bonding: fix macvlan over alb bond support"). Hangbin Liu (1): bonding: fix macvlan over alb bond support Jakub Kicinski (1): net: remove bond_slave_has_mac_rcu() drivers/net/bonding/bond_alb.c | 6 +++--- include/net/bonding.h | 25 +------------------------ 2 files changed, 4 insertions(+), 27 deletions(-) -- 2.25.1

2 3

[PATCH openEuler-1.0-LTS 0/2] PCI: acpiphp: linux-4.19.y bugfixes backport
by Jialin Zhang 08 Oct '23

08 Oct '23

Igor Mammedov (2): PCI: acpiphp: Reassign resources on bridge if necessary PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus drivers/pci/hotplug/acpiphp_glue.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) -- 2.25.1

2 3

[PATCH OLK-5.10 v3 00/44] xfs: recent patches to fix xfs issues
by Long Li 08 Oct '23

08 Oct '23

Baokun Li (1): xfs: propagate the return value of xfs_log_force() to avoid soft lockup Colin Ian King (2): xfs: remove redundant initializations of pointers drop_leaf and save_leaf xfs: remove redundant pointer lip Darrick J. Wong (9): xfs: use setattr_copy to set vfs inode attributes xfs: remove kmem_zone typedef xfs: rename _zone variables to _cache xfs: compact deferred intent item structures xfs: create slab caches for frequently-used deferred items xfs: rename xfs_bmap_add_free to xfs_free_extent_later xfs: reduce the size of struct xfs_extent_free_item xfs: remove unused parameter from refcount code xfs: pass xfs_extent_free_item directly through the log intent code Dave Chinner (19): xfs: don't assert fail on perag references on teardown xfs: set prealloc flag in xfs_alloc_file_space() xfs: validity check agbnos on the AGFL xfs: validate block number being freed before adding to xefi xfs: don't reverse order of items in bulk AIL insertion xfs: use deferred frees for btree block freeing xfs: pass alloc flags through to xfs_extent_busy_flush() xfs: allow extent free intents to be retried xfs: don't block in busy flushing when freeing extents xfs: journal geometry is not properly bounds checked xfs: AGF length has never been bounds checked xfs: fix bounds check in xfs_defer_agfl_block() xfs: block reservation too large for minleft allocation xfs: punching delalloc extents on write failure is racy xfs: use byte ranges for write cleanup ranges xfs,iomap: move delalloc punching to iomap iomap: buffered write failure should not truncate the page cache xfs: xfs_bmap_punch_delalloc_range() should take a byte range xfs: fix off-by-one-block in xfs_discard_folio() Gaosheng Cui (1): xfs: remove xfs_setattr_time() declaration Guo Xuenan (1): xfs: set minleft correctly for randomly sparse inode allocations Jiapeng Chong (1): xfs: Remove redundant assignment to busy Long Li (6): xfs: fix dir3 block read verify fail during log recover Revert "xfs: propagate the return value of xfs_log_force() to avoid soft lockup" xfs: xfs_trans_cancel() path must check for log shutdown xfs: don't verify agf length when log recovery xfs: shutdown to ensure submits buffers on LSN boundaries xfs: update the last_sync_lsn with ctx start lsn yangerkun (4): xfs: keep growfs sb log item active until ail flush success xfs: fix xfs shutdown since we reserve more blocks in agfl fixup xfs: longest free extent no need consider postalloc xfs: shutdown xfs once inode double free fs/xfs/kmem.h | 4 - fs/xfs/libxfs/xfs_alloc.c | 390 +++++++++++++++++++++-------- fs/xfs/libxfs/xfs_alloc.h | 51 +++- fs/xfs/libxfs/xfs_alloc_btree.c | 2 +- fs/xfs/libxfs/xfs_attr_leaf.c | 2 - fs/xfs/libxfs/xfs_bmap.c | 90 +++---- fs/xfs/libxfs/xfs_bmap.h | 37 +-- fs/xfs/libxfs/xfs_bmap_btree.c | 27 +- fs/xfs/libxfs/xfs_btree.c | 4 +- fs/xfs/libxfs/xfs_btree.h | 2 +- fs/xfs/libxfs/xfs_da_btree.c | 6 +- fs/xfs/libxfs/xfs_da_btree.h | 3 +- fs/xfs/libxfs/xfs_defer.c | 70 +++++- fs/xfs/libxfs/xfs_defer.h | 3 + fs/xfs/libxfs/xfs_ialloc.c | 32 ++- fs/xfs/libxfs/xfs_ialloc_btree.c | 8 +- fs/xfs/libxfs/xfs_inode_fork.c | 4 +- fs/xfs/libxfs/xfs_inode_fork.h | 2 +- fs/xfs/libxfs/xfs_refcount.c | 56 +++-- fs/xfs/libxfs/xfs_refcount.h | 7 +- fs/xfs/libxfs/xfs_refcount_btree.c | 11 +- fs/xfs/libxfs/xfs_rmap.c | 21 +- fs/xfs/libxfs/xfs_rmap.h | 7 +- fs/xfs/libxfs/xfs_rmap_btree.c | 2 +- fs/xfs/libxfs/xfs_sb.c | 56 ++++- fs/xfs/libxfs/xfs_types.c | 23 ++ fs/xfs/libxfs/xfs_types.h | 2 + fs/xfs/xfs_aops.c | 32 +-- fs/xfs/xfs_bmap_item.c | 16 +- fs/xfs/xfs_bmap_item.h | 6 +- fs/xfs/xfs_bmap_util.c | 19 +- fs/xfs/xfs_bmap_util.h | 2 +- fs/xfs/xfs_buf.c | 16 +- fs/xfs/xfs_buf_item.c | 10 +- fs/xfs/xfs_buf_item.h | 11 +- fs/xfs/xfs_buf_item_recover.c | 9 +- fs/xfs/xfs_dquot.c | 26 +- fs/xfs/xfs_extent_busy.c | 36 ++- fs/xfs/xfs_extent_busy.h | 6 +- fs/xfs/xfs_extfree_item.c | 137 +++++++--- fs/xfs/xfs_extfree_item.h | 6 +- fs/xfs/xfs_file.c | 8 - fs/xfs/xfs_icache.c | 8 +- fs/xfs/xfs_icreate_item.c | 6 +- fs/xfs/xfs_icreate_item.h | 2 +- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_inode.h | 2 +- fs/xfs/xfs_inode_item.c | 6 +- fs/xfs/xfs_inode_item.h | 2 +- fs/xfs/xfs_iomap.c | 292 ++++++++++++++++++--- fs/xfs/xfs_iops.c | 56 +---- fs/xfs/xfs_iops.h | 1 - fs/xfs/xfs_log.c | 72 +++--- fs/xfs/xfs_log_priv.h | 2 +- fs/xfs/xfs_log_recover.c | 6 +- fs/xfs/xfs_mount.c | 12 +- fs/xfs/xfs_mru_cache.c | 2 +- fs/xfs/xfs_pnfs.c | 3 +- fs/xfs/xfs_qm.h | 2 +- fs/xfs/xfs_refcount_item.c | 16 +- fs/xfs/xfs_refcount_item.h | 6 +- fs/xfs/xfs_reflink.c | 7 +- fs/xfs/xfs_rmap_item.c | 16 +- fs/xfs/xfs_rmap_item.h | 6 +- fs/xfs/xfs_super.c | 233 ++++++++--------- fs/xfs/xfs_trans.c | 24 +- fs/xfs/xfs_trans.h | 2 +- fs/xfs/xfs_trans_ail.c | 5 +- fs/xfs/xfs_trans_dquot.c | 4 +- mm/filemap.c | 1 + 70 files changed, 1358 insertions(+), 700 deletions(-) -- 2.31.1

2 45

[PATCH openEuler-1.0-LTS] mm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()
by Tong Tiangen 08 Oct '23

08 Oct '23

mainline inclusion from mainline-v6.6-rc1 commit d256d1cd8da1cbc4615de69df71c87ce623fec2f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I85WL9 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- We found a softlock issue in our test, analyzed the logs, and found that the relevant CPU call trace as follows: CPU0: _do_fork -> copy_process() -> write_lock_irq(&tasklist_lock) //Disable irq,waiting for //tasklist_lock CPU1: wp_page_copy() ->pte_offset_map_lock() -> spin_lock(&page->ptl); //Hold page->ptl -> ptep_clear_flush() -> flush_tlb_others() ... -> smp_call_function_many() -> arch_send_call_function_ipi_mask() -> csd_lock_wait() //Waiting for other CPUs respond //IPI CPU2: collect_procs_anon() -> read_lock(&tasklist_lock) //Hold tasklist_lock ->for_each_process(tsk) -> page_mapped_in_vma() -> page_vma_mapped_walk() -> map_pte() ->spin_lock(&page->ptl) //Waiting for page->ptl We can see that CPU1 waiting for CPU0 respond IPI，CPU0 waiting for CPU2 unlock tasklist_lock, CPU2 waiting for CPU1 unlock page->ptl. As a result, softlockup is triggered. For collect_procs_anon(), what we're doing is task list iteration, during the iteration, with the help of call_rcu(), the task_struct object is freed only after one or more grace periods elapse. the logic as follows: release_task() -> __exit_signal() -> __unhash_process() -> list_del_rcu() -> put_task_struct_rcu_user() -> call_rcu(&task->rcu, delayed_put_task_struct) delayed_put_task_struct() -> put_task_struct() -> if (refcount_sub_and_test()) __put_task_struct() -> free_task() Therefore, under the protection of the rcu lock, we can safely use get_task_struct() to ensure a safe reference to task_struct during the iteration. By removing the use of tasklist_lock in task list iteration, we can break the softlock chain above. The same logic can also be applied to: - collect_procs_file() - collect_procs_fsdax() - collect_procs_ksm() Link: https://lkml.kernel.org/r/20230828022527.241693-1-tongtiangen@huawei.com Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Paul E. McKenney <paulmck(a)kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: mm/filemap.c mm/ksm.c mm/memory-failure.c Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com> --- mm/filemap.c | 3 --- mm/memory-failure.c | 12 ++++++------ 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index f33504b784a2..5bf4645256cb 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -166,9 +166,6 @@ static void page_cache_kill(struct page *page) * bdi.wb->list_lock (zap_pte_range->set_page_dirty) * ->inode->i_lock (zap_pte_range->set_page_dirty) * ->private_lock (zap_pte_range->__set_page_dirty_buffers) - * - * ->i_mmap_rwsem - * ->tasklist_lock (memory_failure, collect_procs_ao) */ static int page_cache_tree_insert(struct address_space *mapping, diff --git a/mm/memory-failure.c b/mm/memory-failure.c index fff715a27253..8924d7d9bffa 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -427,8 +427,8 @@ static void kill_procs(struct list_head *to_kill, int forcekill, bool fail, * on behalf of the thread group. Return task_struct of the (first found) * dedicated thread if found, and return NULL otherwise. * - * We already hold read_lock(&tasklist_lock) in the caller, so we don't - * have to call rcu_read_lock/unlock() in this function. + * We already hold rcu lock in the caller, so we don't have to call + * rcu_read_lock/unlock() in this function. */ static struct task_struct *find_early_kill_thread(struct task_struct *tsk) { @@ -478,7 +478,7 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill, return; pgoff = page_to_pgoff(page); - read_lock(&tasklist_lock); + rcu_read_lock(); for_each_process (tsk) { struct anon_vma_chain *vmac; struct task_struct *t = task_early_kill(tsk, force_early); @@ -494,7 +494,7 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill, add_to_kill(t, page, vma, to_kill, tkc); } } - read_unlock(&tasklist_lock); + rcu_read_unlock(); page_unlock_anon_vma_read(av); } @@ -509,7 +509,7 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill, struct address_space *mapping = page->mapping; i_mmap_lock_read(mapping); - read_lock(&tasklist_lock); + rcu_read_lock(); for_each_process(tsk) { pgoff_t pgoff = page_to_pgoff(page); struct task_struct *t = task_early_kill(tsk, force_early); @@ -529,7 +529,7 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill, add_to_kill(t, page, vma, to_kill, tkc); } } - read_unlock(&tasklist_lock); + rcu_read_unlock(); i_mmap_unlock_read(mapping); } -- 2.25.1

2 1

[PATCH openEuler-1.0-LTS v2] x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms
by Yu Liao 08 Oct '23

08 Oct '23

From: Zhang Rui <rui.zhang(a)intel.com> stable inclusion from stable-v4.19.293 commit f5867e5b6a780710251318cb2047e24c49935e43 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I85XNK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit edc0a2b5957652f4685ef3516f519f84807087db ] Traditionally, all CPUs in a system have identical numbers of SMT siblings. That changes with hybrid processors where some logical CPUs have a sibling and others have none. Today, the CPU boot code sets the global variable smp_num_siblings when every CPU thread is brought up. The last thread to boot will overwrite it with the number of siblings of *that* thread. That last thread to boot will "win". If the thread is a Pcore, smp_num_siblings == 2. If it is an Ecore, smp_num_siblings == 1. smp_num_siblings describes if the *system* supports SMT. It should specify the maximum number of SMT threads among all cores. Ensure that smp_num_siblings represents the system-wide maximum number of siblings by always increasing its value. Never allow it to decrease. On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are not updated in any cpu sibling map because the system is treated as an UP system when probing Ecore CPUs. Below shows part of the CPU topology information before and after the fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore). ... -/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff -/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11 +/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff +/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21 ... -/sys/devices/system/cpu/cpu12/topology/package_cpus:001000 -/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12 +/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff +/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21 Notice that the "before" 'package_cpus_list' has only one CPU. This means that userspace tools like lscpu will see a little laptop like an 11-socket system: -Core(s) per socket: 1 -Socket(s): 11 +Core(s) per socket: 16 +Socket(s): 1 This is also expected to make the scheduler do rather wonky things too. [ dhansen: remove CPUID detail from changelog, add end user effects ] CC: stable(a)kernel.org Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology") Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()") Suggested-by: Len Brown <len.brown(a)intel.com> Signed-off-by: Zhang Rui <rui.zhang(a)intel.com> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Link: https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang%40intel.com Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflicts: arch/x86/kernel/cpu/topology.c Signed-off-by: Yu Liao <liaoyu15(a)huawei.com> --- arch/x86/kernel/cpu/topology.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index 7c130bdfd746..5c15e1329fcb 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -77,7 +77,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c) * initial apic id, which also represents 32-bit extended x2apic id. */ c->initial_apicid = edx; - smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx); + smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx)); #endif return 0; } @@ -106,7 +106,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c) */ cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx); c->initial_apicid = edx; - core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx); + core_level_siblings = LEVEL_MAX_SIBLINGS(ebx); + smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx)); core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); die_level_siblings = LEVEL_MAX_SIBLINGS(ebx); die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); -- 2.25.1

2 1

[PATCH openEuler-1.0-LTS] x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms
by Yu Liao 07 Oct '23

07 Oct '23

From: Zhang Rui <rui.zhang(a)intel.com> stable inclusion from stable-v4.19.293 commit f5867e5b6a780710251318cb2047e24c49935e43 category: bugfix bugzilla: 188350 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit edc0a2b5957652f4685ef3516f519f84807087db ] Traditionally, all CPUs in a system have identical numbers of SMT siblings. That changes with hybrid processors where some logical CPUs have a sibling and others have none. Today, the CPU boot code sets the global variable smp_num_siblings when every CPU thread is brought up. The last thread to boot will overwrite it with the number of siblings of *that* thread. That last thread to boot will "win". If the thread is a Pcore, smp_num_siblings == 2. If it is an Ecore, smp_num_siblings == 1. smp_num_siblings describes if the *system* supports SMT. It should specify the maximum number of SMT threads among all cores. Ensure that smp_num_siblings represents the system-wide maximum number of siblings by always increasing its value. Never allow it to decrease. On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are not updated in any cpu sibling map because the system is treated as an UP system when probing Ecore CPUs. Below shows part of the CPU topology information before and after the fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore). ... -/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff -/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11 +/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff +/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21 ... -/sys/devices/system/cpu/cpu12/topology/package_cpus:001000 -/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12 +/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff +/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21 Notice that the "before" 'package_cpus_list' has only one CPU. This means that userspace tools like lscpu will see a little laptop like an 11-socket system: -Core(s) per socket: 1 -Socket(s): 11 +Core(s) per socket: 16 +Socket(s): 1 This is also expected to make the scheduler do rather wonky things too. [ dhansen: remove CPUID detail from changelog, add end user effects ] CC: stable(a)kernel.org Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology") Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()") Suggested-by: Len Brown <len.brown(a)intel.com> Signed-off-by: Zhang Rui <rui.zhang(a)intel.com> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Link: https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang%40intel.com Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflicts: arch/x86/kernel/cpu/topology.c Signed-off-by: Yu Liao <liaoyu15(a)huawei.com> --- arch/x86/kernel/cpu/topology.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index 7c130bdfd746..5c15e1329fcb 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -77,7 +77,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c) * initial apic id, which also represents 32-bit extended x2apic id. */ c->initial_apicid = edx; - smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx); + smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx)); #endif return 0; } @@ -106,7 +106,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c) */ cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx); c->initial_apicid = edx; - core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx); + core_level_siblings = LEVEL_MAX_SIBLINGS(ebx); + smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx)); core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); die_level_siblings = LEVEL_MAX_SIBLINGS(ebx); die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); -- 2.25.1

2 1

[PATCH OLK-5.10 00/28] fix CVE-2023-20569
by Jialin Zhang 07 Oct '23

07 Oct '23

New architectural features and CPUID bits related to the Speculative Return Stack Overflow (SRSO) vulnerability. Arnaldo Carvalho de Melo (2): tools headers cpufeatures: Sync with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources Borislav Petkov (AMD) (9): x86/bugs: Increase the x86 bugs vector size to two u32s x86/srso: Add a Speculative RAS Overflow mitigation x86/srso: Add IBPB_BRTYPE support x86/srso: Add SRSO_NO support x86/srso: Add IBPB x86/srso: Add IBPB on VMEXIT x86/srso: Tie SBPB bit setting to microcode patch detection x86/srso: Disable the mitigation on unaffected configurations x86/srso: Correct the mitigation status when SMT is disabled Jialin Zhang (1): x86/cpufeatures: Fix abi breakage caused by NCAPINTS in cpufeature header file. Josh Poimboeuf (2): x86/srso: Fix return thunks in generated code objtool: Add frame-pointer-specific function ignore Kim Phillips (1): x86/cpu, kvm: Add support for CPUID_80000021_EAX Nick Desaulniers (1): x86/srso: Fix build breakage with the LLVM linker Nikolay Borisov (1): kabi: Allow extra bugsints (bsc#1213927). Peter Zijlstra (10): x86/ibt: Add ANNOTATE_NOENDBR x86/cpu: Fix __x86_return_thunk symbol type x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() x86/alternative: Make custom return thunk unconditional x86/cpu: Clean up SRSO return thunk mess x86/cpu: Rename original retbleed methods x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 x86/cpu: Cleanup the untrain mess objtool/x86: Fixup frame-pointer vs rethunk objtool/x86: Fix SRSO mess Sean Christopherson (1): x86/retpoline: Don't clobber RFLAGS during srso_safe_ret() Documentation/admin-guide/hw-vuln/index.rst | 1 + Documentation/admin-guide/hw-vuln/srso.rst | 133 ++++++++++++ .../admin-guide/kernel-parameters.txt | 11 + arch/x86/Kconfig | 7 + arch/x86/include/asm/cpufeature.h | 23 +- arch/x86/include/asm/cpufeatures.h | 15 +- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/nospec-branch.h | 34 ++- arch/x86/include/asm/processor.h | 7 +- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/cpu/amd.c | 19 ++ arch/x86/kernel/cpu/bugs.c | 197 ++++++++++++++++++ arch/x86/kernel/cpu/common.c | 17 +- arch/x86/kernel/cpu/mkcapflags.sh | 2 +- arch/x86/kernel/cpu/proc.c | 2 +- arch/x86/kernel/cpu/scattered.c | 3 + arch/x86/kernel/vmlinux.lds.S | 38 +++- arch/x86/kvm/cpuid.c | 3 + arch/x86/kvm/svm/svm.c | 4 +- arch/x86/kvm/svm/vmenter.S | 3 + arch/x86/lib/retpoline.S | 135 ++++++++++-- drivers/base/cpu.c | 8 + include/linux/cpu.h | 2 + include/linux/objtool.h | 28 +++ tools/arch/x86/include/asm/cpufeatures.h | 16 +- tools/arch/x86/include/asm/msr-index.h | 1 + tools/include/linux/objtool.h | 28 +++ tools/objtool/arch.h | 1 + tools/objtool/arch/x86/decode.c | 6 + tools/objtool/check.c | 41 +++- tools/objtool/elf.h | 1 + .../perf/trace/beauty/tracepoints/x86_msr.sh | 2 +- 32 files changed, 740 insertions(+), 51 deletions(-) create mode 100644 Documentation/admin-guide/hw-vuln/srso.rst -- 2.25.1

3 30

[PATCH openEuler-1.0-LTS] ipv4: fix null-deref in ipv4_link_failure
by Lu Wei 07 Oct '23

07 Oct '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v6.6-rc3 commit 0113d9c9d1ccc07f5a3710dac4aa24b6d711278c category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I85DZB CVE: CVE-2023-42754 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs. When skb->dev is NULL, dev_net(skb->dev) will become null-dereference. This patch adds a check for the edge case and switch to use the net_device from the rtable when skb->dev is NULL. Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") Suggested-by: David Ahern <dsahern(a)kernel.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Cc: Stephen Suryaputra <ssuryaextr(a)gmail.com> Cc: Vadim Fedorenko <vfedorenko(a)novek.ru> Reviewed-by: David Ahern <dsahern(a)kernel.org> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/ipv4/route.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 68f67bf99ed7..86096e2e43b0 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1212,6 +1212,7 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie) static void ipv4_send_dest_unreach(struct sk_buff *skb) { + struct net_device *dev; struct ip_options opt; int res; @@ -1229,7 +1230,8 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb) opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr); rcu_read_lock(); - res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); + dev = skb->dev ? skb->dev : skb_rtable(skb)->dst.dev; + res = __ip_options_compile(dev_net(dev), &opt, skb, NULL); rcu_read_unlock(); if (res) -- 2.34.1

2 1

[PATCH OLK-5.10 00/28] fix CVE-2023-20569
by Jialin Zhang 07 Oct '23

07 Oct '23

New architectural features and CPUID bits related to the Speculative Return Stack Overflow (SRSO) vulnerability. Arnaldo Carvalho de Melo (2): tools headers cpufeatures: Sync with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources Borislav Petkov (AMD) (9): x86/bugs: Increase the x86 bugs vector size to two u32s x86/srso: Add a Speculative RAS Overflow mitigation x86/srso: Add IBPB_BRTYPE support x86/srso: Add SRSO_NO support x86/srso: Add IBPB x86/srso: Add IBPB on VMEXIT x86/srso: Tie SBPB bit setting to microcode patch detection x86/srso: Disable the mitigation on unaffected configurations x86/srso: Correct the mitigation status when SMT is disabled Jialin Zhang (1): x86/cpufeatures: Fix abi breakage caused by NCAPINTS in cpufeature header file. Josh Poimboeuf (2): x86/srso: Fix return thunks in generated code objtool: Add frame-pointer-specific function ignore Kim Phillips (1): x86/cpu, kvm: Add support for CPUID_80000021_EAX Nick Desaulniers (1): x86/srso: Fix build breakage with the LLVM linker Nikolay Borisov (1): kabi: Allow extra bugsints (bsc#1213927). Peter Zijlstra (10): x86/ibt: Add ANNOTATE_NOENDBR x86/cpu: Fix __x86_return_thunk symbol type x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() x86/alternative: Make custom return thunk unconditional x86/cpu: Clean up SRSO return thunk mess x86/cpu: Rename original retbleed methods x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 x86/cpu: Cleanup the untrain mess objtool/x86: Fixup frame-pointer vs rethunk objtool/x86: Fix SRSO mess Sean Christopherson (1): x86/retpoline: Don't clobber RFLAGS during srso_safe_ret() Documentation/admin-guide/hw-vuln/index.rst | 1 + Documentation/admin-guide/hw-vuln/srso.rst | 133 ++++++++++++ .../admin-guide/kernel-parameters.txt | 11 + arch/x86/Kconfig | 7 + arch/x86/include/asm/cpufeature.h | 23 +- arch/x86/include/asm/cpufeatures.h | 15 +- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/nospec-branch.h | 34 ++- arch/x86/include/asm/processor.h | 7 +- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/cpu/amd.c | 19 ++ arch/x86/kernel/cpu/bugs.c | 197 ++++++++++++++++++ arch/x86/kernel/cpu/common.c | 17 +- arch/x86/kernel/cpu/mkcapflags.sh | 2 +- arch/x86/kernel/cpu/proc.c | 2 +- arch/x86/kernel/cpu/scattered.c | 3 + arch/x86/kernel/vmlinux.lds.S | 38 +++- arch/x86/kvm/cpuid.c | 3 + arch/x86/kvm/svm/svm.c | 4 +- arch/x86/kvm/svm/vmenter.S | 3 + arch/x86/lib/retpoline.S | 135 ++++++++++-- drivers/base/cpu.c | 8 + include/linux/cpu.h | 2 + include/linux/objtool.h | 28 +++ tools/arch/x86/include/asm/cpufeatures.h | 16 +- tools/arch/x86/include/asm/msr-index.h | 1 + tools/include/linux/objtool.h | 28 +++ tools/objtool/arch.h | 1 + tools/objtool/arch/x86/decode.c | 6 + tools/objtool/check.c | 41 +++- tools/objtool/elf.h | 1 + .../perf/trace/beauty/tracepoints/x86_msr.sh | 2 +- 32 files changed, 740 insertions(+), 51 deletions(-) create mode 100644 Documentation/admin-guide/hw-vuln/srso.rst -- 2.25.1

1 28

[PATCH OLK-5.10] net/sched: Retire rsvp classifier
by Dong Chenchen 06 Oct '23

06 Oct '23

From: Jamal Hadi Salim <jhs(a)mojatatu.com> stable inclusion from stable-v5.10.196 commit 8db844077ec9912d75952c80d76da71fc2412852 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I84B2W CVE: CVE-2023-42755 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=… -------------------------------- commit 265b4da82dbf5df04bee5a5d46b7474b1aaf326a upstream. The rsvp classifier has served us well for about a quarter of a century but has has not been getting much maintenance attention due to lack of known users. Signed-off-by: Jamal Hadi Salim <jhs(a)mojatatu.com> Acked-by: Jiri Pirko <jiri(a)nvidia.com> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com> --- net/sched/Kconfig | 28 -- net/sched/Makefile | 2 - net/sched/cls_rsvp.c | 24 -- net/sched/cls_rsvp.h | 777 ------------------------------------------ net/sched/cls_rsvp6.c | 24 -- 5 files changed, 855 deletions(-) delete mode 100644 net/sched/cls_rsvp.c delete mode 100644 net/sched/cls_rsvp.h delete mode 100644 net/sched/cls_rsvp6.c diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 697522371914..2046c16b29f0 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -548,34 +548,6 @@ config CLS_U32_MARK help Say Y here to be able to use netfilter marks as u32 key. -config NET_CLS_RSVP - tristate "IPv4 Resource Reservation Protocol (RSVP)" - select NET_CLS - help - The Resource Reservation Protocol (RSVP) permits end systems to - request a minimum and maximum data flow rate for a connection; this - is important for real time data such as streaming sound or video. - - Say Y here if you want to be able to classify outgoing packets based - on their RSVP requests. - - To compile this code as a module, choose M here: the - module will be called cls_rsvp. - -config NET_CLS_RSVP6 - tristate "IPv6 Resource Reservation Protocol (RSVP6)" - select NET_CLS - help - The Resource Reservation Protocol (RSVP) permits end systems to - request a minimum and maximum data flow rate for a connection; this - is important for real time data such as streaming sound or video. - - Say Y here if you want to be able to classify outgoing packets based - on their RSVP requests and you are using the IPv6 protocol. - - To compile this code as a module, choose M here: the - module will be called cls_rsvp6. - config NET_CLS_FLOW tristate "Flow classifier" select NET_CLS diff --git a/net/sched/Makefile b/net/sched/Makefile index 4311fdb21119..df2bcd785f7d 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -68,8 +68,6 @@ obj-$(CONFIG_NET_SCH_TAPRIO) += sch_taprio.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o obj-$(CONFIG_NET_CLS_FW) += cls_fw.o -obj-$(CONFIG_NET_CLS_RSVP) += cls_rsvp.o -obj-$(CONFIG_NET_CLS_RSVP6) += cls_rsvp6.o obj-$(CONFIG_NET_CLS_BASIC) += cls_basic.o obj-$(CONFIG_NET_CLS_FLOW) += cls_flow.o obj-$(CONFIG_NET_CLS_CGROUP) += cls_cgroup.o diff --git a/net/sched/cls_rsvp.c b/net/sched/cls_rsvp.c deleted file mode 100644 index de1c1d4da597..000000000000 --- a/net/sched/cls_rsvp.c +++ /dev/null @@ -1,24 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * net/sched/cls_rsvp.c Special RSVP packet classifier for IPv4. - * - * Authors: Alexey Kuznetsov, <kuznet(a)ms2.inr.ac.ru> - */ - -#include <linux/module.h> -#include <linux/types.h> -#include <linux/kernel.h> -#include <linux/string.h> -#include <linux/errno.h> -#include <linux/skbuff.h> -#include <net/ip.h> -#include <net/netlink.h> -#include <net/act_api.h> -#include <net/pkt_cls.h> - -#define RSVP_DST_LEN 1 -#define RSVP_ID "rsvp" -#define RSVP_OPS cls_rsvp_ops - -#include "cls_rsvp.h" -MODULE_LICENSE("GPL"); diff --git a/net/sched/cls_rsvp.h b/net/sched/cls_rsvp.h deleted file mode 100644 index d36949d9382c..000000000000 --- a/net/sched/cls_rsvp.h +++ /dev/null @@ -1,777 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ -/* - * net/sched/cls_rsvp.h Template file for RSVPv[46] classifiers. - * - * Authors: Alexey Kuznetsov, <kuznet(a)ms2.inr.ac.ru> - */ - -/* - Comparing to general packet classification problem, - RSVP needs only sevaral relatively simple rules: - - * (dst, protocol) are always specified, - so that we are able to hash them. - * src may be exact, or may be wildcard, so that - we can keep a hash table plus one wildcard entry. - * source port (or flow label) is important only if src is given. - - IMPLEMENTATION. - - We use a two level hash table: The top level is keyed by - destination address and protocol ID, every bucket contains a list - of "rsvp sessions", identified by destination address, protocol and - DPI(="Destination Port ID"): triple (key, mask, offset). - - Every bucket has a smaller hash table keyed by source address - (cf. RSVP flowspec) and one wildcard entry for wildcard reservations. - Every bucket is again a list of "RSVP flows", selected by - source address and SPI(="Source Port ID" here rather than - "security parameter index"): triple (key, mask, offset). - - - NOTE 1. All the packets with IPv6 extension headers (but AH and ESP) - and all fragmented packets go to the best-effort traffic class. - - - NOTE 2. Two "port id"'s seems to be redundant, rfc2207 requires - only one "Generalized Port Identifier". So that for classic - ah, esp (and udp,tcp) both *pi should coincide or one of them - should be wildcard. - - At first sight, this redundancy is just a waste of CPU - resources. But DPI and SPI add the possibility to assign different - priorities to GPIs. Look also at note 4 about tunnels below. - - - NOTE 3. One complication is the case of tunneled packets. - We implement it as following: if the first lookup - matches a special session with "tunnelhdr" value not zero, - flowid doesn't contain the true flow ID, but the tunnel ID (1...255). - In this case, we pull tunnelhdr bytes and restart lookup - with tunnel ID added to the list of keys. Simple and stupid 8)8) - It's enough for PIMREG and IPIP. - - - NOTE 4. Two GPIs make it possible to parse even GRE packets. - F.e. DPI can select ETH_P_IP (and necessary flags to make - tunnelhdr correct) in GRE protocol field and SPI matches - GRE key. Is it not nice? 8)8) - - - Well, as result, despite its simplicity, we get a pretty - powerful classification engine. */ - - -struct rsvp_head { - u32 tmap[256/32]; - u32 hgenerator; - u8 tgenerator; - struct rsvp_session __rcu *ht[256]; - struct rcu_head rcu; -}; - -struct rsvp_session { - struct rsvp_session __rcu *next; - __be32 dst[RSVP_DST_LEN]; - struct tc_rsvp_gpi dpi; - u8 protocol; - u8 tunnelid; - /* 16 (src,sport) hash slots, and one wildcard source slot */ - struct rsvp_filter __rcu *ht[16 + 1]; - struct rcu_head rcu; -}; - - -struct rsvp_filter { - struct rsvp_filter __rcu *next; - __be32 src[RSVP_DST_LEN]; - struct tc_rsvp_gpi spi; - u8 tunnelhdr; - - struct tcf_result res; - struct tcf_exts exts; - - u32 handle; - struct rsvp_session *sess; - struct rcu_work rwork; -}; - -static inline unsigned int hash_dst(__be32 *dst, u8 protocol, u8 tunnelid) -{ - unsigned int h = (__force __u32)dst[RSVP_DST_LEN - 1]; - - h ^= h>>16; - h ^= h>>8; - return (h ^ protocol ^ tunnelid) & 0xFF; -} - -static inline unsigned int hash_src(__be32 *src) -{ - unsigned int h = (__force __u32)src[RSVP_DST_LEN-1]; - - h ^= h>>16; - h ^= h>>8; - h ^= h>>4; - return h & 0xF; -} - -#define RSVP_APPLY_RESULT() \ -{ \ - int r = tcf_exts_exec(skb, &f->exts, res); \ - if (r < 0) \ - continue; \ - else if (r > 0) \ - return r; \ -} - -static int rsvp_classify(struct sk_buff *skb, const struct tcf_proto *tp, - struct tcf_result *res) -{ - struct rsvp_head *head = rcu_dereference_bh(tp->root); - struct rsvp_session *s; - struct rsvp_filter *f; - unsigned int h1, h2; - __be32 *dst, *src; - u8 protocol; - u8 tunnelid = 0; - u8 *xprt; -#if RSVP_DST_LEN == 4 - struct ipv6hdr *nhptr; - - if (!pskb_network_may_pull(skb, sizeof(*nhptr))) - return -1; - nhptr = ipv6_hdr(skb); -#else - struct iphdr *nhptr; - - if (!pskb_network_may_pull(skb, sizeof(*nhptr))) - return -1; - nhptr = ip_hdr(skb); -#endif -restart: - -#if RSVP_DST_LEN == 4 - src = &nhptr->saddr.s6_addr32[0]; - dst = &nhptr->daddr.s6_addr32[0]; - protocol = nhptr->nexthdr; - xprt = ((u8 *)nhptr) + sizeof(struct ipv6hdr); -#else - src = &nhptr->saddr; - dst = &nhptr->daddr; - protocol = nhptr->protocol; - xprt = ((u8 *)nhptr) + (nhptr->ihl<<2); - if (ip_is_fragment(nhptr)) - return -1; -#endif - - h1 = hash_dst(dst, protocol, tunnelid); - h2 = hash_src(src); - - for (s = rcu_dereference_bh(head->ht[h1]); s; - s = rcu_dereference_bh(s->next)) { - if (dst[RSVP_DST_LEN-1] == s->dst[RSVP_DST_LEN - 1] && - protocol == s->protocol && - !(s->dpi.mask & - (*(u32 *)(xprt + s->dpi.offset) ^ s->dpi.key)) && -#if RSVP_DST_LEN == 4 - dst[0] == s->dst[0] && - dst[1] == s->dst[1] && - dst[2] == s->dst[2] && -#endif - tunnelid == s->tunnelid) { - - for (f = rcu_dereference_bh(s->ht[h2]); f; - f = rcu_dereference_bh(f->next)) { - if (src[RSVP_DST_LEN-1] == f->src[RSVP_DST_LEN - 1] && - !(f->spi.mask & (*(u32 *)(xprt + f->spi.offset) ^ f->spi.key)) -#if RSVP_DST_LEN == 4 - && - src[0] == f->src[0] && - src[1] == f->src[1] && - src[2] == f->src[2] -#endif - ) { - *res = f->res; - RSVP_APPLY_RESULT(); - -matched: - if (f->tunnelhdr == 0) - return 0; - - tunnelid = f->res.classid; - nhptr = (void *)(xprt + f->tunnelhdr - sizeof(*nhptr)); - goto restart; - } - } - - /* And wildcard bucket... */ - for (f = rcu_dereference_bh(s->ht[16]); f; - f = rcu_dereference_bh(f->next)) { - *res = f->res; - RSVP_APPLY_RESULT(); - goto matched; - } - return -1; - } - } - return -1; -} - -static void rsvp_replace(struct tcf_proto *tp, struct rsvp_filter *n, u32 h) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_session *s; - struct rsvp_filter __rcu **ins; - struct rsvp_filter *pins; - unsigned int h1 = h & 0xFF; - unsigned int h2 = (h >> 8) & 0xFF; - - for (s = rtnl_dereference(head->ht[h1]); s; - s = rtnl_dereference(s->next)) { - for (ins = &s->ht[h2], pins = rtnl_dereference(*ins); ; - ins = &pins->next, pins = rtnl_dereference(*ins)) { - if (pins->handle == h) { - RCU_INIT_POINTER(n->next, pins->next); - rcu_assign_pointer(*ins, n); - return; - } - } - } - - /* Something went wrong if we are trying to replace a non-existant - * node. Mind as well halt instead of silently failing. - */ - BUG_ON(1); -} - -static void *rsvp_get(struct tcf_proto *tp, u32 handle) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_session *s; - struct rsvp_filter *f; - unsigned int h1 = handle & 0xFF; - unsigned int h2 = (handle >> 8) & 0xFF; - - if (h2 > 16) - return NULL; - - for (s = rtnl_dereference(head->ht[h1]); s; - s = rtnl_dereference(s->next)) { - for (f = rtnl_dereference(s->ht[h2]); f; - f = rtnl_dereference(f->next)) { - if (f->handle == handle) - return f; - } - } - return NULL; -} - -static int rsvp_init(struct tcf_proto *tp) -{ - struct rsvp_head *data; - - data = kzalloc(sizeof(struct rsvp_head), GFP_KERNEL); - if (data) { - rcu_assign_pointer(tp->root, data); - return 0; - } - return -ENOBUFS; -} - -static void __rsvp_delete_filter(struct rsvp_filter *f) -{ - tcf_exts_destroy(&f->exts); - tcf_exts_put_net(&f->exts); - kfree(f); -} - -static void rsvp_delete_filter_work(struct work_struct *work) -{ - struct rsvp_filter *f = container_of(to_rcu_work(work), - struct rsvp_filter, - rwork); - rtnl_lock(); - __rsvp_delete_filter(f); - rtnl_unlock(); -} - -static void rsvp_delete_filter(struct tcf_proto *tp, struct rsvp_filter *f) -{ - tcf_unbind_filter(tp, &f->res); - /* all classifiers are required to call tcf_exts_destroy() after rcu - * grace period, since converted-to-rcu actions are relying on that - * in cleanup() callback - */ - if (tcf_exts_get_net(&f->exts)) - tcf_queue_work(&f->rwork, rsvp_delete_filter_work); - else - __rsvp_delete_filter(f); -} - -static void rsvp_destroy(struct tcf_proto *tp, bool rtnl_held, - struct netlink_ext_ack *extack) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - int h1, h2; - - if (data == NULL) - return; - - for (h1 = 0; h1 < 256; h1++) { - struct rsvp_session *s; - - while ((s = rtnl_dereference(data->ht[h1])) != NULL) { - RCU_INIT_POINTER(data->ht[h1], s->next); - - for (h2 = 0; h2 <= 16; h2++) { - struct rsvp_filter *f; - - while ((f = rtnl_dereference(s->ht[h2])) != NULL) { - rcu_assign_pointer(s->ht[h2], f->next); - rsvp_delete_filter(tp, f); - } - } - kfree_rcu(s, rcu); - } - } - kfree_rcu(data, rcu); -} - -static int rsvp_delete(struct tcf_proto *tp, void *arg, bool *last, - bool rtnl_held, struct netlink_ext_ack *extack) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_filter *nfp, *f = arg; - struct rsvp_filter __rcu **fp; - unsigned int h = f->handle; - struct rsvp_session __rcu **sp; - struct rsvp_session *nsp, *s = f->sess; - int i, h1; - - fp = &s->ht[(h >> 8) & 0xFF]; - for (nfp = rtnl_dereference(*fp); nfp; - fp = &nfp->next, nfp = rtnl_dereference(*fp)) { - if (nfp == f) { - RCU_INIT_POINTER(*fp, f->next); - rsvp_delete_filter(tp, f); - - /* Strip tree */ - - for (i = 0; i <= 16; i++) - if (s->ht[i]) - goto out; - - /* OK, session has no flows */ - sp = &head->ht[h & 0xFF]; - for (nsp = rtnl_dereference(*sp); nsp; - sp = &nsp->next, nsp = rtnl_dereference(*sp)) { - if (nsp == s) { - RCU_INIT_POINTER(*sp, s->next); - kfree_rcu(s, rcu); - goto out; - } - } - - break; - } - } - -out: - *last = true; - for (h1 = 0; h1 < 256; h1++) { - if (rcu_access_pointer(head->ht[h1])) { - *last = false; - break; - } - } - - return 0; -} - -static unsigned int gen_handle(struct tcf_proto *tp, unsigned salt) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - int i = 0xFFFF; - - while (i-- > 0) { - u32 h; - - if ((data->hgenerator += 0x10000) == 0) - data->hgenerator = 0x10000; - h = data->hgenerator|salt; - if (!rsvp_get(tp, h)) - return h; - } - return 0; -} - -static int tunnel_bts(struct rsvp_head *data) -{ - int n = data->tgenerator >> 5; - u32 b = 1 << (data->tgenerator & 0x1F); - - if (data->tmap[n] & b) - return 0; - data->tmap[n] |= b; - return 1; -} - -static void tunnel_recycle(struct rsvp_head *data) -{ - struct rsvp_session __rcu **sht = data->ht; - u32 tmap[256/32]; - int h1, h2; - - memset(tmap, 0, sizeof(tmap)); - - for (h1 = 0; h1 < 256; h1++) { - struct rsvp_session *s; - for (s = rtnl_dereference(sht[h1]); s; - s = rtnl_dereference(s->next)) { - for (h2 = 0; h2 <= 16; h2++) { - struct rsvp_filter *f; - - for (f = rtnl_dereference(s->ht[h2]); f; - f = rtnl_dereference(f->next)) { - if (f->tunnelhdr == 0) - continue; - data->tgenerator = f->res.classid; - tunnel_bts(data); - } - } - } - } - - memcpy(data->tmap, tmap, sizeof(tmap)); -} - -static u32 gen_tunnel(struct rsvp_head *data) -{ - int i, k; - - for (k = 0; k < 2; k++) { - for (i = 255; i > 0; i--) { - if (++data->tgenerator == 0) - data->tgenerator = 1; - if (tunnel_bts(data)) - return data->tgenerator; - } - tunnel_recycle(data); - } - return 0; -} - -static const struct nla_policy rsvp_policy[TCA_RSVP_MAX + 1] = { - [TCA_RSVP_CLASSID] = { .type = NLA_U32 }, - [TCA_RSVP_DST] = { .len = RSVP_DST_LEN * sizeof(u32) }, - [TCA_RSVP_SRC] = { .len = RSVP_DST_LEN * sizeof(u32) }, - [TCA_RSVP_PINFO] = { .len = sizeof(struct tc_rsvp_pinfo) }, -}; - -static int rsvp_change(struct net *net, struct sk_buff *in_skb, - struct tcf_proto *tp, unsigned long base, - u32 handle, - struct nlattr **tca, - void **arg, bool ovr, bool rtnl_held, - struct netlink_ext_ack *extack) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - struct rsvp_filter *f, *nfp; - struct rsvp_filter __rcu **fp; - struct rsvp_session *nsp, *s; - struct rsvp_session __rcu **sp; - struct tc_rsvp_pinfo *pinfo = NULL; - struct nlattr *opt = tca[TCA_OPTIONS]; - struct nlattr *tb[TCA_RSVP_MAX + 1]; - struct tcf_exts e; - unsigned int h1, h2; - __be32 *dst; - int err; - - if (opt == NULL) - return handle ? -EINVAL : 0; - - err = nla_parse_nested_deprecated(tb, TCA_RSVP_MAX, opt, rsvp_policy, - NULL); - if (err < 0) - return err; - - err = tcf_exts_init(&e, net, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) - return err; - err = tcf_exts_validate(net, tp, tb, tca[TCA_RATE], &e, ovr, true, - extack); - if (err < 0) - goto errout2; - - f = *arg; - if (f) { - /* Node exists: adjust only classid */ - struct rsvp_filter *n; - - if (f->handle != handle && handle) - goto errout2; - - n = kmemdup(f, sizeof(*f), GFP_KERNEL); - if (!n) { - err = -ENOMEM; - goto errout2; - } - - err = tcf_exts_init(&n->exts, net, TCA_RSVP_ACT, - TCA_RSVP_POLICE); - if (err < 0) { - kfree(n); - goto errout2; - } - - if (tb[TCA_RSVP_CLASSID]) { - n->res.classid = nla_get_u32(tb[TCA_RSVP_CLASSID]); - tcf_bind_filter(tp, &n->res, base); - } - - tcf_exts_change(&n->exts, &e); - rsvp_replace(tp, n, handle); - return 0; - } - - /* Now more serious part... */ - err = -EINVAL; - if (handle) - goto errout2; - if (tb[TCA_RSVP_DST] == NULL) - goto errout2; - - err = -ENOBUFS; - f = kzalloc(sizeof(struct rsvp_filter), GFP_KERNEL); - if (f == NULL) - goto errout2; - - err = tcf_exts_init(&f->exts, net, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) - goto errout; - h2 = 16; - if (tb[TCA_RSVP_SRC]) { - memcpy(f->src, nla_data(tb[TCA_RSVP_SRC]), sizeof(f->src)); - h2 = hash_src(f->src); - } - if (tb[TCA_RSVP_PINFO]) { - pinfo = nla_data(tb[TCA_RSVP_PINFO]); - f->spi = pinfo->spi; - f->tunnelhdr = pinfo->tunnelhdr; - } - if (tb[TCA_RSVP_CLASSID]) - f->res.classid = nla_get_u32(tb[TCA_RSVP_CLASSID]); - - dst = nla_data(tb[TCA_RSVP_DST]); - h1 = hash_dst(dst, pinfo ? pinfo->protocol : 0, pinfo ? pinfo->tunnelid : 0); - - err = -ENOMEM; - if ((f->handle = gen_handle(tp, h1 | (h2<<8))) == 0) - goto errout; - - if (f->tunnelhdr) { - err = -EINVAL; - if (f->res.classid > 255) - goto errout; - - err = -ENOMEM; - if (f->res.classid == 0 && - (f->res.classid = gen_tunnel(data)) == 0) - goto errout; - } - - for (sp = &data->ht[h1]; - (s = rtnl_dereference(*sp)) != NULL; - sp = &s->next) { - if (dst[RSVP_DST_LEN-1] == s->dst[RSVP_DST_LEN-1] && - pinfo && pinfo->protocol == s->protocol && - memcmp(&pinfo->dpi, &s->dpi, sizeof(s->dpi)) == 0 && -#if RSVP_DST_LEN == 4 - dst[0] == s->dst[0] && - dst[1] == s->dst[1] && - dst[2] == s->dst[2] && -#endif - pinfo->tunnelid == s->tunnelid) { - -insert: - /* OK, we found appropriate session */ - - fp = &s->ht[h2]; - - f->sess = s; - if (f->tunnelhdr == 0) - tcf_bind_filter(tp, &f->res, base); - - tcf_exts_change(&f->exts, &e); - - fp = &s->ht[h2]; - for (nfp = rtnl_dereference(*fp); nfp; - fp = &nfp->next, nfp = rtnl_dereference(*fp)) { - __u32 mask = nfp->spi.mask & f->spi.mask; - - if (mask != f->spi.mask) - break; - } - RCU_INIT_POINTER(f->next, nfp); - rcu_assign_pointer(*fp, f); - - *arg = f; - return 0; - } - } - - /* No session found. Create new one. */ - - err = -ENOBUFS; - s = kzalloc(sizeof(struct rsvp_session), GFP_KERNEL); - if (s == NULL) - goto errout; - memcpy(s->dst, dst, sizeof(s->dst)); - - if (pinfo) { - s->dpi = pinfo->dpi; - s->protocol = pinfo->protocol; - s->tunnelid = pinfo->tunnelid; - } - sp = &data->ht[h1]; - for (nsp = rtnl_dereference(*sp); nsp; - sp = &nsp->next, nsp = rtnl_dereference(*sp)) { - if ((nsp->dpi.mask & s->dpi.mask) != s->dpi.mask) - break; - } - RCU_INIT_POINTER(s->next, nsp); - rcu_assign_pointer(*sp, s); - - goto insert; - -errout: - tcf_exts_destroy(&f->exts); - kfree(f); -errout2: - tcf_exts_destroy(&e); - return err; -} - -static void rsvp_walk(struct tcf_proto *tp, struct tcf_walker *arg, - bool rtnl_held) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - unsigned int h, h1; - - if (arg->stop) - return; - - for (h = 0; h < 256; h++) { - struct rsvp_session *s; - - for (s = rtnl_dereference(head->ht[h]); s; - s = rtnl_dereference(s->next)) { - for (h1 = 0; h1 <= 16; h1++) { - struct rsvp_filter *f; - - for (f = rtnl_dereference(s->ht[h1]); f; - f = rtnl_dereference(f->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; - return; - } - arg->count++; - } - } - } - } -} - -static int rsvp_dump(struct net *net, struct tcf_proto *tp, void *fh, - struct sk_buff *skb, struct tcmsg *t, bool rtnl_held) -{ - struct rsvp_filter *f = fh; - struct rsvp_session *s; - struct nlattr *nest; - struct tc_rsvp_pinfo pinfo; - - if (f == NULL) - return skb->len; - s = f->sess; - - t->tcm_handle = f->handle; - - nest = nla_nest_start_noflag(skb, TCA_OPTIONS); - if (nest == NULL) - goto nla_put_failure; - - if (nla_put(skb, TCA_RSVP_DST, sizeof(s->dst), &s->dst)) - goto nla_put_failure; - pinfo.dpi = s->dpi; - pinfo.spi = f->spi; - pinfo.protocol = s->protocol; - pinfo.tunnelid = s->tunnelid; - pinfo.tunnelhdr = f->tunnelhdr; - pinfo.pad = 0; - if (nla_put(skb, TCA_RSVP_PINFO, sizeof(pinfo), &pinfo)) - goto nla_put_failure; - if (f->res.classid && - nla_put_u32(skb, TCA_RSVP_CLASSID, f->res.classid)) - goto nla_put_failure; - if (((f->handle >> 8) & 0xFF) != 16 && - nla_put(skb, TCA_RSVP_SRC, sizeof(f->src), f->src)) - goto nla_put_failure; - - if (tcf_exts_dump(skb, &f->exts) < 0) - goto nla_put_failure; - - nla_nest_end(skb, nest); - - if (tcf_exts_dump_stats(skb, &f->exts) < 0) - goto nla_put_failure; - return skb->len; - -nla_put_failure: - nla_nest_cancel(skb, nest); - return -1; -} - -static void rsvp_bind_class(void *fh, u32 classid, unsigned long cl, void *q, - unsigned long base) -{ - struct rsvp_filter *f = fh; - - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } -} - -static struct tcf_proto_ops RSVP_OPS __read_mostly = { - .kind = RSVP_ID, - .classify = rsvp_classify, - .init = rsvp_init, - .destroy = rsvp_destroy, - .get = rsvp_get, - .change = rsvp_change, - .delete = rsvp_delete, - .walk = rsvp_walk, - .dump = rsvp_dump, - .bind_class = rsvp_bind_class, - .owner = THIS_MODULE, -}; - -static int __init init_rsvp(void) -{ - return register_tcf_proto_ops(&RSVP_OPS); -} - -static void __exit exit_rsvp(void) -{ - unregister_tcf_proto_ops(&RSVP_OPS); -} - -module_init(init_rsvp) -module_exit(exit_rsvp) diff --git a/net/sched/cls_rsvp6.c b/net/sched/cls_rsvp6.c deleted file mode 100644 index 64078846000e..000000000000 --- a/net/sched/cls_rsvp6.c +++ /dev/null @@ -1,24 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * net/sched/cls_rsvp6.c Special RSVP packet classifier for IPv6. - * - * Authors: Alexey Kuznetsov, <kuznet(a)ms2.inr.ac.ru> - */ - -#include <linux/module.h> -#include <linux/types.h> -#include <linux/kernel.h> -#include <linux/string.h> -#include <linux/errno.h> -#include <linux/ipv6.h> -#include <linux/skbuff.h> -#include <net/act_api.h> -#include <net/pkt_cls.h> -#include <net/netlink.h> - -#define RSVP_DST_LEN 4 -#define RSVP_ID "rsvp6" -#define RSVP_OPS cls_rsvp6_ops - -#include "cls_rsvp.h" -MODULE_LICENSE("GPL"); -- 2.25.1

2 1

[PATCH openEuler-1.0-LTS] net/sched: Retire rsvp classifier
by Dong Chenchen 06 Oct '23

06 Oct '23

From: Jamal Hadi Salim <jhs(a)mojatatu.com> stable inclusion from stable-v4.19.294 commit 6ca0ea6a46e7a2d70fb1b1f6a886efe2b2365e16 category: bugfix bugzilla: 189257, https://gitee.com/src-openeuler/kernel/issues/I84B2W CVE: CVE-2023-42755 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=… -------------------------------- commit 265b4da82dbf5df04bee5a5d46b7474b1aaf326a upstream. The rsvp classifier has served us well for about a quarter of a century but has has not been getting much maintenance attention due to lack of known users. Signed-off-by: Jamal Hadi Salim <jhs(a)mojatatu.com> Acked-by: Jiri Pirko <jiri(a)nvidia.com> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com> --- net/sched/Kconfig | 28 -- net/sched/Makefile | 2 - net/sched/cls_rsvp.c | 28 -- net/sched/cls_rsvp.h | 770 ------------------------------------------ net/sched/cls_rsvp6.c | 28 -- 5 files changed, 856 deletions(-) delete mode 100644 net/sched/cls_rsvp.c delete mode 100644 net/sched/cls_rsvp.h delete mode 100644 net/sched/cls_rsvp6.c diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 4547022ed7f4..7698a8974a47 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -503,34 +503,6 @@ config CLS_U32_MARK ---help--- Say Y here to be able to use netfilter marks as u32 key. -config NET_CLS_RSVP - tristate "IPv4 Resource Reservation Protocol (RSVP)" - select NET_CLS - ---help--- - The Resource Reservation Protocol (RSVP) permits end systems to - request a minimum and maximum data flow rate for a connection; this - is important for real time data such as streaming sound or video. - - Say Y here if you want to be able to classify outgoing packets based - on their RSVP requests. - - To compile this code as a module, choose M here: the - module will be called cls_rsvp. - -config NET_CLS_RSVP6 - tristate "IPv6 Resource Reservation Protocol (RSVP6)" - select NET_CLS - ---help--- - The Resource Reservation Protocol (RSVP) permits end systems to - request a minimum and maximum data flow rate for a connection; this - is important for real time data such as streaming sound or video. - - Say Y here if you want to be able to classify outgoing packets based - on their RSVP requests and you are using the IPv6 protocol. - - To compile this code as a module, choose M here: the - module will be called cls_rsvp6. - config NET_CLS_FLOW tristate "Flow classifier" select NET_CLS diff --git a/net/sched/Makefile b/net/sched/Makefile index 5eed580cdb42..3139c32e1947 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -61,8 +61,6 @@ obj-$(CONFIG_NET_SCH_ETF) += sch_etf.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o obj-$(CONFIG_NET_CLS_FW) += cls_fw.o -obj-$(CONFIG_NET_CLS_RSVP) += cls_rsvp.o -obj-$(CONFIG_NET_CLS_RSVP6) += cls_rsvp6.o obj-$(CONFIG_NET_CLS_BASIC) += cls_basic.o obj-$(CONFIG_NET_CLS_FLOW) += cls_flow.o obj-$(CONFIG_NET_CLS_CGROUP) += cls_cgroup.o diff --git a/net/sched/cls_rsvp.c b/net/sched/cls_rsvp.c deleted file mode 100644 index cbb5e0d600f3..000000000000 --- a/net/sched/cls_rsvp.c +++ /dev/null @@ -1,28 +0,0 @@ -/* - * net/sched/cls_rsvp.c Special RSVP packet classifier for IPv4. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - * - * Authors: Alexey Kuznetsov, <kuznet(a)ms2.inr.ac.ru> - */ - -#include <linux/module.h> -#include <linux/types.h> -#include <linux/kernel.h> -#include <linux/string.h> -#include <linux/errno.h> -#include <linux/skbuff.h> -#include <net/ip.h> -#include <net/netlink.h> -#include <net/act_api.h> -#include <net/pkt_cls.h> - -#define RSVP_DST_LEN 1 -#define RSVP_ID "rsvp" -#define RSVP_OPS cls_rsvp_ops - -#include "cls_rsvp.h" -MODULE_LICENSE("GPL"); diff --git a/net/sched/cls_rsvp.h b/net/sched/cls_rsvp.h deleted file mode 100644 index a1e9f7cbcffc..000000000000 --- a/net/sched/cls_rsvp.h +++ /dev/null @@ -1,770 +0,0 @@ -/* - * net/sched/cls_rsvp.h Template file for RSVPv[46] classifiers. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - * - * Authors: Alexey Kuznetsov, <kuznet(a)ms2.inr.ac.ru> - */ - -/* - Comparing to general packet classification problem, - RSVP needs only sevaral relatively simple rules: - - * (dst, protocol) are always specified, - so that we are able to hash them. - * src may be exact, or may be wildcard, so that - we can keep a hash table plus one wildcard entry. - * source port (or flow label) is important only if src is given. - - IMPLEMENTATION. - - We use a two level hash table: The top level is keyed by - destination address and protocol ID, every bucket contains a list - of "rsvp sessions", identified by destination address, protocol and - DPI(="Destination Port ID"): triple (key, mask, offset). - - Every bucket has a smaller hash table keyed by source address - (cf. RSVP flowspec) and one wildcard entry for wildcard reservations. - Every bucket is again a list of "RSVP flows", selected by - source address and SPI(="Source Port ID" here rather than - "security parameter index"): triple (key, mask, offset). - - - NOTE 1. All the packets with IPv6 extension headers (but AH and ESP) - and all fragmented packets go to the best-effort traffic class. - - - NOTE 2. Two "port id"'s seems to be redundant, rfc2207 requires - only one "Generalized Port Identifier". So that for classic - ah, esp (and udp,tcp) both *pi should coincide or one of them - should be wildcard. - - At first sight, this redundancy is just a waste of CPU - resources. But DPI and SPI add the possibility to assign different - priorities to GPIs. Look also at note 4 about tunnels below. - - - NOTE 3. One complication is the case of tunneled packets. - We implement it as following: if the first lookup - matches a special session with "tunnelhdr" value not zero, - flowid doesn't contain the true flow ID, but the tunnel ID (1...255). - In this case, we pull tunnelhdr bytes and restart lookup - with tunnel ID added to the list of keys. Simple and stupid 8)8) - It's enough for PIMREG and IPIP. - - - NOTE 4. Two GPIs make it possible to parse even GRE packets. - F.e. DPI can select ETH_P_IP (and necessary flags to make - tunnelhdr correct) in GRE protocol field and SPI matches - GRE key. Is it not nice? 8)8) - - - Well, as result, despite its simplicity, we get a pretty - powerful classification engine. */ - - -struct rsvp_head { - u32 tmap[256/32]; - u32 hgenerator; - u8 tgenerator; - struct rsvp_session __rcu *ht[256]; - struct rcu_head rcu; -}; - -struct rsvp_session { - struct rsvp_session __rcu *next; - __be32 dst[RSVP_DST_LEN]; - struct tc_rsvp_gpi dpi; - u8 protocol; - u8 tunnelid; - /* 16 (src,sport) hash slots, and one wildcard source slot */ - struct rsvp_filter __rcu *ht[16 + 1]; - struct rcu_head rcu; -}; - - -struct rsvp_filter { - struct rsvp_filter __rcu *next; - __be32 src[RSVP_DST_LEN]; - struct tc_rsvp_gpi spi; - u8 tunnelhdr; - - struct tcf_result res; - struct tcf_exts exts; - - u32 handle; - struct rsvp_session *sess; - struct rcu_work rwork; -}; - -static inline unsigned int hash_dst(__be32 *dst, u8 protocol, u8 tunnelid) -{ - unsigned int h = (__force __u32)dst[RSVP_DST_LEN - 1]; - - h ^= h>>16; - h ^= h>>8; - return (h ^ protocol ^ tunnelid) & 0xFF; -} - -static inline unsigned int hash_src(__be32 *src) -{ - unsigned int h = (__force __u32)src[RSVP_DST_LEN-1]; - - h ^= h>>16; - h ^= h>>8; - h ^= h>>4; - return h & 0xF; -} - -#define RSVP_APPLY_RESULT() \ -{ \ - int r = tcf_exts_exec(skb, &f->exts, res); \ - if (r < 0) \ - continue; \ - else if (r > 0) \ - return r; \ -} - -static int rsvp_classify(struct sk_buff *skb, const struct tcf_proto *tp, - struct tcf_result *res) -{ - struct rsvp_head *head = rcu_dereference_bh(tp->root); - struct rsvp_session *s; - struct rsvp_filter *f; - unsigned int h1, h2; - __be32 *dst, *src; - u8 protocol; - u8 tunnelid = 0; - u8 *xprt; -#if RSVP_DST_LEN == 4 - struct ipv6hdr *nhptr; - - if (!pskb_network_may_pull(skb, sizeof(*nhptr))) - return -1; - nhptr = ipv6_hdr(skb); -#else - struct iphdr *nhptr; - - if (!pskb_network_may_pull(skb, sizeof(*nhptr))) - return -1; - nhptr = ip_hdr(skb); -#endif -restart: - -#if RSVP_DST_LEN == 4 - src = &nhptr->saddr.s6_addr32[0]; - dst = &nhptr->daddr.s6_addr32[0]; - protocol = nhptr->nexthdr; - xprt = ((u8 *)nhptr) + sizeof(struct ipv6hdr); -#else - src = &nhptr->saddr; - dst = &nhptr->daddr; - protocol = nhptr->protocol; - xprt = ((u8 *)nhptr) + (nhptr->ihl<<2); - if (ip_is_fragment(nhptr)) - return -1; -#endif - - h1 = hash_dst(dst, protocol, tunnelid); - h2 = hash_src(src); - - for (s = rcu_dereference_bh(head->ht[h1]); s; - s = rcu_dereference_bh(s->next)) { - if (dst[RSVP_DST_LEN-1] == s->dst[RSVP_DST_LEN - 1] && - protocol == s->protocol && - !(s->dpi.mask & - (*(u32 *)(xprt + s->dpi.offset) ^ s->dpi.key)) && -#if RSVP_DST_LEN == 4 - dst[0] == s->dst[0] && - dst[1] == s->dst[1] && - dst[2] == s->dst[2] && -#endif - tunnelid == s->tunnelid) { - - for (f = rcu_dereference_bh(s->ht[h2]); f; - f = rcu_dereference_bh(f->next)) { - if (src[RSVP_DST_LEN-1] == f->src[RSVP_DST_LEN - 1] && - !(f->spi.mask & (*(u32 *)(xprt + f->spi.offset) ^ f->spi.key)) -#if RSVP_DST_LEN == 4 - && - src[0] == f->src[0] && - src[1] == f->src[1] && - src[2] == f->src[2] -#endif - ) { - *res = f->res; - RSVP_APPLY_RESULT(); - -matched: - if (f->tunnelhdr == 0) - return 0; - - tunnelid = f->res.classid; - nhptr = (void *)(xprt + f->tunnelhdr - sizeof(*nhptr)); - goto restart; - } - } - - /* And wildcard bucket... */ - for (f = rcu_dereference_bh(s->ht[16]); f; - f = rcu_dereference_bh(f->next)) { - *res = f->res; - RSVP_APPLY_RESULT(); - goto matched; - } - return -1; - } - } - return -1; -} - -static void rsvp_replace(struct tcf_proto *tp, struct rsvp_filter *n, u32 h) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_session *s; - struct rsvp_filter __rcu **ins; - struct rsvp_filter *pins; - unsigned int h1 = h & 0xFF; - unsigned int h2 = (h >> 8) & 0xFF; - - for (s = rtnl_dereference(head->ht[h1]); s; - s = rtnl_dereference(s->next)) { - for (ins = &s->ht[h2], pins = rtnl_dereference(*ins); ; - ins = &pins->next, pins = rtnl_dereference(*ins)) { - if (pins->handle == h) { - RCU_INIT_POINTER(n->next, pins->next); - rcu_assign_pointer(*ins, n); - return; - } - } - } - - /* Something went wrong if we are trying to replace a non-existant - * node. Mind as well halt instead of silently failing. - */ - BUG_ON(1); -} - -static void *rsvp_get(struct tcf_proto *tp, u32 handle) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_session *s; - struct rsvp_filter *f; - unsigned int h1 = handle & 0xFF; - unsigned int h2 = (handle >> 8) & 0xFF; - - if (h2 > 16) - return NULL; - - for (s = rtnl_dereference(head->ht[h1]); s; - s = rtnl_dereference(s->next)) { - for (f = rtnl_dereference(s->ht[h2]); f; - f = rtnl_dereference(f->next)) { - if (f->handle == handle) - return f; - } - } - return NULL; -} - -static int rsvp_init(struct tcf_proto *tp) -{ - struct rsvp_head *data; - - data = kzalloc(sizeof(struct rsvp_head), GFP_KERNEL); - if (data) { - rcu_assign_pointer(tp->root, data); - return 0; - } - return -ENOBUFS; -} - -static void __rsvp_delete_filter(struct rsvp_filter *f) -{ - tcf_exts_destroy(&f->exts); - tcf_exts_put_net(&f->exts); - kfree(f); -} - -static void rsvp_delete_filter_work(struct work_struct *work) -{ - struct rsvp_filter *f = container_of(to_rcu_work(work), - struct rsvp_filter, - rwork); - rtnl_lock(); - __rsvp_delete_filter(f); - rtnl_unlock(); -} - -static void rsvp_delete_filter(struct tcf_proto *tp, struct rsvp_filter *f) -{ - tcf_unbind_filter(tp, &f->res); - /* all classifiers are required to call tcf_exts_destroy() after rcu - * grace period, since converted-to-rcu actions are relying on that - * in cleanup() callback - */ - if (tcf_exts_get_net(&f->exts)) - tcf_queue_work(&f->rwork, rsvp_delete_filter_work); - else - __rsvp_delete_filter(f); -} - -static void rsvp_destroy(struct tcf_proto *tp, struct netlink_ext_ack *extack) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - int h1, h2; - - if (data == NULL) - return; - - for (h1 = 0; h1 < 256; h1++) { - struct rsvp_session *s; - - while ((s = rtnl_dereference(data->ht[h1])) != NULL) { - RCU_INIT_POINTER(data->ht[h1], s->next); - - for (h2 = 0; h2 <= 16; h2++) { - struct rsvp_filter *f; - - while ((f = rtnl_dereference(s->ht[h2])) != NULL) { - rcu_assign_pointer(s->ht[h2], f->next); - rsvp_delete_filter(tp, f); - } - } - kfree_rcu(s, rcu); - } - } - kfree_rcu(data, rcu); -} - -static int rsvp_delete(struct tcf_proto *tp, void *arg, bool *last, - struct netlink_ext_ack *extack) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_filter *nfp, *f = arg; - struct rsvp_filter __rcu **fp; - unsigned int h = f->handle; - struct rsvp_session __rcu **sp; - struct rsvp_session *nsp, *s = f->sess; - int i, h1; - - fp = &s->ht[(h >> 8) & 0xFF]; - for (nfp = rtnl_dereference(*fp); nfp; - fp = &nfp->next, nfp = rtnl_dereference(*fp)) { - if (nfp == f) { - RCU_INIT_POINTER(*fp, f->next); - rsvp_delete_filter(tp, f); - - /* Strip tree */ - - for (i = 0; i <= 16; i++) - if (s->ht[i]) - goto out; - - /* OK, session has no flows */ - sp = &head->ht[h & 0xFF]; - for (nsp = rtnl_dereference(*sp); nsp; - sp = &nsp->next, nsp = rtnl_dereference(*sp)) { - if (nsp == s) { - RCU_INIT_POINTER(*sp, s->next); - kfree_rcu(s, rcu); - goto out; - } - } - - break; - } - } - -out: - *last = true; - for (h1 = 0; h1 < 256; h1++) { - if (rcu_access_pointer(head->ht[h1])) { - *last = false; - break; - } - } - - return 0; -} - -static unsigned int gen_handle(struct tcf_proto *tp, unsigned salt) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - int i = 0xFFFF; - - while (i-- > 0) { - u32 h; - - if ((data->hgenerator += 0x10000) == 0) - data->hgenerator = 0x10000; - h = data->hgenerator|salt; - if (!rsvp_get(tp, h)) - return h; - } - return 0; -} - -static int tunnel_bts(struct rsvp_head *data) -{ - int n = data->tgenerator >> 5; - u32 b = 1 << (data->tgenerator & 0x1F); - - if (data->tmap[n] & b) - return 0; - data->tmap[n] |= b; - return 1; -} - -static void tunnel_recycle(struct rsvp_head *data) -{ - struct rsvp_session __rcu **sht = data->ht; - u32 tmap[256/32]; - int h1, h2; - - memset(tmap, 0, sizeof(tmap)); - - for (h1 = 0; h1 < 256; h1++) { - struct rsvp_session *s; - for (s = rtnl_dereference(sht[h1]); s; - s = rtnl_dereference(s->next)) { - for (h2 = 0; h2 <= 16; h2++) { - struct rsvp_filter *f; - - for (f = rtnl_dereference(s->ht[h2]); f; - f = rtnl_dereference(f->next)) { - if (f->tunnelhdr == 0) - continue; - data->tgenerator = f->res.classid; - tunnel_bts(data); - } - } - } - } - - memcpy(data->tmap, tmap, sizeof(tmap)); -} - -static u32 gen_tunnel(struct rsvp_head *data) -{ - int i, k; - - for (k = 0; k < 2; k++) { - for (i = 255; i > 0; i--) { - if (++data->tgenerator == 0) - data->tgenerator = 1; - if (tunnel_bts(data)) - return data->tgenerator; - } - tunnel_recycle(data); - } - return 0; -} - -static const struct nla_policy rsvp_policy[TCA_RSVP_MAX + 1] = { - [TCA_RSVP_CLASSID] = { .type = NLA_U32 }, - [TCA_RSVP_DST] = { .len = RSVP_DST_LEN * sizeof(u32) }, - [TCA_RSVP_SRC] = { .len = RSVP_DST_LEN * sizeof(u32) }, - [TCA_RSVP_PINFO] = { .len = sizeof(struct tc_rsvp_pinfo) }, -}; - -static int rsvp_change(struct net *net, struct sk_buff *in_skb, - struct tcf_proto *tp, unsigned long base, - u32 handle, - struct nlattr **tca, - void **arg, bool ovr, struct netlink_ext_ack *extack) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - struct rsvp_filter *f, *nfp; - struct rsvp_filter __rcu **fp; - struct rsvp_session *nsp, *s; - struct rsvp_session __rcu **sp; - struct tc_rsvp_pinfo *pinfo = NULL; - struct nlattr *opt = tca[TCA_OPTIONS]; - struct nlattr *tb[TCA_RSVP_MAX + 1]; - struct tcf_exts e; - unsigned int h1, h2; - __be32 *dst; - int err; - - if (opt == NULL) - return handle ? -EINVAL : 0; - - err = nla_parse_nested(tb, TCA_RSVP_MAX, opt, rsvp_policy, NULL); - if (err < 0) - return err; - - err = tcf_exts_init(&e, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) - return err; - err = tcf_exts_validate(net, tp, tb, tca[TCA_RATE], &e, ovr, extack); - if (err < 0) - goto errout2; - - f = *arg; - if (f) { - /* Node exists: adjust only classid */ - struct rsvp_filter *n; - - if (f->handle != handle && handle) - goto errout2; - - n = kmemdup(f, sizeof(*f), GFP_KERNEL); - if (!n) { - err = -ENOMEM; - goto errout2; - } - - err = tcf_exts_init(&n->exts, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) { - kfree(n); - goto errout2; - } - - if (tb[TCA_RSVP_CLASSID]) { - n->res.classid = nla_get_u32(tb[TCA_RSVP_CLASSID]); - tcf_bind_filter(tp, &n->res, base); - } - - tcf_exts_change(&n->exts, &e); - rsvp_replace(tp, n, handle); - return 0; - } - - /* Now more serious part... */ - err = -EINVAL; - if (handle) - goto errout2; - if (tb[TCA_RSVP_DST] == NULL) - goto errout2; - - err = -ENOBUFS; - f = kzalloc(sizeof(struct rsvp_filter), GFP_KERNEL); - if (f == NULL) - goto errout2; - - err = tcf_exts_init(&f->exts, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) - goto errout; - h2 = 16; - if (tb[TCA_RSVP_SRC]) { - memcpy(f->src, nla_data(tb[TCA_RSVP_SRC]), sizeof(f->src)); - h2 = hash_src(f->src); - } - if (tb[TCA_RSVP_PINFO]) { - pinfo = nla_data(tb[TCA_RSVP_PINFO]); - f->spi = pinfo->spi; - f->tunnelhdr = pinfo->tunnelhdr; - } - if (tb[TCA_RSVP_CLASSID]) - f->res.classid = nla_get_u32(tb[TCA_RSVP_CLASSID]); - - dst = nla_data(tb[TCA_RSVP_DST]); - h1 = hash_dst(dst, pinfo ? pinfo->protocol : 0, pinfo ? pinfo->tunnelid : 0); - - err = -ENOMEM; - if ((f->handle = gen_handle(tp, h1 | (h2<<8))) == 0) - goto errout; - - if (f->tunnelhdr) { - err = -EINVAL; - if (f->res.classid > 255) - goto errout; - - err = -ENOMEM; - if (f->res.classid == 0 && - (f->res.classid = gen_tunnel(data)) == 0) - goto errout; - } - - for (sp = &data->ht[h1]; - (s = rtnl_dereference(*sp)) != NULL; - sp = &s->next) { - if (dst[RSVP_DST_LEN-1] == s->dst[RSVP_DST_LEN-1] && - pinfo && pinfo->protocol == s->protocol && - memcmp(&pinfo->dpi, &s->dpi, sizeof(s->dpi)) == 0 && -#if RSVP_DST_LEN == 4 - dst[0] == s->dst[0] && - dst[1] == s->dst[1] && - dst[2] == s->dst[2] && -#endif - pinfo->tunnelid == s->tunnelid) { - -insert: - /* OK, we found appropriate session */ - - fp = &s->ht[h2]; - - f->sess = s; - if (f->tunnelhdr == 0) - tcf_bind_filter(tp, &f->res, base); - - tcf_exts_change(&f->exts, &e); - - fp = &s->ht[h2]; - for (nfp = rtnl_dereference(*fp); nfp; - fp = &nfp->next, nfp = rtnl_dereference(*fp)) { - __u32 mask = nfp->spi.mask & f->spi.mask; - - if (mask != f->spi.mask) - break; - } - RCU_INIT_POINTER(f->next, nfp); - rcu_assign_pointer(*fp, f); - - *arg = f; - return 0; - } - } - - /* No session found. Create new one. */ - - err = -ENOBUFS; - s = kzalloc(sizeof(struct rsvp_session), GFP_KERNEL); - if (s == NULL) - goto errout; - memcpy(s->dst, dst, sizeof(s->dst)); - - if (pinfo) { - s->dpi = pinfo->dpi; - s->protocol = pinfo->protocol; - s->tunnelid = pinfo->tunnelid; - } - sp = &data->ht[h1]; - for (nsp = rtnl_dereference(*sp); nsp; - sp = &nsp->next, nsp = rtnl_dereference(*sp)) { - if ((nsp->dpi.mask & s->dpi.mask) != s->dpi.mask) - break; - } - RCU_INIT_POINTER(s->next, nsp); - rcu_assign_pointer(*sp, s); - - goto insert; - -errout: - tcf_exts_destroy(&f->exts); - kfree(f); -errout2: - tcf_exts_destroy(&e); - return err; -} - -static void rsvp_walk(struct tcf_proto *tp, struct tcf_walker *arg) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - unsigned int h, h1; - - if (arg->stop) - return; - - for (h = 0; h < 256; h++) { - struct rsvp_session *s; - - for (s = rtnl_dereference(head->ht[h]); s; - s = rtnl_dereference(s->next)) { - for (h1 = 0; h1 <= 16; h1++) { - struct rsvp_filter *f; - - for (f = rtnl_dereference(s->ht[h1]); f; - f = rtnl_dereference(f->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; - return; - } - arg->count++; - } - } - } - } -} - -static int rsvp_dump(struct net *net, struct tcf_proto *tp, void *fh, - struct sk_buff *skb, struct tcmsg *t) -{ - struct rsvp_filter *f = fh; - struct rsvp_session *s; - struct nlattr *nest; - struct tc_rsvp_pinfo pinfo; - - if (f == NULL) - return skb->len; - s = f->sess; - - t->tcm_handle = f->handle; - - nest = nla_nest_start(skb, TCA_OPTIONS); - if (nest == NULL) - goto nla_put_failure; - - if (nla_put(skb, TCA_RSVP_DST, sizeof(s->dst), &s->dst)) - goto nla_put_failure; - pinfo.dpi = s->dpi; - pinfo.spi = f->spi; - pinfo.protocol = s->protocol; - pinfo.tunnelid = s->tunnelid; - pinfo.tunnelhdr = f->tunnelhdr; - pinfo.pad = 0; - if (nla_put(skb, TCA_RSVP_PINFO, sizeof(pinfo), &pinfo)) - goto nla_put_failure; - if (f->res.classid && - nla_put_u32(skb, TCA_RSVP_CLASSID, f->res.classid)) - goto nla_put_failure; - if (((f->handle >> 8) & 0xFF) != 16 && - nla_put(skb, TCA_RSVP_SRC, sizeof(f->src), f->src)) - goto nla_put_failure; - - if (tcf_exts_dump(skb, &f->exts) < 0) - goto nla_put_failure; - - nla_nest_end(skb, nest); - - if (tcf_exts_dump_stats(skb, &f->exts) < 0) - goto nla_put_failure; - return skb->len; - -nla_put_failure: - nla_nest_cancel(skb, nest); - return -1; -} - -static void rsvp_bind_class(void *fh, u32 classid, unsigned long cl) -{ - struct rsvp_filter *f = fh; - - if (f && f->res.classid == classid) - f->res.class = cl; -} - -static struct tcf_proto_ops RSVP_OPS __read_mostly = { - .kind = RSVP_ID, - .classify = rsvp_classify, - .init = rsvp_init, - .destroy = rsvp_destroy, - .get = rsvp_get, - .change = rsvp_change, - .delete = rsvp_delete, - .walk = rsvp_walk, - .dump = rsvp_dump, - .bind_class = rsvp_bind_class, - .owner = THIS_MODULE, -}; - -static int __init init_rsvp(void) -{ - return register_tcf_proto_ops(&RSVP_OPS); -} - -static void __exit exit_rsvp(void) -{ - unregister_tcf_proto_ops(&RSVP_OPS); -} - -module_init(init_rsvp) -module_exit(exit_rsvp) diff --git a/net/sched/cls_rsvp6.c b/net/sched/cls_rsvp6.c deleted file mode 100644 index dd08aea2aee5..000000000000 --- a/net/sched/cls_rsvp6.c +++ /dev/null @@ -1,28 +0,0 @@ -/* - * net/sched/cls_rsvp6.c Special RSVP packet classifier for IPv6. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - * - * Authors: Alexey Kuznetsov, <kuznet(a)ms2.inr.ac.ru> - */ - -#include <linux/module.h> -#include <linux/types.h> -#include <linux/kernel.h> -#include <linux/string.h> -#include <linux/errno.h> -#include <linux/ipv6.h> -#include <linux/skbuff.h> -#include <net/act_api.h> -#include <net/pkt_cls.h> -#include <net/netlink.h> - -#define RSVP_DST_LEN 4 -#define RSVP_ID "rsvp6" -#define RSVP_OPS cls_rsvp6_ops - -#include "cls_rsvp.h" -MODULE_LICENSE("GPL"); -- 2.25.1

2 1

[PATCH 07/15] Complete the developing of hikptool ub_dfx function
by veega2022 06 Oct '23

06 Oct '23

The hikptool ub_dfx command is submitted for the first time. This command can be used to dump register information of the LRB, PFA, and PM modules. Signed-off-by: Jianqiang Li <lijianqiang16(a)huawei.com> --- libhikptdev/include/hikptdev_plug.h | 1 + net/hikp_net_lib.h | 9 + net/ub/ub_dfx/hikp_ub_dfx.c | 319 ++++++++++++++++++++++++++++ net/ub/ub_dfx/hikp_ub_dfx.h | 100 +++++++++ tool_lib/tool_lib.h | 2 +- 5 files changed, 430 insertions(+), 1 deletion(-) create mode 100644 net/ub/ub_dfx/hikp_ub_dfx.c create mode 100644 net/ub/ub_dfx/hikp_ub_dfx.h diff --git a/libhikptdev/include/hikptdev_plug.h b/libhikptdev/include/hikptdev_plug.h index 42bea6b..56cea78 100644 --- a/libhikptdev/include/hikptdev_plug.h +++ b/libhikptdev/include/hikptdev_plug.h @@ -43,6 +43,7 @@ enum cmd_module_type { MAC_MOD = 8, DPDK_MOD = 9, CXL_MOD = 10, + UB_MOD = 11, }; void hikp_unlock(void); diff --git a/net/hikp_net_lib.h b/net/hikp_net_lib.h index cc99d0c..af0a51d 100644 --- a/net/hikp_net_lib.h +++ b/net/hikp_net_lib.h @@ -98,6 +98,15 @@ enum roce_cmd_type { GET_ROCEE_TSP_CMD, }; +enum ub_cmd_type { + GET_UNIC_PPP_CMD = 0x1, + GET_UB_DFX_INFO_CMD, + GET_UB_LINK_INFO_CMD, + GET_UB_BP_INFO_CMD, + GET_UB_CRD_INFO_CMD, + GET_UB_BASIC_INFO_CMD, +}; + #define HIKP_MAX_PF_NUM 8 #define HIKP_NIC_MAX_FUNC_NUM 256 diff --git a/net/ub/ub_dfx/hikp_ub_dfx.c b/net/ub/ub_dfx/hikp_ub_dfx.c new file mode 100644 index 0000000..c50f555 --- /dev/null +++ b/net/ub/ub_dfx/hikp_ub_dfx.c @@ -0,0 +1,319 @@ +/* + * Copyright (c) 2023 Hisilicon Technologies Co., Ltd. + * Hikptool is licensed under Mulan PSL v2. + * You can use this software according to the terms and conditions of the Mulan PSL v2. + * You may obtain a copy of Mulan PSL v2 at: + * http://license.coscl.org.cn/MulanPSL2 + * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, + * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, + * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + * + * See the Mulan PSL v2 for more details. + */ + +#include "tool_cmd.h" +#include "hikp_net_lib.h" +#include "hikp_ub_dfx.h" + +struct ub_dfx_param g_ub_dfx_param = { 0 }; + +static const struct dfx_module_cmd g_ub_dfx_module_parse[] = { + {"LRB", LRB_DFX_REG_DUMP}, + {"PFA", PFA_DFX_REG_DUMP}, + {"PM", PM_DFX_REG_DUMP} +}; + +static const struct dfx_type_parse g_dfx_type_parse[] = { + {INCORRECT_REG_TYPE, WIDTH_32_BIT, "INCORRECT TYPE"}, + {TYPE_32_STATS, WIDTH_32_BIT, "32 bit statistics"}, + {TYPE_32_RUNNING_STATUS, WIDTH_32_BIT, "32 bit running status"}, + {TYPE_64_STATS, WIDTH_64_BIT, "64 bit statistics"}, +}; + +static void dfx_help_info(const struct major_cmd_ctrl *self) +{ + printf("\n Usage: %s %s\n", self->cmd_ptr->name, "-i <interface>\n"); + printf("\n %s\n", self->cmd_ptr->help_info); + printf(" Options:\n\n"); + printf(" %s, %-25s %s\n", "-h", "--help", "display this help and exit"); + printf(" %s, %-25s %s\n", "-i", "--interface=<interface>", + "device target or bdf id, e.g. ubn0 or 0000:35:00.0"); + printf(" %s\n", " [-m/--module LRB/PFA/PM] : this is necessary param\n"); +} + +static int hikp_ub_dfx_help(struct major_cmd_ctrl *self, const char *argv) +{ + dfx_help_info(self); + return 0; +} + +static int hikp_ub_dfx_target(struct major_cmd_ctrl *self, const char *argv) +{ + self->err_no = tool_check_and_get_valid_bdf_id(argv, &(g_ub_dfx_param.target)); + if (self->err_no != 0) { + snprintf(self->err_str, sizeof(self->err_str), "Unknown device %s.", argv); + return self->err_no; + } + + return 0; +} + +static int hikp_ub_dfx_module_select(struct major_cmd_ctrl *self, const char *argv) +{ + size_t arr_size = HIKP_ARRAY_SIZE(g_ub_dfx_module_parse); + bool is_found; + size_t i; + + for (i = 0; i < arr_size; i++) { + is_found = strncmp(argv, (const char *)g_ub_dfx_module_parse[i].module_name, + sizeof(g_ub_dfx_module_parse[i].module_name)) == 0; + if (is_found) { + g_ub_dfx_param.sub_cmd_code = g_ub_dfx_module_parse[i].sub_cmd_code; + g_ub_dfx_param.module_idx = i; + g_ub_dfx_param.flag |= MODULE_SET_FLAG; + return 0; + } + } + dfx_help_info(self); + snprintf(self->err_str, sizeof(self->err_str), "-m/--module param error!!!"); + self->err_no = -EINVAL; + + return -EINVAL; +} + +static int hikp_ub_dfx_get_blk_data(struct hikp_cmd_ret **cmd_ret, + uint32_t blk_id, uint32_t sub_cmd_code) +{ + struct hikp_cmd_header req_header = { 0 }; + struct ub_dfx_req_para req_data = { 0 }; + + req_data.bdf = g_ub_dfx_param.target.bdf; + req_data.block_id = blk_id; + hikp_cmd_init(&req_header, UB_MOD, GET_UB_DFX_INFO_CMD, sub_cmd_code); + *cmd_ret = hikp_cmd_alloc(&req_header, &req_data, sizeof(req_data)); + + return hikp_rsp_normal_check(*cmd_ret); +} + +static int hikp_ub_get_first_blk_dfx(struct ub_dfx_rsp_head *rsp_head, uint32_t **reg_data, + uint32_t *max_dfx_size, uint32_t *version) +{ + struct ub_dfx_rsp *dfx_rsp = NULL; + struct hikp_cmd_ret *cmd_ret; + int ret; + + ret = hikp_ub_dfx_get_blk_data(&cmd_ret, 0, g_ub_dfx_param.sub_cmd_code); + if (ret < 0) + goto err_out; + + dfx_rsp = (struct ub_dfx_rsp *)(cmd_ret->rsp_data); + *version = cmd_ret->version; + *rsp_head = dfx_rsp->rsp_head; + if (rsp_head->total_blk_num == 0) { + /* if total block number is zero, set total type number to zero anyway */ + rsp_head->total_type_num = 0; + goto err_out; + } + *max_dfx_size = (uint32_t)(rsp_head->total_blk_num * MAX_DFX_DATA_NUM * sizeof(uint32_t)); + *reg_data = (uint32_t *)calloc(1, *max_dfx_size); + if (*reg_data == NULL) { + HIKP_ERROR_PRINT("malloc log memory 0x%x failed.\n", *max_dfx_size); + ret = -ENOMEM; + goto err_out; + } + + if (rsp_head->cur_blk_size > *max_dfx_size) { + free(*reg_data); + *reg_data = NULL; + HIKP_ERROR_PRINT("blk0 reg_data copy size error, data size: 0x%x, max size: 0x%x\n", + rsp_head->cur_blk_size, *max_dfx_size); + ret = -EINVAL; + goto err_out; + } + memcpy(*reg_data, dfx_rsp->reg_data, rsp_head->cur_blk_size); + + *max_dfx_size -= (uint32_t)rsp_head->cur_blk_size; +err_out: + free(cmd_ret); + cmd_ret = NULL; + + return ret; +} + +static int hikp_ub_get_blk_dfx(struct ub_dfx_rsp_head *rsp_head, uint32_t blk_id, + uint32_t *reg_data, uint32_t *max_dfx_size) +{ + struct ub_dfx_rsp *dfx_rsp = NULL; + struct hikp_cmd_ret *cmd_ret; + int ret; + + ret = hikp_ub_dfx_get_blk_data(&cmd_ret, blk_id, g_ub_dfx_param.sub_cmd_code); + if (ret < 0) + goto err_out; + + dfx_rsp = (struct ub_dfx_rsp *)(cmd_ret->rsp_data); + *rsp_head = dfx_rsp->rsp_head; + if (rsp_head->cur_blk_size > *max_dfx_size) { + HIKP_ERROR_PRINT("blk%u reg_data copy size error, " + "data size: 0x%x, max size: 0x%x\n", + blk_id, rsp_head->cur_blk_size, *max_dfx_size); + ret = -EINVAL; + goto err_out; + } + memcpy(reg_data, dfx_rsp->reg_data, rsp_head->cur_blk_size); + *max_dfx_size -= (uint32_t)rsp_head->cur_blk_size; + +err_out: + free(cmd_ret); + cmd_ret = NULL; + + return ret; +} + +static bool is_type_found(uint16_t type_id, uint32_t *index) +{ + size_t arr_size = HIKP_ARRAY_SIZE(g_dfx_type_parse); + size_t i; + + for (i = 0; i < arr_size; i++) { + if (g_dfx_type_parse[i].type_id == type_id) { + *index = i; + return true; + } + } + + return false; +} + +static void hikp_ub_dfx_print_type_head(uint8_t type_id, uint8_t *last_type_id) +{ + uint32_t index = 0; + + if (type_id != *last_type_id) { + printf("-----------------------------------------------------\n"); + if (is_type_found(type_id, &index)) + printf("type name: %s\n\n", g_dfx_type_parse[index].type_name); + else + HIKP_WARN_PRINT("type name: unknown type, type id is %u\n\n", type_id); + + *last_type_id = type_id; + } +} + +static void hikp_ub_dfx_print_b32(uint32_t num, uint32_t *reg_data) +{ + uint32_t word_num = num * WORD_NUM_PER_REG; + uint16_t offset; + uint32_t value; + uint32_t index; + uint32_t i; + + for (i = 0, index = 1; i < word_num; i = i + WORD_NUM_PER_REG, index++) { + offset = (uint16_t)HI_GET_BITFIELD(reg_data[i], 0, DFX_REG_ADDR_MASK); + value = reg_data[i + 1]; + printf("%03u: 0x%04x\t0x%08x\n", index, offset, value); + } +} + +static void hikp_ub_dfx_print_b64(uint32_t num, uint32_t *reg_data) +{ + uint32_t word_num = num * WORD_NUM_PER_REG; + uint16_t offset; + uint64_t value; + uint32_t index; + uint32_t i; + + for (i = 0, index = 1; i < word_num; i = i + WORD_NUM_PER_REG, index++) { + offset = (uint16_t)HI_GET_BITFIELD(reg_data[i], 0, DFX_REG_ADDR_MASK); + value = (uint64_t)reg_data[i + 1] | + (HI_GET_BITFIELD((uint64_t)reg_data[i], DFX_REG_VALUE_OFF, + DFX_REG_VALUE_MASK) << BIT_NUM_OF_WORD); + printf("%03u: 0x%04x\t0x%016lx\n", index, offset, value); + } +} + +static void hikp_ub_dfx_print(const struct ub_dfx_rsp_head *rsp_head, uint32_t *reg_data) +{ + struct ub_dfx_type_head *type_head; + uint8_t last_type_id = 0; + uint32_t *ptr = reg_data; + uint8_t i; + + printf("****************** module %s reg dump start ********************\n", + g_ub_dfx_module_parse[g_ub_dfx_param.module_idx].module_name); + for (i = 0; i < rsp_head->total_type_num; i++) { + type_head = (struct ub_dfx_type_head *)ptr; + if (type_head->type_id == INCORRECT_REG_TYPE) { + HIKP_ERROR_PRINT("No.%u type is incorrect reg type\n", i + 1u); + break; + } + hikp_ub_dfx_print_type_head(type_head->type_id, &last_type_id); + ptr++; + if (type_head->bit_width == WIDTH_32_BIT) { + hikp_ub_dfx_print_b32((uint32_t)type_head->reg_num, ptr); + } else if (type_head->bit_width == WIDTH_64_BIT) { + hikp_ub_dfx_print_b64((uint32_t)type_head->reg_num, ptr); + } else { + HIKP_ERROR_PRINT("type%u's bit width error.\n", type_head->type_id); + break; + } + ptr += (uint32_t)type_head->reg_num * WORD_NUM_PER_REG; + } + printf("################### ====== dump end ====== ######################\n"); +} + +static void hikp_ub_dfx_execute(struct major_cmd_ctrl *self) +{ + struct ub_dfx_rsp_head rsp_head = { 0 }; + struct ub_dfx_rsp_head tmp_head = { 0 }; + uint32_t *reg_data = NULL; + uint32_t max_dfx_size = 0; + uint32_t real_reg_size; + uint32_t version; + uint32_t i; + + if (!(g_ub_dfx_param.flag & MODULE_SET_FLAG)) { + self->err_no = -EINVAL; + snprintf(self->err_str, sizeof(self->err_str), "Please specify a module."); + dfx_help_info(self); + return; + } + + self->err_no = hikp_ub_get_first_blk_dfx(&rsp_head, &reg_data, &max_dfx_size, &version); + if (self->err_no != 0) { + snprintf(self->err_str, sizeof(self->err_str), "get the first block dfx fail."); + return; + } + real_reg_size = (uint32_t)rsp_head.cur_blk_size; + for (i = 1; i < rsp_head.total_blk_num; i++) { + self->err_no = hikp_ub_get_blk_dfx(&tmp_head, i, + reg_data + (real_reg_size / sizeof(uint32_t)), + &max_dfx_size); + if (self->err_no != 0) { + snprintf(self->err_str, sizeof(self->err_str), + "getting block%u reg fail.", i); + free(reg_data); + return; + } + real_reg_size += (uint32_t)tmp_head.cur_blk_size; + memset(&tmp_head, 0, sizeof(struct ub_dfx_rsp_head)); + } + + printf("DFX cmd version: 0x%x\n\n", version); + hikp_ub_dfx_print((const struct ub_dfx_rsp_head *)&rsp_head, reg_data); + free(reg_data); +} + +static void cmd_ub_dfx_init(void) +{ + struct major_cmd_ctrl *major_cmd = get_major_cmd(); + + major_cmd->option_count = 0; + major_cmd->execute = hikp_ub_dfx_execute; + + cmd_option_register("-h", "--help", false, hikp_ub_dfx_help); + cmd_option_register("-i", "--interface", true, hikp_ub_dfx_target); + cmd_option_register("-m", "--module", true, hikp_ub_dfx_module_select); +} + +HIKP_CMD_DECLARE("ub_dfx", "dump ub dfx info of hardware", cmd_ub_dfx_init); diff --git a/net/ub/ub_dfx/hikp_ub_dfx.h b/net/ub/ub_dfx/hikp_ub_dfx.h new file mode 100644 index 0000000..4ba37a1 --- /dev/null +++ b/net/ub/ub_dfx/hikp_ub_dfx.h @@ -0,0 +1,100 @@ +/* + * Copyright (c) 2023 Hisilicon Technologies Co., Ltd. + * Hikptool is licensed under Mulan PSL v2. + * You can use this software according to the terms and conditions of the Mulan PSL v2. + * You may obtain a copy of Mulan PSL v2 at: + * http://license.coscl.org.cn/MulanPSL2 + * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, + * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, + * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + * + * See the Mulan PSL v2 for more details. + */ + +#ifndef HIKP_UB_DFX_H +#define HIKP_UB_DFX_H + +#include "hikp_net_lib.h" + +#define MAX_DFX_DATA_NUM 59 +#define MODULE_SET_FLAG 0x1 + +enum ub_dfx_cmd_type { + LRB_DFX_REG_DUMP = 0, + PFA_DFX_REG_DUMP = 1, + PM_DFX_REG_DUMP = 2, + INVALID_MODULE = 0xFFFFFFFF, +}; + +enum ub_dfx_reg_type { + INCORRECT_REG_TYPE = 0, + TYPE_32_STATS = 1, + TYPE_32_RUNNING_STATUS = 2, + TYPE_64_STATS = 3, + TYPE_INVALID = 255, +}; + +#define MAX_TYPE_NAME_LEN 40 + +enum ub_dfx_reg_width { + WIDTH_32_BIT = 32, + WIDTH_64_BIT = 64, +}; + +struct dfx_type_parse { + uint8_t type_id; + uint8_t bit_width; + uint8_t type_name[MAX_TYPE_NAME_LEN]; +}; + +struct ub_dfx_param { + struct tool_target target; + uint32_t sub_cmd_code; + uint8_t module_idx; + uint8_t flag; +}; + +#define MAX_MODULE_NAME_LEN 20 +struct dfx_module_cmd { + uint8_t module_name[MAX_MODULE_NAME_LEN]; + uint32_t sub_cmd_code; +}; + +struct ub_dfx_req_para { + struct bdf_t bdf; + uint8_t block_id; +}; + +struct ub_dfx_type_head { + uint8_t type_id; + uint8_t bit_width; + uint8_t reg_num; + uint8_t flag; +}; + +struct ub_dfx_rsp_head { + uint8_t total_blk_num; + uint8_t total_type_num; + uint8_t cur_blk_size; + uint8_t rsvd; +}; + +/********************************************************* + * All registers are returned as key-value pairs, and divided + * into three groups of data. + * 1. 32bit regs: R0 bit0~bit15: offset, R1 bit0~bit31: value + * 2. 64bit regs: R0 bit0~bit15: offset, R0 bit16~bit31 high16 value, R1 bit0~bit31: low32 value + *********************************************************/ +#define DFX_REG_VALUE_OFF 16 +#define DFX_REG_VALUE_MASK 0xFFFF +#define DFX_REG_ADDR_MASK 0xFFFF + +#define WORD_NUM_PER_REG 2 +#define BIT_NUM_OF_WORD 32 + +struct ub_dfx_rsp { + struct ub_dfx_rsp_head rsp_head; + uint32_t reg_data[MAX_DFX_DATA_NUM]; +}; + +#endif /* HIKP_UB_DFX_H */ diff --git a/tool_lib/tool_lib.h b/tool_lib/tool_lib.h index b211175..bf37465 100644 --- a/tool_lib/tool_lib.h +++ b/tool_lib/tool_lib.h @@ -18,7 +18,7 @@ #define TOOL_NAME "hikptool" -#define TOOL_VER "1.0.14" +#define TOOL_VER "1.0.15" #define HI_GET_BITFIELD(value, start, mask) (((value) >> (start)) & (mask)) #define HI_SET_FIELD(origin, shift, val) ((origin) |= (val) << (shift)) -- 2.36.1.windows.1

2 21

[PATCH OLK-5.10 v2 00/44] xfs: recent patches to fix xfs issues
by Long Li 28 Sep '23

28 Sep '23

Baokun Li (1): xfs: propagate the return value of xfs_log_force() to avoid soft lockup Colin Ian King (2): xfs: remove redundant initializations of pointers drop_leaf and save_leaf xfs: remove redundant pointer lip Darrick J. Wong (9): xfs: use setattr_copy to set vfs inode attributes xfs: remove kmem_zone typedef xfs: rename _zone variables to _cache xfs: compact deferred intent item structures xfs: create slab caches for frequently-used deferred items xfs: rename xfs_bmap_add_free to xfs_free_extent_later xfs: reduce the size of struct xfs_extent_free_item xfs: remove unused parameter from refcount code xfs: pass xfs_extent_free_item directly through the log intent code Dave Chinner (19): xfs: don't assert fail on perag references on teardown xfs: set prealloc flag in xfs_alloc_file_space() xfs: validity check agbnos on the AGFL xfs: validate block number being freed before adding to xefi xfs: don't reverse order of items in bulk AIL insertion xfs: use deferred frees for btree block freeing xfs: pass alloc flags through to xfs_extent_busy_flush() xfs: allow extent free intents to be retried xfs: don't block in busy flushing when freeing extents xfs: journal geometry is not properly bounds checked xfs: AGF length has never been bounds checked xfs: fix bounds check in xfs_defer_agfl_block() xfs: block reservation too large for minleft allocation xfs: punching delalloc extents on write failure is racy xfs: use byte ranges for write cleanup ranges xfs,iomap: move delalloc punching to iomap iomap: buffered write failure should not truncate the page cache xfs: xfs_bmap_punch_delalloc_range() should take a byte range xfs: fix off-by-one-block in xfs_discard_folio() Gaosheng Cui (1): xfs: remove xfs_setattr_time() declaration Guo Xuenan (1): xfs: set minleft correctly for randomly sparse inode allocations Jiapeng Chong (1): xfs: Remove redundant assignment to busy Long Li (6): xfs: fix dir3 block read verify fail during log recover Revert "xfs: propagate the return value of xfs_log_force() to avoid soft lockup" xfs: xfs_trans_cancel() path must check for log shutdown xfs: don't verify agf length when log recovery xfs: shutdown to ensure submits buffers on LSN boundaries xfs: update the last_sync_lsn with ctx start lsn yangerkun (4): xfs: keep growfs sb log item active until ail flush success xfs: fix xfs shutdown since we reserve more blocks in agfl fixup xfs: longest free extent no need consider postalloc xfs: shutdown xfs once inode double free fs/xfs/kmem.h | 4 - fs/xfs/libxfs/xfs_alloc.c | 390 +++++++++++++++++++++-------- fs/xfs/libxfs/xfs_alloc.h | 51 +++- fs/xfs/libxfs/xfs_alloc_btree.c | 2 +- fs/xfs/libxfs/xfs_attr_leaf.c | 2 - fs/xfs/libxfs/xfs_bmap.c | 90 +++---- fs/xfs/libxfs/xfs_bmap.h | 37 +-- fs/xfs/libxfs/xfs_bmap_btree.c | 27 +- fs/xfs/libxfs/xfs_btree.c | 4 +- fs/xfs/libxfs/xfs_btree.h | 2 +- fs/xfs/libxfs/xfs_da_btree.c | 6 +- fs/xfs/libxfs/xfs_da_btree.h | 3 +- fs/xfs/libxfs/xfs_defer.c | 70 +++++- fs/xfs/libxfs/xfs_defer.h | 3 + fs/xfs/libxfs/xfs_ialloc.c | 32 ++- fs/xfs/libxfs/xfs_ialloc_btree.c | 8 +- fs/xfs/libxfs/xfs_inode_fork.c | 4 +- fs/xfs/libxfs/xfs_inode_fork.h | 2 +- fs/xfs/libxfs/xfs_refcount.c | 56 +++-- fs/xfs/libxfs/xfs_refcount.h | 7 +- fs/xfs/libxfs/xfs_refcount_btree.c | 11 +- fs/xfs/libxfs/xfs_rmap.c | 21 +- fs/xfs/libxfs/xfs_rmap.h | 7 +- fs/xfs/libxfs/xfs_rmap_btree.c | 2 +- fs/xfs/libxfs/xfs_sb.c | 56 ++++- fs/xfs/libxfs/xfs_types.c | 23 ++ fs/xfs/libxfs/xfs_types.h | 2 + fs/xfs/xfs_aops.c | 32 +-- fs/xfs/xfs_bmap_item.c | 16 +- fs/xfs/xfs_bmap_item.h | 6 +- fs/xfs/xfs_bmap_util.c | 19 +- fs/xfs/xfs_bmap_util.h | 2 +- fs/xfs/xfs_buf.c | 16 +- fs/xfs/xfs_buf_item.c | 10 +- fs/xfs/xfs_buf_item.h | 11 +- fs/xfs/xfs_buf_item_recover.c | 9 +- fs/xfs/xfs_dquot.c | 26 +- fs/xfs/xfs_extent_busy.c | 36 ++- fs/xfs/xfs_extent_busy.h | 6 +- fs/xfs/xfs_extfree_item.c | 137 +++++++--- fs/xfs/xfs_extfree_item.h | 6 +- fs/xfs/xfs_file.c | 8 - fs/xfs/xfs_icache.c | 8 +- fs/xfs/xfs_icreate_item.c | 6 +- fs/xfs/xfs_icreate_item.h | 2 +- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_inode.h | 2 +- fs/xfs/xfs_inode_item.c | 6 +- fs/xfs/xfs_inode_item.h | 2 +- fs/xfs/xfs_iomap.c | 292 ++++++++++++++++++--- fs/xfs/xfs_iops.c | 56 +---- fs/xfs/xfs_iops.h | 1 - fs/xfs/xfs_log.c | 72 +++--- fs/xfs/xfs_log_priv.h | 2 +- fs/xfs/xfs_log_recover.c | 6 +- fs/xfs/xfs_mount.c | 12 +- fs/xfs/xfs_mru_cache.c | 2 +- fs/xfs/xfs_pnfs.c | 3 +- fs/xfs/xfs_qm.h | 2 +- fs/xfs/xfs_refcount_item.c | 16 +- fs/xfs/xfs_refcount_item.h | 6 +- fs/xfs/xfs_reflink.c | 7 +- fs/xfs/xfs_rmap_item.c | 16 +- fs/xfs/xfs_rmap_item.h | 6 +- fs/xfs/xfs_super.c | 233 ++++++++--------- fs/xfs/xfs_trans.c | 24 +- fs/xfs/xfs_trans.h | 2 +- fs/xfs/xfs_trans_ail.c | 5 +- fs/xfs/xfs_trans_dquot.c | 4 +- mm/filemap.c | 1 + 70 files changed, 1358 insertions(+), 700 deletions(-) -- 2.31.1

2 45

[PATCH openEuler-23.09 00/13] LoongArch: fix some pci problems
by Hongchen Zhang 28 Sep '23

28 Sep '23

Hongchen Zhang (13): LoongArch: fix ls2k500 bmc not work when installing iso LS7A2000 : Add quirk for OHCI device rev 0x02 PCI: Check if entry->offset already exist for mem resource PCI: Check if the pci controller can use both CFG0 and CFG1 mode to access configuration space PCI: PM: Fix pcie mrrs restoring pci: fix kabi error caused by pm_suspend_target_state LoongArch: Fixed some pcie card not scanning properly pci/quirks: ls7a2000: fix pm transition of devices under pcie port LS7A2000: PCIE: Fixup GPU card error pci: fix X server auto probe fail when both ast and etnaviv drm present LoongArch: pci root bridige set acpi companion only when not acpi_disabled. LoongArch: Fix secondary bridge routing errors pci: irq: Add early_param pci_irq_limit to limit pci irq numbers arch/loongarch/pci/acpi.c | 23 ++-- drivers/gpu/drm/loongson/loongson_module.c | 15 +++ drivers/irqchip/irq-loongson-pch-pic.c | 6 +- drivers/pci/controller/pci-loongson.c | 147 ++++++++++++++++++++- drivers/pci/msi/msi.c | 25 ++++ drivers/pci/pci.c | 20 ++- 6 files changed, 217 insertions(+), 19 deletions(-) -- 2.33.0

2 14

[PATCH openEuler-1.0-LTS] can: raw: add missing refcount for memory leak fix
by Ziyang Xuan 28 Sep '23

28 Sep '23

From: Oliver Hartkopp <socketcan(a)hartkopp.net> mainline inclusion from mainline-v6.5 commit c275a176e4b69868576e543409927ae75e3a3288 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I852NG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------- Commit ee8b94c8510c ("can: raw: fix receiver memory leak") introduced a new reference to the CAN netdevice that has assigned CAN filters. But this new ro->dev reference did not maintain its own refcount which lead to another KASAN use-after-free splat found by Eric Dumazet. This patch ensures a proper refcount for the CAN nedevice. Fixes: ee8b94c8510c ("can: raw: fix receiver memory leak") Reported-by: Eric Dumazet <edumazet(a)google.com> Cc: Ziyang Xuan <william.xuanziyang(a)huawei.com> Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net> Link: https://lore.kernel.org/r/20230821144547.6658-3-socketcan@hartkopp.net Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Conflicts: net/can/raw.c Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com> --- net/can/raw.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/net/can/raw.c b/net/can/raw.c index 26934174cd42..951d568470c5 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -287,8 +287,10 @@ static void raw_notify(struct raw_sock *ro, unsigned long msg, case NETDEV_UNREGISTER: lock_sock(sk); /* remove current filters & unregister */ - if (ro->bound) + if (ro->bound) { raw_disable_allfilters(dev_net(dev), dev, sk); + dev_put(dev); + } if (ro->count > 1) kfree(ro->filter); @@ -392,10 +394,12 @@ static int raw_release(struct socket *sock) /* remove current filters & unregister */ if (ro->bound) { - if (ro->dev) + if (ro->dev) { raw_disable_allfilters(dev_net(ro->dev), ro->dev, sk); - else + dev_put(ro->dev); + } else { raw_disable_allfilters(sock_net(sk), NULL, sk); + } } if (ro->count > 1) @@ -446,10 +450,10 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) goto out; } if (dev->type != ARPHRD_CAN) { - dev_put(dev); err = -ENODEV; - goto out; + goto out_put_dev; } + if (!(dev->flags & IFF_UP)) notify_enetdown = 1; @@ -457,7 +461,9 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) /* filters set by default/setsockopt */ err = raw_enable_allfilters(sock_net(sk), dev, sk); - dev_put(dev); + if (err) + goto out_put_dev; + } else { ifindex = 0; @@ -468,18 +474,28 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) if (!err) { if (ro->bound) { /* unregister old filters */ - if (ro->dev) + if (ro->dev) { raw_disable_allfilters(dev_net(ro->dev), ro->dev, sk); - else + /* drop reference to old ro->dev */ + dev_put(ro->dev); + } else { raw_disable_allfilters(sock_net(sk), NULL, sk); + } } ro->ifindex = ifindex; ro->bound = 1; + /* bind() ok -> hold a reference for new ro->dev */ ro->dev = dev; + if (ro->dev) + dev_hold(ro->dev); } - out: +out_put_dev: + /* remove potential reference from dev_get_by_index() */ + if (dev) + dev_put(dev); +out: release_sock(sk); rtnl_unlock(); -- 2.25.1

2 1

[PATCH] socip: Return -EINVAL when the parameter check fails
by veega 28 Sep '23

28 Sep '23

From: veega2022 <zhuweijia(a)huawei.com> Return -EINVAL when the parameter check fails Signed-off-by: fangjian <f.fangjian(a)huawei.com> --- socip/hikp_socip_dumpreg.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/socip/hikp_socip_dumpreg.c b/socip/hikp_socip_dumpreg.c index 088f5dd..b74dac8 100644 --- a/socip/hikp_socip_dumpreg.c +++ b/socip/hikp_socip_dumpreg.c @@ -131,6 +131,7 @@ static void hikp_socip_dumpreg_execute(struct major_cmd_ctrl *self) struct hikp_cmd_ret *cmd_ret; if (!check_socip_dumpreg_param()) { + self->err_no = -EINVAL; cmd_socip_dump_help(self, NULL); return; } @@ -141,6 +142,7 @@ static void hikp_socip_dumpreg_execute(struct major_cmd_ctrl *self) hikp_cmd_init(&req_header, SOCIP_MOD, HIKP_SOCIP_CMD_DUMPREG, param[MODULE_ID_INDEX].val); cmd_ret = hikp_cmd_alloc(&req_header, &req_data, sizeof(req_data)); if (!cmd_ret) { + self->err_no=-EINVAL; HIKP_ERROR_PRINT("hikp_cmd_alloc\n"); return; } -- 2.33.0

1 36

[PATCH openEuler-23.09 00/13] LoongArch: fix some pci problems
by Hongchen Zhang 28 Sep '23

28 Sep '23

Hongchen Zhang (13): LoongArch: fix ls2k500 bmc not work when installing iso LS7A2000 : Add quirk for OHCI device rev 0x02 PCI: Check if entry->offset already exist for mem resource PCI: Check if the pci controller can use both CFG0 and CFG1 mode to access configuration space PCI: PM: Fix pcie mrrs restoring pci: fix kabi error caused by pm_suspend_target_state LoongArch: Fixed some pcie card not scanning properly pci/quirks: ls7a2000: fix pm transition of devices under pcie port LS7A2000: PCIE: Fixup GPU card error pci: fix X server auto probe fail when both ast and etnaviv drm present LoongArch: pci root bridige set acpi companion only when not acpi_disabled. LoongArch: Fix secondary bridge routing errors pci: irq: Add early_param pci_irq_limit to limit pci irq numbers arch/loongarch/pci/acpi.c | 23 ++-- drivers/gpu/drm/loongson/loongson_module.c | 15 +++ drivers/irqchip/irq-loongson-pch-pic.c | 6 +- drivers/pci/controller/pci-loongson.c | 147 ++++++++++++++++++++- drivers/pci/msi/msi.c | 25 ++++ drivers/pci/pci.c | 20 ++- 6 files changed, 217 insertions(+), 19 deletions(-) -- 2.33.0

1 13

[PATCH openEuler-23.09 00/14] LoongArch: fix some pci problems
by Hongchen Zhang 28 Sep '23

28 Sep '23

Hongchen Zhang (14): LoongArch: fix ls2k500 bmc not work when installing iso LS7A2000 : Add quirk for OHCI device rev 0x02 PCI: Check if entry->offset already exist for mem resource PCI: Check if the pci controller can use both CFG0 and CFG1 mode to access configuration space PCI: PM: Fix pcie mrrs restoring pci: fix kabi error caused by pm_suspend_target_state LoongArch: Fixed some pcie card not scanning properly pci/quirks: ls7a2000: fix pm transition of devices under pcie port LS7A2000: PCIE: Fixup GPU card error pci: fix X server auto probe fail when both ast and etnaviv drm present LoongArch: pci root bridige set acpi companion only when not acpi_disabled. LoongArch: Fix secondary bridge routing errors pci: irq: Add early_param pci_irq_limit to limit pci irq numbers irqchip/loongson-pch-pic: 7a1000 int_clear reg must use 64bit write. arch/loongarch/pci/acpi.c | 23 ++-- drivers/gpu/drm/loongson/loongson_module.c | 15 +++ drivers/irqchip/irq-loongson-pch-pic.c | 22 +-- drivers/pci/controller/pci-loongson.c | 147 ++++++++++++++++++++- drivers/pci/msi/msi.c | 25 ++++ drivers/pci/pci.c | 20 ++- 6 files changed, 228 insertions(+), 24 deletions(-) -- 2.33.0

1 5

[PATCH openEuler-23.09 00/11] LoongArch: some optimization and
by Hongchen Zhang 28 Sep '23

28 Sep '23

Hongchen Zhang (11): LoongArch: Adapted SECTION_SIZE_BITS with page size LoongArch: Remove redudant csr save/restore LoongArch: use 40 bits address space for user LoongArch: refresh usage of sync LoongArch: fix SECCOMP test error LoongArch: Change definition of cpu_relax() for Loongson-3 usb: xhci: add XHCI_NO_SOFT_RETRY quirk for EJ188 net: stmmac: fix potential double free of dma descriptor resources LoongArch: Remove generic irq migration irqchip/loongson-pch-pic: 7a1000 int_clear reg must use 64bit write. LoongArch: defconfig: Enable a large number of configurations arch/loongarch/Kconfig | 8 +- arch/loongarch/configs/loongson3_defconfig | 1586 +++++++++++++++-- arch/loongarch/include/asm/atomic.h | 8 + arch/loongarch/include/asm/cmpxchg.h | 2 + arch/loongarch/include/asm/futex.h | 2 + arch/loongarch/include/asm/irq.h | 1 + arch/loongarch/include/asm/pgtable.h | 7 +- arch/loongarch/include/asm/sparsemem.h | 2 +- arch/loongarch/include/asm/stackframe.h | 9 - arch/loongarch/include/asm/thread_info.h | 1 + arch/loongarch/include/asm/vdso/processor.h | 11 +- arch/loongarch/kernel/entry.S | 8 - arch/loongarch/kernel/irq.c | 36 + arch/loongarch/kernel/smp.c | 3 +- arch/loongarch/kernel/switch.S | 6 - arch/loongarch/mm/pgtable.c | 2 - drivers/irqchip/irq-loongson-pch-pic.c | 16 +- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 11 + drivers/usb/host/xhci-pci.c | 6 + kernel/irq/Kconfig | 4 +- 20 files changed, 1546 insertions(+), 183 deletions(-) -- 2.33.0

2 12

[PATCH openEuler-23.09 00/14] LoongArch: fix some pci problems
by Hongchen Zhang 28 Sep '23

28 Sep '23

Hongchen Zhang (14): LoongArch: fix ls2k500 bmc not work when installing iso LS7A2000 : Add quirk for OHCI device rev 0x02 PCI: Check if entry->offset already exist for mem resource PCI: Check if the pci controller can use both CFG0 and CFG1 mode to access configuration space PCI: PM: Fix pcie mrrs restoring pci: fix kabi error caused by pm_suspend_target_state LoongArch: Fixed some pcie card not scanning properly pci/quirks: ls7a2000: fix pm transition of devices under pcie port LS7A2000: PCIE: Fixup GPU card error pci: fix X server auto probe fail when both ast and etnaviv drm present LoongArch: pci root bridige set acpi companion only when not acpi_disabled. LoongArch: Fix secondary bridge routing errors pci: irq: Add early_param pci_irq_limit to limit pci irq numbers irqchip/loongson-pch-pic: 7a1000 int_clear reg must use 64bit write. arch/loongarch/pci/acpi.c | 23 ++-- drivers/gpu/drm/loongson/loongson_module.c | 15 +++ drivers/irqchip/irq-loongson-pch-pic.c | 22 +-- drivers/pci/controller/pci-loongson.c | 147 ++++++++++++++++++++- drivers/pci/msi/msi.c | 25 ++++ drivers/pci/pci.c | 20 ++- 6 files changed, 228 insertions(+), 24 deletions(-) -- 2.33.0

2 15

[PATCH openEuler-23.09 00/24] LoongArch: some fix and optimization for LoongArch machine
by Hongchen Zhang 28 Sep '23

28 Sep '23

Hongchen Zhang (24): LoongArch: fix ls2k500 bmc not work when installing iso LoongArch: Adapted SECTION_SIZE_BITS with page size LoongArch: Remove redudant csr save/restore LoongArch: use 40 bits address space for user LoongArch: refresh usage of sync LoongArch: fix SECCOMP test error LoongArch: Change definition of cpu_relax() for Loongson-3 LS7A2000 : Add quirk for OHCI device rev 0x02 PCI: Check if entry->offset already exist for mem resource PCI: Check if the pci controller can use both CFG0 and CFG1 mode to access configuration space PCI: PM: Fix pcie mrrs restoring pci: fix kabi error caused by pm_suspend_target_state LoongArch: Fixed some pcie card not scanning properly pci/quirks: ls7a2000: fix pm transition of devices under pcie port LS7A2000: PCIE: Fixup GPU card error pci: fix X server auto probe fail when both ast and etnaviv drm present LoongArch: pci root bridige set acpi companion only when not acpi_disabled. usb: xhci: add XHCI_NO_SOFT_RETRY quirk for EJ188 net: stmmac: fix potential double free of dma descriptor resources LoongArch: Fix secondary bridge routing errors LoongArch: Remove generic irq migration pci: irq: Add early_param pci_irq_limit to limit pci irq numbers irqchip/loongson-pch-pic: 7a1000 int_clear reg must use 64bit write. LoongArch: defconfig: Enable a large number of configurations arch/loongarch/Kconfig | 8 +- arch/loongarch/configs/loongson3_defconfig | 1586 +++++++++++++++-- arch/loongarch/include/asm/atomic.h | 8 + arch/loongarch/include/asm/cmpxchg.h | 2 + arch/loongarch/include/asm/futex.h | 2 + arch/loongarch/include/asm/irq.h | 1 + arch/loongarch/include/asm/pgtable.h | 7 +- arch/loongarch/include/asm/sparsemem.h | 2 +- arch/loongarch/include/asm/stackframe.h | 9 - arch/loongarch/include/asm/thread_info.h | 1 + arch/loongarch/include/asm/vdso/processor.h | 11 +- arch/loongarch/kernel/entry.S | 8 - arch/loongarch/kernel/irq.c | 36 + arch/loongarch/kernel/smp.c | 3 +- arch/loongarch/kernel/switch.S | 6 - arch/loongarch/mm/pgtable.c | 2 - arch/loongarch/pci/acpi.c | 23 +- drivers/gpu/drm/loongson/loongson_module.c | 15 + drivers/irqchip/irq-loongson-pch-pic.c | 22 +- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 11 + drivers/pci/controller/pci-loongson.c | 147 +- drivers/pci/msi/msi.c | 25 + drivers/pci/pci.c | 20 +- drivers/usb/host/xhci-pci.c | 6 + kernel/irq/Kconfig | 4 +- 25 files changed, 1763 insertions(+), 202 deletions(-) -- 2.33.0

1 15

[src-openeuler/rdma-core v3 0/5] Support reporting wc as software mode
by Chengchang Tang 26 Sep '23

26 Sep '23

From: Juan Zhou <zhoujuan51(a)h-partners.com> Chengchang Tang (5): libhns: Support reporting wc as software mode libhns: return error when post send in reset state libhns: separate the initialization steps of lock libhns: assign doorbell to zero when allocate it libhns: Fix missing reset notification. providers/hns/hns_roce_u.c | 4 + providers/hns/hns_roce_u.h | 14 ++ providers/hns/hns_roce_u_db.c | 2 + providers/hns/hns_roce_u_hw_v2.c | 272 +++++++++++++++++++++++++++---- providers/hns/hns_roce_u_hw_v2.h | 2 + providers/hns/hns_roce_u_verbs.c | 147 ++++++++++++++--- 6 files changed, 387 insertions(+), 54 deletions(-) -- 2.30.0

1 5

[OLK-5.10 1/2] RDMA/hns: Fix potential UAF after reset
by Chengchang Tang 26 Sep '23

26 Sep '23

driver inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/I83L7U ---------------------------------------------------------- Currently, the mapping relationship of reset page between kernel mode and user mode is maintained by driver. If the driver is hot-plugged (e.g. reset), the memory of the reset page is released by kernel driver, but the reset page in user mode still points to this released address which would lead to a UAF. This patch use the helper rdma_user_mmap_io() to maintain the vma mapping, rather than driver itself, which remmaps the userspace reset page to an safe zero page if driver was hot-plugged. Fixes: e8b1fec497a0 ("RDMA/hns: Kernel notify usr space to stop ring db") Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com> --- drivers/infiniband/hw/hns/hns_roce_main.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 4a16200ab950..2f0a5b2bbc50 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -675,10 +675,9 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma) goto out; } - ret = remap_pfn_range(vma, vma->vm_start, - page_to_pfn(hr_dev->reset_page), - PAGE_SIZE, vma->vm_page_prot); - goto out; + prot = vma->vm_page_prot; + pfn = page_to_pfn(hr_dev->reset_page); + break; default: ret = -EINVAL; goto out; -- 2.30.0

1 1

[PATCH OLK-5.10] drm/inspur: fix compile warning
by Hongchen Zhang 26 Sep '23

26 Sep '23

LoongArch inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I84IWW ------------------------------------------ Fix the following warning of inspur BMC drm driver: drivers/gpu/drm/inspur/inspur_drm_drv.c: In function ‘inspur_pci_probe’: drivers/gpu/drm/inspur/inspur_drm_drv.c:379:29: warning: unused variable ‘priv’ [-Wunused-variable] 379 | struct inspur_drm_private *priv; | ^~~~ Fixes: b9d65551a3ad ("drm: add inspur drm driver support") Signed-off-by: Hongchen Zhang <zhanghongchen(a)loongson.cn> --- drivers/gpu/drm/inspur/inspur_drm_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/inspur/inspur_drm_drv.c b/drivers/gpu/drm/inspur/inspur_drm_drv.c index d7026e1df167..f615d6d89109 100644 --- a/drivers/gpu/drm/inspur/inspur_drm_drv.c +++ b/drivers/gpu/drm/inspur/inspur_drm_drv.c @@ -376,7 +376,7 @@ static int inspur_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { int ret = 0; - struct inspur_drm_private *priv; + struct inspur_drm_private __maybe_unused *priv; struct drm_device *dev; inspur_remove_framebuffers(pdev); -- 2.33.0

2 1

[PATCH openEuler-1.0-LTS] sched/rt: Fix RT utilization tracking during policy change
by Xia Fukun 26 Sep '23

26 Sep '23

From: Vincent Donnefort <vincent.donnefort(a)arm.com> stable inclusion from stable-5.10.50 commit c576472a051a9975e2433de6c80ed27acea2d6f9 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I84IKS CVE: NA -------------------------------- [ Upstream commit fecfcbc288e9f4923f40fd23ca78a6acdc7fdf6c ] RT keeps track of the utilization on a per-rq basis with the structure avg_rt. This utilization is updated during task_tick_rt(), put_prev_task_rt() and set_next_task_rt(). However, when the current running task changes its policy, set_next_task_rt() which would usually take care of updating the utilization when the rq starts running RT tasks, will not see a such change, leaving the avg_rt structure outdated. When that very same task will be dequeued later, put_prev_task_rt() will then update the utilization, based on a wrong last_update_time, leading to a huge spike in the RT utilization signal. The signal would eventually recover from this issue after few ms. Even if no RT tasks are run, avg_rt is also updated in __update_blocked_others(). But as the CPU capacity depends partly on the avg_rt, this issue has nonetheless a significant impact on the scheduler. Fix this issue by ensuring a load update when a running task changes its policy to RT. Fixes: 371bf427 ("sched/rt: Add rt_rq utilization tracking") Signed-off-by: Vincent Donnefort <vincent.donnefort(a)arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot(a)linaro.org> Link: https://lore.kernel.org/r/1624271872-211872-2-git-send-email-vincent.donnef… Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- kernel/sched/rt.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 5dff9a6fe2cf..28fc1e341e17 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2242,13 +2242,20 @@ void __init init_sched_rt_class(void) static void switched_to_rt(struct rq *rq, struct task_struct *p) { /* - * If we are already running, then there's nothing - * that needs to be done. But if we are not running - * we may need to preempt the current running task. - * If that current running task is also an RT task + * If we are running, update the avg_rt tracking, as the running time + * will now on be accounted into the latter. + */ + if (task_current(rq, p)) { + update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0); + return; + } + + /* + * If we are not running we may need to preempt the current + * running task. If that current running task is also an RT task * then see if we can move to another run queue. */ - if (task_on_rq_queued(p) && rq->curr != p) { + if (task_on_rq_queued(p)) { #ifdef CONFIG_SMP if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) rt_queue_push_tasks(rq); -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] cifs: Release folio lock on fscache read hit.
by ZhaoLong Wang 26 Sep '23

26 Sep '23

From: Russell Harmon via samba-technical <samba-technical(a)lists.samba.org> stable inclusion from stable-v4.19.293 commit 5a87735675147f848445f05fd1f06168188f91af category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I84IKB CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit 69513dd669e243928f7450893190915a88f84a2b upstream. Under the current code, when cifs_readpage_worker is called, the call contract is that the callee should unlock the page. This is documented in the read_folio section of Documentation/filesystems/vfs.rst as: > The filesystem should unlock the folio once the read has completed, > whether it was successful or not. Without this change, when fscache is in use and cache hit occurs during a read, the page lock is leaked, producing the following stack on subsequent reads (via mmap) to the page: $ cat /proc/3890/task/12864/stack [<0>] folio_wait_bit_common+0x124/0x350 [<0>] filemap_read_folio+0xad/0xf0 [<0>] filemap_fault+0x8b1/0xab0 [<0>] __do_fault+0x39/0x150 [<0>] do_fault+0x25c/0x3e0 [<0>] __handle_mm_fault+0x6ca/0xc70 [<0>] handle_mm_fault+0xe9/0x350 [<0>] do_user_addr_fault+0x225/0x6c0 [<0>] exc_page_fault+0x84/0x1b0 [<0>] asm_exc_page_fault+0x27/0x30 This requires a reboot to resolve; it is a deadlock. Note however that the call to cifs_readpage_from_fscache does mark the page clean, but does not free the folio lock. This happens in __cifs_readpage_from_fscache on success. Releasing the lock at that point however is not appropriate as cifs_readahead also calls cifs_readpage_from_fscache and *does* unconditionally release the lock after its return. This change therefore effectively makes cifs_readpage_worker work like cifs_readahead. Signed-off-by: Russell Harmon <russ(a)har.mn> Acked-by: Paulo Alcantara (SUSE) <pc(a)manguebit.com> Reviewed-by: David Howells <dhowells(a)redhat.com> Cc: stable(a)vger.kernel.org Signed-off-by: Steve French <stfrench(a)microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: ZhaoLong Wang <wangzhaolong1(a)huawei.com> --- fs/cifs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/cifs/file.c b/fs/cifs/file.c index bc3d0d76c2c4..875cb44ba573 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -4005,9 +4005,9 @@ static int cifs_readpage_worker(struct file *file, struct page *page, io_error: kunmap(page); - unlock_page(page); read_complete: + unlock_page(page); return rc; } -- 2.34.3

2 1

[PATCH OLK-5.10] xfrm6: fix inet6_dev refcount underflow problem
by Zhengchao Shao 26 Sep '23

26 Sep '23

From: Lu Wei <luwei32(a)huawei.com> maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I84HSL Reference: https://lore.kernel.org/netdev/CADvbK_euiOKytyFd6KYgNoM5SbDcyz92Li=K7P48H35… -------------------------------- There are race conditions that may lead to inet6_dev refcount underflow in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev(). One of the refcount underflow bugs is shown below: (cpu 1) | (cpu 2) xfrm6_dst_destroy() | ... | in6_dev_put() | | rt6_uncached_list_flush_dev() ... | ... | in6_dev_put() rt6_uncached_list_del() | ... ... | xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(), so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put() again for the same inet6_dev. Fix it by moving in6_dev_put() after rt6_uncached_list_del() in xfrm6_dst_destroy(). Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts") Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com> Reviewed-by: Xin Long <lucien.xin(a)gmail.com> Signed-off-by: Lu Wei <luwei32(a)huawei.com> Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com> --- net/ipv6/xfrm6_policy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 247296e3294b..4c3aa97f23fa 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -120,11 +120,11 @@ static void xfrm6_dst_destroy(struct dst_entry *dst) { struct xfrm_dst *xdst = (struct xfrm_dst *)dst; - if (likely(xdst->u.rt6.rt6i_idev)) - in6_dev_put(xdst->u.rt6.rt6i_idev); dst_destroy_metrics_generic(dst); if (xdst->u.rt6.rt6i_uncached_list) rt6_uncached_list_del(&xdst->u.rt6); + if (likely(xdst->u.rt6.rt6i_idev)) + in6_dev_put(xdst->u.rt6.rt6i_idev); xfrm_dst_destroy(xdst); } -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] xfrm6: fix inet6_dev refcount underflow problem
by Zhengchao Shao 26 Sep '23

26 Sep '23

From: Lu Wei <luwei32(a)huawei.com> maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I84HSL Reference: https://lore.kernel.org/netdev/CADvbK_euiOKytyFd6KYgNoM5SbDcyz92Li=K7P48H35… -------------------------------- There are race conditions that may lead to inet6_dev refcount underflow in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev(). One of the refcount underflow bugs is shown below: (cpu 1) | (cpu 2) xfrm6_dst_destroy() | ... | in6_dev_put() | | rt6_uncached_list_flush_dev() ... | ... | in6_dev_put() rt6_uncached_list_del() | ... ... | xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(), so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put() again for the same inet6_dev. Fix it by moving in6_dev_put() after rt6_uncached_list_del() in xfrm6_dst_destroy(). Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts") Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com> Reviewed-by: Xin Long <lucien.xin(a)gmail.com> Signed-off-by: Lu Wei <luwei32(a)huawei.com> Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com> --- net/ipv6/xfrm6_policy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 30232591cf2b..cfe650cddeb6 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -243,11 +243,11 @@ static void xfrm6_dst_destroy(struct dst_entry *dst) { struct xfrm_dst *xdst = (struct xfrm_dst *)dst; - if (likely(xdst->u.rt6.rt6i_idev)) - in6_dev_put(xdst->u.rt6.rt6i_idev); dst_destroy_metrics_generic(dst); if (xdst->u.rt6.rt6i_uncached_list) rt6_uncached_list_del(&xdst->u.rt6); + if (likely(xdst->u.rt6.rt6i_idev)) + in6_dev_put(xdst->u.rt6.rt6i_idev); xfrm_dst_destroy(xdst); } -- 2.34.1

2 1

[PATCH OLK-5.10 00/44] xfs: recent patches to fix xfs issues
by Long Li 26 Sep '23

26 Sep '23

Baokun Li (1): xfs: propagate the return value of xfs_log_force() to avoid soft lockup Colin Ian King (2): xfs: remove redundant initializations of pointers drop_leaf and save_leaf xfs: remove redundant pointer lip Darrick J. Wong (9): xfs: use setattr_copy to set vfs inode attributes xfs: remove kmem_zone typedef xfs: rename _zone variables to _cache xfs: compact deferred intent item structures xfs: create slab caches for frequently-used deferred items xfs: rename xfs_bmap_add_free to xfs_free_extent_later xfs: reduce the size of struct xfs_extent_free_item xfs: remove unused parameter from refcount code xfs: pass xfs_extent_free_item directly through the log intent code Dave Chinner (19): xfs: don't assert fail on perag references on teardown xfs: set prealloc flag in xfs_alloc_file_space() xfs: validity check agbnos on the AGFL xfs: validate block number being freed before adding to xefi xfs: don't reverse order of items in bulk AIL insertion xfs: use deferred frees for btree block freeing xfs: pass alloc flags through to xfs_extent_busy_flush() xfs: allow extent free intents to be retried xfs: don't block in busy flushing when freeing extents xfs: journal geometry is not properly bounds checked xfs: AGF length has never been bounds checked xfs: fix bounds check in xfs_defer_agfl_block() xfs: block reservation too large for minleft allocation xfs: punching delalloc extents on write failure is racy xfs: use byte ranges for write cleanup ranges xfs,iomap: move delalloc punching to iomap iomap: buffered write failure should not truncate the page cache xfs: xfs_bmap_punch_delalloc_range() should take a byte range xfs: fix off-by-one-block in xfs_discard_folio() Gaosheng Cui (1): xfs: remove xfs_setattr_time() declaration Guo Xuenan (1): xfs: set minleft correctly for randomly sparse inode allocations Jiapeng Chong (1): xfs: Remove redundant assignment to busy Long Li (6): xfs: fix dir3 block read verify fail during log recover Revert "[Huawei] xfs: propagate the return value of xfs_log_force() to avoid soft lockup" xfs: xfs_trans_cancel() path must check for log shutdown xfs: don't verify agf length when log recovery xfs: shutdown to ensure submits buffers on LSN boundaries xfs: update the last_sync_lsn with ctx start lsn yangerkun (4): xfs: keep growfs sb log item active until ail flush success xfs: fix xfs shutdown since we reserve more blocks in agfl fixup xfs: longest free extent no need consider postalloc xfs: shutdown xfs once inode double free fs/xfs/kmem.h | 4 - fs/xfs/libxfs/xfs_alloc.c | 390 +++++++++++++++++++++-------- fs/xfs/libxfs/xfs_alloc.h | 51 +++- fs/xfs/libxfs/xfs_alloc_btree.c | 2 +- fs/xfs/libxfs/xfs_attr_leaf.c | 2 - fs/xfs/libxfs/xfs_bmap.c | 90 +++---- fs/xfs/libxfs/xfs_bmap.h | 37 +-- fs/xfs/libxfs/xfs_bmap_btree.c | 27 +- fs/xfs/libxfs/xfs_btree.c | 4 +- fs/xfs/libxfs/xfs_btree.h | 2 +- fs/xfs/libxfs/xfs_da_btree.c | 6 +- fs/xfs/libxfs/xfs_da_btree.h | 3 +- fs/xfs/libxfs/xfs_defer.c | 70 +++++- fs/xfs/libxfs/xfs_defer.h | 3 + fs/xfs/libxfs/xfs_ialloc.c | 32 ++- fs/xfs/libxfs/xfs_ialloc_btree.c | 8 +- fs/xfs/libxfs/xfs_inode_fork.c | 4 +- fs/xfs/libxfs/xfs_inode_fork.h | 2 +- fs/xfs/libxfs/xfs_refcount.c | 56 +++-- fs/xfs/libxfs/xfs_refcount.h | 7 +- fs/xfs/libxfs/xfs_refcount_btree.c | 11 +- fs/xfs/libxfs/xfs_rmap.c | 21 +- fs/xfs/libxfs/xfs_rmap.h | 7 +- fs/xfs/libxfs/xfs_rmap_btree.c | 2 +- fs/xfs/libxfs/xfs_sb.c | 56 ++++- fs/xfs/libxfs/xfs_types.c | 23 ++ fs/xfs/libxfs/xfs_types.h | 2 + fs/xfs/xfs_aops.c | 32 +-- fs/xfs/xfs_bmap_item.c | 16 +- fs/xfs/xfs_bmap_item.h | 6 +- fs/xfs/xfs_bmap_util.c | 19 +- fs/xfs/xfs_bmap_util.h | 2 +- fs/xfs/xfs_buf.c | 16 +- fs/xfs/xfs_buf_item.c | 10 +- fs/xfs/xfs_buf_item.h | 11 +- fs/xfs/xfs_buf_item_recover.c | 9 +- fs/xfs/xfs_dquot.c | 26 +- fs/xfs/xfs_extent_busy.c | 36 ++- fs/xfs/xfs_extent_busy.h | 6 +- fs/xfs/xfs_extfree_item.c | 137 +++++++--- fs/xfs/xfs_extfree_item.h | 6 +- fs/xfs/xfs_file.c | 8 - fs/xfs/xfs_icache.c | 8 +- fs/xfs/xfs_icreate_item.c | 6 +- fs/xfs/xfs_icreate_item.h | 2 +- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_inode.h | 2 +- fs/xfs/xfs_inode_item.c | 6 +- fs/xfs/xfs_inode_item.h | 2 +- fs/xfs/xfs_iomap.c | 292 ++++++++++++++++++--- fs/xfs/xfs_iops.c | 56 +---- fs/xfs/xfs_iops.h | 1 - fs/xfs/xfs_log.c | 72 +++--- fs/xfs/xfs_log_priv.h | 2 +- fs/xfs/xfs_log_recover.c | 6 +- fs/xfs/xfs_mount.c | 12 +- fs/xfs/xfs_mru_cache.c | 2 +- fs/xfs/xfs_pnfs.c | 3 +- fs/xfs/xfs_qm.h | 2 +- fs/xfs/xfs_refcount_item.c | 16 +- fs/xfs/xfs_refcount_item.h | 6 +- fs/xfs/xfs_reflink.c | 7 +- fs/xfs/xfs_rmap_item.c | 16 +- fs/xfs/xfs_rmap_item.h | 6 +- fs/xfs/xfs_super.c | 233 ++++++++--------- fs/xfs/xfs_trans.c | 24 +- fs/xfs/xfs_trans.h | 2 +- fs/xfs/xfs_trans_ail.c | 5 +- fs/xfs/xfs_trans_dquot.c | 4 +- 69 files changed, 1357 insertions(+), 700 deletions(-) -- 2.31.1

2 45

[PATCH openEuler-22.03-LTS-SP2] scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
by Yong Hu 26 Sep '23

26 Sep '23

From: Shuchang Li <lishuchang(a)hust.edu.cn> stable inclusion from stable-v5.10.180 commit bab8dc38b1a0a12bc064fc064269033bdcf5b88e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZCDZ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit 91a0c0c1413239d0548b5aac4c82f38f6d53a91e ] When if_type equals zero and pci_resource_start(pdev, PCI_64BIT_BAR4) returns false, drbl_regs_memmap_p is not remapped. This passes a NULL pointer to iounmap(), which can trigger a WARN() on certain arches. When if_type equals six and pci_resource_start(pdev, PCI_64BIT_BAR4) returns true, drbl_regs_memmap_p may has been remapped and ctrl_regs_memmap_p is not remapped. This is a resource leak and passes a NULL pointer to iounmap(). To fix these issues, we need to add null checks before iounmap(), and change some goto labels. Fixes: 1351e69fc6db ("scsi: lpfc: Add push-to-adapter support to sli4") Signed-off-by: Shuchang Li <lishuchang(a)hust.edu.cn> Link: https://lore.kernel.org/r/20230404072133.1022-1-lishuchang@hust.edu.cn Reviewed-by: Justin Tee <justin.tee(a)broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yong Hu <yong.hu(a)windriver.com> --- drivers/scsi/lpfc/lpfc_init.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 17200b453cbb..1bb3c96a04bd 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -10477,7 +10477,7 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) goto out_iounmap_all; } else { error = -ENOMEM; - goto out_iounmap_all; + goto out_iounmap_ctrl; } } @@ -10495,7 +10495,7 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) dev_err(&pdev->dev, "ioremap failed for SLI4 HBA dpp registers.\n"); error = -ENOMEM; - goto out_iounmap_ctrl; + goto out_iounmap_all; } phba->pci_bar4_memmap_p = phba->sli4_hba.dpp_regs_memmap_p; } @@ -10520,9 +10520,11 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) return 0; out_iounmap_all: - iounmap(phba->sli4_hba.drbl_regs_memmap_p); + if (phba->sli4_hba.drbl_regs_memmap_p) + iounmap(phba->sli4_hba.drbl_regs_memmap_p); out_iounmap_ctrl: - iounmap(phba->sli4_hba.ctrl_regs_memmap_p); + if (phba->sli4_hba.ctrl_regs_memmap_p) + iounmap(phba->sli4_hba.ctrl_regs_memmap_p); out_iounmap_conf: iounmap(phba->sli4_hba.conf_regs_memmap_p); -- 2.34.1

2 1

[PATCH openEuler-22.03-LTS] scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
by Yong Hu 26 Sep '23

26 Sep '23

From: Shuchang Li <lishuchang(a)hust.edu.cn> stable inclusion from stable-v5.10.180 commit bab8dc38b1a0a12bc064fc064269033bdcf5b88e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZCDZ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit 91a0c0c1413239d0548b5aac4c82f38f6d53a91e ] When if_type equals zero and pci_resource_start(pdev, PCI_64BIT_BAR4) returns false, drbl_regs_memmap_p is not remapped. This passes a NULL pointer to iounmap(), which can trigger a WARN() on certain arches. When if_type equals six and pci_resource_start(pdev, PCI_64BIT_BAR4) returns true, drbl_regs_memmap_p may has been remapped and ctrl_regs_memmap_p is not remapped. This is a resource leak and passes a NULL pointer to iounmap(). To fix these issues, we need to add null checks before iounmap(), and change some goto labels. Fixes: 1351e69fc6db ("scsi: lpfc: Add push-to-adapter support to sli4") Signed-off-by: Shuchang Li <lishuchang(a)hust.edu.cn> Link: https://lore.kernel.org/r/20230404072133.1022-1-lishuchang@hust.edu.cn Reviewed-by: Justin Tee <justin.tee(a)broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yong Hu <yong.hu(a)windriver.com> --- drivers/scsi/lpfc/lpfc_init.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 134e4ee5dc48..2f7a17e96e25 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -10474,7 +10474,7 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) goto out_iounmap_all; } else { error = -ENOMEM; - goto out_iounmap_all; + goto out_iounmap_ctrl; } } @@ -10492,7 +10492,7 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) dev_err(&pdev->dev, "ioremap failed for SLI4 HBA dpp registers.\n"); error = -ENOMEM; - goto out_iounmap_ctrl; + goto out_iounmap_all; } phba->pci_bar4_memmap_p = phba->sli4_hba.dpp_regs_memmap_p; } @@ -10517,9 +10517,11 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) return 0; out_iounmap_all: - iounmap(phba->sli4_hba.drbl_regs_memmap_p); + if (phba->sli4_hba.drbl_regs_memmap_p) + iounmap(phba->sli4_hba.drbl_regs_memmap_p); out_iounmap_ctrl: - iounmap(phba->sli4_hba.ctrl_regs_memmap_p); + if (phba->sli4_hba.ctrl_regs_memmap_p) + iounmap(phba->sli4_hba.ctrl_regs_memmap_p); out_iounmap_conf: iounmap(phba->sli4_hba.conf_regs_memmap_p); -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
by Lu Wei 26 Sep '23

26 Sep '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v6.6-rc1 commit 050d91c03b28ca479df13dfb02bcd2c60dd6a878 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I83QCZ CVE: CVE-2023-42753 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Acked-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/ipset/ip_set_hash_netportnet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 613e18e720a4..9290a4d7b862 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -39,6 +39,7 @@ MODULE_ALIAS("ip_set_hash:net,port,net"); #define IP_SET_HASH_WITH_PROTO #define IP_SET_HASH_WITH_NETS #define IPSET_NET_COUNT 2 +#define IP_SET_HASH_WITH_NET0 /* IPv4 variant */ -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
by Lu Wei 26 Sep '23

26 Sep '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v6.6-rc1 commit 050d91c03b28ca479df13dfb02bcd2c60dd6a878 category: bugfix bugzilla: 189250 CVE: CVE-2023-42753 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Acked-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/ipset/ip_set_hash_netportnet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 613e18e720a4..9290a4d7b862 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -39,6 +39,7 @@ MODULE_ALIAS("ip_set_hash:net,port,net"); #define IP_SET_HASH_WITH_PROTO #define IP_SET_HASH_WITH_NETS #define IPSET_NET_COUNT 2 +#define IP_SET_HASH_WITH_NET0 /* IPv4 variant */ -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS,v2] netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
by Lu Wei 26 Sep '23

26 Sep '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v6.6-rc1 commit 050d91c03b28ca479df13dfb02bcd2c60dd6a878 category: bugfix bugzilla: 189250 CVE: CVE-2023-42753 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Acked-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/ipset/ip_set_hash_netportnet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 613e18e720a4..9290a4d7b862 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -39,6 +39,7 @@ MODULE_ALIAS("ip_set_hash:net,port,net"); #define IP_SET_HASH_WITH_PROTO #define IP_SET_HASH_WITH_NETS #define IPSET_NET_COUNT 2 +#define IP_SET_HASH_WITH_NET0 /* IPv4 variant */ -- 2.34.1

1 0

[PATCH openEuler-1.0-LTS] [just for review!!!!]Add feature: eNFS - nfs multipath to improve performance and reliability
by mingqian218472 25 Sep '23

25 Sep '23

From: 闫海涛 <yanhaitao2(a)huawei.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7SVH7 --------------------------------- Currently, the NFS client can use only one server IP address at a single mount point. As a result, the hardware capability of multiple storage nodes and NICs cannot be fully utilized. In multiple financial sites, the performance cannot meet service requirements. In addition, when a single link is faulty, services are suspended. The reliability problem needs to be solved. OpenEuler-based commercial OS vendors hope that the eNFS feature will be integrated into 20.03 SP4 to resolve performance and reliability problems. When user mount one NFS share, can input localaddrs/remoteaddrs these two optional Parameters to use eNFS multipath. If these optional parameters are not used, NFS will behave as before. For example, mount -t nfs -o [localaddrs=127.17.0.1-127.17.0.4],[remoteaddrs=127.17.1.1-127.17.1.4] xx.xx.xx.xx:/test /mnt/test Changes in eNFS are as follows: 1. patch 0001: At the NFS layer, the eNFS registration function is called back when the mount command parses parameters. The eNFS parses and saves the IP address list entered by users. 2. patch 0002: At the sunrpc layer, the eNFS registration function is called back When the NFS uses sunrpc to create rpc_clnt, the eNFS combines the IP address list entered for mount to generate multiple xprts. When the I/O times out, the callback function of the eNFS is called back so that the eNFS switches to an available link for retry. 3. patch 0003: The eNFS module registers the interface for parsing the mount command. During the mount process, the NFS invokes the eNFS interface to enable the eNFS to parse the mounting parameters of UltraPath. The eNFS module saves the mounting parameters to the context of nfs_client. 4. patch 0004: When the NFS invokes the SunRPC to create rpc_clnt, the eNFS interface is called back. The eNFS creates multiple xprts based on the output IP address list. When NFS V3 I/Os are delivered, eNFS distributes I/Os to available links based on the link status, improving performance through load balancing. 5. patch 0005: When sending I/Os from the SunRPC module to the NFS server times out, the SunRPC module calls back the eNFS module to reselect a link. The eNFS module distributes I/Os to other available links, preventing service interruption caused by a single link failure. 6. patch 0006: The eNFS compilation option and makefile are added. By default, the eNFS compilation is not performed. Signed-off-by: mingqian218472 <zhangmingqian.zhang(a)huawei.com> --- ...-nfs-multipath-to-improve-performanc.patch | 6148 +++++++++++++++++ ...enfs_registe_and_handle_mount_option.patch | 757 ++ ...nd_create_multipath_then_dispatch_IO.patch | 805 +++ ...add_enfs_module_for_nfs_mount_option.patch | 1209 ++++ ...dd_enfs_module_for_sunrpc_multipatch.patch | 1581 +++++ ...le_for_sunrpc_failover_and_configure.patch | 1607 +++++ 0006-add_enfs_compile_option.patch | 70 + 7 files changed, 12177 insertions(+) create mode 100644 0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch create mode 100644 0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch create mode 100644 0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch create mode 100644 0003-add_enfs_module_for_nfs_mount_option.patch create mode 100644 0004-add_enfs_module_for_sunrpc_multipatch.patch create mode 100644 0005-add_enfs_module_for_sunrpc_failover_and_configure.patch create mode 100644 0006-add_enfs_compile_option.patch diff --git a/0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch b/0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch new file mode 100644 index 0000000..2974c5f --- /dev/null +++ b/0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch @@ -0,0 +1,6148 @@ +From 53f616b0a649494e33d30b250d06c4049ccb88be Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?=E9=97=AB=E6=B5=B7=E6=B6=9B?= <yanhaitao2(a)huawei.com> +Date: Mon, 25 Sep 2023 19:19:15 +0800 +Subject: [PATCH openEuler-20.03-LTS-SP3] Add feature: eNFS - nfs multipath to + improve performance and reliability + +driver inclusion +category: feature +bugzilla: https://gitee.com/openeuler/release-management/issues/I7U0W0 + +--------------------------------- + +Currently, the NFS client can use only one server IP address at a single mount point. As a result, the hardware capability of multiple storage nodes and NICs cannot be fully utilized. In multiple financial sites, the performance cannot meet service requirements. In addition, when a single link is faulty, services are suspended. The reliability problem needs to be solved. +OpenEuler-based commercial OS vendors hope that the eNFS feature will be integrated into 20.03 SP4 to resolve performance and reliability problems. + +When user mount one NFS share, can input localaddrs/remoteaddrs these two optional Parameters to use eNFS multipath. If these optional parameters are not used, NFS will behave as before. For example, +mount -t nfs -o [localaddrs=127.17.0.1-127.17.0.4],[remoteaddrs=127.17.1.1-127.17.1.4] xx.xx.xx.xx:/test /mnt/test + +Changes in eNFS are as follows: +1. patch 0001: +At the NFS layer, the eNFS registration function is called back when the mount command parses parameters. The eNFS parses and saves the IP address list entered by users. +2. patch 0002: +At the sunrpc layer, the eNFS registration function is called back When the NFS uses sunrpc to create rpc_clnt, the eNFS combines the IP address list entered for mount to generate multiple xprts. When the I/O times out, the callback function of the eNFS is called back so that the eNFS switches to an available link for retry. +3. patch 0003: +The eNFS module registers the interface for parsing the mount command. During the mount process, the NFS invokes the eNFS interface to enable the eNFS to parse the mounting parameters of UltraPath. The eNFS module saves the mounting parameters to the context of nfs_client. +4. patch 0004: +When the NFS invokes the SunRPC to create rpc_clnt, the eNFS interface is called back. The eNFS creates multiple xprts based on the output IP address list. When NFS V3 I/Os are delivered, eNFS distributes I/Os to available links based on the link status, improving performance through load balancing. +5. patch 0005: +When sending I/Os from the SunRPC module to the NFS server times out, the SunRPC module calls back the eNFS module to reselect a link. The eNFS module distributes I/Os to other available links, preventing service interruption caused by a single link failure. +6. patch 0006: +The eNFS compilation option and makefile are added. By default, the eNFS compilation is not performed. + +Signed-off-by: mingqian218472 <zhangmingqian.zhang(a)huawei.com> +--- + ...enfs_registe_and_handle_mount_option.patch | 757 ++++++++ + ...nd_create_multipath_then_dispatch_IO.patch | 805 +++++++++ + ...add_enfs_module_for_nfs_mount_option.patch | 1209 +++++++++++++ + ...dd_enfs_module_for_sunrpc_multipatch.patch | 1581 ++++++++++++++++ + ...le_for_sunrpc_failover_and_configure.patch | 1607 +++++++++++++++++ + 0006-add_enfs_compile_option.patch | 70 + + kernel.spec | 13 + + 7 files changed, 6042 insertions(+) + create mode 100644 0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch + create mode 100644 0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch + create mode 100644 0003-add_enfs_module_for_nfs_mount_option.patch + create mode 100644 0004-add_enfs_module_for_sunrpc_multipatch.patch + create mode 100644 0005-add_enfs_module_for_sunrpc_failover_and_configure.patch + create mode 100644 0006-add_enfs_compile_option.patch + +diff --git a/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch +new file mode 100644 +index 0000000..38e57a9 +--- /dev/null ++++ b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch +@@ -0,0 +1,757 @@ ++diff --git a/fs/nfs/client.c b/fs/nfs/client.c ++index 7d02dc52209d..50820a8a684a 100644 ++--- a/fs/nfs/client.c +++++ b/fs/nfs/client.c ++@@ -48,7 +48,7 @@ ++ #include "callback.h" ++ #include "delegation.h" ++ #include "iostat.h" ++-#include "internal.h" +++#include "enfs_adapter.h" ++ #include "fscache.h" ++ #include "pnfs.h" ++ #include "nfs.h" ++@@ -255,6 +255,7 @@ void nfs_free_client(struct nfs_client *clp) ++ put_nfs_version(clp->cl_nfs_mod); ++ kfree(clp->cl_hostname); ++ kfree(clp->cl_acceptor); +++ nfs_free_multi_path_client(clp); ++ kfree(clp); ++ } ++ EXPORT_SYMBOL_GPL(nfs_free_client); ++@@ -330,6 +331,9 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat ++ sap)) ++ continue; ++ +++ if (!nfs_multipath_client_match(clp, data)) +++ continue; +++ ++ refcount_inc(&clp->cl_count); ++ return clp; ++ } ++@@ -512,6 +516,9 @@ int nfs_create_rpc_client(struct nfs_client *clp, ++ .program = &nfs_program, ++ .version = clp->rpc_ops->version, ++ .authflavor = flavor, +++#if IS_ENABLED(CONFIG_ENFS) +++ .multipath_option = cl_init->enfs_option, +++#endif ++ }; ++ ++ if (test_bit(NFS_CS_DISCRTRY, &clp->cl_flags)) ++@@ -634,6 +641,13 @@ struct nfs_client *nfs_init_client(struct nfs_client *clp, ++ /* the client is already initialised */ ++ if (clp->cl_cons_state == NFS_CS_READY) ++ return clp; +++ error = nfs_create_multi_path_client(clp, cl_init); +++ if (error < 0) { +++ dprintk("%s: create failed.%d!\n", __func__, error); +++ nfs_put_client(clp); +++ clp = ERR_PTR(error); +++ return clp; +++ } ++ ++ /* ++ * Create a client RPC handle for doing FSSTAT with UNIX auth only ++@@ -666,6 +680,9 @@ static int nfs_init_server(struct nfs_server *server, ++ .net = data->net, ++ .timeparms = &timeparms, ++ .init_flags = (1UL << NFS_CS_REUSEPORT), +++#if IS_ENABLED(CONFIG_ENFS) +++ .enfs_option = data->enfs_option, +++#endif ++ }; ++ struct nfs_client *clp; ++ int error; ++diff --git a/fs/nfs/enfs_adapter.c b/fs/nfs/enfs_adapter.c ++new file mode 100644 ++index 000000000000..7f471f2072c4 ++--- /dev/null +++++ b/fs/nfs/enfs_adapter.c ++@@ -0,0 +1,230 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs3.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/sunrpc/sched.h> +++#include <linux/nfs_iostat.h> +++#include "enfs_adapter.h" +++#include "iostat.h" +++ +++struct enfs_adapter_ops __rcu *enfs_adapter; +++ +++int enfs_adapter_register(struct enfs_adapter_ops *ops) +++{ +++ struct enfs_adapter_ops *old; +++ +++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, NULL, ops); +++ if (old == NULL || old == ops) +++ return 0; +++ pr_err("regist %s ops %p failed. old %p\n", __func__, ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(enfs_adapter_register); +++ +++int enfs_adapter_unregister(struct enfs_adapter_ops *ops) +++{ +++ struct enfs_adapter_ops *old; +++ +++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, ops, NULL); +++ if (old == ops || old == NULL) +++ return 0; +++ pr_err("unregist %s ops %p failed. old %p\n", __func__, ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(enfs_adapter_unregister); +++ +++struct enfs_adapter_ops *nfs_multipath_router_get(void) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ rcu_read_lock(); +++ ops = rcu_dereference(enfs_adapter); +++ if (ops == NULL) { +++ rcu_read_unlock(); +++ return NULL; +++ } +++ if (!try_module_get(ops->owner)) +++ ops = NULL; +++ rcu_read_unlock(); +++ return ops; +++} +++ +++void nfs_multipath_router_put(struct enfs_adapter_ops *ops) +++{ +++ if (ops) +++ module_put(ops->owner); +++} +++ +++bool is_valid_option(enum nfsmultipathoptions option) +++{ +++ if (option < REMOTEADDR || option >= INVALID_OPTION) { +++ pr_warn("%s: ENFS invalid option %d\n", __func__, option); +++ return false; +++ } +++ +++ return true; +++} +++ +++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, +++ struct nfs_parsed_mount_data *mnt) +++{ +++ +++ //parseMultiPathOptions(getNfsMultiPathOpt(token), string, mnt); +++ +++ int rc; +++ struct enfs_adapter_ops *ops; +++ +++ ops = nfs_multipath_router_get(); +++ if ((ops == NULL) || (ops->parse_mount_options == NULL) || +++ !is_valid_option(option)) { +++ nfs_multipath_router_put(ops); +++ dfprintk(MOUNT, +++ "NFS: parsing nfs mount option enfs not load[%s]\n" +++ , __func__); +++ return -EOPNOTSUPP; +++ } +++ // nfs_multipath_parse_options +++ dfprintk(MOUNT, "NFS: parsing nfs mount option '%s' type: %d[%s]\n" +++ , str, option, __func__); +++ rc = ops->parse_mount_options(option, str, &mnt->enfs_option, mnt->net); +++ nfs_multipath_router_put(ops); +++ return rc; +++} +++ +++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ if (data->enfs_option == NULL) +++ return; +++ +++ ops = nfs_multipath_router_get(); +++ if ((ops == NULL) || (ops->free_mount_options == NULL)) { +++ nfs_multipath_router_put(ops); +++ return; +++ } +++ ops->free_mount_options((void *)&data->enfs_option); +++ nfs_multipath_router_put(ops); +++} +++ +++int nfs_create_multi_path_client(struct nfs_client *client, +++ const struct nfs_client_initdata *cl_init) +++{ +++ int ret = 0; +++ struct enfs_adapter_ops *ops; +++ +++ if (cl_init->enfs_option == NULL) +++ return 0; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_init != NULL) +++ ret = ops->client_info_init( +++ (void *)&client->cl_multipath_data, cl_init); +++ nfs_multipath_router_put(ops); +++ +++ return ret; +++} +++EXPORT_SYMBOL_GPL(nfs_create_multi_path_client); +++ +++void nfs_free_multi_path_client(struct nfs_client *clp) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ if (clp->cl_multipath_data == NULL) +++ return; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_free != NULL) +++ ops->client_info_free(clp->cl_multipath_data); +++ nfs_multipath_router_put(ops); +++} +++ +++int nfs_multipath_client_match(struct nfs_client *clp, +++ const struct nfs_client_initdata *sap) +++{ +++ int ret = true; +++ struct enfs_adapter_ops *ops; +++ +++ pr_info("%s src %p dst %p\n.", __func__, +++ clp->cl_multipath_data, sap->enfs_option); +++ +++ if (clp->cl_multipath_data == NULL && sap->enfs_option == NULL) +++ return true; +++ +++ if ((clp->cl_multipath_data == NULL && sap->enfs_option) || +++ (clp->cl_multipath_data && sap->enfs_option == NULL)) { +++ pr_err("not match client src %p dst %p\n.", +++ clp->cl_multipath_data, sap->enfs_option); +++ return false; +++ } +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_match != NULL) +++ ret = ops->client_info_match(clp->cl_multipath_data, +++ sap->enfs_option); +++ nfs_multipath_router_put(ops); +++ +++ return ret; +++} +++ +++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst) +++{ +++ int ret = true; +++ struct enfs_adapter_ops *ops; +++ +++ if (src->cl_multipath_data == NULL && dst->cl_multipath_data == NULL) +++ return true; +++ +++ if (src->cl_multipath_data == NULL || dst->cl_multipath_data == NULL) +++ return false; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->nfs4_client_info_match != NULL) +++ ret = ops->nfs4_client_info_match(src->cl_multipath_data, +++ src->cl_multipath_data); +++ nfs_multipath_router_put(ops); +++ +++ return ret; +++} +++EXPORT_SYMBOL_GPL(nfs4_multipath_client_match); +++ +++void nfs_multipath_show_client_info(struct seq_file *mount_option, +++ struct nfs_server *server) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ if (mount_option == NULL || server == NULL || +++ server->client == NULL || +++ server->nfs_client->cl_multipath_data == NULL) +++ return; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_show != NULL) +++ ops->client_info_show(mount_option, server); +++ nfs_multipath_router_put(ops); +++} +++ +++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) +++{ +++ int ret = 0; +++ struct enfs_adapter_ops *ops; +++ +++ if (nfs_client == NULL || nfs_client->cl_rpcclient == NULL) +++ return 0; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->remount_ip_list != NULL) +++ ret = ops->remount_ip_list(nfs_client, enfs_option); +++ nfs_multipath_router_put(ops); +++ return ret; +++} +++EXPORT_SYMBOL_GPL(nfs_remount_iplist); ++diff --git a/fs/nfs/enfs_adapter.h b/fs/nfs/enfs_adapter.h ++new file mode 100644 ++index 000000000000..752544e18056 ++--- /dev/null +++++ b/fs/nfs/enfs_adapter.h ++@@ -0,0 +1,101 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapt header. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _NFS_MULTIPATH_H_ +++#define _NFS_MULTIPATH_H_ +++ +++#include "internal.h" +++ +++#if IS_ENABLED(CONFIG_ENFS) +++enum nfsmultipathoptions { +++ REMOTEADDR, +++ LOCALADDR, +++ REMOTEDNSNAME, +++ REMOUNTREMOTEADDR, +++ REMOUNTLOCALADDR, +++ INVALID_OPTION +++}; +++ +++ +++struct enfs_adapter_ops { +++ const char *name; +++ struct module *owner; +++ int (*parse_mount_options)(enum nfsmultipathoptions option, +++ char *str, void **enfs_option, struct net *net_ns); +++ +++ void (*free_mount_options)(void **data); +++ +++ int (*client_info_init)(void **data, +++ const struct nfs_client_initdata *cl_init); +++ void (*client_info_free)(void *data); +++ int (*client_info_match)(void *src, void *dst); +++ int (*nfs4_client_info_match)(void *src, void *dst); +++ void (*client_info_show)(struct seq_file *mount_option, void *data); +++ int (*remount_ip_list)(struct nfs_client *nfs_client, +++ void *enfs_option); +++}; +++ +++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, +++ struct nfs_parsed_mount_data *mnt); +++void enfs_free_mount_options(struct nfs_parsed_mount_data *data); +++int nfs_create_multi_path_client(struct nfs_client *client, +++ const struct nfs_client_initdata *cl_init); +++void nfs_free_multi_path_client(struct nfs_client *clp); +++int nfs_multipath_client_match(struct nfs_client *clp, +++ const struct nfs_client_initdata *sap); +++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst); +++void nfs_multipath_show_client_info(struct seq_file *mount_option, +++ struct nfs_server *server); +++int enfs_adapter_register(struct enfs_adapter_ops *ops); +++int enfs_adapter_unregister(struct enfs_adapter_ops *ops); +++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); +++int nfs4_create_multi_path(struct nfs_server *server, +++ struct nfs_parsed_mount_data *data, +++ const struct rpc_timeout *timeparms); +++ +++#else +++static inline +++void nfs_free_multi_path_client(struct nfs_client *clp) +++{ +++ +++} +++ +++static inline +++int nfs_multipath_client_match(struct nfs_client *clp, +++ const struct nfs_client_initdata *sap) +++{ +++ return 1; +++} +++ +++static inline +++int nfs_create_multi_path_client(struct nfs_client *client, +++ const struct nfs_client_initdata *cl_init) +++{ +++ return 0; +++} +++ +++static inline +++void nfs_multipath_show_client_info(struct seq_file *mount_option, +++ struct nfs_server *server) +++{ +++ +++} +++ +++static inline +++int nfs4_multipath_client_match(struct nfs_client *src, +++ struct nfs_client *dst) +++{ +++ return 1; +++} +++ +++static inline +++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) +++{ +++ +++} +++ +++#endif // CONFIG_ENFS +++#endif // _NFS_MULTIPATH_H_ ++diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h ++index 0ce5a90640c4..c696693edc7b 100644 ++--- a/fs/nfs/internal.h +++++ b/fs/nfs/internal.h ++@@ -93,6 +93,9 @@ struct nfs_client_initdata { ++ u32 minorversion; ++ struct net *net; ++ const struct rpc_timeout *timeparms; +++#if IS_ENABLED(CONFIG_ENFS) +++ void *enfs_option; /* struct multipath_mount_options * */ +++#endif ++ }; ++ ++ /* ++@@ -135,6 +138,9 @@ struct nfs_parsed_mount_data { ++ ++ struct security_mnt_opts lsm_opts; ++ struct net *net; +++#if IS_ENABLED(CONFIG_ENFS) +++ void *enfs_option; /* struct multipath_mount_options * */ +++#endif ++ }; ++ ++ /* mount_clnt.c */ ++diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c ++index 1350ea673672..4aa6e1f961f7 100644 ++--- a/fs/nfs/nfs4client.c +++++ b/fs/nfs/nfs4client.c ++@@ -10,7 +10,7 @@ ++ #include <linux/sunrpc/xprt.h> ++ #include <linux/sunrpc/bc_xprt.h> ++ #include <linux/sunrpc/rpc_pipe_fs.h> ++-#include "internal.h" +++#include "enfs_adapter.h" ++ #include "callback.h" ++ #include "delegation.h" ++ #include "nfs4session.h" ++@@ -225,6 +225,16 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init) ++ __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags); ++ __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags); ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ err = nfs_create_multi_path_client(clp, cl_init); +++ if (err < 0) { +++ dprintk("%s: create failed.%d\n", __func__, err); +++ nfs_put_client(clp); +++ clp = ERR_PTR(err); +++ return clp; +++ } +++#endif +++ ++ /* ++ * Set up the connection to the server before we add add to the ++ * global list. ++@@ -529,6 +539,9 @@ static int nfs4_match_client(struct nfs_client *pos, struct nfs_client *new, ++ if (!nfs4_match_client_owner_id(pos, new)) ++ return 1; ++ +++ if (!nfs4_multipath_client_match(pos, new)) +++ return 1; +++ ++ return 0; ++ } ++ ++@@ -860,7 +873,7 @@ static int nfs4_set_client(struct nfs_server *server, ++ const size_t addrlen, ++ const char *ip_addr, ++ int proto, const struct rpc_timeout *timeparms, ++- u32 minorversion, struct net *net) +++ u32 minorversion, struct net *net, void *enfs_option) ++ { ++ struct nfs_client_initdata cl_init = { ++ .hostname = hostname, ++@@ -872,6 +885,9 @@ static int nfs4_set_client(struct nfs_server *server, ++ .minorversion = minorversion, ++ .net = net, ++ .timeparms = timeparms, +++#if IS_ENABLED(CONFIG_ENFS) +++ .enfs_option = enfs_option, +++#endif ++ }; ++ struct nfs_client *clp; ++ ++@@ -1042,6 +1058,30 @@ static int nfs4_server_common_setup(struct nfs_server *server, ++ return error; ++ } ++ +++int nfs4_create_multi_path(struct nfs_server *server, +++ struct nfs_parsed_mount_data *data, +++ const struct rpc_timeout *timeparms) +++{ +++ struct nfs_client_initdata cl_init = { +++ .hostname = data->nfs_server.hostname, +++ .addr = (const struct sockaddr *)&data->nfs_server.address, +++ .addrlen = data->nfs_server.addrlen, +++ .ip_addr = data->client_address, +++ .nfs_mod = &nfs_v4, +++ .proto = data->nfs_server.protocol, +++ .minorversion = data->minorversion, +++ .net = data->net, +++ .timeparms = timeparms, +++#if IS_ENABLED(CONFIG_ENFS) +++ .enfs_option = data->enfs_option, +++#endif // CONFIG_ENFS +++ }; +++ +++ return nfs_create_multi_path_client(server->nfs_client, &cl_init); +++ +++} +++EXPORT_SYMBOL_GPL(nfs4_create_multi_path); +++ ++ /* ++ * Create a version 4 volume record ++ */ ++@@ -1050,6 +1090,7 @@ static int nfs4_init_server(struct nfs_server *server, ++ { ++ struct rpc_timeout timeparms; ++ int error; +++ void *enfs_option = NULL; ++ ++ nfs_init_timeout_values(&timeparms, data->nfs_server.protocol, ++ data->timeo, data->retrans); ++@@ -1067,6 +1108,10 @@ static int nfs4_init_server(struct nfs_server *server, ++ else ++ data->selected_flavor = RPC_AUTH_UNIX; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ enfs_option = data->enfs_option; +++#endif +++ ++ /* Get a client record */ ++ error = nfs4_set_client(server, ++ data->nfs_server.hostname, ++@@ -1076,7 +1121,7 @@ static int nfs4_init_server(struct nfs_server *server, ++ data->nfs_server.protocol, ++ &timeparms, ++ data->minorversion, ++- data->net); +++ data->net, enfs_option); ++ if (error < 0) ++ return error; ++ ++@@ -1161,7 +1206,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, ++ XPRT_TRANSPORT_RDMA, ++ parent_server->client->cl_timeout, ++ parent_client->cl_mvops->minor_version, ++- parent_client->cl_net); +++ parent_client->cl_net, NULL); ++ if (!error) ++ goto init_server; ++ #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */ ++@@ -1174,7 +1219,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, ++ XPRT_TRANSPORT_TCP, ++ parent_server->client->cl_timeout, ++ parent_client->cl_mvops->minor_version, ++- parent_client->cl_net); +++ parent_client->cl_net, NULL); ++ if (error < 0) ++ goto error; ++ ++@@ -1269,7 +1314,7 @@ int nfs4_update_server(struct nfs_server *server, const char *hostname, ++ set_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); ++ error = nfs4_set_client(server, hostname, sap, salen, buf, ++ clp->cl_proto, clnt->cl_timeout, ++- clp->cl_minorversion, net); +++ clp->cl_minorversion, net, NULL); ++ clear_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); ++ if (error != 0) { ++ nfs_server_insert_lists(server); ++diff --git a/fs/nfs/super.c b/fs/nfs/super.c ++index a05e1eb2c3fd..83cd294aca15 100644 ++--- a/fs/nfs/super.c +++++ b/fs/nfs/super.c ++@@ -61,7 +61,7 @@ ++ #include "callback.h" ++ #include "delegation.h" ++ #include "iostat.h" ++-#include "internal.h" +++#include "enfs_adapter.h" ++ #include "fscache.h" ++ #include "nfs4session.h" ++ #include "pnfs.h" ++@@ -113,6 +113,12 @@ enum { ++ ++ /* Special mount options */ ++ Opt_userspace, Opt_deprecated, Opt_sloppy, +++#if IS_ENABLED(CONFIG_ENFS) +++ Opt_remote_iplist, +++ Opt_local_iplist, +++ Opt_remote_dnslist, +++ Opt_enfs_info, +++#endif ++ ++ Opt_err ++ }; ++@@ -183,6 +189,13 @@ static const match_table_t nfs_mount_option_tokens = { ++ { Opt_fscache_uniq, "fsc=%s" }, ++ { Opt_local_lock, "local_lock=%s" }, ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ { Opt_remote_iplist, "remoteaddrs=%s" }, +++ { Opt_local_iplist, "localaddrs=%s" }, +++ { Opt_remote_dnslist, "remotednsname=%s" }, +++ { Opt_enfs_info, "enfs_info=%s" }, +++#endif +++ ++ /* The following needs to be listed after all other options */ ++ { Opt_nfsvers, "v%s" }, ++ ++@@ -365,6 +378,21 @@ static struct shrinker acl_shrinker = { ++ .seeks = DEFAULT_SEEKS, ++ }; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++enum nfsmultipathoptions getNfsMultiPathOpt(int token) +++{ +++ switch (token) { +++ case Opt_remote_iplist: +++ return REMOUNTREMOTEADDR; +++ case Opt_local_iplist: +++ return REMOUNTLOCALADDR; +++ case Opt_remote_dnslist: +++ return REMOTEDNSNAME; +++ } +++ return INVALID_OPTION; +++} +++#endif +++ ++ /* ++ * Register the NFS filesystems ++ */ ++@@ -758,6 +786,9 @@ int nfs_show_options(struct seq_file *m, struct dentry *root) ++ seq_printf(m, ",addr=%s", ++ rpc_peeraddr2str(nfss->nfs_client->cl_rpcclient, ++ RPC_DISPLAY_ADDR)); +++ +++ nfs_multipath_show_client_info(m, nfss); +++ ++ rcu_read_unlock(); ++ ++ return 0; ++@@ -853,6 +884,8 @@ int nfs_show_stats(struct seq_file *m, struct dentry *root) ++ seq_puts(m, root->d_sb->s_flags & SB_NODIRATIME ? ",nodiratime" : ""); ++ nfs_show_mount_options(m, nfss, 1); ++ +++ nfs_multipath_show_client_info(m, nfss); +++ ++ seq_printf(m, "\n\tage:\t%lu", (jiffies - nfss->mount_time) / HZ); ++ ++ show_implementation_id(m, nfss); ++@@ -977,6 +1010,7 @@ static void nfs_free_parsed_mount_data(struct nfs_parsed_mount_data *data) ++ kfree(data->nfs_server.export_path); ++ kfree(data->nfs_server.hostname); ++ kfree(data->fscache_uniq); +++ enfs_free_mount_options(data); ++ security_free_mnt_opts(&data->lsm_opts); ++ kfree(data); ++ } ++@@ -1641,7 +1675,34 @@ static int nfs_parse_mount_options(char *raw, ++ return 0; ++ }; ++ break; ++- +++#if IS_ENABLED(CONFIG_ENFS) +++ case Opt_remote_iplist: +++ case Opt_local_iplist: +++ case Opt_remote_dnslist: +++ string = match_strdup(args); +++ if (string == NULL) +++ goto out_nomem; +++ rc = enfs_parse_mount_options(getNfsMultiPathOpt(token), +++ string, mnt); +++ kfree(string); +++ switch (rc) { +++ case 0: +++ break; +++ case -ENOMEM: +++ goto out_nomem; +++ case -ENOSPC: +++ goto out_limit; +++ case -EINVAL: +++ goto out_invalid_address; +++ case -ENOTSUPP: +++ goto out_invalid_address; +++ case -EOPNOTSUPP: +++ goto out_invalid_address; +++ } +++ break; +++ case Opt_enfs_info: +++ break; +++#endif ++ /* ++ * Special options ++ */ ++@@ -1720,6 +1781,11 @@ static int nfs_parse_mount_options(char *raw, ++ free_secdata(secdata); ++ printk(KERN_INFO "NFS: security options invalid: %d\n", rc); ++ return 0; +++#if IS_ENABLED(CONFIG_ENFS) +++out_limit: +++ dprintk("NFS: param is more than supported limit: %d\n", rc); +++ return 0; +++#endif ++ } ++ ++ /* ++@@ -2335,6 +2401,14 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) ++ if (!nfs_parse_mount_options((char *)options, data)) ++ goto out; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ if (data->enfs_option) { +++ error = nfs_remount_iplist(nfss->nfs_client, data->enfs_option); +++ if (error) +++ goto out; +++ } +++#endif +++ ++ /* ++ * noac is a special case. It implies -o sync, but that's not ++ * necessarily reflected in the mtab options. do_remount_sb ++@@ -2347,6 +2421,11 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) ++ /* compare new mount options with old ones */ ++ error = nfs_compare_remount_data(nfss, data); ++ out: +++#if IS_ENABLED(CONFIG_ENFS) +++ /* release remount option member */ +++ if (data->enfs_option) +++ enfs_free_mount_options(data); +++#endif ++ nfs_free_parsed_mount_data(data); ++ return error; ++ } ++diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h ++index 7023ae64e3d7..2c19678afe8d 100644 ++--- a/include/linux/nfs_fs_sb.h +++++ b/include/linux/nfs_fs_sb.h ++@@ -123,6 +123,11 @@ struct nfs_client { ++ ++ struct net *cl_net; ++ struct list_head pending_cb_stateids; +++ +++#if IS_ENABLED(CONFIG_ENFS) +++ /* multi path private structure (struct multipath_client_info *) */ +++ void *cl_multipath_data; +++#endif ++ }; ++ ++ /* +diff --git a/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch +new file mode 100644 +index 0000000..540a2ce +--- /dev/null ++++ b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch +@@ -0,0 +1,805 @@ ++diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h ++index 8aa865bce4f6..89178f78de8c 100644 ++--- a/include/linux/sunrpc/clnt.h +++++ b/include/linux/sunrpc/clnt.h ++@@ -70,6 +70,10 @@ struct rpc_clnt { ++ struct dentry *cl_debugfs; /* debugfs directory */ ++ #endif ++ struct rpc_xprt_iter cl_xpi; +++ +++#if IS_ENABLED(CONFIG_ENFS) +++ bool cl_enfs; +++#endif ++ }; ++ ++ /* ++@@ -124,6 +128,9 @@ struct rpc_create_args { ++ unsigned long flags; ++ char *client_name; ++ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */ +++#if IS_ENABLED(CONFIG_ENFS) +++ void *multipath_option; +++#endif ++ }; ++ ++ struct rpc_add_xprt_test { ++@@ -221,6 +228,12 @@ bool rpc_clnt_xprt_switch_has_addr(struct rpc_clnt *clnt, ++ const struct sockaddr *sap); ++ void rpc_cleanup_clids(void); ++ +++#if IS_ENABLED(CONFIG_ENFS) +++int +++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, +++ const struct rpc_call_ops *ops, void *data, int flags); +++#endif /* CONFIG_ENFS */ +++ ++ static inline int rpc_reply_expected(struct rpc_task *task) ++ { ++ return (task->tk_msg.rpc_proc != NULL) && ++diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h ++index ad2e243f3f03..124f5a0faf3e 100644 ++--- a/include/linux/sunrpc/sched.h +++++ b/include/linux/sunrpc/sched.h ++@@ -90,6 +90,9 @@ struct rpc_task { ++ tk_garb_retry : 2, ++ tk_cred_retry : 2, ++ tk_rebind_retry : 2; +++#if IS_ENABLED(CONFIG_ENFS) +++ unsigned long tk_major_timeo; /* major timeout ticks */ +++#endif ++ }; ++ ++ typedef void (*rpc_action)(struct rpc_task *); ++@@ -118,6 +121,9 @@ struct rpc_task_setup { ++ */ ++ #define RPC_TASK_ASYNC 0x0001 /* is an async task */ ++ #define RPC_TASK_SWAPPER 0x0002 /* is swapping in/out */ +++#if IS_ENABLED(CONFIG_ENFS) +++#define RPC_TASK_FIXED 0x0004 /* detect xprt status task */ +++#endif ++ #define RPC_CALL_MAJORSEEN 0x0020 /* major timeout seen */ ++ #define RPC_TASK_ROOTCREDS 0x0040 /* force root creds */ ++ #define RPC_TASK_DYNAMIC 0x0080 /* task was kmalloc'ed */ ++@@ -257,6 +263,9 @@ void rpc_destroy_mempool(void); ++ extern struct workqueue_struct *rpciod_workqueue; ++ extern struct workqueue_struct *xprtiod_workqueue; ++ void rpc_prepare_task(struct rpc_task *task); +++#if IS_ENABLED(CONFIG_ENFS) +++void rpc_init_task_retry_counters(struct rpc_task *task); +++#endif ++ ++ static inline int rpc_wait_for_completion_task(struct rpc_task *task) ++ { ++diff --git a/include/linux/sunrpc/sunrpc_enfs_adapter.h b/include/linux/sunrpc/sunrpc_enfs_adapter.h ++new file mode 100644 ++index 000000000000..28abedcf5cf6 ++--- /dev/null +++++ b/include/linux/sunrpc/sunrpc_enfs_adapter.h ++@@ -0,0 +1,128 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* Client-side SUNRPC ENFS adapter header. +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _SUNRPC_ENFS_ADAPTER_H_ +++#define _SUNRPC_ENFS_ADAPTER_H_ +++#include <linux/sunrpc/clnt.h> +++ +++#if IS_ENABLED(CONFIG_ENFS) +++ +++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) +++{ +++ xps->xps_nactive--; +++} +++ +++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) +++{ +++ xps->xps_nactive--; +++} +++ +++struct rpc_xprt *rpc_task_get_xprt +++(struct rpc_clnt *clnt, struct rpc_xprt *xprt); +++ +++struct rpc_multipath_ops { +++ struct module *owner; +++ void (*create_clnt)(struct rpc_create_args *args, +++ struct rpc_clnt *clnt); +++ void (*releas_clnt)(struct rpc_clnt *clnt); +++ void (*create_xprt)(struct rpc_xprt *xprt); +++ void (*destroy_xprt)(struct rpc_xprt *xprt); +++ void (*xprt_iostat)(struct rpc_task *task); +++ void (*failover_handle)(struct rpc_task *task); +++ bool (*task_need_call_start_again)(struct rpc_task *task); +++ void (*adjust_task_timeout)(struct rpc_task *task, void *condition); +++ void (*init_task_req)(struct rpc_task *task, struct rpc_rqst *req); +++ bool (*prepare_transmit)(struct rpc_task *task); +++}; +++ +++extern struct rpc_multipath_ops __rcu *multipath_ops; +++void rpc_init_task_retry_counters(struct rpc_task *task); +++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops); +++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops); +++struct rpc_multipath_ops *rpc_multipath_ops_get(void); +++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops); +++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); +++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, +++ struct rpc_clnt *clnt); +++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt); +++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt); +++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt); +++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task); +++void rpc_multipath_ops_failover_handle(struct rpc_task *task); +++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task); +++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, +++ void *condition); +++void rpc_multipath_ops_init_task_req(struct rpc_task *task, +++ struct rpc_rqst *req); +++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task); +++ +++#else +++static inline struct rpc_xprt *rpc_task_get_xprt(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt) +++{ +++ return NULL; +++} +++ +++static inline void rpc_task_release_xprt(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt) +++{ +++} +++ +++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) +++{ +++} +++ +++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) +++{ +++} +++ +++static inline void rpc_multipath_ops_create_clnt +++(struct rpc_create_args *args, struct rpc_clnt *clnt) +++{ +++} +++ +++static inline void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) +++{ +++} +++ +++static inline bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) +++{ +++ return false; +++} +++ +++static inline void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) +++{ +++} +++ +++static inline void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) +++{ +++} +++ +++static inline void rpc_multipath_ops_failover_handle(struct rpc_task *task) +++{ +++} +++ +++static inline +++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) +++{ +++ return false; +++} +++ +++static inline void +++rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, void *condition) +++{ +++} +++ +++static inline void +++rpc_multipath_ops_init_task_req(struct rpc_task *task, struct rpc_rqst *req) +++{ +++} +++ +++static inline bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) +++{ +++ return false; +++} +++ +++#endif +++#endif // _SUNRPC_ENFS_ADAPTER_H_ ++diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h ++index ccfacca1eba9..2e47b3577947 100644 ++--- a/include/linux/sunrpc/xprt.h +++++ b/include/linux/sunrpc/xprt.h ++@@ -279,6 +279,10 @@ struct rpc_xprt { ++ atomic_t inject_disconnect; ++ #endif ++ struct rcu_head rcu; +++#if IS_ENABLED(CONFIG_ENFS) +++ atomic_long_t queuelen; +++ void *multipath_context; +++#endif ++ }; ++ ++ #if defined(CONFIG_SUNRPC_BACKCHANNEL) ++diff --git a/include/linux/sunrpc/xprtmultipath.h b/include/linux/sunrpc/xprtmultipath.h ++index af1257c030d2..d54e4dbbbf34 100644 ++--- a/include/linux/sunrpc/xprtmultipath.h +++++ b/include/linux/sunrpc/xprtmultipath.h ++@@ -22,6 +22,10 @@ struct rpc_xprt_switch { ++ const struct rpc_xprt_iter_ops *xps_iter_ops; ++ ++ struct rcu_head xps_rcu; +++#if IS_ENABLED(CONFIG_ENFS) +++ unsigned int xps_nactive; +++ atomic_long_t xps_queuelen; +++#endif ++ }; ++ ++ struct rpc_xprt_iter { ++@@ -69,4 +73,8 @@ extern struct rpc_xprt *xprt_iter_get_next(struct rpc_xprt_iter *xpi); ++ ++ extern bool rpc_xprt_switch_has_addr(struct rpc_xprt_switch *xps, ++ const struct sockaddr *sap); +++#if IS_ENABLED(CONFIG_ENFS) +++extern void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, +++ struct rpc_xprt *xprt); +++#endif ++ #endif ++diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c ++index 0fc540b0d183..d7ffee637148 100644 ++--- a/net/sunrpc/clnt.c +++++ b/net/sunrpc/clnt.c ++@@ -37,6 +37,7 @@ ++ #include <linux/sunrpc/rpc_pipe_fs.h> ++ #include <linux/sunrpc/metrics.h> ++ #include <linux/sunrpc/bc_xprt.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ #include <trace/events/sunrpc.h> ++ ++ #include "sunrpc.h" ++@@ -490,6 +491,8 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, ++ } ++ } ++ +++ rpc_multipath_ops_create_clnt(args, clnt); +++ ++ clnt->cl_softrtry = 1; ++ if (args->flags & RPC_CLNT_CREATE_HARDRTRY) ++ clnt->cl_softrtry = 0; ++@@ -869,6 +872,8 @@ void rpc_shutdown_client(struct rpc_clnt *clnt) ++ list_empty(&clnt->cl_tasks), 1*HZ); ++ } ++ +++ rpc_multipath_ops_releas_clnt(clnt); +++ ++ rpc_release_client(clnt); ++ } ++ EXPORT_SYMBOL_GPL(rpc_shutdown_client); ++@@ -981,7 +986,13 @@ void rpc_task_release_transport(struct rpc_task *task) ++ ++ if (xprt) { ++ task->tk_xprt = NULL; ++- xprt_put(xprt); +++#if IS_ENABLED(CONFIG_ENFS) +++ if (task->tk_client) { +++ rpc_task_release_xprt(task->tk_client, xprt); +++ return; +++ } +++#endif +++ xprt_put(xprt); ++ } ++ } ++ EXPORT_SYMBOL_GPL(rpc_task_release_transport); ++@@ -990,6 +1001,10 @@ void rpc_task_release_client(struct rpc_task *task) ++ { ++ struct rpc_clnt *clnt = task->tk_client; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ rpc_task_release_transport(task); +++#endif +++ ++ if (clnt != NULL) { ++ /* Remove from client task list */ ++ spin_lock(&clnt->cl_lock); ++@@ -999,14 +1014,29 @@ void rpc_task_release_client(struct rpc_task *task) ++ ++ rpc_release_client(clnt); ++ } +++#if IS_ENABLED(CONFIG_ENFS) +++#else ++ rpc_task_release_transport(task); +++#endif ++ } ++ +++#if IS_ENABLED(CONFIG_ENFS) +++static struct rpc_xprt * +++rpc_task_get_next_xprt(struct rpc_clnt *clnt) +++{ +++ return rpc_task_get_xprt(clnt, xprt_iter_get_next(&clnt->cl_xpi)); +++} +++#endif +++ ++ static ++ void rpc_task_set_transport(struct rpc_task *task, struct rpc_clnt *clnt) ++ { ++ if (!task->tk_xprt) +++#if IS_ENABLED(CONFIG_ENFS) +++ task->tk_xprt = rpc_task_get_next_xprt(clnt); +++#else ++ task->tk_xprt = xprt_iter_get_next(&clnt->cl_xpi); +++#endif ++ } ++ ++ static ++@@ -1597,6 +1627,14 @@ call_reserveresult(struct rpc_task *task) ++ return; ++ case -EIO: /* probably a shutdown */ ++ break; +++#if IS_ENABLED(CONFIG_ENFS) +++ case -ETIMEDOUT: /* woken up; restart */ +++ if (rpc_multipath_ops_task_need_call_start_again(task)) { +++ rpc_task_release_transport(task); +++ task->tk_action = call_start; +++ return; +++ } +++#endif ++ default: ++ printk(KERN_ERR "%s: unrecognized error %d, exiting\n", ++ __func__, status); ++@@ -1962,6 +2000,10 @@ call_transmit(struct rpc_task *task) ++ return; ++ if (!xprt_prepare_transmit(task)) ++ return; +++ +++ if (rpc_multipath_ops_prepare_transmit(task)) +++ return; +++ ++ task->tk_action = call_transmit_status; ++ /* Encode here so that rpcsec_gss can use correct sequence number. */ ++ if (rpc_task_need_encode(task)) { ++@@ -2277,6 +2319,9 @@ call_timeout(struct rpc_task *task) ++ ++ retry: ++ task->tk_action = call_bind; +++#if IS_ENABLED(CONFIG_ENFS) +++ rpc_multipath_ops_failover_handle(task); +++#endif ++ task->tk_status = 0; ++ } ++ ++@@ -2961,3 +3006,30 @@ rpc_clnt_swap_deactivate(struct rpc_clnt *clnt) ++ } ++ EXPORT_SYMBOL_GPL(rpc_clnt_swap_deactivate); ++ #endif /* CONFIG_SUNRPC_SWAP */ +++ +++#if IS_ENABLED(CONFIG_ENFS) +++/* rpc_clnt_test_xprt - Test and add a new transport to a rpc_clnt +++ * @clnt: pointer to struct rpc_clnt +++ * @xprt: pointer struct rpc_xprt +++ * @ops: async operation +++ */ +++int +++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, +++ const struct rpc_call_ops *ops, void *data, int flags) +++{ +++ struct rpc_cred *cred; +++ struct rpc_task *task; +++ +++ cred = authnull_ops.lookup_cred(NULL, NULL, 0); +++ task = rpc_call_null_helper(clnt, xprt, cred, +++ RPC_TASK_SOFT | RPC_TASK_SOFTCONN | flags, +++ ops, data); +++ put_rpccred(cred); +++ if (IS_ERR(task)) +++ return PTR_ERR(task); +++ +++ rpc_put_task(task); +++ return 1; +++} +++EXPORT_SYMBOL_GPL(rpc_clnt_test_xprt); +++#endif ++diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c ++index a873c92a4898..2254fea0e863 100644 ++--- a/net/sunrpc/sched.c +++++ b/net/sunrpc/sched.c ++@@ -20,7 +20,7 @@ ++ #include <linux/mutex.h> ++ #include <linux/freezer.h> ++ ++-#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ ++ #include "sunrpc.h" ++ ++@@ -962,7 +962,12 @@ static void rpc_init_task(struct rpc_task *task, const struct rpc_task_setup *ta ++ /* Initialize workqueue for async tasks */ ++ task->tk_workqueue = task_setup_data->workqueue; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ task->tk_xprt = rpc_task_get_xprt(task_setup_data->rpc_client, +++ xprt_get(task_setup_data->rpc_xprt)); +++#else ++ task->tk_xprt = xprt_get(task_setup_data->rpc_xprt); +++#endif ++ ++ if (task->tk_ops->rpc_call_prepare != NULL) ++ task->tk_action = rpc_prepare_task; ++diff --git a/net/sunrpc/sunrpc_enfs_adapter.c b/net/sunrpc/sunrpc_enfs_adapter.c ++new file mode 100644 ++index 000000000000..c1543545c6de ++--- /dev/null +++++ b/net/sunrpc/sunrpc_enfs_adapter.c ++@@ -0,0 +1,214 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* Client-side SUNRPC ENFS adapter header. +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> +++ +++struct rpc_multipath_ops __rcu *multipath_ops; +++ +++void rpc_init_task_retry_counters(struct rpc_task *task) +++{ +++ /* Initialize retry counters */ +++ task->tk_garb_retry = 2; +++ task->tk_cred_retry = 2; +++ task->tk_rebind_retry = 2; +++} +++EXPORT_SYMBOL_GPL(rpc_init_task_retry_counters); +++ +++struct rpc_xprt * +++rpc_task_get_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) +++{ +++ struct rpc_xprt_switch *xps; +++ +++ if (!xprt) +++ return NULL; +++ rcu_read_lock(); +++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); +++ atomic_long_inc(&xps->xps_queuelen); +++ rcu_read_unlock(); +++ atomic_long_inc(&xprt->queuelen); +++ +++ return xprt; +++} +++ +++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops) +++{ +++ struct rpc_multipath_ops *old; +++ +++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, NULL, ops); +++ if (!old || old == ops) +++ return 0; +++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_register); +++ +++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops) +++{ +++ struct rpc_multipath_ops *old; +++ +++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, ops, NULL); +++ if (!old || old == ops) +++ return 0; +++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_unregister); +++ +++struct rpc_multipath_ops *rpc_multipath_ops_get(void) +++{ +++ struct rpc_multipath_ops *ops; +++ +++ rcu_read_lock(); +++ ops = rcu_dereference(multipath_ops); +++ if (!ops) { +++ rcu_read_unlock(); +++ return NULL; +++ } +++ if (!try_module_get(ops->owner)) +++ ops = NULL; +++ rcu_read_unlock(); +++ return ops; +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_get); +++ +++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops) +++{ +++ if (ops) +++ module_put(ops->owner); +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_put); +++ +++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) +++{ +++ struct rpc_xprt_switch *xps; +++ +++ atomic_long_dec(&xprt->queuelen); +++ rcu_read_lock(); +++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); +++ atomic_long_dec(&xps->xps_queuelen); +++ rcu_read_unlock(); +++ +++ xprt_put(xprt); +++} +++ +++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, +++ struct rpc_clnt *clnt) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ if (args->multipath_option) { +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->create_clnt) +++ mops->create_clnt(args, clnt); +++ rpc_multipath_ops_put(mops); +++ } +++} +++ +++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->releas_clnt) +++ mops->releas_clnt(clnt); +++ +++ rpc_multipath_ops_put(mops); +++} +++ +++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->create_xprt) { +++ mops->create_xprt(xprt); +++ if (!xprt->multipath_context) { +++ rpc_multipath_ops_put(mops); +++ return true; +++ } +++ } +++ rpc_multipath_ops_put(mops); +++ return false; +++} +++ +++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ if (xprt->multipath_context) { +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->destroy_xprt) +++ mops->destroy_xprt(xprt); +++ rpc_multipath_ops_put(mops); +++ } +++} +++ +++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ mops = rpc_multipath_ops_get(); +++ if (task->tk_client && mops && mops->xprt_iostat) +++ mops->xprt_iostat(task); +++ rpc_multipath_ops_put(mops); +++} +++ +++void rpc_multipath_ops_failover_handle(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mpath_ops = NULL; +++ +++ mpath_ops = rpc_multipath_ops_get(); +++ if (mpath_ops && mpath_ops->failover_handle) +++ mpath_ops->failover_handle(task); +++ rpc_multipath_ops_put(mpath_ops); +++} +++ +++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mpath_ops = NULL; +++ bool ret = false; +++ +++ mpath_ops = rpc_multipath_ops_get(); +++ if (mpath_ops && mpath_ops->task_need_call_start_again) +++ ret = mpath_ops->task_need_call_start_again(task); +++ rpc_multipath_ops_put(mpath_ops); +++ return ret; +++} +++ +++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, +++ void *condition) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->adjust_task_timeout) +++ mops->adjust_task_timeout(task, NULL); +++ rpc_multipath_ops_put(mops); +++} +++ +++void rpc_multipath_ops_init_task_req(struct rpc_task *task, +++ struct rpc_rqst *req) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->init_task_req) +++ mops->init_task_req(task, req); +++ rpc_multipath_ops_put(mops); +++} +++ +++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->prepare_transmit) { +++ if (!(mops->prepare_transmit(task))) { +++ rpc_multipath_ops_put(mops); +++ return true; +++ } +++ } +++ rpc_multipath_ops_put(mops); +++ return false; +++} ++diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c ++index c912bf20faa2..c2b63b3d5217 100644 ++--- a/net/sunrpc/xprt.c +++++ b/net/sunrpc/xprt.c ++@@ -48,6 +48,7 @@ ++ #include <linux/sunrpc/clnt.h> ++ #include <linux/sunrpc/metrics.h> ++ #include <linux/sunrpc/bc_xprt.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ #include <linux/rcupdate.h> ++ ++ #include <trace/events/sunrpc.h> ++@@ -259,6 +260,9 @@ int xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task) ++ dprintk("RPC: %5u failed to lock transport %p\n", ++ task->tk_pid, xprt); ++ task->tk_timeout = 0; +++ +++ rpc_multipath_ops_adjust_task_timeout(task, NULL); +++ ++ task->tk_status = -EAGAIN; ++ if (req == NULL) ++ priority = RPC_PRIORITY_LOW; ++@@ -560,6 +564,9 @@ void xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action) ++ struct rpc_xprt *xprt = req->rq_xprt; ++ ++ task->tk_timeout = RPC_IS_SOFT(task) ? req->rq_timeout : 0; +++ +++ rpc_multipath_ops_adjust_task_timeout(task, NULL); +++ ++ rpc_sleep_on(&xprt->pending, task, action); ++ } ++ EXPORT_SYMBOL_GPL(xprt_wait_for_buffer_space); ++@@ -1347,6 +1354,9 @@ xprt_request_init(struct rpc_task *task) ++ req->rq_rcv_buf.buflen = 0; ++ req->rq_release_snd_buf = NULL; ++ xprt_reset_majortimeo(req); +++ +++ rpc_multipath_ops_init_task_req(task, req); +++ ++ dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid, ++ req, ntohl(req->rq_xid)); ++ } ++@@ -1427,6 +1437,9 @@ void xprt_release(struct rpc_task *task) ++ task->tk_ops->rpc_count_stats(task, task->tk_calldata); ++ else if (task->tk_client) ++ rpc_count_iostats(task, task->tk_client->cl_metrics); +++ +++ rpc_multipath_ops_xprt_iostat(task); +++ ++ spin_lock(&xprt->recv_lock); ++ if (!list_empty(&req->rq_list)) { ++ list_del_init(&req->rq_list); ++@@ -1455,6 +1468,7 @@ void xprt_release(struct rpc_task *task) ++ else ++ xprt_free_bc_request(req); ++ } +++EXPORT_SYMBOL_GPL(xprt_release); ++ ++ static void xprt_init(struct rpc_xprt *xprt, struct net *net) ++ { ++@@ -1528,6 +1542,10 @@ struct rpc_xprt *xprt_create_transport(struct xprt_create *args) ++ return ERR_PTR(-ENOMEM); ++ } ++ +++if (rpc_multipath_ops_create_xprt(xprt)) { +++ xprt_destroy(xprt); +++ return ERR_PTR(-ENOMEM); +++} ++ rpc_xprt_debugfs_register(xprt); ++ ++ dprintk("RPC: created transport %p with %u slots\n", xprt, ++@@ -1547,6 +1565,9 @@ static void xprt_destroy_cb(struct work_struct *work) ++ rpc_destroy_wait_queue(&xprt->sending); ++ rpc_destroy_wait_queue(&xprt->backlog); ++ kfree(xprt->servername); +++ +++ rpc_multipath_ops_destroy_xprt(xprt); +++ ++ /* ++ * Tear down transport state and free the rpc_xprt ++ */ ++diff --git a/net/sunrpc/xprtmultipath.c b/net/sunrpc/xprtmultipath.c ++index 6ebaa58b4eff..6202a0be1327 100644 ++--- a/net/sunrpc/xprtmultipath.c +++++ b/net/sunrpc/xprtmultipath.c ++@@ -18,6 +18,7 @@ ++ #include <linux/sunrpc/xprt.h> ++ #include <linux/sunrpc/addr.h> ++ #include <linux/sunrpc/xprtmultipath.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ ++ typedef struct rpc_xprt *(*xprt_switch_find_xprt_t)(struct list_head *head, ++ const struct rpc_xprt *cur); ++@@ -26,8 +27,8 @@ static const struct rpc_xprt_iter_ops rpc_xprt_iter_singular; ++ static const struct rpc_xprt_iter_ops rpc_xprt_iter_roundrobin; ++ static const struct rpc_xprt_iter_ops rpc_xprt_iter_listall; ++ ++-static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++- struct rpc_xprt *xprt) +++void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, +++ struct rpc_xprt *xprt) ++ { ++ if (unlikely(xprt_get(xprt) == NULL)) ++ return; ++@@ -36,7 +37,9 @@ static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++ if (xps->xps_nxprts == 0) ++ xps->xps_net = xprt->xprt_net; ++ xps->xps_nxprts++; +++ rpc_xps_nactive_add_one(xps); ++ } +++EXPORT_SYMBOL(xprt_switch_add_xprt_locked); ++ ++ /** ++ * rpc_xprt_switch_add_xprt - Add a new rpc_xprt to an rpc_xprt_switch ++@@ -63,6 +66,7 @@ static void xprt_switch_remove_xprt_locked(struct rpc_xprt_switch *xps, ++ if (unlikely(xprt == NULL)) ++ return; ++ xps->xps_nxprts--; +++ rpc_xps_nactive_sub_one(xps); ++ if (xps->xps_nxprts == 0) ++ xps->xps_net = NULL; ++ smp_wmb(); ++@@ -84,7 +88,7 @@ void rpc_xprt_switch_remove_xprt(struct rpc_xprt_switch *xps, ++ spin_unlock(&xps->xps_lock); ++ xprt_put(xprt); ++ } ++- +++EXPORT_SYMBOL(rpc_xprt_switch_remove_xprt); ++ /** ++ * xprt_switch_alloc - Allocate a new struct rpc_xprt_switch ++ * @xprt: pointer to struct rpc_xprt ++@@ -102,7 +106,13 @@ struct rpc_xprt_switch *xprt_switch_alloc(struct rpc_xprt *xprt, ++ if (xps != NULL) { ++ spin_lock_init(&xps->xps_lock); ++ kref_init(&xps->xps_kref); +++#if IS_ENABLED(CONFIG_ENFS) +++ xps->xps_nxprts = 0; +++ xps->xps_nactive = 0; +++ atomic_long_set(&xps->xps_queuelen, 0); +++#else ++ xps->xps_nxprts = 0; +++#endif ++ INIT_LIST_HEAD(&xps->xps_xprt_list); ++ xps->xps_iter_ops = &rpc_xprt_iter_singular; ++ xprt_switch_add_xprt_locked(xps, xprt); ++@@ -148,6 +158,7 @@ struct rpc_xprt_switch *xprt_switch_get(struct rpc_xprt_switch *xps) ++ return xps; ++ return NULL; ++ } +++EXPORT_SYMBOL(xprt_switch_get); ++ ++ /** ++ * xprt_switch_put - Release a reference to a rpc_xprt_switch ++@@ -160,6 +171,7 @@ void xprt_switch_put(struct rpc_xprt_switch *xps) ++ if (xps != NULL) ++ kref_put(&xps->xps_kref, xprt_switch_free); ++ } +++EXPORT_SYMBOL(xprt_switch_put); ++ ++ /** ++ * rpc_xprt_switch_set_roundrobin - Set a round-robin policy on rpc_xprt_switch +diff --git a/0003-add_enfs_module_for_nfs_mount_option.patch b/0003-add_enfs_module_for_nfs_mount_option.patch +new file mode 100644 +index 0000000..70753b5 +--- /dev/null ++++ b/0003-add_enfs_module_for_nfs_mount_option.patch +@@ -0,0 +1,1209 @@ ++diff --git a/fs/nfs/enfs/Makefile b/fs/nfs/enfs/Makefile ++new file mode 100644 ++index 000000000000..6e83eb23c668 ++--- /dev/null +++++ b/fs/nfs/enfs/Makefile ++@@ -0,0 +1,18 @@ +++obj-m += enfs.o +++ +++#EXTRA_CFLAGS += -I$(PWD)/.. +++ +++enfs-y := enfs_init.o +++enfs-y += enfs_config.o +++enfs-y += mgmt_init.o +++enfs-y += enfs_multipath_client.o +++enfs-y += enfs_multipath_parse.o +++enfs-y += failover_path.o +++enfs-y += failover_time.o +++enfs-y += enfs_roundrobin.o +++enfs-y += enfs_multipath.o +++enfs-y += enfs_path.o +++enfs-y += enfs_proc.o +++enfs-y += enfs_remount.o +++enfs-y += pm_ping.o +++enfs-y += pm_state.o ++diff --git a/fs/nfs/enfs/enfs.h b/fs/nfs/enfs/enfs.h ++new file mode 100644 ++index 000000000000..be3d95220088 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs.h ++@@ -0,0 +1,62 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS multipath adapt header. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++ +++#ifndef _ENFS_H_ +++#define _ENFS_H_ +++#include <linux/atomic.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs3.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include "../enfs_adapter.h" +++ +++#define IP_ADDRESS_LEN_MAX 64 +++#define MAX_IP_PAIR_PER_MOUNT 8 +++#define MAX_IP_INDEX (MAX_IP_PAIR_PER_MOUNT) +++#define MAX_SUPPORTED_LOCAL_IP_COUNT 8 +++#define MAX_SUPPORTED_REMOTE_IP_COUNT 32 +++#define MAX_DNS_NAME_LEN 512 +++#define MAX_DNS_SUPPORTED 2 +++#define EXTEND_CMD_MAX_BUF_LEN 65356 +++ +++ +++struct nfs_ip_list { +++ int count; +++ struct sockaddr_storage address[MAX_SUPPORTED_REMOTE_IP_COUNT]; +++ size_t addrlen[MAX_SUPPORTED_REMOTE_IP_COUNT]; +++}; +++ +++struct NFS_ROUTE_DNS_S { +++ char dnsname[MAX_DNS_NAME_LEN]; // valid only if dnsExist is true +++}; +++ +++struct NFS_ROUTE_DNS_INFO_S { +++ int dnsNameCount; // Count of DNS name in the list +++ // valid only if dnsExist is true +++ struct NFS_ROUTE_DNS_S routeRemoteDnsList[MAX_DNS_SUPPORTED]; +++}; +++ +++struct rpc_iostats; +++struct enfs_xprt_context { +++ struct sockaddr_storage srcaddr; +++ struct rpc_iostats *stats; +++ bool main; +++ atomic_t path_state; +++ atomic_t path_check_state; +++}; +++ +++static inline bool enfs_is_main_xprt(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = xprt->multipath_context; +++ +++ if (!ctx) +++ return false; +++ return ctx->main; +++} +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_init.c b/fs/nfs/enfs/enfs_init.c ++new file mode 100644 ++index 000000000000..4b55608191a7 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_init.c ++@@ -0,0 +1,98 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/module.h> +++#include <linux/sunrpc/sched.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs3.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include "enfs.h" +++#include "enfs_multipath_parse.h" +++#include "enfs_multipath_client.h" +++#include "enfs_remount.h" +++#include "init.h" +++#include "enfs_log.h" +++#include "enfs_multipath.h" +++#include "mgmt_init.h" +++ +++struct enfs_adapter_ops enfs_adapter = { +++ .name = "enfs", +++ .owner = THIS_MODULE, +++ .parse_mount_options = nfs_multipath_parse_options, +++ .free_mount_options = nfs_multipath_free_options, +++ .client_info_init = nfs_multipath_client_info_init, +++ .client_info_free = nfs_multipath_client_info_free, +++ .client_info_match = nfs_multipath_client_info_match, +++ .client_info_show = nfs_multipath_client_info_show, +++ .remount_ip_list = enfs_remount_iplist, +++}; +++ +++int32_t enfs_init(void) +++{ +++ int err; +++ +++ err = enfs_multipath_init(); +++ if (err) { +++ enfs_log_error("init multipath failed.\n"); +++ goto out; +++ } +++ +++ err = mgmt_init(); +++ if (err != 0) { +++ enfs_log_error("init mgmt failed.\n"); +++ goto out_tp_exit; +++ } +++ +++ return 0; +++ +++out_tp_exit: +++ enfs_multipath_exit(); +++out: +++ return err; +++} +++ +++void enfs_fini(void) +++{ +++ mgmt_fini(); +++ +++ enfs_multipath_exit(); +++} +++ +++static int __init init_enfs(void) +++{ +++ int ret; +++ +++ ret = enfs_adapter_register(&enfs_adapter); +++ if (ret) { +++ pr_err("regist enfs_adapter fail. ret %d\n", ret); +++ return -1; +++ } +++ +++ ret = enfs_init(); +++ if (ret) { +++ enfs_adapter_unregister(&enfs_adapter); +++ return -1; +++ } +++ +++ return 0; +++} +++ +++static void __exit exit_enfs(void) +++{ +++ enfs_fini(); +++ enfs_adapter_unregister(&enfs_adapter); +++} +++ +++MODULE_LICENSE("GPL"); +++MODULE_AUTHOR("Huawei Tech. Co., Ltd."); +++MODULE_DESCRIPTION("Nfs client router"); +++MODULE_VERSION("1.0"); +++ +++module_init(init_enfs); +++module_exit(exit_enfs); ++diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c ++new file mode 100644 ++index 000000000000..63c02898a42c ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.c ++@@ -0,0 +1,340 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/proc_fs.h> +++#include <linux/seq_file.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/addr.h> +++#include "enfs_multipath_client.h" +++#include "enfs_multipath_parse.h" +++ +++int +++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, +++ const struct nfs_client_initdata *client_init_data) +++{ +++ struct multipath_mount_options *mount_options = +++ (struct multipath_mount_options *)client_init_data->enfs_option; +++ +++ if (mount_options->local_ip_list) { +++ client_info->local_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->local_ip_list) +++ return -ENOMEM; +++ +++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->remote_ip_list) { +++ +++ client_info->remote_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->remote_ip_list) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->remote_ip_list, +++ mount_options->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->pRemoteDnsInfo) { +++ client_info->pRemoteDnsInfo = +++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); +++ +++ if (!client_info->pRemoteDnsInfo) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ kfree(client_info->remote_ip_list); +++ client_info->remote_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->pRemoteDnsInfo, +++ mount_options->pRemoteDnsInfo, +++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); +++ } +++ return 0; +++} +++ +++void nfs_multipath_client_info_free_work(struct work_struct *work) +++{ +++ +++ struct multipath_client_info *clp_info; +++ +++ if (work == NULL) +++ return; +++ +++ clp_info = container_of(work, struct multipath_client_info, work); +++ +++ if (clp_info->local_ip_list != NULL) { +++ kfree(clp_info->local_ip_list); +++ clp_info->local_ip_list = NULL; +++ } +++ if (clp_info->remote_ip_list != NULL) { +++ kfree(clp_info->remote_ip_list); +++ clp_info->remote_ip_list = NULL; +++ } +++ kfree(clp_info); +++} +++ +++void nfs_multipath_client_info_free(void *data) +++{ +++ struct multipath_client_info *clp_info = +++ (struct multipath_client_info *)data; +++ +++ if (clp_info == NULL) +++ return; +++ pr_info("free client info %p.\n", clp_info); +++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); +++ schedule_work(&clp_info->work); +++} +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init) +++{ +++ int rc; +++ struct multipath_client_info *info; +++ struct multipath_client_info **enfs_info; +++ /* no multi path info, no need do multipath init */ +++ if (cl_init->enfs_option == NULL) +++ return 0; +++ enfs_info = (struct multipath_client_info **)data; +++ if (enfs_info == NULL) +++ return -EINVAL; +++ +++ if (*enfs_info == NULL) +++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), +++ GFP_KERNEL); +++ +++ if (*enfs_info == NULL) +++ return -ENOMEM; +++ +++ info = (struct multipath_client_info *)*enfs_info; +++ pr_info("init client info %p.\n", info); +++ rc = nfs_multipath_client_mount_info_init(info, cl_init); +++ if (rc) { +++ nfs_multipath_client_info_free((void *)info); +++ return rc; +++ } +++ return rc; +++} +++ +++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, +++ const struct nfs_ip_list *ip_list_dst) +++{ +++ int i; +++ int j; +++ bool is_find; +++ /* if both are equal or NULL, then return true. */ +++ if (ip_list_src == ip_list_dst) +++ return true; +++ +++ if ((ip_list_src == NULL || ip_list_dst == NULL)) +++ return false; +++ +++ if (ip_list_src->count != ip_list_dst->count) +++ return false; +++ +++ for (i = 0; i < ip_list_src->count; i++) { +++ is_find = false; +++ for (j = 0; j < ip_list_src->count; j++) { +++ if (rpc_cmp_addr_port( +++ (const struct sockaddr *) +++ &ip_list_src->address[i], +++ (const struct sockaddr *) +++ &ip_list_dst->address[j]) +++ ) { +++ is_find = true; +++ break; +++ } +++ } +++ if (is_find == false) +++ return false; +++ } +++ return true; +++} +++ +++int +++nfs_multipath_dns_list_info_match( +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) +++{ +++ int i; +++ +++ /* if both are equal or NULL, then return true. */ +++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) +++ return true; +++ +++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) +++ return false; +++ +++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) +++ return false; +++ +++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { +++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, +++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) +++ return false; +++ } +++ return true; +++} +++ +++int nfs_multipath_client_info_match(void *src, void *dst) +++{ +++ int ret = true; +++ +++ struct multipath_client_info *src_info; +++ struct multipath_mount_options *dst_info; +++ +++ src_info = (struct multipath_client_info *)src; +++ dst_info = (struct multipath_mount_options *)dst; +++ pr_info("try match client .\n"); +++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, +++ dst_info->local_ip_list); +++ if (ret == false) { +++ pr_err("local_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, +++ dst_info->remote_ip_list); +++ if (ret == false) { +++ pr_err("remote_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, +++ dst_info->pRemoteDnsInfo); +++ if (ret == false) { +++ pr_err("dns not match.\n"); +++ return ret; +++ } +++ pr_info("try match client ret %d.\n", ret); +++ return ret; +++} +++ +++void nfs_multipath_print_ip_info(struct seq_file *mount_option, +++ struct nfs_ip_list *ip_list, +++ const char *type) +++{ +++ char buf[IP_ADDRESS_LEN_MAX + 1]; +++ int len = 0; +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < ip_list->count; i++) { +++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], +++ buf, IP_ADDRESS_LEN_MAX); +++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) +++ buf[len] = '\0'; +++ +++ if (i == 0) +++ seq_printf(mount_option, "%s", buf); +++ else +++ seq_printf(mount_option, "~%s", buf); +++ dfprintk(MOUNT, +++ "NFS: show nfs mount option type:%s %s [%s]\n", +++ type, buf, __func__); +++ } +++} +++ +++void nfs_multipath_print_dns_info(struct seq_file *mount_option, +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, +++ const char *type) +++{ +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { +++ if (i == 0) +++ seq_printf(mount_option, +++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) +++ seq_printf(mount_option, ",%s]", +++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else +++ seq_printf(mount_option, +++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ } +++} +++ +++ +++static void multipath_print_sockaddr(struct seq_file *seq, +++ struct sockaddr *addr) +++{ +++ switch (addr->sa_family) { +++ case AF_INET: { +++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; +++ +++ seq_printf(seq, "%pI4", &sin->sin_addr); +++ return; +++ } +++ case AF_INET6: { +++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; +++ +++ seq_printf(seq, "%pI6", &sin6->sin6_addr); +++ return; +++ } +++ default: +++ break; +++ } +++ pr_err("unsupport family:%d\n", addr->sa_family); +++} +++ +++static void multipath_print_enfs_info(struct seq_file *seq, +++ struct nfs_server *server) +++{ +++ struct sockaddr_storage peeraddr; +++ struct rpc_clnt *next = server->client; +++ +++ rpc_peeraddr(server->client, +++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); +++ seq_puts(seq, ",enfs_info="); +++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); +++ +++ while (next->cl_parent) { +++ if (next == next->cl_parent) +++ break; +++ next = next->cl_parent; +++ } +++ seq_printf(seq, "_%u", next->cl_clid); +++} +++ +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) +++{ +++ struct nfs_server *server = data; +++ struct multipath_client_info *client_info = +++ server->nfs_client->cl_multipath_data; +++ +++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); +++ if ((client_info->remote_ip_list) && +++ (client_info->remote_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->remote_ip_list, +++ "remoteaddrs"); +++ +++ if ((client_info->local_ip_list) && +++ (client_info->local_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->local_ip_list, +++ "localaddrs"); +++ +++ if ((client_info->pRemoteDnsInfo) && +++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) +++ nfs_multipath_print_dns_info(mount_option, +++ client_info->pRemoteDnsInfo, +++ "remotednsname"); +++ +++ multipath_print_enfs_info(mount_option, server); +++} ++diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h ++new file mode 100644 ++index 000000000000..208f7260690d ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.h ++@@ -0,0 +1,26 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _ENFS_MULTIPATH_CLIENT_H_ +++#define _ENFS_MULTIPATH_CLIENT_H_ +++ +++#include "enfs.h" +++ +++struct multipath_client_info { +++ struct work_struct work; +++ struct nfs_ip_list *remote_ip_list; +++ struct nfs_ip_list *local_ip_list; +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; +++ s64 client_id; +++}; +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init); +++void nfs_multipath_client_info_free(void *data); +++int nfs_multipath_client_info_match(void *src, void *dst); +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_multipath_parse.c b/fs/nfs/enfs/enfs_multipath_parse.c ++new file mode 100644 ++index 000000000000..9c4c6c1880b6 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_parse.c ++@@ -0,0 +1,601 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/parser.h> +++#include <linux/kern_levels.h> +++#include <linux/sunrpc/addr.h> +++#include "enfs_multipath_parse.h" +++#include "enfs_log.h" +++ +++#define NFSDBG_FACILITY NFSDBG_CLIENT +++ +++void nfs_multipath_parse_ip_ipv6_add(struct sockaddr_in6 *sin6, int add_num) +++{ +++ int i; +++ +++ pr_info("NFS: before %08x%08x%08x%08x add_num: %d[%s]\n", +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), +++ add_num, __func__); +++ for (i = 0; i < add_num; i++) { +++ sin6->sin6_addr.in6_u.u6_addr32[3] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[3] != 0) +++ continue; +++ +++ sin6->sin6_addr.in6_u.u6_addr32[2] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[2] != 0) +++ continue; +++ +++ sin6->sin6_addr.in6_u.u6_addr32[1] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[1] != 0) +++ continue; +++ +++ sin6->sin6_addr.in6_u.u6_addr32[0] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[0] != 0) +++ continue; +++ } +++ +++ return; +++ +++} +++ +++static int nfs_multipath_parse_ip_range(struct net *net_ns, const char *cursor, +++ struct nfs_ip_list *ip_list, enum nfsmultipathoptions type) +++{ +++ struct sockaddr_storage addr; +++ struct sockaddr_storage tmp_addr; +++ int i; +++ size_t len; +++ int add_num = 1; +++ bool duplicate_flag = false; +++ bool is_complete = false; +++ struct sockaddr_in *sin4; +++ struct sockaddr_in6 *sin6; +++ +++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", +++ cursor, type, __func__); +++ len = rpc_pton(net_ns, cursor, strlen(cursor), +++ (struct sockaddr *)&addr, sizeof(addr)); +++ if (!len) +++ return -EINVAL; +++ +++ if (addr.ss_family != ip_list->address[ip_list->count - 1].ss_family) { +++ pr_info("NFS: %s parsing nfs mount option type: %d fail.\n", +++ __func__, type); +++ return -EINVAL; +++ } +++ +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &ip_list->address[ip_list->count - 1], +++ (const struct sockaddr *)&addr)) { +++ +++ pr_info("range ip is same ip.\n"); +++ return 0; +++ +++ } +++ +++ while (true) { +++ +++ tmp_addr = ip_list->address[ip_list->count - 1]; +++ +++ switch (addr.ss_family) { +++ case AF_INET: +++ sin4 = (struct sockaddr_in *)&tmp_addr; +++ +++ sin4->sin_addr.s_addr = +++ htonl(ntohl(sin4->sin_addr.s_addr) + add_num); +++ +++ pr_info("NFS: mount option ip%08x type: %d ipcont %d [%s]\n", +++ ntohl(sin4->sin_addr.s_addr), +++ type, ip_list->count, __func__); +++ break; +++ case AF_INET6: +++ sin6 = (struct sockaddr_in6 *)&tmp_addr; +++ nfs_multipath_parse_ip_ipv6_add(sin6, add_num); +++ pr_info("NFS: mount option ip %08x%08x%08x%08x type: %d ipcont %d [%s]\n", +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), +++ type, ip_list->count, __func__); +++ break; +++ // return -EOPNOTSUPP; +++ default: +++ return -EOPNOTSUPP; +++ } +++ +++ if (rpc_cmp_addr((const struct sockaddr *)&tmp_addr, +++ (const struct sockaddr *)&addr)) { +++ is_complete = true; +++ } +++ // delete duplicate ip, continuosly repeat, skip it +++ for (i = 0; i < ip_list->count; i++) { +++ duplicate_flag = false; +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &ip_list->address[i], +++ (const struct sockaddr *)&tmp_addr)) { +++ add_num++; +++ duplicate_flag = true; +++ break; +++ } +++ } +++ +++ if (duplicate_flag == false) { +++ pr_info("this ip not duplicate;"); +++ add_num = 1; +++ // if not repeat but omit limit return false +++ if ((type == LOCALADDR && +++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || +++ (type == REMOTEADDR && +++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { +++ +++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", +++ __func__, type, ip_list->count, +++ type == LOCALADDR ? +++ MAX_SUPPORTED_LOCAL_IP_COUNT : +++ MAX_SUPPORTED_REMOTE_IP_COUNT); +++ ip_list->count = 0; +++ return -ENOSPC; +++ } +++ ip_list->address[ip_list->count] = tmp_addr; +++ +++ ip_list->addrlen[ip_list->count] = +++ ip_list->addrlen[ip_list->count - 1]; +++ +++ ip_list->count += 1; +++ } +++ if (is_complete == true) +++ break; +++ } +++ return 0; +++} +++ +++int nfs_multipath_parse_ip_list_inter(struct nfs_ip_list *ip_list, +++ struct net *net_ns, +++ char *cursor, enum nfsmultipathoptions type) +++{ +++ int i = 0; +++ struct sockaddr_storage addr; +++ struct sockaddr_storage swap; +++ int len; +++ +++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", +++ cursor, type, __func__); +++ +++ len = rpc_pton(net_ns, cursor, +++ strlen(cursor), +++ (struct sockaddr *)&addr, sizeof(addr)); +++ if (!len) +++ return -EINVAL; +++ +++ // check repeated ip +++ for (i = 0; i < ip_list->count; i++) { +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &ip_list->address[i], +++ (const struct sockaddr *)&addr)) { +++ +++ pr_info("NFS: mount option '%s' type:%d index %d same as before index %d [%s]\n", +++ cursor, type, ip_list->count, i, __func__); +++ // prevent this ip is beginning +++ // if repeated take it to the end of list +++ swap = ip_list->address[i]; +++ +++ ip_list->address[i] = +++ ip_list->address[ip_list->count-1]; +++ +++ ip_list->address[ip_list->count-1] = swap; +++ return 0; +++ } +++ } +++ // if not repeated, check exceed limit +++ if ((type == LOCALADDR && +++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || +++ (type == REMOTEADDR && +++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { +++ +++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", +++ __func__, type, ip_list->count, +++ type == LOCALADDR ? +++ MAX_SUPPORTED_LOCAL_IP_COUNT : +++ MAX_SUPPORTED_REMOTE_IP_COUNT); +++ +++ ip_list->count = 0; +++ return -ENOSPC; +++ } +++ ip_list->address[ip_list->count] = addr; +++ ip_list->addrlen[ip_list->count] = len; +++ ip_list->count++; +++ +++ return 0; +++} +++ +++char *nfs_multipath_parse_ip_list_get_cursor(char **buf_to_parse, bool *single) +++{ +++ char *cursor = NULL; +++ const char *single_sep = strchr(*buf_to_parse, '~'); +++ const char *range_sep = strchr(*buf_to_parse, '-'); +++ +++ *single = true; +++ if (range_sep) { +++ if (range_sep > single_sep) { // A-B or A~B-C +++ if (single_sep == NULL) { // A-B +++ cursor = strsep(buf_to_parse, "-"); +++ if (cursor) +++ *single = false; +++ } else// A~B-C +++ cursor = strsep(buf_to_parse, "~"); +++ } else { // A-B~C +++ cursor = strsep(buf_to_parse, "-"); +++ if (cursor) +++ *single = false; +++ } +++ } else { // A~B~C +++ cursor = strsep(buf_to_parse, "~"); +++ } +++ return cursor; +++} +++ +++bool nfs_multipath_parse_param_check(enum nfsmultipathoptions type, +++ struct multipath_mount_options *options) +++{ +++ if (type == REMOUNTREMOTEADDR && options->remote_ip_list->count != 0) { +++ memset(options->remote_ip_list, 0, sizeof(struct nfs_ip_list)); +++ return true; +++ } +++ if (type == REMOUNTLOCALADDR && options->local_ip_list->count != 0) { +++ memset(options->local_ip_list, 0, sizeof(struct nfs_ip_list)); +++ return true; +++ } +++ if ((type == REMOTEADDR || type == REMOTEDNSNAME) && +++ options->pRemoteDnsInfo->dnsNameCount != 0) { +++ +++ pr_info("[MULTIPATH:%s] parse for %d ,already have dns\n", +++ __func__, type); +++ return false; +++ } else if ((type == REMOTEADDR || type == REMOTEDNSNAME) && +++ options->remote_ip_list->count != 0) { +++ +++ pr_info("[MULTIPATH:%s] parse for %d ,already have iplist\n", +++ __func__, type); +++ return false; +++ } +++ return true; +++} +++ +++int nfs_multipath_parse_ip_list(char *buffer, struct net *net_ns, +++ struct multipath_mount_options *options, +++ enum nfsmultipathoptions type) +++{ +++ char *buf_to_parse = NULL; +++ bool prev_range = false; +++ int ret = 0; +++ char *cursor = NULL; +++ bool single = true; +++ struct nfs_ip_list *ip_list_tmp = NULL; +++ +++ if (!nfs_multipath_parse_param_check(type, options)) +++ return -ENOTSUPP; +++ +++ if (type == REMOUNTREMOTEADDR) +++ type = REMOTEADDR; +++ +++ if (type == REMOUNTLOCALADDR) +++ type = LOCALADDR; +++ +++ if (type == LOCALADDR) +++ ip_list_tmp = options->local_ip_list; +++ else +++ ip_list_tmp = options->remote_ip_list; +++ +++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", +++ buffer, type, __func__); +++ +++ buf_to_parse = buffer; +++ while (buf_to_parse != NULL) { +++ cursor = +++ nfs_multipath_parse_ip_list_get_cursor(&buf_to_parse, &single); +++ if (!cursor) +++ break; +++ +++ if (single == false && prev_range == true) { +++ pr_info("NFS: mount option type: %d fail. Multiple Range.[%s]\n", +++ type, __func__); +++ +++ ret = -EINVAL; +++ goto out; +++ } +++ +++ if (prev_range == false) { +++ ret = nfs_multipath_parse_ip_list_inter(ip_list_tmp, +++ net_ns, cursor, type); +++ if (ret) +++ goto out; +++ if (single == false) +++ prev_range = true; +++ } else { +++ ret = nfs_multipath_parse_ip_range(net_ns, cursor, +++ ip_list_tmp, type); +++ if (ret != 0) +++ goto out; +++ prev_range = false; +++ } +++ } +++ +++out: +++ if (ret) +++ memset(ip_list_tmp, 0, sizeof(struct nfs_ip_list)); +++ +++ return ret; +++} +++ +++int nfs_multipath_parse_dns_list(char *buffer, struct net *net_ns, +++ struct multipath_mount_options *options) +++{ +++ struct NFS_ROUTE_DNS_INFO_S *dns_name_list_tmp = NULL; +++ char *cursor = NULL; +++ char *bufToParse; +++ +++ if (!nfs_multipath_parse_param_check(REMOTEDNSNAME, options)) +++ return -ENOTSUPP; +++ +++ pr_info("[MULTIPATH:%s] buffer %s\n", __func__, buffer); +++ // freed in nfs_free_parsed_mount_data +++ dns_name_list_tmp = kmalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), +++ GFP_KERNEL); +++ if (!dns_name_list_tmp) +++ return -ENOMEM; +++ +++ dns_name_list_tmp->dnsNameCount = 0; +++ bufToParse = buffer; +++ while (bufToParse) { +++ if (dns_name_list_tmp->dnsNameCount >= MAX_DNS_SUPPORTED) { +++ pr_err("%s: dnsname for %s reached %d,more than supported limit %d\n", +++ __func__, cursor, +++ dns_name_list_tmp->dnsNameCount, +++ MAX_DNS_SUPPORTED); +++ dns_name_list_tmp->dnsNameCount = 0; +++ return -ENOSPC; +++ } +++ cursor = strsep(&bufToParse, "~"); +++ if (!cursor) +++ break; +++ +++ strcpy(dns_name_list_tmp->routeRemoteDnsList +++ [dns_name_list_tmp->dnsNameCount].dnsname, +++ cursor); +++ dns_name_list_tmp->dnsNameCount++; +++ } +++ if (dns_name_list_tmp->dnsNameCount == 0) +++ return -EINVAL; +++ options->pRemoteDnsInfo = dns_name_list_tmp; +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check_ipv4_valid(struct sockaddr_in *addr) +++{ +++ if (addr->sin_addr.s_addr == 0 || addr->sin_addr.s_addr == 0xffffffff) +++ return -EINVAL; +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check_ipv6_valid(struct sockaddr_in6 *addr) +++{ +++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0 && +++ addr->sin6_addr.in6_u.u6_addr32[1] == 0 && +++ addr->sin6_addr.in6_u.u6_addr32[2] == 0 && +++ addr->sin6_addr.in6_u.u6_addr32[3] == 0) +++ return -EINVAL; +++ +++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0xffffffff && +++ addr->sin6_addr.in6_u.u6_addr32[1] == 0xffffffff && +++ addr->sin6_addr.in6_u.u6_addr32[2] == 0xffffffff && +++ addr->sin6_addr.in6_u.u6_addr32[3] == 0xffffffff) +++ return -EINVAL; +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check_ip_valid(struct sockaddr_storage *address) +++{ +++ int rc = 0; +++ +++ if (address->ss_family == AF_INET) +++ rc = nfs_multipath_parse_options_check_ipv4_valid( +++ (struct sockaddr_in *)address); +++ else if (address->ss_family == AF_INET6) +++ rc = nfs_multipath_parse_options_check_ipv6_valid( +++ (struct sockaddr_in6 *)address); +++ else +++ rc = -EINVAL; +++ +++ return rc; +++} +++ +++int nfs_multipath_parse_options_check_valid( +++ struct multipath_mount_options *options) +++{ +++ int rc; +++ int i; +++ +++ if (options == NULL) +++ return 0; +++ +++ for (i = 0; i < options->local_ip_list->count; i++) { +++ rc = nfs_multipath_parse_options_check_ip_valid( +++ &options->local_ip_list->address[i]); +++ if (rc != 0) +++ return rc; +++ } +++ +++ for (i = 0; i < options->remote_ip_list->count; i++) { +++ rc = nfs_multipath_parse_options_check_ip_valid( +++ &options->remote_ip_list->address[i]); +++ if (rc != 0) +++ return rc; +++ } +++ +++ return 0; +++} +++int nfs_multipath_parse_options_check_duplicate( +++ struct multipath_mount_options *options) +++{ +++ int i; +++ int j; +++ +++ if (options == NULL || +++ options->local_ip_list->count == 0 || +++ options->remote_ip_list->count == 0) +++ +++ return 0; +++ +++ for (i = 0; i < options->local_ip_list->count; i++) { +++ for (j = 0; j < options->remote_ip_list->count; j++) { +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &options->local_ip_list->address[i], +++ (const struct sockaddr *) +++ &options->remote_ip_list->address[j])) +++ return -ENOTSUPP; +++ } +++ } +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check(struct multipath_mount_options *options) +++{ +++ int rc = 0; +++ +++ rc = nfs_multipath_parse_options_check_valid(options); +++ +++ if (rc != 0) { +++ pr_err("has invaild ip.\n"); +++ return rc; +++ } +++ +++ rc = nfs_multipath_parse_options_check_duplicate(options); +++ if (rc != 0) +++ return rc; +++ return rc; +++} +++ +++int nfs_multipath_alloc_options(void **enfs_option) +++{ +++ struct multipath_mount_options *options = NULL; +++ +++ options = kzalloc(sizeof(struct multipath_mount_options), GFP_KERNEL); +++ +++ if (options == NULL) +++ return -ENOMEM; +++ +++ options->local_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ if (options->local_ip_list == NULL) { +++ kfree(options); +++ return -ENOMEM; +++ } +++ +++ options->remote_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ if (options->remote_ip_list == NULL) { +++ kfree(options->local_ip_list); +++ kfree(options); +++ return -ENOMEM; +++ } +++ +++ options->pRemoteDnsInfo = kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), +++ GFP_KERNEL); +++ if (options->pRemoteDnsInfo == NULL) { +++ kfree(options->remote_ip_list); +++ kfree(options->local_ip_list); +++ kfree(options); +++ return -ENOMEM; +++ } +++ +++ *enfs_option = options; +++ return 0; +++} +++ +++int nfs_multipath_parse_options(enum nfsmultipathoptions type, +++ char *str, void **enfs_option, struct net *net_ns) +++{ +++ int rc; +++ struct multipath_mount_options *options = NULL; +++ +++ if ((str == NULL) || (enfs_option == NULL) || (net_ns == NULL)) +++ return -EINVAL; +++ +++ if (*enfs_option == NULL) { +++ rc = nfs_multipath_alloc_options(enfs_option); +++ if (rc != 0) { +++ enfs_log_error( +++ "alloc enfs_options failed! errno:%d\n", rc); +++ return rc; +++ } +++ } +++ +++ options = (struct multipath_mount_options *)*enfs_option; +++ +++ if (type == LOCALADDR || type == REMOUNTLOCALADDR || +++ type == REMOTEADDR || type == REMOUNTREMOTEADDR) { +++ rc = nfs_multipath_parse_ip_list(str, net_ns, options, type); +++ } else if (type == REMOTEDNSNAME) { +++ /* alloc and release need to modify */ +++ rc = nfs_multipath_parse_dns_list(str, net_ns, options); +++ } else { +++ rc = -EOPNOTSUPP; +++ } +++ +++ // after parsing cmd, need checking local and remote +++ // IP is same. if not means illegal cmd +++ if (rc == 0) +++ rc = nfs_multipath_parse_options_check_duplicate(options); +++ +++ if (rc == 0) +++ rc = nfs_multipath_parse_options_check(options); +++ +++ return rc; +++} +++ +++void nfs_multipath_free_options(void **enfs_option) +++{ +++ struct multipath_mount_options *options; +++ +++ if (enfs_option == NULL || *enfs_option == NULL) +++ return; +++ +++ options = (struct multipath_mount_options *)*enfs_option; +++ +++ if (options->remote_ip_list != NULL) { +++ kfree(options->remote_ip_list); +++ options->remote_ip_list = NULL; +++ } +++ +++ if (options->local_ip_list != NULL) { +++ kfree(options->local_ip_list); +++ options->local_ip_list = NULL; +++ } +++ +++ if (options->pRemoteDnsInfo != NULL) { +++ kfree(options->pRemoteDnsInfo); +++ options->pRemoteDnsInfo = NULL; +++ } +++ +++ kfree(options); +++ *enfs_option = NULL; +++} ++diff --git a/fs/nfs/enfs/enfs_multipath_parse.h b/fs/nfs/enfs/enfs_multipath_parse.h ++new file mode 100644 ++index 000000000000..6f3e8703e3e2 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_parse.h ++@@ -0,0 +1,22 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _ENFS_MULTIPATH_PARSE_H_ +++#define _ENFS_MULTIPATH_PARSE_H_ +++ +++#include "enfs.h" +++ +++struct multipath_mount_options { +++ struct nfs_ip_list *remote_ip_list; +++ struct nfs_ip_list *local_ip_list; +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; +++}; +++ +++int nfs_multipath_parse_options(enum nfsmultipathoptions type, +++ char *str, void **enfs_option, struct net *net_ns); +++void nfs_multipath_free_options(void **enfs_option); +++ +++#endif +diff --git a/0004-add_enfs_module_for_sunrpc_multipatch.patch b/0004-add_enfs_module_for_sunrpc_multipatch.patch +new file mode 100644 +index 0000000..2c0fcc7 +--- /dev/null ++++ b/0004-add_enfs_module_for_sunrpc_multipatch.patch +@@ -0,0 +1,1581 @@ ++diff --git a/fs/nfs/enfs/enfs_multipath.h b/fs/nfs/enfs/enfs_multipath.h ++new file mode 100644 ++index 000000000000..e064c2929ced ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath.h ++@@ -0,0 +1,24 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: enfs multipath +++ * Author: +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef ENFS_MULTIPATH_H +++#define ENFS_MULTIPATH_H +++#include <linux/sunrpc/clnt.h> +++ +++#define MAX_XPRT_NUM_PER_CLIENT 32 +++ +++int enfs_multipath_init(void); +++void enfs_multipath_exit(void); +++void enfs_xprt_ippair_create(struct xprt_create *xprtargs, +++ struct rpc_clnt *clnt, void *data); +++int enfs_config_xprt_create_args(struct xprt_create *xprtargs, +++ struct rpc_create_args *args, +++ char *servername, size_t length); +++void print_enfs_multipath_addr(struct sockaddr *local, struct sockaddr *remote); +++ +++#endif // ENFS_MULTIPATH_H ++diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c ++new file mode 100644 ++index 000000000000..63c02898a42c ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.c ++@@ -0,0 +1,340 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/proc_fs.h> +++#include <linux/seq_file.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/addr.h> +++#include "enfs_multipath_client.h" +++#include "enfs_multipath_parse.h" +++ +++int +++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, +++ const struct nfs_client_initdata *client_init_data) +++{ +++ struct multipath_mount_options *mount_options = +++ (struct multipath_mount_options *)client_init_data->enfs_option; +++ +++ if (mount_options->local_ip_list) { +++ client_info->local_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->local_ip_list) +++ return -ENOMEM; +++ +++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->remote_ip_list) { +++ +++ client_info->remote_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->remote_ip_list) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->remote_ip_list, +++ mount_options->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->pRemoteDnsInfo) { +++ client_info->pRemoteDnsInfo = +++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); +++ +++ if (!client_info->pRemoteDnsInfo) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ kfree(client_info->remote_ip_list); +++ client_info->remote_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->pRemoteDnsInfo, +++ mount_options->pRemoteDnsInfo, +++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); +++ } +++ return 0; +++} +++ +++void nfs_multipath_client_info_free_work(struct work_struct *work) +++{ +++ +++ struct multipath_client_info *clp_info; +++ +++ if (work == NULL) +++ return; +++ +++ clp_info = container_of(work, struct multipath_client_info, work); +++ +++ if (clp_info->local_ip_list != NULL) { +++ kfree(clp_info->local_ip_list); +++ clp_info->local_ip_list = NULL; +++ } +++ if (clp_info->remote_ip_list != NULL) { +++ kfree(clp_info->remote_ip_list); +++ clp_info->remote_ip_list = NULL; +++ } +++ kfree(clp_info); +++} +++ +++void nfs_multipath_client_info_free(void *data) +++{ +++ struct multipath_client_info *clp_info = +++ (struct multipath_client_info *)data; +++ +++ if (clp_info == NULL) +++ return; +++ pr_info("free client info %p.\n", clp_info); +++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); +++ schedule_work(&clp_info->work); +++} +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init) +++{ +++ int rc; +++ struct multipath_client_info *info; +++ struct multipath_client_info **enfs_info; +++ /* no multi path info, no need do multipath init */ +++ if (cl_init->enfs_option == NULL) +++ return 0; +++ enfs_info = (struct multipath_client_info **)data; +++ if (enfs_info == NULL) +++ return -EINVAL; +++ +++ if (*enfs_info == NULL) +++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), +++ GFP_KERNEL); +++ +++ if (*enfs_info == NULL) +++ return -ENOMEM; +++ +++ info = (struct multipath_client_info *)*enfs_info; +++ pr_info("init client info %p.\n", info); +++ rc = nfs_multipath_client_mount_info_init(info, cl_init); +++ if (rc) { +++ nfs_multipath_client_info_free((void *)info); +++ return rc; +++ } +++ return rc; +++} +++ +++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, +++ const struct nfs_ip_list *ip_list_dst) +++{ +++ int i; +++ int j; +++ bool is_find; +++ /* if both are equal or NULL, then return true. */ +++ if (ip_list_src == ip_list_dst) +++ return true; +++ +++ if ((ip_list_src == NULL || ip_list_dst == NULL)) +++ return false; +++ +++ if (ip_list_src->count != ip_list_dst->count) +++ return false; +++ +++ for (i = 0; i < ip_list_src->count; i++) { +++ is_find = false; +++ for (j = 0; j < ip_list_src->count; j++) { +++ if (rpc_cmp_addr_port( +++ (const struct sockaddr *) +++ &ip_list_src->address[i], +++ (const struct sockaddr *) +++ &ip_list_dst->address[j]) +++ ) { +++ is_find = true; +++ break; +++ } +++ } +++ if (is_find == false) +++ return false; +++ } +++ return true; +++} +++ +++int +++nfs_multipath_dns_list_info_match( +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) +++{ +++ int i; +++ +++ /* if both are equal or NULL, then return true. */ +++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) +++ return true; +++ +++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) +++ return false; +++ +++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) +++ return false; +++ +++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { +++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, +++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) +++ return false; +++ } +++ return true; +++} +++ +++int nfs_multipath_client_info_match(void *src, void *dst) +++{ +++ int ret = true; +++ +++ struct multipath_client_info *src_info; +++ struct multipath_mount_options *dst_info; +++ +++ src_info = (struct multipath_client_info *)src; +++ dst_info = (struct multipath_mount_options *)dst; +++ pr_info("try match client .\n"); +++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, +++ dst_info->local_ip_list); +++ if (ret == false) { +++ pr_err("local_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, +++ dst_info->remote_ip_list); +++ if (ret == false) { +++ pr_err("remote_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, +++ dst_info->pRemoteDnsInfo); +++ if (ret == false) { +++ pr_err("dns not match.\n"); +++ return ret; +++ } +++ pr_info("try match client ret %d.\n", ret); +++ return ret; +++} +++ +++void nfs_multipath_print_ip_info(struct seq_file *mount_option, +++ struct nfs_ip_list *ip_list, +++ const char *type) +++{ +++ char buf[IP_ADDRESS_LEN_MAX + 1]; +++ int len = 0; +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < ip_list->count; i++) { +++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], +++ buf, IP_ADDRESS_LEN_MAX); +++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) +++ buf[len] = '\0'; +++ +++ if (i == 0) +++ seq_printf(mount_option, "%s", buf); +++ else +++ seq_printf(mount_option, "~%s", buf); +++ dfprintk(MOUNT, +++ "NFS: show nfs mount option type:%s %s [%s]\n", +++ type, buf, __func__); +++ } +++} +++ +++void nfs_multipath_print_dns_info(struct seq_file *mount_option, +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, +++ const char *type) +++{ +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { +++ if (i == 0) +++ seq_printf(mount_option, +++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) +++ seq_printf(mount_option, ",%s]", +++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else +++ seq_printf(mount_option, +++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ } +++} +++ +++ +++static void multipath_print_sockaddr(struct seq_file *seq, +++ struct sockaddr *addr) +++{ +++ switch (addr->sa_family) { +++ case AF_INET: { +++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; +++ +++ seq_printf(seq, "%pI4", &sin->sin_addr); +++ return; +++ } +++ case AF_INET6: { +++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; +++ +++ seq_printf(seq, "%pI6", &sin6->sin6_addr); +++ return; +++ } +++ default: +++ break; +++ } +++ pr_err("unsupport family:%d\n", addr->sa_family); +++} +++ +++static void multipath_print_enfs_info(struct seq_file *seq, +++ struct nfs_server *server) +++{ +++ struct sockaddr_storage peeraddr; +++ struct rpc_clnt *next = server->client; +++ +++ rpc_peeraddr(server->client, +++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); +++ seq_puts(seq, ",enfs_info="); +++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); +++ +++ while (next->cl_parent) { +++ if (next == next->cl_parent) +++ break; +++ next = next->cl_parent; +++ } +++ seq_printf(seq, "_%u", next->cl_clid); +++} +++ +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) +++{ +++ struct nfs_server *server = data; +++ struct multipath_client_info *client_info = +++ server->nfs_client->cl_multipath_data; +++ +++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); +++ if ((client_info->remote_ip_list) && +++ (client_info->remote_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->remote_ip_list, +++ "remoteaddrs"); +++ +++ if ((client_info->local_ip_list) && +++ (client_info->local_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->local_ip_list, +++ "localaddrs"); +++ +++ if ((client_info->pRemoteDnsInfo) && +++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) +++ nfs_multipath_print_dns_info(mount_option, +++ client_info->pRemoteDnsInfo, +++ "remotednsname"); +++ +++ multipath_print_enfs_info(mount_option, server); +++} ++diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h ++new file mode 100644 ++index 000000000000..208f7260690d ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.h ++@@ -0,0 +1,26 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _ENFS_MULTIPATH_CLIENT_H_ +++#define _ENFS_MULTIPATH_CLIENT_H_ +++ +++#include "enfs.h" +++ +++struct multipath_client_info { +++ struct work_struct work; +++ struct nfs_ip_list *remote_ip_list; +++ struct nfs_ip_list *local_ip_list; +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; +++ s64 client_id; +++}; +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init); +++void nfs_multipath_client_info_free(void *data); +++int nfs_multipath_client_info_match(void *src, void *dst); +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_path.c b/fs/nfs/enfs/enfs_path.c ++new file mode 100644 ++index 000000000000..7355f8c2f672 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_path.c ++@@ -0,0 +1,47 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++ +++#include <linux/sunrpc/metrics.h> +++#include <linux/sunrpc/xprt.h> +++ +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_path.h" +++ +++// only create ctx in this function +++// alloc iostat memory in create_clnt +++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx; +++ +++ if (!xprt) { +++ enfs_log_error("invalid xprt pointer.\n"); +++ return -EINVAL; +++ } +++ +++ ctx = kzalloc(sizeof(struct enfs_xprt_context), GFP_KERNEL); +++ if (!ctx) { +++ enfs_log_error("add xprt test failed.\n"); +++ return -ENOMEM; +++ } +++ +++ xprt->multipath_context = (void *)ctx; +++ return 0; +++} +++ +++// free multi_context and iostat memory +++void enfs_free_xprt_ctx(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = xprt->multipath_context; +++ +++ if (ctx) { +++ if (ctx->stats) { +++ rpc_free_iostats(ctx->stats); +++ ctx->stats = NULL; +++ } +++ kfree(xprt->multipath_context); +++ xprt->multipath_context = NULL; +++ } +++} ++diff --git a/fs/nfs/enfs/enfs_path.h b/fs/nfs/enfs/enfs_path.h ++new file mode 100644 ++index 000000000000..97b1ef3730b8 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_path.h ++@@ -0,0 +1,12 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++ +++#ifndef ENFS_PATH_H +++#define ENFS_PATH_H +++ +++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt); +++void enfs_free_xprt_ctx(struct rpc_xprt *xprt); +++ +++#endif // ENFS_PATH_H ++diff --git a/fs/nfs/enfs/enfs_proc.c b/fs/nfs/enfs/enfs_proc.c ++new file mode 100644 ++index 000000000000..53fa1a07642f ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_proc.c ++@@ -0,0 +1,545 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#include <linux/module.h> +++#include <linux/proc_fs.h> +++#include <linux/seq_file.h> +++#include <linux/spinlock.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/metrics.h> +++#include <linux/sunrpc/xprtsock.h> +++#include <net/netns/generic.h> +++ +++#include "../../../net/sunrpc/netns.h" +++ +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_proc.h" +++#include "enfs_multipath.h" +++#include "pm_state.h" +++ +++#define ENFS_PROC_DIR "enfs" +++#define ENFS_PROC_PATH_STATUS_LEN 256 +++ +++static struct proc_dir_entry *enfs_proc_parent; +++ +++void +++enfs_iterate_each_rpc_clnt(int (*fn)(struct rpc_clnt *clnt, void *data), +++ void *data) +++{ +++ struct net *net; +++ struct sunrpc_net *sn; +++ struct rpc_clnt *clnt; +++ +++ rcu_read_lock(); +++ for_each_net_rcu(net) { +++ sn = net_generic(net, sunrpc_net_id); +++ if (sn == NULL) +++ continue; +++ spin_lock(&sn->rpc_client_lock); +++ list_for_each_entry(clnt, &sn->all_clients, cl_clients) { +++ fn(clnt, data); +++ } +++ spin_unlock(&sn->rpc_client_lock); +++ } +++ rcu_read_unlock(); +++} +++ +++struct proc_dir_entry *enfs_get_proc_parent(void) +++{ +++ return enfs_proc_parent; +++} +++ +++static int sockaddr_ip_to_str(struct sockaddr *addr, char *buf, int len) +++{ +++ switch (addr->sa_family) { +++ case AF_INET: { +++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; +++ +++ snprintf(buf, len, "%pI4", &sin->sin_addr); +++ return 0; +++ } +++ case AF_INET6: { +++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; +++ +++ snprintf(buf, len, "%pI6", &sin6->sin6_addr); +++ return 0; +++ } +++ default: +++ break; +++ } +++ return 1; +++} +++ +++static bool should_print(const char *name) +++{ +++ int i; +++ static const char * const proc_names[] = { +++ "READ", +++ "WRITE", +++ }; +++ +++ if (name == NULL) +++ return false; +++ +++ for (i = 0; i < ARRAY_SIZE(proc_names); i++) { +++ if (strcmp(name, proc_names[i]) == 0) +++ return true; +++ } +++ return false; +++} +++ +++struct enfs_xprt_iter { +++ unsigned int id; +++ struct seq_file *seq; +++ unsigned int max_addrs_length; +++}; +++ +++static int debug_show_xprt(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ +++ if (xprt->multipath_context) +++ ctx = xprt->multipath_context; +++ +++ pr_info(" xprt:%p ctx:%p main:%d queue_len:%lu.\n", xprt, +++ xprt->multipath_context, +++ ctx ? ctx->main : false, +++ atomic_long_read(&xprt->queuelen)); +++ return 0; +++} +++ +++static int debug_show_clnt(struct rpc_clnt *clnt, void *data) +++{ +++ pr_info(" clnt %d addr:%p enfs:%d\n", +++ clnt->cl_clid, clnt, +++ clnt->cl_enfs); +++ rpc_clnt_iterate_for_each_xprt(clnt, debug_show_xprt, NULL); +++ return 0; +++} +++ +++static void debug_print_all_xprt(void) +++{ +++ enfs_iterate_each_rpc_clnt(debug_show_clnt, NULL); +++} +++ +++static +++void enfs_proc_format_xprt_addr_display(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ char *local_name_buf, +++ int local_name_buf_len, +++ char *remote_name_buf, +++ int remote_name_buf_len) +++{ +++ int err; +++ struct sockaddr_storage srcaddr; +++ struct enfs_xprt_context *ctx; +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ sockaddr_ip_to_str((struct sockaddr *)&xprt->addr, +++ remote_name_buf, remote_name_buf_len); +++ +++ // get local address depend one main or not +++ if (enfs_is_main_xprt(xprt)) { +++ err = rpc_localaddr(clnt, (struct sockaddr *)&srcaddr, +++ sizeof(srcaddr)); +++ if (err != 0) +++ (void)snprintf(local_name_buf, +++ local_name_buf_len, "Unknown"); +++ else +++ sockaddr_ip_to_str((struct sockaddr *)&srcaddr, +++ local_name_buf, +++ local_name_buf_len); +++ } else { +++ sockaddr_ip_to_str((struct sockaddr *)&ctx->srcaddr, +++ local_name_buf, +++ local_name_buf_len); +++ } +++} +++ +++static int enfs_show_xprt_stats(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ unsigned int op; +++ unsigned int maxproc = clnt->cl_maxproc; +++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; +++ struct enfs_xprt_context *ctx; +++ char local_name[INET6_ADDRSTRLEN]; +++ char remote_name[INET6_ADDRSTRLEN]; +++ +++ if (!xprt->multipath_context) +++ return 0; +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ enfs_proc_format_xprt_addr_display(clnt, xprt, local_name, +++ sizeof(local_name), +++ remote_name, sizeof(remote_name)); +++ +++ seq_printf(iter->seq, "%-6u%-*s%-*s", iter->id, +++ iter->max_addrs_length + 4, +++ local_name, +++ iter->max_addrs_length + 4, +++ remote_name); +++ +++ iter->id++; +++ +++ for (op = 0; op < maxproc; op++) { +++ if (!should_print(clnt->cl_procinfo[op].p_name)) +++ continue; +++ +++ seq_printf(iter->seq, "%-22lu%-22Lu%-22Lu", +++ ctx->stats[op].om_ops, +++ ctx->stats[op].om_ops == 0 ? 0 : +++ ktime_to_ms(ctx->stats[op].om_rtt) / +++ ctx->stats[op].om_ops, +++ ctx->stats[op].om_ops == 0 ? 0 : +++ ktime_to_ms(ctx->stats[op].om_execute) / +++ ctx->stats[op].om_ops); +++ } +++ seq_puts(iter->seq, "\n"); +++ return 0; +++} +++ +++static int rpc_proc_show_path_status(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; +++ struct enfs_xprt_context *ctx = NULL; +++ char local_name[INET6_ADDRSTRLEN] = {0}; +++ char remote_name[INET6_ADDRSTRLEN] = {0}; +++ char multiapth_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; +++ char xprt_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; +++ +++ if (!xprt->multipath_context) { +++ enfs_log_debug("multipath_context is null.\n"); +++ return 0; +++ } +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ enfs_proc_format_xprt_addr_display(clnt, xprt, +++ local_name, +++ sizeof(local_name), +++ remote_name, sizeof(remote_name)); +++ +++ pm_get_path_state_desc(xprt, +++ multiapth_status, +++ ENFS_PROC_PATH_STATUS_LEN); +++ +++ pm_get_xprt_state_desc(xprt, +++ xprt_status, +++ ENFS_PROC_PATH_STATUS_LEN); +++ +++ seq_printf(iter->seq, "%-6u%-*s%-*s%-12s%-12s\n", +++ iter->id, iter->max_addrs_length + 4, +++ local_name, iter->max_addrs_length + 4, +++ remote_name, multiapth_status, +++ xprt_status); +++ iter->id++; +++ return 0; +++} +++ +++static int enfs_get_max_addrs_length(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; +++ char local_name[INET6_ADDRSTRLEN]; +++ char remote_name[INET6_ADDRSTRLEN]; +++ +++ enfs_proc_format_xprt_addr_display(clnt, xprt, +++ local_name, sizeof(local_name), +++ remote_name, sizeof(remote_name)); +++ +++ if (iter->max_addrs_length < strlen(local_name)) +++ iter->max_addrs_length = strlen(local_name); +++ +++ if (iter->max_addrs_length < strlen(remote_name)) +++ iter->max_addrs_length = strlen(remote_name); +++ +++ return 0; +++} +++ +++static int rpc_proc_clnt_showpath(struct seq_file *seq, void *v) +++{ +++ struct rpc_clnt *clnt = seq->private; +++ struct enfs_xprt_iter iter; +++ +++ iter.seq = seq; +++ iter.id = 0; +++ iter.max_addrs_length = 0; +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ enfs_get_max_addrs_length, +++ (void *)&iter); +++ +++ seq_printf(seq, "%-6s%-*s%-*s%-12s%-12s\n", "id", +++ iter.max_addrs_length + 4, +++ "local_addr", +++ iter.max_addrs_length + 4, +++ "remote_addr", +++ "path_state", +++ "xprt_state"); +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ rpc_proc_show_path_status, +++ (void *)&iter); +++ return 0; +++} +++ +++static int enfs_rpc_proc_show(struct seq_file *seq, void *v) +++{ +++ struct rpc_clnt *clnt = seq->private; +++ struct enfs_xprt_iter iter; +++ +++ iter.seq = seq; +++ iter.id = 0; +++ iter.max_addrs_length = 0; +++ +++ debug_print_all_xprt(); +++ pr_info("enfs proc clnt:%p\n", clnt); +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ enfs_get_max_addrs_length, +++ (void *)&iter); +++ +++ seq_printf(seq, "%-6s%-*s%-*s%-22s%-22s%-22s%-22s%-22s%-22s\n", "id", +++ iter.max_addrs_length + 4, "local_addr", +++ iter.max_addrs_length + 4, +++ "remote_addr", "r_count", +++ "r_rtt", "r_exec", "w_count", "w_rtt", "w_exec"); +++ +++ // rpc_clnt_show_stats(seq, clnt); +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ enfs_show_xprt_stats, +++ (void *)&iter); +++ return 0; +++} +++ +++static int rpc_proc_open(struct inode *inode, struct file *file) +++{ +++ struct rpc_clnt *clnt = PDE_DATA(inode); +++ +++ pr_info("%s %p\n", __func__, clnt); +++ return single_open(file, enfs_rpc_proc_show, clnt); +++} +++ +++static int enfs_reset_xprt_stats(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ unsigned int op; +++ struct enfs_xprt_context *ctx; +++ unsigned int maxproc = clnt->cl_maxproc; +++ struct rpc_iostats stats = {0}; +++ +++ if (!xprt->multipath_context) +++ return 0; +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ for (op = 0; op < maxproc; op++) { +++ spin_lock(&ctx->stats[op].om_lock); +++ ctx->stats[op] = stats; +++ spin_unlock(&ctx->stats[op].om_lock); +++ } +++ return 0; +++} +++ +++static void trim_newline_ch(char *str, int len) +++{ +++ int i; +++ +++ for (i = 0; str[i] != '\0' && i < len; i++) { +++ if (str[i] == '\n') +++ str[i] = '\0'; +++ } +++} +++ +++static ssize_t enfs_proc_write(struct file *file, +++ const char __user *user_buf, +++ size_t len, +++ loff_t *offset) +++{ +++ char buffer[128]; +++ struct rpc_clnt *clnt = +++ ((struct seq_file *)file->private_data)->private; +++ +++ if (len >= sizeof(buffer)) +++ return -E2BIG; +++ +++ if (copy_from_user(buffer, user_buf, len) != 0) +++ return -EFAULT; +++ +++ buffer[len] = '\0'; +++ trim_newline_ch(buffer, len); +++ if (strcmp(buffer, "reset") != 0) +++ return -EINVAL; +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, enfs_reset_xprt_stats, NULL); +++ return len; +++} +++ +++static int rpc_proc_show_path(struct inode *inode, struct file *file) +++{ +++ struct rpc_clnt *clnt = PDE_DATA(inode); +++ +++ return single_open(file, rpc_proc_clnt_showpath, clnt); +++} +++ +++static const struct file_operations rpc_proc_fops = { +++ .owner = THIS_MODULE, +++ .open = rpc_proc_open, +++ .read = seq_read, +++ .llseek = seq_lseek, +++ .release = single_release, +++ .write = enfs_proc_write, +++}; +++ +++static const struct file_operations rpc_show_path_fops = { +++ .owner = THIS_MODULE, +++ .open = rpc_proc_show_path, +++ .read = seq_read, +++ .llseek = seq_lseek, +++ .release = single_release, +++}; +++ +++static int clnt_proc_name(struct rpc_clnt *clnt, char *buf, int len) +++{ +++ int ret; +++ +++ ret = snprintf(buf, len, "%s_%u", +++ rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR), +++ clnt->cl_clid); +++ if (ret > len) +++ return -E2BIG; +++ return 0; +++} +++ +++static int enfs_proc_create_file(struct rpc_clnt *clnt) +++{ +++ int err; +++ char buf[128]; +++ +++ struct proc_dir_entry *clnt_entry; +++ struct proc_dir_entry *stat_entry; +++ +++ err = clnt_proc_name(clnt, buf, sizeof(buf)); +++ if (err) +++ return err; +++ +++ clnt_entry = proc_mkdir(buf, enfs_proc_parent); +++ if (clnt_entry == NULL) +++ return -EINVAL; +++ +++ stat_entry = proc_create_data("stat", +++ 0, clnt_entry, +++ &rpc_proc_fops, clnt); +++ +++ if (stat_entry == NULL) +++ return -EINVAL; +++ +++ stat_entry = proc_create_data("path", +++ 0, clnt_entry, +++ &rpc_show_path_fops, clnt); +++ +++ if (stat_entry == NULL) +++ return -EINVAL; +++ +++ return 0; +++} +++ +++void enfs_count_iostat(struct rpc_task *task) +++{ +++ struct enfs_xprt_context *ctx = task->tk_xprt->multipath_context; +++ +++ if (!ctx || !ctx->stats) +++ return; +++ rpc_count_iostats(task, ctx->stats); +++} +++ +++static void enfs_proc_delete_file(struct rpc_clnt *clnt) +++{ +++ int err; +++ char buf[128]; +++ +++ err = clnt_proc_name(clnt, buf, sizeof(buf)); +++ if (err) { +++ pr_err("gen clnt name failed.\n"); +++ return; +++ } +++ remove_proc_subtree(buf, enfs_proc_parent); +++} +++ +++// create proc file "/porc/enfs/[mount_ip]_[id]/stat" +++int enfs_proc_create_clnt(struct rpc_clnt *clnt) +++{ +++ int err; +++ +++ err = enfs_proc_create_file(clnt); +++ if (err) { +++ pr_err("create client %d\n", err); +++ return err; +++ } +++ +++ return 0; +++} +++ +++void enfs_proc_delete_clnt(struct rpc_clnt *clnt) +++{ +++ if (clnt->cl_enfs) +++ enfs_proc_delete_file(clnt); +++} +++ +++static int enfs_proc_create_parent(void) +++{ +++ enfs_proc_parent = proc_mkdir(ENFS_PROC_DIR, NULL); +++ +++ if (enfs_proc_parent == NULL) { +++ pr_err("Enfs create proc dir err\n"); +++ return -ENOMEM; +++ } +++ return 0; +++} +++ +++static void enfs_proc_delete_parent(void) +++{ +++ remove_proc_entry(ENFS_PROC_DIR, NULL); +++} +++ +++static int enfs_proc_init_create_clnt(struct rpc_clnt *clnt, void *data) +++{ +++ if (clnt->cl_enfs) +++ enfs_proc_create_file(clnt); +++ return 0; +++} +++ +++static int enfs_proc_destroy_clnt(struct rpc_clnt *clnt, void *data) +++{ +++ if (clnt->cl_enfs) +++ enfs_proc_delete_file(clnt); +++ return 0; +++} +++ +++int enfs_proc_init(void) +++{ +++ int err; +++ +++ err = enfs_proc_create_parent(); +++ if (err) +++ return err; +++ +++ enfs_iterate_each_rpc_clnt(enfs_proc_init_create_clnt, NULL); +++ return 0; +++} +++ +++void enfs_proc_exit(void) +++{ +++ enfs_iterate_each_rpc_clnt(enfs_proc_destroy_clnt, NULL); +++ enfs_proc_delete_parent(); +++} ++diff --git a/fs/nfs/enfs/enfs_proc.h b/fs/nfs/enfs/enfs_proc.h ++new file mode 100644 ++index 000000000000..321951031c2e ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_proc.h ++@@ -0,0 +1,21 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS PROC. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef ENFS_PROC_H +++#define ENFS_PROC_H +++ +++struct rpc_clnt; +++struct rpc_task; +++struct proc_dir_entry; +++ +++int enfs_proc_init(void); +++void enfs_proc_exit(void); +++struct proc_dir_entry *enfs_get_proc_parent(void); +++int enfs_proc_create_clnt(struct rpc_clnt *clnt); +++void enfs_proc_delete_clnt(struct rpc_clnt *clnt); +++void enfs_count_iostat(struct rpc_task *task); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_remount.c b/fs/nfs/enfs/enfs_remount.c ++new file mode 100644 ++index 000000000000..2c3fe125c735 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_remount.c ++@@ -0,0 +1,221 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: remount ip source file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++#include "enfs_remount.h" +++ +++#include <linux/string.h> +++#include <linux/in.h> +++#include <linux/in6.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/spinlock.h> +++#include <linux/sunrpc/addr.h> +++#include <linux/sunrpc/metrics.h> +++#include <linux/sunrpc/xprtmultipath.h> +++#include <linux/sunrpc/xprtsock.h> +++#include <linux/sunrpc/xprt.h> +++#include <linux/smp.h> +++#include <linux/delay.h> +++ +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_multipath.h" +++#include "enfs_multipath_parse.h" +++#include "enfs_path.h" +++#include "enfs_proc.h" +++#include "enfs_multipath_client.h" +++ +++static bool enfs_rpc_xprt_switch_need_delete_addr( +++ struct multipath_mount_options *enfs_option, +++ struct sockaddr *dstaddr, struct sockaddr *srcaddr) +++{ +++ int i; +++ bool find_same_ip = false; +++ int32_t local_total; +++ int32_t remote_total; +++ +++ local_total = enfs_option->local_ip_list->count; +++ remote_total = enfs_option->remote_ip_list->count; +++ if (local_total == 0 || remote_total == 0) { +++ pr_err("no ip list is present.\n"); +++ return false; +++ } +++ +++ for (i = 0; i < local_total; i++) { +++ find_same_ip = +++ rpc_cmp_addr((struct sockaddr *) +++ &enfs_option->local_ip_list->address[i], +++ srcaddr); +++ if (find_same_ip) +++ break; +++ } +++ +++ if (find_same_ip == false) +++ return true; +++ +++ find_same_ip = false; +++ for (i = 0; i < remote_total; i++) { +++ find_same_ip = +++ rpc_cmp_addr((struct sockaddr *) +++ &enfs_option->remote_ip_list->address[i], +++ dstaddr); +++ if (find_same_ip) +++ break; +++ } +++ +++ if (find_same_ip == false) +++ return true; +++ +++ return false; +++} +++ +++// Used in rcu_lock +++static bool enfs_delete_xprt_from_switch(struct rpc_xprt *xprt, +++ void *enfs_option, +++ struct rpc_xprt_switch *xps) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ struct multipath_mount_options *mopt = +++ (struct multipath_mount_options *)enfs_option; +++ +++ if (enfs_is_main_xprt(xprt)) +++ return true; +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (enfs_rpc_xprt_switch_need_delete_addr(mopt, +++ (struct sockaddr *)&xprt->addr, +++ (struct sockaddr *)&ctx->srcaddr)) { +++ +++ print_enfs_multipath_addr((struct sockaddr *)&ctx->srcaddr, +++ (struct sockaddr *)&xprt->addr); +++ rpc_xprt_switch_remove_xprt(xps, xprt); +++ return true; +++ } +++ +++ return false; +++} +++ +++void enfs_clnt_delete_obsolete_xprts(struct nfs_client *nfs_client, +++ void *enfs_option) +++{ +++ int xprt_count = 0; +++ struct rpc_xprt *pos = NULL; +++ struct rpc_xprt_switch *xps = NULL; +++ +++ rcu_read_lock(); +++ xps = xprt_switch_get( +++ rcu_dereference( +++ nfs_client->cl_rpcclient->cl_xpi.xpi_xpswitch)); +++ if (xps == NULL) { +++ rcu_read_unlock(); +++ xprt_switch_put(xps); +++ return; +++ } +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (xprt_count < MAX_XPRT_NUM_PER_CLIENT) { +++ if (enfs_delete_xprt_from_switch( +++ pos, enfs_option, xps) == false) +++ xprt_count++; +++ } else +++ rpc_xprt_switch_remove_xprt(xps, pos); +++ } +++ rcu_read_unlock(); +++ xprt_switch_put(xps); +++} +++ +++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) +++{ +++ int errno = 0; +++ char servername[48]; +++ struct multipath_mount_options *remount_lists = +++ (struct multipath_mount_options *)enfs_option; +++ struct multipath_client_info *client_info = +++ (struct multipath_client_info *)nfs_client->cl_multipath_data; +++ struct xprt_create xprtargs; +++ struct rpc_create_args args = { +++ .protocol = nfs_client->cl_proto, +++ .net = nfs_client->cl_net, +++ .addrsize = nfs_client->cl_addrlen, +++ .servername = nfs_client->cl_hostname, +++ }; +++ +++ memset(&xprtargs, 0, sizeof(struct xprt_create)); +++ +++ //mount is not use multipath +++ if (client_info == NULL || enfs_option == NULL) { +++ enfs_log_error( +++ "mount information or remount information is empty.\n"); +++ return -EINVAL; +++ } +++ +++ //remount : localaddrs and remoteaddrs are empty +++ if (remount_lists->local_ip_list->count == 0 && +++ remount_lists->remote_ip_list->count == 0) { +++ enfs_log_info("remount local_ip_list and remote_ip_list are NULL\n"); +++ return 0; +++ } +++ +++ errno = enfs_config_xprt_create_args(&xprtargs, +++ &args, servername, sizeof(servername)); +++ +++ if (errno) { +++ enfs_log_error("config_xprt_create failed! errno:%d\n", errno); +++ return errno; +++ } +++ +++ if (remount_lists->local_ip_list->count == 0) { +++ if (client_info->local_ip_list->count == 0) { +++ errno = rpc_localaddr(nfs_client->cl_rpcclient, +++ (struct sockaddr *) +++ &remount_lists->local_ip_list->address[0], +++ sizeof(struct sockaddr_storage)); +++ if (errno) { +++ enfs_log_error("get clnt srcaddr errno:%d\n", +++ errno); +++ return errno; +++ } +++ remount_lists->local_ip_list->count = 1; +++ } else +++ memcpy(remount_lists->local_ip_list, +++ client_info->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (remount_lists->remote_ip_list->count == 0) { +++ if (client_info->remote_ip_list->count == 0) { +++ errno = rpc_peeraddr(nfs_client->cl_rpcclient, +++ (struct sockaddr *) +++ &remount_lists->remote_ip_list->address[0], +++ sizeof(struct sockaddr_storage)); +++ if (errno == 0) { +++ enfs_log_error("get clnt dstaddr errno:%d\n", +++ errno); +++ return errno; +++ } +++ remount_lists->remote_ip_list->count = 1; +++ } else +++ memcpy(remount_lists->remote_ip_list, +++ client_info->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ enfs_log_info("Remount creating new links...\n"); +++ enfs_xprt_ippair_create(&xprtargs, +++ nfs_client->cl_rpcclient, +++ remount_lists); +++ +++ enfs_log_info("Remount deleting obsolete links...\n"); +++ enfs_clnt_delete_obsolete_xprts(nfs_client, remount_lists); +++ +++ memcpy(client_info->local_ip_list, +++ remount_lists->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ memcpy(client_info->remote_ip_list, +++ remount_lists->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ +++ return 0; +++} ++diff --git a/fs/nfs/enfs/enfs_remount.h b/fs/nfs/enfs/enfs_remount.h ++new file mode 100644 ++index 000000000000..a663ed257004 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_remount.h ++@@ -0,0 +1,15 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: remount ip header file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++#ifndef _ENFS_REMOUNT_ +++#define _ENFS_REMOUNT_ +++#include <linux/string.h> +++#include "enfs.h" +++ +++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_roundrobin.c b/fs/nfs/enfs/enfs_roundrobin.c ++new file mode 100644 ++index 000000000000..4e4eda784a3e ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_roundrobin.c ++@@ -0,0 +1,255 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#include <linux/spinlock.h> +++#include <linux/module.h> +++#include <linux/printk.h> +++#include <linux/kref.h> +++#include <linux/rculist.h> +++#include <linux/types.h> +++#include <linux/sunrpc/xprt.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/xprtmultipath.h> +++#include "enfs_roundrobin.h" +++ +++#include "enfs.h" +++#include "enfs_config.h" +++#include "pm_state.h" +++ +++typedef struct rpc_xprt *(*enfs_xprt_switch_find_xprt_t)( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur); +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin; +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular; +++ +++static bool enfs_xprt_is_active(struct rpc_xprt *xprt) +++{ +++ enum pm_path_state state; +++ +++ if (kref_read(&xprt->kref) <= 0) +++ return false; +++ +++ state = pm_get_path_state(xprt); +++ if (state == PM_STATE_NORMAL) +++ return true; +++ +++ return false; +++} +++ +++static struct rpc_xprt *enfs_lb_set_cursor_xprt( +++ struct rpc_xprt_switch *xps, struct rpc_xprt **cursor, +++ enfs_xprt_switch_find_xprt_t find_next) +++{ +++ struct rpc_xprt *pos; +++ struct rpc_xprt *old; +++ +++ old = smp_load_acquire(cursor); /* read latest cursor */ +++ pos = find_next(xps, old); +++ smp_store_release(cursor, pos); /* let cursor point to pos */ +++ return pos; +++} +++ +++static +++struct rpc_xprt *enfs_lb_find_next_entry_roundrobin( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *pos; +++ struct rpc_xprt *prev = NULL; +++ bool found = false; +++ struct rpc_xprt *min_queuelen_xprt = NULL; +++ unsigned long pos_xprt_queuelen; +++ unsigned long min_xprt_queuelen = 0; +++ +++ unsigned long xps_queuelen = atomic_long_read(&xps->xps_queuelen); +++ // delete origin xprt +++ unsigned int multipath_nactive = READ_ONCE(xps->xps_nactive) - 1; +++ +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (enfs_is_main_xprt(pos) || !enfs_xprt_is_active(pos)) { +++ prev = pos; +++ continue; +++ } +++ +++ pos_xprt_queuelen = atomic_long_read(&pos->queuelen); +++ if (min_queuelen_xprt == NULL || +++ pos_xprt_queuelen < min_xprt_queuelen) { +++ +++ min_queuelen_xprt = pos; +++ min_xprt_queuelen = pos_xprt_queuelen; +++ } +++ +++ if (cur == prev) +++ found = true; +++ +++ if (found && pos_xprt_queuelen * +++ multipath_nactive <= xps_queuelen) +++ return pos; +++ prev = pos; +++ }; +++ +++ return min_queuelen_xprt; +++} +++ +++struct rpc_xprt *enfs_lb_switch_find_first_active_xprt( +++ struct rpc_xprt_switch *xps) +++{ +++ struct rpc_xprt *pos; +++ +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (enfs_xprt_is_active(pos)) +++ return pos; +++ }; +++ return NULL; +++} +++ +++struct rpc_xprt *enfs_lb_switch_get_main_xprt(struct rpc_xprt_switch *xps) +++{ +++ return list_first_or_null_rcu(&xps->xps_xprt_list, +++ struct rpc_xprt, xprt_switch); +++} +++ +++static struct rpc_xprt *enfs_lb_switch_get_next_xprt_roundrobin( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *xprt; +++ +++ // disable multipath +++ if (enfs_get_config_multipath_state()) +++ return enfs_lb_switch_get_main_xprt(xps); +++ +++ xprt = enfs_lb_find_next_entry_roundrobin(xps, cur); +++ if (xprt != NULL) +++ return xprt; +++ +++ return enfs_lb_switch_get_main_xprt(xps); +++} +++ +++static +++struct rpc_xprt *enfs_lb_iter_next_entry_roundrobin(struct rpc_xprt_iter *xpi) +++{ +++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); +++ +++ if (xps == NULL) +++ return NULL; +++ +++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, +++ enfs_lb_switch_get_next_xprt_roundrobin); +++} +++ +++static +++struct rpc_xprt *enfs_lb_switch_find_singular_entry( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *pos; +++ bool found = false; +++ +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (cur == pos) +++ found = true; +++ +++ if (found && enfs_xprt_is_active(pos)) +++ return pos; +++ } +++ return NULL; +++} +++ +++struct rpc_xprt *enfs_lb_get_singular_xprt( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *xprt; +++ +++ if (xps == NULL) +++ return NULL; +++ +++ // disable multipath +++ if (enfs_get_config_multipath_state()) +++ return enfs_lb_switch_get_main_xprt(xps); +++ +++ if (cur == NULL || xps->xps_nxprts < 2) +++ return enfs_lb_switch_find_first_active_xprt(xps); +++ +++ xprt = enfs_lb_switch_find_singular_entry(xps, cur); +++ if (!xprt) +++ return enfs_lb_switch_get_main_xprt(xps); +++ +++ return xprt; +++} +++ +++static +++struct rpc_xprt *enfs_lb_iter_next_entry_sigular(struct rpc_xprt_iter *xpi) +++{ +++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); +++ +++ if (xps == NULL) +++ return NULL; +++ +++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, +++ enfs_lb_get_singular_xprt); +++} +++ +++static void enfs_lb_iter_default_rewind(struct rpc_xprt_iter *xpi) +++{ +++ WRITE_ONCE(xpi->xpi_cursor, NULL); +++} +++ +++static void enfs_lb_switch_set_roundrobin(struct rpc_clnt *clnt) +++{ +++ struct rpc_xprt_switch *xps; +++ +++ rcu_read_lock(); +++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); +++ rcu_read_unlock(); +++ if (clnt->cl_vers == 3) { +++ +++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_roundrobin) +++ WRITE_ONCE(xps->xps_iter_ops, +++ &enfs_xprt_iter_roundrobin); +++ +++ return; +++ } +++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_singular) +++ WRITE_ONCE(xps->xps_iter_ops, &enfs_xprt_iter_singular); +++} +++ +++static +++struct rpc_xprt *enfs_lb_switch_find_current(struct list_head *head, +++ const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *pos; +++ +++ list_for_each_entry_rcu(pos, head, xprt_switch) { +++ if (cur == pos) +++ return pos; +++ } +++ return NULL; +++} +++ +++static struct rpc_xprt *enfs_lb_iter_current_entry(struct rpc_xprt_iter *xpi) +++{ +++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); +++ struct list_head *head; +++ +++ if (xps == NULL) +++ return NULL; +++ head = &xps->xps_xprt_list; +++ if (xpi->xpi_cursor == NULL || xps->xps_nxprts < 2) +++ return enfs_lb_switch_get_main_xprt(xps); +++ return enfs_lb_switch_find_current(head, xpi->xpi_cursor); +++} +++ +++void enfs_lb_set_policy(struct rpc_clnt *clnt) +++{ +++ enfs_lb_switch_set_roundrobin(clnt); +++} +++ +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin = { +++ .xpi_rewind = enfs_lb_iter_default_rewind, +++ .xpi_xprt = enfs_lb_iter_current_entry, +++ .xpi_next = enfs_lb_iter_next_entry_roundrobin, +++}; +++ +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular = { +++ .xpi_rewind = enfs_lb_iter_default_rewind, +++ .xpi_xprt = enfs_lb_iter_current_entry, +++ .xpi_next = enfs_lb_iter_next_entry_sigular, +++}; ++diff --git a/fs/nfs/enfs/enfs_roundrobin.h b/fs/nfs/enfs/enfs_roundrobin.h ++new file mode 100644 ++index 000000000000..b72b088a6258 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_roundrobin.h ++@@ -0,0 +1,9 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#ifndef ENFS_ROUNDROBIN_H +++#define ENFS_ROUNDROBIN_H +++ +++void enfs_lb_set_policy(struct rpc_clnt *clnt); +++#endif +diff --git a/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch +new file mode 100644 +index 0000000..cc6b677 +--- /dev/null ++++ b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch +@@ -0,0 +1,1607 @@ ++diff --git a/fs/nfs/enfs/enfs_config.c b/fs/nfs/enfs/enfs_config.c ++new file mode 100644 ++index 000000000000..11aa7a00385b ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_config.c ++@@ -0,0 +1,378 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#include <linux/cdev.h> +++#include <linux/errno.h> +++#include <linux/fcntl.h> +++#include <linux/fs.h> +++#include <linux/kernel.h> +++#include <linux/kthread.h> +++#include <linux/slab.h> +++#include <linux/string.h> +++#include <linux/uaccess.h> +++#include <linux/delay.h> +++ +++#include "enfs_errcode.h" +++#include "enfs_log.h" +++#include "enfs_config.h" +++ +++#define MAX_FILE_SIZE 8192 +++#define STRING_BUF_SIZE 128 +++#define CONFIG_FILE_PATH "/etc/enfs/config.ini" +++#define ENFS_NOTIFY_FILE_PERIOD 1000UL +++ +++#define MAX_PATH_DETECT_INTERVAL 300 +++#define MIN_PATH_DETECT_INTERVAL 5 +++#define MAX_PATH_DETECT_TIMEOUT 60 +++#define MIN_PATH_DETECT_TIMEOUT 1 +++#define MAX_MULTIPATH_TIMEOUT 60 +++#define MIN_MULTIPATH_TIMEOUT 0 +++#define MAX_MULTIPATH_STATE ENFS_MULTIPATH_DISABLE +++#define MIN_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE +++ +++#define DEFAULT_PATH_DETECT_INTERVAL 10 +++#define DEFAULT_PATH_DETECT_TIMEOUT 5 +++#define DEFAULT_MULTIPATH_TIMEOUT 0 +++#define DEFAULT_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE +++#define DEFAULT_LOADBALANCE_MODE ENFS_LOADBALANCE_RR +++ +++typedef int (*check_and_assign_func)(char *, char *, int, int); +++ +++struct enfs_config_info { +++ int32_t path_detect_interval; +++ int32_t path_detect_timeout; +++ int32_t multipath_timeout; +++ int32_t loadbalance_mode; +++ int32_t multipath_state; +++}; +++ +++struct check_and_assign_value { +++ char *field_name; +++ check_and_assign_func func; +++ int min_value; +++ int max_value; +++}; +++ +++static struct enfs_config_info g_enfs_config_info; +++static struct timespec64 modify_time; +++static struct task_struct *thread; +++ +++static int enfs_check_config_value(char *value, int min_value, int max_value) +++{ +++ unsigned long num_value; +++ int ret; +++ +++ ret = kstrtol(value, 10, &num_value); +++ if (ret != 0) { +++ enfs_log_error("Failed to convert string to int\n"); +++ return -EINVAL; +++ } +++ +++ if (num_value < min_value || num_value > max_value) +++ return -EINVAL; +++ +++ return num_value; +++} +++ +++static int32_t enfs_check_and_assign_int_value(char *field_name, char *value, +++ int min_value, int max_value) +++{ +++ int int_value = enfs_check_config_value(value, min_value, max_value); +++ +++ if (int_value < 0) +++ return -EINVAL; +++ +++ if (strcmp(field_name, "path_detect_interval") == 0) { +++ g_enfs_config_info.path_detect_interval = int_value; +++ return ENFS_RET_OK; +++ } +++ if (strcmp(field_name, "path_detect_timeout") == 0) { +++ g_enfs_config_info.path_detect_timeout = int_value; +++ return ENFS_RET_OK; +++ } +++ if (strcmp(field_name, "multipath_timeout") == 0) { +++ g_enfs_config_info.multipath_timeout = int_value; +++ return ENFS_RET_OK; +++ } +++ if (strcmp(field_name, "multipath_disable") == 0) { +++ g_enfs_config_info.multipath_state = int_value; +++ return ENFS_RET_OK; +++ } +++ return -EINVAL; +++} +++ +++static int32_t enfs_check_and_assign_loadbalance_mode(char *field_name, +++ char *value, +++ int min_value, +++ int max_value) +++{ +++ if (value == NULL) +++ return -EINVAL; +++ +++ if (strcmp(field_name, "multipath_select_policy") == 0) { +++ if (strcmp(value, "roundrobin") == 0) { +++ g_enfs_config_info.loadbalance_mode +++ = ENFS_LOADBALANCE_RR; +++ return ENFS_RET_OK; +++ } +++ } +++ return -EINVAL; +++} +++ +++static const struct check_and_assign_value g_check_and_assign_value[] = { +++ {"path_detect_interval", enfs_check_and_assign_int_value, +++ MIN_PATH_DETECT_INTERVAL, MAX_PATH_DETECT_INTERVAL}, +++ {"path_detect_timeout", enfs_check_and_assign_int_value, +++ MIN_PATH_DETECT_TIMEOUT, MAX_PATH_DETECT_TIMEOUT}, +++ {"multipath_timeout", enfs_check_and_assign_int_value, +++ MIN_MULTIPATH_TIMEOUT, MAX_MULTIPATH_TIMEOUT}, +++ {"multipath_disable", enfs_check_and_assign_int_value, +++ MIN_MULTIPATH_STATE, MAX_MULTIPATH_STATE}, +++ {"multipath_select_policy", enfs_check_and_assign_loadbalance_mode, +++ 0, 0}, +++}; +++ +++static int32_t enfs_read_config_file(char *buffer, char *file_path) +++{ +++ int ret; +++ struct file *filp = NULL; +++ loff_t f_pos = 0; +++ mm_segment_t fs; +++ +++ +++ filp = filp_open(file_path, O_RDONLY, 0); +++ +++ if (IS_ERR(filp)) { +++ enfs_log_error("Failed to open file %s\n", CONFIG_FILE_PATH); +++ ret = -ENOENT; +++ return ret; +++ } +++ +++ fs = get_fs(); +++ set_fs(get_ds()); +++ kernel_read(filp, buffer, MAX_FILE_SIZE, &f_pos); +++ set_fs(fs); +++ +++ ret = filp_close(filp, NULL); +++ if (ret) { +++ enfs_log_error("Close File:%s failed:%d.\n", +++ CONFIG_FILE_PATH, ret); +++ return -EINVAL; +++ } +++ return ENFS_RET_OK; +++} +++ +++static int32_t enfs_deal_with_comment_line(char *buffer) +++{ +++ int ret; +++ char *pos = strchr(buffer, '\n'); +++ +++ if (pos != NULL) +++ ret = strlen(buffer) - strlen(pos); +++ else +++ ret = strlen(buffer); +++ +++ return ret; +++} +++ +++static int32_t enfs_parse_key_value_from_config(char *buffer, char *key, +++ char *value, int keyLen, +++ int valueLen) +++{ +++ char *line; +++ char *tokenPtr; +++ int len; +++ char *tem; +++ char *pos = strchr(buffer, '\n'); +++ +++ if (pos != NULL) +++ len = strlen(buffer) - strlen(pos); +++ else +++ len = strlen(buffer); +++ +++ line = kmalloc(len + 1, GFP_KERNEL); +++ if (!line) { +++ enfs_log_error("Failed to allocate memory.\n"); +++ return -ENOMEM; +++ } +++ line[len] = '\0'; +++ strncpy(line, buffer, len); +++ +++ tem = line; +++ tokenPtr = strsep(&tem, "="); +++ if (tokenPtr == NULL || tem == NULL) { +++ kfree(line); +++ return len; +++ } +++ strncpy(key, strim(tokenPtr), keyLen); +++ strncpy(value, strim(tem), valueLen); +++ +++ kfree(line); +++ return len; +++} +++ +++static int32_t enfs_get_value_from_config_file(char *buffer, char *field_name, +++ char *value, int valueLen) +++{ +++ int ret; +++ char key[STRING_BUF_SIZE + 1] = {0}; +++ char val[STRING_BUF_SIZE + 1] = {0}; +++ +++ while (buffer[0] != '\0') { +++ if (buffer[0] == '\n') { +++ buffer++; +++ } else if (buffer[0] == '#') { +++ ret = enfs_deal_with_comment_line(buffer); +++ if (ret > 0) +++ buffer += ret; +++ } else { +++ ret = enfs_parse_key_value_from_config(buffer, key, val, +++ STRING_BUF_SIZE, +++ STRING_BUF_SIZE); +++ if (ret < 0) { +++ enfs_log_error("failed parse key value, %d\n" +++ , ret); +++ return ret; +++ } +++ key[STRING_BUF_SIZE] = '\0'; +++ val[STRING_BUF_SIZE] = '\0'; +++ +++ buffer += ret; +++ +++ if (strcmp(field_name, key) == 0) { +++ strncpy(value, val, valueLen); +++ return ENFS_RET_OK; +++ } +++ } +++ } +++ enfs_log_error("can not find value which matched field_name: %s.\n", +++ field_name); +++ return -EINVAL; +++} +++ +++int32_t enfs_config_load(void) +++{ +++ char value[STRING_BUF_SIZE + 1]; +++ int ret; +++ int table_len; +++ int min; +++ int max; +++ int i; +++ char *buffer; +++ +++ buffer = kmalloc(MAX_FILE_SIZE, GFP_KERNEL); +++ if (!buffer) { +++ enfs_log_error("Failed to allocate memory.\n"); +++ return -ENOMEM; +++ } +++ memset(buffer, 0, MAX_FILE_SIZE); +++ +++ g_enfs_config_info.path_detect_interval = DEFAULT_PATH_DETECT_INTERVAL; +++ g_enfs_config_info.path_detect_timeout = DEFAULT_PATH_DETECT_TIMEOUT; +++ g_enfs_config_info.multipath_timeout = DEFAULT_MULTIPATH_TIMEOUT; +++ g_enfs_config_info.multipath_state = DEFAULT_MULTIPATH_STATE; +++ g_enfs_config_info.loadbalance_mode = DEFAULT_LOADBALANCE_MODE; +++ +++ table_len = sizeof(g_check_and_assign_value) / +++ sizeof(g_check_and_assign_value[0]); +++ +++ ret = enfs_read_config_file(buffer, CONFIG_FILE_PATH); +++ if (ret != 0) { +++ kfree(buffer); +++ return ret; +++ } +++ +++ for (i = 0; i < table_len; i++) { +++ ret = enfs_get_value_from_config_file(buffer, +++ g_check_and_assign_value[i].field_name, +++ value, STRING_BUF_SIZE); +++ if (ret < 0) +++ continue; +++ +++ value[STRING_BUF_SIZE] = '\0'; +++ min = g_check_and_assign_value[i].min_value; +++ max = g_check_and_assign_value[i].max_value; +++ if (g_check_and_assign_value[i].func != NULL) +++ (*g_check_and_assign_value[i].func)( +++ g_check_and_assign_value[i].field_name, +++ value, min, max); +++ } +++ +++ kfree(buffer); +++ return ENFS_RET_OK; +++} +++ +++int32_t enfs_get_config_path_detect_interval(void) +++{ +++ return g_enfs_config_info.path_detect_interval; +++} +++ +++int32_t enfs_get_config_path_detect_timeout(void) +++{ +++ return g_enfs_config_info.path_detect_timeout; +++} +++ +++int32_t enfs_get_config_multipath_timeout(void) +++{ +++ return g_enfs_config_info.multipath_timeout; +++} +++ +++int32_t enfs_get_config_multipath_state(void) +++{ +++ return g_enfs_config_info.multipath_state; +++} +++ +++int32_t enfs_get_config_loadbalance_mode(void) +++{ +++ return g_enfs_config_info.loadbalance_mode; +++} +++ +++static bool enfs_file_changed(const char *filename) +++{ +++ int err; +++ struct kstat file_stat; +++ +++ err = vfs_stat(filename, &file_stat); +++ if (err) { +++ pr_err("failed to open file:%s err:%d\n", filename, err); +++ return false; +++ } +++ +++ if (timespec64_compare(&modify_time, &file_stat.mtime) == -1) { +++ modify_time = file_stat.mtime; +++ pr_info("file change: %lld %lld\n", modify_time.tv_sec, +++ file_stat.mtime.tv_sec); +++ return true; +++ } +++ +++ return false; +++} +++ +++static int enfs_thread_func(void *data) +++{ +++ while (!kthread_should_stop()) { +++ if (enfs_file_changed(CONFIG_FILE_PATH)) +++ enfs_config_load(); +++ +++ msleep(ENFS_NOTIFY_FILE_PERIOD); +++ } +++ return 0; +++} +++ +++int enfs_config_timer_init(void) +++{ +++ thread = kthread_run(enfs_thread_func, NULL, "enfs_notiy_file_thread"); +++ if (IS_ERR(thread)) { +++ pr_err("Failed to create kernel thread\n"); +++ return PTR_ERR(thread); +++ } +++ return 0; +++} +++ +++void enfs_config_timer_exit(void) +++{ +++ pr_info("enfs_notify_file_exit\n"); +++ if (thread) +++ kthread_stop(thread); +++} ++diff --git a/fs/nfs/enfs/enfs_config.h b/fs/nfs/enfs/enfs_config.h ++new file mode 100644 ++index 000000000000..752710129170 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_config.h ++@@ -0,0 +1,32 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs configuration +++ * Author: y00583252 +++ * Create: 2023-07-27 +++ */ +++ +++#ifndef ENFS_CONFIG_H +++#define ENFS_CONFIG_H +++ +++#include <linux/types.h> +++ +++enum enfs_multipath_state { +++ ENFS_MULTIPATH_ENABLE = 0, +++ ENFS_MULTIPATH_DISABLE = 1, +++}; +++ +++enum enfs_loadbalance_mode { +++ ENFS_LOADBALANCE_RR, +++}; +++ +++ +++int32_t enfs_get_config_path_detect_interval(void); +++int32_t enfs_get_config_path_detect_timeout(void); +++int32_t enfs_get_config_multipath_timeout(void); +++int32_t enfs_get_config_multipath_state(void); +++int32_t enfs_get_config_loadbalance_mode(void); +++int32_t enfs_config_load(void); +++int32_t enfs_config_timer_init(void); +++void enfs_config_timer_exit(void); +++#endif // ENFS_CONFIG_H ++diff --git a/fs/nfs/enfs/enfs_errcode.h b/fs/nfs/enfs/enfs_errcode.h ++new file mode 100644 ++index 000000000000..cca47ab9a191 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_errcode.h ++@@ -0,0 +1,17 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs errocode +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef ENFS_ERRCODE_H +++#define ENFS_ERRCODE_H +++ +++enum { +++ ENFS_RET_OK = 0, +++ ENFS_RET_FAIL +++}; +++ +++#endif // ENFS_ERRCODE_H ++diff --git a/fs/nfs/enfs/enfs_log.h b/fs/nfs/enfs/enfs_log.h ++new file mode 100644 ++index 000000000000..177b404f05df ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_log.h ++@@ -0,0 +1,25 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: enfs log +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++#ifndef ENFS_LOG_H +++#define ENFS_LOG_H +++ +++#include <linux/printk.h> +++ +++#define enfs_log_info(fmt, ...) \ +++ pr_info("enfs:[%s]" pr_fmt(fmt), \ +++ __func__, ##__VA_ARGS__) +++ +++#define enfs_log_error(fmt, ...) \ +++ pr_err("enfs:[%s]" pr_fmt(fmt), \ +++ __func__, ##__VA_ARGS__) +++ +++#define enfs_log_debug(fmt, ...) \ +++ pr_debug("enfs:[%s]" pr_fmt(fmt), \ +++ __func__, ##__VA_ARGS__) +++ +++#endif // ENFS_ERRCODE_H ++diff --git a/fs/nfs/enfs/failover_com.h b/fs/nfs/enfs/failover_com.h ++new file mode 100644 ++index 000000000000..c52940da232e ++--- /dev/null +++++ b/fs/nfs/enfs/failover_com.h ++@@ -0,0 +1,23 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: failover time commont header file +++ * Create: 2023-08-02 +++ */ +++#ifndef FAILOVER_COMMON_H +++#define FAILOVER_COMMON_H +++ +++static inline bool failover_is_enfs_clnt(struct rpc_clnt *clnt) +++{ +++ struct rpc_clnt *next = clnt->cl_parent; +++ +++ while (next) { +++ if (next == next->cl_parent) +++ break; +++ next = next->cl_parent; +++ } +++ +++ return next != NULL ? next->cl_enfs : clnt->cl_enfs; +++} +++ +++#endif // FAILOVER_COMMON_H ++diff --git a/fs/nfs/enfs/failover_path.c b/fs/nfs/enfs/failover_path.c ++new file mode 100644 ++index 000000000000..93b454de29d1 ++--- /dev/null +++++ b/fs/nfs/enfs/failover_path.c ++@@ -0,0 +1,207 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs path failover file +++ * Author: y00583252 +++ * Create: 2023-08-02 +++ */ +++ +++#include "failover_path.h" +++#include <linux/nfs.h> +++#include <linux/nfs3.h> +++#include <linux/nfs4.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/sched.h> +++#include <linux/sunrpc/xprt.h> +++#include "enfs_config.h" +++#include "enfs_log.h" +++#include "failover_com.h" +++#include "pm_state.h" +++#include "pm_ping.h" +++ +++enum failover_policy_t { +++ FAILOVER_NOACTION = 1, +++ FAILOVER_RETRY, +++ FAILOVER_RETRY_DELAY, +++}; +++ +++static void failover_retry_path(struct rpc_task *task) +++{ +++ xprt_release(task); +++ rpc_init_task_retry_counters(task); +++ rpc_task_release_transport(task); +++ rpc_restart_call(task); +++} +++ +++static void failover_retry_path_delay(struct rpc_task *task, int32_t delay) +++{ +++ failover_retry_path(task); +++ rpc_delay(task, delay); +++} +++ +++static void failover_retry_path_by_policy(struct rpc_task *task, +++ enum failover_policy_t policy) +++{ +++ if (policy == FAILOVER_RETRY) +++ failover_retry_path(task); +++ else if (policy == FAILOVER_RETRY_DELAY) +++ failover_retry_path_delay(task, 3 * HZ); // delay 3s +++} +++ +++static +++enum failover_policy_t failover_get_nfs3_retry_policy(struct rpc_task *task) +++{ +++ enum failover_policy_t policy = FAILOVER_NOACTION; +++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; +++ u32 proc; +++ +++ if (unlikely(procinfo == NULL)) { +++ enfs_log_error("the task contains no valid proc.\n"); +++ return FAILOVER_NOACTION; +++ } +++ +++ proc = procinfo->p_proc; +++ +++ switch (proc) { +++ case NFS3PROC_CREATE: +++ case NFS3PROC_MKDIR: +++ case NFS3PROC_REMOVE: +++ case NFS3PROC_RMDIR: +++ case NFS3PROC_SYMLINK: +++ case NFS3PROC_LINK: +++ case NFS3PROC_SETATTR: +++ case NFS3PROC_WRITE: +++ policy = FAILOVER_RETRY_DELAY; +++ default: +++ policy = FAILOVER_RETRY; +++ } +++ return policy; +++} +++ +++static +++enum failover_policy_t failover_get_nfs4_retry_policy(struct rpc_task *task) +++{ +++ enum failover_policy_t policy = FAILOVER_NOACTION; +++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; +++ u32 proc_idx; +++ +++ if (unlikely(procinfo == NULL)) { +++ enfs_log_error("the task contains no valid proc.\n"); +++ return FAILOVER_NOACTION; +++ } +++ +++ proc_idx = procinfo->p_statidx; +++ +++ switch (proc_idx) { +++ case NFSPROC4_CLNT_CREATE: +++ case NFSPROC4_CLNT_REMOVE: +++ case NFSPROC4_CLNT_LINK: +++ case NFSPROC4_CLNT_SYMLINK: +++ case NFSPROC4_CLNT_SETATTR: +++ case NFSPROC4_CLNT_WRITE: +++ case NFSPROC4_CLNT_RENAME: +++ case NFSPROC4_CLNT_SETACL: +++ policy = FAILOVER_RETRY_DELAY; +++ default: +++ policy = FAILOVER_RETRY; +++ } +++ return policy; +++} +++ +++static enum failover_policy_t failover_get_retry_policy(struct rpc_task *task) +++{ +++ struct rpc_clnt *clnt = task->tk_client; +++ u32 version = clnt->cl_vers; +++ enum failover_policy_t policy = FAILOVER_NOACTION; +++ +++ // 1. if the task meant to send to certain xprt, take no action +++ if (task->tk_flags & RPC_TASK_FIXED) +++ return FAILOVER_NOACTION; +++ +++ // 2. get policy by different version of nfs protocal +++ if (version == 3) // nfs v3 +++ policy = failover_get_nfs3_retry_policy(task); +++ else if (version == 4) // nfs v4 +++ policy = failover_get_nfs4_retry_policy(task); +++ else +++ return FAILOVER_NOACTION; +++ +++ // 3. if the task is not send to target, retry immediately +++ if (!RPC_WAS_SENT(task)) +++ policy = FAILOVER_RETRY; +++ +++ return policy; +++} +++ +++static int failover_check_task(struct rpc_task *task) +++{ +++ struct rpc_clnt *clnt = NULL; +++ int disable_mpath = enfs_get_config_multipath_state(); +++ +++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return -EINVAL; +++ } +++ +++ if (unlikely((task == NULL) || (task->tk_client == NULL))) { +++ enfs_log_error("The task is not valid.\n"); +++ return -EINVAL; +++ } +++ +++ clnt = task->tk_client; +++ +++ if (clnt->cl_prog != NFS_PROGRAM) { +++ enfs_log_debug("The clnt is not prog{%u} type.\n", +++ clnt->cl_prog); +++ return -EINVAL; +++ } +++ +++ if (!failover_is_enfs_clnt(clnt)) { +++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); +++ return -EINVAL; +++ } +++ return 0; +++} +++ +++void failover_handle(struct rpc_task *task) +++{ +++ enum failover_policy_t policy; +++ int ret; +++ +++ ret = failover_check_task(task); +++ if (ret != 0) +++ return; +++ +++ pm_set_path_state(task->tk_xprt, PM_STATE_FAULT); +++ +++ policy = failover_get_retry_policy(task); +++ +++ failover_retry_path_by_policy(task, policy); +++} +++ +++bool failover_task_need_call_start_again(struct rpc_task *task) +++{ +++ int ret; +++ +++ ret = failover_check_task(task); +++ if (ret != 0) +++ return false; +++ +++ return true; +++} +++ +++bool failover_prepare_transmit(struct rpc_task *task) +++{ +++ if (task->tk_flags & RPC_TASK_FIXED) +++ return true; +++ +++ if (pm_ping_is_test_xprt_task(task)) +++ return true; +++ +++ if (pm_get_path_state(task->tk_xprt) == PM_STATE_FAULT) { +++ task->tk_status = -ETIMEDOUT; +++ return false; +++ } +++ +++ return true; +++} ++diff --git a/fs/nfs/enfs/failover_path.h b/fs/nfs/enfs/failover_path.h ++new file mode 100644 ++index 000000000000..6f1294829a6e ++--- /dev/null +++++ b/fs/nfs/enfs/failover_path.h ++@@ -0,0 +1,17 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs path failover header file +++ * Author: y00583252 +++ * Create: 2023-08-02 +++ */ +++ +++#ifndef FAILOVER_PATH_H +++#define FAILOVER_PATH_H +++ +++#include <linux/sunrpc/sched.h> +++ +++void failover_handle(struct rpc_task *task); +++bool failover_prepare_transmit(struct rpc_task *task); +++ +++#endif // FAILOVER_PATH_H ++diff --git a/fs/nfs/enfs/failover_time.c b/fs/nfs/enfs/failover_time.c ++new file mode 100644 ++index 000000000000..866ea82d13fc ++--- /dev/null +++++ b/fs/nfs/enfs/failover_time.c ++@@ -0,0 +1,99 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: failover time file +++ * Create: 2023-08-02 +++ */ +++ +++#include "failover_time.h" +++#include <linux/jiffies.h> +++#include <linux/sunrpc/clnt.h> +++#include "enfs_config.h" +++#include "enfs_log.h" +++#include "failover_com.h" +++#include "pm_ping.h" +++ +++static unsigned long failover_get_mulitipath_timeout(struct rpc_clnt *clnt) +++{ +++ unsigned long config_tmo = enfs_get_config_multipath_timeout() * HZ; +++ unsigned long clnt_tmo = clnt->cl_timeout->to_initval; +++ +++ if (config_tmo == 0) +++ return clnt_tmo; +++ +++ return config_tmo > clnt_tmo ? clnt_tmo : config_tmo; +++} +++ +++void failover_adjust_task_timeout(struct rpc_task *task, void *condition) +++{ +++ struct rpc_clnt *clnt = NULL; +++ unsigned long tmo; +++ int disable_mpath = enfs_get_config_multipath_state(); +++ +++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return; +++ } +++ +++ clnt = task->tk_client; +++ if (unlikely(clnt == NULL)) { +++ enfs_log_error("task associate client is NULL.\n"); +++ return; +++ } +++ +++ if (!failover_is_enfs_clnt(clnt)) { +++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); +++ return; +++ } +++ +++ tmo = failover_get_mulitipath_timeout(clnt); +++ if (tmo == 0) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return; +++ } +++ +++ if (task->tk_timeout != 0) +++ task->tk_timeout = +++ task->tk_timeout < tmo ? task->tk_timeout : tmo; +++ else +++ task->tk_timeout = tmo; +++} +++ +++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req) +++{ +++ struct rpc_clnt *clnt = NULL; +++ int disable_mpath = enfs_get_config_multipath_state(); +++ +++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return; +++ } +++ +++ clnt = task->tk_client; +++ if (unlikely(clnt == NULL)) { +++ enfs_log_error("task associate client is NULL.\n"); +++ return; +++ } +++ +++ if (!failover_is_enfs_clnt(clnt)) { +++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); +++ return; +++ } +++ +++ if (!pm_ping_is_test_xprt_task(task)) +++ req->rq_timeout = failover_get_mulitipath_timeout(clnt); +++ else { +++ req->rq_timeout = enfs_get_config_path_detect_timeout() * HZ; +++ req->rq_majortimeo = req->rq_timeout + jiffies; +++ } +++ +++ /* +++ * when task is retried, the req is new, we lost major-timeout times, +++ * so we have to restore req major +++ * timeouts from the task, if it is stored. +++ */ +++ if (task->tk_major_timeo != 0) +++ req->rq_majortimeo = task->tk_major_timeo; +++ else +++ task->tk_major_timeo = req->rq_majortimeo; +++} ++diff --git a/fs/nfs/enfs/failover_time.h b/fs/nfs/enfs/failover_time.h ++new file mode 100644 ++index 000000000000..ede25b577a2a ++--- /dev/null +++++ b/fs/nfs/enfs/failover_time.h ++@@ -0,0 +1,16 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: failover time header file +++ * Create: 2023-08-02 +++ */ +++ +++#ifndef FAILOVER_TIME_H +++#define FAILOVER_TIME_H +++ +++#include <linux/sunrpc/sched.h> +++ +++void failover_adjust_task_timeout(struct rpc_task *task, void *condition); +++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req); +++ +++#endif // FAILOVER_TIME_H ++diff --git a/fs/nfs/enfs/init.h b/fs/nfs/enfs/init.h ++new file mode 100644 ++index 000000000000..fdabb9084e19 ++--- /dev/null +++++ b/fs/nfs/enfs/init.h ++@@ -0,0 +1,17 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs client init +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef ENFS_INIT_H +++#define ENFS_INIT_H +++ +++#include <linux/types.h> +++ +++int32_t enfs_init(void); +++void enfs_fini(void); +++ +++#endif ++diff --git a/fs/nfs/enfs/mgmt_init.c b/fs/nfs/enfs/mgmt_init.c ++new file mode 100644 ++index 000000000000..75a40c5e0f6c ++--- /dev/null +++++ b/fs/nfs/enfs/mgmt_init.c ++@@ -0,0 +1,22 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: mgmt component init +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#include "mgmt_init.h" +++#include <linux/printk.h> +++#include "enfs_errcode.h" +++#include "enfs_config.h" +++ +++int32_t mgmt_init(void) +++{ +++ return enfs_config_timer_init(); +++} +++ +++void mgmt_fini(void) +++{ +++ enfs_config_timer_exit(); +++} ++diff --git a/fs/nfs/enfs/mgmt_init.h b/fs/nfs/enfs/mgmt_init.h ++new file mode 100644 ++index 000000000000..aa78303b9f01 ++--- /dev/null +++++ b/fs/nfs/enfs/mgmt_init.h ++@@ -0,0 +1,18 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: mgmt component init +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef MGMT_INIT_H +++#define MGMT_INIT_H +++ +++#include <linux/types.h> +++ +++int32_t mgmt_init(void); +++void mgmt_fini(void); +++ +++ +++#endif // MGMT_INIT_H ++diff --git a/fs/nfs/enfs/pm_ping.c b/fs/nfs/enfs/pm_ping.c ++new file mode 100644 ++index 000000000000..24153cd4c7f3 ++--- /dev/null +++++ b/fs/nfs/enfs/pm_ping.c ++@@ -0,0 +1,421 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: path state header file +++ * Author: x00833432 +++ * Create: 2023-08-21 +++ */ +++ +++#include "pm_ping.h" +++#include <linux/err.h> +++#include <linux/spinlock.h> +++#include <linux/slab.h> +++#include <linux/module.h> +++#include <linux/printk.h> +++#include <linux/kthread.h> +++#include <linux/nfs.h> +++#include <linux/errno.h> +++#include <linux/rcupdate.h> +++#include <linux/workqueue.h> +++#include <net/netns/generic.h> +++#include <linux/atomic.h> +++#include <linux/sunrpc/clnt.h> +++ +++#include "../../../net/sunrpc/netns.h" +++#include "pm_state.h" +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_config.h" +++ +++#define SLEEP_INTERVAL 2 +++extern unsigned int sunrpc_net_id; +++ +++static struct task_struct *pm_ping_timer_thread; +++//protect pint_execute_workq +++static spinlock_t ping_execute_workq_lock; +++// timer for test xprt workqueue +++static struct workqueue_struct *ping_execute_workq; +++// count the ping xprt work on flight +++static atomic_t check_xprt_count; +++ +++struct ping_xprt_work { +++ struct rpc_xprt *xprt; // use this specific xprt +++ struct rpc_clnt *clnt; // use this specific rpc_client +++ struct work_struct ping_work; +++}; +++ +++struct pm_ping_async_callback { +++ void *data; +++ void (*func)(void *data); +++}; +++ +++// set xprt's enum pm_check_state +++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, +++ enum pm_check_state state) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ +++ if (IS_ERR(xprt)) { +++ enfs_log_error("The xprt ptr is not exist.\n"); +++ return; +++ } +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ xprt_get(xprt); +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ xprt_put(xprt); +++ return; +++ } +++ +++ atomic_set(&ctx->path_check_state, state); +++ xprt_put(xprt); +++} +++ +++// get xprt's enum pm_check_state +++static enum pm_check_state pm_ping_get_path_check_state(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ enum pm_check_state state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return PM_CHECK_UNDEFINE; +++ } +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ return PM_CHECK_UNDEFINE; +++ } +++ +++ state = atomic_read(&ctx->path_check_state); +++ +++ return state; +++} +++ +++static void pm_ping_call_done_callback(void *data) +++{ +++ struct pm_ping_async_callback *callback_data = +++ (struct pm_ping_async_callback *)data; +++ +++ if (callback_data == NULL) +++ return; +++ +++ callback_data->func(callback_data->data); +++ +++ kfree(callback_data); +++} +++ +++// Default callback for async RPC calls +++static void pm_ping_call_done(struct rpc_task *task, void *data) +++{ +++ struct rpc_xprt *xprt = task->tk_xprt; +++ +++ atomic_dec(&check_xprt_count); +++ if (task->tk_status >= 0) +++ pm_set_path_state(xprt, PM_STATE_NORMAL); +++ else +++ pm_set_path_state(xprt, PM_STATE_FAULT); +++ +++ pm_ping_set_path_check_state(xprt, PM_CHECK_FINISH); +++ +++ pm_ping_call_done_callback(data); +++} +++ +++// register func to rpc_call_done +++static const struct rpc_call_ops pm_ping_set_status_ops = { +++ .rpc_call_done = pm_ping_call_done, +++}; +++ +++// execute work which in work_queue +++static void pm_ping_execute_work(struct work_struct *work) +++{ +++ int ret = 0; +++ +++ // get the work information +++ struct ping_xprt_work *work_info = +++ container_of(work, struct ping_xprt_work, ping_work); +++ +++ // if check state is pending +++ if (pm_ping_get_path_check_state(work_info->xprt) == PM_CHECK_WAITING) { +++ +++ pm_ping_set_path_check_state(work_info->xprt, +++ PM_CHECK_CHECKING); +++ +++ ret = rpc_clnt_test_xprt(work_info->clnt, +++ work_info->xprt, +++ &pm_ping_set_status_ops, +++ NULL, +++ RPC_TASK_ASYNC | RPC_TASK_FIXED); +++ +++ if (ret < 0) { +++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); +++ +++ pm_ping_set_path_check_state(work_info->xprt, +++ PM_CHECK_FINISH); +++ +++ } else +++ atomic_inc(&check_xprt_count); +++ +++ } +++ +++ atomic_dec(&work_info->clnt->cl_count); +++ xprt_put(work_info->xprt); +++ kfree(work_info); +++ work_info = NULL; +++} +++ +++static bool pm_ping_workqueue_queue_work(struct work_struct *work) +++{ +++ bool ret = false; +++ +++ spin_lock(&ping_execute_workq_lock); +++ +++ if (ping_execute_workq != NULL) +++ ret = queue_work(ping_execute_workq, work); +++ +++ spin_unlock(&ping_execute_workq_lock); +++ return ret; +++} +++ +++// init test work and add this work to workqueue +++static int pm_ping_add_work(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, void *data) +++{ +++ struct ping_xprt_work *work_info; +++ bool ret = false; +++ +++ if (IS_ERR(xprt) || xprt == NULL) { +++ enfs_log_error("The xprt ptr is not exist.\n"); +++ return -EINVAL; +++ } +++ +++ if (IS_ERR(clnt) || clnt == NULL) { +++ enfs_log_error("The clnt ptr is not exist.\n"); +++ return -EINVAL; +++ } +++ +++ if (!xprt->multipath_context) { +++ enfs_log_error("multipath_context is null.\n"); +++ return -EINVAL; +++ } +++ +++ // check xprt pending status, if pending status equals Finish +++ // means this xprt can inster to work queue +++ if (pm_ping_get_path_check_state(xprt) == +++ PM_CHECK_FINISH || +++ pm_ping_get_path_check_state(xprt) == +++ PM_CHECK_INIT) { +++ +++ enfs_log_debug("find xprt pointer. %p\n", xprt); +++ work_info = kzalloc(sizeof(struct ping_xprt_work), GFP_ATOMIC); +++ if (work_info == NULL) +++ return -ENOMEM; +++ work_info->clnt = clnt; +++ atomic_inc(&clnt->cl_count); +++ work_info->xprt = xprt; +++ xprt_get(xprt); +++ INIT_WORK(&work_info->ping_work, pm_ping_execute_work); +++ pm_ping_set_path_check_state(xprt, PM_CHECK_WAITING); +++ +++ ret = pm_ping_workqueue_queue_work(&work_info->ping_work); +++ if (!ret) { +++ atomic_dec(&work_info->clnt->cl_count); +++ xprt_put(work_info->xprt); +++ kfree(work_info); +++ return -EINVAL; +++ } +++ } +++ return 0; +++} +++ +++// encapsulate pm_ping_add_work() +++static int pm_ping_execute_xprt_test(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, void *data) +++{ +++ pm_ping_add_work(clnt, xprt, NULL); +++ // return 0 for rpc_clnt_iterate_for_each_xprt(); +++ // because negative value will stop iterate all xprt +++ // and we need return negative value for debug +++ // Therefore, we need this function to iterate all xprt +++ return 0; +++} +++ +++// export to other module add ping work to workqueue +++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) +++{ +++ int ret; +++ +++ ret = pm_ping_add_work(clnt, xprt, NULL); +++ return ret; +++} +++ +++// iterate xprt in the client +++static void pm_ping_loop_rpclnt(struct sunrpc_net *sn) +++{ +++ struct rpc_clnt *clnt; +++ +++ spin_lock(&sn->rpc_client_lock); +++ list_for_each_entry_rcu(clnt, &sn->all_clients, cl_clients) { +++ if (clnt->cl_enfs) { +++ enfs_log_debug("find rpc_clnt. %p\n", clnt); +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ pm_ping_execute_xprt_test, NULL); +++ } +++ } +++ spin_unlock(&sn->rpc_client_lock); +++} +++ +++// iterate each clnt in the sunrpc_net +++static void pm_ping_loop_sunrpc_net(void) +++{ +++ struct net *net; +++ struct sunrpc_net *sn; +++ +++ rcu_read_lock(); +++ for_each_net_rcu(net) { +++ sn = net_generic(net, sunrpc_net_id); +++ if (sn == NULL) +++ continue; +++ pm_ping_loop_rpclnt(sn); +++ } +++ rcu_read_unlock(); +++} +++ +++static int pm_ping_routine(void *data) +++{ +++ while (!kthread_should_stop()) { +++ // equale 0 means open multipath +++ if (enfs_get_config_multipath_state() == +++ ENFS_MULTIPATH_ENABLE) +++ pm_ping_loop_sunrpc_net(); +++ +++ msleep((unsigned int) +++ enfs_get_config_path_detect_interval() * 1000); +++ } +++ return 0; +++} +++ +++// start thread to cycly ping +++static int pm_ping_start(void) +++{ +++ pm_ping_timer_thread = +++ kthread_run(pm_ping_routine, NULL, "pm_ping_routine"); +++ if (IS_ERR(pm_ping_timer_thread)) { +++ enfs_log_error("Failed to create kernel thread\n"); +++ return PTR_ERR(pm_ping_timer_thread); +++ } +++ return 0; +++} +++ +++// initialize workqueue +++static int pm_ping_workqueue_init(void) +++{ +++ struct workqueue_struct *queue = NULL; +++ +++ queue = create_workqueue("pm_ping_workqueue"); +++ +++ if (queue == NULL) { +++ enfs_log_error("create workqueue failed.\n"); +++ return -ENOMEM; +++ } +++ +++ spin_lock(&ping_execute_workq_lock); +++ ping_execute_workq = queue; +++ spin_unlock(&ping_execute_workq_lock); +++ enfs_log_info("create workqueue succeeeded.\n"); +++ return 0; +++} +++ +++static void pm_ping_workqueue_fini(void) +++{ +++ struct workqueue_struct *queue = NULL; +++ +++ spin_lock(&ping_execute_workq_lock); +++ queue = ping_execute_workq; +++ ping_execute_workq = NULL; +++ spin_unlock(&ping_execute_workq_lock); +++ +++ enfs_log_info("delete work queue\n"); +++ +++ if (queue != NULL) { +++ flush_workqueue(queue); +++ destroy_workqueue(queue); +++ } +++} +++ +++// module exit func +++void pm_ping_fini(void) +++{ +++ if (pm_ping_timer_thread) +++ kthread_stop(pm_ping_timer_thread); +++ +++ pm_ping_workqueue_fini(); +++ +++ while (atomic_read(&check_xprt_count) != 0) +++ msleep(SLEEP_INTERVAL); +++} +++ +++// module init func +++int pm_ping_init(void) +++{ +++ int ret; +++ +++ atomic_set(&check_xprt_count, 0); +++ ret = pm_ping_workqueue_init(); +++ if (ret != 0) { +++ enfs_log_error("PM_PING Module loading failed.\n"); +++ return ret; +++ } +++ ret = pm_ping_start(); +++ if (ret != 0) { +++ enfs_log_error("PM_PING Module loading failed.\n"); +++ pm_ping_workqueue_fini(); +++ return ret; +++ } +++ +++ return ret; +++} +++ +++bool pm_ping_is_test_xprt_task(struct rpc_task *task) +++{ +++ return task->tk_ops == &pm_ping_set_status_ops ? true : false; +++} +++ +++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void (*func)(void *data), +++ void *data) +++{ +++ int ret; +++ +++ struct pm_ping_async_callback *callback_data = +++ kzalloc(sizeof(struct pm_ping_async_callback), GFP_KERNEL); +++ +++ if (callback_data == NULL) { +++ enfs_log_error("failed to mzalloc mem\n"); +++ return -ENOMEM; +++ } +++ +++ callback_data->data = data; +++ callback_data->func = func; +++ atomic_inc(&check_xprt_count); +++ ret = rpc_clnt_test_xprt(clnt, xprt, +++ &pm_ping_set_status_ops, +++ callback_data, +++ RPC_TASK_ASYNC | RPC_TASK_FIXED); +++ +++ if (ret < 0) { +++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); +++ atomic_dec(&check_xprt_count); +++ } +++ +++ return ret; +++} ++diff --git a/fs/nfs/enfs/pm_ping.h b/fs/nfs/enfs/pm_ping.h ++new file mode 100644 ++index 000000000000..6bcb94bfc836 ++--- /dev/null +++++ b/fs/nfs/enfs/pm_ping.h ++@@ -0,0 +1,33 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs configuration +++ * Author: x00833432 +++ * Create: 2023-07-27 +++ */ +++ +++#ifndef PM_PING_H +++#define PM_PING_H +++ +++#include <linux/sunrpc/clnt.h> +++ +++enum pm_check_state { +++ PM_CHECK_INIT, // this xprt never been queued +++ PM_CHECK_WAITING, // this xprt waiting in the queue +++ PM_CHECK_CHECKING, // this xprt is testing +++ PM_CHECK_FINISH, // this xprt has been finished +++ PM_CHECK_UNDEFINE, // undefine multipath struct +++}; +++ +++int pm_ping_init(void); +++void pm_ping_fini(void); +++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); +++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, +++ enum pm_check_state state); +++bool pm_ping_is_test_xprt_task(struct rpc_task *task); +++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void (*func)(void *data), +++ void *data); +++ +++#endif // PM_PING_H ++diff --git a/fs/nfs/enfs/pm_state.c b/fs/nfs/enfs/pm_state.c ++new file mode 100644 ++index 000000000000..220621a207a2 ++--- /dev/null +++++ b/fs/nfs/enfs/pm_state.c ++@@ -0,0 +1,158 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: path state file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++#include "pm_state.h" +++#include <linux/sunrpc/xprt.h> +++ +++#include "enfs.h" +++#include "enfs_log.h" +++ +++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ enum pm_path_state state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return PM_STATE_UNDEFINED; +++ } +++ +++ xprt_get(xprt); +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ xprt_put(xprt); +++ return PM_STATE_UNDEFINED; +++ } +++ +++ state = atomic_read(&ctx->path_state); +++ +++ xprt_put(xprt); +++ +++ return state; +++} +++ +++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ enum pm_path_state cur_state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ xprt_get(xprt); +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ xprt_put(xprt); +++ return; +++ } +++ +++ cur_state = atomic_read(&ctx->path_state); +++ if (cur_state == state) { +++ enfs_log_debug("The xprt is already {%d}.\n", state); +++ xprt_put(xprt); +++ return; +++ } +++ +++ atomic_set(&ctx->path_state, state); +++ enfs_log_info("The xprt {%p} path state change from {%d} to {%d}.\n", +++ xprt, cur_state, state); +++ +++ xprt_put(xprt); +++} +++ +++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len) +++{ +++ enum pm_path_state state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ if ((buf == NULL) || (len <= 0)) { +++ enfs_log_error("Buffer is not valid, len=%d.\n", len); +++ return; +++ } +++ +++ state = pm_get_path_state(xprt); +++ +++ switch (state) { +++ case PM_STATE_INIT: +++ (void)snprintf(buf, len, "Init"); +++ break; +++ case PM_STATE_NORMAL: +++ (void)snprintf(buf, len, "Normal"); +++ break; +++ case PM_STATE_FAULT: +++ (void)snprintf(buf, len, "Fault"); +++ break; +++ default: +++ (void)snprintf(buf, len, "Unknown"); +++ break; +++ } +++} +++ +++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len) +++{ +++ int i; +++ unsigned long state; +++ static unsigned long xprt_mask[] = { +++ XPRT_LOCKED, XPRT_CONNECTED, +++ XPRT_CONNECTING, XPRT_CLOSE_WAIT, +++ XPRT_BOUND, XPRT_BINDING, XPRT_CLOSING, +++ XPRT_CONGESTED}; +++ +++ static const char *const xprt_state_desc[] = { +++ "LOCKED", "CONNECTED", "CONNECTING", +++ "CLOSE_WAIT", "BOUND", "BINDING", +++ "CLOSING", "CONGESTED"}; +++ int pos = 0; +++ int ret = 0; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ if ((buf == NULL) || (len <= 0)) { +++ enfs_log_error( +++ "Xprt state buffer is not valid, len=%d.\n", +++ len); +++ return; +++ } +++ +++ xprt_get(xprt); +++ state = READ_ONCE(xprt->state); +++ xprt_put(xprt); +++ +++ for (i = 0; i < ARRAY_SIZE(xprt_mask); ++i) { +++ if (pos >= len) +++ break; +++ +++ if (!test_bit(xprt_mask[i], &state)) +++ continue; +++ +++ if (pos == 0) +++ ret = snprintf(buf, len, "%s", xprt_state_desc[i]); +++ else +++ ret = snprintf(buf + pos, len - pos, "|%s", +++ xprt_state_desc[i]); +++ +++ if (ret < 0) { +++ enfs_log_error("format state failed, ret %d.\n", ret); +++ break; +++ } +++ +++ pos += ret; +++ } +++} ++diff --git a/fs/nfs/enfs/pm_state.h b/fs/nfs/enfs/pm_state.h ++new file mode 100644 ++index 000000000000..f5f52e5ab91d ++--- /dev/null +++++ b/fs/nfs/enfs/pm_state.h ++@@ -0,0 +1,28 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: path state header file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++ +++#ifndef PM_STATE_H +++#define PM_STATE_H +++ +++#include <linux/types.h> +++#include <linux/sunrpc/xprt.h> +++ +++enum pm_path_state { +++ PM_STATE_INIT, +++ PM_STATE_NORMAL, +++ PM_STATE_FAULT, +++ PM_STATE_UNDEFINED // xprt is not multipath xprt +++}; +++ +++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state); +++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt); +++ +++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len); +++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len); +++ +++#endif // PM_STATE_H +diff --git a/0006-add_enfs_compile_option.patch b/0006-add_enfs_compile_option.patch +new file mode 100644 +index 0000000..ff3bc0e +--- /dev/null ++++ b/0006-add_enfs_compile_option.patch +@@ -0,0 +1,70 @@ ++diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig ++index b04256636d4b..ae53510c0627 100644 ++--- a/arch/arm64/configs/openeuler_defconfig +++++ b/arch/arm64/configs/openeuler_defconfig ++@@ -5344,6 +5344,7 @@ CONFIG_LOCKD=m ++ CONFIG_LOCKD_V4=y ++ CONFIG_NFS_ACL_SUPPORT=m ++ CONFIG_NFS_COMMON=y +++# CONFIG_ENFS is not set ++ CONFIG_SUNRPC=m ++ CONFIG_SUNRPC_GSS=m ++ CONFIG_SUNRPC_BACKCHANNEL=y ++diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig ++index 59baeb2973af..ccc317f7fdb2 100644 ++--- a/arch/x86/configs/openeuler_defconfig +++++ b/arch/x86/configs/openeuler_defconfig ++@@ -6825,6 +6825,7 @@ CONFIG_LOCKD=m ++ CONFIG_LOCKD_V4=y ++ CONFIG_NFS_ACL_SUPPORT=m ++ CONFIG_NFS_COMMON=y +++# CONFIG_ENFS is not set ++ CONFIG_SUNRPC=m ++ CONFIG_SUNRPC_GSS=m ++ CONFIG_SUNRPC_BACKCHANNEL=y ++diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig ++index e55f86713948..872c9b7671b1 100644 ++--- a/fs/nfs/Kconfig +++++ b/fs/nfs/Kconfig ++@@ -196,3 +196,14 @@ config NFS_DEBUG ++ depends on NFS_FS && SUNRPC_DEBUG ++ select CRC32 ++ default y +++ +++config ENFS +++ tristate "NFS client support for ENFS" +++ depends on NFS_FS +++ default n +++ help +++ This option enables support multipath of the NFS protocol +++ in the kernel's NFS client. +++ This feature will improve performance and reliability. +++ +++ If sure, say Y. ++diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile ++index c587e3c4c6a6..19d0ac2ba3b8 100644 ++--- a/fs/nfs/Makefile +++++ b/fs/nfs/Makefile ++@@ -12,6 +12,7 @@ nfs-y := client.o dir.o file.o getroot.o inode.o super.o \ ++ nfs-$(CONFIG_ROOT_NFS) += nfsroot.o ++ nfs-$(CONFIG_SYSCTL) += sysctl.o ++ nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-index.o +++nfs-$(CONFIG_ENFS) += enfs_adapter.o ++ ++ obj-$(CONFIG_NFS_V2) += nfsv2.o ++ nfsv2-y := nfs2super.o proc.o nfs2xdr.o ++@@ -34,3 +35,5 @@ nfsv4-$(CONFIG_NFS_V4_2) += nfs42proc.o ++ obj-$(CONFIG_PNFS_FILE_LAYOUT) += filelayout/ ++ obj-$(CONFIG_PNFS_BLOCK) += blocklayout/ ++ obj-$(CONFIG_PNFS_FLEXFILE_LAYOUT) += flexfilelayout/ +++ +++obj-$(CONFIG_ENFS) += enfs/ ++diff --git a/net/sunrpc/Makefile b/net/sunrpc/Makefile ++index 090658c3da12..fe4e3b28c5d1 100644 ++--- a/net/sunrpc/Makefile +++++ b/net/sunrpc/Makefile ++@@ -19,3 +19,4 @@ sunrpc-$(CONFIG_SUNRPC_DEBUG) += debugfs.o ++ sunrpc-$(CONFIG_SUNRPC_BACKCHANNEL) += backchannel_rqst.o ++ sunrpc-$(CONFIG_PROC_FS) += stats.o ++ sunrpc-$(CONFIG_SYSCTL) += sysctl.o +++sunrpc-$(CONFIG_ENFS) += sunrpc_enfs_adapter.o +diff --git a/kernel.spec b/kernel.spec +index 3215446..e242c00 100644 +--- a/kernel.spec ++++ b/kernel.spec +@@ -60,6 +60,13 @@ Source9002: series.conf + Source9998: patches.tar.bz2 + %endif + ++Patch0001: 0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch ++Patch0002: 0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch ++Patch0003: 0003-add_enfs_module.patch ++Patch0004: 0004-add_enfs_module_for_sunrpc_multipatch.patch ++Patch0005: 0005-add_enfs_module_for_sunrpc_failover_and_configure.patch ++Patch0006: 0006-add_enfs_compile_option.patch ++ + #BuildRequires: + BuildRequires: module-init-tools, patch >= 2.5.4, bash >= 2.03, tar + BuildRequires: bzip2, xz, findutils, gzip, m4, perl, make >= 3.78, diffutils, gawk +@@ -256,6 +263,12 @@ Applypatches() + Applypatches series.conf %{_builddir}/kernel-%{version}/linux-%{KernelVer} + %endif + ++%patch0001 -p1 ++%patch0002 -p1 ++%patch0003 -p1 ++%patch0004 -p1 ++%patch0005 -p1 ++%patch0006 -p1 + touch .scmversion + + find . $ -name "*.orig" -o -name "*~" $ -exec rm -f {} \; >/dev/null +-- +2.25.0.windows.1 + diff --git a/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch new file mode 100644 index 0000000..38e57a9 --- /dev/null +++ b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch @@ -0,0 +1,757 @@ +diff --git a/fs/nfs/client.c b/fs/nfs/client.c +index 7d02dc52209d..50820a8a684a 100644 +--- a/fs/nfs/client.c ++++ b/fs/nfs/client.c +@@ -48,7 +48,7 @@ + #include "callback.h" + #include "delegation.h" + #include "iostat.h" +-#include "internal.h" ++#include "enfs_adapter.h" + #include "fscache.h" + #include "pnfs.h" + #include "nfs.h" +@@ -255,6 +255,7 @@ void nfs_free_client(struct nfs_client *clp) + put_nfs_version(clp->cl_nfs_mod); + kfree(clp->cl_hostname); + kfree(clp->cl_acceptor); ++ nfs_free_multi_path_client(clp); + kfree(clp); + } + EXPORT_SYMBOL_GPL(nfs_free_client); +@@ -330,6 +331,9 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat + sap)) + continue; + ++ if (!nfs_multipath_client_match(clp, data)) ++ continue; ++ + refcount_inc(&clp->cl_count); + return clp; + } +@@ -512,6 +516,9 @@ int nfs_create_rpc_client(struct nfs_client *clp, + .program = &nfs_program, + .version = clp->rpc_ops->version, + .authflavor = flavor, ++#if IS_ENABLED(CONFIG_ENFS) ++ .multipath_option = cl_init->enfs_option, ++#endif + }; + + if (test_bit(NFS_CS_DISCRTRY, &clp->cl_flags)) +@@ -634,6 +641,13 @@ struct nfs_client *nfs_init_client(struct nfs_client *clp, + /* the client is already initialised */ + if (clp->cl_cons_state == NFS_CS_READY) + return clp; ++ error = nfs_create_multi_path_client(clp, cl_init); ++ if (error < 0) { ++ dprintk("%s: create failed.%d!\n", __func__, error); ++ nfs_put_client(clp); ++ clp = ERR_PTR(error); ++ return clp; ++ } + + /* + * Create a client RPC handle for doing FSSTAT with UNIX auth only +@@ -666,6 +680,9 @@ static int nfs_init_server(struct nfs_server *server, + .net = data->net, + .timeparms = &timeparms, + .init_flags = (1UL << NFS_CS_REUSEPORT), ++#if IS_ENABLED(CONFIG_ENFS) ++ .enfs_option = data->enfs_option, ++#endif + }; + struct nfs_client *clp; + int error; +diff --git a/fs/nfs/enfs_adapter.c b/fs/nfs/enfs_adapter.c +new file mode 100644 +index 000000000000..7f471f2072c4 +--- /dev/null ++++ b/fs/nfs/enfs_adapter.c +@@ -0,0 +1,230 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs3.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/sunrpc/sched.h> ++#include <linux/nfs_iostat.h> ++#include "enfs_adapter.h" ++#include "iostat.h" ++ ++struct enfs_adapter_ops __rcu *enfs_adapter; ++ ++int enfs_adapter_register(struct enfs_adapter_ops *ops) ++{ ++ struct enfs_adapter_ops *old; ++ ++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, NULL, ops); ++ if (old == NULL || old == ops) ++ return 0; ++ pr_err("regist %s ops %p failed. old %p\n", __func__, ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(enfs_adapter_register); ++ ++int enfs_adapter_unregister(struct enfs_adapter_ops *ops) ++{ ++ struct enfs_adapter_ops *old; ++ ++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, ops, NULL); ++ if (old == ops || old == NULL) ++ return 0; ++ pr_err("unregist %s ops %p failed. old %p\n", __func__, ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(enfs_adapter_unregister); ++ ++struct enfs_adapter_ops *nfs_multipath_router_get(void) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ rcu_read_lock(); ++ ops = rcu_dereference(enfs_adapter); ++ if (ops == NULL) { ++ rcu_read_unlock(); ++ return NULL; ++ } ++ if (!try_module_get(ops->owner)) ++ ops = NULL; ++ rcu_read_unlock(); ++ return ops; ++} ++ ++void nfs_multipath_router_put(struct enfs_adapter_ops *ops) ++{ ++ if (ops) ++ module_put(ops->owner); ++} ++ ++bool is_valid_option(enum nfsmultipathoptions option) ++{ ++ if (option < REMOTEADDR || option >= INVALID_OPTION) { ++ pr_warn("%s: ENFS invalid option %d\n", __func__, option); ++ return false; ++ } ++ ++ return true; ++} ++ ++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, ++ struct nfs_parsed_mount_data *mnt) ++{ ++ ++ //parseMultiPathOptions(getNfsMultiPathOpt(token), string, mnt); ++ ++ int rc; ++ struct enfs_adapter_ops *ops; ++ ++ ops = nfs_multipath_router_get(); ++ if ((ops == NULL) || (ops->parse_mount_options == NULL) || ++ !is_valid_option(option)) { ++ nfs_multipath_router_put(ops); ++ dfprintk(MOUNT, ++ "NFS: parsing nfs mount option enfs not load[%s]\n" ++ , __func__); ++ return -EOPNOTSUPP; ++ } ++ // nfs_multipath_parse_options ++ dfprintk(MOUNT, "NFS: parsing nfs mount option '%s' type: %d[%s]\n" ++ , str, option, __func__); ++ rc = ops->parse_mount_options(option, str, &mnt->enfs_option, mnt->net); ++ nfs_multipath_router_put(ops); ++ return rc; ++} ++ ++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ if (data->enfs_option == NULL) ++ return; ++ ++ ops = nfs_multipath_router_get(); ++ if ((ops == NULL) || (ops->free_mount_options == NULL)) { ++ nfs_multipath_router_put(ops); ++ return; ++ } ++ ops->free_mount_options((void *)&data->enfs_option); ++ nfs_multipath_router_put(ops); ++} ++ ++int nfs_create_multi_path_client(struct nfs_client *client, ++ const struct nfs_client_initdata *cl_init) ++{ ++ int ret = 0; ++ struct enfs_adapter_ops *ops; ++ ++ if (cl_init->enfs_option == NULL) ++ return 0; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_init != NULL) ++ ret = ops->client_info_init( ++ (void *)&client->cl_multipath_data, cl_init); ++ nfs_multipath_router_put(ops); ++ ++ return ret; ++} ++EXPORT_SYMBOL_GPL(nfs_create_multi_path_client); ++ ++void nfs_free_multi_path_client(struct nfs_client *clp) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ if (clp->cl_multipath_data == NULL) ++ return; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_free != NULL) ++ ops->client_info_free(clp->cl_multipath_data); ++ nfs_multipath_router_put(ops); ++} ++ ++int nfs_multipath_client_match(struct nfs_client *clp, ++ const struct nfs_client_initdata *sap) ++{ ++ int ret = true; ++ struct enfs_adapter_ops *ops; ++ ++ pr_info("%s src %p dst %p\n.", __func__, ++ clp->cl_multipath_data, sap->enfs_option); ++ ++ if (clp->cl_multipath_data == NULL && sap->enfs_option == NULL) ++ return true; ++ ++ if ((clp->cl_multipath_data == NULL && sap->enfs_option) || ++ (clp->cl_multipath_data && sap->enfs_option == NULL)) { ++ pr_err("not match client src %p dst %p\n.", ++ clp->cl_multipath_data, sap->enfs_option); ++ return false; ++ } ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_match != NULL) ++ ret = ops->client_info_match(clp->cl_multipath_data, ++ sap->enfs_option); ++ nfs_multipath_router_put(ops); ++ ++ return ret; ++} ++ ++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst) ++{ ++ int ret = true; ++ struct enfs_adapter_ops *ops; ++ ++ if (src->cl_multipath_data == NULL && dst->cl_multipath_data == NULL) ++ return true; ++ ++ if (src->cl_multipath_data == NULL || dst->cl_multipath_data == NULL) ++ return false; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->nfs4_client_info_match != NULL) ++ ret = ops->nfs4_client_info_match(src->cl_multipath_data, ++ src->cl_multipath_data); ++ nfs_multipath_router_put(ops); ++ ++ return ret; ++} ++EXPORT_SYMBOL_GPL(nfs4_multipath_client_match); ++ ++void nfs_multipath_show_client_info(struct seq_file *mount_option, ++ struct nfs_server *server) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ if (mount_option == NULL || server == NULL || ++ server->client == NULL || ++ server->nfs_client->cl_multipath_data == NULL) ++ return; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_show != NULL) ++ ops->client_info_show(mount_option, server); ++ nfs_multipath_router_put(ops); ++} ++ ++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) ++{ ++ int ret = 0; ++ struct enfs_adapter_ops *ops; ++ ++ if (nfs_client == NULL || nfs_client->cl_rpcclient == NULL) ++ return 0; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->remount_ip_list != NULL) ++ ret = ops->remount_ip_list(nfs_client, enfs_option); ++ nfs_multipath_router_put(ops); ++ return ret; ++} ++EXPORT_SYMBOL_GPL(nfs_remount_iplist); +diff --git a/fs/nfs/enfs_adapter.h b/fs/nfs/enfs_adapter.h +new file mode 100644 +index 000000000000..752544e18056 +--- /dev/null ++++ b/fs/nfs/enfs_adapter.h +@@ -0,0 +1,101 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapt header. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _NFS_MULTIPATH_H_ ++#define _NFS_MULTIPATH_H_ ++ ++#include "internal.h" ++ ++#if IS_ENABLED(CONFIG_ENFS) ++enum nfsmultipathoptions { ++ REMOTEADDR, ++ LOCALADDR, ++ REMOTEDNSNAME, ++ REMOUNTREMOTEADDR, ++ REMOUNTLOCALADDR, ++ INVALID_OPTION ++}; ++ ++ ++struct enfs_adapter_ops { ++ const char *name; ++ struct module *owner; ++ int (*parse_mount_options)(enum nfsmultipathoptions option, ++ char *str, void **enfs_option, struct net *net_ns); ++ ++ void (*free_mount_options)(void **data); ++ ++ int (*client_info_init)(void **data, ++ const struct nfs_client_initdata *cl_init); ++ void (*client_info_free)(void *data); ++ int (*client_info_match)(void *src, void *dst); ++ int (*nfs4_client_info_match)(void *src, void *dst); ++ void (*client_info_show)(struct seq_file *mount_option, void *data); ++ int (*remount_ip_list)(struct nfs_client *nfs_client, ++ void *enfs_option); ++}; ++ ++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, ++ struct nfs_parsed_mount_data *mnt); ++void enfs_free_mount_options(struct nfs_parsed_mount_data *data); ++int nfs_create_multi_path_client(struct nfs_client *client, ++ const struct nfs_client_initdata *cl_init); ++void nfs_free_multi_path_client(struct nfs_client *clp); ++int nfs_multipath_client_match(struct nfs_client *clp, ++ const struct nfs_client_initdata *sap); ++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst); ++void nfs_multipath_show_client_info(struct seq_file *mount_option, ++ struct nfs_server *server); ++int enfs_adapter_register(struct enfs_adapter_ops *ops); ++int enfs_adapter_unregister(struct enfs_adapter_ops *ops); ++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); ++int nfs4_create_multi_path(struct nfs_server *server, ++ struct nfs_parsed_mount_data *data, ++ const struct rpc_timeout *timeparms); ++ ++#else ++static inline ++void nfs_free_multi_path_client(struct nfs_client *clp) ++{ ++ ++} ++ ++static inline ++int nfs_multipath_client_match(struct nfs_client *clp, ++ const struct nfs_client_initdata *sap) ++{ ++ return 1; ++} ++ ++static inline ++int nfs_create_multi_path_client(struct nfs_client *client, ++ const struct nfs_client_initdata *cl_init) ++{ ++ return 0; ++} ++ ++static inline ++void nfs_multipath_show_client_info(struct seq_file *mount_option, ++ struct nfs_server *server) ++{ ++ ++} ++ ++static inline ++int nfs4_multipath_client_match(struct nfs_client *src, ++ struct nfs_client *dst) ++{ ++ return 1; ++} ++ ++static inline ++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) ++{ ++ ++} ++ ++#endif // CONFIG_ENFS ++#endif // _NFS_MULTIPATH_H_ +diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h +index 0ce5a90640c4..c696693edc7b 100644 +--- a/fs/nfs/internal.h ++++ b/fs/nfs/internal.h +@@ -93,6 +93,9 @@ struct nfs_client_initdata { + u32 minorversion; + struct net *net; + const struct rpc_timeout *timeparms; ++#if IS_ENABLED(CONFIG_ENFS) ++ void *enfs_option; /* struct multipath_mount_options * */ ++#endif + }; + + /* +@@ -135,6 +138,9 @@ struct nfs_parsed_mount_data { + + struct security_mnt_opts lsm_opts; + struct net *net; ++#if IS_ENABLED(CONFIG_ENFS) ++ void *enfs_option; /* struct multipath_mount_options * */ ++#endif + }; + + /* mount_clnt.c */ +diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c +index 1350ea673672..4aa6e1f961f7 100644 +--- a/fs/nfs/nfs4client.c ++++ b/fs/nfs/nfs4client.c +@@ -10,7 +10,7 @@ + #include <linux/sunrpc/xprt.h> + #include <linux/sunrpc/bc_xprt.h> + #include <linux/sunrpc/rpc_pipe_fs.h> +-#include "internal.h" ++#include "enfs_adapter.h" + #include "callback.h" + #include "delegation.h" + #include "nfs4session.h" +@@ -225,6 +225,16 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init) + __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags); + __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags); + ++#if IS_ENABLED(CONFIG_ENFS) ++ err = nfs_create_multi_path_client(clp, cl_init); ++ if (err < 0) { ++ dprintk("%s: create failed.%d\n", __func__, err); ++ nfs_put_client(clp); ++ clp = ERR_PTR(err); ++ return clp; ++ } ++#endif ++ + /* + * Set up the connection to the server before we add add to the + * global list. +@@ -529,6 +539,9 @@ static int nfs4_match_client(struct nfs_client *pos, struct nfs_client *new, + if (!nfs4_match_client_owner_id(pos, new)) + return 1; + ++ if (!nfs4_multipath_client_match(pos, new)) ++ return 1; ++ + return 0; + } + +@@ -860,7 +873,7 @@ static int nfs4_set_client(struct nfs_server *server, + const size_t addrlen, + const char *ip_addr, + int proto, const struct rpc_timeout *timeparms, +- u32 minorversion, struct net *net) ++ u32 minorversion, struct net *net, void *enfs_option) + { + struct nfs_client_initdata cl_init = { + .hostname = hostname, +@@ -872,6 +885,9 @@ static int nfs4_set_client(struct nfs_server *server, + .minorversion = minorversion, + .net = net, + .timeparms = timeparms, ++#if IS_ENABLED(CONFIG_ENFS) ++ .enfs_option = enfs_option, ++#endif + }; + struct nfs_client *clp; + +@@ -1042,6 +1058,30 @@ static int nfs4_server_common_setup(struct nfs_server *server, + return error; + } + ++int nfs4_create_multi_path(struct nfs_server *server, ++ struct nfs_parsed_mount_data *data, ++ const struct rpc_timeout *timeparms) ++{ ++ struct nfs_client_initdata cl_init = { ++ .hostname = data->nfs_server.hostname, ++ .addr = (const struct sockaddr *)&data->nfs_server.address, ++ .addrlen = data->nfs_server.addrlen, ++ .ip_addr = data->client_address, ++ .nfs_mod = &nfs_v4, ++ .proto = data->nfs_server.protocol, ++ .minorversion = data->minorversion, ++ .net = data->net, ++ .timeparms = timeparms, ++#if IS_ENABLED(CONFIG_ENFS) ++ .enfs_option = data->enfs_option, ++#endif // CONFIG_ENFS ++ }; ++ ++ return nfs_create_multi_path_client(server->nfs_client, &cl_init); ++ ++} ++EXPORT_SYMBOL_GPL(nfs4_create_multi_path); ++ + /* + * Create a version 4 volume record + */ +@@ -1050,6 +1090,7 @@ static int nfs4_init_server(struct nfs_server *server, + { + struct rpc_timeout timeparms; + int error; ++ void *enfs_option = NULL; + + nfs_init_timeout_values(&timeparms, data->nfs_server.protocol, + data->timeo, data->retrans); +@@ -1067,6 +1108,10 @@ static int nfs4_init_server(struct nfs_server *server, + else + data->selected_flavor = RPC_AUTH_UNIX; + ++#if IS_ENABLED(CONFIG_ENFS) ++ enfs_option = data->enfs_option; ++#endif ++ + /* Get a client record */ + error = nfs4_set_client(server, + data->nfs_server.hostname, +@@ -1076,7 +1121,7 @@ static int nfs4_init_server(struct nfs_server *server, + data->nfs_server.protocol, + &timeparms, + data->minorversion, +- data->net); ++ data->net, enfs_option); + if (error < 0) + return error; + +@@ -1161,7 +1206,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, + XPRT_TRANSPORT_RDMA, + parent_server->client->cl_timeout, + parent_client->cl_mvops->minor_version, +- parent_client->cl_net); ++ parent_client->cl_net, NULL); + if (!error) + goto init_server; + #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */ +@@ -1174,7 +1219,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, + XPRT_TRANSPORT_TCP, + parent_server->client->cl_timeout, + parent_client->cl_mvops->minor_version, +- parent_client->cl_net); ++ parent_client->cl_net, NULL); + if (error < 0) + goto error; + +@@ -1269,7 +1314,7 @@ int nfs4_update_server(struct nfs_server *server, const char *hostname, + set_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); + error = nfs4_set_client(server, hostname, sap, salen, buf, + clp->cl_proto, clnt->cl_timeout, +- clp->cl_minorversion, net); ++ clp->cl_minorversion, net, NULL); + clear_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); + if (error != 0) { + nfs_server_insert_lists(server); +diff --git a/fs/nfs/super.c b/fs/nfs/super.c +index a05e1eb2c3fd..83cd294aca15 100644 +--- a/fs/nfs/super.c ++++ b/fs/nfs/super.c +@@ -61,7 +61,7 @@ + #include "callback.h" + #include "delegation.h" + #include "iostat.h" +-#include "internal.h" ++#include "enfs_adapter.h" + #include "fscache.h" + #include "nfs4session.h" + #include "pnfs.h" +@@ -113,6 +113,12 @@ enum { + + /* Special mount options */ + Opt_userspace, Opt_deprecated, Opt_sloppy, ++#if IS_ENABLED(CONFIG_ENFS) ++ Opt_remote_iplist, ++ Opt_local_iplist, ++ Opt_remote_dnslist, ++ Opt_enfs_info, ++#endif + + Opt_err + }; +@@ -183,6 +189,13 @@ static const match_table_t nfs_mount_option_tokens = { + { Opt_fscache_uniq, "fsc=%s" }, + { Opt_local_lock, "local_lock=%s" }, + ++#if IS_ENABLED(CONFIG_ENFS) ++ { Opt_remote_iplist, "remoteaddrs=%s" }, ++ { Opt_local_iplist, "localaddrs=%s" }, ++ { Opt_remote_dnslist, "remotednsname=%s" }, ++ { Opt_enfs_info, "enfs_info=%s" }, ++#endif ++ + /* The following needs to be listed after all other options */ + { Opt_nfsvers, "v%s" }, + +@@ -365,6 +378,21 @@ static struct shrinker acl_shrinker = { + .seeks = DEFAULT_SEEKS, + }; + ++#if IS_ENABLED(CONFIG_ENFS) ++enum nfsmultipathoptions getNfsMultiPathOpt(int token) ++{ ++ switch (token) { ++ case Opt_remote_iplist: ++ return REMOUNTREMOTEADDR; ++ case Opt_local_iplist: ++ return REMOUNTLOCALADDR; ++ case Opt_remote_dnslist: ++ return REMOTEDNSNAME; ++ } ++ return INVALID_OPTION; ++} ++#endif ++ + /* + * Register the NFS filesystems + */ +@@ -758,6 +786,9 @@ int nfs_show_options(struct seq_file *m, struct dentry *root) + seq_printf(m, ",addr=%s", + rpc_peeraddr2str(nfss->nfs_client->cl_rpcclient, + RPC_DISPLAY_ADDR)); ++ ++ nfs_multipath_show_client_info(m, nfss); ++ + rcu_read_unlock(); + + return 0; +@@ -853,6 +884,8 @@ int nfs_show_stats(struct seq_file *m, struct dentry *root) + seq_puts(m, root->d_sb->s_flags & SB_NODIRATIME ? ",nodiratime" : ""); + nfs_show_mount_options(m, nfss, 1); + ++ nfs_multipath_show_client_info(m, nfss); ++ + seq_printf(m, "\n\tage:\t%lu", (jiffies - nfss->mount_time) / HZ); + + show_implementation_id(m, nfss); +@@ -977,6 +1010,7 @@ static void nfs_free_parsed_mount_data(struct nfs_parsed_mount_data *data) + kfree(data->nfs_server.export_path); + kfree(data->nfs_server.hostname); + kfree(data->fscache_uniq); ++ enfs_free_mount_options(data); + security_free_mnt_opts(&data->lsm_opts); + kfree(data); + } +@@ -1641,7 +1675,34 @@ static int nfs_parse_mount_options(char *raw, + return 0; + }; + break; +- ++#if IS_ENABLED(CONFIG_ENFS) ++ case Opt_remote_iplist: ++ case Opt_local_iplist: ++ case Opt_remote_dnslist: ++ string = match_strdup(args); ++ if (string == NULL) ++ goto out_nomem; ++ rc = enfs_parse_mount_options(getNfsMultiPathOpt(token), ++ string, mnt); ++ kfree(string); ++ switch (rc) { ++ case 0: ++ break; ++ case -ENOMEM: ++ goto out_nomem; ++ case -ENOSPC: ++ goto out_limit; ++ case -EINVAL: ++ goto out_invalid_address; ++ case -ENOTSUPP: ++ goto out_invalid_address; ++ case -EOPNOTSUPP: ++ goto out_invalid_address; ++ } ++ break; ++ case Opt_enfs_info: ++ break; ++#endif + /* + * Special options + */ +@@ -1720,6 +1781,11 @@ static int nfs_parse_mount_options(char *raw, + free_secdata(secdata); + printk(KERN_INFO "NFS: security options invalid: %d\n", rc); + return 0; ++#if IS_ENABLED(CONFIG_ENFS) ++out_limit: ++ dprintk("NFS: param is more than supported limit: %d\n", rc); ++ return 0; ++#endif + } + + /* +@@ -2335,6 +2401,14 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) + if (!nfs_parse_mount_options((char *)options, data)) + goto out; + ++#if IS_ENABLED(CONFIG_ENFS) ++ if (data->enfs_option) { ++ error = nfs_remount_iplist(nfss->nfs_client, data->enfs_option); ++ if (error) ++ goto out; ++ } ++#endif ++ + /* + * noac is a special case. It implies -o sync, but that's not + * necessarily reflected in the mtab options. do_remount_sb +@@ -2347,6 +2421,11 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) + /* compare new mount options with old ones */ + error = nfs_compare_remount_data(nfss, data); + out: ++#if IS_ENABLED(CONFIG_ENFS) ++ /* release remount option member */ ++ if (data->enfs_option) ++ enfs_free_mount_options(data); ++#endif + nfs_free_parsed_mount_data(data); + return error; + } +diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h +index 7023ae64e3d7..2c19678afe8d 100644 +--- a/include/linux/nfs_fs_sb.h ++++ b/include/linux/nfs_fs_sb.h +@@ -123,6 +123,11 @@ struct nfs_client { + + struct net *cl_net; + struct list_head pending_cb_stateids; ++ ++#if IS_ENABLED(CONFIG_ENFS) ++ /* multi path private structure (struct multipath_client_info *) */ ++ void *cl_multipath_data; ++#endif + }; + + /* diff --git a/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch new file mode 100644 index 0000000..540a2ce --- /dev/null +++ b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch @@ -0,0 +1,805 @@ +diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h +index 8aa865bce4f6..89178f78de8c 100644 +--- a/include/linux/sunrpc/clnt.h ++++ b/include/linux/sunrpc/clnt.h +@@ -70,6 +70,10 @@ struct rpc_clnt { + struct dentry *cl_debugfs; /* debugfs directory */ + #endif + struct rpc_xprt_iter cl_xpi; ++ ++#if IS_ENABLED(CONFIG_ENFS) ++ bool cl_enfs; ++#endif + }; + + /* +@@ -124,6 +128,9 @@ struct rpc_create_args { + unsigned long flags; + char *client_name; + struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */ ++#if IS_ENABLED(CONFIG_ENFS) ++ void *multipath_option; ++#endif + }; + + struct rpc_add_xprt_test { +@@ -221,6 +228,12 @@ bool rpc_clnt_xprt_switch_has_addr(struct rpc_clnt *clnt, + const struct sockaddr *sap); + void rpc_cleanup_clids(void); + ++#if IS_ENABLED(CONFIG_ENFS) ++int ++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, ++ const struct rpc_call_ops *ops, void *data, int flags); ++#endif /* CONFIG_ENFS */ ++ + static inline int rpc_reply_expected(struct rpc_task *task) + { + return (task->tk_msg.rpc_proc != NULL) && +diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h +index ad2e243f3f03..124f5a0faf3e 100644 +--- a/include/linux/sunrpc/sched.h ++++ b/include/linux/sunrpc/sched.h +@@ -90,6 +90,9 @@ struct rpc_task { + tk_garb_retry : 2, + tk_cred_retry : 2, + tk_rebind_retry : 2; ++#if IS_ENABLED(CONFIG_ENFS) ++ unsigned long tk_major_timeo; /* major timeout ticks */ ++#endif + }; + + typedef void (*rpc_action)(struct rpc_task *); +@@ -118,6 +121,9 @@ struct rpc_task_setup { + */ + #define RPC_TASK_ASYNC 0x0001 /* is an async task */ + #define RPC_TASK_SWAPPER 0x0002 /* is swapping in/out */ ++#if IS_ENABLED(CONFIG_ENFS) ++#define RPC_TASK_FIXED 0x0004 /* detect xprt status task */ ++#endif + #define RPC_CALL_MAJORSEEN 0x0020 /* major timeout seen */ + #define RPC_TASK_ROOTCREDS 0x0040 /* force root creds */ + #define RPC_TASK_DYNAMIC 0x0080 /* task was kmalloc'ed */ +@@ -257,6 +263,9 @@ void rpc_destroy_mempool(void); + extern struct workqueue_struct *rpciod_workqueue; + extern struct workqueue_struct *xprtiod_workqueue; + void rpc_prepare_task(struct rpc_task *task); ++#if IS_ENABLED(CONFIG_ENFS) ++void rpc_init_task_retry_counters(struct rpc_task *task); ++#endif + + static inline int rpc_wait_for_completion_task(struct rpc_task *task) + { +diff --git a/include/linux/sunrpc/sunrpc_enfs_adapter.h b/include/linux/sunrpc/sunrpc_enfs_adapter.h +new file mode 100644 +index 000000000000..28abedcf5cf6 +--- /dev/null ++++ b/include/linux/sunrpc/sunrpc_enfs_adapter.h +@@ -0,0 +1,128 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* Client-side SUNRPC ENFS adapter header. ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _SUNRPC_ENFS_ADAPTER_H_ ++#define _SUNRPC_ENFS_ADAPTER_H_ ++#include <linux/sunrpc/clnt.h> ++ ++#if IS_ENABLED(CONFIG_ENFS) ++ ++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) ++{ ++ xps->xps_nactive--; ++} ++ ++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) ++{ ++ xps->xps_nactive--; ++} ++ ++struct rpc_xprt *rpc_task_get_xprt ++(struct rpc_clnt *clnt, struct rpc_xprt *xprt); ++ ++struct rpc_multipath_ops { ++ struct module *owner; ++ void (*create_clnt)(struct rpc_create_args *args, ++ struct rpc_clnt *clnt); ++ void (*releas_clnt)(struct rpc_clnt *clnt); ++ void (*create_xprt)(struct rpc_xprt *xprt); ++ void (*destroy_xprt)(struct rpc_xprt *xprt); ++ void (*xprt_iostat)(struct rpc_task *task); ++ void (*failover_handle)(struct rpc_task *task); ++ bool (*task_need_call_start_again)(struct rpc_task *task); ++ void (*adjust_task_timeout)(struct rpc_task *task, void *condition); ++ void (*init_task_req)(struct rpc_task *task, struct rpc_rqst *req); ++ bool (*prepare_transmit)(struct rpc_task *task); ++}; ++ ++extern struct rpc_multipath_ops __rcu *multipath_ops; ++void rpc_init_task_retry_counters(struct rpc_task *task); ++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops); ++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops); ++struct rpc_multipath_ops *rpc_multipath_ops_get(void); ++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops); ++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); ++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, ++ struct rpc_clnt *clnt); ++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt); ++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt); ++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt); ++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task); ++void rpc_multipath_ops_failover_handle(struct rpc_task *task); ++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task); ++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, ++ void *condition); ++void rpc_multipath_ops_init_task_req(struct rpc_task *task, ++ struct rpc_rqst *req); ++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task); ++ ++#else ++static inline struct rpc_xprt *rpc_task_get_xprt(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt) ++{ ++ return NULL; ++} ++ ++static inline void rpc_task_release_xprt(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt) ++{ ++} ++ ++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) ++{ ++} ++ ++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) ++{ ++} ++ ++static inline void rpc_multipath_ops_create_clnt ++(struct rpc_create_args *args, struct rpc_clnt *clnt) ++{ ++} ++ ++static inline void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) ++{ ++} ++ ++static inline bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) ++{ ++ return false; ++} ++ ++static inline void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) ++{ ++} ++ ++static inline void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) ++{ ++} ++ ++static inline void rpc_multipath_ops_failover_handle(struct rpc_task *task) ++{ ++} ++ ++static inline ++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) ++{ ++ return false; ++} ++ ++static inline void ++rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, void *condition) ++{ ++} ++ ++static inline void ++rpc_multipath_ops_init_task_req(struct rpc_task *task, struct rpc_rqst *req) ++{ ++} ++ ++static inline bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) ++{ ++ return false; ++} ++ ++#endif ++#endif // _SUNRPC_ENFS_ADAPTER_H_ +diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h +index ccfacca1eba9..2e47b3577947 100644 +--- a/include/linux/sunrpc/xprt.h ++++ b/include/linux/sunrpc/xprt.h +@@ -279,6 +279,10 @@ struct rpc_xprt { + atomic_t inject_disconnect; + #endif + struct rcu_head rcu; ++#if IS_ENABLED(CONFIG_ENFS) ++ atomic_long_t queuelen; ++ void *multipath_context; ++#endif + }; + + #if defined(CONFIG_SUNRPC_BACKCHANNEL) +diff --git a/include/linux/sunrpc/xprtmultipath.h b/include/linux/sunrpc/xprtmultipath.h +index af1257c030d2..d54e4dbbbf34 100644 +--- a/include/linux/sunrpc/xprtmultipath.h ++++ b/include/linux/sunrpc/xprtmultipath.h +@@ -22,6 +22,10 @@ struct rpc_xprt_switch { + const struct rpc_xprt_iter_ops *xps_iter_ops; + + struct rcu_head xps_rcu; ++#if IS_ENABLED(CONFIG_ENFS) ++ unsigned int xps_nactive; ++ atomic_long_t xps_queuelen; ++#endif + }; + + struct rpc_xprt_iter { +@@ -69,4 +73,8 @@ extern struct rpc_xprt *xprt_iter_get_next(struct rpc_xprt_iter *xpi); + + extern bool rpc_xprt_switch_has_addr(struct rpc_xprt_switch *xps, + const struct sockaddr *sap); ++#if IS_ENABLED(CONFIG_ENFS) ++extern void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++ struct rpc_xprt *xprt); ++#endif + #endif +diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c +index 0fc540b0d183..d7ffee637148 100644 +--- a/net/sunrpc/clnt.c ++++ b/net/sunrpc/clnt.c +@@ -37,6 +37,7 @@ + #include <linux/sunrpc/rpc_pipe_fs.h> + #include <linux/sunrpc/metrics.h> + #include <linux/sunrpc/bc_xprt.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + #include <trace/events/sunrpc.h> + + #include "sunrpc.h" +@@ -490,6 +491,8 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, + } + } + ++ rpc_multipath_ops_create_clnt(args, clnt); ++ + clnt->cl_softrtry = 1; + if (args->flags & RPC_CLNT_CREATE_HARDRTRY) + clnt->cl_softrtry = 0; +@@ -869,6 +872,8 @@ void rpc_shutdown_client(struct rpc_clnt *clnt) + list_empty(&clnt->cl_tasks), 1*HZ); + } + ++ rpc_multipath_ops_releas_clnt(clnt); ++ + rpc_release_client(clnt); + } + EXPORT_SYMBOL_GPL(rpc_shutdown_client); +@@ -981,7 +986,13 @@ void rpc_task_release_transport(struct rpc_task *task) + + if (xprt) { + task->tk_xprt = NULL; +- xprt_put(xprt); ++#if IS_ENABLED(CONFIG_ENFS) ++ if (task->tk_client) { ++ rpc_task_release_xprt(task->tk_client, xprt); ++ return; ++ } ++#endif ++ xprt_put(xprt); + } + } + EXPORT_SYMBOL_GPL(rpc_task_release_transport); +@@ -990,6 +1001,10 @@ void rpc_task_release_client(struct rpc_task *task) + { + struct rpc_clnt *clnt = task->tk_client; + ++#if IS_ENABLED(CONFIG_ENFS) ++ rpc_task_release_transport(task); ++#endif ++ + if (clnt != NULL) { + /* Remove from client task list */ + spin_lock(&clnt->cl_lock); +@@ -999,14 +1014,29 @@ void rpc_task_release_client(struct rpc_task *task) + + rpc_release_client(clnt); + } ++#if IS_ENABLED(CONFIG_ENFS) ++#else + rpc_task_release_transport(task); ++#endif + } + ++#if IS_ENABLED(CONFIG_ENFS) ++static struct rpc_xprt * ++rpc_task_get_next_xprt(struct rpc_clnt *clnt) ++{ ++ return rpc_task_get_xprt(clnt, xprt_iter_get_next(&clnt->cl_xpi)); ++} ++#endif ++ + static + void rpc_task_set_transport(struct rpc_task *task, struct rpc_clnt *clnt) + { + if (!task->tk_xprt) ++#if IS_ENABLED(CONFIG_ENFS) ++ task->tk_xprt = rpc_task_get_next_xprt(clnt); ++#else + task->tk_xprt = xprt_iter_get_next(&clnt->cl_xpi); ++#endif + } + + static +@@ -1597,6 +1627,14 @@ call_reserveresult(struct rpc_task *task) + return; + case -EIO: /* probably a shutdown */ + break; ++#if IS_ENABLED(CONFIG_ENFS) ++ case -ETIMEDOUT: /* woken up; restart */ ++ if (rpc_multipath_ops_task_need_call_start_again(task)) { ++ rpc_task_release_transport(task); ++ task->tk_action = call_start; ++ return; ++ } ++#endif + default: + printk(KERN_ERR "%s: unrecognized error %d, exiting\n", + __func__, status); +@@ -1962,6 +2000,10 @@ call_transmit(struct rpc_task *task) + return; + if (!xprt_prepare_transmit(task)) + return; ++ ++ if (rpc_multipath_ops_prepare_transmit(task)) ++ return; ++ + task->tk_action = call_transmit_status; + /* Encode here so that rpcsec_gss can use correct sequence number. */ + if (rpc_task_need_encode(task)) { +@@ -2277,6 +2319,9 @@ call_timeout(struct rpc_task *task) + + retry: + task->tk_action = call_bind; ++#if IS_ENABLED(CONFIG_ENFS) ++ rpc_multipath_ops_failover_handle(task); ++#endif + task->tk_status = 0; + } + +@@ -2961,3 +3006,30 @@ rpc_clnt_swap_deactivate(struct rpc_clnt *clnt) + } + EXPORT_SYMBOL_GPL(rpc_clnt_swap_deactivate); + #endif /* CONFIG_SUNRPC_SWAP */ ++ ++#if IS_ENABLED(CONFIG_ENFS) ++/* rpc_clnt_test_xprt - Test and add a new transport to a rpc_clnt ++ * @clnt: pointer to struct rpc_clnt ++ * @xprt: pointer struct rpc_xprt ++ * @ops: async operation ++ */ ++int ++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, ++ const struct rpc_call_ops *ops, void *data, int flags) ++{ ++ struct rpc_cred *cred; ++ struct rpc_task *task; ++ ++ cred = authnull_ops.lookup_cred(NULL, NULL, 0); ++ task = rpc_call_null_helper(clnt, xprt, cred, ++ RPC_TASK_SOFT | RPC_TASK_SOFTCONN | flags, ++ ops, data); ++ put_rpccred(cred); ++ if (IS_ERR(task)) ++ return PTR_ERR(task); ++ ++ rpc_put_task(task); ++ return 1; ++} ++EXPORT_SYMBOL_GPL(rpc_clnt_test_xprt); ++#endif +diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c +index a873c92a4898..2254fea0e863 100644 +--- a/net/sunrpc/sched.c ++++ b/net/sunrpc/sched.c +@@ -20,7 +20,7 @@ + #include <linux/mutex.h> + #include <linux/freezer.h> + +-#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + + #include "sunrpc.h" + +@@ -962,7 +962,12 @@ static void rpc_init_task(struct rpc_task *task, const struct rpc_task_setup *ta + /* Initialize workqueue for async tasks */ + task->tk_workqueue = task_setup_data->workqueue; + ++#if IS_ENABLED(CONFIG_ENFS) ++ task->tk_xprt = rpc_task_get_xprt(task_setup_data->rpc_client, ++ xprt_get(task_setup_data->rpc_xprt)); ++#else + task->tk_xprt = xprt_get(task_setup_data->rpc_xprt); ++#endif + + if (task->tk_ops->rpc_call_prepare != NULL) + task->tk_action = rpc_prepare_task; +diff --git a/net/sunrpc/sunrpc_enfs_adapter.c b/net/sunrpc/sunrpc_enfs_adapter.c +new file mode 100644 +index 000000000000..c1543545c6de +--- /dev/null ++++ b/net/sunrpc/sunrpc_enfs_adapter.c +@@ -0,0 +1,214 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* Client-side SUNRPC ENFS adapter header. ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ ++struct rpc_multipath_ops __rcu *multipath_ops; ++ ++void rpc_init_task_retry_counters(struct rpc_task *task) ++{ ++ /* Initialize retry counters */ ++ task->tk_garb_retry = 2; ++ task->tk_cred_retry = 2; ++ task->tk_rebind_retry = 2; ++} ++EXPORT_SYMBOL_GPL(rpc_init_task_retry_counters); ++ ++struct rpc_xprt * ++rpc_task_get_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) ++{ ++ struct rpc_xprt_switch *xps; ++ ++ if (!xprt) ++ return NULL; ++ rcu_read_lock(); ++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ++ atomic_long_inc(&xps->xps_queuelen); ++ rcu_read_unlock(); ++ atomic_long_inc(&xprt->queuelen); ++ ++ return xprt; ++} ++ ++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops) ++{ ++ struct rpc_multipath_ops *old; ++ ++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, NULL, ops); ++ if (!old || old == ops) ++ return 0; ++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_register); ++ ++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops) ++{ ++ struct rpc_multipath_ops *old; ++ ++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, ops, NULL); ++ if (!old || old == ops) ++ return 0; ++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_unregister); ++ ++struct rpc_multipath_ops *rpc_multipath_ops_get(void) ++{ ++ struct rpc_multipath_ops *ops; ++ ++ rcu_read_lock(); ++ ops = rcu_dereference(multipath_ops); ++ if (!ops) { ++ rcu_read_unlock(); ++ return NULL; ++ } ++ if (!try_module_get(ops->owner)) ++ ops = NULL; ++ rcu_read_unlock(); ++ return ops; ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_get); ++ ++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops) ++{ ++ if (ops) ++ module_put(ops->owner); ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_put); ++ ++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) ++{ ++ struct rpc_xprt_switch *xps; ++ ++ atomic_long_dec(&xprt->queuelen); ++ rcu_read_lock(); ++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ++ atomic_long_dec(&xps->xps_queuelen); ++ rcu_read_unlock(); ++ ++ xprt_put(xprt); ++} ++ ++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, ++ struct rpc_clnt *clnt) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ if (args->multipath_option) { ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->create_clnt) ++ mops->create_clnt(args, clnt); ++ rpc_multipath_ops_put(mops); ++ } ++} ++ ++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->releas_clnt) ++ mops->releas_clnt(clnt); ++ ++ rpc_multipath_ops_put(mops); ++} ++ ++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->create_xprt) { ++ mops->create_xprt(xprt); ++ if (!xprt->multipath_context) { ++ rpc_multipath_ops_put(mops); ++ return true; ++ } ++ } ++ rpc_multipath_ops_put(mops); ++ return false; ++} ++ ++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ if (xprt->multipath_context) { ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->destroy_xprt) ++ mops->destroy_xprt(xprt); ++ rpc_multipath_ops_put(mops); ++ } ++} ++ ++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ mops = rpc_multipath_ops_get(); ++ if (task->tk_client && mops && mops->xprt_iostat) ++ mops->xprt_iostat(task); ++ rpc_multipath_ops_put(mops); ++} ++ ++void rpc_multipath_ops_failover_handle(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mpath_ops = NULL; ++ ++ mpath_ops = rpc_multipath_ops_get(); ++ if (mpath_ops && mpath_ops->failover_handle) ++ mpath_ops->failover_handle(task); ++ rpc_multipath_ops_put(mpath_ops); ++} ++ ++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mpath_ops = NULL; ++ bool ret = false; ++ ++ mpath_ops = rpc_multipath_ops_get(); ++ if (mpath_ops && mpath_ops->task_need_call_start_again) ++ ret = mpath_ops->task_need_call_start_again(task); ++ rpc_multipath_ops_put(mpath_ops); ++ return ret; ++} ++ ++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, ++ void *condition) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->adjust_task_timeout) ++ mops->adjust_task_timeout(task, NULL); ++ rpc_multipath_ops_put(mops); ++} ++ ++void rpc_multipath_ops_init_task_req(struct rpc_task *task, ++ struct rpc_rqst *req) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->init_task_req) ++ mops->init_task_req(task, req); ++ rpc_multipath_ops_put(mops); ++} ++ ++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->prepare_transmit) { ++ if (!(mops->prepare_transmit(task))) { ++ rpc_multipath_ops_put(mops); ++ return true; ++ } ++ } ++ rpc_multipath_ops_put(mops); ++ return false; ++} +diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c +index c912bf20faa2..c2b63b3d5217 100644 +--- a/net/sunrpc/xprt.c ++++ b/net/sunrpc/xprt.c +@@ -48,6 +48,7 @@ + #include <linux/sunrpc/clnt.h> + #include <linux/sunrpc/metrics.h> + #include <linux/sunrpc/bc_xprt.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + #include <linux/rcupdate.h> + + #include <trace/events/sunrpc.h> +@@ -259,6 +260,9 @@ int xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task) + dprintk("RPC: %5u failed to lock transport %p\n", + task->tk_pid, xprt); + task->tk_timeout = 0; ++ ++ rpc_multipath_ops_adjust_task_timeout(task, NULL); ++ + task->tk_status = -EAGAIN; + if (req == NULL) + priority = RPC_PRIORITY_LOW; +@@ -560,6 +564,9 @@ void xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action) + struct rpc_xprt *xprt = req->rq_xprt; + + task->tk_timeout = RPC_IS_SOFT(task) ? req->rq_timeout : 0; ++ ++ rpc_multipath_ops_adjust_task_timeout(task, NULL); ++ + rpc_sleep_on(&xprt->pending, task, action); + } + EXPORT_SYMBOL_GPL(xprt_wait_for_buffer_space); +@@ -1347,6 +1354,9 @@ xprt_request_init(struct rpc_task *task) + req->rq_rcv_buf.buflen = 0; + req->rq_release_snd_buf = NULL; + xprt_reset_majortimeo(req); ++ ++ rpc_multipath_ops_init_task_req(task, req); ++ + dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid, + req, ntohl(req->rq_xid)); + } +@@ -1427,6 +1437,9 @@ void xprt_release(struct rpc_task *task) + task->tk_ops->rpc_count_stats(task, task->tk_calldata); + else if (task->tk_client) + rpc_count_iostats(task, task->tk_client->cl_metrics); ++ ++ rpc_multipath_ops_xprt_iostat(task); ++ + spin_lock(&xprt->recv_lock); + if (!list_empty(&req->rq_list)) { + list_del_init(&req->rq_list); +@@ -1455,6 +1468,7 @@ void xprt_release(struct rpc_task *task) + else + xprt_free_bc_request(req); + } ++EXPORT_SYMBOL_GPL(xprt_release); + + static void xprt_init(struct rpc_xprt *xprt, struct net *net) + { +@@ -1528,6 +1542,10 @@ struct rpc_xprt *xprt_create_transport(struct xprt_create *args) + return ERR_PTR(-ENOMEM); + } + ++if (rpc_multipath_ops_create_xprt(xprt)) { ++ xprt_destroy(xprt); ++ return ERR_PTR(-ENOMEM); ++} + rpc_xprt_debugfs_register(xprt); + + dprintk("RPC: created transport %p with %u slots\n", xprt, +@@ -1547,6 +1565,9 @@ static void xprt_destroy_cb(struct work_struct *work) + rpc_destroy_wait_queue(&xprt->sending); + rpc_destroy_wait_queue(&xprt->backlog); + kfree(xprt->servername); ++ ++ rpc_multipath_ops_destroy_xprt(xprt); ++ + /* + * Tear down transport state and free the rpc_xprt + */ +diff --git a/net/sunrpc/xprtmultipath.c b/net/sunrpc/xprtmultipath.c +index 6ebaa58b4eff..6202a0be1327 100644 +--- a/net/sunrpc/xprtmultipath.c ++++ b/net/sunrpc/xprtmultipath.c +@@ -18,6 +18,7 @@ + #include <linux/sunrpc/xprt.h> + #include <linux/sunrpc/addr.h> + #include <linux/sunrpc/xprtmultipath.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + + typedef struct rpc_xprt *(*xprt_switch_find_xprt_t)(struct list_head *head, + const struct rpc_xprt *cur); +@@ -26,8 +27,8 @@ static const struct rpc_xprt_iter_ops rpc_xprt_iter_singular; + static const struct rpc_xprt_iter_ops rpc_xprt_iter_roundrobin; + static const struct rpc_xprt_iter_ops rpc_xprt_iter_listall; + +-static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, +- struct rpc_xprt *xprt) ++void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++ struct rpc_xprt *xprt) + { + if (unlikely(xprt_get(xprt) == NULL)) + return; +@@ -36,7 +37,9 @@ static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, + if (xps->xps_nxprts == 0) + xps->xps_net = xprt->xprt_net; + xps->xps_nxprts++; ++ rpc_xps_nactive_add_one(xps); + } ++EXPORT_SYMBOL(xprt_switch_add_xprt_locked); + + /** + * rpc_xprt_switch_add_xprt - Add a new rpc_xprt to an rpc_xprt_switch +@@ -63,6 +66,7 @@ static void xprt_switch_remove_xprt_locked(struct rpc_xprt_switch *xps, + if (unlikely(xprt == NULL)) + return; + xps->xps_nxprts--; ++ rpc_xps_nactive_sub_one(xps); + if (xps->xps_nxprts == 0) + xps->xps_net = NULL; + smp_wmb(); +@@ -84,7 +88,7 @@ void rpc_xprt_switch_remove_xprt(struct rpc_xprt_switch *xps, + spin_unlock(&xps->xps_lock); + xprt_put(xprt); + } +- ++EXPORT_SYMBOL(rpc_xprt_switch_remove_xprt); + /** + * xprt_switch_alloc - Allocate a new struct rpc_xprt_switch + * @xprt: pointer to struct rpc_xprt +@@ -102,7 +106,13 @@ struct rpc_xprt_switch *xprt_switch_alloc(struct rpc_xprt *xprt, + if (xps != NULL) { + spin_lock_init(&xps->xps_lock); + kref_init(&xps->xps_kref); ++#if IS_ENABLED(CONFIG_ENFS) ++ xps->xps_nxprts = 0; ++ xps->xps_nactive = 0; ++ atomic_long_set(&xps->xps_queuelen, 0); ++#else + xps->xps_nxprts = 0; ++#endif + INIT_LIST_HEAD(&xps->xps_xprt_list); + xps->xps_iter_ops = &rpc_xprt_iter_singular; + xprt_switch_add_xprt_locked(xps, xprt); +@@ -148,6 +158,7 @@ struct rpc_xprt_switch *xprt_switch_get(struct rpc_xprt_switch *xps) + return xps; + return NULL; + } ++EXPORT_SYMBOL(xprt_switch_get); + + /** + * xprt_switch_put - Release a reference to a rpc_xprt_switch +@@ -160,6 +171,7 @@ void xprt_switch_put(struct rpc_xprt_switch *xps) + if (xps != NULL) + kref_put(&xps->xps_kref, xprt_switch_free); + } ++EXPORT_SYMBOL(xprt_switch_put); + + /** + * rpc_xprt_switch_set_roundrobin - Set a round-robin policy on rpc_xprt_switch diff --git a/0003-add_enfs_module_for_nfs_mount_option.patch b/0003-add_enfs_module_for_nfs_mount_option.patch new file mode 100644 index 0000000..70753b5 --- /dev/null +++ b/0003-add_enfs_module_for_nfs_mount_option.patch @@ -0,0 +1,1209 @@ +diff --git a/fs/nfs/enfs/Makefile b/fs/nfs/enfs/Makefile +new file mode 100644 +index 000000000000..6e83eb23c668 +--- /dev/null ++++ b/fs/nfs/enfs/Makefile +@@ -0,0 +1,18 @@ ++obj-m += enfs.o ++ ++#EXTRA_CFLAGS += -I$(PWD)/.. ++ ++enfs-y := enfs_init.o ++enfs-y += enfs_config.o ++enfs-y += mgmt_init.o ++enfs-y += enfs_multipath_client.o ++enfs-y += enfs_multipath_parse.o ++enfs-y += failover_path.o ++enfs-y += failover_time.o ++enfs-y += enfs_roundrobin.o ++enfs-y += enfs_multipath.o ++enfs-y += enfs_path.o ++enfs-y += enfs_proc.o ++enfs-y += enfs_remount.o ++enfs-y += pm_ping.o ++enfs-y += pm_state.o +diff --git a/fs/nfs/enfs/enfs.h b/fs/nfs/enfs/enfs.h +new file mode 100644 +index 000000000000..be3d95220088 +--- /dev/null ++++ b/fs/nfs/enfs/enfs.h +@@ -0,0 +1,62 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS multipath adapt header. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++ ++#ifndef _ENFS_H_ ++#define _ENFS_H_ ++#include <linux/atomic.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs3.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include "../enfs_adapter.h" ++ ++#define IP_ADDRESS_LEN_MAX 64 ++#define MAX_IP_PAIR_PER_MOUNT 8 ++#define MAX_IP_INDEX (MAX_IP_PAIR_PER_MOUNT) ++#define MAX_SUPPORTED_LOCAL_IP_COUNT 8 ++#define MAX_SUPPORTED_REMOTE_IP_COUNT 32 ++#define MAX_DNS_NAME_LEN 512 ++#define MAX_DNS_SUPPORTED 2 ++#define EXTEND_CMD_MAX_BUF_LEN 65356 ++ ++ ++struct nfs_ip_list { ++ int count; ++ struct sockaddr_storage address[MAX_SUPPORTED_REMOTE_IP_COUNT]; ++ size_t addrlen[MAX_SUPPORTED_REMOTE_IP_COUNT]; ++}; ++ ++struct NFS_ROUTE_DNS_S { ++ char dnsname[MAX_DNS_NAME_LEN]; // valid only if dnsExist is true ++}; ++ ++struct NFS_ROUTE_DNS_INFO_S { ++ int dnsNameCount; // Count of DNS name in the list ++ // valid only if dnsExist is true ++ struct NFS_ROUTE_DNS_S routeRemoteDnsList[MAX_DNS_SUPPORTED]; ++}; ++ ++struct rpc_iostats; ++struct enfs_xprt_context { ++ struct sockaddr_storage srcaddr; ++ struct rpc_iostats *stats; ++ bool main; ++ atomic_t path_state; ++ atomic_t path_check_state; ++}; ++ ++static inline bool enfs_is_main_xprt(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = xprt->multipath_context; ++ ++ if (!ctx) ++ return false; ++ return ctx->main; ++} ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_init.c b/fs/nfs/enfs/enfs_init.c +new file mode 100644 +index 000000000000..4b55608191a7 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_init.c +@@ -0,0 +1,98 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/module.h> ++#include <linux/sunrpc/sched.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs3.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include "enfs.h" ++#include "enfs_multipath_parse.h" ++#include "enfs_multipath_client.h" ++#include "enfs_remount.h" ++#include "init.h" ++#include "enfs_log.h" ++#include "enfs_multipath.h" ++#include "mgmt_init.h" ++ ++struct enfs_adapter_ops enfs_adapter = { ++ .name = "enfs", ++ .owner = THIS_MODULE, ++ .parse_mount_options = nfs_multipath_parse_options, ++ .free_mount_options = nfs_multipath_free_options, ++ .client_info_init = nfs_multipath_client_info_init, ++ .client_info_free = nfs_multipath_client_info_free, ++ .client_info_match = nfs_multipath_client_info_match, ++ .client_info_show = nfs_multipath_client_info_show, ++ .remount_ip_list = enfs_remount_iplist, ++}; ++ ++int32_t enfs_init(void) ++{ ++ int err; ++ ++ err = enfs_multipath_init(); ++ if (err) { ++ enfs_log_error("init multipath failed.\n"); ++ goto out; ++ } ++ ++ err = mgmt_init(); ++ if (err != 0) { ++ enfs_log_error("init mgmt failed.\n"); ++ goto out_tp_exit; ++ } ++ ++ return 0; ++ ++out_tp_exit: ++ enfs_multipath_exit(); ++out: ++ return err; ++} ++ ++void enfs_fini(void) ++{ ++ mgmt_fini(); ++ ++ enfs_multipath_exit(); ++} ++ ++static int __init init_enfs(void) ++{ ++ int ret; ++ ++ ret = enfs_adapter_register(&enfs_adapter); ++ if (ret) { ++ pr_err("regist enfs_adapter fail. ret %d\n", ret); ++ return -1; ++ } ++ ++ ret = enfs_init(); ++ if (ret) { ++ enfs_adapter_unregister(&enfs_adapter); ++ return -1; ++ } ++ ++ return 0; ++} ++ ++static void __exit exit_enfs(void) ++{ ++ enfs_fini(); ++ enfs_adapter_unregister(&enfs_adapter); ++} ++ ++MODULE_LICENSE("GPL"); ++MODULE_AUTHOR("Huawei Tech. Co., Ltd."); ++MODULE_DESCRIPTION("Nfs client router"); ++MODULE_VERSION("1.0"); ++ ++module_init(init_enfs); ++module_exit(exit_enfs); +diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c +new file mode 100644 +index 000000000000..63c02898a42c +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.c +@@ -0,0 +1,340 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/proc_fs.h> ++#include <linux/seq_file.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/addr.h> ++#include "enfs_multipath_client.h" ++#include "enfs_multipath_parse.h" ++ ++int ++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, ++ const struct nfs_client_initdata *client_init_data) ++{ ++ struct multipath_mount_options *mount_options = ++ (struct multipath_mount_options *)client_init_data->enfs_option; ++ ++ if (mount_options->local_ip_list) { ++ client_info->local_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->local_ip_list) ++ return -ENOMEM; ++ ++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->remote_ip_list) { ++ ++ client_info->remote_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->remote_ip_list) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->remote_ip_list, ++ mount_options->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->pRemoteDnsInfo) { ++ client_info->pRemoteDnsInfo = ++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); ++ ++ if (!client_info->pRemoteDnsInfo) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ kfree(client_info->remote_ip_list); ++ client_info->remote_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->pRemoteDnsInfo, ++ mount_options->pRemoteDnsInfo, ++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); ++ } ++ return 0; ++} ++ ++void nfs_multipath_client_info_free_work(struct work_struct *work) ++{ ++ ++ struct multipath_client_info *clp_info; ++ ++ if (work == NULL) ++ return; ++ ++ clp_info = container_of(work, struct multipath_client_info, work); ++ ++ if (clp_info->local_ip_list != NULL) { ++ kfree(clp_info->local_ip_list); ++ clp_info->local_ip_list = NULL; ++ } ++ if (clp_info->remote_ip_list != NULL) { ++ kfree(clp_info->remote_ip_list); ++ clp_info->remote_ip_list = NULL; ++ } ++ kfree(clp_info); ++} ++ ++void nfs_multipath_client_info_free(void *data) ++{ ++ struct multipath_client_info *clp_info = ++ (struct multipath_client_info *)data; ++ ++ if (clp_info == NULL) ++ return; ++ pr_info("free client info %p.\n", clp_info); ++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); ++ schedule_work(&clp_info->work); ++} ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init) ++{ ++ int rc; ++ struct multipath_client_info *info; ++ struct multipath_client_info **enfs_info; ++ /* no multi path info, no need do multipath init */ ++ if (cl_init->enfs_option == NULL) ++ return 0; ++ enfs_info = (struct multipath_client_info **)data; ++ if (enfs_info == NULL) ++ return -EINVAL; ++ ++ if (*enfs_info == NULL) ++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), ++ GFP_KERNEL); ++ ++ if (*enfs_info == NULL) ++ return -ENOMEM; ++ ++ info = (struct multipath_client_info *)*enfs_info; ++ pr_info("init client info %p.\n", info); ++ rc = nfs_multipath_client_mount_info_init(info, cl_init); ++ if (rc) { ++ nfs_multipath_client_info_free((void *)info); ++ return rc; ++ } ++ return rc; ++} ++ ++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, ++ const struct nfs_ip_list *ip_list_dst) ++{ ++ int i; ++ int j; ++ bool is_find; ++ /* if both are equal or NULL, then return true. */ ++ if (ip_list_src == ip_list_dst) ++ return true; ++ ++ if ((ip_list_src == NULL || ip_list_dst == NULL)) ++ return false; ++ ++ if (ip_list_src->count != ip_list_dst->count) ++ return false; ++ ++ for (i = 0; i < ip_list_src->count; i++) { ++ is_find = false; ++ for (j = 0; j < ip_list_src->count; j++) { ++ if (rpc_cmp_addr_port( ++ (const struct sockaddr *) ++ &ip_list_src->address[i], ++ (const struct sockaddr *) ++ &ip_list_dst->address[j]) ++ ) { ++ is_find = true; ++ break; ++ } ++ } ++ if (is_find == false) ++ return false; ++ } ++ return true; ++} ++ ++int ++nfs_multipath_dns_list_info_match( ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) ++{ ++ int i; ++ ++ /* if both are equal or NULL, then return true. */ ++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) ++ return true; ++ ++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) ++ return false; ++ ++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) ++ return false; ++ ++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { ++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, ++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) ++ return false; ++ } ++ return true; ++} ++ ++int nfs_multipath_client_info_match(void *src, void *dst) ++{ ++ int ret = true; ++ ++ struct multipath_client_info *src_info; ++ struct multipath_mount_options *dst_info; ++ ++ src_info = (struct multipath_client_info *)src; ++ dst_info = (struct multipath_mount_options *)dst; ++ pr_info("try match client .\n"); ++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, ++ dst_info->local_ip_list); ++ if (ret == false) { ++ pr_err("local_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, ++ dst_info->remote_ip_list); ++ if (ret == false) { ++ pr_err("remote_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, ++ dst_info->pRemoteDnsInfo); ++ if (ret == false) { ++ pr_err("dns not match.\n"); ++ return ret; ++ } ++ pr_info("try match client ret %d.\n", ret); ++ return ret; ++} ++ ++void nfs_multipath_print_ip_info(struct seq_file *mount_option, ++ struct nfs_ip_list *ip_list, ++ const char *type) ++{ ++ char buf[IP_ADDRESS_LEN_MAX + 1]; ++ int len = 0; ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < ip_list->count; i++) { ++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], ++ buf, IP_ADDRESS_LEN_MAX); ++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) ++ buf[len] = '\0'; ++ ++ if (i == 0) ++ seq_printf(mount_option, "%s", buf); ++ else ++ seq_printf(mount_option, "~%s", buf); ++ dfprintk(MOUNT, ++ "NFS: show nfs mount option type:%s %s [%s]\n", ++ type, buf, __func__); ++ } ++} ++ ++void nfs_multipath_print_dns_info(struct seq_file *mount_option, ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, ++ const char *type) ++{ ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { ++ if (i == 0) ++ seq_printf(mount_option, ++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) ++ seq_printf(mount_option, ",%s]", ++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else ++ seq_printf(mount_option, ++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ } ++} ++ ++ ++static void multipath_print_sockaddr(struct seq_file *seq, ++ struct sockaddr *addr) ++{ ++ switch (addr->sa_family) { ++ case AF_INET: { ++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; ++ ++ seq_printf(seq, "%pI4", &sin->sin_addr); ++ return; ++ } ++ case AF_INET6: { ++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; ++ ++ seq_printf(seq, "%pI6", &sin6->sin6_addr); ++ return; ++ } ++ default: ++ break; ++ } ++ pr_err("unsupport family:%d\n", addr->sa_family); ++} ++ ++static void multipath_print_enfs_info(struct seq_file *seq, ++ struct nfs_server *server) ++{ ++ struct sockaddr_storage peeraddr; ++ struct rpc_clnt *next = server->client; ++ ++ rpc_peeraddr(server->client, ++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); ++ seq_puts(seq, ",enfs_info="); ++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); ++ ++ while (next->cl_parent) { ++ if (next == next->cl_parent) ++ break; ++ next = next->cl_parent; ++ } ++ seq_printf(seq, "_%u", next->cl_clid); ++} ++ ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) ++{ ++ struct nfs_server *server = data; ++ struct multipath_client_info *client_info = ++ server->nfs_client->cl_multipath_data; ++ ++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); ++ if ((client_info->remote_ip_list) && ++ (client_info->remote_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->remote_ip_list, ++ "remoteaddrs"); ++ ++ if ((client_info->local_ip_list) && ++ (client_info->local_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->local_ip_list, ++ "localaddrs"); ++ ++ if ((client_info->pRemoteDnsInfo) && ++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) ++ nfs_multipath_print_dns_info(mount_option, ++ client_info->pRemoteDnsInfo, ++ "remotednsname"); ++ ++ multipath_print_enfs_info(mount_option, server); ++} +diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h +new file mode 100644 +index 000000000000..208f7260690d +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.h +@@ -0,0 +1,26 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _ENFS_MULTIPATH_CLIENT_H_ ++#define _ENFS_MULTIPATH_CLIENT_H_ ++ ++#include "enfs.h" ++ ++struct multipath_client_info { ++ struct work_struct work; ++ struct nfs_ip_list *remote_ip_list; ++ struct nfs_ip_list *local_ip_list; ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; ++ s64 client_id; ++}; ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init); ++void nfs_multipath_client_info_free(void *data); ++int nfs_multipath_client_info_match(void *src, void *dst); ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_multipath_parse.c b/fs/nfs/enfs/enfs_multipath_parse.c +new file mode 100644 +index 000000000000..9c4c6c1880b6 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_parse.c +@@ -0,0 +1,601 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/parser.h> ++#include <linux/kern_levels.h> ++#include <linux/sunrpc/addr.h> ++#include "enfs_multipath_parse.h" ++#include "enfs_log.h" ++ ++#define NFSDBG_FACILITY NFSDBG_CLIENT ++ ++void nfs_multipath_parse_ip_ipv6_add(struct sockaddr_in6 *sin6, int add_num) ++{ ++ int i; ++ ++ pr_info("NFS: before %08x%08x%08x%08x add_num: %d[%s]\n", ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), ++ add_num, __func__); ++ for (i = 0; i < add_num; i++) { ++ sin6->sin6_addr.in6_u.u6_addr32[3] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[3] != 0) ++ continue; ++ ++ sin6->sin6_addr.in6_u.u6_addr32[2] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[2] != 0) ++ continue; ++ ++ sin6->sin6_addr.in6_u.u6_addr32[1] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[1] != 0) ++ continue; ++ ++ sin6->sin6_addr.in6_u.u6_addr32[0] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[0] != 0) ++ continue; ++ } ++ ++ return; ++ ++} ++ ++static int nfs_multipath_parse_ip_range(struct net *net_ns, const char *cursor, ++ struct nfs_ip_list *ip_list, enum nfsmultipathoptions type) ++{ ++ struct sockaddr_storage addr; ++ struct sockaddr_storage tmp_addr; ++ int i; ++ size_t len; ++ int add_num = 1; ++ bool duplicate_flag = false; ++ bool is_complete = false; ++ struct sockaddr_in *sin4; ++ struct sockaddr_in6 *sin6; ++ ++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", ++ cursor, type, __func__); ++ len = rpc_pton(net_ns, cursor, strlen(cursor), ++ (struct sockaddr *)&addr, sizeof(addr)); ++ if (!len) ++ return -EINVAL; ++ ++ if (addr.ss_family != ip_list->address[ip_list->count - 1].ss_family) { ++ pr_info("NFS: %s parsing nfs mount option type: %d fail.\n", ++ __func__, type); ++ return -EINVAL; ++ } ++ ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &ip_list->address[ip_list->count - 1], ++ (const struct sockaddr *)&addr)) { ++ ++ pr_info("range ip is same ip.\n"); ++ return 0; ++ ++ } ++ ++ while (true) { ++ ++ tmp_addr = ip_list->address[ip_list->count - 1]; ++ ++ switch (addr.ss_family) { ++ case AF_INET: ++ sin4 = (struct sockaddr_in *)&tmp_addr; ++ ++ sin4->sin_addr.s_addr = ++ htonl(ntohl(sin4->sin_addr.s_addr) + add_num); ++ ++ pr_info("NFS: mount option ip%08x type: %d ipcont %d [%s]\n", ++ ntohl(sin4->sin_addr.s_addr), ++ type, ip_list->count, __func__); ++ break; ++ case AF_INET6: ++ sin6 = (struct sockaddr_in6 *)&tmp_addr; ++ nfs_multipath_parse_ip_ipv6_add(sin6, add_num); ++ pr_info("NFS: mount option ip %08x%08x%08x%08x type: %d ipcont %d [%s]\n", ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), ++ type, ip_list->count, __func__); ++ break; ++ // return -EOPNOTSUPP; ++ default: ++ return -EOPNOTSUPP; ++ } ++ ++ if (rpc_cmp_addr((const struct sockaddr *)&tmp_addr, ++ (const struct sockaddr *)&addr)) { ++ is_complete = true; ++ } ++ // delete duplicate ip, continuosly repeat, skip it ++ for (i = 0; i < ip_list->count; i++) { ++ duplicate_flag = false; ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &ip_list->address[i], ++ (const struct sockaddr *)&tmp_addr)) { ++ add_num++; ++ duplicate_flag = true; ++ break; ++ } ++ } ++ ++ if (duplicate_flag == false) { ++ pr_info("this ip not duplicate;"); ++ add_num = 1; ++ // if not repeat but omit limit return false ++ if ((type == LOCALADDR && ++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || ++ (type == REMOTEADDR && ++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { ++ ++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", ++ __func__, type, ip_list->count, ++ type == LOCALADDR ? ++ MAX_SUPPORTED_LOCAL_IP_COUNT : ++ MAX_SUPPORTED_REMOTE_IP_COUNT); ++ ip_list->count = 0; ++ return -ENOSPC; ++ } ++ ip_list->address[ip_list->count] = tmp_addr; ++ ++ ip_list->addrlen[ip_list->count] = ++ ip_list->addrlen[ip_list->count - 1]; ++ ++ ip_list->count += 1; ++ } ++ if (is_complete == true) ++ break; ++ } ++ return 0; ++} ++ ++int nfs_multipath_parse_ip_list_inter(struct nfs_ip_list *ip_list, ++ struct net *net_ns, ++ char *cursor, enum nfsmultipathoptions type) ++{ ++ int i = 0; ++ struct sockaddr_storage addr; ++ struct sockaddr_storage swap; ++ int len; ++ ++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", ++ cursor, type, __func__); ++ ++ len = rpc_pton(net_ns, cursor, ++ strlen(cursor), ++ (struct sockaddr *)&addr, sizeof(addr)); ++ if (!len) ++ return -EINVAL; ++ ++ // check repeated ip ++ for (i = 0; i < ip_list->count; i++) { ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &ip_list->address[i], ++ (const struct sockaddr *)&addr)) { ++ ++ pr_info("NFS: mount option '%s' type:%d index %d same as before index %d [%s]\n", ++ cursor, type, ip_list->count, i, __func__); ++ // prevent this ip is beginning ++ // if repeated take it to the end of list ++ swap = ip_list->address[i]; ++ ++ ip_list->address[i] = ++ ip_list->address[ip_list->count-1]; ++ ++ ip_list->address[ip_list->count-1] = swap; ++ return 0; ++ } ++ } ++ // if not repeated, check exceed limit ++ if ((type == LOCALADDR && ++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || ++ (type == REMOTEADDR && ++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { ++ ++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", ++ __func__, type, ip_list->count, ++ type == LOCALADDR ? ++ MAX_SUPPORTED_LOCAL_IP_COUNT : ++ MAX_SUPPORTED_REMOTE_IP_COUNT); ++ ++ ip_list->count = 0; ++ return -ENOSPC; ++ } ++ ip_list->address[ip_list->count] = addr; ++ ip_list->addrlen[ip_list->count] = len; ++ ip_list->count++; ++ ++ return 0; ++} ++ ++char *nfs_multipath_parse_ip_list_get_cursor(char **buf_to_parse, bool *single) ++{ ++ char *cursor = NULL; ++ const char *single_sep = strchr(*buf_to_parse, '~'); ++ const char *range_sep = strchr(*buf_to_parse, '-'); ++ ++ *single = true; ++ if (range_sep) { ++ if (range_sep > single_sep) { // A-B or A~B-C ++ if (single_sep == NULL) { // A-B ++ cursor = strsep(buf_to_parse, "-"); ++ if (cursor) ++ *single = false; ++ } else// A~B-C ++ cursor = strsep(buf_to_parse, "~"); ++ } else { // A-B~C ++ cursor = strsep(buf_to_parse, "-"); ++ if (cursor) ++ *single = false; ++ } ++ } else { // A~B~C ++ cursor = strsep(buf_to_parse, "~"); ++ } ++ return cursor; ++} ++ ++bool nfs_multipath_parse_param_check(enum nfsmultipathoptions type, ++ struct multipath_mount_options *options) ++{ ++ if (type == REMOUNTREMOTEADDR && options->remote_ip_list->count != 0) { ++ memset(options->remote_ip_list, 0, sizeof(struct nfs_ip_list)); ++ return true; ++ } ++ if (type == REMOUNTLOCALADDR && options->local_ip_list->count != 0) { ++ memset(options->local_ip_list, 0, sizeof(struct nfs_ip_list)); ++ return true; ++ } ++ if ((type == REMOTEADDR || type == REMOTEDNSNAME) && ++ options->pRemoteDnsInfo->dnsNameCount != 0) { ++ ++ pr_info("[MULTIPATH:%s] parse for %d ,already have dns\n", ++ __func__, type); ++ return false; ++ } else if ((type == REMOTEADDR || type == REMOTEDNSNAME) && ++ options->remote_ip_list->count != 0) { ++ ++ pr_info("[MULTIPATH:%s] parse for %d ,already have iplist\n", ++ __func__, type); ++ return false; ++ } ++ return true; ++} ++ ++int nfs_multipath_parse_ip_list(char *buffer, struct net *net_ns, ++ struct multipath_mount_options *options, ++ enum nfsmultipathoptions type) ++{ ++ char *buf_to_parse = NULL; ++ bool prev_range = false; ++ int ret = 0; ++ char *cursor = NULL; ++ bool single = true; ++ struct nfs_ip_list *ip_list_tmp = NULL; ++ ++ if (!nfs_multipath_parse_param_check(type, options)) ++ return -ENOTSUPP; ++ ++ if (type == REMOUNTREMOTEADDR) ++ type = REMOTEADDR; ++ ++ if (type == REMOUNTLOCALADDR) ++ type = LOCALADDR; ++ ++ if (type == LOCALADDR) ++ ip_list_tmp = options->local_ip_list; ++ else ++ ip_list_tmp = options->remote_ip_list; ++ ++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", ++ buffer, type, __func__); ++ ++ buf_to_parse = buffer; ++ while (buf_to_parse != NULL) { ++ cursor = ++ nfs_multipath_parse_ip_list_get_cursor(&buf_to_parse, &single); ++ if (!cursor) ++ break; ++ ++ if (single == false && prev_range == true) { ++ pr_info("NFS: mount option type: %d fail. Multiple Range.[%s]\n", ++ type, __func__); ++ ++ ret = -EINVAL; ++ goto out; ++ } ++ ++ if (prev_range == false) { ++ ret = nfs_multipath_parse_ip_list_inter(ip_list_tmp, ++ net_ns, cursor, type); ++ if (ret) ++ goto out; ++ if (single == false) ++ prev_range = true; ++ } else { ++ ret = nfs_multipath_parse_ip_range(net_ns, cursor, ++ ip_list_tmp, type); ++ if (ret != 0) ++ goto out; ++ prev_range = false; ++ } ++ } ++ ++out: ++ if (ret) ++ memset(ip_list_tmp, 0, sizeof(struct nfs_ip_list)); ++ ++ return ret; ++} ++ ++int nfs_multipath_parse_dns_list(char *buffer, struct net *net_ns, ++ struct multipath_mount_options *options) ++{ ++ struct NFS_ROUTE_DNS_INFO_S *dns_name_list_tmp = NULL; ++ char *cursor = NULL; ++ char *bufToParse; ++ ++ if (!nfs_multipath_parse_param_check(REMOTEDNSNAME, options)) ++ return -ENOTSUPP; ++ ++ pr_info("[MULTIPATH:%s] buffer %s\n", __func__, buffer); ++ // freed in nfs_free_parsed_mount_data ++ dns_name_list_tmp = kmalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), ++ GFP_KERNEL); ++ if (!dns_name_list_tmp) ++ return -ENOMEM; ++ ++ dns_name_list_tmp->dnsNameCount = 0; ++ bufToParse = buffer; ++ while (bufToParse) { ++ if (dns_name_list_tmp->dnsNameCount >= MAX_DNS_SUPPORTED) { ++ pr_err("%s: dnsname for %s reached %d,more than supported limit %d\n", ++ __func__, cursor, ++ dns_name_list_tmp->dnsNameCount, ++ MAX_DNS_SUPPORTED); ++ dns_name_list_tmp->dnsNameCount = 0; ++ return -ENOSPC; ++ } ++ cursor = strsep(&bufToParse, "~"); ++ if (!cursor) ++ break; ++ ++ strcpy(dns_name_list_tmp->routeRemoteDnsList ++ [dns_name_list_tmp->dnsNameCount].dnsname, ++ cursor); ++ dns_name_list_tmp->dnsNameCount++; ++ } ++ if (dns_name_list_tmp->dnsNameCount == 0) ++ return -EINVAL; ++ options->pRemoteDnsInfo = dns_name_list_tmp; ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check_ipv4_valid(struct sockaddr_in *addr) ++{ ++ if (addr->sin_addr.s_addr == 0 || addr->sin_addr.s_addr == 0xffffffff) ++ return -EINVAL; ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check_ipv6_valid(struct sockaddr_in6 *addr) ++{ ++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0 && ++ addr->sin6_addr.in6_u.u6_addr32[1] == 0 && ++ addr->sin6_addr.in6_u.u6_addr32[2] == 0 && ++ addr->sin6_addr.in6_u.u6_addr32[3] == 0) ++ return -EINVAL; ++ ++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0xffffffff && ++ addr->sin6_addr.in6_u.u6_addr32[1] == 0xffffffff && ++ addr->sin6_addr.in6_u.u6_addr32[2] == 0xffffffff && ++ addr->sin6_addr.in6_u.u6_addr32[3] == 0xffffffff) ++ return -EINVAL; ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check_ip_valid(struct sockaddr_storage *address) ++{ ++ int rc = 0; ++ ++ if (address->ss_family == AF_INET) ++ rc = nfs_multipath_parse_options_check_ipv4_valid( ++ (struct sockaddr_in *)address); ++ else if (address->ss_family == AF_INET6) ++ rc = nfs_multipath_parse_options_check_ipv6_valid( ++ (struct sockaddr_in6 *)address); ++ else ++ rc = -EINVAL; ++ ++ return rc; ++} ++ ++int nfs_multipath_parse_options_check_valid( ++ struct multipath_mount_options *options) ++{ ++ int rc; ++ int i; ++ ++ if (options == NULL) ++ return 0; ++ ++ for (i = 0; i < options->local_ip_list->count; i++) { ++ rc = nfs_multipath_parse_options_check_ip_valid( ++ &options->local_ip_list->address[i]); ++ if (rc != 0) ++ return rc; ++ } ++ ++ for (i = 0; i < options->remote_ip_list->count; i++) { ++ rc = nfs_multipath_parse_options_check_ip_valid( ++ &options->remote_ip_list->address[i]); ++ if (rc != 0) ++ return rc; ++ } ++ ++ return 0; ++} ++int nfs_multipath_parse_options_check_duplicate( ++ struct multipath_mount_options *options) ++{ ++ int i; ++ int j; ++ ++ if (options == NULL || ++ options->local_ip_list->count == 0 || ++ options->remote_ip_list->count == 0) ++ ++ return 0; ++ ++ for (i = 0; i < options->local_ip_list->count; i++) { ++ for (j = 0; j < options->remote_ip_list->count; j++) { ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &options->local_ip_list->address[i], ++ (const struct sockaddr *) ++ &options->remote_ip_list->address[j])) ++ return -ENOTSUPP; ++ } ++ } ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check(struct multipath_mount_options *options) ++{ ++ int rc = 0; ++ ++ rc = nfs_multipath_parse_options_check_valid(options); ++ ++ if (rc != 0) { ++ pr_err("has invaild ip.\n"); ++ return rc; ++ } ++ ++ rc = nfs_multipath_parse_options_check_duplicate(options); ++ if (rc != 0) ++ return rc; ++ return rc; ++} ++ ++int nfs_multipath_alloc_options(void **enfs_option) ++{ ++ struct multipath_mount_options *options = NULL; ++ ++ options = kzalloc(sizeof(struct multipath_mount_options), GFP_KERNEL); ++ ++ if (options == NULL) ++ return -ENOMEM; ++ ++ options->local_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ if (options->local_ip_list == NULL) { ++ kfree(options); ++ return -ENOMEM; ++ } ++ ++ options->remote_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ if (options->remote_ip_list == NULL) { ++ kfree(options->local_ip_list); ++ kfree(options); ++ return -ENOMEM; ++ } ++ ++ options->pRemoteDnsInfo = kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), ++ GFP_KERNEL); ++ if (options->pRemoteDnsInfo == NULL) { ++ kfree(options->remote_ip_list); ++ kfree(options->local_ip_list); ++ kfree(options); ++ return -ENOMEM; ++ } ++ ++ *enfs_option = options; ++ return 0; ++} ++ ++int nfs_multipath_parse_options(enum nfsmultipathoptions type, ++ char *str, void **enfs_option, struct net *net_ns) ++{ ++ int rc; ++ struct multipath_mount_options *options = NULL; ++ ++ if ((str == NULL) || (enfs_option == NULL) || (net_ns == NULL)) ++ return -EINVAL; ++ ++ if (*enfs_option == NULL) { ++ rc = nfs_multipath_alloc_options(enfs_option); ++ if (rc != 0) { ++ enfs_log_error( ++ "alloc enfs_options failed! errno:%d\n", rc); ++ return rc; ++ } ++ } ++ ++ options = (struct multipath_mount_options *)*enfs_option; ++ ++ if (type == LOCALADDR || type == REMOUNTLOCALADDR || ++ type == REMOTEADDR || type == REMOUNTREMOTEADDR) { ++ rc = nfs_multipath_parse_ip_list(str, net_ns, options, type); ++ } else if (type == REMOTEDNSNAME) { ++ /* alloc and release need to modify */ ++ rc = nfs_multipath_parse_dns_list(str, net_ns, options); ++ } else { ++ rc = -EOPNOTSUPP; ++ } ++ ++ // after parsing cmd, need checking local and remote ++ // IP is same. if not means illegal cmd ++ if (rc == 0) ++ rc = nfs_multipath_parse_options_check_duplicate(options); ++ ++ if (rc == 0) ++ rc = nfs_multipath_parse_options_check(options); ++ ++ return rc; ++} ++ ++void nfs_multipath_free_options(void **enfs_option) ++{ ++ struct multipath_mount_options *options; ++ ++ if (enfs_option == NULL || *enfs_option == NULL) ++ return; ++ ++ options = (struct multipath_mount_options *)*enfs_option; ++ ++ if (options->remote_ip_list != NULL) { ++ kfree(options->remote_ip_list); ++ options->remote_ip_list = NULL; ++ } ++ ++ if (options->local_ip_list != NULL) { ++ kfree(options->local_ip_list); ++ options->local_ip_list = NULL; ++ } ++ ++ if (options->pRemoteDnsInfo != NULL) { ++ kfree(options->pRemoteDnsInfo); ++ options->pRemoteDnsInfo = NULL; ++ } ++ ++ kfree(options); ++ *enfs_option = NULL; ++} +diff --git a/fs/nfs/enfs/enfs_multipath_parse.h b/fs/nfs/enfs/enfs_multipath_parse.h +new file mode 100644 +index 000000000000..6f3e8703e3e2 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_parse.h +@@ -0,0 +1,22 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _ENFS_MULTIPATH_PARSE_H_ ++#define _ENFS_MULTIPATH_PARSE_H_ ++ ++#include "enfs.h" ++ ++struct multipath_mount_options { ++ struct nfs_ip_list *remote_ip_list; ++ struct nfs_ip_list *local_ip_list; ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; ++}; ++ ++int nfs_multipath_parse_options(enum nfsmultipathoptions type, ++ char *str, void **enfs_option, struct net *net_ns); ++void nfs_multipath_free_options(void **enfs_option); ++ ++#endif diff --git a/0004-add_enfs_module_for_sunrpc_multipatch.patch b/0004-add_enfs_module_for_sunrpc_multipatch.patch new file mode 100644 index 0000000..2c0fcc7 --- /dev/null +++ b/0004-add_enfs_module_for_sunrpc_multipatch.patch @@ -0,0 +1,1581 @@ +diff --git a/fs/nfs/enfs/enfs_multipath.h b/fs/nfs/enfs/enfs_multipath.h +new file mode 100644 +index 000000000000..e064c2929ced +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath.h +@@ -0,0 +1,24 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: enfs multipath ++ * Author: ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef ENFS_MULTIPATH_H ++#define ENFS_MULTIPATH_H ++#include <linux/sunrpc/clnt.h> ++ ++#define MAX_XPRT_NUM_PER_CLIENT 32 ++ ++int enfs_multipath_init(void); ++void enfs_multipath_exit(void); ++void enfs_xprt_ippair_create(struct xprt_create *xprtargs, ++ struct rpc_clnt *clnt, void *data); ++int enfs_config_xprt_create_args(struct xprt_create *xprtargs, ++ struct rpc_create_args *args, ++ char *servername, size_t length); ++void print_enfs_multipath_addr(struct sockaddr *local, struct sockaddr *remote); ++ ++#endif // ENFS_MULTIPATH_H +diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c +new file mode 100644 +index 000000000000..63c02898a42c +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.c +@@ -0,0 +1,340 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/proc_fs.h> ++#include <linux/seq_file.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/addr.h> ++#include "enfs_multipath_client.h" ++#include "enfs_multipath_parse.h" ++ ++int ++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, ++ const struct nfs_client_initdata *client_init_data) ++{ ++ struct multipath_mount_options *mount_options = ++ (struct multipath_mount_options *)client_init_data->enfs_option; ++ ++ if (mount_options->local_ip_list) { ++ client_info->local_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->local_ip_list) ++ return -ENOMEM; ++ ++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->remote_ip_list) { ++ ++ client_info->remote_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->remote_ip_list) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->remote_ip_list, ++ mount_options->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->pRemoteDnsInfo) { ++ client_info->pRemoteDnsInfo = ++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); ++ ++ if (!client_info->pRemoteDnsInfo) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ kfree(client_info->remote_ip_list); ++ client_info->remote_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->pRemoteDnsInfo, ++ mount_options->pRemoteDnsInfo, ++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); ++ } ++ return 0; ++} ++ ++void nfs_multipath_client_info_free_work(struct work_struct *work) ++{ ++ ++ struct multipath_client_info *clp_info; ++ ++ if (work == NULL) ++ return; ++ ++ clp_info = container_of(work, struct multipath_client_info, work); ++ ++ if (clp_info->local_ip_list != NULL) { ++ kfree(clp_info->local_ip_list); ++ clp_info->local_ip_list = NULL; ++ } ++ if (clp_info->remote_ip_list != NULL) { ++ kfree(clp_info->remote_ip_list); ++ clp_info->remote_ip_list = NULL; ++ } ++ kfree(clp_info); ++} ++ ++void nfs_multipath_client_info_free(void *data) ++{ ++ struct multipath_client_info *clp_info = ++ (struct multipath_client_info *)data; ++ ++ if (clp_info == NULL) ++ return; ++ pr_info("free client info %p.\n", clp_info); ++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); ++ schedule_work(&clp_info->work); ++} ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init) ++{ ++ int rc; ++ struct multipath_client_info *info; ++ struct multipath_client_info **enfs_info; ++ /* no multi path info, no need do multipath init */ ++ if (cl_init->enfs_option == NULL) ++ return 0; ++ enfs_info = (struct multipath_client_info **)data; ++ if (enfs_info == NULL) ++ return -EINVAL; ++ ++ if (*enfs_info == NULL) ++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), ++ GFP_KERNEL); ++ ++ if (*enfs_info == NULL) ++ return -ENOMEM; ++ ++ info = (struct multipath_client_info *)*enfs_info; ++ pr_info("init client info %p.\n", info); ++ rc = nfs_multipath_client_mount_info_init(info, cl_init); ++ if (rc) { ++ nfs_multipath_client_info_free((void *)info); ++ return rc; ++ } ++ return rc; ++} ++ ++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, ++ const struct nfs_ip_list *ip_list_dst) ++{ ++ int i; ++ int j; ++ bool is_find; ++ /* if both are equal or NULL, then return true. */ ++ if (ip_list_src == ip_list_dst) ++ return true; ++ ++ if ((ip_list_src == NULL || ip_list_dst == NULL)) ++ return false; ++ ++ if (ip_list_src->count != ip_list_dst->count) ++ return false; ++ ++ for (i = 0; i < ip_list_src->count; i++) { ++ is_find = false; ++ for (j = 0; j < ip_list_src->count; j++) { ++ if (rpc_cmp_addr_port( ++ (const struct sockaddr *) ++ &ip_list_src->address[i], ++ (const struct sockaddr *) ++ &ip_list_dst->address[j]) ++ ) { ++ is_find = true; ++ break; ++ } ++ } ++ if (is_find == false) ++ return false; ++ } ++ return true; ++} ++ ++int ++nfs_multipath_dns_list_info_match( ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) ++{ ++ int i; ++ ++ /* if both are equal or NULL, then return true. */ ++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) ++ return true; ++ ++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) ++ return false; ++ ++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) ++ return false; ++ ++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { ++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, ++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) ++ return false; ++ } ++ return true; ++} ++ ++int nfs_multipath_client_info_match(void *src, void *dst) ++{ ++ int ret = true; ++ ++ struct multipath_client_info *src_info; ++ struct multipath_mount_options *dst_info; ++ ++ src_info = (struct multipath_client_info *)src; ++ dst_info = (struct multipath_mount_options *)dst; ++ pr_info("try match client .\n"); ++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, ++ dst_info->local_ip_list); ++ if (ret == false) { ++ pr_err("local_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, ++ dst_info->remote_ip_list); ++ if (ret == false) { ++ pr_err("remote_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, ++ dst_info->pRemoteDnsInfo); ++ if (ret == false) { ++ pr_err("dns not match.\n"); ++ return ret; ++ } ++ pr_info("try match client ret %d.\n", ret); ++ return ret; ++} ++ ++void nfs_multipath_print_ip_info(struct seq_file *mount_option, ++ struct nfs_ip_list *ip_list, ++ const char *type) ++{ ++ char buf[IP_ADDRESS_LEN_MAX + 1]; ++ int len = 0; ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < ip_list->count; i++) { ++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], ++ buf, IP_ADDRESS_LEN_MAX); ++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) ++ buf[len] = '\0'; ++ ++ if (i == 0) ++ seq_printf(mount_option, "%s", buf); ++ else ++ seq_printf(mount_option, "~%s", buf); ++ dfprintk(MOUNT, ++ "NFS: show nfs mount option type:%s %s [%s]\n", ++ type, buf, __func__); ++ } ++} ++ ++void nfs_multipath_print_dns_info(struct seq_file *mount_option, ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, ++ const char *type) ++{ ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { ++ if (i == 0) ++ seq_printf(mount_option, ++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) ++ seq_printf(mount_option, ",%s]", ++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else ++ seq_printf(mount_option, ++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ } ++} ++ ++ ++static void multipath_print_sockaddr(struct seq_file *seq, ++ struct sockaddr *addr) ++{ ++ switch (addr->sa_family) { ++ case AF_INET: { ++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; ++ ++ seq_printf(seq, "%pI4", &sin->sin_addr); ++ return; ++ } ++ case AF_INET6: { ++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; ++ ++ seq_printf(seq, "%pI6", &sin6->sin6_addr); ++ return; ++ } ++ default: ++ break; ++ } ++ pr_err("unsupport family:%d\n", addr->sa_family); ++} ++ ++static void multipath_print_enfs_info(struct seq_file *seq, ++ struct nfs_server *server) ++{ ++ struct sockaddr_storage peeraddr; ++ struct rpc_clnt *next = server->client; ++ ++ rpc_peeraddr(server->client, ++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); ++ seq_puts(seq, ",enfs_info="); ++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); ++ ++ while (next->cl_parent) { ++ if (next == next->cl_parent) ++ break; ++ next = next->cl_parent; ++ } ++ seq_printf(seq, "_%u", next->cl_clid); ++} ++ ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) ++{ ++ struct nfs_server *server = data; ++ struct multipath_client_info *client_info = ++ server->nfs_client->cl_multipath_data; ++ ++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); ++ if ((client_info->remote_ip_list) && ++ (client_info->remote_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->remote_ip_list, ++ "remoteaddrs"); ++ ++ if ((client_info->local_ip_list) && ++ (client_info->local_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->local_ip_list, ++ "localaddrs"); ++ ++ if ((client_info->pRemoteDnsInfo) && ++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) ++ nfs_multipath_print_dns_info(mount_option, ++ client_info->pRemoteDnsInfo, ++ "remotednsname"); ++ ++ multipath_print_enfs_info(mount_option, server); ++} +diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h +new file mode 100644 +index 000000000000..208f7260690d +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.h +@@ -0,0 +1,26 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _ENFS_MULTIPATH_CLIENT_H_ ++#define _ENFS_MULTIPATH_CLIENT_H_ ++ ++#include "enfs.h" ++ ++struct multipath_client_info { ++ struct work_struct work; ++ struct nfs_ip_list *remote_ip_list; ++ struct nfs_ip_list *local_ip_list; ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; ++ s64 client_id; ++}; ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init); ++void nfs_multipath_client_info_free(void *data); ++int nfs_multipath_client_info_match(void *src, void *dst); ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_path.c b/fs/nfs/enfs/enfs_path.c +new file mode 100644 +index 000000000000..7355f8c2f672 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_path.c +@@ -0,0 +1,47 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++ ++#include <linux/sunrpc/metrics.h> ++#include <linux/sunrpc/xprt.h> ++ ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_path.h" ++ ++// only create ctx in this function ++// alloc iostat memory in create_clnt ++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx; ++ ++ if (!xprt) { ++ enfs_log_error("invalid xprt pointer.\n"); ++ return -EINVAL; ++ } ++ ++ ctx = kzalloc(sizeof(struct enfs_xprt_context), GFP_KERNEL); ++ if (!ctx) { ++ enfs_log_error("add xprt test failed.\n"); ++ return -ENOMEM; ++ } ++ ++ xprt->multipath_context = (void *)ctx; ++ return 0; ++} ++ ++// free multi_context and iostat memory ++void enfs_free_xprt_ctx(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = xprt->multipath_context; ++ ++ if (ctx) { ++ if (ctx->stats) { ++ rpc_free_iostats(ctx->stats); ++ ctx->stats = NULL; ++ } ++ kfree(xprt->multipath_context); ++ xprt->multipath_context = NULL; ++ } ++} +diff --git a/fs/nfs/enfs/enfs_path.h b/fs/nfs/enfs/enfs_path.h +new file mode 100644 +index 000000000000..97b1ef3730b8 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_path.h +@@ -0,0 +1,12 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++ ++#ifndef ENFS_PATH_H ++#define ENFS_PATH_H ++ ++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt); ++void enfs_free_xprt_ctx(struct rpc_xprt *xprt); ++ ++#endif // ENFS_PATH_H +diff --git a/fs/nfs/enfs/enfs_proc.c b/fs/nfs/enfs/enfs_proc.c +new file mode 100644 +index 000000000000..53fa1a07642f +--- /dev/null ++++ b/fs/nfs/enfs/enfs_proc.c +@@ -0,0 +1,545 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#include <linux/module.h> ++#include <linux/proc_fs.h> ++#include <linux/seq_file.h> ++#include <linux/spinlock.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/metrics.h> ++#include <linux/sunrpc/xprtsock.h> ++#include <net/netns/generic.h> ++ ++#include "../../../net/sunrpc/netns.h" ++ ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_proc.h" ++#include "enfs_multipath.h" ++#include "pm_state.h" ++ ++#define ENFS_PROC_DIR "enfs" ++#define ENFS_PROC_PATH_STATUS_LEN 256 ++ ++static struct proc_dir_entry *enfs_proc_parent; ++ ++void ++enfs_iterate_each_rpc_clnt(int (*fn)(struct rpc_clnt *clnt, void *data), ++ void *data) ++{ ++ struct net *net; ++ struct sunrpc_net *sn; ++ struct rpc_clnt *clnt; ++ ++ rcu_read_lock(); ++ for_each_net_rcu(net) { ++ sn = net_generic(net, sunrpc_net_id); ++ if (sn == NULL) ++ continue; ++ spin_lock(&sn->rpc_client_lock); ++ list_for_each_entry(clnt, &sn->all_clients, cl_clients) { ++ fn(clnt, data); ++ } ++ spin_unlock(&sn->rpc_client_lock); ++ } ++ rcu_read_unlock(); ++} ++ ++struct proc_dir_entry *enfs_get_proc_parent(void) ++{ ++ return enfs_proc_parent; ++} ++ ++static int sockaddr_ip_to_str(struct sockaddr *addr, char *buf, int len) ++{ ++ switch (addr->sa_family) { ++ case AF_INET: { ++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; ++ ++ snprintf(buf, len, "%pI4", &sin->sin_addr); ++ return 0; ++ } ++ case AF_INET6: { ++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; ++ ++ snprintf(buf, len, "%pI6", &sin6->sin6_addr); ++ return 0; ++ } ++ default: ++ break; ++ } ++ return 1; ++} ++ ++static bool should_print(const char *name) ++{ ++ int i; ++ static const char * const proc_names[] = { ++ "READ", ++ "WRITE", ++ }; ++ ++ if (name == NULL) ++ return false; ++ ++ for (i = 0; i < ARRAY_SIZE(proc_names); i++) { ++ if (strcmp(name, proc_names[i]) == 0) ++ return true; ++ } ++ return false; ++} ++ ++struct enfs_xprt_iter { ++ unsigned int id; ++ struct seq_file *seq; ++ unsigned int max_addrs_length; ++}; ++ ++static int debug_show_xprt(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ ++ if (xprt->multipath_context) ++ ctx = xprt->multipath_context; ++ ++ pr_info(" xprt:%p ctx:%p main:%d queue_len:%lu.\n", xprt, ++ xprt->multipath_context, ++ ctx ? ctx->main : false, ++ atomic_long_read(&xprt->queuelen)); ++ return 0; ++} ++ ++static int debug_show_clnt(struct rpc_clnt *clnt, void *data) ++{ ++ pr_info(" clnt %d addr:%p enfs:%d\n", ++ clnt->cl_clid, clnt, ++ clnt->cl_enfs); ++ rpc_clnt_iterate_for_each_xprt(clnt, debug_show_xprt, NULL); ++ return 0; ++} ++ ++static void debug_print_all_xprt(void) ++{ ++ enfs_iterate_each_rpc_clnt(debug_show_clnt, NULL); ++} ++ ++static ++void enfs_proc_format_xprt_addr_display(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ char *local_name_buf, ++ int local_name_buf_len, ++ char *remote_name_buf, ++ int remote_name_buf_len) ++{ ++ int err; ++ struct sockaddr_storage srcaddr; ++ struct enfs_xprt_context *ctx; ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ sockaddr_ip_to_str((struct sockaddr *)&xprt->addr, ++ remote_name_buf, remote_name_buf_len); ++ ++ // get local address depend one main or not ++ if (enfs_is_main_xprt(xprt)) { ++ err = rpc_localaddr(clnt, (struct sockaddr *)&srcaddr, ++ sizeof(srcaddr)); ++ if (err != 0) ++ (void)snprintf(local_name_buf, ++ local_name_buf_len, "Unknown"); ++ else ++ sockaddr_ip_to_str((struct sockaddr *)&srcaddr, ++ local_name_buf, ++ local_name_buf_len); ++ } else { ++ sockaddr_ip_to_str((struct sockaddr *)&ctx->srcaddr, ++ local_name_buf, ++ local_name_buf_len); ++ } ++} ++ ++static int enfs_show_xprt_stats(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ unsigned int op; ++ unsigned int maxproc = clnt->cl_maxproc; ++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; ++ struct enfs_xprt_context *ctx; ++ char local_name[INET6_ADDRSTRLEN]; ++ char remote_name[INET6_ADDRSTRLEN]; ++ ++ if (!xprt->multipath_context) ++ return 0; ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ enfs_proc_format_xprt_addr_display(clnt, xprt, local_name, ++ sizeof(local_name), ++ remote_name, sizeof(remote_name)); ++ ++ seq_printf(iter->seq, "%-6u%-*s%-*s", iter->id, ++ iter->max_addrs_length + 4, ++ local_name, ++ iter->max_addrs_length + 4, ++ remote_name); ++ ++ iter->id++; ++ ++ for (op = 0; op < maxproc; op++) { ++ if (!should_print(clnt->cl_procinfo[op].p_name)) ++ continue; ++ ++ seq_printf(iter->seq, "%-22lu%-22Lu%-22Lu", ++ ctx->stats[op].om_ops, ++ ctx->stats[op].om_ops == 0 ? 0 : ++ ktime_to_ms(ctx->stats[op].om_rtt) / ++ ctx->stats[op].om_ops, ++ ctx->stats[op].om_ops == 0 ? 0 : ++ ktime_to_ms(ctx->stats[op].om_execute) / ++ ctx->stats[op].om_ops); ++ } ++ seq_puts(iter->seq, "\n"); ++ return 0; ++} ++ ++static int rpc_proc_show_path_status(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; ++ struct enfs_xprt_context *ctx = NULL; ++ char local_name[INET6_ADDRSTRLEN] = {0}; ++ char remote_name[INET6_ADDRSTRLEN] = {0}; ++ char multiapth_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; ++ char xprt_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; ++ ++ if (!xprt->multipath_context) { ++ enfs_log_debug("multipath_context is null.\n"); ++ return 0; ++ } ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ enfs_proc_format_xprt_addr_display(clnt, xprt, ++ local_name, ++ sizeof(local_name), ++ remote_name, sizeof(remote_name)); ++ ++ pm_get_path_state_desc(xprt, ++ multiapth_status, ++ ENFS_PROC_PATH_STATUS_LEN); ++ ++ pm_get_xprt_state_desc(xprt, ++ xprt_status, ++ ENFS_PROC_PATH_STATUS_LEN); ++ ++ seq_printf(iter->seq, "%-6u%-*s%-*s%-12s%-12s\n", ++ iter->id, iter->max_addrs_length + 4, ++ local_name, iter->max_addrs_length + 4, ++ remote_name, multiapth_status, ++ xprt_status); ++ iter->id++; ++ return 0; ++} ++ ++static int enfs_get_max_addrs_length(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; ++ char local_name[INET6_ADDRSTRLEN]; ++ char remote_name[INET6_ADDRSTRLEN]; ++ ++ enfs_proc_format_xprt_addr_display(clnt, xprt, ++ local_name, sizeof(local_name), ++ remote_name, sizeof(remote_name)); ++ ++ if (iter->max_addrs_length < strlen(local_name)) ++ iter->max_addrs_length = strlen(local_name); ++ ++ if (iter->max_addrs_length < strlen(remote_name)) ++ iter->max_addrs_length = strlen(remote_name); ++ ++ return 0; ++} ++ ++static int rpc_proc_clnt_showpath(struct seq_file *seq, void *v) ++{ ++ struct rpc_clnt *clnt = seq->private; ++ struct enfs_xprt_iter iter; ++ ++ iter.seq = seq; ++ iter.id = 0; ++ iter.max_addrs_length = 0; ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ enfs_get_max_addrs_length, ++ (void *)&iter); ++ ++ seq_printf(seq, "%-6s%-*s%-*s%-12s%-12s\n", "id", ++ iter.max_addrs_length + 4, ++ "local_addr", ++ iter.max_addrs_length + 4, ++ "remote_addr", ++ "path_state", ++ "xprt_state"); ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ rpc_proc_show_path_status, ++ (void *)&iter); ++ return 0; ++} ++ ++static int enfs_rpc_proc_show(struct seq_file *seq, void *v) ++{ ++ struct rpc_clnt *clnt = seq->private; ++ struct enfs_xprt_iter iter; ++ ++ iter.seq = seq; ++ iter.id = 0; ++ iter.max_addrs_length = 0; ++ ++ debug_print_all_xprt(); ++ pr_info("enfs proc clnt:%p\n", clnt); ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ enfs_get_max_addrs_length, ++ (void *)&iter); ++ ++ seq_printf(seq, "%-6s%-*s%-*s%-22s%-22s%-22s%-22s%-22s%-22s\n", "id", ++ iter.max_addrs_length + 4, "local_addr", ++ iter.max_addrs_length + 4, ++ "remote_addr", "r_count", ++ "r_rtt", "r_exec", "w_count", "w_rtt", "w_exec"); ++ ++ // rpc_clnt_show_stats(seq, clnt); ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ enfs_show_xprt_stats, ++ (void *)&iter); ++ return 0; ++} ++ ++static int rpc_proc_open(struct inode *inode, struct file *file) ++{ ++ struct rpc_clnt *clnt = PDE_DATA(inode); ++ ++ pr_info("%s %p\n", __func__, clnt); ++ return single_open(file, enfs_rpc_proc_show, clnt); ++} ++ ++static int enfs_reset_xprt_stats(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ unsigned int op; ++ struct enfs_xprt_context *ctx; ++ unsigned int maxproc = clnt->cl_maxproc; ++ struct rpc_iostats stats = {0}; ++ ++ if (!xprt->multipath_context) ++ return 0; ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ for (op = 0; op < maxproc; op++) { ++ spin_lock(&ctx->stats[op].om_lock); ++ ctx->stats[op] = stats; ++ spin_unlock(&ctx->stats[op].om_lock); ++ } ++ return 0; ++} ++ ++static void trim_newline_ch(char *str, int len) ++{ ++ int i; ++ ++ for (i = 0; str[i] != '\0' && i < len; i++) { ++ if (str[i] == '\n') ++ str[i] = '\0'; ++ } ++} ++ ++static ssize_t enfs_proc_write(struct file *file, ++ const char __user *user_buf, ++ size_t len, ++ loff_t *offset) ++{ ++ char buffer[128]; ++ struct rpc_clnt *clnt = ++ ((struct seq_file *)file->private_data)->private; ++ ++ if (len >= sizeof(buffer)) ++ return -E2BIG; ++ ++ if (copy_from_user(buffer, user_buf, len) != 0) ++ return -EFAULT; ++ ++ buffer[len] = '\0'; ++ trim_newline_ch(buffer, len); ++ if (strcmp(buffer, "reset") != 0) ++ return -EINVAL; ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, enfs_reset_xprt_stats, NULL); ++ return len; ++} ++ ++static int rpc_proc_show_path(struct inode *inode, struct file *file) ++{ ++ struct rpc_clnt *clnt = PDE_DATA(inode); ++ ++ return single_open(file, rpc_proc_clnt_showpath, clnt); ++} ++ ++static const struct file_operations rpc_proc_fops = { ++ .owner = THIS_MODULE, ++ .open = rpc_proc_open, ++ .read = seq_read, ++ .llseek = seq_lseek, ++ .release = single_release, ++ .write = enfs_proc_write, ++}; ++ ++static const struct file_operations rpc_show_path_fops = { ++ .owner = THIS_MODULE, ++ .open = rpc_proc_show_path, ++ .read = seq_read, ++ .llseek = seq_lseek, ++ .release = single_release, ++}; ++ ++static int clnt_proc_name(struct rpc_clnt *clnt, char *buf, int len) ++{ ++ int ret; ++ ++ ret = snprintf(buf, len, "%s_%u", ++ rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR), ++ clnt->cl_clid); ++ if (ret > len) ++ return -E2BIG; ++ return 0; ++} ++ ++static int enfs_proc_create_file(struct rpc_clnt *clnt) ++{ ++ int err; ++ char buf[128]; ++ ++ struct proc_dir_entry *clnt_entry; ++ struct proc_dir_entry *stat_entry; ++ ++ err = clnt_proc_name(clnt, buf, sizeof(buf)); ++ if (err) ++ return err; ++ ++ clnt_entry = proc_mkdir(buf, enfs_proc_parent); ++ if (clnt_entry == NULL) ++ return -EINVAL; ++ ++ stat_entry = proc_create_data("stat", ++ 0, clnt_entry, ++ &rpc_proc_fops, clnt); ++ ++ if (stat_entry == NULL) ++ return -EINVAL; ++ ++ stat_entry = proc_create_data("path", ++ 0, clnt_entry, ++ &rpc_show_path_fops, clnt); ++ ++ if (stat_entry == NULL) ++ return -EINVAL; ++ ++ return 0; ++} ++ ++void enfs_count_iostat(struct rpc_task *task) ++{ ++ struct enfs_xprt_context *ctx = task->tk_xprt->multipath_context; ++ ++ if (!ctx || !ctx->stats) ++ return; ++ rpc_count_iostats(task, ctx->stats); ++} ++ ++static void enfs_proc_delete_file(struct rpc_clnt *clnt) ++{ ++ int err; ++ char buf[128]; ++ ++ err = clnt_proc_name(clnt, buf, sizeof(buf)); ++ if (err) { ++ pr_err("gen clnt name failed.\n"); ++ return; ++ } ++ remove_proc_subtree(buf, enfs_proc_parent); ++} ++ ++// create proc file "/porc/enfs/[mount_ip]_[id]/stat" ++int enfs_proc_create_clnt(struct rpc_clnt *clnt) ++{ ++ int err; ++ ++ err = enfs_proc_create_file(clnt); ++ if (err) { ++ pr_err("create client %d\n", err); ++ return err; ++ } ++ ++ return 0; ++} ++ ++void enfs_proc_delete_clnt(struct rpc_clnt *clnt) ++{ ++ if (clnt->cl_enfs) ++ enfs_proc_delete_file(clnt); ++} ++ ++static int enfs_proc_create_parent(void) ++{ ++ enfs_proc_parent = proc_mkdir(ENFS_PROC_DIR, NULL); ++ ++ if (enfs_proc_parent == NULL) { ++ pr_err("Enfs create proc dir err\n"); ++ return -ENOMEM; ++ } ++ return 0; ++} ++ ++static void enfs_proc_delete_parent(void) ++{ ++ remove_proc_entry(ENFS_PROC_DIR, NULL); ++} ++ ++static int enfs_proc_init_create_clnt(struct rpc_clnt *clnt, void *data) ++{ ++ if (clnt->cl_enfs) ++ enfs_proc_create_file(clnt); ++ return 0; ++} ++ ++static int enfs_proc_destroy_clnt(struct rpc_clnt *clnt, void *data) ++{ ++ if (clnt->cl_enfs) ++ enfs_proc_delete_file(clnt); ++ return 0; ++} ++ ++int enfs_proc_init(void) ++{ ++ int err; ++ ++ err = enfs_proc_create_parent(); ++ if (err) ++ return err; ++ ++ enfs_iterate_each_rpc_clnt(enfs_proc_init_create_clnt, NULL); ++ return 0; ++} ++ ++void enfs_proc_exit(void) ++{ ++ enfs_iterate_each_rpc_clnt(enfs_proc_destroy_clnt, NULL); ++ enfs_proc_delete_parent(); ++} +diff --git a/fs/nfs/enfs/enfs_proc.h b/fs/nfs/enfs/enfs_proc.h +new file mode 100644 +index 000000000000..321951031c2e +--- /dev/null ++++ b/fs/nfs/enfs/enfs_proc.h +@@ -0,0 +1,21 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS PROC. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef ENFS_PROC_H ++#define ENFS_PROC_H ++ ++struct rpc_clnt; ++struct rpc_task; ++struct proc_dir_entry; ++ ++int enfs_proc_init(void); ++void enfs_proc_exit(void); ++struct proc_dir_entry *enfs_get_proc_parent(void); ++int enfs_proc_create_clnt(struct rpc_clnt *clnt); ++void enfs_proc_delete_clnt(struct rpc_clnt *clnt); ++void enfs_count_iostat(struct rpc_task *task); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_remount.c b/fs/nfs/enfs/enfs_remount.c +new file mode 100644 +index 000000000000..2c3fe125c735 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_remount.c +@@ -0,0 +1,221 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: remount ip source file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++#include "enfs_remount.h" ++ ++#include <linux/string.h> ++#include <linux/in.h> ++#include <linux/in6.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/spinlock.h> ++#include <linux/sunrpc/addr.h> ++#include <linux/sunrpc/metrics.h> ++#include <linux/sunrpc/xprtmultipath.h> ++#include <linux/sunrpc/xprtsock.h> ++#include <linux/sunrpc/xprt.h> ++#include <linux/smp.h> ++#include <linux/delay.h> ++ ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_multipath.h" ++#include "enfs_multipath_parse.h" ++#include "enfs_path.h" ++#include "enfs_proc.h" ++#include "enfs_multipath_client.h" ++ ++static bool enfs_rpc_xprt_switch_need_delete_addr( ++ struct multipath_mount_options *enfs_option, ++ struct sockaddr *dstaddr, struct sockaddr *srcaddr) ++{ ++ int i; ++ bool find_same_ip = false; ++ int32_t local_total; ++ int32_t remote_total; ++ ++ local_total = enfs_option->local_ip_list->count; ++ remote_total = enfs_option->remote_ip_list->count; ++ if (local_total == 0 || remote_total == 0) { ++ pr_err("no ip list is present.\n"); ++ return false; ++ } ++ ++ for (i = 0; i < local_total; i++) { ++ find_same_ip = ++ rpc_cmp_addr((struct sockaddr *) ++ &enfs_option->local_ip_list->address[i], ++ srcaddr); ++ if (find_same_ip) ++ break; ++ } ++ ++ if (find_same_ip == false) ++ return true; ++ ++ find_same_ip = false; ++ for (i = 0; i < remote_total; i++) { ++ find_same_ip = ++ rpc_cmp_addr((struct sockaddr *) ++ &enfs_option->remote_ip_list->address[i], ++ dstaddr); ++ if (find_same_ip) ++ break; ++ } ++ ++ if (find_same_ip == false) ++ return true; ++ ++ return false; ++} ++ ++// Used in rcu_lock ++static bool enfs_delete_xprt_from_switch(struct rpc_xprt *xprt, ++ void *enfs_option, ++ struct rpc_xprt_switch *xps) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ struct multipath_mount_options *mopt = ++ (struct multipath_mount_options *)enfs_option; ++ ++ if (enfs_is_main_xprt(xprt)) ++ return true; ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (enfs_rpc_xprt_switch_need_delete_addr(mopt, ++ (struct sockaddr *)&xprt->addr, ++ (struct sockaddr *)&ctx->srcaddr)) { ++ ++ print_enfs_multipath_addr((struct sockaddr *)&ctx->srcaddr, ++ (struct sockaddr *)&xprt->addr); ++ rpc_xprt_switch_remove_xprt(xps, xprt); ++ return true; ++ } ++ ++ return false; ++} ++ ++void enfs_clnt_delete_obsolete_xprts(struct nfs_client *nfs_client, ++ void *enfs_option) ++{ ++ int xprt_count = 0; ++ struct rpc_xprt *pos = NULL; ++ struct rpc_xprt_switch *xps = NULL; ++ ++ rcu_read_lock(); ++ xps = xprt_switch_get( ++ rcu_dereference( ++ nfs_client->cl_rpcclient->cl_xpi.xpi_xpswitch)); ++ if (xps == NULL) { ++ rcu_read_unlock(); ++ xprt_switch_put(xps); ++ return; ++ } ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (xprt_count < MAX_XPRT_NUM_PER_CLIENT) { ++ if (enfs_delete_xprt_from_switch( ++ pos, enfs_option, xps) == false) ++ xprt_count++; ++ } else ++ rpc_xprt_switch_remove_xprt(xps, pos); ++ } ++ rcu_read_unlock(); ++ xprt_switch_put(xps); ++} ++ ++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) ++{ ++ int errno = 0; ++ char servername[48]; ++ struct multipath_mount_options *remount_lists = ++ (struct multipath_mount_options *)enfs_option; ++ struct multipath_client_info *client_info = ++ (struct multipath_client_info *)nfs_client->cl_multipath_data; ++ struct xprt_create xprtargs; ++ struct rpc_create_args args = { ++ .protocol = nfs_client->cl_proto, ++ .net = nfs_client->cl_net, ++ .addrsize = nfs_client->cl_addrlen, ++ .servername = nfs_client->cl_hostname, ++ }; ++ ++ memset(&xprtargs, 0, sizeof(struct xprt_create)); ++ ++ //mount is not use multipath ++ if (client_info == NULL || enfs_option == NULL) { ++ enfs_log_error( ++ "mount information or remount information is empty.\n"); ++ return -EINVAL; ++ } ++ ++ //remount : localaddrs and remoteaddrs are empty ++ if (remount_lists->local_ip_list->count == 0 && ++ remount_lists->remote_ip_list->count == 0) { ++ enfs_log_info("remount local_ip_list and remote_ip_list are NULL\n"); ++ return 0; ++ } ++ ++ errno = enfs_config_xprt_create_args(&xprtargs, ++ &args, servername, sizeof(servername)); ++ ++ if (errno) { ++ enfs_log_error("config_xprt_create failed! errno:%d\n", errno); ++ return errno; ++ } ++ ++ if (remount_lists->local_ip_list->count == 0) { ++ if (client_info->local_ip_list->count == 0) { ++ errno = rpc_localaddr(nfs_client->cl_rpcclient, ++ (struct sockaddr *) ++ &remount_lists->local_ip_list->address[0], ++ sizeof(struct sockaddr_storage)); ++ if (errno) { ++ enfs_log_error("get clnt srcaddr errno:%d\n", ++ errno); ++ return errno; ++ } ++ remount_lists->local_ip_list->count = 1; ++ } else ++ memcpy(remount_lists->local_ip_list, ++ client_info->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (remount_lists->remote_ip_list->count == 0) { ++ if (client_info->remote_ip_list->count == 0) { ++ errno = rpc_peeraddr(nfs_client->cl_rpcclient, ++ (struct sockaddr *) ++ &remount_lists->remote_ip_list->address[0], ++ sizeof(struct sockaddr_storage)); ++ if (errno == 0) { ++ enfs_log_error("get clnt dstaddr errno:%d\n", ++ errno); ++ return errno; ++ } ++ remount_lists->remote_ip_list->count = 1; ++ } else ++ memcpy(remount_lists->remote_ip_list, ++ client_info->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ enfs_log_info("Remount creating new links...\n"); ++ enfs_xprt_ippair_create(&xprtargs, ++ nfs_client->cl_rpcclient, ++ remount_lists); ++ ++ enfs_log_info("Remount deleting obsolete links...\n"); ++ enfs_clnt_delete_obsolete_xprts(nfs_client, remount_lists); ++ ++ memcpy(client_info->local_ip_list, ++ remount_lists->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ memcpy(client_info->remote_ip_list, ++ remount_lists->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ ++ return 0; ++} +diff --git a/fs/nfs/enfs/enfs_remount.h b/fs/nfs/enfs/enfs_remount.h +new file mode 100644 +index 000000000000..a663ed257004 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_remount.h +@@ -0,0 +1,15 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: remount ip header file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++#ifndef _ENFS_REMOUNT_ ++#define _ENFS_REMOUNT_ ++#include <linux/string.h> ++#include "enfs.h" ++ ++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_roundrobin.c b/fs/nfs/enfs/enfs_roundrobin.c +new file mode 100644 +index 000000000000..4e4eda784a3e +--- /dev/null ++++ b/fs/nfs/enfs/enfs_roundrobin.c +@@ -0,0 +1,255 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#include <linux/spinlock.h> ++#include <linux/module.h> ++#include <linux/printk.h> ++#include <linux/kref.h> ++#include <linux/rculist.h> ++#include <linux/types.h> ++#include <linux/sunrpc/xprt.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/xprtmultipath.h> ++#include "enfs_roundrobin.h" ++ ++#include "enfs.h" ++#include "enfs_config.h" ++#include "pm_state.h" ++ ++typedef struct rpc_xprt *(*enfs_xprt_switch_find_xprt_t)( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur); ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin; ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular; ++ ++static bool enfs_xprt_is_active(struct rpc_xprt *xprt) ++{ ++ enum pm_path_state state; ++ ++ if (kref_read(&xprt->kref) <= 0) ++ return false; ++ ++ state = pm_get_path_state(xprt); ++ if (state == PM_STATE_NORMAL) ++ return true; ++ ++ return false; ++} ++ ++static struct rpc_xprt *enfs_lb_set_cursor_xprt( ++ struct rpc_xprt_switch *xps, struct rpc_xprt **cursor, ++ enfs_xprt_switch_find_xprt_t find_next) ++{ ++ struct rpc_xprt *pos; ++ struct rpc_xprt *old; ++ ++ old = smp_load_acquire(cursor); /* read latest cursor */ ++ pos = find_next(xps, old); ++ smp_store_release(cursor, pos); /* let cursor point to pos */ ++ return pos; ++} ++ ++static ++struct rpc_xprt *enfs_lb_find_next_entry_roundrobin( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *pos; ++ struct rpc_xprt *prev = NULL; ++ bool found = false; ++ struct rpc_xprt *min_queuelen_xprt = NULL; ++ unsigned long pos_xprt_queuelen; ++ unsigned long min_xprt_queuelen = 0; ++ ++ unsigned long xps_queuelen = atomic_long_read(&xps->xps_queuelen); ++ // delete origin xprt ++ unsigned int multipath_nactive = READ_ONCE(xps->xps_nactive) - 1; ++ ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (enfs_is_main_xprt(pos) || !enfs_xprt_is_active(pos)) { ++ prev = pos; ++ continue; ++ } ++ ++ pos_xprt_queuelen = atomic_long_read(&pos->queuelen); ++ if (min_queuelen_xprt == NULL || ++ pos_xprt_queuelen < min_xprt_queuelen) { ++ ++ min_queuelen_xprt = pos; ++ min_xprt_queuelen = pos_xprt_queuelen; ++ } ++ ++ if (cur == prev) ++ found = true; ++ ++ if (found && pos_xprt_queuelen * ++ multipath_nactive <= xps_queuelen) ++ return pos; ++ prev = pos; ++ }; ++ ++ return min_queuelen_xprt; ++} ++ ++struct rpc_xprt *enfs_lb_switch_find_first_active_xprt( ++ struct rpc_xprt_switch *xps) ++{ ++ struct rpc_xprt *pos; ++ ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (enfs_xprt_is_active(pos)) ++ return pos; ++ }; ++ return NULL; ++} ++ ++struct rpc_xprt *enfs_lb_switch_get_main_xprt(struct rpc_xprt_switch *xps) ++{ ++ return list_first_or_null_rcu(&xps->xps_xprt_list, ++ struct rpc_xprt, xprt_switch); ++} ++ ++static struct rpc_xprt *enfs_lb_switch_get_next_xprt_roundrobin( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *xprt; ++ ++ // disable multipath ++ if (enfs_get_config_multipath_state()) ++ return enfs_lb_switch_get_main_xprt(xps); ++ ++ xprt = enfs_lb_find_next_entry_roundrobin(xps, cur); ++ if (xprt != NULL) ++ return xprt; ++ ++ return enfs_lb_switch_get_main_xprt(xps); ++} ++ ++static ++struct rpc_xprt *enfs_lb_iter_next_entry_roundrobin(struct rpc_xprt_iter *xpi) ++{ ++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); ++ ++ if (xps == NULL) ++ return NULL; ++ ++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, ++ enfs_lb_switch_get_next_xprt_roundrobin); ++} ++ ++static ++struct rpc_xprt *enfs_lb_switch_find_singular_entry( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *pos; ++ bool found = false; ++ ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (cur == pos) ++ found = true; ++ ++ if (found && enfs_xprt_is_active(pos)) ++ return pos; ++ } ++ return NULL; ++} ++ ++struct rpc_xprt *enfs_lb_get_singular_xprt( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *xprt; ++ ++ if (xps == NULL) ++ return NULL; ++ ++ // disable multipath ++ if (enfs_get_config_multipath_state()) ++ return enfs_lb_switch_get_main_xprt(xps); ++ ++ if (cur == NULL || xps->xps_nxprts < 2) ++ return enfs_lb_switch_find_first_active_xprt(xps); ++ ++ xprt = enfs_lb_switch_find_singular_entry(xps, cur); ++ if (!xprt) ++ return enfs_lb_switch_get_main_xprt(xps); ++ ++ return xprt; ++} ++ ++static ++struct rpc_xprt *enfs_lb_iter_next_entry_sigular(struct rpc_xprt_iter *xpi) ++{ ++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); ++ ++ if (xps == NULL) ++ return NULL; ++ ++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, ++ enfs_lb_get_singular_xprt); ++} ++ ++static void enfs_lb_iter_default_rewind(struct rpc_xprt_iter *xpi) ++{ ++ WRITE_ONCE(xpi->xpi_cursor, NULL); ++} ++ ++static void enfs_lb_switch_set_roundrobin(struct rpc_clnt *clnt) ++{ ++ struct rpc_xprt_switch *xps; ++ ++ rcu_read_lock(); ++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ++ rcu_read_unlock(); ++ if (clnt->cl_vers == 3) { ++ ++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_roundrobin) ++ WRITE_ONCE(xps->xps_iter_ops, ++ &enfs_xprt_iter_roundrobin); ++ ++ return; ++ } ++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_singular) ++ WRITE_ONCE(xps->xps_iter_ops, &enfs_xprt_iter_singular); ++} ++ ++static ++struct rpc_xprt *enfs_lb_switch_find_current(struct list_head *head, ++ const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *pos; ++ ++ list_for_each_entry_rcu(pos, head, xprt_switch) { ++ if (cur == pos) ++ return pos; ++ } ++ return NULL; ++} ++ ++static struct rpc_xprt *enfs_lb_iter_current_entry(struct rpc_xprt_iter *xpi) ++{ ++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); ++ struct list_head *head; ++ ++ if (xps == NULL) ++ return NULL; ++ head = &xps->xps_xprt_list; ++ if (xpi->xpi_cursor == NULL || xps->xps_nxprts < 2) ++ return enfs_lb_switch_get_main_xprt(xps); ++ return enfs_lb_switch_find_current(head, xpi->xpi_cursor); ++} ++ ++void enfs_lb_set_policy(struct rpc_clnt *clnt) ++{ ++ enfs_lb_switch_set_roundrobin(clnt); ++} ++ ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin = { ++ .xpi_rewind = enfs_lb_iter_default_rewind, ++ .xpi_xprt = enfs_lb_iter_current_entry, ++ .xpi_next = enfs_lb_iter_next_entry_roundrobin, ++}; ++ ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular = { ++ .xpi_rewind = enfs_lb_iter_default_rewind, ++ .xpi_xprt = enfs_lb_iter_current_entry, ++ .xpi_next = enfs_lb_iter_next_entry_sigular, ++}; +diff --git a/fs/nfs/enfs/enfs_roundrobin.h b/fs/nfs/enfs/enfs_roundrobin.h +new file mode 100644 +index 000000000000..b72b088a6258 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_roundrobin.h +@@ -0,0 +1,9 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#ifndef ENFS_ROUNDROBIN_H ++#define ENFS_ROUNDROBIN_H ++ ++void enfs_lb_set_policy(struct rpc_clnt *clnt); ++#endif diff --git a/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch new file mode 100644 index 0000000..cc6b677 --- /dev/null +++ b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch @@ -0,0 +1,1607 @@ +diff --git a/fs/nfs/enfs/enfs_config.c b/fs/nfs/enfs/enfs_config.c +new file mode 100644 +index 000000000000..11aa7a00385b +--- /dev/null ++++ b/fs/nfs/enfs/enfs_config.c +@@ -0,0 +1,378 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#include <linux/cdev.h> ++#include <linux/errno.h> ++#include <linux/fcntl.h> ++#include <linux/fs.h> ++#include <linux/kernel.h> ++#include <linux/kthread.h> ++#include <linux/slab.h> ++#include <linux/string.h> ++#include <linux/uaccess.h> ++#include <linux/delay.h> ++ ++#include "enfs_errcode.h" ++#include "enfs_log.h" ++#include "enfs_config.h" ++ ++#define MAX_FILE_SIZE 8192 ++#define STRING_BUF_SIZE 128 ++#define CONFIG_FILE_PATH "/etc/enfs/config.ini" ++#define ENFS_NOTIFY_FILE_PERIOD 1000UL ++ ++#define MAX_PATH_DETECT_INTERVAL 300 ++#define MIN_PATH_DETECT_INTERVAL 5 ++#define MAX_PATH_DETECT_TIMEOUT 60 ++#define MIN_PATH_DETECT_TIMEOUT 1 ++#define MAX_MULTIPATH_TIMEOUT 60 ++#define MIN_MULTIPATH_TIMEOUT 0 ++#define MAX_MULTIPATH_STATE ENFS_MULTIPATH_DISABLE ++#define MIN_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE ++ ++#define DEFAULT_PATH_DETECT_INTERVAL 10 ++#define DEFAULT_PATH_DETECT_TIMEOUT 5 ++#define DEFAULT_MULTIPATH_TIMEOUT 0 ++#define DEFAULT_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE ++#define DEFAULT_LOADBALANCE_MODE ENFS_LOADBALANCE_RR ++ ++typedef int (*check_and_assign_func)(char *, char *, int, int); ++ ++struct enfs_config_info { ++ int32_t path_detect_interval; ++ int32_t path_detect_timeout; ++ int32_t multipath_timeout; ++ int32_t loadbalance_mode; ++ int32_t multipath_state; ++}; ++ ++struct check_and_assign_value { ++ char *field_name; ++ check_and_assign_func func; ++ int min_value; ++ int max_value; ++}; ++ ++static struct enfs_config_info g_enfs_config_info; ++static struct timespec64 modify_time; ++static struct task_struct *thread; ++ ++static int enfs_check_config_value(char *value, int min_value, int max_value) ++{ ++ unsigned long num_value; ++ int ret; ++ ++ ret = kstrtol(value, 10, &num_value); ++ if (ret != 0) { ++ enfs_log_error("Failed to convert string to int\n"); ++ return -EINVAL; ++ } ++ ++ if (num_value < min_value || num_value > max_value) ++ return -EINVAL; ++ ++ return num_value; ++} ++ ++static int32_t enfs_check_and_assign_int_value(char *field_name, char *value, ++ int min_value, int max_value) ++{ ++ int int_value = enfs_check_config_value(value, min_value, max_value); ++ ++ if (int_value < 0) ++ return -EINVAL; ++ ++ if (strcmp(field_name, "path_detect_interval") == 0) { ++ g_enfs_config_info.path_detect_interval = int_value; ++ return ENFS_RET_OK; ++ } ++ if (strcmp(field_name, "path_detect_timeout") == 0) { ++ g_enfs_config_info.path_detect_timeout = int_value; ++ return ENFS_RET_OK; ++ } ++ if (strcmp(field_name, "multipath_timeout") == 0) { ++ g_enfs_config_info.multipath_timeout = int_value; ++ return ENFS_RET_OK; ++ } ++ if (strcmp(field_name, "multipath_disable") == 0) { ++ g_enfs_config_info.multipath_state = int_value; ++ return ENFS_RET_OK; ++ } ++ return -EINVAL; ++} ++ ++static int32_t enfs_check_and_assign_loadbalance_mode(char *field_name, ++ char *value, ++ int min_value, ++ int max_value) ++{ ++ if (value == NULL) ++ return -EINVAL; ++ ++ if (strcmp(field_name, "multipath_select_policy") == 0) { ++ if (strcmp(value, "roundrobin") == 0) { ++ g_enfs_config_info.loadbalance_mode ++ = ENFS_LOADBALANCE_RR; ++ return ENFS_RET_OK; ++ } ++ } ++ return -EINVAL; ++} ++ ++static const struct check_and_assign_value g_check_and_assign_value[] = { ++ {"path_detect_interval", enfs_check_and_assign_int_value, ++ MIN_PATH_DETECT_INTERVAL, MAX_PATH_DETECT_INTERVAL}, ++ {"path_detect_timeout", enfs_check_and_assign_int_value, ++ MIN_PATH_DETECT_TIMEOUT, MAX_PATH_DETECT_TIMEOUT}, ++ {"multipath_timeout", enfs_check_and_assign_int_value, ++ MIN_MULTIPATH_TIMEOUT, MAX_MULTIPATH_TIMEOUT}, ++ {"multipath_disable", enfs_check_and_assign_int_value, ++ MIN_MULTIPATH_STATE, MAX_MULTIPATH_STATE}, ++ {"multipath_select_policy", enfs_check_and_assign_loadbalance_mode, ++ 0, 0}, ++}; ++ ++static int32_t enfs_read_config_file(char *buffer, char *file_path) ++{ ++ int ret; ++ struct file *filp = NULL; ++ loff_t f_pos = 0; ++ mm_segment_t fs; ++ ++ ++ filp = filp_open(file_path, O_RDONLY, 0); ++ ++ if (IS_ERR(filp)) { ++ enfs_log_error("Failed to open file %s\n", CONFIG_FILE_PATH); ++ ret = -ENOENT; ++ return ret; ++ } ++ ++ fs = get_fs(); ++ set_fs(get_ds()); ++ kernel_read(filp, buffer, MAX_FILE_SIZE, &f_pos); ++ set_fs(fs); ++ ++ ret = filp_close(filp, NULL); ++ if (ret) { ++ enfs_log_error("Close File:%s failed:%d.\n", ++ CONFIG_FILE_PATH, ret); ++ return -EINVAL; ++ } ++ return ENFS_RET_OK; ++} ++ ++static int32_t enfs_deal_with_comment_line(char *buffer) ++{ ++ int ret; ++ char *pos = strchr(buffer, '\n'); ++ ++ if (pos != NULL) ++ ret = strlen(buffer) - strlen(pos); ++ else ++ ret = strlen(buffer); ++ ++ return ret; ++} ++ ++static int32_t enfs_parse_key_value_from_config(char *buffer, char *key, ++ char *value, int keyLen, ++ int valueLen) ++{ ++ char *line; ++ char *tokenPtr; ++ int len; ++ char *tem; ++ char *pos = strchr(buffer, '\n'); ++ ++ if (pos != NULL) ++ len = strlen(buffer) - strlen(pos); ++ else ++ len = strlen(buffer); ++ ++ line = kmalloc(len + 1, GFP_KERNEL); ++ if (!line) { ++ enfs_log_error("Failed to allocate memory.\n"); ++ return -ENOMEM; ++ } ++ line[len] = '\0'; ++ strncpy(line, buffer, len); ++ ++ tem = line; ++ tokenPtr = strsep(&tem, "="); ++ if (tokenPtr == NULL || tem == NULL) { ++ kfree(line); ++ return len; ++ } ++ strncpy(key, strim(tokenPtr), keyLen); ++ strncpy(value, strim(tem), valueLen); ++ ++ kfree(line); ++ return len; ++} ++ ++static int32_t enfs_get_value_from_config_file(char *buffer, char *field_name, ++ char *value, int valueLen) ++{ ++ int ret; ++ char key[STRING_BUF_SIZE + 1] = {0}; ++ char val[STRING_BUF_SIZE + 1] = {0}; ++ ++ while (buffer[0] != '\0') { ++ if (buffer[0] == '\n') { ++ buffer++; ++ } else if (buffer[0] == '#') { ++ ret = enfs_deal_with_comment_line(buffer); ++ if (ret > 0) ++ buffer += ret; ++ } else { ++ ret = enfs_parse_key_value_from_config(buffer, key, val, ++ STRING_BUF_SIZE, ++ STRING_BUF_SIZE); ++ if (ret < 0) { ++ enfs_log_error("failed parse key value, %d\n" ++ , ret); ++ return ret; ++ } ++ key[STRING_BUF_SIZE] = '\0'; ++ val[STRING_BUF_SIZE] = '\0'; ++ ++ buffer += ret; ++ ++ if (strcmp(field_name, key) == 0) { ++ strncpy(value, val, valueLen); ++ return ENFS_RET_OK; ++ } ++ } ++ } ++ enfs_log_error("can not find value which matched field_name: %s.\n", ++ field_name); ++ return -EINVAL; ++} ++ ++int32_t enfs_config_load(void) ++{ ++ char value[STRING_BUF_SIZE + 1]; ++ int ret; ++ int table_len; ++ int min; ++ int max; ++ int i; ++ char *buffer; ++ ++ buffer = kmalloc(MAX_FILE_SIZE, GFP_KERNEL); ++ if (!buffer) { ++ enfs_log_error("Failed to allocate memory.\n"); ++ return -ENOMEM; ++ } ++ memset(buffer, 0, MAX_FILE_SIZE); ++ ++ g_enfs_config_info.path_detect_interval = DEFAULT_PATH_DETECT_INTERVAL; ++ g_enfs_config_info.path_detect_timeout = DEFAULT_PATH_DETECT_TIMEOUT; ++ g_enfs_config_info.multipath_timeout = DEFAULT_MULTIPATH_TIMEOUT; ++ g_enfs_config_info.multipath_state = DEFAULT_MULTIPATH_STATE; ++ g_enfs_config_info.loadbalance_mode = DEFAULT_LOADBALANCE_MODE; ++ ++ table_len = sizeof(g_check_and_assign_value) / ++ sizeof(g_check_and_assign_value[0]); ++ ++ ret = enfs_read_config_file(buffer, CONFIG_FILE_PATH); ++ if (ret != 0) { ++ kfree(buffer); ++ return ret; ++ } ++ ++ for (i = 0; i < table_len; i++) { ++ ret = enfs_get_value_from_config_file(buffer, ++ g_check_and_assign_value[i].field_name, ++ value, STRING_BUF_SIZE); ++ if (ret < 0) ++ continue; ++ ++ value[STRING_BUF_SIZE] = '\0'; ++ min = g_check_and_assign_value[i].min_value; ++ max = g_check_and_assign_value[i].max_value; ++ if (g_check_and_assign_value[i].func != NULL) ++ (*g_check_and_assign_value[i].func)( ++ g_check_and_assign_value[i].field_name, ++ value, min, max); ++ } ++ ++ kfree(buffer); ++ return ENFS_RET_OK; ++} ++ ++int32_t enfs_get_config_path_detect_interval(void) ++{ ++ return g_enfs_config_info.path_detect_interval; ++} ++ ++int32_t enfs_get_config_path_detect_timeout(void) ++{ ++ return g_enfs_config_info.path_detect_timeout; ++} ++ ++int32_t enfs_get_config_multipath_timeout(void) ++{ ++ return g_enfs_config_info.multipath_timeout; ++} ++ ++int32_t enfs_get_config_multipath_state(void) ++{ ++ return g_enfs_config_info.multipath_state; ++} ++ ++int32_t enfs_get_config_loadbalance_mode(void) ++{ ++ return g_enfs_config_info.loadbalance_mode; ++} ++ ++static bool enfs_file_changed(const char *filename) ++{ ++ int err; ++ struct kstat file_stat; ++ ++ err = vfs_stat(filename, &file_stat); ++ if (err) { ++ pr_err("failed to open file:%s err:%d\n", filename, err); ++ return false; ++ } ++ ++ if (timespec64_compare(&modify_time, &file_stat.mtime) == -1) { ++ modify_time = file_stat.mtime; ++ pr_info("file change: %lld %lld\n", modify_time.tv_sec, ++ file_stat.mtime.tv_sec); ++ return true; ++ } ++ ++ return false; ++} ++ ++static int enfs_thread_func(void *data) ++{ ++ while (!kthread_should_stop()) { ++ if (enfs_file_changed(CONFIG_FILE_PATH)) ++ enfs_config_load(); ++ ++ msleep(ENFS_NOTIFY_FILE_PERIOD); ++ } ++ return 0; ++} ++ ++int enfs_config_timer_init(void) ++{ ++ thread = kthread_run(enfs_thread_func, NULL, "enfs_notiy_file_thread"); ++ if (IS_ERR(thread)) { ++ pr_err("Failed to create kernel thread\n"); ++ return PTR_ERR(thread); ++ } ++ return 0; ++} ++ ++void enfs_config_timer_exit(void) ++{ ++ pr_info("enfs_notify_file_exit\n"); ++ if (thread) ++ kthread_stop(thread); ++} +diff --git a/fs/nfs/enfs/enfs_config.h b/fs/nfs/enfs/enfs_config.h +new file mode 100644 +index 000000000000..752710129170 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_config.h +@@ -0,0 +1,32 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs configuration ++ * Author: y00583252 ++ * Create: 2023-07-27 ++ */ ++ ++#ifndef ENFS_CONFIG_H ++#define ENFS_CONFIG_H ++ ++#include <linux/types.h> ++ ++enum enfs_multipath_state { ++ ENFS_MULTIPATH_ENABLE = 0, ++ ENFS_MULTIPATH_DISABLE = 1, ++}; ++ ++enum enfs_loadbalance_mode { ++ ENFS_LOADBALANCE_RR, ++}; ++ ++ ++int32_t enfs_get_config_path_detect_interval(void); ++int32_t enfs_get_config_path_detect_timeout(void); ++int32_t enfs_get_config_multipath_timeout(void); ++int32_t enfs_get_config_multipath_state(void); ++int32_t enfs_get_config_loadbalance_mode(void); ++int32_t enfs_config_load(void); ++int32_t enfs_config_timer_init(void); ++void enfs_config_timer_exit(void); ++#endif // ENFS_CONFIG_H +diff --git a/fs/nfs/enfs/enfs_errcode.h b/fs/nfs/enfs/enfs_errcode.h +new file mode 100644 +index 000000000000..cca47ab9a191 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_errcode.h +@@ -0,0 +1,17 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs errocode ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef ENFS_ERRCODE_H ++#define ENFS_ERRCODE_H ++ ++enum { ++ ENFS_RET_OK = 0, ++ ENFS_RET_FAIL ++}; ++ ++#endif // ENFS_ERRCODE_H +diff --git a/fs/nfs/enfs/enfs_log.h b/fs/nfs/enfs/enfs_log.h +new file mode 100644 +index 000000000000..177b404f05df +--- /dev/null ++++ b/fs/nfs/enfs/enfs_log.h +@@ -0,0 +1,25 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: enfs log ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++#ifndef ENFS_LOG_H ++#define ENFS_LOG_H ++ ++#include <linux/printk.h> ++ ++#define enfs_log_info(fmt, ...) \ ++ pr_info("enfs:[%s]" pr_fmt(fmt), \ ++ __func__, ##__VA_ARGS__) ++ ++#define enfs_log_error(fmt, ...) \ ++ pr_err("enfs:[%s]" pr_fmt(fmt), \ ++ __func__, ##__VA_ARGS__) ++ ++#define enfs_log_debug(fmt, ...) \ ++ pr_debug("enfs:[%s]" pr_fmt(fmt), \ ++ __func__, ##__VA_ARGS__) ++ ++#endif // ENFS_ERRCODE_H +diff --git a/fs/nfs/enfs/failover_com.h b/fs/nfs/enfs/failover_com.h +new file mode 100644 +index 000000000000..c52940da232e +--- /dev/null ++++ b/fs/nfs/enfs/failover_com.h +@@ -0,0 +1,23 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: failover time commont header file ++ * Create: 2023-08-02 ++ */ ++#ifndef FAILOVER_COMMON_H ++#define FAILOVER_COMMON_H ++ ++static inline bool failover_is_enfs_clnt(struct rpc_clnt *clnt) ++{ ++ struct rpc_clnt *next = clnt->cl_parent; ++ ++ while (next) { ++ if (next == next->cl_parent) ++ break; ++ next = next->cl_parent; ++ } ++ ++ return next != NULL ? next->cl_enfs : clnt->cl_enfs; ++} ++ ++#endif // FAILOVER_COMMON_H +diff --git a/fs/nfs/enfs/failover_path.c b/fs/nfs/enfs/failover_path.c +new file mode 100644 +index 000000000000..93b454de29d1 +--- /dev/null ++++ b/fs/nfs/enfs/failover_path.c +@@ -0,0 +1,207 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs path failover file ++ * Author: y00583252 ++ * Create: 2023-08-02 ++ */ ++ ++#include "failover_path.h" ++#include <linux/nfs.h> ++#include <linux/nfs3.h> ++#include <linux/nfs4.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/sched.h> ++#include <linux/sunrpc/xprt.h> ++#include "enfs_config.h" ++#include "enfs_log.h" ++#include "failover_com.h" ++#include "pm_state.h" ++#include "pm_ping.h" ++ ++enum failover_policy_t { ++ FAILOVER_NOACTION = 1, ++ FAILOVER_RETRY, ++ FAILOVER_RETRY_DELAY, ++}; ++ ++static void failover_retry_path(struct rpc_task *task) ++{ ++ xprt_release(task); ++ rpc_init_task_retry_counters(task); ++ rpc_task_release_transport(task); ++ rpc_restart_call(task); ++} ++ ++static void failover_retry_path_delay(struct rpc_task *task, int32_t delay) ++{ ++ failover_retry_path(task); ++ rpc_delay(task, delay); ++} ++ ++static void failover_retry_path_by_policy(struct rpc_task *task, ++ enum failover_policy_t policy) ++{ ++ if (policy == FAILOVER_RETRY) ++ failover_retry_path(task); ++ else if (policy == FAILOVER_RETRY_DELAY) ++ failover_retry_path_delay(task, 3 * HZ); // delay 3s ++} ++ ++static ++enum failover_policy_t failover_get_nfs3_retry_policy(struct rpc_task *task) ++{ ++ enum failover_policy_t policy = FAILOVER_NOACTION; ++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; ++ u32 proc; ++ ++ if (unlikely(procinfo == NULL)) { ++ enfs_log_error("the task contains no valid proc.\n"); ++ return FAILOVER_NOACTION; ++ } ++ ++ proc = procinfo->p_proc; ++ ++ switch (proc) { ++ case NFS3PROC_CREATE: ++ case NFS3PROC_MKDIR: ++ case NFS3PROC_REMOVE: ++ case NFS3PROC_RMDIR: ++ case NFS3PROC_SYMLINK: ++ case NFS3PROC_LINK: ++ case NFS3PROC_SETATTR: ++ case NFS3PROC_WRITE: ++ policy = FAILOVER_RETRY_DELAY; ++ default: ++ policy = FAILOVER_RETRY; ++ } ++ return policy; ++} ++ ++static ++enum failover_policy_t failover_get_nfs4_retry_policy(struct rpc_task *task) ++{ ++ enum failover_policy_t policy = FAILOVER_NOACTION; ++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; ++ u32 proc_idx; ++ ++ if (unlikely(procinfo == NULL)) { ++ enfs_log_error("the task contains no valid proc.\n"); ++ return FAILOVER_NOACTION; ++ } ++ ++ proc_idx = procinfo->p_statidx; ++ ++ switch (proc_idx) { ++ case NFSPROC4_CLNT_CREATE: ++ case NFSPROC4_CLNT_REMOVE: ++ case NFSPROC4_CLNT_LINK: ++ case NFSPROC4_CLNT_SYMLINK: ++ case NFSPROC4_CLNT_SETATTR: ++ case NFSPROC4_CLNT_WRITE: ++ case NFSPROC4_CLNT_RENAME: ++ case NFSPROC4_CLNT_SETACL: ++ policy = FAILOVER_RETRY_DELAY; ++ default: ++ policy = FAILOVER_RETRY; ++ } ++ return policy; ++} ++ ++static enum failover_policy_t failover_get_retry_policy(struct rpc_task *task) ++{ ++ struct rpc_clnt *clnt = task->tk_client; ++ u32 version = clnt->cl_vers; ++ enum failover_policy_t policy = FAILOVER_NOACTION; ++ ++ // 1. if the task meant to send to certain xprt, take no action ++ if (task->tk_flags & RPC_TASK_FIXED) ++ return FAILOVER_NOACTION; ++ ++ // 2. get policy by different version of nfs protocal ++ if (version == 3) // nfs v3 ++ policy = failover_get_nfs3_retry_policy(task); ++ else if (version == 4) // nfs v4 ++ policy = failover_get_nfs4_retry_policy(task); ++ else ++ return FAILOVER_NOACTION; ++ ++ // 3. if the task is not send to target, retry immediately ++ if (!RPC_WAS_SENT(task)) ++ policy = FAILOVER_RETRY; ++ ++ return policy; ++} ++ ++static int failover_check_task(struct rpc_task *task) ++{ ++ struct rpc_clnt *clnt = NULL; ++ int disable_mpath = enfs_get_config_multipath_state(); ++ ++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return -EINVAL; ++ } ++ ++ if (unlikely((task == NULL) || (task->tk_client == NULL))) { ++ enfs_log_error("The task is not valid.\n"); ++ return -EINVAL; ++ } ++ ++ clnt = task->tk_client; ++ ++ if (clnt->cl_prog != NFS_PROGRAM) { ++ enfs_log_debug("The clnt is not prog{%u} type.\n", ++ clnt->cl_prog); ++ return -EINVAL; ++ } ++ ++ if (!failover_is_enfs_clnt(clnt)) { ++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); ++ return -EINVAL; ++ } ++ return 0; ++} ++ ++void failover_handle(struct rpc_task *task) ++{ ++ enum failover_policy_t policy; ++ int ret; ++ ++ ret = failover_check_task(task); ++ if (ret != 0) ++ return; ++ ++ pm_set_path_state(task->tk_xprt, PM_STATE_FAULT); ++ ++ policy = failover_get_retry_policy(task); ++ ++ failover_retry_path_by_policy(task, policy); ++} ++ ++bool failover_task_need_call_start_again(struct rpc_task *task) ++{ ++ int ret; ++ ++ ret = failover_check_task(task); ++ if (ret != 0) ++ return false; ++ ++ return true; ++} ++ ++bool failover_prepare_transmit(struct rpc_task *task) ++{ ++ if (task->tk_flags & RPC_TASK_FIXED) ++ return true; ++ ++ if (pm_ping_is_test_xprt_task(task)) ++ return true; ++ ++ if (pm_get_path_state(task->tk_xprt) == PM_STATE_FAULT) { ++ task->tk_status = -ETIMEDOUT; ++ return false; ++ } ++ ++ return true; ++} +diff --git a/fs/nfs/enfs/failover_path.h b/fs/nfs/enfs/failover_path.h +new file mode 100644 +index 000000000000..6f1294829a6e +--- /dev/null ++++ b/fs/nfs/enfs/failover_path.h +@@ -0,0 +1,17 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs path failover header file ++ * Author: y00583252 ++ * Create: 2023-08-02 ++ */ ++ ++#ifndef FAILOVER_PATH_H ++#define FAILOVER_PATH_H ++ ++#include <linux/sunrpc/sched.h> ++ ++void failover_handle(struct rpc_task *task); ++bool failover_prepare_transmit(struct rpc_task *task); ++ ++#endif // FAILOVER_PATH_H +diff --git a/fs/nfs/enfs/failover_time.c b/fs/nfs/enfs/failover_time.c +new file mode 100644 +index 000000000000..866ea82d13fc +--- /dev/null ++++ b/fs/nfs/enfs/failover_time.c +@@ -0,0 +1,99 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: failover time file ++ * Create: 2023-08-02 ++ */ ++ ++#include "failover_time.h" ++#include <linux/jiffies.h> ++#include <linux/sunrpc/clnt.h> ++#include "enfs_config.h" ++#include "enfs_log.h" ++#include "failover_com.h" ++#include "pm_ping.h" ++ ++static unsigned long failover_get_mulitipath_timeout(struct rpc_clnt *clnt) ++{ ++ unsigned long config_tmo = enfs_get_config_multipath_timeout() * HZ; ++ unsigned long clnt_tmo = clnt->cl_timeout->to_initval; ++ ++ if (config_tmo == 0) ++ return clnt_tmo; ++ ++ return config_tmo > clnt_tmo ? clnt_tmo : config_tmo; ++} ++ ++void failover_adjust_task_timeout(struct rpc_task *task, void *condition) ++{ ++ struct rpc_clnt *clnt = NULL; ++ unsigned long tmo; ++ int disable_mpath = enfs_get_config_multipath_state(); ++ ++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return; ++ } ++ ++ clnt = task->tk_client; ++ if (unlikely(clnt == NULL)) { ++ enfs_log_error("task associate client is NULL.\n"); ++ return; ++ } ++ ++ if (!failover_is_enfs_clnt(clnt)) { ++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); ++ return; ++ } ++ ++ tmo = failover_get_mulitipath_timeout(clnt); ++ if (tmo == 0) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return; ++ } ++ ++ if (task->tk_timeout != 0) ++ task->tk_timeout = ++ task->tk_timeout < tmo ? task->tk_timeout : tmo; ++ else ++ task->tk_timeout = tmo; ++} ++ ++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req) ++{ ++ struct rpc_clnt *clnt = NULL; ++ int disable_mpath = enfs_get_config_multipath_state(); ++ ++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return; ++ } ++ ++ clnt = task->tk_client; ++ if (unlikely(clnt == NULL)) { ++ enfs_log_error("task associate client is NULL.\n"); ++ return; ++ } ++ ++ if (!failover_is_enfs_clnt(clnt)) { ++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); ++ return; ++ } ++ ++ if (!pm_ping_is_test_xprt_task(task)) ++ req->rq_timeout = failover_get_mulitipath_timeout(clnt); ++ else { ++ req->rq_timeout = enfs_get_config_path_detect_timeout() * HZ; ++ req->rq_majortimeo = req->rq_timeout + jiffies; ++ } ++ ++ /* ++ * when task is retried, the req is new, we lost major-timeout times, ++ * so we have to restore req major ++ * timeouts from the task, if it is stored. ++ */ ++ if (task->tk_major_timeo != 0) ++ req->rq_majortimeo = task->tk_major_timeo; ++ else ++ task->tk_major_timeo = req->rq_majortimeo; ++} +diff --git a/fs/nfs/enfs/failover_time.h b/fs/nfs/enfs/failover_time.h +new file mode 100644 +index 000000000000..ede25b577a2a +--- /dev/null ++++ b/fs/nfs/enfs/failover_time.h +@@ -0,0 +1,16 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: failover time header file ++ * Create: 2023-08-02 ++ */ ++ ++#ifndef FAILOVER_TIME_H ++#define FAILOVER_TIME_H ++ ++#include <linux/sunrpc/sched.h> ++ ++void failover_adjust_task_timeout(struct rpc_task *task, void *condition); ++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req); ++ ++#endif // FAILOVER_TIME_H +diff --git a/fs/nfs/enfs/init.h b/fs/nfs/enfs/init.h +new file mode 100644 +index 000000000000..fdabb9084e19 +--- /dev/null ++++ b/fs/nfs/enfs/init.h +@@ -0,0 +1,17 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs client init ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef ENFS_INIT_H ++#define ENFS_INIT_H ++ ++#include <linux/types.h> ++ ++int32_t enfs_init(void); ++void enfs_fini(void); ++ ++#endif +diff --git a/fs/nfs/enfs/mgmt_init.c b/fs/nfs/enfs/mgmt_init.c +new file mode 100644 +index 000000000000..75a40c5e0f6c +--- /dev/null ++++ b/fs/nfs/enfs/mgmt_init.c +@@ -0,0 +1,22 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: mgmt component init ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#include "mgmt_init.h" ++#include <linux/printk.h> ++#include "enfs_errcode.h" ++#include "enfs_config.h" ++ ++int32_t mgmt_init(void) ++{ ++ return enfs_config_timer_init(); ++} ++ ++void mgmt_fini(void) ++{ ++ enfs_config_timer_exit(); ++} +diff --git a/fs/nfs/enfs/mgmt_init.h b/fs/nfs/enfs/mgmt_init.h +new file mode 100644 +index 000000000000..aa78303b9f01 +--- /dev/null ++++ b/fs/nfs/enfs/mgmt_init.h +@@ -0,0 +1,18 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: mgmt component init ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef MGMT_INIT_H ++#define MGMT_INIT_H ++ ++#include <linux/types.h> ++ ++int32_t mgmt_init(void); ++void mgmt_fini(void); ++ ++ ++#endif // MGMT_INIT_H +diff --git a/fs/nfs/enfs/pm_ping.c b/fs/nfs/enfs/pm_ping.c +new file mode 100644 +index 000000000000..24153cd4c7f3 +--- /dev/null ++++ b/fs/nfs/enfs/pm_ping.c +@@ -0,0 +1,421 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: path state header file ++ * Author: x00833432 ++ * Create: 2023-08-21 ++ */ ++ ++#include "pm_ping.h" ++#include <linux/err.h> ++#include <linux/spinlock.h> ++#include <linux/slab.h> ++#include <linux/module.h> ++#include <linux/printk.h> ++#include <linux/kthread.h> ++#include <linux/nfs.h> ++#include <linux/errno.h> ++#include <linux/rcupdate.h> ++#include <linux/workqueue.h> ++#include <net/netns/generic.h> ++#include <linux/atomic.h> ++#include <linux/sunrpc/clnt.h> ++ ++#include "../../../net/sunrpc/netns.h" ++#include "pm_state.h" ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_config.h" ++ ++#define SLEEP_INTERVAL 2 ++extern unsigned int sunrpc_net_id; ++ ++static struct task_struct *pm_ping_timer_thread; ++//protect pint_execute_workq ++static spinlock_t ping_execute_workq_lock; ++// timer for test xprt workqueue ++static struct workqueue_struct *ping_execute_workq; ++// count the ping xprt work on flight ++static atomic_t check_xprt_count; ++ ++struct ping_xprt_work { ++ struct rpc_xprt *xprt; // use this specific xprt ++ struct rpc_clnt *clnt; // use this specific rpc_client ++ struct work_struct ping_work; ++}; ++ ++struct pm_ping_async_callback { ++ void *data; ++ void (*func)(void *data); ++}; ++ ++// set xprt's enum pm_check_state ++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, ++ enum pm_check_state state) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ ++ if (IS_ERR(xprt)) { ++ enfs_log_error("The xprt ptr is not exist.\n"); ++ return; ++ } ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ xprt_get(xprt); ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ xprt_put(xprt); ++ return; ++ } ++ ++ atomic_set(&ctx->path_check_state, state); ++ xprt_put(xprt); ++} ++ ++// get xprt's enum pm_check_state ++static enum pm_check_state pm_ping_get_path_check_state(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ enum pm_check_state state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return PM_CHECK_UNDEFINE; ++ } ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ return PM_CHECK_UNDEFINE; ++ } ++ ++ state = atomic_read(&ctx->path_check_state); ++ ++ return state; ++} ++ ++static void pm_ping_call_done_callback(void *data) ++{ ++ struct pm_ping_async_callback *callback_data = ++ (struct pm_ping_async_callback *)data; ++ ++ if (callback_data == NULL) ++ return; ++ ++ callback_data->func(callback_data->data); ++ ++ kfree(callback_data); ++} ++ ++// Default callback for async RPC calls ++static void pm_ping_call_done(struct rpc_task *task, void *data) ++{ ++ struct rpc_xprt *xprt = task->tk_xprt; ++ ++ atomic_dec(&check_xprt_count); ++ if (task->tk_status >= 0) ++ pm_set_path_state(xprt, PM_STATE_NORMAL); ++ else ++ pm_set_path_state(xprt, PM_STATE_FAULT); ++ ++ pm_ping_set_path_check_state(xprt, PM_CHECK_FINISH); ++ ++ pm_ping_call_done_callback(data); ++} ++ ++// register func to rpc_call_done ++static const struct rpc_call_ops pm_ping_set_status_ops = { ++ .rpc_call_done = pm_ping_call_done, ++}; ++ ++// execute work which in work_queue ++static void pm_ping_execute_work(struct work_struct *work) ++{ ++ int ret = 0; ++ ++ // get the work information ++ struct ping_xprt_work *work_info = ++ container_of(work, struct ping_xprt_work, ping_work); ++ ++ // if check state is pending ++ if (pm_ping_get_path_check_state(work_info->xprt) == PM_CHECK_WAITING) { ++ ++ pm_ping_set_path_check_state(work_info->xprt, ++ PM_CHECK_CHECKING); ++ ++ ret = rpc_clnt_test_xprt(work_info->clnt, ++ work_info->xprt, ++ &pm_ping_set_status_ops, ++ NULL, ++ RPC_TASK_ASYNC | RPC_TASK_FIXED); ++ ++ if (ret < 0) { ++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); ++ ++ pm_ping_set_path_check_state(work_info->xprt, ++ PM_CHECK_FINISH); ++ ++ } else ++ atomic_inc(&check_xprt_count); ++ ++ } ++ ++ atomic_dec(&work_info->clnt->cl_count); ++ xprt_put(work_info->xprt); ++ kfree(work_info); ++ work_info = NULL; ++} ++ ++static bool pm_ping_workqueue_queue_work(struct work_struct *work) ++{ ++ bool ret = false; ++ ++ spin_lock(&ping_execute_workq_lock); ++ ++ if (ping_execute_workq != NULL) ++ ret = queue_work(ping_execute_workq, work); ++ ++ spin_unlock(&ping_execute_workq_lock); ++ return ret; ++} ++ ++// init test work and add this work to workqueue ++static int pm_ping_add_work(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, void *data) ++{ ++ struct ping_xprt_work *work_info; ++ bool ret = false; ++ ++ if (IS_ERR(xprt) || xprt == NULL) { ++ enfs_log_error("The xprt ptr is not exist.\n"); ++ return -EINVAL; ++ } ++ ++ if (IS_ERR(clnt) || clnt == NULL) { ++ enfs_log_error("The clnt ptr is not exist.\n"); ++ return -EINVAL; ++ } ++ ++ if (!xprt->multipath_context) { ++ enfs_log_error("multipath_context is null.\n"); ++ return -EINVAL; ++ } ++ ++ // check xprt pending status, if pending status equals Finish ++ // means this xprt can inster to work queue ++ if (pm_ping_get_path_check_state(xprt) == ++ PM_CHECK_FINISH || ++ pm_ping_get_path_check_state(xprt) == ++ PM_CHECK_INIT) { ++ ++ enfs_log_debug("find xprt pointer. %p\n", xprt); ++ work_info = kzalloc(sizeof(struct ping_xprt_work), GFP_ATOMIC); ++ if (work_info == NULL) ++ return -ENOMEM; ++ work_info->clnt = clnt; ++ atomic_inc(&clnt->cl_count); ++ work_info->xprt = xprt; ++ xprt_get(xprt); ++ INIT_WORK(&work_info->ping_work, pm_ping_execute_work); ++ pm_ping_set_path_check_state(xprt, PM_CHECK_WAITING); ++ ++ ret = pm_ping_workqueue_queue_work(&work_info->ping_work); ++ if (!ret) { ++ atomic_dec(&work_info->clnt->cl_count); ++ xprt_put(work_info->xprt); ++ kfree(work_info); ++ return -EINVAL; ++ } ++ } ++ return 0; ++} ++ ++// encapsulate pm_ping_add_work() ++static int pm_ping_execute_xprt_test(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, void *data) ++{ ++ pm_ping_add_work(clnt, xprt, NULL); ++ // return 0 for rpc_clnt_iterate_for_each_xprt(); ++ // because negative value will stop iterate all xprt ++ // and we need return negative value for debug ++ // Therefore, we need this function to iterate all xprt ++ return 0; ++} ++ ++// export to other module add ping work to workqueue ++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) ++{ ++ int ret; ++ ++ ret = pm_ping_add_work(clnt, xprt, NULL); ++ return ret; ++} ++ ++// iterate xprt in the client ++static void pm_ping_loop_rpclnt(struct sunrpc_net *sn) ++{ ++ struct rpc_clnt *clnt; ++ ++ spin_lock(&sn->rpc_client_lock); ++ list_for_each_entry_rcu(clnt, &sn->all_clients, cl_clients) { ++ if (clnt->cl_enfs) { ++ enfs_log_debug("find rpc_clnt. %p\n", clnt); ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ pm_ping_execute_xprt_test, NULL); ++ } ++ } ++ spin_unlock(&sn->rpc_client_lock); ++} ++ ++// iterate each clnt in the sunrpc_net ++static void pm_ping_loop_sunrpc_net(void) ++{ ++ struct net *net; ++ struct sunrpc_net *sn; ++ ++ rcu_read_lock(); ++ for_each_net_rcu(net) { ++ sn = net_generic(net, sunrpc_net_id); ++ if (sn == NULL) ++ continue; ++ pm_ping_loop_rpclnt(sn); ++ } ++ rcu_read_unlock(); ++} ++ ++static int pm_ping_routine(void *data) ++{ ++ while (!kthread_should_stop()) { ++ // equale 0 means open multipath ++ if (enfs_get_config_multipath_state() == ++ ENFS_MULTIPATH_ENABLE) ++ pm_ping_loop_sunrpc_net(); ++ ++ msleep((unsigned int) ++ enfs_get_config_path_detect_interval() * 1000); ++ } ++ return 0; ++} ++ ++// start thread to cycly ping ++static int pm_ping_start(void) ++{ ++ pm_ping_timer_thread = ++ kthread_run(pm_ping_routine, NULL, "pm_ping_routine"); ++ if (IS_ERR(pm_ping_timer_thread)) { ++ enfs_log_error("Failed to create kernel thread\n"); ++ return PTR_ERR(pm_ping_timer_thread); ++ } ++ return 0; ++} ++ ++// initialize workqueue ++static int pm_ping_workqueue_init(void) ++{ ++ struct workqueue_struct *queue = NULL; ++ ++ queue = create_workqueue("pm_ping_workqueue"); ++ ++ if (queue == NULL) { ++ enfs_log_error("create workqueue failed.\n"); ++ return -ENOMEM; ++ } ++ ++ spin_lock(&ping_execute_workq_lock); ++ ping_execute_workq = queue; ++ spin_unlock(&ping_execute_workq_lock); ++ enfs_log_info("create workqueue succeeeded.\n"); ++ return 0; ++} ++ ++static void pm_ping_workqueue_fini(void) ++{ ++ struct workqueue_struct *queue = NULL; ++ ++ spin_lock(&ping_execute_workq_lock); ++ queue = ping_execute_workq; ++ ping_execute_workq = NULL; ++ spin_unlock(&ping_execute_workq_lock); ++ ++ enfs_log_info("delete work queue\n"); ++ ++ if (queue != NULL) { ++ flush_workqueue(queue); ++ destroy_workqueue(queue); ++ } ++} ++ ++// module exit func ++void pm_ping_fini(void) ++{ ++ if (pm_ping_timer_thread) ++ kthread_stop(pm_ping_timer_thread); ++ ++ pm_ping_workqueue_fini(); ++ ++ while (atomic_read(&check_xprt_count) != 0) ++ msleep(SLEEP_INTERVAL); ++} ++ ++// module init func ++int pm_ping_init(void) ++{ ++ int ret; ++ ++ atomic_set(&check_xprt_count, 0); ++ ret = pm_ping_workqueue_init(); ++ if (ret != 0) { ++ enfs_log_error("PM_PING Module loading failed.\n"); ++ return ret; ++ } ++ ret = pm_ping_start(); ++ if (ret != 0) { ++ enfs_log_error("PM_PING Module loading failed.\n"); ++ pm_ping_workqueue_fini(); ++ return ret; ++ } ++ ++ return ret; ++} ++ ++bool pm_ping_is_test_xprt_task(struct rpc_task *task) ++{ ++ return task->tk_ops == &pm_ping_set_status_ops ? true : false; ++} ++ ++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void (*func)(void *data), ++ void *data) ++{ ++ int ret; ++ ++ struct pm_ping_async_callback *callback_data = ++ kzalloc(sizeof(struct pm_ping_async_callback), GFP_KERNEL); ++ ++ if (callback_data == NULL) { ++ enfs_log_error("failed to mzalloc mem\n"); ++ return -ENOMEM; ++ } ++ ++ callback_data->data = data; ++ callback_data->func = func; ++ atomic_inc(&check_xprt_count); ++ ret = rpc_clnt_test_xprt(clnt, xprt, ++ &pm_ping_set_status_ops, ++ callback_data, ++ RPC_TASK_ASYNC | RPC_TASK_FIXED); ++ ++ if (ret < 0) { ++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); ++ atomic_dec(&check_xprt_count); ++ } ++ ++ return ret; ++} +diff --git a/fs/nfs/enfs/pm_ping.h b/fs/nfs/enfs/pm_ping.h +new file mode 100644 +index 000000000000..6bcb94bfc836 +--- /dev/null ++++ b/fs/nfs/enfs/pm_ping.h +@@ -0,0 +1,33 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs configuration ++ * Author: x00833432 ++ * Create: 2023-07-27 ++ */ ++ ++#ifndef PM_PING_H ++#define PM_PING_H ++ ++#include <linux/sunrpc/clnt.h> ++ ++enum pm_check_state { ++ PM_CHECK_INIT, // this xprt never been queued ++ PM_CHECK_WAITING, // this xprt waiting in the queue ++ PM_CHECK_CHECKING, // this xprt is testing ++ PM_CHECK_FINISH, // this xprt has been finished ++ PM_CHECK_UNDEFINE, // undefine multipath struct ++}; ++ ++int pm_ping_init(void); ++void pm_ping_fini(void); ++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); ++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, ++ enum pm_check_state state); ++bool pm_ping_is_test_xprt_task(struct rpc_task *task); ++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void (*func)(void *data), ++ void *data); ++ ++#endif // PM_PING_H +diff --git a/fs/nfs/enfs/pm_state.c b/fs/nfs/enfs/pm_state.c +new file mode 100644 +index 000000000000..220621a207a2 +--- /dev/null ++++ b/fs/nfs/enfs/pm_state.c +@@ -0,0 +1,158 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: path state file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++#include "pm_state.h" ++#include <linux/sunrpc/xprt.h> ++ ++#include "enfs.h" ++#include "enfs_log.h" ++ ++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ enum pm_path_state state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return PM_STATE_UNDEFINED; ++ } ++ ++ xprt_get(xprt); ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ xprt_put(xprt); ++ return PM_STATE_UNDEFINED; ++ } ++ ++ state = atomic_read(&ctx->path_state); ++ ++ xprt_put(xprt); ++ ++ return state; ++} ++ ++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ enum pm_path_state cur_state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ xprt_get(xprt); ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ xprt_put(xprt); ++ return; ++ } ++ ++ cur_state = atomic_read(&ctx->path_state); ++ if (cur_state == state) { ++ enfs_log_debug("The xprt is already {%d}.\n", state); ++ xprt_put(xprt); ++ return; ++ } ++ ++ atomic_set(&ctx->path_state, state); ++ enfs_log_info("The xprt {%p} path state change from {%d} to {%d}.\n", ++ xprt, cur_state, state); ++ ++ xprt_put(xprt); ++} ++ ++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len) ++{ ++ enum pm_path_state state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ if ((buf == NULL) || (len <= 0)) { ++ enfs_log_error("Buffer is not valid, len=%d.\n", len); ++ return; ++ } ++ ++ state = pm_get_path_state(xprt); ++ ++ switch (state) { ++ case PM_STATE_INIT: ++ (void)snprintf(buf, len, "Init"); ++ break; ++ case PM_STATE_NORMAL: ++ (void)snprintf(buf, len, "Normal"); ++ break; ++ case PM_STATE_FAULT: ++ (void)snprintf(buf, len, "Fault"); ++ break; ++ default: ++ (void)snprintf(buf, len, "Unknown"); ++ break; ++ } ++} ++ ++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len) ++{ ++ int i; ++ unsigned long state; ++ static unsigned long xprt_mask[] = { ++ XPRT_LOCKED, XPRT_CONNECTED, ++ XPRT_CONNECTING, XPRT_CLOSE_WAIT, ++ XPRT_BOUND, XPRT_BINDING, XPRT_CLOSING, ++ XPRT_CONGESTED}; ++ ++ static const char *const xprt_state_desc[] = { ++ "LOCKED", "CONNECTED", "CONNECTING", ++ "CLOSE_WAIT", "BOUND", "BINDING", ++ "CLOSING", "CONGESTED"}; ++ int pos = 0; ++ int ret = 0; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ if ((buf == NULL) || (len <= 0)) { ++ enfs_log_error( ++ "Xprt state buffer is not valid, len=%d.\n", ++ len); ++ return; ++ } ++ ++ xprt_get(xprt); ++ state = READ_ONCE(xprt->state); ++ xprt_put(xprt); ++ ++ for (i = 0; i < ARRAY_SIZE(xprt_mask); ++i) { ++ if (pos >= len) ++ break; ++ ++ if (!test_bit(xprt_mask[i], &state)) ++ continue; ++ ++ if (pos == 0) ++ ret = snprintf(buf, len, "%s", xprt_state_desc[i]); ++ else ++ ret = snprintf(buf + pos, len - pos, "|%s", ++ xprt_state_desc[i]); ++ ++ if (ret < 0) { ++ enfs_log_error("format state failed, ret %d.\n", ret); ++ break; ++ } ++ ++ pos += ret; ++ } ++} +diff --git a/fs/nfs/enfs/pm_state.h b/fs/nfs/enfs/pm_state.h +new file mode 100644 +index 000000000000..f5f52e5ab91d +--- /dev/null ++++ b/fs/nfs/enfs/pm_state.h +@@ -0,0 +1,28 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: path state header file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++ ++#ifndef PM_STATE_H ++#define PM_STATE_H ++ ++#include <linux/types.h> ++#include <linux/sunrpc/xprt.h> ++ ++enum pm_path_state { ++ PM_STATE_INIT, ++ PM_STATE_NORMAL, ++ PM_STATE_FAULT, ++ PM_STATE_UNDEFINED // xprt is not multipath xprt ++}; ++ ++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state); ++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt); ++ ++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len); ++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len); ++ ++#endif // PM_STATE_H diff --git a/0006-add_enfs_compile_option.patch b/0006-add_enfs_compile_option.patch new file mode 100644 index 0000000..ff3bc0e --- /dev/null +++ b/0006-add_enfs_compile_option.patch @@ -0,0 +1,70 @@ +diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig +index b04256636d4b..ae53510c0627 100644 +--- a/arch/arm64/configs/openeuler_defconfig ++++ b/arch/arm64/configs/openeuler_defconfig +@@ -5344,6 +5344,7 @@ CONFIG_LOCKD=m + CONFIG_LOCKD_V4=y + CONFIG_NFS_ACL_SUPPORT=m + CONFIG_NFS_COMMON=y ++# CONFIG_ENFS is not set + CONFIG_SUNRPC=m + CONFIG_SUNRPC_GSS=m + CONFIG_SUNRPC_BACKCHANNEL=y +diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig +index 59baeb2973af..ccc317f7fdb2 100644 +--- a/arch/x86/configs/openeuler_defconfig ++++ b/arch/x86/configs/openeuler_defconfig +@@ -6825,6 +6825,7 @@ CONFIG_LOCKD=m + CONFIG_LOCKD_V4=y + CONFIG_NFS_ACL_SUPPORT=m + CONFIG_NFS_COMMON=y ++CONFIG_ENFS=y + CONFIG_SUNRPC=m + CONFIG_SUNRPC_GSS=m + CONFIG_SUNRPC_BACKCHANNEL=y +diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig +index e55f86713948..872c9b7671b1 100644 +--- a/fs/nfs/Kconfig ++++ b/fs/nfs/Kconfig +@@ -196,3 +196,14 @@ config NFS_DEBUG + depends on NFS_FS && SUNRPC_DEBUG + select CRC32 + default y ++ ++config ENFS ++ tristate "NFS client support for ENFS" ++ depends on NFS_FS ++ default n ++ help ++ This option enables support multipath of the NFS protocol ++ in the kernel's NFS client. ++ This feature will improve performance and reliability. ++ ++ If sure, say Y. +diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile +index c587e3c4c6a6..19d0ac2ba3b8 100644 +--- a/fs/nfs/Makefile ++++ b/fs/nfs/Makefile +@@ -12,6 +12,7 @@ nfs-y := client.o dir.o file.o getroot.o inode.o super.o \ + nfs-$(CONFIG_ROOT_NFS) += nfsroot.o + nfs-$(CONFIG_SYSCTL) += sysctl.o + nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-index.o ++nfs-$(CONFIG_ENFS) += enfs_adapter.o + + obj-$(CONFIG_NFS_V2) += nfsv2.o + nfsv2-y := nfs2super.o proc.o nfs2xdr.o +@@ -34,3 +35,5 @@ nfsv4-$(CONFIG_NFS_V4_2) += nfs42proc.o + obj-$(CONFIG_PNFS_FILE_LAYOUT) += filelayout/ + obj-$(CONFIG_PNFS_BLOCK) += blocklayout/ + obj-$(CONFIG_PNFS_FLEXFILE_LAYOUT) += flexfilelayout/ ++ ++obj-$(CONFIG_ENFS) += enfs/ +diff --git a/net/sunrpc/Makefile b/net/sunrpc/Makefile +index 090658c3da12..fe4e3b28c5d1 100644 +--- a/net/sunrpc/Makefile ++++ b/net/sunrpc/Makefile +@@ -19,3 +19,4 @@ sunrpc-$(CONFIG_SUNRPC_DEBUG) += debugfs.o + sunrpc-$(CONFIG_SUNRPC_BACKCHANNEL) += backchannel_rqst.o + sunrpc-$(CONFIG_PROC_FS) += stats.o + sunrpc-$(CONFIG_SYSCTL) += sysctl.o ++sunrpc-$(CONFIG_ENFS) += sunrpc_enfs_adapter.o -- 2.25.0.windows.1

1 0

[PATCH openEuler-1.0-LTS] [just for review!!!!]Add feature: eNFS - nfs multipath to improve performance and reliability
by mingqian218472 25 Sep '23

25 Sep '23

From: 闫海涛 <yanhaitao2(a)huawei.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7SVH7 --------------------------------- Currently, the NFS client can use only one server IP address at a single mount point. As a result, the hardware capability of multiple storage nodes and NICs cannot be fully utilized. In multiple financial sites, the performance cannot meet service requirements. In addition, when a single link is faulty, services are suspended. The reliability problem needs to be solved. OpenEuler-based commercial OS vendors hope that the eNFS feature will be integrated into 20.03 SP4 to resolve performance and reliability problems. When user mount one NFS share, can input localaddrs/remoteaddrs these two optional Parameters to use eNFS multipath. If these optional parameters are not used, NFS will behave as before. For example, mount -t nfs -o [localaddrs=127.17.0.1-127.17.0.4],[remoteaddrs=127.17.1.1-127.17.1.4] xx.xx.xx.xx:/test /mnt/test Changes in eNFS are as follows: 1. patch 0001: At the NFS layer, the eNFS registration function is called back when the mount command parses parameters. The eNFS parses and saves the IP address list entered by users. 2. patch 0002: At the sunrpc layer, the eNFS registration function is called back When the NFS uses sunrpc to create rpc_clnt, the eNFS combines the IP address list entered for mount to generate multiple xprts. When the I/O times out, the callback function of the eNFS is called back so that the eNFS switches to an available link for retry. 3. patch 0003: The eNFS module registers the interface for parsing the mount command. During the mount process, the NFS invokes the eNFS interface to enable the eNFS to parse the mounting parameters of UltraPath. The eNFS module saves the mounting parameters to the context of nfs_client. 4. patch 0004: When the NFS invokes the SunRPC to create rpc_clnt, the eNFS interface is called back. The eNFS creates multiple xprts based on the output IP address list. When NFS V3 I/Os are delivered, eNFS distributes I/Os to available links based on the link status, improving performance through load balancing. 5. patch 0005: When sending I/Os from the SunRPC module to the NFS server times out, the SunRPC module calls back the eNFS module to reselect a link. The eNFS module distributes I/Os to other available links, preventing service interruption caused by a single link failure. 6. patch 0006: The eNFS compilation option and makefile are added. By default, the eNFS compilation is not performed. Signed-off-by: mingqian218472 <zhangmingqian.zhang(a)huawei.com> --- ...-nfs-multipath-to-improve-performanc.patch | 6148 +++++++++++++++++ ...enfs_registe_and_handle_mount_option.patch | 757 ++ ...nd_create_multipath_then_dispatch_IO.patch | 805 +++ ...add_enfs_module_for_nfs_mount_option.patch | 1209 ++++ ...dd_enfs_module_for_sunrpc_multipatch.patch | 1581 +++++ ...le_for_sunrpc_failover_and_configure.patch | 1607 +++++ 0006-add_enfs_compile_option.patch | 70 + 7 files changed, 12177 insertions(+) create mode 100644 0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch create mode 100644 0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch create mode 100644 0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch create mode 100644 0003-add_enfs_module_for_nfs_mount_option.patch create mode 100644 0004-add_enfs_module_for_sunrpc_multipatch.patch create mode 100644 0005-add_enfs_module_for_sunrpc_failover_and_configure.patch create mode 100644 0006-add_enfs_compile_option.patch diff --git a/0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch b/0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch new file mode 100644 index 0000000..2974c5f --- /dev/null +++ b/0001-Add-feature-eNFS-nfs-multipath-to-improve-performanc.patch @@ -0,0 +1,6148 @@ +From 53f616b0a649494e33d30b250d06c4049ccb88be Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?=E9=97=AB=E6=B5=B7=E6=B6=9B?= <yanhaitao2(a)huawei.com> +Date: Mon, 25 Sep 2023 19:19:15 +0800 +Subject: [PATCH openEuler-20.03-LTS-SP3] Add feature: eNFS - nfs multipath to + improve performance and reliability + +driver inclusion +category: feature +bugzilla: https://gitee.com/openeuler/release-management/issues/I7U0W0 + +--------------------------------- + +Currently, the NFS client can use only one server IP address at a single mount point. As a result, the hardware capability of multiple storage nodes and NICs cannot be fully utilized. In multiple financial sites, the performance cannot meet service requirements. In addition, when a single link is faulty, services are suspended. The reliability problem needs to be solved. +OpenEuler-based commercial OS vendors hope that the eNFS feature will be integrated into 20.03 SP4 to resolve performance and reliability problems. + +When user mount one NFS share, can input localaddrs/remoteaddrs these two optional Parameters to use eNFS multipath. If these optional parameters are not used, NFS will behave as before. For example, +mount -t nfs -o [localaddrs=127.17.0.1-127.17.0.4],[remoteaddrs=127.17.1.1-127.17.1.4] xx.xx.xx.xx:/test /mnt/test + +Changes in eNFS are as follows: +1. patch 0001: +At the NFS layer, the eNFS registration function is called back when the mount command parses parameters. The eNFS parses and saves the IP address list entered by users. +2. patch 0002: +At the sunrpc layer, the eNFS registration function is called back When the NFS uses sunrpc to create rpc_clnt, the eNFS combines the IP address list entered for mount to generate multiple xprts. When the I/O times out, the callback function of the eNFS is called back so that the eNFS switches to an available link for retry. +3. patch 0003: +The eNFS module registers the interface for parsing the mount command. During the mount process, the NFS invokes the eNFS interface to enable the eNFS to parse the mounting parameters of UltraPath. The eNFS module saves the mounting parameters to the context of nfs_client. +4. patch 0004: +When the NFS invokes the SunRPC to create rpc_clnt, the eNFS interface is called back. The eNFS creates multiple xprts based on the output IP address list. When NFS V3 I/Os are delivered, eNFS distributes I/Os to available links based on the link status, improving performance through load balancing. +5. patch 0005: +When sending I/Os from the SunRPC module to the NFS server times out, the SunRPC module calls back the eNFS module to reselect a link. The eNFS module distributes I/Os to other available links, preventing service interruption caused by a single link failure. +6. patch 0006: +The eNFS compilation option and makefile are added. By default, the eNFS compilation is not performed. + +Signed-off-by: mingqian218472 <zhangmingqian.zhang(a)huawei.com> +--- + ...enfs_registe_and_handle_mount_option.patch | 757 ++++++++ + ...nd_create_multipath_then_dispatch_IO.patch | 805 +++++++++ + ...add_enfs_module_for_nfs_mount_option.patch | 1209 +++++++++++++ + ...dd_enfs_module_for_sunrpc_multipatch.patch | 1581 ++++++++++++++++ + ...le_for_sunrpc_failover_and_configure.patch | 1607 +++++++++++++++++ + 0006-add_enfs_compile_option.patch | 70 + + kernel.spec | 13 + + 7 files changed, 6042 insertions(+) + create mode 100644 0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch + create mode 100644 0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch + create mode 100644 0003-add_enfs_module_for_nfs_mount_option.patch + create mode 100644 0004-add_enfs_module_for_sunrpc_multipatch.patch + create mode 100644 0005-add_enfs_module_for_sunrpc_failover_and_configure.patch + create mode 100644 0006-add_enfs_compile_option.patch + +diff --git a/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch +new file mode 100644 +index 0000000..38e57a9 +--- /dev/null ++++ b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch +@@ -0,0 +1,757 @@ ++diff --git a/fs/nfs/client.c b/fs/nfs/client.c ++index 7d02dc52209d..50820a8a684a 100644 ++--- a/fs/nfs/client.c +++++ b/fs/nfs/client.c ++@@ -48,7 +48,7 @@ ++ #include "callback.h" ++ #include "delegation.h" ++ #include "iostat.h" ++-#include "internal.h" +++#include "enfs_adapter.h" ++ #include "fscache.h" ++ #include "pnfs.h" ++ #include "nfs.h" ++@@ -255,6 +255,7 @@ void nfs_free_client(struct nfs_client *clp) ++ put_nfs_version(clp->cl_nfs_mod); ++ kfree(clp->cl_hostname); ++ kfree(clp->cl_acceptor); +++ nfs_free_multi_path_client(clp); ++ kfree(clp); ++ } ++ EXPORT_SYMBOL_GPL(nfs_free_client); ++@@ -330,6 +331,9 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat ++ sap)) ++ continue; ++ +++ if (!nfs_multipath_client_match(clp, data)) +++ continue; +++ ++ refcount_inc(&clp->cl_count); ++ return clp; ++ } ++@@ -512,6 +516,9 @@ int nfs_create_rpc_client(struct nfs_client *clp, ++ .program = &nfs_program, ++ .version = clp->rpc_ops->version, ++ .authflavor = flavor, +++#if IS_ENABLED(CONFIG_ENFS) +++ .multipath_option = cl_init->enfs_option, +++#endif ++ }; ++ ++ if (test_bit(NFS_CS_DISCRTRY, &clp->cl_flags)) ++@@ -634,6 +641,13 @@ struct nfs_client *nfs_init_client(struct nfs_client *clp, ++ /* the client is already initialised */ ++ if (clp->cl_cons_state == NFS_CS_READY) ++ return clp; +++ error = nfs_create_multi_path_client(clp, cl_init); +++ if (error < 0) { +++ dprintk("%s: create failed.%d!\n", __func__, error); +++ nfs_put_client(clp); +++ clp = ERR_PTR(error); +++ return clp; +++ } ++ ++ /* ++ * Create a client RPC handle for doing FSSTAT with UNIX auth only ++@@ -666,6 +680,9 @@ static int nfs_init_server(struct nfs_server *server, ++ .net = data->net, ++ .timeparms = &timeparms, ++ .init_flags = (1UL << NFS_CS_REUSEPORT), +++#if IS_ENABLED(CONFIG_ENFS) +++ .enfs_option = data->enfs_option, +++#endif ++ }; ++ struct nfs_client *clp; ++ int error; ++diff --git a/fs/nfs/enfs_adapter.c b/fs/nfs/enfs_adapter.c ++new file mode 100644 ++index 000000000000..7f471f2072c4 ++--- /dev/null +++++ b/fs/nfs/enfs_adapter.c ++@@ -0,0 +1,230 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs3.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/sunrpc/sched.h> +++#include <linux/nfs_iostat.h> +++#include "enfs_adapter.h" +++#include "iostat.h" +++ +++struct enfs_adapter_ops __rcu *enfs_adapter; +++ +++int enfs_adapter_register(struct enfs_adapter_ops *ops) +++{ +++ struct enfs_adapter_ops *old; +++ +++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, NULL, ops); +++ if (old == NULL || old == ops) +++ return 0; +++ pr_err("regist %s ops %p failed. old %p\n", __func__, ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(enfs_adapter_register); +++ +++int enfs_adapter_unregister(struct enfs_adapter_ops *ops) +++{ +++ struct enfs_adapter_ops *old; +++ +++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, ops, NULL); +++ if (old == ops || old == NULL) +++ return 0; +++ pr_err("unregist %s ops %p failed. old %p\n", __func__, ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(enfs_adapter_unregister); +++ +++struct enfs_adapter_ops *nfs_multipath_router_get(void) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ rcu_read_lock(); +++ ops = rcu_dereference(enfs_adapter); +++ if (ops == NULL) { +++ rcu_read_unlock(); +++ return NULL; +++ } +++ if (!try_module_get(ops->owner)) +++ ops = NULL; +++ rcu_read_unlock(); +++ return ops; +++} +++ +++void nfs_multipath_router_put(struct enfs_adapter_ops *ops) +++{ +++ if (ops) +++ module_put(ops->owner); +++} +++ +++bool is_valid_option(enum nfsmultipathoptions option) +++{ +++ if (option < REMOTEADDR || option >= INVALID_OPTION) { +++ pr_warn("%s: ENFS invalid option %d\n", __func__, option); +++ return false; +++ } +++ +++ return true; +++} +++ +++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, +++ struct nfs_parsed_mount_data *mnt) +++{ +++ +++ //parseMultiPathOptions(getNfsMultiPathOpt(token), string, mnt); +++ +++ int rc; +++ struct enfs_adapter_ops *ops; +++ +++ ops = nfs_multipath_router_get(); +++ if ((ops == NULL) || (ops->parse_mount_options == NULL) || +++ !is_valid_option(option)) { +++ nfs_multipath_router_put(ops); +++ dfprintk(MOUNT, +++ "NFS: parsing nfs mount option enfs not load[%s]\n" +++ , __func__); +++ return -EOPNOTSUPP; +++ } +++ // nfs_multipath_parse_options +++ dfprintk(MOUNT, "NFS: parsing nfs mount option '%s' type: %d[%s]\n" +++ , str, option, __func__); +++ rc = ops->parse_mount_options(option, str, &mnt->enfs_option, mnt->net); +++ nfs_multipath_router_put(ops); +++ return rc; +++} +++ +++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ if (data->enfs_option == NULL) +++ return; +++ +++ ops = nfs_multipath_router_get(); +++ if ((ops == NULL) || (ops->free_mount_options == NULL)) { +++ nfs_multipath_router_put(ops); +++ return; +++ } +++ ops->free_mount_options((void *)&data->enfs_option); +++ nfs_multipath_router_put(ops); +++} +++ +++int nfs_create_multi_path_client(struct nfs_client *client, +++ const struct nfs_client_initdata *cl_init) +++{ +++ int ret = 0; +++ struct enfs_adapter_ops *ops; +++ +++ if (cl_init->enfs_option == NULL) +++ return 0; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_init != NULL) +++ ret = ops->client_info_init( +++ (void *)&client->cl_multipath_data, cl_init); +++ nfs_multipath_router_put(ops); +++ +++ return ret; +++} +++EXPORT_SYMBOL_GPL(nfs_create_multi_path_client); +++ +++void nfs_free_multi_path_client(struct nfs_client *clp) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ if (clp->cl_multipath_data == NULL) +++ return; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_free != NULL) +++ ops->client_info_free(clp->cl_multipath_data); +++ nfs_multipath_router_put(ops); +++} +++ +++int nfs_multipath_client_match(struct nfs_client *clp, +++ const struct nfs_client_initdata *sap) +++{ +++ int ret = true; +++ struct enfs_adapter_ops *ops; +++ +++ pr_info("%s src %p dst %p\n.", __func__, +++ clp->cl_multipath_data, sap->enfs_option); +++ +++ if (clp->cl_multipath_data == NULL && sap->enfs_option == NULL) +++ return true; +++ +++ if ((clp->cl_multipath_data == NULL && sap->enfs_option) || +++ (clp->cl_multipath_data && sap->enfs_option == NULL)) { +++ pr_err("not match client src %p dst %p\n.", +++ clp->cl_multipath_data, sap->enfs_option); +++ return false; +++ } +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_match != NULL) +++ ret = ops->client_info_match(clp->cl_multipath_data, +++ sap->enfs_option); +++ nfs_multipath_router_put(ops); +++ +++ return ret; +++} +++ +++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst) +++{ +++ int ret = true; +++ struct enfs_adapter_ops *ops; +++ +++ if (src->cl_multipath_data == NULL && dst->cl_multipath_data == NULL) +++ return true; +++ +++ if (src->cl_multipath_data == NULL || dst->cl_multipath_data == NULL) +++ return false; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->nfs4_client_info_match != NULL) +++ ret = ops->nfs4_client_info_match(src->cl_multipath_data, +++ src->cl_multipath_data); +++ nfs_multipath_router_put(ops); +++ +++ return ret; +++} +++EXPORT_SYMBOL_GPL(nfs4_multipath_client_match); +++ +++void nfs_multipath_show_client_info(struct seq_file *mount_option, +++ struct nfs_server *server) +++{ +++ struct enfs_adapter_ops *ops; +++ +++ if (mount_option == NULL || server == NULL || +++ server->client == NULL || +++ server->nfs_client->cl_multipath_data == NULL) +++ return; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->client_info_show != NULL) +++ ops->client_info_show(mount_option, server); +++ nfs_multipath_router_put(ops); +++} +++ +++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) +++{ +++ int ret = 0; +++ struct enfs_adapter_ops *ops; +++ +++ if (nfs_client == NULL || nfs_client->cl_rpcclient == NULL) +++ return 0; +++ +++ ops = nfs_multipath_router_get(); +++ if (ops != NULL && ops->remount_ip_list != NULL) +++ ret = ops->remount_ip_list(nfs_client, enfs_option); +++ nfs_multipath_router_put(ops); +++ return ret; +++} +++EXPORT_SYMBOL_GPL(nfs_remount_iplist); ++diff --git a/fs/nfs/enfs_adapter.h b/fs/nfs/enfs_adapter.h ++new file mode 100644 ++index 000000000000..752544e18056 ++--- /dev/null +++++ b/fs/nfs/enfs_adapter.h ++@@ -0,0 +1,101 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapt header. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _NFS_MULTIPATH_H_ +++#define _NFS_MULTIPATH_H_ +++ +++#include "internal.h" +++ +++#if IS_ENABLED(CONFIG_ENFS) +++enum nfsmultipathoptions { +++ REMOTEADDR, +++ LOCALADDR, +++ REMOTEDNSNAME, +++ REMOUNTREMOTEADDR, +++ REMOUNTLOCALADDR, +++ INVALID_OPTION +++}; +++ +++ +++struct enfs_adapter_ops { +++ const char *name; +++ struct module *owner; +++ int (*parse_mount_options)(enum nfsmultipathoptions option, +++ char *str, void **enfs_option, struct net *net_ns); +++ +++ void (*free_mount_options)(void **data); +++ +++ int (*client_info_init)(void **data, +++ const struct nfs_client_initdata *cl_init); +++ void (*client_info_free)(void *data); +++ int (*client_info_match)(void *src, void *dst); +++ int (*nfs4_client_info_match)(void *src, void *dst); +++ void (*client_info_show)(struct seq_file *mount_option, void *data); +++ int (*remount_ip_list)(struct nfs_client *nfs_client, +++ void *enfs_option); +++}; +++ +++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, +++ struct nfs_parsed_mount_data *mnt); +++void enfs_free_mount_options(struct nfs_parsed_mount_data *data); +++int nfs_create_multi_path_client(struct nfs_client *client, +++ const struct nfs_client_initdata *cl_init); +++void nfs_free_multi_path_client(struct nfs_client *clp); +++int nfs_multipath_client_match(struct nfs_client *clp, +++ const struct nfs_client_initdata *sap); +++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst); +++void nfs_multipath_show_client_info(struct seq_file *mount_option, +++ struct nfs_server *server); +++int enfs_adapter_register(struct enfs_adapter_ops *ops); +++int enfs_adapter_unregister(struct enfs_adapter_ops *ops); +++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); +++int nfs4_create_multi_path(struct nfs_server *server, +++ struct nfs_parsed_mount_data *data, +++ const struct rpc_timeout *timeparms); +++ +++#else +++static inline +++void nfs_free_multi_path_client(struct nfs_client *clp) +++{ +++ +++} +++ +++static inline +++int nfs_multipath_client_match(struct nfs_client *clp, +++ const struct nfs_client_initdata *sap) +++{ +++ return 1; +++} +++ +++static inline +++int nfs_create_multi_path_client(struct nfs_client *client, +++ const struct nfs_client_initdata *cl_init) +++{ +++ return 0; +++} +++ +++static inline +++void nfs_multipath_show_client_info(struct seq_file *mount_option, +++ struct nfs_server *server) +++{ +++ +++} +++ +++static inline +++int nfs4_multipath_client_match(struct nfs_client *src, +++ struct nfs_client *dst) +++{ +++ return 1; +++} +++ +++static inline +++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) +++{ +++ +++} +++ +++#endif // CONFIG_ENFS +++#endif // _NFS_MULTIPATH_H_ ++diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h ++index 0ce5a90640c4..c696693edc7b 100644 ++--- a/fs/nfs/internal.h +++++ b/fs/nfs/internal.h ++@@ -93,6 +93,9 @@ struct nfs_client_initdata { ++ u32 minorversion; ++ struct net *net; ++ const struct rpc_timeout *timeparms; +++#if IS_ENABLED(CONFIG_ENFS) +++ void *enfs_option; /* struct multipath_mount_options * */ +++#endif ++ }; ++ ++ /* ++@@ -135,6 +138,9 @@ struct nfs_parsed_mount_data { ++ ++ struct security_mnt_opts lsm_opts; ++ struct net *net; +++#if IS_ENABLED(CONFIG_ENFS) +++ void *enfs_option; /* struct multipath_mount_options * */ +++#endif ++ }; ++ ++ /* mount_clnt.c */ ++diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c ++index 1350ea673672..4aa6e1f961f7 100644 ++--- a/fs/nfs/nfs4client.c +++++ b/fs/nfs/nfs4client.c ++@@ -10,7 +10,7 @@ ++ #include <linux/sunrpc/xprt.h> ++ #include <linux/sunrpc/bc_xprt.h> ++ #include <linux/sunrpc/rpc_pipe_fs.h> ++-#include "internal.h" +++#include "enfs_adapter.h" ++ #include "callback.h" ++ #include "delegation.h" ++ #include "nfs4session.h" ++@@ -225,6 +225,16 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init) ++ __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags); ++ __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags); ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ err = nfs_create_multi_path_client(clp, cl_init); +++ if (err < 0) { +++ dprintk("%s: create failed.%d\n", __func__, err); +++ nfs_put_client(clp); +++ clp = ERR_PTR(err); +++ return clp; +++ } +++#endif +++ ++ /* ++ * Set up the connection to the server before we add add to the ++ * global list. ++@@ -529,6 +539,9 @@ static int nfs4_match_client(struct nfs_client *pos, struct nfs_client *new, ++ if (!nfs4_match_client_owner_id(pos, new)) ++ return 1; ++ +++ if (!nfs4_multipath_client_match(pos, new)) +++ return 1; +++ ++ return 0; ++ } ++ ++@@ -860,7 +873,7 @@ static int nfs4_set_client(struct nfs_server *server, ++ const size_t addrlen, ++ const char *ip_addr, ++ int proto, const struct rpc_timeout *timeparms, ++- u32 minorversion, struct net *net) +++ u32 minorversion, struct net *net, void *enfs_option) ++ { ++ struct nfs_client_initdata cl_init = { ++ .hostname = hostname, ++@@ -872,6 +885,9 @@ static int nfs4_set_client(struct nfs_server *server, ++ .minorversion = minorversion, ++ .net = net, ++ .timeparms = timeparms, +++#if IS_ENABLED(CONFIG_ENFS) +++ .enfs_option = enfs_option, +++#endif ++ }; ++ struct nfs_client *clp; ++ ++@@ -1042,6 +1058,30 @@ static int nfs4_server_common_setup(struct nfs_server *server, ++ return error; ++ } ++ +++int nfs4_create_multi_path(struct nfs_server *server, +++ struct nfs_parsed_mount_data *data, +++ const struct rpc_timeout *timeparms) +++{ +++ struct nfs_client_initdata cl_init = { +++ .hostname = data->nfs_server.hostname, +++ .addr = (const struct sockaddr *)&data->nfs_server.address, +++ .addrlen = data->nfs_server.addrlen, +++ .ip_addr = data->client_address, +++ .nfs_mod = &nfs_v4, +++ .proto = data->nfs_server.protocol, +++ .minorversion = data->minorversion, +++ .net = data->net, +++ .timeparms = timeparms, +++#if IS_ENABLED(CONFIG_ENFS) +++ .enfs_option = data->enfs_option, +++#endif // CONFIG_ENFS +++ }; +++ +++ return nfs_create_multi_path_client(server->nfs_client, &cl_init); +++ +++} +++EXPORT_SYMBOL_GPL(nfs4_create_multi_path); +++ ++ /* ++ * Create a version 4 volume record ++ */ ++@@ -1050,6 +1090,7 @@ static int nfs4_init_server(struct nfs_server *server, ++ { ++ struct rpc_timeout timeparms; ++ int error; +++ void *enfs_option = NULL; ++ ++ nfs_init_timeout_values(&timeparms, data->nfs_server.protocol, ++ data->timeo, data->retrans); ++@@ -1067,6 +1108,10 @@ static int nfs4_init_server(struct nfs_server *server, ++ else ++ data->selected_flavor = RPC_AUTH_UNIX; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ enfs_option = data->enfs_option; +++#endif +++ ++ /* Get a client record */ ++ error = nfs4_set_client(server, ++ data->nfs_server.hostname, ++@@ -1076,7 +1121,7 @@ static int nfs4_init_server(struct nfs_server *server, ++ data->nfs_server.protocol, ++ &timeparms, ++ data->minorversion, ++- data->net); +++ data->net, enfs_option); ++ if (error < 0) ++ return error; ++ ++@@ -1161,7 +1206,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, ++ XPRT_TRANSPORT_RDMA, ++ parent_server->client->cl_timeout, ++ parent_client->cl_mvops->minor_version, ++- parent_client->cl_net); +++ parent_client->cl_net, NULL); ++ if (!error) ++ goto init_server; ++ #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */ ++@@ -1174,7 +1219,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, ++ XPRT_TRANSPORT_TCP, ++ parent_server->client->cl_timeout, ++ parent_client->cl_mvops->minor_version, ++- parent_client->cl_net); +++ parent_client->cl_net, NULL); ++ if (error < 0) ++ goto error; ++ ++@@ -1269,7 +1314,7 @@ int nfs4_update_server(struct nfs_server *server, const char *hostname, ++ set_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); ++ error = nfs4_set_client(server, hostname, sap, salen, buf, ++ clp->cl_proto, clnt->cl_timeout, ++- clp->cl_minorversion, net); +++ clp->cl_minorversion, net, NULL); ++ clear_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); ++ if (error != 0) { ++ nfs_server_insert_lists(server); ++diff --git a/fs/nfs/super.c b/fs/nfs/super.c ++index a05e1eb2c3fd..83cd294aca15 100644 ++--- a/fs/nfs/super.c +++++ b/fs/nfs/super.c ++@@ -61,7 +61,7 @@ ++ #include "callback.h" ++ #include "delegation.h" ++ #include "iostat.h" ++-#include "internal.h" +++#include "enfs_adapter.h" ++ #include "fscache.h" ++ #include "nfs4session.h" ++ #include "pnfs.h" ++@@ -113,6 +113,12 @@ enum { ++ ++ /* Special mount options */ ++ Opt_userspace, Opt_deprecated, Opt_sloppy, +++#if IS_ENABLED(CONFIG_ENFS) +++ Opt_remote_iplist, +++ Opt_local_iplist, +++ Opt_remote_dnslist, +++ Opt_enfs_info, +++#endif ++ ++ Opt_err ++ }; ++@@ -183,6 +189,13 @@ static const match_table_t nfs_mount_option_tokens = { ++ { Opt_fscache_uniq, "fsc=%s" }, ++ { Opt_local_lock, "local_lock=%s" }, ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ { Opt_remote_iplist, "remoteaddrs=%s" }, +++ { Opt_local_iplist, "localaddrs=%s" }, +++ { Opt_remote_dnslist, "remotednsname=%s" }, +++ { Opt_enfs_info, "enfs_info=%s" }, +++#endif +++ ++ /* The following needs to be listed after all other options */ ++ { Opt_nfsvers, "v%s" }, ++ ++@@ -365,6 +378,21 @@ static struct shrinker acl_shrinker = { ++ .seeks = DEFAULT_SEEKS, ++ }; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++enum nfsmultipathoptions getNfsMultiPathOpt(int token) +++{ +++ switch (token) { +++ case Opt_remote_iplist: +++ return REMOUNTREMOTEADDR; +++ case Opt_local_iplist: +++ return REMOUNTLOCALADDR; +++ case Opt_remote_dnslist: +++ return REMOTEDNSNAME; +++ } +++ return INVALID_OPTION; +++} +++#endif +++ ++ /* ++ * Register the NFS filesystems ++ */ ++@@ -758,6 +786,9 @@ int nfs_show_options(struct seq_file *m, struct dentry *root) ++ seq_printf(m, ",addr=%s", ++ rpc_peeraddr2str(nfss->nfs_client->cl_rpcclient, ++ RPC_DISPLAY_ADDR)); +++ +++ nfs_multipath_show_client_info(m, nfss); +++ ++ rcu_read_unlock(); ++ ++ return 0; ++@@ -853,6 +884,8 @@ int nfs_show_stats(struct seq_file *m, struct dentry *root) ++ seq_puts(m, root->d_sb->s_flags & SB_NODIRATIME ? ",nodiratime" : ""); ++ nfs_show_mount_options(m, nfss, 1); ++ +++ nfs_multipath_show_client_info(m, nfss); +++ ++ seq_printf(m, "\n\tage:\t%lu", (jiffies - nfss->mount_time) / HZ); ++ ++ show_implementation_id(m, nfss); ++@@ -977,6 +1010,7 @@ static void nfs_free_parsed_mount_data(struct nfs_parsed_mount_data *data) ++ kfree(data->nfs_server.export_path); ++ kfree(data->nfs_server.hostname); ++ kfree(data->fscache_uniq); +++ enfs_free_mount_options(data); ++ security_free_mnt_opts(&data->lsm_opts); ++ kfree(data); ++ } ++@@ -1641,7 +1675,34 @@ static int nfs_parse_mount_options(char *raw, ++ return 0; ++ }; ++ break; ++- +++#if IS_ENABLED(CONFIG_ENFS) +++ case Opt_remote_iplist: +++ case Opt_local_iplist: +++ case Opt_remote_dnslist: +++ string = match_strdup(args); +++ if (string == NULL) +++ goto out_nomem; +++ rc = enfs_parse_mount_options(getNfsMultiPathOpt(token), +++ string, mnt); +++ kfree(string); +++ switch (rc) { +++ case 0: +++ break; +++ case -ENOMEM: +++ goto out_nomem; +++ case -ENOSPC: +++ goto out_limit; +++ case -EINVAL: +++ goto out_invalid_address; +++ case -ENOTSUPP: +++ goto out_invalid_address; +++ case -EOPNOTSUPP: +++ goto out_invalid_address; +++ } +++ break; +++ case Opt_enfs_info: +++ break; +++#endif ++ /* ++ * Special options ++ */ ++@@ -1720,6 +1781,11 @@ static int nfs_parse_mount_options(char *raw, ++ free_secdata(secdata); ++ printk(KERN_INFO "NFS: security options invalid: %d\n", rc); ++ return 0; +++#if IS_ENABLED(CONFIG_ENFS) +++out_limit: +++ dprintk("NFS: param is more than supported limit: %d\n", rc); +++ return 0; +++#endif ++ } ++ ++ /* ++@@ -2335,6 +2401,14 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) ++ if (!nfs_parse_mount_options((char *)options, data)) ++ goto out; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ if (data->enfs_option) { +++ error = nfs_remount_iplist(nfss->nfs_client, data->enfs_option); +++ if (error) +++ goto out; +++ } +++#endif +++ ++ /* ++ * noac is a special case. It implies -o sync, but that's not ++ * necessarily reflected in the mtab options. do_remount_sb ++@@ -2347,6 +2421,11 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) ++ /* compare new mount options with old ones */ ++ error = nfs_compare_remount_data(nfss, data); ++ out: +++#if IS_ENABLED(CONFIG_ENFS) +++ /* release remount option member */ +++ if (data->enfs_option) +++ enfs_free_mount_options(data); +++#endif ++ nfs_free_parsed_mount_data(data); ++ return error; ++ } ++diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h ++index 7023ae64e3d7..2c19678afe8d 100644 ++--- a/include/linux/nfs_fs_sb.h +++++ b/include/linux/nfs_fs_sb.h ++@@ -123,6 +123,11 @@ struct nfs_client { ++ ++ struct net *cl_net; ++ struct list_head pending_cb_stateids; +++ +++#if IS_ENABLED(CONFIG_ENFS) +++ /* multi path private structure (struct multipath_client_info *) */ +++ void *cl_multipath_data; +++#endif ++ }; ++ ++ /* +diff --git a/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch +new file mode 100644 +index 0000000..540a2ce +--- /dev/null ++++ b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch +@@ -0,0 +1,805 @@ ++diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h ++index 8aa865bce4f6..89178f78de8c 100644 ++--- a/include/linux/sunrpc/clnt.h +++++ b/include/linux/sunrpc/clnt.h ++@@ -70,6 +70,10 @@ struct rpc_clnt { ++ struct dentry *cl_debugfs; /* debugfs directory */ ++ #endif ++ struct rpc_xprt_iter cl_xpi; +++ +++#if IS_ENABLED(CONFIG_ENFS) +++ bool cl_enfs; +++#endif ++ }; ++ ++ /* ++@@ -124,6 +128,9 @@ struct rpc_create_args { ++ unsigned long flags; ++ char *client_name; ++ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */ +++#if IS_ENABLED(CONFIG_ENFS) +++ void *multipath_option; +++#endif ++ }; ++ ++ struct rpc_add_xprt_test { ++@@ -221,6 +228,12 @@ bool rpc_clnt_xprt_switch_has_addr(struct rpc_clnt *clnt, ++ const struct sockaddr *sap); ++ void rpc_cleanup_clids(void); ++ +++#if IS_ENABLED(CONFIG_ENFS) +++int +++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, +++ const struct rpc_call_ops *ops, void *data, int flags); +++#endif /* CONFIG_ENFS */ +++ ++ static inline int rpc_reply_expected(struct rpc_task *task) ++ { ++ return (task->tk_msg.rpc_proc != NULL) && ++diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h ++index ad2e243f3f03..124f5a0faf3e 100644 ++--- a/include/linux/sunrpc/sched.h +++++ b/include/linux/sunrpc/sched.h ++@@ -90,6 +90,9 @@ struct rpc_task { ++ tk_garb_retry : 2, ++ tk_cred_retry : 2, ++ tk_rebind_retry : 2; +++#if IS_ENABLED(CONFIG_ENFS) +++ unsigned long tk_major_timeo; /* major timeout ticks */ +++#endif ++ }; ++ ++ typedef void (*rpc_action)(struct rpc_task *); ++@@ -118,6 +121,9 @@ struct rpc_task_setup { ++ */ ++ #define RPC_TASK_ASYNC 0x0001 /* is an async task */ ++ #define RPC_TASK_SWAPPER 0x0002 /* is swapping in/out */ +++#if IS_ENABLED(CONFIG_ENFS) +++#define RPC_TASK_FIXED 0x0004 /* detect xprt status task */ +++#endif ++ #define RPC_CALL_MAJORSEEN 0x0020 /* major timeout seen */ ++ #define RPC_TASK_ROOTCREDS 0x0040 /* force root creds */ ++ #define RPC_TASK_DYNAMIC 0x0080 /* task was kmalloc'ed */ ++@@ -257,6 +263,9 @@ void rpc_destroy_mempool(void); ++ extern struct workqueue_struct *rpciod_workqueue; ++ extern struct workqueue_struct *xprtiod_workqueue; ++ void rpc_prepare_task(struct rpc_task *task); +++#if IS_ENABLED(CONFIG_ENFS) +++void rpc_init_task_retry_counters(struct rpc_task *task); +++#endif ++ ++ static inline int rpc_wait_for_completion_task(struct rpc_task *task) ++ { ++diff --git a/include/linux/sunrpc/sunrpc_enfs_adapter.h b/include/linux/sunrpc/sunrpc_enfs_adapter.h ++new file mode 100644 ++index 000000000000..28abedcf5cf6 ++--- /dev/null +++++ b/include/linux/sunrpc/sunrpc_enfs_adapter.h ++@@ -0,0 +1,128 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* Client-side SUNRPC ENFS adapter header. +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _SUNRPC_ENFS_ADAPTER_H_ +++#define _SUNRPC_ENFS_ADAPTER_H_ +++#include <linux/sunrpc/clnt.h> +++ +++#if IS_ENABLED(CONFIG_ENFS) +++ +++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) +++{ +++ xps->xps_nactive--; +++} +++ +++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) +++{ +++ xps->xps_nactive--; +++} +++ +++struct rpc_xprt *rpc_task_get_xprt +++(struct rpc_clnt *clnt, struct rpc_xprt *xprt); +++ +++struct rpc_multipath_ops { +++ struct module *owner; +++ void (*create_clnt)(struct rpc_create_args *args, +++ struct rpc_clnt *clnt); +++ void (*releas_clnt)(struct rpc_clnt *clnt); +++ void (*create_xprt)(struct rpc_xprt *xprt); +++ void (*destroy_xprt)(struct rpc_xprt *xprt); +++ void (*xprt_iostat)(struct rpc_task *task); +++ void (*failover_handle)(struct rpc_task *task); +++ bool (*task_need_call_start_again)(struct rpc_task *task); +++ void (*adjust_task_timeout)(struct rpc_task *task, void *condition); +++ void (*init_task_req)(struct rpc_task *task, struct rpc_rqst *req); +++ bool (*prepare_transmit)(struct rpc_task *task); +++}; +++ +++extern struct rpc_multipath_ops __rcu *multipath_ops; +++void rpc_init_task_retry_counters(struct rpc_task *task); +++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops); +++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops); +++struct rpc_multipath_ops *rpc_multipath_ops_get(void); +++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops); +++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); +++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, +++ struct rpc_clnt *clnt); +++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt); +++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt); +++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt); +++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task); +++void rpc_multipath_ops_failover_handle(struct rpc_task *task); +++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task); +++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, +++ void *condition); +++void rpc_multipath_ops_init_task_req(struct rpc_task *task, +++ struct rpc_rqst *req); +++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task); +++ +++#else +++static inline struct rpc_xprt *rpc_task_get_xprt(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt) +++{ +++ return NULL; +++} +++ +++static inline void rpc_task_release_xprt(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt) +++{ +++} +++ +++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) +++{ +++} +++ +++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) +++{ +++} +++ +++static inline void rpc_multipath_ops_create_clnt +++(struct rpc_create_args *args, struct rpc_clnt *clnt) +++{ +++} +++ +++static inline void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) +++{ +++} +++ +++static inline bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) +++{ +++ return false; +++} +++ +++static inline void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) +++{ +++} +++ +++static inline void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) +++{ +++} +++ +++static inline void rpc_multipath_ops_failover_handle(struct rpc_task *task) +++{ +++} +++ +++static inline +++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) +++{ +++ return false; +++} +++ +++static inline void +++rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, void *condition) +++{ +++} +++ +++static inline void +++rpc_multipath_ops_init_task_req(struct rpc_task *task, struct rpc_rqst *req) +++{ +++} +++ +++static inline bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) +++{ +++ return false; +++} +++ +++#endif +++#endif // _SUNRPC_ENFS_ADAPTER_H_ ++diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h ++index ccfacca1eba9..2e47b3577947 100644 ++--- a/include/linux/sunrpc/xprt.h +++++ b/include/linux/sunrpc/xprt.h ++@@ -279,6 +279,10 @@ struct rpc_xprt { ++ atomic_t inject_disconnect; ++ #endif ++ struct rcu_head rcu; +++#if IS_ENABLED(CONFIG_ENFS) +++ atomic_long_t queuelen; +++ void *multipath_context; +++#endif ++ }; ++ ++ #if defined(CONFIG_SUNRPC_BACKCHANNEL) ++diff --git a/include/linux/sunrpc/xprtmultipath.h b/include/linux/sunrpc/xprtmultipath.h ++index af1257c030d2..d54e4dbbbf34 100644 ++--- a/include/linux/sunrpc/xprtmultipath.h +++++ b/include/linux/sunrpc/xprtmultipath.h ++@@ -22,6 +22,10 @@ struct rpc_xprt_switch { ++ const struct rpc_xprt_iter_ops *xps_iter_ops; ++ ++ struct rcu_head xps_rcu; +++#if IS_ENABLED(CONFIG_ENFS) +++ unsigned int xps_nactive; +++ atomic_long_t xps_queuelen; +++#endif ++ }; ++ ++ struct rpc_xprt_iter { ++@@ -69,4 +73,8 @@ extern struct rpc_xprt *xprt_iter_get_next(struct rpc_xprt_iter *xpi); ++ ++ extern bool rpc_xprt_switch_has_addr(struct rpc_xprt_switch *xps, ++ const struct sockaddr *sap); +++#if IS_ENABLED(CONFIG_ENFS) +++extern void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, +++ struct rpc_xprt *xprt); +++#endif ++ #endif ++diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c ++index 0fc540b0d183..d7ffee637148 100644 ++--- a/net/sunrpc/clnt.c +++++ b/net/sunrpc/clnt.c ++@@ -37,6 +37,7 @@ ++ #include <linux/sunrpc/rpc_pipe_fs.h> ++ #include <linux/sunrpc/metrics.h> ++ #include <linux/sunrpc/bc_xprt.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ #include <trace/events/sunrpc.h> ++ ++ #include "sunrpc.h" ++@@ -490,6 +491,8 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, ++ } ++ } ++ +++ rpc_multipath_ops_create_clnt(args, clnt); +++ ++ clnt->cl_softrtry = 1; ++ if (args->flags & RPC_CLNT_CREATE_HARDRTRY) ++ clnt->cl_softrtry = 0; ++@@ -869,6 +872,8 @@ void rpc_shutdown_client(struct rpc_clnt *clnt) ++ list_empty(&clnt->cl_tasks), 1*HZ); ++ } ++ +++ rpc_multipath_ops_releas_clnt(clnt); +++ ++ rpc_release_client(clnt); ++ } ++ EXPORT_SYMBOL_GPL(rpc_shutdown_client); ++@@ -981,7 +986,13 @@ void rpc_task_release_transport(struct rpc_task *task) ++ ++ if (xprt) { ++ task->tk_xprt = NULL; ++- xprt_put(xprt); +++#if IS_ENABLED(CONFIG_ENFS) +++ if (task->tk_client) { +++ rpc_task_release_xprt(task->tk_client, xprt); +++ return; +++ } +++#endif +++ xprt_put(xprt); ++ } ++ } ++ EXPORT_SYMBOL_GPL(rpc_task_release_transport); ++@@ -990,6 +1001,10 @@ void rpc_task_release_client(struct rpc_task *task) ++ { ++ struct rpc_clnt *clnt = task->tk_client; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ rpc_task_release_transport(task); +++#endif +++ ++ if (clnt != NULL) { ++ /* Remove from client task list */ ++ spin_lock(&clnt->cl_lock); ++@@ -999,14 +1014,29 @@ void rpc_task_release_client(struct rpc_task *task) ++ ++ rpc_release_client(clnt); ++ } +++#if IS_ENABLED(CONFIG_ENFS) +++#else ++ rpc_task_release_transport(task); +++#endif ++ } ++ +++#if IS_ENABLED(CONFIG_ENFS) +++static struct rpc_xprt * +++rpc_task_get_next_xprt(struct rpc_clnt *clnt) +++{ +++ return rpc_task_get_xprt(clnt, xprt_iter_get_next(&clnt->cl_xpi)); +++} +++#endif +++ ++ static ++ void rpc_task_set_transport(struct rpc_task *task, struct rpc_clnt *clnt) ++ { ++ if (!task->tk_xprt) +++#if IS_ENABLED(CONFIG_ENFS) +++ task->tk_xprt = rpc_task_get_next_xprt(clnt); +++#else ++ task->tk_xprt = xprt_iter_get_next(&clnt->cl_xpi); +++#endif ++ } ++ ++ static ++@@ -1597,6 +1627,14 @@ call_reserveresult(struct rpc_task *task) ++ return; ++ case -EIO: /* probably a shutdown */ ++ break; +++#if IS_ENABLED(CONFIG_ENFS) +++ case -ETIMEDOUT: /* woken up; restart */ +++ if (rpc_multipath_ops_task_need_call_start_again(task)) { +++ rpc_task_release_transport(task); +++ task->tk_action = call_start; +++ return; +++ } +++#endif ++ default: ++ printk(KERN_ERR "%s: unrecognized error %d, exiting\n", ++ __func__, status); ++@@ -1962,6 +2000,10 @@ call_transmit(struct rpc_task *task) ++ return; ++ if (!xprt_prepare_transmit(task)) ++ return; +++ +++ if (rpc_multipath_ops_prepare_transmit(task)) +++ return; +++ ++ task->tk_action = call_transmit_status; ++ /* Encode here so that rpcsec_gss can use correct sequence number. */ ++ if (rpc_task_need_encode(task)) { ++@@ -2277,6 +2319,9 @@ call_timeout(struct rpc_task *task) ++ ++ retry: ++ task->tk_action = call_bind; +++#if IS_ENABLED(CONFIG_ENFS) +++ rpc_multipath_ops_failover_handle(task); +++#endif ++ task->tk_status = 0; ++ } ++ ++@@ -2961,3 +3006,30 @@ rpc_clnt_swap_deactivate(struct rpc_clnt *clnt) ++ } ++ EXPORT_SYMBOL_GPL(rpc_clnt_swap_deactivate); ++ #endif /* CONFIG_SUNRPC_SWAP */ +++ +++#if IS_ENABLED(CONFIG_ENFS) +++/* rpc_clnt_test_xprt - Test and add a new transport to a rpc_clnt +++ * @clnt: pointer to struct rpc_clnt +++ * @xprt: pointer struct rpc_xprt +++ * @ops: async operation +++ */ +++int +++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, +++ const struct rpc_call_ops *ops, void *data, int flags) +++{ +++ struct rpc_cred *cred; +++ struct rpc_task *task; +++ +++ cred = authnull_ops.lookup_cred(NULL, NULL, 0); +++ task = rpc_call_null_helper(clnt, xprt, cred, +++ RPC_TASK_SOFT | RPC_TASK_SOFTCONN | flags, +++ ops, data); +++ put_rpccred(cred); +++ if (IS_ERR(task)) +++ return PTR_ERR(task); +++ +++ rpc_put_task(task); +++ return 1; +++} +++EXPORT_SYMBOL_GPL(rpc_clnt_test_xprt); +++#endif ++diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c ++index a873c92a4898..2254fea0e863 100644 ++--- a/net/sunrpc/sched.c +++++ b/net/sunrpc/sched.c ++@@ -20,7 +20,7 @@ ++ #include <linux/mutex.h> ++ #include <linux/freezer.h> ++ ++-#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ ++ #include "sunrpc.h" ++ ++@@ -962,7 +962,12 @@ static void rpc_init_task(struct rpc_task *task, const struct rpc_task_setup *ta ++ /* Initialize workqueue for async tasks */ ++ task->tk_workqueue = task_setup_data->workqueue; ++ +++#if IS_ENABLED(CONFIG_ENFS) +++ task->tk_xprt = rpc_task_get_xprt(task_setup_data->rpc_client, +++ xprt_get(task_setup_data->rpc_xprt)); +++#else ++ task->tk_xprt = xprt_get(task_setup_data->rpc_xprt); +++#endif ++ ++ if (task->tk_ops->rpc_call_prepare != NULL) ++ task->tk_action = rpc_prepare_task; ++diff --git a/net/sunrpc/sunrpc_enfs_adapter.c b/net/sunrpc/sunrpc_enfs_adapter.c ++new file mode 100644 ++index 000000000000..c1543545c6de ++--- /dev/null +++++ b/net/sunrpc/sunrpc_enfs_adapter.c ++@@ -0,0 +1,214 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* Client-side SUNRPC ENFS adapter header. +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> +++ +++struct rpc_multipath_ops __rcu *multipath_ops; +++ +++void rpc_init_task_retry_counters(struct rpc_task *task) +++{ +++ /* Initialize retry counters */ +++ task->tk_garb_retry = 2; +++ task->tk_cred_retry = 2; +++ task->tk_rebind_retry = 2; +++} +++EXPORT_SYMBOL_GPL(rpc_init_task_retry_counters); +++ +++struct rpc_xprt * +++rpc_task_get_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) +++{ +++ struct rpc_xprt_switch *xps; +++ +++ if (!xprt) +++ return NULL; +++ rcu_read_lock(); +++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); +++ atomic_long_inc(&xps->xps_queuelen); +++ rcu_read_unlock(); +++ atomic_long_inc(&xprt->queuelen); +++ +++ return xprt; +++} +++ +++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops) +++{ +++ struct rpc_multipath_ops *old; +++ +++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, NULL, ops); +++ if (!old || old == ops) +++ return 0; +++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_register); +++ +++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops) +++{ +++ struct rpc_multipath_ops *old; +++ +++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, ops, NULL); +++ if (!old || old == ops) +++ return 0; +++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); +++ return -EPERM; +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_unregister); +++ +++struct rpc_multipath_ops *rpc_multipath_ops_get(void) +++{ +++ struct rpc_multipath_ops *ops; +++ +++ rcu_read_lock(); +++ ops = rcu_dereference(multipath_ops); +++ if (!ops) { +++ rcu_read_unlock(); +++ return NULL; +++ } +++ if (!try_module_get(ops->owner)) +++ ops = NULL; +++ rcu_read_unlock(); +++ return ops; +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_get); +++ +++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops) +++{ +++ if (ops) +++ module_put(ops->owner); +++} +++EXPORT_SYMBOL_GPL(rpc_multipath_ops_put); +++ +++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) +++{ +++ struct rpc_xprt_switch *xps; +++ +++ atomic_long_dec(&xprt->queuelen); +++ rcu_read_lock(); +++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); +++ atomic_long_dec(&xps->xps_queuelen); +++ rcu_read_unlock(); +++ +++ xprt_put(xprt); +++} +++ +++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, +++ struct rpc_clnt *clnt) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ if (args->multipath_option) { +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->create_clnt) +++ mops->create_clnt(args, clnt); +++ rpc_multipath_ops_put(mops); +++ } +++} +++ +++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->releas_clnt) +++ mops->releas_clnt(clnt); +++ +++ rpc_multipath_ops_put(mops); +++} +++ +++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->create_xprt) { +++ mops->create_xprt(xprt); +++ if (!xprt->multipath_context) { +++ rpc_multipath_ops_put(mops); +++ return true; +++ } +++ } +++ rpc_multipath_ops_put(mops); +++ return false; +++} +++ +++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ if (xprt->multipath_context) { +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->destroy_xprt) +++ mops->destroy_xprt(xprt); +++ rpc_multipath_ops_put(mops); +++ } +++} +++ +++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mops; +++ +++ mops = rpc_multipath_ops_get(); +++ if (task->tk_client && mops && mops->xprt_iostat) +++ mops->xprt_iostat(task); +++ rpc_multipath_ops_put(mops); +++} +++ +++void rpc_multipath_ops_failover_handle(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mpath_ops = NULL; +++ +++ mpath_ops = rpc_multipath_ops_get(); +++ if (mpath_ops && mpath_ops->failover_handle) +++ mpath_ops->failover_handle(task); +++ rpc_multipath_ops_put(mpath_ops); +++} +++ +++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mpath_ops = NULL; +++ bool ret = false; +++ +++ mpath_ops = rpc_multipath_ops_get(); +++ if (mpath_ops && mpath_ops->task_need_call_start_again) +++ ret = mpath_ops->task_need_call_start_again(task); +++ rpc_multipath_ops_put(mpath_ops); +++ return ret; +++} +++ +++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, +++ void *condition) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->adjust_task_timeout) +++ mops->adjust_task_timeout(task, NULL); +++ rpc_multipath_ops_put(mops); +++} +++ +++void rpc_multipath_ops_init_task_req(struct rpc_task *task, +++ struct rpc_rqst *req) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->init_task_req) +++ mops->init_task_req(task, req); +++ rpc_multipath_ops_put(mops); +++} +++ +++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) +++{ +++ struct rpc_multipath_ops *mops = NULL; +++ +++ mops = rpc_multipath_ops_get(); +++ if (mops && mops->prepare_transmit) { +++ if (!(mops->prepare_transmit(task))) { +++ rpc_multipath_ops_put(mops); +++ return true; +++ } +++ } +++ rpc_multipath_ops_put(mops); +++ return false; +++} ++diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c ++index c912bf20faa2..c2b63b3d5217 100644 ++--- a/net/sunrpc/xprt.c +++++ b/net/sunrpc/xprt.c ++@@ -48,6 +48,7 @@ ++ #include <linux/sunrpc/clnt.h> ++ #include <linux/sunrpc/metrics.h> ++ #include <linux/sunrpc/bc_xprt.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ #include <linux/rcupdate.h> ++ ++ #include <trace/events/sunrpc.h> ++@@ -259,6 +260,9 @@ int xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task) ++ dprintk("RPC: %5u failed to lock transport %p\n", ++ task->tk_pid, xprt); ++ task->tk_timeout = 0; +++ +++ rpc_multipath_ops_adjust_task_timeout(task, NULL); +++ ++ task->tk_status = -EAGAIN; ++ if (req == NULL) ++ priority = RPC_PRIORITY_LOW; ++@@ -560,6 +564,9 @@ void xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action) ++ struct rpc_xprt *xprt = req->rq_xprt; ++ ++ task->tk_timeout = RPC_IS_SOFT(task) ? req->rq_timeout : 0; +++ +++ rpc_multipath_ops_adjust_task_timeout(task, NULL); +++ ++ rpc_sleep_on(&xprt->pending, task, action); ++ } ++ EXPORT_SYMBOL_GPL(xprt_wait_for_buffer_space); ++@@ -1347,6 +1354,9 @@ xprt_request_init(struct rpc_task *task) ++ req->rq_rcv_buf.buflen = 0; ++ req->rq_release_snd_buf = NULL; ++ xprt_reset_majortimeo(req); +++ +++ rpc_multipath_ops_init_task_req(task, req); +++ ++ dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid, ++ req, ntohl(req->rq_xid)); ++ } ++@@ -1427,6 +1437,9 @@ void xprt_release(struct rpc_task *task) ++ task->tk_ops->rpc_count_stats(task, task->tk_calldata); ++ else if (task->tk_client) ++ rpc_count_iostats(task, task->tk_client->cl_metrics); +++ +++ rpc_multipath_ops_xprt_iostat(task); +++ ++ spin_lock(&xprt->recv_lock); ++ if (!list_empty(&req->rq_list)) { ++ list_del_init(&req->rq_list); ++@@ -1455,6 +1468,7 @@ void xprt_release(struct rpc_task *task) ++ else ++ xprt_free_bc_request(req); ++ } +++EXPORT_SYMBOL_GPL(xprt_release); ++ ++ static void xprt_init(struct rpc_xprt *xprt, struct net *net) ++ { ++@@ -1528,6 +1542,10 @@ struct rpc_xprt *xprt_create_transport(struct xprt_create *args) ++ return ERR_PTR(-ENOMEM); ++ } ++ +++if (rpc_multipath_ops_create_xprt(xprt)) { +++ xprt_destroy(xprt); +++ return ERR_PTR(-ENOMEM); +++} ++ rpc_xprt_debugfs_register(xprt); ++ ++ dprintk("RPC: created transport %p with %u slots\n", xprt, ++@@ -1547,6 +1565,9 @@ static void xprt_destroy_cb(struct work_struct *work) ++ rpc_destroy_wait_queue(&xprt->sending); ++ rpc_destroy_wait_queue(&xprt->backlog); ++ kfree(xprt->servername); +++ +++ rpc_multipath_ops_destroy_xprt(xprt); +++ ++ /* ++ * Tear down transport state and free the rpc_xprt ++ */ ++diff --git a/net/sunrpc/xprtmultipath.c b/net/sunrpc/xprtmultipath.c ++index 6ebaa58b4eff..6202a0be1327 100644 ++--- a/net/sunrpc/xprtmultipath.c +++++ b/net/sunrpc/xprtmultipath.c ++@@ -18,6 +18,7 @@ ++ #include <linux/sunrpc/xprt.h> ++ #include <linux/sunrpc/addr.h> ++ #include <linux/sunrpc/xprtmultipath.h> +++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ ++ typedef struct rpc_xprt *(*xprt_switch_find_xprt_t)(struct list_head *head, ++ const struct rpc_xprt *cur); ++@@ -26,8 +27,8 @@ static const struct rpc_xprt_iter_ops rpc_xprt_iter_singular; ++ static const struct rpc_xprt_iter_ops rpc_xprt_iter_roundrobin; ++ static const struct rpc_xprt_iter_ops rpc_xprt_iter_listall; ++ ++-static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++- struct rpc_xprt *xprt) +++void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, +++ struct rpc_xprt *xprt) ++ { ++ if (unlikely(xprt_get(xprt) == NULL)) ++ return; ++@@ -36,7 +37,9 @@ static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++ if (xps->xps_nxprts == 0) ++ xps->xps_net = xprt->xprt_net; ++ xps->xps_nxprts++; +++ rpc_xps_nactive_add_one(xps); ++ } +++EXPORT_SYMBOL(xprt_switch_add_xprt_locked); ++ ++ /** ++ * rpc_xprt_switch_add_xprt - Add a new rpc_xprt to an rpc_xprt_switch ++@@ -63,6 +66,7 @@ static void xprt_switch_remove_xprt_locked(struct rpc_xprt_switch *xps, ++ if (unlikely(xprt == NULL)) ++ return; ++ xps->xps_nxprts--; +++ rpc_xps_nactive_sub_one(xps); ++ if (xps->xps_nxprts == 0) ++ xps->xps_net = NULL; ++ smp_wmb(); ++@@ -84,7 +88,7 @@ void rpc_xprt_switch_remove_xprt(struct rpc_xprt_switch *xps, ++ spin_unlock(&xps->xps_lock); ++ xprt_put(xprt); ++ } ++- +++EXPORT_SYMBOL(rpc_xprt_switch_remove_xprt); ++ /** ++ * xprt_switch_alloc - Allocate a new struct rpc_xprt_switch ++ * @xprt: pointer to struct rpc_xprt ++@@ -102,7 +106,13 @@ struct rpc_xprt_switch *xprt_switch_alloc(struct rpc_xprt *xprt, ++ if (xps != NULL) { ++ spin_lock_init(&xps->xps_lock); ++ kref_init(&xps->xps_kref); +++#if IS_ENABLED(CONFIG_ENFS) +++ xps->xps_nxprts = 0; +++ xps->xps_nactive = 0; +++ atomic_long_set(&xps->xps_queuelen, 0); +++#else ++ xps->xps_nxprts = 0; +++#endif ++ INIT_LIST_HEAD(&xps->xps_xprt_list); ++ xps->xps_iter_ops = &rpc_xprt_iter_singular; ++ xprt_switch_add_xprt_locked(xps, xprt); ++@@ -148,6 +158,7 @@ struct rpc_xprt_switch *xprt_switch_get(struct rpc_xprt_switch *xps) ++ return xps; ++ return NULL; ++ } +++EXPORT_SYMBOL(xprt_switch_get); ++ ++ /** ++ * xprt_switch_put - Release a reference to a rpc_xprt_switch ++@@ -160,6 +171,7 @@ void xprt_switch_put(struct rpc_xprt_switch *xps) ++ if (xps != NULL) ++ kref_put(&xps->xps_kref, xprt_switch_free); ++ } +++EXPORT_SYMBOL(xprt_switch_put); ++ ++ /** ++ * rpc_xprt_switch_set_roundrobin - Set a round-robin policy on rpc_xprt_switch +diff --git a/0003-add_enfs_module_for_nfs_mount_option.patch b/0003-add_enfs_module_for_nfs_mount_option.patch +new file mode 100644 +index 0000000..70753b5 +--- /dev/null ++++ b/0003-add_enfs_module_for_nfs_mount_option.patch +@@ -0,0 +1,1209 @@ ++diff --git a/fs/nfs/enfs/Makefile b/fs/nfs/enfs/Makefile ++new file mode 100644 ++index 000000000000..6e83eb23c668 ++--- /dev/null +++++ b/fs/nfs/enfs/Makefile ++@@ -0,0 +1,18 @@ +++obj-m += enfs.o +++ +++#EXTRA_CFLAGS += -I$(PWD)/.. +++ +++enfs-y := enfs_init.o +++enfs-y += enfs_config.o +++enfs-y += mgmt_init.o +++enfs-y += enfs_multipath_client.o +++enfs-y += enfs_multipath_parse.o +++enfs-y += failover_path.o +++enfs-y += failover_time.o +++enfs-y += enfs_roundrobin.o +++enfs-y += enfs_multipath.o +++enfs-y += enfs_path.o +++enfs-y += enfs_proc.o +++enfs-y += enfs_remount.o +++enfs-y += pm_ping.o +++enfs-y += pm_state.o ++diff --git a/fs/nfs/enfs/enfs.h b/fs/nfs/enfs/enfs.h ++new file mode 100644 ++index 000000000000..be3d95220088 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs.h ++@@ -0,0 +1,62 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS multipath adapt header. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++ +++#ifndef _ENFS_H_ +++#define _ENFS_H_ +++#include <linux/atomic.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs3.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include "../enfs_adapter.h" +++ +++#define IP_ADDRESS_LEN_MAX 64 +++#define MAX_IP_PAIR_PER_MOUNT 8 +++#define MAX_IP_INDEX (MAX_IP_PAIR_PER_MOUNT) +++#define MAX_SUPPORTED_LOCAL_IP_COUNT 8 +++#define MAX_SUPPORTED_REMOTE_IP_COUNT 32 +++#define MAX_DNS_NAME_LEN 512 +++#define MAX_DNS_SUPPORTED 2 +++#define EXTEND_CMD_MAX_BUF_LEN 65356 +++ +++ +++struct nfs_ip_list { +++ int count; +++ struct sockaddr_storage address[MAX_SUPPORTED_REMOTE_IP_COUNT]; +++ size_t addrlen[MAX_SUPPORTED_REMOTE_IP_COUNT]; +++}; +++ +++struct NFS_ROUTE_DNS_S { +++ char dnsname[MAX_DNS_NAME_LEN]; // valid only if dnsExist is true +++}; +++ +++struct NFS_ROUTE_DNS_INFO_S { +++ int dnsNameCount; // Count of DNS name in the list +++ // valid only if dnsExist is true +++ struct NFS_ROUTE_DNS_S routeRemoteDnsList[MAX_DNS_SUPPORTED]; +++}; +++ +++struct rpc_iostats; +++struct enfs_xprt_context { +++ struct sockaddr_storage srcaddr; +++ struct rpc_iostats *stats; +++ bool main; +++ atomic_t path_state; +++ atomic_t path_check_state; +++}; +++ +++static inline bool enfs_is_main_xprt(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = xprt->multipath_context; +++ +++ if (!ctx) +++ return false; +++ return ctx->main; +++} +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_init.c b/fs/nfs/enfs/enfs_init.c ++new file mode 100644 ++index 000000000000..4b55608191a7 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_init.c ++@@ -0,0 +1,98 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/module.h> +++#include <linux/sunrpc/sched.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs3.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include "enfs.h" +++#include "enfs_multipath_parse.h" +++#include "enfs_multipath_client.h" +++#include "enfs_remount.h" +++#include "init.h" +++#include "enfs_log.h" +++#include "enfs_multipath.h" +++#include "mgmt_init.h" +++ +++struct enfs_adapter_ops enfs_adapter = { +++ .name = "enfs", +++ .owner = THIS_MODULE, +++ .parse_mount_options = nfs_multipath_parse_options, +++ .free_mount_options = nfs_multipath_free_options, +++ .client_info_init = nfs_multipath_client_info_init, +++ .client_info_free = nfs_multipath_client_info_free, +++ .client_info_match = nfs_multipath_client_info_match, +++ .client_info_show = nfs_multipath_client_info_show, +++ .remount_ip_list = enfs_remount_iplist, +++}; +++ +++int32_t enfs_init(void) +++{ +++ int err; +++ +++ err = enfs_multipath_init(); +++ if (err) { +++ enfs_log_error("init multipath failed.\n"); +++ goto out; +++ } +++ +++ err = mgmt_init(); +++ if (err != 0) { +++ enfs_log_error("init mgmt failed.\n"); +++ goto out_tp_exit; +++ } +++ +++ return 0; +++ +++out_tp_exit: +++ enfs_multipath_exit(); +++out: +++ return err; +++} +++ +++void enfs_fini(void) +++{ +++ mgmt_fini(); +++ +++ enfs_multipath_exit(); +++} +++ +++static int __init init_enfs(void) +++{ +++ int ret; +++ +++ ret = enfs_adapter_register(&enfs_adapter); +++ if (ret) { +++ pr_err("regist enfs_adapter fail. ret %d\n", ret); +++ return -1; +++ } +++ +++ ret = enfs_init(); +++ if (ret) { +++ enfs_adapter_unregister(&enfs_adapter); +++ return -1; +++ } +++ +++ return 0; +++} +++ +++static void __exit exit_enfs(void) +++{ +++ enfs_fini(); +++ enfs_adapter_unregister(&enfs_adapter); +++} +++ +++MODULE_LICENSE("GPL"); +++MODULE_AUTHOR("Huawei Tech. Co., Ltd."); +++MODULE_DESCRIPTION("Nfs client router"); +++MODULE_VERSION("1.0"); +++ +++module_init(init_enfs); +++module_exit(exit_enfs); ++diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c ++new file mode 100644 ++index 000000000000..63c02898a42c ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.c ++@@ -0,0 +1,340 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/proc_fs.h> +++#include <linux/seq_file.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/addr.h> +++#include "enfs_multipath_client.h" +++#include "enfs_multipath_parse.h" +++ +++int +++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, +++ const struct nfs_client_initdata *client_init_data) +++{ +++ struct multipath_mount_options *mount_options = +++ (struct multipath_mount_options *)client_init_data->enfs_option; +++ +++ if (mount_options->local_ip_list) { +++ client_info->local_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->local_ip_list) +++ return -ENOMEM; +++ +++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->remote_ip_list) { +++ +++ client_info->remote_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->remote_ip_list) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->remote_ip_list, +++ mount_options->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->pRemoteDnsInfo) { +++ client_info->pRemoteDnsInfo = +++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); +++ +++ if (!client_info->pRemoteDnsInfo) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ kfree(client_info->remote_ip_list); +++ client_info->remote_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->pRemoteDnsInfo, +++ mount_options->pRemoteDnsInfo, +++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); +++ } +++ return 0; +++} +++ +++void nfs_multipath_client_info_free_work(struct work_struct *work) +++{ +++ +++ struct multipath_client_info *clp_info; +++ +++ if (work == NULL) +++ return; +++ +++ clp_info = container_of(work, struct multipath_client_info, work); +++ +++ if (clp_info->local_ip_list != NULL) { +++ kfree(clp_info->local_ip_list); +++ clp_info->local_ip_list = NULL; +++ } +++ if (clp_info->remote_ip_list != NULL) { +++ kfree(clp_info->remote_ip_list); +++ clp_info->remote_ip_list = NULL; +++ } +++ kfree(clp_info); +++} +++ +++void nfs_multipath_client_info_free(void *data) +++{ +++ struct multipath_client_info *clp_info = +++ (struct multipath_client_info *)data; +++ +++ if (clp_info == NULL) +++ return; +++ pr_info("free client info %p.\n", clp_info); +++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); +++ schedule_work(&clp_info->work); +++} +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init) +++{ +++ int rc; +++ struct multipath_client_info *info; +++ struct multipath_client_info **enfs_info; +++ /* no multi path info, no need do multipath init */ +++ if (cl_init->enfs_option == NULL) +++ return 0; +++ enfs_info = (struct multipath_client_info **)data; +++ if (enfs_info == NULL) +++ return -EINVAL; +++ +++ if (*enfs_info == NULL) +++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), +++ GFP_KERNEL); +++ +++ if (*enfs_info == NULL) +++ return -ENOMEM; +++ +++ info = (struct multipath_client_info *)*enfs_info; +++ pr_info("init client info %p.\n", info); +++ rc = nfs_multipath_client_mount_info_init(info, cl_init); +++ if (rc) { +++ nfs_multipath_client_info_free((void *)info); +++ return rc; +++ } +++ return rc; +++} +++ +++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, +++ const struct nfs_ip_list *ip_list_dst) +++{ +++ int i; +++ int j; +++ bool is_find; +++ /* if both are equal or NULL, then return true. */ +++ if (ip_list_src == ip_list_dst) +++ return true; +++ +++ if ((ip_list_src == NULL || ip_list_dst == NULL)) +++ return false; +++ +++ if (ip_list_src->count != ip_list_dst->count) +++ return false; +++ +++ for (i = 0; i < ip_list_src->count; i++) { +++ is_find = false; +++ for (j = 0; j < ip_list_src->count; j++) { +++ if (rpc_cmp_addr_port( +++ (const struct sockaddr *) +++ &ip_list_src->address[i], +++ (const struct sockaddr *) +++ &ip_list_dst->address[j]) +++ ) { +++ is_find = true; +++ break; +++ } +++ } +++ if (is_find == false) +++ return false; +++ } +++ return true; +++} +++ +++int +++nfs_multipath_dns_list_info_match( +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) +++{ +++ int i; +++ +++ /* if both are equal or NULL, then return true. */ +++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) +++ return true; +++ +++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) +++ return false; +++ +++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) +++ return false; +++ +++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { +++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, +++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) +++ return false; +++ } +++ return true; +++} +++ +++int nfs_multipath_client_info_match(void *src, void *dst) +++{ +++ int ret = true; +++ +++ struct multipath_client_info *src_info; +++ struct multipath_mount_options *dst_info; +++ +++ src_info = (struct multipath_client_info *)src; +++ dst_info = (struct multipath_mount_options *)dst; +++ pr_info("try match client .\n"); +++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, +++ dst_info->local_ip_list); +++ if (ret == false) { +++ pr_err("local_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, +++ dst_info->remote_ip_list); +++ if (ret == false) { +++ pr_err("remote_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, +++ dst_info->pRemoteDnsInfo); +++ if (ret == false) { +++ pr_err("dns not match.\n"); +++ return ret; +++ } +++ pr_info("try match client ret %d.\n", ret); +++ return ret; +++} +++ +++void nfs_multipath_print_ip_info(struct seq_file *mount_option, +++ struct nfs_ip_list *ip_list, +++ const char *type) +++{ +++ char buf[IP_ADDRESS_LEN_MAX + 1]; +++ int len = 0; +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < ip_list->count; i++) { +++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], +++ buf, IP_ADDRESS_LEN_MAX); +++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) +++ buf[len] = '\0'; +++ +++ if (i == 0) +++ seq_printf(mount_option, "%s", buf); +++ else +++ seq_printf(mount_option, "~%s", buf); +++ dfprintk(MOUNT, +++ "NFS: show nfs mount option type:%s %s [%s]\n", +++ type, buf, __func__); +++ } +++} +++ +++void nfs_multipath_print_dns_info(struct seq_file *mount_option, +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, +++ const char *type) +++{ +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { +++ if (i == 0) +++ seq_printf(mount_option, +++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) +++ seq_printf(mount_option, ",%s]", +++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else +++ seq_printf(mount_option, +++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ } +++} +++ +++ +++static void multipath_print_sockaddr(struct seq_file *seq, +++ struct sockaddr *addr) +++{ +++ switch (addr->sa_family) { +++ case AF_INET: { +++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; +++ +++ seq_printf(seq, "%pI4", &sin->sin_addr); +++ return; +++ } +++ case AF_INET6: { +++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; +++ +++ seq_printf(seq, "%pI6", &sin6->sin6_addr); +++ return; +++ } +++ default: +++ break; +++ } +++ pr_err("unsupport family:%d\n", addr->sa_family); +++} +++ +++static void multipath_print_enfs_info(struct seq_file *seq, +++ struct nfs_server *server) +++{ +++ struct sockaddr_storage peeraddr; +++ struct rpc_clnt *next = server->client; +++ +++ rpc_peeraddr(server->client, +++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); +++ seq_puts(seq, ",enfs_info="); +++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); +++ +++ while (next->cl_parent) { +++ if (next == next->cl_parent) +++ break; +++ next = next->cl_parent; +++ } +++ seq_printf(seq, "_%u", next->cl_clid); +++} +++ +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) +++{ +++ struct nfs_server *server = data; +++ struct multipath_client_info *client_info = +++ server->nfs_client->cl_multipath_data; +++ +++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); +++ if ((client_info->remote_ip_list) && +++ (client_info->remote_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->remote_ip_list, +++ "remoteaddrs"); +++ +++ if ((client_info->local_ip_list) && +++ (client_info->local_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->local_ip_list, +++ "localaddrs"); +++ +++ if ((client_info->pRemoteDnsInfo) && +++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) +++ nfs_multipath_print_dns_info(mount_option, +++ client_info->pRemoteDnsInfo, +++ "remotednsname"); +++ +++ multipath_print_enfs_info(mount_option, server); +++} ++diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h ++new file mode 100644 ++index 000000000000..208f7260690d ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.h ++@@ -0,0 +1,26 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _ENFS_MULTIPATH_CLIENT_H_ +++#define _ENFS_MULTIPATH_CLIENT_H_ +++ +++#include "enfs.h" +++ +++struct multipath_client_info { +++ struct work_struct work; +++ struct nfs_ip_list *remote_ip_list; +++ struct nfs_ip_list *local_ip_list; +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; +++ s64 client_id; +++}; +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init); +++void nfs_multipath_client_info_free(void *data); +++int nfs_multipath_client_info_match(void *src, void *dst); +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_multipath_parse.c b/fs/nfs/enfs/enfs_multipath_parse.c ++new file mode 100644 ++index 000000000000..9c4c6c1880b6 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_parse.c ++@@ -0,0 +1,601 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/parser.h> +++#include <linux/kern_levels.h> +++#include <linux/sunrpc/addr.h> +++#include "enfs_multipath_parse.h" +++#include "enfs_log.h" +++ +++#define NFSDBG_FACILITY NFSDBG_CLIENT +++ +++void nfs_multipath_parse_ip_ipv6_add(struct sockaddr_in6 *sin6, int add_num) +++{ +++ int i; +++ +++ pr_info("NFS: before %08x%08x%08x%08x add_num: %d[%s]\n", +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), +++ add_num, __func__); +++ for (i = 0; i < add_num; i++) { +++ sin6->sin6_addr.in6_u.u6_addr32[3] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[3] != 0) +++ continue; +++ +++ sin6->sin6_addr.in6_u.u6_addr32[2] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[2] != 0) +++ continue; +++ +++ sin6->sin6_addr.in6_u.u6_addr32[1] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[1] != 0) +++ continue; +++ +++ sin6->sin6_addr.in6_u.u6_addr32[0] = +++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]) + 1); +++ +++ if (sin6->sin6_addr.in6_u.u6_addr32[0] != 0) +++ continue; +++ } +++ +++ return; +++ +++} +++ +++static int nfs_multipath_parse_ip_range(struct net *net_ns, const char *cursor, +++ struct nfs_ip_list *ip_list, enum nfsmultipathoptions type) +++{ +++ struct sockaddr_storage addr; +++ struct sockaddr_storage tmp_addr; +++ int i; +++ size_t len; +++ int add_num = 1; +++ bool duplicate_flag = false; +++ bool is_complete = false; +++ struct sockaddr_in *sin4; +++ struct sockaddr_in6 *sin6; +++ +++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", +++ cursor, type, __func__); +++ len = rpc_pton(net_ns, cursor, strlen(cursor), +++ (struct sockaddr *)&addr, sizeof(addr)); +++ if (!len) +++ return -EINVAL; +++ +++ if (addr.ss_family != ip_list->address[ip_list->count - 1].ss_family) { +++ pr_info("NFS: %s parsing nfs mount option type: %d fail.\n", +++ __func__, type); +++ return -EINVAL; +++ } +++ +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &ip_list->address[ip_list->count - 1], +++ (const struct sockaddr *)&addr)) { +++ +++ pr_info("range ip is same ip.\n"); +++ return 0; +++ +++ } +++ +++ while (true) { +++ +++ tmp_addr = ip_list->address[ip_list->count - 1]; +++ +++ switch (addr.ss_family) { +++ case AF_INET: +++ sin4 = (struct sockaddr_in *)&tmp_addr; +++ +++ sin4->sin_addr.s_addr = +++ htonl(ntohl(sin4->sin_addr.s_addr) + add_num); +++ +++ pr_info("NFS: mount option ip%08x type: %d ipcont %d [%s]\n", +++ ntohl(sin4->sin_addr.s_addr), +++ type, ip_list->count, __func__); +++ break; +++ case AF_INET6: +++ sin6 = (struct sockaddr_in6 *)&tmp_addr; +++ nfs_multipath_parse_ip_ipv6_add(sin6, add_num); +++ pr_info("NFS: mount option ip %08x%08x%08x%08x type: %d ipcont %d [%s]\n", +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), +++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), +++ type, ip_list->count, __func__); +++ break; +++ // return -EOPNOTSUPP; +++ default: +++ return -EOPNOTSUPP; +++ } +++ +++ if (rpc_cmp_addr((const struct sockaddr *)&tmp_addr, +++ (const struct sockaddr *)&addr)) { +++ is_complete = true; +++ } +++ // delete duplicate ip, continuosly repeat, skip it +++ for (i = 0; i < ip_list->count; i++) { +++ duplicate_flag = false; +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &ip_list->address[i], +++ (const struct sockaddr *)&tmp_addr)) { +++ add_num++; +++ duplicate_flag = true; +++ break; +++ } +++ } +++ +++ if (duplicate_flag == false) { +++ pr_info("this ip not duplicate;"); +++ add_num = 1; +++ // if not repeat but omit limit return false +++ if ((type == LOCALADDR && +++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || +++ (type == REMOTEADDR && +++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { +++ +++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", +++ __func__, type, ip_list->count, +++ type == LOCALADDR ? +++ MAX_SUPPORTED_LOCAL_IP_COUNT : +++ MAX_SUPPORTED_REMOTE_IP_COUNT); +++ ip_list->count = 0; +++ return -ENOSPC; +++ } +++ ip_list->address[ip_list->count] = tmp_addr; +++ +++ ip_list->addrlen[ip_list->count] = +++ ip_list->addrlen[ip_list->count - 1]; +++ +++ ip_list->count += 1; +++ } +++ if (is_complete == true) +++ break; +++ } +++ return 0; +++} +++ +++int nfs_multipath_parse_ip_list_inter(struct nfs_ip_list *ip_list, +++ struct net *net_ns, +++ char *cursor, enum nfsmultipathoptions type) +++{ +++ int i = 0; +++ struct sockaddr_storage addr; +++ struct sockaddr_storage swap; +++ int len; +++ +++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", +++ cursor, type, __func__); +++ +++ len = rpc_pton(net_ns, cursor, +++ strlen(cursor), +++ (struct sockaddr *)&addr, sizeof(addr)); +++ if (!len) +++ return -EINVAL; +++ +++ // check repeated ip +++ for (i = 0; i < ip_list->count; i++) { +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &ip_list->address[i], +++ (const struct sockaddr *)&addr)) { +++ +++ pr_info("NFS: mount option '%s' type:%d index %d same as before index %d [%s]\n", +++ cursor, type, ip_list->count, i, __func__); +++ // prevent this ip is beginning +++ // if repeated take it to the end of list +++ swap = ip_list->address[i]; +++ +++ ip_list->address[i] = +++ ip_list->address[ip_list->count-1]; +++ +++ ip_list->address[ip_list->count-1] = swap; +++ return 0; +++ } +++ } +++ // if not repeated, check exceed limit +++ if ((type == LOCALADDR && +++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || +++ (type == REMOTEADDR && +++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { +++ +++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", +++ __func__, type, ip_list->count, +++ type == LOCALADDR ? +++ MAX_SUPPORTED_LOCAL_IP_COUNT : +++ MAX_SUPPORTED_REMOTE_IP_COUNT); +++ +++ ip_list->count = 0; +++ return -ENOSPC; +++ } +++ ip_list->address[ip_list->count] = addr; +++ ip_list->addrlen[ip_list->count] = len; +++ ip_list->count++; +++ +++ return 0; +++} +++ +++char *nfs_multipath_parse_ip_list_get_cursor(char **buf_to_parse, bool *single) +++{ +++ char *cursor = NULL; +++ const char *single_sep = strchr(*buf_to_parse, '~'); +++ const char *range_sep = strchr(*buf_to_parse, '-'); +++ +++ *single = true; +++ if (range_sep) { +++ if (range_sep > single_sep) { // A-B or A~B-C +++ if (single_sep == NULL) { // A-B +++ cursor = strsep(buf_to_parse, "-"); +++ if (cursor) +++ *single = false; +++ } else// A~B-C +++ cursor = strsep(buf_to_parse, "~"); +++ } else { // A-B~C +++ cursor = strsep(buf_to_parse, "-"); +++ if (cursor) +++ *single = false; +++ } +++ } else { // A~B~C +++ cursor = strsep(buf_to_parse, "~"); +++ } +++ return cursor; +++} +++ +++bool nfs_multipath_parse_param_check(enum nfsmultipathoptions type, +++ struct multipath_mount_options *options) +++{ +++ if (type == REMOUNTREMOTEADDR && options->remote_ip_list->count != 0) { +++ memset(options->remote_ip_list, 0, sizeof(struct nfs_ip_list)); +++ return true; +++ } +++ if (type == REMOUNTLOCALADDR && options->local_ip_list->count != 0) { +++ memset(options->local_ip_list, 0, sizeof(struct nfs_ip_list)); +++ return true; +++ } +++ if ((type == REMOTEADDR || type == REMOTEDNSNAME) && +++ options->pRemoteDnsInfo->dnsNameCount != 0) { +++ +++ pr_info("[MULTIPATH:%s] parse for %d ,already have dns\n", +++ __func__, type); +++ return false; +++ } else if ((type == REMOTEADDR || type == REMOTEDNSNAME) && +++ options->remote_ip_list->count != 0) { +++ +++ pr_info("[MULTIPATH:%s] parse for %d ,already have iplist\n", +++ __func__, type); +++ return false; +++ } +++ return true; +++} +++ +++int nfs_multipath_parse_ip_list(char *buffer, struct net *net_ns, +++ struct multipath_mount_options *options, +++ enum nfsmultipathoptions type) +++{ +++ char *buf_to_parse = NULL; +++ bool prev_range = false; +++ int ret = 0; +++ char *cursor = NULL; +++ bool single = true; +++ struct nfs_ip_list *ip_list_tmp = NULL; +++ +++ if (!nfs_multipath_parse_param_check(type, options)) +++ return -ENOTSUPP; +++ +++ if (type == REMOUNTREMOTEADDR) +++ type = REMOTEADDR; +++ +++ if (type == REMOUNTLOCALADDR) +++ type = LOCALADDR; +++ +++ if (type == LOCALADDR) +++ ip_list_tmp = options->local_ip_list; +++ else +++ ip_list_tmp = options->remote_ip_list; +++ +++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", +++ buffer, type, __func__); +++ +++ buf_to_parse = buffer; +++ while (buf_to_parse != NULL) { +++ cursor = +++ nfs_multipath_parse_ip_list_get_cursor(&buf_to_parse, &single); +++ if (!cursor) +++ break; +++ +++ if (single == false && prev_range == true) { +++ pr_info("NFS: mount option type: %d fail. Multiple Range.[%s]\n", +++ type, __func__); +++ +++ ret = -EINVAL; +++ goto out; +++ } +++ +++ if (prev_range == false) { +++ ret = nfs_multipath_parse_ip_list_inter(ip_list_tmp, +++ net_ns, cursor, type); +++ if (ret) +++ goto out; +++ if (single == false) +++ prev_range = true; +++ } else { +++ ret = nfs_multipath_parse_ip_range(net_ns, cursor, +++ ip_list_tmp, type); +++ if (ret != 0) +++ goto out; +++ prev_range = false; +++ } +++ } +++ +++out: +++ if (ret) +++ memset(ip_list_tmp, 0, sizeof(struct nfs_ip_list)); +++ +++ return ret; +++} +++ +++int nfs_multipath_parse_dns_list(char *buffer, struct net *net_ns, +++ struct multipath_mount_options *options) +++{ +++ struct NFS_ROUTE_DNS_INFO_S *dns_name_list_tmp = NULL; +++ char *cursor = NULL; +++ char *bufToParse; +++ +++ if (!nfs_multipath_parse_param_check(REMOTEDNSNAME, options)) +++ return -ENOTSUPP; +++ +++ pr_info("[MULTIPATH:%s] buffer %s\n", __func__, buffer); +++ // freed in nfs_free_parsed_mount_data +++ dns_name_list_tmp = kmalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), +++ GFP_KERNEL); +++ if (!dns_name_list_tmp) +++ return -ENOMEM; +++ +++ dns_name_list_tmp->dnsNameCount = 0; +++ bufToParse = buffer; +++ while (bufToParse) { +++ if (dns_name_list_tmp->dnsNameCount >= MAX_DNS_SUPPORTED) { +++ pr_err("%s: dnsname for %s reached %d,more than supported limit %d\n", +++ __func__, cursor, +++ dns_name_list_tmp->dnsNameCount, +++ MAX_DNS_SUPPORTED); +++ dns_name_list_tmp->dnsNameCount = 0; +++ return -ENOSPC; +++ } +++ cursor = strsep(&bufToParse, "~"); +++ if (!cursor) +++ break; +++ +++ strcpy(dns_name_list_tmp->routeRemoteDnsList +++ [dns_name_list_tmp->dnsNameCount].dnsname, +++ cursor); +++ dns_name_list_tmp->dnsNameCount++; +++ } +++ if (dns_name_list_tmp->dnsNameCount == 0) +++ return -EINVAL; +++ options->pRemoteDnsInfo = dns_name_list_tmp; +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check_ipv4_valid(struct sockaddr_in *addr) +++{ +++ if (addr->sin_addr.s_addr == 0 || addr->sin_addr.s_addr == 0xffffffff) +++ return -EINVAL; +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check_ipv6_valid(struct sockaddr_in6 *addr) +++{ +++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0 && +++ addr->sin6_addr.in6_u.u6_addr32[1] == 0 && +++ addr->sin6_addr.in6_u.u6_addr32[2] == 0 && +++ addr->sin6_addr.in6_u.u6_addr32[3] == 0) +++ return -EINVAL; +++ +++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0xffffffff && +++ addr->sin6_addr.in6_u.u6_addr32[1] == 0xffffffff && +++ addr->sin6_addr.in6_u.u6_addr32[2] == 0xffffffff && +++ addr->sin6_addr.in6_u.u6_addr32[3] == 0xffffffff) +++ return -EINVAL; +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check_ip_valid(struct sockaddr_storage *address) +++{ +++ int rc = 0; +++ +++ if (address->ss_family == AF_INET) +++ rc = nfs_multipath_parse_options_check_ipv4_valid( +++ (struct sockaddr_in *)address); +++ else if (address->ss_family == AF_INET6) +++ rc = nfs_multipath_parse_options_check_ipv6_valid( +++ (struct sockaddr_in6 *)address); +++ else +++ rc = -EINVAL; +++ +++ return rc; +++} +++ +++int nfs_multipath_parse_options_check_valid( +++ struct multipath_mount_options *options) +++{ +++ int rc; +++ int i; +++ +++ if (options == NULL) +++ return 0; +++ +++ for (i = 0; i < options->local_ip_list->count; i++) { +++ rc = nfs_multipath_parse_options_check_ip_valid( +++ &options->local_ip_list->address[i]); +++ if (rc != 0) +++ return rc; +++ } +++ +++ for (i = 0; i < options->remote_ip_list->count; i++) { +++ rc = nfs_multipath_parse_options_check_ip_valid( +++ &options->remote_ip_list->address[i]); +++ if (rc != 0) +++ return rc; +++ } +++ +++ return 0; +++} +++int nfs_multipath_parse_options_check_duplicate( +++ struct multipath_mount_options *options) +++{ +++ int i; +++ int j; +++ +++ if (options == NULL || +++ options->local_ip_list->count == 0 || +++ options->remote_ip_list->count == 0) +++ +++ return 0; +++ +++ for (i = 0; i < options->local_ip_list->count; i++) { +++ for (j = 0; j < options->remote_ip_list->count; j++) { +++ if (rpc_cmp_addr((const struct sockaddr *) +++ &options->local_ip_list->address[i], +++ (const struct sockaddr *) +++ &options->remote_ip_list->address[j])) +++ return -ENOTSUPP; +++ } +++ } +++ return 0; +++} +++ +++int nfs_multipath_parse_options_check(struct multipath_mount_options *options) +++{ +++ int rc = 0; +++ +++ rc = nfs_multipath_parse_options_check_valid(options); +++ +++ if (rc != 0) { +++ pr_err("has invaild ip.\n"); +++ return rc; +++ } +++ +++ rc = nfs_multipath_parse_options_check_duplicate(options); +++ if (rc != 0) +++ return rc; +++ return rc; +++} +++ +++int nfs_multipath_alloc_options(void **enfs_option) +++{ +++ struct multipath_mount_options *options = NULL; +++ +++ options = kzalloc(sizeof(struct multipath_mount_options), GFP_KERNEL); +++ +++ if (options == NULL) +++ return -ENOMEM; +++ +++ options->local_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ if (options->local_ip_list == NULL) { +++ kfree(options); +++ return -ENOMEM; +++ } +++ +++ options->remote_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ if (options->remote_ip_list == NULL) { +++ kfree(options->local_ip_list); +++ kfree(options); +++ return -ENOMEM; +++ } +++ +++ options->pRemoteDnsInfo = kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), +++ GFP_KERNEL); +++ if (options->pRemoteDnsInfo == NULL) { +++ kfree(options->remote_ip_list); +++ kfree(options->local_ip_list); +++ kfree(options); +++ return -ENOMEM; +++ } +++ +++ *enfs_option = options; +++ return 0; +++} +++ +++int nfs_multipath_parse_options(enum nfsmultipathoptions type, +++ char *str, void **enfs_option, struct net *net_ns) +++{ +++ int rc; +++ struct multipath_mount_options *options = NULL; +++ +++ if ((str == NULL) || (enfs_option == NULL) || (net_ns == NULL)) +++ return -EINVAL; +++ +++ if (*enfs_option == NULL) { +++ rc = nfs_multipath_alloc_options(enfs_option); +++ if (rc != 0) { +++ enfs_log_error( +++ "alloc enfs_options failed! errno:%d\n", rc); +++ return rc; +++ } +++ } +++ +++ options = (struct multipath_mount_options *)*enfs_option; +++ +++ if (type == LOCALADDR || type == REMOUNTLOCALADDR || +++ type == REMOTEADDR || type == REMOUNTREMOTEADDR) { +++ rc = nfs_multipath_parse_ip_list(str, net_ns, options, type); +++ } else if (type == REMOTEDNSNAME) { +++ /* alloc and release need to modify */ +++ rc = nfs_multipath_parse_dns_list(str, net_ns, options); +++ } else { +++ rc = -EOPNOTSUPP; +++ } +++ +++ // after parsing cmd, need checking local and remote +++ // IP is same. if not means illegal cmd +++ if (rc == 0) +++ rc = nfs_multipath_parse_options_check_duplicate(options); +++ +++ if (rc == 0) +++ rc = nfs_multipath_parse_options_check(options); +++ +++ return rc; +++} +++ +++void nfs_multipath_free_options(void **enfs_option) +++{ +++ struct multipath_mount_options *options; +++ +++ if (enfs_option == NULL || *enfs_option == NULL) +++ return; +++ +++ options = (struct multipath_mount_options *)*enfs_option; +++ +++ if (options->remote_ip_list != NULL) { +++ kfree(options->remote_ip_list); +++ options->remote_ip_list = NULL; +++ } +++ +++ if (options->local_ip_list != NULL) { +++ kfree(options->local_ip_list); +++ options->local_ip_list = NULL; +++ } +++ +++ if (options->pRemoteDnsInfo != NULL) { +++ kfree(options->pRemoteDnsInfo); +++ options->pRemoteDnsInfo = NULL; +++ } +++ +++ kfree(options); +++ *enfs_option = NULL; +++} ++diff --git a/fs/nfs/enfs/enfs_multipath_parse.h b/fs/nfs/enfs/enfs_multipath_parse.h ++new file mode 100644 ++index 000000000000..6f3e8703e3e2 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_parse.h ++@@ -0,0 +1,22 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _ENFS_MULTIPATH_PARSE_H_ +++#define _ENFS_MULTIPATH_PARSE_H_ +++ +++#include "enfs.h" +++ +++struct multipath_mount_options { +++ struct nfs_ip_list *remote_ip_list; +++ struct nfs_ip_list *local_ip_list; +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; +++}; +++ +++int nfs_multipath_parse_options(enum nfsmultipathoptions type, +++ char *str, void **enfs_option, struct net *net_ns); +++void nfs_multipath_free_options(void **enfs_option); +++ +++#endif +diff --git a/0004-add_enfs_module_for_sunrpc_multipatch.patch b/0004-add_enfs_module_for_sunrpc_multipatch.patch +new file mode 100644 +index 0000000..2c0fcc7 +--- /dev/null ++++ b/0004-add_enfs_module_for_sunrpc_multipatch.patch +@@ -0,0 +1,1581 @@ ++diff --git a/fs/nfs/enfs/enfs_multipath.h b/fs/nfs/enfs/enfs_multipath.h ++new file mode 100644 ++index 000000000000..e064c2929ced ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath.h ++@@ -0,0 +1,24 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: enfs multipath +++ * Author: +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef ENFS_MULTIPATH_H +++#define ENFS_MULTIPATH_H +++#include <linux/sunrpc/clnt.h> +++ +++#define MAX_XPRT_NUM_PER_CLIENT 32 +++ +++int enfs_multipath_init(void); +++void enfs_multipath_exit(void); +++void enfs_xprt_ippair_create(struct xprt_create *xprtargs, +++ struct rpc_clnt *clnt, void *data); +++int enfs_config_xprt_create_args(struct xprt_create *xprtargs, +++ struct rpc_create_args *args, +++ char *servername, size_t length); +++void print_enfs_multipath_addr(struct sockaddr *local, struct sockaddr *remote); +++ +++#endif // ENFS_MULTIPATH_H ++diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c ++new file mode 100644 ++index 000000000000..63c02898a42c ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.c ++@@ -0,0 +1,340 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#include <linux/types.h> +++#include <linux/nfs.h> +++#include <linux/nfs4.h> +++#include <linux/nfs_fs.h> +++#include <linux/nfs_fs_sb.h> +++#include <linux/proc_fs.h> +++#include <linux/seq_file.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/addr.h> +++#include "enfs_multipath_client.h" +++#include "enfs_multipath_parse.h" +++ +++int +++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, +++ const struct nfs_client_initdata *client_init_data) +++{ +++ struct multipath_mount_options *mount_options = +++ (struct multipath_mount_options *)client_init_data->enfs_option; +++ +++ if (mount_options->local_ip_list) { +++ client_info->local_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->local_ip_list) +++ return -ENOMEM; +++ +++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->remote_ip_list) { +++ +++ client_info->remote_ip_list = +++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); +++ +++ if (!client_info->remote_ip_list) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->remote_ip_list, +++ mount_options->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (mount_options->pRemoteDnsInfo) { +++ client_info->pRemoteDnsInfo = +++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); +++ +++ if (!client_info->pRemoteDnsInfo) { +++ kfree(client_info->local_ip_list); +++ client_info->local_ip_list = NULL; +++ kfree(client_info->remote_ip_list); +++ client_info->remote_ip_list = NULL; +++ return -ENOMEM; +++ } +++ memcpy(client_info->pRemoteDnsInfo, +++ mount_options->pRemoteDnsInfo, +++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); +++ } +++ return 0; +++} +++ +++void nfs_multipath_client_info_free_work(struct work_struct *work) +++{ +++ +++ struct multipath_client_info *clp_info; +++ +++ if (work == NULL) +++ return; +++ +++ clp_info = container_of(work, struct multipath_client_info, work); +++ +++ if (clp_info->local_ip_list != NULL) { +++ kfree(clp_info->local_ip_list); +++ clp_info->local_ip_list = NULL; +++ } +++ if (clp_info->remote_ip_list != NULL) { +++ kfree(clp_info->remote_ip_list); +++ clp_info->remote_ip_list = NULL; +++ } +++ kfree(clp_info); +++} +++ +++void nfs_multipath_client_info_free(void *data) +++{ +++ struct multipath_client_info *clp_info = +++ (struct multipath_client_info *)data; +++ +++ if (clp_info == NULL) +++ return; +++ pr_info("free client info %p.\n", clp_info); +++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); +++ schedule_work(&clp_info->work); +++} +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init) +++{ +++ int rc; +++ struct multipath_client_info *info; +++ struct multipath_client_info **enfs_info; +++ /* no multi path info, no need do multipath init */ +++ if (cl_init->enfs_option == NULL) +++ return 0; +++ enfs_info = (struct multipath_client_info **)data; +++ if (enfs_info == NULL) +++ return -EINVAL; +++ +++ if (*enfs_info == NULL) +++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), +++ GFP_KERNEL); +++ +++ if (*enfs_info == NULL) +++ return -ENOMEM; +++ +++ info = (struct multipath_client_info *)*enfs_info; +++ pr_info("init client info %p.\n", info); +++ rc = nfs_multipath_client_mount_info_init(info, cl_init); +++ if (rc) { +++ nfs_multipath_client_info_free((void *)info); +++ return rc; +++ } +++ return rc; +++} +++ +++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, +++ const struct nfs_ip_list *ip_list_dst) +++{ +++ int i; +++ int j; +++ bool is_find; +++ /* if both are equal or NULL, then return true. */ +++ if (ip_list_src == ip_list_dst) +++ return true; +++ +++ if ((ip_list_src == NULL || ip_list_dst == NULL)) +++ return false; +++ +++ if (ip_list_src->count != ip_list_dst->count) +++ return false; +++ +++ for (i = 0; i < ip_list_src->count; i++) { +++ is_find = false; +++ for (j = 0; j < ip_list_src->count; j++) { +++ if (rpc_cmp_addr_port( +++ (const struct sockaddr *) +++ &ip_list_src->address[i], +++ (const struct sockaddr *) +++ &ip_list_dst->address[j]) +++ ) { +++ is_find = true; +++ break; +++ } +++ } +++ if (is_find == false) +++ return false; +++ } +++ return true; +++} +++ +++int +++nfs_multipath_dns_list_info_match( +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, +++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) +++{ +++ int i; +++ +++ /* if both are equal or NULL, then return true. */ +++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) +++ return true; +++ +++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) +++ return false; +++ +++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) +++ return false; +++ +++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { +++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, +++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) +++ return false; +++ } +++ return true; +++} +++ +++int nfs_multipath_client_info_match(void *src, void *dst) +++{ +++ int ret = true; +++ +++ struct multipath_client_info *src_info; +++ struct multipath_mount_options *dst_info; +++ +++ src_info = (struct multipath_client_info *)src; +++ dst_info = (struct multipath_mount_options *)dst; +++ pr_info("try match client .\n"); +++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, +++ dst_info->local_ip_list); +++ if (ret == false) { +++ pr_err("local_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, +++ dst_info->remote_ip_list); +++ if (ret == false) { +++ pr_err("remote_ip not match.\n"); +++ return ret; +++ } +++ +++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, +++ dst_info->pRemoteDnsInfo); +++ if (ret == false) { +++ pr_err("dns not match.\n"); +++ return ret; +++ } +++ pr_info("try match client ret %d.\n", ret); +++ return ret; +++} +++ +++void nfs_multipath_print_ip_info(struct seq_file *mount_option, +++ struct nfs_ip_list *ip_list, +++ const char *type) +++{ +++ char buf[IP_ADDRESS_LEN_MAX + 1]; +++ int len = 0; +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < ip_list->count; i++) { +++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], +++ buf, IP_ADDRESS_LEN_MAX); +++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) +++ buf[len] = '\0'; +++ +++ if (i == 0) +++ seq_printf(mount_option, "%s", buf); +++ else +++ seq_printf(mount_option, "~%s", buf); +++ dfprintk(MOUNT, +++ "NFS: show nfs mount option type:%s %s [%s]\n", +++ type, buf, __func__); +++ } +++} +++ +++void nfs_multipath_print_dns_info(struct seq_file *mount_option, +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, +++ const char *type) +++{ +++ int i = 0; +++ +++ seq_printf(mount_option, ",%s=", type); +++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { +++ if (i == 0) +++ seq_printf(mount_option, +++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) +++ seq_printf(mount_option, ",%s]", +++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ else +++ seq_printf(mount_option, +++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); +++ } +++} +++ +++ +++static void multipath_print_sockaddr(struct seq_file *seq, +++ struct sockaddr *addr) +++{ +++ switch (addr->sa_family) { +++ case AF_INET: { +++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; +++ +++ seq_printf(seq, "%pI4", &sin->sin_addr); +++ return; +++ } +++ case AF_INET6: { +++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; +++ +++ seq_printf(seq, "%pI6", &sin6->sin6_addr); +++ return; +++ } +++ default: +++ break; +++ } +++ pr_err("unsupport family:%d\n", addr->sa_family); +++} +++ +++static void multipath_print_enfs_info(struct seq_file *seq, +++ struct nfs_server *server) +++{ +++ struct sockaddr_storage peeraddr; +++ struct rpc_clnt *next = server->client; +++ +++ rpc_peeraddr(server->client, +++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); +++ seq_puts(seq, ",enfs_info="); +++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); +++ +++ while (next->cl_parent) { +++ if (next == next->cl_parent) +++ break; +++ next = next->cl_parent; +++ } +++ seq_printf(seq, "_%u", next->cl_clid); +++} +++ +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) +++{ +++ struct nfs_server *server = data; +++ struct multipath_client_info *client_info = +++ server->nfs_client->cl_multipath_data; +++ +++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); +++ if ((client_info->remote_ip_list) && +++ (client_info->remote_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->remote_ip_list, +++ "remoteaddrs"); +++ +++ if ((client_info->local_ip_list) && +++ (client_info->local_ip_list->count > 0)) +++ nfs_multipath_print_ip_info(mount_option, +++ client_info->local_ip_list, +++ "localaddrs"); +++ +++ if ((client_info->pRemoteDnsInfo) && +++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) +++ nfs_multipath_print_dns_info(mount_option, +++ client_info->pRemoteDnsInfo, +++ "remotednsname"); +++ +++ multipath_print_enfs_info(mount_option, server); +++} ++diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h ++new file mode 100644 ++index 000000000000..208f7260690d ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_multipath_client.h ++@@ -0,0 +1,26 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS adapter. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef _ENFS_MULTIPATH_CLIENT_H_ +++#define _ENFS_MULTIPATH_CLIENT_H_ +++ +++#include "enfs.h" +++ +++struct multipath_client_info { +++ struct work_struct work; +++ struct nfs_ip_list *remote_ip_list; +++ struct nfs_ip_list *local_ip_list; +++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; +++ s64 client_id; +++}; +++ +++int nfs_multipath_client_info_init(void **data, +++ const struct nfs_client_initdata *cl_init); +++void nfs_multipath_client_info_free(void *data); +++int nfs_multipath_client_info_match(void *src, void *dst); +++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_path.c b/fs/nfs/enfs/enfs_path.c ++new file mode 100644 ++index 000000000000..7355f8c2f672 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_path.c ++@@ -0,0 +1,47 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++ +++#include <linux/sunrpc/metrics.h> +++#include <linux/sunrpc/xprt.h> +++ +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_path.h" +++ +++// only create ctx in this function +++// alloc iostat memory in create_clnt +++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx; +++ +++ if (!xprt) { +++ enfs_log_error("invalid xprt pointer.\n"); +++ return -EINVAL; +++ } +++ +++ ctx = kzalloc(sizeof(struct enfs_xprt_context), GFP_KERNEL); +++ if (!ctx) { +++ enfs_log_error("add xprt test failed.\n"); +++ return -ENOMEM; +++ } +++ +++ xprt->multipath_context = (void *)ctx; +++ return 0; +++} +++ +++// free multi_context and iostat memory +++void enfs_free_xprt_ctx(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = xprt->multipath_context; +++ +++ if (ctx) { +++ if (ctx->stats) { +++ rpc_free_iostats(ctx->stats); +++ ctx->stats = NULL; +++ } +++ kfree(xprt->multipath_context); +++ xprt->multipath_context = NULL; +++ } +++} ++diff --git a/fs/nfs/enfs/enfs_path.h b/fs/nfs/enfs/enfs_path.h ++new file mode 100644 ++index 000000000000..97b1ef3730b8 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_path.h ++@@ -0,0 +1,12 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++ +++#ifndef ENFS_PATH_H +++#define ENFS_PATH_H +++ +++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt); +++void enfs_free_xprt_ctx(struct rpc_xprt *xprt); +++ +++#endif // ENFS_PATH_H ++diff --git a/fs/nfs/enfs/enfs_proc.c b/fs/nfs/enfs/enfs_proc.c ++new file mode 100644 ++index 000000000000..53fa1a07642f ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_proc.c ++@@ -0,0 +1,545 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#include <linux/module.h> +++#include <linux/proc_fs.h> +++#include <linux/seq_file.h> +++#include <linux/spinlock.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/metrics.h> +++#include <linux/sunrpc/xprtsock.h> +++#include <net/netns/generic.h> +++ +++#include "../../../net/sunrpc/netns.h" +++ +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_proc.h" +++#include "enfs_multipath.h" +++#include "pm_state.h" +++ +++#define ENFS_PROC_DIR "enfs" +++#define ENFS_PROC_PATH_STATUS_LEN 256 +++ +++static struct proc_dir_entry *enfs_proc_parent; +++ +++void +++enfs_iterate_each_rpc_clnt(int (*fn)(struct rpc_clnt *clnt, void *data), +++ void *data) +++{ +++ struct net *net; +++ struct sunrpc_net *sn; +++ struct rpc_clnt *clnt; +++ +++ rcu_read_lock(); +++ for_each_net_rcu(net) { +++ sn = net_generic(net, sunrpc_net_id); +++ if (sn == NULL) +++ continue; +++ spin_lock(&sn->rpc_client_lock); +++ list_for_each_entry(clnt, &sn->all_clients, cl_clients) { +++ fn(clnt, data); +++ } +++ spin_unlock(&sn->rpc_client_lock); +++ } +++ rcu_read_unlock(); +++} +++ +++struct proc_dir_entry *enfs_get_proc_parent(void) +++{ +++ return enfs_proc_parent; +++} +++ +++static int sockaddr_ip_to_str(struct sockaddr *addr, char *buf, int len) +++{ +++ switch (addr->sa_family) { +++ case AF_INET: { +++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; +++ +++ snprintf(buf, len, "%pI4", &sin->sin_addr); +++ return 0; +++ } +++ case AF_INET6: { +++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; +++ +++ snprintf(buf, len, "%pI6", &sin6->sin6_addr); +++ return 0; +++ } +++ default: +++ break; +++ } +++ return 1; +++} +++ +++static bool should_print(const char *name) +++{ +++ int i; +++ static const char * const proc_names[] = { +++ "READ", +++ "WRITE", +++ }; +++ +++ if (name == NULL) +++ return false; +++ +++ for (i = 0; i < ARRAY_SIZE(proc_names); i++) { +++ if (strcmp(name, proc_names[i]) == 0) +++ return true; +++ } +++ return false; +++} +++ +++struct enfs_xprt_iter { +++ unsigned int id; +++ struct seq_file *seq; +++ unsigned int max_addrs_length; +++}; +++ +++static int debug_show_xprt(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ +++ if (xprt->multipath_context) +++ ctx = xprt->multipath_context; +++ +++ pr_info(" xprt:%p ctx:%p main:%d queue_len:%lu.\n", xprt, +++ xprt->multipath_context, +++ ctx ? ctx->main : false, +++ atomic_long_read(&xprt->queuelen)); +++ return 0; +++} +++ +++static int debug_show_clnt(struct rpc_clnt *clnt, void *data) +++{ +++ pr_info(" clnt %d addr:%p enfs:%d\n", +++ clnt->cl_clid, clnt, +++ clnt->cl_enfs); +++ rpc_clnt_iterate_for_each_xprt(clnt, debug_show_xprt, NULL); +++ return 0; +++} +++ +++static void debug_print_all_xprt(void) +++{ +++ enfs_iterate_each_rpc_clnt(debug_show_clnt, NULL); +++} +++ +++static +++void enfs_proc_format_xprt_addr_display(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ char *local_name_buf, +++ int local_name_buf_len, +++ char *remote_name_buf, +++ int remote_name_buf_len) +++{ +++ int err; +++ struct sockaddr_storage srcaddr; +++ struct enfs_xprt_context *ctx; +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ sockaddr_ip_to_str((struct sockaddr *)&xprt->addr, +++ remote_name_buf, remote_name_buf_len); +++ +++ // get local address depend one main or not +++ if (enfs_is_main_xprt(xprt)) { +++ err = rpc_localaddr(clnt, (struct sockaddr *)&srcaddr, +++ sizeof(srcaddr)); +++ if (err != 0) +++ (void)snprintf(local_name_buf, +++ local_name_buf_len, "Unknown"); +++ else +++ sockaddr_ip_to_str((struct sockaddr *)&srcaddr, +++ local_name_buf, +++ local_name_buf_len); +++ } else { +++ sockaddr_ip_to_str((struct sockaddr *)&ctx->srcaddr, +++ local_name_buf, +++ local_name_buf_len); +++ } +++} +++ +++static int enfs_show_xprt_stats(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ unsigned int op; +++ unsigned int maxproc = clnt->cl_maxproc; +++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; +++ struct enfs_xprt_context *ctx; +++ char local_name[INET6_ADDRSTRLEN]; +++ char remote_name[INET6_ADDRSTRLEN]; +++ +++ if (!xprt->multipath_context) +++ return 0; +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ enfs_proc_format_xprt_addr_display(clnt, xprt, local_name, +++ sizeof(local_name), +++ remote_name, sizeof(remote_name)); +++ +++ seq_printf(iter->seq, "%-6u%-*s%-*s", iter->id, +++ iter->max_addrs_length + 4, +++ local_name, +++ iter->max_addrs_length + 4, +++ remote_name); +++ +++ iter->id++; +++ +++ for (op = 0; op < maxproc; op++) { +++ if (!should_print(clnt->cl_procinfo[op].p_name)) +++ continue; +++ +++ seq_printf(iter->seq, "%-22lu%-22Lu%-22Lu", +++ ctx->stats[op].om_ops, +++ ctx->stats[op].om_ops == 0 ? 0 : +++ ktime_to_ms(ctx->stats[op].om_rtt) / +++ ctx->stats[op].om_ops, +++ ctx->stats[op].om_ops == 0 ? 0 : +++ ktime_to_ms(ctx->stats[op].om_execute) / +++ ctx->stats[op].om_ops); +++ } +++ seq_puts(iter->seq, "\n"); +++ return 0; +++} +++ +++static int rpc_proc_show_path_status(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; +++ struct enfs_xprt_context *ctx = NULL; +++ char local_name[INET6_ADDRSTRLEN] = {0}; +++ char remote_name[INET6_ADDRSTRLEN] = {0}; +++ char multiapth_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; +++ char xprt_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; +++ +++ if (!xprt->multipath_context) { +++ enfs_log_debug("multipath_context is null.\n"); +++ return 0; +++ } +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ enfs_proc_format_xprt_addr_display(clnt, xprt, +++ local_name, +++ sizeof(local_name), +++ remote_name, sizeof(remote_name)); +++ +++ pm_get_path_state_desc(xprt, +++ multiapth_status, +++ ENFS_PROC_PATH_STATUS_LEN); +++ +++ pm_get_xprt_state_desc(xprt, +++ xprt_status, +++ ENFS_PROC_PATH_STATUS_LEN); +++ +++ seq_printf(iter->seq, "%-6u%-*s%-*s%-12s%-12s\n", +++ iter->id, iter->max_addrs_length + 4, +++ local_name, iter->max_addrs_length + 4, +++ remote_name, multiapth_status, +++ xprt_status); +++ iter->id++; +++ return 0; +++} +++ +++static int enfs_get_max_addrs_length(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; +++ char local_name[INET6_ADDRSTRLEN]; +++ char remote_name[INET6_ADDRSTRLEN]; +++ +++ enfs_proc_format_xprt_addr_display(clnt, xprt, +++ local_name, sizeof(local_name), +++ remote_name, sizeof(remote_name)); +++ +++ if (iter->max_addrs_length < strlen(local_name)) +++ iter->max_addrs_length = strlen(local_name); +++ +++ if (iter->max_addrs_length < strlen(remote_name)) +++ iter->max_addrs_length = strlen(remote_name); +++ +++ return 0; +++} +++ +++static int rpc_proc_clnt_showpath(struct seq_file *seq, void *v) +++{ +++ struct rpc_clnt *clnt = seq->private; +++ struct enfs_xprt_iter iter; +++ +++ iter.seq = seq; +++ iter.id = 0; +++ iter.max_addrs_length = 0; +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ enfs_get_max_addrs_length, +++ (void *)&iter); +++ +++ seq_printf(seq, "%-6s%-*s%-*s%-12s%-12s\n", "id", +++ iter.max_addrs_length + 4, +++ "local_addr", +++ iter.max_addrs_length + 4, +++ "remote_addr", +++ "path_state", +++ "xprt_state"); +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ rpc_proc_show_path_status, +++ (void *)&iter); +++ return 0; +++} +++ +++static int enfs_rpc_proc_show(struct seq_file *seq, void *v) +++{ +++ struct rpc_clnt *clnt = seq->private; +++ struct enfs_xprt_iter iter; +++ +++ iter.seq = seq; +++ iter.id = 0; +++ iter.max_addrs_length = 0; +++ +++ debug_print_all_xprt(); +++ pr_info("enfs proc clnt:%p\n", clnt); +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ enfs_get_max_addrs_length, +++ (void *)&iter); +++ +++ seq_printf(seq, "%-6s%-*s%-*s%-22s%-22s%-22s%-22s%-22s%-22s\n", "id", +++ iter.max_addrs_length + 4, "local_addr", +++ iter.max_addrs_length + 4, +++ "remote_addr", "r_count", +++ "r_rtt", "r_exec", "w_count", "w_rtt", "w_exec"); +++ +++ // rpc_clnt_show_stats(seq, clnt); +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ enfs_show_xprt_stats, +++ (void *)&iter); +++ return 0; +++} +++ +++static int rpc_proc_open(struct inode *inode, struct file *file) +++{ +++ struct rpc_clnt *clnt = PDE_DATA(inode); +++ +++ pr_info("%s %p\n", __func__, clnt); +++ return single_open(file, enfs_rpc_proc_show, clnt); +++} +++ +++static int enfs_reset_xprt_stats(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void *data) +++{ +++ unsigned int op; +++ struct enfs_xprt_context *ctx; +++ unsigned int maxproc = clnt->cl_maxproc; +++ struct rpc_iostats stats = {0}; +++ +++ if (!xprt->multipath_context) +++ return 0; +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ +++ for (op = 0; op < maxproc; op++) { +++ spin_lock(&ctx->stats[op].om_lock); +++ ctx->stats[op] = stats; +++ spin_unlock(&ctx->stats[op].om_lock); +++ } +++ return 0; +++} +++ +++static void trim_newline_ch(char *str, int len) +++{ +++ int i; +++ +++ for (i = 0; str[i] != '\0' && i < len; i++) { +++ if (str[i] == '\n') +++ str[i] = '\0'; +++ } +++} +++ +++static ssize_t enfs_proc_write(struct file *file, +++ const char __user *user_buf, +++ size_t len, +++ loff_t *offset) +++{ +++ char buffer[128]; +++ struct rpc_clnt *clnt = +++ ((struct seq_file *)file->private_data)->private; +++ +++ if (len >= sizeof(buffer)) +++ return -E2BIG; +++ +++ if (copy_from_user(buffer, user_buf, len) != 0) +++ return -EFAULT; +++ +++ buffer[len] = '\0'; +++ trim_newline_ch(buffer, len); +++ if (strcmp(buffer, "reset") != 0) +++ return -EINVAL; +++ +++ rpc_clnt_iterate_for_each_xprt(clnt, enfs_reset_xprt_stats, NULL); +++ return len; +++} +++ +++static int rpc_proc_show_path(struct inode *inode, struct file *file) +++{ +++ struct rpc_clnt *clnt = PDE_DATA(inode); +++ +++ return single_open(file, rpc_proc_clnt_showpath, clnt); +++} +++ +++static const struct file_operations rpc_proc_fops = { +++ .owner = THIS_MODULE, +++ .open = rpc_proc_open, +++ .read = seq_read, +++ .llseek = seq_lseek, +++ .release = single_release, +++ .write = enfs_proc_write, +++}; +++ +++static const struct file_operations rpc_show_path_fops = { +++ .owner = THIS_MODULE, +++ .open = rpc_proc_show_path, +++ .read = seq_read, +++ .llseek = seq_lseek, +++ .release = single_release, +++}; +++ +++static int clnt_proc_name(struct rpc_clnt *clnt, char *buf, int len) +++{ +++ int ret; +++ +++ ret = snprintf(buf, len, "%s_%u", +++ rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR), +++ clnt->cl_clid); +++ if (ret > len) +++ return -E2BIG; +++ return 0; +++} +++ +++static int enfs_proc_create_file(struct rpc_clnt *clnt) +++{ +++ int err; +++ char buf[128]; +++ +++ struct proc_dir_entry *clnt_entry; +++ struct proc_dir_entry *stat_entry; +++ +++ err = clnt_proc_name(clnt, buf, sizeof(buf)); +++ if (err) +++ return err; +++ +++ clnt_entry = proc_mkdir(buf, enfs_proc_parent); +++ if (clnt_entry == NULL) +++ return -EINVAL; +++ +++ stat_entry = proc_create_data("stat", +++ 0, clnt_entry, +++ &rpc_proc_fops, clnt); +++ +++ if (stat_entry == NULL) +++ return -EINVAL; +++ +++ stat_entry = proc_create_data("path", +++ 0, clnt_entry, +++ &rpc_show_path_fops, clnt); +++ +++ if (stat_entry == NULL) +++ return -EINVAL; +++ +++ return 0; +++} +++ +++void enfs_count_iostat(struct rpc_task *task) +++{ +++ struct enfs_xprt_context *ctx = task->tk_xprt->multipath_context; +++ +++ if (!ctx || !ctx->stats) +++ return; +++ rpc_count_iostats(task, ctx->stats); +++} +++ +++static void enfs_proc_delete_file(struct rpc_clnt *clnt) +++{ +++ int err; +++ char buf[128]; +++ +++ err = clnt_proc_name(clnt, buf, sizeof(buf)); +++ if (err) { +++ pr_err("gen clnt name failed.\n"); +++ return; +++ } +++ remove_proc_subtree(buf, enfs_proc_parent); +++} +++ +++// create proc file "/porc/enfs/[mount_ip]_[id]/stat" +++int enfs_proc_create_clnt(struct rpc_clnt *clnt) +++{ +++ int err; +++ +++ err = enfs_proc_create_file(clnt); +++ if (err) { +++ pr_err("create client %d\n", err); +++ return err; +++ } +++ +++ return 0; +++} +++ +++void enfs_proc_delete_clnt(struct rpc_clnt *clnt) +++{ +++ if (clnt->cl_enfs) +++ enfs_proc_delete_file(clnt); +++} +++ +++static int enfs_proc_create_parent(void) +++{ +++ enfs_proc_parent = proc_mkdir(ENFS_PROC_DIR, NULL); +++ +++ if (enfs_proc_parent == NULL) { +++ pr_err("Enfs create proc dir err\n"); +++ return -ENOMEM; +++ } +++ return 0; +++} +++ +++static void enfs_proc_delete_parent(void) +++{ +++ remove_proc_entry(ENFS_PROC_DIR, NULL); +++} +++ +++static int enfs_proc_init_create_clnt(struct rpc_clnt *clnt, void *data) +++{ +++ if (clnt->cl_enfs) +++ enfs_proc_create_file(clnt); +++ return 0; +++} +++ +++static int enfs_proc_destroy_clnt(struct rpc_clnt *clnt, void *data) +++{ +++ if (clnt->cl_enfs) +++ enfs_proc_delete_file(clnt); +++ return 0; +++} +++ +++int enfs_proc_init(void) +++{ +++ int err; +++ +++ err = enfs_proc_create_parent(); +++ if (err) +++ return err; +++ +++ enfs_iterate_each_rpc_clnt(enfs_proc_init_create_clnt, NULL); +++ return 0; +++} +++ +++void enfs_proc_exit(void) +++{ +++ enfs_iterate_each_rpc_clnt(enfs_proc_destroy_clnt, NULL); +++ enfs_proc_delete_parent(); +++} ++diff --git a/fs/nfs/enfs/enfs_proc.h b/fs/nfs/enfs/enfs_proc.h ++new file mode 100644 ++index 000000000000..321951031c2e ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_proc.h ++@@ -0,0 +1,21 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Client-side ENFS PROC. +++ * +++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. +++ */ +++#ifndef ENFS_PROC_H +++#define ENFS_PROC_H +++ +++struct rpc_clnt; +++struct rpc_task; +++struct proc_dir_entry; +++ +++int enfs_proc_init(void); +++void enfs_proc_exit(void); +++struct proc_dir_entry *enfs_get_proc_parent(void); +++int enfs_proc_create_clnt(struct rpc_clnt *clnt); +++void enfs_proc_delete_clnt(struct rpc_clnt *clnt); +++void enfs_count_iostat(struct rpc_task *task); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_remount.c b/fs/nfs/enfs/enfs_remount.c ++new file mode 100644 ++index 000000000000..2c3fe125c735 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_remount.c ++@@ -0,0 +1,221 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: remount ip source file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++#include "enfs_remount.h" +++ +++#include <linux/string.h> +++#include <linux/in.h> +++#include <linux/in6.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/spinlock.h> +++#include <linux/sunrpc/addr.h> +++#include <linux/sunrpc/metrics.h> +++#include <linux/sunrpc/xprtmultipath.h> +++#include <linux/sunrpc/xprtsock.h> +++#include <linux/sunrpc/xprt.h> +++#include <linux/smp.h> +++#include <linux/delay.h> +++ +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_multipath.h" +++#include "enfs_multipath_parse.h" +++#include "enfs_path.h" +++#include "enfs_proc.h" +++#include "enfs_multipath_client.h" +++ +++static bool enfs_rpc_xprt_switch_need_delete_addr( +++ struct multipath_mount_options *enfs_option, +++ struct sockaddr *dstaddr, struct sockaddr *srcaddr) +++{ +++ int i; +++ bool find_same_ip = false; +++ int32_t local_total; +++ int32_t remote_total; +++ +++ local_total = enfs_option->local_ip_list->count; +++ remote_total = enfs_option->remote_ip_list->count; +++ if (local_total == 0 || remote_total == 0) { +++ pr_err("no ip list is present.\n"); +++ return false; +++ } +++ +++ for (i = 0; i < local_total; i++) { +++ find_same_ip = +++ rpc_cmp_addr((struct sockaddr *) +++ &enfs_option->local_ip_list->address[i], +++ srcaddr); +++ if (find_same_ip) +++ break; +++ } +++ +++ if (find_same_ip == false) +++ return true; +++ +++ find_same_ip = false; +++ for (i = 0; i < remote_total; i++) { +++ find_same_ip = +++ rpc_cmp_addr((struct sockaddr *) +++ &enfs_option->remote_ip_list->address[i], +++ dstaddr); +++ if (find_same_ip) +++ break; +++ } +++ +++ if (find_same_ip == false) +++ return true; +++ +++ return false; +++} +++ +++// Used in rcu_lock +++static bool enfs_delete_xprt_from_switch(struct rpc_xprt *xprt, +++ void *enfs_option, +++ struct rpc_xprt_switch *xps) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ struct multipath_mount_options *mopt = +++ (struct multipath_mount_options *)enfs_option; +++ +++ if (enfs_is_main_xprt(xprt)) +++ return true; +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (enfs_rpc_xprt_switch_need_delete_addr(mopt, +++ (struct sockaddr *)&xprt->addr, +++ (struct sockaddr *)&ctx->srcaddr)) { +++ +++ print_enfs_multipath_addr((struct sockaddr *)&ctx->srcaddr, +++ (struct sockaddr *)&xprt->addr); +++ rpc_xprt_switch_remove_xprt(xps, xprt); +++ return true; +++ } +++ +++ return false; +++} +++ +++void enfs_clnt_delete_obsolete_xprts(struct nfs_client *nfs_client, +++ void *enfs_option) +++{ +++ int xprt_count = 0; +++ struct rpc_xprt *pos = NULL; +++ struct rpc_xprt_switch *xps = NULL; +++ +++ rcu_read_lock(); +++ xps = xprt_switch_get( +++ rcu_dereference( +++ nfs_client->cl_rpcclient->cl_xpi.xpi_xpswitch)); +++ if (xps == NULL) { +++ rcu_read_unlock(); +++ xprt_switch_put(xps); +++ return; +++ } +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (xprt_count < MAX_XPRT_NUM_PER_CLIENT) { +++ if (enfs_delete_xprt_from_switch( +++ pos, enfs_option, xps) == false) +++ xprt_count++; +++ } else +++ rpc_xprt_switch_remove_xprt(xps, pos); +++ } +++ rcu_read_unlock(); +++ xprt_switch_put(xps); +++} +++ +++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) +++{ +++ int errno = 0; +++ char servername[48]; +++ struct multipath_mount_options *remount_lists = +++ (struct multipath_mount_options *)enfs_option; +++ struct multipath_client_info *client_info = +++ (struct multipath_client_info *)nfs_client->cl_multipath_data; +++ struct xprt_create xprtargs; +++ struct rpc_create_args args = { +++ .protocol = nfs_client->cl_proto, +++ .net = nfs_client->cl_net, +++ .addrsize = nfs_client->cl_addrlen, +++ .servername = nfs_client->cl_hostname, +++ }; +++ +++ memset(&xprtargs, 0, sizeof(struct xprt_create)); +++ +++ //mount is not use multipath +++ if (client_info == NULL || enfs_option == NULL) { +++ enfs_log_error( +++ "mount information or remount information is empty.\n"); +++ return -EINVAL; +++ } +++ +++ //remount : localaddrs and remoteaddrs are empty +++ if (remount_lists->local_ip_list->count == 0 && +++ remount_lists->remote_ip_list->count == 0) { +++ enfs_log_info("remount local_ip_list and remote_ip_list are NULL\n"); +++ return 0; +++ } +++ +++ errno = enfs_config_xprt_create_args(&xprtargs, +++ &args, servername, sizeof(servername)); +++ +++ if (errno) { +++ enfs_log_error("config_xprt_create failed! errno:%d\n", errno); +++ return errno; +++ } +++ +++ if (remount_lists->local_ip_list->count == 0) { +++ if (client_info->local_ip_list->count == 0) { +++ errno = rpc_localaddr(nfs_client->cl_rpcclient, +++ (struct sockaddr *) +++ &remount_lists->local_ip_list->address[0], +++ sizeof(struct sockaddr_storage)); +++ if (errno) { +++ enfs_log_error("get clnt srcaddr errno:%d\n", +++ errno); +++ return errno; +++ } +++ remount_lists->local_ip_list->count = 1; +++ } else +++ memcpy(remount_lists->local_ip_list, +++ client_info->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ if (remount_lists->remote_ip_list->count == 0) { +++ if (client_info->remote_ip_list->count == 0) { +++ errno = rpc_peeraddr(nfs_client->cl_rpcclient, +++ (struct sockaddr *) +++ &remount_lists->remote_ip_list->address[0], +++ sizeof(struct sockaddr_storage)); +++ if (errno == 0) { +++ enfs_log_error("get clnt dstaddr errno:%d\n", +++ errno); +++ return errno; +++ } +++ remount_lists->remote_ip_list->count = 1; +++ } else +++ memcpy(remount_lists->remote_ip_list, +++ client_info->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ } +++ +++ enfs_log_info("Remount creating new links...\n"); +++ enfs_xprt_ippair_create(&xprtargs, +++ nfs_client->cl_rpcclient, +++ remount_lists); +++ +++ enfs_log_info("Remount deleting obsolete links...\n"); +++ enfs_clnt_delete_obsolete_xprts(nfs_client, remount_lists); +++ +++ memcpy(client_info->local_ip_list, +++ remount_lists->local_ip_list, +++ sizeof(struct nfs_ip_list)); +++ memcpy(client_info->remote_ip_list, +++ remount_lists->remote_ip_list, +++ sizeof(struct nfs_ip_list)); +++ +++ return 0; +++} ++diff --git a/fs/nfs/enfs/enfs_remount.h b/fs/nfs/enfs/enfs_remount.h ++new file mode 100644 ++index 000000000000..a663ed257004 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_remount.h ++@@ -0,0 +1,15 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: remount ip header file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++#ifndef _ENFS_REMOUNT_ +++#define _ENFS_REMOUNT_ +++#include <linux/string.h> +++#include "enfs.h" +++ +++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); +++ +++#endif ++diff --git a/fs/nfs/enfs/enfs_roundrobin.c b/fs/nfs/enfs/enfs_roundrobin.c ++new file mode 100644 ++index 000000000000..4e4eda784a3e ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_roundrobin.c ++@@ -0,0 +1,255 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#include <linux/spinlock.h> +++#include <linux/module.h> +++#include <linux/printk.h> +++#include <linux/kref.h> +++#include <linux/rculist.h> +++#include <linux/types.h> +++#include <linux/sunrpc/xprt.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/xprtmultipath.h> +++#include "enfs_roundrobin.h" +++ +++#include "enfs.h" +++#include "enfs_config.h" +++#include "pm_state.h" +++ +++typedef struct rpc_xprt *(*enfs_xprt_switch_find_xprt_t)( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur); +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin; +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular; +++ +++static bool enfs_xprt_is_active(struct rpc_xprt *xprt) +++{ +++ enum pm_path_state state; +++ +++ if (kref_read(&xprt->kref) <= 0) +++ return false; +++ +++ state = pm_get_path_state(xprt); +++ if (state == PM_STATE_NORMAL) +++ return true; +++ +++ return false; +++} +++ +++static struct rpc_xprt *enfs_lb_set_cursor_xprt( +++ struct rpc_xprt_switch *xps, struct rpc_xprt **cursor, +++ enfs_xprt_switch_find_xprt_t find_next) +++{ +++ struct rpc_xprt *pos; +++ struct rpc_xprt *old; +++ +++ old = smp_load_acquire(cursor); /* read latest cursor */ +++ pos = find_next(xps, old); +++ smp_store_release(cursor, pos); /* let cursor point to pos */ +++ return pos; +++} +++ +++static +++struct rpc_xprt *enfs_lb_find_next_entry_roundrobin( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *pos; +++ struct rpc_xprt *prev = NULL; +++ bool found = false; +++ struct rpc_xprt *min_queuelen_xprt = NULL; +++ unsigned long pos_xprt_queuelen; +++ unsigned long min_xprt_queuelen = 0; +++ +++ unsigned long xps_queuelen = atomic_long_read(&xps->xps_queuelen); +++ // delete origin xprt +++ unsigned int multipath_nactive = READ_ONCE(xps->xps_nactive) - 1; +++ +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (enfs_is_main_xprt(pos) || !enfs_xprt_is_active(pos)) { +++ prev = pos; +++ continue; +++ } +++ +++ pos_xprt_queuelen = atomic_long_read(&pos->queuelen); +++ if (min_queuelen_xprt == NULL || +++ pos_xprt_queuelen < min_xprt_queuelen) { +++ +++ min_queuelen_xprt = pos; +++ min_xprt_queuelen = pos_xprt_queuelen; +++ } +++ +++ if (cur == prev) +++ found = true; +++ +++ if (found && pos_xprt_queuelen * +++ multipath_nactive <= xps_queuelen) +++ return pos; +++ prev = pos; +++ }; +++ +++ return min_queuelen_xprt; +++} +++ +++struct rpc_xprt *enfs_lb_switch_find_first_active_xprt( +++ struct rpc_xprt_switch *xps) +++{ +++ struct rpc_xprt *pos; +++ +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (enfs_xprt_is_active(pos)) +++ return pos; +++ }; +++ return NULL; +++} +++ +++struct rpc_xprt *enfs_lb_switch_get_main_xprt(struct rpc_xprt_switch *xps) +++{ +++ return list_first_or_null_rcu(&xps->xps_xprt_list, +++ struct rpc_xprt, xprt_switch); +++} +++ +++static struct rpc_xprt *enfs_lb_switch_get_next_xprt_roundrobin( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *xprt; +++ +++ // disable multipath +++ if (enfs_get_config_multipath_state()) +++ return enfs_lb_switch_get_main_xprt(xps); +++ +++ xprt = enfs_lb_find_next_entry_roundrobin(xps, cur); +++ if (xprt != NULL) +++ return xprt; +++ +++ return enfs_lb_switch_get_main_xprt(xps); +++} +++ +++static +++struct rpc_xprt *enfs_lb_iter_next_entry_roundrobin(struct rpc_xprt_iter *xpi) +++{ +++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); +++ +++ if (xps == NULL) +++ return NULL; +++ +++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, +++ enfs_lb_switch_get_next_xprt_roundrobin); +++} +++ +++static +++struct rpc_xprt *enfs_lb_switch_find_singular_entry( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *pos; +++ bool found = false; +++ +++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { +++ if (cur == pos) +++ found = true; +++ +++ if (found && enfs_xprt_is_active(pos)) +++ return pos; +++ } +++ return NULL; +++} +++ +++struct rpc_xprt *enfs_lb_get_singular_xprt( +++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *xprt; +++ +++ if (xps == NULL) +++ return NULL; +++ +++ // disable multipath +++ if (enfs_get_config_multipath_state()) +++ return enfs_lb_switch_get_main_xprt(xps); +++ +++ if (cur == NULL || xps->xps_nxprts < 2) +++ return enfs_lb_switch_find_first_active_xprt(xps); +++ +++ xprt = enfs_lb_switch_find_singular_entry(xps, cur); +++ if (!xprt) +++ return enfs_lb_switch_get_main_xprt(xps); +++ +++ return xprt; +++} +++ +++static +++struct rpc_xprt *enfs_lb_iter_next_entry_sigular(struct rpc_xprt_iter *xpi) +++{ +++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); +++ +++ if (xps == NULL) +++ return NULL; +++ +++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, +++ enfs_lb_get_singular_xprt); +++} +++ +++static void enfs_lb_iter_default_rewind(struct rpc_xprt_iter *xpi) +++{ +++ WRITE_ONCE(xpi->xpi_cursor, NULL); +++} +++ +++static void enfs_lb_switch_set_roundrobin(struct rpc_clnt *clnt) +++{ +++ struct rpc_xprt_switch *xps; +++ +++ rcu_read_lock(); +++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); +++ rcu_read_unlock(); +++ if (clnt->cl_vers == 3) { +++ +++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_roundrobin) +++ WRITE_ONCE(xps->xps_iter_ops, +++ &enfs_xprt_iter_roundrobin); +++ +++ return; +++ } +++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_singular) +++ WRITE_ONCE(xps->xps_iter_ops, &enfs_xprt_iter_singular); +++} +++ +++static +++struct rpc_xprt *enfs_lb_switch_find_current(struct list_head *head, +++ const struct rpc_xprt *cur) +++{ +++ struct rpc_xprt *pos; +++ +++ list_for_each_entry_rcu(pos, head, xprt_switch) { +++ if (cur == pos) +++ return pos; +++ } +++ return NULL; +++} +++ +++static struct rpc_xprt *enfs_lb_iter_current_entry(struct rpc_xprt_iter *xpi) +++{ +++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); +++ struct list_head *head; +++ +++ if (xps == NULL) +++ return NULL; +++ head = &xps->xps_xprt_list; +++ if (xpi->xpi_cursor == NULL || xps->xps_nxprts < 2) +++ return enfs_lb_switch_get_main_xprt(xps); +++ return enfs_lb_switch_find_current(head, xpi->xpi_cursor); +++} +++ +++void enfs_lb_set_policy(struct rpc_clnt *clnt) +++{ +++ enfs_lb_switch_set_roundrobin(clnt); +++} +++ +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin = { +++ .xpi_rewind = enfs_lb_iter_default_rewind, +++ .xpi_xprt = enfs_lb_iter_current_entry, +++ .xpi_next = enfs_lb_iter_next_entry_roundrobin, +++}; +++ +++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular = { +++ .xpi_rewind = enfs_lb_iter_default_rewind, +++ .xpi_xprt = enfs_lb_iter_current_entry, +++ .xpi_next = enfs_lb_iter_next_entry_sigular, +++}; ++diff --git a/fs/nfs/enfs/enfs_roundrobin.h b/fs/nfs/enfs/enfs_roundrobin.h ++new file mode 100644 ++index 000000000000..b72b088a6258 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_roundrobin.h ++@@ -0,0 +1,9 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#ifndef ENFS_ROUNDROBIN_H +++#define ENFS_ROUNDROBIN_H +++ +++void enfs_lb_set_policy(struct rpc_clnt *clnt); +++#endif +diff --git a/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch +new file mode 100644 +index 0000000..cc6b677 +--- /dev/null ++++ b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch +@@ -0,0 +1,1607 @@ ++diff --git a/fs/nfs/enfs/enfs_config.c b/fs/nfs/enfs/enfs_config.c ++new file mode 100644 ++index 000000000000..11aa7a00385b ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_config.c ++@@ -0,0 +1,378 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ */ +++#include <linux/cdev.h> +++#include <linux/errno.h> +++#include <linux/fcntl.h> +++#include <linux/fs.h> +++#include <linux/kernel.h> +++#include <linux/kthread.h> +++#include <linux/slab.h> +++#include <linux/string.h> +++#include <linux/uaccess.h> +++#include <linux/delay.h> +++ +++#include "enfs_errcode.h" +++#include "enfs_log.h" +++#include "enfs_config.h" +++ +++#define MAX_FILE_SIZE 8192 +++#define STRING_BUF_SIZE 128 +++#define CONFIG_FILE_PATH "/etc/enfs/config.ini" +++#define ENFS_NOTIFY_FILE_PERIOD 1000UL +++ +++#define MAX_PATH_DETECT_INTERVAL 300 +++#define MIN_PATH_DETECT_INTERVAL 5 +++#define MAX_PATH_DETECT_TIMEOUT 60 +++#define MIN_PATH_DETECT_TIMEOUT 1 +++#define MAX_MULTIPATH_TIMEOUT 60 +++#define MIN_MULTIPATH_TIMEOUT 0 +++#define MAX_MULTIPATH_STATE ENFS_MULTIPATH_DISABLE +++#define MIN_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE +++ +++#define DEFAULT_PATH_DETECT_INTERVAL 10 +++#define DEFAULT_PATH_DETECT_TIMEOUT 5 +++#define DEFAULT_MULTIPATH_TIMEOUT 0 +++#define DEFAULT_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE +++#define DEFAULT_LOADBALANCE_MODE ENFS_LOADBALANCE_RR +++ +++typedef int (*check_and_assign_func)(char *, char *, int, int); +++ +++struct enfs_config_info { +++ int32_t path_detect_interval; +++ int32_t path_detect_timeout; +++ int32_t multipath_timeout; +++ int32_t loadbalance_mode; +++ int32_t multipath_state; +++}; +++ +++struct check_and_assign_value { +++ char *field_name; +++ check_and_assign_func func; +++ int min_value; +++ int max_value; +++}; +++ +++static struct enfs_config_info g_enfs_config_info; +++static struct timespec64 modify_time; +++static struct task_struct *thread; +++ +++static int enfs_check_config_value(char *value, int min_value, int max_value) +++{ +++ unsigned long num_value; +++ int ret; +++ +++ ret = kstrtol(value, 10, &num_value); +++ if (ret != 0) { +++ enfs_log_error("Failed to convert string to int\n"); +++ return -EINVAL; +++ } +++ +++ if (num_value < min_value || num_value > max_value) +++ return -EINVAL; +++ +++ return num_value; +++} +++ +++static int32_t enfs_check_and_assign_int_value(char *field_name, char *value, +++ int min_value, int max_value) +++{ +++ int int_value = enfs_check_config_value(value, min_value, max_value); +++ +++ if (int_value < 0) +++ return -EINVAL; +++ +++ if (strcmp(field_name, "path_detect_interval") == 0) { +++ g_enfs_config_info.path_detect_interval = int_value; +++ return ENFS_RET_OK; +++ } +++ if (strcmp(field_name, "path_detect_timeout") == 0) { +++ g_enfs_config_info.path_detect_timeout = int_value; +++ return ENFS_RET_OK; +++ } +++ if (strcmp(field_name, "multipath_timeout") == 0) { +++ g_enfs_config_info.multipath_timeout = int_value; +++ return ENFS_RET_OK; +++ } +++ if (strcmp(field_name, "multipath_disable") == 0) { +++ g_enfs_config_info.multipath_state = int_value; +++ return ENFS_RET_OK; +++ } +++ return -EINVAL; +++} +++ +++static int32_t enfs_check_and_assign_loadbalance_mode(char *field_name, +++ char *value, +++ int min_value, +++ int max_value) +++{ +++ if (value == NULL) +++ return -EINVAL; +++ +++ if (strcmp(field_name, "multipath_select_policy") == 0) { +++ if (strcmp(value, "roundrobin") == 0) { +++ g_enfs_config_info.loadbalance_mode +++ = ENFS_LOADBALANCE_RR; +++ return ENFS_RET_OK; +++ } +++ } +++ return -EINVAL; +++} +++ +++static const struct check_and_assign_value g_check_and_assign_value[] = { +++ {"path_detect_interval", enfs_check_and_assign_int_value, +++ MIN_PATH_DETECT_INTERVAL, MAX_PATH_DETECT_INTERVAL}, +++ {"path_detect_timeout", enfs_check_and_assign_int_value, +++ MIN_PATH_DETECT_TIMEOUT, MAX_PATH_DETECT_TIMEOUT}, +++ {"multipath_timeout", enfs_check_and_assign_int_value, +++ MIN_MULTIPATH_TIMEOUT, MAX_MULTIPATH_TIMEOUT}, +++ {"multipath_disable", enfs_check_and_assign_int_value, +++ MIN_MULTIPATH_STATE, MAX_MULTIPATH_STATE}, +++ {"multipath_select_policy", enfs_check_and_assign_loadbalance_mode, +++ 0, 0}, +++}; +++ +++static int32_t enfs_read_config_file(char *buffer, char *file_path) +++{ +++ int ret; +++ struct file *filp = NULL; +++ loff_t f_pos = 0; +++ mm_segment_t fs; +++ +++ +++ filp = filp_open(file_path, O_RDONLY, 0); +++ +++ if (IS_ERR(filp)) { +++ enfs_log_error("Failed to open file %s\n", CONFIG_FILE_PATH); +++ ret = -ENOENT; +++ return ret; +++ } +++ +++ fs = get_fs(); +++ set_fs(get_ds()); +++ kernel_read(filp, buffer, MAX_FILE_SIZE, &f_pos); +++ set_fs(fs); +++ +++ ret = filp_close(filp, NULL); +++ if (ret) { +++ enfs_log_error("Close File:%s failed:%d.\n", +++ CONFIG_FILE_PATH, ret); +++ return -EINVAL; +++ } +++ return ENFS_RET_OK; +++} +++ +++static int32_t enfs_deal_with_comment_line(char *buffer) +++{ +++ int ret; +++ char *pos = strchr(buffer, '\n'); +++ +++ if (pos != NULL) +++ ret = strlen(buffer) - strlen(pos); +++ else +++ ret = strlen(buffer); +++ +++ return ret; +++} +++ +++static int32_t enfs_parse_key_value_from_config(char *buffer, char *key, +++ char *value, int keyLen, +++ int valueLen) +++{ +++ char *line; +++ char *tokenPtr; +++ int len; +++ char *tem; +++ char *pos = strchr(buffer, '\n'); +++ +++ if (pos != NULL) +++ len = strlen(buffer) - strlen(pos); +++ else +++ len = strlen(buffer); +++ +++ line = kmalloc(len + 1, GFP_KERNEL); +++ if (!line) { +++ enfs_log_error("Failed to allocate memory.\n"); +++ return -ENOMEM; +++ } +++ line[len] = '\0'; +++ strncpy(line, buffer, len); +++ +++ tem = line; +++ tokenPtr = strsep(&tem, "="); +++ if (tokenPtr == NULL || tem == NULL) { +++ kfree(line); +++ return len; +++ } +++ strncpy(key, strim(tokenPtr), keyLen); +++ strncpy(value, strim(tem), valueLen); +++ +++ kfree(line); +++ return len; +++} +++ +++static int32_t enfs_get_value_from_config_file(char *buffer, char *field_name, +++ char *value, int valueLen) +++{ +++ int ret; +++ char key[STRING_BUF_SIZE + 1] = {0}; +++ char val[STRING_BUF_SIZE + 1] = {0}; +++ +++ while (buffer[0] != '\0') { +++ if (buffer[0] == '\n') { +++ buffer++; +++ } else if (buffer[0] == '#') { +++ ret = enfs_deal_with_comment_line(buffer); +++ if (ret > 0) +++ buffer += ret; +++ } else { +++ ret = enfs_parse_key_value_from_config(buffer, key, val, +++ STRING_BUF_SIZE, +++ STRING_BUF_SIZE); +++ if (ret < 0) { +++ enfs_log_error("failed parse key value, %d\n" +++ , ret); +++ return ret; +++ } +++ key[STRING_BUF_SIZE] = '\0'; +++ val[STRING_BUF_SIZE] = '\0'; +++ +++ buffer += ret; +++ +++ if (strcmp(field_name, key) == 0) { +++ strncpy(value, val, valueLen); +++ return ENFS_RET_OK; +++ } +++ } +++ } +++ enfs_log_error("can not find value which matched field_name: %s.\n", +++ field_name); +++ return -EINVAL; +++} +++ +++int32_t enfs_config_load(void) +++{ +++ char value[STRING_BUF_SIZE + 1]; +++ int ret; +++ int table_len; +++ int min; +++ int max; +++ int i; +++ char *buffer; +++ +++ buffer = kmalloc(MAX_FILE_SIZE, GFP_KERNEL); +++ if (!buffer) { +++ enfs_log_error("Failed to allocate memory.\n"); +++ return -ENOMEM; +++ } +++ memset(buffer, 0, MAX_FILE_SIZE); +++ +++ g_enfs_config_info.path_detect_interval = DEFAULT_PATH_DETECT_INTERVAL; +++ g_enfs_config_info.path_detect_timeout = DEFAULT_PATH_DETECT_TIMEOUT; +++ g_enfs_config_info.multipath_timeout = DEFAULT_MULTIPATH_TIMEOUT; +++ g_enfs_config_info.multipath_state = DEFAULT_MULTIPATH_STATE; +++ g_enfs_config_info.loadbalance_mode = DEFAULT_LOADBALANCE_MODE; +++ +++ table_len = sizeof(g_check_and_assign_value) / +++ sizeof(g_check_and_assign_value[0]); +++ +++ ret = enfs_read_config_file(buffer, CONFIG_FILE_PATH); +++ if (ret != 0) { +++ kfree(buffer); +++ return ret; +++ } +++ +++ for (i = 0; i < table_len; i++) { +++ ret = enfs_get_value_from_config_file(buffer, +++ g_check_and_assign_value[i].field_name, +++ value, STRING_BUF_SIZE); +++ if (ret < 0) +++ continue; +++ +++ value[STRING_BUF_SIZE] = '\0'; +++ min = g_check_and_assign_value[i].min_value; +++ max = g_check_and_assign_value[i].max_value; +++ if (g_check_and_assign_value[i].func != NULL) +++ (*g_check_and_assign_value[i].func)( +++ g_check_and_assign_value[i].field_name, +++ value, min, max); +++ } +++ +++ kfree(buffer); +++ return ENFS_RET_OK; +++} +++ +++int32_t enfs_get_config_path_detect_interval(void) +++{ +++ return g_enfs_config_info.path_detect_interval; +++} +++ +++int32_t enfs_get_config_path_detect_timeout(void) +++{ +++ return g_enfs_config_info.path_detect_timeout; +++} +++ +++int32_t enfs_get_config_multipath_timeout(void) +++{ +++ return g_enfs_config_info.multipath_timeout; +++} +++ +++int32_t enfs_get_config_multipath_state(void) +++{ +++ return g_enfs_config_info.multipath_state; +++} +++ +++int32_t enfs_get_config_loadbalance_mode(void) +++{ +++ return g_enfs_config_info.loadbalance_mode; +++} +++ +++static bool enfs_file_changed(const char *filename) +++{ +++ int err; +++ struct kstat file_stat; +++ +++ err = vfs_stat(filename, &file_stat); +++ if (err) { +++ pr_err("failed to open file:%s err:%d\n", filename, err); +++ return false; +++ } +++ +++ if (timespec64_compare(&modify_time, &file_stat.mtime) == -1) { +++ modify_time = file_stat.mtime; +++ pr_info("file change: %lld %lld\n", modify_time.tv_sec, +++ file_stat.mtime.tv_sec); +++ return true; +++ } +++ +++ return false; +++} +++ +++static int enfs_thread_func(void *data) +++{ +++ while (!kthread_should_stop()) { +++ if (enfs_file_changed(CONFIG_FILE_PATH)) +++ enfs_config_load(); +++ +++ msleep(ENFS_NOTIFY_FILE_PERIOD); +++ } +++ return 0; +++} +++ +++int enfs_config_timer_init(void) +++{ +++ thread = kthread_run(enfs_thread_func, NULL, "enfs_notiy_file_thread"); +++ if (IS_ERR(thread)) { +++ pr_err("Failed to create kernel thread\n"); +++ return PTR_ERR(thread); +++ } +++ return 0; +++} +++ +++void enfs_config_timer_exit(void) +++{ +++ pr_info("enfs_notify_file_exit\n"); +++ if (thread) +++ kthread_stop(thread); +++} ++diff --git a/fs/nfs/enfs/enfs_config.h b/fs/nfs/enfs/enfs_config.h ++new file mode 100644 ++index 000000000000..752710129170 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_config.h ++@@ -0,0 +1,32 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs configuration +++ * Author: y00583252 +++ * Create: 2023-07-27 +++ */ +++ +++#ifndef ENFS_CONFIG_H +++#define ENFS_CONFIG_H +++ +++#include <linux/types.h> +++ +++enum enfs_multipath_state { +++ ENFS_MULTIPATH_ENABLE = 0, +++ ENFS_MULTIPATH_DISABLE = 1, +++}; +++ +++enum enfs_loadbalance_mode { +++ ENFS_LOADBALANCE_RR, +++}; +++ +++ +++int32_t enfs_get_config_path_detect_interval(void); +++int32_t enfs_get_config_path_detect_timeout(void); +++int32_t enfs_get_config_multipath_timeout(void); +++int32_t enfs_get_config_multipath_state(void); +++int32_t enfs_get_config_loadbalance_mode(void); +++int32_t enfs_config_load(void); +++int32_t enfs_config_timer_init(void); +++void enfs_config_timer_exit(void); +++#endif // ENFS_CONFIG_H ++diff --git a/fs/nfs/enfs/enfs_errcode.h b/fs/nfs/enfs/enfs_errcode.h ++new file mode 100644 ++index 000000000000..cca47ab9a191 ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_errcode.h ++@@ -0,0 +1,17 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs errocode +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef ENFS_ERRCODE_H +++#define ENFS_ERRCODE_H +++ +++enum { +++ ENFS_RET_OK = 0, +++ ENFS_RET_FAIL +++}; +++ +++#endif // ENFS_ERRCODE_H ++diff --git a/fs/nfs/enfs/enfs_log.h b/fs/nfs/enfs/enfs_log.h ++new file mode 100644 ++index 000000000000..177b404f05df ++--- /dev/null +++++ b/fs/nfs/enfs/enfs_log.h ++@@ -0,0 +1,25 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: enfs log +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++#ifndef ENFS_LOG_H +++#define ENFS_LOG_H +++ +++#include <linux/printk.h> +++ +++#define enfs_log_info(fmt, ...) \ +++ pr_info("enfs:[%s]" pr_fmt(fmt), \ +++ __func__, ##__VA_ARGS__) +++ +++#define enfs_log_error(fmt, ...) \ +++ pr_err("enfs:[%s]" pr_fmt(fmt), \ +++ __func__, ##__VA_ARGS__) +++ +++#define enfs_log_debug(fmt, ...) \ +++ pr_debug("enfs:[%s]" pr_fmt(fmt), \ +++ __func__, ##__VA_ARGS__) +++ +++#endif // ENFS_ERRCODE_H ++diff --git a/fs/nfs/enfs/failover_com.h b/fs/nfs/enfs/failover_com.h ++new file mode 100644 ++index 000000000000..c52940da232e ++--- /dev/null +++++ b/fs/nfs/enfs/failover_com.h ++@@ -0,0 +1,23 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: failover time commont header file +++ * Create: 2023-08-02 +++ */ +++#ifndef FAILOVER_COMMON_H +++#define FAILOVER_COMMON_H +++ +++static inline bool failover_is_enfs_clnt(struct rpc_clnt *clnt) +++{ +++ struct rpc_clnt *next = clnt->cl_parent; +++ +++ while (next) { +++ if (next == next->cl_parent) +++ break; +++ next = next->cl_parent; +++ } +++ +++ return next != NULL ? next->cl_enfs : clnt->cl_enfs; +++} +++ +++#endif // FAILOVER_COMMON_H ++diff --git a/fs/nfs/enfs/failover_path.c b/fs/nfs/enfs/failover_path.c ++new file mode 100644 ++index 000000000000..93b454de29d1 ++--- /dev/null +++++ b/fs/nfs/enfs/failover_path.c ++@@ -0,0 +1,207 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs path failover file +++ * Author: y00583252 +++ * Create: 2023-08-02 +++ */ +++ +++#include "failover_path.h" +++#include <linux/nfs.h> +++#include <linux/nfs3.h> +++#include <linux/nfs4.h> +++#include <linux/sunrpc/clnt.h> +++#include <linux/sunrpc/sched.h> +++#include <linux/sunrpc/xprt.h> +++#include "enfs_config.h" +++#include "enfs_log.h" +++#include "failover_com.h" +++#include "pm_state.h" +++#include "pm_ping.h" +++ +++enum failover_policy_t { +++ FAILOVER_NOACTION = 1, +++ FAILOVER_RETRY, +++ FAILOVER_RETRY_DELAY, +++}; +++ +++static void failover_retry_path(struct rpc_task *task) +++{ +++ xprt_release(task); +++ rpc_init_task_retry_counters(task); +++ rpc_task_release_transport(task); +++ rpc_restart_call(task); +++} +++ +++static void failover_retry_path_delay(struct rpc_task *task, int32_t delay) +++{ +++ failover_retry_path(task); +++ rpc_delay(task, delay); +++} +++ +++static void failover_retry_path_by_policy(struct rpc_task *task, +++ enum failover_policy_t policy) +++{ +++ if (policy == FAILOVER_RETRY) +++ failover_retry_path(task); +++ else if (policy == FAILOVER_RETRY_DELAY) +++ failover_retry_path_delay(task, 3 * HZ); // delay 3s +++} +++ +++static +++enum failover_policy_t failover_get_nfs3_retry_policy(struct rpc_task *task) +++{ +++ enum failover_policy_t policy = FAILOVER_NOACTION; +++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; +++ u32 proc; +++ +++ if (unlikely(procinfo == NULL)) { +++ enfs_log_error("the task contains no valid proc.\n"); +++ return FAILOVER_NOACTION; +++ } +++ +++ proc = procinfo->p_proc; +++ +++ switch (proc) { +++ case NFS3PROC_CREATE: +++ case NFS3PROC_MKDIR: +++ case NFS3PROC_REMOVE: +++ case NFS3PROC_RMDIR: +++ case NFS3PROC_SYMLINK: +++ case NFS3PROC_LINK: +++ case NFS3PROC_SETATTR: +++ case NFS3PROC_WRITE: +++ policy = FAILOVER_RETRY_DELAY; +++ default: +++ policy = FAILOVER_RETRY; +++ } +++ return policy; +++} +++ +++static +++enum failover_policy_t failover_get_nfs4_retry_policy(struct rpc_task *task) +++{ +++ enum failover_policy_t policy = FAILOVER_NOACTION; +++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; +++ u32 proc_idx; +++ +++ if (unlikely(procinfo == NULL)) { +++ enfs_log_error("the task contains no valid proc.\n"); +++ return FAILOVER_NOACTION; +++ } +++ +++ proc_idx = procinfo->p_statidx; +++ +++ switch (proc_idx) { +++ case NFSPROC4_CLNT_CREATE: +++ case NFSPROC4_CLNT_REMOVE: +++ case NFSPROC4_CLNT_LINK: +++ case NFSPROC4_CLNT_SYMLINK: +++ case NFSPROC4_CLNT_SETATTR: +++ case NFSPROC4_CLNT_WRITE: +++ case NFSPROC4_CLNT_RENAME: +++ case NFSPROC4_CLNT_SETACL: +++ policy = FAILOVER_RETRY_DELAY; +++ default: +++ policy = FAILOVER_RETRY; +++ } +++ return policy; +++} +++ +++static enum failover_policy_t failover_get_retry_policy(struct rpc_task *task) +++{ +++ struct rpc_clnt *clnt = task->tk_client; +++ u32 version = clnt->cl_vers; +++ enum failover_policy_t policy = FAILOVER_NOACTION; +++ +++ // 1. if the task meant to send to certain xprt, take no action +++ if (task->tk_flags & RPC_TASK_FIXED) +++ return FAILOVER_NOACTION; +++ +++ // 2. get policy by different version of nfs protocal +++ if (version == 3) // nfs v3 +++ policy = failover_get_nfs3_retry_policy(task); +++ else if (version == 4) // nfs v4 +++ policy = failover_get_nfs4_retry_policy(task); +++ else +++ return FAILOVER_NOACTION; +++ +++ // 3. if the task is not send to target, retry immediately +++ if (!RPC_WAS_SENT(task)) +++ policy = FAILOVER_RETRY; +++ +++ return policy; +++} +++ +++static int failover_check_task(struct rpc_task *task) +++{ +++ struct rpc_clnt *clnt = NULL; +++ int disable_mpath = enfs_get_config_multipath_state(); +++ +++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return -EINVAL; +++ } +++ +++ if (unlikely((task == NULL) || (task->tk_client == NULL))) { +++ enfs_log_error("The task is not valid.\n"); +++ return -EINVAL; +++ } +++ +++ clnt = task->tk_client; +++ +++ if (clnt->cl_prog != NFS_PROGRAM) { +++ enfs_log_debug("The clnt is not prog{%u} type.\n", +++ clnt->cl_prog); +++ return -EINVAL; +++ } +++ +++ if (!failover_is_enfs_clnt(clnt)) { +++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); +++ return -EINVAL; +++ } +++ return 0; +++} +++ +++void failover_handle(struct rpc_task *task) +++{ +++ enum failover_policy_t policy; +++ int ret; +++ +++ ret = failover_check_task(task); +++ if (ret != 0) +++ return; +++ +++ pm_set_path_state(task->tk_xprt, PM_STATE_FAULT); +++ +++ policy = failover_get_retry_policy(task); +++ +++ failover_retry_path_by_policy(task, policy); +++} +++ +++bool failover_task_need_call_start_again(struct rpc_task *task) +++{ +++ int ret; +++ +++ ret = failover_check_task(task); +++ if (ret != 0) +++ return false; +++ +++ return true; +++} +++ +++bool failover_prepare_transmit(struct rpc_task *task) +++{ +++ if (task->tk_flags & RPC_TASK_FIXED) +++ return true; +++ +++ if (pm_ping_is_test_xprt_task(task)) +++ return true; +++ +++ if (pm_get_path_state(task->tk_xprt) == PM_STATE_FAULT) { +++ task->tk_status = -ETIMEDOUT; +++ return false; +++ } +++ +++ return true; +++} ++diff --git a/fs/nfs/enfs/failover_path.h b/fs/nfs/enfs/failover_path.h ++new file mode 100644 ++index 000000000000..6f1294829a6e ++--- /dev/null +++++ b/fs/nfs/enfs/failover_path.h ++@@ -0,0 +1,17 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs path failover header file +++ * Author: y00583252 +++ * Create: 2023-08-02 +++ */ +++ +++#ifndef FAILOVER_PATH_H +++#define FAILOVER_PATH_H +++ +++#include <linux/sunrpc/sched.h> +++ +++void failover_handle(struct rpc_task *task); +++bool failover_prepare_transmit(struct rpc_task *task); +++ +++#endif // FAILOVER_PATH_H ++diff --git a/fs/nfs/enfs/failover_time.c b/fs/nfs/enfs/failover_time.c ++new file mode 100644 ++index 000000000000..866ea82d13fc ++--- /dev/null +++++ b/fs/nfs/enfs/failover_time.c ++@@ -0,0 +1,99 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: failover time file +++ * Create: 2023-08-02 +++ */ +++ +++#include "failover_time.h" +++#include <linux/jiffies.h> +++#include <linux/sunrpc/clnt.h> +++#include "enfs_config.h" +++#include "enfs_log.h" +++#include "failover_com.h" +++#include "pm_ping.h" +++ +++static unsigned long failover_get_mulitipath_timeout(struct rpc_clnt *clnt) +++{ +++ unsigned long config_tmo = enfs_get_config_multipath_timeout() * HZ; +++ unsigned long clnt_tmo = clnt->cl_timeout->to_initval; +++ +++ if (config_tmo == 0) +++ return clnt_tmo; +++ +++ return config_tmo > clnt_tmo ? clnt_tmo : config_tmo; +++} +++ +++void failover_adjust_task_timeout(struct rpc_task *task, void *condition) +++{ +++ struct rpc_clnt *clnt = NULL; +++ unsigned long tmo; +++ int disable_mpath = enfs_get_config_multipath_state(); +++ +++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return; +++ } +++ +++ clnt = task->tk_client; +++ if (unlikely(clnt == NULL)) { +++ enfs_log_error("task associate client is NULL.\n"); +++ return; +++ } +++ +++ if (!failover_is_enfs_clnt(clnt)) { +++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); +++ return; +++ } +++ +++ tmo = failover_get_mulitipath_timeout(clnt); +++ if (tmo == 0) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return; +++ } +++ +++ if (task->tk_timeout != 0) +++ task->tk_timeout = +++ task->tk_timeout < tmo ? task->tk_timeout : tmo; +++ else +++ task->tk_timeout = tmo; +++} +++ +++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req) +++{ +++ struct rpc_clnt *clnt = NULL; +++ int disable_mpath = enfs_get_config_multipath_state(); +++ +++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { +++ enfs_log_debug("Multipath is not enabled.\n"); +++ return; +++ } +++ +++ clnt = task->tk_client; +++ if (unlikely(clnt == NULL)) { +++ enfs_log_error("task associate client is NULL.\n"); +++ return; +++ } +++ +++ if (!failover_is_enfs_clnt(clnt)) { +++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); +++ return; +++ } +++ +++ if (!pm_ping_is_test_xprt_task(task)) +++ req->rq_timeout = failover_get_mulitipath_timeout(clnt); +++ else { +++ req->rq_timeout = enfs_get_config_path_detect_timeout() * HZ; +++ req->rq_majortimeo = req->rq_timeout + jiffies; +++ } +++ +++ /* +++ * when task is retried, the req is new, we lost major-timeout times, +++ * so we have to restore req major +++ * timeouts from the task, if it is stored. +++ */ +++ if (task->tk_major_timeo != 0) +++ req->rq_majortimeo = task->tk_major_timeo; +++ else +++ task->tk_major_timeo = req->rq_majortimeo; +++} ++diff --git a/fs/nfs/enfs/failover_time.h b/fs/nfs/enfs/failover_time.h ++new file mode 100644 ++index 000000000000..ede25b577a2a ++--- /dev/null +++++ b/fs/nfs/enfs/failover_time.h ++@@ -0,0 +1,16 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: failover time header file +++ * Create: 2023-08-02 +++ */ +++ +++#ifndef FAILOVER_TIME_H +++#define FAILOVER_TIME_H +++ +++#include <linux/sunrpc/sched.h> +++ +++void failover_adjust_task_timeout(struct rpc_task *task, void *condition); +++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req); +++ +++#endif // FAILOVER_TIME_H ++diff --git a/fs/nfs/enfs/init.h b/fs/nfs/enfs/init.h ++new file mode 100644 ++index 000000000000..fdabb9084e19 ++--- /dev/null +++++ b/fs/nfs/enfs/init.h ++@@ -0,0 +1,17 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs client init +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef ENFS_INIT_H +++#define ENFS_INIT_H +++ +++#include <linux/types.h> +++ +++int32_t enfs_init(void); +++void enfs_fini(void); +++ +++#endif ++diff --git a/fs/nfs/enfs/mgmt_init.c b/fs/nfs/enfs/mgmt_init.c ++new file mode 100644 ++index 000000000000..75a40c5e0f6c ++--- /dev/null +++++ b/fs/nfs/enfs/mgmt_init.c ++@@ -0,0 +1,22 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: mgmt component init +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#include "mgmt_init.h" +++#include <linux/printk.h> +++#include "enfs_errcode.h" +++#include "enfs_config.h" +++ +++int32_t mgmt_init(void) +++{ +++ return enfs_config_timer_init(); +++} +++ +++void mgmt_fini(void) +++{ +++ enfs_config_timer_exit(); +++} ++diff --git a/fs/nfs/enfs/mgmt_init.h b/fs/nfs/enfs/mgmt_init.h ++new file mode 100644 ++index 000000000000..aa78303b9f01 ++--- /dev/null +++++ b/fs/nfs/enfs/mgmt_init.h ++@@ -0,0 +1,18 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: mgmt component init +++ * Author: y00583252 +++ * Create: 2023-07-31 +++ */ +++ +++#ifndef MGMT_INIT_H +++#define MGMT_INIT_H +++ +++#include <linux/types.h> +++ +++int32_t mgmt_init(void); +++void mgmt_fini(void); +++ +++ +++#endif // MGMT_INIT_H ++diff --git a/fs/nfs/enfs/pm_ping.c b/fs/nfs/enfs/pm_ping.c ++new file mode 100644 ++index 000000000000..24153cd4c7f3 ++--- /dev/null +++++ b/fs/nfs/enfs/pm_ping.c ++@@ -0,0 +1,421 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: path state header file +++ * Author: x00833432 +++ * Create: 2023-08-21 +++ */ +++ +++#include "pm_ping.h" +++#include <linux/err.h> +++#include <linux/spinlock.h> +++#include <linux/slab.h> +++#include <linux/module.h> +++#include <linux/printk.h> +++#include <linux/kthread.h> +++#include <linux/nfs.h> +++#include <linux/errno.h> +++#include <linux/rcupdate.h> +++#include <linux/workqueue.h> +++#include <net/netns/generic.h> +++#include <linux/atomic.h> +++#include <linux/sunrpc/clnt.h> +++ +++#include "../../../net/sunrpc/netns.h" +++#include "pm_state.h" +++#include "enfs.h" +++#include "enfs_log.h" +++#include "enfs_config.h" +++ +++#define SLEEP_INTERVAL 2 +++extern unsigned int sunrpc_net_id; +++ +++static struct task_struct *pm_ping_timer_thread; +++//protect pint_execute_workq +++static spinlock_t ping_execute_workq_lock; +++// timer for test xprt workqueue +++static struct workqueue_struct *ping_execute_workq; +++// count the ping xprt work on flight +++static atomic_t check_xprt_count; +++ +++struct ping_xprt_work { +++ struct rpc_xprt *xprt; // use this specific xprt +++ struct rpc_clnt *clnt; // use this specific rpc_client +++ struct work_struct ping_work; +++}; +++ +++struct pm_ping_async_callback { +++ void *data; +++ void (*func)(void *data); +++}; +++ +++// set xprt's enum pm_check_state +++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, +++ enum pm_check_state state) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ +++ if (IS_ERR(xprt)) { +++ enfs_log_error("The xprt ptr is not exist.\n"); +++ return; +++ } +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ xprt_get(xprt); +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ xprt_put(xprt); +++ return; +++ } +++ +++ atomic_set(&ctx->path_check_state, state); +++ xprt_put(xprt); +++} +++ +++// get xprt's enum pm_check_state +++static enum pm_check_state pm_ping_get_path_check_state(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ enum pm_check_state state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return PM_CHECK_UNDEFINE; +++ } +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ return PM_CHECK_UNDEFINE; +++ } +++ +++ state = atomic_read(&ctx->path_check_state); +++ +++ return state; +++} +++ +++static void pm_ping_call_done_callback(void *data) +++{ +++ struct pm_ping_async_callback *callback_data = +++ (struct pm_ping_async_callback *)data; +++ +++ if (callback_data == NULL) +++ return; +++ +++ callback_data->func(callback_data->data); +++ +++ kfree(callback_data); +++} +++ +++// Default callback for async RPC calls +++static void pm_ping_call_done(struct rpc_task *task, void *data) +++{ +++ struct rpc_xprt *xprt = task->tk_xprt; +++ +++ atomic_dec(&check_xprt_count); +++ if (task->tk_status >= 0) +++ pm_set_path_state(xprt, PM_STATE_NORMAL); +++ else +++ pm_set_path_state(xprt, PM_STATE_FAULT); +++ +++ pm_ping_set_path_check_state(xprt, PM_CHECK_FINISH); +++ +++ pm_ping_call_done_callback(data); +++} +++ +++// register func to rpc_call_done +++static const struct rpc_call_ops pm_ping_set_status_ops = { +++ .rpc_call_done = pm_ping_call_done, +++}; +++ +++// execute work which in work_queue +++static void pm_ping_execute_work(struct work_struct *work) +++{ +++ int ret = 0; +++ +++ // get the work information +++ struct ping_xprt_work *work_info = +++ container_of(work, struct ping_xprt_work, ping_work); +++ +++ // if check state is pending +++ if (pm_ping_get_path_check_state(work_info->xprt) == PM_CHECK_WAITING) { +++ +++ pm_ping_set_path_check_state(work_info->xprt, +++ PM_CHECK_CHECKING); +++ +++ ret = rpc_clnt_test_xprt(work_info->clnt, +++ work_info->xprt, +++ &pm_ping_set_status_ops, +++ NULL, +++ RPC_TASK_ASYNC | RPC_TASK_FIXED); +++ +++ if (ret < 0) { +++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); +++ +++ pm_ping_set_path_check_state(work_info->xprt, +++ PM_CHECK_FINISH); +++ +++ } else +++ atomic_inc(&check_xprt_count); +++ +++ } +++ +++ atomic_dec(&work_info->clnt->cl_count); +++ xprt_put(work_info->xprt); +++ kfree(work_info); +++ work_info = NULL; +++} +++ +++static bool pm_ping_workqueue_queue_work(struct work_struct *work) +++{ +++ bool ret = false; +++ +++ spin_lock(&ping_execute_workq_lock); +++ +++ if (ping_execute_workq != NULL) +++ ret = queue_work(ping_execute_workq, work); +++ +++ spin_unlock(&ping_execute_workq_lock); +++ return ret; +++} +++ +++// init test work and add this work to workqueue +++static int pm_ping_add_work(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, void *data) +++{ +++ struct ping_xprt_work *work_info; +++ bool ret = false; +++ +++ if (IS_ERR(xprt) || xprt == NULL) { +++ enfs_log_error("The xprt ptr is not exist.\n"); +++ return -EINVAL; +++ } +++ +++ if (IS_ERR(clnt) || clnt == NULL) { +++ enfs_log_error("The clnt ptr is not exist.\n"); +++ return -EINVAL; +++ } +++ +++ if (!xprt->multipath_context) { +++ enfs_log_error("multipath_context is null.\n"); +++ return -EINVAL; +++ } +++ +++ // check xprt pending status, if pending status equals Finish +++ // means this xprt can inster to work queue +++ if (pm_ping_get_path_check_state(xprt) == +++ PM_CHECK_FINISH || +++ pm_ping_get_path_check_state(xprt) == +++ PM_CHECK_INIT) { +++ +++ enfs_log_debug("find xprt pointer. %p\n", xprt); +++ work_info = kzalloc(sizeof(struct ping_xprt_work), GFP_ATOMIC); +++ if (work_info == NULL) +++ return -ENOMEM; +++ work_info->clnt = clnt; +++ atomic_inc(&clnt->cl_count); +++ work_info->xprt = xprt; +++ xprt_get(xprt); +++ INIT_WORK(&work_info->ping_work, pm_ping_execute_work); +++ pm_ping_set_path_check_state(xprt, PM_CHECK_WAITING); +++ +++ ret = pm_ping_workqueue_queue_work(&work_info->ping_work); +++ if (!ret) { +++ atomic_dec(&work_info->clnt->cl_count); +++ xprt_put(work_info->xprt); +++ kfree(work_info); +++ return -EINVAL; +++ } +++ } +++ return 0; +++} +++ +++// encapsulate pm_ping_add_work() +++static int pm_ping_execute_xprt_test(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, void *data) +++{ +++ pm_ping_add_work(clnt, xprt, NULL); +++ // return 0 for rpc_clnt_iterate_for_each_xprt(); +++ // because negative value will stop iterate all xprt +++ // and we need return negative value for debug +++ // Therefore, we need this function to iterate all xprt +++ return 0; +++} +++ +++// export to other module add ping work to workqueue +++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) +++{ +++ int ret; +++ +++ ret = pm_ping_add_work(clnt, xprt, NULL); +++ return ret; +++} +++ +++// iterate xprt in the client +++static void pm_ping_loop_rpclnt(struct sunrpc_net *sn) +++{ +++ struct rpc_clnt *clnt; +++ +++ spin_lock(&sn->rpc_client_lock); +++ list_for_each_entry_rcu(clnt, &sn->all_clients, cl_clients) { +++ if (clnt->cl_enfs) { +++ enfs_log_debug("find rpc_clnt. %p\n", clnt); +++ rpc_clnt_iterate_for_each_xprt(clnt, +++ pm_ping_execute_xprt_test, NULL); +++ } +++ } +++ spin_unlock(&sn->rpc_client_lock); +++} +++ +++// iterate each clnt in the sunrpc_net +++static void pm_ping_loop_sunrpc_net(void) +++{ +++ struct net *net; +++ struct sunrpc_net *sn; +++ +++ rcu_read_lock(); +++ for_each_net_rcu(net) { +++ sn = net_generic(net, sunrpc_net_id); +++ if (sn == NULL) +++ continue; +++ pm_ping_loop_rpclnt(sn); +++ } +++ rcu_read_unlock(); +++} +++ +++static int pm_ping_routine(void *data) +++{ +++ while (!kthread_should_stop()) { +++ // equale 0 means open multipath +++ if (enfs_get_config_multipath_state() == +++ ENFS_MULTIPATH_ENABLE) +++ pm_ping_loop_sunrpc_net(); +++ +++ msleep((unsigned int) +++ enfs_get_config_path_detect_interval() * 1000); +++ } +++ return 0; +++} +++ +++// start thread to cycly ping +++static int pm_ping_start(void) +++{ +++ pm_ping_timer_thread = +++ kthread_run(pm_ping_routine, NULL, "pm_ping_routine"); +++ if (IS_ERR(pm_ping_timer_thread)) { +++ enfs_log_error("Failed to create kernel thread\n"); +++ return PTR_ERR(pm_ping_timer_thread); +++ } +++ return 0; +++} +++ +++// initialize workqueue +++static int pm_ping_workqueue_init(void) +++{ +++ struct workqueue_struct *queue = NULL; +++ +++ queue = create_workqueue("pm_ping_workqueue"); +++ +++ if (queue == NULL) { +++ enfs_log_error("create workqueue failed.\n"); +++ return -ENOMEM; +++ } +++ +++ spin_lock(&ping_execute_workq_lock); +++ ping_execute_workq = queue; +++ spin_unlock(&ping_execute_workq_lock); +++ enfs_log_info("create workqueue succeeeded.\n"); +++ return 0; +++} +++ +++static void pm_ping_workqueue_fini(void) +++{ +++ struct workqueue_struct *queue = NULL; +++ +++ spin_lock(&ping_execute_workq_lock); +++ queue = ping_execute_workq; +++ ping_execute_workq = NULL; +++ spin_unlock(&ping_execute_workq_lock); +++ +++ enfs_log_info("delete work queue\n"); +++ +++ if (queue != NULL) { +++ flush_workqueue(queue); +++ destroy_workqueue(queue); +++ } +++} +++ +++// module exit func +++void pm_ping_fini(void) +++{ +++ if (pm_ping_timer_thread) +++ kthread_stop(pm_ping_timer_thread); +++ +++ pm_ping_workqueue_fini(); +++ +++ while (atomic_read(&check_xprt_count) != 0) +++ msleep(SLEEP_INTERVAL); +++} +++ +++// module init func +++int pm_ping_init(void) +++{ +++ int ret; +++ +++ atomic_set(&check_xprt_count, 0); +++ ret = pm_ping_workqueue_init(); +++ if (ret != 0) { +++ enfs_log_error("PM_PING Module loading failed.\n"); +++ return ret; +++ } +++ ret = pm_ping_start(); +++ if (ret != 0) { +++ enfs_log_error("PM_PING Module loading failed.\n"); +++ pm_ping_workqueue_fini(); +++ return ret; +++ } +++ +++ return ret; +++} +++ +++bool pm_ping_is_test_xprt_task(struct rpc_task *task) +++{ +++ return task->tk_ops == &pm_ping_set_status_ops ? true : false; +++} +++ +++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void (*func)(void *data), +++ void *data) +++{ +++ int ret; +++ +++ struct pm_ping_async_callback *callback_data = +++ kzalloc(sizeof(struct pm_ping_async_callback), GFP_KERNEL); +++ +++ if (callback_data == NULL) { +++ enfs_log_error("failed to mzalloc mem\n"); +++ return -ENOMEM; +++ } +++ +++ callback_data->data = data; +++ callback_data->func = func; +++ atomic_inc(&check_xprt_count); +++ ret = rpc_clnt_test_xprt(clnt, xprt, +++ &pm_ping_set_status_ops, +++ callback_data, +++ RPC_TASK_ASYNC | RPC_TASK_FIXED); +++ +++ if (ret < 0) { +++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); +++ atomic_dec(&check_xprt_count); +++ } +++ +++ return ret; +++} ++diff --git a/fs/nfs/enfs/pm_ping.h b/fs/nfs/enfs/pm_ping.h ++new file mode 100644 ++index 000000000000..6bcb94bfc836 ++--- /dev/null +++++ b/fs/nfs/enfs/pm_ping.h ++@@ -0,0 +1,33 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: nfs configuration +++ * Author: x00833432 +++ * Create: 2023-07-27 +++ */ +++ +++#ifndef PM_PING_H +++#define PM_PING_H +++ +++#include <linux/sunrpc/clnt.h> +++ +++enum pm_check_state { +++ PM_CHECK_INIT, // this xprt never been queued +++ PM_CHECK_WAITING, // this xprt waiting in the queue +++ PM_CHECK_CHECKING, // this xprt is testing +++ PM_CHECK_FINISH, // this xprt has been finished +++ PM_CHECK_UNDEFINE, // undefine multipath struct +++}; +++ +++int pm_ping_init(void); +++void pm_ping_fini(void); +++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); +++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, +++ enum pm_check_state state); +++bool pm_ping_is_test_xprt_task(struct rpc_task *task); +++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, +++ struct rpc_xprt *xprt, +++ void (*func)(void *data), +++ void *data); +++ +++#endif // PM_PING_H ++diff --git a/fs/nfs/enfs/pm_state.c b/fs/nfs/enfs/pm_state.c ++new file mode 100644 ++index 000000000000..220621a207a2 ++--- /dev/null +++++ b/fs/nfs/enfs/pm_state.c ++@@ -0,0 +1,158 @@ +++// SPDX-License-Identifier: GPL-2.0 +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: path state file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++#include "pm_state.h" +++#include <linux/sunrpc/xprt.h> +++ +++#include "enfs.h" +++#include "enfs_log.h" +++ +++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ enum pm_path_state state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return PM_STATE_UNDEFINED; +++ } +++ +++ xprt_get(xprt); +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ xprt_put(xprt); +++ return PM_STATE_UNDEFINED; +++ } +++ +++ state = atomic_read(&ctx->path_state); +++ +++ xprt_put(xprt); +++ +++ return state; +++} +++ +++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state) +++{ +++ struct enfs_xprt_context *ctx = NULL; +++ enum pm_path_state cur_state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ xprt_get(xprt); +++ +++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; +++ if (ctx == NULL) { +++ enfs_log_error("The xprt multipath ctx is not valid.\n"); +++ xprt_put(xprt); +++ return; +++ } +++ +++ cur_state = atomic_read(&ctx->path_state); +++ if (cur_state == state) { +++ enfs_log_debug("The xprt is already {%d}.\n", state); +++ xprt_put(xprt); +++ return; +++ } +++ +++ atomic_set(&ctx->path_state, state); +++ enfs_log_info("The xprt {%p} path state change from {%d} to {%d}.\n", +++ xprt, cur_state, state); +++ +++ xprt_put(xprt); +++} +++ +++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len) +++{ +++ enum pm_path_state state; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ if ((buf == NULL) || (len <= 0)) { +++ enfs_log_error("Buffer is not valid, len=%d.\n", len); +++ return; +++ } +++ +++ state = pm_get_path_state(xprt); +++ +++ switch (state) { +++ case PM_STATE_INIT: +++ (void)snprintf(buf, len, "Init"); +++ break; +++ case PM_STATE_NORMAL: +++ (void)snprintf(buf, len, "Normal"); +++ break; +++ case PM_STATE_FAULT: +++ (void)snprintf(buf, len, "Fault"); +++ break; +++ default: +++ (void)snprintf(buf, len, "Unknown"); +++ break; +++ } +++} +++ +++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len) +++{ +++ int i; +++ unsigned long state; +++ static unsigned long xprt_mask[] = { +++ XPRT_LOCKED, XPRT_CONNECTED, +++ XPRT_CONNECTING, XPRT_CLOSE_WAIT, +++ XPRT_BOUND, XPRT_BINDING, XPRT_CLOSING, +++ XPRT_CONGESTED}; +++ +++ static const char *const xprt_state_desc[] = { +++ "LOCKED", "CONNECTED", "CONNECTING", +++ "CLOSE_WAIT", "BOUND", "BINDING", +++ "CLOSING", "CONGESTED"}; +++ int pos = 0; +++ int ret = 0; +++ +++ if (xprt == NULL) { +++ enfs_log_error("The xprt is not valid.\n"); +++ return; +++ } +++ +++ if ((buf == NULL) || (len <= 0)) { +++ enfs_log_error( +++ "Xprt state buffer is not valid, len=%d.\n", +++ len); +++ return; +++ } +++ +++ xprt_get(xprt); +++ state = READ_ONCE(xprt->state); +++ xprt_put(xprt); +++ +++ for (i = 0; i < ARRAY_SIZE(xprt_mask); ++i) { +++ if (pos >= len) +++ break; +++ +++ if (!test_bit(xprt_mask[i], &state)) +++ continue; +++ +++ if (pos == 0) +++ ret = snprintf(buf, len, "%s", xprt_state_desc[i]); +++ else +++ ret = snprintf(buf + pos, len - pos, "|%s", +++ xprt_state_desc[i]); +++ +++ if (ret < 0) { +++ enfs_log_error("format state failed, ret %d.\n", ret); +++ break; +++ } +++ +++ pos += ret; +++ } +++} ++diff --git a/fs/nfs/enfs/pm_state.h b/fs/nfs/enfs/pm_state.h ++new file mode 100644 ++index 000000000000..f5f52e5ab91d ++--- /dev/null +++++ b/fs/nfs/enfs/pm_state.h ++@@ -0,0 +1,28 @@ +++/* SPDX-License-Identifier: GPL-2.0 */ +++/* +++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. +++ * Description: path state header file +++ * Author: y00583252 +++ * Create: 2023-08-12 +++ */ +++ +++#ifndef PM_STATE_H +++#define PM_STATE_H +++ +++#include <linux/types.h> +++#include <linux/sunrpc/xprt.h> +++ +++enum pm_path_state { +++ PM_STATE_INIT, +++ PM_STATE_NORMAL, +++ PM_STATE_FAULT, +++ PM_STATE_UNDEFINED // xprt is not multipath xprt +++}; +++ +++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state); +++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt); +++ +++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len); +++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len); +++ +++#endif // PM_STATE_H +diff --git a/0006-add_enfs_compile_option.patch b/0006-add_enfs_compile_option.patch +new file mode 100644 +index 0000000..ff3bc0e +--- /dev/null ++++ b/0006-add_enfs_compile_option.patch +@@ -0,0 +1,70 @@ ++diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig ++index b04256636d4b..ae53510c0627 100644 ++--- a/arch/arm64/configs/openeuler_defconfig +++++ b/arch/arm64/configs/openeuler_defconfig ++@@ -5344,6 +5344,7 @@ CONFIG_LOCKD=m ++ CONFIG_LOCKD_V4=y ++ CONFIG_NFS_ACL_SUPPORT=m ++ CONFIG_NFS_COMMON=y +++# CONFIG_ENFS is not set ++ CONFIG_SUNRPC=m ++ CONFIG_SUNRPC_GSS=m ++ CONFIG_SUNRPC_BACKCHANNEL=y ++diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig ++index 59baeb2973af..ccc317f7fdb2 100644 ++--- a/arch/x86/configs/openeuler_defconfig +++++ b/arch/x86/configs/openeuler_defconfig ++@@ -6825,6 +6825,7 @@ CONFIG_LOCKD=m ++ CONFIG_LOCKD_V4=y ++ CONFIG_NFS_ACL_SUPPORT=m ++ CONFIG_NFS_COMMON=y +++# CONFIG_ENFS is not set ++ CONFIG_SUNRPC=m ++ CONFIG_SUNRPC_GSS=m ++ CONFIG_SUNRPC_BACKCHANNEL=y ++diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig ++index e55f86713948..872c9b7671b1 100644 ++--- a/fs/nfs/Kconfig +++++ b/fs/nfs/Kconfig ++@@ -196,3 +196,14 @@ config NFS_DEBUG ++ depends on NFS_FS && SUNRPC_DEBUG ++ select CRC32 ++ default y +++ +++config ENFS +++ tristate "NFS client support for ENFS" +++ depends on NFS_FS +++ default n +++ help +++ This option enables support multipath of the NFS protocol +++ in the kernel's NFS client. +++ This feature will improve performance and reliability. +++ +++ If sure, say Y. ++diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile ++index c587e3c4c6a6..19d0ac2ba3b8 100644 ++--- a/fs/nfs/Makefile +++++ b/fs/nfs/Makefile ++@@ -12,6 +12,7 @@ nfs-y := client.o dir.o file.o getroot.o inode.o super.o \ ++ nfs-$(CONFIG_ROOT_NFS) += nfsroot.o ++ nfs-$(CONFIG_SYSCTL) += sysctl.o ++ nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-index.o +++nfs-$(CONFIG_ENFS) += enfs_adapter.o ++ ++ obj-$(CONFIG_NFS_V2) += nfsv2.o ++ nfsv2-y := nfs2super.o proc.o nfs2xdr.o ++@@ -34,3 +35,5 @@ nfsv4-$(CONFIG_NFS_V4_2) += nfs42proc.o ++ obj-$(CONFIG_PNFS_FILE_LAYOUT) += filelayout/ ++ obj-$(CONFIG_PNFS_BLOCK) += blocklayout/ ++ obj-$(CONFIG_PNFS_FLEXFILE_LAYOUT) += flexfilelayout/ +++ +++obj-$(CONFIG_ENFS) += enfs/ ++diff --git a/net/sunrpc/Makefile b/net/sunrpc/Makefile ++index 090658c3da12..fe4e3b28c5d1 100644 ++--- a/net/sunrpc/Makefile +++++ b/net/sunrpc/Makefile ++@@ -19,3 +19,4 @@ sunrpc-$(CONFIG_SUNRPC_DEBUG) += debugfs.o ++ sunrpc-$(CONFIG_SUNRPC_BACKCHANNEL) += backchannel_rqst.o ++ sunrpc-$(CONFIG_PROC_FS) += stats.o ++ sunrpc-$(CONFIG_SYSCTL) += sysctl.o +++sunrpc-$(CONFIG_ENFS) += sunrpc_enfs_adapter.o +diff --git a/kernel.spec b/kernel.spec +index 3215446..e242c00 100644 +--- a/kernel.spec ++++ b/kernel.spec +@@ -60,6 +60,13 @@ Source9002: series.conf + Source9998: patches.tar.bz2 + %endif + ++Patch0001: 0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch ++Patch0002: 0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch ++Patch0003: 0003-add_enfs_module.patch ++Patch0004: 0004-add_enfs_module_for_sunrpc_multipatch.patch ++Patch0005: 0005-add_enfs_module_for_sunrpc_failover_and_configure.patch ++Patch0006: 0006-add_enfs_compile_option.patch ++ + #BuildRequires: + BuildRequires: module-init-tools, patch >= 2.5.4, bash >= 2.03, tar + BuildRequires: bzip2, xz, findutils, gzip, m4, perl, make >= 3.78, diffutils, gawk +@@ -256,6 +263,12 @@ Applypatches() + Applypatches series.conf %{_builddir}/kernel-%{version}/linux-%{KernelVer} + %endif + ++%patch0001 -p1 ++%patch0002 -p1 ++%patch0003 -p1 ++%patch0004 -p1 ++%patch0005 -p1 ++%patch0006 -p1 + touch .scmversion + + find . $ -name "*.orig" -o -name "*~" $ -exec rm -f {} \; >/dev/null +-- +2.25.0.windows.1 + diff --git a/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch new file mode 100644 index 0000000..38e57a9 --- /dev/null +++ b/0001-nfs_add_api_to_support_enfs_registe_and_handle_mount_option.patch @@ -0,0 +1,757 @@ +diff --git a/fs/nfs/client.c b/fs/nfs/client.c +index 7d02dc52209d..50820a8a684a 100644 +--- a/fs/nfs/client.c ++++ b/fs/nfs/client.c +@@ -48,7 +48,7 @@ + #include "callback.h" + #include "delegation.h" + #include "iostat.h" +-#include "internal.h" ++#include "enfs_adapter.h" + #include "fscache.h" + #include "pnfs.h" + #include "nfs.h" +@@ -255,6 +255,7 @@ void nfs_free_client(struct nfs_client *clp) + put_nfs_version(clp->cl_nfs_mod); + kfree(clp->cl_hostname); + kfree(clp->cl_acceptor); ++ nfs_free_multi_path_client(clp); + kfree(clp); + } + EXPORT_SYMBOL_GPL(nfs_free_client); +@@ -330,6 +331,9 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat + sap)) + continue; + ++ if (!nfs_multipath_client_match(clp, data)) ++ continue; ++ + refcount_inc(&clp->cl_count); + return clp; + } +@@ -512,6 +516,9 @@ int nfs_create_rpc_client(struct nfs_client *clp, + .program = &nfs_program, + .version = clp->rpc_ops->version, + .authflavor = flavor, ++#if IS_ENABLED(CONFIG_ENFS) ++ .multipath_option = cl_init->enfs_option, ++#endif + }; + + if (test_bit(NFS_CS_DISCRTRY, &clp->cl_flags)) +@@ -634,6 +641,13 @@ struct nfs_client *nfs_init_client(struct nfs_client *clp, + /* the client is already initialised */ + if (clp->cl_cons_state == NFS_CS_READY) + return clp; ++ error = nfs_create_multi_path_client(clp, cl_init); ++ if (error < 0) { ++ dprintk("%s: create failed.%d!\n", __func__, error); ++ nfs_put_client(clp); ++ clp = ERR_PTR(error); ++ return clp; ++ } + + /* + * Create a client RPC handle for doing FSSTAT with UNIX auth only +@@ -666,6 +680,9 @@ static int nfs_init_server(struct nfs_server *server, + .net = data->net, + .timeparms = &timeparms, + .init_flags = (1UL << NFS_CS_REUSEPORT), ++#if IS_ENABLED(CONFIG_ENFS) ++ .enfs_option = data->enfs_option, ++#endif + }; + struct nfs_client *clp; + int error; +diff --git a/fs/nfs/enfs_adapter.c b/fs/nfs/enfs_adapter.c +new file mode 100644 +index 000000000000..7f471f2072c4 +--- /dev/null ++++ b/fs/nfs/enfs_adapter.c +@@ -0,0 +1,230 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs3.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/sunrpc/sched.h> ++#include <linux/nfs_iostat.h> ++#include "enfs_adapter.h" ++#include "iostat.h" ++ ++struct enfs_adapter_ops __rcu *enfs_adapter; ++ ++int enfs_adapter_register(struct enfs_adapter_ops *ops) ++{ ++ struct enfs_adapter_ops *old; ++ ++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, NULL, ops); ++ if (old == NULL || old == ops) ++ return 0; ++ pr_err("regist %s ops %p failed. old %p\n", __func__, ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(enfs_adapter_register); ++ ++int enfs_adapter_unregister(struct enfs_adapter_ops *ops) ++{ ++ struct enfs_adapter_ops *old; ++ ++ old = cmpxchg((struct enfs_adapter_ops **)&enfs_adapter, ops, NULL); ++ if (old == ops || old == NULL) ++ return 0; ++ pr_err("unregist %s ops %p failed. old %p\n", __func__, ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(enfs_adapter_unregister); ++ ++struct enfs_adapter_ops *nfs_multipath_router_get(void) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ rcu_read_lock(); ++ ops = rcu_dereference(enfs_adapter); ++ if (ops == NULL) { ++ rcu_read_unlock(); ++ return NULL; ++ } ++ if (!try_module_get(ops->owner)) ++ ops = NULL; ++ rcu_read_unlock(); ++ return ops; ++} ++ ++void nfs_multipath_router_put(struct enfs_adapter_ops *ops) ++{ ++ if (ops) ++ module_put(ops->owner); ++} ++ ++bool is_valid_option(enum nfsmultipathoptions option) ++{ ++ if (option < REMOTEADDR || option >= INVALID_OPTION) { ++ pr_warn("%s: ENFS invalid option %d\n", __func__, option); ++ return false; ++ } ++ ++ return true; ++} ++ ++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, ++ struct nfs_parsed_mount_data *mnt) ++{ ++ ++ //parseMultiPathOptions(getNfsMultiPathOpt(token), string, mnt); ++ ++ int rc; ++ struct enfs_adapter_ops *ops; ++ ++ ops = nfs_multipath_router_get(); ++ if ((ops == NULL) || (ops->parse_mount_options == NULL) || ++ !is_valid_option(option)) { ++ nfs_multipath_router_put(ops); ++ dfprintk(MOUNT, ++ "NFS: parsing nfs mount option enfs not load[%s]\n" ++ , __func__); ++ return -EOPNOTSUPP; ++ } ++ // nfs_multipath_parse_options ++ dfprintk(MOUNT, "NFS: parsing nfs mount option '%s' type: %d[%s]\n" ++ , str, option, __func__); ++ rc = ops->parse_mount_options(option, str, &mnt->enfs_option, mnt->net); ++ nfs_multipath_router_put(ops); ++ return rc; ++} ++ ++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ if (data->enfs_option == NULL) ++ return; ++ ++ ops = nfs_multipath_router_get(); ++ if ((ops == NULL) || (ops->free_mount_options == NULL)) { ++ nfs_multipath_router_put(ops); ++ return; ++ } ++ ops->free_mount_options((void *)&data->enfs_option); ++ nfs_multipath_router_put(ops); ++} ++ ++int nfs_create_multi_path_client(struct nfs_client *client, ++ const struct nfs_client_initdata *cl_init) ++{ ++ int ret = 0; ++ struct enfs_adapter_ops *ops; ++ ++ if (cl_init->enfs_option == NULL) ++ return 0; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_init != NULL) ++ ret = ops->client_info_init( ++ (void *)&client->cl_multipath_data, cl_init); ++ nfs_multipath_router_put(ops); ++ ++ return ret; ++} ++EXPORT_SYMBOL_GPL(nfs_create_multi_path_client); ++ ++void nfs_free_multi_path_client(struct nfs_client *clp) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ if (clp->cl_multipath_data == NULL) ++ return; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_free != NULL) ++ ops->client_info_free(clp->cl_multipath_data); ++ nfs_multipath_router_put(ops); ++} ++ ++int nfs_multipath_client_match(struct nfs_client *clp, ++ const struct nfs_client_initdata *sap) ++{ ++ int ret = true; ++ struct enfs_adapter_ops *ops; ++ ++ pr_info("%s src %p dst %p\n.", __func__, ++ clp->cl_multipath_data, sap->enfs_option); ++ ++ if (clp->cl_multipath_data == NULL && sap->enfs_option == NULL) ++ return true; ++ ++ if ((clp->cl_multipath_data == NULL && sap->enfs_option) || ++ (clp->cl_multipath_data && sap->enfs_option == NULL)) { ++ pr_err("not match client src %p dst %p\n.", ++ clp->cl_multipath_data, sap->enfs_option); ++ return false; ++ } ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_match != NULL) ++ ret = ops->client_info_match(clp->cl_multipath_data, ++ sap->enfs_option); ++ nfs_multipath_router_put(ops); ++ ++ return ret; ++} ++ ++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst) ++{ ++ int ret = true; ++ struct enfs_adapter_ops *ops; ++ ++ if (src->cl_multipath_data == NULL && dst->cl_multipath_data == NULL) ++ return true; ++ ++ if (src->cl_multipath_data == NULL || dst->cl_multipath_data == NULL) ++ return false; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->nfs4_client_info_match != NULL) ++ ret = ops->nfs4_client_info_match(src->cl_multipath_data, ++ src->cl_multipath_data); ++ nfs_multipath_router_put(ops); ++ ++ return ret; ++} ++EXPORT_SYMBOL_GPL(nfs4_multipath_client_match); ++ ++void nfs_multipath_show_client_info(struct seq_file *mount_option, ++ struct nfs_server *server) ++{ ++ struct enfs_adapter_ops *ops; ++ ++ if (mount_option == NULL || server == NULL || ++ server->client == NULL || ++ server->nfs_client->cl_multipath_data == NULL) ++ return; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->client_info_show != NULL) ++ ops->client_info_show(mount_option, server); ++ nfs_multipath_router_put(ops); ++} ++ ++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) ++{ ++ int ret = 0; ++ struct enfs_adapter_ops *ops; ++ ++ if (nfs_client == NULL || nfs_client->cl_rpcclient == NULL) ++ return 0; ++ ++ ops = nfs_multipath_router_get(); ++ if (ops != NULL && ops->remount_ip_list != NULL) ++ ret = ops->remount_ip_list(nfs_client, enfs_option); ++ nfs_multipath_router_put(ops); ++ return ret; ++} ++EXPORT_SYMBOL_GPL(nfs_remount_iplist); +diff --git a/fs/nfs/enfs_adapter.h b/fs/nfs/enfs_adapter.h +new file mode 100644 +index 000000000000..752544e18056 +--- /dev/null ++++ b/fs/nfs/enfs_adapter.h +@@ -0,0 +1,101 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapt header. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _NFS_MULTIPATH_H_ ++#define _NFS_MULTIPATH_H_ ++ ++#include "internal.h" ++ ++#if IS_ENABLED(CONFIG_ENFS) ++enum nfsmultipathoptions { ++ REMOTEADDR, ++ LOCALADDR, ++ REMOTEDNSNAME, ++ REMOUNTREMOTEADDR, ++ REMOUNTLOCALADDR, ++ INVALID_OPTION ++}; ++ ++ ++struct enfs_adapter_ops { ++ const char *name; ++ struct module *owner; ++ int (*parse_mount_options)(enum nfsmultipathoptions option, ++ char *str, void **enfs_option, struct net *net_ns); ++ ++ void (*free_mount_options)(void **data); ++ ++ int (*client_info_init)(void **data, ++ const struct nfs_client_initdata *cl_init); ++ void (*client_info_free)(void *data); ++ int (*client_info_match)(void *src, void *dst); ++ int (*nfs4_client_info_match)(void *src, void *dst); ++ void (*client_info_show)(struct seq_file *mount_option, void *data); ++ int (*remount_ip_list)(struct nfs_client *nfs_client, ++ void *enfs_option); ++}; ++ ++int enfs_parse_mount_options(enum nfsmultipathoptions option, char *str, ++ struct nfs_parsed_mount_data *mnt); ++void enfs_free_mount_options(struct nfs_parsed_mount_data *data); ++int nfs_create_multi_path_client(struct nfs_client *client, ++ const struct nfs_client_initdata *cl_init); ++void nfs_free_multi_path_client(struct nfs_client *clp); ++int nfs_multipath_client_match(struct nfs_client *clp, ++ const struct nfs_client_initdata *sap); ++int nfs4_multipath_client_match(struct nfs_client *src, struct nfs_client *dst); ++void nfs_multipath_show_client_info(struct seq_file *mount_option, ++ struct nfs_server *server); ++int enfs_adapter_register(struct enfs_adapter_ops *ops); ++int enfs_adapter_unregister(struct enfs_adapter_ops *ops); ++int nfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); ++int nfs4_create_multi_path(struct nfs_server *server, ++ struct nfs_parsed_mount_data *data, ++ const struct rpc_timeout *timeparms); ++ ++#else ++static inline ++void nfs_free_multi_path_client(struct nfs_client *clp) ++{ ++ ++} ++ ++static inline ++int nfs_multipath_client_match(struct nfs_client *clp, ++ const struct nfs_client_initdata *sap) ++{ ++ return 1; ++} ++ ++static inline ++int nfs_create_multi_path_client(struct nfs_client *client, ++ const struct nfs_client_initdata *cl_init) ++{ ++ return 0; ++} ++ ++static inline ++void nfs_multipath_show_client_info(struct seq_file *mount_option, ++ struct nfs_server *server) ++{ ++ ++} ++ ++static inline ++int nfs4_multipath_client_match(struct nfs_client *src, ++ struct nfs_client *dst) ++{ ++ return 1; ++} ++ ++static inline ++void enfs_free_mount_options(struct nfs_parsed_mount_data *data) ++{ ++ ++} ++ ++#endif // CONFIG_ENFS ++#endif // _NFS_MULTIPATH_H_ +diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h +index 0ce5a90640c4..c696693edc7b 100644 +--- a/fs/nfs/internal.h ++++ b/fs/nfs/internal.h +@@ -93,6 +93,9 @@ struct nfs_client_initdata { + u32 minorversion; + struct net *net; + const struct rpc_timeout *timeparms; ++#if IS_ENABLED(CONFIG_ENFS) ++ void *enfs_option; /* struct multipath_mount_options * */ ++#endif + }; + + /* +@@ -135,6 +138,9 @@ struct nfs_parsed_mount_data { + + struct security_mnt_opts lsm_opts; + struct net *net; ++#if IS_ENABLED(CONFIG_ENFS) ++ void *enfs_option; /* struct multipath_mount_options * */ ++#endif + }; + + /* mount_clnt.c */ +diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c +index 1350ea673672..4aa6e1f961f7 100644 +--- a/fs/nfs/nfs4client.c ++++ b/fs/nfs/nfs4client.c +@@ -10,7 +10,7 @@ + #include <linux/sunrpc/xprt.h> + #include <linux/sunrpc/bc_xprt.h> + #include <linux/sunrpc/rpc_pipe_fs.h> +-#include "internal.h" ++#include "enfs_adapter.h" + #include "callback.h" + #include "delegation.h" + #include "nfs4session.h" +@@ -225,6 +225,16 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init) + __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags); + __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags); + ++#if IS_ENABLED(CONFIG_ENFS) ++ err = nfs_create_multi_path_client(clp, cl_init); ++ if (err < 0) { ++ dprintk("%s: create failed.%d\n", __func__, err); ++ nfs_put_client(clp); ++ clp = ERR_PTR(err); ++ return clp; ++ } ++#endif ++ + /* + * Set up the connection to the server before we add add to the + * global list. +@@ -529,6 +539,9 @@ static int nfs4_match_client(struct nfs_client *pos, struct nfs_client *new, + if (!nfs4_match_client_owner_id(pos, new)) + return 1; + ++ if (!nfs4_multipath_client_match(pos, new)) ++ return 1; ++ + return 0; + } + +@@ -860,7 +873,7 @@ static int nfs4_set_client(struct nfs_server *server, + const size_t addrlen, + const char *ip_addr, + int proto, const struct rpc_timeout *timeparms, +- u32 minorversion, struct net *net) ++ u32 minorversion, struct net *net, void *enfs_option) + { + struct nfs_client_initdata cl_init = { + .hostname = hostname, +@@ -872,6 +885,9 @@ static int nfs4_set_client(struct nfs_server *server, + .minorversion = minorversion, + .net = net, + .timeparms = timeparms, ++#if IS_ENABLED(CONFIG_ENFS) ++ .enfs_option = enfs_option, ++#endif + }; + struct nfs_client *clp; + +@@ -1042,6 +1058,30 @@ static int nfs4_server_common_setup(struct nfs_server *server, + return error; + } + ++int nfs4_create_multi_path(struct nfs_server *server, ++ struct nfs_parsed_mount_data *data, ++ const struct rpc_timeout *timeparms) ++{ ++ struct nfs_client_initdata cl_init = { ++ .hostname = data->nfs_server.hostname, ++ .addr = (const struct sockaddr *)&data->nfs_server.address, ++ .addrlen = data->nfs_server.addrlen, ++ .ip_addr = data->client_address, ++ .nfs_mod = &nfs_v4, ++ .proto = data->nfs_server.protocol, ++ .minorversion = data->minorversion, ++ .net = data->net, ++ .timeparms = timeparms, ++#if IS_ENABLED(CONFIG_ENFS) ++ .enfs_option = data->enfs_option, ++#endif // CONFIG_ENFS ++ }; ++ ++ return nfs_create_multi_path_client(server->nfs_client, &cl_init); ++ ++} ++EXPORT_SYMBOL_GPL(nfs4_create_multi_path); ++ + /* + * Create a version 4 volume record + */ +@@ -1050,6 +1090,7 @@ static int nfs4_init_server(struct nfs_server *server, + { + struct rpc_timeout timeparms; + int error; ++ void *enfs_option = NULL; + + nfs_init_timeout_values(&timeparms, data->nfs_server.protocol, + data->timeo, data->retrans); +@@ -1067,6 +1108,10 @@ static int nfs4_init_server(struct nfs_server *server, + else + data->selected_flavor = RPC_AUTH_UNIX; + ++#if IS_ENABLED(CONFIG_ENFS) ++ enfs_option = data->enfs_option; ++#endif ++ + /* Get a client record */ + error = nfs4_set_client(server, + data->nfs_server.hostname, +@@ -1076,7 +1121,7 @@ static int nfs4_init_server(struct nfs_server *server, + data->nfs_server.protocol, + &timeparms, + data->minorversion, +- data->net); ++ data->net, enfs_option); + if (error < 0) + return error; + +@@ -1161,7 +1206,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, + XPRT_TRANSPORT_RDMA, + parent_server->client->cl_timeout, + parent_client->cl_mvops->minor_version, +- parent_client->cl_net); ++ parent_client->cl_net, NULL); + if (!error) + goto init_server; + #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */ +@@ -1174,7 +1219,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, + XPRT_TRANSPORT_TCP, + parent_server->client->cl_timeout, + parent_client->cl_mvops->minor_version, +- parent_client->cl_net); ++ parent_client->cl_net, NULL); + if (error < 0) + goto error; + +@@ -1269,7 +1314,7 @@ int nfs4_update_server(struct nfs_server *server, const char *hostname, + set_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); + error = nfs4_set_client(server, hostname, sap, salen, buf, + clp->cl_proto, clnt->cl_timeout, +- clp->cl_minorversion, net); ++ clp->cl_minorversion, net, NULL); + clear_bit(NFS_MIG_TSM_POSSIBLE, &server->mig_status); + if (error != 0) { + nfs_server_insert_lists(server); +diff --git a/fs/nfs/super.c b/fs/nfs/super.c +index a05e1eb2c3fd..83cd294aca15 100644 +--- a/fs/nfs/super.c ++++ b/fs/nfs/super.c +@@ -61,7 +61,7 @@ + #include "callback.h" + #include "delegation.h" + #include "iostat.h" +-#include "internal.h" ++#include "enfs_adapter.h" + #include "fscache.h" + #include "nfs4session.h" + #include "pnfs.h" +@@ -113,6 +113,12 @@ enum { + + /* Special mount options */ + Opt_userspace, Opt_deprecated, Opt_sloppy, ++#if IS_ENABLED(CONFIG_ENFS) ++ Opt_remote_iplist, ++ Opt_local_iplist, ++ Opt_remote_dnslist, ++ Opt_enfs_info, ++#endif + + Opt_err + }; +@@ -183,6 +189,13 @@ static const match_table_t nfs_mount_option_tokens = { + { Opt_fscache_uniq, "fsc=%s" }, + { Opt_local_lock, "local_lock=%s" }, + ++#if IS_ENABLED(CONFIG_ENFS) ++ { Opt_remote_iplist, "remoteaddrs=%s" }, ++ { Opt_local_iplist, "localaddrs=%s" }, ++ { Opt_remote_dnslist, "remotednsname=%s" }, ++ { Opt_enfs_info, "enfs_info=%s" }, ++#endif ++ + /* The following needs to be listed after all other options */ + { Opt_nfsvers, "v%s" }, + +@@ -365,6 +378,21 @@ static struct shrinker acl_shrinker = { + .seeks = DEFAULT_SEEKS, + }; + ++#if IS_ENABLED(CONFIG_ENFS) ++enum nfsmultipathoptions getNfsMultiPathOpt(int token) ++{ ++ switch (token) { ++ case Opt_remote_iplist: ++ return REMOUNTREMOTEADDR; ++ case Opt_local_iplist: ++ return REMOUNTLOCALADDR; ++ case Opt_remote_dnslist: ++ return REMOTEDNSNAME; ++ } ++ return INVALID_OPTION; ++} ++#endif ++ + /* + * Register the NFS filesystems + */ +@@ -758,6 +786,9 @@ int nfs_show_options(struct seq_file *m, struct dentry *root) + seq_printf(m, ",addr=%s", + rpc_peeraddr2str(nfss->nfs_client->cl_rpcclient, + RPC_DISPLAY_ADDR)); ++ ++ nfs_multipath_show_client_info(m, nfss); ++ + rcu_read_unlock(); + + return 0; +@@ -853,6 +884,8 @@ int nfs_show_stats(struct seq_file *m, struct dentry *root) + seq_puts(m, root->d_sb->s_flags & SB_NODIRATIME ? ",nodiratime" : ""); + nfs_show_mount_options(m, nfss, 1); + ++ nfs_multipath_show_client_info(m, nfss); ++ + seq_printf(m, "\n\tage:\t%lu", (jiffies - nfss->mount_time) / HZ); + + show_implementation_id(m, nfss); +@@ -977,6 +1010,7 @@ static void nfs_free_parsed_mount_data(struct nfs_parsed_mount_data *data) + kfree(data->nfs_server.export_path); + kfree(data->nfs_server.hostname); + kfree(data->fscache_uniq); ++ enfs_free_mount_options(data); + security_free_mnt_opts(&data->lsm_opts); + kfree(data); + } +@@ -1641,7 +1675,34 @@ static int nfs_parse_mount_options(char *raw, + return 0; + }; + break; +- ++#if IS_ENABLED(CONFIG_ENFS) ++ case Opt_remote_iplist: ++ case Opt_local_iplist: ++ case Opt_remote_dnslist: ++ string = match_strdup(args); ++ if (string == NULL) ++ goto out_nomem; ++ rc = enfs_parse_mount_options(getNfsMultiPathOpt(token), ++ string, mnt); ++ kfree(string); ++ switch (rc) { ++ case 0: ++ break; ++ case -ENOMEM: ++ goto out_nomem; ++ case -ENOSPC: ++ goto out_limit; ++ case -EINVAL: ++ goto out_invalid_address; ++ case -ENOTSUPP: ++ goto out_invalid_address; ++ case -EOPNOTSUPP: ++ goto out_invalid_address; ++ } ++ break; ++ case Opt_enfs_info: ++ break; ++#endif + /* + * Special options + */ +@@ -1720,6 +1781,11 @@ static int nfs_parse_mount_options(char *raw, + free_secdata(secdata); + printk(KERN_INFO "NFS: security options invalid: %d\n", rc); + return 0; ++#if IS_ENABLED(CONFIG_ENFS) ++out_limit: ++ dprintk("NFS: param is more than supported limit: %d\n", rc); ++ return 0; ++#endif + } + + /* +@@ -2335,6 +2401,14 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) + if (!nfs_parse_mount_options((char *)options, data)) + goto out; + ++#if IS_ENABLED(CONFIG_ENFS) ++ if (data->enfs_option) { ++ error = nfs_remount_iplist(nfss->nfs_client, data->enfs_option); ++ if (error) ++ goto out; ++ } ++#endif ++ + /* + * noac is a special case. It implies -o sync, but that's not + * necessarily reflected in the mtab options. do_remount_sb +@@ -2347,6 +2421,11 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) + /* compare new mount options with old ones */ + error = nfs_compare_remount_data(nfss, data); + out: ++#if IS_ENABLED(CONFIG_ENFS) ++ /* release remount option member */ ++ if (data->enfs_option) ++ enfs_free_mount_options(data); ++#endif + nfs_free_parsed_mount_data(data); + return error; + } +diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h +index 7023ae64e3d7..2c19678afe8d 100644 +--- a/include/linux/nfs_fs_sb.h ++++ b/include/linux/nfs_fs_sb.h +@@ -123,6 +123,11 @@ struct nfs_client { + + struct net *cl_net; + struct list_head pending_cb_stateids; ++ ++#if IS_ENABLED(CONFIG_ENFS) ++ /* multi path private structure (struct multipath_client_info *) */ ++ void *cl_multipath_data; ++#endif + }; + + /* diff --git a/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch new file mode 100644 index 0000000..540a2ce --- /dev/null +++ b/0002-sunrpc_add_api_to_support_enfs_registe_and_create_multipath_then_dispatch_IO.patch @@ -0,0 +1,805 @@ +diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h +index 8aa865bce4f6..89178f78de8c 100644 +--- a/include/linux/sunrpc/clnt.h ++++ b/include/linux/sunrpc/clnt.h +@@ -70,6 +70,10 @@ struct rpc_clnt { + struct dentry *cl_debugfs; /* debugfs directory */ + #endif + struct rpc_xprt_iter cl_xpi; ++ ++#if IS_ENABLED(CONFIG_ENFS) ++ bool cl_enfs; ++#endif + }; + + /* +@@ -124,6 +128,9 @@ struct rpc_create_args { + unsigned long flags; + char *client_name; + struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */ ++#if IS_ENABLED(CONFIG_ENFS) ++ void *multipath_option; ++#endif + }; + + struct rpc_add_xprt_test { +@@ -221,6 +228,12 @@ bool rpc_clnt_xprt_switch_has_addr(struct rpc_clnt *clnt, + const struct sockaddr *sap); + void rpc_cleanup_clids(void); + ++#if IS_ENABLED(CONFIG_ENFS) ++int ++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, ++ const struct rpc_call_ops *ops, void *data, int flags); ++#endif /* CONFIG_ENFS */ ++ + static inline int rpc_reply_expected(struct rpc_task *task) + { + return (task->tk_msg.rpc_proc != NULL) && +diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h +index ad2e243f3f03..124f5a0faf3e 100644 +--- a/include/linux/sunrpc/sched.h ++++ b/include/linux/sunrpc/sched.h +@@ -90,6 +90,9 @@ struct rpc_task { + tk_garb_retry : 2, + tk_cred_retry : 2, + tk_rebind_retry : 2; ++#if IS_ENABLED(CONFIG_ENFS) ++ unsigned long tk_major_timeo; /* major timeout ticks */ ++#endif + }; + + typedef void (*rpc_action)(struct rpc_task *); +@@ -118,6 +121,9 @@ struct rpc_task_setup { + */ + #define RPC_TASK_ASYNC 0x0001 /* is an async task */ + #define RPC_TASK_SWAPPER 0x0002 /* is swapping in/out */ ++#if IS_ENABLED(CONFIG_ENFS) ++#define RPC_TASK_FIXED 0x0004 /* detect xprt status task */ ++#endif + #define RPC_CALL_MAJORSEEN 0x0020 /* major timeout seen */ + #define RPC_TASK_ROOTCREDS 0x0040 /* force root creds */ + #define RPC_TASK_DYNAMIC 0x0080 /* task was kmalloc'ed */ +@@ -257,6 +263,9 @@ void rpc_destroy_mempool(void); + extern struct workqueue_struct *rpciod_workqueue; + extern struct workqueue_struct *xprtiod_workqueue; + void rpc_prepare_task(struct rpc_task *task); ++#if IS_ENABLED(CONFIG_ENFS) ++void rpc_init_task_retry_counters(struct rpc_task *task); ++#endif + + static inline int rpc_wait_for_completion_task(struct rpc_task *task) + { +diff --git a/include/linux/sunrpc/sunrpc_enfs_adapter.h b/include/linux/sunrpc/sunrpc_enfs_adapter.h +new file mode 100644 +index 000000000000..28abedcf5cf6 +--- /dev/null ++++ b/include/linux/sunrpc/sunrpc_enfs_adapter.h +@@ -0,0 +1,128 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* Client-side SUNRPC ENFS adapter header. ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _SUNRPC_ENFS_ADAPTER_H_ ++#define _SUNRPC_ENFS_ADAPTER_H_ ++#include <linux/sunrpc/clnt.h> ++ ++#if IS_ENABLED(CONFIG_ENFS) ++ ++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) ++{ ++ xps->xps_nactive--; ++} ++ ++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) ++{ ++ xps->xps_nactive--; ++} ++ ++struct rpc_xprt *rpc_task_get_xprt ++(struct rpc_clnt *clnt, struct rpc_xprt *xprt); ++ ++struct rpc_multipath_ops { ++ struct module *owner; ++ void (*create_clnt)(struct rpc_create_args *args, ++ struct rpc_clnt *clnt); ++ void (*releas_clnt)(struct rpc_clnt *clnt); ++ void (*create_xprt)(struct rpc_xprt *xprt); ++ void (*destroy_xprt)(struct rpc_xprt *xprt); ++ void (*xprt_iostat)(struct rpc_task *task); ++ void (*failover_handle)(struct rpc_task *task); ++ bool (*task_need_call_start_again)(struct rpc_task *task); ++ void (*adjust_task_timeout)(struct rpc_task *task, void *condition); ++ void (*init_task_req)(struct rpc_task *task, struct rpc_rqst *req); ++ bool (*prepare_transmit)(struct rpc_task *task); ++}; ++ ++extern struct rpc_multipath_ops __rcu *multipath_ops; ++void rpc_init_task_retry_counters(struct rpc_task *task); ++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops); ++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops); ++struct rpc_multipath_ops *rpc_multipath_ops_get(void); ++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops); ++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); ++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, ++ struct rpc_clnt *clnt); ++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt); ++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt); ++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt); ++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task); ++void rpc_multipath_ops_failover_handle(struct rpc_task *task); ++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task); ++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, ++ void *condition); ++void rpc_multipath_ops_init_task_req(struct rpc_task *task, ++ struct rpc_rqst *req); ++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task); ++ ++#else ++static inline struct rpc_xprt *rpc_task_get_xprt(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt) ++{ ++ return NULL; ++} ++ ++static inline void rpc_task_release_xprt(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt) ++{ ++} ++ ++static inline void rpc_xps_nactive_add_one(struct rpc_xprt_switch *xps) ++{ ++} ++ ++static inline void rpc_xps_nactive_sub_one(struct rpc_xprt_switch *xps) ++{ ++} ++ ++static inline void rpc_multipath_ops_create_clnt ++(struct rpc_create_args *args, struct rpc_clnt *clnt) ++{ ++} ++ ++static inline void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) ++{ ++} ++ ++static inline bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) ++{ ++ return false; ++} ++ ++static inline void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) ++{ ++} ++ ++static inline void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) ++{ ++} ++ ++static inline void rpc_multipath_ops_failover_handle(struct rpc_task *task) ++{ ++} ++ ++static inline ++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) ++{ ++ return false; ++} ++ ++static inline void ++rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, void *condition) ++{ ++} ++ ++static inline void ++rpc_multipath_ops_init_task_req(struct rpc_task *task, struct rpc_rqst *req) ++{ ++} ++ ++static inline bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) ++{ ++ return false; ++} ++ ++#endif ++#endif // _SUNRPC_ENFS_ADAPTER_H_ +diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h +index ccfacca1eba9..2e47b3577947 100644 +--- a/include/linux/sunrpc/xprt.h ++++ b/include/linux/sunrpc/xprt.h +@@ -279,6 +279,10 @@ struct rpc_xprt { + atomic_t inject_disconnect; + #endif + struct rcu_head rcu; ++#if IS_ENABLED(CONFIG_ENFS) ++ atomic_long_t queuelen; ++ void *multipath_context; ++#endif + }; + + #if defined(CONFIG_SUNRPC_BACKCHANNEL) +diff --git a/include/linux/sunrpc/xprtmultipath.h b/include/linux/sunrpc/xprtmultipath.h +index af1257c030d2..d54e4dbbbf34 100644 +--- a/include/linux/sunrpc/xprtmultipath.h ++++ b/include/linux/sunrpc/xprtmultipath.h +@@ -22,6 +22,10 @@ struct rpc_xprt_switch { + const struct rpc_xprt_iter_ops *xps_iter_ops; + + struct rcu_head xps_rcu; ++#if IS_ENABLED(CONFIG_ENFS) ++ unsigned int xps_nactive; ++ atomic_long_t xps_queuelen; ++#endif + }; + + struct rpc_xprt_iter { +@@ -69,4 +73,8 @@ extern struct rpc_xprt *xprt_iter_get_next(struct rpc_xprt_iter *xpi); + + extern bool rpc_xprt_switch_has_addr(struct rpc_xprt_switch *xps, + const struct sockaddr *sap); ++#if IS_ENABLED(CONFIG_ENFS) ++extern void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++ struct rpc_xprt *xprt); ++#endif + #endif +diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c +index 0fc540b0d183..d7ffee637148 100644 +--- a/net/sunrpc/clnt.c ++++ b/net/sunrpc/clnt.c +@@ -37,6 +37,7 @@ + #include <linux/sunrpc/rpc_pipe_fs.h> + #include <linux/sunrpc/metrics.h> + #include <linux/sunrpc/bc_xprt.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + #include <trace/events/sunrpc.h> + + #include "sunrpc.h" +@@ -490,6 +491,8 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, + } + } + ++ rpc_multipath_ops_create_clnt(args, clnt); ++ + clnt->cl_softrtry = 1; + if (args->flags & RPC_CLNT_CREATE_HARDRTRY) + clnt->cl_softrtry = 0; +@@ -869,6 +872,8 @@ void rpc_shutdown_client(struct rpc_clnt *clnt) + list_empty(&clnt->cl_tasks), 1*HZ); + } + ++ rpc_multipath_ops_releas_clnt(clnt); ++ + rpc_release_client(clnt); + } + EXPORT_SYMBOL_GPL(rpc_shutdown_client); +@@ -981,7 +986,13 @@ void rpc_task_release_transport(struct rpc_task *task) + + if (xprt) { + task->tk_xprt = NULL; +- xprt_put(xprt); ++#if IS_ENABLED(CONFIG_ENFS) ++ if (task->tk_client) { ++ rpc_task_release_xprt(task->tk_client, xprt); ++ return; ++ } ++#endif ++ xprt_put(xprt); + } + } + EXPORT_SYMBOL_GPL(rpc_task_release_transport); +@@ -990,6 +1001,10 @@ void rpc_task_release_client(struct rpc_task *task) + { + struct rpc_clnt *clnt = task->tk_client; + ++#if IS_ENABLED(CONFIG_ENFS) ++ rpc_task_release_transport(task); ++#endif ++ + if (clnt != NULL) { + /* Remove from client task list */ + spin_lock(&clnt->cl_lock); +@@ -999,14 +1014,29 @@ void rpc_task_release_client(struct rpc_task *task) + + rpc_release_client(clnt); + } ++#if IS_ENABLED(CONFIG_ENFS) ++#else + rpc_task_release_transport(task); ++#endif + } + ++#if IS_ENABLED(CONFIG_ENFS) ++static struct rpc_xprt * ++rpc_task_get_next_xprt(struct rpc_clnt *clnt) ++{ ++ return rpc_task_get_xprt(clnt, xprt_iter_get_next(&clnt->cl_xpi)); ++} ++#endif ++ + static + void rpc_task_set_transport(struct rpc_task *task, struct rpc_clnt *clnt) + { + if (!task->tk_xprt) ++#if IS_ENABLED(CONFIG_ENFS) ++ task->tk_xprt = rpc_task_get_next_xprt(clnt); ++#else + task->tk_xprt = xprt_iter_get_next(&clnt->cl_xpi); ++#endif + } + + static +@@ -1597,6 +1627,14 @@ call_reserveresult(struct rpc_task *task) + return; + case -EIO: /* probably a shutdown */ + break; ++#if IS_ENABLED(CONFIG_ENFS) ++ case -ETIMEDOUT: /* woken up; restart */ ++ if (rpc_multipath_ops_task_need_call_start_again(task)) { ++ rpc_task_release_transport(task); ++ task->tk_action = call_start; ++ return; ++ } ++#endif + default: + printk(KERN_ERR "%s: unrecognized error %d, exiting\n", + __func__, status); +@@ -1962,6 +2000,10 @@ call_transmit(struct rpc_task *task) + return; + if (!xprt_prepare_transmit(task)) + return; ++ ++ if (rpc_multipath_ops_prepare_transmit(task)) ++ return; ++ + task->tk_action = call_transmit_status; + /* Encode here so that rpcsec_gss can use correct sequence number. */ + if (rpc_task_need_encode(task)) { +@@ -2277,6 +2319,9 @@ call_timeout(struct rpc_task *task) + + retry: + task->tk_action = call_bind; ++#if IS_ENABLED(CONFIG_ENFS) ++ rpc_multipath_ops_failover_handle(task); ++#endif + task->tk_status = 0; + } + +@@ -2961,3 +3006,30 @@ rpc_clnt_swap_deactivate(struct rpc_clnt *clnt) + } + EXPORT_SYMBOL_GPL(rpc_clnt_swap_deactivate); + #endif /* CONFIG_SUNRPC_SWAP */ ++ ++#if IS_ENABLED(CONFIG_ENFS) ++/* rpc_clnt_test_xprt - Test and add a new transport to a rpc_clnt ++ * @clnt: pointer to struct rpc_clnt ++ * @xprt: pointer struct rpc_xprt ++ * @ops: async operation ++ */ ++int ++rpc_clnt_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt, ++ const struct rpc_call_ops *ops, void *data, int flags) ++{ ++ struct rpc_cred *cred; ++ struct rpc_task *task; ++ ++ cred = authnull_ops.lookup_cred(NULL, NULL, 0); ++ task = rpc_call_null_helper(clnt, xprt, cred, ++ RPC_TASK_SOFT | RPC_TASK_SOFTCONN | flags, ++ ops, data); ++ put_rpccred(cred); ++ if (IS_ERR(task)) ++ return PTR_ERR(task); ++ ++ rpc_put_task(task); ++ return 1; ++} ++EXPORT_SYMBOL_GPL(rpc_clnt_test_xprt); ++#endif +diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c +index a873c92a4898..2254fea0e863 100644 +--- a/net/sunrpc/sched.c ++++ b/net/sunrpc/sched.c +@@ -20,7 +20,7 @@ + #include <linux/mutex.h> + #include <linux/freezer.h> + +-#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + + #include "sunrpc.h" + +@@ -962,7 +962,12 @@ static void rpc_init_task(struct rpc_task *task, const struct rpc_task_setup *ta + /* Initialize workqueue for async tasks */ + task->tk_workqueue = task_setup_data->workqueue; + ++#if IS_ENABLED(CONFIG_ENFS) ++ task->tk_xprt = rpc_task_get_xprt(task_setup_data->rpc_client, ++ xprt_get(task_setup_data->rpc_xprt)); ++#else + task->tk_xprt = xprt_get(task_setup_data->rpc_xprt); ++#endif + + if (task->tk_ops->rpc_call_prepare != NULL) + task->tk_action = rpc_prepare_task; +diff --git a/net/sunrpc/sunrpc_enfs_adapter.c b/net/sunrpc/sunrpc_enfs_adapter.c +new file mode 100644 +index 000000000000..c1543545c6de +--- /dev/null ++++ b/net/sunrpc/sunrpc_enfs_adapter.c +@@ -0,0 +1,214 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* Client-side SUNRPC ENFS adapter header. ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> ++ ++struct rpc_multipath_ops __rcu *multipath_ops; ++ ++void rpc_init_task_retry_counters(struct rpc_task *task) ++{ ++ /* Initialize retry counters */ ++ task->tk_garb_retry = 2; ++ task->tk_cred_retry = 2; ++ task->tk_rebind_retry = 2; ++} ++EXPORT_SYMBOL_GPL(rpc_init_task_retry_counters); ++ ++struct rpc_xprt * ++rpc_task_get_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) ++{ ++ struct rpc_xprt_switch *xps; ++ ++ if (!xprt) ++ return NULL; ++ rcu_read_lock(); ++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ++ atomic_long_inc(&xps->xps_queuelen); ++ rcu_read_unlock(); ++ atomic_long_inc(&xprt->queuelen); ++ ++ return xprt; ++} ++ ++int rpc_multipath_ops_register(struct rpc_multipath_ops *ops) ++{ ++ struct rpc_multipath_ops *old; ++ ++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, NULL, ops); ++ if (!old || old == ops) ++ return 0; ++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_register); ++ ++int rpc_multipath_ops_unregister(struct rpc_multipath_ops *ops) ++{ ++ struct rpc_multipath_ops *old; ++ ++ old = cmpxchg((struct rpc_multipath_ops **)&multipath_ops, ops, NULL); ++ if (!old || old == ops) ++ return 0; ++ pr_err("regist rpc_multipath ops %p fail. old %p\n", ops, old); ++ return -EPERM; ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_unregister); ++ ++struct rpc_multipath_ops *rpc_multipath_ops_get(void) ++{ ++ struct rpc_multipath_ops *ops; ++ ++ rcu_read_lock(); ++ ops = rcu_dereference(multipath_ops); ++ if (!ops) { ++ rcu_read_unlock(); ++ return NULL; ++ } ++ if (!try_module_get(ops->owner)) ++ ops = NULL; ++ rcu_read_unlock(); ++ return ops; ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_get); ++ ++void rpc_multipath_ops_put(struct rpc_multipath_ops *ops) ++{ ++ if (ops) ++ module_put(ops->owner); ++} ++EXPORT_SYMBOL_GPL(rpc_multipath_ops_put); ++ ++void rpc_task_release_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) ++{ ++ struct rpc_xprt_switch *xps; ++ ++ atomic_long_dec(&xprt->queuelen); ++ rcu_read_lock(); ++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ++ atomic_long_dec(&xps->xps_queuelen); ++ rcu_read_unlock(); ++ ++ xprt_put(xprt); ++} ++ ++void rpc_multipath_ops_create_clnt(struct rpc_create_args *args, ++ struct rpc_clnt *clnt) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ if (args->multipath_option) { ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->create_clnt) ++ mops->create_clnt(args, clnt); ++ rpc_multipath_ops_put(mops); ++ } ++} ++ ++void rpc_multipath_ops_releas_clnt(struct rpc_clnt *clnt) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->releas_clnt) ++ mops->releas_clnt(clnt); ++ ++ rpc_multipath_ops_put(mops); ++} ++ ++bool rpc_multipath_ops_create_xprt(struct rpc_xprt *xprt) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->create_xprt) { ++ mops->create_xprt(xprt); ++ if (!xprt->multipath_context) { ++ rpc_multipath_ops_put(mops); ++ return true; ++ } ++ } ++ rpc_multipath_ops_put(mops); ++ return false; ++} ++ ++void rpc_multipath_ops_destroy_xprt(struct rpc_xprt *xprt) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ if (xprt->multipath_context) { ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->destroy_xprt) ++ mops->destroy_xprt(xprt); ++ rpc_multipath_ops_put(mops); ++ } ++} ++ ++void rpc_multipath_ops_xprt_iostat(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mops; ++ ++ mops = rpc_multipath_ops_get(); ++ if (task->tk_client && mops && mops->xprt_iostat) ++ mops->xprt_iostat(task); ++ rpc_multipath_ops_put(mops); ++} ++ ++void rpc_multipath_ops_failover_handle(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mpath_ops = NULL; ++ ++ mpath_ops = rpc_multipath_ops_get(); ++ if (mpath_ops && mpath_ops->failover_handle) ++ mpath_ops->failover_handle(task); ++ rpc_multipath_ops_put(mpath_ops); ++} ++ ++bool rpc_multipath_ops_task_need_call_start_again(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mpath_ops = NULL; ++ bool ret = false; ++ ++ mpath_ops = rpc_multipath_ops_get(); ++ if (mpath_ops && mpath_ops->task_need_call_start_again) ++ ret = mpath_ops->task_need_call_start_again(task); ++ rpc_multipath_ops_put(mpath_ops); ++ return ret; ++} ++ ++void rpc_multipath_ops_adjust_task_timeout(struct rpc_task *task, ++ void *condition) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->adjust_task_timeout) ++ mops->adjust_task_timeout(task, NULL); ++ rpc_multipath_ops_put(mops); ++} ++ ++void rpc_multipath_ops_init_task_req(struct rpc_task *task, ++ struct rpc_rqst *req) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->init_task_req) ++ mops->init_task_req(task, req); ++ rpc_multipath_ops_put(mops); ++} ++ ++bool rpc_multipath_ops_prepare_transmit(struct rpc_task *task) ++{ ++ struct rpc_multipath_ops *mops = NULL; ++ ++ mops = rpc_multipath_ops_get(); ++ if (mops && mops->prepare_transmit) { ++ if (!(mops->prepare_transmit(task))) { ++ rpc_multipath_ops_put(mops); ++ return true; ++ } ++ } ++ rpc_multipath_ops_put(mops); ++ return false; ++} +diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c +index c912bf20faa2..c2b63b3d5217 100644 +--- a/net/sunrpc/xprt.c ++++ b/net/sunrpc/xprt.c +@@ -48,6 +48,7 @@ + #include <linux/sunrpc/clnt.h> + #include <linux/sunrpc/metrics.h> + #include <linux/sunrpc/bc_xprt.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + #include <linux/rcupdate.h> + + #include <trace/events/sunrpc.h> +@@ -259,6 +260,9 @@ int xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task) + dprintk("RPC: %5u failed to lock transport %p\n", + task->tk_pid, xprt); + task->tk_timeout = 0; ++ ++ rpc_multipath_ops_adjust_task_timeout(task, NULL); ++ + task->tk_status = -EAGAIN; + if (req == NULL) + priority = RPC_PRIORITY_LOW; +@@ -560,6 +564,9 @@ void xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action) + struct rpc_xprt *xprt = req->rq_xprt; + + task->tk_timeout = RPC_IS_SOFT(task) ? req->rq_timeout : 0; ++ ++ rpc_multipath_ops_adjust_task_timeout(task, NULL); ++ + rpc_sleep_on(&xprt->pending, task, action); + } + EXPORT_SYMBOL_GPL(xprt_wait_for_buffer_space); +@@ -1347,6 +1354,9 @@ xprt_request_init(struct rpc_task *task) + req->rq_rcv_buf.buflen = 0; + req->rq_release_snd_buf = NULL; + xprt_reset_majortimeo(req); ++ ++ rpc_multipath_ops_init_task_req(task, req); ++ + dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid, + req, ntohl(req->rq_xid)); + } +@@ -1427,6 +1437,9 @@ void xprt_release(struct rpc_task *task) + task->tk_ops->rpc_count_stats(task, task->tk_calldata); + else if (task->tk_client) + rpc_count_iostats(task, task->tk_client->cl_metrics); ++ ++ rpc_multipath_ops_xprt_iostat(task); ++ + spin_lock(&xprt->recv_lock); + if (!list_empty(&req->rq_list)) { + list_del_init(&req->rq_list); +@@ -1455,6 +1468,7 @@ void xprt_release(struct rpc_task *task) + else + xprt_free_bc_request(req); + } ++EXPORT_SYMBOL_GPL(xprt_release); + + static void xprt_init(struct rpc_xprt *xprt, struct net *net) + { +@@ -1528,6 +1542,10 @@ struct rpc_xprt *xprt_create_transport(struct xprt_create *args) + return ERR_PTR(-ENOMEM); + } + ++if (rpc_multipath_ops_create_xprt(xprt)) { ++ xprt_destroy(xprt); ++ return ERR_PTR(-ENOMEM); ++} + rpc_xprt_debugfs_register(xprt); + + dprintk("RPC: created transport %p with %u slots\n", xprt, +@@ -1547,6 +1565,9 @@ static void xprt_destroy_cb(struct work_struct *work) + rpc_destroy_wait_queue(&xprt->sending); + rpc_destroy_wait_queue(&xprt->backlog); + kfree(xprt->servername); ++ ++ rpc_multipath_ops_destroy_xprt(xprt); ++ + /* + * Tear down transport state and free the rpc_xprt + */ +diff --git a/net/sunrpc/xprtmultipath.c b/net/sunrpc/xprtmultipath.c +index 6ebaa58b4eff..6202a0be1327 100644 +--- a/net/sunrpc/xprtmultipath.c ++++ b/net/sunrpc/xprtmultipath.c +@@ -18,6 +18,7 @@ + #include <linux/sunrpc/xprt.h> + #include <linux/sunrpc/addr.h> + #include <linux/sunrpc/xprtmultipath.h> ++#include <linux/sunrpc/sunrpc_enfs_adapter.h> + + typedef struct rpc_xprt *(*xprt_switch_find_xprt_t)(struct list_head *head, + const struct rpc_xprt *cur); +@@ -26,8 +27,8 @@ static const struct rpc_xprt_iter_ops rpc_xprt_iter_singular; + static const struct rpc_xprt_iter_ops rpc_xprt_iter_roundrobin; + static const struct rpc_xprt_iter_ops rpc_xprt_iter_listall; + +-static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, +- struct rpc_xprt *xprt) ++void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, ++ struct rpc_xprt *xprt) + { + if (unlikely(xprt_get(xprt) == NULL)) + return; +@@ -36,7 +37,9 @@ static void xprt_switch_add_xprt_locked(struct rpc_xprt_switch *xps, + if (xps->xps_nxprts == 0) + xps->xps_net = xprt->xprt_net; + xps->xps_nxprts++; ++ rpc_xps_nactive_add_one(xps); + } ++EXPORT_SYMBOL(xprt_switch_add_xprt_locked); + + /** + * rpc_xprt_switch_add_xprt - Add a new rpc_xprt to an rpc_xprt_switch +@@ -63,6 +66,7 @@ static void xprt_switch_remove_xprt_locked(struct rpc_xprt_switch *xps, + if (unlikely(xprt == NULL)) + return; + xps->xps_nxprts--; ++ rpc_xps_nactive_sub_one(xps); + if (xps->xps_nxprts == 0) + xps->xps_net = NULL; + smp_wmb(); +@@ -84,7 +88,7 @@ void rpc_xprt_switch_remove_xprt(struct rpc_xprt_switch *xps, + spin_unlock(&xps->xps_lock); + xprt_put(xprt); + } +- ++EXPORT_SYMBOL(rpc_xprt_switch_remove_xprt); + /** + * xprt_switch_alloc - Allocate a new struct rpc_xprt_switch + * @xprt: pointer to struct rpc_xprt +@@ -102,7 +106,13 @@ struct rpc_xprt_switch *xprt_switch_alloc(struct rpc_xprt *xprt, + if (xps != NULL) { + spin_lock_init(&xps->xps_lock); + kref_init(&xps->xps_kref); ++#if IS_ENABLED(CONFIG_ENFS) ++ xps->xps_nxprts = 0; ++ xps->xps_nactive = 0; ++ atomic_long_set(&xps->xps_queuelen, 0); ++#else + xps->xps_nxprts = 0; ++#endif + INIT_LIST_HEAD(&xps->xps_xprt_list); + xps->xps_iter_ops = &rpc_xprt_iter_singular; + xprt_switch_add_xprt_locked(xps, xprt); +@@ -148,6 +158,7 @@ struct rpc_xprt_switch *xprt_switch_get(struct rpc_xprt_switch *xps) + return xps; + return NULL; + } ++EXPORT_SYMBOL(xprt_switch_get); + + /** + * xprt_switch_put - Release a reference to a rpc_xprt_switch +@@ -160,6 +171,7 @@ void xprt_switch_put(struct rpc_xprt_switch *xps) + if (xps != NULL) + kref_put(&xps->xps_kref, xprt_switch_free); + } ++EXPORT_SYMBOL(xprt_switch_put); + + /** + * rpc_xprt_switch_set_roundrobin - Set a round-robin policy on rpc_xprt_switch diff --git a/0003-add_enfs_module_for_nfs_mount_option.patch b/0003-add_enfs_module_for_nfs_mount_option.patch new file mode 100644 index 0000000..70753b5 --- /dev/null +++ b/0003-add_enfs_module_for_nfs_mount_option.patch @@ -0,0 +1,1209 @@ +diff --git a/fs/nfs/enfs/Makefile b/fs/nfs/enfs/Makefile +new file mode 100644 +index 000000000000..6e83eb23c668 +--- /dev/null ++++ b/fs/nfs/enfs/Makefile +@@ -0,0 +1,18 @@ ++obj-m += enfs.o ++ ++#EXTRA_CFLAGS += -I$(PWD)/.. ++ ++enfs-y := enfs_init.o ++enfs-y += enfs_config.o ++enfs-y += mgmt_init.o ++enfs-y += enfs_multipath_client.o ++enfs-y += enfs_multipath_parse.o ++enfs-y += failover_path.o ++enfs-y += failover_time.o ++enfs-y += enfs_roundrobin.o ++enfs-y += enfs_multipath.o ++enfs-y += enfs_path.o ++enfs-y += enfs_proc.o ++enfs-y += enfs_remount.o ++enfs-y += pm_ping.o ++enfs-y += pm_state.o +diff --git a/fs/nfs/enfs/enfs.h b/fs/nfs/enfs/enfs.h +new file mode 100644 +index 000000000000..be3d95220088 +--- /dev/null ++++ b/fs/nfs/enfs/enfs.h +@@ -0,0 +1,62 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS multipath adapt header. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++ ++#ifndef _ENFS_H_ ++#define _ENFS_H_ ++#include <linux/atomic.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs3.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include "../enfs_adapter.h" ++ ++#define IP_ADDRESS_LEN_MAX 64 ++#define MAX_IP_PAIR_PER_MOUNT 8 ++#define MAX_IP_INDEX (MAX_IP_PAIR_PER_MOUNT) ++#define MAX_SUPPORTED_LOCAL_IP_COUNT 8 ++#define MAX_SUPPORTED_REMOTE_IP_COUNT 32 ++#define MAX_DNS_NAME_LEN 512 ++#define MAX_DNS_SUPPORTED 2 ++#define EXTEND_CMD_MAX_BUF_LEN 65356 ++ ++ ++struct nfs_ip_list { ++ int count; ++ struct sockaddr_storage address[MAX_SUPPORTED_REMOTE_IP_COUNT]; ++ size_t addrlen[MAX_SUPPORTED_REMOTE_IP_COUNT]; ++}; ++ ++struct NFS_ROUTE_DNS_S { ++ char dnsname[MAX_DNS_NAME_LEN]; // valid only if dnsExist is true ++}; ++ ++struct NFS_ROUTE_DNS_INFO_S { ++ int dnsNameCount; // Count of DNS name in the list ++ // valid only if dnsExist is true ++ struct NFS_ROUTE_DNS_S routeRemoteDnsList[MAX_DNS_SUPPORTED]; ++}; ++ ++struct rpc_iostats; ++struct enfs_xprt_context { ++ struct sockaddr_storage srcaddr; ++ struct rpc_iostats *stats; ++ bool main; ++ atomic_t path_state; ++ atomic_t path_check_state; ++}; ++ ++static inline bool enfs_is_main_xprt(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = xprt->multipath_context; ++ ++ if (!ctx) ++ return false; ++ return ctx->main; ++} ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_init.c b/fs/nfs/enfs/enfs_init.c +new file mode 100644 +index 000000000000..4b55608191a7 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_init.c +@@ -0,0 +1,98 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/module.h> ++#include <linux/sunrpc/sched.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs3.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include "enfs.h" ++#include "enfs_multipath_parse.h" ++#include "enfs_multipath_client.h" ++#include "enfs_remount.h" ++#include "init.h" ++#include "enfs_log.h" ++#include "enfs_multipath.h" ++#include "mgmt_init.h" ++ ++struct enfs_adapter_ops enfs_adapter = { ++ .name = "enfs", ++ .owner = THIS_MODULE, ++ .parse_mount_options = nfs_multipath_parse_options, ++ .free_mount_options = nfs_multipath_free_options, ++ .client_info_init = nfs_multipath_client_info_init, ++ .client_info_free = nfs_multipath_client_info_free, ++ .client_info_match = nfs_multipath_client_info_match, ++ .client_info_show = nfs_multipath_client_info_show, ++ .remount_ip_list = enfs_remount_iplist, ++}; ++ ++int32_t enfs_init(void) ++{ ++ int err; ++ ++ err = enfs_multipath_init(); ++ if (err) { ++ enfs_log_error("init multipath failed.\n"); ++ goto out; ++ } ++ ++ err = mgmt_init(); ++ if (err != 0) { ++ enfs_log_error("init mgmt failed.\n"); ++ goto out_tp_exit; ++ } ++ ++ return 0; ++ ++out_tp_exit: ++ enfs_multipath_exit(); ++out: ++ return err; ++} ++ ++void enfs_fini(void) ++{ ++ mgmt_fini(); ++ ++ enfs_multipath_exit(); ++} ++ ++static int __init init_enfs(void) ++{ ++ int ret; ++ ++ ret = enfs_adapter_register(&enfs_adapter); ++ if (ret) { ++ pr_err("regist enfs_adapter fail. ret %d\n", ret); ++ return -1; ++ } ++ ++ ret = enfs_init(); ++ if (ret) { ++ enfs_adapter_unregister(&enfs_adapter); ++ return -1; ++ } ++ ++ return 0; ++} ++ ++static void __exit exit_enfs(void) ++{ ++ enfs_fini(); ++ enfs_adapter_unregister(&enfs_adapter); ++} ++ ++MODULE_LICENSE("GPL"); ++MODULE_AUTHOR("Huawei Tech. Co., Ltd."); ++MODULE_DESCRIPTION("Nfs client router"); ++MODULE_VERSION("1.0"); ++ ++module_init(init_enfs); ++module_exit(exit_enfs); +diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c +new file mode 100644 +index 000000000000..63c02898a42c +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.c +@@ -0,0 +1,340 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/proc_fs.h> ++#include <linux/seq_file.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/addr.h> ++#include "enfs_multipath_client.h" ++#include "enfs_multipath_parse.h" ++ ++int ++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, ++ const struct nfs_client_initdata *client_init_data) ++{ ++ struct multipath_mount_options *mount_options = ++ (struct multipath_mount_options *)client_init_data->enfs_option; ++ ++ if (mount_options->local_ip_list) { ++ client_info->local_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->local_ip_list) ++ return -ENOMEM; ++ ++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->remote_ip_list) { ++ ++ client_info->remote_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->remote_ip_list) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->remote_ip_list, ++ mount_options->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->pRemoteDnsInfo) { ++ client_info->pRemoteDnsInfo = ++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); ++ ++ if (!client_info->pRemoteDnsInfo) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ kfree(client_info->remote_ip_list); ++ client_info->remote_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->pRemoteDnsInfo, ++ mount_options->pRemoteDnsInfo, ++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); ++ } ++ return 0; ++} ++ ++void nfs_multipath_client_info_free_work(struct work_struct *work) ++{ ++ ++ struct multipath_client_info *clp_info; ++ ++ if (work == NULL) ++ return; ++ ++ clp_info = container_of(work, struct multipath_client_info, work); ++ ++ if (clp_info->local_ip_list != NULL) { ++ kfree(clp_info->local_ip_list); ++ clp_info->local_ip_list = NULL; ++ } ++ if (clp_info->remote_ip_list != NULL) { ++ kfree(clp_info->remote_ip_list); ++ clp_info->remote_ip_list = NULL; ++ } ++ kfree(clp_info); ++} ++ ++void nfs_multipath_client_info_free(void *data) ++{ ++ struct multipath_client_info *clp_info = ++ (struct multipath_client_info *)data; ++ ++ if (clp_info == NULL) ++ return; ++ pr_info("free client info %p.\n", clp_info); ++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); ++ schedule_work(&clp_info->work); ++} ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init) ++{ ++ int rc; ++ struct multipath_client_info *info; ++ struct multipath_client_info **enfs_info; ++ /* no multi path info, no need do multipath init */ ++ if (cl_init->enfs_option == NULL) ++ return 0; ++ enfs_info = (struct multipath_client_info **)data; ++ if (enfs_info == NULL) ++ return -EINVAL; ++ ++ if (*enfs_info == NULL) ++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), ++ GFP_KERNEL); ++ ++ if (*enfs_info == NULL) ++ return -ENOMEM; ++ ++ info = (struct multipath_client_info *)*enfs_info; ++ pr_info("init client info %p.\n", info); ++ rc = nfs_multipath_client_mount_info_init(info, cl_init); ++ if (rc) { ++ nfs_multipath_client_info_free((void *)info); ++ return rc; ++ } ++ return rc; ++} ++ ++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, ++ const struct nfs_ip_list *ip_list_dst) ++{ ++ int i; ++ int j; ++ bool is_find; ++ /* if both are equal or NULL, then return true. */ ++ if (ip_list_src == ip_list_dst) ++ return true; ++ ++ if ((ip_list_src == NULL || ip_list_dst == NULL)) ++ return false; ++ ++ if (ip_list_src->count != ip_list_dst->count) ++ return false; ++ ++ for (i = 0; i < ip_list_src->count; i++) { ++ is_find = false; ++ for (j = 0; j < ip_list_src->count; j++) { ++ if (rpc_cmp_addr_port( ++ (const struct sockaddr *) ++ &ip_list_src->address[i], ++ (const struct sockaddr *) ++ &ip_list_dst->address[j]) ++ ) { ++ is_find = true; ++ break; ++ } ++ } ++ if (is_find == false) ++ return false; ++ } ++ return true; ++} ++ ++int ++nfs_multipath_dns_list_info_match( ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) ++{ ++ int i; ++ ++ /* if both are equal or NULL, then return true. */ ++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) ++ return true; ++ ++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) ++ return false; ++ ++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) ++ return false; ++ ++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { ++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, ++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) ++ return false; ++ } ++ return true; ++} ++ ++int nfs_multipath_client_info_match(void *src, void *dst) ++{ ++ int ret = true; ++ ++ struct multipath_client_info *src_info; ++ struct multipath_mount_options *dst_info; ++ ++ src_info = (struct multipath_client_info *)src; ++ dst_info = (struct multipath_mount_options *)dst; ++ pr_info("try match client .\n"); ++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, ++ dst_info->local_ip_list); ++ if (ret == false) { ++ pr_err("local_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, ++ dst_info->remote_ip_list); ++ if (ret == false) { ++ pr_err("remote_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, ++ dst_info->pRemoteDnsInfo); ++ if (ret == false) { ++ pr_err("dns not match.\n"); ++ return ret; ++ } ++ pr_info("try match client ret %d.\n", ret); ++ return ret; ++} ++ ++void nfs_multipath_print_ip_info(struct seq_file *mount_option, ++ struct nfs_ip_list *ip_list, ++ const char *type) ++{ ++ char buf[IP_ADDRESS_LEN_MAX + 1]; ++ int len = 0; ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < ip_list->count; i++) { ++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], ++ buf, IP_ADDRESS_LEN_MAX); ++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) ++ buf[len] = '\0'; ++ ++ if (i == 0) ++ seq_printf(mount_option, "%s", buf); ++ else ++ seq_printf(mount_option, "~%s", buf); ++ dfprintk(MOUNT, ++ "NFS: show nfs mount option type:%s %s [%s]\n", ++ type, buf, __func__); ++ } ++} ++ ++void nfs_multipath_print_dns_info(struct seq_file *mount_option, ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, ++ const char *type) ++{ ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { ++ if (i == 0) ++ seq_printf(mount_option, ++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) ++ seq_printf(mount_option, ",%s]", ++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else ++ seq_printf(mount_option, ++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ } ++} ++ ++ ++static void multipath_print_sockaddr(struct seq_file *seq, ++ struct sockaddr *addr) ++{ ++ switch (addr->sa_family) { ++ case AF_INET: { ++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; ++ ++ seq_printf(seq, "%pI4", &sin->sin_addr); ++ return; ++ } ++ case AF_INET6: { ++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; ++ ++ seq_printf(seq, "%pI6", &sin6->sin6_addr); ++ return; ++ } ++ default: ++ break; ++ } ++ pr_err("unsupport family:%d\n", addr->sa_family); ++} ++ ++static void multipath_print_enfs_info(struct seq_file *seq, ++ struct nfs_server *server) ++{ ++ struct sockaddr_storage peeraddr; ++ struct rpc_clnt *next = server->client; ++ ++ rpc_peeraddr(server->client, ++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); ++ seq_puts(seq, ",enfs_info="); ++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); ++ ++ while (next->cl_parent) { ++ if (next == next->cl_parent) ++ break; ++ next = next->cl_parent; ++ } ++ seq_printf(seq, "_%u", next->cl_clid); ++} ++ ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) ++{ ++ struct nfs_server *server = data; ++ struct multipath_client_info *client_info = ++ server->nfs_client->cl_multipath_data; ++ ++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); ++ if ((client_info->remote_ip_list) && ++ (client_info->remote_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->remote_ip_list, ++ "remoteaddrs"); ++ ++ if ((client_info->local_ip_list) && ++ (client_info->local_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->local_ip_list, ++ "localaddrs"); ++ ++ if ((client_info->pRemoteDnsInfo) && ++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) ++ nfs_multipath_print_dns_info(mount_option, ++ client_info->pRemoteDnsInfo, ++ "remotednsname"); ++ ++ multipath_print_enfs_info(mount_option, server); ++} +diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h +new file mode 100644 +index 000000000000..208f7260690d +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.h +@@ -0,0 +1,26 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _ENFS_MULTIPATH_CLIENT_H_ ++#define _ENFS_MULTIPATH_CLIENT_H_ ++ ++#include "enfs.h" ++ ++struct multipath_client_info { ++ struct work_struct work; ++ struct nfs_ip_list *remote_ip_list; ++ struct nfs_ip_list *local_ip_list; ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; ++ s64 client_id; ++}; ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init); ++void nfs_multipath_client_info_free(void *data); ++int nfs_multipath_client_info_match(void *src, void *dst); ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_multipath_parse.c b/fs/nfs/enfs/enfs_multipath_parse.c +new file mode 100644 +index 000000000000..9c4c6c1880b6 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_parse.c +@@ -0,0 +1,601 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/parser.h> ++#include <linux/kern_levels.h> ++#include <linux/sunrpc/addr.h> ++#include "enfs_multipath_parse.h" ++#include "enfs_log.h" ++ ++#define NFSDBG_FACILITY NFSDBG_CLIENT ++ ++void nfs_multipath_parse_ip_ipv6_add(struct sockaddr_in6 *sin6, int add_num) ++{ ++ int i; ++ ++ pr_info("NFS: before %08x%08x%08x%08x add_num: %d[%s]\n", ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), ++ add_num, __func__); ++ for (i = 0; i < add_num; i++) { ++ sin6->sin6_addr.in6_u.u6_addr32[3] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[3] != 0) ++ continue; ++ ++ sin6->sin6_addr.in6_u.u6_addr32[2] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[2] != 0) ++ continue; ++ ++ sin6->sin6_addr.in6_u.u6_addr32[1] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[1] != 0) ++ continue; ++ ++ sin6->sin6_addr.in6_u.u6_addr32[0] = ++ htonl(ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]) + 1); ++ ++ if (sin6->sin6_addr.in6_u.u6_addr32[0] != 0) ++ continue; ++ } ++ ++ return; ++ ++} ++ ++static int nfs_multipath_parse_ip_range(struct net *net_ns, const char *cursor, ++ struct nfs_ip_list *ip_list, enum nfsmultipathoptions type) ++{ ++ struct sockaddr_storage addr; ++ struct sockaddr_storage tmp_addr; ++ int i; ++ size_t len; ++ int add_num = 1; ++ bool duplicate_flag = false; ++ bool is_complete = false; ++ struct sockaddr_in *sin4; ++ struct sockaddr_in6 *sin6; ++ ++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", ++ cursor, type, __func__); ++ len = rpc_pton(net_ns, cursor, strlen(cursor), ++ (struct sockaddr *)&addr, sizeof(addr)); ++ if (!len) ++ return -EINVAL; ++ ++ if (addr.ss_family != ip_list->address[ip_list->count - 1].ss_family) { ++ pr_info("NFS: %s parsing nfs mount option type: %d fail.\n", ++ __func__, type); ++ return -EINVAL; ++ } ++ ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &ip_list->address[ip_list->count - 1], ++ (const struct sockaddr *)&addr)) { ++ ++ pr_info("range ip is same ip.\n"); ++ return 0; ++ ++ } ++ ++ while (true) { ++ ++ tmp_addr = ip_list->address[ip_list->count - 1]; ++ ++ switch (addr.ss_family) { ++ case AF_INET: ++ sin4 = (struct sockaddr_in *)&tmp_addr; ++ ++ sin4->sin_addr.s_addr = ++ htonl(ntohl(sin4->sin_addr.s_addr) + add_num); ++ ++ pr_info("NFS: mount option ip%08x type: %d ipcont %d [%s]\n", ++ ntohl(sin4->sin_addr.s_addr), ++ type, ip_list->count, __func__); ++ break; ++ case AF_INET6: ++ sin6 = (struct sockaddr_in6 *)&tmp_addr; ++ nfs_multipath_parse_ip_ipv6_add(sin6, add_num); ++ pr_info("NFS: mount option ip %08x%08x%08x%08x type: %d ipcont %d [%s]\n", ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[0]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[1]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[2]), ++ ntohl(sin6->sin6_addr.in6_u.u6_addr32[3]), ++ type, ip_list->count, __func__); ++ break; ++ // return -EOPNOTSUPP; ++ default: ++ return -EOPNOTSUPP; ++ } ++ ++ if (rpc_cmp_addr((const struct sockaddr *)&tmp_addr, ++ (const struct sockaddr *)&addr)) { ++ is_complete = true; ++ } ++ // delete duplicate ip, continuosly repeat, skip it ++ for (i = 0; i < ip_list->count; i++) { ++ duplicate_flag = false; ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &ip_list->address[i], ++ (const struct sockaddr *)&tmp_addr)) { ++ add_num++; ++ duplicate_flag = true; ++ break; ++ } ++ } ++ ++ if (duplicate_flag == false) { ++ pr_info("this ip not duplicate;"); ++ add_num = 1; ++ // if not repeat but omit limit return false ++ if ((type == LOCALADDR && ++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || ++ (type == REMOTEADDR && ++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { ++ ++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", ++ __func__, type, ip_list->count, ++ type == LOCALADDR ? ++ MAX_SUPPORTED_LOCAL_IP_COUNT : ++ MAX_SUPPORTED_REMOTE_IP_COUNT); ++ ip_list->count = 0; ++ return -ENOSPC; ++ } ++ ip_list->address[ip_list->count] = tmp_addr; ++ ++ ip_list->addrlen[ip_list->count] = ++ ip_list->addrlen[ip_list->count - 1]; ++ ++ ip_list->count += 1; ++ } ++ if (is_complete == true) ++ break; ++ } ++ return 0; ++} ++ ++int nfs_multipath_parse_ip_list_inter(struct nfs_ip_list *ip_list, ++ struct net *net_ns, ++ char *cursor, enum nfsmultipathoptions type) ++{ ++ int i = 0; ++ struct sockaddr_storage addr; ++ struct sockaddr_storage swap; ++ int len; ++ ++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", ++ cursor, type, __func__); ++ ++ len = rpc_pton(net_ns, cursor, ++ strlen(cursor), ++ (struct sockaddr *)&addr, sizeof(addr)); ++ if (!len) ++ return -EINVAL; ++ ++ // check repeated ip ++ for (i = 0; i < ip_list->count; i++) { ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &ip_list->address[i], ++ (const struct sockaddr *)&addr)) { ++ ++ pr_info("NFS: mount option '%s' type:%d index %d same as before index %d [%s]\n", ++ cursor, type, ip_list->count, i, __func__); ++ // prevent this ip is beginning ++ // if repeated take it to the end of list ++ swap = ip_list->address[i]; ++ ++ ip_list->address[i] = ++ ip_list->address[ip_list->count-1]; ++ ++ ip_list->address[ip_list->count-1] = swap; ++ return 0; ++ } ++ } ++ // if not repeated, check exceed limit ++ if ((type == LOCALADDR && ++ ip_list->count >= MAX_SUPPORTED_LOCAL_IP_COUNT) || ++ (type == REMOTEADDR && ++ ip_list->count >= MAX_SUPPORTED_REMOTE_IP_COUNT)) { ++ ++ pr_info("[MULTIPATH:%s] iplist for type %d reached %d, more than supported limit %d\n", ++ __func__, type, ip_list->count, ++ type == LOCALADDR ? ++ MAX_SUPPORTED_LOCAL_IP_COUNT : ++ MAX_SUPPORTED_REMOTE_IP_COUNT); ++ ++ ip_list->count = 0; ++ return -ENOSPC; ++ } ++ ip_list->address[ip_list->count] = addr; ++ ip_list->addrlen[ip_list->count] = len; ++ ip_list->count++; ++ ++ return 0; ++} ++ ++char *nfs_multipath_parse_ip_list_get_cursor(char **buf_to_parse, bool *single) ++{ ++ char *cursor = NULL; ++ const char *single_sep = strchr(*buf_to_parse, '~'); ++ const char *range_sep = strchr(*buf_to_parse, '-'); ++ ++ *single = true; ++ if (range_sep) { ++ if (range_sep > single_sep) { // A-B or A~B-C ++ if (single_sep == NULL) { // A-B ++ cursor = strsep(buf_to_parse, "-"); ++ if (cursor) ++ *single = false; ++ } else// A~B-C ++ cursor = strsep(buf_to_parse, "~"); ++ } else { // A-B~C ++ cursor = strsep(buf_to_parse, "-"); ++ if (cursor) ++ *single = false; ++ } ++ } else { // A~B~C ++ cursor = strsep(buf_to_parse, "~"); ++ } ++ return cursor; ++} ++ ++bool nfs_multipath_parse_param_check(enum nfsmultipathoptions type, ++ struct multipath_mount_options *options) ++{ ++ if (type == REMOUNTREMOTEADDR && options->remote_ip_list->count != 0) { ++ memset(options->remote_ip_list, 0, sizeof(struct nfs_ip_list)); ++ return true; ++ } ++ if (type == REMOUNTLOCALADDR && options->local_ip_list->count != 0) { ++ memset(options->local_ip_list, 0, sizeof(struct nfs_ip_list)); ++ return true; ++ } ++ if ((type == REMOTEADDR || type == REMOTEDNSNAME) && ++ options->pRemoteDnsInfo->dnsNameCount != 0) { ++ ++ pr_info("[MULTIPATH:%s] parse for %d ,already have dns\n", ++ __func__, type); ++ return false; ++ } else if ((type == REMOTEADDR || type == REMOTEDNSNAME) && ++ options->remote_ip_list->count != 0) { ++ ++ pr_info("[MULTIPATH:%s] parse for %d ,already have iplist\n", ++ __func__, type); ++ return false; ++ } ++ return true; ++} ++ ++int nfs_multipath_parse_ip_list(char *buffer, struct net *net_ns, ++ struct multipath_mount_options *options, ++ enum nfsmultipathoptions type) ++{ ++ char *buf_to_parse = NULL; ++ bool prev_range = false; ++ int ret = 0; ++ char *cursor = NULL; ++ bool single = true; ++ struct nfs_ip_list *ip_list_tmp = NULL; ++ ++ if (!nfs_multipath_parse_param_check(type, options)) ++ return -ENOTSUPP; ++ ++ if (type == REMOUNTREMOTEADDR) ++ type = REMOTEADDR; ++ ++ if (type == REMOUNTLOCALADDR) ++ type = LOCALADDR; ++ ++ if (type == LOCALADDR) ++ ip_list_tmp = options->local_ip_list; ++ else ++ ip_list_tmp = options->remote_ip_list; ++ ++ pr_info("NFS: parsing nfs mount option '%s' type: %d[%s]\n", ++ buffer, type, __func__); ++ ++ buf_to_parse = buffer; ++ while (buf_to_parse != NULL) { ++ cursor = ++ nfs_multipath_parse_ip_list_get_cursor(&buf_to_parse, &single); ++ if (!cursor) ++ break; ++ ++ if (single == false && prev_range == true) { ++ pr_info("NFS: mount option type: %d fail. Multiple Range.[%s]\n", ++ type, __func__); ++ ++ ret = -EINVAL; ++ goto out; ++ } ++ ++ if (prev_range == false) { ++ ret = nfs_multipath_parse_ip_list_inter(ip_list_tmp, ++ net_ns, cursor, type); ++ if (ret) ++ goto out; ++ if (single == false) ++ prev_range = true; ++ } else { ++ ret = nfs_multipath_parse_ip_range(net_ns, cursor, ++ ip_list_tmp, type); ++ if (ret != 0) ++ goto out; ++ prev_range = false; ++ } ++ } ++ ++out: ++ if (ret) ++ memset(ip_list_tmp, 0, sizeof(struct nfs_ip_list)); ++ ++ return ret; ++} ++ ++int nfs_multipath_parse_dns_list(char *buffer, struct net *net_ns, ++ struct multipath_mount_options *options) ++{ ++ struct NFS_ROUTE_DNS_INFO_S *dns_name_list_tmp = NULL; ++ char *cursor = NULL; ++ char *bufToParse; ++ ++ if (!nfs_multipath_parse_param_check(REMOTEDNSNAME, options)) ++ return -ENOTSUPP; ++ ++ pr_info("[MULTIPATH:%s] buffer %s\n", __func__, buffer); ++ // freed in nfs_free_parsed_mount_data ++ dns_name_list_tmp = kmalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), ++ GFP_KERNEL); ++ if (!dns_name_list_tmp) ++ return -ENOMEM; ++ ++ dns_name_list_tmp->dnsNameCount = 0; ++ bufToParse = buffer; ++ while (bufToParse) { ++ if (dns_name_list_tmp->dnsNameCount >= MAX_DNS_SUPPORTED) { ++ pr_err("%s: dnsname for %s reached %d,more than supported limit %d\n", ++ __func__, cursor, ++ dns_name_list_tmp->dnsNameCount, ++ MAX_DNS_SUPPORTED); ++ dns_name_list_tmp->dnsNameCount = 0; ++ return -ENOSPC; ++ } ++ cursor = strsep(&bufToParse, "~"); ++ if (!cursor) ++ break; ++ ++ strcpy(dns_name_list_tmp->routeRemoteDnsList ++ [dns_name_list_tmp->dnsNameCount].dnsname, ++ cursor); ++ dns_name_list_tmp->dnsNameCount++; ++ } ++ if (dns_name_list_tmp->dnsNameCount == 0) ++ return -EINVAL; ++ options->pRemoteDnsInfo = dns_name_list_tmp; ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check_ipv4_valid(struct sockaddr_in *addr) ++{ ++ if (addr->sin_addr.s_addr == 0 || addr->sin_addr.s_addr == 0xffffffff) ++ return -EINVAL; ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check_ipv6_valid(struct sockaddr_in6 *addr) ++{ ++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0 && ++ addr->sin6_addr.in6_u.u6_addr32[1] == 0 && ++ addr->sin6_addr.in6_u.u6_addr32[2] == 0 && ++ addr->sin6_addr.in6_u.u6_addr32[3] == 0) ++ return -EINVAL; ++ ++ if (addr->sin6_addr.in6_u.u6_addr32[0] == 0xffffffff && ++ addr->sin6_addr.in6_u.u6_addr32[1] == 0xffffffff && ++ addr->sin6_addr.in6_u.u6_addr32[2] == 0xffffffff && ++ addr->sin6_addr.in6_u.u6_addr32[3] == 0xffffffff) ++ return -EINVAL; ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check_ip_valid(struct sockaddr_storage *address) ++{ ++ int rc = 0; ++ ++ if (address->ss_family == AF_INET) ++ rc = nfs_multipath_parse_options_check_ipv4_valid( ++ (struct sockaddr_in *)address); ++ else if (address->ss_family == AF_INET6) ++ rc = nfs_multipath_parse_options_check_ipv6_valid( ++ (struct sockaddr_in6 *)address); ++ else ++ rc = -EINVAL; ++ ++ return rc; ++} ++ ++int nfs_multipath_parse_options_check_valid( ++ struct multipath_mount_options *options) ++{ ++ int rc; ++ int i; ++ ++ if (options == NULL) ++ return 0; ++ ++ for (i = 0; i < options->local_ip_list->count; i++) { ++ rc = nfs_multipath_parse_options_check_ip_valid( ++ &options->local_ip_list->address[i]); ++ if (rc != 0) ++ return rc; ++ } ++ ++ for (i = 0; i < options->remote_ip_list->count; i++) { ++ rc = nfs_multipath_parse_options_check_ip_valid( ++ &options->remote_ip_list->address[i]); ++ if (rc != 0) ++ return rc; ++ } ++ ++ return 0; ++} ++int nfs_multipath_parse_options_check_duplicate( ++ struct multipath_mount_options *options) ++{ ++ int i; ++ int j; ++ ++ if (options == NULL || ++ options->local_ip_list->count == 0 || ++ options->remote_ip_list->count == 0) ++ ++ return 0; ++ ++ for (i = 0; i < options->local_ip_list->count; i++) { ++ for (j = 0; j < options->remote_ip_list->count; j++) { ++ if (rpc_cmp_addr((const struct sockaddr *) ++ &options->local_ip_list->address[i], ++ (const struct sockaddr *) ++ &options->remote_ip_list->address[j])) ++ return -ENOTSUPP; ++ } ++ } ++ return 0; ++} ++ ++int nfs_multipath_parse_options_check(struct multipath_mount_options *options) ++{ ++ int rc = 0; ++ ++ rc = nfs_multipath_parse_options_check_valid(options); ++ ++ if (rc != 0) { ++ pr_err("has invaild ip.\n"); ++ return rc; ++ } ++ ++ rc = nfs_multipath_parse_options_check_duplicate(options); ++ if (rc != 0) ++ return rc; ++ return rc; ++} ++ ++int nfs_multipath_alloc_options(void **enfs_option) ++{ ++ struct multipath_mount_options *options = NULL; ++ ++ options = kzalloc(sizeof(struct multipath_mount_options), GFP_KERNEL); ++ ++ if (options == NULL) ++ return -ENOMEM; ++ ++ options->local_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ if (options->local_ip_list == NULL) { ++ kfree(options); ++ return -ENOMEM; ++ } ++ ++ options->remote_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ if (options->remote_ip_list == NULL) { ++ kfree(options->local_ip_list); ++ kfree(options); ++ return -ENOMEM; ++ } ++ ++ options->pRemoteDnsInfo = kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), ++ GFP_KERNEL); ++ if (options->pRemoteDnsInfo == NULL) { ++ kfree(options->remote_ip_list); ++ kfree(options->local_ip_list); ++ kfree(options); ++ return -ENOMEM; ++ } ++ ++ *enfs_option = options; ++ return 0; ++} ++ ++int nfs_multipath_parse_options(enum nfsmultipathoptions type, ++ char *str, void **enfs_option, struct net *net_ns) ++{ ++ int rc; ++ struct multipath_mount_options *options = NULL; ++ ++ if ((str == NULL) || (enfs_option == NULL) || (net_ns == NULL)) ++ return -EINVAL; ++ ++ if (*enfs_option == NULL) { ++ rc = nfs_multipath_alloc_options(enfs_option); ++ if (rc != 0) { ++ enfs_log_error( ++ "alloc enfs_options failed! errno:%d\n", rc); ++ return rc; ++ } ++ } ++ ++ options = (struct multipath_mount_options *)*enfs_option; ++ ++ if (type == LOCALADDR || type == REMOUNTLOCALADDR || ++ type == REMOTEADDR || type == REMOUNTREMOTEADDR) { ++ rc = nfs_multipath_parse_ip_list(str, net_ns, options, type); ++ } else if (type == REMOTEDNSNAME) { ++ /* alloc and release need to modify */ ++ rc = nfs_multipath_parse_dns_list(str, net_ns, options); ++ } else { ++ rc = -EOPNOTSUPP; ++ } ++ ++ // after parsing cmd, need checking local and remote ++ // IP is same. if not means illegal cmd ++ if (rc == 0) ++ rc = nfs_multipath_parse_options_check_duplicate(options); ++ ++ if (rc == 0) ++ rc = nfs_multipath_parse_options_check(options); ++ ++ return rc; ++} ++ ++void nfs_multipath_free_options(void **enfs_option) ++{ ++ struct multipath_mount_options *options; ++ ++ if (enfs_option == NULL || *enfs_option == NULL) ++ return; ++ ++ options = (struct multipath_mount_options *)*enfs_option; ++ ++ if (options->remote_ip_list != NULL) { ++ kfree(options->remote_ip_list); ++ options->remote_ip_list = NULL; ++ } ++ ++ if (options->local_ip_list != NULL) { ++ kfree(options->local_ip_list); ++ options->local_ip_list = NULL; ++ } ++ ++ if (options->pRemoteDnsInfo != NULL) { ++ kfree(options->pRemoteDnsInfo); ++ options->pRemoteDnsInfo = NULL; ++ } ++ ++ kfree(options); ++ *enfs_option = NULL; ++} +diff --git a/fs/nfs/enfs/enfs_multipath_parse.h b/fs/nfs/enfs/enfs_multipath_parse.h +new file mode 100644 +index 000000000000..6f3e8703e3e2 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_parse.h +@@ -0,0 +1,22 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _ENFS_MULTIPATH_PARSE_H_ ++#define _ENFS_MULTIPATH_PARSE_H_ ++ ++#include "enfs.h" ++ ++struct multipath_mount_options { ++ struct nfs_ip_list *remote_ip_list; ++ struct nfs_ip_list *local_ip_list; ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; ++}; ++ ++int nfs_multipath_parse_options(enum nfsmultipathoptions type, ++ char *str, void **enfs_option, struct net *net_ns); ++void nfs_multipath_free_options(void **enfs_option); ++ ++#endif diff --git a/0004-add_enfs_module_for_sunrpc_multipatch.patch b/0004-add_enfs_module_for_sunrpc_multipatch.patch new file mode 100644 index 0000000..2c0fcc7 --- /dev/null +++ b/0004-add_enfs_module_for_sunrpc_multipatch.patch @@ -0,0 +1,1581 @@ +diff --git a/fs/nfs/enfs/enfs_multipath.h b/fs/nfs/enfs/enfs_multipath.h +new file mode 100644 +index 000000000000..e064c2929ced +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath.h +@@ -0,0 +1,24 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: enfs multipath ++ * Author: ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef ENFS_MULTIPATH_H ++#define ENFS_MULTIPATH_H ++#include <linux/sunrpc/clnt.h> ++ ++#define MAX_XPRT_NUM_PER_CLIENT 32 ++ ++int enfs_multipath_init(void); ++void enfs_multipath_exit(void); ++void enfs_xprt_ippair_create(struct xprt_create *xprtargs, ++ struct rpc_clnt *clnt, void *data); ++int enfs_config_xprt_create_args(struct xprt_create *xprtargs, ++ struct rpc_create_args *args, ++ char *servername, size_t length); ++void print_enfs_multipath_addr(struct sockaddr *local, struct sockaddr *remote); ++ ++#endif // ENFS_MULTIPATH_H +diff --git a/fs/nfs/enfs/enfs_multipath_client.c b/fs/nfs/enfs/enfs_multipath_client.c +new file mode 100644 +index 000000000000..63c02898a42c +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.c +@@ -0,0 +1,340 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#include <linux/types.h> ++#include <linux/nfs.h> ++#include <linux/nfs4.h> ++#include <linux/nfs_fs.h> ++#include <linux/nfs_fs_sb.h> ++#include <linux/proc_fs.h> ++#include <linux/seq_file.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/addr.h> ++#include "enfs_multipath_client.h" ++#include "enfs_multipath_parse.h" ++ ++int ++nfs_multipath_client_mount_info_init(struct multipath_client_info *client_info, ++ const struct nfs_client_initdata *client_init_data) ++{ ++ struct multipath_mount_options *mount_options = ++ (struct multipath_mount_options *)client_init_data->enfs_option; ++ ++ if (mount_options->local_ip_list) { ++ client_info->local_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->local_ip_list) ++ return -ENOMEM; ++ ++ memcpy(client_info->local_ip_list, mount_options->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->remote_ip_list) { ++ ++ client_info->remote_ip_list = ++ kzalloc(sizeof(struct nfs_ip_list), GFP_KERNEL); ++ ++ if (!client_info->remote_ip_list) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->remote_ip_list, ++ mount_options->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (mount_options->pRemoteDnsInfo) { ++ client_info->pRemoteDnsInfo = ++ kzalloc(sizeof(struct NFS_ROUTE_DNS_INFO_S), GFP_KERNEL); ++ ++ if (!client_info->pRemoteDnsInfo) { ++ kfree(client_info->local_ip_list); ++ client_info->local_ip_list = NULL; ++ kfree(client_info->remote_ip_list); ++ client_info->remote_ip_list = NULL; ++ return -ENOMEM; ++ } ++ memcpy(client_info->pRemoteDnsInfo, ++ mount_options->pRemoteDnsInfo, ++ sizeof(struct NFS_ROUTE_DNS_INFO_S)); ++ } ++ return 0; ++} ++ ++void nfs_multipath_client_info_free_work(struct work_struct *work) ++{ ++ ++ struct multipath_client_info *clp_info; ++ ++ if (work == NULL) ++ return; ++ ++ clp_info = container_of(work, struct multipath_client_info, work); ++ ++ if (clp_info->local_ip_list != NULL) { ++ kfree(clp_info->local_ip_list); ++ clp_info->local_ip_list = NULL; ++ } ++ if (clp_info->remote_ip_list != NULL) { ++ kfree(clp_info->remote_ip_list); ++ clp_info->remote_ip_list = NULL; ++ } ++ kfree(clp_info); ++} ++ ++void nfs_multipath_client_info_free(void *data) ++{ ++ struct multipath_client_info *clp_info = ++ (struct multipath_client_info *)data; ++ ++ if (clp_info == NULL) ++ return; ++ pr_info("free client info %p.\n", clp_info); ++ INIT_WORK(&clp_info->work, nfs_multipath_client_info_free_work); ++ schedule_work(&clp_info->work); ++} ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init) ++{ ++ int rc; ++ struct multipath_client_info *info; ++ struct multipath_client_info **enfs_info; ++ /* no multi path info, no need do multipath init */ ++ if (cl_init->enfs_option == NULL) ++ return 0; ++ enfs_info = (struct multipath_client_info **)data; ++ if (enfs_info == NULL) ++ return -EINVAL; ++ ++ if (*enfs_info == NULL) ++ *enfs_info = kzalloc(sizeof(struct multipath_client_info), ++ GFP_KERNEL); ++ ++ if (*enfs_info == NULL) ++ return -ENOMEM; ++ ++ info = (struct multipath_client_info *)*enfs_info; ++ pr_info("init client info %p.\n", info); ++ rc = nfs_multipath_client_mount_info_init(info, cl_init); ++ if (rc) { ++ nfs_multipath_client_info_free((void *)info); ++ return rc; ++ } ++ return rc; ++} ++ ++bool nfs_multipath_ip_list_info_match(const struct nfs_ip_list *ip_list_src, ++ const struct nfs_ip_list *ip_list_dst) ++{ ++ int i; ++ int j; ++ bool is_find; ++ /* if both are equal or NULL, then return true. */ ++ if (ip_list_src == ip_list_dst) ++ return true; ++ ++ if ((ip_list_src == NULL || ip_list_dst == NULL)) ++ return false; ++ ++ if (ip_list_src->count != ip_list_dst->count) ++ return false; ++ ++ for (i = 0; i < ip_list_src->count; i++) { ++ is_find = false; ++ for (j = 0; j < ip_list_src->count; j++) { ++ if (rpc_cmp_addr_port( ++ (const struct sockaddr *) ++ &ip_list_src->address[i], ++ (const struct sockaddr *) ++ &ip_list_dst->address[j]) ++ ) { ++ is_find = true; ++ break; ++ } ++ } ++ if (is_find == false) ++ return false; ++ } ++ return true; ++} ++ ++int ++nfs_multipath_dns_list_info_match( ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoSrc, ++ const struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfoDst) ++{ ++ int i; ++ ++ /* if both are equal or NULL, then return true. */ ++ if (pRemoteDnsInfoSrc == pRemoteDnsInfoDst) ++ return true; ++ ++ if ((pRemoteDnsInfoSrc == NULL || pRemoteDnsInfoDst == NULL)) ++ return false; ++ ++ if (pRemoteDnsInfoSrc->dnsNameCount != pRemoteDnsInfoDst->dnsNameCount) ++ return false; ++ ++ for (i = 0; i < pRemoteDnsInfoSrc->dnsNameCount; i++) { ++ if (!strcmp(pRemoteDnsInfoSrc->routeRemoteDnsList[i].dnsname, ++ pRemoteDnsInfoDst->routeRemoteDnsList[i].dnsname)) ++ return false; ++ } ++ return true; ++} ++ ++int nfs_multipath_client_info_match(void *src, void *dst) ++{ ++ int ret = true; ++ ++ struct multipath_client_info *src_info; ++ struct multipath_mount_options *dst_info; ++ ++ src_info = (struct multipath_client_info *)src; ++ dst_info = (struct multipath_mount_options *)dst; ++ pr_info("try match client .\n"); ++ ret = nfs_multipath_ip_list_info_match(src_info->local_ip_list, ++ dst_info->local_ip_list); ++ if (ret == false) { ++ pr_err("local_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_ip_list_info_match(src_info->remote_ip_list, ++ dst_info->remote_ip_list); ++ if (ret == false) { ++ pr_err("remote_ip not match.\n"); ++ return ret; ++ } ++ ++ ret = nfs_multipath_dns_list_info_match(src_info->pRemoteDnsInfo, ++ dst_info->pRemoteDnsInfo); ++ if (ret == false) { ++ pr_err("dns not match.\n"); ++ return ret; ++ } ++ pr_info("try match client ret %d.\n", ret); ++ return ret; ++} ++ ++void nfs_multipath_print_ip_info(struct seq_file *mount_option, ++ struct nfs_ip_list *ip_list, ++ const char *type) ++{ ++ char buf[IP_ADDRESS_LEN_MAX + 1]; ++ int len = 0; ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < ip_list->count; i++) { ++ len = rpc_ntop((struct sockaddr *)&ip_list->address[i], ++ buf, IP_ADDRESS_LEN_MAX); ++ if (len > 0 && len < IP_ADDRESS_LEN_MAX) ++ buf[len] = '\0'; ++ ++ if (i == 0) ++ seq_printf(mount_option, "%s", buf); ++ else ++ seq_printf(mount_option, "~%s", buf); ++ dfprintk(MOUNT, ++ "NFS: show nfs mount option type:%s %s [%s]\n", ++ type, buf, __func__); ++ } ++} ++ ++void nfs_multipath_print_dns_info(struct seq_file *mount_option, ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo, ++ const char *type) ++{ ++ int i = 0; ++ ++ seq_printf(mount_option, ",%s=", type); ++ for (i = 0; i < pRemoteDnsInfo->dnsNameCount; i++) { ++ if (i == 0) ++ seq_printf(mount_option, ++ "[%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else if (i == pRemoteDnsInfo->dnsNameCount - 1) ++ seq_printf(mount_option, ",%s]", ++ pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ else ++ seq_printf(mount_option, ++ ",%s", pRemoteDnsInfo->routeRemoteDnsList[i].dnsname); ++ } ++} ++ ++ ++static void multipath_print_sockaddr(struct seq_file *seq, ++ struct sockaddr *addr) ++{ ++ switch (addr->sa_family) { ++ case AF_INET: { ++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; ++ ++ seq_printf(seq, "%pI4", &sin->sin_addr); ++ return; ++ } ++ case AF_INET6: { ++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; ++ ++ seq_printf(seq, "%pI6", &sin6->sin6_addr); ++ return; ++ } ++ default: ++ break; ++ } ++ pr_err("unsupport family:%d\n", addr->sa_family); ++} ++ ++static void multipath_print_enfs_info(struct seq_file *seq, ++ struct nfs_server *server) ++{ ++ struct sockaddr_storage peeraddr; ++ struct rpc_clnt *next = server->client; ++ ++ rpc_peeraddr(server->client, ++ (struct sockaddr *)&peeraddr, sizeof(peeraddr)); ++ seq_puts(seq, ",enfs_info="); ++ multipath_print_sockaddr(seq, (struct sockaddr *)&peeraddr); ++ ++ while (next->cl_parent) { ++ if (next == next->cl_parent) ++ break; ++ next = next->cl_parent; ++ } ++ seq_printf(seq, "_%u", next->cl_clid); ++} ++ ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data) ++{ ++ struct nfs_server *server = data; ++ struct multipath_client_info *client_info = ++ server->nfs_client->cl_multipath_data; ++ ++ dfprintk(MOUNT, "NFS: show nfs mount option[%s]\n", __func__); ++ if ((client_info->remote_ip_list) && ++ (client_info->remote_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->remote_ip_list, ++ "remoteaddrs"); ++ ++ if ((client_info->local_ip_list) && ++ (client_info->local_ip_list->count > 0)) ++ nfs_multipath_print_ip_info(mount_option, ++ client_info->local_ip_list, ++ "localaddrs"); ++ ++ if ((client_info->pRemoteDnsInfo) && ++ (client_info->pRemoteDnsInfo->dnsNameCount > 0)) ++ nfs_multipath_print_dns_info(mount_option, ++ client_info->pRemoteDnsInfo, ++ "remotednsname"); ++ ++ multipath_print_enfs_info(mount_option, server); ++} +diff --git a/fs/nfs/enfs/enfs_multipath_client.h b/fs/nfs/enfs/enfs_multipath_client.h +new file mode 100644 +index 000000000000..208f7260690d +--- /dev/null ++++ b/fs/nfs/enfs/enfs_multipath_client.h +@@ -0,0 +1,26 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS adapter. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef _ENFS_MULTIPATH_CLIENT_H_ ++#define _ENFS_MULTIPATH_CLIENT_H_ ++ ++#include "enfs.h" ++ ++struct multipath_client_info { ++ struct work_struct work; ++ struct nfs_ip_list *remote_ip_list; ++ struct nfs_ip_list *local_ip_list; ++ struct NFS_ROUTE_DNS_INFO_S *pRemoteDnsInfo; ++ s64 client_id; ++}; ++ ++int nfs_multipath_client_info_init(void **data, ++ const struct nfs_client_initdata *cl_init); ++void nfs_multipath_client_info_free(void *data); ++int nfs_multipath_client_info_match(void *src, void *dst); ++void nfs_multipath_client_info_show(struct seq_file *mount_option, void *data); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_path.c b/fs/nfs/enfs/enfs_path.c +new file mode 100644 +index 000000000000..7355f8c2f672 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_path.c +@@ -0,0 +1,47 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++ ++#include <linux/sunrpc/metrics.h> ++#include <linux/sunrpc/xprt.h> ++ ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_path.h" ++ ++// only create ctx in this function ++// alloc iostat memory in create_clnt ++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx; ++ ++ if (!xprt) { ++ enfs_log_error("invalid xprt pointer.\n"); ++ return -EINVAL; ++ } ++ ++ ctx = kzalloc(sizeof(struct enfs_xprt_context), GFP_KERNEL); ++ if (!ctx) { ++ enfs_log_error("add xprt test failed.\n"); ++ return -ENOMEM; ++ } ++ ++ xprt->multipath_context = (void *)ctx; ++ return 0; ++} ++ ++// free multi_context and iostat memory ++void enfs_free_xprt_ctx(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = xprt->multipath_context; ++ ++ if (ctx) { ++ if (ctx->stats) { ++ rpc_free_iostats(ctx->stats); ++ ctx->stats = NULL; ++ } ++ kfree(xprt->multipath_context); ++ xprt->multipath_context = NULL; ++ } ++} +diff --git a/fs/nfs/enfs/enfs_path.h b/fs/nfs/enfs/enfs_path.h +new file mode 100644 +index 000000000000..97b1ef3730b8 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_path.h +@@ -0,0 +1,12 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++ ++#ifndef ENFS_PATH_H ++#define ENFS_PATH_H ++ ++int enfs_alloc_xprt_ctx(struct rpc_xprt *xprt); ++void enfs_free_xprt_ctx(struct rpc_xprt *xprt); ++ ++#endif // ENFS_PATH_H +diff --git a/fs/nfs/enfs/enfs_proc.c b/fs/nfs/enfs/enfs_proc.c +new file mode 100644 +index 000000000000..53fa1a07642f +--- /dev/null ++++ b/fs/nfs/enfs/enfs_proc.c +@@ -0,0 +1,545 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#include <linux/module.h> ++#include <linux/proc_fs.h> ++#include <linux/seq_file.h> ++#include <linux/spinlock.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/metrics.h> ++#include <linux/sunrpc/xprtsock.h> ++#include <net/netns/generic.h> ++ ++#include "../../../net/sunrpc/netns.h" ++ ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_proc.h" ++#include "enfs_multipath.h" ++#include "pm_state.h" ++ ++#define ENFS_PROC_DIR "enfs" ++#define ENFS_PROC_PATH_STATUS_LEN 256 ++ ++static struct proc_dir_entry *enfs_proc_parent; ++ ++void ++enfs_iterate_each_rpc_clnt(int (*fn)(struct rpc_clnt *clnt, void *data), ++ void *data) ++{ ++ struct net *net; ++ struct sunrpc_net *sn; ++ struct rpc_clnt *clnt; ++ ++ rcu_read_lock(); ++ for_each_net_rcu(net) { ++ sn = net_generic(net, sunrpc_net_id); ++ if (sn == NULL) ++ continue; ++ spin_lock(&sn->rpc_client_lock); ++ list_for_each_entry(clnt, &sn->all_clients, cl_clients) { ++ fn(clnt, data); ++ } ++ spin_unlock(&sn->rpc_client_lock); ++ } ++ rcu_read_unlock(); ++} ++ ++struct proc_dir_entry *enfs_get_proc_parent(void) ++{ ++ return enfs_proc_parent; ++} ++ ++static int sockaddr_ip_to_str(struct sockaddr *addr, char *buf, int len) ++{ ++ switch (addr->sa_family) { ++ case AF_INET: { ++ struct sockaddr_in *sin = (struct sockaddr_in *)addr; ++ ++ snprintf(buf, len, "%pI4", &sin->sin_addr); ++ return 0; ++ } ++ case AF_INET6: { ++ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr; ++ ++ snprintf(buf, len, "%pI6", &sin6->sin6_addr); ++ return 0; ++ } ++ default: ++ break; ++ } ++ return 1; ++} ++ ++static bool should_print(const char *name) ++{ ++ int i; ++ static const char * const proc_names[] = { ++ "READ", ++ "WRITE", ++ }; ++ ++ if (name == NULL) ++ return false; ++ ++ for (i = 0; i < ARRAY_SIZE(proc_names); i++) { ++ if (strcmp(name, proc_names[i]) == 0) ++ return true; ++ } ++ return false; ++} ++ ++struct enfs_xprt_iter { ++ unsigned int id; ++ struct seq_file *seq; ++ unsigned int max_addrs_length; ++}; ++ ++static int debug_show_xprt(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ ++ if (xprt->multipath_context) ++ ctx = xprt->multipath_context; ++ ++ pr_info(" xprt:%p ctx:%p main:%d queue_len:%lu.\n", xprt, ++ xprt->multipath_context, ++ ctx ? ctx->main : false, ++ atomic_long_read(&xprt->queuelen)); ++ return 0; ++} ++ ++static int debug_show_clnt(struct rpc_clnt *clnt, void *data) ++{ ++ pr_info(" clnt %d addr:%p enfs:%d\n", ++ clnt->cl_clid, clnt, ++ clnt->cl_enfs); ++ rpc_clnt_iterate_for_each_xprt(clnt, debug_show_xprt, NULL); ++ return 0; ++} ++ ++static void debug_print_all_xprt(void) ++{ ++ enfs_iterate_each_rpc_clnt(debug_show_clnt, NULL); ++} ++ ++static ++void enfs_proc_format_xprt_addr_display(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ char *local_name_buf, ++ int local_name_buf_len, ++ char *remote_name_buf, ++ int remote_name_buf_len) ++{ ++ int err; ++ struct sockaddr_storage srcaddr; ++ struct enfs_xprt_context *ctx; ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ sockaddr_ip_to_str((struct sockaddr *)&xprt->addr, ++ remote_name_buf, remote_name_buf_len); ++ ++ // get local address depend one main or not ++ if (enfs_is_main_xprt(xprt)) { ++ err = rpc_localaddr(clnt, (struct sockaddr *)&srcaddr, ++ sizeof(srcaddr)); ++ if (err != 0) ++ (void)snprintf(local_name_buf, ++ local_name_buf_len, "Unknown"); ++ else ++ sockaddr_ip_to_str((struct sockaddr *)&srcaddr, ++ local_name_buf, ++ local_name_buf_len); ++ } else { ++ sockaddr_ip_to_str((struct sockaddr *)&ctx->srcaddr, ++ local_name_buf, ++ local_name_buf_len); ++ } ++} ++ ++static int enfs_show_xprt_stats(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ unsigned int op; ++ unsigned int maxproc = clnt->cl_maxproc; ++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; ++ struct enfs_xprt_context *ctx; ++ char local_name[INET6_ADDRSTRLEN]; ++ char remote_name[INET6_ADDRSTRLEN]; ++ ++ if (!xprt->multipath_context) ++ return 0; ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ enfs_proc_format_xprt_addr_display(clnt, xprt, local_name, ++ sizeof(local_name), ++ remote_name, sizeof(remote_name)); ++ ++ seq_printf(iter->seq, "%-6u%-*s%-*s", iter->id, ++ iter->max_addrs_length + 4, ++ local_name, ++ iter->max_addrs_length + 4, ++ remote_name); ++ ++ iter->id++; ++ ++ for (op = 0; op < maxproc; op++) { ++ if (!should_print(clnt->cl_procinfo[op].p_name)) ++ continue; ++ ++ seq_printf(iter->seq, "%-22lu%-22Lu%-22Lu", ++ ctx->stats[op].om_ops, ++ ctx->stats[op].om_ops == 0 ? 0 : ++ ktime_to_ms(ctx->stats[op].om_rtt) / ++ ctx->stats[op].om_ops, ++ ctx->stats[op].om_ops == 0 ? 0 : ++ ktime_to_ms(ctx->stats[op].om_execute) / ++ ctx->stats[op].om_ops); ++ } ++ seq_puts(iter->seq, "\n"); ++ return 0; ++} ++ ++static int rpc_proc_show_path_status(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; ++ struct enfs_xprt_context *ctx = NULL; ++ char local_name[INET6_ADDRSTRLEN] = {0}; ++ char remote_name[INET6_ADDRSTRLEN] = {0}; ++ char multiapth_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; ++ char xprt_status[ENFS_PROC_PATH_STATUS_LEN] = {0}; ++ ++ if (!xprt->multipath_context) { ++ enfs_log_debug("multipath_context is null.\n"); ++ return 0; ++ } ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ enfs_proc_format_xprt_addr_display(clnt, xprt, ++ local_name, ++ sizeof(local_name), ++ remote_name, sizeof(remote_name)); ++ ++ pm_get_path_state_desc(xprt, ++ multiapth_status, ++ ENFS_PROC_PATH_STATUS_LEN); ++ ++ pm_get_xprt_state_desc(xprt, ++ xprt_status, ++ ENFS_PROC_PATH_STATUS_LEN); ++ ++ seq_printf(iter->seq, "%-6u%-*s%-*s%-12s%-12s\n", ++ iter->id, iter->max_addrs_length + 4, ++ local_name, iter->max_addrs_length + 4, ++ remote_name, multiapth_status, ++ xprt_status); ++ iter->id++; ++ return 0; ++} ++ ++static int enfs_get_max_addrs_length(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ struct enfs_xprt_iter *iter = (struct enfs_xprt_iter *)data; ++ char local_name[INET6_ADDRSTRLEN]; ++ char remote_name[INET6_ADDRSTRLEN]; ++ ++ enfs_proc_format_xprt_addr_display(clnt, xprt, ++ local_name, sizeof(local_name), ++ remote_name, sizeof(remote_name)); ++ ++ if (iter->max_addrs_length < strlen(local_name)) ++ iter->max_addrs_length = strlen(local_name); ++ ++ if (iter->max_addrs_length < strlen(remote_name)) ++ iter->max_addrs_length = strlen(remote_name); ++ ++ return 0; ++} ++ ++static int rpc_proc_clnt_showpath(struct seq_file *seq, void *v) ++{ ++ struct rpc_clnt *clnt = seq->private; ++ struct enfs_xprt_iter iter; ++ ++ iter.seq = seq; ++ iter.id = 0; ++ iter.max_addrs_length = 0; ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ enfs_get_max_addrs_length, ++ (void *)&iter); ++ ++ seq_printf(seq, "%-6s%-*s%-*s%-12s%-12s\n", "id", ++ iter.max_addrs_length + 4, ++ "local_addr", ++ iter.max_addrs_length + 4, ++ "remote_addr", ++ "path_state", ++ "xprt_state"); ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ rpc_proc_show_path_status, ++ (void *)&iter); ++ return 0; ++} ++ ++static int enfs_rpc_proc_show(struct seq_file *seq, void *v) ++{ ++ struct rpc_clnt *clnt = seq->private; ++ struct enfs_xprt_iter iter; ++ ++ iter.seq = seq; ++ iter.id = 0; ++ iter.max_addrs_length = 0; ++ ++ debug_print_all_xprt(); ++ pr_info("enfs proc clnt:%p\n", clnt); ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ enfs_get_max_addrs_length, ++ (void *)&iter); ++ ++ seq_printf(seq, "%-6s%-*s%-*s%-22s%-22s%-22s%-22s%-22s%-22s\n", "id", ++ iter.max_addrs_length + 4, "local_addr", ++ iter.max_addrs_length + 4, ++ "remote_addr", "r_count", ++ "r_rtt", "r_exec", "w_count", "w_rtt", "w_exec"); ++ ++ // rpc_clnt_show_stats(seq, clnt); ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ enfs_show_xprt_stats, ++ (void *)&iter); ++ return 0; ++} ++ ++static int rpc_proc_open(struct inode *inode, struct file *file) ++{ ++ struct rpc_clnt *clnt = PDE_DATA(inode); ++ ++ pr_info("%s %p\n", __func__, clnt); ++ return single_open(file, enfs_rpc_proc_show, clnt); ++} ++ ++static int enfs_reset_xprt_stats(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void *data) ++{ ++ unsigned int op; ++ struct enfs_xprt_context *ctx; ++ unsigned int maxproc = clnt->cl_maxproc; ++ struct rpc_iostats stats = {0}; ++ ++ if (!xprt->multipath_context) ++ return 0; ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ ++ for (op = 0; op < maxproc; op++) { ++ spin_lock(&ctx->stats[op].om_lock); ++ ctx->stats[op] = stats; ++ spin_unlock(&ctx->stats[op].om_lock); ++ } ++ return 0; ++} ++ ++static void trim_newline_ch(char *str, int len) ++{ ++ int i; ++ ++ for (i = 0; str[i] != '\0' && i < len; i++) { ++ if (str[i] == '\n') ++ str[i] = '\0'; ++ } ++} ++ ++static ssize_t enfs_proc_write(struct file *file, ++ const char __user *user_buf, ++ size_t len, ++ loff_t *offset) ++{ ++ char buffer[128]; ++ struct rpc_clnt *clnt = ++ ((struct seq_file *)file->private_data)->private; ++ ++ if (len >= sizeof(buffer)) ++ return -E2BIG; ++ ++ if (copy_from_user(buffer, user_buf, len) != 0) ++ return -EFAULT; ++ ++ buffer[len] = '\0'; ++ trim_newline_ch(buffer, len); ++ if (strcmp(buffer, "reset") != 0) ++ return -EINVAL; ++ ++ rpc_clnt_iterate_for_each_xprt(clnt, enfs_reset_xprt_stats, NULL); ++ return len; ++} ++ ++static int rpc_proc_show_path(struct inode *inode, struct file *file) ++{ ++ struct rpc_clnt *clnt = PDE_DATA(inode); ++ ++ return single_open(file, rpc_proc_clnt_showpath, clnt); ++} ++ ++static const struct file_operations rpc_proc_fops = { ++ .owner = THIS_MODULE, ++ .open = rpc_proc_open, ++ .read = seq_read, ++ .llseek = seq_lseek, ++ .release = single_release, ++ .write = enfs_proc_write, ++}; ++ ++static const struct file_operations rpc_show_path_fops = { ++ .owner = THIS_MODULE, ++ .open = rpc_proc_show_path, ++ .read = seq_read, ++ .llseek = seq_lseek, ++ .release = single_release, ++}; ++ ++static int clnt_proc_name(struct rpc_clnt *clnt, char *buf, int len) ++{ ++ int ret; ++ ++ ret = snprintf(buf, len, "%s_%u", ++ rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR), ++ clnt->cl_clid); ++ if (ret > len) ++ return -E2BIG; ++ return 0; ++} ++ ++static int enfs_proc_create_file(struct rpc_clnt *clnt) ++{ ++ int err; ++ char buf[128]; ++ ++ struct proc_dir_entry *clnt_entry; ++ struct proc_dir_entry *stat_entry; ++ ++ err = clnt_proc_name(clnt, buf, sizeof(buf)); ++ if (err) ++ return err; ++ ++ clnt_entry = proc_mkdir(buf, enfs_proc_parent); ++ if (clnt_entry == NULL) ++ return -EINVAL; ++ ++ stat_entry = proc_create_data("stat", ++ 0, clnt_entry, ++ &rpc_proc_fops, clnt); ++ ++ if (stat_entry == NULL) ++ return -EINVAL; ++ ++ stat_entry = proc_create_data("path", ++ 0, clnt_entry, ++ &rpc_show_path_fops, clnt); ++ ++ if (stat_entry == NULL) ++ return -EINVAL; ++ ++ return 0; ++} ++ ++void enfs_count_iostat(struct rpc_task *task) ++{ ++ struct enfs_xprt_context *ctx = task->tk_xprt->multipath_context; ++ ++ if (!ctx || !ctx->stats) ++ return; ++ rpc_count_iostats(task, ctx->stats); ++} ++ ++static void enfs_proc_delete_file(struct rpc_clnt *clnt) ++{ ++ int err; ++ char buf[128]; ++ ++ err = clnt_proc_name(clnt, buf, sizeof(buf)); ++ if (err) { ++ pr_err("gen clnt name failed.\n"); ++ return; ++ } ++ remove_proc_subtree(buf, enfs_proc_parent); ++} ++ ++// create proc file "/porc/enfs/[mount_ip]_[id]/stat" ++int enfs_proc_create_clnt(struct rpc_clnt *clnt) ++{ ++ int err; ++ ++ err = enfs_proc_create_file(clnt); ++ if (err) { ++ pr_err("create client %d\n", err); ++ return err; ++ } ++ ++ return 0; ++} ++ ++void enfs_proc_delete_clnt(struct rpc_clnt *clnt) ++{ ++ if (clnt->cl_enfs) ++ enfs_proc_delete_file(clnt); ++} ++ ++static int enfs_proc_create_parent(void) ++{ ++ enfs_proc_parent = proc_mkdir(ENFS_PROC_DIR, NULL); ++ ++ if (enfs_proc_parent == NULL) { ++ pr_err("Enfs create proc dir err\n"); ++ return -ENOMEM; ++ } ++ return 0; ++} ++ ++static void enfs_proc_delete_parent(void) ++{ ++ remove_proc_entry(ENFS_PROC_DIR, NULL); ++} ++ ++static int enfs_proc_init_create_clnt(struct rpc_clnt *clnt, void *data) ++{ ++ if (clnt->cl_enfs) ++ enfs_proc_create_file(clnt); ++ return 0; ++} ++ ++static int enfs_proc_destroy_clnt(struct rpc_clnt *clnt, void *data) ++{ ++ if (clnt->cl_enfs) ++ enfs_proc_delete_file(clnt); ++ return 0; ++} ++ ++int enfs_proc_init(void) ++{ ++ int err; ++ ++ err = enfs_proc_create_parent(); ++ if (err) ++ return err; ++ ++ enfs_iterate_each_rpc_clnt(enfs_proc_init_create_clnt, NULL); ++ return 0; ++} ++ ++void enfs_proc_exit(void) ++{ ++ enfs_iterate_each_rpc_clnt(enfs_proc_destroy_clnt, NULL); ++ enfs_proc_delete_parent(); ++} +diff --git a/fs/nfs/enfs/enfs_proc.h b/fs/nfs/enfs/enfs_proc.h +new file mode 100644 +index 000000000000..321951031c2e +--- /dev/null ++++ b/fs/nfs/enfs/enfs_proc.h +@@ -0,0 +1,21 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Client-side ENFS PROC. ++ * ++ * Copyright (c) 2023. Huawei Technologies Co., Ltd. All rights reserved. ++ */ ++#ifndef ENFS_PROC_H ++#define ENFS_PROC_H ++ ++struct rpc_clnt; ++struct rpc_task; ++struct proc_dir_entry; ++ ++int enfs_proc_init(void); ++void enfs_proc_exit(void); ++struct proc_dir_entry *enfs_get_proc_parent(void); ++int enfs_proc_create_clnt(struct rpc_clnt *clnt); ++void enfs_proc_delete_clnt(struct rpc_clnt *clnt); ++void enfs_count_iostat(struct rpc_task *task); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_remount.c b/fs/nfs/enfs/enfs_remount.c +new file mode 100644 +index 000000000000..2c3fe125c735 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_remount.c +@@ -0,0 +1,221 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: remount ip source file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++#include "enfs_remount.h" ++ ++#include <linux/string.h> ++#include <linux/in.h> ++#include <linux/in6.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/spinlock.h> ++#include <linux/sunrpc/addr.h> ++#include <linux/sunrpc/metrics.h> ++#include <linux/sunrpc/xprtmultipath.h> ++#include <linux/sunrpc/xprtsock.h> ++#include <linux/sunrpc/xprt.h> ++#include <linux/smp.h> ++#include <linux/delay.h> ++ ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_multipath.h" ++#include "enfs_multipath_parse.h" ++#include "enfs_path.h" ++#include "enfs_proc.h" ++#include "enfs_multipath_client.h" ++ ++static bool enfs_rpc_xprt_switch_need_delete_addr( ++ struct multipath_mount_options *enfs_option, ++ struct sockaddr *dstaddr, struct sockaddr *srcaddr) ++{ ++ int i; ++ bool find_same_ip = false; ++ int32_t local_total; ++ int32_t remote_total; ++ ++ local_total = enfs_option->local_ip_list->count; ++ remote_total = enfs_option->remote_ip_list->count; ++ if (local_total == 0 || remote_total == 0) { ++ pr_err("no ip list is present.\n"); ++ return false; ++ } ++ ++ for (i = 0; i < local_total; i++) { ++ find_same_ip = ++ rpc_cmp_addr((struct sockaddr *) ++ &enfs_option->local_ip_list->address[i], ++ srcaddr); ++ if (find_same_ip) ++ break; ++ } ++ ++ if (find_same_ip == false) ++ return true; ++ ++ find_same_ip = false; ++ for (i = 0; i < remote_total; i++) { ++ find_same_ip = ++ rpc_cmp_addr((struct sockaddr *) ++ &enfs_option->remote_ip_list->address[i], ++ dstaddr); ++ if (find_same_ip) ++ break; ++ } ++ ++ if (find_same_ip == false) ++ return true; ++ ++ return false; ++} ++ ++// Used in rcu_lock ++static bool enfs_delete_xprt_from_switch(struct rpc_xprt *xprt, ++ void *enfs_option, ++ struct rpc_xprt_switch *xps) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ struct multipath_mount_options *mopt = ++ (struct multipath_mount_options *)enfs_option; ++ ++ if (enfs_is_main_xprt(xprt)) ++ return true; ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (enfs_rpc_xprt_switch_need_delete_addr(mopt, ++ (struct sockaddr *)&xprt->addr, ++ (struct sockaddr *)&ctx->srcaddr)) { ++ ++ print_enfs_multipath_addr((struct sockaddr *)&ctx->srcaddr, ++ (struct sockaddr *)&xprt->addr); ++ rpc_xprt_switch_remove_xprt(xps, xprt); ++ return true; ++ } ++ ++ return false; ++} ++ ++void enfs_clnt_delete_obsolete_xprts(struct nfs_client *nfs_client, ++ void *enfs_option) ++{ ++ int xprt_count = 0; ++ struct rpc_xprt *pos = NULL; ++ struct rpc_xprt_switch *xps = NULL; ++ ++ rcu_read_lock(); ++ xps = xprt_switch_get( ++ rcu_dereference( ++ nfs_client->cl_rpcclient->cl_xpi.xpi_xpswitch)); ++ if (xps == NULL) { ++ rcu_read_unlock(); ++ xprt_switch_put(xps); ++ return; ++ } ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (xprt_count < MAX_XPRT_NUM_PER_CLIENT) { ++ if (enfs_delete_xprt_from_switch( ++ pos, enfs_option, xps) == false) ++ xprt_count++; ++ } else ++ rpc_xprt_switch_remove_xprt(xps, pos); ++ } ++ rcu_read_unlock(); ++ xprt_switch_put(xps); ++} ++ ++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option) ++{ ++ int errno = 0; ++ char servername[48]; ++ struct multipath_mount_options *remount_lists = ++ (struct multipath_mount_options *)enfs_option; ++ struct multipath_client_info *client_info = ++ (struct multipath_client_info *)nfs_client->cl_multipath_data; ++ struct xprt_create xprtargs; ++ struct rpc_create_args args = { ++ .protocol = nfs_client->cl_proto, ++ .net = nfs_client->cl_net, ++ .addrsize = nfs_client->cl_addrlen, ++ .servername = nfs_client->cl_hostname, ++ }; ++ ++ memset(&xprtargs, 0, sizeof(struct xprt_create)); ++ ++ //mount is not use multipath ++ if (client_info == NULL || enfs_option == NULL) { ++ enfs_log_error( ++ "mount information or remount information is empty.\n"); ++ return -EINVAL; ++ } ++ ++ //remount : localaddrs and remoteaddrs are empty ++ if (remount_lists->local_ip_list->count == 0 && ++ remount_lists->remote_ip_list->count == 0) { ++ enfs_log_info("remount local_ip_list and remote_ip_list are NULL\n"); ++ return 0; ++ } ++ ++ errno = enfs_config_xprt_create_args(&xprtargs, ++ &args, servername, sizeof(servername)); ++ ++ if (errno) { ++ enfs_log_error("config_xprt_create failed! errno:%d\n", errno); ++ return errno; ++ } ++ ++ if (remount_lists->local_ip_list->count == 0) { ++ if (client_info->local_ip_list->count == 0) { ++ errno = rpc_localaddr(nfs_client->cl_rpcclient, ++ (struct sockaddr *) ++ &remount_lists->local_ip_list->address[0], ++ sizeof(struct sockaddr_storage)); ++ if (errno) { ++ enfs_log_error("get clnt srcaddr errno:%d\n", ++ errno); ++ return errno; ++ } ++ remount_lists->local_ip_list->count = 1; ++ } else ++ memcpy(remount_lists->local_ip_list, ++ client_info->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ if (remount_lists->remote_ip_list->count == 0) { ++ if (client_info->remote_ip_list->count == 0) { ++ errno = rpc_peeraddr(nfs_client->cl_rpcclient, ++ (struct sockaddr *) ++ &remount_lists->remote_ip_list->address[0], ++ sizeof(struct sockaddr_storage)); ++ if (errno == 0) { ++ enfs_log_error("get clnt dstaddr errno:%d\n", ++ errno); ++ return errno; ++ } ++ remount_lists->remote_ip_list->count = 1; ++ } else ++ memcpy(remount_lists->remote_ip_list, ++ client_info->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ } ++ ++ enfs_log_info("Remount creating new links...\n"); ++ enfs_xprt_ippair_create(&xprtargs, ++ nfs_client->cl_rpcclient, ++ remount_lists); ++ ++ enfs_log_info("Remount deleting obsolete links...\n"); ++ enfs_clnt_delete_obsolete_xprts(nfs_client, remount_lists); ++ ++ memcpy(client_info->local_ip_list, ++ remount_lists->local_ip_list, ++ sizeof(struct nfs_ip_list)); ++ memcpy(client_info->remote_ip_list, ++ remount_lists->remote_ip_list, ++ sizeof(struct nfs_ip_list)); ++ ++ return 0; ++} +diff --git a/fs/nfs/enfs/enfs_remount.h b/fs/nfs/enfs/enfs_remount.h +new file mode 100644 +index 000000000000..a663ed257004 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_remount.h +@@ -0,0 +1,15 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: remount ip header file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++#ifndef _ENFS_REMOUNT_ ++#define _ENFS_REMOUNT_ ++#include <linux/string.h> ++#include "enfs.h" ++ ++int enfs_remount_iplist(struct nfs_client *nfs_client, void *enfs_option); ++ ++#endif +diff --git a/fs/nfs/enfs/enfs_roundrobin.c b/fs/nfs/enfs/enfs_roundrobin.c +new file mode 100644 +index 000000000000..4e4eda784a3e +--- /dev/null ++++ b/fs/nfs/enfs/enfs_roundrobin.c +@@ -0,0 +1,255 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#include <linux/spinlock.h> ++#include <linux/module.h> ++#include <linux/printk.h> ++#include <linux/kref.h> ++#include <linux/rculist.h> ++#include <linux/types.h> ++#include <linux/sunrpc/xprt.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/xprtmultipath.h> ++#include "enfs_roundrobin.h" ++ ++#include "enfs.h" ++#include "enfs_config.h" ++#include "pm_state.h" ++ ++typedef struct rpc_xprt *(*enfs_xprt_switch_find_xprt_t)( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur); ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin; ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular; ++ ++static bool enfs_xprt_is_active(struct rpc_xprt *xprt) ++{ ++ enum pm_path_state state; ++ ++ if (kref_read(&xprt->kref) <= 0) ++ return false; ++ ++ state = pm_get_path_state(xprt); ++ if (state == PM_STATE_NORMAL) ++ return true; ++ ++ return false; ++} ++ ++static struct rpc_xprt *enfs_lb_set_cursor_xprt( ++ struct rpc_xprt_switch *xps, struct rpc_xprt **cursor, ++ enfs_xprt_switch_find_xprt_t find_next) ++{ ++ struct rpc_xprt *pos; ++ struct rpc_xprt *old; ++ ++ old = smp_load_acquire(cursor); /* read latest cursor */ ++ pos = find_next(xps, old); ++ smp_store_release(cursor, pos); /* let cursor point to pos */ ++ return pos; ++} ++ ++static ++struct rpc_xprt *enfs_lb_find_next_entry_roundrobin( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *pos; ++ struct rpc_xprt *prev = NULL; ++ bool found = false; ++ struct rpc_xprt *min_queuelen_xprt = NULL; ++ unsigned long pos_xprt_queuelen; ++ unsigned long min_xprt_queuelen = 0; ++ ++ unsigned long xps_queuelen = atomic_long_read(&xps->xps_queuelen); ++ // delete origin xprt ++ unsigned int multipath_nactive = READ_ONCE(xps->xps_nactive) - 1; ++ ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (enfs_is_main_xprt(pos) || !enfs_xprt_is_active(pos)) { ++ prev = pos; ++ continue; ++ } ++ ++ pos_xprt_queuelen = atomic_long_read(&pos->queuelen); ++ if (min_queuelen_xprt == NULL || ++ pos_xprt_queuelen < min_xprt_queuelen) { ++ ++ min_queuelen_xprt = pos; ++ min_xprt_queuelen = pos_xprt_queuelen; ++ } ++ ++ if (cur == prev) ++ found = true; ++ ++ if (found && pos_xprt_queuelen * ++ multipath_nactive <= xps_queuelen) ++ return pos; ++ prev = pos; ++ }; ++ ++ return min_queuelen_xprt; ++} ++ ++struct rpc_xprt *enfs_lb_switch_find_first_active_xprt( ++ struct rpc_xprt_switch *xps) ++{ ++ struct rpc_xprt *pos; ++ ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (enfs_xprt_is_active(pos)) ++ return pos; ++ }; ++ return NULL; ++} ++ ++struct rpc_xprt *enfs_lb_switch_get_main_xprt(struct rpc_xprt_switch *xps) ++{ ++ return list_first_or_null_rcu(&xps->xps_xprt_list, ++ struct rpc_xprt, xprt_switch); ++} ++ ++static struct rpc_xprt *enfs_lb_switch_get_next_xprt_roundrobin( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *xprt; ++ ++ // disable multipath ++ if (enfs_get_config_multipath_state()) ++ return enfs_lb_switch_get_main_xprt(xps); ++ ++ xprt = enfs_lb_find_next_entry_roundrobin(xps, cur); ++ if (xprt != NULL) ++ return xprt; ++ ++ return enfs_lb_switch_get_main_xprt(xps); ++} ++ ++static ++struct rpc_xprt *enfs_lb_iter_next_entry_roundrobin(struct rpc_xprt_iter *xpi) ++{ ++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); ++ ++ if (xps == NULL) ++ return NULL; ++ ++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, ++ enfs_lb_switch_get_next_xprt_roundrobin); ++} ++ ++static ++struct rpc_xprt *enfs_lb_switch_find_singular_entry( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *pos; ++ bool found = false; ++ ++ list_for_each_entry_rcu(pos, &xps->xps_xprt_list, xprt_switch) { ++ if (cur == pos) ++ found = true; ++ ++ if (found && enfs_xprt_is_active(pos)) ++ return pos; ++ } ++ return NULL; ++} ++ ++struct rpc_xprt *enfs_lb_get_singular_xprt( ++ struct rpc_xprt_switch *xps, const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *xprt; ++ ++ if (xps == NULL) ++ return NULL; ++ ++ // disable multipath ++ if (enfs_get_config_multipath_state()) ++ return enfs_lb_switch_get_main_xprt(xps); ++ ++ if (cur == NULL || xps->xps_nxprts < 2) ++ return enfs_lb_switch_find_first_active_xprt(xps); ++ ++ xprt = enfs_lb_switch_find_singular_entry(xps, cur); ++ if (!xprt) ++ return enfs_lb_switch_get_main_xprt(xps); ++ ++ return xprt; ++} ++ ++static ++struct rpc_xprt *enfs_lb_iter_next_entry_sigular(struct rpc_xprt_iter *xpi) ++{ ++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); ++ ++ if (xps == NULL) ++ return NULL; ++ ++ return enfs_lb_set_cursor_xprt(xps, &xpi->xpi_cursor, ++ enfs_lb_get_singular_xprt); ++} ++ ++static void enfs_lb_iter_default_rewind(struct rpc_xprt_iter *xpi) ++{ ++ WRITE_ONCE(xpi->xpi_cursor, NULL); ++} ++ ++static void enfs_lb_switch_set_roundrobin(struct rpc_clnt *clnt) ++{ ++ struct rpc_xprt_switch *xps; ++ ++ rcu_read_lock(); ++ xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ++ rcu_read_unlock(); ++ if (clnt->cl_vers == 3) { ++ ++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_roundrobin) ++ WRITE_ONCE(xps->xps_iter_ops, ++ &enfs_xprt_iter_roundrobin); ++ ++ return; ++ } ++ if (READ_ONCE(xps->xps_iter_ops) != &enfs_xprt_iter_singular) ++ WRITE_ONCE(xps->xps_iter_ops, &enfs_xprt_iter_singular); ++} ++ ++static ++struct rpc_xprt *enfs_lb_switch_find_current(struct list_head *head, ++ const struct rpc_xprt *cur) ++{ ++ struct rpc_xprt *pos; ++ ++ list_for_each_entry_rcu(pos, head, xprt_switch) { ++ if (cur == pos) ++ return pos; ++ } ++ return NULL; ++} ++ ++static struct rpc_xprt *enfs_lb_iter_current_entry(struct rpc_xprt_iter *xpi) ++{ ++ struct rpc_xprt_switch *xps = rcu_dereference(xpi->xpi_xpswitch); ++ struct list_head *head; ++ ++ if (xps == NULL) ++ return NULL; ++ head = &xps->xps_xprt_list; ++ if (xpi->xpi_cursor == NULL || xps->xps_nxprts < 2) ++ return enfs_lb_switch_get_main_xprt(xps); ++ return enfs_lb_switch_find_current(head, xpi->xpi_cursor); ++} ++ ++void enfs_lb_set_policy(struct rpc_clnt *clnt) ++{ ++ enfs_lb_switch_set_roundrobin(clnt); ++} ++ ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_roundrobin = { ++ .xpi_rewind = enfs_lb_iter_default_rewind, ++ .xpi_xprt = enfs_lb_iter_current_entry, ++ .xpi_next = enfs_lb_iter_next_entry_roundrobin, ++}; ++ ++static const struct rpc_xprt_iter_ops enfs_xprt_iter_singular = { ++ .xpi_rewind = enfs_lb_iter_default_rewind, ++ .xpi_xprt = enfs_lb_iter_current_entry, ++ .xpi_next = enfs_lb_iter_next_entry_sigular, ++}; +diff --git a/fs/nfs/enfs/enfs_roundrobin.h b/fs/nfs/enfs/enfs_roundrobin.h +new file mode 100644 +index 000000000000..b72b088a6258 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_roundrobin.h +@@ -0,0 +1,9 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#ifndef ENFS_ROUNDROBIN_H ++#define ENFS_ROUNDROBIN_H ++ ++void enfs_lb_set_policy(struct rpc_clnt *clnt); ++#endif diff --git a/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch new file mode 100644 index 0000000..cc6b677 --- /dev/null +++ b/0005-add_enfs_module_for_sunrpc_failover_and_configure.patch @@ -0,0 +1,1607 @@ +diff --git a/fs/nfs/enfs/enfs_config.c b/fs/nfs/enfs/enfs_config.c +new file mode 100644 +index 000000000000..11aa7a00385b +--- /dev/null ++++ b/fs/nfs/enfs/enfs_config.c +@@ -0,0 +1,378 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ */ ++#include <linux/cdev.h> ++#include <linux/errno.h> ++#include <linux/fcntl.h> ++#include <linux/fs.h> ++#include <linux/kernel.h> ++#include <linux/kthread.h> ++#include <linux/slab.h> ++#include <linux/string.h> ++#include <linux/uaccess.h> ++#include <linux/delay.h> ++ ++#include "enfs_errcode.h" ++#include "enfs_log.h" ++#include "enfs_config.h" ++ ++#define MAX_FILE_SIZE 8192 ++#define STRING_BUF_SIZE 128 ++#define CONFIG_FILE_PATH "/etc/enfs/config.ini" ++#define ENFS_NOTIFY_FILE_PERIOD 1000UL ++ ++#define MAX_PATH_DETECT_INTERVAL 300 ++#define MIN_PATH_DETECT_INTERVAL 5 ++#define MAX_PATH_DETECT_TIMEOUT 60 ++#define MIN_PATH_DETECT_TIMEOUT 1 ++#define MAX_MULTIPATH_TIMEOUT 60 ++#define MIN_MULTIPATH_TIMEOUT 0 ++#define MAX_MULTIPATH_STATE ENFS_MULTIPATH_DISABLE ++#define MIN_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE ++ ++#define DEFAULT_PATH_DETECT_INTERVAL 10 ++#define DEFAULT_PATH_DETECT_TIMEOUT 5 ++#define DEFAULT_MULTIPATH_TIMEOUT 0 ++#define DEFAULT_MULTIPATH_STATE ENFS_MULTIPATH_ENABLE ++#define DEFAULT_LOADBALANCE_MODE ENFS_LOADBALANCE_RR ++ ++typedef int (*check_and_assign_func)(char *, char *, int, int); ++ ++struct enfs_config_info { ++ int32_t path_detect_interval; ++ int32_t path_detect_timeout; ++ int32_t multipath_timeout; ++ int32_t loadbalance_mode; ++ int32_t multipath_state; ++}; ++ ++struct check_and_assign_value { ++ char *field_name; ++ check_and_assign_func func; ++ int min_value; ++ int max_value; ++}; ++ ++static struct enfs_config_info g_enfs_config_info; ++static struct timespec64 modify_time; ++static struct task_struct *thread; ++ ++static int enfs_check_config_value(char *value, int min_value, int max_value) ++{ ++ unsigned long num_value; ++ int ret; ++ ++ ret = kstrtol(value, 10, &num_value); ++ if (ret != 0) { ++ enfs_log_error("Failed to convert string to int\n"); ++ return -EINVAL; ++ } ++ ++ if (num_value < min_value || num_value > max_value) ++ return -EINVAL; ++ ++ return num_value; ++} ++ ++static int32_t enfs_check_and_assign_int_value(char *field_name, char *value, ++ int min_value, int max_value) ++{ ++ int int_value = enfs_check_config_value(value, min_value, max_value); ++ ++ if (int_value < 0) ++ return -EINVAL; ++ ++ if (strcmp(field_name, "path_detect_interval") == 0) { ++ g_enfs_config_info.path_detect_interval = int_value; ++ return ENFS_RET_OK; ++ } ++ if (strcmp(field_name, "path_detect_timeout") == 0) { ++ g_enfs_config_info.path_detect_timeout = int_value; ++ return ENFS_RET_OK; ++ } ++ if (strcmp(field_name, "multipath_timeout") == 0) { ++ g_enfs_config_info.multipath_timeout = int_value; ++ return ENFS_RET_OK; ++ } ++ if (strcmp(field_name, "multipath_disable") == 0) { ++ g_enfs_config_info.multipath_state = int_value; ++ return ENFS_RET_OK; ++ } ++ return -EINVAL; ++} ++ ++static int32_t enfs_check_and_assign_loadbalance_mode(char *field_name, ++ char *value, ++ int min_value, ++ int max_value) ++{ ++ if (value == NULL) ++ return -EINVAL; ++ ++ if (strcmp(field_name, "multipath_select_policy") == 0) { ++ if (strcmp(value, "roundrobin") == 0) { ++ g_enfs_config_info.loadbalance_mode ++ = ENFS_LOADBALANCE_RR; ++ return ENFS_RET_OK; ++ } ++ } ++ return -EINVAL; ++} ++ ++static const struct check_and_assign_value g_check_and_assign_value[] = { ++ {"path_detect_interval", enfs_check_and_assign_int_value, ++ MIN_PATH_DETECT_INTERVAL, MAX_PATH_DETECT_INTERVAL}, ++ {"path_detect_timeout", enfs_check_and_assign_int_value, ++ MIN_PATH_DETECT_TIMEOUT, MAX_PATH_DETECT_TIMEOUT}, ++ {"multipath_timeout", enfs_check_and_assign_int_value, ++ MIN_MULTIPATH_TIMEOUT, MAX_MULTIPATH_TIMEOUT}, ++ {"multipath_disable", enfs_check_and_assign_int_value, ++ MIN_MULTIPATH_STATE, MAX_MULTIPATH_STATE}, ++ {"multipath_select_policy", enfs_check_and_assign_loadbalance_mode, ++ 0, 0}, ++}; ++ ++static int32_t enfs_read_config_file(char *buffer, char *file_path) ++{ ++ int ret; ++ struct file *filp = NULL; ++ loff_t f_pos = 0; ++ mm_segment_t fs; ++ ++ ++ filp = filp_open(file_path, O_RDONLY, 0); ++ ++ if (IS_ERR(filp)) { ++ enfs_log_error("Failed to open file %s\n", CONFIG_FILE_PATH); ++ ret = -ENOENT; ++ return ret; ++ } ++ ++ fs = get_fs(); ++ set_fs(get_ds()); ++ kernel_read(filp, buffer, MAX_FILE_SIZE, &f_pos); ++ set_fs(fs); ++ ++ ret = filp_close(filp, NULL); ++ if (ret) { ++ enfs_log_error("Close File:%s failed:%d.\n", ++ CONFIG_FILE_PATH, ret); ++ return -EINVAL; ++ } ++ return ENFS_RET_OK; ++} ++ ++static int32_t enfs_deal_with_comment_line(char *buffer) ++{ ++ int ret; ++ char *pos = strchr(buffer, '\n'); ++ ++ if (pos != NULL) ++ ret = strlen(buffer) - strlen(pos); ++ else ++ ret = strlen(buffer); ++ ++ return ret; ++} ++ ++static int32_t enfs_parse_key_value_from_config(char *buffer, char *key, ++ char *value, int keyLen, ++ int valueLen) ++{ ++ char *line; ++ char *tokenPtr; ++ int len; ++ char *tem; ++ char *pos = strchr(buffer, '\n'); ++ ++ if (pos != NULL) ++ len = strlen(buffer) - strlen(pos); ++ else ++ len = strlen(buffer); ++ ++ line = kmalloc(len + 1, GFP_KERNEL); ++ if (!line) { ++ enfs_log_error("Failed to allocate memory.\n"); ++ return -ENOMEM; ++ } ++ line[len] = '\0'; ++ strncpy(line, buffer, len); ++ ++ tem = line; ++ tokenPtr = strsep(&tem, "="); ++ if (tokenPtr == NULL || tem == NULL) { ++ kfree(line); ++ return len; ++ } ++ strncpy(key, strim(tokenPtr), keyLen); ++ strncpy(value, strim(tem), valueLen); ++ ++ kfree(line); ++ return len; ++} ++ ++static int32_t enfs_get_value_from_config_file(char *buffer, char *field_name, ++ char *value, int valueLen) ++{ ++ int ret; ++ char key[STRING_BUF_SIZE + 1] = {0}; ++ char val[STRING_BUF_SIZE + 1] = {0}; ++ ++ while (buffer[0] != '\0') { ++ if (buffer[0] == '\n') { ++ buffer++; ++ } else if (buffer[0] == '#') { ++ ret = enfs_deal_with_comment_line(buffer); ++ if (ret > 0) ++ buffer += ret; ++ } else { ++ ret = enfs_parse_key_value_from_config(buffer, key, val, ++ STRING_BUF_SIZE, ++ STRING_BUF_SIZE); ++ if (ret < 0) { ++ enfs_log_error("failed parse key value, %d\n" ++ , ret); ++ return ret; ++ } ++ key[STRING_BUF_SIZE] = '\0'; ++ val[STRING_BUF_SIZE] = '\0'; ++ ++ buffer += ret; ++ ++ if (strcmp(field_name, key) == 0) { ++ strncpy(value, val, valueLen); ++ return ENFS_RET_OK; ++ } ++ } ++ } ++ enfs_log_error("can not find value which matched field_name: %s.\n", ++ field_name); ++ return -EINVAL; ++} ++ ++int32_t enfs_config_load(void) ++{ ++ char value[STRING_BUF_SIZE + 1]; ++ int ret; ++ int table_len; ++ int min; ++ int max; ++ int i; ++ char *buffer; ++ ++ buffer = kmalloc(MAX_FILE_SIZE, GFP_KERNEL); ++ if (!buffer) { ++ enfs_log_error("Failed to allocate memory.\n"); ++ return -ENOMEM; ++ } ++ memset(buffer, 0, MAX_FILE_SIZE); ++ ++ g_enfs_config_info.path_detect_interval = DEFAULT_PATH_DETECT_INTERVAL; ++ g_enfs_config_info.path_detect_timeout = DEFAULT_PATH_DETECT_TIMEOUT; ++ g_enfs_config_info.multipath_timeout = DEFAULT_MULTIPATH_TIMEOUT; ++ g_enfs_config_info.multipath_state = DEFAULT_MULTIPATH_STATE; ++ g_enfs_config_info.loadbalance_mode = DEFAULT_LOADBALANCE_MODE; ++ ++ table_len = sizeof(g_check_and_assign_value) / ++ sizeof(g_check_and_assign_value[0]); ++ ++ ret = enfs_read_config_file(buffer, CONFIG_FILE_PATH); ++ if (ret != 0) { ++ kfree(buffer); ++ return ret; ++ } ++ ++ for (i = 0; i < table_len; i++) { ++ ret = enfs_get_value_from_config_file(buffer, ++ g_check_and_assign_value[i].field_name, ++ value, STRING_BUF_SIZE); ++ if (ret < 0) ++ continue; ++ ++ value[STRING_BUF_SIZE] = '\0'; ++ min = g_check_and_assign_value[i].min_value; ++ max = g_check_and_assign_value[i].max_value; ++ if (g_check_and_assign_value[i].func != NULL) ++ (*g_check_and_assign_value[i].func)( ++ g_check_and_assign_value[i].field_name, ++ value, min, max); ++ } ++ ++ kfree(buffer); ++ return ENFS_RET_OK; ++} ++ ++int32_t enfs_get_config_path_detect_interval(void) ++{ ++ return g_enfs_config_info.path_detect_interval; ++} ++ ++int32_t enfs_get_config_path_detect_timeout(void) ++{ ++ return g_enfs_config_info.path_detect_timeout; ++} ++ ++int32_t enfs_get_config_multipath_timeout(void) ++{ ++ return g_enfs_config_info.multipath_timeout; ++} ++ ++int32_t enfs_get_config_multipath_state(void) ++{ ++ return g_enfs_config_info.multipath_state; ++} ++ ++int32_t enfs_get_config_loadbalance_mode(void) ++{ ++ return g_enfs_config_info.loadbalance_mode; ++} ++ ++static bool enfs_file_changed(const char *filename) ++{ ++ int err; ++ struct kstat file_stat; ++ ++ err = vfs_stat(filename, &file_stat); ++ if (err) { ++ pr_err("failed to open file:%s err:%d\n", filename, err); ++ return false; ++ } ++ ++ if (timespec64_compare(&modify_time, &file_stat.mtime) == -1) { ++ modify_time = file_stat.mtime; ++ pr_info("file change: %lld %lld\n", modify_time.tv_sec, ++ file_stat.mtime.tv_sec); ++ return true; ++ } ++ ++ return false; ++} ++ ++static int enfs_thread_func(void *data) ++{ ++ while (!kthread_should_stop()) { ++ if (enfs_file_changed(CONFIG_FILE_PATH)) ++ enfs_config_load(); ++ ++ msleep(ENFS_NOTIFY_FILE_PERIOD); ++ } ++ return 0; ++} ++ ++int enfs_config_timer_init(void) ++{ ++ thread = kthread_run(enfs_thread_func, NULL, "enfs_notiy_file_thread"); ++ if (IS_ERR(thread)) { ++ pr_err("Failed to create kernel thread\n"); ++ return PTR_ERR(thread); ++ } ++ return 0; ++} ++ ++void enfs_config_timer_exit(void) ++{ ++ pr_info("enfs_notify_file_exit\n"); ++ if (thread) ++ kthread_stop(thread); ++} +diff --git a/fs/nfs/enfs/enfs_config.h b/fs/nfs/enfs/enfs_config.h +new file mode 100644 +index 000000000000..752710129170 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_config.h +@@ -0,0 +1,32 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs configuration ++ * Author: y00583252 ++ * Create: 2023-07-27 ++ */ ++ ++#ifndef ENFS_CONFIG_H ++#define ENFS_CONFIG_H ++ ++#include <linux/types.h> ++ ++enum enfs_multipath_state { ++ ENFS_MULTIPATH_ENABLE = 0, ++ ENFS_MULTIPATH_DISABLE = 1, ++}; ++ ++enum enfs_loadbalance_mode { ++ ENFS_LOADBALANCE_RR, ++}; ++ ++ ++int32_t enfs_get_config_path_detect_interval(void); ++int32_t enfs_get_config_path_detect_timeout(void); ++int32_t enfs_get_config_multipath_timeout(void); ++int32_t enfs_get_config_multipath_state(void); ++int32_t enfs_get_config_loadbalance_mode(void); ++int32_t enfs_config_load(void); ++int32_t enfs_config_timer_init(void); ++void enfs_config_timer_exit(void); ++#endif // ENFS_CONFIG_H +diff --git a/fs/nfs/enfs/enfs_errcode.h b/fs/nfs/enfs/enfs_errcode.h +new file mode 100644 +index 000000000000..cca47ab9a191 +--- /dev/null ++++ b/fs/nfs/enfs/enfs_errcode.h +@@ -0,0 +1,17 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs errocode ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef ENFS_ERRCODE_H ++#define ENFS_ERRCODE_H ++ ++enum { ++ ENFS_RET_OK = 0, ++ ENFS_RET_FAIL ++}; ++ ++#endif // ENFS_ERRCODE_H +diff --git a/fs/nfs/enfs/enfs_log.h b/fs/nfs/enfs/enfs_log.h +new file mode 100644 +index 000000000000..177b404f05df +--- /dev/null ++++ b/fs/nfs/enfs/enfs_log.h +@@ -0,0 +1,25 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: enfs log ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++#ifndef ENFS_LOG_H ++#define ENFS_LOG_H ++ ++#include <linux/printk.h> ++ ++#define enfs_log_info(fmt, ...) \ ++ pr_info("enfs:[%s]" pr_fmt(fmt), \ ++ __func__, ##__VA_ARGS__) ++ ++#define enfs_log_error(fmt, ...) \ ++ pr_err("enfs:[%s]" pr_fmt(fmt), \ ++ __func__, ##__VA_ARGS__) ++ ++#define enfs_log_debug(fmt, ...) \ ++ pr_debug("enfs:[%s]" pr_fmt(fmt), \ ++ __func__, ##__VA_ARGS__) ++ ++#endif // ENFS_ERRCODE_H +diff --git a/fs/nfs/enfs/failover_com.h b/fs/nfs/enfs/failover_com.h +new file mode 100644 +index 000000000000..c52940da232e +--- /dev/null ++++ b/fs/nfs/enfs/failover_com.h +@@ -0,0 +1,23 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: failover time commont header file ++ * Create: 2023-08-02 ++ */ ++#ifndef FAILOVER_COMMON_H ++#define FAILOVER_COMMON_H ++ ++static inline bool failover_is_enfs_clnt(struct rpc_clnt *clnt) ++{ ++ struct rpc_clnt *next = clnt->cl_parent; ++ ++ while (next) { ++ if (next == next->cl_parent) ++ break; ++ next = next->cl_parent; ++ } ++ ++ return next != NULL ? next->cl_enfs : clnt->cl_enfs; ++} ++ ++#endif // FAILOVER_COMMON_H +diff --git a/fs/nfs/enfs/failover_path.c b/fs/nfs/enfs/failover_path.c +new file mode 100644 +index 000000000000..93b454de29d1 +--- /dev/null ++++ b/fs/nfs/enfs/failover_path.c +@@ -0,0 +1,207 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs path failover file ++ * Author: y00583252 ++ * Create: 2023-08-02 ++ */ ++ ++#include "failover_path.h" ++#include <linux/nfs.h> ++#include <linux/nfs3.h> ++#include <linux/nfs4.h> ++#include <linux/sunrpc/clnt.h> ++#include <linux/sunrpc/sched.h> ++#include <linux/sunrpc/xprt.h> ++#include "enfs_config.h" ++#include "enfs_log.h" ++#include "failover_com.h" ++#include "pm_state.h" ++#include "pm_ping.h" ++ ++enum failover_policy_t { ++ FAILOVER_NOACTION = 1, ++ FAILOVER_RETRY, ++ FAILOVER_RETRY_DELAY, ++}; ++ ++static void failover_retry_path(struct rpc_task *task) ++{ ++ xprt_release(task); ++ rpc_init_task_retry_counters(task); ++ rpc_task_release_transport(task); ++ rpc_restart_call(task); ++} ++ ++static void failover_retry_path_delay(struct rpc_task *task, int32_t delay) ++{ ++ failover_retry_path(task); ++ rpc_delay(task, delay); ++} ++ ++static void failover_retry_path_by_policy(struct rpc_task *task, ++ enum failover_policy_t policy) ++{ ++ if (policy == FAILOVER_RETRY) ++ failover_retry_path(task); ++ else if (policy == FAILOVER_RETRY_DELAY) ++ failover_retry_path_delay(task, 3 * HZ); // delay 3s ++} ++ ++static ++enum failover_policy_t failover_get_nfs3_retry_policy(struct rpc_task *task) ++{ ++ enum failover_policy_t policy = FAILOVER_NOACTION; ++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; ++ u32 proc; ++ ++ if (unlikely(procinfo == NULL)) { ++ enfs_log_error("the task contains no valid proc.\n"); ++ return FAILOVER_NOACTION; ++ } ++ ++ proc = procinfo->p_proc; ++ ++ switch (proc) { ++ case NFS3PROC_CREATE: ++ case NFS3PROC_MKDIR: ++ case NFS3PROC_REMOVE: ++ case NFS3PROC_RMDIR: ++ case NFS3PROC_SYMLINK: ++ case NFS3PROC_LINK: ++ case NFS3PROC_SETATTR: ++ case NFS3PROC_WRITE: ++ policy = FAILOVER_RETRY_DELAY; ++ default: ++ policy = FAILOVER_RETRY; ++ } ++ return policy; ++} ++ ++static ++enum failover_policy_t failover_get_nfs4_retry_policy(struct rpc_task *task) ++{ ++ enum failover_policy_t policy = FAILOVER_NOACTION; ++ const struct rpc_procinfo *procinfo = task->tk_msg.rpc_proc; ++ u32 proc_idx; ++ ++ if (unlikely(procinfo == NULL)) { ++ enfs_log_error("the task contains no valid proc.\n"); ++ return FAILOVER_NOACTION; ++ } ++ ++ proc_idx = procinfo->p_statidx; ++ ++ switch (proc_idx) { ++ case NFSPROC4_CLNT_CREATE: ++ case NFSPROC4_CLNT_REMOVE: ++ case NFSPROC4_CLNT_LINK: ++ case NFSPROC4_CLNT_SYMLINK: ++ case NFSPROC4_CLNT_SETATTR: ++ case NFSPROC4_CLNT_WRITE: ++ case NFSPROC4_CLNT_RENAME: ++ case NFSPROC4_CLNT_SETACL: ++ policy = FAILOVER_RETRY_DELAY; ++ default: ++ policy = FAILOVER_RETRY; ++ } ++ return policy; ++} ++ ++static enum failover_policy_t failover_get_retry_policy(struct rpc_task *task) ++{ ++ struct rpc_clnt *clnt = task->tk_client; ++ u32 version = clnt->cl_vers; ++ enum failover_policy_t policy = FAILOVER_NOACTION; ++ ++ // 1. if the task meant to send to certain xprt, take no action ++ if (task->tk_flags & RPC_TASK_FIXED) ++ return FAILOVER_NOACTION; ++ ++ // 2. get policy by different version of nfs protocal ++ if (version == 3) // nfs v3 ++ policy = failover_get_nfs3_retry_policy(task); ++ else if (version == 4) // nfs v4 ++ policy = failover_get_nfs4_retry_policy(task); ++ else ++ return FAILOVER_NOACTION; ++ ++ // 3. if the task is not send to target, retry immediately ++ if (!RPC_WAS_SENT(task)) ++ policy = FAILOVER_RETRY; ++ ++ return policy; ++} ++ ++static int failover_check_task(struct rpc_task *task) ++{ ++ struct rpc_clnt *clnt = NULL; ++ int disable_mpath = enfs_get_config_multipath_state(); ++ ++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return -EINVAL; ++ } ++ ++ if (unlikely((task == NULL) || (task->tk_client == NULL))) { ++ enfs_log_error("The task is not valid.\n"); ++ return -EINVAL; ++ } ++ ++ clnt = task->tk_client; ++ ++ if (clnt->cl_prog != NFS_PROGRAM) { ++ enfs_log_debug("The clnt is not prog{%u} type.\n", ++ clnt->cl_prog); ++ return -EINVAL; ++ } ++ ++ if (!failover_is_enfs_clnt(clnt)) { ++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); ++ return -EINVAL; ++ } ++ return 0; ++} ++ ++void failover_handle(struct rpc_task *task) ++{ ++ enum failover_policy_t policy; ++ int ret; ++ ++ ret = failover_check_task(task); ++ if (ret != 0) ++ return; ++ ++ pm_set_path_state(task->tk_xprt, PM_STATE_FAULT); ++ ++ policy = failover_get_retry_policy(task); ++ ++ failover_retry_path_by_policy(task, policy); ++} ++ ++bool failover_task_need_call_start_again(struct rpc_task *task) ++{ ++ int ret; ++ ++ ret = failover_check_task(task); ++ if (ret != 0) ++ return false; ++ ++ return true; ++} ++ ++bool failover_prepare_transmit(struct rpc_task *task) ++{ ++ if (task->tk_flags & RPC_TASK_FIXED) ++ return true; ++ ++ if (pm_ping_is_test_xprt_task(task)) ++ return true; ++ ++ if (pm_get_path_state(task->tk_xprt) == PM_STATE_FAULT) { ++ task->tk_status = -ETIMEDOUT; ++ return false; ++ } ++ ++ return true; ++} +diff --git a/fs/nfs/enfs/failover_path.h b/fs/nfs/enfs/failover_path.h +new file mode 100644 +index 000000000000..6f1294829a6e +--- /dev/null ++++ b/fs/nfs/enfs/failover_path.h +@@ -0,0 +1,17 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs path failover header file ++ * Author: y00583252 ++ * Create: 2023-08-02 ++ */ ++ ++#ifndef FAILOVER_PATH_H ++#define FAILOVER_PATH_H ++ ++#include <linux/sunrpc/sched.h> ++ ++void failover_handle(struct rpc_task *task); ++bool failover_prepare_transmit(struct rpc_task *task); ++ ++#endif // FAILOVER_PATH_H +diff --git a/fs/nfs/enfs/failover_time.c b/fs/nfs/enfs/failover_time.c +new file mode 100644 +index 000000000000..866ea82d13fc +--- /dev/null ++++ b/fs/nfs/enfs/failover_time.c +@@ -0,0 +1,99 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: failover time file ++ * Create: 2023-08-02 ++ */ ++ ++#include "failover_time.h" ++#include <linux/jiffies.h> ++#include <linux/sunrpc/clnt.h> ++#include "enfs_config.h" ++#include "enfs_log.h" ++#include "failover_com.h" ++#include "pm_ping.h" ++ ++static unsigned long failover_get_mulitipath_timeout(struct rpc_clnt *clnt) ++{ ++ unsigned long config_tmo = enfs_get_config_multipath_timeout() * HZ; ++ unsigned long clnt_tmo = clnt->cl_timeout->to_initval; ++ ++ if (config_tmo == 0) ++ return clnt_tmo; ++ ++ return config_tmo > clnt_tmo ? clnt_tmo : config_tmo; ++} ++ ++void failover_adjust_task_timeout(struct rpc_task *task, void *condition) ++{ ++ struct rpc_clnt *clnt = NULL; ++ unsigned long tmo; ++ int disable_mpath = enfs_get_config_multipath_state(); ++ ++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return; ++ } ++ ++ clnt = task->tk_client; ++ if (unlikely(clnt == NULL)) { ++ enfs_log_error("task associate client is NULL.\n"); ++ return; ++ } ++ ++ if (!failover_is_enfs_clnt(clnt)) { ++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); ++ return; ++ } ++ ++ tmo = failover_get_mulitipath_timeout(clnt); ++ if (tmo == 0) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return; ++ } ++ ++ if (task->tk_timeout != 0) ++ task->tk_timeout = ++ task->tk_timeout < tmo ? task->tk_timeout : tmo; ++ else ++ task->tk_timeout = tmo; ++} ++ ++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req) ++{ ++ struct rpc_clnt *clnt = NULL; ++ int disable_mpath = enfs_get_config_multipath_state(); ++ ++ if (disable_mpath != ENFS_MULTIPATH_ENABLE) { ++ enfs_log_debug("Multipath is not enabled.\n"); ++ return; ++ } ++ ++ clnt = task->tk_client; ++ if (unlikely(clnt == NULL)) { ++ enfs_log_error("task associate client is NULL.\n"); ++ return; ++ } ++ ++ if (!failover_is_enfs_clnt(clnt)) { ++ enfs_log_debug("The clnt is not a enfs-managed type.\n"); ++ return; ++ } ++ ++ if (!pm_ping_is_test_xprt_task(task)) ++ req->rq_timeout = failover_get_mulitipath_timeout(clnt); ++ else { ++ req->rq_timeout = enfs_get_config_path_detect_timeout() * HZ; ++ req->rq_majortimeo = req->rq_timeout + jiffies; ++ } ++ ++ /* ++ * when task is retried, the req is new, we lost major-timeout times, ++ * so we have to restore req major ++ * timeouts from the task, if it is stored. ++ */ ++ if (task->tk_major_timeo != 0) ++ req->rq_majortimeo = task->tk_major_timeo; ++ else ++ task->tk_major_timeo = req->rq_majortimeo; ++} +diff --git a/fs/nfs/enfs/failover_time.h b/fs/nfs/enfs/failover_time.h +new file mode 100644 +index 000000000000..ede25b577a2a +--- /dev/null ++++ b/fs/nfs/enfs/failover_time.h +@@ -0,0 +1,16 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: failover time header file ++ * Create: 2023-08-02 ++ */ ++ ++#ifndef FAILOVER_TIME_H ++#define FAILOVER_TIME_H ++ ++#include <linux/sunrpc/sched.h> ++ ++void failover_adjust_task_timeout(struct rpc_task *task, void *condition); ++void failover_init_task_req(struct rpc_task *task, struct rpc_rqst *req); ++ ++#endif // FAILOVER_TIME_H +diff --git a/fs/nfs/enfs/init.h b/fs/nfs/enfs/init.h +new file mode 100644 +index 000000000000..fdabb9084e19 +--- /dev/null ++++ b/fs/nfs/enfs/init.h +@@ -0,0 +1,17 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs client init ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef ENFS_INIT_H ++#define ENFS_INIT_H ++ ++#include <linux/types.h> ++ ++int32_t enfs_init(void); ++void enfs_fini(void); ++ ++#endif +diff --git a/fs/nfs/enfs/mgmt_init.c b/fs/nfs/enfs/mgmt_init.c +new file mode 100644 +index 000000000000..75a40c5e0f6c +--- /dev/null ++++ b/fs/nfs/enfs/mgmt_init.c +@@ -0,0 +1,22 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: mgmt component init ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#include "mgmt_init.h" ++#include <linux/printk.h> ++#include "enfs_errcode.h" ++#include "enfs_config.h" ++ ++int32_t mgmt_init(void) ++{ ++ return enfs_config_timer_init(); ++} ++ ++void mgmt_fini(void) ++{ ++ enfs_config_timer_exit(); ++} +diff --git a/fs/nfs/enfs/mgmt_init.h b/fs/nfs/enfs/mgmt_init.h +new file mode 100644 +index 000000000000..aa78303b9f01 +--- /dev/null ++++ b/fs/nfs/enfs/mgmt_init.h +@@ -0,0 +1,18 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: mgmt component init ++ * Author: y00583252 ++ * Create: 2023-07-31 ++ */ ++ ++#ifndef MGMT_INIT_H ++#define MGMT_INIT_H ++ ++#include <linux/types.h> ++ ++int32_t mgmt_init(void); ++void mgmt_fini(void); ++ ++ ++#endif // MGMT_INIT_H +diff --git a/fs/nfs/enfs/pm_ping.c b/fs/nfs/enfs/pm_ping.c +new file mode 100644 +index 000000000000..24153cd4c7f3 +--- /dev/null ++++ b/fs/nfs/enfs/pm_ping.c +@@ -0,0 +1,421 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: path state header file ++ * Author: x00833432 ++ * Create: 2023-08-21 ++ */ ++ ++#include "pm_ping.h" ++#include <linux/err.h> ++#include <linux/spinlock.h> ++#include <linux/slab.h> ++#include <linux/module.h> ++#include <linux/printk.h> ++#include <linux/kthread.h> ++#include <linux/nfs.h> ++#include <linux/errno.h> ++#include <linux/rcupdate.h> ++#include <linux/workqueue.h> ++#include <net/netns/generic.h> ++#include <linux/atomic.h> ++#include <linux/sunrpc/clnt.h> ++ ++#include "../../../net/sunrpc/netns.h" ++#include "pm_state.h" ++#include "enfs.h" ++#include "enfs_log.h" ++#include "enfs_config.h" ++ ++#define SLEEP_INTERVAL 2 ++extern unsigned int sunrpc_net_id; ++ ++static struct task_struct *pm_ping_timer_thread; ++//protect pint_execute_workq ++static spinlock_t ping_execute_workq_lock; ++// timer for test xprt workqueue ++static struct workqueue_struct *ping_execute_workq; ++// count the ping xprt work on flight ++static atomic_t check_xprt_count; ++ ++struct ping_xprt_work { ++ struct rpc_xprt *xprt; // use this specific xprt ++ struct rpc_clnt *clnt; // use this specific rpc_client ++ struct work_struct ping_work; ++}; ++ ++struct pm_ping_async_callback { ++ void *data; ++ void (*func)(void *data); ++}; ++ ++// set xprt's enum pm_check_state ++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, ++ enum pm_check_state state) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ ++ if (IS_ERR(xprt)) { ++ enfs_log_error("The xprt ptr is not exist.\n"); ++ return; ++ } ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ xprt_get(xprt); ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ xprt_put(xprt); ++ return; ++ } ++ ++ atomic_set(&ctx->path_check_state, state); ++ xprt_put(xprt); ++} ++ ++// get xprt's enum pm_check_state ++static enum pm_check_state pm_ping_get_path_check_state(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ enum pm_check_state state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return PM_CHECK_UNDEFINE; ++ } ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ return PM_CHECK_UNDEFINE; ++ } ++ ++ state = atomic_read(&ctx->path_check_state); ++ ++ return state; ++} ++ ++static void pm_ping_call_done_callback(void *data) ++{ ++ struct pm_ping_async_callback *callback_data = ++ (struct pm_ping_async_callback *)data; ++ ++ if (callback_data == NULL) ++ return; ++ ++ callback_data->func(callback_data->data); ++ ++ kfree(callback_data); ++} ++ ++// Default callback for async RPC calls ++static void pm_ping_call_done(struct rpc_task *task, void *data) ++{ ++ struct rpc_xprt *xprt = task->tk_xprt; ++ ++ atomic_dec(&check_xprt_count); ++ if (task->tk_status >= 0) ++ pm_set_path_state(xprt, PM_STATE_NORMAL); ++ else ++ pm_set_path_state(xprt, PM_STATE_FAULT); ++ ++ pm_ping_set_path_check_state(xprt, PM_CHECK_FINISH); ++ ++ pm_ping_call_done_callback(data); ++} ++ ++// register func to rpc_call_done ++static const struct rpc_call_ops pm_ping_set_status_ops = { ++ .rpc_call_done = pm_ping_call_done, ++}; ++ ++// execute work which in work_queue ++static void pm_ping_execute_work(struct work_struct *work) ++{ ++ int ret = 0; ++ ++ // get the work information ++ struct ping_xprt_work *work_info = ++ container_of(work, struct ping_xprt_work, ping_work); ++ ++ // if check state is pending ++ if (pm_ping_get_path_check_state(work_info->xprt) == PM_CHECK_WAITING) { ++ ++ pm_ping_set_path_check_state(work_info->xprt, ++ PM_CHECK_CHECKING); ++ ++ ret = rpc_clnt_test_xprt(work_info->clnt, ++ work_info->xprt, ++ &pm_ping_set_status_ops, ++ NULL, ++ RPC_TASK_ASYNC | RPC_TASK_FIXED); ++ ++ if (ret < 0) { ++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); ++ ++ pm_ping_set_path_check_state(work_info->xprt, ++ PM_CHECK_FINISH); ++ ++ } else ++ atomic_inc(&check_xprt_count); ++ ++ } ++ ++ atomic_dec(&work_info->clnt->cl_count); ++ xprt_put(work_info->xprt); ++ kfree(work_info); ++ work_info = NULL; ++} ++ ++static bool pm_ping_workqueue_queue_work(struct work_struct *work) ++{ ++ bool ret = false; ++ ++ spin_lock(&ping_execute_workq_lock); ++ ++ if (ping_execute_workq != NULL) ++ ret = queue_work(ping_execute_workq, work); ++ ++ spin_unlock(&ping_execute_workq_lock); ++ return ret; ++} ++ ++// init test work and add this work to workqueue ++static int pm_ping_add_work(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, void *data) ++{ ++ struct ping_xprt_work *work_info; ++ bool ret = false; ++ ++ if (IS_ERR(xprt) || xprt == NULL) { ++ enfs_log_error("The xprt ptr is not exist.\n"); ++ return -EINVAL; ++ } ++ ++ if (IS_ERR(clnt) || clnt == NULL) { ++ enfs_log_error("The clnt ptr is not exist.\n"); ++ return -EINVAL; ++ } ++ ++ if (!xprt->multipath_context) { ++ enfs_log_error("multipath_context is null.\n"); ++ return -EINVAL; ++ } ++ ++ // check xprt pending status, if pending status equals Finish ++ // means this xprt can inster to work queue ++ if (pm_ping_get_path_check_state(xprt) == ++ PM_CHECK_FINISH || ++ pm_ping_get_path_check_state(xprt) == ++ PM_CHECK_INIT) { ++ ++ enfs_log_debug("find xprt pointer. %p\n", xprt); ++ work_info = kzalloc(sizeof(struct ping_xprt_work), GFP_ATOMIC); ++ if (work_info == NULL) ++ return -ENOMEM; ++ work_info->clnt = clnt; ++ atomic_inc(&clnt->cl_count); ++ work_info->xprt = xprt; ++ xprt_get(xprt); ++ INIT_WORK(&work_info->ping_work, pm_ping_execute_work); ++ pm_ping_set_path_check_state(xprt, PM_CHECK_WAITING); ++ ++ ret = pm_ping_workqueue_queue_work(&work_info->ping_work); ++ if (!ret) { ++ atomic_dec(&work_info->clnt->cl_count); ++ xprt_put(work_info->xprt); ++ kfree(work_info); ++ return -EINVAL; ++ } ++ } ++ return 0; ++} ++ ++// encapsulate pm_ping_add_work() ++static int pm_ping_execute_xprt_test(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, void *data) ++{ ++ pm_ping_add_work(clnt, xprt, NULL); ++ // return 0 for rpc_clnt_iterate_for_each_xprt(); ++ // because negative value will stop iterate all xprt ++ // and we need return negative value for debug ++ // Therefore, we need this function to iterate all xprt ++ return 0; ++} ++ ++// export to other module add ping work to workqueue ++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) ++{ ++ int ret; ++ ++ ret = pm_ping_add_work(clnt, xprt, NULL); ++ return ret; ++} ++ ++// iterate xprt in the client ++static void pm_ping_loop_rpclnt(struct sunrpc_net *sn) ++{ ++ struct rpc_clnt *clnt; ++ ++ spin_lock(&sn->rpc_client_lock); ++ list_for_each_entry_rcu(clnt, &sn->all_clients, cl_clients) { ++ if (clnt->cl_enfs) { ++ enfs_log_debug("find rpc_clnt. %p\n", clnt); ++ rpc_clnt_iterate_for_each_xprt(clnt, ++ pm_ping_execute_xprt_test, NULL); ++ } ++ } ++ spin_unlock(&sn->rpc_client_lock); ++} ++ ++// iterate each clnt in the sunrpc_net ++static void pm_ping_loop_sunrpc_net(void) ++{ ++ struct net *net; ++ struct sunrpc_net *sn; ++ ++ rcu_read_lock(); ++ for_each_net_rcu(net) { ++ sn = net_generic(net, sunrpc_net_id); ++ if (sn == NULL) ++ continue; ++ pm_ping_loop_rpclnt(sn); ++ } ++ rcu_read_unlock(); ++} ++ ++static int pm_ping_routine(void *data) ++{ ++ while (!kthread_should_stop()) { ++ // equale 0 means open multipath ++ if (enfs_get_config_multipath_state() == ++ ENFS_MULTIPATH_ENABLE) ++ pm_ping_loop_sunrpc_net(); ++ ++ msleep((unsigned int) ++ enfs_get_config_path_detect_interval() * 1000); ++ } ++ return 0; ++} ++ ++// start thread to cycly ping ++static int pm_ping_start(void) ++{ ++ pm_ping_timer_thread = ++ kthread_run(pm_ping_routine, NULL, "pm_ping_routine"); ++ if (IS_ERR(pm_ping_timer_thread)) { ++ enfs_log_error("Failed to create kernel thread\n"); ++ return PTR_ERR(pm_ping_timer_thread); ++ } ++ return 0; ++} ++ ++// initialize workqueue ++static int pm_ping_workqueue_init(void) ++{ ++ struct workqueue_struct *queue = NULL; ++ ++ queue = create_workqueue("pm_ping_workqueue"); ++ ++ if (queue == NULL) { ++ enfs_log_error("create workqueue failed.\n"); ++ return -ENOMEM; ++ } ++ ++ spin_lock(&ping_execute_workq_lock); ++ ping_execute_workq = queue; ++ spin_unlock(&ping_execute_workq_lock); ++ enfs_log_info("create workqueue succeeeded.\n"); ++ return 0; ++} ++ ++static void pm_ping_workqueue_fini(void) ++{ ++ struct workqueue_struct *queue = NULL; ++ ++ spin_lock(&ping_execute_workq_lock); ++ queue = ping_execute_workq; ++ ping_execute_workq = NULL; ++ spin_unlock(&ping_execute_workq_lock); ++ ++ enfs_log_info("delete work queue\n"); ++ ++ if (queue != NULL) { ++ flush_workqueue(queue); ++ destroy_workqueue(queue); ++ } ++} ++ ++// module exit func ++void pm_ping_fini(void) ++{ ++ if (pm_ping_timer_thread) ++ kthread_stop(pm_ping_timer_thread); ++ ++ pm_ping_workqueue_fini(); ++ ++ while (atomic_read(&check_xprt_count) != 0) ++ msleep(SLEEP_INTERVAL); ++} ++ ++// module init func ++int pm_ping_init(void) ++{ ++ int ret; ++ ++ atomic_set(&check_xprt_count, 0); ++ ret = pm_ping_workqueue_init(); ++ if (ret != 0) { ++ enfs_log_error("PM_PING Module loading failed.\n"); ++ return ret; ++ } ++ ret = pm_ping_start(); ++ if (ret != 0) { ++ enfs_log_error("PM_PING Module loading failed.\n"); ++ pm_ping_workqueue_fini(); ++ return ret; ++ } ++ ++ return ret; ++} ++ ++bool pm_ping_is_test_xprt_task(struct rpc_task *task) ++{ ++ return task->tk_ops == &pm_ping_set_status_ops ? true : false; ++} ++ ++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void (*func)(void *data), ++ void *data) ++{ ++ int ret; ++ ++ struct pm_ping_async_callback *callback_data = ++ kzalloc(sizeof(struct pm_ping_async_callback), GFP_KERNEL); ++ ++ if (callback_data == NULL) { ++ enfs_log_error("failed to mzalloc mem\n"); ++ return -ENOMEM; ++ } ++ ++ callback_data->data = data; ++ callback_data->func = func; ++ atomic_inc(&check_xprt_count); ++ ret = rpc_clnt_test_xprt(clnt, xprt, ++ &pm_ping_set_status_ops, ++ callback_data, ++ RPC_TASK_ASYNC | RPC_TASK_FIXED); ++ ++ if (ret < 0) { ++ enfs_log_debug("ping xprt execute failed ,ret %d", ret); ++ atomic_dec(&check_xprt_count); ++ } ++ ++ return ret; ++} +diff --git a/fs/nfs/enfs/pm_ping.h b/fs/nfs/enfs/pm_ping.h +new file mode 100644 +index 000000000000..6bcb94bfc836 +--- /dev/null ++++ b/fs/nfs/enfs/pm_ping.h +@@ -0,0 +1,33 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: nfs configuration ++ * Author: x00833432 ++ * Create: 2023-07-27 ++ */ ++ ++#ifndef PM_PING_H ++#define PM_PING_H ++ ++#include <linux/sunrpc/clnt.h> ++ ++enum pm_check_state { ++ PM_CHECK_INIT, // this xprt never been queued ++ PM_CHECK_WAITING, // this xprt waiting in the queue ++ PM_CHECK_CHECKING, // this xprt is testing ++ PM_CHECK_FINISH, // this xprt has been finished ++ PM_CHECK_UNDEFINE, // undefine multipath struct ++}; ++ ++int pm_ping_init(void); ++void pm_ping_fini(void); ++int pm_ping_rpc_test_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt); ++void pm_ping_set_path_check_state(struct rpc_xprt *xprt, ++ enum pm_check_state state); ++bool pm_ping_is_test_xprt_task(struct rpc_task *task); ++int pm_ping_rpc_test_xprt_with_callback(struct rpc_clnt *clnt, ++ struct rpc_xprt *xprt, ++ void (*func)(void *data), ++ void *data); ++ ++#endif // PM_PING_H +diff --git a/fs/nfs/enfs/pm_state.c b/fs/nfs/enfs/pm_state.c +new file mode 100644 +index 000000000000..220621a207a2 +--- /dev/null ++++ b/fs/nfs/enfs/pm_state.c +@@ -0,0 +1,158 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: path state file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++#include "pm_state.h" ++#include <linux/sunrpc/xprt.h> ++ ++#include "enfs.h" ++#include "enfs_log.h" ++ ++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ enum pm_path_state state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return PM_STATE_UNDEFINED; ++ } ++ ++ xprt_get(xprt); ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ xprt_put(xprt); ++ return PM_STATE_UNDEFINED; ++ } ++ ++ state = atomic_read(&ctx->path_state); ++ ++ xprt_put(xprt); ++ ++ return state; ++} ++ ++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state) ++{ ++ struct enfs_xprt_context *ctx = NULL; ++ enum pm_path_state cur_state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ xprt_get(xprt); ++ ++ ctx = (struct enfs_xprt_context *)xprt->multipath_context; ++ if (ctx == NULL) { ++ enfs_log_error("The xprt multipath ctx is not valid.\n"); ++ xprt_put(xprt); ++ return; ++ } ++ ++ cur_state = atomic_read(&ctx->path_state); ++ if (cur_state == state) { ++ enfs_log_debug("The xprt is already {%d}.\n", state); ++ xprt_put(xprt); ++ return; ++ } ++ ++ atomic_set(&ctx->path_state, state); ++ enfs_log_info("The xprt {%p} path state change from {%d} to {%d}.\n", ++ xprt, cur_state, state); ++ ++ xprt_put(xprt); ++} ++ ++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len) ++{ ++ enum pm_path_state state; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ if ((buf == NULL) || (len <= 0)) { ++ enfs_log_error("Buffer is not valid, len=%d.\n", len); ++ return; ++ } ++ ++ state = pm_get_path_state(xprt); ++ ++ switch (state) { ++ case PM_STATE_INIT: ++ (void)snprintf(buf, len, "Init"); ++ break; ++ case PM_STATE_NORMAL: ++ (void)snprintf(buf, len, "Normal"); ++ break; ++ case PM_STATE_FAULT: ++ (void)snprintf(buf, len, "Fault"); ++ break; ++ default: ++ (void)snprintf(buf, len, "Unknown"); ++ break; ++ } ++} ++ ++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len) ++{ ++ int i; ++ unsigned long state; ++ static unsigned long xprt_mask[] = { ++ XPRT_LOCKED, XPRT_CONNECTED, ++ XPRT_CONNECTING, XPRT_CLOSE_WAIT, ++ XPRT_BOUND, XPRT_BINDING, XPRT_CLOSING, ++ XPRT_CONGESTED}; ++ ++ static const char *const xprt_state_desc[] = { ++ "LOCKED", "CONNECTED", "CONNECTING", ++ "CLOSE_WAIT", "BOUND", "BINDING", ++ "CLOSING", "CONGESTED"}; ++ int pos = 0; ++ int ret = 0; ++ ++ if (xprt == NULL) { ++ enfs_log_error("The xprt is not valid.\n"); ++ return; ++ } ++ ++ if ((buf == NULL) || (len <= 0)) { ++ enfs_log_error( ++ "Xprt state buffer is not valid, len=%d.\n", ++ len); ++ return; ++ } ++ ++ xprt_get(xprt); ++ state = READ_ONCE(xprt->state); ++ xprt_put(xprt); ++ ++ for (i = 0; i < ARRAY_SIZE(xprt_mask); ++i) { ++ if (pos >= len) ++ break; ++ ++ if (!test_bit(xprt_mask[i], &state)) ++ continue; ++ ++ if (pos == 0) ++ ret = snprintf(buf, len, "%s", xprt_state_desc[i]); ++ else ++ ret = snprintf(buf + pos, len - pos, "|%s", ++ xprt_state_desc[i]); ++ ++ if (ret < 0) { ++ enfs_log_error("format state failed, ret %d.\n", ret); ++ break; ++ } ++ ++ pos += ret; ++ } ++} +diff --git a/fs/nfs/enfs/pm_state.h b/fs/nfs/enfs/pm_state.h +new file mode 100644 +index 000000000000..f5f52e5ab91d +--- /dev/null ++++ b/fs/nfs/enfs/pm_state.h +@@ -0,0 +1,28 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved. ++ * Description: path state header file ++ * Author: y00583252 ++ * Create: 2023-08-12 ++ */ ++ ++#ifndef PM_STATE_H ++#define PM_STATE_H ++ ++#include <linux/types.h> ++#include <linux/sunrpc/xprt.h> ++ ++enum pm_path_state { ++ PM_STATE_INIT, ++ PM_STATE_NORMAL, ++ PM_STATE_FAULT, ++ PM_STATE_UNDEFINED // xprt is not multipath xprt ++}; ++ ++void pm_set_path_state(struct rpc_xprt *xprt, enum pm_path_state state); ++enum pm_path_state pm_get_path_state(struct rpc_xprt *xprt); ++ ++void pm_get_path_state_desc(struct rpc_xprt *xprt, char *buf, int len); ++void pm_get_xprt_state_desc(struct rpc_xprt *xprt, char *buf, int len); ++ ++#endif // PM_STATE_H diff --git a/0006-add_enfs_compile_option.patch b/0006-add_enfs_compile_option.patch new file mode 100644 index 0000000..ff3bc0e --- /dev/null +++ b/0006-add_enfs_compile_option.patch @@ -0,0 +1,70 @@ +diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig +index b04256636d4b..ae53510c0627 100644 +--- a/arch/arm64/configs/openeuler_defconfig ++++ b/arch/arm64/configs/openeuler_defconfig +@@ -5344,6 +5344,7 @@ CONFIG_LOCKD=m + CONFIG_LOCKD_V4=y + CONFIG_NFS_ACL_SUPPORT=m + CONFIG_NFS_COMMON=y ++# CONFIG_ENFS is not set + CONFIG_SUNRPC=m + CONFIG_SUNRPC_GSS=m + CONFIG_SUNRPC_BACKCHANNEL=y +diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig +index 59baeb2973af..ccc317f7fdb2 100644 +--- a/arch/x86/configs/openeuler_defconfig ++++ b/arch/x86/configs/openeuler_defconfig +@@ -6825,6 +6825,7 @@ CONFIG_LOCKD=m + CONFIG_LOCKD_V4=y + CONFIG_NFS_ACL_SUPPORT=m + CONFIG_NFS_COMMON=y ++CONFIG_ENFS=y + CONFIG_SUNRPC=m + CONFIG_SUNRPC_GSS=m + CONFIG_SUNRPC_BACKCHANNEL=y +diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig +index e55f86713948..872c9b7671b1 100644 +--- a/fs/nfs/Kconfig ++++ b/fs/nfs/Kconfig +@@ -196,3 +196,14 @@ config NFS_DEBUG + depends on NFS_FS && SUNRPC_DEBUG + select CRC32 + default y ++ ++config ENFS ++ tristate "NFS client support for ENFS" ++ depends on NFS_FS ++ default n ++ help ++ This option enables support multipath of the NFS protocol ++ in the kernel's NFS client. ++ This feature will improve performance and reliability. ++ ++ If sure, say Y. +diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile +index c587e3c4c6a6..19d0ac2ba3b8 100644 +--- a/fs/nfs/Makefile ++++ b/fs/nfs/Makefile +@@ -12,6 +12,7 @@ nfs-y := client.o dir.o file.o getroot.o inode.o super.o \ + nfs-$(CONFIG_ROOT_NFS) += nfsroot.o + nfs-$(CONFIG_SYSCTL) += sysctl.o + nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-index.o ++nfs-$(CONFIG_ENFS) += enfs_adapter.o + + obj-$(CONFIG_NFS_V2) += nfsv2.o + nfsv2-y := nfs2super.o proc.o nfs2xdr.o +@@ -34,3 +35,5 @@ nfsv4-$(CONFIG_NFS_V4_2) += nfs42proc.o + obj-$(CONFIG_PNFS_FILE_LAYOUT) += filelayout/ + obj-$(CONFIG_PNFS_BLOCK) += blocklayout/ + obj-$(CONFIG_PNFS_FLEXFILE_LAYOUT) += flexfilelayout/ ++ ++obj-$(CONFIG_ENFS) += enfs/ +diff --git a/net/sunrpc/Makefile b/net/sunrpc/Makefile +index 090658c3da12..fe4e3b28c5d1 100644 +--- a/net/sunrpc/Makefile ++++ b/net/sunrpc/Makefile +@@ -19,3 +19,4 @@ sunrpc-$(CONFIG_SUNRPC_DEBUG) += debugfs.o + sunrpc-$(CONFIG_SUNRPC_BACKCHANNEL) += backchannel_rqst.o + sunrpc-$(CONFIG_PROC_FS) += stats.o + sunrpc-$(CONFIG_SYSCTL) += sysctl.o ++sunrpc-$(CONFIG_ENFS) += sunrpc_enfs_adapter.o -- 2.25.0.windows.1

1 0

[PATCH openEuler-1.0-LTS] netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
by Lu Wei 25 Sep '23

25 Sep '23

From: Kyle Zeng <zengyhkyle(a)gmail.com> mainline inclusion from mainline-v4.20-rc2 commit 886503f34d63e681662057448819edb5b1057a97 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I83QCZ CVE: CVE-2023-42753 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle(a)gmail.com> Acked-by: Jozsef Kadlecsik <kadlec(a)netfilter.org> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Lu Wei <luwei32(a)huawei.com> --- net/netfilter/ipset/ip_set_hash_netportnet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 613e18e720a4..9290a4d7b862 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -39,6 +39,7 @@ MODULE_ALIAS("ip_set_hash:net,port,net"); #define IP_SET_HASH_WITH_PROTO #define IP_SET_HASH_WITH_NETS #define IPSET_NET_COUNT 2 +#define IP_SET_HASH_WITH_NET0 /* IPv4 variant */ -- 2.34.1

2 1

[PATCH openEuler-23.09] gpio: loongson: Add 3A/3B/3C/7A gpio dirver support
by Ming Wang 25 Sep '23

25 Sep '23

LoongArch inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6BWFP -------------------------------- Signed-off-by: Juxin Gao <gaojuxin(a)loongson.cn> Signed-off-by: Ming Wang <wangming01(a)loongson.cn> --- drivers/gpio/Kconfig | 3 +- drivers/gpio/gpio-loongson.c | 414 ++++++++++++++++++++++++++++------- 2 files changed, 341 insertions(+), 76 deletions(-) diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index f45c6a36551c..be0cf9c87cd6 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -377,7 +377,8 @@ config GPIO_LOGICVC config GPIO_LOONGSON bool "Loongson-2/3 GPIO support" - depends on CPU_LOONGSON2EF || CPU_LOONGSON64 + depends on CPU_LOONGSON2EF || CPU_LOONGSON64 || LOONGARCH + default m help Driver for GPIO functionality on Loongson-2F/3A/3B processors. diff --git a/drivers/gpio/gpio-loongson.c b/drivers/gpio/gpio-loongson.c index a42145873cc9..a3a3d647a043 100644 --- a/drivers/gpio/gpio-loongson.c +++ b/drivers/gpio/gpio-loongson.c @@ -1,13 +1,13 @@ -// SPDX-License-Identifier: GPL-2.0-or-later /* - * Loongson-2F/3A/3B GPIO Support + * Loongson-3A/3B/3C/7A GPIO Support * - * Copyright (c) 2008 Richard Liu, STMicroelectronics <richard.liu(a)st.com> - * Copyright (c) 2008-2010 Arnaud Patard <apatard(a)mandriva.com> - * Copyright (c) 2013 Hongbing Hu <huhb(a)lemote.com> - * Copyright (c) 2014 Huacai Chen <chenhc(a)lemote.com> + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. */ +#include <linux/acpi.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/module.h> @@ -16,120 +16,384 @@ #include <linux/gpio/driver.h> #include <linux/platform_device.h> #include <linux/bitops.h> +#include <linux/property.h> #include <asm/types.h> -#include <loongson.h> -#define STLS2F_N_GPIO 4 -#define STLS3A_N_GPIO 16 +/* ============== Data structrues =============== */ -#ifdef CONFIG_CPU_LOONGSON64 -#define LOONGSON_N_GPIO STLS3A_N_GPIO -#else -#define LOONGSON_N_GPIO STLS2F_N_GPIO -#endif +/* gpio data */ +struct platform_gpio_data { + u32 gpio_conf; + u32 gpio_out; + u32 gpio_in; + u32 in_start_bit; + u32 support_irq; + char *label; + int gpio_base; + int ngpio; +}; + +#define GPIO_IO_CONF(x) (x->base + x->conf_offset) +#define GPIO_OUT(x) (x->base + x->out_offset) +#define GPIO_IN(x) (x->base + x->in_offset) + +#define LS7A_GPIO_OEN_BYTE(x, gpio) (x->base + x->conf_offset + gpio) +#define LS7A_GPIO_OUT_BYTE(x, gpio) (x->base + x->out_offset + gpio) +#define LS7A_GPIO_IN_BYTE(x, gpio) (x->base + x->in_offset + gpio) + +struct loongson_gpio_chip { + struct gpio_chip chip; + spinlock_t lock; + void __iomem *base; + int conf_offset; + int out_offset; + int in_offset; + int in_start_bit; + u16 *gsi_idx_map; + u16 mapsize; + bool support_irq; +}; /* - * Offset into the register where we read lines, we write them from offset 0. - * This offset is the only thing that stand between us and using - * GPIO_GENERIC. + * GPIO primitives. */ -#define LOONGSON_GPIO_IN_OFFSET 16 +static int loongson_gpio_request(struct gpio_chip *chip, unsigned int pin) +{ + if (pin >= chip->ngpio) + return -EINVAL; + else + return 0; +} + +static inline void +__set_direction(struct loongson_gpio_chip *lgpio, unsigned int pin, int input) +{ + u64 temp; + u8 value; -static DEFINE_SPINLOCK(gpio_lock); + if (!strcmp(lgpio->chip.label, "loongson,loongson3-gpio") || + !strncmp(lgpio->chip.label, "LOON0007", 8)) { + temp = readq(GPIO_IO_CONF(lgpio)); + if (input) + temp |= 1ULL << pin; + else + temp &= ~(1ULL << pin); + writeq(temp, GPIO_IO_CONF(lgpio)); + return; + } + if (!strcmp(lgpio->chip.label, "loongson,ls7a-gpio") || + !strncmp(lgpio->chip.label, "LOON0002", 8)) { + if (input) + value = 1; + else + value = 0; + writeb(value, LS7A_GPIO_OEN_BYTE(lgpio, pin)); + return; + } +} -static int loongson_gpio_get_value(struct gpio_chip *chip, unsigned gpio) +static void __set_level(struct loongson_gpio_chip *lgpio, unsigned int pin, int high) { - u32 val; + u64 temp; + u8 value; - spin_lock(&gpio_lock); - val = LOONGSON_GPIODATA; - spin_unlock(&gpio_lock); + /* If GPIO controller is on 3A,then... */ + if (!strcmp(lgpio->chip.label, "loongson,loongson3-gpio") || + !strncmp(lgpio->chip.label, "LOON0007", 8)) { + temp = readq(GPIO_OUT(lgpio)); + if (high) + temp |= 1ULL << pin; + else + temp &= ~(1ULL << pin); + writeq(temp, GPIO_OUT(lgpio)); + return; + } - return !!(val & BIT(gpio + LOONGSON_GPIO_IN_OFFSET)); + if (!strcmp(lgpio->chip.label, "loongson,ls7a-gpio") || + !strncmp(lgpio->chip.label, "LOON0002", 8)) { + if (high) + value = 1; + else + value = 0; + writeb(value, LS7A_GPIO_OUT_BYTE(lgpio, pin)); + return; + } } -static void loongson_gpio_set_value(struct gpio_chip *chip, - unsigned gpio, int value) +static int loongson_gpio_direction_input(struct gpio_chip *chip, unsigned int pin) { - u32 val; + unsigned long flags; + struct loongson_gpio_chip *lgpio = + container_of(chip, struct loongson_gpio_chip, chip); - spin_lock(&gpio_lock); - val = LOONGSON_GPIODATA; - if (value) - val |= BIT(gpio); - else - val &= ~BIT(gpio); - LOONGSON_GPIODATA = val; - spin_unlock(&gpio_lock); + spin_lock_irqsave(&lgpio->lock, flags); + __set_direction(lgpio, pin, 1); + spin_unlock_irqrestore(&lgpio->lock, flags); + + return 0; } -static int loongson_gpio_direction_input(struct gpio_chip *chip, unsigned gpio) +static int loongson_gpio_direction_output(struct gpio_chip *chip, + unsigned int pin, int value) { - u32 temp; + struct loongson_gpio_chip *lgpio = + container_of(chip, struct loongson_gpio_chip, chip); + unsigned long flags; - spin_lock(&gpio_lock); - temp = LOONGSON_GPIOIE; - temp |= BIT(gpio); - LOONGSON_GPIOIE = temp; - spin_unlock(&gpio_lock); + spin_lock_irqsave(&lgpio->lock, flags); + __set_level(lgpio, pin, value); + __set_direction(lgpio, pin, 0); + spin_unlock_irqrestore(&lgpio->lock, flags); return 0; } -static int loongson_gpio_direction_output(struct gpio_chip *chip, - unsigned gpio, int level) +static int loongson_gpio_get(struct gpio_chip *chip, unsigned int pin) { - u32 temp; + struct loongson_gpio_chip *lgpio = + container_of(chip, struct loongson_gpio_chip, chip); + u64 temp; + u8 value; + + /* GPIO controller in 3A is different for 7A */ + if (!strcmp(lgpio->chip.label, "loongson,loongson3-gpio") || + !strncmp(lgpio->chip.label, "LOON0007", 8)) { + temp = readq(GPIO_IN(lgpio)); + return ((temp & (1ULL << (pin + lgpio->in_start_bit))) != 0); + } + + if (!strcmp(lgpio->chip.label, "loongson,ls7a-gpio") || + !strncmp(lgpio->chip.label, "LOON0002", 8)) { + value = readb(LS7A_GPIO_IN_BYTE(lgpio, pin)); + return (value & 1); + } + + return -ENXIO; +} + +static void loongson_gpio_set(struct gpio_chip *chip, unsigned int pin, int value) +{ + struct loongson_gpio_chip *lgpio = + container_of(chip, struct loongson_gpio_chip, chip); + unsigned long flags; + + spin_lock_irqsave(&lgpio->lock, flags); + __set_level(lgpio, pin, value); + spin_unlock_irqrestore(&lgpio->lock, flags); +} + +static int loongson_gpio_to_irq(struct gpio_chip *chip, unsigned int offset) +{ + struct platform_device *pdev = + container_of(chip->parent, struct platform_device, dev); + struct loongson_gpio_chip *lgpio = + container_of(chip, struct loongson_gpio_chip, chip); + + if (offset >= chip->ngpio) + return -EINVAL; + + if ((lgpio->gsi_idx_map != NULL) && (offset < lgpio->mapsize)) + offset = lgpio->gsi_idx_map[offset]; + + return platform_get_irq(pdev, offset); +} + +static int loongson_gpio_init(struct device *dev, struct loongson_gpio_chip *lgpio, + struct device_node *np, + void __iomem *base) +{ + lgpio->chip.request = loongson_gpio_request; + lgpio->chip.direction_input = loongson_gpio_direction_input; + lgpio->chip.get = loongson_gpio_get; + lgpio->chip.direction_output = loongson_gpio_direction_output; + lgpio->chip.set = loongson_gpio_set; + lgpio->chip.can_sleep = 0; + lgpio->chip.fwnode = dev_fwnode(dev); + lgpio->chip.parent = dev; + spin_lock_init(&lgpio->lock); + lgpio->base = (void __iomem *)base; + + if (!strcmp(lgpio->chip.label, "loongson,ls7a-gpio") || + !strncmp(lgpio->chip.label, "LOON0002", 8) || + !strcmp(lgpio->chip.label, "loongson,loongson3-gpio") || + !strncmp(lgpio->chip.label, "LOON0007", 8)) { - loongson_gpio_set_value(chip, gpio, level); - spin_lock(&gpio_lock); - temp = LOONGSON_GPIOIE; - temp &= ~BIT(gpio); - LOONGSON_GPIOIE = temp; - spin_unlock(&gpio_lock); + lgpio->chip.to_irq = loongson_gpio_to_irq; + } + gpiochip_add(&lgpio->chip); return 0; } + +static void of_loongson_gpio_get_props(struct device_node *np, + struct loongson_gpio_chip *lgpio) +{ + const char *name; + + of_property_read_u32(np, "ngpios", (u32 *)&lgpio->chip.ngpio); + of_property_read_u32(np, "gpio_base", (u32 *)&lgpio->chip.base); + of_property_read_u32(np, "conf_offset", (u32 *)&lgpio->conf_offset); + of_property_read_u32(np, "out_offset", (u32 *)&lgpio->out_offset); + of_property_read_u32(np, "in_offset", (u32 *)&lgpio->in_offset); + of_property_read_string(np, "compatible", &name); + if (!strcmp(name, "loongson,loongson3-gpio")) { + of_property_read_u32(np, "in_start_bit", + (u32 *)&lgpio->in_start_bit); + if (of_property_read_bool(np, "support_irq")) + lgpio->support_irq = true; + } + lgpio->chip.label = kstrdup(name, GFP_KERNEL); +} + +static void acpi_loongson_gpio_get_props(struct platform_device *pdev, + struct loongson_gpio_chip *lgpio) +{ + + struct device *dev = &pdev->dev; + int rval; + + device_property_read_u32(dev, "ngpios", (u32 *)&lgpio->chip.ngpio); + device_property_read_u32(dev, "gpio_base", (u32 *)&lgpio->chip.base); + device_property_read_u32(dev, "conf_offset", (u32 *)&lgpio->conf_offset); + device_property_read_u32(dev, "out_offset", (u32 *)&lgpio->out_offset); + device_property_read_u32(dev, "in_offset", (u32 *)&lgpio->in_offset); + rval = device_property_read_u16_array(dev, "gsi_idx_map", NULL, 0); + if (rval > 0) { + lgpio->gsi_idx_map = + kmalloc_array(rval, sizeof(*lgpio->gsi_idx_map), + GFP_KERNEL); + if (unlikely(!lgpio->gsi_idx_map)) { + dev_err(dev, "Alloc gsi_idx_map fail!\n"); + } else { + lgpio->mapsize = rval; + device_property_read_u16_array(dev, "gsi_idx_map", + lgpio->gsi_idx_map, lgpio->mapsize); + } + } + if (!strcmp(pdev->name, "LOON0007")) { + device_property_read_u32(dev, "in_start_bit", + (u32 *)&lgpio->in_start_bit); + if (device_property_read_bool(dev, "support_irq")) + lgpio->support_irq = true; + } + lgpio->chip.label = kstrdup(pdev->name, GFP_KERNEL); +} + +static void platform_loongson_gpio_get_props(struct platform_device *pdev, + struct loongson_gpio_chip *lgpio) +{ + struct platform_gpio_data *gpio_data = + (struct platform_gpio_data *)pdev->dev.platform_data; + + lgpio->chip.ngpio = gpio_data->ngpio; + lgpio->chip.base = gpio_data->gpio_base; + lgpio->conf_offset = gpio_data->gpio_conf; + lgpio->out_offset = gpio_data->gpio_out; + lgpio->in_offset = gpio_data->gpio_in; + if (!strcmp(gpio_data->label, "loongson,loongson3-gpio")) { + lgpio->in_start_bit = gpio_data->in_start_bit; + lgpio->support_irq = gpio_data->support_irq; + } + lgpio->chip.label = kstrdup(gpio_data->label, GFP_KERNEL); +} + static int loongson_gpio_probe(struct platform_device *pdev) { - struct gpio_chip *gc; + struct resource *iores; + void __iomem *base; + struct loongson_gpio_chip *lgpio; + struct device_node *np = pdev->dev.of_node; struct device *dev = &pdev->dev; + int ret = 0; - gc = devm_kzalloc(dev, sizeof(*gc), GFP_KERNEL); - if (!gc) + lgpio = kzalloc(sizeof(struct loongson_gpio_chip), GFP_KERNEL); + if (!lgpio) return -ENOMEM; - gc->label = "loongson-gpio-chip"; - gc->base = 0; - gc->ngpio = LOONGSON_N_GPIO; - gc->get = loongson_gpio_get_value; - gc->set = loongson_gpio_set_value; - gc->direction_input = loongson_gpio_direction_input; - gc->direction_output = loongson_gpio_direction_output; + if (np) + of_loongson_gpio_get_props(np, lgpio); + else if (ACPI_COMPANION(&pdev->dev)) + acpi_loongson_gpio_get_props(pdev, lgpio); + else + platform_loongson_gpio_get_props(pdev, lgpio); + + iores = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!iores) { + ret = -ENODEV; + goto out; + } + if (!request_mem_region(iores->start, resource_size(iores), + pdev->name)) { + ret = -EBUSY; + goto out; + } + base = ioremap(iores->start, resource_size(iores)); + if (!base) { + ret = -ENOMEM; + goto out; + } + platform_set_drvdata(pdev, lgpio); + loongson_gpio_init(dev, lgpio, np, base); - return gpiochip_add_data(gc, NULL); + return 0; +out: + pr_err("%s: %s: missing mandatory property\n", __func__, np->name); + return ret; } -static struct platform_driver loongson_gpio_driver = { +static int loongson_gpio_remove(struct platform_device *pdev) +{ + struct loongson_gpio_chip *lgpio = platform_get_drvdata(pdev); + struct resource *mem; + + platform_set_drvdata(pdev, NULL); + gpiochip_remove(&lgpio->chip); + iounmap(lgpio->base); + kfree(lgpio->gsi_idx_map); + kfree(lgpio); + mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); + release_mem_region(mem->start, resource_size(mem)); + return 0; +} + +static const struct of_device_id loongson_gpio_dt_ids[] = { + { .compatible = "loongson,loongson3-gpio"}, + { .compatible = "loongson,ls7a-gpio"}, + {} +}; +MODULE_DEVICE_TABLE(of, loongson_gpio_dt_ids); + +static const struct acpi_device_id loongson_gpio_acpi_match[] = { + {"LOON0002"}, + {"LOON0007"}, + {} +}; +MODULE_DEVICE_TABLE(acpi, loongson_gpio_acpi_match); + +static struct platform_driver ls_gpio_driver = { .driver = { .name = "loongson-gpio", + .owner = THIS_MODULE, + .of_match_table = loongson_gpio_dt_ids, + .acpi_match_table = ACPI_PTR(loongson_gpio_acpi_match), }, .probe = loongson_gpio_probe, + .remove = loongson_gpio_remove, }; static int __init loongson_gpio_setup(void) { - struct platform_device *pdev; - int ret; - - ret = platform_driver_register(&loongson_gpio_driver); - if (ret) { - pr_err("error registering loongson GPIO driver\n"); - return ret; - } + return platform_driver_register(&ls_gpio_driver); +} +subsys_initcall(loongson_gpio_setup); - pdev = platform_device_register_simple("loongson-gpio", -1, NULL, 0); - return PTR_ERR_OR_ZERO(pdev); +static void __exit loongson_gpio_driver(void) +{ + platform_driver_unregister(&ls_gpio_driver); } -postcore_initcall(loongson_gpio_setup); +module_exit(loongson_gpio_driver); +MODULE_AUTHOR("Loongson Technology Corporation Limited"); +MODULE_DESCRIPTION("LOONGSON GPIO"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS("platform:loongson_gpio"); -- 2.39.2

2 1

[PATCH OLK-5.10] scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
by Yong Hu 25 Sep '23

25 Sep '23

From: Shuchang Li <lishuchang(a)hust.edu.cn> stable inclusion from stable-v5.10.180 commit bab8dc38b1a0a12bc064fc064269033bdcf5b88e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZCDZ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit 91a0c0c1413239d0548b5aac4c82f38f6d53a91e ] When if_type equals zero and pci_resource_start(pdev, PCI_64BIT_BAR4) returns false, drbl_regs_memmap_p is not remapped. This passes a NULL pointer to iounmap(), which can trigger a WARN() on certain arches. When if_type equals six and pci_resource_start(pdev, PCI_64BIT_BAR4) returns true, drbl_regs_memmap_p may has been remapped and ctrl_regs_memmap_p is not remapped. This is a resource leak and passes a NULL pointer to iounmap(). To fix these issues, we need to add null checks before iounmap(), and change some goto labels. Fixes: 1351e69fc6db ("scsi: lpfc: Add push-to-adapter support to sli4") Signed-off-by: Shuchang Li <lishuchang(a)hust.edu.cn> Link: https://lore.kernel.org/r/20230404072133.1022-1-lishuchang@hust.edu.cn Reviewed-by: Justin Tee <justin.tee(a)broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yong Hu <yong.hu(a)windriver.com> --- drivers/scsi/lpfc/lpfc_init.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 17200b453cbb..1bb3c96a04bd 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -10477,7 +10477,7 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) goto out_iounmap_all; } else { error = -ENOMEM; - goto out_iounmap_all; + goto out_iounmap_ctrl; } } @@ -10495,7 +10495,7 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) dev_err(&pdev->dev, "ioremap failed for SLI4 HBA dpp registers.\n"); error = -ENOMEM; - goto out_iounmap_ctrl; + goto out_iounmap_all; } phba->pci_bar4_memmap_p = phba->sli4_hba.dpp_regs_memmap_p; } @@ -10520,9 +10520,11 @@ lpfc_sli4_pci_mem_setup(struct lpfc_hba *phba) return 0; out_iounmap_all: - iounmap(phba->sli4_hba.drbl_regs_memmap_p); + if (phba->sli4_hba.drbl_regs_memmap_p) + iounmap(phba->sli4_hba.drbl_regs_memmap_p); out_iounmap_ctrl: - iounmap(phba->sli4_hba.ctrl_regs_memmap_p); + if (phba->sli4_hba.ctrl_regs_memmap_p) + iounmap(phba->sli4_hba.ctrl_regs_memmap_p); out_iounmap_conf: iounmap(phba->sli4_hba.conf_regs_memmap_p); -- 2.34.1

2 1

[PATCH OLK-5.10] scsi: lpfc: Prevent lpfc_debugfs_lockstat_write() buffer overflow
by Yong Hu 25 Sep '23

25 Sep '23

From: Justin Tee <justin.tee(a)broadcom.com> stable inclusion from stable-v5.10.181 commit e0e7faee3a7dd6f51350cda64997116a247eb045 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZCDZ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit c6087b82a9146826564a55c5ca0164cac40348f5 ] A static code analysis tool flagged the possibility of buffer overflow when using copy_from_user() for a debugfs entry. Currently, it is possible that copy_from_user() copies more bytes than what would fit in the mybuf char array. Add a min() restriction check between sizeof(mybuf) - 1 and nbytes passed from the userspace buffer to protect against buffer overflow. Link: https://lore.kernel.org/r/20230301231626.9621-2-justintee8345@gmail.com Signed-off-by: Justin Tee <justin.tee(a)broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yong Hu <yong.hu(a)windriver.com> --- drivers/scsi/lpfc/lpfc_debugfs.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index fbc76d69ea0b..2b77cbbcdccb 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -2159,10 +2159,13 @@ lpfc_debugfs_lockstat_write(struct file *file, const char __user *buf, char mybuf[64]; char *pbuf; int i; + size_t bsize; memset(mybuf, 0, sizeof(mybuf)); - if (copy_from_user(mybuf, buf, nbytes)) + bsize = min(nbytes, (sizeof(mybuf) - 1)); + + if (copy_from_user(mybuf, buf, bsize)) return -EFAULT; pbuf = &mybuf[0]; @@ -2183,7 +2186,7 @@ lpfc_debugfs_lockstat_write(struct file *file, const char __user *buf, qp->lock_conflict.wq_access = 0; } } - return nbytes; + return bsize; } #endif -- 2.34.1

2 1

[PATCH openEuler-23.09 0/5] LoongArch: add old BPI compatibility
by yangyinglu 25 Sep '23

25 Sep '23

yangyinglu (5): LoongArch: add kernel setvirtmap for runtime LoongArch: Old BPI compatibility LoongArch: Fix virtual machine startup error LoongArch: Fixed EIOINTC structure members LoongArch: use arch specific phys_to_dma arch/loongarch/Kconfig | 1 + arch/loongarch/include/asm/addrspace.h | 1 + arch/loongarch/include/asm/efi.h | 1 + arch/loongarch/include/asm/irq.h | 1 + arch/loongarch/include/asm/loongarch.h | 1 + arch/loongarch/kernel/Makefile | 1 + arch/loongarch/kernel/acpi.c | 7 +- arch/loongarch/kernel/dma.c | 26 +- arch/loongarch/kernel/efi.c | 175 ++++++++- arch/loongarch/kernel/env.c | 6 + arch/loongarch/kernel/irq.c | 25 +- arch/loongarch/kernel/legacy_boot.c | 484 +++++++++++++++++++++++++ arch/loongarch/kernel/legacy_boot.h | 90 +++++ arch/loongarch/kernel/mem.c | 26 +- arch/loongarch/kernel/numa.c | 39 +- arch/loongarch/kernel/reset.c | 3 +- arch/loongarch/kernel/setup.c | 18 +- arch/loongarch/kernel/smp.c | 6 +- arch/loongarch/pci/acpi.c | 147 +++++++- drivers/firmware/efi/Makefile | 1 + drivers/irqchip/irq-loongarch-cpu.c | 7 +- drivers/irqchip/irq-loongson-eiointc.c | 46 ++- drivers/irqchip/irq-loongson-pch-pic.c | 5 + 23 files changed, 1075 insertions(+), 42 deletions(-) create mode 100644 arch/loongarch/kernel/legacy_boot.c create mode 100644 arch/loongarch/kernel/legacy_boot.h -- 2.20.1

2 6

[PATCH openEuler-23.09 0/5] LoongArch: add rtc driver and fix
by Ming Wang 25 Sep '23

25 Sep '23

Ming Wang (5): rtc: Add rtc driver for the Loongson family chips LoongArch: kdump: Add memory reservation for old kernel LoongArch: kexec: Add compatibility with old interfaces LoongArch: Fix kdump failure on v40 interface specification LoongArch: kdump: Add high memory reservation arch/loongarch/kernel/machine_kexec.c | 45 ++- arch/loongarch/kernel/setup.c | 94 +++++- drivers/rtc/Kconfig | 13 + drivers/rtc/Makefile | 1 + drivers/rtc/rtc-loongson.c | 397 ++++++++++++++++++++++++++ 5 files changed, 535 insertions(+), 15 deletions(-) create mode 100644 drivers/rtc/rtc-loongson.c -- 2.39.2

2 6

启动 openEuler 2023 年度优秀项目推荐
by Huxinwei 25 Sep '23

25 Sep '23

各位社区的开发者：经 openEuler技术委员会9月20日会议讨论，现正式启动 openEuler 2023年度优秀项目的评选，请各位社区开发者和参与者推荐。当前在评选标准和项目设置上的考虑，可以参见：oEEP (openeuler.org)<https://www.openeuler.org/zh/oEEP/?name=oEEP-0007%20openEuler%E4%BC%98%E7%A…> 。截止 2023 年 10 月 15 日（周日）为止，任意三名以上社区参与者联名，可以向 tc(a)openeuler.org<mailto:tc@openeuler.org> 推荐您认可的项目。推荐项目的邮件请在邮件主题中明确包含 “openEuler 2023 年度优秀项目推荐” 字样。推荐项目的邮件内容中，请明确联名推荐人的邮箱地址和相应的 gitee id，推荐的项目名称，项目代码仓位置，推荐获奖的方向。我将汇总所有推荐，在 10 月 18 日之前通过社区邮件列表公示。欢迎大家的参与和推荐 Regards openEuler Technical Committee

1 0

[PATCH openEuler-23.09 0/4] LoongArch: Add cpufreq and BMC
by Weihao Li 25 Sep '23

25 Sep '23

Weihao Li (4): cpufreq: Add cpufreq driver for LoongArch fbdev: add ls2k500sfb driver for ls2k500 bmc. ipmi: add ls2k500 bmc ipmi support. LoongArch: defconfig: enable CONFIG_FB_LS2K500=m. arch/loongarch/Kconfig | 6 + arch/loongarch/configs/loongson3_defconfig | 5 + arch/loongarch/include/asm/fpu.h | 13 +- drivers/char/ipmi/Makefile | 4 + drivers/char/ipmi/btlock.h | 92 ++ drivers/char/ipmi/ipmi_si.h | 11 + drivers/char/ipmi/ipmi_si_intf.c | 4 + drivers/char/ipmi/ipmi_si_ls2k500.c | 172 +++ drivers/char/ipmi/kcs_bmc_ls2k500.h | 67 + drivers/cpufreq/Kconfig | 11 + drivers/cpufreq/Makefile | 1 + drivers/cpufreq/loongson3-acpi-cpufreq.c | 1549 ++++++++++++++++++++ drivers/video/fbdev/Kconfig | 9 + drivers/video/fbdev/Makefile | 1 + drivers/video/fbdev/ls2k500sfb.c | 791 ++++++++++ 15 files changed, 2735 insertions(+), 1 deletion(-) create mode 100644 drivers/char/ipmi/btlock.h create mode 100644 drivers/char/ipmi/ipmi_si_ls2k500.c create mode 100644 drivers/char/ipmi/kcs_bmc_ls2k500.h create mode 100644 drivers/cpufreq/loongson3-acpi-cpufreq.c create mode 100644 drivers/video/fbdev/ls2k500sfb.c -- 2.20.1

2 5

[PATCH openEuler-23.09 0/5] LoongArch: add rtc driver and fix
by Ming Wang 25 Sep '23

25 Sep '23

Ming Wang (5): rtc: Add rtc driver for the Loongson family chips LoongArch: kdump: Add memory reservation for old kernel LoongArch: kexec: Add compatibility with old interfaces LoongArch: Fix kdump failure on v40 interface specification LoongArch: kdump: Add high memory reservation arch/loongarch/kernel/machine_kexec.c | 45 ++- arch/loongarch/kernel/setup.c | 94 +++++- drivers/rtc/Kconfig | 13 + drivers/rtc/Makefile | 1 + drivers/rtc/rtc-loongson.c | 397 ++++++++++++++++++++++++++ 5 files changed, 535 insertions(+), 15 deletions(-) create mode 100644 drivers/rtc/rtc-loongson.c -- 2.39.2

1 5

[PATCH OLK-5.10] sdei_watchdog: Avoid exception during sdei handler
by Zheng Zengkai 23 Sep '23

23 Sep '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I82QPR -------------------------------- On Kunpeng920 platform, when firmware triggers SDEI event too soon, A WARN_ON() will be called in sdei_watchdog_callback(), this leads to warning "sdei: unsafe: exception during handler" being reported in _sdei_handler(). As the comments for the warning mentioned, We took a synchronous exception from the SDEI handler. This could deadlock, and if you interrupt KVM it will hyp-panic instead. Remove the WARN_ON() to avoid potential issue and warning. Fixes: 0fa83fd0f8f7 ("sdei_watchdog: avoid possible false hardlockup") Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com> --- arch/arm64/kernel/watchdog_sdei.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/kernel/watchdog_sdei.c b/arch/arm64/kernel/watchdog_sdei.c index aa980b090598..7fd8c2d3dd1b 100644 --- a/arch/arm64/kernel/watchdog_sdei.c +++ b/arch/arm64/kernel/watchdog_sdei.c @@ -78,7 +78,6 @@ static int sdei_watchdog_callback(u32 event, if (delta < watchdog_thresh * (u64)NSEC_PER_SEC * 4 / 5) { pr_err(FW_BUG "SDEI Watchdog event triggered too soon, " "time to last check:%lld ns\n", delta); - WARN_ON(1); return 0; } -- 2.20.1

2 1

[PATCH openEuler-1.0-LTS] sdei_watchdog: Avoid exception during sdei handler
by Zheng Zengkai 23 Sep '23

23 Sep '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I82QPR -------------------------------- On Kunpeng920 platform, when firmware triggers SDEI event too soon, A WARN_ON() will be called in sdei_watchdog_callback(), this leads to warning "sdei: unsafe: exception during handler" being reported in _sdei_handler(). As the comments for the warning mentioned, We took a synchronous exception from the SDEI handler. This could deadlock, and if you interrupt KVM it will hyp-panic instead. Remove the WARN_ON() to avoid potential issue and warning. Fixes: 37433b57ffdd ("sdei_watchdog: avoid possible false hardlockup") Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com> --- arch/arm64/kernel/watchdog_sdei.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/kernel/watchdog_sdei.c b/arch/arm64/kernel/watchdog_sdei.c index a499a14b23c1..5884abdaeb9d 100644 --- a/arch/arm64/kernel/watchdog_sdei.c +++ b/arch/arm64/kernel/watchdog_sdei.c @@ -78,7 +78,6 @@ static int sdei_watchdog_callback(u32 event, if (delta < watchdog_thresh * (u64)NSEC_PER_SEC * 4 / 5) { pr_err(FW_BUG "SDEI Watchdog event triggered too soon, " "time to last check:%lld ns\n", delta); - WARN_ON(1); return 0; } -- 2.20.1

2 1

[PATCH 01/88] scsi: mpt3sas: Define hba_port structure
by Hao Zhang 22 Sep '23

22 Sep '23

From: Sreekanth Reddy <sreekanth.reddy(a)broadcom.com> Commit b22a0fac8c056e88fc72f7241fa9077b804634a6 upstream. Define a new hba_port structure which holds the following variables: - port_id: Port ID of the narrow/wide port of the HBA - sas_address: SAS Address of the remote device that is attached to the current HBA port - phy_mask: HBA's phy bits to which above SAS addressed device is attached - flags: This field is used to refresh port details during HBA reset Link: https://lore.kernel.org/r/20201027130847.9962-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy(a)broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Integrated-by: Siyu Zhang <siyu.zhang(a)windriver.com> --- drivers/scsi/mpt3sas/mpt3sas_base.h | 35 ++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 86774747fe25..9a3429b1e7ce 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -420,6 +420,7 @@ struct Mpi2ManufacturingPage11_t { * @flags: MPT_TARGET_FLAGS_XXX flags * @deleted: target flaged for deletion * @tm_busy: target is busy with TM request. + * @port: hba port entry containing target's port number info * @sas_dev: The sas_device associated with this target * @pcie_dev: The pcie device associated with this target */ @@ -432,6 +433,7 @@ struct MPT3SAS_TARGET { u32 flags; u8 deleted; u8 tm_busy; + struct hba_port *port; struct _sas_device *sas_dev; struct _pcie_device *pcie_dev; }; @@ -534,6 +536,7 @@ struct _internal_cmd { * addition routine. * @chassis_slot: chassis slot * @is_chassis_slot_valid: chassis slot valid or not + * @port: hba port entry containing device's port number info */ struct _sas_device { struct list_head list; @@ -560,6 +563,7 @@ struct _sas_device { u8 is_chassis_slot_valid; u8 connector_name[5]; struct kref refcount; + struct hba_port *port; }; static inline void sas_device_get(struct _sas_device *s) @@ -730,6 +734,7 @@ struct _boot_device { * @remote_identify: attached device identification * @rphy: sas transport rphy object * @port: sas transport wide/narrow port object + * @hba_port: hba port entry containing port's port number info * @phy_list: _sas_phy list objects belonging to this port */ struct _sas_port { @@ -738,6 +743,7 @@ struct _sas_port { struct sas_identify remote_identify; struct sas_rphy *rphy; struct sas_port *port; + struct hba_port *hba_port; struct list_head phy_list; }; @@ -751,6 +757,7 @@ struct _sas_port { * @handle: device handle for this phy * @attached_handle: device handle for attached device * @phy_belongs_to_port: port has been created for this phy + * @port: hba port entry containing port number info */ struct _sas_phy { struct list_head port_siblings; @@ -761,6 +768,7 @@ struct _sas_phy { u16 handle; u16 attached_handle; u8 phy_belongs_to_port; + struct hba_port *port; }; /** @@ -776,6 +784,7 @@ struct _sas_phy { * @responding: used in _scsih_expander_device_mark_responding * @phy: a list of phys that make up this sas_host/expander * @sas_port_list: list of ports attached to this sas_host/expander + * @port: hba port entry containing node's port number info */ struct _sas_node { struct list_head list; @@ -787,11 +796,11 @@ struct _sas_node { u16 enclosure_handle; u64 enclosure_logical_id; u8 responding; + struct hba_port *port; struct _sas_phy *phy; struct list_head sas_port_list; }; - /** * struct _enclosure_node - enclosure information * @list: list of enclosures @@ -1009,6 +1018,27 @@ struct reply_post_struct { dma_addr_t reply_post_free_dma; }; +/** + * struct hba_port - Saves each HBA's Wide/Narrow port info + * @sas_address: sas address of this wide/narrow port's attached device + * @phy_mask: HBA PHY's belonging to this port + * @port_id: port number + * @flags: hba port flags + */ +struct hba_port { + struct list_head list; + u64 sas_address; + u32 phy_mask; + u8 port_id; + u8 flags; +}; + +/* hba port flags */ +#define HBA_PORT_FLAG_DIRTY_PORT 0x01 +#define HBA_PORT_FLAG_NEW_PORT 0x02 + +#define MULTIPATH_DISABLED_PORT_ID 0xFF + typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct MPT3SAS_ADAPTER *ioc); /** * struct MPT3SAS_ADAPTER - per adapter struct @@ -1191,6 +1221,7 @@ typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct MPT3SAS_ADAPTER *ioc); * which ensures the syncrhonization between cli/sysfs_show path. * @atomic_desc_capable: Atomic Request Descriptor support. * @GET_MSIX_INDEX: Get the msix index of high iops queues. + * @port_table_list: list containing HBA's wide/narrow port's info */ struct MPT3SAS_ADAPTER { struct list_head list; @@ -1483,6 +1514,8 @@ struct MPT3SAS_ADAPTER { PUT_SMID_IO_FP_HIP put_smid_hi_priority; PUT_SMID_DEFAULT put_smid_default; GET_MSIX_INDEX get_msix_index_for_smlio; + + struct list_head port_table_list; }; struct mpt3sas_debugfs_buffer { -- 2.33.0

1 87

openEuler kernel SIG双周例会
by openEuler conference 22 Sep '23

22 Sep '23

您好！ Kernel SIG 邀请您参加 2023-09-22 14:15 召开的WeLink会议(自动录制) 会议主题：openEuler kernel SIG双周例会会议链接：https://bmeeting.huaweicloud.com:36443/#/j/984839769 会议纪要：https://etherpad.openeuler.org/p/Kernel-meetings 温馨提醒：建议接入会议后修改参会人的姓名，也可以使用您在gitee.com的ID 更多资讯尽在：https://openeuler.org/zh/ Hello! openEuler Kernel SIG invites you to attend the WeLink conference(auto recording) will be held at 2023-09-22 14:15, The subject of the conference is openEuler kernel SIG双周例会, You can join the meeting at https://bmeeting.huaweicloud.com:36443/#/j/984839769. Add topics at https://etherpad.openeuler.org/p/Kernel-meetings. Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com. More information: https://openeuler.org/en/

1 0

（备用）openEuler Kernel SIG双周例会
by openEuler conference 22 Sep '23

22 Sep '23

您好！ Kernel SIG 邀请您参加 2023-09-22 14:15 召开的Zoom会议(自动录制) 会议主题：（备用）openEuler Kernel SIG双周例会会议内容： 1. 进展update 2. 议题征集中会议链接：https://us06web.zoom.us/j/83148210957?pwd=UO852Qaa2nEd3cIHULjWDmtCAmZ90O.1 会议纪要：https://etherpad.openeuler.org/p/Kernel-meetings 温馨提醒：建议接入会议后修改参会人的姓名，也可以使用您在gitee.com的ID 更多资讯尽在：https://openeuler.org/zh/ Hello! openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-09-22 14:15, The subject of the conference is （备用）openEuler Kernel SIG双周例会, Summary: 1. 进展update 2. 议题征集中 You can join the meeting at https://us06web.zoom.us/j/83148210957?pwd=UO852Qaa2nEd3cIHULjWDmtCAmZ90O.1. Add topics at https://etherpad.openeuler.org/p/Kernel-meetings. Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com. More information: https://openeuler.org/en/

1 0

[PATCH openEuler-1.0-LTS] cpuidle: Fix kobject memory leaks in error paths
by Xia Fukun 22 Sep '23

22 Sep '23

From: Anel Orazgaliyeva <anelkz(a)amazon.de> stable inclusion from stable-v4.19.294 commit 22d44652b6d6404b96a40bb051d1046e6c005ae5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I81G0T CVE: NA -------------------------------- [ Upstream commit e5f5a66c9aa9c331da5527c2e3fd9394e7091e01 ] Commit c343bf1ba5ef ("cpuidle: Fix three reference count leaks") fixes the cleanup of kobjects; however, it removes kfree() calls altogether, leading to memory leaks. Fix those and also defer the initialization of dev->kobj_dev until after the error check, so that we do not end up with a dangling pointer. Fixes: c343bf1ba5ef ("cpuidle: Fix three reference count leaks") Signed-off-by: Anel Orazgaliyeva <anelkz(a)amazon.de> Suggested-by: Aman Priyadarshi <apeureka(a)amazon.de> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- drivers/cpuidle/sysfs.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index 38986a36197e..76fcd45eadb5 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -481,6 +481,7 @@ static int cpuidle_add_state_sysfs(struct cpuidle_device *device) &kdev->kobj, "state%d", i); if (ret) { kobject_put(&kobj->kobj); + kfree(kobj); goto error_state; } cpuidle_add_s2idle_attr_group(kobj); @@ -612,6 +613,7 @@ static int cpuidle_add_driver_sysfs(struct cpuidle_device *dev) &kdev->kobj, "driver"); if (ret) { kobject_put(&kdrv->kobj); + kfree(kdrv); return ret; } @@ -698,7 +700,6 @@ int cpuidle_add_sysfs(struct cpuidle_device *dev) if (!kdev) return -ENOMEM; kdev->dev = dev; - dev->kobj_dev = kdev; init_completion(&kdev->kobj_unregister); @@ -706,9 +707,11 @@ int cpuidle_add_sysfs(struct cpuidle_device *dev) "cpuidle"); if (error) { kobject_put(&kdev->kobj); + kfree(kdev); return error; } + dev->kobj_dev = kdev; kobject_uevent(&kdev->kobj, KOBJ_ADD); return 0; -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS] cec-api: prevent leaking memory through hole in structure
by Zhao Wenhui 21 Sep '23

21 Sep '23

From: Hans Verkuil <hverkuil-cisco(a)xs4all.nl> mainline inclusion from mainline-v5.9-rc1 commit 6c42227c3467549ddc65efe99c869021d2f4a570 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I82DIP CVE: CVE-2020-36766 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Fix this smatch warning: drivers/media/cec/core/cec-api.c:156 cec_adap_g_log_addrs() warn: check that 'log_addrs' doesn't leak information (struct has a hole after 'features') Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org> Signed-off-by: Zhao Wenhui <zhaowenhui8(a)huawei.com> --- drivers/media/cec/cec-api.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/media/cec/cec-api.c b/drivers/media/cec/cec-api.c index 4961573850d5..b2b3f779592f 100644 --- a/drivers/media/cec/cec-api.c +++ b/drivers/media/cec/cec-api.c @@ -147,7 +147,13 @@ static long cec_adap_g_log_addrs(struct cec_adapter *adap, struct cec_log_addrs log_addrs; mutex_lock(&adap->lock); - log_addrs = adap->log_addrs; + /* + * We use memcpy here instead of assignment since there is a + * hole at the end of struct cec_log_addrs that an assignment + * might ignore. So when we do copy_to_user() we could leak + * one byte of memory. + */ + memcpy(&log_addrs, &adap->log_addrs, sizeof(log_addrs)); if (!adap->is_configured) memset(log_addrs.log_addr, CEC_LOG_ADDR_INVALID, sizeof(log_addrs.log_addr)); -- 2.34.1

2 1

[PATCH OLK-5.10] etmem: Fixed an issue where the module reference counting is incorrect
by liubo 21 Sep '23

21 Sep '23

euleros inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I839LV CVE: NA ---------------------------------------------------- When the /proc/pid/idle_page and /proc/pid/swap_page are opened, the try_module_get command is used to add reference counting to prevent the module from being released. However, if the file fails to be opened, the reference count must be correctly released in the abnormal process. Signed-off-by: liubo <liubo254(a)huawei.com> --- fs/proc/task_mmu.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 502893304027..9182d0c6d22c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1911,15 +1911,20 @@ static int mm_idle_open(struct inode *inode, struct file *file) } mm = proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR(mm)) + if (IS_ERR(mm)) { + module_put(module); return PTR_ERR(mm); + } file->private_data = mm; if (proc_page_scan_operations.open) - return proc_page_scan_operations.open(inode, file); + ret = proc_page_scan_operations.open(inode, file); - return 0; + if (ret != 0) + module_put(module); + + return ret; } static int mm_idle_release(struct inode *inode, struct file *file) @@ -2004,15 +2009,20 @@ static int mm_swap_open(struct inode *inode, struct file *file) } mm = proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR(mm)) + if (IS_ERR(mm)) { + module_put(module); return PTR_ERR(mm); + } file->private_data = mm; if (proc_swap_pages_operations.open) - return proc_swap_pages_operations.open(inode, file); + ret = proc_swap_pages_operations.open(inode, file); - return 0; + if (ret != 0) + module_put(module); + + return ret; } static int mm_swap_release(struct inode *inode, struct file *file) -- 2.33.0

2 1

[PATCH openEuler-1.0-LTS] crypto: hisilicon - reset before init the device
by wangyuan 21 Sep '23

21 Sep '23

From: Yu'an Wang <wangyuan46(a)huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I830AI CVE: NA -------------------------------- Before initializing the device, reset the device to clear the residual data to prevent unexpected problems, such as reboot scene, which may maintain device state before reboot. Signed-off-by: Yu'an Wang <wangyuan46(a)huawei.com> --- drivers/crypto/hisilicon/hpre/hpre_main.c | 68 +++++++++++-------- drivers/crypto/hisilicon/qm.c | 83 ++++++++++++++++------- drivers/crypto/hisilicon/rde/rde_main.c | 64 ++++++++--------- drivers/crypto/hisilicon/sec2/sec_main.c | 39 ++++++----- drivers/crypto/hisilicon/zip/zip_main.c | 42 +++++++----- 5 files changed, 175 insertions(+), 121 deletions(-) diff --git a/drivers/crypto/hisilicon/hpre/hpre_main.c b/drivers/crypto/hisilicon/hpre/hpre_main.c index 1a980f255ad4..cbe8ea438fd2 100644 --- a/drivers/crypto/hisilicon/hpre/hpre_main.c +++ b/drivers/crypto/hisilicon/hpre/hpre_main.c @@ -780,28 +780,6 @@ static void hpre_debugfs_exit(struct hisi_qm *qm) debugfs_remove_recursive(qm->debug.debug_root); } -static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) -{ - int ret; - - qm->algs = "rsa\ndh\n"; - qm->uacce_mode = uacce_mode; - qm->pdev = pdev; - ret = hisi_qm_pre_init(qm, pf_q_num, HPRE_PF_DEF_Q_BASE); - if (ret) - return ret; - if (qm->ver == QM_HW_V1) { - pci_warn(pdev, "HPRE version 1 is not supported!\n"); - return -EINVAL; - } - - qm->qm_list = &hpre_devices; - qm->sqe_size = HPRE_SQE_SIZE; - qm->dev_name = hpre_name; - - return 0; -} - static void hpre_log_hw_error(struct hisi_qm *qm, u32 err_sts) { const struct hpre_hw_error *err = hpre_hw_errors; @@ -836,30 +814,36 @@ static void hpre_open_axi_master_ooo(struct hisi_qm *qm) HPRE_ADDR(qm, HPRE_AM_OOO_SHUTDOWN_ENB)); } -static int hpre_pf_probe_init(struct hisi_qm *qm) +static void hpre_err_ini_set(struct hisi_qm *qm) { - int ret; - - if (qm->ver != QM_HW_V2) - return -EINVAL; + if (qm->fun_type == QM_HW_VF) + return; - qm->ctrl_q_num = HPRE_QUEUE_NUM_V2; qm->err_ini.get_dev_hw_err_status = hpre_get_hw_err_status; qm->err_ini.clear_dev_hw_err_status = hpre_clear_hw_err_status; qm->err_ini.err_info.ecc_2bits_mask = HPRE_CORE_ECC_2BIT_ERR | - HPRE_OOO_ECC_2BIT_ERR; + HPRE_OOO_ECC_2BIT_ERR; qm->err_ini.err_info.ce = QM_BASE_CE; qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; qm->err_ini.err_info.fe = 0; qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; qm->err_ini.err_info.acpi_rst = "HRST"; - qm->err_ini.hw_err_disable = hpre_hw_error_disable; qm->err_ini.hw_err_enable = hpre_hw_error_enable; qm->err_ini.set_usr_domain_cache = hpre_set_user_domain_and_cache; qm->err_ini.log_dev_hw_err = hpre_log_hw_error; qm->err_ini.open_axi_master_ooo = hpre_open_axi_master_ooo; qm->err_ini.err_info.msi_wr_port = HPRE_WR_MSI_PORT; +} + +static int hpre_pf_probe_init(struct hisi_qm *qm) +{ + int ret; + + if (qm->ver != QM_HW_V2) + return -EINVAL; + + qm->ctrl_q_num = HPRE_QUEUE_NUM_V2; ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) @@ -870,6 +854,30 @@ static int hpre_pf_probe_init(struct hisi_qm *qm) return 0; } +static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) +{ + int ret; + + qm->algs = "rsa\ndh\n"; + qm->uacce_mode = uacce_mode; + qm->pdev = pdev; + ret = hisi_qm_pre_init(qm, pf_q_num, HPRE_PF_DEF_Q_BASE); + if (ret) + return ret; + + if (qm->ver == QM_HW_V1) { + pci_warn(pdev, "HPRE version 1 is not supported!\n"); + return -EINVAL; + } + + qm->qm_list = &hpre_devices; + qm->sqe_size = HPRE_SQE_SIZE; + qm->dev_name = hpre_name; + hpre_err_ini_set(qm); + + return 0; +} + static int hpre_probe(struct pci_dev *pdev, const struct pci_device_id *id) { struct hisi_qm *qm; diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 739b1a6565fd..f2706dc0d55e 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -230,6 +230,7 @@ #define QMC_ALIGN(sz) ALIGN(sz, 32) static int __hisi_qm_start(struct hisi_qm *qm); +static int qm_reset_device(struct hisi_qm *qm); enum vft_type { SQC_VFT = 0, @@ -2584,6 +2585,30 @@ static int hisi_qm_memory_init(struct hisi_qm *qm) return ret; } +static int qm_clear_device(struct hisi_qm *qm) +{ + u32 val; + int ret; + + if (qm->fun_type == QM_HW_VF) + return 0; + + /* OOO register set and check */ + writel(MASTER_GLOBAL_CTRL_SHUTDOWN, qm->io_base + MASTER_GLOBAL_CTRL); + + ret = readl_relaxed_poll_timeout(qm->io_base + MASTER_TRANS_RETURN, + val, (val == MASTER_TRANS_RETURN_RW), + QM_REG_RD_INTVRL_US, + QM_REG_RD_TMOUT_US); + if (ret) { + pci_warn(qm->pdev, "Device is busy, can not clear device.\n"); + writel(0x0, qm->io_base + MASTER_GLOBAL_CTRL); + return ret; + } + + return qm_reset_device(qm); +} + static int hisi_qm_pci_init(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -2626,8 +2651,14 @@ static int hisi_qm_pci_init(struct hisi_qm *qm) goto err_set_mask_and_coherent; } + ret = qm_clear_device(qm); + if (ret) + goto err_free_vectors; + return 0; +err_free_vectors: + pci_free_irq_vectors(pdev); err_set_mask_and_coherent: devm_iounmap(dev, qm->io_base); err_ioremap: @@ -3808,6 +3839,34 @@ static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) } } +static int qm_reset_device(struct hisi_qm *qm) +{ + struct pci_dev *pdev = qm->pdev; + unsigned long long value = 0; + acpi_status s; + + /* The reset related sub-control registers are not in PCI BAR */ + if (ACPI_HANDLE(&pdev->dev)) { + s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), + qm->err_ini.err_info.acpi_rst, + NULL, &value); + if (ACPI_FAILURE(s)) { + pci_err(pdev, "NO controller reset method!\n"); + return -EIO; + } + + if (value) { + pci_err(pdev, "Reset step %llu failed!\n", value); + return -EIO; + } + + return 0; + } + + pci_err(pdev, "No reset method!\n"); + return -EINVAL; +} + static int qm_soft_reset(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -3853,29 +3912,7 @@ static int qm_soft_reset(struct hisi_qm *qm) return ret; } - /* The reset related sub-control registers are not in PCI BAR */ - if (ACPI_HANDLE(&pdev->dev)) { - unsigned long long value = 0; - acpi_status s; - - s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), - qm->err_ini.err_info.acpi_rst, - NULL, &value); - if (ACPI_FAILURE(s)) { - pci_err(pdev, "NO controller reset method!\n"); - return -EIO; - } - - if (value) { - pci_err(pdev, "Reset step %llu failed!\n", value); - return -EIO; - } - } else { - pci_err(pdev, "No reset method!\n"); - return -EINVAL; - } - - return 0; + return qm_reset_device(qm); } static int qm_vf_reset_done(struct pci_dev *pdev, diff --git a/drivers/crypto/hisilicon/rde/rde_main.c b/drivers/crypto/hisilicon/rde/rde_main.c index f3f70079aa77..f2e00ff891db 100644 --- a/drivers/crypto/hisilicon/rde/rde_main.c +++ b/drivers/crypto/hisilicon/rde/rde_main.c @@ -28,15 +28,8 @@ #define HRDE_QUEUE_NUM_V2 1024 #define HRDE_PCI_DEVICE_ID 0xa25a #define HRDE_SQE_SIZE 64 -#define HRDE_SQ_SIZE (HRDE_SQE_SIZE * QM_Q_DEPTH) #define HRDE_PF_DEF_Q_NUM 64 #define HRDE_PF_DEF_Q_BASE 0 -#define HRDE_RD_INTVRL_US 10 -#define HRDE_RD_TMOUT_US 1000 -#define HRDE_RST_TMOUT_MS 400 -#define HRDE_ENABLE 1 -#define HRDE_DISABLE 0 -#define HRDE_PCI_COMMAND_INVALID 0xFFFFFFFF #define HRDE_RAS_INT_MSK 0x310290 #define HRDE_RAS_CE_MSK BIT(2) @@ -101,7 +94,7 @@ static struct hisi_qm_list rde_devices; static void hisi_rde_ras_proc(struct work_struct *work); static const struct hisi_rde_hw_error rde_hw_error[] = { - {.int_msk = BIT(0), .msg = "Rde_ecc_1bitt_err"}, + {.int_msk = BIT(0), .msg = "Rde_ecc_1bit_err"}, {.int_msk = BIT(1), .msg = "Rde_ecc_2bit_err"}, {.int_msk = BIT(2), .msg = "Rde_stat_mgmt_state_timeout_err"}, {.int_msk = BIT(3), .msg = "Rde_data_wr_state_timeout_err"}, @@ -269,7 +262,7 @@ static int hisi_rde_set_user_domain_and_cache(struct hisi_qm *qm) writel(AXI_M_CFG, qm->io_base + QM_AXI_M_CFG); writel(AXI_M_CFG_ENABLE, qm->io_base + QM_AXI_M_CFG_ENABLE); - /* disable BME/PM/SRIOV FLR*/ + /* disable BME/PM/SRIOV FLR */ writel(PEH_AXUSER_CFG, qm->io_base + QM_PEH_AXUSER_CFG); writel(PEH_AXUSER_CFG_ENABLE, qm->io_base + QM_PEH_AXUSER_CFG_ENABLE); @@ -351,7 +344,7 @@ static int current_qm_write(struct ctrl_debug_file *file, u32 val) u32 tmp; if (val > 0) { - pr_err("Function id should be smaller than 0.\n"); + pr_err("Function id should be equal to 0.\n"); return -EINVAL; } @@ -423,7 +416,7 @@ static ssize_t ctrl_debug_write(struct file *filp, const char __user *buf, size_t count, loff_t *pos) { struct ctrl_debug_file *file = filp->private_data; - char tbuf[20]; + char tbuf[HRDE_DBGFS_VAL_MAX_LEN]; unsigned long val; int len, ret; @@ -623,6 +616,24 @@ static void hisi_rde_open_master_ooo(struct hisi_qm *qm) writel(val | HRDE_AXI_SHUTDOWN_EN, qm->io_base + HRDE_CFG); } +static void hisi_rde_err_ini_set(struct hisi_qm *qm) +{ + qm->err_ini.get_dev_hw_err_status = hisi_rde_get_hw_err_status; + qm->err_ini.clear_dev_hw_err_status = hisi_rde_clear_hw_err_status; + qm->err_ini.err_info.ecc_2bits_mask = HRDE_ECC_2BIT_ERR; + qm->err_ini.err_info.ce = QM_BASE_CE; + qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; + qm->err_ini.err_info.fe = 0; + qm->err_ini.err_info.msi = 0; + qm->err_ini.err_info.acpi_rst = "RRST"; + qm->err_ini.hw_err_disable = hisi_rde_hw_error_disable; + qm->err_ini.hw_err_enable = hisi_rde_hw_error_enable; + qm->err_ini.set_usr_domain_cache = hisi_rde_set_user_domain_and_cache; + qm->err_ini.log_dev_hw_err = hisi_rde_hw_error_log; + qm->err_ini.open_axi_master_ooo = hisi_rde_open_master_ooo; + qm->err_ini.err_info.msi_wr_port = HRDE_WR_MSI_PORT; +} + static int hisi_rde_pf_probe_init(struct hisi_qm *qm) { struct hisi_rde *hisi_rde = container_of(qm, struct hisi_rde, qm); @@ -649,21 +660,6 @@ static int hisi_rde_pf_probe_init(struct hisi_qm *qm) return -EINVAL; } - qm->err_ini.get_dev_hw_err_status = hisi_rde_get_hw_err_status; - qm->err_ini.clear_dev_hw_err_status = hisi_rde_clear_hw_err_status; - qm->err_ini.err_info.ecc_2bits_mask = HRDE_ECC_2BIT_ERR; - qm->err_ini.err_info.ce = QM_BASE_CE; - qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; - qm->err_ini.err_info.fe = 0; - qm->err_ini.err_info.msi = 0; - qm->err_ini.err_info.acpi_rst = "RRST"; - qm->err_ini.hw_err_disable = hisi_rde_hw_error_disable; - qm->err_ini.hw_err_enable = hisi_rde_hw_error_enable; - qm->err_ini.set_usr_domain_cache = hisi_rde_set_user_domain_and_cache; - qm->err_ini.log_dev_hw_err = hisi_rde_hw_error_log; - qm->err_ini.open_axi_master_ooo = hisi_rde_open_master_ooo; - qm->err_ini.err_info.msi_wr_port = HRDE_WR_MSI_PORT; - ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) return ret; @@ -690,6 +686,7 @@ static int hisi_rde_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->sqe_size = HRDE_SQE_SIZE; qm->dev_name = hisi_rde_name; qm->abnormal_fix = hisi_rde_abnormal_fix; + hisi_rde_err_ini_set(qm); return 0; } @@ -727,31 +724,31 @@ static int hisi_rde_probe(struct pci_dev *pdev, const struct pci_device_id *id) ret = hisi_rde_qm_pre_init(qm, pdev); if (ret) { - pci_err(pdev, "Pre init qm failed!\n"); + pci_err(pdev, "Failed to pre init qm!\n"); return ret; } ret = hisi_qm_init(qm); if (ret) { - pci_err(pdev, "Init qm failed!\n"); + pci_err(pdev, "Failed to init qm!\n"); return ret; } ret = hisi_rde_pf_probe_init(qm); if (ret) { - pci_err(pdev, "Init pf failed!\n"); + pci_err(pdev, "Failed to init pf!\n"); goto err_qm_uninit; } ret = hisi_qm_start(qm); if (ret) { - pci_err(pdev, "Start qm failed!\n"); + pci_err(pdev, "Failed to start qm!\n"); goto err_qm_uninit; } ret = hisi_rde_debugfs_init(qm); if (ret) - pci_warn(pdev, "Init debugfs failed!\n"); + pci_warn(pdev, "Failed to init debugfs!\n"); hisi_qm_add_to_list(qm, &rde_devices); @@ -793,8 +790,7 @@ static void hisi_rde_ras_proc(struct work_struct *work) ret = hisi_qm_process_dev_error(pdev); if (ret == PCI_ERS_RESULT_NEED_RESET) if (hisi_qm_controller_reset(&hisi_rde->qm)) - dev_err(&pdev->dev, "Hisi_rde reset fail.\n"); - + dev_err(&pdev->dev, "Failed to reset device!\n"); } int hisi_rde_abnormal_fix(struct hisi_qm *qm) @@ -850,7 +846,7 @@ static int __init hisi_rde_init(void) ret = pci_register_driver(&hisi_rde_pci_driver); if (ret < 0) { hisi_rde_unregister_debugfs(); - pr_err("Register pci driver failed.\n"); + pr_err("Failed to register pci driver!\n"); } return ret; diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c index a568d5363c1e..0f32dcb69e12 100644 --- a/drivers/crypto/hisilicon/sec2/sec_main.c +++ b/drivers/crypto/hisilicon/sec2/sec_main.c @@ -712,29 +712,17 @@ static void sec_open_axi_master_ooo(struct hisi_qm *qm) writel(val | SEC_AXI_SHUTDOWN_ENABLE, SEC_ADDR(qm, SEC_CONTROL_REG)); } -static int sec_pf_probe_init(struct hisi_qm *qm) +static void sec_err_ini_set(struct hisi_qm *qm) { - int ret; - - switch (qm->ver) { - case QM_HW_V1: - qm->ctrl_q_num = SEC_QUEUE_NUM_V1; - break; - - case QM_HW_V2: - qm->ctrl_q_num = SEC_QUEUE_NUM_V2; - break; - - default: - return -EINVAL; - } + if (qm->fun_type == QM_HW_VF) + return; qm->err_ini.get_dev_hw_err_status = sec_get_hw_err_status; qm->err_ini.clear_dev_hw_err_status = sec_clear_hw_err_status; qm->err_ini.err_info.ecc_2bits_mask = SEC_CORE_INT_STATUS_M_ECC; qm->err_ini.err_info.ce = QM_BASE_CE; qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT | - QM_ACC_WB_NOT_READY_TIMEOUT; + QM_ACC_WB_NOT_READY_TIMEOUT; qm->err_ini.err_info.fe = 0; qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; qm->err_ini.err_info.acpi_rst = "SRST"; @@ -744,6 +732,24 @@ static int sec_pf_probe_init(struct hisi_qm *qm) qm->err_ini.log_dev_hw_err = sec_log_hw_error; qm->err_ini.open_axi_master_ooo = sec_open_axi_master_ooo; qm->err_ini.err_info.msi_wr_port = SEC_WR_MSI_PORT; +} + +static int sec_pf_probe_init(struct hisi_qm *qm) +{ + int ret; + + switch (qm->ver) { + case QM_HW_V1: + qm->ctrl_q_num = SEC_QUEUE_NUM_V1; + break; + + case QM_HW_V2: + qm->ctrl_q_num = SEC_QUEUE_NUM_V2; + break; + + default: + return -EINVAL; + } ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) @@ -807,6 +813,7 @@ static int sec_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->qm_list = &sec_devices; qm->sqe_size = SEC_SQE_SIZE; qm->dev_name = sec_name; + sec_err_ini_set(qm); return 0; } diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c index 17bbab667553..1ca51793e26a 100644 --- a/drivers/crypto/hisilicon/zip/zip_main.c +++ b/drivers/crypto/hisilicon/zip/zip_main.c @@ -204,7 +204,7 @@ static struct debugfs_reg32 hzip_dfx_regs[] = { {"HZIP_AVG_DELAY ", 0x28ull}, {"HZIP_MEM_VISIBLE_DATA ", 0x30ull}, {"HZIP_MEM_VISIBLE_ADDR ", 0x34ull}, - {"HZIP_COMSUMED_BYTE ", 0x38ull}, + {"HZIP_CONSUMED_BYTE ", 0x38ull}, {"HZIP_PRODUCED_BYTE ", 0x40ull}, {"HZIP_COMP_INF ", 0x70ull}, {"HZIP_PRE_OUT ", 0x78ull}, @@ -755,6 +755,28 @@ static void hisi_zip_close_axi_master_ooo(struct hisi_qm *qm) qm->io_base + HZIP_CORE_INT_SET); } +static void hisi_zip_err_ini_set(struct hisi_qm *qm) +{ + if (qm->fun_type == QM_HW_VF) + return; + + qm->err_ini.get_dev_hw_err_status = hisi_zip_get_hw_err_status; + qm->err_ini.clear_dev_hw_err_status = hisi_zip_clear_hw_err_status; + qm->err_ini.err_info.ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC; + qm->err_ini.err_info.ce = QM_BASE_CE; + qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_WB_NOT_READY_TIMEOUT; + qm->err_ini.err_info.fe = 0; + qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; + qm->err_ini.err_info.acpi_rst = "ZRST"; + qm->err_ini.hw_err_disable = hisi_zip_hw_error_disable; + qm->err_ini.hw_err_enable = hisi_zip_hw_error_enable; + qm->err_ini.set_usr_domain_cache = hisi_zip_set_user_domain_and_cache; + qm->err_ini.log_dev_hw_err = hisi_zip_log_hw_error; + qm->err_ini.open_axi_master_ooo = hisi_zip_open_axi_master_ooo; + qm->err_ini.close_axi_master_ooo = hisi_zip_close_axi_master_ooo; + qm->err_ini.err_info.msi_wr_port = HZIP_WR_PORT; +} + static int hisi_zip_pf_probe_init(struct hisi_qm *qm) { struct hisi_zip *zip = container_of(qm, struct hisi_zip, qm); @@ -781,23 +803,6 @@ static int hisi_zip_pf_probe_init(struct hisi_qm *qm) return -EINVAL; } - qm->err_ini.get_dev_hw_err_status = hisi_zip_get_hw_err_status; - qm->err_ini.clear_dev_hw_err_status = hisi_zip_clear_hw_err_status; - qm->err_ini.err_info.ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC; - qm->err_ini.err_info.ce = QM_BASE_CE; - qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_WB_NOT_READY_TIMEOUT; - qm->err_ini.err_info.fe = 0; - qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; - qm->err_ini.err_info.acpi_rst = "ZRST"; - qm->err_ini.hw_err_disable = hisi_zip_hw_error_disable; - qm->err_ini.hw_err_enable = hisi_zip_hw_error_enable; - qm->err_ini.set_usr_domain_cache = hisi_zip_set_user_domain_and_cache; - qm->err_ini.log_dev_hw_err = hisi_zip_log_hw_error; - qm->err_ini.open_axi_master_ooo = hisi_zip_open_axi_master_ooo; - qm->err_ini.close_axi_master_ooo = hisi_zip_close_axi_master_ooo; - - qm->err_ini.err_info.msi_wr_port = HZIP_WR_PORT; - ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) return ret; @@ -822,6 +827,7 @@ static int hisi_zip_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->sqe_size = HZIP_SQE_SIZE; qm->dev_name = hisi_zip_name; qm->qm_list = &zip_devices; + hisi_zip_err_ini_set(qm); return 0; } -- 2.30.0

2 1

[PATCH openEuler-23.09 0/5] LoongArch: add old BPI compatibility
by yangyinglu 21 Sep '23

21 Sep '23

LoongArch: add kernel setvirtmap for runtime LoongArch: Old BPI compatibility LoongArch: Fix virtual machine startup error LoongArch: Fixed EIOINTC structure members LoongArch: use arch specific phys_to_dma arch/loongarch/Kconfig | 1 + arch/loongarch/include/asm/addrspace.h | 1 + arch/loongarch/include/asm/efi.h | 1 + arch/loongarch/include/asm/irq.h | 1 + arch/loongarch/include/asm/loongarch.h | 1 + arch/loongarch/kernel/Makefile | 1 + arch/loongarch/kernel/acpi.c | 7 +- arch/loongarch/kernel/dma.c | 26 +- arch/loongarch/kernel/efi.c | 175 ++++++++- arch/loongarch/kernel/env.c | 6 + arch/loongarch/kernel/irq.c | 25 +- arch/loongarch/kernel/legacy_boot.c | 484 +++++++++++++++++++++++++ arch/loongarch/kernel/legacy_boot.h | 90 +++++ arch/loongarch/kernel/mem.c | 26 +- arch/loongarch/kernel/numa.c | 39 +- arch/loongarch/kernel/reset.c | 3 +- arch/loongarch/kernel/setup.c | 18 +- arch/loongarch/kernel/smp.c | 6 +- arch/loongarch/pci/acpi.c | 147 +++++++- drivers/firmware/efi/Makefile | 1 + drivers/irqchip/irq-loongarch-cpu.c | 7 +- drivers/irqchip/irq-loongson-eiointc.c | 46 ++- drivers/irqchip/irq-loongson-pch-pic.c | 5 + 23 files changed, 1075 insertions(+), 42 deletions(-) create mode 100644 arch/loongarch/kernel/legacy_boot.c create mode 100644 arch/loongarch/kernel/legacy_boot.h -- 2.20.1

2 6

[PATCH openEuler-1.0-LTS] crypto: hisilicon - reset before init the device
by w00416078 21 Sep '23

21 Sep '23

From: Yu'an Wang <wangyuan46(a)huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I830AI CVE: NA -------------------------------- Before initializing the device, reset the device to clear the residual data to prevent unexpected problems, such as reboot scene, which may maintain device state before reboot. Signed-off-by: Yu'an Wang <wangyuan46(a)huawei.com> --- drivers/crypto/hisilicon/hpre/hpre_main.c | 68 +++++++++++-------- drivers/crypto/hisilicon/qm.c | 83 ++++++++++++++++------- drivers/crypto/hisilicon/rde/rde_main.c | 64 ++++++++--------- drivers/crypto/hisilicon/sec2/sec_main.c | 39 ++++++----- drivers/crypto/hisilicon/zip/zip_main.c | 42 +++++++----- 5 files changed, 175 insertions(+), 121 deletions(-) diff --git a/drivers/crypto/hisilicon/hpre/hpre_main.c b/drivers/crypto/hisilicon/hpre/hpre_main.c index 1a980f255ad4..cbe8ea438fd2 100644 --- a/drivers/crypto/hisilicon/hpre/hpre_main.c +++ b/drivers/crypto/hisilicon/hpre/hpre_main.c @@ -780,28 +780,6 @@ static void hpre_debugfs_exit(struct hisi_qm *qm) debugfs_remove_recursive(qm->debug.debug_root); } -static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) -{ - int ret; - - qm->algs = "rsa\ndh\n"; - qm->uacce_mode = uacce_mode; - qm->pdev = pdev; - ret = hisi_qm_pre_init(qm, pf_q_num, HPRE_PF_DEF_Q_BASE); - if (ret) - return ret; - if (qm->ver == QM_HW_V1) { - pci_warn(pdev, "HPRE version 1 is not supported!\n"); - return -EINVAL; - } - - qm->qm_list = &hpre_devices; - qm->sqe_size = HPRE_SQE_SIZE; - qm->dev_name = hpre_name; - - return 0; -} - static void hpre_log_hw_error(struct hisi_qm *qm, u32 err_sts) { const struct hpre_hw_error *err = hpre_hw_errors; @@ -836,30 +814,36 @@ static void hpre_open_axi_master_ooo(struct hisi_qm *qm) HPRE_ADDR(qm, HPRE_AM_OOO_SHUTDOWN_ENB)); } -static int hpre_pf_probe_init(struct hisi_qm *qm) +static void hpre_err_ini_set(struct hisi_qm *qm) { - int ret; - - if (qm->ver != QM_HW_V2) - return -EINVAL; + if (qm->fun_type == QM_HW_VF) + return; - qm->ctrl_q_num = HPRE_QUEUE_NUM_V2; qm->err_ini.get_dev_hw_err_status = hpre_get_hw_err_status; qm->err_ini.clear_dev_hw_err_status = hpre_clear_hw_err_status; qm->err_ini.err_info.ecc_2bits_mask = HPRE_CORE_ECC_2BIT_ERR | - HPRE_OOO_ECC_2BIT_ERR; + HPRE_OOO_ECC_2BIT_ERR; qm->err_ini.err_info.ce = QM_BASE_CE; qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; qm->err_ini.err_info.fe = 0; qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; qm->err_ini.err_info.acpi_rst = "HRST"; - qm->err_ini.hw_err_disable = hpre_hw_error_disable; qm->err_ini.hw_err_enable = hpre_hw_error_enable; qm->err_ini.set_usr_domain_cache = hpre_set_user_domain_and_cache; qm->err_ini.log_dev_hw_err = hpre_log_hw_error; qm->err_ini.open_axi_master_ooo = hpre_open_axi_master_ooo; qm->err_ini.err_info.msi_wr_port = HPRE_WR_MSI_PORT; +} + +static int hpre_pf_probe_init(struct hisi_qm *qm) +{ + int ret; + + if (qm->ver != QM_HW_V2) + return -EINVAL; + + qm->ctrl_q_num = HPRE_QUEUE_NUM_V2; ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) @@ -870,6 +854,30 @@ static int hpre_pf_probe_init(struct hisi_qm *qm) return 0; } +static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) +{ + int ret; + + qm->algs = "rsa\ndh\n"; + qm->uacce_mode = uacce_mode; + qm->pdev = pdev; + ret = hisi_qm_pre_init(qm, pf_q_num, HPRE_PF_DEF_Q_BASE); + if (ret) + return ret; + + if (qm->ver == QM_HW_V1) { + pci_warn(pdev, "HPRE version 1 is not supported!\n"); + return -EINVAL; + } + + qm->qm_list = &hpre_devices; + qm->sqe_size = HPRE_SQE_SIZE; + qm->dev_name = hpre_name; + hpre_err_ini_set(qm); + + return 0; +} + static int hpre_probe(struct pci_dev *pdev, const struct pci_device_id *id) { struct hisi_qm *qm; diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 739b1a6565fd..f2706dc0d55e 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -230,6 +230,7 @@ #define QMC_ALIGN(sz) ALIGN(sz, 32) static int __hisi_qm_start(struct hisi_qm *qm); +static int qm_reset_device(struct hisi_qm *qm); enum vft_type { SQC_VFT = 0, @@ -2584,6 +2585,30 @@ static int hisi_qm_memory_init(struct hisi_qm *qm) return ret; } +static int qm_clear_device(struct hisi_qm *qm) +{ + u32 val; + int ret; + + if (qm->fun_type == QM_HW_VF) + return 0; + + /* OOO register set and check */ + writel(MASTER_GLOBAL_CTRL_SHUTDOWN, qm->io_base + MASTER_GLOBAL_CTRL); + + ret = readl_relaxed_poll_timeout(qm->io_base + MASTER_TRANS_RETURN, + val, (val == MASTER_TRANS_RETURN_RW), + QM_REG_RD_INTVRL_US, + QM_REG_RD_TMOUT_US); + if (ret) { + pci_warn(qm->pdev, "Device is busy, can not clear device.\n"); + writel(0x0, qm->io_base + MASTER_GLOBAL_CTRL); + return ret; + } + + return qm_reset_device(qm); +} + static int hisi_qm_pci_init(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -2626,8 +2651,14 @@ static int hisi_qm_pci_init(struct hisi_qm *qm) goto err_set_mask_and_coherent; } + ret = qm_clear_device(qm); + if (ret) + goto err_free_vectors; + return 0; +err_free_vectors: + pci_free_irq_vectors(pdev); err_set_mask_and_coherent: devm_iounmap(dev, qm->io_base); err_ioremap: @@ -3808,6 +3839,34 @@ static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) } } +static int qm_reset_device(struct hisi_qm *qm) +{ + struct pci_dev *pdev = qm->pdev; + unsigned long long value = 0; + acpi_status s; + + /* The reset related sub-control registers are not in PCI BAR */ + if (ACPI_HANDLE(&pdev->dev)) { + s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), + qm->err_ini.err_info.acpi_rst, + NULL, &value); + if (ACPI_FAILURE(s)) { + pci_err(pdev, "NO controller reset method!\n"); + return -EIO; + } + + if (value) { + pci_err(pdev, "Reset step %llu failed!\n", value); + return -EIO; + } + + return 0; + } + + pci_err(pdev, "No reset method!\n"); + return -EINVAL; +} + static int qm_soft_reset(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -3853,29 +3912,7 @@ static int qm_soft_reset(struct hisi_qm *qm) return ret; } - /* The reset related sub-control registers are not in PCI BAR */ - if (ACPI_HANDLE(&pdev->dev)) { - unsigned long long value = 0; - acpi_status s; - - s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), - qm->err_ini.err_info.acpi_rst, - NULL, &value); - if (ACPI_FAILURE(s)) { - pci_err(pdev, "NO controller reset method!\n"); - return -EIO; - } - - if (value) { - pci_err(pdev, "Reset step %llu failed!\n", value); - return -EIO; - } - } else { - pci_err(pdev, "No reset method!\n"); - return -EINVAL; - } - - return 0; + return qm_reset_device(qm); } static int qm_vf_reset_done(struct pci_dev *pdev, diff --git a/drivers/crypto/hisilicon/rde/rde_main.c b/drivers/crypto/hisilicon/rde/rde_main.c index f3f70079aa77..f2e00ff891db 100644 --- a/drivers/crypto/hisilicon/rde/rde_main.c +++ b/drivers/crypto/hisilicon/rde/rde_main.c @@ -28,15 +28,8 @@ #define HRDE_QUEUE_NUM_V2 1024 #define HRDE_PCI_DEVICE_ID 0xa25a #define HRDE_SQE_SIZE 64 -#define HRDE_SQ_SIZE (HRDE_SQE_SIZE * QM_Q_DEPTH) #define HRDE_PF_DEF_Q_NUM 64 #define HRDE_PF_DEF_Q_BASE 0 -#define HRDE_RD_INTVRL_US 10 -#define HRDE_RD_TMOUT_US 1000 -#define HRDE_RST_TMOUT_MS 400 -#define HRDE_ENABLE 1 -#define HRDE_DISABLE 0 -#define HRDE_PCI_COMMAND_INVALID 0xFFFFFFFF #define HRDE_RAS_INT_MSK 0x310290 #define HRDE_RAS_CE_MSK BIT(2) @@ -101,7 +94,7 @@ static struct hisi_qm_list rde_devices; static void hisi_rde_ras_proc(struct work_struct *work); static const struct hisi_rde_hw_error rde_hw_error[] = { - {.int_msk = BIT(0), .msg = "Rde_ecc_1bitt_err"}, + {.int_msk = BIT(0), .msg = "Rde_ecc_1bit_err"}, {.int_msk = BIT(1), .msg = "Rde_ecc_2bit_err"}, {.int_msk = BIT(2), .msg = "Rde_stat_mgmt_state_timeout_err"}, {.int_msk = BIT(3), .msg = "Rde_data_wr_state_timeout_err"}, @@ -269,7 +262,7 @@ static int hisi_rde_set_user_domain_and_cache(struct hisi_qm *qm) writel(AXI_M_CFG, qm->io_base + QM_AXI_M_CFG); writel(AXI_M_CFG_ENABLE, qm->io_base + QM_AXI_M_CFG_ENABLE); - /* disable BME/PM/SRIOV FLR*/ + /* disable BME/PM/SRIOV FLR */ writel(PEH_AXUSER_CFG, qm->io_base + QM_PEH_AXUSER_CFG); writel(PEH_AXUSER_CFG_ENABLE, qm->io_base + QM_PEH_AXUSER_CFG_ENABLE); @@ -351,7 +344,7 @@ static int current_qm_write(struct ctrl_debug_file *file, u32 val) u32 tmp; if (val > 0) { - pr_err("Function id should be smaller than 0.\n"); + pr_err("Function id should be equal to 0.\n"); return -EINVAL; } @@ -423,7 +416,7 @@ static ssize_t ctrl_debug_write(struct file *filp, const char __user *buf, size_t count, loff_t *pos) { struct ctrl_debug_file *file = filp->private_data; - char tbuf[20]; + char tbuf[HRDE_DBGFS_VAL_MAX_LEN]; unsigned long val; int len, ret; @@ -623,6 +616,24 @@ static void hisi_rde_open_master_ooo(struct hisi_qm *qm) writel(val | HRDE_AXI_SHUTDOWN_EN, qm->io_base + HRDE_CFG); } +static void hisi_rde_err_ini_set(struct hisi_qm *qm) +{ + qm->err_ini.get_dev_hw_err_status = hisi_rde_get_hw_err_status; + qm->err_ini.clear_dev_hw_err_status = hisi_rde_clear_hw_err_status; + qm->err_ini.err_info.ecc_2bits_mask = HRDE_ECC_2BIT_ERR; + qm->err_ini.err_info.ce = QM_BASE_CE; + qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; + qm->err_ini.err_info.fe = 0; + qm->err_ini.err_info.msi = 0; + qm->err_ini.err_info.acpi_rst = "RRST"; + qm->err_ini.hw_err_disable = hisi_rde_hw_error_disable; + qm->err_ini.hw_err_enable = hisi_rde_hw_error_enable; + qm->err_ini.set_usr_domain_cache = hisi_rde_set_user_domain_and_cache; + qm->err_ini.log_dev_hw_err = hisi_rde_hw_error_log; + qm->err_ini.open_axi_master_ooo = hisi_rde_open_master_ooo; + qm->err_ini.err_info.msi_wr_port = HRDE_WR_MSI_PORT; +} + static int hisi_rde_pf_probe_init(struct hisi_qm *qm) { struct hisi_rde *hisi_rde = container_of(qm, struct hisi_rde, qm); @@ -649,21 +660,6 @@ static int hisi_rde_pf_probe_init(struct hisi_qm *qm) return -EINVAL; } - qm->err_ini.get_dev_hw_err_status = hisi_rde_get_hw_err_status; - qm->err_ini.clear_dev_hw_err_status = hisi_rde_clear_hw_err_status; - qm->err_ini.err_info.ecc_2bits_mask = HRDE_ECC_2BIT_ERR; - qm->err_ini.err_info.ce = QM_BASE_CE; - qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; - qm->err_ini.err_info.fe = 0; - qm->err_ini.err_info.msi = 0; - qm->err_ini.err_info.acpi_rst = "RRST"; - qm->err_ini.hw_err_disable = hisi_rde_hw_error_disable; - qm->err_ini.hw_err_enable = hisi_rde_hw_error_enable; - qm->err_ini.set_usr_domain_cache = hisi_rde_set_user_domain_and_cache; - qm->err_ini.log_dev_hw_err = hisi_rde_hw_error_log; - qm->err_ini.open_axi_master_ooo = hisi_rde_open_master_ooo; - qm->err_ini.err_info.msi_wr_port = HRDE_WR_MSI_PORT; - ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) return ret; @@ -690,6 +686,7 @@ static int hisi_rde_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->sqe_size = HRDE_SQE_SIZE; qm->dev_name = hisi_rde_name; qm->abnormal_fix = hisi_rde_abnormal_fix; + hisi_rde_err_ini_set(qm); return 0; } @@ -727,31 +724,31 @@ static int hisi_rde_probe(struct pci_dev *pdev, const struct pci_device_id *id) ret = hisi_rde_qm_pre_init(qm, pdev); if (ret) { - pci_err(pdev, "Pre init qm failed!\n"); + pci_err(pdev, "Failed to pre init qm!\n"); return ret; } ret = hisi_qm_init(qm); if (ret) { - pci_err(pdev, "Init qm failed!\n"); + pci_err(pdev, "Failed to init qm!\n"); return ret; } ret = hisi_rde_pf_probe_init(qm); if (ret) { - pci_err(pdev, "Init pf failed!\n"); + pci_err(pdev, "Failed to init pf!\n"); goto err_qm_uninit; } ret = hisi_qm_start(qm); if (ret) { - pci_err(pdev, "Start qm failed!\n"); + pci_err(pdev, "Failed to start qm!\n"); goto err_qm_uninit; } ret = hisi_rde_debugfs_init(qm); if (ret) - pci_warn(pdev, "Init debugfs failed!\n"); + pci_warn(pdev, "Failed to init debugfs!\n"); hisi_qm_add_to_list(qm, &rde_devices); @@ -793,8 +790,7 @@ static void hisi_rde_ras_proc(struct work_struct *work) ret = hisi_qm_process_dev_error(pdev); if (ret == PCI_ERS_RESULT_NEED_RESET) if (hisi_qm_controller_reset(&hisi_rde->qm)) - dev_err(&pdev->dev, "Hisi_rde reset fail.\n"); - + dev_err(&pdev->dev, "Failed to reset device!\n"); } int hisi_rde_abnormal_fix(struct hisi_qm *qm) @@ -850,7 +846,7 @@ static int __init hisi_rde_init(void) ret = pci_register_driver(&hisi_rde_pci_driver); if (ret < 0) { hisi_rde_unregister_debugfs(); - pr_err("Register pci driver failed.\n"); + pr_err("Failed to register pci driver!\n"); } return ret; diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c index a568d5363c1e..0f32dcb69e12 100644 --- a/drivers/crypto/hisilicon/sec2/sec_main.c +++ b/drivers/crypto/hisilicon/sec2/sec_main.c @@ -712,29 +712,17 @@ static void sec_open_axi_master_ooo(struct hisi_qm *qm) writel(val | SEC_AXI_SHUTDOWN_ENABLE, SEC_ADDR(qm, SEC_CONTROL_REG)); } -static int sec_pf_probe_init(struct hisi_qm *qm) +static void sec_err_ini_set(struct hisi_qm *qm) { - int ret; - - switch (qm->ver) { - case QM_HW_V1: - qm->ctrl_q_num = SEC_QUEUE_NUM_V1; - break; - - case QM_HW_V2: - qm->ctrl_q_num = SEC_QUEUE_NUM_V2; - break; - - default: - return -EINVAL; - } + if (qm->fun_type == QM_HW_VF) + return; qm->err_ini.get_dev_hw_err_status = sec_get_hw_err_status; qm->err_ini.clear_dev_hw_err_status = sec_clear_hw_err_status; qm->err_ini.err_info.ecc_2bits_mask = SEC_CORE_INT_STATUS_M_ECC; qm->err_ini.err_info.ce = QM_BASE_CE; qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT | - QM_ACC_WB_NOT_READY_TIMEOUT; + QM_ACC_WB_NOT_READY_TIMEOUT; qm->err_ini.err_info.fe = 0; qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; qm->err_ini.err_info.acpi_rst = "SRST"; @@ -744,6 +732,24 @@ static int sec_pf_probe_init(struct hisi_qm *qm) qm->err_ini.log_dev_hw_err = sec_log_hw_error; qm->err_ini.open_axi_master_ooo = sec_open_axi_master_ooo; qm->err_ini.err_info.msi_wr_port = SEC_WR_MSI_PORT; +} + +static int sec_pf_probe_init(struct hisi_qm *qm) +{ + int ret; + + switch (qm->ver) { + case QM_HW_V1: + qm->ctrl_q_num = SEC_QUEUE_NUM_V1; + break; + + case QM_HW_V2: + qm->ctrl_q_num = SEC_QUEUE_NUM_V2; + break; + + default: + return -EINVAL; + } ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) @@ -807,6 +813,7 @@ static int sec_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->qm_list = &sec_devices; qm->sqe_size = SEC_SQE_SIZE; qm->dev_name = sec_name; + sec_err_ini_set(qm); return 0; } diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c index 17bbab667553..1ca51793e26a 100644 --- a/drivers/crypto/hisilicon/zip/zip_main.c +++ b/drivers/crypto/hisilicon/zip/zip_main.c @@ -204,7 +204,7 @@ static struct debugfs_reg32 hzip_dfx_regs[] = { {"HZIP_AVG_DELAY ", 0x28ull}, {"HZIP_MEM_VISIBLE_DATA ", 0x30ull}, {"HZIP_MEM_VISIBLE_ADDR ", 0x34ull}, - {"HZIP_COMSUMED_BYTE ", 0x38ull}, + {"HZIP_CONSUMED_BYTE ", 0x38ull}, {"HZIP_PRODUCED_BYTE ", 0x40ull}, {"HZIP_COMP_INF ", 0x70ull}, {"HZIP_PRE_OUT ", 0x78ull}, @@ -755,6 +755,28 @@ static void hisi_zip_close_axi_master_ooo(struct hisi_qm *qm) qm->io_base + HZIP_CORE_INT_SET); } +static void hisi_zip_err_ini_set(struct hisi_qm *qm) +{ + if (qm->fun_type == QM_HW_VF) + return; + + qm->err_ini.get_dev_hw_err_status = hisi_zip_get_hw_err_status; + qm->err_ini.clear_dev_hw_err_status = hisi_zip_clear_hw_err_status; + qm->err_ini.err_info.ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC; + qm->err_ini.err_info.ce = QM_BASE_CE; + qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_WB_NOT_READY_TIMEOUT; + qm->err_ini.err_info.fe = 0; + qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; + qm->err_ini.err_info.acpi_rst = "ZRST"; + qm->err_ini.hw_err_disable = hisi_zip_hw_error_disable; + qm->err_ini.hw_err_enable = hisi_zip_hw_error_enable; + qm->err_ini.set_usr_domain_cache = hisi_zip_set_user_domain_and_cache; + qm->err_ini.log_dev_hw_err = hisi_zip_log_hw_error; + qm->err_ini.open_axi_master_ooo = hisi_zip_open_axi_master_ooo; + qm->err_ini.close_axi_master_ooo = hisi_zip_close_axi_master_ooo; + qm->err_ini.err_info.msi_wr_port = HZIP_WR_PORT; +} + static int hisi_zip_pf_probe_init(struct hisi_qm *qm) { struct hisi_zip *zip = container_of(qm, struct hisi_zip, qm); @@ -781,23 +803,6 @@ static int hisi_zip_pf_probe_init(struct hisi_qm *qm) return -EINVAL; } - qm->err_ini.get_dev_hw_err_status = hisi_zip_get_hw_err_status; - qm->err_ini.clear_dev_hw_err_status = hisi_zip_clear_hw_err_status; - qm->err_ini.err_info.ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC; - qm->err_ini.err_info.ce = QM_BASE_CE; - qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_WB_NOT_READY_TIMEOUT; - qm->err_ini.err_info.fe = 0; - qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; - qm->err_ini.err_info.acpi_rst = "ZRST"; - qm->err_ini.hw_err_disable = hisi_zip_hw_error_disable; - qm->err_ini.hw_err_enable = hisi_zip_hw_error_enable; - qm->err_ini.set_usr_domain_cache = hisi_zip_set_user_domain_and_cache; - qm->err_ini.log_dev_hw_err = hisi_zip_log_hw_error; - qm->err_ini.open_axi_master_ooo = hisi_zip_open_axi_master_ooo; - qm->err_ini.close_axi_master_ooo = hisi_zip_close_axi_master_ooo; - - qm->err_ini.err_info.msi_wr_port = HZIP_WR_PORT; - ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) return ret; @@ -822,6 +827,7 @@ static int hisi_zip_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->sqe_size = HZIP_SQE_SIZE; qm->dev_name = hisi_zip_name; qm->qm_list = &zip_devices; + hisi_zip_err_ini_set(qm); return 0; } -- 2.30.0

2 1

[PATCH OLK-5.10] ext4: do not mark inode dirty every time when appending using delalloc
by WoZ1zh1 20 Sep '23

20 Sep '23

From: Liu Song <liusong(a)linux.alibaba.com> mainline inclusion from mainline-v6.6-rc1 commit 03de20bed203b0819225d4de98353c1f8755a1dd category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I82QPS CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- In the delalloc append write scenario, if inode's i_size is extended due to buffer write, there are delalloc writes pending in the range up to i_size, and no need to touch i_disksize since writeback will push i_disksize up to i_size eventually. Offers significant performance improvement in high-frequency append write scenarios. I conducted tests in my 32-core environment by launching 32 concurrent threads to append write to the same file. Each write operation had a length of 1024 bytes and was repeated 100000 times. Without using this patch, the test was completed in 7705 ms. However, with this patch, the test was completed in 5066 ms, resulting in a performance improvement of 34%. Moreover, in test scenarios of Kafka version 2.6.2, using packet size of 2K, with this patch resulted in a 10% performance improvement. Signed-off-by: Liu Song <liusong(a)linux.alibaba.com> Suggested-by: Jan Kara <jack(a)suse.cz> Reviewed-by: Jan Kara <jack(a)suse.cz> Link: https://lore.kernel.org/r/20230810154333.84921-1-liusong@linux.alibaba.com Signed-off-by: Theodore Ts'o <tytso(a)mit.edu> Signed-off-by: WoZ1zh1 <wozizhi(a)huawei.com> --- fs/ext4/inode.c | 88 ++++++++++++++++++++++++++++++++++--------------- 1 file changed, 62 insertions(+), 26 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a61e7ab21a16..1b6e16702298 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3080,14 +3080,73 @@ static int ext4_da_should_update_i_disksize(struct page *page, return 1; } +static int ext4_da_do_write_end(struct address_space *mapping, + loff_t pos, unsigned len, unsigned copied, + struct page *page) +{ + struct inode *inode = mapping->host; + loff_t old_size = inode->i_size; + bool disksize_changed = false; + loff_t new_i_size; + + /* + * block_write_end() will mark the inode as dirty with I_DIRTY_PAGES + * flag, which all that's needed to trigger page writeback. + */ + copied = block_write_end(NULL, mapping, pos, len, copied, page, NULL); + new_i_size = pos + copied; + + /* + * It's important to update i_size while still holding page lock, + * because page writeout could otherwise come in and zero beyond + * i_size. + * + * Since we are holding inode lock, we are sure i_disksize <= + * i_size. We also know that if i_disksize < i_size, there are + * delalloc writes pending in the range up to i_size. If the end of + * the current write is <= i_size, there's no need to touch + * i_disksize since writeback will push i_disksize up to i_size + * eventually. If the end of the current write is > i_size and + * inside an allocated block which ext4_da_should_update_i_disksize() + * checked, we need to update i_disksize here as certain + * ext4_writepages() paths not allocating blocks and update i_disksize. + */ + if (new_i_size > inode->i_size) { + unsigned long end; + + i_size_write(inode, new_i_size); + end = (new_i_size - 1) & (PAGE_SIZE - 1); + if (copied && ext4_da_should_update_i_disksize(page, end)) { + ext4_update_i_disksize(inode, new_i_size); + disksize_changed = true; + } + } + + unlock_page(page); + put_page(page); + + if (old_size < pos) + pagecache_isize_extended(inode, old_size, pos); + + if (disksize_changed) { + handle_t *handle; + + handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); + if (IS_ERR(handle)) + return PTR_ERR(handle); + ext4_mark_inode_dirty(handle, inode); + ext4_journal_stop(handle); + } + + return copied; +} + static int ext4_da_write_end(struct file *file, struct address_space *mapping, loff_t pos, unsigned len, unsigned copied, struct page *page, void *fsdata) { struct inode *inode = mapping->host; - loff_t new_i_size; - unsigned long start, end; int write_mode = (int)(unsigned long)fsdata; if (write_mode == FALL_BACK_TO_NONDELALLOC) @@ -3104,30 +3163,7 @@ static int ext4_da_write_end(struct file *file, if (unlikely(copied < len) && !PageUptodate(page)) copied = 0; - start = pos & (PAGE_SIZE - 1); - end = start + copied - 1; - - /* - * Since we are holding inode lock, we are sure i_disksize <= - * i_size. We also know that if i_disksize < i_size, there are - * delalloc writes pending in the range upto i_size. If the end of - * the current write is <= i_size, there's no need to touch - * i_disksize since writeback will push i_disksize upto i_size - * eventually. If the end of the current write is > i_size and - * inside an allocated block (ext4_da_should_update_i_disksize() - * check), we need to update i_disksize here as neither - * ext4_writepage() nor certain ext4_writepages() paths not - * allocating blocks update i_disksize. - * - * Note that we defer inode dirtying to generic_write_end() / - * ext4_da_write_inline_data_end(). - */ - new_i_size = pos + copied; - if (copied && new_i_size > inode->i_size && - ext4_da_should_update_i_disksize(page, end)) - ext4_update_i_disksize(inode, new_i_size); - - return generic_write_end(file, mapping, pos, len, copied, page, fsdata); + return ext4_da_do_write_end(mapping, pos, len, copied, page); } /* -- 2.39.2

2 1

[PATCH openEuler-1.0-LTS] crypto: hisilicon - reset before init the device
by w00416078 20 Sep '23

20 Sep '23

From: Yu'an Wang <wangyuan46(a)huawei.com> driver inclusion category: bugfix bugzilla:https://gitee.com/openeuler/kernel/issues/I830AI CVE: NA -------------------------------- Before initializing the device, reset the device to clear the residual data to prevent unexpected problems, such as reboot scene, which may maintain device state before reboot. Signed-off-by: Yu'an Wang <wangyuan46(a)huawei.com> --- drivers/crypto/hisilicon/hpre/hpre_main.c | 68 +++++++++++-------- drivers/crypto/hisilicon/qm.c | 83 ++++++++++++++++------- drivers/crypto/hisilicon/rde/rde_main.c | 64 ++++++++--------- drivers/crypto/hisilicon/sec2/sec_main.c | 39 ++++++----- drivers/crypto/hisilicon/zip/zip_main.c | 42 +++++++----- 5 files changed, 175 insertions(+), 121 deletions(-) diff --git a/drivers/crypto/hisilicon/hpre/hpre_main.c b/drivers/crypto/hisilicon/hpre/hpre_main.c index 1a980f255ad4..cbe8ea438fd2 100644 --- a/drivers/crypto/hisilicon/hpre/hpre_main.c +++ b/drivers/crypto/hisilicon/hpre/hpre_main.c @@ -780,28 +780,6 @@ static void hpre_debugfs_exit(struct hisi_qm *qm) debugfs_remove_recursive(qm->debug.debug_root); } -static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) -{ - int ret; - - qm->algs = "rsa\ndh\n"; - qm->uacce_mode = uacce_mode; - qm->pdev = pdev; - ret = hisi_qm_pre_init(qm, pf_q_num, HPRE_PF_DEF_Q_BASE); - if (ret) - return ret; - if (qm->ver == QM_HW_V1) { - pci_warn(pdev, "HPRE version 1 is not supported!\n"); - return -EINVAL; - } - - qm->qm_list = &hpre_devices; - qm->sqe_size = HPRE_SQE_SIZE; - qm->dev_name = hpre_name; - - return 0; -} - static void hpre_log_hw_error(struct hisi_qm *qm, u32 err_sts) { const struct hpre_hw_error *err = hpre_hw_errors; @@ -836,30 +814,36 @@ static void hpre_open_axi_master_ooo(struct hisi_qm *qm) HPRE_ADDR(qm, HPRE_AM_OOO_SHUTDOWN_ENB)); } -static int hpre_pf_probe_init(struct hisi_qm *qm) +static void hpre_err_ini_set(struct hisi_qm *qm) { - int ret; - - if (qm->ver != QM_HW_V2) - return -EINVAL; + if (qm->fun_type == QM_HW_VF) + return; - qm->ctrl_q_num = HPRE_QUEUE_NUM_V2; qm->err_ini.get_dev_hw_err_status = hpre_get_hw_err_status; qm->err_ini.clear_dev_hw_err_status = hpre_clear_hw_err_status; qm->err_ini.err_info.ecc_2bits_mask = HPRE_CORE_ECC_2BIT_ERR | - HPRE_OOO_ECC_2BIT_ERR; + HPRE_OOO_ECC_2BIT_ERR; qm->err_ini.err_info.ce = QM_BASE_CE; qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; qm->err_ini.err_info.fe = 0; qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; qm->err_ini.err_info.acpi_rst = "HRST"; - qm->err_ini.hw_err_disable = hpre_hw_error_disable; qm->err_ini.hw_err_enable = hpre_hw_error_enable; qm->err_ini.set_usr_domain_cache = hpre_set_user_domain_and_cache; qm->err_ini.log_dev_hw_err = hpre_log_hw_error; qm->err_ini.open_axi_master_ooo = hpre_open_axi_master_ooo; qm->err_ini.err_info.msi_wr_port = HPRE_WR_MSI_PORT; +} + +static int hpre_pf_probe_init(struct hisi_qm *qm) +{ + int ret; + + if (qm->ver != QM_HW_V2) + return -EINVAL; + + qm->ctrl_q_num = HPRE_QUEUE_NUM_V2; ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) @@ -870,6 +854,30 @@ static int hpre_pf_probe_init(struct hisi_qm *qm) return 0; } +static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) +{ + int ret; + + qm->algs = "rsa\ndh\n"; + qm->uacce_mode = uacce_mode; + qm->pdev = pdev; + ret = hisi_qm_pre_init(qm, pf_q_num, HPRE_PF_DEF_Q_BASE); + if (ret) + return ret; + + if (qm->ver == QM_HW_V1) { + pci_warn(pdev, "HPRE version 1 is not supported!\n"); + return -EINVAL; + } + + qm->qm_list = &hpre_devices; + qm->sqe_size = HPRE_SQE_SIZE; + qm->dev_name = hpre_name; + hpre_err_ini_set(qm); + + return 0; +} + static int hpre_probe(struct pci_dev *pdev, const struct pci_device_id *id) { struct hisi_qm *qm; diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 739b1a6565fd..f2706dc0d55e 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -230,6 +230,7 @@ #define QMC_ALIGN(sz) ALIGN(sz, 32) static int __hisi_qm_start(struct hisi_qm *qm); +static int qm_reset_device(struct hisi_qm *qm); enum vft_type { SQC_VFT = 0, @@ -2584,6 +2585,30 @@ static int hisi_qm_memory_init(struct hisi_qm *qm) return ret; } +static int qm_clear_device(struct hisi_qm *qm) +{ + u32 val; + int ret; + + if (qm->fun_type == QM_HW_VF) + return 0; + + /* OOO register set and check */ + writel(MASTER_GLOBAL_CTRL_SHUTDOWN, qm->io_base + MASTER_GLOBAL_CTRL); + + ret = readl_relaxed_poll_timeout(qm->io_base + MASTER_TRANS_RETURN, + val, (val == MASTER_TRANS_RETURN_RW), + QM_REG_RD_INTVRL_US, + QM_REG_RD_TMOUT_US); + if (ret) { + pci_warn(qm->pdev, "Device is busy, can not clear device.\n"); + writel(0x0, qm->io_base + MASTER_GLOBAL_CTRL); + return ret; + } + + return qm_reset_device(qm); +} + static int hisi_qm_pci_init(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -2626,8 +2651,14 @@ static int hisi_qm_pci_init(struct hisi_qm *qm) goto err_set_mask_and_coherent; } + ret = qm_clear_device(qm); + if (ret) + goto err_free_vectors; + return 0; +err_free_vectors: + pci_free_irq_vectors(pdev); err_set_mask_and_coherent: devm_iounmap(dev, qm->io_base); err_ioremap: @@ -3808,6 +3839,34 @@ static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) } } +static int qm_reset_device(struct hisi_qm *qm) +{ + struct pci_dev *pdev = qm->pdev; + unsigned long long value = 0; + acpi_status s; + + /* The reset related sub-control registers are not in PCI BAR */ + if (ACPI_HANDLE(&pdev->dev)) { + s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), + qm->err_ini.err_info.acpi_rst, + NULL, &value); + if (ACPI_FAILURE(s)) { + pci_err(pdev, "NO controller reset method!\n"); + return -EIO; + } + + if (value) { + pci_err(pdev, "Reset step %llu failed!\n", value); + return -EIO; + } + + return 0; + } + + pci_err(pdev, "No reset method!\n"); + return -EINVAL; +} + static int qm_soft_reset(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -3853,29 +3912,7 @@ static int qm_soft_reset(struct hisi_qm *qm) return ret; } - /* The reset related sub-control registers are not in PCI BAR */ - if (ACPI_HANDLE(&pdev->dev)) { - unsigned long long value = 0; - acpi_status s; - - s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), - qm->err_ini.err_info.acpi_rst, - NULL, &value); - if (ACPI_FAILURE(s)) { - pci_err(pdev, "NO controller reset method!\n"); - return -EIO; - } - - if (value) { - pci_err(pdev, "Reset step %llu failed!\n", value); - return -EIO; - } - } else { - pci_err(pdev, "No reset method!\n"); - return -EINVAL; - } - - return 0; + return qm_reset_device(qm); } static int qm_vf_reset_done(struct pci_dev *pdev, diff --git a/drivers/crypto/hisilicon/rde/rde_main.c b/drivers/crypto/hisilicon/rde/rde_main.c index f3f70079aa77..f2e00ff891db 100644 --- a/drivers/crypto/hisilicon/rde/rde_main.c +++ b/drivers/crypto/hisilicon/rde/rde_main.c @@ -28,15 +28,8 @@ #define HRDE_QUEUE_NUM_V2 1024 #define HRDE_PCI_DEVICE_ID 0xa25a #define HRDE_SQE_SIZE 64 -#define HRDE_SQ_SIZE (HRDE_SQE_SIZE * QM_Q_DEPTH) #define HRDE_PF_DEF_Q_NUM 64 #define HRDE_PF_DEF_Q_BASE 0 -#define HRDE_RD_INTVRL_US 10 -#define HRDE_RD_TMOUT_US 1000 -#define HRDE_RST_TMOUT_MS 400 -#define HRDE_ENABLE 1 -#define HRDE_DISABLE 0 -#define HRDE_PCI_COMMAND_INVALID 0xFFFFFFFF #define HRDE_RAS_INT_MSK 0x310290 #define HRDE_RAS_CE_MSK BIT(2) @@ -101,7 +94,7 @@ static struct hisi_qm_list rde_devices; static void hisi_rde_ras_proc(struct work_struct *work); static const struct hisi_rde_hw_error rde_hw_error[] = { - {.int_msk = BIT(0), .msg = "Rde_ecc_1bitt_err"}, + {.int_msk = BIT(0), .msg = "Rde_ecc_1bit_err"}, {.int_msk = BIT(1), .msg = "Rde_ecc_2bit_err"}, {.int_msk = BIT(2), .msg = "Rde_stat_mgmt_state_timeout_err"}, {.int_msk = BIT(3), .msg = "Rde_data_wr_state_timeout_err"}, @@ -269,7 +262,7 @@ static int hisi_rde_set_user_domain_and_cache(struct hisi_qm *qm) writel(AXI_M_CFG, qm->io_base + QM_AXI_M_CFG); writel(AXI_M_CFG_ENABLE, qm->io_base + QM_AXI_M_CFG_ENABLE); - /* disable BME/PM/SRIOV FLR*/ + /* disable BME/PM/SRIOV FLR */ writel(PEH_AXUSER_CFG, qm->io_base + QM_PEH_AXUSER_CFG); writel(PEH_AXUSER_CFG_ENABLE, qm->io_base + QM_PEH_AXUSER_CFG_ENABLE); @@ -351,7 +344,7 @@ static int current_qm_write(struct ctrl_debug_file *file, u32 val) u32 tmp; if (val > 0) { - pr_err("Function id should be smaller than 0.\n"); + pr_err("Function id should be equal to 0.\n"); return -EINVAL; } @@ -423,7 +416,7 @@ static ssize_t ctrl_debug_write(struct file *filp, const char __user *buf, size_t count, loff_t *pos) { struct ctrl_debug_file *file = filp->private_data; - char tbuf[20]; + char tbuf[HRDE_DBGFS_VAL_MAX_LEN]; unsigned long val; int len, ret; @@ -623,6 +616,24 @@ static void hisi_rde_open_master_ooo(struct hisi_qm *qm) writel(val | HRDE_AXI_SHUTDOWN_EN, qm->io_base + HRDE_CFG); } +static void hisi_rde_err_ini_set(struct hisi_qm *qm) +{ + qm->err_ini.get_dev_hw_err_status = hisi_rde_get_hw_err_status; + qm->err_ini.clear_dev_hw_err_status = hisi_rde_clear_hw_err_status; + qm->err_ini.err_info.ecc_2bits_mask = HRDE_ECC_2BIT_ERR; + qm->err_ini.err_info.ce = QM_BASE_CE; + qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; + qm->err_ini.err_info.fe = 0; + qm->err_ini.err_info.msi = 0; + qm->err_ini.err_info.acpi_rst = "RRST"; + qm->err_ini.hw_err_disable = hisi_rde_hw_error_disable; + qm->err_ini.hw_err_enable = hisi_rde_hw_error_enable; + qm->err_ini.set_usr_domain_cache = hisi_rde_set_user_domain_and_cache; + qm->err_ini.log_dev_hw_err = hisi_rde_hw_error_log; + qm->err_ini.open_axi_master_ooo = hisi_rde_open_master_ooo; + qm->err_ini.err_info.msi_wr_port = HRDE_WR_MSI_PORT; +} + static int hisi_rde_pf_probe_init(struct hisi_qm *qm) { struct hisi_rde *hisi_rde = container_of(qm, struct hisi_rde, qm); @@ -649,21 +660,6 @@ static int hisi_rde_pf_probe_init(struct hisi_qm *qm) return -EINVAL; } - qm->err_ini.get_dev_hw_err_status = hisi_rde_get_hw_err_status; - qm->err_ini.clear_dev_hw_err_status = hisi_rde_clear_hw_err_status; - qm->err_ini.err_info.ecc_2bits_mask = HRDE_ECC_2BIT_ERR; - qm->err_ini.err_info.ce = QM_BASE_CE; - qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT; - qm->err_ini.err_info.fe = 0; - qm->err_ini.err_info.msi = 0; - qm->err_ini.err_info.acpi_rst = "RRST"; - qm->err_ini.hw_err_disable = hisi_rde_hw_error_disable; - qm->err_ini.hw_err_enable = hisi_rde_hw_error_enable; - qm->err_ini.set_usr_domain_cache = hisi_rde_set_user_domain_and_cache; - qm->err_ini.log_dev_hw_err = hisi_rde_hw_error_log; - qm->err_ini.open_axi_master_ooo = hisi_rde_open_master_ooo; - qm->err_ini.err_info.msi_wr_port = HRDE_WR_MSI_PORT; - ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) return ret; @@ -690,6 +686,7 @@ static int hisi_rde_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->sqe_size = HRDE_SQE_SIZE; qm->dev_name = hisi_rde_name; qm->abnormal_fix = hisi_rde_abnormal_fix; + hisi_rde_err_ini_set(qm); return 0; } @@ -727,31 +724,31 @@ static int hisi_rde_probe(struct pci_dev *pdev, const struct pci_device_id *id) ret = hisi_rde_qm_pre_init(qm, pdev); if (ret) { - pci_err(pdev, "Pre init qm failed!\n"); + pci_err(pdev, "Failed to pre init qm!\n"); return ret; } ret = hisi_qm_init(qm); if (ret) { - pci_err(pdev, "Init qm failed!\n"); + pci_err(pdev, "Failed to init qm!\n"); return ret; } ret = hisi_rde_pf_probe_init(qm); if (ret) { - pci_err(pdev, "Init pf failed!\n"); + pci_err(pdev, "Failed to init pf!\n"); goto err_qm_uninit; } ret = hisi_qm_start(qm); if (ret) { - pci_err(pdev, "Start qm failed!\n"); + pci_err(pdev, "Failed to start qm!\n"); goto err_qm_uninit; } ret = hisi_rde_debugfs_init(qm); if (ret) - pci_warn(pdev, "Init debugfs failed!\n"); + pci_warn(pdev, "Failed to init debugfs!\n"); hisi_qm_add_to_list(qm, &rde_devices); @@ -793,8 +790,7 @@ static void hisi_rde_ras_proc(struct work_struct *work) ret = hisi_qm_process_dev_error(pdev); if (ret == PCI_ERS_RESULT_NEED_RESET) if (hisi_qm_controller_reset(&hisi_rde->qm)) - dev_err(&pdev->dev, "Hisi_rde reset fail.\n"); - + dev_err(&pdev->dev, "Failed to reset device!\n"); } int hisi_rde_abnormal_fix(struct hisi_qm *qm) @@ -850,7 +846,7 @@ static int __init hisi_rde_init(void) ret = pci_register_driver(&hisi_rde_pci_driver); if (ret < 0) { hisi_rde_unregister_debugfs(); - pr_err("Register pci driver failed.\n"); + pr_err("Failed to register pci driver!\n"); } return ret; diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c index a568d5363c1e..0f32dcb69e12 100644 --- a/drivers/crypto/hisilicon/sec2/sec_main.c +++ b/drivers/crypto/hisilicon/sec2/sec_main.c @@ -712,29 +712,17 @@ static void sec_open_axi_master_ooo(struct hisi_qm *qm) writel(val | SEC_AXI_SHUTDOWN_ENABLE, SEC_ADDR(qm, SEC_CONTROL_REG)); } -static int sec_pf_probe_init(struct hisi_qm *qm) +static void sec_err_ini_set(struct hisi_qm *qm) { - int ret; - - switch (qm->ver) { - case QM_HW_V1: - qm->ctrl_q_num = SEC_QUEUE_NUM_V1; - break; - - case QM_HW_V2: - qm->ctrl_q_num = SEC_QUEUE_NUM_V2; - break; - - default: - return -EINVAL; - } + if (qm->fun_type == QM_HW_VF) + return; qm->err_ini.get_dev_hw_err_status = sec_get_hw_err_status; qm->err_ini.clear_dev_hw_err_status = sec_clear_hw_err_status; qm->err_ini.err_info.ecc_2bits_mask = SEC_CORE_INT_STATUS_M_ECC; qm->err_ini.err_info.ce = QM_BASE_CE; qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT | - QM_ACC_WB_NOT_READY_TIMEOUT; + QM_ACC_WB_NOT_READY_TIMEOUT; qm->err_ini.err_info.fe = 0; qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; qm->err_ini.err_info.acpi_rst = "SRST"; @@ -744,6 +732,24 @@ static int sec_pf_probe_init(struct hisi_qm *qm) qm->err_ini.log_dev_hw_err = sec_log_hw_error; qm->err_ini.open_axi_master_ooo = sec_open_axi_master_ooo; qm->err_ini.err_info.msi_wr_port = SEC_WR_MSI_PORT; +} + +static int sec_pf_probe_init(struct hisi_qm *qm) +{ + int ret; + + switch (qm->ver) { + case QM_HW_V1: + qm->ctrl_q_num = SEC_QUEUE_NUM_V1; + break; + + case QM_HW_V2: + qm->ctrl_q_num = SEC_QUEUE_NUM_V2; + break; + + default: + return -EINVAL; + } ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) @@ -807,6 +813,7 @@ static int sec_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->qm_list = &sec_devices; qm->sqe_size = SEC_SQE_SIZE; qm->dev_name = sec_name; + sec_err_ini_set(qm); return 0; } diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c index 17bbab667553..1ca51793e26a 100644 --- a/drivers/crypto/hisilicon/zip/zip_main.c +++ b/drivers/crypto/hisilicon/zip/zip_main.c @@ -204,7 +204,7 @@ static struct debugfs_reg32 hzip_dfx_regs[] = { {"HZIP_AVG_DELAY ", 0x28ull}, {"HZIP_MEM_VISIBLE_DATA ", 0x30ull}, {"HZIP_MEM_VISIBLE_ADDR ", 0x34ull}, - {"HZIP_COMSUMED_BYTE ", 0x38ull}, + {"HZIP_CONSUMED_BYTE ", 0x38ull}, {"HZIP_PRODUCED_BYTE ", 0x40ull}, {"HZIP_COMP_INF ", 0x70ull}, {"HZIP_PRE_OUT ", 0x78ull}, @@ -755,6 +755,28 @@ static void hisi_zip_close_axi_master_ooo(struct hisi_qm *qm) qm->io_base + HZIP_CORE_INT_SET); } +static void hisi_zip_err_ini_set(struct hisi_qm *qm) +{ + if (qm->fun_type == QM_HW_VF) + return; + + qm->err_ini.get_dev_hw_err_status = hisi_zip_get_hw_err_status; + qm->err_ini.clear_dev_hw_err_status = hisi_zip_clear_hw_err_status; + qm->err_ini.err_info.ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC; + qm->err_ini.err_info.ce = QM_BASE_CE; + qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_WB_NOT_READY_TIMEOUT; + qm->err_ini.err_info.fe = 0; + qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; + qm->err_ini.err_info.acpi_rst = "ZRST"; + qm->err_ini.hw_err_disable = hisi_zip_hw_error_disable; + qm->err_ini.hw_err_enable = hisi_zip_hw_error_enable; + qm->err_ini.set_usr_domain_cache = hisi_zip_set_user_domain_and_cache; + qm->err_ini.log_dev_hw_err = hisi_zip_log_hw_error; + qm->err_ini.open_axi_master_ooo = hisi_zip_open_axi_master_ooo; + qm->err_ini.close_axi_master_ooo = hisi_zip_close_axi_master_ooo; + qm->err_ini.err_info.msi_wr_port = HZIP_WR_PORT; +} + static int hisi_zip_pf_probe_init(struct hisi_qm *qm) { struct hisi_zip *zip = container_of(qm, struct hisi_zip, qm); @@ -781,23 +803,6 @@ static int hisi_zip_pf_probe_init(struct hisi_qm *qm) return -EINVAL; } - qm->err_ini.get_dev_hw_err_status = hisi_zip_get_hw_err_status; - qm->err_ini.clear_dev_hw_err_status = hisi_zip_clear_hw_err_status; - qm->err_ini.err_info.ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC; - qm->err_ini.err_info.ce = QM_BASE_CE; - qm->err_ini.err_info.nfe = QM_BASE_NFE | QM_ACC_WB_NOT_READY_TIMEOUT; - qm->err_ini.err_info.fe = 0; - qm->err_ini.err_info.msi = QM_DB_RANDOM_INVALID; - qm->err_ini.err_info.acpi_rst = "ZRST"; - qm->err_ini.hw_err_disable = hisi_zip_hw_error_disable; - qm->err_ini.hw_err_enable = hisi_zip_hw_error_enable; - qm->err_ini.set_usr_domain_cache = hisi_zip_set_user_domain_and_cache; - qm->err_ini.log_dev_hw_err = hisi_zip_log_hw_error; - qm->err_ini.open_axi_master_ooo = hisi_zip_open_axi_master_ooo; - qm->err_ini.close_axi_master_ooo = hisi_zip_close_axi_master_ooo; - - qm->err_ini.err_info.msi_wr_port = HZIP_WR_PORT; - ret = qm->err_ini.set_usr_domain_cache(qm); if (ret) return ret; @@ -822,6 +827,7 @@ static int hisi_zip_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) qm->sqe_size = HZIP_SQE_SIZE; qm->dev_name = hisi_zip_name; qm->qm_list = &zip_devices; + hisi_zip_err_ini_set(qm); return 0; } -- 2.30.0

2 1

[PATCH v2 openEuler-23.09 0/7] LoongArch: backport drm and spi driver and some bugfixes
by Hongchen Zhang 20 Sep '23

20 Sep '23

Backport the following patches from upstream. Hongchen Zhang (7): LoongArch: Allow usage of LSX/LASX in the kernel spi: loongson: add bus driver for the loongson spi controller drm: Add kms driver for loongson display controller drm/loongson: Remove a useless check in cursor_plane_atomic_async_check() drm/loongson: Add a check for lsdc_bo_create() errors LoongArch: mm: Add p?d_leaf() definitions LoongArch: Fix module relocation error with binutils 2.41 MAINTAINERS | 4 + arch/loongarch/Makefile | 2 + arch/loongarch/include/asm/pgtable.h | 3 + arch/loongarch/kernel/kfpu.c | 55 +- drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/loongson/Kconfig | 17 + drivers/gpu/drm/loongson/Makefile | 22 + drivers/gpu/drm/loongson/loongson_device.c | 102 ++ drivers/gpu/drm/loongson/loongson_module.c | 33 + drivers/gpu/drm/loongson/loongson_module.h | 12 + drivers/gpu/drm/loongson/lsdc_benchmark.c | 133 +++ drivers/gpu/drm/loongson/lsdc_benchmark.h | 13 + drivers/gpu/drm/loongson/lsdc_crtc.c | 1024 +++++++++++++++++ drivers/gpu/drm/loongson/lsdc_debugfs.c | 110 ++ drivers/gpu/drm/loongson/lsdc_drv.c | 457 ++++++++ drivers/gpu/drm/loongson/lsdc_drv.h | 388 +++++++ drivers/gpu/drm/loongson/lsdc_gem.c | 311 +++++ drivers/gpu/drm/loongson/lsdc_gem.h | 37 + drivers/gpu/drm/loongson/lsdc_gfxpll.c | 199 ++++ drivers/gpu/drm/loongson/lsdc_gfxpll.h | 52 + drivers/gpu/drm/loongson/lsdc_i2c.c | 179 +++ drivers/gpu/drm/loongson/lsdc_i2c.h | 29 + drivers/gpu/drm/loongson/lsdc_irq.c | 74 ++ drivers/gpu/drm/loongson/lsdc_irq.h | 16 + drivers/gpu/drm/loongson/lsdc_output.h | 21 + drivers/gpu/drm/loongson/lsdc_output_7a1000.c | 178 +++ drivers/gpu/drm/loongson/lsdc_output_7a2000.c | 552 +++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.c | 481 ++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.h | 86 ++ drivers/gpu/drm/loongson/lsdc_plane.c | 793 +++++++++++++ drivers/gpu/drm/loongson/lsdc_probe.c | 56 + drivers/gpu/drm/loongson/lsdc_probe.h | 12 + drivers/gpu/drm/loongson/lsdc_regs.h | 406 +++++++ drivers/gpu/drm/loongson/lsdc_ttm.c | 593 ++++++++++ drivers/gpu/drm/loongson/lsdc_ttm.h | 99 ++ drivers/spi/Kconfig | 26 + drivers/spi/Makefile | 3 + drivers/spi/spi-loongson-core.c | 279 +++++ drivers/spi/spi-loongson-pci.c | 55 + drivers/spi/spi-loongson-plat.c | 47 + drivers/spi/spi-loongson.h | 49 + 42 files changed, 7007 insertions(+), 4 deletions(-) create mode 100644 drivers/gpu/drm/loongson/Kconfig create mode 100644 drivers/gpu/drm/loongson/Makefile create mode 100644 drivers/gpu/drm/loongson/loongson_device.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.h create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.c create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.h create mode 100644 drivers/gpu/drm/loongson/lsdc_crtc.c create mode 100644 drivers/gpu/drm/loongson/lsdc_debugfs.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.c create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.h create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.c create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a1000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a2000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_plane.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.h create mode 100644 drivers/gpu/drm/loongson/lsdc_regs.h create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.c create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.h create mode 100644 drivers/spi/spi-loongson-core.c create mode 100644 drivers/spi/spi-loongson-pci.c create mode 100644 drivers/spi/spi-loongson-plat.c create mode 100644 drivers/spi/spi-loongson.h -- 2.33.0

2 8

[PATCH openEuler-23.09 0/7] LoongArch: backport drm and spi driver and some bugfixes
by Hongchen Zhang 20 Sep '23

20 Sep '23

Backport the following patches from upstream. Dan Carpenter (1): drm/loongson: Add a check for lsdc_bo_create() errors Hongchen Zhang (1): LoongArch: mm: Add p?d_leaf() definitions Huacai Chen (2): LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Fix module relocation error with binutils 2.41 Sui Jingfeng (2): drm: Add kms driver for loongson display controller drm/loongson: Remove a useless check in cursor_plane_atomic_async_check() Yinbo Zhu (1): spi: loongson: add bus driver for the loongson spi controller MAINTAINERS | 4 + arch/loongarch/Makefile | 2 + arch/loongarch/include/asm/pgtable.h | 3 + arch/loongarch/kernel/kfpu.c | 55 +- drivers/gpio/gpio-loongson.c | 413 +++++-- drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/loongson/Kconfig | 17 + drivers/gpu/drm/loongson/Makefile | 22 + drivers/gpu/drm/loongson/loongson_device.c | 102 ++ drivers/gpu/drm/loongson/loongson_module.c | 33 + drivers/gpu/drm/loongson/loongson_module.h | 12 + drivers/gpu/drm/loongson/lsdc_benchmark.c | 133 +++ drivers/gpu/drm/loongson/lsdc_benchmark.h | 13 + drivers/gpu/drm/loongson/lsdc_crtc.c | 1024 +++++++++++++++++ drivers/gpu/drm/loongson/lsdc_debugfs.c | 110 ++ drivers/gpu/drm/loongson/lsdc_drv.c | 457 ++++++++ drivers/gpu/drm/loongson/lsdc_drv.h | 388 +++++++ drivers/gpu/drm/loongson/lsdc_gem.c | 311 +++++ drivers/gpu/drm/loongson/lsdc_gem.h | 37 + drivers/gpu/drm/loongson/lsdc_gfxpll.c | 199 ++++ drivers/gpu/drm/loongson/lsdc_gfxpll.h | 52 + drivers/gpu/drm/loongson/lsdc_i2c.c | 179 +++ drivers/gpu/drm/loongson/lsdc_i2c.h | 29 + drivers/gpu/drm/loongson/lsdc_irq.c | 74 ++ drivers/gpu/drm/loongson/lsdc_irq.h | 16 + drivers/gpu/drm/loongson/lsdc_output.h | 21 + drivers/gpu/drm/loongson/lsdc_output_7a1000.c | 178 +++ drivers/gpu/drm/loongson/lsdc_output_7a2000.c | 552 +++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.c | 481 ++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.h | 86 ++ drivers/gpu/drm/loongson/lsdc_plane.c | 793 +++++++++++++ drivers/gpu/drm/loongson/lsdc_probe.c | 56 + drivers/gpu/drm/loongson/lsdc_probe.h | 12 + drivers/gpu/drm/loongson/lsdc_regs.h | 406 +++++++ drivers/gpu/drm/loongson/lsdc_ttm.c | 593 ++++++++++ drivers/gpu/drm/loongson/lsdc_ttm.h | 99 ++ drivers/spi/Kconfig | 26 + drivers/spi/Makefile | 3 + drivers/spi/spi-loongson-core.c | 279 +++++ drivers/spi/spi-loongson-pci.c | 55 + drivers/spi/spi-loongson-plat.c | 47 + drivers/spi/spi-loongson.h | 49 + 43 files changed, 7345 insertions(+), 79 deletions(-) create mode 100644 drivers/gpu/drm/loongson/Kconfig create mode 100644 drivers/gpu/drm/loongson/Makefile create mode 100644 drivers/gpu/drm/loongson/loongson_device.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.h create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.c create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.h create mode 100644 drivers/gpu/drm/loongson/lsdc_crtc.c create mode 100644 drivers/gpu/drm/loongson/lsdc_debugfs.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.c create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.h create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.c create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a1000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a2000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_plane.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.h create mode 100644 drivers/gpu/drm/loongson/lsdc_regs.h create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.c create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.h create mode 100644 drivers/spi/spi-loongson-core.c create mode 100644 drivers/spi/spi-loongson-pci.c create mode 100644 drivers/spi/spi-loongson-plat.c create mode 100644 drivers/spi/spi-loongson.h -- 2.33.0

2 8

[openEuler-23.09 0/7] LoongArch: backport drm and spi driver and some bugfixes
by Hongchen Zhang 20 Sep '23

20 Sep '23

Backport the following patches from upstream. Dan Carpenter (1): drm/loongson: Add a check for lsdc_bo_create() errors Hongchen Zhang (1): LoongArch: mm: Add p?d_leaf() definitions Huacai Chen (2): LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Fix module relocation error with binutils 2.41 Sui Jingfeng (2): drm: Add kms driver for loongson display controller drm/loongson: Remove a useless check in cursor_plane_atomic_async_check() Yinbo Zhu (1): spi: loongson: add bus driver for the loongson spi controller MAINTAINERS | 4 + arch/loongarch/Makefile | 2 + arch/loongarch/include/asm/pgtable.h | 3 + arch/loongarch/kernel/kfpu.c | 55 +- drivers/gpio/gpio-loongson.c | 413 +++++-- drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/loongson/Kconfig | 17 + drivers/gpu/drm/loongson/Makefile | 22 + drivers/gpu/drm/loongson/loongson_device.c | 102 ++ drivers/gpu/drm/loongson/loongson_module.c | 33 + drivers/gpu/drm/loongson/loongson_module.h | 12 + drivers/gpu/drm/loongson/lsdc_benchmark.c | 133 +++ drivers/gpu/drm/loongson/lsdc_benchmark.h | 13 + drivers/gpu/drm/loongson/lsdc_crtc.c | 1024 +++++++++++++++++ drivers/gpu/drm/loongson/lsdc_debugfs.c | 110 ++ drivers/gpu/drm/loongson/lsdc_drv.c | 457 ++++++++ drivers/gpu/drm/loongson/lsdc_drv.h | 388 +++++++ drivers/gpu/drm/loongson/lsdc_gem.c | 311 +++++ drivers/gpu/drm/loongson/lsdc_gem.h | 37 + drivers/gpu/drm/loongson/lsdc_gfxpll.c | 199 ++++ drivers/gpu/drm/loongson/lsdc_gfxpll.h | 52 + drivers/gpu/drm/loongson/lsdc_i2c.c | 179 +++ drivers/gpu/drm/loongson/lsdc_i2c.h | 29 + drivers/gpu/drm/loongson/lsdc_irq.c | 74 ++ drivers/gpu/drm/loongson/lsdc_irq.h | 16 + drivers/gpu/drm/loongson/lsdc_output.h | 21 + drivers/gpu/drm/loongson/lsdc_output_7a1000.c | 178 +++ drivers/gpu/drm/loongson/lsdc_output_7a2000.c | 552 +++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.c | 481 ++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.h | 86 ++ drivers/gpu/drm/loongson/lsdc_plane.c | 793 +++++++++++++ drivers/gpu/drm/loongson/lsdc_probe.c | 56 + drivers/gpu/drm/loongson/lsdc_probe.h | 12 + drivers/gpu/drm/loongson/lsdc_regs.h | 406 +++++++ drivers/gpu/drm/loongson/lsdc_ttm.c | 593 ++++++++++ drivers/gpu/drm/loongson/lsdc_ttm.h | 99 ++ drivers/spi/Kconfig | 26 + drivers/spi/Makefile | 3 + drivers/spi/spi-loongson-core.c | 279 +++++ drivers/spi/spi-loongson-pci.c | 55 + drivers/spi/spi-loongson-plat.c | 47 + drivers/spi/spi-loongson.h | 49 + 43 files changed, 7345 insertions(+), 79 deletions(-) create mode 100644 drivers/gpu/drm/loongson/Kconfig create mode 100644 drivers/gpu/drm/loongson/Makefile create mode 100644 drivers/gpu/drm/loongson/loongson_device.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.h create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.c create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.h create mode 100644 drivers/gpu/drm/loongson/lsdc_crtc.c create mode 100644 drivers/gpu/drm/loongson/lsdc_debugfs.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.c create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.h create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.c create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a1000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a2000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_plane.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.h create mode 100644 drivers/gpu/drm/loongson/lsdc_regs.h create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.c create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.h create mode 100644 drivers/spi/spi-loongson-core.c create mode 100644 drivers/spi/spi-loongson-pci.c create mode 100644 drivers/spi/spi-loongson-plat.c create mode 100644 drivers/spi/spi-loongson.h -- 2.33.0

2 9

openEuler Kernel SIG双周例会
by openEuler conference 20 Sep '23

20 Sep '23

您好！ Kernel SIG 邀请您参加 2023-09-22 14:00 召开的Zoom会议(自动录制) 会议主题：openEuler Kernel SIG双周例会会议内容： 1. 进展update 2. 议题征集中新增议题可直接回复邮件申请，或录入会议看板会议链接：https://us06web.zoom.us/j/83542407044?pwd=UYtASnHgeP3bOAEaO9OCyMaPdQc6iA.1 会议纪要：https://etherpad.openeuler.org/p/Kernel-meetings 温馨提醒：建议接入会议后修改参会人的姓名，也可以使用您在gitee.com的ID 更多资讯尽在：https://openeuler.org/zh/ Hello! openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-09-22 14:00, The subject of the conference is openEuler Kernel SIG双周例会, Summary: 1. 进展update 2. 议题征集中新增议题可直接回复邮件申请，或录入会议看板 You can join the meeting at https://us06web.zoom.us/j/83542407044?pwd=UYtASnHgeP3bOAEaO9OCyMaPdQc6iA.1. Add topics at https://etherpad.openeuler.org/p/Kernel-meetings. Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com. More information: https://openeuler.org/en/

1 0

[openEuler-23.09 0/7] LoongArch: backport drm and spi driver and some bugfixes
by Hongchen Zhang 20 Sep '23

20 Sep '23

Backport the following patches from upstream. Dan Carpenter (1): drm/loongson: Add a check for lsdc_bo_create() errors Hongchen Zhang (1): LoongArch: mm: Add p?d_leaf() definitions Huacai Chen (2): LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Fix module relocation error with binutils 2.41 Sui Jingfeng (2): drm: Add kms driver for loongson display controller drm/loongson: Remove a useless check in cursor_plane_atomic_async_check() Yinbo Zhu (1): spi: loongson: add bus driver for the loongson spi controller MAINTAINERS | 4 + arch/loongarch/Makefile | 2 + arch/loongarch/include/asm/pgtable.h | 3 + arch/loongarch/kernel/kfpu.c | 55 +- drivers/gpio/gpio-loongson.c | 413 +++++-- drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/loongson/Kconfig | 17 + drivers/gpu/drm/loongson/Makefile | 22 + drivers/gpu/drm/loongson/loongson_device.c | 102 ++ drivers/gpu/drm/loongson/loongson_module.c | 33 + drivers/gpu/drm/loongson/loongson_module.h | 12 + drivers/gpu/drm/loongson/lsdc_benchmark.c | 133 +++ drivers/gpu/drm/loongson/lsdc_benchmark.h | 13 + drivers/gpu/drm/loongson/lsdc_crtc.c | 1024 +++++++++++++++++ drivers/gpu/drm/loongson/lsdc_debugfs.c | 110 ++ drivers/gpu/drm/loongson/lsdc_drv.c | 457 ++++++++ drivers/gpu/drm/loongson/lsdc_drv.h | 388 +++++++ drivers/gpu/drm/loongson/lsdc_gem.c | 311 +++++ drivers/gpu/drm/loongson/lsdc_gem.h | 37 + drivers/gpu/drm/loongson/lsdc_gfxpll.c | 199 ++++ drivers/gpu/drm/loongson/lsdc_gfxpll.h | 52 + drivers/gpu/drm/loongson/lsdc_i2c.c | 179 +++ drivers/gpu/drm/loongson/lsdc_i2c.h | 29 + drivers/gpu/drm/loongson/lsdc_irq.c | 74 ++ drivers/gpu/drm/loongson/lsdc_irq.h | 16 + drivers/gpu/drm/loongson/lsdc_output.h | 21 + drivers/gpu/drm/loongson/lsdc_output_7a1000.c | 178 +++ drivers/gpu/drm/loongson/lsdc_output_7a2000.c | 552 +++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.c | 481 ++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.h | 86 ++ drivers/gpu/drm/loongson/lsdc_plane.c | 793 +++++++++++++ drivers/gpu/drm/loongson/lsdc_probe.c | 56 + drivers/gpu/drm/loongson/lsdc_probe.h | 12 + drivers/gpu/drm/loongson/lsdc_regs.h | 406 +++++++ drivers/gpu/drm/loongson/lsdc_ttm.c | 593 ++++++++++ drivers/gpu/drm/loongson/lsdc_ttm.h | 99 ++ drivers/spi/Kconfig | 26 + drivers/spi/Makefile | 3 + drivers/spi/spi-loongson-core.c | 279 +++++ drivers/spi/spi-loongson-pci.c | 55 + drivers/spi/spi-loongson-plat.c | 47 + drivers/spi/spi-loongson.h | 49 + 43 files changed, 7345 insertions(+), 79 deletions(-) create mode 100644 drivers/gpu/drm/loongson/Kconfig create mode 100644 drivers/gpu/drm/loongson/Makefile create mode 100644 drivers/gpu/drm/loongson/loongson_device.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.h create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.c create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.h create mode 100644 drivers/gpu/drm/loongson/lsdc_crtc.c create mode 100644 drivers/gpu/drm/loongson/lsdc_debugfs.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.c create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.h create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.c create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a1000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a2000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_plane.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.h create mode 100644 drivers/gpu/drm/loongson/lsdc_regs.h create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.c create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.h create mode 100644 drivers/spi/spi-loongson-core.c create mode 100644 drivers/spi/spi-loongson-pci.c create mode 100644 drivers/spi/spi-loongson-plat.c create mode 100644 drivers/spi/spi-loongson.h -- 2.33.0

1 7

[PATCH v4 openEuler-23.09 0/3] remote_pager: fix msg_handler_peer.c build failed
by Wupeng Ma 20 Sep '23

20 Sep '23

From: Ma Wupeng <mawupeng1(a)huawei.com> remote_pager: fix msg_handler_peer.c build failed. Chunsheng Luo (3): mmap: export __do_mmap_mm symbol remote_pager: fix msg_handler_peer.c build failed remote_pager: delete unused file drivers/remote_pager/Kconfig | 9 + drivers/remote_pager/Makefile | 1 + drivers/remote_pager/main.c | 7 - drivers/remote_pager/msg_handler_peer.c | 111 ++------ drivers/remote_pager/swap/device/ksymbol.c | 83 ------ drivers/remote_pager/swap/device/ksymbol.h | 35 --- .../remote_pager/swap/device/swap_manager.c | 256 ------------------ .../remote_pager/swap/device/swap_manager.h | 28 -- .../swap/device/swap_policy/policy_list_lru.c | 108 -------- .../swap/device/swap_policy/swap_policy.h | 16 -- mm/mmap.c | 1 + 11 files changed, 33 insertions(+), 622 deletions(-) delete mode 100644 drivers/remote_pager/swap/device/ksymbol.c delete mode 100644 drivers/remote_pager/swap/device/ksymbol.h delete mode 100644 drivers/remote_pager/swap/device/swap_manager.c delete mode 100644 drivers/remote_pager/swap/device/swap_manager.h delete mode 100644 drivers/remote_pager/swap/device/swap_policy/policy_list_lru.c delete mode 100644 drivers/remote_pager/swap/device/swap_policy/swap_policy.h -- 2.25.1

2 4

[openEuler-23.09 0/7] LoongArch: backport drm and spi driver and some bugfixes
by Hongchen Zhang 20 Sep '23

20 Sep '23

Backport the following patches from upstream. Dan Carpenter (1): drm/loongson: Add a check for lsdc_bo_create() errors Hongchen Zhang (1): LoongArch: mm: Add p?d_leaf() definitions Huacai Chen (2): LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Fix module relocation error with binutils 2.41 Sui Jingfeng (2): drm: Add kms driver for loongson display controller drm/loongson: Remove a useless check in cursor_plane_atomic_async_check() Yinbo Zhu (1): spi: loongson: add bus driver for the loongson spi controller MAINTAINERS | 4 + arch/loongarch/Makefile | 2 + arch/loongarch/include/asm/pgtable.h | 3 + arch/loongarch/kernel/kfpu.c | 55 +- drivers/gpio/gpio-loongson.c | 413 +++++-- drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/loongson/Kconfig | 17 + drivers/gpu/drm/loongson/Makefile | 22 + drivers/gpu/drm/loongson/loongson_device.c | 102 ++ drivers/gpu/drm/loongson/loongson_module.c | 33 + drivers/gpu/drm/loongson/loongson_module.h | 12 + drivers/gpu/drm/loongson/lsdc_benchmark.c | 133 +++ drivers/gpu/drm/loongson/lsdc_benchmark.h | 13 + drivers/gpu/drm/loongson/lsdc_crtc.c | 1024 +++++++++++++++++ drivers/gpu/drm/loongson/lsdc_debugfs.c | 110 ++ drivers/gpu/drm/loongson/lsdc_drv.c | 457 ++++++++ drivers/gpu/drm/loongson/lsdc_drv.h | 388 +++++++ drivers/gpu/drm/loongson/lsdc_gem.c | 311 +++++ drivers/gpu/drm/loongson/lsdc_gem.h | 37 + drivers/gpu/drm/loongson/lsdc_gfxpll.c | 199 ++++ drivers/gpu/drm/loongson/lsdc_gfxpll.h | 52 + drivers/gpu/drm/loongson/lsdc_i2c.c | 179 +++ drivers/gpu/drm/loongson/lsdc_i2c.h | 29 + drivers/gpu/drm/loongson/lsdc_irq.c | 74 ++ drivers/gpu/drm/loongson/lsdc_irq.h | 16 + drivers/gpu/drm/loongson/lsdc_output.h | 21 + drivers/gpu/drm/loongson/lsdc_output_7a1000.c | 178 +++ drivers/gpu/drm/loongson/lsdc_output_7a2000.c | 552 +++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.c | 481 ++++++++ drivers/gpu/drm/loongson/lsdc_pixpll.h | 86 ++ drivers/gpu/drm/loongson/lsdc_plane.c | 793 +++++++++++++ drivers/gpu/drm/loongson/lsdc_probe.c | 56 + drivers/gpu/drm/loongson/lsdc_probe.h | 12 + drivers/gpu/drm/loongson/lsdc_regs.h | 406 +++++++ drivers/gpu/drm/loongson/lsdc_ttm.c | 593 ++++++++++ drivers/gpu/drm/loongson/lsdc_ttm.h | 99 ++ drivers/spi/Kconfig | 26 + drivers/spi/Makefile | 3 + drivers/spi/spi-loongson-core.c | 279 +++++ drivers/spi/spi-loongson-pci.c | 55 + drivers/spi/spi-loongson-plat.c | 47 + drivers/spi/spi-loongson.h | 49 + 43 files changed, 7345 insertions(+), 79 deletions(-) create mode 100644 drivers/gpu/drm/loongson/Kconfig create mode 100644 drivers/gpu/drm/loongson/Makefile create mode 100644 drivers/gpu/drm/loongson/loongson_device.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.c create mode 100644 drivers/gpu/drm/loongson/loongson_module.h create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.c create mode 100644 drivers/gpu/drm/loongson/lsdc_benchmark.h create mode 100644 drivers/gpu/drm/loongson/lsdc_crtc.c create mode 100644 drivers/gpu/drm/loongson/lsdc_debugfs.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.c create mode 100644 drivers/gpu/drm/loongson/lsdc_drv.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gem.h create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_gfxpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.c create mode 100644 drivers/gpu/drm/loongson/lsdc_i2c.h create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.c create mode 100644 drivers/gpu/drm/loongson/lsdc_irq.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output.h create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a1000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_output_7a2000.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.c create mode 100644 drivers/gpu/drm/loongson/lsdc_pixpll.h create mode 100644 drivers/gpu/drm/loongson/lsdc_plane.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.c create mode 100644 drivers/gpu/drm/loongson/lsdc_probe.h create mode 100644 drivers/gpu/drm/loongson/lsdc_regs.h create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.c create mode 100644 drivers/gpu/drm/loongson/lsdc_ttm.h create mode 100644 drivers/spi/spi-loongson-core.c create mode 100644 drivers/spi/spi-loongson-pci.c create mode 100644 drivers/spi/spi-loongson-plat.c create mode 100644 drivers/spi/spi-loongson.h -- 2.33.0

1 2

[PATCH v3 openEuler-23.09 0/3] remote_pager: fix msg_handler_peer.c build failed
by Wupeng Ma 20 Sep '23

20 Sep '23

From: Ma Wupeng <mawupeng1(a)huawei.com> remote_pager: fix msg_handler_peer.c build failed. Chunsheng Luo (3): mmap: export __do_mmap_mm symbol remote_pager: fix msg_handler_peer.c build failed remote_pager: delete unused file drivers/remote_pager/Kconfig | 8 + drivers/remote_pager/Makefile | 1 + drivers/remote_pager/main.c | 7 - drivers/remote_pager/msg_handler_peer.c | 111 ++------ drivers/remote_pager/swap/device/ksymbol.c | 83 ------ drivers/remote_pager/swap/device/ksymbol.h | 35 --- .../remote_pager/swap/device/swap_manager.c | 256 ------------------ .../remote_pager/swap/device/swap_manager.h | 28 -- .../swap/device/swap_policy/policy_list_lru.c | 108 -------- .../swap/device/swap_policy/swap_policy.h | 16 -- mm/mmap.c | 1 + 11 files changed, 32 insertions(+), 622 deletions(-) delete mode 100644 drivers/remote_pager/swap/device/ksymbol.c delete mode 100644 drivers/remote_pager/swap/device/ksymbol.h delete mode 100644 drivers/remote_pager/swap/device/swap_manager.c delete mode 100644 drivers/remote_pager/swap/device/swap_manager.h delete mode 100644 drivers/remote_pager/swap/device/swap_policy/policy_list_lru.c delete mode 100644 drivers/remote_pager/swap/device/swap_policy/swap_policy.h -- 2.25.1

2 4

[PATCH openEuler-22.03-LTS 0/5] x86/speculation: Add force option to GDS mitigation
by Zeng Heng 19 Sep '23

19 Sep '23

Arnd Bergmann (1): x86: Move gds_ucode_mitigated() declaration to header Daniel Sneddon (3): x86/speculation: Add force option to GDS mitigation x86/speculation: Add Kconfig option for GDS KVM: Add GDS_NO support to KVM Dave Hansen (1): Documentation/x86: Fix backwards on/off logic about YMM support .../hw-vuln/gather_data_sampling.rst | 18 ++++++++--- .../admin-guide/kernel-parameters.txt | 8 ++++- arch/x86/Kconfig | 19 ++++++++++++ arch/x86/include/asm/processor.h | 2 ++ arch/x86/kernel/cpu/bugs.c | 31 ++++++++++++++++++- arch/x86/kvm/x86.c | 3 ++ 6 files changed, 75 insertions(+), 6 deletions(-) -- 2.25.1

2 6

[PATCH openEuler-23.09 v1] sch_netem: fix issues in netem_change() vs get_dist_table()
by Yue Haibing 19 Sep '23

19 Sep '23

From: Eric Dumazet <edumazet(a)google.com> mainline inclusion from mainline-v6.5-rc1 commit 11b73313c12403f617b47752db0ab3deef201af7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I80FL9 CVE: NA -------------------------------- In blamed commit, I missed that get_dist_table() was allocating memory using GFP_KERNEL, and acquiring qdisc lock to perform the swap of newly allocated table with current one. In this patch, get_dist_table() is allocating memory and copy user data before we acquire the qdisc lock. Then we perform swap operations while being protected by the lock. Note that after this patch netem_change() no longer can do partial changes. If an error is returned, qdisc conf is left unchanged. Fixes: 2174a08db80d ("sch_netem: acquire qdisc lock in netem_change()") Reported-by: syzbot <syzkaller(a)googlegroups.com> Signed-off-by: Eric Dumazet <edumazet(a)google.com> Cc: Stephen Hemminger <stephen(a)networkplumber.org> Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com> Reviewed-by: Simon Horman <simon.horman(a)corigine.com> Link: https://lore.kernel.org/r/20230622181503.2327695-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Yue Haibing <yuehaibing(a)huawei.com> --- net/sched/sch_netem.c | 59 ++++++++++++++++++------------------------- 1 file changed, 25 insertions(+), 34 deletions(-) diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index e79be1b3e74d..b93ec2a3454e 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -773,12 +773,10 @@ static void dist_free(struct disttable *d) * signed 16 bit values. */ -static int get_dist_table(struct Qdisc *sch, struct disttable **tbl, - const struct nlattr *attr) +static int get_dist_table(struct disttable **tbl, const struct nlattr *attr) { size_t n = nla_len(attr)/sizeof(__s16); const __s16 *data = nla_data(attr); - spinlock_t *root_lock; struct disttable *d; int i; @@ -793,13 +791,7 @@ static int get_dist_table(struct Qdisc *sch, struct disttable **tbl, for (i = 0; i < n; i++) d->table[i] = data[i]; - root_lock = qdisc_root_sleeping_lock(sch); - - spin_lock_bh(root_lock); - swap(*tbl, d); - spin_unlock_bh(root_lock); - - dist_free(d); + *tbl = d; return 0; } @@ -956,6 +948,8 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, { struct netem_sched_data *q = qdisc_priv(sch); struct nlattr *tb[TCA_NETEM_MAX + 1]; + struct disttable *delay_dist = NULL; + struct disttable *slot_dist = NULL; struct tc_netem_qopt *qopt; struct clgstate old_clg; int old_loss_model = CLG_RANDOM; @@ -966,6 +960,18 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, if (ret < 0) return ret; + if (tb[TCA_NETEM_DELAY_DIST]) { + ret = get_dist_table(&delay_dist, tb[TCA_NETEM_DELAY_DIST]); + if (ret) + goto table_free; + } + + if (tb[TCA_NETEM_SLOT_DIST]) { + ret = get_dist_table(&slot_dist, tb[TCA_NETEM_SLOT_DIST]); + if (ret) + goto table_free; + } + sch_tree_lock(sch); /* backup q->clg and q->loss_model */ old_clg = q->clg; @@ -975,26 +981,17 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, ret = get_loss_clg(q, tb[TCA_NETEM_LOSS]); if (ret) { q->loss_model = old_loss_model; + q->clg = old_clg; goto unlock; } } else { q->loss_model = CLG_RANDOM; } - if (tb[TCA_NETEM_DELAY_DIST]) { - ret = get_dist_table(sch, &q->delay_dist, - tb[TCA_NETEM_DELAY_DIST]); - if (ret) - goto get_table_failure; - } - - if (tb[TCA_NETEM_SLOT_DIST]) { - ret = get_dist_table(sch, &q->slot_dist, - tb[TCA_NETEM_SLOT_DIST]); - if (ret) - goto get_table_failure; - } - + if (delay_dist) + swap(q->delay_dist, delay_dist); + if (slot_dist) + swap(q->slot_dist, slot_dist); sch->limit = qopt->limit; q->latency = PSCHED_TICKS2NS(qopt->latency); @@ -1044,17 +1041,11 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, unlock: sch_tree_unlock(sch); - return ret; -get_table_failure: - /* recover clg and loss_model, in case of - * q->clg and q->loss_model were modified - * in get_loss_clg() - */ - q->clg = old_clg; - q->loss_model = old_loss_model; - - goto unlock; +table_free: + dist_free(delay_dist); + dist_free(slot_dist); + return ret; } static int netem_init(struct Qdisc *sch, struct nlattr *opt, -- 2.34.1

2 1

[PATCH OLK-5.10] [Backport] media: ttusb-dec: fix memory leak in ttusb_dec_exit_dvb()
by ChenXiaoSong 19 Sep '23

19 Sep '23

From: Hyunwoo Kim <imv4bel(a)gmail.com> stable inclusion from stable-v5.10.183 commit eb37fef417a246fe54530901a3ea9c0abc914fc2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I635HP CVE: CVE-2022-45887 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 517a281338322ff8293f988771c98aaa7205e457 ] Since dvb_frontend_detach() is not called in ttusb_dec_exit_dvb(), which is called when the device is disconnected, dvb_frontend_free() is not finally called. This causes a memory leak just by repeatedly plugging and unplugging the device. Fix this issue by adding dvb_frontend_detach() to ttusb_dec_exit_dvb(). Link: https://lore.kernel.org/linux-media/20221117045925.14297-5-imv4bel@gmail.com Signed-off-by: Hyunwoo Kim <imv4bel(a)gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: ChenXiaoSong <chenxiaosong2(a)huawei.com> --- drivers/media/usb/ttusb-dec/ttusb_dec.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/media/usb/ttusb-dec/ttusb_dec.c b/drivers/media/usb/ttusb-dec/ttusb_dec.c index df6c5e4a0f05..68f88143c8a6 100644 --- a/drivers/media/usb/ttusb-dec/ttusb_dec.c +++ b/drivers/media/usb/ttusb-dec/ttusb_dec.c @@ -1551,8 +1551,7 @@ static void ttusb_dec_exit_dvb(struct ttusb_dec *dec) dvb_dmx_release(&dec->demux); if (dec->fe) { dvb_unregister_frontend(dec->fe); - if (dec->fe->ops.release) - dec->fe->ops.release(dec->fe); + dvb_frontend_detach(dec->fe); } dvb_unregister_adapter(&dec->adapter); } -- 2.31.1

2 1

[PATCH openEuler-23.09 0/2] xfs: fix NULL dereference in xlog_cil_pcp_dead
by Baokun Li 19 Sep '23

19 Sep '23

Darrick J. Wong (2): xfs: fix per-cpu CIL structure aggregation racing with dying cpus xfs: use per-mount cpumask to track nonempty percpu inodegc lists fs/xfs/xfs_icache.c | 78 ++++++++++++++++--------------------------- fs/xfs/xfs_icache.h | 1 - fs/xfs/xfs_log_cil.c | 52 +++++++++-------------------- fs/xfs/xfs_log_priv.h | 14 ++++---- fs/xfs/xfs_mount.h | 6 ++-- fs/xfs/xfs_super.c | 5 +-- 6 files changed, 55 insertions(+), 101 deletions(-) -- 2.31.1

2 3

[PATCH v3 openEuler-23.09 0/2] remote_pager: fix msg_handler_peer.c build failed
by Wupeng Ma 19 Sep '23

19 Sep '23

From: Ma Wupeng <mawupeng1(a)huawei.com> remote_pager: fix msg_handler_peer.c build failed. Chunsheng Luo (2): mmap: export __do_mmap_mm symbol remote_pager: fix msg_handler_peer.c build failed drivers/remote_pager/Kconfig | 9 ++ drivers/remote_pager/Makefile | 1 + drivers/remote_pager/main.c | 7 - drivers/remote_pager/msg_handler_peer.c | 197 +++++++++++++----------- mm/mmap.c | 1 + 5 files changed, 117 insertions(+), 98 deletions(-) -- 2.25.1

2 3

[PATCH v2 openEuler-23.09 0/2] remote_pager: fix msg_handler_peer.c build failed
by Wupeng Ma 19 Sep '23

19 Sep '23

From: Ma Wupeng <mawupeng1(a)huawei.com> remote_pager: fix msg_handler_peer.c build failed. Chunsheng Luo (2): mmap: export __do_mmap_mm symbol remote_pager: fix msg_handler_peer.c build failed drivers/remote_pager/Kconfig | 9 ++ drivers/remote_pager/Makefile | 1 + drivers/remote_pager/main.c | 7 - drivers/remote_pager/msg_handler_peer.c | 197 +++++++++++++----------- mm/mmap.c | 1 + 5 files changed, 117 insertions(+), 98 deletions(-) -- 2.25.1

2 3

[PATCH openEuler-1.0-LTS] crypto: hisilicon/qm - prevent soft lockup in qm_poll_qp()'s loop
by w00416078 19 Sep '23

19 Sep '23

From: Yu'an Wang <wangyuan46(a)huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I82LC6 CVE: NA -------------------------------- The function qm_poll_qp() may take a while due to complex req_cb, so soft lockup may occur in kernel with preemption disabled. Add a cond_resched() to prevent that. Signed-off-by: Yu'an Wang <wangyuan46(a)huawei.com> --- drivers/crypto/hisilicon/qm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index e0fd83465dce..739b1a6565fd 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -540,6 +540,8 @@ static void qm_poll_qp(struct hisi_qp *qp, struct hisi_qm *qm) qm_db(qm, qp->qp_id, QM_DOORBELL_CMD_CQ, qp->qp_status.cq_head, 0); atomic_dec(&qp->qp_status.used); + + cond_resched(); } /* set c_flag */ qm_db(qm, qp->qp_id, QM_DOORBELL_CMD_CQ, -- 2.30.0

2 1

[PATCH openEuler-1.0-LTS] media: ttusb-dec: fix memory leak in ttusb_dec_exit_dvb()
by ChenXiaoSong 19 Sep '23

19 Sep '23

From: Hyunwoo Kim <imv4bel(a)gmail.com> stable inclusion from stable-v4.19.285 commit 3e5af0745a4702ab0df2f880bfe0431eb30f9164 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I635HP CVE: CVE-2022-45887 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit 517a281338322ff8293f988771c98aaa7205e457 ] Since dvb_frontend_detach() is not called in ttusb_dec_exit_dvb(), which is called when the device is disconnected, dvb_frontend_free() is not finally called. This causes a memory leak just by repeatedly plugging and unplugging the device. Fix this issue by adding dvb_frontend_detach() to ttusb_dec_exit_dvb(). Link: https://lore.kernel.org/linux-media/20221117045925.14297-5-imv4bel@gmail.com Signed-off-by: Hyunwoo Kim <imv4bel(a)gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: ChenXiaoSong <chenxiaosong2(a)huawei.com> --- drivers/media/usb/ttusb-dec/ttusb_dec.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/media/usb/ttusb-dec/ttusb_dec.c b/drivers/media/usb/ttusb-dec/ttusb_dec.c index f34efa7c61b4..c915e555897b 100644 --- a/drivers/media/usb/ttusb-dec/ttusb_dec.c +++ b/drivers/media/usb/ttusb-dec/ttusb_dec.c @@ -1561,8 +1561,7 @@ static void ttusb_dec_exit_dvb(struct ttusb_dec *dec) dvb_dmx_release(&dec->demux); if (dec->fe) { dvb_unregister_frontend(dec->fe); - if (dec->fe->ops.release) - dec->fe->ops.release(dec->fe); + dvb_frontend_detach(dec->fe); } dvb_unregister_adapter(&dec->adapter); } -- 2.31.1

2 1

[PATCH OLK-5.10] ext4: fix rec_len verify error
by Baokun Li 19 Sep '23

19 Sep '23

From: Shida Zhang <zhangshida(a)kylinos.cn> mainline inclusion from mainline-v6.6-rc2 commit 7fda67e8c3ab6069f75888f67958a6d30454a9f6 category: bugfix bugzilla: 189039, https://gitee.com/openeuler/kernel/issues/I7OXK8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- With the configuration PAGE_SIZE 64k and filesystem blocksize 64k, a problem occurred when more than 13 million files were directly created under a directory: EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum When enough files are created, the fake_dirent->reclen will be 0xffff. it doesn't equal to the blocksize 65536, i.e. 0x10000. But it is not the same condition when blocksize equals to 4k. when enough files are created, the fake_dirent->reclen will be 0x1000. it equals to the blocksize 4k, i.e. 0x1000. The problem seems to be related to the limitation of the 16-bit field when the blocksize is set to 64k. To address this, helpers like ext4_rec_len_{from,to}_disk has already been introduced to complete the conversion between the encoded and the plain form of rec_len. So fix this one by using the helper, and all the other in this file too. Cc: stable(a)kernel.org Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Suggested-by: Andreas Dilger <adilger(a)dilger.ca> Suggested-by: Darrick J. Wong <djwong(a)kernel.org> Signed-off-by: Shida Zhang <zhangshida(a)kylinos.cn> Reviewed-by: Andreas Dilger <adilger(a)dilger.ca> Reviewed-by: Darrick J. Wong <djwong(a)kernel.org> Link: https://lore.kernel.org/r/20230803060938.1929759-1-zhangshida@kylinos.cn Signed-off-by: Theodore Ts'o <tytso(a)mit.edu> Signed-off-by: Baokun Li <libaokun1(a)huawei.com> --- fs/ext4/namei.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 746aed40a3c8..a01aba88c39d 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -344,17 +344,17 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode, struct buffer_head *bh) { struct ext4_dir_entry_tail *t; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); #ifdef PARANOID struct ext4_dir_entry *d, *top; d = (struct ext4_dir_entry *)bh->b_data; top = (struct ext4_dir_entry *)(bh->b_data + - (EXT4_BLOCK_SIZE(inode->i_sb) - - sizeof(struct ext4_dir_entry_tail))); - while (d < top && d->rec_len) + (blocksize - sizeof(struct ext4_dir_entry_tail))); + while (d < top && ext4_rec_len_from_disk(d->rec_len, blocksize)) d = (struct ext4_dir_entry *)(((void *)d) + - le16_to_cpu(d->rec_len)); + ext4_rec_len_from_disk(d->rec_len, blocksize)); if (d != top) return NULL; @@ -365,7 +365,8 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode, #endif if (t->det_reserved_zero1 || - le16_to_cpu(t->det_rec_len) != sizeof(struct ext4_dir_entry_tail) || + (ext4_rec_len_from_disk(t->det_rec_len, blocksize) != + sizeof(struct ext4_dir_entry_tail)) || t->det_reserved_zero2 || t->det_reserved_ft != EXT4_FT_DIR_CSUM) return NULL; @@ -446,13 +447,14 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode, struct ext4_dir_entry *dp; struct dx_root_info *root; int count_offset; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); + unsigned int rlen = ext4_rec_len_from_disk(dirent->rec_len, blocksize); - if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb)) + if (rlen == blocksize) count_offset = 8; - else if (le16_to_cpu(dirent->rec_len) == 12) { + else if (rlen == 12) { dp = (struct ext4_dir_entry *)(((void *)dirent) + 12); - if (le16_to_cpu(dp->rec_len) != - EXT4_BLOCK_SIZE(inode->i_sb) - 12) + if (ext4_rec_len_from_disk(dp->rec_len, blocksize) != blocksize - 12) return NULL; root = (struct dx_root_info *)(((void *)dp + 12)); if (root->reserved_zero || @@ -1261,6 +1263,7 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, unsigned int buflen = bh->b_size; char *base = bh->b_data; struct dx_hash_info h = *hinfo; + int blocksize = EXT4_BLOCK_SIZE(dir->i_sb); if (ext4_has_metadata_csum(dir->i_sb)) buflen -= sizeof(struct ext4_dir_entry_tail); @@ -1274,11 +1277,12 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, map_tail--; map_tail->hash = h.hash; map_tail->offs = ((char *) de - base)>>2; - map_tail->size = le16_to_cpu(de->rec_len); + map_tail->size = ext4_rec_len_from_disk(de->rec_len, + blocksize); count++; cond_resched(); } - de = ext4_next_entry(de, dir->i_sb->s_blocksize); + de = ext4_next_entry(de, blocksize); } return count; } -- 2.31.1

2 1

[PATCH openEuler-23.09 0/2] remote_pager: fix msg_handler_peer.c build failed
by Wupeng Ma 19 Sep '23

19 Sep '23

From: Ma Wupeng <mawupeng1(a)huawei.com> remote_pager: fix msg_handler_peer.c build failed. Chunsheng Luo (2): mmap: export __do_mmap_mm symbol remote_pager: fix msg_handler_peer.c build failed drivers/remote_pager/Kconfig | 9 ++ drivers/remote_pager/Makefile | 1 + drivers/remote_pager/main.c | 7 - drivers/remote_pager/msg_handler_peer.c | 197 +++++++++++++----------- mm/mmap.c | 1 + 5 files changed, 117 insertions(+), 98 deletions(-) -- 2.25.1

2 3

[PATCH OLK-5.10] Add new config 'CONFIG_EXT4_ERROR_REPORT' to control ext3/4 error reporting
by Baokun Li 19 Sep '23

19 Sep '23

From: Zhihao Cheng <chengzhihao1(a)huawei.com> hulk inclusion category: bugfix bugzilla: 187975, https://gitee.com/openeuler/kernel/issues/I7T77P -------------------------------- Add new config 'CONFIG_EXT4_ERROR_REPORT' to control ext3/4 error reporting. Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com> Signed-off-by: Baokun Li <libaokun1(a)huawei.com> --- arch/arm64/configs/openeuler_defconfig | 1 + arch/x86/configs/openeuler_defconfig | 1 + fs/ext4/Kconfig | 8 ++++++++ fs/ext4/ext4.h | 2 ++ fs/ext4/super.c | 19 +++++++++++++++++++ 5 files changed, 31 insertions(+) diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index f055d8e93bc4..514d35e099f7 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -6211,6 +6211,7 @@ CONFIG_EXT4_FS=m CONFIG_EXT4_USE_FOR_EXT2=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_EXT4_FS_SECURITY=y +CONFIG_EXT4_ERROR_REPORT=y # CONFIG_EXT4_DEBUG is not set CONFIG_JBD2=m # CONFIG_JBD2_DEBUG is not set diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index 9adedd9d615a..203c5e353d94 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -7308,6 +7308,7 @@ CONFIG_EXT4_FS=m CONFIG_EXT4_USE_FOR_EXT2=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_EXT4_FS_SECURITY=y +CONFIG_EXT4_ERROR_REPORT=y # CONFIG_EXT4_DEBUG is not set CONFIG_JBD2=m # CONFIG_JBD2_DEBUG is not set diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig index 86699c8cab28..ae108d47ff00 100644 --- a/fs/ext4/Kconfig +++ b/fs/ext4/Kconfig @@ -117,3 +117,11 @@ config EXT4_KUNIT_TESTS to the KUnit documentation in Documentation/dev-tools/kunit/. If unsure, say N. + +config EXT4_ERROR_REPORT + bool "Ext4 error reporting by netlink" + depends on EXT4_FS && NET + default n + help + Implement the ext3/ext4 file system error report. Report error to + userspace by netlink diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 56776dea9cd1..74f72a22f5ec 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -46,6 +46,7 @@ #include <linux/compiler.h> +#ifdef CONFIG_EXT4_ERROR_REPORT #define NL_EXT4_ERROR_GROUP 1 #define EXT4_ERROR_MAGIC 0xAE32014U struct ext4_err_msg { @@ -54,6 +55,7 @@ struct ext4_err_msg { unsigned long s_flags; int ext4_errno; }; +#endif /* * The fourth extended filesystem constants/structures diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 4afc9dab14cf..d24816506339 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -55,9 +55,11 @@ #include "mballoc.h" #include "fsmap.h" +#ifdef CONFIG_EXT4_ERROR_REPORT #include <uapi/linux/netlink.h> #include <net/sock.h> #include <net/net_namespace.h> +#endif #define CREATE_TRACE_POINTS #include <trace/events/ext4.h> @@ -90,8 +92,10 @@ static void ext4_unregister_li_request(struct super_block *sb); static void ext4_clear_request_list(void); static struct inode *ext4_get_journal_inode(struct super_block *sb, unsigned int journal_inum); +#ifdef CONFIG_EXT4_ERROR_REPORT static void ext4_netlink_send_info(struct super_block *sb, int ext4_errno); static struct sock *ext4nl; +#endif /* * Lock ordering @@ -616,6 +620,7 @@ static void save_error_info(struct super_block *sb, int error, spin_unlock(&sbi->s_error_lock); } +#ifdef CONFIG_EXT4_ERROR_REPORT static void ext4_netlink_send_info(struct super_block *sb, int ext4_errno) { int size; @@ -651,6 +656,7 @@ static void ext4_netlink_send_info(struct super_block *sb, int ext4_errno) kfree_skb(skb); } } +#endif /* Deal with the reporting of failure conditions on a filesystem such as * inconsistencies detected or read IO failures. @@ -713,11 +719,16 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error, sb->s_id); } +#ifdef CONFIG_EXT4_ERROR_REPORT if (sb_rdonly(sb)) return; if (continue_fs) goto out; +#else + if (sb_rdonly(sb) || continue_fs) + return; +#endif ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only"); @@ -727,8 +738,10 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error, */ smp_wmb(); sb->s_flags |= SB_RDONLY; +#ifdef CONFIG_EXT4_ERROR_REPORT out: ext4_netlink_send_info(sb, force_ro ? 2 : 1); +#endif } static void flush_stashed_error_work(struct work_struct *work) @@ -6855,7 +6868,9 @@ wait_queue_head_t ext4__ioend_wq[EXT4_WQ_HASH_SZ]; static int __init ext4_init_fs(void) { int i, err; +#ifdef CONFIG_EXT4_ERROR_REPORT struct netlink_kernel_cfg cfg = {.groups = NL_EXT4_ERROR_GROUP,}; +#endif ratelimit_state_init(&ext4_mount_msg_ratelimit, 30 * HZ, 64); ext4_li_info = NULL; @@ -6908,9 +6923,11 @@ static int __init ext4_init_fs(void) if (err) goto out; +#ifdef CONFIG_EXT4_ERROR_REPORT ext4nl = netlink_kernel_create(&init_net, NETLINK_FILESYSTEM, &cfg); if (!ext4nl) printk(KERN_ERR "EXT4-fs: Cannot create netlink socket.\n"); +#endif return 0; out: unregister_as_ext2(); @@ -6951,7 +6968,9 @@ static void __exit ext4_exit_fs(void) ext4_exit_post_read_processing(); ext4_exit_es(); ext4_exit_pending(); +#ifdef CONFIG_EXT4_ERROR_REPORT netlink_kernel_release(ext4nl); +#endif } MODULE_AUTHOR("Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts'o and others"); -- 2.31.1

2 1

[PATCH 1/2] mmap: export __do_mmap_mm symbol
by Wupeng Ma 19 Sep '23

19 Sep '23

From: Chunsheng Luo <luochunsheng(a)huawei.com> euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WLVX --------------------------------------------- export __do_mmap_mm symbol by referring to the earlier version Signed-off-by: Chunsheng Luo <luochunsheng(a)huawei.com> --- mm/mmap.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/mmap.c b/mm/mmap.c index 2aef07b8a85e..819c37e7eb1f 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1469,6 +1469,7 @@ unsigned long __do_mmap_mm(struct mm_struct *mm, struct file *file, unsigned lon *populate = len; return addr; } +EXPORT_SYMBOL(__do_mmap_mm); unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, -- 2.33.0

1 1

[PATCH OLK-5.10] sched/qos: Fix warning in CPU hotplug scenarios
by Xia Fukun 19 Sep '23

19 Sep '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZMCB CVE: NA -------------------------------- CPU hotplug callbacks race against distribute_cfs_runtime(), when the QOS_SCHED feature is enabled, there may be situations where the cfs_rq-> runtime_remaining == 1 and cfs_rq is QOS_THROTTLED. Turn off the Qos_throttle when the CPU is offline. No longer allocate time to cfs_rq in this scenario to fix the warning. Fixes: 4eb6eb7941dc ("sched/qos: Don't unthrottle cfs_rq when cfs_rq is throttled by qos") Signed-off-by: Xia Fukun <xiafukun(a)huawei.com> --- kernel/sched/fair.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index daa853b19853..b8bf7acb9f9a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5267,6 +5267,19 @@ static void distribute_cfs_runtime(struct cfs_bandwidth *cfs_b) if (!cfs_rq_throttled(cfs_rq)) goto next; + /* + * CPU hotplug callbacks race against distribute_cfs_runtime() + * when the QOS_SCHED feature is enabled, there may be + * situations where the runtime_remaining > 0. + * Qos_sched does not care whether the cfs_rq has time left, + * so no longer allocate time to cfs_rq in this scenario. + */ +#ifdef CONFIG_QOS_SCHED + if (cfs_rq->throttled == QOS_THROTTLED && + cfs_rq->runtime_remaining > 0) + goto next; +#endif + /* By the above check, this should never be true */ SCHED_WARN_ON(cfs_rq->runtime_remaining > 0); @@ -7923,6 +7936,10 @@ static __always_inline bool check_qos_cfs_rq(struct cfs_rq *cfs_rq) if (unlikely(cfs_rq && is_offline_level(cfs_rq->tg->qos_level) && !sched_idle_cpu(smp_processor_id()) && cfs_rq->h_nr_running == cfs_rq->idle_h_nr_running)) { + + if (!rq_of(cfs_rq)->online) + return false; + throttle_qos_cfs_rq(cfs_rq); return true; } -- 2.34.1

2 1

[PATCH openEuler-23.09] mm/mlock: return EINVAL for illegal user memory range in mlock
by Wupeng Ma 19 Sep '23

19 Sep '23

From: Ma Wupeng <mawupeng1(a)huawei.com> hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7TW89 -------------------------------- While testing mlock, we have a problem if the len of mlock is ULONG_MAX. The return value of mlock is zero. But nothing will be locked since the len in do_mlock overflows to zero due to the following code in mlock: len = PAGE_ALIGN(len + (offset_in_page(start))); The same problem happens in munlock. Since TASK_SIZE is the maximum user space address. The start or len of mlock shouldn't be bigger than this. Function access_ok can be used to check this issue, so return -EINVAL if bigger. Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- mm/mlock.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 40b43f8740df..e90139d42f88 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -479,8 +479,6 @@ static int apply_vma_lock_flags(unsigned long start, size_t len, end = start + len; if (end < start) return -EINVAL; - if (end == start) - return 0; vma = vma_iter_load(&vmi); if (!vma) return -ENOMEM; @@ -574,9 +572,15 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla if (!can_do_mlock()) return -EPERM; + if (!len) + return 0; + len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK; + if (!len) + return -EINVAL; + lock_limit = rlimit(RLIMIT_MEMLOCK); lock_limit >>= PAGE_SHIFT; locked = len >> PAGE_SHIFT; @@ -634,8 +638,13 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) start = untagged_addr(start); + if (!len) + return 0; + len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK; + if (!len) + return -EINVAL; if (mmap_write_lock_killable(current->mm)) return -EINTR; -- 2.25.1

2 1

[PATCH OLK-5.10] uacce: modify the configuration mode of device isolation stragety
by Wenkai Lin 19 Sep '23

19 Sep '23

From: Qi Tao <taoqi10(a)huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I82ARL CVE: NA ------------------------------------------------------------------ cancel the reference counting of the accelerator device so that the device (with hardware errors) isolation stragety can also be modified while the accelerator is running a task. Signed-off-by: Qi Tao <taoqi10(a)huawei.com> Signed-off-by: JiangShui Yang <yangjiangshui(a)h-partners.com> --- drivers/misc/uacce/uacce.c | 5 ----- include/linux/uacce.h | 2 -- 2 files changed, 7 deletions(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index 21a4f5892aff..7a64eed01df7 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -88,7 +88,6 @@ static int uacce_put_queue(struct uacce_queue *q) uacce->ops->put_queue(q); q->state = UACCE_Q_ZOMBIE; - atomic_dec(&uacce->ref); return 0; } @@ -348,7 +347,6 @@ static int uacce_fops_open(struct inode *inode, struct file *filep) goto out_with_bond; } - atomic_inc(&uacce->ref); init_waitqueue_head(&q->wait); filep->private_data = q; q->state = UACCE_Q_INIT; @@ -821,9 +819,6 @@ static ssize_t isolate_strategy_store(struct device *dev, struct device_attribut if (val > UACCE_MAX_ERR_THRESHOLD) return -EINVAL; - if (atomic_read(&uacce->ref)) - return -EBUSY; - ret = uacce->ops->isolate_err_threshold_write(uacce, val); if (ret) return ret; diff --git a/include/linux/uacce.h b/include/linux/uacce.h index 8187c1bda236..9b0c04a9cff7 100644 --- a/include/linux/uacce.h +++ b/include/linux/uacce.h @@ -148,7 +148,6 @@ struct uacce_queue { * @mutex: protects uacce operation * @priv: private pointer of the uacce * @queues: list of queues - * @ref: reference of the uacce */ struct uacce_device { const char *algs; @@ -164,7 +163,6 @@ struct uacce_device { struct device dev; struct mutex mutex; void *priv; - atomic_t ref; struct uacce_err_isolate *isolate; struct list_head queues; }; -- 2.30.0

2 1

[PATCH OLK-5.10 v2] jbd2: Fix potential data lost in recovering journal raced with synchronizing fs bdev
by Zhihao Cheng 19 Sep '23

19 Sep '23

maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I828EV CVE: NA Reference: https://lore.kernel.org/linux-ext4/20230911155822.kbg2xayc7ys24kay@quack3/T… -------------------------------- JBD2 makes sure journal data is fallen on fs device by sync_blockdev(), however, other process could intercept the EIO information from bdev's mapping, which leads journal recovering successful even EIO occurs during data written back to fs device. We found this problem in our product, iscsi + multipath is chosen for block device of ext4. Unstable network may trigger kpartx to rescan partitions in device mapper layer. Detailed process is shown as following: mount kpartx irq jbd2_journal_recover do_one_pass memcpy(nbh->b_data, obh->b_data) // copy data to fs dev from journal mark_buffer_dirty // mark bh dirty vfs_read generic_file_read_iter // dio filemap_write_and_wait_range __filemap_fdatawrite_range do_writepages block_write_full_folio submit_bh_wbc >> EIO occurs in disk << end_buffer_async_write mark_buffer_write_io_error mapping_set_error set_bit(AS_EIO, &mapping->flags) // set! filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // clear! err2 = sync_blockdev filemap_write_and_wait filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // false err2 = 0 Filesystem is mounted successfully even data from journal is failed written into disk, and ext4/ocfs2 could become corrupted. Fix it by comparing the wb_err state in fs block device before recovering and after recovering. Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217888 Cc: stable(a)vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com> Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com> --- v1->v3: Initialize wb_err. Untialized wb_err could be same with mapping->wb_err(eg. EIO without ERRSEQ_SEEN). When EIO occurs again, mapping->wb_err won't be changed, and wb_err is still same with mapping->wb_err. fs/jbd2/recovery.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index 9e4537349380..cdf4c553f058 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -292,6 +292,8 @@ int jbd2_journal_recover(journal_t *journal) journal_superblock_t * sb; struct recovery_info info; + errseq_t wb_err; + struct address_space *mapping; memset(&info, 0, sizeof(info)); sb = journal->j_superblock; @@ -309,6 +311,9 @@ int jbd2_journal_recover(journal_t *journal) return 0; } + wb_err = 0; + mapping = journal->j_fs_dev->bd_inode->i_mapping; + errseq_check_and_advance(&mapping->wb_err, &wb_err); err = do_one_pass(journal, &info, PASS_SCAN); if (!err) err = do_one_pass(journal, &info, PASS_REVOKE); @@ -327,6 +332,9 @@ int jbd2_journal_recover(journal_t *journal) jbd2_journal_clear_revoke(journal); err2 = sync_blockdev(journal->j_fs_dev); + if (!err) + err = err2; + err2 = errseq_check_and_advance(&mapping->wb_err, &wb_err); if (!err) err = err2; /* Make sure all replayed data is on permanent storage */ -- 2.31.1

2 1

[PATCH openEuler-23.09] config: Disable x86 IBT for kpatch
by Wei Li 19 Sep '23

19 Sep '23

hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I82GQG -------------------------------- The CONFIG_X86_KERNEL_IBT option causes the compiled symbols to lack the __prefix__ symbol, leading kpatch to incorrectly determine that the symbol has no padding. In reality, the symbol does have padding, resulting in a compilation failure for kpatch. Let's disable IBT for now. Signed-off-by: Wei Li <liwei391(a)huawei.com> --- arch/x86/configs/openeuler_defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index 6bbaaaa82a0b..d3c0da3ddd64 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -457,7 +457,7 @@ CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1 CONFIG_X86_PAT=y CONFIG_ARCH_USES_PG_UNCACHED=y CONFIG_X86_UMIP=y -CONFIG_X86_KERNEL_IBT=y +# CONFIG_X86_KERNEL_IBT is not set CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y CONFIG_X86_INTEL_TSX_MODE_OFF=y # CONFIG_X86_INTEL_TSX_MODE_ON is not set -- 2.25.1

2 1

[PATCH openEuler-1.0-LTS] crypto:hisilicon/qm - cache write back before flr and poweroff
by w00416078 19 Sep '23

19 Sep '23

From: Yu'an Wang <wangyuan46(a)huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4YKIJ CVE: NA -------------------------------- In order to prevent an error when the hardware writes back cache, the driver should write back the hardware cache before flr and poweroff. If vfs is enabled, we should process abnormal scenes of the whole vfs. Signed-off-by: Yu'an Wang <wangyuan46(a)huawei.com> Reviewed-by: Weili Qian <qianweili(a)huawei.com> Reviewed-by: Ling Mingqiang <lingmingqiang(a)huawei.com> Acked-by: Xie XiuQi <xiexiuqi(a)huawei.com> --- drivers/crypto/hisilicon/qm.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 8fa855fd2387..e0fd83465dce 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -2772,6 +2772,8 @@ void hisi_qm_dev_shutdown(struct pci_dev *pdev) ret = hisi_qm_stop(qm, QM_NORMAL); if (ret) dev_err(&pdev->dev, "Fail to stop qm in shutdown!\n"); + + hisi_qm_cache_wb(qm); } EXPORT_SYMBOL_GPL(hisi_qm_dev_shutdown); @@ -3718,12 +3720,16 @@ static int qm_vf_reset_prepare(struct pci_dev *pdev, pci_save_state(dev); ret = hisi_qm_stop(qm, stop_reason); - if (ret) - goto prepare_fail; + if (ret) { + hisi_qm_set_hw_reset(qm, + QM_RESET_STOP_TX_OFFSET); + hisi_qm_set_hw_reset(qm, + QM_RESET_STOP_RX_OFFSET); + atomic_set(&qm->status.flags, QM_STOP); + } } } -prepare_fail: mutex_unlock(&qm_list->lock); return ret; } @@ -4117,19 +4123,26 @@ void hisi_qm_reset_prepare(struct pci_dev *pdev) if (qm->vfs_num) { ret = qm_vf_reset_prepare(pdev, qm->qm_list, QM_FLR); - if (ret) { - pci_err(pdev, "Fails to prepare reset!\n"); - return; - } + if (ret) + pci_err(pdev, "Failed to stop vfs!\n"); } ret = hisi_qm_stop(qm, QM_FLR); if (ret) { - pci_err(pdev, "Fails to stop QM!\n"); - return; + pci_err(pdev, "Failed to stop QM!\n"); + goto err_prepare; } + hisi_qm_cache_wb(qm); pci_info(pdev, "FLR resetting...\n"); + return; + +err_prepare: + pci_info(pdev, "FLR resetting prepare failed!\n"); + hisi_qm_set_hw_reset(qm, QM_RESET_STOP_TX_OFFSET); + hisi_qm_set_hw_reset(qm, QM_RESET_STOP_RX_OFFSET); + atomic_set(&qm->status.flags, QM_STOP); + hisi_qm_cache_wb(qm); } EXPORT_SYMBOL_GPL(hisi_qm_reset_prepare); -- 2.34.1

2 1

[PATCH openEuler-1.0-LTS v2 00/11] Fix booting failure on arm64
by Wei Li 18 Sep '23

18 Sep '23

The arm64 test machine boot failed when using hulk_defconfig after d63c76835476 ("arm64: efi: Execute runtime services from a dedicated stack"), revert this patch set first for weekly release. Wei Li (11): Revert "arm64: efi: Make efi_rt_lock a raw_spinlock" Revert "efi: rt-wrapper: Add missing include" Revert "arm64: efi: Recover from synchronous exceptions occurring in firmware" Revert "arm64: efi: Execute runtime services from a dedicated stack" Revert "efi: fix userspace infinite retry read efivars after EFI runtime services page fault" Revert "arm64: efi: Restore register x18 if it was corrupted" Revert "x86/efi: fix a -Wtype-limits compilation warning" Revert "efi: Fix build error due to enum collision between efi.h and ima.h" Revert "efi: Fix debugobjects warning on 'efi_rts_work'" Revert "efi/x86: Handle page faults occurring while running EFI runtime services" Revert "efi: Make efi_rts_work accessible to efi page fault handler" arch/arm64/include/asm/efi.h | 12 --- arch/arm64/kernel/efi-rt-wrapper.S | 49 +---------- arch/arm64/kernel/efi.c | 50 ----------- arch/arm64/mm/fault.c | 4 - arch/x86/include/asm/efi.h | 1 - arch/x86/mm/fault.c | 9 -- arch/x86/platform/efi/quirks.c | 78 ----------------- drivers/firmware/efi/runtime-wrappers.c | 109 +++++++++++++++--------- include/linux/efi.h | 42 --------- 9 files changed, 70 insertions(+), 284 deletions(-) -- 2.25.1

2 12

[PATCH openEuler-1.0-LTS] crypto:hisilicon/sec - modify hw endian config
by w00416078 18 Sep '23

18 Sep '23

From: Yu'an Wang <wangyuan46(a)huawei.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4YKLC CVE: NA -------------------------------- When the endian configuration of the hardware is abnormal, it will affect the normal function. Currently the soft configuration method can't restore the faulty device. The endian needs to be configured according to the system properties. So fix it. Signed-off-by: Yu'an Wang <wangyuan46(a)huawei.com> Reviewed-by: Kai Ye <yekai13(a)huawei.com> Reviewed-by: Ling Mingqiang <lingmingqiang(a)huawei.com> Acked-by: Xie XiuQi <xiexiuqi(a)huawei.com> --- drivers/crypto/hisilicon/sec2/sec_main.c | 36 +++++++----------------- 1 file changed, 10 insertions(+), 26 deletions(-) diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c index 58f726ba022f..a568d5363c1e 100644 --- a/drivers/crypto/hisilicon/sec2/sec_main.c +++ b/drivers/crypto/hisilicon/sec2/sec_main.c @@ -255,33 +255,19 @@ static const struct pci_device_id sec_dev_ids[] = { }; MODULE_DEVICE_TABLE(pci, sec_dev_ids); -static u8 sec_get_endian(struct hisi_qm *qm) +static void sec_set_endian(struct hisi_qm *qm) { u32 reg; - /* - * As for VF, it is a wrong way to get endian setting by - * reading a register of the engine - */ - if (qm->pdev->is_virtfn) { - dev_err_ratelimited(&qm->pdev->dev, - "cannot access a register in VF!\n"); - return SEC_LE; - } - reg = readl_relaxed(qm->io_base + SEC_ENGINE_PF_CFG_OFF + - SEC_ACC_COMMON_REG_OFF + SEC_CONTROL_REG); - - /* BD little endian mode */ - if (!(reg & BIT(0))) - return SEC_LE; + reg = readl_relaxed(SEC_ADDR(qm, SEC_CONTROL_REG)); + reg &= ~(BIT(1) | BIT(0)); + if (IS_ENABLED(CONFIG_64BIT)) + reg |= BIT(1); - /* BD 32-bits big endian mode */ - else if (!(reg & BIT(1))) - return SEC_32BE; + if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) + reg |= BIT(0); - /* BD 64-bits big endian mode */ - else - return SEC_64BE; + writel_relaxed(reg, SEC_ADDR(qm, SEC_CONTROL_REG)); } static int sec_engine_init(struct hisi_qm *qm) @@ -331,9 +317,7 @@ static int sec_engine_init(struct hisi_qm *qm) SEC_ADDR(qm, SEC_BD_ERR_CHK_EN_REG3)); /* config endian */ - reg = readl_relaxed(SEC_ADDR(qm, SEC_CONTROL_REG)); - reg |= sec_get_endian(qm); - writel_relaxed(reg, SEC_ADDR(qm, SEC_CONTROL_REG)); + sec_set_endian(qm); return 0; } @@ -813,7 +797,7 @@ static int sec_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev) { int ret; - qm->algs = "sec\ncipher\ndigest\naead\n"; + qm->algs = "cipher\ndigest\naead\n"; qm->uacce_mode = uacce_mode; qm->pdev = pdev; ret = hisi_qm_pre_init(qm, pf_q_num, SEC_PF_DEF_Q_BASE); -- 2.34.1

2 1

[PATCH OLK-5.10 0/2] Not clear ATA_PFLAG_EH_PENDING and not thaw the port twice in ata_eh_reset()
by Xingui Yang 18 Sep '23

18 Sep '23

Clear port pending interrupts before reset, as per AHCI specifications (Szuying). Followup fixes for this one are to not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() to allow EH to continue on with other actions recorded with error interrupts triggered before EH completes. A~Nd an additional fix to avoid thawing a port twice in EH (Niklas). Niklas Cassel (2): ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() ata: libata-eh: do not thaw the port twice in ata_eh_reset() drivers/ata/libata-eh.c | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) -- 2.17.1

2 3

[PATCH OLK-5.10 0/2] Not clear ATA_PFLAG_EH_PENDING and not thaw the port twice in ata_eh_reset()
by Xingui Yang 18 Sep '23

18 Sep '23

Clear port pending interrupts before reset, as per AHCI specifications (Szuying). Followup fixes for this one are to not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() to allow EH to continue on with other actions recorded with error interrupts triggered before EH completes. A~Nd an additional fix to avoid thawing a port twice in EH (Niklas). Niklas Cassel (2): ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() ata: libata-eh: do not thaw the port twice in ata_eh_reset() drivers/ata/libata-eh.c | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) -- 2.17.1

2 3

[PATCH openEuler-1.0-LTS v1 0/3] Fix booting failure on arm64
by Wei Li 18 Sep '23

18 Sep '23

The arm64 test machine boot failed when using hulk_defconfig after d63c76835476 ("arm64: efi: Execute runtime services from a dedicated stack"), revert it first for weekly release. Wei Li (3): Revert "arm64: efi: Make efi_rt_lock a raw_spinlock" Revert "arm64: efi: Recover from synchronous exceptions occurring in firmware" Revert "arm64: efi: Execute runtime services from a dedicated stack" arch/arm64/include/asm/efi.h | 12 ------ arch/arm64/kernel/efi-rt-wrapper.S | 39 ++----------------- arch/arm64/kernel/efi.c | 50 ------------------------- arch/arm64/mm/fault.c | 4 -- drivers/firmware/efi/runtime-wrappers.c | 1 - 5 files changed, 3 insertions(+), 103 deletions(-) -- 2.25.1

2 4

[PATCH OLK-5.10] jbd2: Fix potential data lost in recovering journal raced with synchronizing fs bdev
by Zhihao Cheng 18 Sep '23

18 Sep '23

maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I828EV CVE: NA Reference: https://lore.kernel.org/linux-ext4/20230911155822.kbg2xayc7ys24kay@quack3/T… -------------------------------- JBD2 makes sure journal data is fallen on fs device by sync_blockdev(), however, other process could intercept the EIO information from bdev's mapping, which leads journal recovering successful even EIO occurs during data written back to fs device. We found this problem in our product, iscsi + multipath is chosen for block device of ext4. Unstable network may trigger kpartx to rescan partitions in device mapper layer. Detailed process is shown as following: mount kpartx irq jbd2_journal_recover do_one_pass memcpy(nbh->b_data, obh->b_data) // copy data to fs dev from journal mark_buffer_dirty // mark bh dirty vfs_read generic_file_read_iter // dio filemap_write_and_wait_range __filemap_fdatawrite_range do_writepages block_write_full_folio submit_bh_wbc >> EIO occurs in disk << end_buffer_async_write mark_buffer_write_io_error mapping_set_error set_bit(AS_EIO, &mapping->flags) // set! filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // clear! err2 = sync_blockdev filemap_write_and_wait filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // false err2 = 0 Filesystem is mounted successfully even data from journal is failed written into disk, and ext4/ocfs2 could become corrupted. Fix it by comparing the wb_err state in fs block device before recovering and after recovering. Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217888 Cc: stable(a)vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com> Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com> --- fs/jbd2/recovery.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index 9e4537349380..361c37ef1f8a 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -292,6 +292,8 @@ int jbd2_journal_recover(journal_t *journal) journal_superblock_t * sb; struct recovery_info info; + errseq_t wb_err; + struct address_space *mapping; memset(&info, 0, sizeof(info)); sb = journal->j_superblock; @@ -309,6 +311,8 @@ int jbd2_journal_recover(journal_t *journal) return 0; } + mapping = journal->j_fs_dev->bd_inode->i_mapping; + errseq_check_and_advance(&mapping->wb_err, &wb_err); err = do_one_pass(journal, &info, PASS_SCAN); if (!err) err = do_one_pass(journal, &info, PASS_REVOKE); @@ -327,6 +331,9 @@ int jbd2_journal_recover(journal_t *journal) jbd2_journal_clear_revoke(journal); err2 = sync_blockdev(journal->j_fs_dev); + if (!err) + err = err2; + err2 = errseq_check_and_advance(&mapping->wb_err, &wb_err); if (!err) err = err2; /* Make sure all replayed data is on permanent storage */ -- 2.31.1

2 1

[PATCH v2 openEuler-23.09] ima: fix parser strategy unable to manually import kernel
by Zhou Shuiqing 18 Sep '23

18 Sep '23

euleros inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I822F4 CVE: NA ------------------------------------------------- This patch is to fix parser strategy unable to manually import kernel v2: -fix code indentation Signed-off-by: Zhou Shuiqing <zhoushuiqing2(a)huawei.com> Reviewed-by: Huaxin Lu <luhuaxin1(a)huawei.com> --- security/integrity/ima/ima_policy.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index 81a727a3f..80bf1dc80 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -1376,7 +1376,7 @@ static bool ima_validate_rule(struct ima_rule_entry *entry) entry->flags & (IMA_DIGSIG_REQUIRED | IMA_MODSIG_ALLOWED | #ifdef CONFIG_IMA_DIGEST_LIST IMA_CHECK_BLACKLIST | IMA_VALIDATE_ALGOS | - IMA_META_IMMUTABLE_REQUIRED | IMA_PARSER)) + IMA_META_IMMUTABLE_REQUIRED)) #else IMA_CHECK_BLACKLIST | IMA_VALIDATE_ALGOS)) #endif @@ -1416,7 +1416,8 @@ static bool ima_validate_rule(struct ima_rule_entry *entry) IMA_FGROUP | IMA_DIGSIG_REQUIRED | IMA_PERMIT_DIRECTIO | IMA_VALIDATE_ALGOS | #ifdef CONFIG_IMA_DIGEST_LIST - IMA_VERITY_REQUIRED | IMA_META_IMMUTABLE_REQUIRED)) + IMA_VERITY_REQUIRED | + IMA_META_IMMUTABLE_REQUIRED | IMA_PARSER)) #else IMA_VERITY_REQUIRED)) #endif -- 2.33.0

2 1

[PATCH openEuler-23.09] ima: modify the CONFIG configuration of x86_64
by Zhou Shuiqing 18 Sep '23

18 Sep '23

euleros inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8264X CVE: NA ------------------------------------------------- This patch is to modify the CONFIG configuration of x86_64. Signed-off-by: Zhou Shuiqing <zhoushuiqing2(a)huawei.com> --- arch/x86/configs/openeuler_defconfig | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index 5f4d70de3..5bdd03c9f 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -4368,18 +4368,18 @@ CONFIG_TCG_TIS_SPI=y # CONFIG_TCG_TIS_SPI_CR50 is not set CONFIG_TCG_TIS_I2C=m CONFIG_TCG_TIS_I2C_CR50=m -CONFIG_TCG_TIS_I2C_ATMEL=y -CONFIG_TCG_TIS_I2C_INFINEON=y -CONFIG_TCG_TIS_I2C_NUVOTON=y -CONFIG_TCG_NSC=y -CONFIG_TCG_ATMEL=y -CONFIG_TCG_INFINEON=y +CONFIG_TCG_TIS_I2C_ATMEL=m +CONFIG_TCG_TIS_I2C_INFINEON=m +CONFIG_TCG_TIS_I2C_NUVOTON=m +CONFIG_TCG_NSC=m +CONFIG_TCG_ATMEL=m +CONFIG_TCG_INFINEON=m CONFIG_TCG_XEN=m CONFIG_TCG_CRB=y CONFIG_TCG_VTPM_PROXY=m -CONFIG_TCG_TIS_ST33ZP24=y -CONFIG_TCG_TIS_ST33ZP24_I2C=y -CONFIG_TCG_TIS_ST33ZP24_SPI=y +CONFIG_TCG_TIS_ST33ZP24=m +CONFIG_TCG_TIS_ST33ZP24_I2C=m +CONFIG_TCG_TIS_ST33ZP24_SPI=m CONFIG_TELCLOCK=m CONFIG_XILLYBUS_CLASS=m CONFIG_XILLYBUS=m -- 2.33.0

2 1

[PATCH openEuler-23.09] ima: modify the CONFIG configuration of x86_64
by Zhou Shuiqing 18 Sep '23

18 Sep '23

euleros inclusion category: bugfix bugzilla:https://gitee.com/openeuler/kernel/issues/I8264X CVE: NA ------------------------------------------------- This patch is to modify the CONFIG configuration of x86_64. Signed-off-by: Zhou Shuiqing <zhoushuiqing2(a)huawei.com> --- arch/x86/configs/openeuler_defconfig | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index 5f4d70de3..5bdd03c9f 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -4368,18 +4368,18 @@ CONFIG_TCG_TIS_SPI=y # CONFIG_TCG_TIS_SPI_CR50 is not set CONFIG_TCG_TIS_I2C=m CONFIG_TCG_TIS_I2C_CR50=m -CONFIG_TCG_TIS_I2C_ATMEL=y -CONFIG_TCG_TIS_I2C_INFINEON=y -CONFIG_TCG_TIS_I2C_NUVOTON=y -CONFIG_TCG_NSC=y -CONFIG_TCG_ATMEL=y -CONFIG_TCG_INFINEON=y +CONFIG_TCG_TIS_I2C_ATMEL=m +CONFIG_TCG_TIS_I2C_INFINEON=m +CONFIG_TCG_TIS_I2C_NUVOTON=m +CONFIG_TCG_NSC=m +CONFIG_TCG_ATMEL=m +CONFIG_TCG_INFINEON=m CONFIG_TCG_XEN=m CONFIG_TCG_CRB=y CONFIG_TCG_VTPM_PROXY=m -CONFIG_TCG_TIS_ST33ZP24=y -CONFIG_TCG_TIS_ST33ZP24_I2C=y -CONFIG_TCG_TIS_ST33ZP24_SPI=y +CONFIG_TCG_TIS_ST33ZP24=m +CONFIG_TCG_TIS_ST33ZP24_I2C=m +CONFIG_TCG_TIS_ST33ZP24_SPI=m CONFIG_TELCLOCK=m CONFIG_XILLYBUS_CLASS=m CONFIG_XILLYBUS=m -- 2.33.0

2 1

[PATCH openEuler-23.09] livepatch: Enable livepatch configs in openeuler_defconfig
by Zheng Yejian 18 Sep '23

18 Sep '23

hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I826DM CVE: NA -------------------------------- Enable the same livepatch configures for x86_64 and arm64 as that in openEuler-22.03-LTS. Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com> --- arch/arm64/configs/openeuler_defconfig | 12 ++++++++++++ arch/x86/configs/openeuler_defconfig | 9 +++++++++ 2 files changed, 21 insertions(+) diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index 6c6cd4701f92..7e95287b4716 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -332,6 +332,18 @@ CONFIG_ARCH_XGENE=y # CONFIG_ARCH_ZYNQMP is not set # end of Platform selection +CONFIG_HAVE_LIVEPATCH_WO_FTRACE=y + +# +# Enable Livepatch +# +CONFIG_LIVEPATCH=y +CONFIG_LIVEPATCH_WO_FTRACE=y +CONFIG_LIVEPATCH_STOP_MACHINE_CONSISTENCY=y +# CONFIG_LIVEPATCH_STACK is not set +CONFIG_LIVEPATCH_RESTRICT_KPROBE=y +# end of Enable Livepatch + # # Kernel Features # diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index 5f4d70de32f9..0f30eb56c31b 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -502,8 +502,17 @@ CONFIG_LEGACY_VSYSCALL_XONLY=y CONFIG_MODIFY_LDT_SYSCALL=y # CONFIG_STRICT_SIGALTSTACK_SIZE is not set CONFIG_HAVE_LIVEPATCH_WO_FTRACE=y + +# +# Enable Livepatch +# CONFIG_LIVEPATCH=y +# CONFIG_LIVEPATCH_FTRACE is not set CONFIG_LIVEPATCH_WO_FTRACE=y +CONFIG_LIVEPATCH_STOP_MACHINE_CONSISTENCY=y +# CONFIG_LIVEPATCH_STACK is not set +CONFIG_LIVEPATCH_RESTRICT_KPROBE=y +# end of Enable Livepatch # end of Processor type and features CONFIG_FUNCTION_PADDING_CFI=11 -- 2.25.1

2 1

[PATCH OLK-5.10] zram: correctly handle all next_arg() cases
by Jinjiang Tu 18 Sep '23

18 Sep '23

From: Sergey Senozhatsky <senozhatsky(a)chromium.org> mainline inclusion from mainline-v6.3-rc1 commit df32de1433412621b92daf1b3369ac053214031e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I822Z8 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… ------------------------------------------- When supplied buffer does not have assignment sign next_arg() sets `val` pointer to NULL, so we cannot dereference it. Add a NULL pointer test to handle `param` case, in addition to `*val` test, which handles cases when param has no value assigned to it: `param=`. Link: https://lkml.kernel.org/r/20230103030119.1496358-1-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky(a)chromium.org> Cc: Minchan Kim <minchan(a)kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- drivers/block/zram/zram_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index e332b4d55359..955f0c4d358f 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1123,7 +1123,7 @@ static ssize_t recomp_algorithm_store(struct device *dev, while (*args) { args = next_arg(args, &param, &val); - if (!*val) + if (!val || !*val) return -EINVAL; if (!strcmp(param, "algo")) { @@ -1800,7 +1800,7 @@ static ssize_t recompress_store(struct device *dev, while (*args) { args = next_arg(args, &param, &val); - if (!*val) + if (!val || !*val) return -EINVAL; if (!strcmp(param, "type")) { -- 2.25.1

2 1

[PATCH openEuler-22.03-LTS] nvme-pci: fix mempool alloc size
by Yong Hu 18 Sep '23

18 Sep '23

From: Keith Busch <kbusch(a)kernel.org> stable inclusion from stable-v5.10.163 commit dfb6d54893d544151e7f480bc44cfe7823f5ad23 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7PZZC Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=… -------------------------------- [ Upstream commit c89a529e823d51dd23c7ec0c047c7a454a428541 ] Convert the max size to bytes to match the units of the divisor that calculates the worst-case number of PRP entries. The result is used to determine how many PRP Lists are required. The code was previously rounding this to 1 list, but we can require 2 in the worst case. In that scenario, the driver would corrupt memory beyond the size provided by the mempool. While unlikely to occur (you'd need a 4MB in exactly 127 phys segments on a queue that doesn't support SGLs), this memory corruption has been observed by kfence. Cc: Jens Axboe <axboe(a)kernel.dk> Fixes: 943e942e6266f ("nvme-pci: limit max IO size and segments to avoid high order allocations") Signed-off-by: Keith Busch <kbusch(a)kernel.org> Reviewed-by: Jens Axboe <axboe(a)kernel.dk> Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com> Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com> Signed-off-by: Christoph Hellwig <hch(a)lst.de> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yong Hu <yong.hu(a)windriver.com> --- drivers/nvme/host/pci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index bbf6ce4b82ac..e805a9813628 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -371,8 +371,8 @@ static bool nvme_dbbuf_update_and_check_event(u16 value, u32 *dbbuf_db, */ static int nvme_pci_npages_prp(void) { - unsigned nprps = DIV_ROUND_UP(NVME_MAX_KB_SZ + NVME_CTRL_PAGE_SIZE, - NVME_CTRL_PAGE_SIZE); + unsigned max_bytes = (NVME_MAX_KB_SZ * 1024) + NVME_CTRL_PAGE_SIZE; + unsigned nprps = DIV_ROUND_UP(max_bytes, NVME_CTRL_PAGE_SIZE); return DIV_ROUND_UP(8 * nprps, PAGE_SIZE - 8); } -- 2.34.1

2 1