Hi 田俊, 附件为TC例会KABI修复建议评估材料 请查收。
谢谢!
发件人: Tian, Jun J [mailto:jun.j.tian@intel.com] 发送时间: 2022年10月25日 10:43 收件人: Zhengzengkai zhengzengkai@huawei.com; Zhoukang (A) zhoukang7@huawei.com; kernel kernel@openeuler.org 抄送: Xiexiuqi xiexiuqi@huawei.com; Zeng, Jason jason.zeng@intel.com; Wang, Lin X lin.x.wang@intel.com; Dukaitian (Dukaitian, Intelligent Computing R&D) dukaitian@huawei.com; Huxinwei huxinwei@huawei.com; Hushiyuan hushiyuan@huawei.com; Xuhanbing xuhanbing@huawei.com 主题: RE: [PATCH openEuler-5.10 0/4] Try to fix kabi change caused by Intel AMX
KABI的问题主要还是在openEuler LTS update上合入新平台的策略问题。目前openEuler正在快速引入各类多样性平台和完善各平台生态,合入大型的主流新平台很难避免大量的KABI change。在这个前提下更多的应该是检视KABI change是否影响到第三方module,即使有影响的相关module能否适配KABI的变化。KABI的严格要求也是为了避免这类问题发生,但类似fpu, task_thread这种内核底层的修改很少会被driver module引用,如果openEuler检视的结果是没有发现类似的兼容问题,同时其他主流OSV已经合并而且暂时也没有类似的问题,那应该有一个策略来决定是否引入对应的平台的KABI的变化。
另外通过宏来规避检查工具或者通过修改现有patch的实现来统一KABI都不是理想的解决办法。修改patch实现来保持KABI的一致也会造成当前openEuler的base和kernel upstream以及其他主流的OSV的代码存在差异性,对未来rebase或者kernel upgrade会造成conflict的问题,对通用的三方module的维护也可能有潜在问题。实际上大量加入__GENKSYMS__宏也会造成未来维护和rebase的负担。
所以这个问题的本质是需要大家探讨出相应的策略,并建立一套评估流程,在保证KABI稳定的前提下也能灵活的支持更广泛的平台和生态。
Thanks, Jun Tian
From: Zhengzengkai <zhengzengkai@huawei.commailto:zhengzengkai@huawei.com> Sent: Monday, October 24, 2022 7:06 PM To: Tian, Jun J <jun.j.tian@intel.commailto:jun.j.tian@intel.com>; Zhoukang (A) <zhoukang7@huawei.commailto:zhoukang7@huawei.com>; kernel <kernel@openeuler.orgmailto:kernel@openeuler.org> Cc: Xiexiuqi <xiexiuqi@huawei.commailto:xiexiuqi@huawei.com>; Zeng, Jason <jason.zeng@intel.commailto:jason.zeng@intel.com>; Wang, Lin X <lin.x.wang@intel.commailto:lin.x.wang@intel.com>; Dukaitian (Dukaitian, Intelligent Computing R&D) <dukaitian@huawei.commailto:dukaitian@huawei.com>; Huxinwei <huxinwei@huawei.commailto:huxinwei@huawei.com>; Hushiyuan <hushiyuan@huawei.commailto:hushiyuan@huawei.com>; Xuhanbing <xuhanbing@huawei.commailto:xuhanbing@huawei.com> Subject: RE: [PATCH openEuler-5.10 0/4] Try to fix kabi change caused by Intel AMX
不建议所有的KABI变更(不论是否SPR相关的补丁)都用直接加__GENKSYMS__宏的方式规避KABI变更。
这组补丁的意图是想让Intel一起评估下这样改的收益和风险,说白了是否值得?
或者说有没有更好的思路?
________________________________
郑增凯 Zheng Zengkai Mobile: +86-50000020998(For Welink,eSpace Calls) Email: zhengzengkai@huawei.commailto:zhengzengkai@huawei.com 发件人:Tian, Jun J <jun.j.tian@intel.commailto:jun.j.tian@intel.com> 收件人:Zhoukang (A) <zhoukang7@huawei.commailto:zhoukang7@huawei.com>;Zhengzengkai <zhengzengkai@huawei.commailto:zhengzengkai@huawei.com>;kernel <kernel@openeuler.orgmailto:kernel@openeuler.org> 抄 送:Xiexiuqi <xiexiuqi@huawei.commailto:xiexiuqi@huawei.com>;Zeng, Jason <jason.zeng@intel.commailto:jason.zeng@intel.com>;Wang, Lin X <lin.x.wang@intel.commailto:lin.x.wang@intel.com>;Dukaitian (Dukaitian, Intelligent Computing R&D) <dukaitian@huawei.commailto:dukaitian@huawei.com>;Huxinwei <huxinwei@huawei.commailto:huxinwei@huawei.com>;Hushiyuan <hushiyuan@huawei.commailto:hushiyuan@huawei.com>;Xuhanbing <xuhanbing@huawei.commailto:xuhanbing@huawei.com> 时 间:2022-10-24 15:42:04 主 题:RE: [PATCH openEuler-5.10 0/4] Try to fix kabi change caused by Intel AMX
目前看为了统一KABI的检查,最好还是利用__GENKSYMS__包含对应已有修改的kernel struct和ABI。 这样防止SPR代码合并后check-kabi持续误报的问题,我们也会对SPR相应的KABI change的部分统一 通过这个方式处理以保持一致。当然前提是大家已经审视过SPR相应change并没有被引用或者影响可控。
修改KABI CRC工具未来也可能造成其他潜在问题,比如不参与checksum检查的KABI未来可能有 第三方module会引用。
Thanks, Jun Tian
-----Original Message----- From: Zhoukang (A) <zhoukang7@huawei.commailto:zhoukang7@huawei.com> Sent: Saturday, October 22, 2022 3:54 PM To: Zhengzengkai <zhengzengkai@huawei.commailto:zhengzengkai@huawei.com>; kernel@openeuler.orgmailto:kernel@openeuler.org Cc: Xiexiuqi <xiexiuqi@huawei.commailto:xiexiuqi@huawei.com>; Zeng, Jason <jason.zeng@intel.commailto:jason.zeng@intel.com>; Wang, Lin X <lin.x.wang@intel.commailto:lin.x.wang@intel.com>; Tian, Jun J <jun.j.tian@intel.commailto:jun.j.tian@intel.com>; Dukaitian (Dukaitian, Intelligent Computing R&D) <dukaitian@huawei.commailto:dukaitian@huawei.com>; Huxinwei <huxinwei@huawei.commailto:huxinwei@huawei.com>; Hushiyuan <hushiyuan@huawei.commailto:hushiyuan@huawei.com>; Xuhanbing <xuhanbing@huawei.commailto:xuhanbing@huawei.com> Subject: RE: [PATCH openEuler-5.10 0/4] Try to fix kabi change caused by Intel AMX
代码注释有点少, 无法体现KABI兼容思路与驱动的使用约束;
兼容补丁基于下面设计约束进行设计: 内核KABI白名单主要是为第3方驱动提供稳定的运行环境; 因此fpu, task_thread, 中断数据结构内部成员, 禁止驱动代码使用; 已经排查过driver 目录确实没有驱动使用的情况; 基于上面约束, 因此当前fix kabi的补丁仅是避免了检查工具的误报;
另外一种解决思路: 直接修改kabi CRC计算工具, 将特定数据结构的CRC值清除; 同时需要在编 译阶段检查驱动没有使用特定的数据结构成员;
-----Original Message----- From: Zhengzengkai Sent: Saturday, October 22, 2022 3:39 PM To: kernel@openeuler.orgmailto:kernel@openeuler.org Cc: Xiexiuqi <xiexiuqi@huawei.commailto:xiexiuqi@huawei.com>; Zhoukang (A) <zhoukang7@huawei.commailto:zhoukang7@huawei.com>; jason.zeng@intel.commailto:jason.zeng@intel.com; lin.x.wang@intel.commailto:lin.x.wang@intel.com; jun.j.tian@intel.commailto:jun.j.tian@intel.com; Dukaitian (Dukaitian, Intelligent Computing R&D) <dukaitian@huawei.commailto:dukaitian@huawei.com>; Zhengzengkai <zhengzengkai@huawei.commailto:zhengzengkai@huawei.com> Subject: [PATCH openEuler-5.10 0/4] Try to fix kabi change caused by Intel AMX
These four patches try to avoid or fix kabi change caused by Intel AMX PR: https://gitee.com/openeuler/kernel/pulls/58
Zheng Zengkai (4): x86: Avoid kabi change caused by adding pkru element in thread_struct x86/extable: Avoid kabi change caused by exception table rework x86/fpu: Avoid kabi change caused by struct fpu mm: Fix kabi change caused by saved_auxv[] in mm_struct for x86_64
arch/x86/include/asm/extable.h | 4 ++++ arch/x86/include/asm/fpu/types.h | 4 ++++ arch/x86/include/asm/processor.h | 2 ++ arch/x86/include/uapi/asm/auxvec.h | 1 + include/linux/mm_types.h | 15 +++++++++++++++ 5 files changed, 26 insertions(+)
-- 2.20.1