From: Mike Kravetz mike.kravetz@oracle.com
mainline inclusion from mainline-v5.15-rc1 commit e32d20c0c88b1cd0a44f882c4f0eb2f536363d1b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4IGRQ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
When removing a hugetlb page from the pool the ref count is set to one (as the free page has no ref count) and compound page destructor is set to NULL_COMPOUND_DTOR. Since a subsequent call to free the hugetlb page will call __free_pages for non-gigantic pages and free_gigantic_page for gigantic pages the destructor is not used.
However, consider the following race with code taking a speculative reference on the page:
Thread 0 Thread 1 -------- -------- remove_hugetlb_page set_page_refcounted(page); set_compound_page_dtor(page, NULL_COMPOUND_DTOR); get_page_unless_zero(page) __update_and_free_page __free_pages(page, huge_page_order(h));
/* Note that __free_pages() will simply drop the reference to the page. */
put_page(page) __put_compound_page() destroy_compound_page NULL_COMPOUND_DTOR BUG: kernel NULL pointer dereference, address: 0000000000000000
To address this race, set the dtor to the normal compound page dtor for non-gigantic pages. The dtor for gigantic pages does not matter as gigantic pages are changed from a compound page to 'just a group of pages' before freeing. Hence, the destructor is not used.
Link: https://lkml.kernel.org/r/20210809184832.18342-4-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: Muchun Song songmuchun@bytedance.com Cc: Michal Hocko mhocko@suse.com Cc: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Matthew Wilcox willy@infradead.org Cc: Naoya Horiguchi naoya.horiguchi@linux.dev Cc: Mina Almasry almasrymina@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Chen Wandun chenwandun@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- mm/hugetlb.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 47dd6b5e0040..6ae2d2e90681 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1364,8 +1364,28 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, h->surplus_huge_pages_node[nid]--; }
+ /* + * Very subtle + * + * For non-gigantic pages set the destructor to the normal compound + * page dtor. This is needed in case someone takes an additional + * temporary ref to the page, and freeing is delayed until they drop + * their reference. + * + * For gigantic pages set the destructor to the null dtor. This + * destructor will never be called. Before freeing the gigantic + * page destroy_compound_gigantic_page will turn the compound page + * into a simple group of pages. After this the destructor does not + * apply. + * + * This handles the case where more than one ref is held when and + * after update_and_free_page is called. + */ set_page_refcounted(page); - set_compound_page_dtor(page, NULL_COMPOUND_DTOR); + if (hstate_is_gigantic(h)) + set_compound_page_dtor(page, NULL_COMPOUND_DTOR); + else + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
h->nr_huge_pages--; h->nr_huge_pages_node[nid]--;
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBQ8
----------------------------------------
Kprobes use 'stop_machine' to modify code which could be ran in the sdei_handler at the same time. This patch mask sdei before running the stop_machine callback to avoid this race condition.
Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Wei Li liwei391@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/stop_machine.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 890b79cf0e7c..5c80fe3562b7 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -23,6 +23,10 @@ #include <linux/nmi.h> #include <linux/sched/wake_q.h>
+#ifdef CONFIG_ARM64 +#include <linux/arm_sdei.h> +#endif + /* * Structure to determine completion condition and record errors. May * be shared by works on different cpus. @@ -218,6 +222,9 @@ static int multi_cpu_stop(void *data) case MULTI_STOP_DISABLE_IRQ: local_irq_disable(); hard_irq_disable(); +#ifdef CONFIG_ARM64 + sdei_mask_local_cpu(); +#endif break; case MULTI_STOP_RUN: if (is_active) @@ -238,6 +245,9 @@ static int multi_cpu_stop(void *data) rcu_momentary_dyntick_idle(); } while (curstate != MULTI_STOP_EXIT);
+#ifdef CONFIG_ARM64 + sdei_unmask_local_cpu(); +#endif local_irq_restore(flags); return err; }
From: Wei Li liwei391@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBQ8
----------------------------------------
Kprobes use 'stop_machine' to modify code which could be ran in the pseudo nmi context at the same time. This patch mask pseudo nmi before running the stop_machine callback to avoid this race condition.
Signed-off-by: Wei Li liwei391@huawei.com Reviewed-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/include/asm/arch_gicv3.h | 11 +++++++++++ kernel/stop_machine.c | 3 +++ 2 files changed, 14 insertions(+)
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h index 3dd64dd18559..12aced900ada 100644 --- a/arch/arm64/include/asm/arch_gicv3.h +++ b/arch/arm64/include/asm/arch_gicv3.h @@ -184,5 +184,16 @@ static inline void gic_arch_enable_irqs(void) asm volatile ("msr daifclr, #2" : : : "memory"); }
+static inline void gic_arch_disable_irqs(void) +{ + asm volatile ("msr daifset, #2" : : : "memory"); +} + +static inline void gic_arch_restore_irqs(unsigned long flags) +{ + if (gic_supports_nmi()) + asm volatile ("msr daif, %0" : : "r" (flags >> 32) + : "memory"); +} #endif /* __ASSEMBLY__ */ #endif /* __ASM_ARCH_GICV3_H */ diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 5c80fe3562b7..dd5aeddbed5d 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -25,6 +25,7 @@
#ifdef CONFIG_ARM64 #include <linux/arm_sdei.h> +#include <asm/arch_gicv3.h> #endif
/* @@ -223,6 +224,7 @@ static int multi_cpu_stop(void *data) local_irq_disable(); hard_irq_disable(); #ifdef CONFIG_ARM64 + gic_arch_disable_irqs(); sdei_mask_local_cpu(); #endif break; @@ -247,6 +249,7 @@ static int multi_cpu_stop(void *data)
#ifdef CONFIG_ARM64 sdei_unmask_local_cpu(); + gic_arch_restore_irqs(flags); #endif local_irq_restore(flags); return err;
From: Ajo Jose Panoor ajo.jose.panoor@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JC4P CVE: NA
-----------------------------------------------------------------
Writing to securityfs (x509_for_children) fails with permission issues during IMANS configuration. It is because IMANS is checking for CAP_SYS_ADMIN capability in the initial user namespace and not in the newly created user namespace where the new process is actually part off.
Signed-off-by: Ajo Jose Panoor ajo.jose.panoor@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- security/integrity/ima/ima_fs.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index 00cd8095d346..b7959de25a5f 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -637,12 +637,16 @@ static const struct file_operations ima_data_upload_ops = { static int ima_open_for_children(struct inode *inode, struct file *file) { struct ima_namespace *ima_ns = get_current_ns(); + struct ima_namespace *ima_ns_for_children = current->nsproxy->ima_ns_for_children;
/* Allow to set children configuration only after unshare() */ if (ima_ns == current->nsproxy->ima_ns_for_children) return -EPERM;
- return ima_open_simple(inode, file); + if (!ns_capable(ima_ns_for_children->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + return 0; }
static ssize_t ima_write_x509_for_children(struct file *file,
From: Guo Mengqi guomengqi3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4KX9W CVE: NA
------------
Support ts core ras process for ascend.
Signed-off-by: Xu Qiang xuqiang36@huawei.com Signed-off-by: Lijun Fang fanglijun3@huawei.com Signed-off-by: Guo Mengqi guomengqi3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/acpi/apei/ghes.c | 7 +++++++ include/acpi/ghes.h | 2 ++ include/linux/cper.h | 4 ++++ 3 files changed, 13 insertions(+)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..744769f7bddb 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -118,6 +118,9 @@ module_param_named(disable, ghes_disable, bool, 0); static LIST_HEAD(ghes_hed); static DEFINE_MUTEX(ghes_list_mutex);
+BLOCKING_NOTIFIER_HEAD(ghes_ts_err_chain); +EXPORT_SYMBOL(ghes_ts_err_chain); + /* * Because the memory area used to transfer hardware error information * from BIOS to Linux can be determined only in NMI, IRQ or timer @@ -655,6 +658,10 @@ static bool ghes_do_proc(struct ghes *ghes, } else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { queued = ghes_handle_arm_hw_error(gdata, sev); + } + else if (guid_equal(sec_type, &CPER_SEC_TS_CORE)) { + blocking_notifier_call_chain(&ghes_ts_err_chain, + 0, acpi_hest_get_payload(gdata)); } else { void *err = acpi_hest_get_payload(gdata);
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h index 34fb3431a8f3..89330e4872c0 100644 --- a/include/acpi/ghes.h +++ b/include/acpi/ghes.h @@ -145,4 +145,6 @@ int ghes_notify_sea(void); static inline int ghes_notify_sea(void) { return -ENOENT; } #endif
+extern struct blocking_notifier_head ghes_ts_err_chain; + #endif /* GHES_H */ diff --git a/include/linux/cper.h b/include/linux/cper.h index 6a511a1078ca..78cf8a0b05a5 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -197,6 +197,10 @@ enum { #define CPER_SEC_DMAR_IOMMU \ GUID_INIT(0x036F84E1, 0x7F37, 0x428c, 0xA7, 0x9E, 0x57, 0x5F, \ 0xDF, 0xAA, 0x84, 0xEC) +/* HISI ts core */ +#define CPER_SEC_TS_CORE \ + GUID_INIT(0xeb4c71f8, 0xbc76, 0x4c46, 0xbd, 0x9, 0xd0, 0xd3, \ + 0x45, 0x0, 0x5a, 0x92)
#define CPER_PROC_VALID_TYPE 0x0001 #define CPER_PROC_VALID_ISA 0x0002
From: yangerkun yangerkun@huawei.com
hulk inclusion category: bugfix bugzilla: 185810, https://gitee.com/openeuler/kernel/issues/I4JX1G CVE: NA
---------------------------
dio_bio_complete will set page dirty without consider is there still buffer_head valid with this page. This will trigger some problem while ext4 try to writeback this page. For ext4, we fix it by skip writeback the page without buffer_head.
[1] https://lwn.net/Articles/774411/ : "DMA and get_user_pages()" [2] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()" [3] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: yangerkun yangerkun@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xie XiuQi xiexiuqi@huawei.com
Conflicts: fs/ext4/inode.c Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/inode.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0415548afc71..164161e4c144 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1940,6 +1940,20 @@ static int __ext4_journalled_writepage(struct page *page, return ret; }
+static void cancel_page_dirty_status(struct page *page) +{ + struct address_space *mapping = page_mapping(page); + unsigned long flags; + + cancel_dirty_page(page); + xa_lock_irqsave(&mapping->i_pages, flags); + __xa_clear_mark(&mapping->i_pages, page_index(page), + PAGECACHE_TAG_DIRTY); + __xa_clear_mark(&mapping->i_pages, page_index(page), + PAGECACHE_TAG_TOWRITE); + xa_unlock_irqrestore(&mapping->i_pages, flags); +} + /* * Note that we don't need to start a transaction unless we're journaling data * because we should have holes filled from ext4_page_mkwrite(). We even don't @@ -1998,6 +2012,12 @@ static int ext4_writepage(struct page *page, return -EIO; }
+ if (WARN_ON(!page_has_buffers(page))) { + cancel_page_dirty_status(page); + unlock_page(page); + return 0; + } + trace_ext4_writepage(page); size = i_size_read(inode); if (page->index == size >> PAGE_SHIFT && @@ -2606,6 +2626,12 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) continue; }
+ if (WARN_ON(!page_has_buffers(page))) { + cancel_page_dirty_status(page); + unlock_page(page); + continue; + } + wait_on_page_writeback(page); BUG_ON(PageWriteback(page));
From: Wei Li liwei391@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JUZZ
-------------------------------------------------
When enabling CONFIG_DEBUG_PREEMPT and CONFIG_PREEMPT, it triggers a 'BUG' in the pmu based nmi_watchdog initializaion:
[ 3.341853] BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 [ 3.344392] caller is debug_smp_processor_id+0x17/0x20 [ 3.344395] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #398 [ 3.344397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 [ 3.344399] Call Trace: [ 3.344410] dump_stack+0x60/0x76 [ 3.344412] check_preemption_disabled+0xba/0xc0 [ 3.344415] debug_smp_processor_id+0x17/0x20 [ 3.344422] hardlockup_detector_event_create+0xf/0x60 [ 3.344427] hardlockup_detector_perf_init+0xf/0x41 [ 3.344430] watchdog_nmi_probe+0xe/0x10 [ 3.344432] lockup_detector_init+0x22/0x5b [ 3.344437] kernel_init_freeable+0x20c/0x245 [ 3.344439] ? rest_init+0xd0/0xd0 [ 3.344441] kernel_init+0xe/0x110 [ 3.344446] ret_from_fork+0x22/0x30
This issue was introduced by commit a79050434b45, which move down lockup_detector_init() after do_basic_setup(), after sched_init_smp() too.
hardlockup_detector_event_create |- hardlockup_detector_perf_init (unsafe) |- watchdog_nmi_probe |- lockup_detector_init |- hardlockup_detector_perf_enable |- watchdog_nmi_enable |- watchdog_enable |- lockup_detector_online_cpu |- softlockup_start_fn |- softlockup_start_all |- lockup_detector_reconfigure |- lockup_detector_setup |- lockup_detector_init
After analysing the calling context, it's only unsafe to use smp_processor_id() in hardlockup_detector_perf_init() as the thread 'kernel_init' is preemptible after sched_init_smp().
While it is just a test if we can enable the pmu based nmi_watchdog, the real enabling process is in softlockup_start_fn() later which ensures that watchdog_enable() is called on all cores. So it's free to disable preempt to fix this 'BUG'.
Fixes: a79050434b45 ("lockup_detector: init lockup detector after all the init_calls") Signed-off-by: Wei Li liwei391@huawei.com Reviewed-by: wangxiongfeng wangxiongfeng2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/watchdog_hld.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 060873ff8a6d..f535ddd76315 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -504,14 +504,17 @@ void __init hardlockup_detector_perf_restart(void) */ int __init hardlockup_detector_perf_init(void) { - int ret = hardlockup_detector_event_create(); + int ret;
+ preempt_disable(); + ret = hardlockup_detector_event_create(); if (ret) { pr_info("Perf NMI watchdog permanently disabled\n"); } else { perf_event_release_kernel(this_cpu_read(watchdog_ev)); this_cpu_write(watchdog_ev, NULL); } + preempt_enable(); return ret; } #endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */
From: YueHaibing yuehaibing@huawei.com
hulk inclusion category: performance bugzilla: 16588, https://gitee.com/openeuler/kernel/issues/I4K2W7 CVE: NA
-------------------------------------------------
This is the ARM64 CRC T10 DIF transform accelerated with the ARMv8 NEON instruction.
Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: Li Bin huawei.libin@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Lu Wei luwei32@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/crypto/Makefile | 3 +- arch/arm64/crypto/crct10dif-neon-asm_64.S | 752 ++++++++++++++++++++++ arch/arm64/crypto/crct10dif-neon_glue.c | 116 ++++ 3 files changed, 870 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/crypto/crct10dif-neon-asm_64.S create mode 100644 arch/arm64/crypto/crct10dif-neon_glue.c
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index d0901e610df3..a5d4b672b6e1 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -27,7 +27,8 @@ obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM64_CE) += crct10dif-ce.o -crct10dif-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o +crct10dif-ce-y := crct10dif-neon-asm_64.o crct10dif-neon_glue.o +AFLAGS_crct10dif-neon-asm_64.o := -march=armv8-a+crypto
obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o aes-ce-cipher-y := aes-ce-core.o aes-ce-glue.o diff --git a/arch/arm64/crypto/crct10dif-neon-asm_64.S b/arch/arm64/crypto/crct10dif-neon-asm_64.S new file mode 100644 index 000000000000..a37204bf5a7a --- /dev/null +++ b/arch/arm64/crypto/crct10dif-neon-asm_64.S @@ -0,0 +1,752 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2016-2017 Hisilicon Limited. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include <linux/linkage.h> +#include <asm/assembler.h> + +.global crc_t10dif_neon +.text + +/* X0 is initial CRC value + * X1 is data buffer + * X2 is the length of buffer + * X3 is the backup buffer(for extend) + * X4 for other extend parameter(for extend) + * Q0, Q1, Q2, Q3 maybe as parameter for other functions, + * the value of Q0, Q1, Q2, Q3 maybe modified. + * + * suggestion: + * 1. dont use general purpose register for calculation + * 2. set data endianness outside of the kernel + * 3. use ext as shifting around + * 4. dont use LD3/LD4, ST3/ST4 + */ + +crc_t10dif_neon: + /* push the register to stack that CRC16 will use */ + STP X5, X6, [sp, #-0x10]! + STP X7, X8, [sp, #-0x10]! + STP X9, X10, [sp, #-0x10]! + STP X11, X12, [sp, #-0x10]! + STP X13, X14, [sp, #-0x10]! + STP Q10, Q11, [sp, #-0x20]! + STP Q12, Q13, [sp, #-0x20]! + STP Q4, Q5, [sp, #-0x20]! + STP Q6, Q7, [sp, #-0x20]! + STP Q8, Q9, [sp, #-0x20]! + STP Q14, Q15, [sp, #-0x20]! + STP Q16, Q17, [sp, #-0x20]! + STP Q18, Q19, [sp, #-0x20]! + + SUB sp,sp,#0x20 + + MOV X11, #0 // PUSH STACK FLAG + + CMP X2, #0x80 + B.LT 2f // _less_than_128, <128 + + /* V10/V11/V12/V13 is 128bit. + * we get data 512bit( by cacheline ) each time + */ + LDP Q10, Q11, [X1], #0x20 + LDP Q12, Q13, [X1], #0x20 + + /* move the initial value to V6 register */ + LSL X0, X0, #48 + EOR V6.16B, V6.16B, V6.16B + MOV V6.D[1], X0 + + /* big-little end change. because the data in memory is little-end, + * we deal the data for bigend + */ + + REV64 V10.16B, V10.16B + REV64 V11.16B, V11.16B + REV64 V12.16B, V12.16B + REV64 V13.16B, V13.16B + EXT V10.16B, V10.16B, V10.16B, #8 + EXT V11.16B, V11.16B, V11.16B, #8 + EXT V12.16B, V12.16B, V12.16B, #8 + EXT V13.16B, V13.16B, V13.16B, #8 + + EOR V10.16B, V10.16B, V6.16B + + SUB X2, X2, #0x80 + ADD X5, X1, #0x20 + + /* deal data when the size of buffer bigger than 128 bytes */ + /* _fold_64_B_loop */ + LDR Q6,=0xe658000000000000044c000000000000 +1: + + LDP Q16, Q17, [X1] ,#0x40 + LDP Q18, Q19, [X5], #0x40 + + /* carry-less multiply. + * V10 high-64bits carry-less multiply + * V6 high-64bits(PMULL2) + * V11 low-64bits carry-less multiply V6 low-64bits(PMULL) + */ + + PMULL2 V4.1Q, V10.2D, V6.2D + PMULL V10.1Q, V10.1D, V6.1D + PMULL2 V5.1Q, V11.2D, V6.2D + PMULL V11.1Q, V11.1D, V6.1D + + REV64 V16.16B, V16.16B + REV64 V17.16B, V17.16B + REV64 V18.16B, V18.16B + REV64 V19.16B, V19.16B + + PMULL2 V14.1Q, V12.2D, V6.2D + PMULL V12.1Q, V12.1D, V6.1D + PMULL2 V15.1Q, V13.2D, V6.2D + PMULL V13.1Q, V13.1D, V6.1D + + EXT V16.16B, V16.16B, V16.16B, #8 + EOR V10.16B, V10.16B, V4.16B + + EXT V17.16B, V17.16B, V17.16B, #8 + EOR V11.16B, V11.16B, V5.16B + + EXT V18.16B, V18.16B, V18.16B, #8 + EOR V12.16B, V12.16B, V14.16B + + EXT V19.16B, V19.16B, V19.16B, #8 + EOR V13.16B, V13.16B, V15.16B + + SUB X2, X2, #0x40 + + + EOR V10.16B, V10.16B, V16.16B + EOR V11.16B, V11.16B, V17.16B + + EOR V12.16B, V12.16B, V18.16B + EOR V13.16B, V13.16B, V19.16B + + CMP X2, #0x0 + B.GE 1b // >=0 + + LDR Q6, =0x06df0000000000002d56000000000000 + MOV V4.16B, V10.16B + /* V10 carry-less 0x06df000000000000([127:64]*[127:64]) */ + PMULL V4.1Q, V4.1D, V6.1D //switch PMULL & PMULL2 order + PMULL2 V10.1Q, V10.2D, V6.2D + EOR V11.16B, V11.16B, V4.16B + EOR V11.16B, V11.16B, V10.16B + + MOV V4.16B, V11.16B + PMULL V4.1Q, V4.1D, V6.1D //switch PMULL & PMULL2 order + PMULL2 V11.1Q, V11.2D, V6.2D + EOR V12.16B, V12.16B, V4.16B + EOR V12.16B, V12.16B, V11.16B + + MOV V4.16B, V12.16B + PMULL V4.1Q, V4.1D, V6.1D //switch PMULL & PMULL2 order + PMULL2 V12.1Q, V12.2D, V6.2D + EOR V13.16B, V13.16B, V4.16B + EOR V13.16B, V13.16B, V12.16B + + ADD X2, X2, #48 + CMP X2, #0x0 + B.LT 3f // _final_reduction_for_128, <0 + + /* _16B_reduction_loop */ +4: + /* unrelated load as early as possible*/ + LDR Q10, [X1], #0x10 + + MOV V4.16B, V13.16B + PMULL2 V13.1Q, V13.2D, V6.2D + PMULL V4.1Q, V4.1D, V6.1D + EOR V13.16B, V13.16B, V4.16B + + REV64 V10.16B, V10.16B + EXT V10.16B, V10.16B, V10.16B, #8 + + EOR V13.16B, V13.16B, V10.16B + + SUB X2, X2, #0x10 + CMP X2, #0x0 + B.GE 4b // _16B_reduction_loop, >=0 + + /* _final_reduction_for_128 */ +3: ADD X2, X2, #0x10 + CMP X2, #0x0 + B.EQ 5f // _128_done, ==0 + + /* _get_last_two_xmms */ +6: MOV V12.16B, V13.16B + SUB X1, X1, #0x10 + ADD X1, X1, X2 + LDR Q11, [X1], #0x10 + REV64 V11.16B, V11.16B + EXT V11.16B, V11.16B, V11.16B, #8 + + CMP X2, #8 + B.EQ 50f + B.LT 51f + B.GT 52f + +50: + /* dont use X register as temp one */ + FMOV D14, D12 + MOVI D12, #0 + MOV V12.D[1],V14.D[0] + B 53f +51: + MOV X9, #64 + LSL X13, X2, #3 // <<3 equal x8 + SUB X9, X9, X13 + MOV X5, V12.D[0] // low 64-bit + MOV X6, V12.D[1] // high 64-bit + LSR X10, X5, X9 // high bit of low 64-bit + LSL X7, X5, X13 + LSL X8, X6, X13 + ORR X8, X8, X10 // combination of high 64-bit + MOV V12.D[1], X8 + MOV V12.D[0], X7 + + B 53f +52: + LSL X13, X2, #3 // <<3 equal x8 + SUB X13, X13, #64 + + DUP V18.2D, X13 + FMOV D16, D12 + USHL D16, D16, D18 + EXT V12.16B, V16.16B, V16.16B, #8 + +53: + MOVI D14, #0 //add one zero constant + + CMP X2, #0 + B.EQ 30f + CMP X2, #1 + B.EQ 31f + CMP X2, #2 + B.EQ 32f + CMP X2, #3 + B.EQ 33f + CMP X2, #4 + B.EQ 34f + CMP X2, #5 + B.EQ 35f + CMP X2, #6 + B.EQ 36f + CMP X2, #7 + B.EQ 37f + CMP X2, #8 + B.EQ 38f + CMP X2, #9 + B.EQ 39f + CMP X2, #10 + B.EQ 40f + CMP X2, #11 + B.EQ 41f + CMP X2, #12 + B.EQ 42f + CMP X2, #13 + B.EQ 43f + CMP X2, #14 + B.EQ 44f + CMP X2, #15 + B.EQ 45f + + // >> 128bit +30: + EOR V13.16B, V13.16B, V13.16B + EOR V8.16B, V8.16B, V8.16B + LDR Q9,=0xffffffffffffffffffffffffffffffff + B 46f + + // >> 120bit +31: + USHR V13.2D, V13.2D, #56 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xff + LDR Q9,=0xffffffffffffffffffffffffffffff00 + B 46f + + // >> 112bit +32: + USHR V13.2D, V13.2D, #48 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffff + LDR Q9,=0xffffffffffffffffffffffffffff0000 + B 46f + + // >> 104bit +33: + USHR V13.2D, V13.2D, #40 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffffff + LDR Q9,=0xffffffffffffffffffffffffff000000 + B 46f + + // >> 96bit +34: + USHR V13.2D, V13.2D, #32 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffffffff + LDR Q9,=0xffffffffffffffffffffffff00000000 + B 46f + + // >> 88bit +35: + USHR V13.2D, V13.2D, #24 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffffffffff + LDR Q9,=0xffffffffffffffffffffff0000000000 + B 46f + + // >> 80bit +36: + USHR V13.2D, V13.2D, #16 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffffffffffff + LDR Q9,=0xffffffffffffffffffff000000000000 + B 46f + + // >> 72bit +37: + USHR V13.2D, V13.2D, #8 + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffffffffffffff + LDR Q9,=0xffffffffffffffffff00000000000000 + B 46f + + // >> 64bit +38: + EXT V13.16B, V13.16B, V14.16B, #8 + LDR Q8,=0xffffffffffffffff + LDR Q9,=0xffffffffffffffff0000000000000000 + B 46f + + // >> 56bit +39: + EXT V13.16B, V13.16B, V13.16B, #7 + MOV V13.S[3], V14.S[0] + MOV V13.H[5], V14.H[0] + MOV V13.B[9], V14.B[0] + + LDR Q8,=0xffffffffffffffffff + LDR Q9,=0xffffffffffffff000000000000000000 + B 46f + + // >> 48bit +40: + EXT V13.16B, V13.16B, V13.16B, #6 + MOV V13.S[3], V14.S[0] + MOV V13.H[5], V14.H[0] + + LDR Q8,=0xffffffffffffffffffff + LDR Q9,=0xffffffffffff00000000000000000000 + B 46f + + // >> 40bit +41: + EXT V13.16B, V13.16B, V13.16B, #5 + MOV V13.S[3], V14.S[0] + MOV V13.B[11], V14.B[0] + + LDR Q8,=0xffffffffffffffffffffff + LDR Q9,=0xffffffffff0000000000000000000000 + B 46f + + // >> 32bit +42: + EXT V13.16B, V13.16B, V13.16B, #4 + MOV V13.S[3], V14.S[0] + + LDR Q8,=0xffffffffffffffffffffffff + LDR Q9,=0xffffffff000000000000000000000000 + B 46f + + // >> 24bit +43: + EXT V13.16B, V13.16B, V13.16B, #3 + MOV V13.H[7], V14.H[0] + MOV V13.B[13], V14.B[0] + + LDR Q8,=0xffffffffffffffffffffffffff + LDR Q9,=0xffffff00000000000000000000000000 + B 46f + + // >> 16bit +44: + EXT V13.16B, V13.16B, V13.16B, #2 + MOV V13.H[7], V14.H[0] + + LDR Q8,=0xffffffffffffffffffffffffffff + LDR Q9,=0xffff0000000000000000000000000000 + B 46f + + // >> 8bit +45: + EXT V13.16B, V13.16B, V13.16B, #1 + MOV V13.B[15], V14.B[0] + + LDR Q8,=0xffffffffffffffffffffffffffffff + LDR Q9,=0xff000000000000000000000000000000 + + // backup V12 first + // pblendvb xmm1, xmm2 +46: + AND V12.16B, V12.16B, V9.16B + AND V11.16B, V11.16B, V8.16B + ORR V11.16B, V11.16B, V12.16B + + MOV V12.16B, V11.16B + MOV V4.16B, V13.16B + PMULL2 V13.1Q, V13.2D, V6.2D + PMULL V4.1Q, V4.1D, V6.1D + EOR V13.16B, V13.16B, V4.16B + EOR V13.16B, V13.16B, V12.16B + + /* _128_done. we change the Q6 D[0] and D[1] */ +5: LDR Q6, =0x2d560000000000001368000000000000 + MOVI D14, #0 + MOV V10.16B, V13.16B + PMULL2 V13.1Q, V13.2D, V6.2D + + MOV V10.D[1], V10.D[0] + MOV V10.D[0], V14.D[0] //set zero + + EOR V13.16B, V13.16B, V10.16B + + MOV V10.16B, V13.16B + LDR Q7, =0x00000000FFFFFFFFFFFFFFFFFFFFFFFF + AND V10.16B, V10.16B, V7.16B + + MOV S13, V13.S[3] + + PMULL V13.1Q, V13.1D, V6.1D + EOR V13.16B, V13.16B, V10.16B + + /* _barrett */ +7: LDR Q6, =0x00000001f65a57f8000000018bb70000 + MOVI D14, #0 + MOV V10.16B, V13.16B + PMULL2 V13.1Q, V13.2D, V6.2D + + EXT V13.16B, V13.16B, V13.16B, #12 + MOV V13.S[0], V14.S[0] + + EXT V6.16B, V6.16B, V6.16B, #8 + PMULL2 V13.1Q, V13.2D, V6.2D + + EXT V13.16B, V13.16B, V13.16B, #12 + MOV V13.S[0], V14.S[0] + + EOR V13.16B, V13.16B, V10.16B + MOV X0, V13.D[0] + + /* _cleanup */ +8: MOV X14, #48 + LSR X0, X0, X14 +99: + ADD sp, sp, #0x20 + + LDP Q18, Q19, [sp], #0x20 + LDP Q16, Q17, [sp], #0x20 + LDP Q14, Q15, [sp], #0x20 + + LDP Q8, Q9, [sp], #0x20 + LDP Q6, Q7, [sp], #0x20 + LDP Q4, Q5, [sp], #0x20 + LDP Q12, Q13, [sp], #0x20 + LDP Q10, Q11, [sp], #0x20 + LDP X13, X14, [sp], #0x10 + LDP X11, X12, [sp], #0x10 + LDP X9, X10, [sp], #0x10 + LDP X7, X8, [sp], #0x10 + LDP X5, X6, [sp], #0x10 + + RET + + /* _less_than_128 */ +2: CMP X2, #32 + B.LT 9f // _less_than_32 + LDR Q6, =0x06df0000000000002d56000000000000 + + LSL X0, X0, #48 + LDR Q10, =0x0 + MOV V10.D[1], X0 + LDR Q13, [X1], #0x10 + REV64 V13.16B, V13.16B + EXT V13.16B, V13.16B, V13.16B, #8 + + EOR V13.16B, V13.16B, V10.16B + + SUB X2, X2, #32 + B 4b + + /* _less_than_32 */ +9: CMP X2, #0 + B.EQ 99b // _cleanup + LSL X0, X0, #48 + LDR Q10,=0x0 + MOV V10.D[1], X0 + + CMP X2, #16 + B.EQ 10f // _exact_16_left + B.LE 11f // _less_than_16_left + LDR Q13, [X1], #0x10 + + REV64 V13.16B, V13.16B + EXT V13.16B, V13.16B, V13.16B, #8 + + EOR V13.16B, V13.16B, V10.16B + SUB X2, X2, #16 + LDR Q6, =0x06df0000000000002d56000000000000 + B 6b // _get_last_two_xmms + + /* _less_than_16_left */ +11: CMP X2, #4 + B.LT 13f // _only_less_than_4 + + /* backup the length of data, we used in _less_than_2_left */ + MOV X8, X2 + CMP X2, #8 + B.LT 14f // _less_than_8_left + + LDR X14, [X1], #8 + /* push the data to stack, we backup the data to V10 */ + STR X14, [sp, #0] + SUB X2, X2, #8 + ADD X11, X11, #8 + + /* _less_than_8_left */ +14: CMP X2, #4 + B.LT 15f // _less_than_4_left + + /* get 32bit data */ + LDR W5, [X1], #4 + + /* push the data to stack */ + STR W5, [sp, X11] + SUB X2, X2, #4 + ADD X11, X11, #4 + + /* _less_than_4_left */ +15: CMP X2, #2 + B.LT 16f // _less_than_2_left + + /* get 16bits data */ + LDRH W6, [X1], #2 + + /* push the data to stack */ + STRH W6, [sp, X11] + SUB X2, X2, #2 + ADD X11, X11, #2 + + /* _less_than_2_left */ +16: + /* get 8bits data */ + LDRB W7, [X1], #1 + STRB W7, [sp, X11] + ADD X11, X11, #1 + + /* POP data from stack, store to V13 */ + LDR Q13, [sp] + MOVI D14, #0 + REV64 V13.16B, V13.16B + MOV V8.16B, V13.16B + MOV V13.D[1], V8.D[0] + MOV V13.D[0], V8.D[1] + + EOR V13.16B, V13.16B, V10.16B + CMP X8, #15 + B.EQ 80f + CMP X8, #14 + B.EQ 81f + CMP X8, #13 + B.EQ 82f + CMP X8, #12 + B.EQ 83f + CMP X8, #11 + B.EQ 84f + CMP X8, #10 + B.EQ 85f + CMP X8, #9 + B.EQ 86f + CMP X8, #8 + B.EQ 87f + CMP X8, #7 + B.EQ 88f + CMP X8, #6 + B.EQ 89f + CMP X8, #5 + B.EQ 90f + CMP X8, #4 + B.EQ 91f + CMP X8, #3 + B.EQ 92f + CMP X8, #2 + B.EQ 93f + CMP X8, #1 + B.EQ 94f + CMP X8, #0 + B.EQ 95f + +80: + EXT V13.16B, V13.16B, V13.16B, #1 + MOV V13.B[15], V14.B[0] + B 5b + +81: + EXT V13.16B, V13.16B, V13.16B, #2 + MOV V13.H[7], V14.H[0] + B 5b + +82: + EXT V13.16B, V13.16B, V13.16B, #3 + MOV V13.H[7], V14.H[0] + MOV V13.B[13], V14.B[0] + B 5b +83: + + EXT V13.16B, V13.16B, V13.16B, #4 + MOV V13.S[3], V14.S[0] + B 5b + +84: + EXT V13.16B, V13.16B, V13.16B, #5 + MOV V13.S[3], V14.S[0] + MOV V13.B[11], V14.B[0] + B 5b + +85: + EXT V13.16B, V13.16B, V13.16B, #6 + MOV V13.S[3], V14.S[0] + MOV V13.H[5], V14.H[0] + B 5b + +86: + EXT V13.16B, V13.16B, V13.16B, #7 + MOV V13.S[3], V14.S[0] + MOV V13.H[5], V14.H[0] + MOV V13.B[9], V14.B[0] + B 5b + +87: + MOV V13.D[0], V13.D[1] + MOV V13.D[1], V14.D[0] + B 5b + +88: + EXT V13.16B, V13.16B, V13.16B, #9 + MOV V13.D[1], V14.D[0] + MOV V13.B[7], V14.B[0] + B 5b + +89: + EXT V13.16B, V13.16B, V13.16B, #10 + MOV V13.D[1], V14.D[0] + MOV V13.H[3], V14.H[0] + B 5b + +90: + EXT V13.16B, V13.16B, V13.16B, #11 + MOV V13.D[1], V14.D[0] + MOV V13.H[3], V14.H[0] + MOV V13.B[5], V14.B[0] + B 5b + +91: + MOV V13.S[0], V13.S[3] + MOV V13.D[1], V14.D[0] + MOV V13.S[1], V14.S[0] + B 5b + +92: + EXT V13.16B, V13.16B, V13.16B, #13 + MOV V13.D[1], V14.D[0] + MOV V13.S[1], V14.S[0] + MOV V13.B[3], V14.B[0] + B 5b + +93: + MOV V15.H[0], V13.H[7] + MOV V13.16B, V14.16B + MOV V13.H[0], V15.H[0] + B 5b + +94: + MOV V15.B[0], V13.B[15] + MOV V13.16B, V14.16B + MOV V13.B[0], V15.B[0] + B 5b + +95: + LDR Q13,=0x0 + B 5b // _128_done + + /* _exact_16_left */ +10: + LD1 { V13.2D }, [X1], #0x10 + + REV64 V13.16B, V13.16B + EXT V13.16B, V13.16B, V13.16B, #8 + EOR V13.16B, V13.16B, V10.16B + B 5b // _128_done + + /* _only_less_than_4 */ +13: CMP X2, #3 + MOVI D14, #0 + B.LT 17f //_only_less_than_3 + + LDR S13, [X1], #4 + MOV V13.B[15], V13.B[0] + MOV V13.B[14], V13.B[1] + MOV V13.B[13], V13.B[2] + MOV V13.S[0], V13.S[1] + + EOR V13.16B, V13.16B, V10.16B + + EXT V13.16B, V13.16B, V13.16B, #5 + + MOV V13.S[3], V14.S[0] + MOV V13.B[11], V14.B[0] + + B 7b // _barrett + /* _only_less_than_3 */ +17: + CMP X2, #2 + B.LT 18f // _only_less_than_2 + + LDR H13, [X1], #2 + MOV V13.B[15], V13.B[0] + MOV V13.B[14], V13.B[1] + MOV V13.H[0], V13.H[1] + + EOR V13.16B, V13.16B, V10.16B + + EXT V13.16B, V13.16B, V13.16B, #6 + MOV V13.S[3], V14.S[0] + MOV V13.H[5], V14.H[0] + + B 7b // _barrett + + /* _only_less_than_2 */ +18: + LDRB W7, [X1], #1 + LDR Q13, = 0x0 + MOV V13.B[15], W7 + + EOR V13.16B, V13.16B, V10.16B + + EXT V13.16B, V13.16B, V13.16B, #7 + MOV V13.S[3], V14.S[0] + MOV V13.H[5], V14.H[0] + MOV V13.B[9], V14.B[0] + + B 7b // _barrett diff --git a/arch/arm64/crypto/crct10dif-neon_glue.c b/arch/arm64/crypto/crct10dif-neon_glue.c new file mode 100644 index 000000000000..e0c4a9acee27 --- /dev/null +++ b/arch/arm64/crypto/crct10dif-neon_glue.c @@ -0,0 +1,116 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2016-2017 Hisilicon Limited. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + + +#include <linux/types.h> +#include <linux/module.h> +#include <linux/crc-t10dif.h> +#include <crypto/internal/hash.h> +#include <linux/init.h> +#include <linux/string.h> +#include <linux/kernel.h> + +asmlinkage __u16 crc_t10dif_neon(__u16 crc, const unsigned char *buf, + size_t len); + +struct chksum_desc_ctx { + __u16 crc; +}; + +/* + * Steps through buffer one byte at at time, calculates reflected + * crc using table. + */ + +static int chksum_init(struct shash_desc *desc) +{ + struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + + ctx->crc = 0; + + return 0; +} + +static int chksum_update(struct shash_desc *desc, const u8 *data, + unsigned int length) +{ + struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + + ctx->crc = crc_t10dif_neon(ctx->crc, data, length); + return 0; +} + +static int chksum_final(struct shash_desc *desc, u8 *out) +{ + struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + + *(__u16 *)out = ctx->crc; + return 0; +} + +static int __chksum_finup(__u16 *crcp, const u8 *data, unsigned int len, + u8 *out) +{ + *(__u16 *)out = crc_t10dif_neon(*crcp, data, len); + return 0; +} + +static int chksum_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + + return __chksum_finup(&ctx->crc, data, len, out); +} + +static int chksum_digest(struct shash_desc *desc, const u8 *data, + unsigned int length, u8 *out) +{ + struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + + return __chksum_finup(&ctx->crc, data, length, out); +} + +static struct shash_alg alg = { + .digestsize = CRC_T10DIF_DIGEST_SIZE, + .init = chksum_init, + .update = chksum_update, + .final = chksum_final, + .finup = chksum_finup, + .digest = chksum_digest, + .descsize = sizeof(struct chksum_desc_ctx), + .base = { + .cra_name = "crct10dif", + .cra_driver_name = "crct10dif-neon", + .cra_priority = 200, + .cra_blocksize = CRC_T10DIF_BLOCK_SIZE, + .cra_module = THIS_MODULE, + } +}; + +static int __init crct10dif_arm64_mod_init(void) +{ + return crypto_register_shash(&alg); +} + +static void __exit crct10dif_arm64_mod_fini(void) +{ + crypto_unregister_shash(&alg); +} + +module_init(crct10dif_arm64_mod_init); +module_exit(crct10dif_arm64_mod_fini); + +MODULE_AUTHOR("YueHaibing yuehaibing@huawei.com"); +MODULE_DESCRIPTION("T10 DIF CRC calculation accelerated with ARM64 NEON instruction."); +MODULE_LICENSE("GPL"); + +MODULE_ALIAS_CRYPTO("crct10dif"); +MODULE_ALIAS_CRYPTO("crct10dif-neon");
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Add svm driver framework for ascend, that support dts and acpi.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/Kconfig | 10 +++ drivers/char/Makefile | 1 + drivers/char/svm.c | 140 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 151 insertions(+) create mode 100644 drivers/char/svm.c
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig index 4f451477281b..c80a5c641634 100644 --- a/drivers/char/Kconfig +++ b/drivers/char/Kconfig @@ -478,6 +478,16 @@ config PIN_MEMORY_DEV help pin memory driver
+config HISI_SVM + bool "Hisilicon svm driver" + depends on ARM64 && ARM_SMMU_V3 && MMU_NOTIFIER + default m + help + This driver provides character-level access to Hisilicon + SVM chipset. Typically, you can bind a task to the + svm and share the virtual memory with hisilicon svm device. + When in doubt, say "N". + endmenu
config RANDOM_TRUST_CPU diff --git a/drivers/char/Makefile b/drivers/char/Makefile index 71d76fd62692..362d4a9cd4cf 100644 --- a/drivers/char/Makefile +++ b/drivers/char/Makefile @@ -48,3 +48,4 @@ obj-$(CONFIG_XILLYBUS) += xillybus/ obj-$(CONFIG_POWERNV_OP_PANEL) += powernv-op-panel.o obj-$(CONFIG_ADI) += adi.o obj-$(CONFIG_PIN_MEMORY_DEV) += pin_memory.o +obj-$(CONFIG_HISI_SVM) += svm.o diff --git a/drivers/char/svm.c b/drivers/char/svm.c new file mode 100644 index 000000000000..d095b35c5c93 --- /dev/null +++ b/drivers/char/svm.c @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2017-2018 Hisilicon Limited. + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include <asm/esr.h> +#include <linux/mmu_context.h> + +#include <linux/delay.h> +#include <linux/err.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/iommu.h> +#include <linux/miscdevice.h> +#include <linux/mman.h> +#include <linux/mmu_notifier.h> +#include <linux/module.h> +#include <linux/of.h> +#include <linux/of_address.h> +#include <linux/of_device.h> +#include <linux/platform_device.h> +#include <linux/ptrace.h> +#include <linux/security.h> +#include <linux/slab.h> +#include <linux/uaccess.h> +#include <linux/sched.h> +#include <linux/hugetlb.h> +#include <linux/sched/mm.h> +#include <linux/msi.h> +#include <linux/acpi.h> + +#define SVM_DEVICE_NAME "svm" + +struct core_device { + struct device dev; + struct iommu_group *group; + struct iommu_domain *domain; + u8 smmu_bypass; + struct list_head entry; +}; + +struct svm_device { + unsigned long long id; + struct miscdevice miscdev; + struct device *dev; + phys_addr_t l2buff; + unsigned long l2size; +}; + +struct svm_bind_process { + pid_t vpid; + u64 ttbr; + u64 tcr; + int pasid; + u32 flags; +#define SVM_BIND_PID (1 << 0) +}; + +/* + *svm_process is released in svm_notifier_release() when mm refcnt + *goes down zero. We should access svm_process only in the context + *where mm_struct is valid, which means we should always get mm + *refcnt first. + */ +struct svm_process { + struct pid *pid; + struct mm_struct *mm; + unsigned long asid; + struct rb_node rb_node; + struct mmu_notifier notifier; + /* For postponed release */ + struct rcu_head rcu; + int pasid; + struct mutex mutex; + struct rb_root sdma_list; + struct svm_device *sdev; + struct iommu_sva *sva; +}; + +static int svm_open(struct inode *inode, struct file *file) +{ + return 0; +} + +static long svm_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + /*TODO add svm ioctl*/ + return 0; +} +static const struct file_operations svm_fops = { + .owner = THIS_MODULE, + .open = svm_open, + .unlocked_ioctl = svm_ioctl, +}; + +static int svm_device_probe(struct platform_device *pdev) +{ + /*TODO svm device init*/ + return 0; +} + +static int svm_device_remove(struct platform_device *pdev) +{ + /*TODO svm device remove*/ + return 0; +} + +static const struct acpi_device_id svm_acpi_match[] = { + { "HSVM1980", 0}, + { } +}; +MODULE_DEVICE_TABLE(acpi, svm_acpi_match); + +static const struct of_device_id svm_of_match[] = { + { .compatible = "hisilicon,svm" }, + { } +}; +MODULE_DEVICE_TABLE(of, svm_of_match); + +/*svm acpi probe and remove*/ +static struct platform_driver svm_driver = { + .probe = svm_device_probe, + .remove = svm_device_remove, + .driver = { + .name = SVM_DEVICE_NAME, + .acpi_match_table = ACPI_PTR(svm_acpi_match), + .of_match_table = svm_of_match, + }, +}; + +module_platform_driver(svm_driver); + +MODULE_DESCRIPTION("Hisilicon SVM driver"); +MODULE_AUTHOR("Fang Lijun fanglijun3@huawei.com"); +MODULE_LICENSE("GPL v2");
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Init the svm device and remove the svm device, and add a empty functions to init and remove cores.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 108 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 105 insertions(+), 3 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index d095b35c5c93..f246e33e0e17 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -35,6 +35,11 @@
#define SVM_DEVICE_NAME "svm"
+static int probe_index; +static LIST_HEAD(child_list); +static DECLARE_RWSEM(svm_sem); +static struct mutex svm_process_mutex; + struct core_device { struct device dev; struct iommu_group *group; @@ -98,15 +103,112 @@ static const struct file_operations svm_fops = { .unlocked_ioctl = svm_ioctl, };
-static int svm_device_probe(struct platform_device *pdev) +static int svm_remove_core(struct device *dev, void *data) +{ + /* TODO remove core */ + return 0; +} + +static int svm_acpi_init_core(struct svm_device *sdev) +{ + /* TODO acpi init core */ + return 0; +} + +static int svm_dt_init_core(struct svm_device *sdev, struct device_node *np) { - /*TODO svm device init*/ + /* TODO dt init core */ return 0; }
+static int svm_device_probe(struct platform_device *pdev) +{ + int err = -1; + struct device *dev = &pdev->dev; + struct svm_device *sdev = NULL; + struct device_node *np = dev->of_node; + int alias_id; + + if (acpi_disabled && np == NULL) + return -ENODEV; + + if (!dev->bus) { + dev_dbg(dev, "this dev bus is NULL\n"); + return -EPROBE_DEFER; + } + + if (!dev->bus->iommu_ops) { + dev_dbg(dev, "defer probe svm device\n"); + return -EPROBE_DEFER; + } + + sdev = devm_kzalloc(dev, sizeof(*sdev), GFP_KERNEL); + if (sdev == NULL) + return -ENOMEM; + + if (!acpi_disabled) { + err = device_property_read_u64(dev, "svmid", &sdev->id); + if (err) { + dev_err(dev, "failed to get this svm device id\n"); + return err; + } + } else { + alias_id = of_alias_get_id(np, "svm"); + if (alias_id < 0) + sdev->id = probe_index; + else + sdev->id = alias_id; + } + + sdev->dev = dev; + sdev->miscdev.minor = MISC_DYNAMIC_MINOR; + sdev->miscdev.fops = &svm_fops; + sdev->miscdev.name = devm_kasprintf(dev, GFP_KERNEL, + SVM_DEVICE_NAME"%llu", sdev->id); + if (sdev->miscdev.name == NULL) + return -ENOMEM; + + dev_set_drvdata(dev, sdev); + err = misc_register(&sdev->miscdev); + if (err) { + dev_err(dev, "Unable to register misc device\n"); + return err; + } + + if (!acpi_disabled) { + err = svm_acpi_init_core(sdev); + if (err) { + dev_err(dev, "failed to init acpi cores\n"); + goto err_unregister_misc; + } + } else { + err = svm_dt_init_core(sdev, np); + if (err) { + dev_err(dev, "failed to init dt cores\n"); + goto err_unregister_misc; + } + + probe_index++; + } + + mutex_init(&svm_process_mutex); + + return err; + +err_unregister_misc: + misc_deregister(&sdev->miscdev); + + return err; +} + static int svm_device_remove(struct platform_device *pdev) { - /*TODO svm device remove*/ + struct device *dev = &pdev->dev; + struct svm_device *sdev = dev_get_drvdata(dev); + + device_for_each_child(sdev->dev, NULL, svm_remove_core); + misc_deregister(&sdev->miscdev); + return 0; }
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
svm need to init the children device , so, we add the acpi and dts functions to read the childen device of svm device;
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 283 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 279 insertions(+), 4 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index f246e33e0e17..6fb80f4e560c 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -35,6 +35,7 @@
#define SVM_DEVICE_NAME "svm"
+#define CORE_SID 0 static int probe_index; static LIST_HEAD(child_list); static DECLARE_RWSEM(svm_sem); @@ -86,6 +87,10 @@ struct svm_process { struct iommu_sva *sva; };
+static struct bus_type svm_bus_type = { + .name = "svm_bus", +}; + static int svm_open(struct inode *inode, struct file *file) { return 0; @@ -103,22 +108,292 @@ static const struct file_operations svm_fops = { .unlocked_ioctl = svm_ioctl, };
+static inline struct core_device *to_core_device(struct device *d) +{ + return container_of(d, struct core_device, dev); +} + +static void cdev_device_release(struct device *dev) +{ + struct core_device *cdev = to_core_device(dev); + + if (!acpi_disabled) + list_del(&cdev->entry); + + kfree(cdev); +} + static int svm_remove_core(struct device *dev, void *data) { - /* TODO remove core */ + struct core_device *cdev = to_core_device(dev); + + if (!cdev->smmu_bypass) { + iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA); + iommu_detach_group(cdev->domain, cdev->group); + iommu_group_put(cdev->group); + iommu_domain_free(cdev->domain); + } + + device_unregister(&cdev->dev); + + return 0; +} + +#ifdef CONFIG_ACPI +static int svm_acpi_add_core(struct svm_device *sdev, + struct acpi_device *children, int id) +{ + int err; + struct core_device *cdev = NULL; + char *name = NULL; + enum dev_dma_attr attr; + + name = devm_kasprintf(sdev->dev, GFP_KERNEL, "svm_child_dev%d", id); + if (name == NULL) + return -ENOMEM; + + cdev = kzalloc(sizeof(*cdev), GFP_KERNEL); + if (cdev == NULL) + return -ENOMEM; + cdev->dev.fwnode = &children->fwnode; + cdev->dev.parent = sdev->dev; + cdev->dev.bus = &svm_bus_type; + cdev->dev.release = cdev_device_release; + cdev->smmu_bypass = 0; + list_add(&cdev->entry, &child_list); + dev_set_name(&cdev->dev, "%s", name); + + err = device_register(&cdev->dev); + if (err) { + dev_info(&cdev->dev, "core_device register failed\n"); + list_del(&cdev->entry); + kfree(cdev); + return err; + } + + attr = acpi_get_dma_attr(children); + if (attr != DEV_DMA_NOT_SUPPORTED) { + err = acpi_dma_configure(&cdev->dev, attr); + if (err) { + dev_dbg(&cdev->dev, "acpi_dma_configure failed\n"); + return err; + } + } + + err = acpi_dev_prop_read_single(children, "hisi,smmu-bypass", + DEV_PROP_U8, &cdev->smmu_bypass); + if (err) + dev_info(&children->dev, "read smmu bypass failed\n"); + + cdev->group = iommu_group_get(&cdev->dev); + if (IS_ERR_OR_NULL(cdev->group)) { + dev_err(&cdev->dev, "smmu is not right configured\n"); + return -ENXIO; + } + + cdev->domain = iommu_domain_alloc(sdev->dev->bus); + if (cdev->domain == NULL) { + dev_info(&cdev->dev, "failed to alloc domain\n"); + return -ENOMEM; + } + + err = iommu_attach_group(cdev->domain, cdev->group); + if (err) { + dev_err(&cdev->dev, "failed group to domain\n"); + return err; + } + + err = iommu_dev_enable_feature(&cdev->dev, IOMMU_DEV_FEAT_IOPF); + if (err) { + dev_err(&cdev->dev, "failed to enable iopf feature, %d\n", err); + return err; + } + + err = iommu_dev_enable_feature(&cdev->dev, IOMMU_DEV_FEAT_SVA); + if (err) { + dev_err(&cdev->dev, "failed to enable sva feature\n"); + return err; + } + return 0; }
static int svm_acpi_init_core(struct svm_device *sdev) { - /* TODO acpi init core */ + int err = 0; + struct device *dev = sdev->dev; + struct acpi_device *adev = ACPI_COMPANION(sdev->dev); + struct acpi_device *cdev = NULL; + int id = 0; + + down_write(&svm_sem); + if (!svm_bus_type.iommu_ops) { + err = bus_register(&svm_bus_type); + if (err) { + up_write(&svm_sem); + dev_err(dev, "failed to register svm_bus_type\n"); + return err; + } + + err = bus_set_iommu(&svm_bus_type, dev->bus->iommu_ops); + if (err) { + up_write(&svm_sem); + dev_err(dev, "failed to set iommu for svm_bus_type\n"); + goto err_unregister_bus; + } + } else if (svm_bus_type.iommu_ops != dev->bus->iommu_ops) { + err = -EBUSY; + up_write(&svm_sem); + dev_err(dev, "iommu_ops configured, but changed!\n"); + return err; + } + up_write(&svm_sem); + + list_for_each_entry(cdev, &adev->children, node) { + err = svm_acpi_add_core(sdev, cdev, id++); + if (err) + device_for_each_child(dev, NULL, svm_remove_core); + } + + return err; + +err_unregister_bus: + bus_unregister(&svm_bus_type); + + return err; +} +#else +static int svm_acpi_init_core(struct svm_device *sdev) { return 0; } +#endif + +static int svm_of_add_core(struct svm_device *sdev, struct device_node *np) +{ + int err; + struct resource res; + struct core_device *cdev = NULL; + char *name = NULL; + + name = devm_kasprintf(sdev->dev, GFP_KERNEL, "svm%llu_%s", + sdev->id, np->name); + if (name == NULL) + return -ENOMEM; + + cdev = kzalloc(sizeof(*cdev), GFP_KERNEL); + if (cdev == NULL) + return -ENOMEM; + + cdev->dev.of_node = np; + cdev->dev.parent = sdev->dev; + cdev->dev.bus = &svm_bus_type; + cdev->dev.release = cdev_device_release; + cdev->smmu_bypass = of_property_read_bool(np, "hisi,smmu_bypass"); + dev_set_name(&cdev->dev, "%s", name); + + err = device_register(&cdev->dev); + if (err) { + dev_info(&cdev->dev, "core_device register failed\n"); + kfree(cdev); + return err; + } + + err = of_dma_configure(&cdev->dev, np, true); + if (err) { + dev_dbg(&cdev->dev, "of_dma_configure failed\n"); + return err; + } + + err = of_address_to_resource(np, 0, &res); + if (err) { + dev_info(&cdev->dev, "no reg, FW should install the sid\n"); + } else { + /* If the reg specified, install sid for the core */ + void __iomem *core_base = NULL; + int sid = cdev->dev.iommu->fwspec->ids[0]; + + core_base = ioremap(res.start, resource_size(&res)); + if (core_base == NULL) { + dev_err(&cdev->dev, "ioremap failed\n"); + return -ENOMEM; + } + + writel_relaxed(sid, core_base + CORE_SID); + iounmap(core_base); + } + + cdev->group = iommu_group_get(&cdev->dev); + if (IS_ERR_OR_NULL(cdev->group)) { + dev_err(&cdev->dev, "smmu is not right configured\n"); + return -ENXIO; + } + + cdev->domain = iommu_domain_alloc(sdev->dev->bus); + if (cdev->domain == NULL) { + dev_info(&cdev->dev, "failed to alloc domain\n"); + return -ENOMEM; + } + + err = iommu_attach_group(cdev->domain, cdev->group); + if (err) { + dev_err(&cdev->dev, "failed group to domain\n"); + return err; + } + + err = iommu_dev_enable_feature(&cdev->dev, IOMMU_DEV_FEAT_IOPF); + if (err) { + dev_err(&cdev->dev, "failed to enable iopf feature, %d\n", err); + return err; + } + + err = iommu_dev_enable_feature(&cdev->dev, IOMMU_DEV_FEAT_SVA); + if (err) { + dev_err(&cdev->dev, "failed to enable sva feature, %d\n", err); + return err; + } + return 0; }
static int svm_dt_init_core(struct svm_device *sdev, struct device_node *np) { - /* TODO dt init core */ - return 0; + int err = 0; + struct device_node *child = NULL; + struct device *dev = sdev->dev; + + down_write(&svm_sem); + if (svm_bus_type.iommu_ops == NULL) { + err = bus_register(&svm_bus_type); + if (err) { + up_write(&svm_sem); + dev_err(dev, "failed to register svm_bus_type\n"); + return err; + } + + err = bus_set_iommu(&svm_bus_type, dev->bus->iommu_ops); + if (err) { + up_write(&svm_sem); + dev_err(dev, "failed to set iommu for svm_bus_type\n"); + goto err_unregister_bus; + } + } else if (svm_bus_type.iommu_ops != dev->bus->iommu_ops) { + err = -EBUSY; + up_write(&svm_sem); + dev_err(dev, "iommu_ops configured, but changed!\n"); + return err; + } + up_write(&svm_sem); + + for_each_available_child_of_node(np, child) { + err = svm_of_add_core(sdev, child); + if (err) + device_for_each_child(dev, NULL, svm_remove_core); + } + + return err; + +err_unregister_bus: + bus_unregister(&svm_bus_type); + + return err; }
static int svm_device_probe(struct platform_device *pdev)
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
add svm bind ioctl and add some functions, include svm_process alloc, svm_process relaese, and svm_process kref interfaces
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 272 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 270 insertions(+), 2 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index 6fb80f4e560c..2e964c604d1a 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -34,6 +34,9 @@ #include <linux/acpi.h>
#define SVM_DEVICE_NAME "svm" +#define ASID_SHIFT 48 + +#define SVM_IOCTL_PROCESS_BIND 0xffff
#define CORE_SID 0 static int probe_index; @@ -87,6 +90,222 @@ struct svm_process { struct iommu_sva *sva; };
+static char *svm_cmd_to_string(unsigned int cmd) +{ + switch (cmd) { + case SVM_IOCTL_PROCESS_BIND: + return "bind"; + default: + return "unsupported"; + } + + return NULL; +} + +static struct svm_process *find_svm_process(unsigned long asid) +{ + /* TODO */ + return 0; +} + +static void insert_svm_process(struct svm_process *process) +{ + /* TODO */ +} + +static void delete_svm_process(struct svm_process *process) +{ + /* TODO */ +} + +static struct svm_device *file_to_sdev(struct file *file) +{ + return container_of(file->private_data, + struct svm_device, miscdev); +} + +static void svm_dt_bind_cores(struct svm_process *process) +{ + /* TODO */ +} + +static void svm_acpi_bind_cores(struct svm_process *process) +{ + /* TODO */ +} + +static void svm_process_free(struct mmu_notifier *mn) +{ + struct svm_process *process = NULL; + + process = container_of(mn, struct svm_process, notifier); + arm64_mm_context_put(process->mm); + kfree(process); +} + +static void svm_process_release(struct svm_process *process) +{ + delete_svm_process(process); + put_pid(process->pid); + + mmu_notifier_put(&process->notifier); +} + +static void svm_notifier_release(struct mmu_notifier *mn, + struct mm_struct *mm) +{ + struct svm_process *process = NULL; + + process = container_of(mn, struct svm_process, notifier); + + /* + * No need to call svm_unbind_cores(), as iommu-sva will do the + * unbind in its mm_notifier callback. + */ + + mutex_lock(&svm_process_mutex); + svm_process_release(process); + mutex_unlock(&svm_process_mutex); +} + +static struct mmu_notifier_ops svm_process_mmu_notifier = { + .release = svm_notifier_release, + .free_notifier = svm_process_free, +}; + +static struct svm_process * +svm_process_alloc(struct svm_device *sdev, struct pid *pid, + struct mm_struct *mm, unsigned long asid) +{ + struct svm_process *process = kzalloc(sizeof(*process), GFP_ATOMIC); + + if (!process) + return ERR_PTR(-ENOMEM); + + process->sdev = sdev; + process->pid = pid; + process->mm = mm; + process->asid = asid; + process->sdma_list = RB_ROOT; //lint !e64 + mutex_init(&process->mutex); + process->notifier.ops = &svm_process_mmu_notifier; + + return process; +} + +static struct task_struct *svm_get_task(struct svm_bind_process params) +{ + struct task_struct *task = NULL; + + if (params.flags & ~SVM_BIND_PID) + return ERR_PTR(-EINVAL); + + if (params.flags & SVM_BIND_PID) { + struct mm_struct *mm = NULL; + + rcu_read_lock(); + task = find_task_by_vpid(params.vpid); + if (task) + get_task_struct(task); + rcu_read_unlock(); + if (task == NULL) + return ERR_PTR(-ESRCH); + + /* check the permission */ + mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS); + if (IS_ERR_OR_NULL(mm)) { + pr_err("cannot access mm\n"); + put_task_struct(task); + return ERR_PTR(-ESRCH); + } + + mmput(mm); + } else { + get_task_struct(current); + task = current; + } + + return task; +} + +static int svm_process_bind(struct task_struct *task, + struct svm_device *sdev, u64 *ttbr, u64 *tcr, int *pasid) +{ + int err; + unsigned long asid; + struct pid *pid = NULL; + struct svm_process *process = NULL; + struct mm_struct *mm = NULL; + + if ((ttbr == NULL) || (tcr == NULL) || (pasid == NULL)) + return -EINVAL; + + pid = get_task_pid(task, PIDTYPE_PID); + if (pid == NULL) + return -EINVAL; + + mm = get_task_mm(task); + if (!mm) { + err = -EINVAL; + goto err_put_pid; + } + + asid = arm64_mm_context_get(mm); + if (!asid) { + err = -ENOSPC; + goto err_put_mm; + } + + /* If a svm_process already exists, use it */ + mutex_lock(&svm_process_mutex); + process = find_svm_process(asid); + if (process == NULL) { + process = svm_process_alloc(sdev, pid, mm, asid); + if (IS_ERR(process)) { + err = PTR_ERR(process); + mutex_unlock(&svm_process_mutex); + goto err_put_mm_context; + } + err = mmu_notifier_register(&process->notifier, mm); + if (err) { + mutex_unlock(&svm_process_mutex); + goto err_free_svm_process; + } + + insert_svm_process(process); + + if (acpi_disabled) + svm_dt_bind_cores(process); + else + svm_acpi_bind_cores(process); + + mutex_unlock(&svm_process_mutex); + } else { + mutex_unlock(&svm_process_mutex); + arm64_mm_context_put(mm); + put_pid(pid); + } + + + *ttbr = virt_to_phys(mm->pgd) | asid << ASID_SHIFT; + *tcr = read_sysreg(tcr_el1); + *pasid = process->pasid; + + mmput(mm); + return 0; + +err_free_svm_process: + kfree(process); +err_put_mm_context: + arm64_mm_context_put(mm); +err_put_mm: + mmput(mm); +err_put_pid: + put_pid(pid); + + return err; +} + static struct bus_type svm_bus_type = { .name = "svm_bus", }; @@ -99,9 +318,58 @@ static int svm_open(struct inode *inode, struct file *file) static long svm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { - /*TODO add svm ioctl*/ - return 0; + int err = -EINVAL; + struct svm_bind_process params; + struct svm_device *sdev = file_to_sdev(file); + struct task_struct *task; + + if (!arg) + return -EINVAL; + + if (cmd == SVM_IOCTL_PROCESS_BIND) { + err = copy_from_user(¶ms, (void __user *)arg, + sizeof(params)); + if (err) { + dev_err(sdev->dev, "fail to copy params %d\n", err); + return -EFAULT; + } + } + + switch (cmd) { + case SVM_IOCTL_PROCESS_BIND: + task = svm_get_task(params); + if (IS_ERR(task)) { + dev_err(sdev->dev, "failed to get task\n"); + return PTR_ERR(task); + } + + err = svm_process_bind(task, sdev, ¶ms.ttbr, + ¶ms.tcr, ¶ms.pasid); + if (err) { + put_task_struct(task); + dev_err(sdev->dev, "failed to bind task %d\n", err); + return err; + } + + put_task_struct(task); + err = copy_to_user((void __user *)arg, ¶ms, + sizeof(params)); + if (err) { + dev_err(sdev->dev, "failed to copy to user!\n"); + return -EFAULT; + } + break; + default: + err = -EINVAL; + } + + if (err) + dev_err(sdev->dev, "%s: %s failed err = %d\n", __func__, + svm_cmd_to_string(cmd), err); + + return err; } + static const struct file_operations svm_fops = { .owner = THIS_MODULE, .open = svm_open,
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
using a rbtree to manage a svm_process,and implement find process, delete process and insert process
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index 2e964c604d1a..da5c2a2be1c3 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -42,6 +42,7 @@ static int probe_index; static LIST_HEAD(child_list); static DECLARE_RWSEM(svm_sem); +static struct rb_root svm_process_root = RB_ROOT; static struct mutex svm_process_mutex;
struct core_device { @@ -104,18 +105,51 @@ static char *svm_cmd_to_string(unsigned int cmd)
static struct svm_process *find_svm_process(unsigned long asid) { - /* TODO */ - return 0; + struct rb_node *node = svm_process_root.rb_node; + + while (node) { + struct svm_process *process = NULL; + + process = rb_entry(node, struct svm_process, rb_node); + if (asid < process->asid) + node = node->rb_left; + else if (asid > process->asid) + node = node->rb_right; + else + return process; + } + + return NULL; }
static void insert_svm_process(struct svm_process *process) { - /* TODO */ + struct rb_node **p = &svm_process_root.rb_node; + struct rb_node *parent = NULL; + + while (*p) { + struct svm_process *tmp_process = NULL; + + parent = *p; + tmp_process = rb_entry(parent, struct svm_process, rb_node); + if (process->asid < tmp_process->asid) + p = &(*p)->rb_left; + else if (process->asid > tmp_process->asid) + p = &(*p)->rb_right; + else { + WARN_ON_ONCE("asid already in the tree"); + return; + } + } + + rb_link_node(&process->rb_node, parent, p); + rb_insert_color(&process->rb_node, &svm_process_root); }
static void delete_svm_process(struct svm_process *process) { - /* TODO */ + rb_erase(&process->rb_node, &svm_process_root); + RB_CLEAR_NODE(&process->rb_node); }
static struct svm_device *file_to_sdev(struct file *file)
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
implement svm bind core, the process can call ioctl to bind the device.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 71 +++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 64 insertions(+), 7 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index da5c2a2be1c3..213134ae76c5 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -158,14 +158,76 @@ static struct svm_device *file_to_sdev(struct file *file) struct svm_device, miscdev); }
+static inline struct core_device *to_core_device(struct device *d) +{ + return container_of(d, struct core_device, dev); +} + +static int svm_acpi_bind_core(struct core_device *cdev, void *data) +{ + struct task_struct *task = NULL; + struct svm_process *process = data; + + if (cdev->smmu_bypass) + return 0; + + task = get_pid_task(process->pid, PIDTYPE_PID); + if (!task) { + pr_err("failed to get task_struct\n"); + return -ESRCH; + } + + process->sva = iommu_sva_bind_device(&cdev->dev, task->mm, NULL); + if (!process->sva) { + pr_err("failed to bind device\n"); + return PTR_ERR(process->sva); + } + + process->pasid = task->mm->pasid; + put_task_struct(task); + + return 0; +} + +static int svm_dt_bind_core(struct device *dev, void *data) +{ + struct task_struct *task = NULL; + struct svm_process *process = data; + struct core_device *cdev = to_core_device(dev); + + if (cdev->smmu_bypass) + return 0; + + task = get_pid_task(process->pid, PIDTYPE_PID); + if (!task) { + pr_err("failed to get task_struct\n"); + return -ESRCH; + } + + process->sva = iommu_sva_bind_device(dev, task->mm, NULL); + if (!process->sva) { + pr_err("failed to bind device\n"); + return PTR_ERR(process->sva); + } + + process->pasid = task->mm->pasid; + put_task_struct(task); + + return 0; +} + static void svm_dt_bind_cores(struct svm_process *process) { - /* TODO */ + device_for_each_child(process->sdev->dev, process, svm_dt_bind_core); }
static void svm_acpi_bind_cores(struct svm_process *process) { - /* TODO */ + struct core_device *pos = NULL; + + list_for_each_entry(pos, &child_list, entry) { + svm_acpi_bind_core(pos, process); + } }
static void svm_process_free(struct mmu_notifier *mn) @@ -410,11 +472,6 @@ static const struct file_operations svm_fops = { .unlocked_ioctl = svm_ioctl, };
-static inline struct core_device *to_core_device(struct device *d) -{ - return container_of(d, struct core_device, dev); -} - static void cdev_device_release(struct device *dev) { struct core_device *cdev = to_core_device(dev);
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Add and export svm_get_pasid to get pasid of process by pid
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index 213134ae76c5..ff66c7b9709c 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -755,6 +755,54 @@ static int svm_dt_init_core(struct svm_device *sdev, struct device_node *np) return err; }
+int svm_get_pasid(pid_t vpid, int dev_id __maybe_unused) +{ + int pasid; + unsigned long asid; + struct task_struct *task = NULL; + struct mm_struct *mm = NULL; + struct svm_process *process = NULL; + struct svm_bind_process params; + + params.flags = SVM_BIND_PID; + params.vpid = vpid; + params.pasid = -1; + params.ttbr = 0; + params.tcr = 0; + task = svm_get_task(params); + if (IS_ERR(task)) + return PTR_ERR(task); + + mm = get_task_mm(task); + if (mm == NULL) { + pasid = -EINVAL; + goto put_task; + } + + asid = arm64_mm_context_get(mm); + if (!asid) { + pasid = -ENOSPC; + goto put_mm; + } + + mutex_lock(&svm_process_mutex); + process = find_svm_process(asid); + mutex_unlock(&svm_process_mutex); + if (process) + pasid = process->pasid; + else + pasid = -ESRCH; + + arm64_mm_context_put(mm); +put_mm: + mmput(mm); +put_task: + put_task_struct(task); + + return pasid; +} +EXPORT_SYMBOL_GPL(svm_get_pasid); + static int svm_device_probe(struct platform_device *pdev) { int err = -1;
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
svm probe to read l2buff addr and size, this is rts to set l2buff
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index ff66c7b9709c..a430062f76e4 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -803,6 +803,27 @@ int svm_get_pasid(pid_t vpid, int dev_id __maybe_unused) } EXPORT_SYMBOL_GPL(svm_get_pasid);
+static int svm_dt_setup_l2buff(struct svm_device *sdev, struct device_node *np) +{ + struct device_node *l2buff = of_parse_phandle(np, "memory-region", 0); + + if (l2buff) { + struct resource r; + int err = of_address_to_resource(l2buff, 0, &r); + + if (err) { + of_node_put(l2buff); + return err; + } + + sdev->l2buff = r.start; + sdev->l2size = resource_size(&r); + } + + of_node_put(l2buff); + return 0; +} + static int svm_device_probe(struct platform_device *pdev) { int err = -1; @@ -864,6 +885,14 @@ static int svm_device_probe(struct platform_device *pdev) goto err_unregister_misc; } } else { + /* + * Get the l2buff phys address and size, if it do not exist + * just warn and continue, and runtime can not use L2BUFF. + */ + err = svm_dt_setup_l2buff(sdev, np); + if (err) + dev_warn(dev, "Cannot get l2buff\n"); + err = svm_dt_init_core(sdev, np); if (err) { dev_err(dev, "failed to init dt cores\n");
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
-------------------------------------------------
Add alloc and release memory functions in svm. And the physical address of the memory is within 4GB.
For example: /* alloc */ fd = open("dev/svm0",); mmap(0, ALLOC_SIZE,, MAP_PA32BIT, fd, 0);
/* free */ ioctl(fd, SVM_IOCTL_RELEASE_PHYS32,); close(fd);
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/alpha/include/uapi/asm/mman.h | 1 + arch/mips/include/uapi/asm/mman.h | 1 + arch/parisc/include/uapi/asm/mman.h | 1 + arch/powerpc/include/uapi/asm/mman.h | 1 + arch/sparc/include/uapi/asm/mman.h | 1 + arch/xtensa/include/uapi/asm/mman.h | 1 + drivers/char/svm.c | 239 +++++++++++++++++++++++++++ include/linux/mm.h | 1 + include/uapi/asm-generic/mman.h | 1 + mm/mmap.c | 4 + 10 files changed, 251 insertions(+)
diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h index a18ec7f63888..87abc7b03360 100644 --- a/arch/alpha/include/uapi/asm/mman.h +++ b/arch/alpha/include/uapi/asm/mman.h @@ -31,6 +31,7 @@ #define MAP_STACK 0x80000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x100000 /* create a huge page mapping */ #define MAP_FIXED_NOREPLACE 0x200000/* MAP_FIXED which doesn't unmap underlying mapping */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */
#define MS_ASYNC 1 /* sync memory asynchronously */ #define MS_SYNC 2 /* synchronous memory sync */ diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h index 57dc2ac4f8bd..61cd225fcaa4 100644 --- a/arch/mips/include/uapi/asm/mman.h +++ b/arch/mips/include/uapi/asm/mman.h @@ -49,6 +49,7 @@ #define MAP_STACK 0x40000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */
/* * Flags for msync diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h index ab78cba446ed..851678907640 100644 --- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -26,6 +26,7 @@ #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ #define MAP_UNINITIALIZED 0 /* uninitialized anonymous mmap */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */
#define MS_SYNC 1 /* synchronous memory sync */ #define MS_ASYNC 2 /* sync memory asynchronously */ diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h index c0c737215b00..f0eb04780148 100644 --- a/arch/powerpc/include/uapi/asm/mman.h +++ b/arch/powerpc/include/uapi/asm/mman.h @@ -25,6 +25,7 @@ #define MCL_CURRENT 0x2000 /* lock all currently mapped pages */ #define MCL_FUTURE 0x4000 /* lock all additions to address space */ #define MCL_ONFAULT 0x8000 /* lock all pages that are faulted in */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */
/* Override any generic PKEY permission defines */ #define PKEY_DISABLE_EXECUTE 0x4 diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h index cec9f4109687..8caf19c604d0 100644 --- a/arch/sparc/include/uapi/asm/mman.h +++ b/arch/sparc/include/uapi/asm/mman.h @@ -21,5 +21,6 @@ #define MCL_CURRENT 0x2000 /* lock all currently mapped pages */ #define MCL_FUTURE 0x4000 /* lock all additions to address space */ #define MCL_ONFAULT 0x8000 /* lock all pages that are faulted in */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */
#endif /* _UAPI__SPARC_MMAN_H__ */ diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index e5e643752947..a52ac8462b7d 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -56,6 +56,7 @@ #define MAP_STACK 0x40000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */ #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be * uninitialized */
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index a430062f76e4..b1dd373a745c 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -39,6 +39,10 @@ #define SVM_IOCTL_PROCESS_BIND 0xffff
#define CORE_SID 0 + +#define SVM_IOCTL_RELEASE_PHYS32 0xfff3 +#define MMAP_PHY32_MAX (16 * 1024 * 1024) + static int probe_index; static LIST_HEAD(child_list); static DECLARE_RWSEM(svm_sem); @@ -96,6 +100,8 @@ static char *svm_cmd_to_string(unsigned int cmd) switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_RELEASE_PHYS32: + return "release phys"; default: return "unsupported"; } @@ -402,6 +408,83 @@ static int svm_process_bind(struct task_struct *task, return err; }
+static pte_t *svm_get_pte(struct vm_area_struct *vma, + pud_t *pud, + unsigned long addr, + unsigned long *page_size, + unsigned long *offset) +{ + pte_t *pte = NULL; + unsigned long size = 0; + + if (is_vm_hugetlb_page(vma)) { + if (pud_present(*pud)) { + if (pud_huge(*pud)) { + pte = (pte_t *)pud; + *offset = addr & (PUD_SIZE - 1); + size = PUD_SIZE; + } else { + pte = (pte_t *)pmd_offset(pud, addr); + *offset = addr & (PMD_SIZE - 1); + size = PMD_SIZE; + } + } else { + pr_err("%s:hugetlb but pud not present\n", __func__); + } + } else { + pmd_t *pmd = pmd_offset(pud, addr); + + if (pmd_none(*pmd)) + return NULL; + + if (pmd_trans_huge(*pmd)) { + pte = (pte_t *)pmd; + *offset = addr & (PMD_SIZE - 1); + size = PMD_SIZE; + } else if (pmd_trans_unstable(pmd)) { + pr_warn("%s: thp unstable\n", __func__); + } else { + pte = pte_offset_map(pmd, addr); + *offset = addr & (PAGE_SIZE - 1); + size = PAGE_SIZE; + } + } + + if (page_size) + *page_size = size; + + return pte; +} + +/* Must be called with mmap_lock held */ +static pte_t *svm_walk_pt(unsigned long addr, unsigned long *page_size, + unsigned long *offset) +{ + pgd_t *pgd = NULL; + p4d_t *p4d = NULL; + pud_t *pud = NULL; + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma = NULL; + + vma = find_vma(mm, addr); + if (!vma) + return NULL; + + pgd = pgd_offset(mm, addr); + if (pgd_none_or_clear_bad(pgd)) + return NULL; + + p4d = p4d_offset(pgd, addr); + if (p4d_none_or_clear_bad(p4d)) + return NULL; + + pud = pud_offset(p4d, addr); + if (pud_none_or_clear_bad(pud)) + return NULL; + + return svm_get_pte(vma, pud, addr, page_size, offset); +} + static struct bus_type svm_bus_type = { .name = "svm_bus", }; @@ -411,6 +494,157 @@ static int svm_open(struct inode *inode, struct file *file) return 0; }
+static unsigned long svm_get_unmapped_area(struct file *file, + unsigned long addr0, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + unsigned long addr = addr0; + struct mm_struct *mm = current->mm; + struct vm_unmapped_area_info info; + struct svm_device *sdev = file_to_sdev(file); + + if (!acpi_disabled) + return -EPERM; + + if (flags & MAP_FIXED) { + if (IS_ALIGNED(addr, len)) + return addr; + + dev_err(sdev->dev, "MAP_FIXED but not aligned\n"); + return -EINVAL; //lint !e570 + } + + if (addr) { + struct vm_area_struct *vma = NULL; + + addr = ALIGN(addr, len); + + vma = find_vma(mm, addr); + if (TASK_SIZE - len >= addr && addr >= mmap_min_addr && + (vma == NULL || addr + len <= vm_start_gap(vma))) + return addr; + } + + info.flags = VM_UNMAPPED_AREA_TOPDOWN; + info.length = len; + info.low_limit = max(PAGE_SIZE, mmap_min_addr); + info.high_limit = mm->mmap_base; + info.align_mask = ((len >> PAGE_SHIFT) - 1) << PAGE_SHIFT; + info.align_offset = pgoff << PAGE_SHIFT; + + addr = vm_unmapped_area(&info); + + if (offset_in_page(addr)) { + VM_BUG_ON(addr != -ENOMEM); + info.flags = 0; + info.low_limit = TASK_UNMAPPED_BASE; + info.high_limit = TASK_SIZE; + + addr = vm_unmapped_area(&info); + } + + return addr; +} + +static int svm_mmap(struct file *file, struct vm_area_struct *vma) +{ + int err; + struct svm_device *sdev = file_to_sdev(file); + + if (!acpi_disabled) + return -EPERM; + + if (vma->vm_flags & VM_PA32BIT) { + unsigned long vm_size = vma->vm_end - vma->vm_start; + struct page *page = NULL; + + if ((vma->vm_end < vma->vm_start) || (vm_size > MMAP_PHY32_MAX)) + return -EINVAL; + + /* vma->vm_pgoff transfer the nid */ + if (vma->vm_pgoff == 0) + page = alloc_pages(GFP_KERNEL | GFP_DMA32, + get_order(vm_size)); + else + page = alloc_pages_node((int)vma->vm_pgoff, + GFP_KERNEL | __GFP_THISNODE, + get_order(vm_size)); + if (!page) { + dev_err(sdev->dev, "fail to alloc page on node 0x%lx\n", + vma->vm_pgoff); + return -ENOMEM; + } + + err = remap_pfn_range(vma, + vma->vm_start, + page_to_pfn(page), + vm_size, vma->vm_page_prot); + if (err) + dev_err(sdev->dev, + "fail to remap 0x%pK err=%d\n", + (void *)vma->vm_start, err); + } else { + if ((vma->vm_end < vma->vm_start) || + ((vma->vm_end - vma->vm_start) > sdev->l2size)) + return -EINVAL; + + vma->vm_page_prot = __pgprot((~PTE_SHARED) & + vma->vm_page_prot.pgprot); + + err = remap_pfn_range(vma, + vma->vm_start, + sdev->l2buff >> PAGE_SHIFT, + vma->vm_end - vma->vm_start, + __pgprot(vma->vm_page_prot.pgprot | PTE_DIRTY)); + if (err) + dev_err(sdev->dev, + "fail to remap 0x%pK err=%d\n", + (void *)vma->vm_start, err); + } + + return err; +} + +static int svm_release_phys32(unsigned long __user *arg) +{ + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma = NULL; + struct page *page = NULL; + pte_t *pte = NULL; + unsigned long phys, addr, offset; + unsigned int len = 0; + + if (arg == NULL) + return -EINVAL; + + if (get_user(addr, arg)) + return -EFAULT; + + down_read(&mm->mmap_lock); + pte = svm_walk_pt(addr, NULL, &offset); + if (pte && pte_present(*pte)) { + phys = PFN_PHYS(pte_pfn(*pte)) + offset; + } else { + up_read(&mm->mmap_lock); + return -EINVAL; + } + + vma = find_vma(mm, addr); + if (!vma) { + up_read(&mm->mmap_lock); + return -EFAULT; + } + + page = phys_to_page(phys); + len = vma->vm_end - vma->vm_start; + + __free_pages(page, get_order(len)); + + up_read(&mm->mmap_lock); + + return 0; +} + static long svm_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -455,6 +689,9 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_RELEASE_PHYS32: + err = svm_release_phys32((unsigned long __user *)arg); + break; default: err = -EINVAL; } @@ -469,6 +706,8 @@ static long svm_ioctl(struct file *file, unsigned int cmd, static const struct file_operations svm_fops = { .owner = THIS_MODULE, .open = svm_open, + .mmap = svm_mmap, + .get_unmapped_area = svm_get_unmapped_area, .unlocked_ioctl = svm_ioctl, };
diff --git a/include/linux/mm.h b/include/linux/mm.h index 3780281c8112..ae9b6688677f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -298,6 +298,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_HUGEPAGE 0x20000000 /* MADV_HUGEPAGE marked this vma */ #define VM_NOHUGEPAGE 0x40000000 /* MADV_NOHUGEPAGE marked this vma */ #define VM_MERGEABLE 0x80000000 /* KSM may merge identical pages */ +#define VM_PA32BIT 0x400000000 /* Physical address is within 4G */
#ifdef CONFIG_COHERENT_DEVICE #define VM_CDM 0x100000000 /* Contains coherent device memory */ diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h index 57e8195d0b53..344bb9b090a7 100644 --- a/include/uapi/asm-generic/mman.h +++ b/include/uapi/asm-generic/mman.h @@ -9,6 +9,7 @@ #define MAP_EXECUTABLE 0x1000 /* mark it as an executable */ #define MAP_LOCKED 0x2000 /* pages are locked */ #define MAP_NORESERVE 0x4000 /* don't check for reservations */ +#define MAP_PA32BIT 0x400000 /* physical address is within 4G */
/* * Bits [26:31] are reserved, see asm-generic/hugetlb_encode.h diff --git a/mm/mmap.c b/mm/mmap.c index f63925a21c95..f705137fd248 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1462,6 +1462,10 @@ __do_mmap(struct file *file, unsigned long addr, unsigned long len, pkey = 0; }
+ /* Physical address is within 4G */ + if (flags & MAP_PA32BIT) + vm_flags |= VM_PA32BIT; + /* Do simple checking here so the lower-level routines won't have * to. we assume access permissions have been handled by the open * of the memory object, so we don't do any here.
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index b1dd373a745c..60e9f022bfb3 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -36,6 +36,7 @@ #define SVM_DEVICE_NAME "svm" #define ASID_SHIFT 48
+#define SVM_IOCTL_LOAD_FLAG 0xfffa #define SVM_IOCTL_PROCESS_BIND 0xffff
#define CORE_SID 0 @@ -100,6 +101,8 @@ static char *svm_cmd_to_string(unsigned int cmd) switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_LOAD_FLAG: + return "load flag"; case SVM_IOCTL_RELEASE_PHYS32: return "release phys"; default: @@ -494,6 +497,25 @@ static int svm_open(struct inode *inode, struct file *file) return 0; }
+static int svm_proc_load_flag(int __user *arg) +{ + static atomic_t l2buf_load_flag = ATOMIC_INIT(0); + int flag; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) + return -EINVAL; + + if (0 == (atomic_cmpxchg(&l2buf_load_flag, 0, 1))) + flag = 0; + else + flag = 1; + + return put_user(flag, arg); +} + static unsigned long svm_get_unmapped_area(struct file *file, unsigned long addr0, unsigned long len, unsigned long pgoff, unsigned long flags) @@ -689,6 +711,9 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_LOAD_FLAG: + err = svm_proc_load_flag((int __user *)arg); + break; case SVM_IOCTL_RELEASE_PHYS32: err = svm_release_phys32((unsigned long __user *)arg); break;
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
This feature is to use debug the process memory.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 160 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 160 insertions(+)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index 60e9f022bfb3..bf9e299ceb4e 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -36,12 +36,14 @@ #define SVM_DEVICE_NAME "svm" #define ASID_SHIFT 48
+#define SVM_IOCTL_REMAP_PROC 0xfff4 #define SVM_IOCTL_LOAD_FLAG 0xfffa #define SVM_IOCTL_PROCESS_BIND 0xffff
#define CORE_SID 0
#define SVM_IOCTL_RELEASE_PHYS32 0xfff3 +#define SVM_REMAP_MEM_LEN_MAX (16 * 1024 * 1024) #define MMAP_PHY32_MAX (16 * 1024 * 1024)
static int probe_index; @@ -96,11 +98,21 @@ struct svm_process { struct iommu_sva *sva; };
+struct svm_proc_mem { + u32 dev_id; + u32 len; + u64 pid; + u64 vaddr; + u64 buf; +}; + static char *svm_cmd_to_string(unsigned int cmd) { switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_REMAP_PROC: + return "remap proc"; case SVM_IOCTL_LOAD_FLAG: return "load flag"; case SVM_IOCTL_RELEASE_PHYS32: @@ -497,6 +509,151 @@ static int svm_open(struct inode *inode, struct file *file) return 0; }
+static long svm_remap_get_phys(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long addr, unsigned long *phys, + unsigned long *page_size, unsigned long *offset) +{ + long err = -EINVAL; + pgd_t *pgd = NULL; + p4d_t *p4d = NULL; + pud_t *pud = NULL; + pte_t *pte = NULL; + + if (mm == NULL || vma == NULL || phys == NULL || + page_size == NULL || offset == NULL) + return err; + + pgd = pgd_offset(mm, addr); + if (pgd_none_or_clear_bad(pgd)) + return err; + + p4d = p4d_offset(pgd, addr); + if (p4d_none_or_clear_bad(p4d)) + return err; + + pud = pud_offset(p4d, addr); + if (pud_none_or_clear_bad(pud)) + return err; + + pte = svm_get_pte(vma, pud, addr, page_size, offset); + if (pte && pte_present(*pte)) { + *phys = PFN_PHYS(pte_pfn(*pte)); + return 0; + } + + return err; +} + +static long svm_remap_proc(unsigned long __user *arg) +{ + long ret = -EINVAL; + struct svm_proc_mem pmem; + struct task_struct *ptask = NULL; + struct mm_struct *pmm = NULL, *mm = current->mm; + struct vm_area_struct *pvma = NULL, *vma = NULL; + unsigned long end, vaddr, phys, buf, offset, pagesize; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) { + pr_err("arg is invalid.\n"); + return ret; + } + + ret = copy_from_user(&pmem, (void __user *)arg, sizeof(pmem)); + if (ret) { + pr_err("failed to copy args from user space.\n"); + return -EFAULT; + } + + if (pmem.buf & (PAGE_SIZE - 1)) { + pr_err("address is not aligned with page size, addr:%pK.\n", + (void *)pmem.buf); + return -EINVAL; + } + + rcu_read_lock(); + if (pmem.pid) { + ptask = find_task_by_vpid(pmem.pid); + if (!ptask) { + rcu_read_unlock(); + pr_err("No task for this pid\n"); + return -EINVAL; + } + } else { + ptask = current; + } + + get_task_struct(ptask); + rcu_read_unlock(); + pmm = ptask->mm; + + down_read(&mm->mmap_lock); + down_read(&pmm->mmap_lock); + + pvma = find_vma(pmm, pmem.vaddr); + if (pvma == NULL) { + ret = -ESRCH; + goto err; + } + + vma = find_vma(mm, pmem.buf); + if (vma == NULL) { + ret = -ESRCH; + goto err; + } + + if (pmem.len > SVM_REMAP_MEM_LEN_MAX) { + ret = -EINVAL; + pr_err("too large length of memory.\n"); + goto err; + } + vaddr = pmem.vaddr; + end = vaddr + pmem.len; + buf = pmem.buf; + vma->vm_flags |= VM_SHARED; + if (end > pvma->vm_end || end < vaddr) { + ret = -EINVAL; + pr_err("memory length is out of range, vaddr:%pK, len:%u.\n", + (void *)vaddr, pmem.len); + goto err; + } + + do { + ret = svm_remap_get_phys(pmm, pvma, vaddr, + &phys, &pagesize, &offset); + if (ret) { + ret = -EINVAL; + goto err; + } + + vaddr += pagesize - offset; + + do { + if (remap_pfn_range(vma, buf, phys >> PAGE_SHIFT, + PAGE_SIZE, + __pgprot(vma->vm_page_prot.pgprot | + PTE_DIRTY))) { + + ret = -ESRCH; + goto err; + } + + offset += PAGE_SIZE; + buf += PAGE_SIZE; + phys += PAGE_SIZE; + } while (offset < pagesize); + + } while (vaddr < end); + +err: + up_read(&pmm->mmap_lock); + up_read(&mm->mmap_lock); + put_task_struct(ptask); + return ret; +} + static int svm_proc_load_flag(int __user *arg) { static atomic_t l2buf_load_flag = ATOMIC_INIT(0); @@ -711,6 +868,9 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_REMAP_PROC: + err = svm_remap_proc((unsigned long __user *)arg); + break; case SVM_IOCTL_LOAD_FLAG: err = svm_proc_load_flag((int __user *)arg); break;
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Add svm_get_phy_memory_info and svm_get_hugeinfo to get meminfo.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 118 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 118 insertions(+)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index bf9e299ceb4e..79b7e8f9b803 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -37,6 +37,8 @@ #define ASID_SHIFT 48
#define SVM_IOCTL_REMAP_PROC 0xfff4 +#define SVM_IOCTL_GETHUGEINFO 0xfff6 +#define SVM_IOCTL_GET_PHYMEMINFO 0xfff8 #define SVM_IOCTL_LOAD_FLAG 0xfffa #define SVM_IOCTL_PROCESS_BIND 0xffff
@@ -106,11 +108,32 @@ struct svm_proc_mem { u64 buf; };
+struct meminfo { + unsigned long hugetlbfree; + unsigned long hugetlbtotal; +}; + +struct phymeminfo { + unsigned long normal_total; + unsigned long normal_free; + unsigned long huge_total; + unsigned long huge_free; +}; + +struct phymeminfo_ioctl { + struct phymeminfo *info; + unsigned long nodemask; +}; + static char *svm_cmd_to_string(unsigned int cmd) { switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_GETHUGEINFO: + return "get hugeinfo"; + case SVM_IOCTL_GET_PHYMEMINFO: + return "get physical memory info"; case SVM_IOCTL_REMAP_PROC: return "remap proc"; case SVM_IOCTL_LOAD_FLAG: @@ -509,6 +532,95 @@ static int svm_open(struct inode *inode, struct file *file) return 0; }
+static long svm_get_hugeinfo(unsigned long __user *arg) +{ + struct hstate *h = &default_hstate; + struct meminfo info; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) + return -EINVAL; + + if (!hugepages_supported()) + return -ENOTSUPP; + + info.hugetlbfree = h->free_huge_pages; + info.hugetlbtotal = h->nr_huge_pages; + + if (copy_to_user((void __user *)arg, &info, sizeof(info))) + return -EFAULT; + + pr_info("svm get hugetlb info: order(%u), max_huge_pages(%lu)," + "nr_huge_pages(%lu), free_huge_pages(%lu), resv_huge_pages(%lu)", + h->order, + h->max_huge_pages, + h->nr_huge_pages, + h->free_huge_pages, + h->resv_huge_pages); + + return 0; +} + +static void svm_get_node_memory_info_inc(unsigned long nid, struct phymeminfo *info) +{ + struct sysinfo i; + struct hstate *h = &default_hstate; + unsigned long huge_free = 0; + unsigned long huge_total = 0; + + if (hugepages_supported()) { + huge_free = h->free_huge_pages_node[nid] * (PAGE_SIZE << huge_page_order(h)); + huge_total = h->nr_huge_pages_node[nid] * (PAGE_SIZE << huge_page_order(h)); + } + +#ifdef CONFIG_NUMA + si_meminfo_node(&i, nid); +#else + si_meminfo(&i); +#endif + info->normal_free += i.freeram * PAGE_SIZE; + info->normal_total += i.totalram * PAGE_SIZE - huge_total; + info->huge_total += huge_total; + info->huge_free += huge_free; +} + +static void __svm_get_memory_info(unsigned long nodemask, struct phymeminfo *info) +{ + memset(info, 0x0, sizeof(struct phymeminfo)); + + nodemask = nodemask & ((1UL << MAX_NUMNODES) - 1); + + while (nodemask) { + unsigned long nid = find_first_bit(&nodemask, BITS_PER_LONG); + + if (node_isset(nid, node_online_map)) + (void)svm_get_node_memory_info_inc(nid, info); + + nodemask &= ~(1UL << nid); + } +} + +static long svm_get_phy_memory_info(unsigned long __user *arg) +{ + struct phymeminfo info; + struct phymeminfo_ioctl para; + + if (arg == NULL) + return -EINVAL; + + if (copy_from_user(¶, (void __user *)arg, sizeof(para))) + return -EFAULT; + + __svm_get_memory_info(para.nodemask, &info); + + if (copy_to_user((void __user *)para.info, &info, sizeof(info))) + return -EFAULT; + + return 0; +} + static long svm_remap_get_phys(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long *phys, unsigned long *page_size, unsigned long *offset) @@ -868,6 +980,12 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_GETHUGEINFO: + err = svm_get_hugeinfo((unsigned long __user *)arg); + break; + case SVM_IOCTL_GET_PHYMEMINFO: + err = svm_get_phy_memory_info((unsigned long __user *)arg); + break; case SVM_IOCTL_REMAP_PROC: err = svm_remap_proc((unsigned long __user *)arg); break;
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 274 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 274 insertions(+)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index 79b7e8f9b803..19d36bddeb05 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -37,7 +37,9 @@ #define ASID_SHIFT 48
#define SVM_IOCTL_REMAP_PROC 0xfff4 +#define SVM_IOCTL_UNPIN_MEMORY 0xfff5 #define SVM_IOCTL_GETHUGEINFO 0xfff6 +#define SVM_IOCTL_PIN_MEMORY 0xfff7 #define SVM_IOCTL_GET_PHYMEMINFO 0xfff8 #define SVM_IOCTL_LOAD_FLAG 0xfffa #define SVM_IOCTL_PROCESS_BIND 0xffff @@ -100,6 +102,14 @@ struct svm_process { struct iommu_sva *sva; };
+struct svm_sdma { + struct rb_node node; + unsigned long addr; + int nr_pages; + struct page **pages; + atomic64_t ref; +}; + struct svm_proc_mem { u32 dev_id; u32 len; @@ -130,6 +140,10 @@ static char *svm_cmd_to_string(unsigned int cmd) switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_PIN_MEMORY: + return "pin memory"; + case SVM_IOCTL_UNPIN_MEMORY: + return "unpin memory"; case SVM_IOCTL_GETHUGEINFO: return "get hugeinfo"; case SVM_IOCTL_GET_PHYMEMINFO: @@ -532,6 +546,260 @@ static int svm_open(struct inode *inode, struct file *file) return 0; }
+static struct svm_sdma *svm_find_sdma(struct svm_process *process, + unsigned long addr, int nr_pages) +{ + struct rb_node *node = process->sdma_list.rb_node; + + while (node) { + struct svm_sdma *sdma = NULL; + + sdma = rb_entry(node, struct svm_sdma, node); + if (addr < sdma->addr) + node = node->rb_left; + else if (addr > sdma->addr) + node = node->rb_right; + else if (nr_pages < sdma->nr_pages) + node = node->rb_left; + else if (nr_pages > sdma->nr_pages) + node = node->rb_right; + else + return sdma; + } + + return NULL; +} + +static int svm_insert_sdma(struct svm_process *process, struct svm_sdma *sdma) +{ + struct rb_node **p = &process->sdma_list.rb_node; + struct rb_node *parent = NULL; + + while (*p) { + struct svm_sdma *tmp_sdma = NULL; + + parent = *p; + tmp_sdma = rb_entry(parent, struct svm_sdma, node); + if (sdma->addr < tmp_sdma->addr) + p = &(*p)->rb_left; + else if (sdma->addr > tmp_sdma->addr) + p = &(*p)->rb_right; + else if (sdma->nr_pages < tmp_sdma->nr_pages) + p = &(*p)->rb_left; + else if (sdma->nr_pages > tmp_sdma->nr_pages) + p = &(*p)->rb_right; + else { + /* + * add reference count and return -EBUSY + * to free former alloced one. + */ + atomic64_inc(&tmp_sdma->ref); + return -EBUSY; + } + } + + rb_link_node(&sdma->node, parent, p); + rb_insert_color(&sdma->node, &process->sdma_list); + + return 0; +} + +static void svm_remove_sdma(struct svm_process *process, + struct svm_sdma *sdma, bool try_rm) +{ + int null_count = 0; + + if (try_rm && (!atomic64_dec_and_test(&sdma->ref))) + return; + + rb_erase(&sdma->node, &process->sdma_list); + RB_CLEAR_NODE(&sdma->node); + + while (sdma->nr_pages--) { + if (sdma->pages[sdma->nr_pages] == NULL) { + pr_err("null pointer, nr_pages:%d.\n", sdma->nr_pages); + null_count++; + continue; + } + + put_page(sdma->pages[sdma->nr_pages]); + } + + if (null_count) + dump_stack(); + + kvfree(sdma->pages); + kfree(sdma); +} + +static int svm_pin_pages(unsigned long addr, int nr_pages, + struct page **pages) +{ + int err; + + err = get_user_pages_fast(addr, nr_pages, 1, pages); + if (err > 0 && err < nr_pages) { + while (err--) + put_page(pages[err]); + err = -EFAULT; + } else if (err == 0) { + err = -EFAULT; + } + + return err; +} + +static int svm_add_sdma(struct svm_process *process, + unsigned long addr, unsigned long size) +{ + int err; + struct svm_sdma *sdma = NULL; + + sdma = kzalloc(sizeof(struct svm_sdma), GFP_KERNEL); + if (sdma == NULL) + return -ENOMEM; + + atomic64_set(&sdma->ref, 1); + sdma->addr = addr & PAGE_MASK; + sdma->nr_pages = (PAGE_ALIGN(size + addr) >> PAGE_SHIFT) - + (sdma->addr >> PAGE_SHIFT); + sdma->pages = kvcalloc(sdma->nr_pages, sizeof(char *), GFP_KERNEL); + if (sdma->pages == NULL) { + err = -ENOMEM; + goto err_free_sdma; + } + + /* + * If always pin the same addr with the same nr_pages, pin pages + * maybe should move after insert sdma with mutex lock. + */ + err = svm_pin_pages(sdma->addr, sdma->nr_pages, sdma->pages); + if (err < 0) { + pr_err("%s: failed to pin pages addr 0x%pK, size 0x%lx\n", + __func__, (void *)addr, size); + goto err_free_pages; + } + + err = svm_insert_sdma(process, sdma); + if (err < 0) { + err = 0; + pr_debug("%s: sdma already exist!\n", __func__); + goto err_unpin_pages; + } + + return err; + +err_unpin_pages: + while (sdma->nr_pages--) + put_page(sdma->pages[sdma->nr_pages]); +err_free_pages: + kvfree(sdma->pages); +err_free_sdma: + kfree(sdma); + + return err; +} + +static int svm_pin_memory(unsigned long __user *arg) +{ + int err; + struct svm_process *process = NULL; + unsigned long addr, size, asid; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) + return -EINVAL; + + if (get_user(addr, arg)) + return -EFAULT; + + if (get_user(size, arg + 1)) + return -EFAULT; + + if ((addr + size <= addr) || (size >= (u64)UINT_MAX) || (addr == 0)) + return -EINVAL; + + asid = arm64_mm_context_get(current->mm); + if (!asid) + return -ENOSPC; + + mutex_lock(&svm_process_mutex); + process = find_svm_process(asid); + if (process == NULL) { + mutex_unlock(&svm_process_mutex); + err = -ESRCH; + goto out; + } + mutex_unlock(&svm_process_mutex); + + mutex_lock(&process->mutex); + err = svm_add_sdma(process, addr, size); + mutex_unlock(&process->mutex); + +out: + arm64_mm_context_put(current->mm); + + return err; +} + +static int svm_unpin_memory(unsigned long __user *arg) +{ + int err = 0, nr_pages; + struct svm_sdma *sdma = NULL; + unsigned long addr, size, asid; + struct svm_process *process = NULL; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) + return -EINVAL; + + if (get_user(addr, arg)) + return -EFAULT; + + if (get_user(size, arg + 1)) + return -EFAULT; + + if (ULONG_MAX - addr < size) + return -EINVAL; + + asid = arm64_mm_context_get(current->mm); + if (!asid) + return -ENOSPC; + + nr_pages = (PAGE_ALIGN(size + addr) >> PAGE_SHIFT) - + ((addr & PAGE_MASK) >> PAGE_SHIFT); + addr &= PAGE_MASK; + + mutex_lock(&svm_process_mutex); + process = find_svm_process(asid); + if (process == NULL) { + mutex_unlock(&svm_process_mutex); + err = -ESRCH; + goto out; + } + mutex_unlock(&svm_process_mutex); + + mutex_lock(&process->mutex); + sdma = svm_find_sdma(process, addr, nr_pages); + if (sdma == NULL) { + mutex_unlock(&process->mutex); + err = -ESRCH; + goto out; + } + + svm_remove_sdma(process, sdma, true); + mutex_unlock(&process->mutex); + +out: + arm64_mm_context_put(current->mm); + + return err; +} + static long svm_get_hugeinfo(unsigned long __user *arg) { struct hstate *h = &default_hstate; @@ -980,6 +1248,12 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_PIN_MEMORY: + err = svm_pin_memory((unsigned long __user *)arg); + break; + case SVM_IOCTL_UNPIN_MEMORY: + err = svm_unpin_memory((unsigned long __user *)arg); + break; case SVM_IOCTL_GETHUGEINFO: err = svm_get_hugeinfo((unsigned long __user *)arg); break;
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
implement set l2 cache read count, svm drv will modify the page table pte to set read count for smmu
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 524 +++++++++++++++++++++++++-------------------- 1 file changed, 291 insertions(+), 233 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index 19d36bddeb05..bc31724fb730 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -42,6 +42,7 @@ #define SVM_IOCTL_PIN_MEMORY 0xfff7 #define SVM_IOCTL_GET_PHYMEMINFO 0xfff8 #define SVM_IOCTL_LOAD_FLAG 0xfffa +#define SVM_IOCTL_SET_RC 0xfffc #define SVM_IOCTL_PROCESS_BIND 0xffff
#define CORE_SID 0 @@ -140,6 +141,8 @@ static char *svm_cmd_to_string(unsigned int cmd) switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_SET_RC: + return "set rc"; case SVM_IOCTL_PIN_MEMORY: return "pin memory"; case SVM_IOCTL_UNPIN_MEMORY: @@ -221,6 +224,270 @@ static inline struct core_device *to_core_device(struct device *d) return container_of(d, struct core_device, dev); }
+static struct svm_sdma *svm_find_sdma(struct svm_process *process, + unsigned long addr, int nr_pages) +{ + struct rb_node *node = process->sdma_list.rb_node; + + while (node) { + struct svm_sdma *sdma = NULL; + + sdma = rb_entry(node, struct svm_sdma, node); + if (addr < sdma->addr) + node = node->rb_left; + else if (addr > sdma->addr) + node = node->rb_right; + else if (nr_pages < sdma->nr_pages) + node = node->rb_left; + else if (nr_pages > sdma->nr_pages) + node = node->rb_right; + else + return sdma; + } + + return NULL; +} + +static int svm_insert_sdma(struct svm_process *process, struct svm_sdma *sdma) +{ + struct rb_node **p = &process->sdma_list.rb_node; + struct rb_node *parent = NULL; + + while (*p) { + struct svm_sdma *tmp_sdma = NULL; + + parent = *p; + tmp_sdma = rb_entry(parent, struct svm_sdma, node); + if (sdma->addr < tmp_sdma->addr) + p = &(*p)->rb_left; + else if (sdma->addr > tmp_sdma->addr) + p = &(*p)->rb_right; + else if (sdma->nr_pages < tmp_sdma->nr_pages) + p = &(*p)->rb_left; + else if (sdma->nr_pages > tmp_sdma->nr_pages) + p = &(*p)->rb_right; + else { + /* + * add reference count and return -EBUSY + * to free former alloced one. + */ + atomic64_inc(&tmp_sdma->ref); + return -EBUSY; + } + } + + rb_link_node(&sdma->node, parent, p); + rb_insert_color(&sdma->node, &process->sdma_list); + + return 0; +} + +static void svm_remove_sdma(struct svm_process *process, + struct svm_sdma *sdma, bool try_rm) +{ + int null_count = 0; + + if (try_rm && (!atomic64_dec_and_test(&sdma->ref))) + return; + + rb_erase(&sdma->node, &process->sdma_list); + RB_CLEAR_NODE(&sdma->node); + + while (sdma->nr_pages--) { + if (sdma->pages[sdma->nr_pages] == NULL) { + pr_err("null pointer, nr_pages:%d.\n", sdma->nr_pages); + null_count++; + continue; + } + + put_page(sdma->pages[sdma->nr_pages]); + } + + if (null_count) + dump_stack(); + + kvfree(sdma->pages); + kfree(sdma); +} + +static int svm_pin_pages(unsigned long addr, int nr_pages, + struct page **pages) +{ + int err; + + err = get_user_pages_fast(addr, nr_pages, 1, pages); + if (err > 0 && err < nr_pages) { + while (err--) + put_page(pages[err]); + err = -EFAULT; + } else if (err == 0) { + err = -EFAULT; + } + + return err; +} + +static int svm_add_sdma(struct svm_process *process, + unsigned long addr, unsigned long size) +{ + int err; + struct svm_sdma *sdma = NULL; + + sdma = kzalloc(sizeof(struct svm_sdma), GFP_KERNEL); + if (sdma == NULL) + return -ENOMEM; + + atomic64_set(&sdma->ref, 1); + sdma->addr = addr & PAGE_MASK; + sdma->nr_pages = (PAGE_ALIGN(size + addr) >> PAGE_SHIFT) - + (sdma->addr >> PAGE_SHIFT); + sdma->pages = kvcalloc(sdma->nr_pages, sizeof(char *), GFP_KERNEL); + if (sdma->pages == NULL) { + err = -ENOMEM; + goto err_free_sdma; + } + + /* + * If always pin the same addr with the same nr_pages, pin pages + * maybe should move after insert sdma with mutex lock. + */ + err = svm_pin_pages(sdma->addr, sdma->nr_pages, sdma->pages); + if (err < 0) { + pr_err("%s: failed to pin pages addr 0x%pK, size 0x%lx\n", + __func__, (void *)addr, size); + goto err_free_pages; + } + + err = svm_insert_sdma(process, sdma); + if (err < 0) { + err = 0; + pr_debug("%s: sdma already exist!\n", __func__); + goto err_unpin_pages; + } + + return err; + +err_unpin_pages: + while (sdma->nr_pages--) + put_page(sdma->pages[sdma->nr_pages]); +err_free_pages: + kvfree(sdma->pages); +err_free_sdma: + kfree(sdma); + + return err; +} + +static int svm_pin_memory(unsigned long __user *arg) +{ + int err; + struct svm_process *process = NULL; + unsigned long addr, size, asid; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) + return -EINVAL; + + if (get_user(addr, arg)) + return -EFAULT; + + if (get_user(size, arg + 1)) + return -EFAULT; + + if ((addr + size <= addr) || (size >= (u64)UINT_MAX) || (addr == 0)) + return -EINVAL; + + asid = arm64_mm_context_get(current->mm); + if (!asid) + return -ENOSPC; + + mutex_lock(&svm_process_mutex); + process = find_svm_process(asid); + if (process == NULL) { + mutex_unlock(&svm_process_mutex); + err = -ESRCH; + goto out; + } + mutex_unlock(&svm_process_mutex); + + mutex_lock(&process->mutex); + err = svm_add_sdma(process, addr, size); + mutex_unlock(&process->mutex); + +out: + arm64_mm_context_put(current->mm); + + return err; +} + +static int svm_unpin_memory(unsigned long __user *arg) +{ + int err = 0, nr_pages; + struct svm_sdma *sdma = NULL; + unsigned long addr, size, asid; + struct svm_process *process = NULL; + + if (!acpi_disabled) + return -EPERM; + + if (arg == NULL) + return -EINVAL; + + if (get_user(addr, arg)) + return -EFAULT; + + if (get_user(size, arg + 1)) + return -EFAULT; + + if (ULONG_MAX - addr < size) + return -EINVAL; + + asid = arm64_mm_context_get(current->mm); + if (!asid) + return -ENOSPC; + + nr_pages = (PAGE_ALIGN(size + addr) >> PAGE_SHIFT) - + ((addr & PAGE_MASK) >> PAGE_SHIFT); + addr &= PAGE_MASK; + + mutex_lock(&svm_process_mutex); + process = find_svm_process(asid); + if (process == NULL) { + mutex_unlock(&svm_process_mutex); + err = -ESRCH; + goto out; + } + mutex_unlock(&svm_process_mutex); + + mutex_lock(&process->mutex); + sdma = svm_find_sdma(process, addr, nr_pages); + if (sdma == NULL) { + mutex_unlock(&process->mutex); + err = -ESRCH; + goto out; + } + + svm_remove_sdma(process, sdma, true); + mutex_unlock(&process->mutex); + +out: + arm64_mm_context_put(current->mm); + + return err; +} + +static void svm_unpin_all(struct svm_process *process) +{ + struct rb_node *node = NULL; + + while ((node = rb_first(&process->sdma_list))) + svm_remove_sdma(process, + rb_entry(node, struct svm_sdma, node), + false); +} + static int svm_acpi_bind_core(struct core_device *cdev, void *data) { struct task_struct *task = NULL; @@ -293,6 +560,7 @@ static void svm_process_free(struct mmu_notifier *mn) struct svm_process *process = NULL;
process = container_of(mn, struct svm_process, notifier); + svm_unpin_all(process); arm64_mm_context_put(process->mm); kfree(process); } @@ -546,167 +814,14 @@ static int svm_open(struct inode *inode, struct file *file) return 0; }
-static struct svm_sdma *svm_find_sdma(struct svm_process *process, - unsigned long addr, int nr_pages) +static int svm_set_rc(unsigned long __user *arg) { - struct rb_node *node = process->sdma_list.rb_node; - - while (node) { - struct svm_sdma *sdma = NULL; - - sdma = rb_entry(node, struct svm_sdma, node); - if (addr < sdma->addr) - node = node->rb_left; - else if (addr > sdma->addr) - node = node->rb_right; - else if (nr_pages < sdma->nr_pages) - node = node->rb_left; - else if (nr_pages > sdma->nr_pages) - node = node->rb_right; - else - return sdma; - } - - return NULL; -} - -static int svm_insert_sdma(struct svm_process *process, struct svm_sdma *sdma) -{ - struct rb_node **p = &process->sdma_list.rb_node; - struct rb_node *parent = NULL; - - while (*p) { - struct svm_sdma *tmp_sdma = NULL; - - parent = *p; - tmp_sdma = rb_entry(parent, struct svm_sdma, node); - if (sdma->addr < tmp_sdma->addr) - p = &(*p)->rb_left; - else if (sdma->addr > tmp_sdma->addr) - p = &(*p)->rb_right; - else if (sdma->nr_pages < tmp_sdma->nr_pages) - p = &(*p)->rb_left; - else if (sdma->nr_pages > tmp_sdma->nr_pages) - p = &(*p)->rb_right; - else { - /* - * add reference count and return -EBUSY - * to free former alloced one. - */ - atomic64_inc(&tmp_sdma->ref); - return -EBUSY; - } - } - - rb_link_node(&sdma->node, parent, p); - rb_insert_color(&sdma->node, &process->sdma_list); - - return 0; -} - -static void svm_remove_sdma(struct svm_process *process, - struct svm_sdma *sdma, bool try_rm) -{ - int null_count = 0; - - if (try_rm && (!atomic64_dec_and_test(&sdma->ref))) - return; - - rb_erase(&sdma->node, &process->sdma_list); - RB_CLEAR_NODE(&sdma->node); - - while (sdma->nr_pages--) { - if (sdma->pages[sdma->nr_pages] == NULL) { - pr_err("null pointer, nr_pages:%d.\n", sdma->nr_pages); - null_count++; - continue; - } - - put_page(sdma->pages[sdma->nr_pages]); - } - - if (null_count) - dump_stack(); - - kvfree(sdma->pages); - kfree(sdma); -} - -static int svm_pin_pages(unsigned long addr, int nr_pages, - struct page **pages) -{ - int err; - - err = get_user_pages_fast(addr, nr_pages, 1, pages); - if (err > 0 && err < nr_pages) { - while (err--) - put_page(pages[err]); - err = -EFAULT; - } else if (err == 0) { - err = -EFAULT; - } - - return err; -} - -static int svm_add_sdma(struct svm_process *process, - unsigned long addr, unsigned long size) -{ - int err; - struct svm_sdma *sdma = NULL; - - sdma = kzalloc(sizeof(struct svm_sdma), GFP_KERNEL); - if (sdma == NULL) - return -ENOMEM; - - atomic64_set(&sdma->ref, 1); - sdma->addr = addr & PAGE_MASK; - sdma->nr_pages = (PAGE_ALIGN(size + addr) >> PAGE_SHIFT) - - (sdma->addr >> PAGE_SHIFT); - sdma->pages = kvcalloc(sdma->nr_pages, sizeof(char *), GFP_KERNEL); - if (sdma->pages == NULL) { - err = -ENOMEM; - goto err_free_sdma; - } - - /* - * If always pin the same addr with the same nr_pages, pin pages - * maybe should move after insert sdma with mutex lock. - */ - err = svm_pin_pages(sdma->addr, sdma->nr_pages, sdma->pages); - if (err < 0) { - pr_err("%s: failed to pin pages addr 0x%pK, size 0x%lx\n", - __func__, (void *)addr, size); - goto err_free_pages; - } - - err = svm_insert_sdma(process, sdma); - if (err < 0) { - err = 0; - pr_debug("%s: sdma already exist!\n", __func__); - goto err_unpin_pages; - } - - return err; - -err_unpin_pages: - while (sdma->nr_pages--) - put_page(sdma->pages[sdma->nr_pages]); -err_free_pages: - kvfree(sdma->pages); -err_free_sdma: - kfree(sdma); - - return err; -} - -static int svm_pin_memory(unsigned long __user *arg) -{ - int err; - struct svm_process *process = NULL; - unsigned long addr, size, asid; + unsigned long addr, size, rc; + unsigned long end, page_size, offset; + pte_t *pte = NULL; + struct mm_struct *mm = current->mm;
- if (!acpi_disabled) + if (acpi_disabled) return -EPERM;
if (arg == NULL) @@ -718,86 +833,26 @@ static int svm_pin_memory(unsigned long __user *arg) if (get_user(size, arg + 1)) return -EFAULT;
- if ((addr + size <= addr) || (size >= (u64)UINT_MAX) || (addr == 0)) - return -EINVAL; - - asid = arm64_mm_context_get(current->mm); - if (!asid) - return -ENOSPC; - - mutex_lock(&svm_process_mutex); - process = find_svm_process(asid); - if (process == NULL) { - mutex_unlock(&svm_process_mutex); - err = -ESRCH; - goto out; - } - mutex_unlock(&svm_process_mutex); - - mutex_lock(&process->mutex); - err = svm_add_sdma(process, addr, size); - mutex_unlock(&process->mutex); - -out: - arm64_mm_context_put(current->mm); - - return err; -} - -static int svm_unpin_memory(unsigned long __user *arg) -{ - int err = 0, nr_pages; - struct svm_sdma *sdma = NULL; - unsigned long addr, size, asid; - struct svm_process *process = NULL; - - if (!acpi_disabled) - return -EPERM; - - if (arg == NULL) - return -EINVAL; - - if (get_user(addr, arg)) - return -EFAULT; - - if (get_user(size, arg + 1)) + if (get_user(rc, arg + 2)) return -EFAULT;
- if (ULONG_MAX - addr < size) + end = addr + size; + if (addr >= end) return -EINVAL;
- asid = arm64_mm_context_get(current->mm); - if (!asid) - return -ENOSPC; - - nr_pages = (PAGE_ALIGN(size + addr) >> PAGE_SHIFT) - - ((addr & PAGE_MASK) >> PAGE_SHIFT); - addr &= PAGE_MASK; - - mutex_lock(&svm_process_mutex); - process = find_svm_process(asid); - if (process == NULL) { - mutex_unlock(&svm_process_mutex); - err = -ESRCH; - goto out; - } - mutex_unlock(&svm_process_mutex); - - mutex_lock(&process->mutex); - sdma = svm_find_sdma(process, addr, nr_pages); - if (sdma == NULL) { - mutex_unlock(&process->mutex); - err = -ESRCH; - goto out; + down_read(&mm->mmap_lock); + while (addr < end) { + pte = svm_walk_pt(addr, &page_size, &offset); + if (!pte) { + up_read(&mm->mmap_lock); + return -ESRCH; + } + pte->pte |= (rc & (u64)0x0f) << 59; + addr += page_size - offset; } + up_read(&mm->mmap_lock);
- svm_remove_sdma(process, sdma, true); - mutex_unlock(&process->mutex); - -out: - arm64_mm_context_put(current->mm); - - return err; + return 0; }
static long svm_get_hugeinfo(unsigned long __user *arg) @@ -1248,6 +1303,9 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_SET_RC: + err = svm_set_rc((unsigned long __user *)arg); + break; case SVM_IOCTL_PIN_MEMORY: err = svm_pin_memory((unsigned long __user *)arg); break;
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Add ioctl to get pyhs addr for ts core, and put it in the reserved memory.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/svm.c | 313 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 312 insertions(+), 1 deletion(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c index bc31724fb730..531c765e4415 100644 --- a/drivers/char/svm.c +++ b/drivers/char/svm.c @@ -41,6 +41,7 @@ #define SVM_IOCTL_GETHUGEINFO 0xfff6 #define SVM_IOCTL_PIN_MEMORY 0xfff7 #define SVM_IOCTL_GET_PHYMEMINFO 0xfff8 +#define SVM_IOCTL_GET_PHYS 0xfff9 #define SVM_IOCTL_LOAD_FLAG 0xfffa #define SVM_IOCTL_SET_RC 0xfffc #define SVM_IOCTL_PROCESS_BIND 0xffff @@ -141,6 +142,8 @@ static char *svm_cmd_to_string(unsigned int cmd) switch (cmd) { case SVM_IOCTL_PROCESS_BIND: return "bind"; + case SVM_IOCTL_GET_PHYS: + return "get phys"; case SVM_IOCTL_SET_RC: return "set rc"; case SVM_IOCTL_PIN_MEMORY: @@ -164,6 +167,231 @@ static char *svm_cmd_to_string(unsigned int cmd) return NULL; }
+/* + * image word of slot + * SVM_IMAGE_WORD_INIT: initial value, indicating that the slot is not used. + * SVM_IMAGE_WORD_VALID: valid data is filled in the slot + * SVM_IMAGE_WORD_DONE: the DMA operation is complete when the TS uses this address, + * so, this slot can be freed. + */ +#define SVM_IMAGE_WORD_INIT 0x0 +#define SVM_IMAGE_WORD_VALID 0xaa55aa55 +#define SVM_IMAGE_WORD_DONE 0x55ff55ff + +/* + * The length of this structure must be 64 bytes, which is the agreement with the TS. + * And the data type and sequence cannot be changed, because the TS core reads data + * based on the data type and sequence. + * image_word: slot status. For details, see SVM_IMAGE_WORD_xxx + * pid: pid of process which ioctl svm device to get physical addr, it is used for + * verification by TS. + * data_type: used to determine the data type by TS. Currently, data type must be + * SVM_VA2PA_TYPE_DMA. + * char data[48]: for the data type SVM_VA2PA_TYPE_DMA, the DMA address is stored. + */ +struct svm_va2pa_slot { + int image_word; + int resv; + int pid; + int data_type; + union { + char user_defined_data[48]; + struct { + unsigned long phys; + unsigned long len; + char reserved[32]; + }; + }; +}; + +struct svm_va2pa_trunk { + struct svm_va2pa_slot *slots; + int slot_total; + int slot_used; + unsigned long *bitmap; + struct mutex mutex; +}; + +struct svm_va2pa_trunk va2pa_trunk; + +#define SVM_VA2PA_TRUNK_SIZE_MAX 0x3200000 +#define SVM_VA2PA_MEMORY_ALIGN 64 +#define SVM_VA2PA_SLOT_SIZE sizeof(struct svm_va2pa_slot) +#define SVM_VA2PA_TYPE_DMA 0x1 +#define SVM_MEM_REG "va2pa trunk" +#define SVM_VA2PA_CLEAN_BATCH_NUM 0x80 + +struct device_node *svm_find_mem_reg_node(struct device *dev, const char *compat) +{ + int index = 0; + struct device_node *tmp = NULL; + struct device_node *np = dev->of_node; + + for (; ; index++) { + tmp = of_parse_phandle(np, "memory-region", index); + if (!tmp) + break; + + if (of_device_is_compatible(tmp, compat)) + return tmp; + + of_node_put(tmp); + } + + return NULL; +} + +static int svm_parse_trunk_memory(struct device *dev, phys_addr_t *base, unsigned long *size) +{ + int err; + struct resource r; + struct device_node *trunk = NULL; + + trunk = svm_find_mem_reg_node(dev, SVM_MEM_REG); + if (!trunk) { + dev_err(dev, "Didn't find reserved memory\n"); + return -EINVAL; + } + + err = of_address_to_resource(trunk, 0, &r); + of_node_put(trunk); + if (err) { + dev_err(dev, "Couldn't address to resource for reserved memory\n"); + return -ENOMEM; + } + + *base = r.start; + *size = resource_size(&r); + + return 0; +} + +static int svm_setup_trunk(struct device *dev, phys_addr_t base, unsigned long size) +{ + int slot_total; + unsigned long *bitmap = NULL; + struct svm_va2pa_slot *slot = NULL; + + if (!IS_ALIGNED(base, SVM_VA2PA_MEMORY_ALIGN)) { + dev_err(dev, "Didn't aligned to %u\n", SVM_VA2PA_MEMORY_ALIGN); + return -EINVAL; + } + + if ((size == 0) || (size > SVM_VA2PA_TRUNK_SIZE_MAX)) { + dev_err(dev, "Size of reserved memory is not right\n"); + return -EINVAL; + } + + slot_total = size / SVM_VA2PA_SLOT_SIZE; + if (slot_total < BITS_PER_LONG) + return -EINVAL; + + bitmap = kvcalloc(slot_total / BITS_PER_LONG, sizeof(unsigned long), GFP_KERNEL); + if (!bitmap) { + dev_err(dev, "alloc memory failed\n"); + return -ENOMEM; + } + + slot = ioremap(base, size); + if (!slot) { + kvfree(bitmap); + dev_err(dev, "Ioremap trunk failed\n"); + return -ENXIO; + } + + va2pa_trunk.slots = slot; + va2pa_trunk.slot_used = 0; + va2pa_trunk.slot_total = slot_total; + va2pa_trunk.bitmap = bitmap; + mutex_init(&va2pa_trunk.mutex); + + return 0; +} + +static void svm_remove_trunk(struct device *dev) +{ + iounmap(va2pa_trunk.slots); + kvfree(va2pa_trunk.bitmap); + + va2pa_trunk.slots = NULL; + va2pa_trunk.bitmap = NULL; +} + +static void svm_set_slot_valid(unsigned long index, unsigned long phys, unsigned long len) +{ + struct svm_va2pa_slot *slot = &va2pa_trunk.slots[index]; + + slot->phys = phys; + slot->len = len; + slot->image_word = SVM_IMAGE_WORD_VALID; + slot->pid = current->tgid; + slot->data_type = SVM_VA2PA_TYPE_DMA; + __bitmap_set(va2pa_trunk.bitmap, index, 1); + va2pa_trunk.slot_used++; +} + +static void svm_set_slot_init(unsigned long index) +{ + struct svm_va2pa_slot *slot = &va2pa_trunk.slots[index]; + + slot->image_word = SVM_IMAGE_WORD_INIT; + __bitmap_clear(va2pa_trunk.bitmap, index, 1); + va2pa_trunk.slot_used--; +} + +static void svm_clean_done_slots(void) +{ + int used = va2pa_trunk.slot_used; + int count = 0; + long temp = -1; + phys_addr_t addr; + unsigned long *bitmap = va2pa_trunk.bitmap; + + for (; count < used && count < SVM_VA2PA_CLEAN_BATCH_NUM;) { + temp = find_next_bit(bitmap, va2pa_trunk.slot_total, temp + 1); + if (temp == va2pa_trunk.slot_total) + break; + + count++; + if (va2pa_trunk.slots[temp].image_word != SVM_IMAGE_WORD_DONE) + continue; + + addr = (phys_addr_t)va2pa_trunk.slots[temp].phys; + put_page(pfn_to_page(PHYS_PFN(addr))); + svm_set_slot_init(temp); + } +} + +static int svm_find_slot_init(unsigned long *index) +{ + int temp; + unsigned long *bitmap = va2pa_trunk.bitmap; + + temp = find_first_zero_bit(bitmap, va2pa_trunk.slot_total); + if (temp == va2pa_trunk.slot_total) + return -ENOSPC; + + *index = temp; + return 0; +} + +static int svm_va2pa_trunk_init(struct device *dev) +{ + int err; + phys_addr_t base; + unsigned long size; + + err = svm_parse_trunk_memory(dev, &base, &size); + if (err) + return err; + + err = svm_setup_trunk(dev, base, size); + if (err) + return err; + + return 0; +} + static struct svm_process *find_svm_process(unsigned long asid) { struct rb_node *node = svm_process_root.rb_node; @@ -805,6 +1033,78 @@ static pte_t *svm_walk_pt(unsigned long addr, unsigned long *page_size, return svm_get_pte(vma, pud, addr, page_size, offset); }
+static int svm_get_phys(unsigned long __user *arg) +{ + int err; + pte_t *ptep = NULL; + pte_t pte; + unsigned long index = 0; + struct page *page; + unsigned long addr, phys, offset; + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma = NULL; + unsigned long len; + + if (!acpi_disabled) + return -EPERM; + + if (get_user(addr, arg)) + return -EFAULT; + + down_read(&mm->mmap_lock); + ptep = svm_walk_pt(addr, NULL, &offset); + if (!ptep) { + up_read(&mm->mmap_lock); + return -EINVAL; + } + + pte = READ_ONCE(*ptep); + if (!pte_present(pte) || !(pfn_in_present_section(pte_pfn(pte)))) { + up_read(&mm->mmap_lock); + return -EINVAL; + } + + page = pte_page(pte); + get_page(page); + + phys = PFN_PHYS(pte_pfn(pte)) + offset; + + /* fix ts problem, which need the len to check out memory */ + len = 0; + vma = find_vma(mm, addr); + if (vma) + len = vma->vm_end - addr; + + up_read(&mm->mmap_lock); + + mutex_lock(&va2pa_trunk.mutex); + svm_clean_done_slots(); + if (va2pa_trunk.slot_used == va2pa_trunk.slot_total) { + err = -ENOSPC; + goto err_mutex_unlock; + } + + err = svm_find_slot_init(&index); + if (err) + goto err_mutex_unlock; + + svm_set_slot_valid(index, phys, len); + + err = put_user(index * SVM_VA2PA_SLOT_SIZE, (unsigned long __user *)arg); + if (err) + goto err_slot_init; + + mutex_unlock(&va2pa_trunk.mutex); + return 0; + +err_slot_init: + svm_set_slot_init(index); +err_mutex_unlock: + mutex_unlock(&va2pa_trunk.mutex); + put_page(page); + return err; +} + static struct bus_type svm_bus_type = { .name = "svm_bus", }; @@ -1303,6 +1603,9 @@ static long svm_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } break; + case SVM_IOCTL_GET_PHYS: + err = svm_get_phys((unsigned long __user *)arg); + break; case SVM_IOCTL_SET_RC: err = svm_set_rc((unsigned long __user *)arg); break; @@ -1767,10 +2070,15 @@ static int svm_device_probe(struct platform_device *pdev) if (err) dev_warn(dev, "Cannot get l2buff\n");
+ if (svm_va2pa_trunk_init(dev)) { + dev_err(dev, "failed to init va2pa trunk\n"); + goto err_unregister_misc; + } + err = svm_dt_init_core(sdev, np); if (err) { dev_err(dev, "failed to init dt cores\n"); - goto err_unregister_misc; + goto err_remove_trunk; }
probe_index++; @@ -1780,6 +2088,9 @@ static int svm_device_probe(struct platform_device *pdev)
return err;
+err_remove_trunk: + svm_remove_trunk(dev); + err_unregister_misc: misc_deregister(&sdev->miscdev);
From: Lijun Fang fanglijun3@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA
--------
Enable CONFIG_HISI_SVM by default.
Signed-off-by: Lijun Fang fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/configs/openeuler_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index f23a30a6fb01..d94a72fd0ffe 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -3296,6 +3296,7 @@ CONFIG_TCG_TIS_ST33ZP24_I2C=y CONFIG_TCG_TIS_ST33ZP24_SPI=y # CONFIG_XILLYBUS is not set CONFIG_PIN_MEMORY_DEV=m +CONFIG_HISI_SVM=y # end of Character devices
# CONFIG_RANDOM_TRUST_CPU is not set
From: Li Bin huawei.libin@huawei.com
hulk inclusion category: bugfix bugzilla: 30859, https://gitee.com/openeuler/kernel/issues/I4K6FB CVE: NA
Reference: http://openeuler.huawei.com/bugzilla/show_bug.cgi?id=30859
---------------------------
There is softlockup under fio pressure test with smmu enabled: watchdog: BUG: soft lockup - CPU#81 stuck for 22s! [swapper/81:0] ... Call trace: fq_flush_timeout+0xc0/0x110 call_timer_fn+0x34/0x178 expire_timers+0xec/0x158 run_timer_softirq+0xc0/0x1f8 __do_softirq+0x120/0x324 irq_exit+0x11c/0x140 __handle_domain_irq+0x6c/0xc0 gic_handle_irq+0x6c/0x170 el1_irq+0xb8/0x140 arch_cpu_idle+0x38/0x1c0 default_idle_call+0x24/0x44 do_idle+0x1f4/0x2d8 cpu_startup_entry+0x2c/0x30 secondary_start_kernel+0x17c/0x1c8
This is because the timer callback fq_flush_timeout may run more than 10ms, and timer may be processed continuously in the softirq so trigger softlockup. We can use work to deal with fq_ring_free for each cpu which may take long time, that to avoid triggering softlockup.
Signed-off-by: Li Bin huawei.libin@huawei.com Signed-off-by: Peng Wu wupeng58@huawei.com Reviewed-By: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/iommu/iova.c | 31 +++++++++++++++++++++---------- include/linux/iova.h | 1 + 2 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 30d969a4c5fd..2782a5be8df4 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -67,6 +67,7 @@ static void free_iova_flush_queue(struct iova_domain *iovad) if (timer_pending(&iovad->fq_timer)) del_timer(&iovad->fq_timer);
+ flush_work(&iovad->free_iova_work); fq_destroy_all_entries(iovad);
free_percpu(iovad->fq); @@ -76,6 +77,24 @@ static void free_iova_flush_queue(struct iova_domain *iovad) iovad->entry_dtor = NULL; }
+static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq); +static void free_iova_work_func(struct work_struct *work) +{ + struct iova_domain *iovad; + int cpu; + + iovad = container_of(work, struct iova_domain, free_iova_work); + for_each_possible_cpu(cpu) { + unsigned long flags; + struct iova_fq *fq; + + fq = per_cpu_ptr(iovad->fq, cpu); + spin_lock_irqsave(&fq->lock, flags); + fq_ring_free(iovad, fq); + spin_unlock_irqrestore(&fq->lock, flags); + } +} + int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb, iova_entry_dtor entry_dtor) { @@ -106,6 +125,7 @@ int init_iova_flush_queue(struct iova_domain *iovad,
iovad->fq = queue;
+ INIT_WORK(&iovad->free_iova_work, free_iova_work_func); timer_setup(&iovad->fq_timer, fq_flush_timeout, 0); atomic_set(&iovad->fq_timer_on, 0);
@@ -530,20 +550,11 @@ static void fq_destroy_all_entries(struct iova_domain *iovad) static void fq_flush_timeout(struct timer_list *t) { struct iova_domain *iovad = from_timer(iovad, t, fq_timer); - int cpu;
atomic_set(&iovad->fq_timer_on, 0); iova_domain_flush(iovad);
- for_each_possible_cpu(cpu) { - unsigned long flags; - struct iova_fq *fq; - - fq = per_cpu_ptr(iovad->fq, cpu); - spin_lock_irqsave(&fq->lock, flags); - fq_ring_free(iovad, fq); - spin_unlock_irqrestore(&fq->lock, flags); - } + schedule_work(&iovad->free_iova_work); }
void queue_iova(struct iova_domain *iovad, diff --git a/include/linux/iova.h b/include/linux/iova.h index a0637abffee8..56a05c92820c 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -95,6 +95,7 @@ struct iova_domain { flush-queues */ atomic_t fq_timer_on; /* 1 when timer is active, 0 when not */ + struct work_struct free_iova_work; };
static inline unsigned long iova_size(struct iova *iova)
From: Zhen Lei thunder.leizhen@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4K6DE CVE: NA
The non-strict smmu mode has significant performance gains and can resolve the nvme soft lockup problem. We enable it by default.
-----------------------------------------------------
Currently, many peripherals are faster than before. For example, the top speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But when iommu page-table mapping enabled, it's hard to reach the top speed in strict mode, because of frequently map and unmap operations. In order to keep abreast of the times, I think it's better to set non-strict as default.
Below it's our iperf performance data of 25Gb netcard: strict mode: 18-20 Gb/s non-strict mode: 23.5 Gb/s
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com Signed-off-by: Xie XiuQi xiexiuqi@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Zhen Lei thunder.leizhen@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/admin-guide/kernel-parameters.txt | 4 ++-- drivers/iommu/iommu.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d4b9d4a05b7d..dc6c40aaaa4f 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2029,13 +2029,13 @@
iommu.strict= [ARM64] Configure TLB invalidation behaviour Format: { "0" | "1" } - 0 - Lazy mode. + 0 - Lazy mode (default). Request that DMA unmap operations use deferred invalidation of hardware TLBs, for increased throughput at the cost of reduced device isolation. Will fall back to strict mode if not supported by the relevant IOMMU driver. - 1 - Strict mode (default). + 1 - Strict mode. DMA unmap operations invalidate IOMMU hardware TLBs synchronously.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 398bafcd6a55..9adb9d2502ae 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -29,7 +29,7 @@ static struct kset *iommu_group_kset; static DEFINE_IDA(iommu_group_ida);
static unsigned int iommu_def_domain_type __read_mostly; -static bool iommu_dma_strict __read_mostly = true; +static bool iommu_dma_strict __read_mostly; static u32 iommu_cmd_line __read_mostly;
/*
From: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com
euler inclusion category: bugfix bugzilla: 9509, https://gitee.com/openeuler/kernel/issues/I4K61K CVE: NA
Reference: https://lore.kernel.org/lkml/20181017044843.GD1068@jagdpanzerIV/T/
-------------------------------------------------
Make printk_safe_enter_irqsave()/etc macros available to the rest of the kernel.
Signed-off-by: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Signed-off-by: Hongbo Yao yaohongbo@huawei.com Signed-off-by: Peng Wu wupeng58@huawei.com Reviewed-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/printk.h | 40 +++++++++++++++++++++++++++++++++++++ kernel/printk/internal.h | 37 ---------------------------------- kernel/printk/printk_safe.c | 6 ++++-- 3 files changed, 44 insertions(+), 39 deletions(-)
diff --git a/include/linux/printk.h b/include/linux/printk.h index e6a8ee6db68e..7d787f91db92 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -161,6 +161,46 @@ static inline void printk_nmi_direct_enter(void) { } static inline void printk_nmi_direct_exit(void) { } #endif /* PRINTK_NMI */
+#ifdef CONFIG_PRINTK +extern void printk_safe_enter(void); +extern void printk_safe_exit(void); + +#define printk_safe_enter_irqsave(flags) \ + do { \ + local_irq_save(flags); \ + printk_safe_enter(); \ + } while (0) + +#define printk_safe_exit_irqrestore(flags) \ + do { \ + printk_safe_exit(); \ + local_irq_restore(flags); \ + } while (0) + +#define printk_safe_enter_irq() \ + do { \ + local_irq_disable(); \ + printk_safe_enter(); \ + } while (0) + +#define printk_safe_exit_irq() \ + do { \ + printk_safe_exit(); \ + local_irq_enable(); \ + } while (0) +#else +/* + * On !PRINTK builds we still export console output related locks + * and some functions (console_unlock()/tty/etc.), so printk-safe + * must preserve the existing local IRQ guarantees. + */ +#define printk_safe_enter_irqsave(flags) local_irq_save(flags) +#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags) + +#define printk_safe_enter_irq() local_irq_disable() +#define printk_safe_exit_irq() local_irq_enable() +#endif + struct dev_printk_info;
#ifdef CONFIG_PRINTK diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 3a8fd491758c..b1c155328b04 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -22,53 +22,16 @@ int vprintk_store(int facility, int level, __printf(1, 0) int vprintk_default(const char *fmt, va_list args); __printf(1, 0) int vprintk_deferred(const char *fmt, va_list args); __printf(1, 0) int vprintk_func(const char *fmt, va_list args); -void __printk_safe_enter(void); -void __printk_safe_exit(void);
void printk_safe_init(void); bool printk_percpu_data_ready(void);
-#define printk_safe_enter_irqsave(flags) \ - do { \ - local_irq_save(flags); \ - __printk_safe_enter(); \ - } while (0) - -#define printk_safe_exit_irqrestore(flags) \ - do { \ - __printk_safe_exit(); \ - local_irq_restore(flags); \ - } while (0) - -#define printk_safe_enter_irq() \ - do { \ - local_irq_disable(); \ - __printk_safe_enter(); \ - } while (0) - -#define printk_safe_exit_irq() \ - do { \ - __printk_safe_exit(); \ - local_irq_enable(); \ - } while (0) - void defer_console_output(void);
#else
__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; }
-/* - * In !PRINTK builds we still export logbuf_lock spin_lock, console_sem - * semaphore and some of console functions (console_unlock()/etc.), so - * printk-safe must preserve the existing local IRQ guarantees. - */ -#define printk_safe_enter_irqsave(flags) local_irq_save(flags) -#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags) - -#define printk_safe_enter_irq() local_irq_disable() -#define printk_safe_exit_irq() local_irq_enable() - static inline void printk_safe_init(void) { } static inline bool printk_percpu_data_ready(void) { return false; } #endif /* CONFIG_PRINTK */ diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index 2e9e3ed7d63e..d03c36565e0d 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -356,16 +356,18 @@ static __printf(1, 0) int vprintk_safe(const char *fmt, va_list args) }
/* Can be preempted by NMI. */ -void __printk_safe_enter(void) +void printk_safe_enter(void) { this_cpu_inc(printk_context); } +EXPORT_SYMBOL_GPL(printk_safe_enter);
/* Can be preempted by NMI. */ -void __printk_safe_exit(void) +void printk_safe_exit(void) { this_cpu_dec(printk_context); } +EXPORT_SYMBOL_GPL(printk_safe_exit);
__printf(1, 0) int vprintk_func(const char *fmt, va_list args) {
From: Hongbo Yao yaohongbo@huawei.com
euler inclusion category: bugfix bugzilla: 9509, https://gitee.com/openeuler/kernel/issues/I4K61K CVE: NA
Reference: http://openeuler.huawei.com/bugzilla/show_bug.cgi?id=9509
------------------------------------------------
Syzkaller hit 'possible deadlock in console_unlock' for several times. Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&(&port->lock)->rlock); lock(&port_lock_key); lock(&(&port->lock)->rlock); lock(console_owner);
The problem is that call_console_driver->console_driver also can do this thing
uart_port->lock tty_wakeup tty_port->lock
So we can have the following:
tty_write tty_port->lock printk call_console_driver console_driver uart_port->lock tty_wakeup tty_port->lock << deadlock
To solve this problem, switch to printk_safe mode around that kmalloc(), this will redirect all printk()-s from kmalloc() to a special per-CPU buffer, which will be flushed later from a safe context (irq work).
Signed-off-by: Hongbo Yao yaohongbo@huawei.com Signed-off-by: Peng Wu wupeng58@huawei.com Reviewed-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/tty/tty_buffer.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c index bd2d91546e32..1831738b33f4 100644 --- a/drivers/tty/tty_buffer.c +++ b/drivers/tty/tty_buffer.c @@ -172,7 +172,9 @@ static struct tty_buffer *tty_buffer_alloc(struct tty_port *port, size_t size) have queued and recycle that ? */ if (atomic_read(&port->buf.mem_used) > port->buf.mem_limit) return NULL; + printk_safe_enter(); p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC); + printk_safe_exit(); if (p == NULL) return NULL;
From: Xu Qiang xuqiang36@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I CVE: NA
-------------------------------------------------
Export console_flush_on_panic for bbox to use.
Signed-off-by: Xu Qiang xuqiang36@huawei.com Signed-off-by: Fang Lijun fanglijun3@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/printk/printk.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index b9c63109acab..16cb3837b3a5 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2647,6 +2647,7 @@ void console_flush_on_panic(enum con_flush_mode mode) } console_unlock(); } +EXPORT_SYMBOL(console_flush_on_panic);
/* * Return the console tty driver structure and its associated index
From: Weilong Chen chenweilong@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4CMAR CVE: NA
-------------------------------------------------
Customization deliver all types error to driver. As the driver need to process the errors in process context.
Signed-off-by: Weilong Chen chenweilong@huawei.com Reviewed-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/acpi/apei/ghes.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 744769f7bddb..fc816c902394 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -665,11 +665,13 @@ static bool ghes_do_proc(struct ghes *ghes, } else { void *err = acpi_hest_get_payload(gdata);
- ghes_defer_non_standard_event(gdata, sev); log_non_standard_event(sec_type, fru_id, fru_text, sec_sev, err, gdata->error_data_length); } + + /* Customization deliver all types error to driver. */ + ghes_defer_non_standard_event(gdata, sev); }
return queued;
From: Xu Qiang xuqiang36@huawei.com
ascend inclusion category: bugfix Bugzilla: N/A CVE: N/A
-------------------------------------------
Signed-off-by: Xu Qiang xuqiang36@huawei.com Acked-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/printk/printk_safe.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index d03c36565e0d..b774685ccf80 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -288,6 +288,7 @@ void printk_safe_flush_on_panic(void)
printk_safe_flush(); } +EXPORT_SYMBOL_GPL(printk_safe_flush_on_panic);
#ifdef CONFIG_PRINTK_NMI /*
From: Bixuan Cui cuibixuan@huawei.com
ascend inclusion category: feature bugzilla: NA CVE: NA
-------------------------------------------------
Export log_buf_addr_get()/log_buf_len_get() for bbox driver.
Signed-off-by: Bixuan Cui cuibixuan@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/printk/printk.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 16cb3837b3a5..e237ac1a6533 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -457,12 +457,14 @@ char *log_buf_addr_get(void) { return log_buf; } +EXPORT_SYMBOL_GPL(log_buf_addr_get);
/* Return log buffer size */ u32 log_buf_len_get(void) { return log_buf_len; } +EXPORT_SYMBOL_GPL(log_buf_len_get);
/* * Define how much of the log buffer we could take at maximum. The value
From: Bixuan Cui cuibixuan@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4K2U5 CVE: NA
-------------------------------------------------
Export cpu_suspend/cpu_resume/psci_ops for lowpower driver.
Signed-off-by: Bixuan Cui cuibixuan@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/kernel/suspend.c | 2 ++ drivers/firmware/psci/psci.c | 2 ++ 2 files changed, 4 insertions(+)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c index 9f8cdeccd1ba..acef0422ab2e 100644 --- a/arch/arm64/kernel/suspend.c +++ b/arch/arm64/kernel/suspend.c @@ -134,6 +134,8 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
return ret; } +EXPORT_SYMBOL_GPL(cpu_suspend); +EXPORT_SYMBOL_GPL(cpu_resume);
static int __init cpu_suspend_init(void) { diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c index 00af99b6f97c..151d00898cab 100644 --- a/drivers/firmware/psci/psci.c +++ b/drivers/firmware/psci/psci.c @@ -47,6 +47,8 @@ */ static int resident_cpu = -1; struct psci_operations psci_ops; +EXPORT_SYMBOL(psci_ops); + static enum arm_smccc_conduit psci_conduit = SMCCC_CONDUIT_NONE;
bool psci_tos_resident_on(int cpu)
From: Tang Yizhou tangyizhou@huawei.com
hulk inclusion category: feature bugzilla: 37631 CVE: NA
-------------------------------------------------
To support signal monitoring, we need to export the defined tracepoint signal_generate so that it can be used in kernel modules.
Signed-off-by: Tang Yizhou tangyizhou@huawei.com Reviewed-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/signal.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/signal.c b/kernel/signal.c index 30e1b37a73e1..a6434ee9cdbb 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -56,6 +56,8 @@ #include <asm/siginfo.h> #include <asm/cacheflush.h>
+EXPORT_TRACEPOINT_SYMBOL(signal_generate); + /* * SLAB caches for signal bits. */
From: Weilong Chen chenweilong@huawei.com
ascend inclusion category: feature bugzilla: 46922 CVE: NA
-------------------------------------
Taishan's L1/L2 cache is inclusive, and the data is consistent. Any change of L1 does not require DC operation to brush CL in L1 to L2. It's safe that don't clean data cache by address to point of unification.
Without IDC featrue, kernel needs to flush icache as well as dcache, causes performance degradation.
The flaw refers to V110/V200 variant 1.
Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Weilong Chen chenweilong@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 9 ++++++++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu_errata.c | 32 ++++++++++++++++++++++++++ 4 files changed, 45 insertions(+), 1 deletion(-)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 719510247292..616570593704 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -143,6 +143,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| Hisilicon | TSV{110,200} | #1980005 | HISILICON_ERRATUM_1980005 | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 507fdcb74153..08a93ca8f0d9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -771,6 +771,15 @@ config HISILICON_ERRATUM_161600802
If unsure, say Y.
+config HISILICON_ERRATUM_1980005 + bool "Hisilicon erratum IDC support" + default n + help + The HiSilicon TSV100/200 SoC support idc but report wrong value to + kernel. + + If unsure, say N. + config QCOM_FALKOR_ERRATUM_1003 bool "Falkor E1003: Incorrect translation due to ASID change" default y diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index aba1209a684c..93bae3795165 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -70,7 +70,8 @@ #define ARM64_WORKAROUND_HISI_HIP08_RU_PREFETCH 60 #define ARM64_CLEARPAGE_STNP 61 #define ARM64_HAS_TWED 62 +#define ARM64_WORKAROUND_HISILICON_1980005 63
-#define ARM64_NCAPS 63 +#define ARM64_NCAPS 64
#endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 3a427d9f0ef6..abb6c903abef 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -60,6 +60,29 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope) return model == entry->midr_range.model; }
+#ifdef CONFIG_HISILICON_ERRATUM_1980005 +static bool +hisilicon_1980005_match(const struct arm64_cpu_capabilities *entry, + int scope) +{ + static const struct midr_range idc_support_list[] = { + MIDR_ALL_VERSIONS(MIDR_HISI_TSV110), + MIDR_REV(MIDR_HISI_TSV200, 1, 0), + { /* sentinel */ } + }; + + return is_midr_in_range_list(read_cpuid_id(), idc_support_list); +} + +static void +hisilicon_1980005_enable(const struct arm64_cpu_capabilities *__unused) +{ + cpus_set_cap(ARM64_HAS_CACHE_IDC); + arm64_ftr_reg_ctrel0.sys_val |= BIT(CTR_IDC_SHIFT); + sysreg_clear_set(sctlr_el1, SCTLR_EL1_UCT, 0); +} +#endif + static bool has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry, int scope) @@ -473,6 +496,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, .cpu_enable = cpu_enable_trap_ctr_access, }, +#ifdef CONFIG_HISILICON_ERRATUM_1980005 + { + .desc = "Taishan IDC coherence workaround", + .capability = ARM64_WORKAROUND_HISILICON_1980005, + .matches = hisilicon_1980005_match, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .cpu_enable = hisilicon_1980005_enable, + }, +#endif #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003 { .desc = "Qualcomm Technologies Falkor/Kryo erratum 1003",
From: Xie XiuQi xiexiuqi@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBSJ CVE: NA
----------------------------------------
LSE atomic instruction might introduce performance regression on specific benchmark or business. So add a cmdline option to disable/enable it.
"lse=off" cmdline option means disable LSE atomic instruction.
Signed-off-by: Xie XiuQi xiexiuqi@huawei.com Tested-by: Qiang Xiaojun qiangxiaojun@huawei.com [liwei: Fix compile warning with CONFIG_ARM64_LSE_ATOMICS=n] Signed-off-by: Wei Li liwei391@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/kernel/cpufeature.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 1e20a579d30a..4e934aca9f53 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1183,6 +1183,20 @@ static u64 __read_sysreg_by_encoding(u32 sys_id)
#include <linux/irqchip/arm-gic-v3.h>
+static bool lse_disabled; + +static int __init parse_lse(char *str) +{ + if (str == NULL) + return 1; + + if (!strncmp(str, "off", 3)) + lse_disabled = true; + + return 0; +} +early_param("lse", parse_lse); + static bool feature_matches(u64 reg, const struct arm64_cpu_capabilities *entry) { @@ -1205,6 +1219,20 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) return feature_matches(val, entry); }
+#ifdef CONFIG_ARM64_LSE_ATOMICS +static bool has_cpuid_feature_lse(const struct arm64_cpu_capabilities *entry, + int scope) +{ + if (lse_disabled) { + pr_info_once("%s forced OFF by command line option\n", + entry->desc); + return false; + } + + return has_cpuid_feature(entry, scope); +} +#endif + static bool has_useable_gicv3_cpuif(const struct arm64_cpu_capabilities *entry, int scope) { bool has_sre; @@ -1793,7 +1821,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .desc = "LSE atomic instructions", .capability = ARM64_HAS_LSE_ATOMICS, .type = ARM64_CPUCAP_SYSTEM_FEATURE, - .matches = has_cpuid_feature, + .matches = has_cpuid_feature_lse, .sys_reg = SYS_ID_AA64ISAR0_EL1, .field_pos = ID_AA64ISAR0_ATOMICS_SHIFT, .sign = FTR_UNSIGNED,
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KCU2 CVE: NA
----------------------------------------
One of the ILP32 patchset rename 'compat_user_mode' and 'compat_thumb_mode' to 'a32_user_mode' and 'a32_user_mode'. But these two macros are used in some opensource userspace application. To keep compatibility, we redefine these two macros.
Fixes: 23b2f00 ("arm64: rename functions that reference compat term") Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: liwei liwei391@huawei.com --- arch/arm64/include/asm/ptrace.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 34ed891da81b..9193e40b0cce 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -220,6 +220,8 @@ static inline void forget_syscall(struct pt_regs *regs) #define a32_thumb_mode(regs) (0) #endif
+#define compat_thumb_mode(regs) a32_thumb_mode(regs) + #define user_mode(regs) \ (((regs)->pstate & PSR_MODE_MASK) == PSR_MODE_EL0t)
@@ -227,6 +229,8 @@ static inline void forget_syscall(struct pt_regs *regs) (((regs)->pstate & (PSR_MODE32_BIT | PSR_MODE_MASK)) == \ (PSR_MODE32_BIT | PSR_MODE_EL0t))
+#define compat_user_mode(regs) a32_user_mode(regs) + #define processor_mode(regs) \ ((regs)->pstate & PSR_MODE_MASK)
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KCU2 CVE: NA
----------------------------------------
When we set 'nr_cpus=1' in kernel parameter, we get the following error. It's because 'acpi_map_cpuid()' return -ENODEV in 'acpi_map_cpu()' when there are not enough logical CPU numbers. So we need to check the returned logical CPU number and return error if it is negative.
'acpi_map_cpu' when there are not enough logical CPU [ 0.025955] Unable to handle kernel paging request at virtual address ffff00002915b828 [ 0.025958] Mem abort info: [ 0.025959] ESR = 0x96000006 [ 0.025961] Exception class = DABT (current EL), IL = 32 bits [ 0.025963] SET = 0, FnV = 0 [ 0.025965] EA = 0, S1PTW = 0 [ 0.025966] Data abort info: [ 0.025968] ISV = 0, ISS = 0x00000006 [ 0.025970] CM = 0, WnR = 0 [ 0.025972] swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____) [ 0.025974] [ffff00002915b828] pgd=000000013fffe003, pud=000000013fffd003, pmd=0000000000000000 [ 0.025979] Internal error: Oops: 96000006 [#1] SMP [ 0.025981] Modules linked in: [ 0.025983] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) [ 0.025986] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.19.141+ #37 [ 0.025988] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 0.025991] pstate: a0c00005 (NzCv daif +PAN +UAO) [ 0.025993] pc : acpi_map_cpu+0xe0/0x170 [ 0.025996] lr : acpi_map_cpu+0xb8/0x170 [ 0.025997] sp : ffff8000fef1fa50 [ 0.025999] x29: ffff8000fef1fa50 x28: ffff000008e22058 [ 0.026001] x27: ffff0000092a6000 x26: ffff000008d60778 [ 0.026004] x25: ffff0000094c3000 x24: 0000000000000001 [ 0.026006] x23: ffff8000fe802c18 x22: 00000000ffffffff [ 0.026008] x21: ffff000008a16000 x20: ffff000008a16a20 [ 0.026011] x19: 00000000ffffffed x18: ffffffffffffffff [ 0.026013] x17: 0000000087411dcf x16: 00000000b93a5600 [ 0.026015] x15: ffff000009159708 x14: 0720072007200720 [ 0.026018] x13: 0720072007200720 x12: 0720072007200720 [ 0.026020] x11: 072007200720073d x10: 073d073d073d073d [ 0.026022] x9 : 073d073d073d0764 x8 : 0765076607660766 [ 0.026024] x7 : 0766076607660778 x6 : 0000000000000130 [ 0.026027] x5 : ffff0000085955c8 x4 : 0000000000000000 [ 0.026029] x3 : 0000000000000000 x2 : ffff00000915b830 [ 0.026031] x1 : ffff00002915b828 x0 : 0000200000000000 [ 0.026033] Call trace: [ 0.026035] acpi_map_cpu+0xe0/0x170 [ 0.026038] acpi_processor_add+0x44c/0x640 [ 0.026040] acpi_bus_attach+0x174/0x218 [ 0.026043] acpi_bus_attach+0xa8/0x218 [ 0.026045] acpi_bus_attach+0xa8/0x218 [ 0.026047] acpi_bus_attach+0xa8/0x218 [ 0.026049] acpi_bus_scan+0x58/0xb8 [ 0.026052] acpi_scan_init+0xf4/0x234 [ 0.026054] acpi_init+0x318/0x384 [ 0.026056] do_one_initcall+0x54/0x250 [ 0.026059] kernel_init_freeable+0x2d4/0x3c0 [ 0.026061] kernel_init+0x18/0x118 [ 0.026063] ret_from_fork+0x10/0x18 [ 0.026066] Code: d2800020 9120c042 8b010c41 9ad32000 (f820303f)
Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Reviewed-by: Keqian Zhu zhukeqian1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com --- arch/arm64/kernel/acpi.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index b51ffac3b38d..a81105cfe57e 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -410,6 +410,10 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id, int cpu, nid;
cpu = acpi_map_cpuid(physid, acpi_id); + if (cpu < 0) { + pr_info("Unable to map GICC to logical cpu number\n"); + return cpu; + } nid = acpi_get_node(handle); if (nid != NUMA_NO_NODE) { set_cpu_numa_node(cpu, nid);
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KCU2 CVE: NA
----------------------------------------
When I enable CONFIG_ARM64_ILP32 and CONFIG_UBSAN, I got the following compile error. We need to disable UBSAN for 'vdso-ilp32' like commit ab2a69eee74d ("Fix compile problem when CONFIG_KASAN and CONFIG_UBSAN were on")
`.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o `.data' referenced in section `.text' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o: defined in discarded section `.data' of arch/arm64/kernel/vdso-ilp32/gettimeofday-ilp32.o
Signed-off-by: Wei Li liwei391@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/vdso-ilp32/Makefile | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kernel/vdso-ilp32/Makefile b/arch/arm64/kernel/vdso-ilp32/Makefile index 9a5bbe313769..088ba0a7237d 100644 --- a/arch/arm64/kernel/vdso-ilp32/Makefile +++ b/arch/arm64/kernel/vdso-ilp32/Makefile @@ -55,6 +55,9 @@ endif
# Disable gcov profiling for VDSO code GCOV_PROFILE := n +KASAN_SANITIZE := n +UBSAN_SANITIZE := n +KCOV_INSTRUMENT := n
obj-y += vdso-ilp32.o extra-y += vdso-ilp32.lds
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KCU2 CVE: NA
----------------------------------------
Fix following crash that occurs when 'fq_flush_timeout()' access 'fq->lock' while 'iovad->fq' has been cleared. This happens when the 'fq_timer' handler is being executed and we call 'free_iova_flush_queue()'. When the timer handler is being executed, its pending state is cleared and it is detached. This patch use 'del_timer_sync()' to wait for the timer handler 'fq_flush_timeout()' to finish before destroying the flush queue.
[ 9052.361840] Unable to handle kernel paging request at virtual address 0000a02fd6c66008 [ 9052.361843] Mem abort info: [ 9052.361845] ESR = 0x96000004 [ 9052.361847] Exception class = DABT (current EL), IL = 32 bits [ 9052.361849] SET = 0, FnV = 0 [ 9052.361850] EA = 0, S1PTW = 0 [ 9052.361852] Data abort info: [ 9052.361853] ISV = 0, ISS = 0x00000004 [ 9052.361855] CM = 0, WnR = 0 [ 9052.361860] user pgtable: 4k pages, 48-bit VAs, pgdp = 000000009b665b91 [ 9052.361863] [0000a02fd6c66008] pgd=0000000000000000 [ 9052.361870] Internal error: Oops: 96000004 [#1] SMP [ 9052.361873] Process rmmod (pid: 51122, stack limit = 0x000000003f5524f7) [ 9052.361881] CPU: 69 PID: 51122 Comm: rmmod Kdump: loaded Tainted: G OE 4.19.36- [ 9052.361882] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 0.81 07/10/2019 [ 9052.361885] pstate: 80400089 (Nzcv daIf +PAN -UAO) [ 9052.361902] pc : fq_flush_timeout+0x9c/0x110 [ 9052.361904] lr : (null) [ 9052.361906] sp : ffff00000965bd80 [ 9052.361907] x29: ffff00000965bd80 x28: 0000000000000202 [ 9052.361912] x27: 0000000000000000 x26: 0000000000000053 [ 9052.361915] x25: ffffa026ed805008 x24: ffff000009119810 [ 9052.361919] x23: ffff00000911b938 x22: ffff00000911bc04 [ 9052.361922] x21: ffffa026ed804f28 x20: 0000a02fd6c66008 [ 9052.361926] x19: 0000a02fd6c64000 x18: ffff000009117000 [ 9052.361929] x17: 0000000000000008 x16: 0000000000000000 [ 9052.361933] x15: ffff000009119708 x14: 0000000000000115 [ 9052.361936] x13: ffff0000092f09d7 x12: 0000000000000000 [ 9052.361940] x11: 0000000000000001 x10: ffff00000965be98 [ 9052.361943] x9 : 0000000000000000 x8 : 0000000000000007 [ 9052.361947] x7 : 0000000000000010 x6 : 000000d658b784ef [ 9052.361950] x5 : 00ffffffffffffff x4 : 00000000ffffffff [ 9052.361954] x3 : 0000000000000013 x2 : 0000000000000001 [ 9052.361957] x1 : 0000000000000000 x0 : 0000a02fd6c66008 [ 9052.361961] Call trace: [ 9052.361967] fq_flush_timeout+0x9c/0x110 [ 9052.361976] call_timer_fn+0x34/0x178 [ 9052.361980] expire_timers+0xec/0x158 [ 9052.361983] run_timer_softirq+0xc0/0x1f8 [ 9052.361987] __do_softirq+0x120/0x324 [ 9052.361995] irq_exit+0x11c/0x140 [ 9052.362003] __handle_domain_irq+0x6c/0xc0 [ 9052.362005] gic_handle_irq+0x6c/0x150 [ 9052.362008] el1_irq+0xb8/0x140 [ 9052.362010] vprintk_emit+0x2b4/0x320 [ 9052.362013] vprintk_default+0x54/0x90 [ 9052.362016] vprintk_func+0xa0/0x150 [ 9052.362019] printk+0x74/0x94 [ 9052.362034] nvme_get_smart+0x200/0x220 [nvme] [ 9052.362041] nvme_remove+0x38/0x250 [nvme] [ 9052.362051] pci_device_remove+0x48/0xd8 [ 9052.362065] device_release_driver_internal+0x1b4/0x250 [ 9052.362068] driver_detach+0x64/0xe8 [ 9052.362072] bus_remove_driver+0x64/0x118 [ 9052.362074] driver_unregister+0x34/0x60 [ 9052.362077] pci_unregister_driver+0x24/0xd8 [ 9052.362083] nvme_exit+0x24/0x1754 [nvme] [ 9052.362094] __arm64_sys_delete_module+0x19c/0x2a0 [ 9052.362102] el0_svc_common+0x78/0x130 [ 9052.362106] el0_svc_handler+0x38/0x78 [ 9052.362108] el0_svc+0x8/0xc
Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Yao Hongbo yaohongbo@huawei.com Reviewed-by: Zhen Lei thunder.leizhen@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Zhen Lei thunder.leizhen@huawei.com Reviewed-by: Zhen Lei thunder.leizhen@huawei.com --- drivers/iommu/iova.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 2782a5be8df4..82504049f8e4 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -64,8 +64,7 @@ static void free_iova_flush_queue(struct iova_domain *iovad) if (!has_iova_flush_queue(iovad)) return;
- if (timer_pending(&iovad->fq_timer)) - del_timer(&iovad->fq_timer); + del_timer_sync(&iovad->fq_timer);
flush_work(&iovad->free_iova_work); fq_destroy_all_entries(iovad);
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KCU2 CVE: NA
----------------------------------------
We need to clear EOI for the secure timer only when we panic from sdei_handler. If we clear EOI for the secure timer in normal panic routiue, it has no bad effect on Hi1620, but it may cause undefine behavior on Hi1616. So add a check for NMI context before we clear EOI for the secure timer.
Fixes: dd397d5febc4("sdei_watchdog: clear EOI of the secure timer before kdump")
Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Wei Li liwei391@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Xie XiuQi xiexiuqi@huawei.com --- arch/arm64/kernel/machine_kexec.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 53def49c2ea3..0b09ee49cfba 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -262,7 +262,8 @@ void machine_crash_shutdown(struct pt_regs *regs) * interrupt failed to trigger in the second kernel. So we clear eoi * of the secure timer before booting the second kernel. */ - sdei_watchdog_clear_eoi(); + if (in_nmi()) + sdei_watchdog_clear_eoi();
/* for crashing cpu */ crash_save_cpu(regs, smp_processor_id());
From: Xiongfeng Wang wangxiongfeng2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KCU2 CVE: NA
----------------------------------------
When I ran Syzkaller testsuite, I got the following call trace. Reviewed-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xie XiuQi xiexiuqi@huawei.com
================================================================================ UBSAN: Undefined behaviour in kernel/time/ntp.c:457:16 signed integer overflow: 9223372036854775807 + 500 cannot be represented in type 'long int' CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.25-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 handle_overflow+0x193/0x1e2 lib/ubsan.c:190 second_overflow+0x403/0x540 kernel/time/ntp.c:457 accumulate_nsecs_to_secs kernel/time/timekeeping.c:2002 [inline] logarithmic_accumulation kernel/time/timekeeping.c:2046 [inline] timekeeping_advance+0x2bb/0xec0 kernel/time/timekeeping.c:2114 tick_do_update_jiffies64.part.2+0x1a0/0x350 kernel/time/tick-sched.c:97 tick_do_update_jiffies64 kernel/time/tick-sched.c:1229 [inline] tick_nohz_update_jiffies kernel/time/tick-sched.c:499 [inline] tick_nohz_irq_enter kernel/time/tick-sched.c:1232 [inline] tick_irq_enter+0x1fd/0x240 kernel/time/tick-sched.c:1249 irq_enter+0xc4/0x100 kernel/softirq.c:353 entering_irq arch/x86/include/asm/apic.h:517 [inline] entering_ack_irq arch/x86/include/asm/apic.h:523 [inline] smp_apic_timer_interrupt+0x20/0x480 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:864 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58 Code: 01 f0 0f 82 bc fd ff ff 48 c7 c7 c0 21 b1 83 e8 a1 0a 02 ff e9 ab fd ff ff 4c 89 e7 e8 77 b6 a5 fe e9 6a ff ff ff 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffff888106307d20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000007 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881062e4f1c RBP: 0000000000000003 R08: ffffed107c5dc77b R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff848c78a0 R13: 0000000000000003 R14: 1ffff11020c60fae R15: 0000000000000000 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline] default_idle+0x24/0x2b0 arch/x86/kernel/process.c:561 cpuidle_idle_call kernel/sched/idle.c:153 [inline] do_idle+0x2ca/0x420 kernel/sched/idle.c:262 cpu_startup_entry+0xcb/0xe0 kernel/sched/idle.c:368 start_secondary+0x421/0x570 arch/x86/kernel/smpboot.c:271 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243 ================================================================================
It is because time_maxerror is set as 0x7FFFFFFFFFFFFFFF by user. It overflows when we add it with 'MAXFREQ / NSEC_PER_USEC' in 'second_overflow()'.
This patch add a limit check and saturate it when the user set 'time_maxerror'.
Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com --- kernel/time/ntp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c index 069ca78fb0bf..4fa664b26e16 100644 --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -680,6 +680,8 @@ static inline void process_adjtimex_modes(const struct __kernel_timex *txc,
if (txc->modes & ADJ_MAXERROR) time_maxerror = txc->maxerror; + if (time_maxerror > NTP_PHASE_LIMIT) + time_maxerror = NTP_PHASE_LIMIT;
if (txc->modes & ADJ_ESTERROR) time_esterror = txc->esterror;
From: Fang Lijun fanglijun3@huawei.com
ascend inclusion category: Bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMLR CVE: NA
--------------
Enable COHERENT_DEVICE will degrade performance, Hackbench test time (Pipe_Process_Number=800) from 0.3 to 1.8. When the cdmmask was cacheline aligned, it will be improved as same as disabled COHERENT_DEVICE.
Signed-off-by: Fang Lijun fanglijun3@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c index b2260bb53691..9a2e29a3a597 100644 --- a/arch/arm64/mm/numa.c +++ b/arch/arm64/mm/numa.c @@ -26,7 +26,7 @@ static u8 *numa_distance; bool numa_off;
#ifdef CONFIG_COHERENT_DEVICE -nodemask_t cdmmask; +nodemask_t __cacheline_aligned cdmmask;
inline int arch_check_node_cdm(int nid) {
From: Herbert Xu herbert@gondor.apana.org.au
mainline inclusion from mainline-master commit cbbb5f07ab737f868f90d429255d5d644280f6a9 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4IU7R CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/co...
----------------------------------------------------------------------
The function qm_qos_value_init expects an unsigned integer but is incorrectly supplying a signed format to sscanf. This patch fixes it.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/crypto/hisilicon/qm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 69528da0c0d7..a82960fae6eb 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -4235,7 +4235,7 @@ static ssize_t qm_qos_value_init(const char *buf, unsigned long *val) return -EINVAL; }
- ret = sscanf(buf, "%ld", val); + ret = sscanf(buf, "%lu", val); if (ret != QM_QOS_VAL_NUM) return -EINVAL;
From: Colin Ian King colin.king@canonical.com
mainline inclusion from mainline-master commit 6e96dbe7c40a66a1dac3cdc8d29e9172d937a7b1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4IU8M CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/co...
----------------------------------------------------------------------
There is a spelling mistake in a literal string. Fix it.
Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/crypto/hisilicon/zip/zip_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c index 7148201ce76e..873971ef9aee 100644 --- a/drivers/crypto/hisilicon/zip/zip_main.c +++ b/drivers/crypto/hisilicon/zip/zip_main.c @@ -218,7 +218,7 @@ static const struct debugfs_reg32 hzip_dfx_regs[] = { {"HZIP_AVG_DELAY ", 0x28ull}, {"HZIP_MEM_VISIBLE_DATA ", 0x30ull}, {"HZIP_MEM_VISIBLE_ADDR ", 0x34ull}, - {"HZIP_COMSUMED_BYTE ", 0x38ull}, + {"HZIP_CONSUMED_BYTE ", 0x38ull}, {"HZIP_PRODUCED_BYTE ", 0x40ull}, {"HZIP_COMP_INF ", 0x70ull}, {"HZIP_PRE_OUT ", 0x78ull},
From: Kai Ye yekai13@huawei.com
mainline inclusion from mainline-master commit 183b60e005975d3c84c22199ca64a9221e620fb6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4IUAA CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/co...
----------------------------------------------------------------------
As qm should register to uacce in UACCE_DEV_SVA mode, this patch modifies to checks uacce mode before doing uacce registration.
Signed-off-by: Kai Ye yekai13@huawei.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/crypto/hisilicon/qm.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index a82960fae6eb..edcd4282d951 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -3127,7 +3127,7 @@ static int qm_alloc_uacce(struct hisi_qm *qm) if (IS_ERR(uacce)) return PTR_ERR(uacce);
- if (uacce->flags & UACCE_DEV_SVA && qm->mode == UACCE_MODE_SVA) { + if (uacce->flags & UACCE_DEV_SVA) { qm->use_sva = true; } else { /* only consider sva case */ @@ -3411,8 +3411,10 @@ void hisi_qm_uninit(struct hisi_qm *qm)
qm_irq_unregister(qm); hisi_qm_pci_uninit(qm); - uacce_remove(qm->uacce); - qm->uacce = NULL; + if (qm->use_sva) { + uacce_remove(qm->uacce); + qm->uacce = NULL; + }
up_write(&qm->qps_lock); } @@ -5835,9 +5837,11 @@ int hisi_qm_init(struct hisi_qm *qm) goto err_irq_register; }
- ret = qm_alloc_uacce(qm); - if (ret < 0) - dev_warn(dev, "fail to alloc uacce (%d)\n", ret); + if (qm->mode == UACCE_MODE_SVA) { + ret = qm_alloc_uacce(qm); + if (ret < 0) + dev_warn(dev, "fail to alloc uacce (%d)\n", ret); + }
ret = hisi_qm_memory_init(qm); if (ret) @@ -5850,8 +5854,10 @@ int hisi_qm_init(struct hisi_qm *qm) return 0;
err_alloc_uacce: - uacce_remove(qm->uacce); - qm->uacce = NULL; + if (qm->use_sva) { + uacce_remove(qm->uacce); + qm->uacce = NULL; + } err_irq_register: qm_irq_unregister(qm); err_pci_init:
From: Longfang Liu liulongfang@huawei.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JA4O
----------------------------------------------------------------------
In the previous method, getting the queue operation will overwrite the PF queue address, causing calltrace when the PF device driver is unloaded.
Signed-off-by: Longfang Liu liulongfang@huawei.com Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- .../crypto/hisilicon/migration/acc_vf_migration.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/crypto/hisilicon/migration/acc_vf_migration.c b/drivers/crypto/hisilicon/migration/acc_vf_migration.c index 63c396d55344..54f83edabf44 100644 --- a/drivers/crypto/hisilicon/migration/acc_vf_migration.c +++ b/drivers/crypto/hisilicon/migration/acc_vf_migration.c @@ -657,19 +657,17 @@ static int pf_qm_state_pre_save(struct hisi_qm *qm, int vf_id = acc_vf_dev->vf_id; int ret;
- /* vf acc type save */ + /* Vf acc type save */ vf_data->acc_type = acc_vf_dev->acc_type;
- /* vf qp num save from PF */ - ret = pf_qm_get_qp_num(qm, vf_id, &qm->qp_base, &qm->qp_num); - if (ret || qm->qp_num <= 1) { + /* Vf qp num save from PF */ + ret = pf_qm_get_qp_num(qm, vf_id, &vf_data->qp_base, &vf_data->qp_num); + if (ret) { dev_err(dev, "failed to get vft qp nums!\n"); return -EINVAL; } - vf_data->qp_base = qm->qp_base; - vf_data->qp_num = qm->qp_num;
- /* vf isolation state save from PF */ + /* Vf isolation state save from PF */ ret = qm_read_reg(qm, QM_QUE_ISO_CFG_V, &vf_data->que_iso_cfg, 1); if (ret) { dev_err(dev, "failed to read QM_QUE_ISO_CFG_V!\n");
From: Longfang Liu liulongfang@huawei.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JA45
----------------------------------------------------------------------
In the guest reset operation scenario, the driver inside the guest will not perceive the system restart operation, and the corresponding accelerator driver cannot modify the drive state. When the target end of the live migration is restored, the operation of restarting qp cannot be skipped normally. Caused a page fault exception and the reset of the accelerator PF,
Signed-off-by: Longfang Liu liulongfang@huawei.com Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- .../hisilicon/migration/acc_vf_migration.c | 32 +++++++++++++------ 1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/drivers/crypto/hisilicon/migration/acc_vf_migration.c b/drivers/crypto/hisilicon/migration/acc_vf_migration.c index 54f83edabf44..7dcea3b9a6c6 100644 --- a/drivers/crypto/hisilicon/migration/acc_vf_migration.c +++ b/drivers/crypto/hisilicon/migration/acc_vf_migration.c @@ -500,11 +500,12 @@ static void vf_qm_fun_restart(struct hisi_qm *qm, int i;
/* - * When the system is rebooted, the SMMU page table is destroyed, - * and the QP queue cannot be returned normally at this time. - * if vf_ready == 0x2, don't need to restart QP. + * When the Guest is rebooted or reseted, the SMMU page table + * will be destroyed, and the QP queue cannot be returned + * normally at this time. so if Guest acc driver have removed, + * don't need to restart QP. */ - if (vf_data->vf_state == VF_PREPARE) { + if (vf_data->vf_state != VF_READY) { dev_err(dev, "failed to restart VF!\n"); return; } @@ -805,12 +806,7 @@ static int acc_vf_set_device_state(struct acc_vf_migration *acc_vf_dev,
break; case VFIO_DEVICE_STATE_STOP: - /* restart all VF's QP */ - vf_qm_fun_restart(qm, acc_vf_dev); - - break; case VFIO_DEVICE_STATE_RESUMING: - break; default: ret = -EFAULT; @@ -1210,12 +1206,30 @@ static void acc_vf_release(void *device_data) module_put(THIS_MODULE); }
+static void acc_vf_reset(void *device_data) +{ + struct acc_vf_migration *acc_vf_dev = + vfio_pci_vendor_data(device_data); + struct hisi_qm *qm = acc_vf_dev->vf_qm; + struct device *dev = &qm->pdev->dev; + u32 vf_state = VF_NOT_READY; + int ret; + + dev_info(dev, "QEMU prepare to Reset Guest!\n"); + ret = qm_write_reg(qm, QM_VF_STATE, &vf_state, 1); + if (ret) + dev_err(dev, "failed to write QM_VF_STATE\n"); +} + static long acc_vf_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { switch (cmd) { case VFIO_DEVICE_GET_REGION_INFO: return acc_vf_get_region_info(device_data, cmd, arg); + case VFIO_DEVICE_RESET: + acc_vf_reset(device_data); + return vfio_pci_ioctl(device_data, cmd, arg); default: return vfio_pci_ioctl(device_data, cmd, arg); }
From: Longfang Liu liulongfang@huawei.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JA4W
----------------------------------------------------------------------
In the previous driver, the queue isolation configuration register of the PF page was mixed with the queue isolation configuration register of the VF page, which resulted in mismatch and error in the configuration information during migration. Because the BIT0 of the queue isolation register of the PF page is consistent with the value of the queue isolation register of the VF page, Therefore, driver can only read the registers of the VF page.
Signed-off-by: Longfang Liu liulongfang@huawei.com Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/crypto/hisilicon/migration/acc_vf_migration.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/crypto/hisilicon/migration/acc_vf_migration.c b/drivers/crypto/hisilicon/migration/acc_vf_migration.c index 7dcea3b9a6c6..920f19916fea 100644 --- a/drivers/crypto/hisilicon/migration/acc_vf_migration.c +++ b/drivers/crypto/hisilicon/migration/acc_vf_migration.c @@ -381,12 +381,6 @@ static int qm_rw_regs_write(struct hisi_qm *qm, struct acc_vf_data *vf_data) return ret; }
- ret = qm_write_reg(qm, QM_QUE_ISO_CFG_V, &vf_data->que_iso_cfg, 1); - if (ret) { - dev_err(dev, "failed to write QM_QUE_ISO_CFG_V!\n"); - return ret; - } - ret = qm_write_reg(qm, QM_PAGE_SIZE, &vf_data->page_size, 1); if (ret) { dev_err(dev, "failed to write QM_PAGE_SIZE!\n"); @@ -518,6 +512,7 @@ static int vf_match_info_check(struct hisi_qm *qm, struct acc_vf_migration *acc_vf_dev) { struct acc_vf_data *vf_data = acc_vf_dev->vf_data; + struct hisi_qm *pf_qm = acc_vf_dev->pf_qm; struct device *dev = &qm->pdev->dev; u32 que_iso_state; int ret; @@ -541,7 +536,7 @@ static int vf_match_info_check(struct hisi_qm *qm, }
/* vf isolation state check */ - ret = qm_read_reg(qm, QM_QUE_ISO_CFG_V, &que_iso_state, 1); + ret = qm_read_reg(pf_qm, QM_QUE_ISO_CFG_V, &que_iso_state, 1); if (ret) { dev_err(dev, "failed to read QM_QUE_ISO_CFG_V!\n"); return ret;
From: Yang Shen shenyang39@huawei.com
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4J9WR
----------------------------------------------------------------------
Enable the config about SVA feature.
Signed-off-by: Yang Shen shenyang39@huawei.com Reviewed-by: Hao Fang fanghao11@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/configs/openeuler_defconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index d94a72fd0ffe..73cd7d9c9a78 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -5812,11 +5812,12 @@ CONFIG_IOMMU_IO_PGTABLE_LPAE=y # CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set CONFIG_OF_IOMMU=y CONFIG_IOMMU_DMA=y +CONFIG_IOMMU_SVA_LIB=y CONFIG_ARM_SMMU=y # CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS is not set CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y CONFIG_ARM_SMMU_V3=y -# CONFIG_ARM_SMMU_V3_SVA is not set +CONFIG_ARM_SMMU_V3_SVA=y CONFIG_SMMU_BYPASS_DEV=y # CONFIG_QCOM_IOMMU is not set # CONFIG_VIRTIO_IOMMU is not set
From: Xishi Qiu qiuxishi@huawei.com
euler inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4K05G?from=project-issue CVE: N/A
-------------------------------------------------
Add support of port isolation for QLogic HBA cards.
Signed-off-by: Xishi Qiu qiuxishi@huawei.com Signed-off-by: Fang Ying fangying1@huawei.com Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Hui Wang john.wanghui@huawei.com Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com
Confilicts: drivers/pci/quirks.c
Signed-off-by: Xuefeng Wang wxf.wang@hisilicon.com Reviewed-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com
Confilicts: drivers/pci/quirks.c Reviewed-by: wangxiongfeng wangxiongfeng2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index c5d38cb329d7..96245b93e613 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -4889,6 +4889,8 @@ static const struct pci_dev_acs_enabled { { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_spt_pch_acs }, { 0x19a2, 0x710, pci_quirk_mf_endpoint_acs }, /* Emulex BE3-R */ { 0x10df, 0x720, pci_quirk_mf_endpoint_acs }, /* Emulex Skyhawk-R */ + { 0x1077, 0x2031, pci_quirk_mf_endpoint_acs}, /* QLogic QL2672 */ + { 0x1077, 0x2532, pci_quirk_mf_endpoint_acs}, /* Cavium ThunderX */ { PCI_VENDOR_ID_CAVIUM, PCI_ANY_ID, pci_quirk_cavium_acs }, /* Cavium multi-function devices */
From: Wei Li liwei391@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JUZZ
---------------------------
On ARM64, armv8_pmu_driver_init() is called in do_basic_setup(), it will fail to create perf event if lockup_detector_init() is moved back. So revert the patch firstly.
Fixes: 60565144df0a ("init: only move down lockup_detector_init() when sdei_watchdog is enabled") Signed-off-by: Wei Li liwei391@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/kernel/watchdog_sdei.c | 2 +- include/linux/nmi.h | 2 -- init/main.c | 6 +----- 3 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kernel/watchdog_sdei.c b/arch/arm64/kernel/watchdog_sdei.c index cdbe2ebe3d69..aa980b090598 100644 --- a/arch/arm64/kernel/watchdog_sdei.c +++ b/arch/arm64/kernel/watchdog_sdei.c @@ -21,7 +21,7 @@ #define SDEI_NMI_WATCHDOG_HWIRQ 29
static int sdei_watchdog_event_num; -bool disable_sdei_nmi_watchdog; +static bool disable_sdei_nmi_watchdog; static bool sdei_watchdog_registered; static DEFINE_PER_CPU(ktime_t, last_check_time);
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 0cc36b799df6..a01ab0ade22d 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -241,10 +241,8 @@ int proc_watchdog_cpumask(struct ctl_table *, int, void *, size_t *, loff_t *);
#ifdef CONFIG_SDEI_WATCHDOG void sdei_watchdog_clear_eoi(void); -extern bool disable_sdei_nmi_watchdog; #else static inline void sdei_watchdog_clear_eoi(void) { } -#define disable_sdei_nmi_watchdog 1 #endif
#endif diff --git a/init/main.c b/init/main.c index 646e20a8d1ff..dedd20bcfc9c 100644 --- a/init/main.c +++ b/init/main.c @@ -1517,8 +1517,6 @@ static noinline void __init kernel_init_freeable(void)
rcu_init_tasks_generic(); do_pre_smp_initcalls(); - if (disable_sdei_nmi_watchdog) - lockup_detector_init();
smp_init(); sched_init_smp(); @@ -1530,9 +1528,7 @@ static noinline void __init kernel_init_freeable(void)
do_basic_setup();
- /* sdei_watchdog needs to be initialized after sdei_init */ - if (!disable_sdei_nmi_watchdog) - lockup_detector_init(); + lockup_detector_init();
kunit_run_all_tests();
From: Li Kun hw.likun@huawei.com
euler inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4L078 CVE: NA
-------------------------------------------------
We backport the commit 17f4bad3abc7 ("ima: remove usage of filename parameter") to support absolute path in ima measurement log,when get absolute path failed, the pathname with NULL value will be passed to the next measurement processes. Fix the pathname to relative path when get absolute path failed.
Signed-off-by: Li Kun hw.likun@huawei.com Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Hui Wang john.wanghui@huawei.com
Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Jason Yan yanaijie@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com
Conflicts: security/integrity/ima/ima_main.c
Signed-off-by: Guo Zihua guozihua@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- security/integrity/ima/ima_main.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c index 84826e4c844d..e905b70ab0bd 100644 --- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -502,6 +502,9 @@ static int process_ns_measurement(struct file *file, const struct cred *cred, if (!pathbuf) /* ima_rdwr_violation possibly pre-fetched */ pathname = ima_d_path(&file->f_path, &pathbuf, filename);
+ if (!pathname || strlen(pathname) > IMA_EVENT_NAME_LEN_MAX) + pathname = file->f_path.dentry->d_name.name; + found_digest = ima_lookup_digest(iint->ima_hash->digest, hash_algo, COMPACT_FILE);
From: Cheng Jian cj.chengjian@huawei.com
euler inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4K2D1 CVE: NA
-------------------------------------------------------------------------
When we register kretprobe, data_size used to allocate space for storing per-instance private data.
If we use a negative values as data_size, It will register successfully, then cause slab-out-of-bounds which can be found by KASAN.
The call trace as below :
============================================================= BUG: KASAN: slab-out-of-bounds in trampoline_probe_handler +0xb4/0x2f0 at addr ffff8000b732a7a0 Read of size 8 by task sh/1945 ============================================================= BUG kmalloc-64 (Tainted: G B W OE ): kasan: bad access detected ------------------------------------------------------------- INFO: Allocated in register_kretprobe+0x12c/0x350 age=157 cpu=4 pid=1947 ...... INFO: Freed in do_one_initcall+0x110/0x260 age=169 cpu=4 pid=1947 ...... INFO: Slab 0xffff7bffc2dcca80 objects=21 used=10 fp=0xffff8000b732aa80 flags=0x7fff00000004080 INFO: Object 0xffff8000b732a780 @offset=1920 fp=0x (null)
CPU: 7 PID: 1945 Comm: sh Tainted: G B W OE 4.1.46 #8 Hardware name: linux,dummy-virt (DT) Call trace: [<0008d2a0>] dump_backtrace+0x0/0x220 [<0008d4e0>] show_stack+0x20/0x30 [<00ff2278>] dump_stack+0xa8/0xcc [<002dc6c8>] print_trailer+0xf8/0x160 [<002e20d8>] object_err+0x48/0x60 [<002e48dc>] kasan_report+0x26c/0x5a0 [<002e39a0>] __asan_load8+0x60/0x80 [<01000054>] trampoline_probe_handler+0xb4/0x2f0 [<00ffff38>] kretprobe_trampoline+0x54/0xbc Memory state around the buggy address: b732a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc b732a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >b732a780: 00 00 00 00 07 fc fc fc fc fc fc fc fc fc fc fc ^
If data_size is invalid, then we should not register it.
Signed-off-by: Cheng Jian cj.chengjian@huawei.com Reported-by: Kong ZhangHuan kongzhanghuan@huawei.com Acked-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Mao Wenan maowenan@huawei.com Signed-off-by: Hui Wang john.wanghui@huawei.com Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com
Conflicts: kernel/kprobes.c
Signed-off-by: Xuefeng Wang wxf.wang@hisilicon.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com
Conflicts: kernel/kprobes.c
[ hf: cherry-pick from openEuler-1.0-LTS ] Signed-off-by: Li Huafei lihuafei1@huawei.com Reviewed-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/kprobes.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c index f590e9ff3706..ae9354f2cf2e 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2118,6 +2118,9 @@ int register_kretprobe(struct kretprobe *rp) int i; void *addr;
+ if ((ssize_t)rp->data_size < 0) + return -EINVAL; + ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset); if (ret) return ret;