[PATCH OLK-6.6 00/19] Add sharepool support

Wang Wensheng

4 Dec 2023 4 Dec '23

5:17 p.m.

This patch series migrates share pool features from OLK-5.10 to OLK-6.6. We don't want to do this work patch by patch since there are hundreds of patches in OLK-5.10 including bugfix and small features generation. Instead we just take the final version and split it into a few small patches, each of which contains one feature at the final version. The share pool features is a big feature, it is mainly used to share user virtual memory for different processes in the same group. It could be used by this steps: Process A create a new group which is owned by process A. Process A add process B to the group. Process A add process C to the same group. Process B alloc a new memory VA, and write something in it. The VA was send to the process C by IPC, then process C got it. The process C access the VA and got the data directly. The process A could add more processes in the group to share the memory. Fix the memory by use the free function or exit the group. The new features is enabled both by CONFIG_SHARE_POOL and the enable_ascend_share_pool bootarg, it would not affect anything if disabled. Wang Wensheng (19): mm/hugetlb: Introduce hugetlb_insert_hugepage_pte[_by_pa] mm/vmalloc: Extend vmalloc usage about hugepage mm: Extend mmap assocated functions to accept mm_struct mm/sharepool: Add base framework for share_pool mm/sharepool: Add sp_area management code mm/sharepool: Reserve the va space for share_pool mm/sharepool: Implement mg_sp_make_share_u2k() mm/sharepool: Implement mg_sp_unshare_kva mm/sharepool: Implement mg_sp_walk_page_range() mm/sharepool: Implement mg_sp_free() mm/sharepool: Implement mg_sp_alloc() mm/sharepool: Implement mg_sp_make_share_k2u() mm/sharepool: Implement mg_sp_group_add_task() mm/sharepool: Implement mg_sp_group_id_by_pid() mm/sharepool: Implement mg_sp_id_of_current() mm/sharepool: Implement mg_sp_config_dvpp_range() mm/sharepool: Add proc interfaces to show sp info mm/sharepool: support fork() and exit() to handle the mm mm/sharepool: Protect the va reserved for sharepool include/linux/hugetlb.h | 32 + include/linux/mempolicy.h | 11 + include/linux/mm.h | 17 + include/linux/mm_types.h | 6 + include/linux/share_pool.h | 277 +++ include/linux/vmalloc.h | 18 + kernel/fork.c | 3 + mm/Kconfig | 34 + mm/Makefile | 1 + mm/gup.c | 23 +- mm/hugetlb.c | 78 + mm/mempolicy.c | 12 +- mm/mmap.c | 50 +- mm/mremap.c | 4 + mm/share_pool.c | 3450 ++++++++++++++++++++++++++++++++++++ mm/util.c | 4 + mm/vmalloc.c | 205 ++- 17 files changed, 4204 insertions(+), 21 deletions(-) create mode 100644 include/linux/share_pool.h create mode 100644 mm/share_pool.c -- 2.17.1

Show replies by date

Wang Wensheng

4 Dec 4 Dec

5:17 p.m.

New subject: [PATCH OLK-6.6 01/19] mm/hugetlb: Introduce hugetlb_insert_hugepage_pte[_by_pa]

ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8LNGH --------------------------------------------- Add hugetlb_insert_hugepage_pte[_by_pa] to insert hugepages into process page table. The by_pa version performs like remap_pfn_range() that make the pte special and can be used for reserved physical memory. Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> --- include/linux/hugetlb.h | 32 +++++++++++++++++++ mm/Kconfig | 15 +++++++++ mm/hugetlb.c | 70 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 117 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 47d25a5e1933..57c2630cb80d 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -282,6 +282,26 @@ long hugetlb_change_protection(struct vm_area_struct *vma, bool is_hugetlb_entry_migration(pte_t pte); void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); +#ifdef CONFIG_HUGETLB_INSERT_PAGE +int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage); +int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, + unsigned long vir_addr, + pgprot_t prot, unsigned long phy_addr); +#else /* CONFIG_HUGETLB_INSERT_PAGE */ +static inline int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage) +{ + return -EPERM; +} +static inline int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, + unsigned long vir_addr, + pgprot_t prot, unsigned long phy_addr) +{ + return -EPERM; +} +#endif /* CONFIG_HUGETLB_INSERT_PAGE */ + #else /* !CONFIG_HUGETLB_PAGE */ static inline void hugetlb_dup_vma_private(struct vm_area_struct *vma) @@ -491,6 +511,18 @@ static inline vm_fault_t hugetlb_fault(struct mm_struct *mm, static inline void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) { } +static inline int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage) +{ + return -EPERM; +} +static inline int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, + unsigned long vir_addr, + pgprot_t prot, unsigned long phy_addr) +{ + return -EPERM; +} + #endif /* !CONFIG_HUGETLB_PAGE */ /* * hugepages at page global directory. If arch support diff --git a/mm/Kconfig b/mm/Kconfig index ece4f2847e2b..3dead7328cd5 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1269,6 +1269,21 @@ config LOCK_MM_AND_FIND_VMA bool depends on !STACK_GROWSUP +menuconfig ASCEND_FEATURES + bool "Support Ascend Features" + depends on ARM64 + select HUGETLB_INSERT_PAGE + help + The Ascend chip use the Hisilicon DaVinci architecture, and mainly + focus on AI and machine leanring area, contains many external features. + Enable this config to enable selective list of these features. + If unsure, say N + +config HUGETLB_INSERT_PAGE + bool + help + This allowed a driver to insert hugetlb mapping into user address space. + source "mm/damon/Kconfig" endmenu diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1a4d388b6a3b..a148584422d9 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7552,3 +7552,73 @@ static void __init hugetlb_cma_check(void) } #endif /* CONFIG_CMA */ + +#ifdef CONFIG_HUGETLB_INSERT_PAGE +static pte_t *hugetlb_huge_pte_alloc(struct mm_struct *mm, unsigned long addr, + unsigned long size) +{ + pgd_t *pgdp; + p4d_t *p4dp; + pud_t *pudp; + pte_t *ptep = NULL; + + pgdp = pgd_offset(mm, addr); + p4dp = p4d_offset(pgdp, addr); + pudp = pud_alloc(mm, p4dp, addr); + if (!pudp) + return NULL; + + ptep = (pte_t *)pmd_alloc(mm, pudp, addr); + + return ptep; +} + +static int __hugetlb_insert_hugepage(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, unsigned long pfn) +{ + int ret = 0; + pte_t *ptep, entry; + struct hstate *h; + spinlock_t *ptl; + + h = size_to_hstate(PMD_SIZE); + if (!h) + return -EINVAL; + + ptep = hugetlb_huge_pte_alloc(mm, addr, huge_page_size(h)); + if (!ptep) + return -ENXIO; + + if (WARN_ON(ptep && !pte_none(*ptep) && !pmd_huge(*(pmd_t *)ptep))) + return -ENXIO; + + entry = pfn_pte(pfn, prot); + entry = huge_pte_mkdirty(entry); + if (!(pgprot_val(prot) & PTE_RDONLY)) + entry = huge_pte_mkwrite(entry); + entry = pte_mkyoung(entry); + entry = pte_mkhuge(entry); + entry = pte_mkspecial(entry); + + ptl = huge_pte_lockptr(h, mm, ptep); + spin_lock(ptl); + set_huge_pte_at(mm, addr, ptep, entry, PMD_SIZE); + spin_unlock(ptl); + + return ret; +} + +int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage) +{ + return __hugetlb_insert_hugepage(mm, addr, prot, page_to_pfn(hpage)); +} +EXPORT_SYMBOL_GPL(hugetlb_insert_hugepage_pte); + +int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, unsigned long phy_addr) +{ + return __hugetlb_insert_hugepage(mm, addr, prot, phy_addr >> PAGE_SHIFT); +} +EXPORT_SYMBOL_GPL(hugetlb_insert_hugepage_pte_by_pa); +#endif /* CONFIG_HUGETLB_INSERT_PAGE */ -- 2.17.1

Kefeng Wang

7:21 p.m.

New subject: [PATCH OLK-6.6 01/19] mm/hugetlb: Introduce hugetlb_insert_hugepage_pte[_by_pa]

On 2023/12/4 17:17, Wang Wensheng wrote:

...

ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8LNGH

---------------------------------------------

Add hugetlb_insert_hugepage_pte[_by_pa] to insert hugepages into process page table. The by_pa version performs like remap_pfn_range() that make the pte special and can be used for reserved physical memory.

Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> --- include/linux/hugetlb.h | 32 +++++++++++++++++++ mm/Kconfig | 15 +++++++++ mm/hugetlb.c | 70 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 117 insertions(+)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 47d25a5e1933..57c2630cb80d 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -282,6 +282,26 @@ long hugetlb_change_protection(struct vm_area_struct *vma, bool is_hugetlb_entry_migration(pte_t pte); void hugetlb_unshare_all_pmds(struct vm_area_struct *vma);

+#ifdef CONFIG_HUGETLB_INSERT_PAGE +int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage); +int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, + unsigned long vir_addr, + pgprot_t prot, unsigned long phy_addr); +#else /* CONFIG_HUGETLB_INSERT_PAGE */ +static inline int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage) +{ + return -EPERM; +} +static inline int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, + unsigned long vir_addr, + pgprot_t prot, unsigned long phy_addr) +{ + return -EPERM; +} +#endif /* CONFIG_HUGETLB_INSERT_PAGE */ + #else /* !CONFIG_HUGETLB_PAGE */

static inline void hugetlb_dup_vma_private(struct vm_area_struct *vma) @@ -491,6 +511,18 @@ static inline vm_fault_t hugetlb_fault(struct mm_struct *mm,

static inline void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) { }

+static inline int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr, + pgprot_t prot, struct page *hpage) +{ + return -EPERM; +} +static inline int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm, + unsigned long vir_addr, + pgprot_t prot, unsigned long phy_addr) +{ + return -EPERM; +} + #endif /* !CONFIG_HUGETLB_PAGE */ /* * hugepages at page global directory. If arch support diff --git a/mm/Kconfig b/mm/Kconfig index ece4f2847e2b..3dead7328cd5 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1269,6 +1269,21 @@ config LOCK_MM_AND_FIND_VMA bool depends on !STACK_GROWSUP

+menuconfig ASCEND_FEATURES + bool "Support Ascend Features" + depends on ARM64 + select HUGETLB_INSERT_PAGE + help + The Ascend chip use the Hisilicon DaVinci architecture, and mainly + focus on AI and machine leanring area, contains many external features. + Enable this config to enable selective list of these features. + If unsure, say N + +config HUGETLB_INSERT_PAGE + bool + help + This allowed a driver to insert hugetlb mapping into user address space. +

ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8LNGH --------------------------------------------- Add protections for the va reserved for sharepool. Forbid mremap/munmap to access that range. Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> --- include/linux/share_pool.h | 27 +++++++++++++++++++++++++++ mm/mmap.c | 9 +++++++++ mm/mremap.c | 4 ++++ 3 files changed, 40 insertions(+) diff --git a/include/linux/share_pool.h b/include/linux/share_pool.h index 87ce9eb9fa3e..bd2e0ca8f975 100644 --- a/include/linux/share_pool.h +++ b/include/linux/share_pool.h @@ -157,6 +157,23 @@ static inline void sp_init_mm(struct mm_struct *mm) mm->sp_group_master = NULL; } +static inline bool sp_check_addr(unsigned long addr) +{ + if (sp_is_enabled() && mg_is_sharepool_addr(addr)) + return true; + else + return false; +} + +static inline bool sp_check_mmap_addr(unsigned long addr, unsigned long flags) +{ + if (sp_is_enabled() && mg_is_sharepool_addr(addr) && + !(flags & MAP_SHARE_POOL)) + return true; + else + return false; +} + #else /* CONFIG_SHARE_POOL */ static inline int mg_sp_group_add_task(int tgid, unsigned long prot, int spg_id) @@ -245,6 +262,16 @@ static inline bool sp_check_vm_share_pool(unsigned long vm_flags) { return false; } + +static inline bool sp_check_addr(unsigned long addr) +{ + return false; +} + +static inline bool sp_check_mmap_addr(unsigned long addr, unsigned long flags) +{ + return false; +} #endif /* !CONFIG_SHARE_POOL */ #endif /* LINUX_SHARE_POOL_H */ diff --git a/mm/mmap.c b/mm/mmap.c index ff9d9a8d25ce..21d1fc39bf21 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1720,6 +1720,9 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr, if (len > mmap_end - mmap_min_addr) return -ENOMEM; + if (sp_check_mmap_addr(addr, flags)) + return -EINVAL; + if (flags & MAP_FIXED) return addr; @@ -1769,6 +1772,9 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr, if (len > mmap_end - mmap_min_addr) return -ENOMEM; + if (sp_check_mmap_addr(addr, flags)) + return -EINVAL; + if (flags & MAP_FIXED) return addr; @@ -2949,6 +2955,9 @@ static int __vm_munmap(unsigned long start, size_t len, bool unlock) LIST_HEAD(uf); VMA_ITERATOR(vmi, mm, start); + if (sp_check_addr(start)) + return -EINVAL; + if (mmap_write_lock_killable(mm)) return -EINTR; diff --git a/mm/mremap.c b/mm/mremap.c index 382e81c33fc4..b6979f9d687c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include <linux/uaccess.h> #include <linux/userfaultfd_k.h> #include <linux/mempolicy.h> +#include <linux/share_pool.h> #include <asm/cacheflush.h> #include <asm/tlb.h> @@ -948,6 +949,9 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, if (offset_in_page(addr)) return ret; + if (sp_check_addr(addr) || sp_check_addr(new_addr)) + return ret; + old_len = PAGE_ALIGN(old_len); new_len = PAGE_ALIGN(new_len); -- 2.17.1

Kefeng Wang

7:22 p.m.

New subject: [PATCH OLK-6.6 19/19] mm/sharepool: Protect the va reserved for sharepool

On 2023/12/4 17:17, Wang Wensheng wrote:

...

ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8LNGH

---------------------------------------------

Add protections for the va reserved for sharepool. Forbid mremap/munmap to access that range.

Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> --- include/linux/share_pool.h | 27 +++++++++++++++++++++++++++ mm/mmap.c | 9 +++++++++ mm/mremap.c | 4 ++++ 3 files changed, 40 insertions(+)

diff --git a/include/linux/share_pool.h b/include/linux/share_pool.h index 87ce9eb9fa3e..bd2e0ca8f975 100644 --- a/include/linux/share_pool.h +++ b/include/linux/share_pool.h @@ -157,6 +157,23 @@ static inline void sp_init_mm(struct mm_struct *mm) mm->sp_group_master = NULL; }

+static inline bool sp_check_addr(unsigned long addr) +{ + if (sp_is_enabled() && mg_is_sharepool_addr(addr)) + return true; + else + return false; +} + +static inline bool sp_check_mmap_addr(unsigned long addr, unsigned long flags) +{ + if (sp_is_enabled() && mg_is_sharepool_addr(addr) && + !(flags & MAP_SHARE_POOL)) + return true; + else + return false; +} +

这种都可以直接return false了，不用else

patchwork bot

5:31 p.m.

反馈：您发送到kernel@openeuler.org的补丁/补丁集，已成功转换为PR！ PR链接地址： https://gitee.com/openeuler/kernel/pulls/3165 邮件列表地址：https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/3165 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M...

801

Age (days ago)

801

Last active (days ago)

List overview

22 comments

3 participants

participants (3)

Kefeng Wang
patchwork bot
Wang Wensheng

[PATCH OLK-6.6 00/19] Add sharepool support

tags

participants (3)