From: David Hildenbrand david@redhat.com
stable inclusion from stable-v6.6.33 commit b410a6c84d849a44c387d1ddf6edb84f1d7c37c8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IAD6H2
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 90a7592da14951bd21f74a53246ba30955a648aa ]
s390x must disable shared zeropages for processes running VMs, because the VMs could end up making use of "storage keys" or protected virtualization, which are incompatible with shared zeropages.
Yet, with userfaultfd it is possible to insert shared zeropages into such processes. Let's fallback to simply allocating a fresh zeroed anonymous folio and insert that instead.
mm_forbids_zeropage() was introduced in commit 593befa6ab74 ("mm: introduce mm_forbids_zeropage function"), briefly before userfaultfd went upstream.
Note that we don't want to fail the UFFDIO_ZEROPAGE request like we do for hugetlb, it would be rather unexpected. Further, we also cannot really indicated "not supported" to user space ahead of time: it could be that the MM disallows zeropages after userfaultfd was already registered.
[ agordeev: Fixed checkpatch complaints ]
Fixes: c1a4de99fada ("userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation") Reviewed-by: Peter Xu peterx@redhat.com Link: https://lore.kernel.org/r/20240411161441.910170-2-david@redhat.com Signed-off-by: David Hildenbrand david@redhat.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Wang Hai wanghai38@huawei.com --- mm/userfaultfd.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index ef7f6348c7ec..315e59583bd6 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -214,6 +214,38 @@ static int mfill_atomic_pte_copy(pmd_t *dst_pmd, goto out; }
+static int mfill_atomic_pte_zeroed_folio(pmd_t *dst_pmd, + struct vm_area_struct *dst_vma, + unsigned long dst_addr) +{ + struct folio *folio; + int ret = -ENOMEM; + + folio = vma_alloc_zeroed_movable_folio(dst_vma, dst_addr); + if (!folio) + return ret; + + if (mem_cgroup_charge(folio, dst_vma->vm_mm, GFP_KERNEL)) + goto out_put; + + /* + * The memory barrier inside __folio_mark_uptodate makes sure that + * zeroing out the folio become visible before mapping the page + * using set_pte_at(). See do_anonymous_page(). + */ + __folio_mark_uptodate(folio); + + ret = mfill_atomic_install_pte(dst_pmd, dst_vma, dst_addr, + &folio->page, true, 0); + if (ret) + goto out_put; + + return 0; +out_put: + folio_put(folio); + return ret; +} + static int mfill_atomic_pte_zeropage(pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr) @@ -222,6 +254,9 @@ static int mfill_atomic_pte_zeropage(pmd_t *dst_pmd, spinlock_t *ptl; int ret;
+ if (mm_forbids_zeropage(dst_vma->vm_mm)) + return mfill_atomic_pte_zeroed_folio(dst_pmd, dst_vma, dst_addr); + _dst_pte = pte_mkspecial(pfn_pte(my_zero_pfn(dst_addr), dst_vma->vm_page_prot)); ret = -EAGAIN;