From: Mina Almasry almasrymina@google.com
[ Upstream commit d84cf06e3dd8c5c5b547b5d8931015fc536678e5 ]
The userfaultfd hugetlb tests cause a resv_huge_pages underflow. This happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on an index for which we already have a page in the cache. When this happens, we allocate a second page, double consuming the reservation, and then fail to insert the page into the cache and return -EEXIST.
To fix this, we first check if there is a page in the cache which already consumed the reservation, and return -EEXIST immediately if so.
There is still a rare condition where we fail to copy the page contents AND race with a call for hugetlb_no_page() for this index and again we will underflow resv_huge_pages. That is fixed in a more complicated patch not targeted for -stable.
Test:
Hacked the code locally such that resv_huge_pages underflows produce a warning, then:
./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success ./tools/testing/selftests/vm/userfaultfd hugetlb 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
Both tests succeed and produce no warnings. After the test runs number of free/resv hugepages is correct.
[mike.kravetz@oracle.com: changelog fixes]
Link: https://lkml.kernel.org/r/20210528004649.85298-1-almasrymina@google.com Fixes: 8fb5debc5fcd ("userfaultfd: hugetlbfs: add hugetlb_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Mina Almasry almasrymina@google.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Cc: Axel Rasmussen axelrasmussen@google.com Cc: Peter Xu peterx@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- mm/hugetlb.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 491a40d547790..a30922c297aeb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4338,10 +4338,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, struct page *page;
if (!*pagep) { - ret = -ENOMEM; + /* If a page already exists, then it's UFFDIO_COPY for + * a non-missing case. Return -EEXIST. + */ + if (vm_shared && + hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) { + ret = -EEXIST; + goto out; + } + page = alloc_huge_page(dst_vma, dst_addr, 0); - if (IS_ERR(page)) + if (IS_ERR(page)) { + ret = -ENOMEM; goto out; + }
ret = copy_huge_page_from_user(page, (const void __user *) src_addr,