CVE-2024-47745
Kirill A. Shutemov (1): mm: split critical region in remap_file_pages() and invoke LSMs in between
Liam Howlett (1): mm/mmap.c: don't unlock VMAs in remap_file_pages()
Liam R. Howlett (1): remap_file_pages: Use vma_lookup() instead of find_vma()
Shu Han (1): mm: call the security_mmap_file() LSM hook in remap_file_pages()
mm/mmap.c | 82 ++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 31 deletions(-)
From: Liam Howlett liam.howlett@oracle.com
mainline inclusion from mainline-v5.13-rc1 commit fce000b1bc08c64c0cff4bb705b3970bd6fc1e34 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAYQSE CVE: CVE-2024-47745
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Since this call uses MAP_FIXED, do_mmap() will munlock the necessary range. There is also an error in the loop test expression which will evaluate as false and the loop body has never execute.
Link: https://lkml.kernel.org/r/20210223235010.2296915-1-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett Liam.Howlett@Oracle.com Acked-by: Hugh Dickins hughd@google.com Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: David Hildenbrand david@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Liu Shixin liushixin2@huawei.com --- mm/mmap.c | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c index 9b6fcf8c2f1d..e138fde2e733 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3341,25 +3341,9 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
flags &= MAP_NONBLOCK; flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; - if (vma->vm_flags & VM_LOCKED) { - struct vm_area_struct *tmp; + if (vma->vm_flags & VM_LOCKED) flags |= MAP_LOCKED;
- /* drop PG_Mlocked flag for over-mapped range */ - for (tmp = vma; tmp->vm_start >= start + size; - tmp = tmp->vm_next) { - /* - * Split pmd and munlock page on the border - * of the range. - */ - vma_adjust_trans_huge(tmp, start, start + size, 0); - - munlock_vma_pages_range(tmp, - max(tmp->vm_start, start), - min(tmp->vm_end, start + size)); - } - } - file = get_file(vma->vm_file); ret = do_mmap(vma->vm_file, start, size, prot, flags, pgoff, &populate, NULL);
From: "Liam R. Howlett" Liam.Howlett@Oracle.com
mainline inclusion from mainline-v5.15-rc1 commit 9b593cb20283e68e5e65b09ca10038935297f05b category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAYQSE CVE: CVE-2024-47745
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Using vma_lookup() verifies the start address is contained in the found vma. This results in easier to read code.
Link: https://lkml.kernel.org/r/20210817135234.1550204-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett Liam.Howlett@Oracle.com Reviewed-by: David Hildenbrand david@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Liu Shixin liushixin2@huawei.com --- mm/mmap.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c index e138fde2e733..116954328072 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3305,14 +3305,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (mmap_write_lock_killable(mm)) return -EINTR;
- vma = find_vma(mm, start); + vma = vma_lookup(mm, start);
if (!vma || !(vma->vm_flags & VM_SHARED)) goto out;
- if (start < vma->vm_start) - goto out; - if (start + size > vma->vm_end) { struct vm_area_struct *next;
From: Shu Han ebpqwerty472123@gmail.com
mainline inclusion from mainline-v6.12-rc6 commit ea7e2d5e49c05e5db1922387b09ca74aa40f46e2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAYQSE CVE: CVE-2024-47745
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The remap_file_pages syscall handler calls do_mmap() directly, which doesn't contain the LSM security check. And if the process has called personality(READ_IMPLIES_EXEC) before and remap_file_pages() is called for RW pages, this will actually result in remapping the pages to RWX, bypassing a W^X policy enforced by SELinux.
So we should check prot by security_mmap_file LSM hook in the remap_file_pages syscall handler before do_mmap() is called. Otherwise, it potentially permits an attacker to bypass a W^X policy enforced by SELinux.
The bypass is similar to CVE-2016-10044, which bypass the same thing via AIO and can be found in [1].
The PoC:
$ cat > test.c
int main(void) { size_t pagesz = sysconf(_SC_PAGE_SIZE); int mfd = syscall(SYS_memfd_create, "test", 0); const char *buf = mmap(NULL, 4 * pagesz, PROT_READ | PROT_WRITE, MAP_SHARED, mfd, 0); unsigned int old = syscall(SYS_personality, 0xffffffff); syscall(SYS_personality, READ_IMPLIES_EXEC | old); syscall(SYS_remap_file_pages, buf, pagesz, 0, 2, 0); syscall(SYS_personality, old); // show the RWX page exists even if W^X policy is enforced int fd = open("/proc/self/maps", O_RDONLY); unsigned char buf2[1024]; while (1) { int ret = read(fd, buf2, 1024); if (ret <= 0) break; write(1, buf2, ret); } close(fd); }
$ gcc test.c -o test $ ./test | grep rwx 7f1836c34000-7f1836c35000 rwxs 00002000 00:01 2050 /memfd:test (deleted)
Link: https://project-zero.issues.chromium.org/issues/42452389 [1] Cc: stable@vger.kernel.org Signed-off-by: Shu Han ebpqwerty472123@gmail.com Acked-by: Stephen Smalley stephen.smalley.work@gmail.com [PM: subject line tweaks] Signed-off-by: Paul Moore paul@paul-moore.com Conflicts: mm/mmap.c [ Context conflict because do_mmap(). ] Signed-off-by: Liu Shixin liushixin2@huawei.com --- mm/mmap.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c index 116954328072..223b72bcb1e2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3342,8 +3342,12 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, flags |= MAP_LOCKED;
file = get_file(vma->vm_file); + ret = security_mmap_file(vma->vm_file, prot, flags); + if (ret) + goto out_fput; ret = do_mmap(vma->vm_file, start, size, prot, flags, pgoff, &populate, NULL); +out_fput: fput(file); out: mmap_write_unlock(mm);
From: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com
mainline inclusion from mainline-v6.12-rc6 commit 58a039e679fe72bd0efa8b2abe669a7914bb4429 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAYQSE CVE: CVE-2024-47745
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") fixed a security issue, it added an LSM check when trying to remap file pages, so that LSMs have the opportunity to evaluate such action like for other memory operations such as mmap() and mprotect().
However, that commit called security_mmap_file() inside the mmap_lock lock, while the other calls do it before taking the lock, after commit 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
This caused lock inversion issue with IMA which was taking the mmap_lock and i_mutex lock in the opposite way when the remap_file_pages() system call was called.
Solve the issue by splitting the critical region in remap_file_pages() in two regions: the first takes a read lock of mmap_lock, retrieves the VMA and the file descriptor associated, and calculates the 'prot' and 'flags' variables; the second takes a write lock on mmap_lock, checks that the VMA flags and the VMA file descriptor are the same as the ones obtained in the first critical region (otherwise the system call fails), and calls do_mmap().
In between, after releasing the read lock and before taking the write lock, call security_mmap_file(), and solve the lock inversion issue.
Link: https://lkml.kernel.org/r/20241018161415.3845146-1-roberto.sassu@huaweicloud... Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Roberto Sassu roberto.sassu@huawei.com Reported-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.G... Tested-by: Roberto Sassu roberto.sassu@huawei.com Reviewed-by: Roberto Sassu roberto.sassu@huawei.com Reviewed-by: Jann Horn jannh@google.com Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Reviewed-by: Liam R. Howlett Liam.Howlett@Oracle.com Reviewed-by: Paul Moore paul@paul-moore.com Tested-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com Cc: Jarkko Sakkinen jarkko@kernel.org Cc: Dmitry Kasatkin dmitry.kasatkin@gmail.com Cc: Eric Snowberg eric.snowberg@oracle.com Cc: James Morris jmorris@namei.org Cc: Mimi Zohar zohar@linux.ibm.com Cc: "Serge E. Hallyn" serge@hallyn.com Cc: Shu Han ebpqwerty472123@gmail.com Cc: Vlastimil Babka vbabka@suse.cz Signed-off-by: Andrew Morton akpm@linux-foundation.org Conflicts: mm/mmap.c [ Context conflict because pr_warn_once() and do_mmap(). ] Signed-off-by: Liu Shixin liushixin2@huawei.com --- mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 52 insertions(+), 17 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c index 223b72bcb1e2..4d465d66f26b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3286,6 +3286,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, unsigned long populate = 0; unsigned long ret = -EINVAL; struct file *file; + vm_flags_t vm_flags;
pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.rst.\n", current->comm, current->pid); @@ -3302,12 +3303,60 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (pgoff + (size >> PAGE_SHIFT) < pgoff) return ret;
- if (mmap_write_lock_killable(mm)) + if (mmap_read_lock_killable(mm)) return -EINTR;
+ /* + * Look up VMA under read lock first so we can perform the security + * without holding locks (which can be problematic). We reacquire a + * write lock later and check nothing changed underneath us. + */ vma = vma_lookup(mm, start);
- if (!vma || !(vma->vm_flags & VM_SHARED)) + if (!vma || !(vma->vm_flags & VM_SHARED)) { + mmap_read_unlock(mm); + return -EINVAL; + } + + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; + + flags &= MAP_NONBLOCK; + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; + if (vma->vm_flags & VM_LOCKED) + flags |= MAP_LOCKED; + + /* Save vm_flags used to calculate prot and flags, and recheck later. */ + vm_flags = vma->vm_flags; + file = get_file(vma->vm_file); + + mmap_read_unlock(mm); + + /* Call outside mmap_lock to be consistent with other callers. */ + ret = security_mmap_file(file, prot, flags); + if (ret) { + fput(file); + return ret; + } + + ret = -EINVAL; + + /* OK security check passed, take write lock + let it rip. */ + if (mmap_write_lock_killable(mm)) { + fput(file); + return -EINTR; + } + + vma = vma_lookup(mm, start); + + if (!vma) + goto out; + + /* Make sure things didn't change under us. */ + if (vma->vm_flags != vm_flags) + goto out; + if (vma->vm_file != file) goto out;
if (start + size > vma->vm_end) { @@ -3332,25 +3381,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, goto out; }
- prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; - - flags &= MAP_NONBLOCK; - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; - if (vma->vm_flags & VM_LOCKED) - flags |= MAP_LOCKED; - - file = get_file(vma->vm_file); - ret = security_mmap_file(vma->vm_file, prot, flags); - if (ret) - goto out_fput; ret = do_mmap(vma->vm_file, start, size, prot, flags, pgoff, &populate, NULL); -out_fput: - fput(file); out: mmap_write_unlock(mm); + fput(file); if (populate) mm_populate(ret, populate); if (!IS_ERR_VALUE(ret))
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/13393 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/13393 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W...