secretmem: disable memfd_secret() if arch cannot set direct map

JIRA: https://issues.redhat.com/browse/RHEL-27745
JIRA: https://issues.redhat.com/browse/RHEL-66627
CVE: CVE-2024-50182

This patch is a backport of the following upstream commit:
commit 532b53cebe58f34ce1c0f34d866f5c0e335c53c6
Author: Patrick Roy <roypat@amazon.co.uk>
Date:   Tue Oct 1 09:00:41 2024 +0100

    secretmem: disable memfd_secret() if arch cannot set direct map

    Return -ENOSYS from memfd_secret() syscall if !can_set_direct_map().  This
    is the case for example on some arm64 configurations, where marking 4k
    PTEs in the direct map not present can only be done if the direct map is
    set up at 4k granularity in the first place (as ARM's break-before-make
    semantics do not easily allow breaking apart large/gigantic pages).

    More precisely, on arm64 systems with !can_set_direct_map(),
    set_direct_map_invalid_noflush() is a no-op, however it returns success
    (0) instead of an error.  This means that memfd_secret will seemingly
    "work" (e.g.  syscall succeeds, you can mmap the fd and fault in pages),
    but it does not actually achieve its goal of removing its memory from the
    direct map.

    Note that with this patch, memfd_secret() will start erroring on systems
    where can_set_direct_map() returns false (arm64 with
    CONFIG_RODATA_FULL_DEFAULT_ENABLED=n, CONFIG_DEBUG_PAGEALLOC=n and
    CONFIG_KFENCE=n), but that still seems better than the current silent
    failure.  Since CONFIG_RODATA_FULL_DEFAULT_ENABLED defaults to 'y', most
    arm64 systems actually have a working memfd_secret() and aren't be
    affected.

    From going through the iterations of the original memfd_secret patch
    series, it seems that disabling the syscall in these scenarios was the
    intended behavior [1] (preferred over having
    set_direct_map_invalid_noflush return an error as that would result in
    SIGBUSes at page-fault time), however the check for it got dropped between
    v16 [2] and v17 [3], when secretmem moved away from CMA allocations.

    [1]: https://lore.kernel.org/lkml/20201124164930.GK8537@kernel.org/
    [2]: https://lore.kernel.org/lkml/20210121122723.3446-11-rppt@kernel.org/#t
    [3]: https://lore.kernel.org/lkml/20201125092208.12544-10-rppt@kernel.org/

    Link: https://lkml.kernel.org/r/20241001080056.784735-1-roypat@amazon.co.uk
    Fixes: 1507f51255 ("mm: introduce memfd_secret system call to create "secret" memory areas")
    Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
    Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
    Cc: Alexander Graf <graf@amazon.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: James Gowans <jgowans@amazon.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
This commit is contained in:
Rafael Aquini 2024-12-09 12:19:15 -05:00
parent 92597a2f34
commit a6c409e80c
1 changed files with 2 additions and 2 deletions

View File

@ -238,7 +238,7 @@ SYSCALL_DEFINE1(memfd_secret, unsigned int, flags)
/* make sure local flags do not confict with global fcntl.h */
BUILD_BUG_ON(SECRETMEM_FLAGS_MASK & O_CLOEXEC);
if (!secretmem_enable)
if (!secretmem_enable || !can_set_direct_map())
return -ENOSYS;
if (flags & ~(SECRETMEM_FLAGS_MASK | O_CLOEXEC))
@ -280,7 +280,7 @@ static struct file_system_type secretmem_fs = {
static int __init secretmem_init(void)
{
if (!secretmem_enable)
if (!secretmem_enable || !can_set_direct_map())
return 0;
secretmem_mnt = kern_mount(&secretmem_fs);