[PATCH openEuler-5.10 01/90] ACPICA: Interpreter: fix memory leak by using existing buffer

newer
[PATCH OLK-5.10 v4 0/7] Backport...

older
Re: [PATCH OLK-5.10 v3 3/7] EHCI:...

Zheng Zengkai

22 Feb 2022 22 Feb '22

10:11 p.m.

From: Erik Kaneda <erik.kaneda@intel.com> mainline inclusion from mainline-v5.11-rc1 commit 32cf1a12cad43358e47dac8014379c2f33dfbed4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4T7OX CVE: NA -------------------------------- ACPICA commit 52d1da5dcbd79a722b70f02a1a83f04088f51ff6 There was a memory leak that ocurred when a _CID object is defined as a package containing string objects. When _CID is checked for any possible repairs, it calls a helper function to repair _HID (because _CID basically contains multiple _HID entries). The _HID repair function assumes that string objects are standalone objects that are not contained inside of any packages. The _HID repair function replaces the string object with a brand new object and attempts to delete the old object by decrementing the reference count of the old object. Strings inside of packages have a reference count of 2 so the _HID repair function leaves this object in a dangling state and causes a memory leak. Instead of allocating a brand new object and removing the old object, use the existing object when repairing the _HID object. Link: https://github.com/acpica/acpica/commit/52d1da5d Signed-off-by: Erik Kaneda <erik.kaneda@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Conflicts: drivers/acpi/acpica/nsrepair2.c [wangxiongfeng: fix conflicts caused by context mismatch] Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> --- drivers/acpi/acpica/nsrepair2.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/drivers/acpi/acpica/nsrepair2.c b/drivers/acpi/acpica/nsrepair2.c index 125143c41bb8..fb4f8f6209bd 100644 --- a/drivers/acpi/acpica/nsrepair2.c +++ b/drivers/acpi/acpica/nsrepair2.c @@ -491,9 +491,8 @@ acpi_ns_repair_HID(struct acpi_evaluate_info *info, union acpi_operand_object **return_object_ptr) { union acpi_operand_object *return_object = *return_object_ptr; - union acpi_operand_object *new_string; - char *source; char *dest; + char *source; ACPI_FUNCTION_NAME(ns_repair_HID); @@ -514,13 +513,6 @@ acpi_ns_repair_HID(struct acpi_evaluate_info *info, return (AE_OK); } - /* It is simplest to always create a new string object */ - - new_string = acpi_ut_create_string_object(return_object->string.length); - if (!new_string) { - return (AE_NO_MEMORY); - } - /* * Remove a leading asterisk if present. For some unknown reason, there * are many machines in the field that contains IDs like this. @@ -530,7 +522,7 @@ acpi_ns_repair_HID(struct acpi_evaluate_info *info, source = return_object->string.pointer; if (*source == '*') { source++; - new_string->string.length--; + return_object->string.length--; ACPI_DEBUG_PRINT((ACPI_DB_REPAIR, "%s: Removed invalid leading asterisk\n", @@ -545,12 +537,11 @@ acpi_ns_repair_HID(struct acpi_evaluate_info *info, * "NNNN####" where N is an uppercase letter or decimal digit, and * # is a hex digit. */ - for (dest = new_string->string.pointer; *source; dest++, source++) { + for (dest = return_object->string.pointer; *source; dest++, source++) { *dest = (char)toupper((int)*source); } + return_object->string.pointer[return_object->string.length] = 0; - acpi_ut_remove_reference(return_object); - *return_object_ptr = new_string; return (AE_OK); } -- 2.20.1

Show replies by date

Zheng Zengkai

Bamvor Zhang

23 Feb 23 Feb

2:35 p.m.

New subject: [PATCH openEuler-5.10 37/90] x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()

在 2/22/22 10:12 PM, Zheng Zengkai 写道:

...

From: Jarkko Sakkinen <jarkko@kernel.org>

mainline inclusion from mainline-v5.13-rc1 commit 901ddbb9ecf5425183ea0c09d10c2fd7868dce54 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4SIGI CVE: NA

--------------------------------

Background Reviewed-by: Bamvor Zhang <bamvor.zhang@suse.com> Malformat here.

Regards Bamvor

...

==========

SGX enclave memory is enumerated by the processor in contiguous physical ranges called Enclave Page Cache (EPC) sections. Currently, there is a free list per section, but allocations simply target the lowest-numbered sections. This is functional, but has no NUMA awareness.

Fortunately, EPC sections are covered by entries in the ACPI SRAT table. These entries allow each EPC section to be associated with a NUMA node, just like normal RAM.

Solution ========

Implement a NUMA-aware enclave page allocator. Mirror the buddy allocator and maintain a list of enclave pages for each NUMA node. Attempt to allocate enclave memory first from local nodes, then fall back to other nodes.

Note that the fallback is not as sophisticated as the buddy allocator and is itself not aware of NUMA distances. When a node's free list is empty, it searches for the next-highest node with enclave pages (and will wrap if necessary). This could be improved in the future.

Other =====

NUMA_KEEP_MEMINFO dependency is required for phys_to_target_node().

[ Kai Huang: Do not return NULL from __sgx_alloc_epc_page() because callers do not expect that and that leads to a NULL ptr deref. ]

[ dhansen: Fix an uninitialized 'nid' variable in __sgx_alloc_epc_page() as

Reported-by: kernel test robot <lkp@intel.com>

to avoid any potential allocations from the wrong NUMA node or even premature allocation failures. ]

Intel-SIG: commit 901ddbb9ecf5 x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page() Backport for SGX Foundations support

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/lkml/158188326978.894464.217282995221175417.stgit@dw... Link: https://lkml.kernel.org/r/20210319040602.178558-1-kai.huang@intel.com Link: https://lkml.kernel.org/r/20210318214933.29341-1-dave.hansen@intel.com Link: https://lkml.kernel.org/r/20210317235332.362001-2-jarkko.sakkinen@intel.com Signed-off-by: Fan Du <fan.du@intel.com> #openEuler_contributor Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> --- arch/x86/Kconfig | 1 + arch/x86/kernel/cpu/sgx/main.c | 119 +++++++++++++++++++++------------ arch/x86/kernel/cpu/sgx/sgx.h | 16 +++-- 3 files changed, 88 insertions(+), 48 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5cccb82ca62f..040fb77366dd 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1966,6 +1966,7 @@ config X86_SGX depends on CRYPTO_SHA256=y select SRCU select MMU_NOTIFIER + select NUMA_KEEP_MEMINFO if NUMA help Intel(R) Software Guard eXtensions (SGX) is a set of CPU instructions that can be used by applications to set aside private regions of code diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index f3a5cd2d27ef..13a7599ce7d4 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -23,9 +23,21 @@ static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); * with sgx_reclaimer_lock acquired. */ static LIST_HEAD(sgx_active_page_list); - static DEFINE_SPINLOCK(sgx_reclaimer_lock);

+/* The free page list lock protected variables prepend the lock. */ +static unsigned long sgx_nr_free_pages; + +/* Nodes with one or more EPC sections. */ +static nodemask_t sgx_numa_mask; + +/* + * Array with one list_head for each possible NUMA node. Each + * list contains all the sgx_epc_section's which are on that + * node. + */ +static struct sgx_numa_node *sgx_numa_nodes; + static LIST_HEAD(sgx_dirty_page_list);

/* @@ -312,6 +324,7 @@ static void sgx_reclaim_pages(void) struct sgx_epc_section *section; struct sgx_encl_page *encl_page; struct sgx_epc_page *epc_page; + struct sgx_numa_node *node; pgoff_t page_index; int cnt = 0; int ret; @@ -383,28 +396,18 @@ static void sgx_reclaim_pages(void) epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;

section = &sgx_epc_sections[epc_page->section]; - spin_lock(§ion->lock); - list_add_tail(&epc_page->list, §ion->page_list); - section->free_cnt++; - spin_unlock(§ion->lock); - } -} - -static unsigned long sgx_nr_free_pages(void) -{ - unsigned long cnt = 0; - int i; - - for (i = 0; i < sgx_nr_epc_sections; i++) - cnt += sgx_epc_sections[i].free_cnt; + node = section->node;

- return cnt; + spin_lock(&node->lock); + list_add_tail(&epc_page->list, &node->free_page_list); + sgx_nr_free_pages++; + spin_unlock(&node->lock); + } }

static bool sgx_should_reclaim(unsigned long watermark) { - return sgx_nr_free_pages() < watermark && - !list_empty(&sgx_active_page_list); + return sgx_nr_free_pages < watermark && !list_empty(&sgx_active_page_list); }

static int ksgxd(void *p) @@ -451,45 +454,56 @@ static bool __init sgx_page_reclaimer_init(void) return true; }

-static struct sgx_epc_page *__sgx_alloc_epc_page_from_section(struct sgx_epc_section *section) +static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) { - struct sgx_epc_page *page; + struct sgx_numa_node *node = &sgx_numa_nodes[nid]; + struct sgx_epc_page *page = NULL;

- spin_lock(§ion->lock); + spin_lock(&node->lock);

- if (list_empty(§ion->page_list)) { - spin_unlock(§ion->lock); + if (list_empty(&node->free_page_list)) { + spin_unlock(&node->lock); return NULL; }

- page = list_first_entry(§ion->page_list, struct sgx_epc_page, list); + page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); - section->free_cnt--; + sgx_nr_free_pages--; + + spin_unlock(&node->lock);

- spin_unlock(§ion->lock); return page; }

/** * __sgx_alloc_epc_page() - Allocate an EPC page * - * Iterate through EPC sections and borrow a free EPC page to the caller. When a - * page is no longer needed it must be released with sgx_free_epc_page(). + * Iterate through NUMA nodes and reserve ia free EPC page to the caller. Start + * from the NUMA node, where the caller is executing. * * Return: - * an EPC page, - * -errno on error + * - an EPC page: A borrowed EPC pages were available. + * - NULL: Out of EPC pages. */ struct sgx_epc_page *__sgx_alloc_epc_page(void) { - struct sgx_epc_section *section; struct sgx_epc_page *page; - int i; + int nid_of_current = numa_node_id(); + int nid = nid_of_current;

- for (i = 0; i < sgx_nr_epc_sections; i++) { - section = &sgx_epc_sections[i]; + if (node_isset(nid_of_current, sgx_numa_mask)) { + page = __sgx_alloc_epc_page_from_node(nid_of_current); + if (page) + return page; + } + + /* Fall back to the non-local NUMA nodes: */ + while (true) { + nid = next_node_in(nid, sgx_numa_mask); + if (nid == nid_of_current) + break;

- page = __sgx_alloc_epc_page_from_section(section); + page = __sgx_alloc_epc_page_from_node(nid); if (page) return page; } @@ -600,6 +614,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) void sgx_free_epc_page(struct sgx_epc_page *page) { struct sgx_epc_section *section = &sgx_epc_sections[page->section]; + struct sgx_numa_node *node = section->node; int ret;

WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); @@ -608,10 +623,12 @@ void sgx_free_epc_page(struct sgx_epc_page *page) if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret)) return;

- spin_lock(§ion->lock); - list_add_tail(&page->list, §ion->page_list); - section->free_cnt++; - spin_unlock(§ion->lock); + spin_lock(&node->lock); + + list_add_tail(&page->list, &node->free_page_list); + sgx_nr_free_pages++; + + spin_unlock(&node->lock); }

static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, @@ -632,8 +649,6 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, }

section->phys_addr = phys_addr; - spin_lock_init(§ion->lock); - INIT_LIST_HEAD(§ion->page_list);

for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -642,7 +657,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); }

- section->free_cnt = nr_pages; + sgx_nr_free_pages += nr_pages; return true; }

@@ -661,8 +676,13 @@ static bool __init sgx_page_cache_init(void) { u32 eax, ebx, ecx, edx, type; u64 pa, size; + int nid; int i;

+ sgx_numa_nodes = kmalloc_array(num_possible_nodes(), sizeof(*sgx_numa_nodes), GFP_KERNEL); + if (!sgx_numa_nodes) + return false; + for (i = 0; i < ARRAY_SIZE(sgx_epc_sections); i++) { cpuid_count(SGX_CPUID, i + SGX_CPUID_EPC, &eax, &ebx, &ecx, &edx);

@@ -685,6 +705,21 @@ static bool __init sgx_page_cache_init(void) break; }

+ nid = numa_map_to_online_node(phys_to_target_node(pa)); + if (nid == NUMA_NO_NODE) { + /* The physical address is already printed above. */ + pr_warn(FW_BUG "Unable to map EPC section to online node. Fallback to the NUMA node 0.\n"); + nid = 0; + } + + if (!node_isset(nid, sgx_numa_mask)) { + spin_lock_init(&sgx_numa_nodes[nid].lock); + INIT_LIST_HEAD(&sgx_numa_nodes[nid].free_page_list); + node_set(nid, sgx_numa_mask); + } + + sgx_epc_sections[i].node = &sgx_numa_nodes[nid]; + sgx_nr_epc_sections++; }

diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index bc8af0428640..653af8ca1a25 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -29,22 +29,26 @@ struct sgx_epc_page { struct list_head list; };

+/* + * Contains the tracking data for NUMA nodes having EPC pages. Most importantly, + * the free page list local to the node is stored here. + */ +struct sgx_numa_node { + struct list_head free_page_list; + spinlock_t lock; +}; + /* * The firmware can define multiple chunks of EPC to the different areas of the * physical memory e.g. for memory areas of the each node. This structure is * used to store EPC pages for one EPC section and virtual memory area where * the pages have been mapped. - * - * 'lock' must be held before accessing 'page_list' or 'free_cnt'. */ struct sgx_epc_section { unsigned long phys_addr; void *virt_addr; struct sgx_epc_page *pages; - - spinlock_t lock; - struct list_head page_list; - unsigned long free_cnt; + struct sgx_numa_node *node; };

extern struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];

Zheng Zengkai

3:15 p.m.

New subject: [PATCH openEuler-5.10 37/90] x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()

在 2022/2/23 14:35, Bamvor Zhang 写道:

...

在 2/22/22 10:12 PM, Zheng Zengkai 写道:

...
From: Jarkko Sakkinen <jarkko@kernel.org>

mainline inclusion from mainline-v5.13-rc1 commit 901ddbb9ecf5425183ea0c09d10c2fd7868dce54 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4SIGI CVE: NA

--------------------------------

Background Reviewed-by: Bamvor Zhang <bamvor.zhang@suse.com> Malformat here.

Regards

Bamvor

Patchwork incorrectly identifies the "======", and automatically add the Reviewed-by before it. Background Reviewed-by: Bamvor Zhang <bamvor.zhang@suse.com> ========== SGX enclave memory is enumerated by the processor in contiguous physical

...

...
==========

SGX enclave memory is enumerated by the processor in contiguous physical ranges called Enclave Page Cache (EPC) sections. Currently, there is a free list per section, but allocations simply target the lowest-numbered sections. This is functional, but has no NUMA awareness.

Fortunately, EPC sections are covered by entries in the ACPI SRAT table. These entries allow each EPC section to be associated with a NUMA node, just like normal RAM.

Solution ========

Implement a NUMA-aware enclave page allocator. Mirror the buddy allocator and maintain a list of enclave pages for each NUMA node. Attempt to allocate enclave memory first from local nodes, then fall back to other nodes.

Note that the fallback is not as sophisticated as the buddy allocator and is itself not aware of NUMA distances. When a node's free list is empty, it searches for the next-highest node with enclave pages (and will wrap if necessary). This could be improved in the future.

Other =====

NUMA_KEEP_MEMINFO dependency is required for phys_to_target_node().

[ Kai Huang: Do not return NULL from __sgx_alloc_epc_page() because callers do not expect that and that leads to a NULL ptr deref. ]

[ dhansen: Fix an uninitialized 'nid' variable in __sgx_alloc_epc_page() as

Reported-by: kernel test robot <lkp@intel.com>

to avoid any potential allocations from the wrong NUMA node or even premature allocation failures. ]

Intel-SIG: commit 901ddbb9ecf5 x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page() Backport for SGX Foundations support

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/lkml/158188326978.894464.217282995221175417.stgit@dw... Link: https://lkml.kernel.org/r/20210319040602.178558-1-kai.huang@intel.com Link: https://lkml.kernel.org/r/20210318214933.29341-1-dave.hansen@intel.com Link: https://lkml.kernel.org/r/20210317235332.362001-2-jarkko.sakkinen@intel.com Signed-off-by: Fan Du <fan.du@intel.com> #openEuler_contributor Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> --- arch/x86/Kconfig | 1 + arch/x86/kernel/cpu/sgx/main.c | 119 +++++++++++++++++++++------------ arch/x86/kernel/cpu/sgx/sgx.h | 16 +++-- 3 files changed, 88 insertions(+), 48 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5cccb82ca62f..040fb77366dd 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1966,6 +1966,7 @@ config X86_SGX depends on CRYPTO_SHA256=y select SRCU select MMU_NOTIFIER + select NUMA_KEEP_MEMINFO if NUMA help Intel(R) Software Guard eXtensions (SGX) is a set of CPU instructions that can be used by applications to set aside private regions of code diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index f3a5cd2d27ef..13a7599ce7d4 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -23,9 +23,21 @@ static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); * with sgx_reclaimer_lock acquired. */ static LIST_HEAD(sgx_active_page_list); - static DEFINE_SPINLOCK(sgx_reclaimer_lock);

+/* The free page list lock protected variables prepend the lock. */ +static unsigned long sgx_nr_free_pages; + +/* Nodes with one or more EPC sections. */ +static nodemask_t sgx_numa_mask; + +/* + * Array with one list_head for each possible NUMA node. Each + * list contains all the sgx_epc_section's which are on that + * node. + */ +static struct sgx_numa_node *sgx_numa_nodes; + static LIST_HEAD(sgx_dirty_page_list);

/* @@ -312,6 +324,7 @@ static void sgx_reclaim_pages(void) struct sgx_epc_section *section; struct sgx_encl_page *encl_page; struct sgx_epc_page *epc_page; + struct sgx_numa_node *node; pgoff_t page_index; int cnt = 0; int ret; @@ -383,28 +396,18 @@ static void sgx_reclaim_pages(void) epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;

section = &sgx_epc_sections[epc_page->section]; - spin_lock(§ion->lock); - list_add_tail(&epc_page->list, §ion->page_list); - section->free_cnt++; - spin_unlock(§ion->lock); - } -} - -static unsigned long sgx_nr_free_pages(void) -{ - unsigned long cnt = 0; - int i; - - for (i = 0; i < sgx_nr_epc_sections; i++) - cnt += sgx_epc_sections[i].free_cnt; + node = section->node;

- return cnt; + spin_lock(&node->lock); + list_add_tail(&epc_page->list, &node->free_page_list); + sgx_nr_free_pages++; + spin_unlock(&node->lock); + } }

static bool sgx_should_reclaim(unsigned long watermark) { - return sgx_nr_free_pages() < watermark && - !list_empty(&sgx_active_page_list); + return sgx_nr_free_pages < watermark && !list_empty(&sgx_active_page_list); }

static int ksgxd(void *p) @@ -451,45 +454,56 @@ static bool __init sgx_page_reclaimer_init(void) return true; }

-static struct sgx_epc_page *__sgx_alloc_epc_page_from_section(struct sgx_epc_section *section) +static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) { - struct sgx_epc_page *page; + struct sgx_numa_node *node = &sgx_numa_nodes[nid]; + struct sgx_epc_page *page = NULL;

- spin_lock(§ion->lock); + spin_lock(&node->lock);

- if (list_empty(§ion->page_list)) { - spin_unlock(§ion->lock); + if (list_empty(&node->free_page_list)) { + spin_unlock(&node->lock); return NULL; }

- page = list_first_entry(§ion->page_list, struct sgx_epc_page, list); + page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); - section->free_cnt--; + sgx_nr_free_pages--; + + spin_unlock(&node->lock);

- spin_unlock(§ion->lock); return page; }

/** * __sgx_alloc_epc_page() - Allocate an EPC page * - * Iterate through EPC sections and borrow a free EPC page to the caller. When a - * page is no longer needed it must be released with sgx_free_epc_page(). + * Iterate through NUMA nodes and reserve ia free EPC page to the caller. Start + * from the NUMA node, where the caller is executing. * * Return: - * an EPC page, - * -errno on error + * - an EPC page: A borrowed EPC pages were available. + * - NULL: Out of EPC pages. */ struct sgx_epc_page *__sgx_alloc_epc_page(void) { - struct sgx_epc_section *section; struct sgx_epc_page *page; - int i; + int nid_of_current = numa_node_id(); + int nid = nid_of_current;

- for (i = 0; i < sgx_nr_epc_sections; i++) { - section = &sgx_epc_sections[i]; + if (node_isset(nid_of_current, sgx_numa_mask)) { + page = __sgx_alloc_epc_page_from_node(nid_of_current); + if (page) + return page; + } + + /* Fall back to the non-local NUMA nodes: */ + while (true) { + nid = next_node_in(nid, sgx_numa_mask); + if (nid == nid_of_current) + break;

- page = __sgx_alloc_epc_page_from_section(section); + page = __sgx_alloc_epc_page_from_node(nid); if (page) return page; } @@ -600,6 +614,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) void sgx_free_epc_page(struct sgx_epc_page *page) { struct sgx_epc_section *section = &sgx_epc_sections[page->section]; + struct sgx_numa_node *node = section->node; int ret;

WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); @@ -608,10 +623,12 @@ void sgx_free_epc_page(struct sgx_epc_page *page) if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret)) return;

- spin_lock(§ion->lock); - list_add_tail(&page->list, §ion->page_list); - section->free_cnt++; - spin_unlock(§ion->lock); + spin_lock(&node->lock); + + list_add_tail(&page->list, &node->free_page_list); + sgx_nr_free_pages++; + + spin_unlock(&node->lock); }

static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, @@ -632,8 +649,6 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, }

section->phys_addr = phys_addr; - spin_lock_init(§ion->lock); - INIT_LIST_HEAD(§ion->page_list);

for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -642,7 +657,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); }

- section->free_cnt = nr_pages; + sgx_nr_free_pages += nr_pages; return true; }

@@ -661,8 +676,13 @@ static bool __init sgx_page_cache_init(void) { u32 eax, ebx, ecx, edx, type; u64 pa, size; + int nid; int i;

+ sgx_numa_nodes = kmalloc_array(num_possible_nodes(), sizeof(*sgx_numa_nodes), GFP_KERNEL); + if (!sgx_numa_nodes) + return false; + for (i = 0; i < ARRAY_SIZE(sgx_epc_sections); i++) { cpuid_count(SGX_CPUID, i + SGX_CPUID_EPC, &eax, &ebx, &ecx, &edx);

@@ -685,6 +705,21 @@ static bool __init sgx_page_cache_init(void) break; }

+ nid = numa_map_to_online_node(phys_to_target_node(pa)); + if (nid == NUMA_NO_NODE) { + /* The physical address is already printed above. */ + pr_warn(FW_BUG "Unable to map EPC section to online node. Fallback to the NUMA node 0.\n"); + nid = 0; + } + + if (!node_isset(nid, sgx_numa_mask)) { + spin_lock_init(&sgx_numa_nodes[nid].lock); + INIT_LIST_HEAD(&sgx_numa_nodes[nid].free_page_list); + node_set(nid, sgx_numa_mask); + } + + sgx_epc_sections[i].node = &sgx_numa_nodes[nid]; + sgx_nr_epc_sections++; }

diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index bc8af0428640..653af8ca1a25 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -29,22 +29,26 @@ struct sgx_epc_page { struct list_head list; };

+/* + * Contains the tracking data for NUMA nodes having EPC pages. Most importantly, + * the free page list local to the node is stored here. + */ +struct sgx_numa_node { + struct list_head free_page_list; + spinlock_t lock; +}; + /* * The firmware can define multiple chunks of EPC to the different areas of the * physical memory e.g. for memory areas of the each node. This structure is * used to store EPC pages for one EPC section and virtual memory area where * the pages have been mapped. - * - * 'lock' must be held before accessing 'page_list' or 'free_cnt'. */ struct sgx_epc_section { unsigned long phys_addr; void *virt_addr; struct sgx_epc_page *pages; - - spinlock_t lock; - struct list_head page_list; - unsigned long free_cnt; + struct sgx_numa_node *node; };

extern struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];

Zheng Zengkai

...

396 static void *klp_old_code(unsigned char *code) 397 { 398 static unsigned char old_code[JMP_E9_INSN_SIZE]; 399 400 strncpy(old_code, code, JMP_E9_INSN_SIZE); 401 return old_code; 402 }

Last active (days ago)

List overview

91 comments

2 participants

participants (2)

Bamvor Zhang
Zheng Zengkai