From: Liu Shixin liushixin2@huawei.com
hulk inclusion category: feature bugzilla: 186704, https://gitee.com/openeuler/kernel/issues/I58V3W CVE: NA
--------------------------------
The memory error handling on 1GB hugepage is disabled by commit 31286a8484a8 because it may lead to a kernel panic.
However, the commit will result a more troublesome downstream problem. So we have to revert it in some situation. At the same time, we backport commit 15494520b776 which resolve the kernel panic described in commit 31286a8484a8.
We add a new cmdline named 'hugetlb_hwpoison_full' to enable memory error handling on 1GB hugepage. By default, the memory error handling on 1GB hugepage is disabled.
Note that the kernel panic may not have been completely resolved!
Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/admin-guide/kernel-parameters.txt | 3 +++ mm/memory-failure.c | 12 +++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 2702a1369c58..98199d3ae741 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1613,6 +1613,9 @@ off: Disable the feature Equivalent to: nohugevmalloc
+ hugetlb_hwpoison_full + [HW] Enable memory error handling of 1GB hugepage. + hung_task_panic= [KNL] Should the hung task detector generate panics. Format: 0 | 1 diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 509fe34a0421..63bacfcca122 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1148,6 +1148,15 @@ static int try_to_split_thp_page(struct page *page, const char *msg) return 0; }
+static bool hugetlb_hwpoison_full; + +static int __init enable_hugetlb_hwpoison_full(char *str) +{ + hugetlb_hwpoison_full = true; + return 0; +} +early_param("hugetlb_hwpoison_full", enable_hugetlb_hwpoison_full); + static int memory_failure_hugetlb(unsigned long pfn, int flags) { struct page *p = pfn_to_page(pfn); @@ -1197,7 +1206,8 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) * - other mm code walking over page table is aware of pud-aligned * hwpoison entries. */ - if (huge_page_size(page_hstate(head)) > PMD_SIZE) { + if (!hugetlb_hwpoison_full && + huge_page_size(page_hstate(head)) > PMD_SIZE) { action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED); res = -EBUSY; goto out;