From: Charan Teja Reddy charante@codeaurora.org
stable inclusion from linux-4.19.142 commit c666936d8d8b0ace4f3260d71a4eedefd53011d9
--------------------------------
commit 88e8ac11d2ea3acc003cf01bb5a38c8aa76c3cfd upstream.
The following race is observed with the repeated online, offline and a delay between two successive online of memory blocks of movable zone.
P1 P2
Online the first memory block in the movable zone. The pcp struct values are initialized to default values,i.e., pcp->high = 0 & pcp->batch = 1.
Allocate the pages from the movable zone.
Try to Online the second memory block in the movable zone thus it entered the online_pages() but yet to call zone_pcp_update(). This process is entered into the exit path thus it tries to release the order-0 pages to pcp lists through free_unref_page_commit(). As pcp->high = 0, pcp->count = 1 proceed to call the function free_pcppages_bulk(). Update the pcp values thus the new pcp values are like, say, pcp->high = 378, pcp->batch = 63. Read the pcp's batch value using READ_ONCE() and pass the same to free_pcppages_bulk(), pcp values passed here are, batch = 63, count = 1.
Since num of pages in the pcp lists are less than ->batch, then it will stuck in while(list_empty(list)) loop with interrupts disabled thus a core hung.
Avoid this by ensuring free_pcppages_bulk() is called with proper count of pcp list pages.
The mentioned race is some what easily reproducible without [1] because pcp's are not updated for the first memory block online and thus there is a enough race window for P2 between alloc+free and pcp struct values update through onlining of second memory block.
With [1], the race still exists but it is very narrow as we update the pcp struct values for the first memory block online itself.
This is not limited to the movable zone, it could also happen in cases with the normal zone (e.g., hotplug to a node that only has DMA memory, or no other memory yet).
[1]: https://patchwork.kernel.org/patch/11696389/
Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type") Signed-off-by: Charan Teja Reddy charante@codeaurora.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: David Hildenbrand david@redhat.com Acked-by: David Rientjes rientjes@google.com Acked-by: Michal Hocko mhocko@suse.com Cc: Michal Hocko mhocko@suse.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Vinayak Menon vinmenon@codeaurora.org Cc: stable@vger.kernel.org [2.6+] Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeauro... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- mm/page_alloc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5818f5f2fe88..e394ae5ba2a7 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1116,6 +1116,11 @@ static void free_pcppages_bulk(struct zone *zone, int count, struct page *page, *tmp; LIST_HEAD(head);
+ /* + * Ensure proper count is passed which otherwise would stuck in the + * below while (list_empty(list)) loop. + */ + count = min(pcp->count, count); while (count) { struct list_head *list;