From: Chuanhua Han hanchuanhua@oppo.com
mainline inclusion from mainline-v6.11-rc1 commit ebfba0045176cb013f49cb3e5bd9f0b16eba203c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IAJ5MT
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Patch series "large folios swap-in: handle refault cases first", v5.
This patchset is extracted from the large folio swapin series[1], primarily addressing the handling of scenarios involving large folios in the swap cache. Currently, it is particularly focused on addressing the refaulting of mTHP, which is still undergoing reclamation. This approach aims to streamline code review and expedite the integration of this segment into the MM tree.
It relies on Ryan's swap-out series[2], leveraging the helper function swap_pte_batch() introduced by that series.
Presently, do_swap_page only encounters a large folio in the swap cache before the large folio is released by vmscan. However, the code should remain equally useful once we support large folio swap-in via swapin_readahead(). This approach can effectively reduce page faults and eliminate most redundant checks and early exits for MTE restoration in recent MTE patchset[3].
The large folio swap-in for SWP_SYNCHRONOUS_IO and swapin_readahead() will be split into separate patch sets and sent at a later time.
[1] https://lore.kernel.org/linux-mm/20240304081348.197341-1-21cnbao@gmail.com/ [2] https://lore.kernel.org/linux-mm/20240408183946.2991168-1-ryan.roberts@arm.c... [3] https://lore.kernel.org/linux-mm/20240322114136.61386-1-21cnbao@gmail.com/
This patch (of 6):
While swapping in a large folio, we need to free swaps related to the whole folio. To avoid frequently acquiring and releasing swap locks, it is better to introduce an API for batched free. Furthermore, this new function, swap_free_nr(), is designed to efficiently handle various scenarios for releasing a specified number, nr, of swap entries.
Link: https://lkml.kernel.org/r/20240529082824.150954-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240529082824.150954-2-21cnbao@gmail.com Signed-off-by: Chuanhua Han hanchuanhua@oppo.com Co-developed-by: Barry Song v-songbaohua@oppo.com Signed-off-by: Barry Song v-songbaohua@oppo.com Reviewed-by: Ryan Roberts ryan.roberts@arm.com Acked-by: Chris Li chrisl@kernel.org Reviewed-by: "Huang, Ying" ying.huang@intel.com Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: David Hildenbrand david@redhat.com Cc: Gao Xiang xiang@kernel.org Cc: Hugh Dickins hughd@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Kairui Song kasong@tencent.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Suren Baghdasaryan surenb@google.com Cc: Yosry Ahmed yosryahmed@google.com Cc: Yu Zhao yuzhao@google.com Cc: Zi Yan ziy@nvidia.com Cc: Andreas Larsson andreas@gaisler.com Cc: Christoph Hellwig hch@infradead.org Cc: "David S. Miller" davem@davemloft.net Cc: Khalid Aziz khalid.aziz@oracle.com Cc: Len Brown len.brown@intel.com Cc: Pavel Machek pavel@ucw.cz Cc: "Rafael J. Wysocki" rafael@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Liu Shixin liushixin2@huawei.com --- include/linux/swap.h | 5 +++++ mm/swapfile.c | 47 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+)
diff --git a/include/linux/swap.h b/include/linux/swap.h index eb1a190ec5215..b8f2ff0fd9e14 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -511,6 +511,7 @@ extern void swap_shmem_alloc(swp_entry_t); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t); extern void swap_free(swp_entry_t); +extern void swap_free_nr(swp_entry_t entry, int nr_pages); extern void swapcache_free_entries(swp_entry_t *entries, int n); extern void free_swap_and_cache_nr(swp_entry_t entry, int nr); int swap_type_of(dev_t device, sector_t offset); @@ -601,6 +602,10 @@ static inline void swap_free(swp_entry_t swp) { }
+static inline void swap_free_nr(swp_entry_t entry, int nr_pages) +{ +} + static inline void put_swap_folio(struct folio *folio, swp_entry_t swp) { } diff --git a/mm/swapfile.c b/mm/swapfile.c index 6fa8fd11e911b..5d508f808b0aa 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1439,6 +1439,53 @@ void swap_free(swp_entry_t entry) __swap_entry_free(p, entry); }
+static void cluster_swap_free_nr(struct swap_info_struct *sis, + unsigned long offset, int nr_pages) +{ + struct swap_cluster_info *ci; + DECLARE_BITMAP(to_free, BITS_PER_LONG) = { 0 }; + int i, nr; + + ci = lock_cluster_or_swap_info(sis, offset); + while (nr_pages) { + nr = min(BITS_PER_LONG, nr_pages); + for (i = 0; i < nr; i++) { + if (!__swap_entry_free_locked(sis, offset + i, 1)) + bitmap_set(to_free, i, 1); + } + if (!bitmap_empty(to_free, BITS_PER_LONG)) { + unlock_cluster_or_swap_info(sis, ci); + for_each_set_bit(i, to_free, BITS_PER_LONG) + free_swap_slot(swp_entry(sis->type, offset + i)); + if (nr == nr_pages) + return; + bitmap_clear(to_free, 0, BITS_PER_LONG); + ci = lock_cluster_or_swap_info(sis, offset); + } + offset += nr; + nr_pages -= nr; + } + unlock_cluster_or_swap_info(sis, ci); +} + +void swap_free_nr(swp_entry_t entry, int nr_pages) +{ + int nr; + struct swap_info_struct *sis; + unsigned long offset = swp_offset(entry); + + sis = _swap_info_get(entry); + if (!sis) + return; + + while (nr_pages) { + nr = min_t(int, nr_pages, SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); + cluster_swap_free_nr(sis, offset, nr); + offset += nr; + nr_pages -= nr; + } +} + /* * Called after dropping swapcache to decrease refcnt to swap entries. */