This series convert ext4 buffered IO path from buffered_head to iomap,
and enable large folio by default.
01-14: ioamp map multiple blocks pre ->map_blocks by Christoph, backport
from [1].
15: A small debug improvement for the previous series in iomap
map_blocks [2].
16-24: fix a stale zero data issue in xfs and make iomap_zero_ranege
don't increase i_size [3].
25-29: the first part of prepartory changes have been merged to
upstream [4].
30-36: the second part of prepartory changes, support adding
multi-delalloc blocks [5].
37-51: comvert buffered_head to iomap, these are picked up from the my
v3 series [6].
[1] https://lore.kernel.org/linux-fsdevel/20231207072710.176093-1-hch@lst.de/
[2] https://lore.kernel.org/linux-fsdevel/20240220115759.3445025-1-yi.zhang@hua…
[3] https://lore.kernel.org/linux-xfs/20240320110548.2200662-1-yi.zhang@huaweic…
[4] https://lore.kernel.org/linux-ext4/20240105033018.1665752-1-yi.zhang@huawei…
[5] https://lore.kernel.org/linux-ext4/20240330120236.3789589-1-yi.zhang@huawei…
[6] https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huawei…
Thanks,
Yi.
Christoph Hellwig (14):
iomap: clear the per-folio dirty bits on all writeback failures
iomap: treat inline data in iomap_writepage_map as an I/O error
iomap: move the io_folios field out of struct iomap_ioend
iomap: move the PF_MEMALLOC check to iomap_writepages
iomap: factor out a iomap_writepage_handle_eof helper
iomap: move all remaining per-folio logic into iomap_writepage_map
iomap: clean up the iomap_alloc_ioend calling convention
iomap: move the iomap_sector sector calculation out of
iomap_add_to_ioend
iomap: don't chain bios
iomap: only call mapping_set_error once for each failed bio
iomap: factor out a iomap_writepage_map_block helper
iomap: submit ioends immediately
iomap: map multiple blocks at a time
iomap: pass the length of the dirty region to ->map_blocks
Zhang Yi (37):
iomap: add pos and dirty_len into trace_iomap_writepage_map
xfs: match lock mode in xfs_buffered_write_iomap_begin()
xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
xfs: convert delayed extents to unwritten when zeroing post eof blocks
iomap: drop the write failure handles when unsharing and zeroing
iomap: don't increase i_size if it's not a write operation
iomap: use a new variable to handle the written bytes in
iomap_write_iter()
iomap: make iomap_write_end() return a boolean
iomap: do some small logical cleanup in buffered write
ext4: refactor ext4_da_map_blocks()
ext4: convert to exclusive lock while inserting delalloc extents
ext4: add a hole extent entry in cache after punch
ext4: make ext4_map_blocks() distinguish delalloc only extent
ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC map type
ext4: trim delalloc extent
ext4: drop iblock parameter
ext4: make ext4_es_insert_delayed_block() insert multi-blocks
ext4: make ext4_da_reserve_space() reserve multi-clusters
ext4: factor out check for whether a cluster is allocated
ext4: make ext4_insert_delayed_block() insert multi-blocks
ext4: make ext4_da_map_blocks() buffer_head unaware
ext4: use reserved metadata blocks when splitting extent on endio
ext4: factor out ext4_map_{create|query}_blocks()
ext4: introduce seq counter for the extent status entry
ext4: add a new iomap aops for regular file's buffered IO path
ext4: implement buffered read iomap path
ext4: implement buffered write iomap path
ext4: implement writeback iomap path
ext4: implement mmap iomap path
ext4: implement zero_range iomap path
ext4: writeback partial blocks before zeroing out range
ext4: fall back to buffer_head path for defrag
ext4: partial enable iomap for regular file's buffered IO path
filemap: support disable large folios on active inode
ext4: enable large folio for regular file with iomap buffered IO path
ext4: add mount option for buffered IO iomap path
block/fops.c | 2 +-
fs/ext4/ext4.h | 15 +-
fs/ext4/ext4_jbd2.c | 6 +
fs/ext4/extents.c | 42 +-
fs/ext4/extents_status.c | 76 ++-
fs/ext4/extents_status.h | 5 +-
fs/ext4/file.c | 19 +-
fs/ext4/ialloc.c | 5 +
fs/ext4/inode.c | 935 +++++++++++++++++++++++++++---------
fs/ext4/move_extent.c | 35 ++
fs/ext4/page-io.c | 107 +++++
fs/ext4/super.c | 21 +
fs/gfs2/bmap.c | 2 +-
fs/iomap/buffered-io.c | 683 +++++++++++++-------------
fs/iomap/trace.h | 43 +-
fs/xfs/libxfs/xfs_bmap.c | 40 +-
fs/xfs/xfs_aops.c | 63 +--
fs/xfs/xfs_iomap.c | 39 +-
fs/zonefs/file.c | 3 +-
include/linux/iomap.h | 19 +-
include/linux/pagemap.h | 14 +
include/trace/events/ext4.h | 42 +-
mm/readahead.c | 6 +-
23 files changed, 1550 insertions(+), 672 deletions(-)
--
2.39.2
From: Johannes Berg <johannes.berg(a)intel.com>
stable inclusion
from stable-v5.15.149
commit 0c7478a2da3f5fe106b4658338873d50c86ac7ab
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9DNRR
CVE: CVE-2023-52633
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit abe4eaa8618bb36c2b33e9cdde0499296a23448c ]
In 'basic' time-travel mode (without =inf-cpu or =ext), we
still get timer interrupts. These can happen at arbitrary
points in time, i.e. while in timer_read(), which pushes
time forward just a little bit. Then, if we happen to get
the interrupt after calculating the new time to push to,
but before actually finishing that, the interrupt will set
the time to a value that's incompatible with the forward,
and we'll crash because time goes backwards when we do the
forwarding.
Fix this by reading the time_travel_time, calculating the
adjustment, and doing the adjustment all with interrupts
disabled.
Reported-by: Vincent Whitchurch <Vincent.Whitchurch(a)axis.com>
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Guo Mengqi <guomengqi3(a)huawei.com>
---
arch/um/kernel/time.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c
index 8dafc3f2add4..a853d8a29476 100644
--- a/arch/um/kernel/time.c
+++ b/arch/um/kernel/time.c
@@ -374,9 +374,29 @@ static void time_travel_update_time(unsigned long long next, bool idle)
time_travel_del_event(&ne);
}
+static void time_travel_update_time_rel(unsigned long long offs)
+{
+ unsigned long flags;
+
+ /*
+ * Disable interrupts before calculating the new time so
+ * that a real timer interrupt (signal) can't happen at
+ * a bad time e.g. after we read time_travel_time but
+ * before we've completed updating the time.
+ */
+ local_irq_save(flags);
+ time_travel_update_time(time_travel_time + offs, false);
+ local_irq_restore(flags);
+}
+
void time_travel_ndelay(unsigned long nsec)
{
- time_travel_update_time(time_travel_time + nsec, false);
+ /*
+ * Not strictly needed to use _rel() version since this is
+ * only used in INFCPU/EXT modes, but it doesn't hurt and
+ * is more readable too.
+ */
+ time_travel_update_time_rel(nsec);
}
EXPORT_SYMBOL(time_travel_ndelay);
@@ -479,7 +499,11 @@ static int time_travel_connect_external(const char *socket)
#define time_travel_start 0
#define time_travel_time 0
-static inline void time_travel_update_time(unsigned long long ns, bool retearly)
+static inline void time_travel_update_time(unsigned long long ns, bool idle)
+{
+}
+
+static inline void time_travel_update_time_rel(unsigned long long offs)
{
}
@@ -624,9 +648,7 @@ static u64 timer_read(struct clocksource *cs)
* to return from time_travel_update_time().
*/
if (!irqs_disabled() && !in_interrupt() && !in_softirq())
- time_travel_update_time(time_travel_time +
- TIMER_MULTIPLIER,
- false);
+ time_travel_update_time_rel(TIMER_MULTIPLIER);
return time_travel_time / TIMER_MULTIPLIER;
}
--
2.17.1