V3: change the patch header of the second patch.
V2: merge some missing patches.
V1: fix main error.
Chao Yu (3): f2fs: fix to invalidate META_MAPPING before DIO write f2fs: invalidate meta pages only for post_read required inode f2fs: fix to truncate meta inode pages forcely
Hyeong-Jun Kim (1): f2fs: invalidate META_MAPPING before IPU/DIO write
fs/f2fs/checkpoint.c | 5 +++-- fs/f2fs/data.c | 4 ++-- fs/f2fs/f2fs.h | 27 +++++++++++++++++++++++++++ fs/f2fs/gc.c | 3 +-- fs/f2fs/segment.c | 17 ++++++++++++----- include/linux/f2fs_fs.h | 1 + 6 files changed, 46 insertions(+), 11 deletions(-)
From: Hyeong-Jun Kim hj514.kim@samsung.com
mainline inclusion from mainline-v5.16-rc1 commit e3b49ea36802053f312013fd4ccb6e59920a9f76 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HKE5 CVE: CVE-2024-26869
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
Encrypted pages during GC are read and cached in META_MAPPING. However, due to cached pages in META_MAPPING, there is an issue where newly written pages are lost by IPU or DIO writes.
Thread A - f2fs_gc() Thread B /* phase 3 */ down_write(i_gc_rwsem) ra_data_block() ---- (a) up_write(i_gc_rwsem) f2fs_direct_IO() : - down_read(i_gc_rwsem) - __blockdev_direct_io() - get_data_block_dio_write() - f2fs_dio_submit_bio() ---- (b) - up_read(i_gc_rwsem) /* phase 4 */ down_write(i_gc_rwsem) move_data_block() ---- (c) up_write(i_gc_rwsem)
(a) In phase 3 of f2fs_gc(), up-to-date page is read from storage and cached in META_MAPPING. (b) In thread B, writing new data by IPU or DIO write on same blkaddr as read in (a). cached page in META_MAPPING become out-dated. (c) In phase 4 of f2fs_gc(), out-dated page in META_MAPPING is copied to new blkaddr. In conclusion, the newly written data in (b) is lost.
To address this issue, invalidating pages in META_MAPPING before IPU or DIO write.
Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC") Signed-off-by: Hyeong-Jun Kim hj514.kim@samsung.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Conflicts: fs/f2fs/data.c fs/f2fs/segment.c Signed-off-by: Zizhi Wo wozizhi@huawei.com --- fs/f2fs/data.c | 5 ++++- fs/f2fs/segment.c | 3 +++ 2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index e0533cffbb07..5cdf4434ce11 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1715,9 +1715,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, sync_out:
/* for hardware encryption, but to avoid potential issue in future */ - if (flag == F2FS_GET_BLOCK_DIO && map->m_flags & F2FS_MAP_MAPPED) + if (flag == F2FS_GET_BLOCK_DIO && map->m_flags & F2FS_MAP_MAPPED) { f2fs_wait_on_block_writeback_range(inode, map->m_pblk, map->m_len); + invalidate_mapping_pages(META_MAPPING(sbi), + map->m_pblk, map->m_pblk); + }
if (flag == F2FS_GET_BLOCK_PRECACHE) { if (map->m_flags & F2FS_MAP_MAPPED) { diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index a27a93429271..81894f7abf94 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -3554,6 +3554,9 @@ int f2fs_inplace_write_data(struct f2fs_io_info *fio) return -EFSCORRUPTED; }
+ invalidate_mapping_pages(META_MAPPING(sbi), + fio->new_blkaddr, fio->new_blkaddr); + stat_inc_inplace_blocks(fio->sbi);
if (fio->bio && !(SM_I(sbi)->ipu_policy & (1 << F2FS_IPU_NOCACHE)))
From: Chao Yu chao@kernel.org
mainline inclusion from mainline-v6.0-rc1 commit 67ca06872eb02944b4c6f92cffa9242e92c63109 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HKE5 CVE: CVE-2024-26869
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
Quoted from commit e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write")
" Encrypted pages during GC are read and cached in META_MAPPING. However, due to cached pages in META_MAPPING, there is an issue where newly written pages are lost by IPU or DIO writes.
Thread A - f2fs_gc() Thread B /* phase 3 */ down_write(i_gc_rwsem) ra_data_block() ---- (a) up_write(i_gc_rwsem) f2fs_direct_IO() : - down_read(i_gc_rwsem) - __blockdev_direct_io() - get_data_block_dio_write() - f2fs_dio_submit_bio() ---- (b) - up_read(i_gc_rwsem) /* phase 4 */ down_write(i_gc_rwsem) move_data_block() ---- (c) up_write(i_gc_rwsem)
(a) In phase 3 of f2fs_gc(), up-to-date page is read from storage and cached in META_MAPPING. (b) In thread B, writing new data by IPU or DIO write on same blkaddr as read in (a). cached page in META_MAPPING become out-dated. (c) In phase 4 of f2fs_gc(), out-dated page in META_MAPPING is copied to new blkaddr. In conclusion, the newly written data in (b) is lost.
To address this issue, invalidating pages in META_MAPPING before IPU or DIO write. "
In previous commit, we missed to cover extent cache hit case, and passed wrong value for parameter @end of invalidate_mapping_pages(), fix both issues.
Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC") Fixes: e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write") Cc: Hyeong-Jun Kim hj514.kim@samsung.com Signed-off-by: Chao Yu chao.yu@oppo.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Conflicts: fs/f2fs/data.c Signed-off-by: Zizhi Wo wozizhi@huawei.com --- fs/f2fs/data.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 5cdf4434ce11..1d4566a2d042 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1536,9 +1536,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, *map->m_next_extent = pgofs + map->m_len;
/* for hardware encryption, but to avoid potential issue in future */ - if (flag == F2FS_GET_BLOCK_DIO) + if (flag == F2FS_GET_BLOCK_DIO) { f2fs_wait_on_block_writeback_range(inode, map->m_pblk, map->m_len); + invalidate_mapping_pages(META_MAPPING(sbi), + map->m_pblk, map->m_pblk + map->m_len - 1); + } goto out; }
@@ -1719,7 +1722,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, f2fs_wait_on_block_writeback_range(inode, map->m_pblk, map->m_len); invalidate_mapping_pages(META_MAPPING(sbi), - map->m_pblk, map->m_pblk); + map->m_pblk, map->m_pblk + map->m_len - 1); }
if (flag == F2FS_GET_BLOCK_PRECACHE) {
From: Chao Yu chao@kernel.org
mainline inclusion from mainline-v6.0-rc1 commit 0d5b9d8156396bbe1c982708b38ab9e188c45ec9 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HKE5 CVE: CVE-2024-26869
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
After commit e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write"), invalidate_mapping_pages() will be called to avoid race condition in between IPU/DIO and readahead for GC.
However, readahead flow is only used for post_read required inode, so this patch adds check condition to avoids unnecessary page cache invalidating for non-post_read inode.
Signed-off-by: Chao Yu chao.yu@oppo.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Conflicts: fs/f2fs/data.c fs/f2fs/segment.c Signed-off-by: Zizhi Wo wozizhi@huawei.com --- fs/f2fs/data.c | 11 +++-------- fs/f2fs/f2fs.h | 1 + fs/f2fs/segment.c | 9 ++++++++- 3 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 1d4566a2d042..24e9b9229d08 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1536,12 +1536,9 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, *map->m_next_extent = pgofs + map->m_len;
/* for hardware encryption, but to avoid potential issue in future */ - if (flag == F2FS_GET_BLOCK_DIO) { + if (flag == F2FS_GET_BLOCK_DIO) f2fs_wait_on_block_writeback_range(inode, map->m_pblk, map->m_len); - invalidate_mapping_pages(META_MAPPING(sbi), - map->m_pblk, map->m_pblk + map->m_len - 1); - } goto out; }
@@ -1718,12 +1715,9 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, sync_out:
/* for hardware encryption, but to avoid potential issue in future */ - if (flag == F2FS_GET_BLOCK_DIO && map->m_flags & F2FS_MAP_MAPPED) { + if (flag == F2FS_GET_BLOCK_DIO && map->m_flags & F2FS_MAP_MAPPED) f2fs_wait_on_block_writeback_range(inode, map->m_pblk, map->m_len); - invalidate_mapping_pages(META_MAPPING(sbi), - map->m_pblk, map->m_pblk + map->m_len - 1); - }
if (flag == F2FS_GET_BLOCK_PRECACHE) { if (map->m_flags & F2FS_MAP_MAPPED) { @@ -2805,6 +2799,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted, .submitted = false, .compr_blocks = compr_blocks, .need_lock = LOCK_RETRY, + .post_read = f2fs_post_read_required(inode), .io_type = io_type, .io_wbc = wbc, .bio = bio, diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 83ebc860508b..c58d8d5554fc 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1117,6 +1117,7 @@ struct f2fs_io_info { bool retry; /* need to reallocate block address */ int compr_blocks; /* # of compressed block addresses */ bool encrypted; /* indicate file is encrypted */ + bool post_read; /* require post read */ enum iostat_type io_type; /* io type */ struct writeback_control *io_wbc; /* writeback control */ struct bio **bio; /* bio for ipu */ diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 81894f7abf94..2945361f3d89 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -3554,7 +3554,8 @@ int f2fs_inplace_write_data(struct f2fs_io_info *fio) return -EFSCORRUPTED; }
- invalidate_mapping_pages(META_MAPPING(sbi), + if (fio->post_read) + invalidate_mapping_pages(META_MAPPING(sbi), fio->new_blkaddr, fio->new_blkaddr);
stat_inc_inplace_blocks(fio->sbi); @@ -3723,10 +3724,16 @@ void f2fs_wait_on_block_writeback(struct inode *inode, block_t blkaddr) void f2fs_wait_on_block_writeback_range(struct inode *inode, block_t blkaddr, block_t len) { + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); block_t i;
+ if (!f2fs_post_read_required(inode)) + return; + for (i = 0; i < len; i++) f2fs_wait_on_block_writeback(inode, blkaddr + i); + + invalidate_mapping_pages(META_MAPPING(sbi), blkaddr, blkaddr + len - 1); }
static int read_compacted_summaries(struct f2fs_sb_info *sbi)
From: Chao Yu chao@kernel.org
stable inclusion from stable-v6.6.23 commit c92f2927df860a60ba815d3ee610a944b92a8694 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HKE5 CVE: CVE-2024-26869
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 9f0c4a46be1fe9b97dbe66d49204c1371e3ece65 ]
Below race case can cause data corruption:
Thread A GC thread - gc_data_segment - ra_data_block - locked meta_inode page - f2fs_inplace_write_data - invalidate_mapping_pages : fail to invalidate meta_inode page due to lock failure or dirty|writeback status - f2fs_submit_page_bio : write last dirty data to old blkaddr - move_data_block - load old data from meta_inode page - f2fs_submit_page_write : write old data to new blkaddr
Because invalidate_mapping_pages() will skip invalidating page which has unclear status including locked, dirty, writeback and so on, so we need to use truncate_inode_pages_range() instead of invalidate_mapping_pages() to make sure meta_inode page will be dropped.
Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC") Fixes: e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write") Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
Conflicts: fs/f2fs/data.c fs/f2fs/gc.c fs/f2fs/segment.c fs/f2fs/f2fs.h Signed-off-by: Zizhi Wo wozizhi@huawei.com --- fs/f2fs/checkpoint.c | 5 +++-- fs/f2fs/data.c | 3 +-- fs/f2fs/f2fs.h | 26 ++++++++++++++++++++++++++ fs/f2fs/gc.c | 3 +-- fs/f2fs/segment.c | 13 +++++-------- include/linux/f2fs_fs.h | 1 + 6 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 8ca549cc975e..8a94bf3611d3 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -1551,8 +1551,9 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc) */ if (f2fs_sb_has_encrypt(sbi) || f2fs_sb_has_verity(sbi) || f2fs_sb_has_compression(sbi)) - invalidate_mapping_pages(META_MAPPING(sbi), - MAIN_BLKADDR(sbi), MAX_BLKADDR(sbi) - 1); + f2fs_bug_on(sbi, + invalidate_inode_pages2_range(META_MAPPING(sbi), + MAIN_BLKADDR(sbi), MAX_BLKADDR(sbi) - 1));
f2fs_release_ino_entry(sbi, false);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 24e9b9229d08..28beb7d7c7e1 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1421,8 +1421,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) f2fs_allocate_data_block(sbi, NULL, old_blkaddr, &dn->data_blkaddr, &sum, seg_type, NULL); if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO) - invalidate_mapping_pages(META_MAPPING(sbi), - old_blkaddr, old_blkaddr); + f2fs_truncate_meta_inode_pages(sbi, old_blkaddr, 1); f2fs_update_data_blkaddr(dn, dn->data_blkaddr);
/* diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index c58d8d5554fc..a174736931d1 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4127,6 +4127,32 @@ static inline bool is_journalled_quota(struct f2fs_sb_info *sbi) return false; }
+static inline void f2fs_truncate_meta_inode_pages(struct f2fs_sb_info *sbi, + block_t blkaddr, unsigned int cnt) +{ + bool need_submit = false; + int i = 0; + + do { + struct page *page; + + page = find_get_page(META_MAPPING(sbi), blkaddr + i); + if (page) { + if (PageWriteback(page)) + need_submit = true; + f2fs_put_page(page, 0); + } + } while (++i < cnt && !need_submit); + + if (need_submit) + f2fs_submit_merged_write_cond(sbi, sbi->meta_inode, + NULL, 0, DATA); + + truncate_inode_pages_range(META_MAPPING(sbi), + F2FS_BLK_TO_BYTES((loff_t)blkaddr), + F2FS_BLK_END_BYTES((loff_t)(blkaddr + cnt - 1))); +} + #define EFSBADCRC EBADMSG /* Bad CRC detected */ #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index b908dfaafc9c..65f1363b3608 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -1257,8 +1257,7 @@ static int move_data_block(struct inode *inode, block_t bidx, memcpy(page_address(fio.encrypted_page), page_address(mpage), PAGE_SIZE); f2fs_put_page(mpage, 1); - invalidate_mapping_pages(META_MAPPING(fio.sbi), - fio.old_blkaddr, fio.old_blkaddr); + f2fs_truncate_meta_inode_pages(fio.sbi, fio.old_blkaddr, 1);
set_page_dirty(fio.encrypted_page); if (clear_page_dirty_for_io(fio.encrypted_page)) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 2945361f3d89..b87de275a357 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2311,7 +2311,7 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr) if (addr == NEW_ADDR || addr == COMPRESS_ADDR) return;
- invalidate_mapping_pages(META_MAPPING(sbi), addr, addr); + f2fs_truncate_meta_inode_pages(sbi, addr, 1);
/* add it into sit main buffer */ down_write(&sit_i->sentry_lock); @@ -3468,8 +3468,7 @@ static void do_write_page(struct f2fs_summary *sum, struct f2fs_io_info *fio) f2fs_allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr, &fio->new_blkaddr, sum, type, fio); if (GET_SEGNO(fio->sbi, fio->old_blkaddr) != NULL_SEGNO) - invalidate_mapping_pages(META_MAPPING(fio->sbi), - fio->old_blkaddr, fio->old_blkaddr); + f2fs_truncate_meta_inode_pages(fio->sbi, fio->old_blkaddr, 1);
/* writeout dirty page into bdev */ f2fs_submit_page_write(fio); @@ -3555,8 +3554,7 @@ int f2fs_inplace_write_data(struct f2fs_io_info *fio) }
if (fio->post_read) - invalidate_mapping_pages(META_MAPPING(sbi), - fio->new_blkaddr, fio->new_blkaddr); + f2fs_truncate_meta_inode_pages(sbi, fio->new_blkaddr, 1);
stat_inc_inplace_blocks(fio->sbi);
@@ -3644,8 +3642,7 @@ void f2fs_do_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, update_sit_entry(sbi, new_blkaddr, 1); } if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO) { - invalidate_mapping_pages(META_MAPPING(sbi), - old_blkaddr, old_blkaddr); + f2fs_truncate_meta_inode_pages(sbi, old_blkaddr, 1); if (!from_gc) update_segment_mtime(sbi, old_blkaddr, 0); update_sit_entry(sbi, old_blkaddr, -1); @@ -3733,7 +3730,7 @@ void f2fs_wait_on_block_writeback_range(struct inode *inode, block_t blkaddr, for (i = 0; i < len; i++) f2fs_wait_on_block_writeback(inode, blkaddr + i);
- invalidate_mapping_pages(META_MAPPING(sbi), blkaddr, blkaddr + len - 1); + f2fs_truncate_meta_inode_pages(sbi, blkaddr, len); }
static int read_compacted_summaries(struct f2fs_sb_info *sbi) diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h index a5dbb57a687f..7fd4bea6be64 100644 --- a/include/linux/f2fs_fs.h +++ b/include/linux/f2fs_fs.h @@ -27,6 +27,7 @@
#define F2FS_BYTES_TO_BLK(bytes) ((bytes) >> F2FS_BLKSIZE_BITS) #define F2FS_BLK_TO_BYTES(blk) ((blk) << F2FS_BLKSIZE_BITS) +#define F2FS_BLK_END_BYTES(blk) (F2FS_BLK_TO_BYTES(blk + 1) - 1)
/* 0, 1(node nid), 2(meta nid) are reserved node id */ #define F2FS_RESERVED_NODE_NUM 3
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/6597 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/V...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/6597 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/V...