From: Mingzhe Zou mingzhe.zou@easystack.cn
mainline inclusion from v5.18-rc1 commit 7b1002f7cfe581930f63787a0b3de0144e61ed55 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I59A5L?from=project-issue CVE: N/A
--------------------------------------
When attaching a cached device (a.k.a backing device) to a cache device, bch_sectors_dirty_init() is called to count dirty sectors and stripes (see what bcache_dev_sectors_dirty_add() does) on the cache device.
When bcache_dev_sectors_dirty_add() is called, set_bit(stripe, d->full_dirty_stripes) or clear_bit(stripe, d->full_dirty_stripes) operation will always be performed. In full_dirty_stripes, each 1bit represents stripe_size (8192) sectors (512B), so 1bit=4MB (8192*512), and each CPU cache line=64B=512bit=2048MB. When 20 threads process a cached disk with 100G dirty data, a single thread processes about 23M at a time, and 20 threads total 460M. These full_dirty_stripes bits corresponding to the 460M data is likely to fall in the same CPU cache line. When one of these threads performs a set_bit or clear_bit operation, the same CPU cache line of other threads will become invalid and must read the full_dirty_stripes from the main memory again. Compared with single thread, the time of a bcache_dev_sectors_dirty_add() call is increased by about 50 times in our test (100G dirty data, 20 threads, bcache_dev_sectors_dirty_add() is called more than 20 million times).
This patch tries to test_bit before set_bit or clear_bit operation. Therefore, a lot of force set and clear operations will be avoided, and most of bcache_dev_sectors_dirty_add() calls will only read CPU cache line.
Signed-off-by: Mingzhe Zou mingzhe.zou@easystack.cn Signed-off-by: Coly Li colyli@suse.de Reviewed-by: Jason Yan yanaijie@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/md/bcache/writeback.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 8120da278161..e93a09ccaddd 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -585,10 +585,13 @@ void bcache_dev_sectors_dirty_add(struct cache_set *c, unsigned int inode,
sectors_dirty = atomic_add_return(s, d->stripe_sectors_dirty + stripe); - if (sectors_dirty == d->stripe_size) - set_bit(stripe, d->full_dirty_stripes); - else - clear_bit(stripe, d->full_dirty_stripes); + if (sectors_dirty == d->stripe_size) { + if (!test_bit(stripe, d->full_dirty_stripes)) + set_bit(stripe, d->full_dirty_stripes); + } else { + if (test_bit(stripe, d->full_dirty_stripes)) + clear_bit(stripe, d->full_dirty_stripes); + }
nr_sectors -= s; stripe_offset = 0;