mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

June 2023

  • 61 participants
  • 246 discussions
[PATCH OLK-5.10 0/1] x86/fpu: KABI_BROKEN_REMOVE "union fpregs_state
by Zheng Zengkai 24 Jun '23

24 Jun '23
5a2451f10550 ("x86/fpu: Avoid kabi change caused by struct fpu") will lead to performance degradation of libmicro pthread_create testcase, replace kabi fix macro from KABI_DEPRECATE to KABI_BROKEN_REMOVE for element "union fpregs_state state" of struct fpu. Zheng Zengkai (1): x86/fpu: KABI_BROKEN_REMOVE "union fpregs_state state" from struct fpu arch/x86/include/asm/fpu/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.20.1
2 2
0 0
[PATCH OLK-5.10] sched/rt: Fix possible warn when push_rt_task
by Hui Tang 24 Jun '23

24 Jun '23
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FJB5 ------------------------------- A warn may be triggered during reboot, as follows: reboot ->kernel_restart ->machine_restart ->smp_send_stop --- ipi handler set_cpu_online(cpu, false) balance_callback -> __balance_callback ->push_rt_task -> find_lock_lowest_rq --- offline cpu in vec->mask not be cleared -> find_lowest_rq -> cpupri_find -> cpupri_find_fitness -> __cpupri_find [cpumask_and(..., vec->mask)] -> set_task_cpu(next_task, lowest_rq->cpu) --- WARN_ON(!oneline(cpu) So add !cpu_online(lowest_rq->cpu) check before set_task_cpu(). The fix does not completely fix the problem, since cpu_online_mask may be cleared after check. Fixes: 4ff9083b8a9a8 ("sched/core: WARN() when migrating to an offline CPU") Signed-off-by: Hui Tang <tanghui20(a)huawei.com> --- kernel/sched/rt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0f349d8d076d..ca868c04ff24 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1941,6 +1941,9 @@ static int push_rt_task(struct rq *rq) goto retry; } + if (unlikely(!cpu_online(lowest_rq->cpu))) + goto out; + deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); activate_task(lowest_rq, next_task, 0); -- 2.17.1
2 1
0 0
[PATCH OLK-5.10] sched/rt: Fix possible warn when push_rt_task
by Hui Tang 24 Jun '23

24 Jun '23
Offering: HULK hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FJB5 ------------------------------- A warn may be triggered during reboot, as follows: reboot ->kernel_restart ->machine_restart ->smp_send_stop --- ipi handler set_cpu_online(cpu, false) balance_callback -> __balance_callback ->push_rt_task -> find_lock_lowest_rq --- offline cpu in vec->mask not be cleared -> find_lowest_rq -> cpupri_find -> cpupri_find_fitness -> __cpupri_find [cpumask_and(..., vec->mask)] -> set_task_cpu(next_task, lowest_rq->cpu) --- WARN_ON(!oneline(cpu) So add !cpu_online(lowest_rq->cpu) check before set_task_cpu(). The fix does not completely fix the problem, since cpu_online_mask may be cleared after check. Fixes: 4ff9083b8a9a8 ("sched/core: WARN() when migrating to an offline CPU") Signed-off-by: Hui Tang <tanghui20(a)huawei.com> --- kernel/sched/rt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0f349d8d076d..ca868c04ff24 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1941,6 +1941,9 @@ static int push_rt_task(struct rq *rq) goto retry; } + if (unlikely(!cpu_online(lowest_rq->cpu))) + goto out; + deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); activate_task(lowest_rq, next_task, 0); -- 2.17.1
2 1
0 0
[PATCH OLK-5.10 0/3] dm: requeue IO if mapping table not yet
by Li Lingfeng 24 Jun '23

24 Jun '23
It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community. And fix the deadlock introduced by the patch. Li Lingfeng (2): Revert "dm: make sure dm_table is binded before queue request" dm: don't lock fs when the map is NULL during suspend or resume Mike Snitzer (1): dm: requeue IO if mapping table not yet available drivers/md/dm-rq.c | 6 ++---- drivers/md/dm.c | 15 +++++++-------- 2 files changed, 9 insertions(+), 12 deletions(-) -- 2.31.1
2 4
0 0
[PATCH OLK-5.10] dm thin metadata: check fail_io before using data_sm
by Li Lingfeng 24 Jun '23

24 Jun '23
mainline inclusion from mainline-v6.4-rc8 commit cb65b282c9640c27d3129e2e04b711ce1b352838 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FITX CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------- Must check pmd->fail_io before using pmd->data_sm since pmd->data_sm may be destroyed by other processes. P1(kworker) P2(message) do_worker process_prepared process_prepared_discard_passdown_pt2 dm_pool_dec_data_range pool_message commit dm_pool_commit_metadata ↓ // commit failed metadata_operation_failed abort_transaction dm_pool_abort_metadata __open_or_format_metadata ↓ dm_sm_disk_open ↓ // open failed // pmd->data_sm is NULL dm_sm_dec_blocks ↓ // try to access pmd->data_sm --> UAF As shown above, if dm_pool_commit_metadata() and dm_pool_abort_metadata() fail in pool_message process, kworker may trigger UAF. Fixes: be500ed721a6 ("dm space maps: improve performance with inc/dec on ranges of blocks") Cc: stable(a)vger.kernel.org Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> Signed-off-by: Mike Snitzer <snitzer(a)kernel.org> Conflicts: drivers/md/dm-thin-metadata.c Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> --- drivers/md/dm-thin-metadata.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index 85e80fc17641..a3eb430cb7dd 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -1771,13 +1771,15 @@ int dm_thin_remove_range(struct dm_thin_device *td, int dm_pool_block_is_shared(struct dm_pool_metadata *pmd, dm_block_t b, bool *result) { - int r; + int r = -EINVAL; uint32_t ref_count; down_read(&pmd->root_lock); - r = dm_sm_get_count(pmd->data_sm, b, &ref_count); - if (!r) - *result = (ref_count > 1); + if (!pmd->fail_io) { + r = dm_sm_get_count(pmd->data_sm, b, &ref_count); + if (!r) + *result = (ref_count > 1); + } up_read(&pmd->root_lock); return r; @@ -1785,13 +1787,15 @@ int dm_pool_block_is_shared(struct dm_pool_metadata *pmd, dm_block_t b, bool *re int dm_pool_inc_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_t e) { - int r = 0; + int r = -EINVAL; pmd_write_lock(pmd); for (; b != e; b++) { - r = dm_sm_inc_block(pmd->data_sm, b); - if (r) - break; + if (!pmd->fail_io) { + r = dm_sm_inc_block(pmd->data_sm, b); + if (r) + break; + } } pmd_write_unlock(pmd); @@ -1800,13 +1804,15 @@ int dm_pool_inc_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_ int dm_pool_dec_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_t e) { - int r = 0; + int r = -EINVAL; pmd_write_lock(pmd); for (; b != e; b++) { - r = dm_sm_dec_block(pmd->data_sm, b); - if (r) - break; + if (!pmd->fail_io) { + r = dm_sm_dec_block(pmd->data_sm, b); + if (r) + break; + } } pmd_write_unlock(pmd); -- 2.31.1
2 1
0 0
[OLK-5.10 0/3] dm: requeue IO if mapping table not yet
by Li Lingfeng 24 Jun '23

24 Jun '23
It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community. And fix the deadlock introduced by the patch. Li Lingfeng (2): Revert "dm: make sure dm_table is binded before queue request" dm: don't lock fs when the map is NULL during suspend or resume Mike Snitzer (1): dm: requeue IO if mapping table not yet available drivers/md/dm-rq.c | 6 ++---- drivers/md/dm.c | 15 +++++++-------- 2 files changed, 9 insertions(+), 12 deletions(-) -- 2.31.1
1 3
0 0
[PATCH openEuler-1.0-LTS 1/2] nbd: validate the block size in nbd_set_size
by Yongqiang Liu 21 Jun '23

21 Jun '23
From: Christoph Hellwig <hch(a)lst.de> mainline inclusion from mainline-v5.1-rc1 commit dcbddf541f18e367ac9cdad8e223d382cd303161 category: bugfix bugzilla: 188268, https://gitee.com/openeuler/kernel/issues/I6DC67 CVE: NA ---------------------------------------- Move the validation of the block from the callers into nbd_set_size. Signed-off-by: Christoph Hellwig <hch(a)lst.de> Reviewed-by: Josef Bacik <josef(a)toxicpanda.com> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> conflict: drivers/block/nbd.c Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com> Reviewed-by: Yu Kuai <yukuai3(a)huawei.com> Reviewed-by: Hou Tao <houtao1(a)huawei.com> Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com> --- drivers/block/nbd.c | 46 +++++++++++++++++---------------------------- 1 file changed, 17 insertions(+), 29 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index a08f35946718..41bafd5094c3 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -298,16 +298,21 @@ static void nbd_size_clear(struct nbd_device *nbd) } } -static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize, +static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize, loff_t blksize) { struct block_device *bdev; + if (!blksize) + blksize = NBD_DEF_BLKSIZE; + if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize)) + return -EINVAL; + nbd->config->bytesize = bytesize; nbd->config->blksize = blksize; if (!nbd->pid) - return; + return 0; if (nbd->config->flags & NBD_FLAG_SEND_TRIM) { nbd->disk->queue->limits.discard_granularity = blksize; @@ -327,6 +332,7 @@ static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize, bdput(bdev); } kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE); + return 0; } static void nbd_complete_rq(struct request *req) @@ -1329,8 +1335,7 @@ static int nbd_start_device(struct nbd_device *nbd) args->index = i; queue_work(nbd->recv_workq, &args->work); } - nbd_set_size(nbd, config->bytesize, config->blksize); - return error; + return nbd_set_size(nbd, config->bytesize, config->blksize); } static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev) @@ -1377,14 +1382,6 @@ static void nbd_clear_sock_ioctl(struct nbd_device *nbd, nbd_config_put(nbd); } -static bool nbd_is_valid_blksize(unsigned long blksize) -{ - if (!blksize || !is_power_of_2(blksize) || blksize < 512 || - blksize > PAGE_SIZE) - return false; - return true; -} - /* Must be called with config_lock held */ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, unsigned int cmd, unsigned long arg) @@ -1401,20 +1398,13 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, case NBD_SET_SOCK: return nbd_add_socket(nbd, arg, false); case NBD_SET_BLKSIZE: - if (!arg) - arg = NBD_DEF_BLKSIZE; - if (!nbd_is_valid_blksize(arg)) - return -EINVAL; - nbd_set_size(nbd, config->bytesize, arg); - return 0; + return nbd_set_size(nbd, config->bytesize, arg); case NBD_SET_SIZE: - nbd_set_size(nbd, arg, config->blksize); - return 0; + return nbd_set_size(nbd, arg, config->blksize); case NBD_SET_SIZE_BLOCKS: if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize)) return -EINVAL; - nbd_set_size(nbd, bytesize, config->blksize); - return 0; + return nbd_set_size(nbd, bytesize, config->blksize); case NBD_SET_TIMEOUT: if (arg) { nbd->tag_set.timeout = arg * HZ; @@ -1946,18 +1936,16 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info) if (info->attrs[NBD_ATTR_SIZE_BYTES]) { u64 bytes = nla_get_u64(info->attrs[NBD_ATTR_SIZE_BYTES]); - nbd_set_size(nbd, bytes, config->blksize); + ret = nbd_set_size(nbd, bytes, config->blksize); + if (ret) + goto out; } if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]) { u64 bsize = nla_get_u64(info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]); - if (!bsize) - bsize = NBD_DEF_BLKSIZE; - if (!nbd_is_valid_blksize(bsize)) { - ret = -EINVAL; + ret = nbd_set_size(nbd, config->bytesize, bsize); + if (ret) goto out; - } - nbd_set_size(nbd, config->bytesize, bsize); } if (info->attrs[NBD_ATTR_TIMEOUT]) { u64 timeout = nla_get_u64(info->attrs[NBD_ATTR_TIMEOUT]); -- 2.25.1
1 1
0 0
[PATCH] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block
by Wupeng Ma 21 Jun '23

21 Jun '23
From: David Hildenbrand <david(a)redhat.com> mainline inclusion from mainline-v5.11-rc1 commit 8dc4bb58a146655eb057247d7c9d19e73928715b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7F3HQ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- virtio-mem soon wants to use offline_and_remove_memory() memory that exceeds a single Linux memory block (memory_block_size_bytes()). Let's remove that restriction. Let's remember the old state and try to restore that if anything goes wrong. While re-onlining can, in general, fail, it's highly unlikely to happen (usually only when a notifier fails to allocate memory, and these are rather rare). This will be used by virtio-mem to offline+remove memory ranges that are bigger than a single memory block - for example, with a device block size of 1 GiB (e.g., gigantic pages in the hypervisor) and a Linux memory block size of 128MB. While we could compress the state into 2 bit, using 8 bit is much easier. This handling is similar, but different to acpi_scan_try_to_offline(): a) We don't try to offline twice. I am not sure if this CONFIG_MEMCG optimization is still relevant - it should only apply to ZONE_NORMAL (where we have no guarantees). If relevant, we can always add it. b) acpi_scan_try_to_offline() simply onlines all memory in case something goes wrong. It doesn't restore previous online type. Let's do that, so we won't overwrite what e.g., user space configured. Reviewed-by: Wei Yang <richard.weiyang(a)linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst(a)redhat.com> Cc: Jason Wang <jasowang(a)redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux(a)gmail.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Wei Yang <richard.weiyang(a)linux.alibaba.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: David Hildenbrand <david(a)redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-28-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Acked-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- mm/memory_hotplug.c | 105 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 89 insertions(+), 16 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d2dd2bfcaac3..203c4eb59557 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1675,39 +1675,112 @@ int remove_memory(int nid, u64 start, u64 size) } EXPORT_SYMBOL_GPL(remove_memory); +static int try_offline_memory_block(struct memory_block *mem, void *arg) +{ + uint8_t online_type = MMOP_ONLINE_KERNEL; + uint8_t **online_types = arg; + struct page *page; + int rc; + + /* + * Sense the online_type via the zone of the memory block. Offlining + * with multiple zones within one memory block will be rejected + * by offlining code ... so we don't care about that. + */ + page = pfn_to_online_page(section_nr_to_pfn(mem->start_section_nr)); + if (page && zone_idx(page_zone(page)) == ZONE_MOVABLE) + online_type = MMOP_ONLINE_MOVABLE; + + rc = device_offline(&mem->dev); + /* + * Default is MMOP_OFFLINE - change it only if offlining succeeded, + * so try_reonline_memory_block() can do the right thing. + */ + if (!rc) + **online_types = online_type; + + (*online_types)++; + /* Ignore if already offline. */ + return rc < 0 ? rc : 0; +} + +static int try_reonline_memory_block(struct memory_block *mem, void *arg) +{ + uint8_t **online_types = arg; + int rc; + + if (**online_types != MMOP_OFFLINE) { + mem->online_type = **online_types; + rc = device_online(&mem->dev); + if (rc < 0) + pr_warn("%s: Failed to re-online memory: %d", + __func__, rc); + } + + /* Continue processing all remaining memory blocks. */ + (*online_types)++; + return 0; +} + /* - * Try to offline and remove a memory block. Might take a long time to - * finish in case memory is still in use. Primarily useful for memory devices - * that logically unplugged all memory (so it's no longer in use) and want to - * offline + remove the memory block. + * Try to offline and remove memory. Might take a long time to finish in case + * memory is still in use. Primarily useful for memory devices that logically + * unplugged all memory (so it's no longer in use) and want to offline + remove + * that memory. */ int offline_and_remove_memory(int nid, u64 start, u64 size) { - struct memory_block *mem; - int rc = -EINVAL; + const unsigned long mb_count = size / memory_block_size_bytes(); + uint8_t *online_types, *tmp; + int rc; if (!IS_ALIGNED(start, memory_block_size_bytes()) || - size != memory_block_size_bytes()) - return rc; + !IS_ALIGNED(size, memory_block_size_bytes()) || !size) + return -EINVAL; + + /* + * We'll remember the old online type of each memory block, so we can + * try to revert whatever we did when offlining one memory block fails + * after offlining some others succeeded. + */ + online_types = kmalloc_array(mb_count, sizeof(*online_types), + GFP_KERNEL); + if (!online_types) + return -ENOMEM; + /* + * Initialize all states to MMOP_OFFLINE, so when we abort processing in + * try_offline_memory_block(), we'll skip all unprocessed blocks in + * try_reonline_memory_block(). + */ + memset(online_types, MMOP_OFFLINE, mb_count); lock_device_hotplug(); - mem = find_memory_block(__pfn_to_section(PFN_DOWN(start))); - if (mem) - rc = device_offline(&mem->dev); - /* Ignore if the device is already offline. */ - if (rc > 0) - rc = 0; + + tmp = online_types; + rc = walk_memory_blocks(start, size, &tmp, try_offline_memory_block); /* - * In case we succeeded to offline the memory block, remove it. + * In case we succeeded to offline all memory, remove it. * This cannot fail as it cannot get onlined in the meantime. */ if (!rc) { rc = try_remove_memory(nid, start, size); - WARN_ON_ONCE(rc); + if (rc) + pr_err("%s: Failed to remove memory: %d", __func__, rc); + } + + /* + * Rollback what we did. While memory onlining might theoretically fail + * (nacked by a notifier), it barely ever happens. + */ + if (rc) { + tmp = online_types; + walk_memory_blocks(start, size, &tmp, + try_reonline_memory_block); } unlock_device_hotplug(); + kfree(online_types); return rc; } EXPORT_SYMBOL_GPL(offline_and_remove_memory); -- 2.25.1
1 0
0 0
[PATCH OLK-5.10] iommu/iova: increase the iova_rcache depot max size to 128
by Zhang Zekun 21 Jun '23

21 Jun '23
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH CVE: NA --------------------------------------- In fio test with iodepth=256 with allowd cpus to 0-255, we observe a serve performance decrease. The statistic of cache hit rate are relatively low. Here are some statistics about the iova_cpu_rcahe of all cpus: iova alloc order 0 1 2 3 4 5 ---------------------------------------------------------------------- average cpu_rcache hit rate 0.9941 0.7408 0.8109 0.8854 0.9082 0.8887 Jobs: 12 (f=12): [R(12)][20.0%][r=1091MiB/s][r=279k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][22.2%][r=1426MiB/s][r=365k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][25.0%][r=1607MiB/s][r=411k IOPS][eta 00m:27s] Jobs: 12 (f=12): [R(12)][27.8%][r=1501MiB/s][r=384k IOPS][eta 00m:26s] Jobs: 12 (f=12): [R(12)][30.6%][r=1486MiB/s][r=380k IOPS][eta 00m:25s] Jobs: 12 (f=12): [R(12)][33.3%][r=1393MiB/s][r=357k IOPS][eta 00m:24s] Jobs: 12 (f=12): [R(12)][36.1%][r=1550MiB/s][r=397k IOPS][eta 00m:23s] Jobs: 12 (f=12): [R(12)][38.9%][r=1485MiB/s][r=380k IOPS][eta 00m:22s] The under lying hisi sas driver has 16 thread irqs to free iova, but these irq call back function will only free iovas on 16 certain cpus(cpu{0, 16,32...,240}). For example, thread irq which smp affinity is 0-15, will only free iova on cpu 0. However, the driver will alloc iova on all cpus(cpu{0-255}), cpus without free iova in local cpu_rcache need to get free iovas from iova_rcache->depot. The current size of iova_rcache->depot max size is 32, and it seems to be too small for 256 users (16 cpus will put iovas to iova_rcache->depot and 240 cpus will try to get iova from it). Set iova_rcache->depot to 128 can fix the performance issue, and the performance can return to normal. iova alloc order 0 1 2 3 4 5 ---------------------------------------------------------------------- average cpu_rcache hit rate 0.9925 0.9736 0.9789 0.9867 0.9889 0.9906 Jobs: 12 (f=12): [R(12)][12.9%][r=7526MiB/s][r=1927k IOPS][eta 04m:30s] Jobs: 12 (f=12): [R(12)][13.2%][r=7527MiB/s][r=1927k IOPS][eta 04m:29s] Jobs: 12 (f=12): [R(12)][13.5%][r=7529MiB/s][r=1927k IOPS][eta 04m:28s] Jobs: 12 (f=12): [R(12)][13.9%][r=7531MiB/s][r=1928k IOPS][eta 04m:27s] Jobs: 12 (f=12): [R(12)][14.2%][r=7529MiB/s][r=1928k IOPS][eta 04m:26s] Jobs: 12 (f=12): [R(12)][14.5%][r=7528MiB/s][r=1927k IOPS][eta 04m:25s] Jobs: 12 (f=12): [R(12)][14.8%][r=7527MiB/s][r=1927k IOPS][eta 04m:24s] Jobs: 12 (f=12): [R(12)][15.2%][r=7525MiB/s][r=1926k IOPS][eta 04m:23s] Signed-off-by: Zhang Zekun <zhangzekun11(a)huawei.com> --- include/linux/iova.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/iova.h b/include/linux/iova.h index dfa51ae49666..5f7029f3d0f2 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -26,7 +26,7 @@ struct iova_magazine; struct iova_cpu_rcache; #define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ -#define MAX_GLOBAL_MAGS 32 /* magazines per bin */ +#define MAX_GLOBAL_MAGS 128 /* magazines per bin */ struct iova_rcache { spinlock_t lock; -- 2.17.1
2 1
0 0
[PATCH v2 openEuler-22.03-LTS-SP2] arm64: kernel: disable CNP on LINXICORE9100
by Tong Tiangen 21 Jun '23

21 Jun '23
hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7F28R CVE: NA -------------------------------- On Hisilicon LINXICORE9100 cores, sharing tlb entries on two cores when TTBRx.CNP=1 differs from the standard ARM core. This causes issues when tlb entries sharing between CPU cores. Avoid these issues by disabling CNP feature for Hisilicon LINXICORE9100 cores. Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com> --- Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 11 +++++++++++ arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/cpucaps.h | 1 + arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/cpu_errata.c | 14 ++++++++++++++ arch/arm64/kernel/cpufeature.c | 3 +++ 7 files changed, 34 insertions(+) diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 2305def38396..41728b336105 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -151,6 +151,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip09 | #162100801 | HISILICON_ERRATUM_162100801 | +----------------+-----------------+-----------------+-----------------------------+ +| Hisilicon | LINXICORE9100 | #162100125 | HISILICON_ERRATUM_162100125 | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e0143a3a9937..9a238d088245 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -812,6 +812,17 @@ config HISILICON_ERRATUM_162100801 If unsure, say Y. +config HISILICON_ERRATUM_162100125 + bool "Hisilicon erratum 162100125" + default y + help + On Hisilicon LINXICORE9100 cores, sharing tlb entries on two cores when + TTBRx.CNP=1 differs from the standard ARM core. This causes issues when + tlb entries sharing between CPU cores. Avoid these issues by disabling + CNP support for Hisilicon LINXICORE9100 cores. + + If unsure, say Y. + config QCOM_FALKOR_ERRATUM_1003 bool "Falkor E1003: Incorrect translation due to ASID change" default y diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index 49e6d4734c59..eb4ee0522446 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -391,6 +391,7 @@ CONFIG_SOCIONEXT_SYNQUACER_PREITS=y CONFIG_HISILICON_ERRATUM_HIP08_RU_PREFETCH=y # CONFIG_HISILICON_HIP08_RU_PREFETCH_DEFAULT_OFF is not set CONFIG_HISILICON_ERRATUM_162100801=y +CONFIG_HISILICON_ERRATUM_162100125=y # end of ARM errata workarounds via the alternatives framework CONFIG_ARM64_4K_PAGES=y diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 37240e1f553c..b43f8e374114 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -74,6 +74,7 @@ #define ARM64_SPECTRE_BHB 66 #define ARM64_WORKAROUND_1742098 67 #define ARM64_HAS_WFXT 68 +#define ARM64_WORKAROUND_HISILICON_ERRATUM_162100125 69 #define ARM64_NCAPS 80 diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 812781fba3f9..e6b0ec40932d 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -111,6 +111,7 @@ #define HISI_CPU_PART_TSV110 0xD01 #define HISI_CPU_PART_TSV200 0xD02 +#define HISI_CPU_PART_LINXICORE9100 0xD02 #define PHYTIUM_CPU_PART_1500A 0X660 #define PHYTIUM_CPU_PART_2000AHK 0X661 @@ -161,6 +162,7 @@ #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX) #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110) #define MIDR_HISI_TSV200 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV200) +#define MIDR_HISI_LINXICORE9100 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_LINXICORE9100) #define MIDR_FT_1500A MIDR_CPU_MODEL(ARM_CPU_IMP_PHYTIUM, PHYTIUM_CPU_PART_1500A) #define MIDR_FT_2000AHK MIDR_CPU_MODEL(ARM_CPU_IMP_PHYTIUM, PHYTIUM_CPU_PART_2000AHK) #define MIDR_FT_2000PLUS MIDR_CPU_MODEL(ARM_CPU_IMP_PHYTIUM, PHYTIUM_CPU_PART_2000PLUS) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 0955af96391b..7f175b3aac15 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -324,6 +324,13 @@ static const struct midr_range cavium_erratum_30115_cpus[] = { }; #endif +#ifdef CONFIG_HISILICON_ERRATUM_162100125 +static const struct midr_range hisilicon_erratum_162100125_cpus[] = { + MIDR_REV(MIDR_HISI_LINXICORE9100, 0, 0), + {}, +}; +#endif + #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003 static const struct arm64_cpu_capabilities qcom_erratum_1003_list[] = { { @@ -519,6 +526,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .cpu_enable = hisilicon_1980005_enable, }, #endif +#ifdef CONFIG_HISILICON_ERRATUM_162100125 + { + .desc = "Hisilicon erratum 162100125", + .capability = ARM64_WORKAROUND_HISILICON_ERRATUM_162100125, + ERRATA_MIDR_RANGE_LIST(hisilicon_erratum_162100125_cpus), + }, +#endif #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003 { .desc = "Qualcomm Technologies Falkor/Kryo erratum 1003", diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 3b746db0f40c..57631fa553f6 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1330,6 +1330,9 @@ has_useable_cnp(const struct arm64_cpu_capabilities *entry, int scope) if (is_kdump_kernel()) return false; + if (cpus_have_const_cap(ARM64_WORKAROUND_HISILICON_ERRATUM_162100125)) + return false; + return has_cpuid_feature(entry, scope); } -- 2.25.1
2 1
0 0
  • ← Newer
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • ...
  • 25
  • Older →

HyperKitty Powered by HyperKitty