mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 52 participants
  • 18279 discussions
[PATCH OLK-5.10] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block
by Wupeng Ma 25 Jun '23

25 Jun '23
From: David Hildenbrand <david(a)redhat.com> mainline inclusion from mainline-v5.11-rc1 commit 8dc4bb58a146655eb057247d7c9d19e73928715b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7F3HQ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- virtio-mem soon wants to use offline_and_remove_memory() memory that exceeds a single Linux memory block (memory_block_size_bytes()). Let's remove that restriction. Let's remember the old state and try to restore that if anything goes wrong. While re-onlining can, in general, fail, it's highly unlikely to happen (usually only when a notifier fails to allocate memory, and these are rather rare). This will be used by virtio-mem to offline+remove memory ranges that are bigger than a single memory block - for example, with a device block size of 1 GiB (e.g., gigantic pages in the hypervisor) and a Linux memory block size of 128MB. While we could compress the state into 2 bit, using 8 bit is much easier. This handling is similar, but different to acpi_scan_try_to_offline(): a) We don't try to offline twice. I am not sure if this CONFIG_MEMCG optimization is still relevant - it should only apply to ZONE_NORMAL (where we have no guarantees). If relevant, we can always add it. b) acpi_scan_try_to_offline() simply onlines all memory in case something goes wrong. It doesn't restore previous online type. Let's do that, so we won't overwrite what e.g., user space configured. Reviewed-by: Wei Yang <richard.weiyang(a)linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst(a)redhat.com> Cc: Jason Wang <jasowang(a)redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux(a)gmail.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Wei Yang <richard.weiyang(a)linux.alibaba.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: David Hildenbrand <david(a)redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-28-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Acked-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- mm/memory_hotplug.c | 105 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 89 insertions(+), 16 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d2dd2bfcaac3..203c4eb59557 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1675,39 +1675,112 @@ int remove_memory(int nid, u64 start, u64 size) } EXPORT_SYMBOL_GPL(remove_memory); +static int try_offline_memory_block(struct memory_block *mem, void *arg) +{ + uint8_t online_type = MMOP_ONLINE_KERNEL; + uint8_t **online_types = arg; + struct page *page; + int rc; + + /* + * Sense the online_type via the zone of the memory block. Offlining + * with multiple zones within one memory block will be rejected + * by offlining code ... so we don't care about that. + */ + page = pfn_to_online_page(section_nr_to_pfn(mem->start_section_nr)); + if (page && zone_idx(page_zone(page)) == ZONE_MOVABLE) + online_type = MMOP_ONLINE_MOVABLE; + + rc = device_offline(&mem->dev); + /* + * Default is MMOP_OFFLINE - change it only if offlining succeeded, + * so try_reonline_memory_block() can do the right thing. + */ + if (!rc) + **online_types = online_type; + + (*online_types)++; + /* Ignore if already offline. */ + return rc < 0 ? rc : 0; +} + +static int try_reonline_memory_block(struct memory_block *mem, void *arg) +{ + uint8_t **online_types = arg; + int rc; + + if (**online_types != MMOP_OFFLINE) { + mem->online_type = **online_types; + rc = device_online(&mem->dev); + if (rc < 0) + pr_warn("%s: Failed to re-online memory: %d", + __func__, rc); + } + + /* Continue processing all remaining memory blocks. */ + (*online_types)++; + return 0; +} + /* - * Try to offline and remove a memory block. Might take a long time to - * finish in case memory is still in use. Primarily useful for memory devices - * that logically unplugged all memory (so it's no longer in use) and want to - * offline + remove the memory block. + * Try to offline and remove memory. Might take a long time to finish in case + * memory is still in use. Primarily useful for memory devices that logically + * unplugged all memory (so it's no longer in use) and want to offline + remove + * that memory. */ int offline_and_remove_memory(int nid, u64 start, u64 size) { - struct memory_block *mem; - int rc = -EINVAL; + const unsigned long mb_count = size / memory_block_size_bytes(); + uint8_t *online_types, *tmp; + int rc; if (!IS_ALIGNED(start, memory_block_size_bytes()) || - size != memory_block_size_bytes()) - return rc; + !IS_ALIGNED(size, memory_block_size_bytes()) || !size) + return -EINVAL; + + /* + * We'll remember the old online type of each memory block, so we can + * try to revert whatever we did when offlining one memory block fails + * after offlining some others succeeded. + */ + online_types = kmalloc_array(mb_count, sizeof(*online_types), + GFP_KERNEL); + if (!online_types) + return -ENOMEM; + /* + * Initialize all states to MMOP_OFFLINE, so when we abort processing in + * try_offline_memory_block(), we'll skip all unprocessed blocks in + * try_reonline_memory_block(). + */ + memset(online_types, MMOP_OFFLINE, mb_count); lock_device_hotplug(); - mem = find_memory_block(__pfn_to_section(PFN_DOWN(start))); - if (mem) - rc = device_offline(&mem->dev); - /* Ignore if the device is already offline. */ - if (rc > 0) - rc = 0; + + tmp = online_types; + rc = walk_memory_blocks(start, size, &tmp, try_offline_memory_block); /* - * In case we succeeded to offline the memory block, remove it. + * In case we succeeded to offline all memory, remove it. * This cannot fail as it cannot get onlined in the meantime. */ if (!rc) { rc = try_remove_memory(nid, start, size); - WARN_ON_ONCE(rc); + if (rc) + pr_err("%s: Failed to remove memory: %d", __func__, rc); + } + + /* + * Rollback what we did. While memory onlining might theoretically fail + * (nacked by a notifier), it barely ever happens. + */ + if (rc) { + tmp = online_types; + walk_memory_blocks(start, size, &tmp, + try_reonline_memory_block); } unlock_device_hotplug(); + kfree(online_types); return rc; } EXPORT_SYMBOL_GPL(offline_and_remove_memory); -- 2.25.1
2 1
0 0
[PATCH openEuler-22.03-LTS-SP2 0/1] openEuler: introduced OPENEULER_LTS to identify LTS Release
by Xie XiuQi 24 Jun '23

24 Jun '23
In most cases, the out-of-tree module needs to identify the release version of the openEuler for interface adaptation. The existing OPENEULER_VERSION() and OPENEULER_VERSION_CODE() cannot distinguish between LTS versions and innovative versions. Therefore, a new macro OPENEULER_LTS is introduced. Xie XiuQi (1): openEuler: introduced OPENEULER_LTS to identify LTS Release Makefile | 2 ++ 1 file changed, 2 insertions(+) -- 2.20.1
2 2
0 0
[PATCH OLK-5.10 0/1] x86/fpu: KABI_BROKEN_REMOVE "union fpregs_state
by Zheng Zengkai 24 Jun '23

24 Jun '23
5a2451f10550 ("x86/fpu: Avoid kabi change caused by struct fpu") will lead to performance degradation of libmicro pthread_create testcase, replace kabi fix macro from KABI_DEPRECATE to KABI_BROKEN_REMOVE for element "union fpregs_state state" of struct fpu. Zheng Zengkai (1): x86/fpu: KABI_BROKEN_REMOVE "union fpregs_state state" from struct fpu arch/x86/include/asm/fpu/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.20.1
2 2
0 0
[PATCH OLK-5.10] sched/rt: Fix possible warn when push_rt_task
by Hui Tang 24 Jun '23

24 Jun '23
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FJB5 ------------------------------- A warn may be triggered during reboot, as follows: reboot ->kernel_restart ->machine_restart ->smp_send_stop --- ipi handler set_cpu_online(cpu, false) balance_callback -> __balance_callback ->push_rt_task -> find_lock_lowest_rq --- offline cpu in vec->mask not be cleared -> find_lowest_rq -> cpupri_find -> cpupri_find_fitness -> __cpupri_find [cpumask_and(..., vec->mask)] -> set_task_cpu(next_task, lowest_rq->cpu) --- WARN_ON(!oneline(cpu) So add !cpu_online(lowest_rq->cpu) check before set_task_cpu(). The fix does not completely fix the problem, since cpu_online_mask may be cleared after check. Fixes: 4ff9083b8a9a8 ("sched/core: WARN() when migrating to an offline CPU") Signed-off-by: Hui Tang <tanghui20(a)huawei.com> --- kernel/sched/rt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0f349d8d076d..ca868c04ff24 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1941,6 +1941,9 @@ static int push_rt_task(struct rq *rq) goto retry; } + if (unlikely(!cpu_online(lowest_rq->cpu))) + goto out; + deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); activate_task(lowest_rq, next_task, 0); -- 2.17.1
2 1
0 0
[PATCH OLK-5.10] sched/rt: Fix possible warn when push_rt_task
by Hui Tang 24 Jun '23

24 Jun '23
Offering: HULK hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FJB5 ------------------------------- A warn may be triggered during reboot, as follows: reboot ->kernel_restart ->machine_restart ->smp_send_stop --- ipi handler set_cpu_online(cpu, false) balance_callback -> __balance_callback ->push_rt_task -> find_lock_lowest_rq --- offline cpu in vec->mask not be cleared -> find_lowest_rq -> cpupri_find -> cpupri_find_fitness -> __cpupri_find [cpumask_and(..., vec->mask)] -> set_task_cpu(next_task, lowest_rq->cpu) --- WARN_ON(!oneline(cpu) So add !cpu_online(lowest_rq->cpu) check before set_task_cpu(). The fix does not completely fix the problem, since cpu_online_mask may be cleared after check. Fixes: 4ff9083b8a9a8 ("sched/core: WARN() when migrating to an offline CPU") Signed-off-by: Hui Tang <tanghui20(a)huawei.com> --- kernel/sched/rt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0f349d8d076d..ca868c04ff24 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1941,6 +1941,9 @@ static int push_rt_task(struct rq *rq) goto retry; } + if (unlikely(!cpu_online(lowest_rq->cpu))) + goto out; + deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); activate_task(lowest_rq, next_task, 0); -- 2.17.1
2 1
0 0
[PATCH OLK-5.10 0/3] dm: requeue IO if mapping table not yet
by Li Lingfeng 24 Jun '23

24 Jun '23
It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community. And fix the deadlock introduced by the patch. Li Lingfeng (2): Revert "dm: make sure dm_table is binded before queue request" dm: don't lock fs when the map is NULL during suspend or resume Mike Snitzer (1): dm: requeue IO if mapping table not yet available drivers/md/dm-rq.c | 6 ++---- drivers/md/dm.c | 15 +++++++-------- 2 files changed, 9 insertions(+), 12 deletions(-) -- 2.31.1
2 4
0 0
[PATCH OLK-5.10] dm thin metadata: check fail_io before using data_sm
by Li Lingfeng 24 Jun '23

24 Jun '23
mainline inclusion from mainline-v6.4-rc8 commit cb65b282c9640c27d3129e2e04b711ce1b352838 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FITX CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… ---------------------------------------- Must check pmd->fail_io before using pmd->data_sm since pmd->data_sm may be destroyed by other processes. P1(kworker) P2(message) do_worker process_prepared process_prepared_discard_passdown_pt2 dm_pool_dec_data_range pool_message commit dm_pool_commit_metadata ↓ // commit failed metadata_operation_failed abort_transaction dm_pool_abort_metadata __open_or_format_metadata ↓ dm_sm_disk_open ↓ // open failed // pmd->data_sm is NULL dm_sm_dec_blocks ↓ // try to access pmd->data_sm --> UAF As shown above, if dm_pool_commit_metadata() and dm_pool_abort_metadata() fail in pool_message process, kworker may trigger UAF. Fixes: be500ed721a6 ("dm space maps: improve performance with inc/dec on ranges of blocks") Cc: stable(a)vger.kernel.org Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> Signed-off-by: Mike Snitzer <snitzer(a)kernel.org> Conflicts: drivers/md/dm-thin-metadata.c Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> --- drivers/md/dm-thin-metadata.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index 85e80fc17641..a3eb430cb7dd 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -1771,13 +1771,15 @@ int dm_thin_remove_range(struct dm_thin_device *td, int dm_pool_block_is_shared(struct dm_pool_metadata *pmd, dm_block_t b, bool *result) { - int r; + int r = -EINVAL; uint32_t ref_count; down_read(&pmd->root_lock); - r = dm_sm_get_count(pmd->data_sm, b, &ref_count); - if (!r) - *result = (ref_count > 1); + if (!pmd->fail_io) { + r = dm_sm_get_count(pmd->data_sm, b, &ref_count); + if (!r) + *result = (ref_count > 1); + } up_read(&pmd->root_lock); return r; @@ -1785,13 +1787,15 @@ int dm_pool_block_is_shared(struct dm_pool_metadata *pmd, dm_block_t b, bool *re int dm_pool_inc_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_t e) { - int r = 0; + int r = -EINVAL; pmd_write_lock(pmd); for (; b != e; b++) { - r = dm_sm_inc_block(pmd->data_sm, b); - if (r) - break; + if (!pmd->fail_io) { + r = dm_sm_inc_block(pmd->data_sm, b); + if (r) + break; + } } pmd_write_unlock(pmd); @@ -1800,13 +1804,15 @@ int dm_pool_inc_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_ int dm_pool_dec_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_t e) { - int r = 0; + int r = -EINVAL; pmd_write_lock(pmd); for (; b != e; b++) { - r = dm_sm_dec_block(pmd->data_sm, b); - if (r) - break; + if (!pmd->fail_io) { + r = dm_sm_dec_block(pmd->data_sm, b); + if (r) + break; + } } pmd_write_unlock(pmd); -- 2.31.1
2 1
0 0
[OLK-5.10 0/3] dm: requeue IO if mapping table not yet
by Li Lingfeng 24 Jun '23

24 Jun '23
It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community. And fix the deadlock introduced by the patch. Li Lingfeng (2): Revert "dm: make sure dm_table is binded before queue request" dm: don't lock fs when the map is NULL during suspend or resume Mike Snitzer (1): dm: requeue IO if mapping table not yet available drivers/md/dm-rq.c | 6 ++---- drivers/md/dm.c | 15 +++++++-------- 2 files changed, 9 insertions(+), 12 deletions(-) -- 2.31.1
1 3
0 0
[PATCH openEuler-1.0-LTS 1/2] nbd: validate the block size in nbd_set_size
by Yongqiang Liu 21 Jun '23

21 Jun '23
From: Christoph Hellwig <hch(a)lst.de> mainline inclusion from mainline-v5.1-rc1 commit dcbddf541f18e367ac9cdad8e223d382cd303161 category: bugfix bugzilla: 188268, https://gitee.com/openeuler/kernel/issues/I6DC67 CVE: NA ---------------------------------------- Move the validation of the block from the callers into nbd_set_size. Signed-off-by: Christoph Hellwig <hch(a)lst.de> Reviewed-by: Josef Bacik <josef(a)toxicpanda.com> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> conflict: drivers/block/nbd.c Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com> Reviewed-by: Yu Kuai <yukuai3(a)huawei.com> Reviewed-by: Hou Tao <houtao1(a)huawei.com> Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com> --- drivers/block/nbd.c | 46 +++++++++++++++++---------------------------- 1 file changed, 17 insertions(+), 29 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index a08f35946718..41bafd5094c3 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -298,16 +298,21 @@ static void nbd_size_clear(struct nbd_device *nbd) } } -static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize, +static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize, loff_t blksize) { struct block_device *bdev; + if (!blksize) + blksize = NBD_DEF_BLKSIZE; + if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize)) + return -EINVAL; + nbd->config->bytesize = bytesize; nbd->config->blksize = blksize; if (!nbd->pid) - return; + return 0; if (nbd->config->flags & NBD_FLAG_SEND_TRIM) { nbd->disk->queue->limits.discard_granularity = blksize; @@ -327,6 +332,7 @@ static void nbd_set_size(struct nbd_device *nbd, loff_t bytesize, bdput(bdev); } kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE); + return 0; } static void nbd_complete_rq(struct request *req) @@ -1329,8 +1335,7 @@ static int nbd_start_device(struct nbd_device *nbd) args->index = i; queue_work(nbd->recv_workq, &args->work); } - nbd_set_size(nbd, config->bytesize, config->blksize); - return error; + return nbd_set_size(nbd, config->bytesize, config->blksize); } static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev) @@ -1377,14 +1382,6 @@ static void nbd_clear_sock_ioctl(struct nbd_device *nbd, nbd_config_put(nbd); } -static bool nbd_is_valid_blksize(unsigned long blksize) -{ - if (!blksize || !is_power_of_2(blksize) || blksize < 512 || - blksize > PAGE_SIZE) - return false; - return true; -} - /* Must be called with config_lock held */ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, unsigned int cmd, unsigned long arg) @@ -1401,20 +1398,13 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, case NBD_SET_SOCK: return nbd_add_socket(nbd, arg, false); case NBD_SET_BLKSIZE: - if (!arg) - arg = NBD_DEF_BLKSIZE; - if (!nbd_is_valid_blksize(arg)) - return -EINVAL; - nbd_set_size(nbd, config->bytesize, arg); - return 0; + return nbd_set_size(nbd, config->bytesize, arg); case NBD_SET_SIZE: - nbd_set_size(nbd, arg, config->blksize); - return 0; + return nbd_set_size(nbd, arg, config->blksize); case NBD_SET_SIZE_BLOCKS: if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize)) return -EINVAL; - nbd_set_size(nbd, bytesize, config->blksize); - return 0; + return nbd_set_size(nbd, bytesize, config->blksize); case NBD_SET_TIMEOUT: if (arg) { nbd->tag_set.timeout = arg * HZ; @@ -1946,18 +1936,16 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info) if (info->attrs[NBD_ATTR_SIZE_BYTES]) { u64 bytes = nla_get_u64(info->attrs[NBD_ATTR_SIZE_BYTES]); - nbd_set_size(nbd, bytes, config->blksize); + ret = nbd_set_size(nbd, bytes, config->blksize); + if (ret) + goto out; } if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]) { u64 bsize = nla_get_u64(info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]); - if (!bsize) - bsize = NBD_DEF_BLKSIZE; - if (!nbd_is_valid_blksize(bsize)) { - ret = -EINVAL; + ret = nbd_set_size(nbd, config->bytesize, bsize); + if (ret) goto out; - } - nbd_set_size(nbd, config->bytesize, bsize); } if (info->attrs[NBD_ATTR_TIMEOUT]) { u64 timeout = nla_get_u64(info->attrs[NBD_ATTR_TIMEOUT]); -- 2.25.1
1 1
0 0
[PATCH] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block
by Wupeng Ma 21 Jun '23

21 Jun '23
From: David Hildenbrand <david(a)redhat.com> mainline inclusion from mainline-v5.11-rc1 commit 8dc4bb58a146655eb057247d7c9d19e73928715b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7F3HQ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- virtio-mem soon wants to use offline_and_remove_memory() memory that exceeds a single Linux memory block (memory_block_size_bytes()). Let's remove that restriction. Let's remember the old state and try to restore that if anything goes wrong. While re-onlining can, in general, fail, it's highly unlikely to happen (usually only when a notifier fails to allocate memory, and these are rather rare). This will be used by virtio-mem to offline+remove memory ranges that are bigger than a single memory block - for example, with a device block size of 1 GiB (e.g., gigantic pages in the hypervisor) and a Linux memory block size of 128MB. While we could compress the state into 2 bit, using 8 bit is much easier. This handling is similar, but different to acpi_scan_try_to_offline(): a) We don't try to offline twice. I am not sure if this CONFIG_MEMCG optimization is still relevant - it should only apply to ZONE_NORMAL (where we have no guarantees). If relevant, we can always add it. b) acpi_scan_try_to_offline() simply onlines all memory in case something goes wrong. It doesn't restore previous online type. Let's do that, so we won't overwrite what e.g., user space configured. Reviewed-by: Wei Yang <richard.weiyang(a)linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst(a)redhat.com> Cc: Jason Wang <jasowang(a)redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux(a)gmail.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Wei Yang <richard.weiyang(a)linux.alibaba.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: David Hildenbrand <david(a)redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-28-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> Acked-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> --- mm/memory_hotplug.c | 105 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 89 insertions(+), 16 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d2dd2bfcaac3..203c4eb59557 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1675,39 +1675,112 @@ int remove_memory(int nid, u64 start, u64 size) } EXPORT_SYMBOL_GPL(remove_memory); +static int try_offline_memory_block(struct memory_block *mem, void *arg) +{ + uint8_t online_type = MMOP_ONLINE_KERNEL; + uint8_t **online_types = arg; + struct page *page; + int rc; + + /* + * Sense the online_type via the zone of the memory block. Offlining + * with multiple zones within one memory block will be rejected + * by offlining code ... so we don't care about that. + */ + page = pfn_to_online_page(section_nr_to_pfn(mem->start_section_nr)); + if (page && zone_idx(page_zone(page)) == ZONE_MOVABLE) + online_type = MMOP_ONLINE_MOVABLE; + + rc = device_offline(&mem->dev); + /* + * Default is MMOP_OFFLINE - change it only if offlining succeeded, + * so try_reonline_memory_block() can do the right thing. + */ + if (!rc) + **online_types = online_type; + + (*online_types)++; + /* Ignore if already offline. */ + return rc < 0 ? rc : 0; +} + +static int try_reonline_memory_block(struct memory_block *mem, void *arg) +{ + uint8_t **online_types = arg; + int rc; + + if (**online_types != MMOP_OFFLINE) { + mem->online_type = **online_types; + rc = device_online(&mem->dev); + if (rc < 0) + pr_warn("%s: Failed to re-online memory: %d", + __func__, rc); + } + + /* Continue processing all remaining memory blocks. */ + (*online_types)++; + return 0; +} + /* - * Try to offline and remove a memory block. Might take a long time to - * finish in case memory is still in use. Primarily useful for memory devices - * that logically unplugged all memory (so it's no longer in use) and want to - * offline + remove the memory block. + * Try to offline and remove memory. Might take a long time to finish in case + * memory is still in use. Primarily useful for memory devices that logically + * unplugged all memory (so it's no longer in use) and want to offline + remove + * that memory. */ int offline_and_remove_memory(int nid, u64 start, u64 size) { - struct memory_block *mem; - int rc = -EINVAL; + const unsigned long mb_count = size / memory_block_size_bytes(); + uint8_t *online_types, *tmp; + int rc; if (!IS_ALIGNED(start, memory_block_size_bytes()) || - size != memory_block_size_bytes()) - return rc; + !IS_ALIGNED(size, memory_block_size_bytes()) || !size) + return -EINVAL; + + /* + * We'll remember the old online type of each memory block, so we can + * try to revert whatever we did when offlining one memory block fails + * after offlining some others succeeded. + */ + online_types = kmalloc_array(mb_count, sizeof(*online_types), + GFP_KERNEL); + if (!online_types) + return -ENOMEM; + /* + * Initialize all states to MMOP_OFFLINE, so when we abort processing in + * try_offline_memory_block(), we'll skip all unprocessed blocks in + * try_reonline_memory_block(). + */ + memset(online_types, MMOP_OFFLINE, mb_count); lock_device_hotplug(); - mem = find_memory_block(__pfn_to_section(PFN_DOWN(start))); - if (mem) - rc = device_offline(&mem->dev); - /* Ignore if the device is already offline. */ - if (rc > 0) - rc = 0; + + tmp = online_types; + rc = walk_memory_blocks(start, size, &tmp, try_offline_memory_block); /* - * In case we succeeded to offline the memory block, remove it. + * In case we succeeded to offline all memory, remove it. * This cannot fail as it cannot get onlined in the meantime. */ if (!rc) { rc = try_remove_memory(nid, start, size); - WARN_ON_ONCE(rc); + if (rc) + pr_err("%s: Failed to remove memory: %d", __func__, rc); + } + + /* + * Rollback what we did. While memory onlining might theoretically fail + * (nacked by a notifier), it barely ever happens. + */ + if (rc) { + tmp = online_types; + walk_memory_blocks(start, size, &tmp, + try_reonline_memory_block); } unlock_device_hotplug(); + kfree(online_types); return rc; } EXPORT_SYMBOL_GPL(offline_and_remove_memory); -- 2.25.1
1 0
0 0
  • ← Newer
  • 1
  • ...
  • 1511
  • 1512
  • 1513
  • 1514
  • 1515
  • 1516
  • 1517
  • ...
  • 1828
  • Older →

HyperKitty Powered by HyperKitty