mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 28 participants
  • 18558 discussions
[PATCH kernel-4.19 38/57] reiserfs: update reiserfs_xattrs_initialized() condition
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp> commit 5e46d1b78a03d52306f21f77a4e4a144b6d31486 upstream. syzbot is reporting NULL pointer dereference at reiserfs_security_init() [1], for commit ab17c4f02156c4f7 ("reiserfs: fixup xattr_root caching") is assuming that REISERFS_SB(s)->xattr_root != NULL in reiserfs_xattr_jcreate_nblocks() despite that commit made REISERFS_SB(sb)->priv_root != NULL && REISERFS_SB(s)->xattr_root == NULL case possible. I guess that commit 6cb4aff0a77cc0e6 ("reiserfs: fix oops while creating privroot with selinux enabled") wanted to check xattr_root != NULL before reiserfs_xattr_jcreate_nblocks(), for the changelog is talking about the xattr root. The issue is that while creating the privroot during mount reiserfs_security_init calls reiserfs_xattr_jcreate_nblocks which dereferences the xattr root. The xattr root doesn't exist, so we get an oops. Therefore, update reiserfs_xattrs_initialized() to check both the privroot and the xattr root. Link: https://syzkaller.appspot.com/bug?id=8abaedbdeb32c861dc5340544284167dd0e46c… # [1] Reported-and-tested-by: syzbot <syzbot+690cb1e51970435f9775(a)syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp> Fixes: 6cb4aff0a77c ("reiserfs: fix oops while creating privroot with selinux enabled") Acked-by: Jeff Mahoney <jeffm(a)suse.com> Acked-by: Jan Kara <jack(a)suse.com> Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- fs/reiserfs/xattr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/reiserfs/xattr.h b/fs/reiserfs/xattr.h index c764352447ba1..81bec2c80b25c 100644 --- a/fs/reiserfs/xattr.h +++ b/fs/reiserfs/xattr.h @@ -43,7 +43,7 @@ void reiserfs_security_free(struct reiserfs_security_handle *sec); static inline int reiserfs_xattrs_initialized(struct super_block *sb) { - return REISERFS_SB(sb)->priv_root != NULL; + return REISERFS_SB(sb)->priv_root && REISERFS_SB(sb)->xattr_root; } #define xattr_size(size) ((size) + sizeof(struct reiserfs_xattr_header)) -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 37/57] drm/amdgpu: check alignment on CPU page for bo map
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Xℹ Ruoyao <xry111(a)mengyan1223.wang> commit e3512fb67093fabdf27af303066627b921ee9bd8 upstream. The page table of AMDGPU requires an alignment to CPU page so we should check ioctl parameters for it. Return -EINVAL if some parameter is unaligned to CPU page, instead of corrupt the page table sliently. Reviewed-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Xi Ruoyao <xry111(a)mengyan1223.wang> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 50cb4c0a44709..6a1f5df4bc07e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2076,8 +2076,8 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev, uint64_t eaddr; /* validate the parameters */ - if (saddr & AMDGPU_GPU_PAGE_MASK || offset & AMDGPU_GPU_PAGE_MASK || - size == 0 || size & AMDGPU_GPU_PAGE_MASK) + if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK || + size == 0 || size & ~PAGE_MASK) return -EINVAL; /* make sure object fit at this offset */ @@ -2141,8 +2141,8 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev, int r; /* validate the parameters */ - if (saddr & AMDGPU_GPU_PAGE_MASK || offset & AMDGPU_GPU_PAGE_MASK || - size == 0 || size & AMDGPU_GPU_PAGE_MASK) + if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK || + size == 0 || size & ~PAGE_MASK) return -EINVAL; /* make sure object fit at this offset */ -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 36/57] drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Nirmoy Das <nirmoy.das(a)amd.com> commit 5e61b84f9d3ddfba73091f9fbc940caae1c9eb22 upstream. Offset calculation wasn't correct as start addresses are in pfn not in bytes. CC: stable(a)vger.kernel.org Signed-off-by: Nirmoy Das <nirmoy.das(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index f67c332b16a42..50cb4c0a44709 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2286,7 +2286,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev, after->start = eaddr + 1; after->last = tmp->last; after->offset = tmp->offset; - after->offset += after->start - tmp->start; + after->offset += (after->start - tmp->start) << PAGE_SHIFT; after->flags = tmp->flags; after->bo_va = tmp->bo_va; list_add(&after->list, &tmp->bo_va->invalids); -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 35/57] mm: fix race by making init_zero_pfn() early_initcall
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Ilya Lipnitskiy <ilya.lipnitskiy(a)gmail.com> commit e720e7d0e983bf05de80b231bccc39f1487f0f16 upstream. There are code paths that rely on zero_pfn to be fully initialized before core_initcall. For example, wq_sysfs_init() is a core_initcall function that eventually results in a call to kernel_execve, which causes a page fault with a subsequent mmput. If zero_pfn is not initialized by then it may not get cleaned up properly and result in an error: BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1 Here is an analysis of the race as seen on a MIPS device. On this particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until initialized, at which point it becomes PFN 5120: 1. wq_sysfs_init calls into kobject_uevent_env at core_initcall: kobject_uevent_env+0x7e4/0x7ec kset_register+0x68/0x88 bus_register+0xdc/0x34c subsys_virtual_register+0x34/0x78 wq_sysfs_init+0x1c/0x4c do_one_initcall+0x50/0x1a8 kernel_init_freeable+0x230/0x2c8 kernel_init+0x10/0x100 ret_from_kernel_thread+0x14/0x1c 2. kobject_uevent_env() calls call_usermodehelper_exec() which executes kernel_execve asynchronously. 3. Memory allocations in kernel_execve cause a page fault, bumping the MM reference counter: add_mm_counter_fast+0xb4/0xc0 handle_mm_fault+0x6e4/0xea0 __get_user_pages.part.78+0x190/0x37c __get_user_pages_remote+0x128/0x360 get_arg_page+0x34/0xa0 copy_string_kernel+0x194/0x2a4 kernel_execve+0x11c/0x298 call_usermodehelper_exec_async+0x114/0x194 4. In case zero_pfn has not been initialized yet, zap_pte_range does not decrement the MM_ANONPAGES RSS counter and the BUG message is triggered shortly afterwards when __mmdrop checks the ref counters: __mmdrop+0x98/0x1d0 free_bprm+0x44/0x118 kernel_execve+0x160/0x1d8 call_usermodehelper_exec_async+0x114/0x194 ret_from_kernel_thread+0x14/0x1c To avoid races such as described above, initialize init_zero_pfn at early_initcall level. Depending on the architecture, ZERO_PAGE is either constant or gets initialized even earlier, at paging_init, so there is no issue with initializing zero_pfn earlier. Link: https://lkml.kernel.org/r/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niAXOpqEbt… Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy(a)gmail.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: "Eric W. Biederman" <ebiederm(a)xmission.com> Cc: stable(a)vger.kernel.org Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie(a)wanyeetech.com> Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index bbfe9b500d17c..f19c2aa517b17 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -150,7 +150,7 @@ static int __init init_zero_pfn(void) zero_pfn = page_to_pfn(ZERO_PAGE(0)); return 0; } -core_initcall(init_zero_pfn); +early_initcall(init_zero_pfn); #if defined(SPLIT_RSS_COUNTING) -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 34/57] tracing: Fix stack trace event size
by Yang Yingliang 15 Apr '21

15 Apr '21
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> commit 9deb193af69d3fd6dd8e47f292b67c805a787010 upstream. Commit cbc3b92ce037 fixed an issue to modify the macros of the stack trace event so that user space could parse it properly. Originally the stack trace format to user space showed that the called stack was a dynamic array. But it is not actually a dynamic array, in the way that other dynamic event arrays worked, and this broke user space parsing for it. The update was to make the array look to have 8 entries in it. Helper functions were added to make it parse it correctly, as the stack was dynamic, but was determined by the size of the event stored. Although this fixed user space on how it read the event, it changed the internal structure used for the stack trace event. It changed the array size from [0] to [8] (added 8 entries). This increased the size of the stack trace event by 8 words. The size reserved on the ring buffer was the size of the stack trace event plus the number of stack entries found in the stack trace. That commit caused the amount to be 8 more than what was needed because it did not expect the caller field to have any size. This produced 8 entries of garbage (and reading random data) from the stack trace event: <idle>-0 [002] d... 1976396.837549: <stack trace> => trace_event_raw_event_sched_switch => __traceiter_sched_switch => __schedule => schedule_idle => do_idle => cpu_startup_entry => secondary_startup_64_no_verify => 0xc8c5e150ffff93de => 0xffff93de => 0 => 0 => 0xc8c5e17800000000 => 0x1f30affff93de => 0x00000004 => 0x200000000 Instead, subtract the size of the caller field from the size of the event to make sure that only the amount needed to store the stack trace is reserved. Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.ho… Cc: stable(a)vger.kernel.org Fixes: cbc3b92ce037 ("tracing: Set kernel_stack's caller size properly") Reported-by: Vasily Gorbik <gor(a)linux.ibm.com> Tested-by: Vasily Gorbik <gor(a)linux.ibm.com> Acked-by: Vasily Gorbik <gor(a)linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 2780a98682078..032e62c42a5c2 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2645,7 +2645,8 @@ static void __ftrace_trace_stack(struct ring_buffer *buffer, size *= sizeof(unsigned long); event = __trace_buffer_lock_reserve(buffer, TRACE_STACK, - sizeof(*entry) + size, flags, pc); + (sizeof(*entry) - sizeof(entry->caller)) + size, + flags, pc); if (!event) goto out; entry = ring_buffer_event_data(event); -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 33/57] PM: runtime: Fix ordering in pm_runtime_get_suppliers()
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Adrian Hunter <adrian.hunter(a)intel.com> commit c0c33442f7203704aef345647e14c2fb86071001 upstream. rpm_active indicates how many times the supplier usage_count has been incremented. Consequently it must be updated after pm_runtime_get_sync() of the supplier, not before. Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance") Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com> Cc: 5.1+ <stable(a)vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/base/power/runtime.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index f1cac08fa8bb5..eaae4adf9ce4b 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1572,8 +1572,8 @@ void pm_runtime_get_suppliers(struct device *dev) list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) if (link->flags & DL_FLAG_PM_RUNTIME) { link->supplier_preactivated = true; - refcount_inc(&link->rpm_active); pm_runtime_get_sync(link->supplier); + refcount_inc(&link->rpm_active); } device_links_read_unlock(idx); -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 32/57] PM: runtime: Fix race getting/putting suppliers at probe
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Adrian Hunter <adrian.hunter(a)intel.com> commit 9dfacc54a8661bc8be6e08cffee59596ec59f263 upstream. pm_runtime_put_suppliers() must not decrement rpm_active unless the consumer is suspended. That is because, otherwise, it could suspend suppliers for an active consumer. That can happen as follows: static int driver_probe_device(struct device_driver *drv, struct device *dev) { int ret = 0; if (!device_is_registered(dev)) return -ENODEV; dev->can_match = true; pr_debug("bus: '%s': %s: matched device %s with driver %s\n", drv->bus->name, __func__, dev_name(dev), drv->name); pm_runtime_get_suppliers(dev); if (dev->parent) pm_runtime_get_sync(dev->parent); At this point, dev can runtime suspend so rpm_put_suppliers() can run, rpm_active becomes 1 (the lowest value). pm_runtime_barrier(dev); if (initcall_debug) ret = really_probe_debug(dev, drv); else ret = really_probe(dev, drv); Probe callback can have runtime resumed dev, and then runtime put so dev is awaiting autosuspend, but rpm_active is 2. pm_request_idle(dev); if (dev->parent) pm_runtime_put(dev->parent); pm_runtime_put_suppliers(dev); Now pm_runtime_put_suppliers() will put the supplier i.e. rpm_active 2 -> 1, but consumer can still be active. return ret; } Fix by checking the runtime status. For any status other than RPM_SUSPENDED, rpm_active can be considered to be "owned" by rpm_[get/put]_suppliers() and pm_runtime_put_suppliers() need do nothing. Reported-by: Asutosh Das <asutoshd(a)codeaurora.org> Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance") Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com> Cc: 5.1+ <stable(a)vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/base/power/runtime.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 2c99f93020bc9..f1cac08fa8bb5 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1586,6 +1586,8 @@ void pm_runtime_get_suppliers(struct device *dev) void pm_runtime_put_suppliers(struct device *dev) { struct device_link *link; + unsigned long flags; + bool put; int idx; idx = device_links_read_lock(); @@ -1593,7 +1595,11 @@ void pm_runtime_put_suppliers(struct device *dev) list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) if (link->supplier_preactivated) { link->supplier_preactivated = false; - if (refcount_dec_not_one(&link->rpm_active)) + spin_lock_irqsave(&dev->power.lock, flags); + put = pm_runtime_status_suspended(dev) && + refcount_dec_not_one(&link->rpm_active); + spin_unlock_irqrestore(&dev->power.lock, flags); + if (put) pm_runtime_put(link->supplier); } -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 31/57] ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Hui Wang <hui.wang(a)canonical.com> commit e54f30befa7990b897189b44a56c1138c6bfdbb5 upstream. We found the alc_update_headset_mode() is not called on some machines when unplugging the headset, as a result, the mode of the ALC_HEADSET_MODE_UNPLUGGED can't be set, then the current_headset_type is not cleared, if users plug a differnt type of headset next time, the determine_headset_type() will not be called and the audio jack is set to the headset type of previous time. On the Dell machines which connect the dmic to the PCH, if we open the gnome-sound-setting and unplug the headset, this issue will happen. Those machines disable the auto-mute by ucm and has no internal mic in the input source, so the update_headset_mode() will not be called by cap_sync_hook or automute_hook when unplugging, and because the gnome-sound-setting is opened, the codec will not enter the runtime_suspend state, so the update_headset_mode() will not be called by alc_resume when unplugging. In this case the hp_automute_hook is called when unplugging, so add update_headset_mode() calling to this function. Cc: <stable(a)vger.kernel.org> Signed-off-by: Hui Wang <hui.wang(a)canonical.com> Link: https://lore.kernel.org/r/20210320091542.6748-2-hui.wang@canonical.com Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index d94cfae2920f4..f456e5f67824c 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5054,6 +5054,7 @@ static void alc_update_headset_jack_cb(struct hda_codec *codec, struct alc_spec *spec = codec->spec; spec->current_headset_type = ALC_HEADSET_TYPE_UNKNOWN; snd_hda_gen_hp_automute(codec, jack); + alc_update_headset_mode(codec); } static void alc_probe_headset_mode(struct hda_codec *codec) -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 30/57] ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Hui Wang <hui.wang(a)canonical.com> commit febf22565549ea7111e7d45e8f2d64373cc66b11 upstream. We found a recording issue on a Dell AIO, users plug a headset-mic and select headset-mic from UI, but can't record any sound from headset-mic. The root cause is the determine_headset_type() returns a wrong type, e.g. users plug a ctia type headset, but that function returns omtp type. On this machine, the internal mic is not connected to the codec, the "Input Source" is headset mic by default. And when users plug a headset, the determine_headset_type() will be called immediately, the codec on this AIO is alc274, the delay time for this codec in the determine_headset_type() is only 80ms, the delay is too short to correctly determine the headset type, the fail rate is nearly 99% when users plug the headset with the normal speed. Other codecs set several hundred ms delay time, so here I change the delay time to 850ms for alc2x4 series, after this change, the fail rate is zero unless users plug the headset slowly on purpose. Cc: <stable(a)vger.kernel.org> Signed-off-by: Hui Wang <hui.wang(a)canonical.com> Link: https://lore.kernel.org/r/20210320091542.6748-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 6a419b23adbc8..d94cfae2920f4 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4870,7 +4870,7 @@ static void alc_determine_headset_type(struct hda_codec *codec) case 0x10ec0274: case 0x10ec0294: alc_process_coef_fw(codec, coef0274); - msleep(80); + msleep(850); val = alc_read_coef_idx(codec, 0x46); is_ctia = (val & 0x00f0) == 0x00f0; break; -- 2.25.1
1 0
0 0
[PATCH kernel-4.19 28/57] bpf: Remove MTU check in __bpf_skb_max_len
by Yang Yingliang 15 Apr '21

15 Apr '21
From: Jesper Dangaard Brouer <brouer(a)redhat.com> commit 6306c1189e77a513bf02720450bb43bd4ba5d8ae upstream. Multiple BPF-helpers that can manipulate/increase the size of the SKB uses __bpf_skb_max_len() as the max-length. This function limit size against the current net_device MTU (skb->dev->mtu). When a BPF-prog grow the packet size, then it should not be limited to the MTU. The MTU is a transmit limitation, and software receiving this packet should be allowed to increase the size. Further more, current MTU check in __bpf_skb_max_len uses the MTU from ingress/current net_device, which in case of redirects uses the wrong net_device. This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit is elsewhere in the system. Jesper's testing[1] showed it was not possible to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is in-effect due to this being called from softirq context see code __gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed that frames above 16KiB can cause NICs to reset (but not crash). Keep this sanity limit at this level as memory layer can differ based on kernel config. [1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests Signed-off-by: Jesper Dangaard Brouer <brouer(a)redhat.com> Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net> Acked-by: John Fastabend <john.fastabend(a)gmail.com> Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@f… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/core/filter.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index a1077e879aa42..0aeb130aa0708 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2836,18 +2836,14 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 len_diff) return 0; } -static u32 __bpf_skb_max_len(const struct sk_buff *skb) -{ - return skb->dev ? skb->dev->mtu + skb->dev->hard_header_len : - SKB_MAX_ALLOC; -} +#define BPF_SKB_MAX_LEN SKB_MAX_ALLOC static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff) { bool trans_same = skb->transport_header == skb->network_header; u32 len_cur, len_diff_abs = abs(len_diff); u32 len_min = bpf_skb_net_base_len(skb); - u32 len_max = __bpf_skb_max_len(skb); + u32 len_max = BPF_SKB_MAX_LEN; __be16 proto = skb_protocol(skb, true); bool shrink = len_diff < 0; int ret; @@ -2926,7 +2922,7 @@ static int bpf_skb_trim_rcsum(struct sk_buff *skb, unsigned int new_len) static inline int __bpf_skb_change_tail(struct sk_buff *skb, u32 new_len, u64 flags) { - u32 max_len = __bpf_skb_max_len(skb); + u32 max_len = BPF_SKB_MAX_LEN; u32 min_len = __bpf_skb_min_len(skb); int ret; @@ -3002,7 +2998,7 @@ static const struct bpf_func_proto sk_skb_change_tail_proto = { static inline int __bpf_skb_change_head(struct sk_buff *skb, u32 head_room, u64 flags) { - u32 max_len = __bpf_skb_max_len(skb); + u32 max_len = BPF_SKB_MAX_LEN; u32 new_len = skb->len + head_room; int ret; -- 2.25.1
1 0
0 0
  • ← Newer
  • 1
  • ...
  • 1782
  • 1783
  • 1784
  • 1785
  • 1786
  • 1787
  • 1788
  • ...
  • 1856
  • Older →

HyperKitty Powered by HyperKitty