From: Ard Biesheuvel ardb@kernel.org
stable inclusion from stable-4.19.245 commit 0569702c238290924fc1c7c6954258aa4a5fd649 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 0dc14aa94ccd8ba35eb17a0f9b123d1566efd39e ]
The Spectre-BHB mitigations were inadvertently left disabled for Cortex-A15, due to the fact that cpu_v7_bugs_init() is not called in that case. So fix that.
Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- arch/arm/mm/proc-v7-bugs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/mm/proc-v7-bugs.c b/arch/arm/mm/proc-v7-bugs.c index 87a93b76e148..8c7d00cb63ef 100644 --- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -299,6 +299,7 @@ void cpu_v7_ca15_ibe(void) { if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0))) cpu_v7_spectre_v2_init(); + cpu_v7_spectre_bhb_init(); }
void cpu_v7_bugs_init(void)
From: Ard Biesheuvel ardb@kernel.org
stable inclusion from stable-4.19.245 commit 047794b3cfaff4313f379d0cb0509f1c0b972e26 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 3cfb3019979666bdf33a1010147363cf05e0f17b ]
In Thumb2, 'b . + 4' produces a branch instruction that uses a narrow encoding, and so it does not jump to the following instruction as expected. So use W(b) instead.
Fixes: 6c7cb60bff7a ("ARM: fix Thumb2 regression with Spectre BHB") Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- arch/arm/kernel/entry-armv.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index c2d36d31e908..8de1e1da9144 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -1076,7 +1076,7 @@ vector_bhb_loop8_\name:
@ bhb workaround mov r0, #8 -3: b . + 4 +3: W(b) . + 4 subs r0, r0, #1 bne 3b dsb
From: Andrew Lunn andrew@lunn.ch
stable inclusion from stable-4.19.245 commit b1f86c34b2720efc2d3899da572309e515db5190 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit fbb3abdf2223cd0dfc07de85fe5a43ba7f435bdf ]
It is possible to stack bridges on top of each other. Consider the following which makes use of an Ethernet switch:
br1 / \ / \ / \ br0.11 wlan0 | br0 / | \ p1 p2 p3
br0 is offloaded to the switch. Above br0 is a vlan interface, for vlan 11. This vlan interface is then a slave of br1. br1 also has a wireless interface as a slave. This setup trunks wireless lan traffic over the copper network inside a VLAN.
A frame received on p1 which is passed up to the bridge has the skb->offload_fwd_mark flag set to true, indicating that the switch has dealt with forwarding the frame out ports p2 and p3 as needed. This flag instructs the software bridge it does not need to pass the frame back down again. However, the flag is not getting reset when the frame is passed upwards. As a result br1 sees the flag, wrongly interprets it, and fails to forward the frame to wlan0.
When passing a frame upwards, clear the flag. This is the Rx equivalent of br_switchdev_frame_unmark() in br_dev_xmit().
Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") Signed-off-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Ido Schimmel idosch@nvidia.com Tested-by: Ido Schimmel idosch@nvidia.com Acked-by: Nikolay Aleksandrov razor@blackwall.org Link: https://lore.kernel.org/r/20220518005840.771575-1-andrew@lunn.ch Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/bridge/br_input.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 2532c1a19645..14c2fdc268ea 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -47,6 +47,13 @@ static int br_pass_frame_up(struct sk_buff *skb) u64_stats_update_end(&brstats->syncp);
vg = br_vlan_group_rcu(br); + + /* Reset the offload_fwd_mark because there could be a stacked + * bridge above, and it should not think this bridge it doing + * that bridge's work forwarding out its ports. + */ + br_switchdev_frame_unmark(skb); + /* Bridge is just like any other port. Make sure the * packet is allowed except in promisc modue when someone * may be running packet capture.
From: Thomas Gleixner tglx@linutronix.de
stable inclusion from stable-4.19.246 commit 06293f7d7b7c1dcb2b744d0ac53571b6ff53010a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 7e0815b3e09986d2fe651199363e135b9358132a upstream.
When a XEN_HVM guest uses the XEN PIRQ/Eventchannel mechanism, then PCI/MSI[-X] masking is solely controlled by the hypervisor, but contrary to XEN_PV guests this does not disable PCI/MSI[-X] masking in the PCI/MSI layer.
This can lead to a situation where the PCI/MSI layer masks an MSI[-X] interrupt and the hypervisor grants the write despite the fact that it already requested the interrupt. As a consequence interrupt delivery on the affected device is not happening ever.
Set pci_msi_ignore_mask to prevent that like it's done for XEN_PV guests already.
Fixes: 809f9267bbab ("xen: map MSIs into pirqs") Reported-by: Jeremi Piotrowski jpiotrowski@linux.microsoft.com Reported-by: Dusty Mabe dustymabe@redhat.com Reported-by: Salvatore Bonaccorso carnil@debian.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Noah Meyerhans noahm@debian.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87tuaduxj5.ffs@tglx [nmeyerha@amazon.com: backported to 4.19] Signed-off-by: Noah Meyerhans nmeyerha@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- arch/x86/pci/xen.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c index 9112d1cb397b..910ffe04fb99 100644 --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -440,6 +440,11 @@ void __init xen_msi_init(void)
x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; + /* + * With XEN PIRQ/Eventchannels in use PCI/MSI[-X] masking is solely + * controlled by the hypervisor. + */ + pci_msi_ignore_mask = 1; } #endif
From: Thomas Bartschies thomas.bartschies@cvk.de
stable inclusion from stable-4.19.246 commit 539d5deba06e0700807840e8d77507fcb6e4be3c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 015c44d7bff3f44d569716117becd570c179ca32 ]
Since the recent introduction supporting the SM3 and SM4 hash algos for IPsec, the kernel produces invalid pfkey acquire messages, when these encryption modules are disabled. This happens because the availability of the algos wasn't checked in all necessary functions. This patch adds these checks.
Signed-off-by: Thomas Bartschies thomas.bartschies@cvk.de Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/key/af_key.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/key/af_key.c b/net/key/af_key.c index 8ce513e08fd7..0107cc10c4a2 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2908,7 +2908,7 @@ static int count_ah_combs(const struct xfrm_tmpl *t) break; if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg)) + if (aalg_tmpl_set(t, aalg) && aalg->available) sz += sizeof(struct sadb_comb); } return sz + sizeof(struct sadb_prop); @@ -2926,7 +2926,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!ealg->pfkey_supported) continue;
- if (!(ealg_tmpl_set(t, ealg))) + if (!(ealg_tmpl_set(t, ealg) && ealg->available)) continue;
for (k = 1; ; k++) { @@ -2937,7 +2937,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!aalg->pfkey_supported) continue;
- if (aalg_tmpl_set(t, aalg)) + if (aalg_tmpl_set(t, aalg) && aalg->available) sz += sizeof(struct sadb_comb); } }
From: Stephen Brennan stephen.s.brennan@oracle.com
stable inclusion from stable-4.19.246 commit 5b18856296423473cf0d8a6af8aef5df66ae1075 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit d1dc87763f406d4e67caf16dbe438a5647692395 upstream.
A rare BUG_ON triggered in assoc_array_gc:
[3430308.818153] kernel BUG at lib/assoc_array.c:1609!
Which corresponded to the statement currently at line 1593 upstream:
BUG_ON(assoc_array_ptr_is_meta(p));
Using the data from the core dump, I was able to generate a userspace reproducer[1] and determine the cause of the bug.
[1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc
After running the iterator on the entire branch, an internal tree node looked like the following:
NODE (nr_leaves_on_branch: 3) SLOT [0] NODE (2 leaves) SLOT [1] NODE (1 leaf) SLOT [2..f] NODE (empty)
In the userspace reproducer, the pr_devel output when compressing this node was:
-- compress node 0x5607cc089380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] after: 3
At slot 0, an internal node with 2 leaves could not be folded into the node, because there was only one available slot (slot 0). Thus, the internal node was retained. At slot 1, the node had one leaf, and was able to be folded in successfully. The remaining nodes had no leaves, and so were removed. By the end of the compression stage, there were 14 free slots, and only 3 leaf nodes. The tree was ascended and then its parent node was compressed. When this node was seen, it could not be folded, due to the internal node it contained.
The invariant for compression in this function is: whenever nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all leaf nodes. The compression step currently cannot guarantee this, given the corner case shown above.
To fix this issue, retry compression whenever we have retained a node, and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second compression will then allow the node in slot 1 to be folded in, satisfying the invariant. Below is the output of the reproducer once the fix is applied:
-- compress node 0x560e9c562380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] internal nodes remain despite enough space, retrying -- compress node 0x560e9c562380 -- free=14, leaves=1 [0] fold node 2/15 [nx 0] after: 3
Changes
======= DH: - Use false instead of 0. - Reorder the inserted lines in a couple of places to put retained before next_slot.
ver #2) - Fix typo in pr_devel, correct comparison to "<="
Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Cc: stable@vger.kernel.org Signed-off-by: Stephen Brennan stephen.s.brennan@oracle.com Signed-off-by: David Howells dhowells@redhat.com cc: Andrew Morton akpm@linux-foundation.org cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.c... # v1 Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.c... # v2 Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- lib/assoc_array.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/lib/assoc_array.c b/lib/assoc_array.c index 59875eb278ea..3b1ff063ceca 100644 --- a/lib/assoc_array.c +++ b/lib/assoc_array.c @@ -1465,6 +1465,7 @@ int assoc_array_gc(struct assoc_array *array, struct assoc_array_ptr *cursor, *ptr; struct assoc_array_ptr *new_root, *new_parent, **new_ptr_pp; unsigned long nr_leaves_on_tree; + bool retained; int keylen, slot, nr_free, next_slot, i;
pr_devel("-->%s()\n", __func__); @@ -1541,6 +1542,7 @@ int assoc_array_gc(struct assoc_array *array, goto descend; }
+retry_compress: pr_devel("-- compress node %p --\n", new_n);
/* Count up the number of empty slots in this node and work out the @@ -1558,6 +1560,7 @@ int assoc_array_gc(struct assoc_array *array, pr_devel("free=%d, leaves=%lu\n", nr_free, new_n->nr_leaves_on_branch);
/* See what we can fold in */ + retained = false; next_slot = 0; for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { struct assoc_array_shortcut *s; @@ -1607,9 +1610,14 @@ int assoc_array_gc(struct assoc_array *array, pr_devel("[%d] retain node %lu/%d [nx %d]\n", slot, child->nr_leaves_on_branch, nr_free + 1, next_slot); + retained = true; } }
+ if (retained && new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) { + pr_devel("internal nodes remain despite enough space, retrying\n"); + goto retry_compress; + } pr_devel("after: %lu\n", new_n->nr_leaves_on_branch);
nr_leaves_on_tree = new_n->nr_leaves_on_branch;
From: Florian Westphal fw@strlen.de
stable inclusion from stable-4.19.246 commit 92a999d1963eed0df666284e20055136ceabd12f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 56b14ecec97f39118bf85c9ac2438c5a949509ed upstream.
In case the conntrack is clashing, insertion can free skb->_nfct and set skb->_nfct to the already-confirmed entry.
This wasn't found before because the conntrack entry and the extension space used to free'd after an rcu grace period, plus the race needs events enabled to trigger.
Reported-by: syzbot+793a590957d9c1b96620@syzkaller.appspotmail.com Fixes: 71d8c47fc653 ("netfilter: conntrack: introduce clash resolution on insertion race") Fixes: 2ad9d7747c10 ("netfilter: conntrack: free extension area immediately") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- include/net/netfilter/nf_conntrack_core.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h index 2a3e0974a6af..4e3fff9f929b 100644 --- a/include/net/netfilter/nf_conntrack_core.h +++ b/include/net/netfilter/nf_conntrack_core.h @@ -58,8 +58,13 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb) int ret = NF_ACCEPT;
if (ct) { - if (!nf_ct_is_confirmed(ct)) + if (!nf_ct_is_confirmed(ct)) { ret = __nf_conntrack_confirm(skb); + + if (ret == NF_ACCEPT) + ct = (struct nf_conn *)skb_nfct(skb); + } + if (likely(ret == NF_ACCEPT)) nf_ct_deliver_cached_events(ct); }
From: Sultan Alsawaf sultan@kerneltoast.com
stable inclusion from stable-4.19.246 commit 645996efc2ae391246d595832aaa6f9d3cc338c7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 2505a981114dcb715f8977b8433f7540854851d8 upstream.
The asynchronous zspage free worker tries to lock a zspage's entire page list without defending against page migration. Since pages which haven't yet been locked can concurrently migrate off the zspage page list while lock_zspage() churns away, lock_zspage() can suffer from a few different lethal races.
It can lock a page which no longer belongs to the zspage and unsafely dereference page_private(), it can unsafely dereference a torn pointer to the next page (since there's a data race), and it can observe a spurious NULL pointer to the next page and thus not lock all of the zspage's pages (since a single page migration will reconstruct the entire page list, and create_page_chain() unconditionally zeroes out each list pointer in the process).
Fix the races by using migrate_read_lock() in lock_zspage() to synchronize with page migration.
Link: https://lkml.kernel.org/r/20220509024703.243847-1-sultan@kerneltoast.com Fixes: 77ff465799c602 ("zsmalloc: zs_page_migrate: skip unnecessary loops but not return -EBUSY if zspage is not inuse") Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com Acked-by: Minchan Kim minchan@kernel.org Cc: Nitin Gupta ngupta@vflare.org Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- mm/zsmalloc.c | 37 +++++++++++++++++++++++++++++++++---- 1 file changed, 33 insertions(+), 4 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 05d3d587209b..e0ac8b80a6ac 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1814,11 +1814,40 @@ static enum fullness_group putback_zspage(struct size_class *class, */ static void lock_zspage(struct zspage *zspage) { - struct page *page = get_first_page(zspage); + struct page *curr_page, *page;
- do { - lock_page(page); - } while ((page = get_next_page(page)) != NULL); + /* + * Pages we haven't locked yet can be migrated off the list while we're + * trying to lock them, so we need to be careful and only attempt to + * lock each page under migrate_read_lock(). Otherwise, the page we lock + * may no longer belong to the zspage. This means that we may wait for + * the wrong page to unlock, so we must take a reference to the page + * prior to waiting for it to unlock outside migrate_read_lock(). + */ + while (1) { + migrate_read_lock(zspage); + page = get_first_page(zspage); + if (trylock_page(page)) + break; + get_page(page); + migrate_read_unlock(zspage); + wait_on_page_locked(page); + put_page(page); + } + + curr_page = page; + while ((page = get_next_page(curr_page))) { + if (trylock_page(page)) { + curr_page = page; + } else { + get_page(page); + migrate_read_unlock(zspage); + wait_on_page_locked(page); + put_page(page); + migrate_read_lock(zspage); + } + } + migrate_read_unlock(zspage); }
static struct dentry *zs_mount(struct file_system_type *fs_type,
From: Mikulas Patocka mpatocka@redhat.com
stable inclusion from stable-4.19.246 commit fdf6803caf6ebd94f9ead27f888073c4def13715 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit bfe2b0146c4d0230b68f5c71a64380ff8d361f8b upstream.
dm-stats can be used with a very large number of entries (it is only limited by 1/4 of total system memory), so add rescheduling points to the loops that iterate over the entries.
Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/md/dm-stats.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c index 21de30b4e2a1..3d59f3e208c5 100644 --- a/drivers/md/dm-stats.c +++ b/drivers/md/dm-stats.c @@ -224,6 +224,7 @@ void dm_stats_cleanup(struct dm_stats *stats) atomic_read(&shared->in_flight[READ]), atomic_read(&shared->in_flight[WRITE])); } + cond_resched(); } dm_stat_free(&s->rcu_head); } @@ -313,6 +314,7 @@ static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end, for (ni = 0; ni < n_entries; ni++) { atomic_set(&s->stat_shared[ni].in_flight[READ], 0); atomic_set(&s->stat_shared[ni].in_flight[WRITE], 0); + cond_resched(); }
if (s->n_histogram_entries) { @@ -325,6 +327,7 @@ static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end, for (ni = 0; ni < n_entries; ni++) { s->stat_shared[ni].tmp.histogram = hi; hi += s->n_histogram_entries + 1; + cond_resched(); } }
@@ -345,6 +348,7 @@ static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end, for (ni = 0; ni < n_entries; ni++) { p[ni].histogram = hi; hi += s->n_histogram_entries + 1; + cond_resched(); } } } @@ -474,6 +478,7 @@ static int dm_stats_list(struct dm_stats *stats, const char *program, } DMEMIT("\n"); } + cond_resched(); } mutex_unlock(&stats->mutex);
@@ -750,6 +755,7 @@ static void __dm_stat_clear(struct dm_stat *s, size_t idx_start, size_t idx_end, local_irq_enable(); } } + cond_resched(); } }
@@ -865,6 +871,8 @@ static int dm_stats_print(struct dm_stats *stats, int id,
if (unlikely(sz + 1 >= maxlen)) goto buffer_overflow; + + cond_resched(); }
if (clear)
From: Liu Jian liujian56@huawei.com
stable inclusion from stable-4.19.246 commit 22771d3095deeca0d2aefd55999fa8a50caea3cd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 45969b4152c1752089351cd6836a42a566d49bcf upstream.
The data length of skb frags + frag_list may be greater than 0xffff, and skb_header_pointer can not handle negative offset. So, here INT_MAX is used to check the validity of offset. Add the same change to the related function skb_store_bytes.
Fixes: 05c74e5e53f6 ("bpf: add bpf_skb_load_bytes helper") Signed-off-by: Liu Jian liujian56@huawei.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20220416105801.88708-2-liujian56@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/core/filter.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c index 0730395918e0..7a3655e14764 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -1666,7 +1666,7 @@ BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset,
if (unlikely(flags & ~(BPF_F_RECOMPUTE_CSUM | BPF_F_INVALIDATE_HASH))) return -EINVAL; - if (unlikely(offset > 0xffff)) + if (unlikely(offset > INT_MAX)) return -EFAULT; if (unlikely(bpf_try_make_writable(skb, offset + len))) return -EFAULT; @@ -1701,7 +1701,7 @@ BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset, { void *ptr;
- if (unlikely(offset > 0xffff)) + if (unlikely(offset > INT_MAX)) goto err_clear;
ptr = skb_header_pointer(skb, offset, len, to);
From: "Smith, Kyle Miller (Nimble Kernel)" kyles@hpe.com
stable inclusion from stable-4.19.247 commit 8da2b7bdb47e94bbc4062a3978c708926bcb022c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit da42761181627e9bdc37d18368b827948a583929 ]
In nvme_alloc_admin_tags, the admin_q can be set to an error (typically -ENOMEM) if the blk_mq_init_queue call fails to set up the queue, which is checked immediately after the call. However, when we return the error message up the stack, to nvme_reset_work the error takes us to nvme_remove_dead_ctrl() nvme_dev_disable() nvme_suspend_queue(&dev->queues[0]).
Here, we only check that the admin_q is non-NULL, rather than not an error or NULL, and begin quiescing a queue that never existed, leading to bad / NULL pointer dereference.
Signed-off-by: Kyle Smith kyles@hpe.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Reviewed-by: Hannes Reinecke hare@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/nvme/host/pci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 4357b870bd82..f66c0062736e 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1523,6 +1523,7 @@ static int nvme_alloc_admin_tags(struct nvme_dev *dev) dev->ctrl.admin_q = blk_mq_init_queue(&dev->admin_tagset); if (IS_ERR(dev->ctrl.admin_q)) { blk_mq_free_tag_set(&dev->admin_tagset); + dev->ctrl.admin_q = NULL; return -ENOMEM; } if (!blk_get_queue(dev->ctrl.admin_q)) {
From: OGAWA Hirofumi hirofumi@mail.parknet.co.jp
stable inclusion from stable-4.19.247 commit 2d03231e5cc5f1baf97090d82eac81fd53ab1b32 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 183c3237c928109d2008c0456dff508baf692b20 ]
fat*_ent_bread() can be the cause of too many report on I/O error path. So use fat_msg_ratelimit() instead.
Link: https://lkml.kernel.org/r/87bkxogfeq.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi hirofumi@mail.parknet.co.jp Reported-by: qianfan qianfanguijin@163.com Tested-by: qianfan qianfanguijin@163.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- fs/fat/fatent.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c index 4c6c635bc8aa..5e35307a3d6b 100644 --- a/fs/fat/fatent.c +++ b/fs/fat/fatent.c @@ -93,7 +93,8 @@ static int fat12_ent_bread(struct super_block *sb, struct fat_entry *fatent, err_brelse: brelse(bhs[0]); err: - fat_msg(sb, KERN_ERR, "FAT read failed (blocknr %llu)", (llu)blocknr); + fat_msg_ratelimit(sb, KERN_ERR, "FAT read failed (blocknr %llu)", + (llu)blocknr); return -EIO; }
@@ -106,8 +107,8 @@ static int fat_ent_bread(struct super_block *sb, struct fat_entry *fatent, fatent->fat_inode = MSDOS_SB(sb)->fat_inode; fatent->bhs[0] = sb_bread(sb, blocknr); if (!fatent->bhs[0]) { - fat_msg(sb, KERN_ERR, "FAT read failed (blocknr %llu)", - (llu)blocknr); + fat_msg_ratelimit(sb, KERN_ERR, "FAT read failed (blocknr %llu)", + (llu)blocknr); return -EIO; } fatent->nr_bhs = 1;
From: Yicong Yang yangyicong@hisilicon.com
stable inclusion from stable-4.19.247 commit aed6d4d519210c28817948f34c53b6e058e0456c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit a91ee0e9fca9d7501286cfbced9b30a33e52740a ]
The sysfs sriov_numvfs_store() path acquires the device lock before the config space access lock:
sriov_numvfs_store device_lock # A (1) acquire device lock sriov_configure vfio_pci_sriov_configure # (for example) vfio_pci_core_sriov_configure pci_disable_sriov sriov_disable pci_cfg_access_lock pci_wait_cfg # B (4) wait for dev->block_cfg_access == 0
Previously, pci_dev_lock() acquired the config space access lock before the device lock:
pci_dev_lock pci_cfg_access_lock dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1 device_lock # A (3) wait for device lock
Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may deadlock with sriov_numvfs_store() if the operations occur in the sequence (1) (2) (3) (4).
Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires the device lock before the config space access lock, the same as the sriov_numvfs_store() path.
[bhelgaas: combined and adapted commit log from Jay Zhou's independent subsequent posting: https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com] Link: https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyico... Also-posted-by: Jay Zhou jianjay.zhou@huawei.com Signed-off-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/pci/pci.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index e74ebff1e29f..2020452d1f17 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4759,18 +4759,18 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, int probe)
static void pci_dev_lock(struct pci_dev *dev) { - pci_cfg_access_lock(dev); /* block PM suspend, driver probe, etc. */ device_lock(&dev->dev); + pci_cfg_access_lock(dev); }
/* Return 1 on successful lock, 0 on contention */ static int pci_dev_trylock(struct pci_dev *dev) { - if (pci_cfg_access_trylock(dev)) { - if (device_trylock(&dev->dev)) + if (device_trylock(&dev->dev)) { + if (pci_cfg_access_trylock(dev)) return 1; - pci_cfg_access_unlock(dev); + device_unlock(&dev->dev); }
return 0; @@ -4778,8 +4778,8 @@ static int pci_dev_trylock(struct pci_dev *dev)
static void pci_dev_unlock(struct pci_dev *dev) { - device_unlock(&dev->dev); pci_cfg_access_unlock(dev); + device_unlock(&dev->dev); }
static void pci_dev_save_and_disable(struct pci_dev *dev)
From: Amir Goldstein amir73il@gmail.com
stable inclusion from stable-4.19.247 commit 72632015277b56d5f8fd666ccd24cb0ed7ef1d72 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 623af4f538b5df9b416e1b82f720af7371b4c771 ]
Commit 6960b0d909cd ("fsnotify: change locking order") changed some of the mark_mutex locks in direct reclaim path to use: mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
This change is explained: "...It uses nested locking to avoid deadlock in case we do the final iput() on an inode which still holds marks and thus would take the mutex again when calling fsnotify_inode_delete() in destroy_inode()."
The problem is that the mutex_lock_nested() is not a nested lock at all. In fact, it has the opposite effect of preventing lockdep from warning about a very possible deadlock.
Due to these wrong annotations, a deadlock that was introduced with nfsd filecache in kernel v5.4 went unnoticed in v5.4.y for over two years until it was reported recently by Khazhismel Kumykov, only to find out that the deadlock was already fixed in kernel v5.5.
Fix the wrong lockdep annotations.
Cc: Khazhismel Kumykov khazhy@google.com Fixes: 6960b0d909cd ("fsnotify: change locking order") Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Link: https://lore.kernel.org/r/20220422120327.3459282-4-amir73il@gmail.com Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- fs/notify/mark.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/notify/mark.c b/fs/notify/mark.c index 09535f6423fc..3afd58170984 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -434,7 +434,7 @@ void fsnotify_free_mark(struct fsnotify_mark *mark) void fsnotify_destroy_mark(struct fsnotify_mark *mark, struct fsnotify_group *group) { - mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING); + mutex_lock(&group->mark_mutex); fsnotify_detach_mark(mark); mutex_unlock(&group->mark_mutex); fsnotify_free_mark(mark); @@ -703,7 +703,7 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group, * move marks to free to to_free list in one go and then free marks in * to_free list one by one. */ - mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING); + mutex_lock(&group->mark_mutex); list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) { if ((1U << mark->connector->type) & type_mask) list_move(&mark->g_list, &to_free); @@ -712,7 +712,7 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
clear: while (1) { - mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING); + mutex_lock(&group->mark_mutex); if (list_empty(head)) { mutex_unlock(&group->mark_mutex); break;
From: Miaohe Lin linmiaohe@huawei.com
stable inclusion from stable-4.19.247 commit f76ddc8fcf6d81fe89bfa4d3efcbc4fe69a91d48 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit da63dc84befaa9e6079a0bc363ff0eaa975f9073 ]
Compaction sysfs file is created via compaction_register_node in register_node. But we forgot to remove it in unregister_node. Thus compaction sysfs file is leaked. Using compaction_unregister_node to fix this issue.
Link: https://lkml.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com Fixes: ed4a6d7f0676 ("mm: compaction: add /sys trigger for per-node memory compaction") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Rafael J. Wysocki rafael@kernel.org Cc: Mel Gorman mel@csn.ul.ie Cc: Minchan Kim minchan.kim@gmail.com Cc: KAMEZAWA Hiroyuki kamezawa.hiroyu@jp.fujitsu.com Cc: KOSAKI Motohiro kosaki.motohiro@jp.fujitsu.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/base/node.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/base/node.c b/drivers/base/node.c index abc517660430..105a18b9788d 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -643,6 +643,7 @@ static int register_node(struct node *node, int num) */ void unregister_node(struct node *node) { + compaction_unregister_node(node); hugetlb_unregister_node(node); /* no-op, if memoryless node */ node_remove_accesses(node); node_remove_caches(node);
From: Alexey Dobriyan adobriyan@gmail.com
stable inclusion from stable-4.19.247 commit 22b5a48ac899a138552fa05b3fc69a3a0588fdbc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 7055197705709c59b8ab77e6a5c7d46d61edd96e ]
When a process exits, /proc/${pid}, and /proc/${pid}/net dentries are flushed. However some leaf dentries like /proc/${pid}/net/arp_cache aren't. That's because respective PDEs have proc_misc_d_revalidate() hook which returns 1 and leaves dentries/inodes in the LRU.
Force revalidation/lookup on everything under /proc/${pid}/net by inheriting proc_net_dentry_ops.
[akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/YjdVHgildbWO7diJ@localhost.localdomain Fixes: c6c75deda813 ("proc: fix lookup in /proc/net subdirectories after setns(2)") Signed-off-by: Alexey Dobriyan adobriyan@gmail.com Reported-by: hui li juanfengpy@gmail.com Cc: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- fs/proc/generic.c | 3 +++ fs/proc/proc_net.c | 3 +++ 2 files changed, 6 insertions(+)
diff --git a/fs/proc/generic.c b/fs/proc/generic.c index bab10368a04d..d8b3c6a7173f 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -445,6 +445,9 @@ static struct proc_dir_entry *__proc_create(struct proc_dir_entry **parent, proc_set_user(ent, (*parent)->uid, (*parent)->gid);
ent->proc_dops = &proc_misc_dentry_ops; + /* Revalidate everything under /proc/${pid}/net */ + if ((*parent)->proc_dops == &proc_net_dentry_ops) + pde_force_lookup(ent);
out: return ent; diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c index 096bcc1e7a8a..68ce173d5590 100644 --- a/fs/proc/proc_net.c +++ b/fs/proc/proc_net.c @@ -342,6 +342,9 @@ static __net_init int proc_net_ns_init(struct net *net)
proc_set_user(netd, uid, gid);
+ /* Seed dentry revalidation for /proc/${pid}/net */ + pde_force_lookup(netd); + err = -EEXIST; net_statd = proc_net_mkdir(net, "stat", netd); if (!net_statd)
From: Jan Kara jack@suse.cz
stable inclusion from stable-4.19.247 commit 78398c2b2cc14f9a9c8592cf6d334c5a479ed611 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 46c116b920ebec58031f0a78c5ea9599b0d2a371 upstream.
Before splitting a directory block verify its directory entries are sane so that the splitting code does not access memory it should not.
Cc: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20220518093332.13986-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- fs/ext4/namei.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 96f908f41bbb..4bdf87567467 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -274,9 +274,9 @@ static struct dx_frame *dx_probe(struct ext4_filename *fname, struct dx_hash_info *hinfo, struct dx_frame *frame); static void dx_release(struct dx_frame *frames); -static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de, - unsigned blocksize, struct dx_hash_info *hinfo, - struct dx_map_entry map[]); +static int dx_make_map(struct inode *dir, struct buffer_head *bh, + struct dx_hash_info *hinfo, + struct dx_map_entry *map_tail); static void dx_sort_map(struct dx_map_entry *map, unsigned count); static struct ext4_dir_entry_2 *dx_move_dirents(char *from, char *to, struct dx_map_entry *offsets, int count, unsigned blocksize); @@ -1205,15 +1205,23 @@ static inline int search_dirblock(struct buffer_head *bh, * Create map of hash values, offsets, and sizes, stored at end of block. * Returns number of entries mapped. */ -static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de, - unsigned blocksize, struct dx_hash_info *hinfo, +static int dx_make_map(struct inode *dir, struct buffer_head *bh, + struct dx_hash_info *hinfo, struct dx_map_entry *map_tail) { int count = 0; - char *base = (char *) de; + struct ext4_dir_entry_2 *de = (struct ext4_dir_entry_2 *)bh->b_data; + unsigned int buflen = bh->b_size; + char *base = bh->b_data; struct dx_hash_info h = *hinfo;
- while ((char *) de < base + blocksize) { + if (ext4_has_metadata_csum(dir->i_sb)) + buflen -= sizeof(struct ext4_dir_entry_tail); + + while ((char *) de < base + buflen) { + if (ext4_check_dir_entry(dir, NULL, de, bh, base, buflen, + ((char *)de) - base)) + return -EFSCORRUPTED; if (de->name_len && de->inode) { ext4fs_dirhash(de->name, de->name_len, &h); map_tail--; @@ -1223,8 +1231,7 @@ static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de, count++; cond_resched(); } - /* XXX: do we need to check rec_len == 0 case? -Chris */ - de = ext4_next_entry(de, blocksize); + de = ext4_next_entry(de, dir->i_sb->s_blocksize); } return count; } @@ -1732,8 +1739,11 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
/* create map in the end of data2 block */ map = (struct dx_map_entry *) (data2 + blocksize); - count = dx_make_map(dir, (struct ext4_dir_entry_2 *) data1, - blocksize, hinfo, map); + count = dx_make_map(dir, *bh, hinfo, map); + if (count < 0) { + err = count; + goto journal_error; + } map -= count; dx_sort_map(map, count); /* Ensure that neither split block is over half full */
From: Jan Kara jack@suse.cz
stable inclusion from stable-4.19.247 commit b3ad9ff6f06c1dc6abf7437691c88ca3d6da3ac0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 3ba733f879c2a88910744647e41edeefbc0d92b2 upstream.
A maliciously corrupted filesystem can contain cycles in the h-tree stored inside a directory. That can easily lead to the kernel corrupting tree nodes that were already verified under its hands while doing a node split and consequently accessing unallocated memory. Fix the problem by verifying traversed block numbers are unique.
Cc: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20220518093332.13986-2-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- fs/ext4/namei.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 4bdf87567467..6d7a44213464 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -750,12 +750,14 @@ static struct dx_frame * dx_probe(struct ext4_filename *fname, struct inode *dir, struct dx_hash_info *hinfo, struct dx_frame *frame_in) { - unsigned count, indirect; + unsigned count, indirect, level, i; struct dx_entry *at, *entries, *p, *q, *m; struct dx_root *root; struct dx_frame *frame = frame_in; struct dx_frame *ret_err = ERR_PTR(ERR_BAD_DX_DIR); u32 hash; + ext4_lblk_t block; + ext4_lblk_t blocks[EXT4_HTREE_LEVEL];
memset(frame_in, 0, EXT4_HTREE_LEVEL * sizeof(frame_in[0])); frame->bh = ext4_read_dirblock(dir, 0, INDEX); @@ -811,6 +813,8 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, }
dxtrace(printk("Look up %x", hash)); + level = 0; + blocks[0] = 0; while (1) { count = dx_get_count(entries); if (!count || count > dx_get_limit(entries)) { @@ -852,15 +856,27 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, dx_get_block(at))); frame->entries = entries; frame->at = at; - if (!indirect--) + + block = dx_get_block(at); + for (i = 0; i <= level; i++) { + if (blocks[i] == block) { + ext4_warning_inode(dir, + "dx entry: tree cycle block %u points back to block %u", + blocks[level], block); + goto fail; + } + } + if (++level > indirect) return frame; + blocks[level] = block; frame++; - frame->bh = ext4_read_dirblock(dir, dx_get_block(at), INDEX); + frame->bh = ext4_read_dirblock(dir, block, INDEX); if (IS_ERR(frame->bh)) { ret_err = (struct dx_frame *) frame->bh; frame->bh = NULL; goto fail; } + entries = ((struct dx_node *) frame->bh->b_data)->entries;
if (dx_get_limit(entries) != dx_node_limit(dir)) {
From: Xiaomeng Tong xiam0nd.tong@gmail.com
stable inclusion from stable-4.19.247 commit fcb51189503b2f5a254910b85ad1b03295e09aab category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit fc8738343eefc4ea8afb6122826dea48eacde514 upstream.
The bug is here: if (!rdev)
The list iterator value 'rdev' will *always* be set and non-NULL by rdev_for_each(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element found. Otherwise it will bypass the NULL check and lead to invalid memory access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator, while using the original variable 'rdev' as a dedicated pointer to point to the found element.
Cc: stable@vger.kernel.org Fixes: 2aa82191ac36 ("md-cluster: Perform a lazy update") Acked-by: Guoqing Jiang guoqing.jiang@linux.dev Signed-off-by: Xiaomeng Tong xiam0nd.tong@gmail.com Acked-by: Goldwyn Rodrigues rgoldwyn@suse.com Signed-off-by: Song Liu song@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/md/md.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 8a68ff71fbbf..1aa40509f90e 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2483,14 +2483,16 @@ static void sync_sbs(struct mddev *mddev, int nospares)
static bool does_sb_need_changing(struct mddev *mddev) { - struct md_rdev *rdev; + struct md_rdev *rdev = NULL, *iter; struct mdp_superblock_1 *sb; int role;
/* Find a good rdev */ - rdev_for_each(rdev, mddev) - if ((rdev->raid_disk >= 0) && !test_bit(Faulty, &rdev->flags)) + rdev_for_each(iter, mddev) + if ((iter->raid_disk >= 0) && !test_bit(Faulty, &iter->flags)) { + rdev = iter; break; + }
/* No good device found. */ if (!rdev)
From: Xiaomeng Tong xiam0nd.tong@gmail.com
stable inclusion from stable-4.19.247 commit 3a18aa6faa4c3b9b84d56188ac94099ba22a4f11 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 64c54d9244a4efe9bc6e9c98e13c4bbb8bb39083 upstream.
The bug is here: if (!rdev || rdev->desc_nr != nr) {
The list iterator value 'rdev' will *always* be set and non-NULL by rdev_for_each_rcu(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element found (In fact, it will be a bogus pointer to an invalid struct object containing the HEAD). Otherwise it will bypass the check and lead to invalid memory access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator, while using the original variable 'pdev' as a dedicated pointer to point to the found element.
Cc: stable@vger.kernel.org Fixes: 70bcecdb1534 ("md-cluster: Improve md_reload_sb to be less error prone") Signed-off-by: Xiaomeng Tong xiam0nd.tong@gmail.com Signed-off-by: Song Liu song@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/md/md.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 1aa40509f90e..9c94190769b8 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9402,16 +9402,18 @@ static int read_rdev(struct mddev *mddev, struct md_rdev *rdev)
void md_reload_sb(struct mddev *mddev, int nr) { - struct md_rdev *rdev; + struct md_rdev *rdev = NULL, *iter; int err;
/* Find the rdev */ - rdev_for_each_rcu(rdev, mddev) { - if (rdev->desc_nr == nr) + rdev_for_each_rcu(iter, mddev) { + if (iter->desc_nr == nr) { + rdev = iter; break; + } }
- if (!rdev || rdev->desc_nr != nr) { + if (!rdev) { pr_warn("%s: %d Could not find rdev with nr %d\n", __func__, __LINE__, nr); return; }
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
stable inclusion from stable-4.19.247 commit e78a321c14ad4e96f3d215aad8bcb6cb55ddc442 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit af0179270977508df6986b51242825d7edd59caf ]
SER_RS485_RTS_ON_SEND and SER_RS485_RTS_AFTER_SEND relate to behavior within RS485 operation. The driver checks if they have the same value which is not possible to realize with the hardware. The check is taken regardless of SER_RS485_ENABLED flag and -EINVAL is returned when the check fails, which creates problems.
This check makes it unnecessarily complicated to turn RS485 mode off as simple zeroed serial_rs485 struct will trigger that equal values check. In addition, the driver itself memsets its rs485 structure to zero when RS485 is disabled but if userspace would try to make an TIOCSRS485 ioctl() call with the very same struct, it would end up failing with -EINVAL which doesn't make much sense.
Resolve the problem by moving the check inside SER_RS485_ENABLED block.
Fixes: 7ecc77011c6f ("serial: 8250_fintek: Return -EINVAL on invalid configuration") Cc: Ricardo Ribalda Delgado ricardo.ribalda@gmail.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/035c738-8ea5-8b17-b1d7-84a7b3aeaa51@linux.intel.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/tty/serial/8250/8250_fintek.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_fintek.c b/drivers/tty/serial/8250/8250_fintek.c index 79a4958b3f5c..440023069f4f 100644 --- a/drivers/tty/serial/8250/8250_fintek.c +++ b/drivers/tty/serial/8250/8250_fintek.c @@ -197,12 +197,12 @@ static int fintek_8250_rs485_config(struct uart_port *port, if (!pdata) return -EINVAL;
- /* Hardware do not support same RTS level on send and receive */ - if (!(rs485->flags & SER_RS485_RTS_ON_SEND) == - !(rs485->flags & SER_RS485_RTS_AFTER_SEND)) - return -EINVAL;
if (rs485->flags & SER_RS485_ENABLED) { + /* Hardware do not support same RTS level on send and receive */ + if (!(rs485->flags & SER_RS485_RTS_ON_SEND) == + !(rs485->flags & SER_RS485_RTS_AFTER_SEND)) + return -EINVAL; memset(rs485->padding, 0, sizeof(rs485->padding)); config |= RS485_URA; } else {
From: Eric Dumazet edumazet@google.com
stable inclusion from stable-4.19.247 commit 58bd38cbc961fd799842b7be8c5222310f04b908 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 0a375c822497ed6ad6b5da0792a12a6f1af10c0b ]
Laurent reported the enclosed report [1]
This bug triggers with following coditions:
0) Kernel built with CONFIG_DEBUG_PREEMPT=y
1) A new passive FastOpen TCP socket is created. This FO socket waits for an ACK coming from client to be a complete ESTABLISHED one. 2) A socket operation on this socket goes through lock_sock() release_sock() dance. 3) While the socket is owned by the user in step 2), a retransmit of the SYN is received and stored in socket backlog. 4) At release_sock() time, the socket backlog is processed while in process context. 5) A SYNACK packet is cooked in response of the SYN retransmit. 6) -> tcp_rtx_synack() is called in process context.
Before blamed commit, tcp_rtx_synack() was always called from BH handler, from a timer handler.
Fix this by using TCP_INC_STATS() & NET_INC_STATS() which do not assume caller is in non preemptible context.
[1] BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180 caller is tcp_rtx_synack.part.0+0x36/0xc0 CPU: 10 PID: 2180 Comm: epollpep Tainted: G OE 5.16.0-0.bpo.4-amd64 #1 Debian 5.16.12-1~bpo11+1 Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021 Call Trace: <TASK> dump_stack_lvl+0x48/0x5e check_preemption_disabled+0xde/0xe0 tcp_rtx_synack.part.0+0x36/0xc0 tcp_rtx_synack+0x8d/0xa0 ? kmem_cache_alloc+0x2e0/0x3e0 ? apparmor_file_alloc_security+0x3b/0x1f0 inet_rtx_syn_ack+0x16/0x30 tcp_check_req+0x367/0x610 tcp_rcv_state_process+0x91/0xf60 ? get_nohz_timer_target+0x18/0x1a0 ? lock_timer_base+0x61/0x80 ? preempt_count_add+0x68/0xa0 tcp_v4_do_rcv+0xbd/0x270 __release_sock+0x6d/0xb0 release_sock+0x2b/0x90 sock_setsockopt+0x138/0x1140 ? __sys_getsockname+0x7e/0xc0 ? aa_sk_perm+0x3e/0x1a0 __sys_setsockopt+0x198/0x1e0 __x64_sys_setsockopt+0x21/0x30 do_syscall_64+0x38/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Laurent Fasnacht laurent.fasnacht@proton.ch Acked-by: Neal Cardwell ncardwell@google.com Link: https://lore.kernel.org/r/20220530213713.601888-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/ipv4/tcp_output.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 0cd1386d45b5..c9d02047852d 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3858,8 +3858,8 @@ int tcp_rtx_synack(const struct sock *sk, struct request_sock *req) tcp_rsk(req)->txhash = net_tx_rndhash(); res = af_ops->send_synack(sk, NULL, &fl, req, NULL, TCP_SYNACK_NORMAL); if (!res) { - __TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS); - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS); + TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS); + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS); if (unlikely(tcp_passive_fastopen(sk))) tcp_sk(sk)->total_retrans++; trace_tcp_retransmit_synack(sk, req);
From: Trond Myklebust trond.myklebust@hammerspace.com
stable inclusion from stable-4.19.247 commit 6b3fc1496e7227cd6a39a80bbfb7588ef7c7a010 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 6949493884fe88500de4af182588e071cf1544ee ]
When doing layoutget as part of the open() compound, we have to be careful to release the layout locks before we can call any further RPC calls, such as setattr(). The reason is that those calls could trigger a recall, which could deadlock.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- fs/nfs/nfs4proc.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 24df1c068eba..fa90d83197e3 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2964,6 +2964,10 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata, }
out: + if (opendata->lgp) { + nfs4_lgopen_release(opendata->lgp); + opendata->lgp = NULL; + } if (!opendata->cancelled) nfs4_sequence_free_slot(&opendata->o_res.seq_res); return ret;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-4.19.247 commit 95f0ba806277733bf6024e23e27e1be773701cca category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 662a80946ce13633ae90a55379f1346c10f0c432 ]
unix_dgram_poll() calls unix_dgram_peer_wake_me() without `other`'s lock held and check if its receive queue is full. Here we need to use unix_recvq_full_lockless() instead of unix_recvq_full(), otherwise KCSAN will report a data-race.
Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20220605232325.11804-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/unix/af_unix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 82279dbd2f62..e79c32942796 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -445,7 +445,7 @@ static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other) * -ECONNREFUSED. Otherwise, if we haven't queued any skbs * to other and its full, we will hang waiting for POLLOUT. */ - if (unix_recvq_full(other) && !sock_flag(other, SOCK_DEAD)) + if (unix_recvq_full_lockless(other) && !sock_flag(other, SOCK_DEAD)) return 1;
if (connected)
From: Chuck Lever chuck.lever@oracle.com
stable inclusion from stable-4.19.247 commit b42b281ee6faf97eddfc5ba3d3f61b6e274893dd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 6c254bf3b637dd4ef4f78eb78c7447419c0161d7 ]
I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up right at the end of the page array. xdr_get_next_encode_buffer() does not compute the value of xdr->end correctly:
* The check to see if we're on the final available page in xdr->buf needs to account for the space consumed by @nbytes.
* The new xdr->end value needs to account for the portion of @nbytes that is to be encoded into the previous buffer.
Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries") Signed-off-by: Chuck Lever chuck.lever@oracle.com Reviewed-by: NeilBrown neilb@suse.de Reviewed-by: J. Bruce Fields bfields@fieldses.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/sunrpc/xdr.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index c306f242ce34..155fdaf44fcb 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -544,7 +544,11 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, */ xdr->p = (void *)p + frag2bytes; space_left = xdr->buf->buflen - xdr->buf->len; - xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE); + if (space_left - nbytes >= PAGE_SIZE) + xdr->end = (void *)p + PAGE_SIZE; + else + xdr->end = (void *)p + space_left - frag1bytes; + xdr->buf->page_len += frag2bytes; xdr->buf->len += nbytes; return p;
From: Masahiro Yamada masahiroy@kernel.org
stable inclusion from stable-4.19.247 commit 31f3c6a4dcd3260a386e62cef2d5b36e902600a1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 4a388f08d8784af48f352193d2b72aaf167a57a1 ]
EXPORT_SYMBOL and __init is a bad combination because the .init.text section is freed up after the initialization. Hence, modules cannot use symbols annotated __init. The access to a freed symbol may end up with kernel panic.
modpost used to detect it, but it has been broken for a decade.
Recently, I fixed modpost so it started to warn it again, then this showed up in linux-next builds.
There are two ways to fix it:
- Remove __init - Remove EXPORT_SYMBOL
I chose the latter for this case because the only in-tree call-site, net/ipv4/xfrm4_policy.c is never compiled as modular. (CONFIG_XFRM is boolean)
Fixes: 2f32b51b609f ("xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly") Reported-by: Stephen Rothwell sfr@canb.auug.org.au Signed-off-by: Masahiro Yamada masahiroy@kernel.org Acked-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/ipv4/xfrm4_protocol.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/net/ipv4/xfrm4_protocol.c b/net/ipv4/xfrm4_protocol.c index 8dd0e6ab8606..0e1f5dc2766b 100644 --- a/net/ipv4/xfrm4_protocol.c +++ b/net/ipv4/xfrm4_protocol.c @@ -297,4 +297,3 @@ void __init xfrm4_protocol_init(void) { xfrm_input_register_afinfo(&xfrm4_input_afinfo); } -EXPORT_SYMBOL(xfrm4_protocol_init);
From: Willem de Bruijn willemb@google.com
stable inclusion from stable-4.19.247 commit 7596bd7920985f7fc8579a92e48bc53ce4475b21 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 8d21e9963bec1aad2280cdd034c8993033ef2948 ]
GRE with TUNNEL_CSUM will apply local checksum offload on CHECKSUM_PARTIAL packets.
ipgre_xmit must validate csum_start after an optional skb_pull, else lco_csum may trigger an overflow. The original check was
if (csum && skb_checksum_start(skb) < skb->data) return -EINVAL;
This had false positives when skb_checksum_start is undefined: when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement was straightforward
if (csum && skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_start(skb) < skb->data) return -EINVAL;
But was eventually revised more thoroughly: - restrict the check to the only branch where needed, in an uncommon GRE path that uses header_ops and calls skb_pull. - test skb_transport_header, which is set along with csum_start in skb_partial_csum_set in the normal header_ops datapath.
Turns out skbs can arrive in this branch without the transport header set, e.g., through BPF redirection.
Revise the check back to check csum_start directly, and only if CHECKSUM_PARTIAL. Do leave the check in the updated location. Check field regardless of whether TUNNEL_CSUM is configured.
Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/ Link: https://lore.kernel.org/all/20210902193447.94039-2-willemdebruijn.kernel@gma... Fixes: 8a0ed250f911 ("ip_gre: validate csum_start only on pull") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Willem de Bruijn willemb@google.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Link: https://lore.kernel.org/r/20220606132107.3582565-1-willemdebruijn.kernel@gma... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/ipv4/ip_gre.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 41d0f9bb5191..cf60d0e07965 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -678,21 +678,20 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb, }
if (dev->header_ops) { - const int pull_len = tunnel->hlen + sizeof(struct iphdr); - if (skb_cow_head(skb, 0)) goto free_skb;
tnl_params = (const struct iphdr *)skb->data;
- if (pull_len > skb_transport_offset(skb)) - goto free_skb; - /* Pull skb since ip_tunnel_xmit() needs skb->data pointing * to gre header. */ - skb_pull(skb, pull_len); + skb_pull(skb, tunnel->hlen + sizeof(struct iphdr)); skb_reset_mac_header(skb); + + if (skb->ip_summed == CHECKSUM_PARTIAL && + skb_checksum_start(skb) < skb->data) + goto free_skb; } else { if (skb_cow_head(skb, dev->needed_headroom)) goto free_skb;
From: Yu Kuai yukuai3@huawei.com
stable inclusion from stable-4.19.247 commit 013a79f1b5c89290e2e97f1ebf14b14e0cf5fe5c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 06c4da89c24e7023ea448cadf8e9daf06a0aae6e ]
Otherwise there may be race between module removal and the handling of netlink command, which can lead to the oops as shown below:
BUG: kernel NULL pointer dereference, address: 0000000000000098 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 31299 Comm: nbd-client Tainted: G E 5.14.0-rc4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:down_write+0x1a/0x50 Call Trace: start_creating+0x89/0x130 debugfs_create_dir+0x1b/0x130 nbd_start_device+0x13d/0x390 [nbd] nbd_genl_connect+0x42f/0x748 [nbd] genl_family_rcv_msg_doit.isra.0+0xec/0x150 genl_rcv_msg+0xe5/0x1e0 netlink_rcv_skb+0x55/0x100 genl_rcv+0x29/0x40 netlink_unicast+0x1a8/0x250 netlink_sendmsg+0x21b/0x430 ____sys_sendmsg+0x2a4/0x2d0 ___sys_sendmsg+0x81/0xc0 __sys_sendmsg+0x62/0xb0 __x64_sys_sendmsg+0x1f/0x30 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Modules linked in: nbd(E-)
Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Josef Bacik josef@toxicpanda.com Link: https://lore.kernel.org/r/20220521073749.3146892-2-yukuai3@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/block/nbd.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 43a9acaa2e86..09e9379e47c0 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -2461,6 +2461,12 @@ static void __exit nbd_cleanup(void) struct nbd_device *nbd; LIST_HEAD(del_list);
+ /* + * Unregister netlink interface prior to waiting + * for the completion of netlink commands. + */ + genl_unregister_family(&nbd_genl_family); + nbd_dbg_close();
mutex_lock(&nbd_index_mutex); @@ -2476,7 +2482,6 @@ static void __exit nbd_cleanup(void) }
idr_destroy(&nbd_index_idr); - genl_unregister_family(&nbd_genl_family); unregister_blkdev(NBD_MAJOR, "nbd"); }
From: Yu Kuai yukuai3@huawei.com
stable inclusion from stable-4.19.247 commit 2573f2375b64280be977431701ed5d33b75b9ad0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit c55b2b983b0fa012942c3eb16384b2b722caa810 ]
When nbd module is being removing, nbd_alloc_config() may be called concurrently by nbd_genl_connect(), although try_module_get() will return false, but nbd_alloc_config() doesn't handle it.
The race may lead to the leak of nbd_config and its related resources (e.g, recv_workq) and oops in nbd_read_stat() due to the unload of nbd module as shown below:
BUG: kernel NULL pointer dereference, address: 0000000000000040 Oops: 0000 [#1] SMP PTI CPU: 5 PID: 13840 Comm: kworker/u17:33 Not tainted 5.14.0+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: knbd16-recv recv_work [nbd] RIP: 0010:nbd_read_stat.cold+0x130/0x1a4 [nbd] Call Trace: recv_work+0x3b/0xb0 [nbd] process_one_work+0x1ed/0x390 worker_thread+0x4a/0x3d0 kthread+0x12a/0x150 ret_from_fork+0x22/0x30
Fixing it by checking the return value of try_module_get() in nbd_alloc_config(). As nbd_alloc_config() may return ERR_PTR(-ENODEV), assign nbd->config only when nbd_alloc_config() succeeds to ensure the value of nbd->config is binary (valid or NULL).
Also adding a debug message to check the reference counter of nbd_config during module removal.
Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Josef Bacik josef@toxicpanda.com Link: https://lore.kernel.org/r/20220521073749.3146892-3-yukuai3@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/block/nbd.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 09e9379e47c0..568debcfc0bb 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1474,15 +1474,20 @@ static struct nbd_config *nbd_alloc_config(void) { struct nbd_config *config;
+ if (!try_module_get(THIS_MODULE)) + return ERR_PTR(-ENODEV); + config = kzalloc(sizeof(struct nbd_config), GFP_NOFS); - if (!config) - return NULL; + if (!config) { + module_put(THIS_MODULE); + return ERR_PTR(-ENOMEM); + } + atomic_set(&config->recv_threads, 0); init_waitqueue_head(&config->recv_wq); init_waitqueue_head(&config->conn_wait); config->blksize = NBD_DEF_BLKSIZE; atomic_set(&config->live_connections, 0); - try_module_get(THIS_MODULE); return config; }
@@ -1509,12 +1514,13 @@ static int nbd_open(struct block_device *bdev, fmode_t mode) mutex_unlock(&nbd->config_lock); goto out; } - config = nbd->config = nbd_alloc_config(); - if (!config) { - ret = -ENOMEM; + config = nbd_alloc_config(); + if (IS_ERR(config)) { + ret = PTR_ERR(config); mutex_unlock(&nbd->config_lock); goto out; } + nbd->config = config; refcount_set(&nbd->config_refs, 1); refcount_inc(&nbd->refs); mutex_unlock(&nbd->config_lock); @@ -1925,13 +1931,14 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info) nbd_put(nbd); return -EINVAL; } - config = nbd->config = nbd_alloc_config(); - if (!nbd->config) { + config = nbd_alloc_config(); + if (IS_ERR(config)) { mutex_unlock(&nbd->config_lock); nbd_put(nbd); printk(KERN_ERR "nbd: couldn't allocate config\n"); - return -ENOMEM; + return PTR_ERR(config); } + nbd->config = config; refcount_set(&nbd->config_refs, 1); set_bit(NBD_RT_BOUND, &config->runtime_flags);
@@ -2476,6 +2483,9 @@ static void __exit nbd_cleanup(void) while (!list_empty(&del_list)) { nbd = list_first_entry(&del_list, struct nbd_device, list); list_del_init(&nbd->list); + if (refcount_read(&nbd->config_refs)) + printk(KERN_ERR "nbd: possibly leaking nbd_config (ref %d)\n", + refcount_read(&nbd->config_refs)); if (refcount_read(&nbd->refs) != 1) printk(KERN_ERR "nbd: possibly leaking a device\n"); nbd_put(nbd);
From: Yu Kuai yukuai3@huawei.com
stable inclusion from stable-4.19.247 commit 62d227f67a8c25d5e16f40e5290607f9306d2188 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 09dadb5985023e27d4740ebd17e6fea4640110e5 ]
In our tests, "qemu-nbd" triggers a io hung:
INFO: task qemu-nbd:11445 blocked for more than 368 seconds. Not tainted 5.18.0-rc3-next-20220422-00003-g2176915513ca #884 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:qemu-nbd state:D stack: 0 pid:11445 ppid: 1 flags:0x00000000 Call Trace: <TASK> __schedule+0x480/0x1050 ? _raw_spin_lock_irqsave+0x3e/0xb0 schedule+0x9c/0x1b0 blk_mq_freeze_queue_wait+0x9d/0xf0 ? ipi_rseq+0x70/0x70 blk_mq_freeze_queue+0x2b/0x40 nbd_add_socket+0x6b/0x270 [nbd] nbd_ioctl+0x383/0x510 [nbd] blkdev_ioctl+0x18e/0x3e0 __x64_sys_ioctl+0xac/0x120 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd8ff706577 RSP: 002b:00007fd8fcdfebf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000040000000 RCX: 00007fd8ff706577 RDX: 000000000000000d RSI: 000000000000ab00 RDI: 000000000000000f RBP: 000000000000000f R08: 000000000000fbe8 R09: 000055fe497c62b0 R10: 00000002aff20000 R11: 0000000000000246 R12: 000000000000006d R13: 0000000000000000 R14: 00007ffe82dc5e70 R15: 00007fd8fcdff9c0
"qemu-ndb -d" will call ioctl 'NBD_DISCONNECT' first, however, following message was found:
block nbd0: Send disconnect failed -32
Which indicate that something is wrong with the server. Then, "qemu-nbd -d" will call ioctl 'NBD_CLEAR_SOCK', however ioctl can't clear requests after commit 2516ab1543fd("nbd: only clear the queue on device teardown"). And in the meantime, request can't complete through timeout because nbd_xmit_timeout() will always return 'BLK_EH_RESET_TIMER', which means such request will never be completed in this situation.
Now that the flag 'NBD_CMD_INFLIGHT' can make sure requests won't complete multiple times, switch back to call nbd_clear_sock() in nbd_clear_sock_ioctl(), so that inflight requests can be cleared.
Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Josef Bacik josef@toxicpanda.com Link: https://lore.kernel.org/r/20220521073749.3146892-5-yukuai3@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/block/nbd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 568debcfc0bb..833525f41005 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1364,7 +1364,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b static void nbd_clear_sock_ioctl(struct nbd_device *nbd, struct block_device *bdev) { - sock_shutdown(nbd); + nbd_clear_sock(nbd); __invalidate_device(bdev, true); nbd_bdev_reset(bdev); if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
From: Eric Dumazet edumazet@google.com
stable inclusion from stable-4.19.247 commit f2845e1504a3bc4f3381394f057e8b63cb5f3f7a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 11825765291a93d8e7f44230da67b9f607c777bf upstream.
syzbot got a new report [1] finally pointing to a very old bug, added in initial support for MTU probing.
tcp_mtu_probe() has checks about starting an MTU probe if tcp_snd_cwnd(tp) >= 11.
But nothing prevents tcp_snd_cwnd(tp) to be reduced later and before the MTU probe succeeds.
This bug would lead to potential zero-divides.
Debugging added in commit 40570375356c ("tcp: add accessors to read/set tp->snd_cwnd") has paid off :)
While we are at it, address potential overflows in this code.
[1] WARNING: CPU: 1 PID: 14132 at include/net/tcp.h:1219 tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712 Modules linked in: CPU: 1 PID: 14132 Comm: syz-executor.2 Not tainted 5.18.0-syzkaller-07857-gbabf0bb978e3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_snd_cwnd_set include/net/tcp.h:1219 [inline] RIP: 0010:tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712 Code: 74 08 48 89 ef e8 da 80 17 f9 48 8b 45 00 65 48 ff 80 80 03 00 00 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 aa b0 c5 f8 <0f> 0b e9 16 fe ff ff 48 8b 4c 24 08 80 e1 07 38 c1 0f 8c c7 fc ff RSP: 0018:ffffc900079e70f8 EFLAGS: 00010287 RAX: ffffffff88c0f7f6 RBX: ffff8880756e7a80 RCX: 0000000000040000 RDX: ffffc9000c6c4000 RSI: 0000000000031f9e RDI: 0000000000031f9f RBP: 0000000000000000 R08: ffffffff88c0f606 R09: ffffc900079e7520 R10: ffffed101011226d R11: 1ffff1101011226c R12: 1ffff1100eadcf50 R13: ffff8880756e72c0 R14: 1ffff1100eadcf89 R15: dffffc0000000000 FS: 00007f643236e700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ab3f1e2a0 CR3: 0000000064fe7000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_clean_rtx_queue+0x223a/0x2da0 net/ipv4/tcp_input.c:3356 tcp_ack+0x1962/0x3c90 net/ipv4/tcp_input.c:3861 tcp_rcv_established+0x7c8/0x1ac0 net/ipv4/tcp_input.c:5973 tcp_v6_do_rcv+0x57b/0x1210 net/ipv6/tcp_ipv6.c:1476 sk_backlog_rcv include/net/sock.h:1061 [inline] __release_sock+0x1d8/0x4c0 net/core/sock.c:2849 release_sock+0x5d/0x1c0 net/core/sock.c:3404 sk_stream_wait_memory+0x700/0xdc0 net/core/stream.c:145 tcp_sendmsg_locked+0x111d/0x3fc0 net/ipv4/tcp.c:1410 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1448 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] __sys_sendto+0x439/0x5c0 net/socket.c:2119 __do_sys_sendto net/socket.c:2131 [inline] __se_sys_sendto net/socket.c:2127 [inline] __x64_sys_sendto+0xda/0xf0 net/socket.c:2127 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f6431289109 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f643236e168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f643139c100 RCX: 00007f6431289109 RDX: 00000000d0d0c2ac RSI: 0000000020000080 RDI: 000000000000000a RBP: 00007f64312e308d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff372533af R14: 00007f643236e300 R15: 0000000000022000
Fixes: 5d424d5a674f ("[TCP]: MTU probing") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: Yuchung Cheng ycheng@google.com Acked-by: Neal Cardwell ncardwell@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- net/ipv4/tcp_input.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 3216cb28fdef..e30533a84f62 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2570,12 +2570,15 @@ static void tcp_mtup_probe_success(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); struct inet_connection_sock *icsk = inet_csk(sk); + u64 val;
- /* FIXME: breaks with very large cwnd */ tp->prior_ssthresh = tcp_current_ssthresh(sk); - tp->snd_cwnd = tp->snd_cwnd * - tcp_mss_to_mtu(sk, tp->mss_cache) / - icsk->icsk_mtup.probe_size; + + val = (u64)tp->snd_cwnd * tcp_mss_to_mtu(sk, tp->mss_cache); + do_div(val, icsk->icsk_mtup.probe_size); + WARN_ON_ONCE((u32)val != val); + tp->snd_cwnd = max_t(u32, 1U, val); + tp->snd_cwnd_cnt = 0; tp->snd_cwnd_stamp = tcp_jiffies32; tp->snd_ssthresh = tcp_current_ssthresh(sk);
From: Lorenzo Pieralisi lorenzo.pieralisi@arm.com
stable inclusion from stable-4.19.247 commit b322e833697c61a1ee1c239dd4a42b38f6ad531 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
commit 1bbc21785b7336619fb6a67f1fff5afdaf229acc upstream.
Currently the sysfs interface maps the BERT error region as "memory" (through acpi_os_map_memory()) in order to copy the error records into memory buffers through memory operations (eg memory_read_from_buffer()).
The OS system cannot detect whether the BERT error region is part of system RAM or it is "device memory" (eg BMC memory) and therefore it cannot detect which memory attributes the bus to memory support (and corresponding kernel mapping, unless firmware provides the required information).
The acpi_os_map_memory() arch backend implementation determines the mapping attributes. On arm64, if the BERT error region is not present in the EFI memory map, the error region is mapped as device-nGnRnE; this triggers alignment faults since memcpy unaligned accesses are not allowed in device-nGnRnE regions.
The ACPI sysfs code cannot therefore map by default the BERT error region with memory semantics but should use a safer default.
Change the sysfs code to map the BERT error region as MMIO (through acpi_os_map_iomem()) and use the memcpy_fromio() interface to read the error region into the kernel buffer.
Link: https://lore.kernel.org/linux-arm-kernel/31ffe8fc-f5ee-2858-26c5-0fd8bdd6870... Link: https://lore.kernel.org/linux-acpi/CAJZ5v0g+OVbhuUUDrLUCfX_mVqY_e8ubgLTU98=j... Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Tested-by: Veronika Kabatova vkabatov@redhat.com Tested-by: Aristeu Rozanski aris@redhat.com Acked-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
Conflicts: drivers/acpi/sysfs.c [wangxiongfeng: fix conflicts caused by commit bdd56d7d89 ACPI: sysfs: Make sparse happy about address space in use] Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- drivers/acpi/sysfs.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c index 0a8eb8961770..f0c64b6837d3 100644 --- a/drivers/acpi/sysfs.c +++ b/drivers/acpi/sysfs.c @@ -439,18 +439,29 @@ static ssize_t acpi_data_show(struct file *filp, struct kobject *kobj, { struct acpi_data_attr *data_attr; void __iomem *base; - ssize_t rc; + ssize_t size;
data_attr = container_of(bin_attr, struct acpi_data_attr, attr); + size = data_attr->attr.size; + + if (offset < 0) + return -EINVAL; + + if (offset >= size) + return 0;
- base = acpi_os_map_memory(data_attr->addr, data_attr->attr.size); + if (count > size - offset) + count = size - offset; + + base = acpi_os_map_iomem(data_attr->addr, size); if (!base) return -ENOMEM; - rc = memory_read_from_buffer(buf, count, &offset, base, - data_attr->attr.size); - acpi_os_unmap_memory(base, data_attr->attr.size);
- return rc; + memcpy_fromio(buf, base + offset, count); + + acpi_os_unmap_iomem(base, size); + + return count; }
static int acpi_bert_data_init(void *th, struct acpi_data_attr *data_attr)
From: Ammar Faizi ammarfaizi2@gnuweeb.org
stable inclusion from stable-4.19.247 commit 12ffed97ae3303c3c4bc772c9329a9977a9941d6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit b86eb74098a92afd789da02699b4b0dd3f73b889 ]
The asm constraint does not reflect the fact that the asm statement can modify the value of the local variable loops. Which it does.
Specifying the wrong constraint may lead to undefined behavior, it may clobber random stuff (e.g. local variable, important temporary value in regs, etc.). This is especially dangerous when the compiler decides to inline the function and since it doesn't know that the value gets modified, it might decide to use it from a register directly without reloading it.
Change the constraint to "+a" to denote that the first argument is an input and an output argument.
[ bp: Fix typo, massage commit message. ]
Fixes: e01b70ef3eb3 ("x86: fix bug in arch/i386/lib/delay.c file, delay_loop function") Signed-off-by: Ammar Faizi ammarfaizi2@gnuweeb.org Signed-off-by: Borislav Petkov bp@suse.de Link: https://lore.kernel.org/r/20220329104705.65256-2-ammarfaizi2@gnuweeb.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- arch/x86/lib/delay.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index 614c2c6b1959..68ca883abfdb 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -43,8 +43,8 @@ static void delay_loop(unsigned long loops) " jnz 2b \n" "3: dec %0 \n"
- : /* we don't need output */ - :"a" (loops) + : "+a" (loops) + : ); }
From: Randy Dunlap rdunlap@infradead.org
stable inclusion from stable-4.19.247 commit 61967ac7ba2814fefd033eb3979058057a18edc1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5FNPY CVE: NA
--------------------------------
[ Upstream commit 12441ccdf5e2f5a01a46e344976cbbd3d46845c9 ]
__setup() handlers should return 1 to obsolete_checksetup() in init/main.c to indicate that the boot option has been handled. A return of 0 causes the boot option/value to be listed as an Unknown kernel parameter and added to init's (limited) argument (no '=') or environment (with '=') strings. So return 1 from these x86 __setup handlers.
Examples:
Unknown kernel command line parameters "apicpmtimer BOOT_IMAGE=/boot/bzImage-517rc8 vdso=1 ring3mwait=disable", will be passed to user space.
Run /sbin/init as init process with arguments: /sbin/init apicpmtimer with environment: HOME=/ TERM=linux BOOT_IMAGE=/boot/bzImage-517rc8 vdso=1 ring3mwait=disable
Fixes: 2aae950b21e4 ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu") Fixes: 77b52b4c5c66 ("x86: add "debugpat" boot option") Fixes: e16fd002afe2 ("x86/cpufeature: Enable RING3MWAIT for Knights Landing") Fixes: b8ce33590687 ("x86_64: convert to clock events") Reported-by: Igor Zhbanov i.zhbanov@omprussia.ru Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Link: https://lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Link: https://lore.kernel.org/r/20220314012725.26661-1-rdunlap@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com Signed-off-by: Laibin Qiu qiulaibin@huawei.com --- arch/x86/entry/vdso/vma.c | 2 +- arch/x86/kernel/apic/apic.c | 2 +- arch/x86/kernel/cpu/intel.c | 2 +- arch/x86/mm/pat.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index 46caca4d9141..a1c31bb23170 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -329,7 +329,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) static __init int vdso_setup(char *s) { vdso64_enabled = simple_strtoul(s, NULL, 0); - return 0; + return 1; } __setup("vdso=", vdso_setup); #endif diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index cfe893ccd6ab..4bb3c2abab92 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -165,7 +165,7 @@ static __init int setup_apicpmtimer(char *s) { apic_calibrate_pmtmr = 1; notsc_setup(NULL); - return 0; + return 1; } __setup("apicpmtimer", setup_apicpmtimer); #endif diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index a5287b18a63f..76692590eff8 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -71,7 +71,7 @@ static bool ring3mwait_disabled __read_mostly; static int __init ring3mwait_disable(char *__unused) { ring3mwait_disabled = true; - return 0; + return 1; } __setup("ring3mwait=disable", ring3mwait_disable);
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 324e26d0607b..b33304c0042a 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -74,7 +74,7 @@ int pat_debug_enable; static int __init pat_debug_setup(char *str) { pat_debug_enable = 1; - return 0; + return 1; } __setup("debugpat", pat_debug_setup);