Anant Thazhemadam (2): Bluetooth: hci_h5: close serdev device and free hu in h5_close misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells()
Boqun Feng (1): fcntl: Fix potential deadlock in send_sig{io, urg}()
Christophe Leroy (1): powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
Damien Le Moal (1): null_blk: Fix zone size initialization
Dinghao Liu (1): rtc: sun6i: Fix memleak in sun6i_rtc_clk_init
Eric Auger (1): vfio/pci: Move dummy_resources_list init in vfio_pci_probe()
Eric Biggers (4): fscrypt: add fscrypt_is_nokey_name() ext4: prevent creating duplicate encrypted filenames f2fs: prevent creating duplicate encrypted filenames ubifs: prevent creating duplicate encrypted filenames
Greg Kroah-Hartman (1): Linux 4.19.165
Hyeongseok Kim (1): dm verity: skip verity work if I/O error when system is shutting down
Jan Kara (2): ext4: don't remount read-only with errors=continue on reboot quota: Don't overflow quota file offsets
Jessica Yu (1): module: delay kobject uevent until after module init call
Johan Hovold (1): of: fix linker-section match-table corruption
Kevin Vigor (1): md/raid10: initialize r10_bio->read_slot before use.
Mauro Carvalho Chehab (1): media: gp8psk: initialize stats at power control logic
Miroslav Benes (1): module: set MODULE_STATE_GOING state when a module fails to load
Paolo Bonzini (2): KVM: SVM: relax conditions for allowing MSR_IA32_SPEC_CTRL accesses KVM: x86: reinstate vendor-agnostic check on SPEC_CTRL cpuid bits
Petr Vorel (1): uapi: move constants from <linux/kernel.h> to <linux/const.h>
Qinglang Miao (1): powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()
Rustam Kovhaev (1): reiserfs: add check for an invalid ih_entry_count
Souptick Joarder (1): xen/gntdev.c: Mark pages as dirty
Takashi Iwai (3): ALSA: seq: Use bool for snd_seq_queue internal flags ALSA: rawmidi: Access runtime->avail always in spinlock ALSA: pcm: Clear the full allocated memory at hw_params
Trond Myklebust (1): NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode
Makefile | 2 +- arch/powerpc/include/asm/bitops.h | 23 +++++++++++- arch/powerpc/sysdev/mpic_msgr.c | 2 +- arch/x86/kvm/cpuid.h | 14 +++++++ arch/x86/kvm/svm.c | 9 ++--- arch/x86/kvm/vmx.c | 6 +-- drivers/block/null_blk_zoned.c | 20 ++++++---- drivers/bluetooth/hci_h5.c | 8 +++- drivers/md/dm-verity-target.c | 12 +++++- drivers/md/raid10.c | 3 +- drivers/media/usb/dvb-usb/gp8psk.c | 2 +- drivers/misc/vmw_vmci/vmci_context.c | 2 +- drivers/rtc/rtc-sun6i.c | 8 ++-- drivers/vfio/pci/vfio_pci.c | 3 +- drivers/xen/gntdev.c | 17 ++++++--- fs/crypto/hooks.c | 10 ++--- fs/ext4/namei.c | 3 ++ fs/ext4/super.c | 15 +++----- fs/f2fs/f2fs.h | 2 + fs/fcntl.c | 10 +++-- fs/nfs/nfs4super.c | 2 +- fs/nfs/pnfs.c | 33 ++++++++++++++++- fs/nfs/pnfs.h | 5 +++ fs/quota/quota_tree.c | 8 ++-- fs/reiserfs/stree.c | 6 +++ fs/ubifs/dir.c | 17 +++++++-- include/linux/fscrypt_notsupp.h | 5 +++ include/linux/fscrypt_supp.h | 29 +++++++++++++++ include/linux/of.h | 1 + include/uapi/linux/const.h | 5 +++ include/uapi/linux/ethtool.h | 2 +- include/uapi/linux/kernel.h | 9 +---- include/uapi/linux/lightnvm.h | 2 +- include/uapi/linux/mroute6.h | 2 +- include/uapi/linux/netfilter/x_tables.h | 2 +- include/uapi/linux/netlink.h | 2 +- include/uapi/linux/sysctl.h | 2 +- kernel/module.c | 6 ++- sound/core/pcm_native.c | 9 ++++- sound/core/rawmidi.c | 49 ++++++++++++++++++------- sound/core/seq/seq_queue.h | 8 ++-- 41 files changed, 274 insertions(+), 101 deletions(-)
From: Kevin Vigor kvigor@gmail.com
commit 93decc563637c4288380912eac0eb42fb246cc04 upstream.
In __make_request() a new r10bio is allocated and passed to raid10_read_request(). The read_slot member of the bio is not initialized, and the raid10_read_request() uses it to index an array. This leads to occasional panics.
Fix by initializing the field to invalid value and checking for valid value in raid10_read_request().
Cc: stable@vger.kernel.org Signed-off-by: Kevin Vigor kvigor@gmail.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/raid10.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 7c091b640a8a..db47579f3b0d 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1138,7 +1138,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio, struct md_rdev *err_rdev = NULL; gfp_t gfp = GFP_NOIO;
- if (r10_bio->devs[slot].rdev) { + if (slot >= 0 && r10_bio->devs[slot].rdev) { /* * This is an error retry, but we cannot * safely dereference the rdev in the r10_bio, @@ -1547,6 +1547,7 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors) r10_bio->mddev = mddev; r10_bio->sector = bio->bi_iter.bi_sector; r10_bio->state = 0; + r10_bio->read_slot = -1; memset(r10_bio->devs, 0, sizeof(r10_bio->devs[0]) * conf->copies);
if (bio_data_dir(bio) == READ)
From: Eric Biggers ebiggers@google.com
commit 159e1de201b6fca10bfec50405a3b53a561096a8 upstream.
It's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the encryption key.
Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or sys_symlink()) can lookup the target filename while the directory's encryption key hasn't been added yet, resulting in a negative no-key dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or ->symlink()) because the dentry is negative. Normally, ->create() would return -ENOKEY due to the directory's key being unavailable. However, if the key was added between the dentry lookup and ->create(), then the filesystem will go ahead and try to create the file.
If the target filename happens to already exist as a normal name (not a no-key name), a duplicate filename may be added to the directory.
In order to fix this, we need to fix the filesystems to prevent ->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names. (->rename() and ->link() need it too, but those are already handled correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)
In preparation for this, add a helper function fscrypt_is_nokey_name() that filesystems can use to do this check. Use this helper function for the existing checks that fs/crypto/ does for rename and link.
Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/crypto/hooks.c | 10 +++++----- include/linux/fscrypt_notsupp.h | 5 +++++ include/linux/fscrypt_supp.h | 29 +++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c index 042d5b44f4ed..aa86cb2db823 100644 --- a/fs/crypto/hooks.c +++ b/fs/crypto/hooks.c @@ -58,8 +58,8 @@ int __fscrypt_prepare_link(struct inode *inode, struct inode *dir, if (err) return err;
- /* ... in case we looked up ciphertext name before key was added */ - if (dentry->d_flags & DCACHE_ENCRYPTED_NAME) + /* ... in case we looked up no-key name before key was added */ + if (fscrypt_is_nokey_name(dentry)) return -ENOKEY;
if (!fscrypt_has_permitted_context(dir, inode)) @@ -83,9 +83,9 @@ int __fscrypt_prepare_rename(struct inode *old_dir, struct dentry *old_dentry, if (err) return err;
- /* ... in case we looked up ciphertext name(s) before key was added */ - if ((old_dentry->d_flags | new_dentry->d_flags) & - DCACHE_ENCRYPTED_NAME) + /* ... in case we looked up no-key name(s) before key was added */ + if (fscrypt_is_nokey_name(old_dentry) || + fscrypt_is_nokey_name(new_dentry)) return -ENOKEY;
if (old_dir != new_dir) { diff --git a/include/linux/fscrypt_notsupp.h b/include/linux/fscrypt_notsupp.h index 24b261e49dc1..93304cfeb601 100644 --- a/include/linux/fscrypt_notsupp.h +++ b/include/linux/fscrypt_notsupp.h @@ -24,6 +24,11 @@ static inline bool fscrypt_dummy_context_enabled(struct inode *inode) return false; }
+static inline bool fscrypt_is_nokey_name(const struct dentry *dentry) +{ + return false; +} + /* crypto.c */ static inline void fscrypt_enqueue_decrypt_work(struct work_struct *work) { diff --git a/include/linux/fscrypt_supp.h b/include/linux/fscrypt_supp.h index 8641e20694ce..0409c14ae1de 100644 --- a/include/linux/fscrypt_supp.h +++ b/include/linux/fscrypt_supp.h @@ -58,6 +58,35 @@ static inline bool fscrypt_dummy_context_enabled(struct inode *inode) inode->i_sb->s_cop->dummy_context(inode); }
+/** + * fscrypt_is_nokey_name() - test whether a dentry is a no-key name + * @dentry: the dentry to check + * + * This returns true if the dentry is a no-key dentry. A no-key dentry is a + * dentry that was created in an encrypted directory that hasn't had its + * encryption key added yet. Such dentries may be either positive or negative. + * + * When a filesystem is asked to create a new filename in an encrypted directory + * and the new filename's dentry is a no-key dentry, it must fail the operation + * with ENOKEY. This includes ->create(), ->mkdir(), ->mknod(), ->symlink(), + * ->rename(), and ->link(). (However, ->rename() and ->link() are already + * handled by fscrypt_prepare_rename() and fscrypt_prepare_link().) + * + * This is necessary because creating a filename requires the directory's + * encryption key, but just checking for the key on the directory inode during + * the final filesystem operation doesn't guarantee that the key was available + * during the preceding dentry lookup. And the key must have already been + * available during the dentry lookup in order for it to have been checked + * whether the filename already exists in the directory and for the new file's + * dentry not to be invalidated due to it incorrectly having the no-key flag. + * + * Return: %true if the dentry is a no-key name + */ +static inline bool fscrypt_is_nokey_name(const struct dentry *dentry) +{ + return dentry->d_flags & DCACHE_ENCRYPTED_NAME; +} + /* crypto.c */ extern void fscrypt_enqueue_decrypt_work(struct work_struct *); extern struct fscrypt_ctx *fscrypt_get_ctx(const struct inode *, gfp_t);
From: Eric Biggers ebiggers@google.com
commit 75d18cd1868c2aee43553723872c35d7908f240f upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the directory's encryption key.
Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry().
Note that the duplicate check in ext4_find_dest_de() sometimes prevented this bug. However in many cases it didn't, since ext4_find_dest_de() doesn't examine every dentry.
Fixes: 4461471107b7 ("ext4 crypto: enable filename encryption") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-3-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/namei.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 762eb6913240..c755f47b7174 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -2111,6 +2111,9 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry, if (!dentry->d_name.len) return -EINVAL;
+ if (fscrypt_is_nokey_name(dentry)) + return -ENOKEY; + retval = ext4_fname_setup_filename(dir, &dentry->d_name, 0, &fname); if (retval) return retval;
From: Eric Biggers ebiggers@google.com
commit bfc2b7e8518999003a61f91c1deb5e88ed77b07d upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the directory's encryption key.
Fix this bug on f2fs by rejecting no-key dentries in f2fs_add_link().
Note that the weird check for the current task in f2fs_do_add_link() seems to make this bug difficult to reproduce on f2fs.
Fixes: 9ea97163c6da ("f2fs crypto: add filename encryption for f2fs_add_link") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-4-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/f2fs.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index f353c04e26a4..29a89c6c01e0 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -2857,6 +2857,8 @@ bool f2fs_empty_dir(struct inode *dir);
static inline int f2fs_add_link(struct dentry *dentry, struct inode *inode) { + if (fscrypt_is_nokey_name(dentry)) + return -ENOKEY; return f2fs_do_add_link(d_inode(dentry->d_parent), &dentry->d_name, inode, inode->i_ino, inode->i_mode); }
From: Eric Biggers ebiggers@google.com
commit 76786a0f083473de31678bdb259a3d4167cf756d upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the directory's encryption key.
Fix this bug on ubifs by rejecting no-key dentries in ubifs_create(), ubifs_mkdir(), ubifs_mknod(), and ubifs_symlink().
Note that ubifs doesn't actually report the duplicate filenames from readdir, but rather it seems to replace the original dentry with a new one (which is still wrong, just a different effect from ext4).
On ubifs, this fixes xfstest generic/595 as well as the new xfstest I wrote specifically for this bug.
Fixes: f4f61d2cc6d8 ("ubifs: Implement encrypted filenames") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-5-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ubifs/dir.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 10aab5dccaee..8fe2ee5462a0 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -290,6 +290,15 @@ static struct dentry *ubifs_lookup(struct inode *dir, struct dentry *dentry, return d_splice_alias(inode, dentry); }
+static int ubifs_prepare_create(struct inode *dir, struct dentry *dentry, + struct fscrypt_name *nm) +{ + if (fscrypt_is_nokey_name(dentry)) + return -ENOKEY; + + return fscrypt_setup_filename(dir, &dentry->d_name, 0, nm); +} + static int ubifs_create(struct inode *dir, struct dentry *dentry, umode_t mode, bool excl) { @@ -313,7 +322,7 @@ static int ubifs_create(struct inode *dir, struct dentry *dentry, umode_t mode, if (err) return err;
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm); + err = ubifs_prepare_create(dir, dentry, &nm); if (err) goto out_budg;
@@ -977,7 +986,7 @@ static int ubifs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) if (err) return err;
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm); + err = ubifs_prepare_create(dir, dentry, &nm); if (err) goto out_budg;
@@ -1062,7 +1071,7 @@ static int ubifs_mknod(struct inode *dir, struct dentry *dentry, return err; }
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm); + err = ubifs_prepare_create(dir, dentry, &nm); if (err) { kfree(dev); goto out_budg; @@ -1146,7 +1155,7 @@ static int ubifs_symlink(struct inode *dir, struct dentry *dentry, if (err) return err;
- err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm); + err = ubifs_prepare_create(dir, dentry, &nm); if (err) goto out_budg;
From: Eric Auger eric.auger@redhat.com
[ Upstream commit 16b8fe4caf499ae8e12d2ab1b1324497e36a7b83 ]
In case an error occurs in vfio_pci_enable() before the call to vfio_pci_probe_mmaps(), vfio_pci_disable() will try to iterate on an uninitialized list and cause a kernel panic.
Lets move to the initialization to vfio_pci_probe() to fix the issue.
Signed-off-by: Eric Auger eric.auger@redhat.com Fixes: 05f0c03fbac1 ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") CC: Stable stable@vger.kernel.org # v4.7+ Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/pci/vfio_pci.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 5e23e4aa5b0a..c48e1d84efb6 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -118,8 +118,6 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev) int bar; struct vfio_pci_dummy_resource *dummy_res;
- INIT_LIST_HEAD(&vdev->dummy_resources_list); - for (bar = PCI_STD_RESOURCES; bar <= PCI_STD_RESOURCE_END; bar++) { res = vdev->pdev->resource + bar;
@@ -1522,6 +1520,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) mutex_init(&vdev->igate); spin_lock_init(&vdev->irqlock); mutex_init(&vdev->ioeventfds_lock); + INIT_LIST_HEAD(&vdev->dummy_resources_list); INIT_LIST_HEAD(&vdev->ioeventfds_list); mutex_init(&vdev->vma_lock); INIT_LIST_HEAD(&vdev->vma_list);
From: Jan Kara jack@suse.cz
[ Upstream commit b08070eca9e247f60ab39d79b2c25d274750441f ]
ext4_handle_error() with errors=continue mount option can accidentally remount the filesystem read-only when the system is rebooting. Fix that.
Fixes: 1dc1097ff60e ("ext4: avoid panic during forced reboot") Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Andreas Dilger adilger@dilger.ca Cc: stable@kernel.org Link: https://lore.kernel.org/r/20201127113405.26867-2-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org Conflicts: fs/ext4/super.c [yyl: adjust context] --- fs/ext4/super.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 1a0d494e57ae..15af76d547e0 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -499,20 +499,17 @@ static void ext4_netlink_send_info(struct super_block *sb, int ext4_errno)
static void ext4_handle_error(struct super_block *sb) { + journal_t *journal = EXT4_SB(sb)->s_journal; + if (test_opt(sb, WARN_ON_ERROR)) WARN_ON_ONCE(1);
- if (sb_rdonly(sb)) + if (sb_rdonly(sb) || test_opt(sb, ERRORS_CONT)) return;
- if (!test_opt(sb, ERRORS_CONT)) { - journal_t *journal = EXT4_SB(sb)->s_journal; - - EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED; - if (journal) - jbd2_journal_abort(journal, -EIO); - } - + EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED; + if (journal) + jbd2_journal_abort(journal, -EIO); /* * We force ERRORS_RO behavior when system is rebooting. Otherwise we * could panic during 'reboot -f' as the underlying device got already
From: Petr Vorel petr.vorel@gmail.com
commit a85cbe6159ffc973e5702f70a3bd5185f8f3c38d upstream.
and include <linux/const.h> in UAPI headers instead of <linux/kernel.h>.
The reason is to avoid indirect <linux/sysinfo.h> include when using some network headers: <linux/netlink.h> or others -> <linux/kernel.h> -> <linux/sysinfo.h>.
This indirect include causes on MUSL redefinition of struct sysinfo when included both <sys/sysinfo.h> and some of UAPI headers:
In file included from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5, from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5, from ../include/tst_netlink.h:14, from tst_crypto.c:13: x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: redefinition of `struct sysinfo' struct sysinfo { ^~~~~~~ In file included from ../include/tst_safe_macros.h:15, from ../include/tst_test.h:93, from tst_crypto.c:11: x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: originally defined here
Link: https://lkml.kernel.org/r/20201015190013.8901-1-petr.vorel@gmail.com Signed-off-by: Petr Vorel petr.vorel@gmail.com Suggested-by: Rich Felker dalias@aerifal.cx Acked-by: Rich Felker dalias@libc.org Cc: Peter Korsgaard peter@korsgaard.com Cc: Baruch Siach baruch@tkos.co.il Cc: Florian Weimer fweimer@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/const.h | 5 +++++ include/uapi/linux/ethtool.h | 2 +- include/uapi/linux/kernel.h | 9 +-------- include/uapi/linux/lightnvm.h | 2 +- include/uapi/linux/mroute6.h | 2 +- include/uapi/linux/netfilter/x_tables.h | 2 +- include/uapi/linux/netlink.h | 2 +- include/uapi/linux/sysctl.h | 2 +- 8 files changed, 12 insertions(+), 14 deletions(-)
diff --git a/include/uapi/linux/const.h b/include/uapi/linux/const.h index 5ed721ad5b19..af2a44c08683 100644 --- a/include/uapi/linux/const.h +++ b/include/uapi/linux/const.h @@ -28,4 +28,9 @@ #define _BITUL(x) (_UL(1) << (x)) #define _BITULL(x) (_ULL(1) << (x))
+#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1) +#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask)) + +#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) + #endif /* _UAPI_LINUX_CONST_H */ diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index dc69391d2bba..fc21d3726b59 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -14,7 +14,7 @@ #ifndef _UAPI_LINUX_ETHTOOL_H #define _UAPI_LINUX_ETHTOOL_H
-#include <linux/kernel.h> +#include <linux/const.h> #include <linux/types.h> #include <linux/if_ether.h>
diff --git a/include/uapi/linux/kernel.h b/include/uapi/linux/kernel.h index 0ff8f7477847..fadf2db71fe8 100644 --- a/include/uapi/linux/kernel.h +++ b/include/uapi/linux/kernel.h @@ -3,13 +3,6 @@ #define _UAPI_LINUX_KERNEL_H
#include <linux/sysinfo.h> - -/* - * 'kernel.h' contains some often-used function prototypes etc - */ -#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1) -#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask)) - -#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) +#include <linux/const.h>
#endif /* _UAPI_LINUX_KERNEL_H */ diff --git a/include/uapi/linux/lightnvm.h b/include/uapi/linux/lightnvm.h index f9a1be7fc696..ead2e72e5c88 100644 --- a/include/uapi/linux/lightnvm.h +++ b/include/uapi/linux/lightnvm.h @@ -21,7 +21,7 @@ #define _UAPI_LINUX_LIGHTNVM_H
#ifdef __KERNEL__ -#include <linux/kernel.h> +#include <linux/const.h> #include <linux/ioctl.h> #else /* __KERNEL__ */ #include <stdio.h> diff --git a/include/uapi/linux/mroute6.h b/include/uapi/linux/mroute6.h index 9999cc006390..1617eb9949a5 100644 --- a/include/uapi/linux/mroute6.h +++ b/include/uapi/linux/mroute6.h @@ -2,7 +2,7 @@ #ifndef _UAPI__LINUX_MROUTE6_H #define _UAPI__LINUX_MROUTE6_H
-#include <linux/kernel.h> +#include <linux/const.h> #include <linux/types.h> #include <linux/sockios.h> #include <linux/in6.h> /* For struct sockaddr_in6. */ diff --git a/include/uapi/linux/netfilter/x_tables.h b/include/uapi/linux/netfilter/x_tables.h index a8283f7dbc51..b8c6bb233ac1 100644 --- a/include/uapi/linux/netfilter/x_tables.h +++ b/include/uapi/linux/netfilter/x_tables.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ #ifndef _UAPI_X_TABLES_H #define _UAPI_X_TABLES_H -#include <linux/kernel.h> +#include <linux/const.h> #include <linux/types.h>
#define XT_FUNCTION_MAXNAMELEN 30 diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h index ecaefeeb35cd..1d675de314d7 100644 --- a/include/uapi/linux/netlink.h +++ b/include/uapi/linux/netlink.h @@ -2,7 +2,7 @@ #ifndef _UAPI__LINUX_NETLINK_H #define _UAPI__LINUX_NETLINK_H
-#include <linux/kernel.h> +#include <linux/const.h> #include <linux/socket.h> /* for __kernel_sa_family_t */ #include <linux/types.h>
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h index 87aa2a6d9125..cc453ed0e65e 100644 --- a/include/uapi/linux/sysctl.h +++ b/include/uapi/linux/sysctl.h @@ -23,7 +23,7 @@ #ifndef _UAPI_LINUX_SYSCTL_H #define _UAPI_LINUX_SYSCTL_H
-#include <linux/kernel.h> +#include <linux/const.h> #include <linux/types.h> #include <linux/compiler.h>
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit df7e8818926eb4712b67421442acf7d568fe2645 ]
Userspace that does not know about the AMD_IBRS bit might still allow the guest to protect itself with MSR_IA32_SPEC_CTRL using the Intel SPEC_CTRL bit. However, svm.c disallows this and will cause a #GP in the guest when writing to the MSR. Fix this by loosening the test and allowing the Intel CPUID bit, and in fact allow the AMD_STIBP bit as well since it allows writing to MSR_IA32_SPEC_CTRL too.
Reported-by: Zhiyi Guo zhguo@redhat.com Analyzed-by: Dr. David Alan Gilbert dgilbert@redhat.com Analyzed-by: Laszlo Ersek lersek@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/svm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a0c3d1b4b295..f513110983d4 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -4209,6 +4209,8 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_SPEC_CTRL: if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) && + !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) && !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) && !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD)) return 1; @@ -4312,6 +4314,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; case MSR_IA32_SPEC_CTRL: if (!msr->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) && + !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) && !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) && !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD)) return 1;
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit 39485ed95d6b83b62fa75c06c2c4d33992e0d971 ]
Until commit e7c587da1252 ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD. Testing only Intel bits on VMX processors, or only AMD bits on SVM processors, fails if the guests are created with the "opposite" vendor as the host.
While at it, also tweak the host CPU check to use the vendor-agnostic feature bit X86_FEATURE_IBPB, since we only care about the availability of the MSR on the host here and not about specific CPUID bits.
Fixes: e7c587da1252 ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP") Cc: stable@vger.kernel.org Reported-by: Denis V. Lunev den@openvz.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/cpuid.h | 14 ++++++++++++++ arch/x86/kvm/svm.c | 13 +++---------- arch/x86/kvm/vmx.c | 6 +++--- 3 files changed, 20 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index d78a61408243..7dec43b2c420 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -154,6 +154,20 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu) return x86_stepping(best->eax); }
+static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu) +{ + return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) || + guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) || + guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) || + guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD)); +} + +static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu) +{ + return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) || + guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB)); +} + static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu) { return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index f513110983d4..d2dc734f5bd0 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -4209,10 +4209,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_SPEC_CTRL: if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD)) + !guest_has_spec_ctrl_msr(vcpu)) return 1;
msr_info->data = svm->spec_ctrl; @@ -4314,10 +4311,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; case MSR_IA32_SPEC_CTRL: if (!msr->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD)) + !guest_has_spec_ctrl_msr(vcpu)) return 1;
/* The STIBP bit doesn't fault even if it's not advertised */ @@ -4344,12 +4338,11 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; case MSR_IA32_PRED_CMD: if (!msr->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB)) + !guest_has_pred_cmd_msr(vcpu)) return 1;
if (data & ~PRED_CMD_IBPB) return 1; - if (!data) break;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6ca72e3ee372..26ae454af6f4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -4066,7 +4066,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return kvm_get_msr_common(vcpu, msr_info); case MSR_IA32_SPEC_CTRL: if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) + !guest_has_spec_ctrl_msr(vcpu)) return 1;
msr_info->data = to_vmx(vcpu)->spec_ctrl; @@ -4180,7 +4180,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_SPEC_CTRL: if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) + !guest_has_spec_ctrl_msr(vcpu)) return 1;
/* The STIBP bit doesn't fault even if it's not advertised */ @@ -4210,7 +4210,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_PRED_CMD: if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) + !guest_has_pred_cmd_msr(vcpu)) return 1;
if (data & ~PRED_CMD_IBPB)
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 1891ef21d92c4801ea082ee8ed478e304ddc6749 ]
fls() and fls64() are using __builtin_ctz() and _builtin_ctzll(). On powerpc, those builtins trivially use ctlzw and ctlzd power instructions.
Allthough those instructions provide the expected result with input argument 0, __builtin_ctz() and __builtin_ctzll() are documented as undefined for value 0.
The easiest fix would be to use fls() and fls64() functions defined in include/asm-generic/bitops/builtin-fls.h and include/asm-generic/bitops/fls64.h, but GCC output is not optimal:
00000388 <testfls>: 388: 2c 03 00 00 cmpwi r3,0 38c: 41 82 00 10 beq 39c <testfls+0x14> 390: 7c 63 00 34 cntlzw r3,r3 394: 20 63 00 20 subfic r3,r3,32 398: 4e 80 00 20 blr 39c: 38 60 00 00 li r3,0 3a0: 4e 80 00 20 blr
000003b0 <testfls64>: 3b0: 2c 03 00 00 cmpwi r3,0 3b4: 40 82 00 1c bne 3d0 <testfls64+0x20> 3b8: 2f 84 00 00 cmpwi cr7,r4,0 3bc: 38 60 00 00 li r3,0 3c0: 4d 9e 00 20 beqlr cr7 3c4: 7c 83 00 34 cntlzw r3,r4 3c8: 20 63 00 20 subfic r3,r3,32 3cc: 4e 80 00 20 blr 3d0: 7c 63 00 34 cntlzw r3,r3 3d4: 20 63 00 40 subfic r3,r3,64 3d8: 4e 80 00 20 blr
When the input of fls(x) is a constant, just check x for nullity and return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction directly.
For fls64() on PPC64, do the same but with __builtin_clzll() and cntlzd instruction. On PPC32, lets take the generic fls64() which will use our fls(). The result is as expected:
00000388 <testfls>: 388: 7c 63 00 34 cntlzw r3,r3 38c: 20 63 00 20 subfic r3,r3,32 390: 4e 80 00 20 blr
000003a0 <testfls64>: 3a0: 2c 03 00 00 cmpwi r3,0 3a4: 40 82 00 10 bne 3b4 <testfls64+0x14> 3a8: 7c 83 00 34 cntlzw r3,r4 3ac: 20 63 00 20 subfic r3,r3,32 3b0: 4e 80 00 20 blr 3b4: 7c 63 00 34 cntlzw r3,r3 3b8: 20 63 00 40 subfic r3,r3,64 3bc: 4e 80 00 20 blr
Fixes: 2fcff790dcb4 ("powerpc: Use builtin functions for fls()/__fls()/fls64()") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Acked-by: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/348c2d3f19ffcff8abe50d52513f989c4581d000.160337552... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/bitops.h | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h index ff71566dadee..76db1c5000bd 100644 --- a/arch/powerpc/include/asm/bitops.h +++ b/arch/powerpc/include/asm/bitops.h @@ -221,15 +221,34 @@ static __inline__ void __clear_bit_unlock(int nr, volatile unsigned long *addr) */ static __inline__ int fls(unsigned int x) { - return 32 - __builtin_clz(x); + int lz; + + if (__builtin_constant_p(x)) + return x ? 32 - __builtin_clz(x) : 0; + asm("cntlzw %0,%1" : "=r" (lz) : "r" (x)); + return 32 - lz; }
#include <asm-generic/bitops/builtin-__fls.h>
+/* + * 64-bit can do this using one cntlzd (count leading zeroes doubleword) + * instruction; for 32-bit we use the generic version, which does two + * 32-bit fls calls. + */ +#ifdef CONFIG_PPC64 static __inline__ int fls64(__u64 x) { - return 64 - __builtin_clzll(x); + int lz; + + if (__builtin_constant_p(x)) + return x ? 64 - __builtin_clzll(x) : 0; + asm("cntlzd %0,%1" : "=r" (lz) : "r" (x)); + return 64 - lz; } +#else +#include <asm-generic/bitops/fls64.h> +#endif
#ifdef CONFIG_PPC64 unsigned int __arch_hweight8(unsigned int w);
From: Souptick Joarder jrdr.linux@gmail.com
commit 779055842da5b2e508f3ccf9a8153cb1f704f566 upstream.
There seems to be a bug in the original code when gntdev_get_page() is called with writeable=true then the page needs to be marked dirty before being put.
To address this, a bool writeable is added in gnt_dev_copy_batch, set it in gntdev_grant_copy_seg() (and drop `writeable` argument to gntdev_get_page()) and then, based on batch->writeable, use set_page_dirty_lock().
Fixes: a4cdb556cae0 (xen/gntdev: add ioctl for grant copy) Suggested-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Souptick Joarder jrdr.linux@gmail.com Cc: John Hubbard jhubbard@nvidia.com Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Juergen Gross jgross@suse.com Cc: David Vrabel david.vrabel@citrix.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1599375114-32360-1-git-send-email-jrdr.linux@gmail... Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com [jinoh: backport accounting for missing commit 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a write 'bool'")] Signed-off-by: Jinoh Kang jinoh.kang.kr@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/gntdev.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 9d8e02cfd480..3cfbec482efb 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -842,17 +842,18 @@ struct gntdev_copy_batch { s16 __user *status[GNTDEV_COPY_BATCH]; unsigned int nr_ops; unsigned int nr_pages; + bool writeable; };
static int gntdev_get_page(struct gntdev_copy_batch *batch, void __user *virt, - bool writeable, unsigned long *gfn) + unsigned long *gfn) { unsigned long addr = (unsigned long)virt; struct page *page; unsigned long xen_pfn; int ret;
- ret = get_user_pages_fast(addr, 1, writeable, &page); + ret = get_user_pages_fast(addr, 1, batch->writeable, &page); if (ret < 0) return ret;
@@ -868,9 +869,13 @@ static void gntdev_put_pages(struct gntdev_copy_batch *batch) { unsigned int i;
- for (i = 0; i < batch->nr_pages; i++) + for (i = 0; i < batch->nr_pages; i++) { + if (batch->writeable && !PageDirty(batch->pages[i])) + set_page_dirty_lock(batch->pages[i]); put_page(batch->pages[i]); + } batch->nr_pages = 0; + batch->writeable = false; }
static int gntdev_copy(struct gntdev_copy_batch *batch) @@ -959,8 +964,9 @@ static int gntdev_grant_copy_seg(struct gntdev_copy_batch *batch, virt = seg->source.virt + copied; off = (unsigned long)virt & ~XEN_PAGE_MASK; len = min(len, (size_t)XEN_PAGE_SIZE - off); + batch->writeable = false;
- ret = gntdev_get_page(batch, virt, false, &gfn); + ret = gntdev_get_page(batch, virt, &gfn); if (ret < 0) return ret;
@@ -978,8 +984,9 @@ static int gntdev_grant_copy_seg(struct gntdev_copy_batch *batch, virt = seg->dest.virt + copied; off = (unsigned long)virt & ~XEN_PAGE_MASK; len = min(len, (size_t)XEN_PAGE_SIZE - off); + batch->writeable = true;
- ret = gntdev_get_page(batch, virt, true, &gfn); + ret = gntdev_get_page(batch, virt, &gfn); if (ret < 0) return ret;
From: Damien Le Moal damien.lemoal@wdc.com
commit 0ebcdd702f49aeb0ad2e2d894f8c124a0acc6e23 upstream.
For a null_blk device with zoned mode enabled is currently initialized with a number of zones equal to the device capacity divided by the zone size, without considering if the device capacity is a multiple of the zone size. If the zone size is not a divisor of the capacity, the zones end up not covering the entire capacity, potentially resulting is out of bounds accesses to the zone array.
Fix this by adding one last smaller zone with a size equal to the remainder of the disk capacity divided by the zone size if the capacity is not a multiple of the zone size. For such smaller last zone, the zone capacity is also checked so that it does not exceed the smaller zone size.
Reported-by: Naohiro Aota naohiro.aota@wdc.com Fixes: ca4b2a011948 ("null_blk: add zone support") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/null_blk_zoned.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c index d1725ac636c0..079ed33fd806 100644 --- a/drivers/block/null_blk_zoned.c +++ b/drivers/block/null_blk_zoned.c @@ -1,9 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/vmalloc.h> +#include <linux/sizes.h> #include "null_blk.h"
-/* zone_size in MBs to sectors. */ -#define ZONE_SIZE_SHIFT 11 +#define MB_TO_SECTS(mb) (((sector_t)mb * SZ_1M) >> SECTOR_SHIFT)
static inline unsigned int null_zone_no(struct nullb_device *dev, sector_t sect) { @@ -12,7 +12,7 @@ static inline unsigned int null_zone_no(struct nullb_device *dev, sector_t sect)
int null_zone_init(struct nullb_device *dev) { - sector_t dev_size = (sector_t)dev->size * 1024 * 1024; + sector_t dev_capacity_sects; sector_t sector = 0; unsigned int i;
@@ -25,9 +25,12 @@ int null_zone_init(struct nullb_device *dev) return -EINVAL; }
- dev->zone_size_sects = dev->zone_size << ZONE_SIZE_SHIFT; - dev->nr_zones = dev_size >> - (SECTOR_SHIFT + ilog2(dev->zone_size_sects)); + dev_capacity_sects = MB_TO_SECTS(dev->size); + dev->zone_size_sects = MB_TO_SECTS(dev->zone_size); + dev->nr_zones = dev_capacity_sects >> ilog2(dev->zone_size_sects); + if (dev_capacity_sects & (dev->zone_size_sects - 1)) + dev->nr_zones++; + dev->zones = kvmalloc_array(dev->nr_zones, sizeof(struct blk_zone), GFP_KERNEL | __GFP_ZERO); if (!dev->zones) @@ -37,7 +40,10 @@ int null_zone_init(struct nullb_device *dev) struct blk_zone *zone = &dev->zones[i];
zone->start = zone->wp = sector; - zone->len = dev->zone_size_sects; + if (zone->start + dev->zone_size_sects > dev_capacity_sects) + zone->len = dev_capacity_sects - zone->start; + else + zone->len = dev->zone_size_sects; zone->type = BLK_ZONE_TYPE_SEQWRITE_REQ; zone->cond = BLK_ZONE_COND_EMPTY;
From: Johan Hovold johan@kernel.org
commit 5812b32e01c6d86ba7a84110702b46d8a8531fe9 upstream.
Specify type alignment when declaring linker-section match-table entries to prevent gcc from increasing alignment and corrupting the various tables with padding (e.g. timers, irqchips, clocks, reserved memory).
This is specifically needed on x86 where gcc (typically) aligns larger objects like struct of_device_id with static extent on 32-byte boundaries which at best prevents matching on anything but the first entry. Specifying alignment when declaring variables suppresses this optimisation.
Here's a 64-bit example where all entries are corrupt as 16 bytes of padding has been inserted before the first entry:
ffffffff8266b4b0 D __clk_of_table ffffffff8266b4c0 d __of_table_fixed_factor_clk ffffffff8266b5a0 d __of_table_fixed_clk ffffffff8266b680 d __clk_of_table_sentinel
And here's a 32-bit example where the 8-byte-aligned table happens to be placed on a 32-byte boundary so that all but the first entry are corrupt due to the 28 bytes of padding inserted between entries:
812b3ec0 D __irqchip_of_table 812b3ec0 d __of_table_irqchip1 812b3fa0 d __of_table_irqchip2 812b4080 d __of_table_irqchip3 812b4160 d irqchip_of_match_end
Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte alignment), and on arm using gcc-7.2.
Note that there are no in-tree users of these tables on x86 currently (even if they are included in the image).
Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations") Fixes: f6e916b82022 ("irqchip: add basic infrastructure") Cc: stable stable@vger.kernel.org # 3.9 Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org [ johan: adjust context to 5.4 ] Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/of.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/linux/of.h b/include/linux/of.h index d4f14b0302b6..6429f00341d1 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -1258,6 +1258,7 @@ static inline int of_get_available_child_count(const struct device_node *np) #define _OF_DECLARE(table, name, compat, fn, fn_type) \ static const struct of_device_id __of_table_##name \ __used __section(__##table##_of_table) \ + __aligned(__alignof__(struct of_device_id)) \ = { .compatible = compat, \ .data = (fn == (fn_type)NULL) ? fn : fn } #else
From: Anant Thazhemadam anant.thazhemadam@gmail.com
commit 70f259a3f4276b71db365b1d6ff1eab805ea6ec3 upstream.
When h5_close() gets called, the memory allocated for the hu gets freed only if hu->serdev doesn't exist. This leads to a memory leak. So when h5_close() is requested, close the serdev device instance and free the memory allocated to the hu entirely instead.
Fixes: https://syzkaller.appspot.com/bug?extid=6ce141c55b2f7aafd1c4 Reported-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com Tested-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com Signed-off-by: Anant Thazhemadam anant.thazhemadam@gmail.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/hci_h5.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c index 7ffeb37e8f20..a017322dba82 100644 --- a/drivers/bluetooth/hci_h5.c +++ b/drivers/bluetooth/hci_h5.c @@ -263,8 +263,12 @@ static int h5_close(struct hci_uart *hu) if (h5->vnd && h5->vnd->close) h5->vnd->close(h5);
- if (!hu->serdev) - kfree(h5); + if (hu->serdev) + serdev_device_close(hu->serdev); + + kfree_skb(h5->rx_skb); + kfree(h5); + h5 = NULL;
return 0; }
From: Rustam Kovhaev rkovhaev@gmail.com
commit d24396c5290ba8ab04ba505176874c4e04a2d53c upstream.
when directory item has an invalid value set for ih_entry_count it might trigger use-after-free or out-of-bounds read in bin_search_in_dir_item()
ih_entry_count * IH_SIZE for directory item should not be larger than ih_item_len
Link: https://lore.kernel.org/r/20201101140958.3650143-1-rkovhaev@gmail.com Reported-and-tested-by: syzbot+83b6f7cf9922cae5c4d7@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=83b6f7cf9922cae5c4d7 Signed-off-by: Rustam Kovhaev rkovhaev@gmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/reiserfs/stree.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/reiserfs/stree.c b/fs/reiserfs/stree.c index 2946713cb00d..5229038852ca 100644 --- a/fs/reiserfs/stree.c +++ b/fs/reiserfs/stree.c @@ -454,6 +454,12 @@ static int is_leaf(char *buf, int blocksize, struct buffer_head *bh) "(second one): %h", ih); return 0; } + if (is_direntry_le_ih(ih) && (ih_item_len(ih) < (ih_entry_count(ih) * IH_SIZE))) { + reiserfs_warning(NULL, "reiserfs-5093", + "item entry count seems wrong %h", + ih); + return 0; + } prev_location = ih_location(ih); }
From: Anant Thazhemadam anant.thazhemadam@gmail.com
commit 31dcb6c30a26d32650ce134820f27de3c675a45a upstream.
A kernel-infoleak was reported by syzbot, which was caused because dbells was left uninitialized. Using kzalloc() instead of kmalloc() fixes this issue.
Reported-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com Tested-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com Signed-off-by: Anant Thazhemadam anant.thazhemadam@gmail.com Link: https://lore.kernel.org/r/20201122224534.333471-1-anant.thazhemadam@gmail.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/vmw_vmci/vmci_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/vmw_vmci/vmci_context.c b/drivers/misc/vmw_vmci/vmci_context.c index bc089e634a75..26e20b091160 100644 --- a/drivers/misc/vmw_vmci/vmci_context.c +++ b/drivers/misc/vmw_vmci/vmci_context.c @@ -751,7 +751,7 @@ static int vmci_ctx_get_chkpt_doorbells(struct vmci_ctx *context, return VMCI_ERROR_MORE_DATA; }
- dbells = kmalloc(data_size, GFP_ATOMIC); + dbells = kzalloc(data_size, GFP_ATOMIC); if (!dbells) return VMCI_ERROR_NO_MEM;
From: Mauro Carvalho Chehab mchehab+huawei@kernel.org
commit d0ac1a26ed5943127cb0156148735f5f52a07075 upstream.
As reported on: https://lore.kernel.org/linux-media/20190627222020.45909-1-willemdebruijn.ke...
if gp8psk_usb_in_op() returns an error, the status var is not initialized. Yet, this var is used later on, in order to identify: - if the device was already started; - if firmware has loaded; - if the LNBf was powered on.
Using status = 0 seems to ensure that everything will be properly powered up.
So, instead of the proposed solution, let's just set status = 0.
Reported-by: syzbot syzkaller@googlegroups.com Reported-by: Willem de Bruijn willemb@google.com Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/usb/dvb-usb/gp8psk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/usb/dvb-usb/gp8psk.c b/drivers/media/usb/dvb-usb/gp8psk.c index 13e96b0aeb0f..d97eab01cb8c 100644 --- a/drivers/media/usb/dvb-usb/gp8psk.c +++ b/drivers/media/usb/dvb-usb/gp8psk.c @@ -185,7 +185,7 @@ static int gp8psk_load_bcm4500fw(struct dvb_usb_device *d)
static int gp8psk_power_ctrl(struct dvb_usb_device *d, int onoff) { - u8 status, buf; + u8 status = 0, buf; int gp_product_id = le16_to_cpu(d->udev->descriptor.idProduct);
if (onoff) {
From: Takashi Iwai tiwai@suse.de
commit 4ebd47037027c4beae99680bff3b20fdee5d7c1e upstream.
The snd_seq_queue struct contains various flags in the bit fields. Those are categorized to two different use cases, both of which are protected by different spinlocks. That implies that there are still potential risks of the bad operations for bit fields by concurrent accesses.
For addressing the problem, this patch rearranges those flags to be a standard bool instead of a bit field.
Reported-by: syzbot+63cbe31877bb80ef58f5@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20201206083456.21110-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/seq/seq_queue.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/sound/core/seq/seq_queue.h b/sound/core/seq/seq_queue.h index e006fc8e3a36..6b634cfb42ed 100644 --- a/sound/core/seq/seq_queue.h +++ b/sound/core/seq/seq_queue.h @@ -40,10 +40,10 @@ struct snd_seq_queue { struct snd_seq_timer *timer; /* time keeper for this queue */ int owner; /* client that 'owns' the timer */ - unsigned int locked:1, /* timer is only accesibble by owner if set */ - klocked:1, /* kernel lock (after START) */ - check_again:1, - check_blocked:1; + bool locked; /* timer is only accesibble by owner if set */ + bool klocked; /* kernel lock (after START) */ + bool check_again; /* concurrent access happened during check */ + bool check_blocked; /* queue being checked */
unsigned int flags; /* status flags */ unsigned int info_flags; /* info for sync */
From: Takashi Iwai tiwai@suse.de
commit 88a06d6fd6b369d88cec46c62db3e2604a2f50d5 upstream.
The runtime->avail field may be accessed concurrently while some places refer to it without taking the runtime->lock spinlock, as detected by KCSAN. Usually this isn't a big problem, but for consistency and safety, we should take the spinlock at each place referencing this field.
Reported-by: syzbot+a23a6f1215c84756577c@syzkaller.appspotmail.com Reported-by: syzbot+3d367d1df1d2b67f5c19@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20201206083527.21163-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/rawmidi.c | 49 +++++++++++++++++++++++++++++++------------- 1 file changed, 35 insertions(+), 14 deletions(-)
diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index 9b26973fe697..f4f855d7a791 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -87,11 +87,21 @@ static inline unsigned short snd_rawmidi_file_flags(struct file *file) } }
-static inline int snd_rawmidi_ready(struct snd_rawmidi_substream *substream) +static inline bool __snd_rawmidi_ready(struct snd_rawmidi_runtime *runtime) +{ + return runtime->avail >= runtime->avail_min; +} + +static bool snd_rawmidi_ready(struct snd_rawmidi_substream *substream) { struct snd_rawmidi_runtime *runtime = substream->runtime; + unsigned long flags; + bool ready;
- return runtime->avail >= runtime->avail_min; + spin_lock_irqsave(&runtime->lock, flags); + ready = __snd_rawmidi_ready(runtime); + spin_unlock_irqrestore(&runtime->lock, flags); + return ready; }
static inline int snd_rawmidi_ready_append(struct snd_rawmidi_substream *substream, @@ -960,7 +970,7 @@ int snd_rawmidi_receive(struct snd_rawmidi_substream *substream, if (result > 0) { if (runtime->event) schedule_work(&runtime->event_work); - else if (snd_rawmidi_ready(substream)) + else if (__snd_rawmidi_ready(runtime)) wake_up(&runtime->sleep); } spin_unlock_irqrestore(&runtime->lock, flags); @@ -1039,7 +1049,7 @@ static ssize_t snd_rawmidi_read(struct file *file, char __user *buf, size_t coun result = 0; while (count > 0) { spin_lock_irq(&runtime->lock); - while (!snd_rawmidi_ready(substream)) { + while (!__snd_rawmidi_ready(runtime)) { wait_queue_entry_t wait;
if ((file->f_flags & O_NONBLOCK) != 0 || result > 0) { @@ -1056,9 +1066,11 @@ static ssize_t snd_rawmidi_read(struct file *file, char __user *buf, size_t coun return -ENODEV; if (signal_pending(current)) return result > 0 ? result : -ERESTARTSYS; - if (!runtime->avail) - return result > 0 ? result : -EIO; spin_lock_irq(&runtime->lock); + if (!runtime->avail) { + spin_unlock_irq(&runtime->lock); + return result > 0 ? result : -EIO; + } } spin_unlock_irq(&runtime->lock); count1 = snd_rawmidi_kernel_read1(substream, @@ -1196,7 +1208,7 @@ int __snd_rawmidi_transmit_ack(struct snd_rawmidi_substream *substream, int coun runtime->avail += count; substream->bytes += count; if (count > 0) { - if (runtime->drain || snd_rawmidi_ready(substream)) + if (runtime->drain || __snd_rawmidi_ready(runtime)) wake_up(&runtime->sleep); } return count; @@ -1363,9 +1375,11 @@ static ssize_t snd_rawmidi_write(struct file *file, const char __user *buf, return -ENODEV; if (signal_pending(current)) return result > 0 ? result : -ERESTARTSYS; - if (!runtime->avail && !timeout) - return result > 0 ? result : -EIO; spin_lock_irq(&runtime->lock); + if (!runtime->avail && !timeout) { + spin_unlock_irq(&runtime->lock); + return result > 0 ? result : -EIO; + } } spin_unlock_irq(&runtime->lock); count1 = snd_rawmidi_kernel_write1(substream, buf, NULL, count); @@ -1445,6 +1459,7 @@ static void snd_rawmidi_proc_info_read(struct snd_info_entry *entry, struct snd_rawmidi *rmidi; struct snd_rawmidi_substream *substream; struct snd_rawmidi_runtime *runtime; + unsigned long buffer_size, avail, xruns;
rmidi = entry->private_data; snd_iprintf(buffer, "%s\n\n", rmidi->name); @@ -1463,13 +1478,16 @@ static void snd_rawmidi_proc_info_read(struct snd_info_entry *entry, " Owner PID : %d\n", pid_vnr(substream->pid)); runtime = substream->runtime; + spin_lock_irq(&runtime->lock); + buffer_size = runtime->buffer_size; + avail = runtime->avail; + spin_unlock_irq(&runtime->lock); snd_iprintf(buffer, " Mode : %s\n" " Buffer size : %lu\n" " Avail : %lu\n", runtime->oss ? "OSS compatible" : "native", - (unsigned long) runtime->buffer_size, - (unsigned long) runtime->avail); + buffer_size, avail); } } } @@ -1487,13 +1505,16 @@ static void snd_rawmidi_proc_info_read(struct snd_info_entry *entry, " Owner PID : %d\n", pid_vnr(substream->pid)); runtime = substream->runtime; + spin_lock_irq(&runtime->lock); + buffer_size = runtime->buffer_size; + avail = runtime->avail; + xruns = runtime->xruns; + spin_unlock_irq(&runtime->lock); snd_iprintf(buffer, " Buffer size : %lu\n" " Avail : %lu\n" " Overruns : %lu\n", - (unsigned long) runtime->buffer_size, - (unsigned long) runtime->avail, - (unsigned long) runtime->xruns); + buffer_size, avail, xruns); } } }
From: Boqun Feng boqun.feng@gmail.com
commit 8d1ddb5e79374fb277985a6b3faa2ed8631c5b4c upstream.
Syzbot reports a potential deadlock found by the newly added recursive read deadlock detection in lockdep:
[...] ======================================================== [...] WARNING: possible irq lock inversion dependency detected [...] 5.9.0-rc2-syzkaller #0 Not tainted [...] -------------------------------------------------------- [...] syz-executor.1/10214 just changed the state of lock: [...] ffff88811f506338 (&f->f_owner.lock){.+..}-{2:2}, at: send_sigurg+0x1d/0x200 [...] but this lock was taken by another, HARDIRQ-safe lock in the past: [...] (&dev->event_lock){-...}-{2:2} [...] [...] [...] and interrupts could create inverse lock ordering between them. [...] [...] [...] other info that might help us debug this: [...] Chain exists of: [...] &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock [...] [...] Possible interrupt unsafe locking scenario: [...] [...] CPU0 CPU1 [...] ---- ---- [...] lock(&f->f_owner.lock); [...] local_irq_disable(); [...] lock(&dev->event_lock); [...] lock(&new->fa_lock); [...] <Interrupt> [...] lock(&dev->event_lock); [...] [...] *** DEADLOCK ***
The corresponding deadlock case is as followed:
CPU 0 CPU 1 CPU 2 read_lock(&fown->lock); spin_lock_irqsave(&dev->event_lock, ...) write_lock_irq(&filp->f_owner.lock); // wait for the lock read_lock(&fown-lock); // have to wait until the writer release // due to the fairness <interrupted> spin_lock_irqsave(&dev->event_lock); // wait for the lock
The lock dependency on CPU 1 happens if there exists a call sequence:
input_inject_event(): spin_lock_irqsave(&dev->event_lock,...); input_handle_event(): input_pass_values(): input_to_handler(): handler->event(): // evdev_event() evdev_pass_values(): spin_lock(&client->buffer_lock); __pass_event(): kill_fasync(): kill_fasync_rcu(): read_lock(&fa->fa_lock); send_sigio(): read_lock(&fown->lock);
To fix this, make the reader in send_sigurg() and send_sigio() use read_lock_irqsave() and read_lock_irqrestore().
Reported-by: syzbot+22e87cdf94021b984aa6@syzkaller.appspotmail.com Reported-by: syzbot+c5e32344981ad9f33750@syzkaller.appspotmail.com Signed-off-by: Boqun Feng boqun.feng@gmail.com Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fcntl.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/fcntl.c b/fs/fcntl.c index 0c70a8ed2a98..2e54fdc14e51 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -799,9 +799,10 @@ void send_sigio(struct fown_struct *fown, int fd, int band) { struct task_struct *p; enum pid_type type; + unsigned long flags; struct pid *pid; - read_lock(&fown->lock); + read_lock_irqsave(&fown->lock, flags);
type = fown->pid_type; pid = fown->pid; @@ -822,7 +823,7 @@ void send_sigio(struct fown_struct *fown, int fd, int band) read_unlock(&tasklist_lock); } out_unlock_fown: - read_unlock(&fown->lock); + read_unlock_irqrestore(&fown->lock, flags); }
static void send_sigurg_to_task(struct task_struct *p, @@ -837,9 +838,10 @@ int send_sigurg(struct fown_struct *fown) struct task_struct *p; enum pid_type type; struct pid *pid; + unsigned long flags; int ret = 0; - read_lock(&fown->lock); + read_lock_irqsave(&fown->lock, flags);
type = fown->pid_type; pid = fown->pid; @@ -862,7 +864,7 @@ int send_sigurg(struct fown_struct *fown) read_unlock(&tasklist_lock); } out_unlock_fown: - read_unlock(&fown->lock); + read_unlock_irqrestore(&fown->lock, flags); return ret; }
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit 28d211919e422f58c1e6c900e5810eee4f1ce4c8 ]
When clk_hw_register_fixed_rate_with_accuracy() fails, clk_data should be freed. It's the same for the subsequent two error paths, but we should also unregister the already registered clocks in them.
Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Link: https://lore.kernel.org/r/20201020061226.6572-1-dinghao.liu@zju.edu.cn Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-sun6i.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c index 2cd5a7b1a2e3..e85abe805606 100644 --- a/drivers/rtc/rtc-sun6i.c +++ b/drivers/rtc/rtc-sun6i.c @@ -232,7 +232,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node) 300000000); if (IS_ERR(rtc->int_osc)) { pr_crit("Couldn't register the internal oscillator\n"); - return; + goto err; }
parents[0] = clk_hw_get_name(rtc->int_osc); @@ -248,7 +248,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node) rtc->losc = clk_register(NULL, &rtc->hw); if (IS_ERR(rtc->losc)) { pr_crit("Couldn't register the LOSC clock\n"); - return; + goto err_register; }
of_property_read_string_index(node, "clock-output-names", 1, @@ -259,7 +259,7 @@ static void __init sun6i_rtc_clk_init(struct device_node *node) &rtc->lock); if (IS_ERR(rtc->ext_losc)) { pr_crit("Couldn't register the LOSC external gate\n"); - return; + goto err_register; }
clk_data->num = 2; @@ -268,6 +268,8 @@ static void __init sun6i_rtc_clk_init(struct device_node *node) of_clk_add_hw_provider(node, of_clk_hw_onecell_get, clk_data); return;
+err_register: + clk_hw_unregister_fixed_rate(rtc->int_osc); err: kfree(clk_data); }
From: Miroslav Benes mbenes@suse.cz
[ Upstream commit 5e8ed280dab9eeabc1ba0b2db5dbe9fe6debb6b5 ]
If a module fails to load due to an error in prepare_coming_module(), the following error handling in load_module() runs with MODULE_STATE_COMING in module's state. Fix it by correctly setting MODULE_STATE_GOING under "bug_cleanup" label.
Signed-off-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Jessica Yu jeyu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/module.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/module.c b/kernel/module.c index e0dec03bbdd6..75ab11707a41 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3855,6 +3855,7 @@ static int load_module(struct load_info *info, const char __user *uargs, MODULE_STATE_GOING, mod); klp_module_going(mod); bug_cleanup: + mod->state = MODULE_STATE_GOING; /* module_bug_cleanup needs module_mutex protection */ mutex_lock(&module_mutex); module_bug_cleanup(mod);
From: Jan Kara jack@suse.cz
[ Upstream commit 10f04d40a9fa29785206c619f80d8beedb778837 ]
The on-disk quota format supports quota files with upto 2^32 blocks. Be careful when computing quota file offsets in the quota files from block numbers as they can overflow 32-bit types. Since quota files larger than 4GB would require ~26 millions of quota users, this is mostly a theoretical concern now but better be careful, fuzzers would find the problem sooner or later anyway...
Reviewed-by: Andreas Dilger adilger@dilger.ca Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/quota/quota_tree.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c index bb3f59bcfcf5..656f9ff63edd 100644 --- a/fs/quota/quota_tree.c +++ b/fs/quota/quota_tree.c @@ -61,7 +61,7 @@ static ssize_t read_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf)
memset(buf, 0, info->dqi_usable_bs); return sb->s_op->quota_read(sb, info->dqi_type, buf, - info->dqi_usable_bs, blk << info->dqi_blocksize_bits); + info->dqi_usable_bs, (loff_t)blk << info->dqi_blocksize_bits); }
static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf) @@ -70,7 +70,7 @@ static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf) ssize_t ret;
ret = sb->s_op->quota_write(sb, info->dqi_type, buf, - info->dqi_usable_bs, blk << info->dqi_blocksize_bits); + info->dqi_usable_bs, (loff_t)blk << info->dqi_blocksize_bits); if (ret != info->dqi_usable_bs) { quota_error(sb, "dquota write failed"); if (ret >= 0) @@ -283,7 +283,7 @@ static uint find_free_dqentry(struct qtree_mem_dqinfo *info, blk); goto out_buf; } - dquot->dq_off = (blk << info->dqi_blocksize_bits) + + dquot->dq_off = ((loff_t)blk << info->dqi_blocksize_bits) + sizeof(struct qt_disk_dqdbheader) + i * info->dqi_entry_size; kfree(buf); @@ -558,7 +558,7 @@ static loff_t find_block_dqentry(struct qtree_mem_dqinfo *info, ret = -EIO; goto out_buf; } else { - ret = (blk << info->dqi_blocksize_bits) + sizeof(struct + ret = ((loff_t)blk << info->dqi_blocksize_bits) + sizeof(struct qt_disk_dqdbheader) + i * info->dqi_entry_size; } out_buf:
From: Qinglang Miao miaoqinglang@huawei.com
[ Upstream commit ffa1797040c5da391859a9556be7b735acbe1242 ]
I noticed that iounmap() of msgr_block_addr before return from mpic_msgr_probe() in the error handling case is missing. So use devm_ioremap() instead of just ioremap() when remapping the message register block, so the mapping will be automatically released on probe failure.
Signed-off-by: Qinglang Miao miaoqinglang@huawei.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/sysdev/mpic_msgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/sysdev/mpic_msgr.c b/arch/powerpc/sysdev/mpic_msgr.c index 280e964e1aa8..497e86cfb12e 100644 --- a/arch/powerpc/sysdev/mpic_msgr.c +++ b/arch/powerpc/sysdev/mpic_msgr.c @@ -196,7 +196,7 @@ static int mpic_msgr_probe(struct platform_device *dev)
/* IO map the message register block. */ of_address_to_resource(np, 0, &rsrc); - msgr_block_addr = ioremap(rsrc.start, resource_size(&rsrc)); + msgr_block_addr = devm_ioremap(&dev->dev, rsrc.start, resource_size(&rsrc)); if (!msgr_block_addr) { dev_err(&dev->dev, "Failed to iomap MPIC message registers"); return -EFAULT;
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit b6d49ecd1081740b6e632366428b960461f8158b ]
When returning the layout in nfs4_evict_inode(), we need to ensure that the layout is actually done being freed before we can proceed to free the inode itself.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4super.c | 2 +- fs/nfs/pnfs.c | 33 +++++++++++++++++++++++++++++++-- fs/nfs/pnfs.h | 5 +++++ 3 files changed, 37 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/nfs4super.c b/fs/nfs/nfs4super.c index 6fb7cb6b3f4b..e7a10f5f5405 100644 --- a/fs/nfs/nfs4super.c +++ b/fs/nfs/nfs4super.c @@ -95,7 +95,7 @@ static void nfs4_evict_inode(struct inode *inode) nfs_inode_return_delegation_noreclaim(inode); /* Note that above delegreturn would trigger pnfs return-on-close */ pnfs_return_layout(inode); - pnfs_destroy_layout(NFS_I(inode)); + pnfs_destroy_layout_final(NFS_I(inode)); /* First call standard NFS clear_inode() code */ nfs_clear_inode(inode); } diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 2b9e139a2997..a253384a4710 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -294,6 +294,7 @@ void pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo) { struct inode *inode; + unsigned long i_state;
if (!lo) return; @@ -304,8 +305,12 @@ pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo) if (!list_empty(&lo->plh_segs)) WARN_ONCE(1, "NFS: BUG unfreed layout segments.\n"); pnfs_detach_layout_hdr(lo); + i_state = inode->i_state; spin_unlock(&inode->i_lock); pnfs_free_layout_hdr(lo); + /* Notify pnfs_destroy_layout_final() that we're done */ + if (i_state & (I_FREEING | I_CLEAR)) + wake_up_var(lo); } }
@@ -713,8 +718,7 @@ pnfs_free_lseg_list(struct list_head *free_me) } }
-void -pnfs_destroy_layout(struct nfs_inode *nfsi) +static struct pnfs_layout_hdr *__pnfs_destroy_layout(struct nfs_inode *nfsi) { struct pnfs_layout_hdr *lo; LIST_HEAD(tmp_list); @@ -732,9 +736,34 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) pnfs_put_layout_hdr(lo); } else spin_unlock(&nfsi->vfs_inode.i_lock); + return lo; +} + +void pnfs_destroy_layout(struct nfs_inode *nfsi) +{ + __pnfs_destroy_layout(nfsi); } EXPORT_SYMBOL_GPL(pnfs_destroy_layout);
+static bool pnfs_layout_removed(struct nfs_inode *nfsi, + struct pnfs_layout_hdr *lo) +{ + bool ret; + + spin_lock(&nfsi->vfs_inode.i_lock); + ret = nfsi->layout != lo; + spin_unlock(&nfsi->vfs_inode.i_lock); + return ret; +} + +void pnfs_destroy_layout_final(struct nfs_inode *nfsi) +{ + struct pnfs_layout_hdr *lo = __pnfs_destroy_layout(nfsi); + + if (lo) + wait_var_event(lo, pnfs_layout_removed(nfsi, lo)); +} + static bool pnfs_layout_add_bulk_destroy_list(struct inode *inode, struct list_head *layout_list) diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h index 3ba44819a88a..80fafa29e567 100644 --- a/fs/nfs/pnfs.h +++ b/fs/nfs/pnfs.h @@ -254,6 +254,7 @@ struct pnfs_layout_segment *pnfs_layout_process(struct nfs4_layoutget *lgp); void pnfs_layoutget_free(struct nfs4_layoutget *lgp); void pnfs_free_lseg_list(struct list_head *tmp_list); void pnfs_destroy_layout(struct nfs_inode *); +void pnfs_destroy_layout_final(struct nfs_inode *); void pnfs_destroy_all_layouts(struct nfs_client *); int pnfs_destroy_layouts_byfsid(struct nfs_client *clp, struct nfs_fsid *fsid, @@ -645,6 +646,10 @@ static inline void pnfs_destroy_layout(struct nfs_inode *nfsi) { }
+static inline void pnfs_destroy_layout_final(struct nfs_inode *nfsi) +{ +} + static inline struct pnfs_layout_segment * pnfs_get_lseg(struct pnfs_layout_segment *lseg) {
From: Jessica Yu jeyu@kernel.org
[ Upstream commit 38dc717e97153e46375ee21797aa54777e5498f3 ]
Apparently there has been a longstanding race between udev/systemd and the module loader. Currently, the module loader sends a uevent right after sysfs initialization, but before the module calls its init function. However, some udev rules expect that the module has initialized already upon receiving the uevent.
This race has been triggered recently (see link in references) in some systemd mount unit files. For instance, the configfs module creates the /sys/kernel/config mount point in its init function, however the module loader issues the uevent before this happens. sys-kernel-config.mount expects to be able to mount /sys/kernel/config upon receipt of the module loading uevent, but if the configfs module has not called its init function yet, then this directory will not exist and the mount unit fails. A similar situation exists for sys-fs-fuse-connections.mount, as the fuse sysfs mount point is created during the fuse module's init function. If udev is faster than module initialization then the mount unit would fail in a similar fashion.
To fix this race, delay the module KOBJ_ADD uevent until after the module has finished calling its init routine.
References: https://github.com/systemd/systemd/issues/17586 Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Tested-By: Nicolas Morey-Chaisemartin nmoreychaisemartin@suse.com Signed-off-by: Jessica Yu jeyu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/module.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c index 75ab11707a41..0292104f1867 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1806,7 +1806,6 @@ static int mod_sysfs_init(struct module *mod) if (err) mod_kobject_put(mod);
- /* delay uevent until full sysfs population */ out: return err; } @@ -1843,7 +1842,6 @@ static int mod_sysfs_setup(struct module *mod, add_sect_attrs(mod, info); add_notes_attrs(mod, info);
- kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD); return 0;
out_unreg_modinfo_attrs: @@ -3513,6 +3511,9 @@ static noinline int do_init_module(struct module *mod) blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_LIVE, mod);
+ /* Delay uevent until module has finished its init routine */ + kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD); + /* * We need to finish all async code before the module init sequence * is done. This has potential to deadlock. For example, a newly
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 618de0f4ef11acd8cf26902e65493d46cc20cc89 ]
The PCM hw_params core function tries to clear up the PCM buffer before actually using for avoiding the information leak from the previous usages or the usage before a new allocation. It performs the memset() with runtime->dma_bytes, but this might still leave some remaining bytes untouched; namely, the PCM buffer size is aligned in page size for mmap, hence runtime->dma_bytes doesn't necessarily cover all PCM buffer pages, and the remaining bytes are exposed via mmap.
This patch changes the memory clearance to cover the all buffer pages if the stream is supposed to be mmap-ready (that guarantees that the buffer size is aligned in page size).
Reviewed-by: Lars-Peter Clausen lars@metafoo.de Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/pcm_native.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 7c12b0deb4eb..db62dbe7eaa8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -753,8 +753,13 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream, runtime->boundary *= 2;
/* clear the buffer for avoiding possible kernel info leaks */ - if (runtime->dma_area && !substream->ops->copy_user) - memset(runtime->dma_area, 0, runtime->dma_bytes); + if (runtime->dma_area && !substream->ops->copy_user) { + size_t size = runtime->dma_bytes; + + if (runtime->info & SNDRV_PCM_INFO_MMAP) + size = PAGE_ALIGN(size); + memset(runtime->dma_area, 0, size); + }
snd_pcm_timer_resolution_change(substream); snd_pcm_set_state(substream, SNDRV_PCM_STATE_SETUP);
From: Hyeongseok Kim hyeongseok@gmail.com
[ Upstream commit 252bd1256396cebc6fc3526127fdb0b317601318 ]
If emergency system shutdown is called, like by thermal shutdown, a dm device could be alive when the block device couldn't process I/O requests anymore. In this state, the handling of I/O errors by new dm I/O requests or by those already in-flight can lead to a verity corruption state, which is a misjudgment.
So, skip verity work in response to I/O error when system is shutting down.
Signed-off-by: Hyeongseok Kim hyeongseok@gmail.com Reviewed-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-verity-target.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c index 50ab1af8ac8a..6ec3e97814b1 100644 --- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -533,6 +533,15 @@ static int verity_verify_io(struct dm_verity_io *io) return 0; }
+/* + * Skip verity work in response to I/O error when system is shutting down. + */ +static inline bool verity_is_system_shutting_down(void) +{ + return system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF + || system_state == SYSTEM_RESTART; +} + /* * End one "io" structure with a given error. */ @@ -560,7 +569,8 @@ static void verity_end_io(struct bio *bio) { struct dm_verity_io *io = bio->bi_private;
- if (bio->bi_status && !verity_fec_is_enabled(io->v)) { + if (bio->bi_status && + (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) { verity_finish_io(io, bio->bi_status); return; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
Merge 30 patches from 4.19.165 stable branch (30 total) beside 0 already merged patches.
Tested-by: Pavel Machek (CIP) pavel@denx.de Tested-by: Jon Hunter jonathanh@nvidia.com Tested-by: Guenter Roeck linux@roeck-us.net Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/20210105090818.518271884@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile index d02af6881a5f..e636c2143295 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 19 -SUBLEVEL = 164 +SUBLEVEL = 165 EXTRAVERSION = NAME = "People's Front"