CVE-2021-27365 CVE-2021-27363 CVE-2021-27364
Chris Leech (2): scsi: iscsi: Ensure sysfs attributes are limited to PAGE_SIZE scsi: iscsi: Verify lengths on passthrough PDUs
Christoph Hellwig (1): mm/swapfile.c: fix a comment in sys_swapon()
Darrick J. Wong (2): mm: set S_SWAPFILE on blockdev swap devices vfs: don't allow writes to swap files
Domenico Andreoli (1): hibernate: Allow uswsusp to write to swap
Jan Beulich (10): Xen/x86: don't bail early from clear_foreign_p2m_mapping() Xen/x86: also check kernel mapping in set_foreign_p2m_mapping() Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages() Xen/gntdev: correct error checking in gntdev_map_grant_pages() xen-blkback: don't "handle" error by BUG() xen-netback: don't "handle" error by BUG() xen-scsiback: don't "handle" error by BUG() xen-blkback: fix error handling in xen_blkbk_map() Xen/gnttab: handle p2m update errors on a per-slot basis xen-netback: respect gnttab_map_refs()'s return value
Joe Perches (1): sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output
Lee Duncan (1): scsi: iscsi: Restrict sessions and handles to admin capabilities
Miaohe Lin (1): mm/swapfile.c: fix potential memory leak in sys_swapon
Miklos Szeredi (6): ovl: pass correct flags for opening real directory ovl: switch to mounter creds in readdir ovl: verify permissions in ovl_path_open() ovl: call secutiry hook in ovl_real_ioctl() ovl: check permission to open real file ovl: do not fail because of O_NOATIME
Naohiro Aota (1): mm/swapfile.c: move inode_lock out of claim_swapfile
Stefano Stabellini (1): xen/arm: don't ignore return errors from set_phys_to_machine
Wenchao Hao (2): nvme: register ns_id attributes as default sysfs groups virtio-blk: modernize sysfs attribute creation
Yang Yingliang (1): sysfs: fix kabi broken when add sysfs_emit and sysfs_emit_at
Ye Bin (1): ext4: Fix not report exception message when mount with errors=continue
zhangyi (F) (1): block_dump: remove block_dump feature when dirting inode
Documentation/filesystems/sysfs.txt | 8 +- arch/arm/xen/p2m.c | 33 ++++++- arch/x86/xen/p2m.c | 59 ++++++++--- drivers/block/virtio_blk.c | 67 +++++++------ drivers/block/xen-blkback/blkback.c | 30 +++--- drivers/net/xen-netback/netback.c | 10 +- drivers/nvme/host/core.c | 20 ++-- drivers/nvme/host/lightnvm.c | 105 +++++++++----------- drivers/nvme/host/multipath.c | 11 +-- drivers/nvme/host/nvme.h | 10 +- drivers/scsi/libiscsi.c | 148 ++++++++++++++-------------- drivers/scsi/scsi_transport_iscsi.c | 38 +++++-- drivers/xen/gntdev.c | 37 +++---- drivers/xen/xen-scsiback.c | 4 +- fs/block_dev.c | 5 + fs/ext4/super.c | 6 +- fs/fs-writeback.c | 25 ----- fs/overlayfs/file.c | 28 ++++-- fs/overlayfs/readdir.c | 37 +++++-- fs/overlayfs/util.c | 27 ++++- fs/sysfs/file.c | 55 +++++++++++ include/linux/fs.h | 11 +++ include/linux/sysfs.h | 16 +++ include/xen/grant_table.h | 1 + mm/filemap.c | 3 + mm/memory.c | 4 + mm/mmap.c | 8 +- mm/swapfile.c | 72 ++++++++------ security/security.c | 1 + 29 files changed, 552 insertions(+), 327 deletions(-)
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit dfed59ee4b41b0937163dfed36752d29e72d0712 CVE: CVE-2021-26932
--------------------------------
commit a35f2ef3b7376bfd0a57f7844bd7454389aae1fc upstream.
Its sibling (set_foreign_p2m_mapping()) as well as the sibling of its only caller (gnttab_map_refs()) don't clean up after themselves in case of error. Higher level callers are expected to do so. However, in order for that to really clean up any partially set up state, the operation should not terminate upon encountering an entry in unexpected state. It is particularly relevant to notice here that set_foreign_p2m_mapping() would skip setting up a p2m entry if its grant mapping failed, but it would continue to set up further p2m entries as long as their mappings succeeded.
Arguably down the road set_foreign_p2m_mapping() may want its page state related WARN_ON() also converted to an error return.
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/x86/xen/p2m.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index 159a897151d6..7ecbc63f1506 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -746,17 +746,15 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops, unsigned long mfn = __pfn_to_mfn(page_to_pfn(pages[i])); unsigned long pfn = page_to_pfn(pages[i]);
- if (mfn == INVALID_P2M_ENTRY || !(mfn & FOREIGN_FRAME_BIT)) { + if (mfn != INVALID_P2M_ENTRY && (mfn & FOREIGN_FRAME_BIT)) + set_phys_to_machine(pfn, INVALID_P2M_ENTRY); + else ret = -EINVAL; - goto out; - } - - set_phys_to_machine(pfn, INVALID_P2M_ENTRY); } if (kunmap_ops) ret = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, - kunmap_ops, count); -out: + kunmap_ops, count) ?: ret; + return ret; } EXPORT_SYMBOL_GPL(clear_foreign_p2m_mapping);
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit c3d586afdb4474f9389eeddf6c9259e33cc0a321 CVE: CVE-2021-26932
--------------------------------
commit b512e1b077e5ccdbd6e225b15d934ab12453b70a upstream.
We should not set up further state if either mapping failed; paying attention to just the user mapping's status isn't enough.
Also use GNTST_okay instead of implying its value (zero).
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/x86/xen/p2m.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index 7ecbc63f1506..e8ef994c7243 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -708,7 +708,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops, unsigned long mfn, pfn;
/* Do not add to override if the map failed. */ - if (map_ops[i].status) + if (map_ops[i].status != GNTST_okay || + (kmap_ops && kmap_ops[i].status != GNTST_okay)) continue;
if (map_ops[i].flags & GNTMAP_contains_pte) {
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit ba75f4393225c4049797388329313d1d9a5ef480 CVE: CVE-2021-26932
--------------------------------
commit dbe5283605b3bc12ca45def09cc721a0a5c853a2 upstream.
We may not skip setting the field in the unmap structure when GNTMAP_device_map is in use - such an unmap would fail to release the respective resources (a page ref in the hypervisor). Otoh the field doesn't need setting at all when GNTMAP_device_map is not in use.
To record the value for unmapping, we also better don't use our local p2m: In particular after a subsequent change it may not have got updated for all the batch elements. Instead it can simply be taken from the respective map's results.
We can additionally avoid playing this game altogether for the kernel part of the mappings in (x86) PV mode.
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Stefano Stabellini sstabellini@kernel.org Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/xen/gntdev.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 9d8e02cfd480..8a03087ecb26 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -323,18 +323,25 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map) * to the kernel linear addresses of the struct pages. * These ptes are completely different from the user ptes dealt * with find_grant_ptes. + * Note that GNTMAP_device_map isn't needed here: The + * dev_bus_addr output field gets consumed only from ->map_ops, + * and by not requesting it when mapping we also avoid needing + * to mirror dev_bus_addr into ->unmap_ops (and holding an extra + * reference to the page in the hypervisor). */ + unsigned int flags = (map->flags & ~GNTMAP_device_map) | + GNTMAP_host_map; + for (i = 0; i < map->count; i++) { unsigned long address = (unsigned long) pfn_to_kaddr(page_to_pfn(map->pages[i])); BUG_ON(PageHighMem(map->pages[i]));
- gnttab_set_map_op(&map->kmap_ops[i], address, - map->flags | GNTMAP_host_map, + gnttab_set_map_op(&map->kmap_ops[i], address, flags, map->grants[i].ref, map->grants[i].domid); gnttab_set_unmap_op(&map->kunmap_ops[i], address, - map->flags | GNTMAP_host_map, -1); + flags, -1); } }
@@ -350,17 +357,12 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map) continue; }
+ if (map->flags & GNTMAP_device_map) + map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr; + map->unmap_ops[i].handle = map->map_ops[i].handle; if (use_ptemod) map->kunmap_ops[i].handle = map->kmap_ops[i].handle; -#ifdef CONFIG_XEN_GRANT_DMA_ALLOC - else if (map->dma_vaddr) { - unsigned long bfn; - - bfn = pfn_to_bfn(page_to_pfn(map->pages[i])); - map->unmap_ops[i].dev_bus_addr = __pfn_to_phys(bfn); - } -#endif } return err; }
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit e07f06f6bbeed5bf47fed79ac6a57ec62b33304a CVE: CVE-2021-26932
--------------------------------
commit ebee0eab08594b2bd5db716288a4f1ae5936e9bc upstream.
Failure of the kernel part of the mapping operation should also be indicated as an error to the caller, or else it may assume the respective kernel VA is okay to access.
Furthermore gnttab_map_refs() failing still requires recording successfully mapped handles, so they can be unmapped subsequently. This in turn requires there to be a way to tell full hypercall failure from partial success - preset map_op status fields such that they won't "happen" to look as if the operation succeeded.
Also again use GNTST_okay instead of implying its value (zero).
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/xen/gntdev.c | 17 +++++++++-------- include/xen/grant_table.h | 1 + 2 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 8a03087ecb26..a8b589b5f362 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -348,21 +348,22 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map) pr_debug("map %d+%d\n", map->index, map->count); err = gnttab_map_refs(map->map_ops, use_ptemod ? map->kmap_ops : NULL, map->pages, map->count); - if (err) - return err;
for (i = 0; i < map->count; i++) { - if (map->map_ops[i].status) { + if (map->map_ops[i].status == GNTST_okay) + map->unmap_ops[i].handle = map->map_ops[i].handle; + else if (!err) err = -EINVAL; - continue; - }
if (map->flags & GNTMAP_device_map) map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
- map->unmap_ops[i].handle = map->map_ops[i].handle; - if (use_ptemod) - map->kunmap_ops[i].handle = map->kmap_ops[i].handle; + if (use_ptemod) { + if (map->kmap_ops[i].status == GNTST_okay) + map->kunmap_ops[i].handle = map->kmap_ops[i].handle; + else if (!err) + err = -EINVAL; + } } return err; } diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 9bc5bc07d4d3..a9978350b45b 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -157,6 +157,7 @@ gnttab_set_map_op(struct gnttab_map_grant_ref *map, phys_addr_t addr, map->flags = flags; map->ref = ref; map->dom = domid; + map->status = 1; /* arbitrary positive value */ }
static inline void
From: Stefano Stabellini stefano.stabellini@xilinx.com
stable inclusion from linux-4.19.177 commit 271a3984f73c485f4c1b796a61cc5bd3994a0463 CVE: CVE-2021-26932
--------------------------------
commit 36bf1dfb8b266e089afa9b7b984217f17027bf35 upstream.
set_phys_to_machine can fail due to lack of memory, see the kzalloc call in arch/arm/xen/p2m.c:__set_phys_to_machine_multi.
Don't ignore the potential return error in set_foreign_p2m_mapping, returning it to the caller instead.
This is part of XSA-361.
Signed-off-by: Stefano Stabellini stefano.stabellini@xilinx.com Cc: stable@vger.kernel.org Reviewed-by: Julien Grall jgrall@amazon.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm/xen/p2m.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c index 0641ba54ab62..ce538c51fa3f 100644 --- a/arch/arm/xen/p2m.c +++ b/arch/arm/xen/p2m.c @@ -93,8 +93,10 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops, for (i = 0; i < count; i++) { if (map_ops[i].status) continue; - set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, - map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT); + if (unlikely(!set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, + map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT))) { + return -ENOMEM; + } }
return 0;
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit a01b49a9bf91a723f541139c063c1ff681ac536a CVE: CVE-2021-26931
--------------------------------
commit 5a264285ed1cd32e26d9de4f3c8c6855e467fd63 upstream.
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/block/xen-blkback/blkback.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index b18f0162cb9c..432f9359b57c 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -867,10 +867,8 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring, break; }
- if (segs_to_map) { + if (segs_to_map) ret = gnttab_map_refs(map, NULL, pages_to_gnt, segs_to_map); - BUG_ON(ret); - }
/* * Now swizzle the MFN in our domain with the MFN from the other domain @@ -885,7 +883,7 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring, pr_debug("invalid buffer -- could not remap it\n"); put_free_pages(ring, &pages[seg_idx]->page, 1); pages[seg_idx]->handle = BLKBACK_INVALID_HANDLE; - ret |= 1; + ret |= !ret; goto next; } pages[seg_idx]->handle = map[new_map_idx].handle;
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit 717faa776ca2163119239ea58bb78c4d732d8a4f CVE: CVE-2021-26931
--------------------------------
commit 3194a1746e8aabe86075fd3c5e7cf1f4632d7f16 upstream.
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/net/xen-netback/netback.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index f228298c3bd0..b29a1b279fff 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1326,13 +1326,11 @@ int xenvif_tx_action(struct xenvif_queue *queue, int budget) return 0;
gnttab_batch_copy(queue->tx_copy_ops, nr_cops); - if (nr_mops != 0) { + if (nr_mops != 0) ret = gnttab_map_refs(queue->tx_map_ops, NULL, queue->pages_to_map, nr_mops); - BUG_ON(ret); - }
work_done = xenvif_tx_submit(queue);
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit f84c00fbd27b043fa42a56eaaa14e293877bc69b CVE: CVE-2021-26931
--------------------------------
commit 7c77474b2d22176d2bfb592ec74e0f2cb71352c9 upstream.
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/xen/xen-scsiback.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c index 1abc0a55b8d9..614d067ffe12 100644 --- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -422,12 +422,12 @@ static int scsiback_gnttab_data_map_batch(struct gnttab_map_grant_ref *map, return 0;
err = gnttab_map_refs(map, NULL, pg, cnt); - BUG_ON(err); for (i = 0; i < cnt; i++) { if (unlikely(map[i].status != GNTST_okay)) { pr_err("invalid buffer -- could not remap it\n"); map[i].handle = SCSIBACK_INVALID_HANDLE; - err = -ENOMEM; + if (!err) + err = -ENOMEM; } else { get_page(pg[i]); }
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.177 commit 98f16e171e2849dba76e2e0346e914452c030dc5 CVE: CVE-2021-26930
--------------------------------
commit 871997bc9e423f05c7da7c9178e62dde5df2a7f8 upstream.
The function uses a goto-based loop, which may lead to an earlier error getting discarded by a later iteration. Exit this ad-hoc loop when an error was encountered.
The out-of-memory error path additionally fails to fill a structure field looked at by xen_blkbk_unmap_prepare() before inspecting the handle which does get properly set (to BLKBACK_INVALID_HANDLE).
Since the earlier exiting from the ad-hoc loop requires the same field filling (invalidation) as that on the out-of-memory path, fold both paths. While doing so, drop the pr_alert(), as extra log messages aren't going to help the situation (the kernel will log oom conditions already anyway).
This is XSA-365.
Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com Reviewed-by: Julien Grall julien@xen.org Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/block/xen-blkback/blkback.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 432f9359b57c..208f3eea3641 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -850,8 +850,11 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring, pages[i]->page = persistent_gnt->page; pages[i]->persistent_gnt = persistent_gnt; } else { - if (get_free_page(ring, &pages[i]->page)) - goto out_of_memory; + if (get_free_page(ring, &pages[i]->page)) { + put_free_pages(ring, pages_to_gnt, segs_to_map); + ret = -ENOMEM; + goto out; + } addr = vaddr(pages[i]->page); pages_to_gnt[segs_to_map] = pages[i]->page; pages[i]->persistent_gnt = NULL; @@ -935,17 +938,18 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring, } segs_to_map = 0; last_map = map_until; - if (map_until != num) + if (!ret && map_until != num) goto again;
- return ret; - -out_of_memory: - pr_alert("%s: out of memory\n", __func__); - put_free_pages(ring, pages_to_gnt, segs_to_map); - for (i = last_map; i < num; i++) +out: + for (i = last_map; i < num; i++) { + /* Don't zap current batch's valid persistent grants. */ + if(i >= last_map + segs_to_map) + pages[i]->persistent_gnt = NULL; pages[i]->handle = BLKBACK_INVALID_HANDLE; - return -ENOMEM; + } + + return ret; }
static int xen_blkbk_map_seg(struct pending_req *pending_req)
From: Ye Bin yebin10@huawei.com
hulk inclusion category: bugfix bugzilla: 50614 CVE: NA
-----------------------------------------------
Fixes: 49af7ecfab9a ("ext4: don't remount read-only with errors=continue on reboot") Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/ext4/super.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 15f8aeda9ee7..18870ae874ab 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -509,9 +509,12 @@ static void ext4_handle_error(struct super_block *sb) if (test_opt(sb, WARN_ON_ERROR)) WARN_ON_ONCE(1);
- if (sb_rdonly(sb) || test_opt(sb, ERRORS_CONT)) + if (sb_rdonly(sb)) return;
+ if (test_opt(sb, ERRORS_CONT)) + goto out; + EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED; if (journal) jbd2_journal_abort(journal, -EIO); @@ -533,6 +536,7 @@ static void ext4_handle_error(struct super_block *sb) sb->s_id); }
+out: ext4_netlink_send_info(sb, 1); }
From: Wenchao Hao haowenchao@huawei.com
euleros/rtos inclusion category: bugfix bugzilla: NA
--------------------------------
We should be registering the ns_id attribute as default sysfs attribute groups, otherwise we have a race condition between the uevent and the attributes appearing in sysfs.
Signed-off-by: Wenchao Hao haowenchao@huawei.com Reviewed-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/nvme/host/core.c | 20 +++---- drivers/nvme/host/lightnvm.c | 105 ++++++++++++++-------------------- drivers/nvme/host/multipath.c | 11 +--- drivers/nvme/host/nvme.h | 10 +--- 4 files changed, 58 insertions(+), 88 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 8c760493354f..779306e640e9 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2966,6 +2966,14 @@ const struct attribute_group nvme_ns_id_attr_group = { .is_visible = nvme_ns_id_attrs_are_visible, };
+const struct attribute_group *nvme_ns_id_attr_groups[] = { + &nvme_ns_id_attr_group, +#ifdef CONFIG_NVM + &nvme_nvm_attr_group, +#endif + NULL, +}; + #define nvme_show_str_function(field) \ static ssize_t field##_show(struct device *dev, \ struct device_attribute *attr, char *buf) \ @@ -3331,14 +3339,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
nvme_get_ctrl(ctrl);
+ disk_to_dev(ns->disk)->groups = nvme_ns_id_attr_groups; device_add_disk(ctrl->device, ns->disk); - if (sysfs_create_group(&disk_to_dev(ns->disk)->kobj, - &nvme_ns_id_attr_group)) - pr_warn("%s: failed to create sysfs group for identification\n", - ns->disk->disk_name); - if (ns->ndev && nvme_nvm_register_sysfs(ns)) - pr_warn("%s: failed to register lightnvm sysfs group for identification\n", - ns->disk->disk_name);
nvme_mpath_add_disk(ns, id); nvme_fault_inject_init(ns); @@ -3378,10 +3380,6 @@ static void nvme_ns_remove(struct nvme_ns *ns) synchronize_srcu(&ns->head->srcu); /* wait for concurrent submissions */
if (ns->disk && ns->disk->flags & GENHD_FL_UP) { - sysfs_remove_group(&disk_to_dev(ns->disk)->kobj, - &nvme_ns_id_attr_group); - if (ns->ndev) - nvme_nvm_unregister_sysfs(ns); del_gendisk(ns->disk); blk_cleanup_queue(ns->queue); if (blk_get_integrity(ns->disk)) diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c index a69553e75f38..d10257b9c523 100644 --- a/drivers/nvme/host/lightnvm.c +++ b/drivers/nvme/host/lightnvm.c @@ -1193,10 +1193,29 @@ static NVM_DEV_ATTR_12_RO(multiplane_modes); static NVM_DEV_ATTR_12_RO(media_capabilities); static NVM_DEV_ATTR_12_RO(max_phys_secs);
-static struct attribute *nvm_dev_attrs_12[] = { +/* 2.0 values */ +static NVM_DEV_ATTR_20_RO(groups); +static NVM_DEV_ATTR_20_RO(punits); +static NVM_DEV_ATTR_20_RO(chunks); +static NVM_DEV_ATTR_20_RO(clba); +static NVM_DEV_ATTR_20_RO(ws_min); +static NVM_DEV_ATTR_20_RO(ws_opt); +static NVM_DEV_ATTR_20_RO(maxoc); +static NVM_DEV_ATTR_20_RO(maxocpu); +static NVM_DEV_ATTR_20_RO(mw_cunits); +static NVM_DEV_ATTR_20_RO(write_typ); +static NVM_DEV_ATTR_20_RO(write_max); +static NVM_DEV_ATTR_20_RO(reset_typ); +static NVM_DEV_ATTR_20_RO(reset_max); + +static struct attribute *nvm_dev_attrs[] = { + /* version agnostic attrs */ &dev_attr_version.attr, &dev_attr_capabilities.attr, + &dev_attr_read_typ.attr, + &dev_attr_read_max.attr,
+ /* 1.2 attrs */ &dev_attr_vendor_opcode.attr, &dev_attr_device_mode.attr, &dev_attr_media_manager.attr, @@ -1211,8 +1230,6 @@ static struct attribute *nvm_dev_attrs_12[] = { &dev_attr_page_size.attr, &dev_attr_hw_sector_size.attr, &dev_attr_oob_sector_size.attr, - &dev_attr_read_typ.attr, - &dev_attr_read_max.attr, &dev_attr_prog_typ.attr, &dev_attr_prog_max.attr, &dev_attr_erase_typ.attr, @@ -1221,33 +1238,7 @@ static struct attribute *nvm_dev_attrs_12[] = { &dev_attr_media_capabilities.attr, &dev_attr_max_phys_secs.attr,
- NULL, -}; - -static const struct attribute_group nvm_dev_attr_group_12 = { - .name = "lightnvm", - .attrs = nvm_dev_attrs_12, -}; - -/* 2.0 values */ -static NVM_DEV_ATTR_20_RO(groups); -static NVM_DEV_ATTR_20_RO(punits); -static NVM_DEV_ATTR_20_RO(chunks); -static NVM_DEV_ATTR_20_RO(clba); -static NVM_DEV_ATTR_20_RO(ws_min); -static NVM_DEV_ATTR_20_RO(ws_opt); -static NVM_DEV_ATTR_20_RO(maxoc); -static NVM_DEV_ATTR_20_RO(maxocpu); -static NVM_DEV_ATTR_20_RO(mw_cunits); -static NVM_DEV_ATTR_20_RO(write_typ); -static NVM_DEV_ATTR_20_RO(write_max); -static NVM_DEV_ATTR_20_RO(reset_typ); -static NVM_DEV_ATTR_20_RO(reset_max); - -static struct attribute *nvm_dev_attrs_20[] = { - &dev_attr_version.attr, - &dev_attr_capabilities.attr, - + /* 2.0 attrs */ &dev_attr_groups.attr, &dev_attr_punits.attr, &dev_attr_chunks.attr, @@ -1258,8 +1249,6 @@ static struct attribute *nvm_dev_attrs_20[] = { &dev_attr_maxocpu.attr, &dev_attr_mw_cunits.attr,
- &dev_attr_read_typ.attr, - &dev_attr_read_max.attr, &dev_attr_write_typ.attr, &dev_attr_write_max.attr, &dev_attr_reset_typ.attr, @@ -1268,44 +1257,38 @@ static struct attribute *nvm_dev_attrs_20[] = { NULL, };
-static const struct attribute_group nvm_dev_attr_group_20 = { - .name = "lightnvm", - .attrs = nvm_dev_attrs_20, -}; - -int nvme_nvm_register_sysfs(struct nvme_ns *ns) +static umode_t nvm_dev_attrs_visible(struct kobject *kobj, + struct attribute *attr, int index) { + struct device *dev = container_of(kobj, struct device, kobj); + struct gendisk *disk = dev_to_disk(dev); + struct nvme_ns *ns = disk->private_data; struct nvm_dev *ndev = ns->ndev; - struct nvm_geo *geo = &ndev->geo; + struct device_attribute *dev_attr = + container_of(attr, typeof(*dev_attr), attr);
if (!ndev) - return -EINVAL; - - switch (geo->major_ver_id) { - case 1: - return sysfs_create_group(&disk_to_dev(ns->disk)->kobj, - &nvm_dev_attr_group_12); - case 2: - return sysfs_create_group(&disk_to_dev(ns->disk)->kobj, - &nvm_dev_attr_group_20); - } - - return -EINVAL; -} + return 0;
-void nvme_nvm_unregister_sysfs(struct nvme_ns *ns) -{ - struct nvm_dev *ndev = ns->ndev; - struct nvm_geo *geo = &ndev->geo; + if (dev_attr->show == nvm_dev_attr_show) + return attr->mode;
- switch (geo->major_ver_id) { + switch (ndev->geo.major_ver_id) { case 1: - sysfs_remove_group(&disk_to_dev(ns->disk)->kobj, - &nvm_dev_attr_group_12); + if (dev_attr->show == nvm_dev_attr_show_12) + return attr->mode; break; case 2: - sysfs_remove_group(&disk_to_dev(ns->disk)->kobj, - &nvm_dev_attr_group_20); + if (dev_attr->show == nvm_dev_attr_show_20) + return attr->mode; break; } + + return 0; } + +const struct attribute_group nvme_nvm_attr_group = { + .name = "lightnvm", + .attrs = nvm_dev_attrs, + .is_visible = nvm_dev_attrs_visible, +}; diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index e61c5eec971b..e13ff4dfa3df 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -345,11 +345,9 @@ static void nvme_mpath_set_live(struct nvme_ns *ns) return;
if (!test_and_set_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) { + WARN_ON(disk_to_dev(head->disk)->groups); + disk_to_dev(head->disk)->groups = nvme_ns_id_attr_groups; device_add_disk(&head->subsys->dev, head->disk); - if (sysfs_create_group(&disk_to_dev(head->disk)->kobj, - &nvme_ns_id_attr_group)) - dev_warn(&head->subsys->dev, - "failed to create id group.\n"); }
synchronize_srcu(&head->srcu); @@ -556,11 +554,8 @@ void nvme_mpath_remove_disk(struct nvme_ns_head *head) { if (!head->disk) return; - if (head->disk->flags & GENHD_FL_UP) { - sysfs_remove_group(&disk_to_dev(head->disk)->kobj, - &nvme_ns_id_attr_group); + if (head->disk->flags & GENHD_FL_UP) del_gendisk(head->disk); - } blk_set_queue_dying(head->disk->queue); /* make sure all pending bios are cleaned up */ kblockd_schedule_work(&head->requeue_work); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index a2f784d9e091..4617168aa73f 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -470,7 +470,7 @@ int nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl); int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, void *log, size_t size, u64 offset);
-extern const struct attribute_group nvme_ns_id_attr_group; +extern const struct attribute_group *nvme_ns_id_attr_groups[]; extern const struct block_device_operations nvme_ns_head_ops;
#ifdef CONFIG_NVME_MULTIPATH @@ -603,8 +603,7 @@ static inline void nvme_mpath_update_disk_size(struct gendisk *disk) void nvme_nvm_update_nvm_info(struct nvme_ns *ns); int nvme_nvm_register(struct nvme_ns *ns, char *disk_name, int node); void nvme_nvm_unregister(struct nvme_ns *ns); -int nvme_nvm_register_sysfs(struct nvme_ns *ns); -void nvme_nvm_unregister_sysfs(struct nvme_ns *ns); +extern const struct attribute_group nvme_nvm_attr_group; int nvme_nvm_ioctl(struct nvme_ns *ns, unsigned int cmd, unsigned long arg); #else static inline void nvme_nvm_update_nvm_info(struct nvme_ns *ns) {}; @@ -615,11 +614,6 @@ static inline int nvme_nvm_register(struct nvme_ns *ns, char *disk_name, }
static inline void nvme_nvm_unregister(struct nvme_ns *ns) {}; -static inline int nvme_nvm_register_sysfs(struct nvme_ns *ns) -{ - return 0; -} -static inline void nvme_nvm_unregister_sysfs(struct nvme_ns *ns) {}; static inline int nvme_nvm_ioctl(struct nvme_ns *ns, unsigned int cmd, unsigned long arg) {
From: Wenchao Hao haowenchao@huawei.com
euleros/rtos inclusion category: bugfix bugzilla: NA
--------------------------------
Register default sysfs groups during device_add_disk() to avoid a race condition with udev during startup.
Signed-off-by: Wenchao Hao haowenchao@huawei.com Reviewed-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/block/virtio_blk.c | 67 ++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 9c3dca8f5d21..3b00ee46b9d9 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -418,8 +418,8 @@ static int minor_to_index(int minor) return minor >> PART_BITS; }
-static ssize_t virtblk_serial_show(struct device *dev, - struct device_attribute *attr, char *buf) +static ssize_t serial_show(struct device *dev, + struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); int err; @@ -438,7 +438,7 @@ static ssize_t virtblk_serial_show(struct device *dev, return err; }
-static DEVICE_ATTR(serial, 0444, virtblk_serial_show, NULL); +static DEVICE_ATTR_RO(serial);
/* The queue's logical block size must be set before calling this */ static void virtblk_update_capacity(struct virtio_blk *vblk, bool resize) @@ -614,8 +614,8 @@ static const char *const virtblk_cache_types[] = { };
static ssize_t -virtblk_cache_type_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) +cache_type_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) { struct gendisk *disk = dev_to_disk(dev); struct virtio_blk *vblk = disk->private_data; @@ -633,8 +633,7 @@ virtblk_cache_type_store(struct device *dev, struct device_attribute *attr, }
static ssize_t -virtblk_cache_type_show(struct device *dev, struct device_attribute *attr, - char *buf) +cache_type_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); struct virtio_blk *vblk = disk->private_data; @@ -644,12 +643,38 @@ virtblk_cache_type_show(struct device *dev, struct device_attribute *attr, return snprintf(buf, 40, "%s\n", virtblk_cache_types[writeback]); }
-static const struct device_attribute dev_attr_cache_type_ro = - __ATTR(cache_type, 0444, - virtblk_cache_type_show, NULL); -static const struct device_attribute dev_attr_cache_type_rw = - __ATTR(cache_type, 0644, - virtblk_cache_type_show, virtblk_cache_type_store); +static DEVICE_ATTR_RW(cache_type); + +static struct attribute *virtblk_attrs[] = { + &dev_attr_serial.attr, + &dev_attr_cache_type.attr, + NULL, +}; + +static umode_t virtblk_attrs_are_visible(struct kobject *kobj, + struct attribute *a, int n) +{ + struct device *dev = container_of(kobj, struct device, kobj); + struct gendisk *disk = dev_to_disk(dev); + struct virtio_blk *vblk = disk->private_data; + struct virtio_device *vdev = vblk->vdev; + + if (a == &dev_attr_cache_type.attr && + !virtio_has_feature(vdev, VIRTIO_BLK_F_CONFIG_WCE)) + return S_IRUGO; + + return a->mode; +} + +static const struct attribute_group virtblk_attr_group = { + .attrs = virtblk_attrs, + .is_visible = virtblk_attrs_are_visible, +}; + +static const struct attribute_group *virtblk_attr_groups[] = { + &virtblk_attr_group, + NULL, +};
static int virtblk_init_request(struct blk_mq_tag_set *set, struct request *rq, unsigned int hctx_idx, unsigned int numa_node) @@ -853,24 +878,10 @@ static int virtblk_probe(struct virtio_device *vdev) virtblk_update_capacity(vblk, false); virtio_device_ready(vdev);
+ disk_to_dev(vblk->disk)->groups = virtblk_attr_groups; device_add_disk(&vdev->dev, vblk->disk); - err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial); - if (err) - goto out_del_disk; - - if (virtio_has_feature(vdev, VIRTIO_BLK_F_CONFIG_WCE)) - err = device_create_file(disk_to_dev(vblk->disk), - &dev_attr_cache_type_rw); - else - err = device_create_file(disk_to_dev(vblk->disk), - &dev_attr_cache_type_ro); - if (err) - goto out_del_disk; return 0;
-out_del_disk: - del_gendisk(vblk->disk); - blk_cleanup_queue(vblk->disk->queue); out_free_tags: blk_mq_free_tag_set(&vblk->tag_set); out_put_disk:
From: "zhangyi (F)" yi.zhang@huawei.com
hulk inclusion category: bugfix bugzilla: 48166 CVE: NA ---------------------------
block_dump is an old debugging interface, one of it's functions is used to dump who write which file on disk. If block_dump is enabled, we can turn on debug log level and gather information about write process name file name from kmsg. It is done by block_dump___mark_inode_dirty() to print kernel message directly when marking inode dirty, so it can trigger log storm easily.
After tracepoints has been introduced into the kernel, we got trace_writeback_mark_inode_dirty() in __mark_inode_dirty(), which is a better replacement of block_dump___mark_inode_dirty(). The only downside is that it only trace the inode number and not a file name, but it may not a big deal because the original dumped file name in block_dump is not accurate in some cases, and we can still find it through the inode number and device id. So this patch delete the block_dump feature.
Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: Ye bin yebin10@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/fs-writeback.c | 25 ------------------------- 1 file changed, 25 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index a9c7522e367c..73b4047c996c 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -2117,28 +2117,6 @@ int dirtytime_interval_handler(struct ctl_table *table, int write, return ret; }
-static noinline void block_dump___mark_inode_dirty(struct inode *inode) -{ - if (inode->i_ino || strcmp(inode->i_sb->s_id, "bdev")) { - struct dentry *dentry; - const char *name = "?"; - - dentry = d_find_alias(inode); - if (dentry) { - spin_lock(&dentry->d_lock); - name = (const char *) dentry->d_name.name; - } - printk(KERN_DEBUG - "%s(%d): dirtied inode %lu (%s) on %s\n", - current->comm, task_pid_nr(current), inode->i_ino, - name, inode->i_sb->s_id); - if (dentry) { - spin_unlock(&dentry->d_lock); - dput(dentry); - } - } -} - /** * __mark_inode_dirty - internal function * @@ -2198,9 +2176,6 @@ void __mark_inode_dirty(struct inode *inode, int flags) (dirtytime && (inode->i_state & I_DIRTY_INODE))) return;
- if (unlikely(block_dump)) - block_dump___mark_inode_dirty(inode); - spin_lock(&inode->i_lock); if (dirtytime && (inode->i_state & I_DIRTY_INODE)) goto out_unlock_inode;
From: "Darrick J. Wong" darrick.wong@oracle.com
mainline inclusion from mainline-5.4-rc1 commit 1638045c36772b47a0765f7dca07cb90267e4942 category: bugfix bugzilla: 50612 CVE: NA ---------------------------
Set S_SWAPFILE on block device inodes so that they have the same protections as a swap flie.
Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: yangerkun yangerkun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- mm/swapfile.c | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c index 13171e764c56..06df64e59d9c 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2474,9 +2474,8 @@ add_swap_extent(struct swap_info_struct *sis, unsigned long start_page, * requirements, they are simply tossed out - we will never use those blocks * for swapping. * - * For S_ISREG swapfiles we set S_SWAPFILE across the life of the swapon. This - * prevents root from shooting her foot off by ftruncating an in-use swapfile, - * which will scribble on the fs. + * For all swap devices we set S_SWAPFILE across the life of the swapon. This + * prevents users from writing to the swap device, which will corrupt memory. * * The amount of disk space which a single swap extent represents varies. * Typically it is in the 1-4 megabyte range. So we can have hundreds of @@ -2767,13 +2766,14 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) inode = mapping->host; if (S_ISBLK(inode->i_mode)) { struct block_device *bdev = I_BDEV(inode); + set_blocksize(bdev, old_block_size); blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); - } else { - inode_lock(inode); - inode->i_flags &= ~S_SWAPFILE; - inode_unlock(inode); } + + inode_lock(inode); + inode->i_flags &= ~S_SWAPFILE; + inode_unlock(inode); filp_close(swap_file, NULL);
/* @@ -2999,11 +2999,11 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) p->flags |= SWP_BLKDEV; } else if (S_ISREG(inode->i_mode)) { p->bdev = inode->i_sb->s_bdev; - inode_lock(inode); - if (IS_SWAPFILE(inode)) - return -EBUSY; - } else - return -EINVAL; + } + + inode_lock(inode); + if (IS_SWAPFILE(inode)) + return -EBUSY;
return 0; } @@ -3404,8 +3404,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) atomic_inc(&proc_poll_event); wake_up_interruptible(&proc_poll_wait);
- if (S_ISREG(inode->i_mode)) - inode->i_flags |= S_SWAPFILE; + inode->i_flags |= S_SWAPFILE; error = 0; goto out; bad_swap: @@ -3427,7 +3426,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) if (inced_nr_rotate_swap) atomic_dec(&nr_rotate_swap); if (swap_file) { - if (inode && S_ISREG(inode->i_mode)) { + if (inode) { inode_unlock(inode); inode = NULL; } @@ -3440,7 +3439,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } if (name) putname(name); - if (inode && S_ISREG(inode->i_mode)) + if (inode) inode_unlock(inode); if (!error) enable_swap_slots_cache();
From: "Darrick J. Wong" darrick.wong@oracle.com
mainline inclusion from mainline-5.4-rc1 commit dc617f29dbe5ef0c8ced65ce62c464af1daaab3d category: bugfix bugzilla: 50612 CVE: NA ---------------------------
Don't let userspace write to an active swap file because the kernel effectively has a long term lease on the storage and things could get seriously corrupted if we let this happen.
Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de
Conflict: include/linux/fs.h mm/filemap.c Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/block_dev.c | 3 +++ include/linux/fs.h | 11 +++++++++++ mm/filemap.c | 3 +++ mm/memory.c | 4 ++++ mm/mmap.c | 8 ++++++-- mm/swapfile.c | 12 +++++++++++- 6 files changed, 38 insertions(+), 3 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c index 2db79b6d5e6b..5f58e1a604a0 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -2007,6 +2007,9 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from) if (bdev_read_only(I_BDEV(bd_inode))) return -EPERM;
+ if (IS_SWAPFILE(bd_inode)) + return -ETXTBSY; + if (!iov_iter_count(from)) return 0;
diff --git a/include/linux/fs.h b/include/linux/fs.h index 787c8cd420a0..118021c316da 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3550,4 +3550,15 @@ extern int vfs_fadvise(struct file *file, loff_t offset, loff_t len, extern int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice);
+/* + * Flush file data before changing attributes. Caller must hold any locks + * required to prevent further writes to this file until we're done setting + * flags. + */ +static inline int inode_drain_writes(struct inode *inode) +{ + inode_dio_wait(inode); + return filemap_write_and_wait(inode->i_mapping); +} + #endif /* _LINUX_FS_H */ diff --git a/mm/filemap.c b/mm/filemap.c index 52e888f8de49..b4f919d487d1 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3052,6 +3052,9 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) unsigned long limit = rlimit(RLIMIT_FSIZE); loff_t pos;
+ if (IS_SWAPFILE(inode)) + return -ETXTBSY; + if (!iov_iter_count(from)) return 0;
diff --git a/mm/memory.c b/mm/memory.c index d146d4231686..0ff363795cbd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2226,6 +2226,10 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf)
vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
+ if (vmf->vma->vm_file && + IS_SWAPFILE(vmf->vma->vm_file->f_mapping->host)) + return VM_FAULT_SIGBUS; + ret = vmf->vma->vm_ops->page_mkwrite(vmf); /* Restore original flags so that caller is not surprised */ vmf->flags = old_flags; diff --git a/mm/mmap.c b/mm/mmap.c index 04e34c022775..3fcfed26d298 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1486,8 +1486,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr, case MAP_SHARED_VALIDATE: if (flags & ~flags_mask) return -EOPNOTSUPP; - if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE)) - return -EACCES; + if (prot & PROT_WRITE) { + if (!(file->f_mode & FMODE_WRITE)) + return -EACCES; + if (IS_SWAPFILE(file->f_mapping->host)) + return -ETXTBSY; + }
/* * Make sure we don't allow writing to an append-only diff --git a/mm/swapfile.c b/mm/swapfile.c index 06df64e59d9c..c03de4f1ee77 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3384,6 +3384,17 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) if (error) goto bad_swap;
+ /* + * Flush any pending IO and dirty mappings before we start using this + * swap device. + */ + inode->i_flags |= S_SWAPFILE; + error = inode_drain_writes(inode); + if (error) { + inode->i_flags &= ~S_SWAPFILE; + goto bad_swap; + } + mutex_lock(&swapon_mutex); prio = -1; if (swap_flags & SWAP_FLAG_PREFER) @@ -3404,7 +3415,6 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) atomic_inc(&proc_poll_event); wake_up_interruptible(&proc_poll_wait);
- inode->i_flags |= S_SWAPFILE; error = 0; goto out; bad_swap:
From: Christoph Hellwig hch@lst.de
mainline inclusion from mainline-5.6-rc3 commit fed98ef4d8b665316479dd35cbd92d3e2ff470a3 category: bugfix bugzilla: 50612 CVE: NA ---------------------------
claim_swapfile now always takes i_rwsem.
Link: http://lkml.kernel.org/r/20200114161225.309792-2-hch@lst.de Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- mm/swapfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c index c03de4f1ee77..074a724df169 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3259,7 +3259,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) mapping = swap_file->f_mapping; inode = mapping->host;
- /* If S_ISREG(inode->i_mode) will do inode_lock(inode); */ + /* will take i_rwsem; */ error = claim_swapfile(p, inode); if (unlikely(error)) goto bad_swap;
From: Naohiro Aota naohiro.aota@wdc.com
mainline inclusion from mainline-5.6 commit d795a90e2ba024dbf2f22107ae89c210b98b08b8 category: bugfix bugzilla: 50612 CVE: NA ---------------------------
claim_swapfile() currently keeps the inode locked when it is successful, or the file is already swapfile (with -EBUSY). And, on the other error cases, it does not lock the inode.
This inconsistency of the lock state and return value is quite confusing and actually causing a bad unlock balance as below in the "bad_swap" section of __do_sys_swapon().
This commit fixes this issue by moving the inode_lock() and IS_SWAPFILE check out of claim_swapfile(). The inode is unlocked in "bad_swap_unlock_inode" section, so that the inode is ensured to be unlocked at "bad_swap". Thus, error handling codes after the locking now jumps to "bad_swap_unlock_inode" instead of "bad_swap".
===================================== WARNING: bad unlock balance detected! 5.5.0-rc7+ #176 Not tainted ------------------------------------- swapon/4294 is trying to release lock (&sb->s_type->i_mutex_key) at: __do_sys_swapon+0x94b/0x3550 but there are no more locks to release!
other info that might help us debug this: no locks held by swapon/4294.
stack backtrace: CPU: 5 PID: 4294 Comm: swapon Not tainted 5.5.0-rc7-BTRFS-ZNS+ #176 Hardware name: ASUS All Series/H87-PRO, BIOS 2102 07/29/2014 Call Trace: dump_stack+0xa1/0xea print_unlock_imbalance_bug.cold+0x114/0x123 lock_release+0x562/0xed0 up_write+0x2d/0x490 __do_sys_swapon+0x94b/0x3550 __x64_sys_swapon+0x54/0x80 do_syscall_64+0xa4/0x4b0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f15da0a0dc7
Fixes: 1638045c3677 ("mm: set S_SWAPFILE on blockdev swap devices") Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Tested-by: Qais Youef qais.yousef@arm.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Cc: Christoph Hellwig hch@infradead.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20200206090132.154869-1-naohiro.aota@wdc.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- mm/swapfile.c | 41 ++++++++++++++++++++--------------------- 1 file changed, 20 insertions(+), 21 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c index 074a724df169..c2a672301410 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3001,10 +3001,6 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) p->bdev = inode->i_sb->s_bdev; }
- inode_lock(inode); - if (IS_SWAPFILE(inode)) - return -EBUSY; - return 0; }
@@ -3259,36 +3255,41 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) mapping = swap_file->f_mapping; inode = mapping->host;
- /* will take i_rwsem; */ error = claim_swapfile(p, inode); if (unlikely(error)) goto bad_swap;
+ inode_lock(inode); + if (IS_SWAPFILE(inode)) { + error = -EBUSY; + goto bad_swap_unlock_inode; + } + /* * Read the swap header. */ if (!mapping->a_ops->readpage) { error = -EINVAL; - goto bad_swap; + goto bad_swap_unlock_inode; } page = read_mapping_page(mapping, 0, swap_file); if (IS_ERR(page)) { error = PTR_ERR(page); - goto bad_swap; + goto bad_swap_unlock_inode; } swap_header = kmap(page);
maxpages = read_swap_header(p, swap_header, inode); if (unlikely(!maxpages)) { error = -EINVAL; - goto bad_swap; + goto bad_swap_unlock_inode; }
/* OK, set up the swap map and apply the bad block list */ swap_map = vzalloc(maxpages); if (!swap_map) { error = -ENOMEM; - goto bad_swap; + goto bad_swap_unlock_inode; }
if (bdi_cap_stable_pages_required(inode_to_bdi(inode))) @@ -3313,7 +3314,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) GFP_KERNEL); if (!cluster_info) { error = -ENOMEM; - goto bad_swap; + goto bad_swap_unlock_inode; }
for (ci = 0; ci < nr_cluster; ci++) @@ -3322,7 +3323,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) p->percpu_cluster = alloc_percpu(struct percpu_cluster); if (!p->percpu_cluster) { error = -ENOMEM; - goto bad_swap; + goto bad_swap_unlock_inode; } for_each_possible_cpu(cpu) { struct percpu_cluster *cluster; @@ -3336,13 +3337,13 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
error = swap_cgroup_swapon(p->type, maxpages); if (error) - goto bad_swap; + goto bad_swap_unlock_inode;
nr_extents = setup_swap_map_and_extents(p, swap_header, swap_map, cluster_info, maxpages, &span); if (unlikely(nr_extents < 0)) { error = nr_extents; - goto bad_swap; + goto bad_swap_unlock_inode; } /* frontswap enabled? set up bit-per-page map for frontswap */ if (IS_ENABLED(CONFIG_FRONTSWAP)) @@ -3382,7 +3383,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
error = init_swap_address_space(p->type, maxpages); if (error) - goto bad_swap; + goto bad_swap_unlock_inode;
/* * Flush any pending IO and dirty mappings before we start using this @@ -3392,7 +3393,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) error = inode_drain_writes(inode); if (error) { inode->i_flags &= ~S_SWAPFILE; - goto bad_swap; + goto bad_swap_unlock_inode; }
mutex_lock(&swapon_mutex); @@ -3417,6 +3418,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
error = 0; goto out; +bad_swap_unlock_inode: + inode_unlock(inode); bad_swap: free_percpu(p->percpu_cluster); p->percpu_cluster = NULL; @@ -3424,6 +3427,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) set_blocksize(p->bdev, p->old_block_size); blkdev_put(p->bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); } + inode = NULL; destroy_swap_extents(p); swap_cgroup_swapoff(p->type); spin_lock(&swap_lock); @@ -3435,13 +3439,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) kvfree(frontswap_map); if (inced_nr_rotate_swap) atomic_dec(&nr_rotate_swap); - if (swap_file) { - if (inode) { - inode_unlock(inode); - inode = NULL; - } + if (swap_file) filp_close(swap_file, NULL); - } out: if (page && !IS_ERR(page)) { kunmap(page);
From: Domenico Andreoli domenico.andreoli@linux.com
mainline inclusion from mainline-5.7-rc1 commit 56939e014a6c212b317414faa307029e2e80c3b9 category: bugfix bugzilla: 50612 CVE: NA ---------------------------
It turns out that there is one use case for programs being able to write to swap devices, and that is the userspace hibernation code.
Quick fix: disable the S_SWAPFILE check if hibernation is configured.
Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files") Reported-by: Domenico Andreoli domenico.andreoli@linux.com Reported-by: Marian Klein mkleinsoft@gmail.com Signed-off-by: Domenico Andreoli domenico.andreoli@linux.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/block_dev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c index 5f58e1a604a0..a90bfc36c6da 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -34,6 +34,7 @@ #include <linux/task_io_accounting_ops.h> #include <linux/falloc.h> #include <linux/uaccess.h> +#include <linux/suspend.h> #include "internal.h"
struct bdev_inode { @@ -2007,7 +2008,8 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from) if (bdev_read_only(I_BDEV(bd_inode))) return -EPERM;
- if (IS_SWAPFILE(bd_inode)) + /* uswsusp needs write permission to the swap */ + if (IS_SWAPFILE(bd_inode) && !hibernation_available()) return -ETXTBSY;
if (!iov_iter_count(from))
From: Miaohe Lin linmiaohe@huawei.com
mainline inclusion from mainline-5.10-rc1 commit 822bca52ee7eb279acfba261a423ed7ac47d6f73 category: bugfix bugzilla: 50612 CVE: NA ---------------------------
If we failed to drain inode, we would forget to free the swap address space allocated by init_swap_address_space() above.
Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Link: https://lkml.kernel.org/r/20200930101803.53884-1-linmiaohe@huawei.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: zhangyi (F) yi.zhang@huawei.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- mm/swapfile.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c index c2a672301410..c54b0afd8c87 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3393,7 +3393,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) error = inode_drain_writes(inode); if (error) { inode->i_flags &= ~S_SWAPFILE; - goto bad_swap_unlock_inode; + goto free_swap_address_space; }
mutex_lock(&swapon_mutex); @@ -3418,6 +3418,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
error = 0; goto out; +free_swap_address_space: + exit_swap_address_space(p->type); bad_swap_unlock_inode: inode_unlock(inode); bad_swap:
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.8-rc1 commit 130fdbc3d1f9966dd4230709c30f3768bccd3065 category: bugfix bugzilla: NA CVE: CVE-2020-16120
--------------------------------
The three instances of ovl_path_open() in overlayfs/readdir.c do three different things:
- pass f_flags from overlay file - pass O_RDONLY | O_DIRECTORY - pass just O_RDONLY
The value of f_flags can be (other than O_RDONLY):
O_WRONLY - not possible for a directory O_RDWR - not possible for a directory O_CREAT - masked out by dentry_open() O_EXCL - masked out by dentry_open() O_NOCTTY - masked out by dentry_open() O_TRUNC - masked out by dentry_open() O_APPEND - no effect on directory ops O_NDELAY - no effect on directory ops O_NONBLOCK - no effect on directory ops __O_SYNC - no effect on directory ops O_DSYNC - no effect on directory ops FASYNC - no effect on directory ops O_DIRECT - no effect on directory ops O_LARGEFILE - ? O_DIRECTORY - only affects lookup O_NOFOLLOW - only affects lookup O_NOATIME - overlay sets this unconditionally in ovl_path_open() O_CLOEXEC - only affects fd allocation O_PATH - no effect on directory ops __O_TMPFILE - not possible for a directory
Fon non-merge directories we use the underlying filesystem's iterate; in this case honor O_LARGEFILE from the original file to make sure that open doesn't get rejected.
For merge directories it's safe to pass O_LARGEFILE unconditionally since userspace will only see the artificial offsets created by overlayfs.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/overlayfs/readdir.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index ae99b90a8b98..b98df843ac96 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -300,7 +300,7 @@ static inline int ovl_dir_read(struct path *realpath, struct file *realfile; int err;
- realfile = ovl_path_open(realpath, O_RDONLY | O_DIRECTORY); + realfile = ovl_path_open(realpath, O_RDONLY | O_LARGEFILE); if (IS_ERR(realfile)) return PTR_ERR(realfile);
@@ -823,6 +823,12 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin) return res; }
+static struct file *ovl_dir_open_realfile(struct file *file, + struct path *realpath) +{ + return ovl_path_open(realpath, O_RDONLY | (file->f_flags & O_LARGEFILE)); +} + static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end, int datasync) { @@ -845,7 +851,7 @@ static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end, struct path upperpath;
ovl_path_upper(dentry, &upperpath); - realfile = ovl_path_open(&upperpath, O_RDONLY); + realfile = ovl_dir_open_realfile(file, &upperpath);
inode_lock(inode); if (!od->upperfile) { @@ -896,7 +902,7 @@ static int ovl_dir_open(struct inode *inode, struct file *file) return -ENOMEM;
type = ovl_path_real(file->f_path.dentry, &realpath); - realfile = ovl_path_open(&realpath, file->f_flags); + realfile = ovl_dir_open_realfile(file, &realpath); if (IS_ERR(realfile)) { kfree(od); return PTR_ERR(realfile);
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.8-rc1 commit 48bd024b8a40d73ad6b086de2615738da0c7004f category: bugfix bugzilla: NA CVE: CVE-2020-16120
--------------------------------
In preparation for more permission checking, override credentials for directory operations on the underlying filesystems.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/overlayfs/readdir.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index b98df843ac96..75a9a04eb56a 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -735,8 +735,10 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) struct ovl_dir_file *od = file->private_data; struct dentry *dentry = file->f_path.dentry; struct ovl_cache_entry *p; + const struct cred *old_cred; int err;
+ old_cred = ovl_override_creds(dentry->d_sb); if (!ctx->pos) ovl_dir_reset(file);
@@ -750,17 +752,20 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) (ovl_same_fs(dentry->d_sb) && (ovl_is_impure_dir(file) || OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent))))) { - return ovl_iterate_real(file, ctx); + err = ovl_iterate_real(file, ctx); + } else { + err = iterate_dir(od->realfile, ctx); } - return iterate_dir(od->realfile, ctx); + goto out; }
if (!od->cache) { struct ovl_dir_cache *cache;
cache = ovl_cache_get(dentry); + err = PTR_ERR(cache); if (IS_ERR(cache)) - return PTR_ERR(cache); + goto out;
od->cache = cache; ovl_seek_cursor(od, ctx->pos); @@ -772,7 +777,7 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) if (!p->ino) { err = ovl_cache_update_ino(&file->f_path, p); if (err) - return err; + goto out; } if (!dir_emit(ctx, p->name, p->len, p->ino, p->type)) break; @@ -780,7 +785,10 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) od->cursor = p->l_node.next; ctx->pos++; } - return 0; + err = 0; +out: + revert_creds(old_cred); + return err; }
static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin) @@ -826,7 +834,14 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin) static struct file *ovl_dir_open_realfile(struct file *file, struct path *realpath) { - return ovl_path_open(realpath, O_RDONLY | (file->f_flags & O_LARGEFILE)); + struct file *res; + const struct cred *old_cred; + + old_cred = ovl_override_creds(file_inode(file)->i_sb); + res = ovl_path_open(realpath, O_RDONLY | (file->f_flags & O_LARGEFILE)); + revert_creds(old_cred); + + return res; }
static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end,
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.8-rc1 commit 56230d956739b9cb1cbde439d76227d77979a04d category: bugfix bugzilla: NA CVE: CVE-2020-16120
--------------------------------
Check permission before opening a real file.
ovl_path_open() is used by readdir and copy-up routines.
ovl_permission() theoretically already checked copy up permissions, but it doesn't hurt to re-do these checks during the actual copy-up.
For directory reading ovl_permission() only checks access to topmost underlying layer. Readdir on a merged directory accesses layers below the topmost one as well. Permission wasn't checked for these layers.
Note: modifying ovl_permission() to perform this check would be far more complex and hence more bug prone. The result is less precise permissions returned in access(2). If this turns out to be an issue, we can revisit this bug.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/overlayfs/util.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c index d0570ac2788b..eb9411461b69 100644 --- a/fs/overlayfs/util.c +++ b/fs/overlayfs/util.c @@ -466,7 +466,32 @@ bool ovl_is_whiteout(struct dentry *dentry)
struct file *ovl_path_open(struct path *path, int flags) { - return dentry_open(path, flags | O_NOATIME, current_cred()); + struct inode *inode = d_inode(path->dentry); + int err, acc_mode; + + if (flags & ~(O_ACCMODE | O_LARGEFILE)) + BUG(); + + switch (flags & O_ACCMODE) { + case O_RDONLY: + acc_mode = MAY_READ; + break; + case O_WRONLY: + acc_mode = MAY_WRITE; + break; + default: + BUG(); + } + + err = inode_permission(inode, acc_mode | MAY_OPEN); + if (err) + return ERR_PTR(err); + + /* O_NOATIME is an optimization, don't fail if not permitted */ + if (inode_owner_or_capable(inode)) + flags |= O_NOATIME; + + return dentry_open(path, flags, current_cred()); }
/* Caller should hold ovl_inode->lock */
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.8-rc1 commit 292f902a40c11f043a5ca1305a114da0e523eaa3 category: bugfix bugzilla: NA CVE: CVE-2020-16120
--------------------------------
Verify LSM permissions for underlying file, since vfs_ioctl() doesn't do it.
[Stephen Rothwell] export security_file_ioctl
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Conflicts: fs/overlayfs/file.c [yyl: adjust context] Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/overlayfs/file.c | 5 ++++- security/security.c | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 83cc52871307..fb5595e680d1 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -12,6 +12,7 @@ #include <linux/xattr.h> #include <linux/uio.h> #include <linux/uaccess.h> +#include <linux/security.h> #include "overlayfs.h"
static char ovl_whatisit(struct inode *inode, struct inode *realinode) @@ -403,7 +404,9 @@ static long ovl_real_ioctl(struct file *file, unsigned int cmd, return ret;
old_cred = ovl_override_creds(file_inode(file)->i_sb); - ret = vfs_ioctl(real.file, cmd, arg); + ret = security_file_ioctl(real.file, cmd, arg); + if (!ret) + ret = vfs_ioctl(real.file, cmd, arg); revert_creds(old_cred);
fdput(real); diff --git a/security/security.c b/security/security.c index 5ce2448f3a45..9e4d6c999c79 100644 --- a/security/security.c +++ b/security/security.c @@ -893,6 +893,7 @@ int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { return call_int_hook(file_ioctl, 0, file, cmd, arg); } +EXPORT_SYMBOL_GPL(security_file_ioctl);
static inline unsigned long mmap_prot(struct file *file, unsigned long prot) {
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.8-rc1 commit 05acefb4872dae89e772729efb194af754c877e8 category: bugfix bugzilla: NA CVE: CVE-2020-16120
--------------------------------
Call inode_permission() on real inode before opening regular file on one of the underlying layers.
In some cases ovl_permission() already checks access to an underlying file, but it misses the metacopy case, and possibly other ones as well.
Removing the redundant permission check from ovl_permission() should be considered later.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Conflicts: fs/overlayfs/file.c [yyl: adjust context] Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/overlayfs/file.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index fb5595e680d1..f464a23c95af 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -35,10 +35,22 @@ static struct file *ovl_open_realfile(const struct file *file, struct file *realfile; const struct cred *old_cred; int flags = file->f_flags | OVL_OPEN_FLAGS; + int acc_mode = ACC_MODE(flags); + int err; + + if (flags & O_APPEND) + acc_mode |= MAY_APPEND;
old_cred = ovl_override_creds(inode->i_sb); - realfile = open_with_fake_path(&file->f_path, flags, realinode, - current_cred()); + err = inode_permission(realinode, MAY_OPEN | acc_mode); + if (err) { + realfile = ERR_PTR(err); + } else if (!inode_owner_or_capable(realinode)) { + realfile = ERR_PTR(-EPERM); + } else { + realfile = open_with_fake_path(&file->f_path, flags, realinode, + current_cred()); + } revert_creds(old_cred);
pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
From: Miklos Szeredi mszeredi@redhat.com
mainline inclusion from mainline-v5.11-rc1 commit b6650dab404c701d7fe08a108b746542a934da84 category: bugfix bugzilla: NA CVE: CVE-2020-16120
--------------------------------
In case the file cannot be opened with O_NOATIME because of lack of capabilities, then clear O_NOATIME instead of failing.
Remove WARN_ON(), since it would now trigger if O_NOATIME was cleared. Noticed by Amir Goldstein.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/overlayfs/file.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index f464a23c95af..cd1c94f77dee 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -45,9 +45,10 @@ static struct file *ovl_open_realfile(const struct file *file, err = inode_permission(realinode, MAY_OPEN | acc_mode); if (err) { realfile = ERR_PTR(err); - } else if (!inode_owner_or_capable(realinode)) { - realfile = ERR_PTR(-EPERM); } else { + if (!inode_owner_or_capable(realinode)) + flags &= ~O_NOATIME; + realfile = open_with_fake_path(&file->f_path, flags, realinode, current_cred()); } @@ -67,12 +68,6 @@ static int ovl_change_flags(struct file *file, unsigned int flags) struct inode *inode = file_inode(file); int err;
- flags |= OVL_OPEN_FLAGS; - - /* If some flag changed that cannot be changed then something's amiss */ - if (WARN_ON((file->f_flags ^ flags) & ~OVL_SETFL_MASK)) - return -EIO; - flags &= OVL_SETFL_MASK;
if (((flags ^ file->f_flags) & O_APPEND) && IS_APPEND(inode))
From: Lee Duncan lduncan@suse.com
stable inclusion from linux-4.19.179 commit ae84b246a76c4ace5997e5ca7e9fde3e1a526bc3 CVE: CVE-2021-27364/CVE-2021-27363
--------------------------------
commit 688e8128b7a92df982709a4137ea4588d16f24aa upstream.
Protect the iSCSI transport handle, available in sysfs, by requiring CAP_SYS_ADMIN to read it. Also protect the netlink socket by restricting reception of messages to ones sent with CAP_SYS_ADMIN. This disables normal users from being able to end arbitrary iSCSI sessions.
Cc: stable@vger.kernel.org Reported-by: Adam Nichols adam@grimm-co.com Reviewed-by: Chris Leech cleech@redhat.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Lee Duncan lduncan@suse.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Reviewed-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/scsi/scsi_transport_iscsi.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 698347301198..174bab398202 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -119,6 +119,9 @@ show_transport_handle(struct device *dev, struct device_attribute *attr, char *buf) { struct iscsi_internal *priv = dev_to_iscsi_internal(dev); + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; return sprintf(buf, "%llu\n", (unsigned long long)iscsi_handle(priv->iscsi_transport)); } static DEVICE_ATTR(handle, S_IRUGO, show_transport_handle, NULL); @@ -3504,6 +3507,9 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group) struct iscsi_cls_conn *conn; struct iscsi_endpoint *ep = NULL;
+ if (!netlink_capable(skb, CAP_SYS_ADMIN)) + return -EPERM; + if (nlh->nlmsg_type == ISCSI_UEVENT_PATH_UPDATE) *group = ISCSI_NL_GRP_UIP; else
From: Joe Perches joe@perches.com
stable inclusion from linux-4.19.179 commit cb1f69d53ac8a417fc42df013526b54735194c14 CVE: CVE-2021-27365
Prepare for CVE-2021-27365 --------------------------------
commit 2efc459d06f1630001e3984854848a5647086232 upstream.
Output defects can exist in sysfs content using sprintf and snprintf.
sprintf does not know the PAGE_SIZE maximum of the temporary buffer used for outputting sysfs content and it's possible to overrun the PAGE_SIZE buffer length.
Add a generic sysfs_emit function that knows that the size of the temporary buffer and ensures that no overrun is done.
Add a generic sysfs_emit_at function that can be used in multiple call situations that also ensures that no overrun is done.
Validate the output buffer argument to be page aligned. Validate the offset len argument to be within the PAGE_SIZE buf.
Signed-off-by: Joe Perches joe@perches.com Link: https://lore.kernel.org/r/884235202216d464d61ee975f7465332c86f76b2.160028592... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- Documentation/filesystems/sysfs.txt | 8 ++--- fs/sysfs/file.c | 55 +++++++++++++++++++++++++++++ include/linux/sysfs.h | 16 +++++++++ 3 files changed, 74 insertions(+), 5 deletions(-)
diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index a1426cabcef1..2e38fafc1b63 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -211,12 +211,10 @@ Other notes: is 4096.
- show() methods should return the number of bytes printed into the - buffer. This is the return value of scnprintf(). + buffer.
-- show() must not use snprintf() when formatting the value to be - returned to user space. If you can guarantee that an overflow - will never happen you can use sprintf() otherwise you must use - scnprintf(). +- show() should only use sysfs_emit() or sysfs_emit_at() when formatting + the value to be returned to user space.
- store() should return the number of bytes used from the buffer. If the entire buffer has been used, just return the count argument. diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 7a220626ea6a..74de104f8f33 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -15,6 +15,7 @@ #include <linux/list.h> #include <linux/mutex.h> #include <linux/seq_file.h> +#include <linux/mm.h>
#include "sysfs.h" #include "../kernfs/kernfs-internal.h" @@ -558,3 +559,57 @@ void sysfs_remove_bin_file(struct kobject *kobj, kernfs_remove_by_name(kobj->sd, attr->attr.name); } EXPORT_SYMBOL_GPL(sysfs_remove_bin_file); + +/** + * sysfs_emit - scnprintf equivalent, aware of PAGE_SIZE buffer. + * @buf: start of PAGE_SIZE buffer. + * @fmt: format + * @...: optional arguments to @format + * + * + * Returns number of characters written to @buf. + */ +int sysfs_emit(char *buf, const char *fmt, ...) +{ + va_list args; + int len; + + if (WARN(!buf || offset_in_page(buf), + "invalid sysfs_emit: buf:%p\n", buf)) + return 0; + + va_start(args, fmt); + len = vscnprintf(buf, PAGE_SIZE, fmt, args); + va_end(args); + + return len; +} +EXPORT_SYMBOL_GPL(sysfs_emit); + +/** + * sysfs_emit_at - scnprintf equivalent, aware of PAGE_SIZE buffer. + * @buf: start of PAGE_SIZE buffer. + * @at: offset in @buf to start write in bytes + * @at must be >= 0 && < PAGE_SIZE + * @fmt: format + * @...: optional arguments to @fmt + * + * + * Returns number of characters written starting at &@buf[@at]. + */ +int sysfs_emit_at(char *buf, int at, const char *fmt, ...) +{ + va_list args; + int len; + + if (WARN(!buf || offset_in_page(buf) || at < 0 || at >= PAGE_SIZE, + "invalid sysfs_emit_at: buf:%p at:%d\n", buf, at)) + return 0; + + va_start(args, fmt); + len = vscnprintf(buf + at, PAGE_SIZE - at, fmt, args); + va_end(args); + + return len; +} +EXPORT_SYMBOL_GPL(sysfs_emit_at); diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h index 987cefa337de..1cd7bad56075 100644 --- a/include/linux/sysfs.h +++ b/include/linux/sysfs.h @@ -299,6 +299,11 @@ static inline void sysfs_enable_ns(struct kernfs_node *kn) return kernfs_enable_ns(kn); }
+__printf(2, 3) +int sysfs_emit(char *buf, const char *fmt, ...); +__printf(3, 4) +int sysfs_emit_at(char *buf, int at, const char *fmt, ...); + #else /* CONFIG_SYSFS */
static inline int sysfs_create_dir_ns(struct kobject *kobj, const void *ns) @@ -505,6 +510,17 @@ static inline void sysfs_enable_ns(struct kernfs_node *kn) { }
+__printf(2, 3) +static inline int sysfs_emit(char *buf, const char *fmt, ...) +{ + return 0; +} + +__printf(3, 4) +static inline int sysfs_emit_at(char *buf, int at, const char *fmt, ...) +{ + return 0; +} #endif /* CONFIG_SYSFS */
static inline int __must_check sysfs_create_file(struct kobject *kobj,
From: Chris Leech cleech@redhat.com
stable inclusion from linux-4.19.179 commit b2957d7baff77b399c7408dc12bacc7f63765897 CVE: CVE-2021-27365
--------------------------------
commit ec98ea7070e94cc25a422ec97d1421e28d97b7ee upstream.
As the iSCSI parameters are exported back through sysfs, it should be enforcing that they never are more than PAGE_SIZE (which should be more than enough) before accepting updates through netlink.
Change all iSCSI sysfs attributes to use sysfs_emit().
Cc: stable@vger.kernel.org Reported-by: Adam Nichols adam@grimm-co.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Chris Leech cleech@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Reviewed-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/scsi/libiscsi.c | 148 ++++++++++++++-------------- drivers/scsi/scsi_transport_iscsi.c | 23 +++-- 2 files changed, 89 insertions(+), 82 deletions(-)
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index c4e02d97058a..d6f3c4ed63e8 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -3409,125 +3409,125 @@ int iscsi_session_get_param(struct iscsi_cls_session *cls_session,
switch(param) { case ISCSI_PARAM_FAST_ABORT: - len = sprintf(buf, "%d\n", session->fast_abort); + len = sysfs_emit(buf, "%d\n", session->fast_abort); break; case ISCSI_PARAM_ABORT_TMO: - len = sprintf(buf, "%d\n", session->abort_timeout); + len = sysfs_emit(buf, "%d\n", session->abort_timeout); break; case ISCSI_PARAM_LU_RESET_TMO: - len = sprintf(buf, "%d\n", session->lu_reset_timeout); + len = sysfs_emit(buf, "%d\n", session->lu_reset_timeout); break; case ISCSI_PARAM_TGT_RESET_TMO: - len = sprintf(buf, "%d\n", session->tgt_reset_timeout); + len = sysfs_emit(buf, "%d\n", session->tgt_reset_timeout); break; case ISCSI_PARAM_INITIAL_R2T_EN: - len = sprintf(buf, "%d\n", session->initial_r2t_en); + len = sysfs_emit(buf, "%d\n", session->initial_r2t_en); break; case ISCSI_PARAM_MAX_R2T: - len = sprintf(buf, "%hu\n", session->max_r2t); + len = sysfs_emit(buf, "%hu\n", session->max_r2t); break; case ISCSI_PARAM_IMM_DATA_EN: - len = sprintf(buf, "%d\n", session->imm_data_en); + len = sysfs_emit(buf, "%d\n", session->imm_data_en); break; case ISCSI_PARAM_FIRST_BURST: - len = sprintf(buf, "%u\n", session->first_burst); + len = sysfs_emit(buf, "%u\n", session->first_burst); break; case ISCSI_PARAM_MAX_BURST: - len = sprintf(buf, "%u\n", session->max_burst); + len = sysfs_emit(buf, "%u\n", session->max_burst); break; case ISCSI_PARAM_PDU_INORDER_EN: - len = sprintf(buf, "%d\n", session->pdu_inorder_en); + len = sysfs_emit(buf, "%d\n", session->pdu_inorder_en); break; case ISCSI_PARAM_DATASEQ_INORDER_EN: - len = sprintf(buf, "%d\n", session->dataseq_inorder_en); + len = sysfs_emit(buf, "%d\n", session->dataseq_inorder_en); break; case ISCSI_PARAM_DEF_TASKMGMT_TMO: - len = sprintf(buf, "%d\n", session->def_taskmgmt_tmo); + len = sysfs_emit(buf, "%d\n", session->def_taskmgmt_tmo); break; case ISCSI_PARAM_ERL: - len = sprintf(buf, "%d\n", session->erl); + len = sysfs_emit(buf, "%d\n", session->erl); break; case ISCSI_PARAM_TARGET_NAME: - len = sprintf(buf, "%s\n", session->targetname); + len = sysfs_emit(buf, "%s\n", session->targetname); break; case ISCSI_PARAM_TARGET_ALIAS: - len = sprintf(buf, "%s\n", session->targetalias); + len = sysfs_emit(buf, "%s\n", session->targetalias); break; case ISCSI_PARAM_TPGT: - len = sprintf(buf, "%d\n", session->tpgt); + len = sysfs_emit(buf, "%d\n", session->tpgt); break; case ISCSI_PARAM_USERNAME: - len = sprintf(buf, "%s\n", session->username); + len = sysfs_emit(buf, "%s\n", session->username); break; case ISCSI_PARAM_USERNAME_IN: - len = sprintf(buf, "%s\n", session->username_in); + len = sysfs_emit(buf, "%s\n", session->username_in); break; case ISCSI_PARAM_PASSWORD: - len = sprintf(buf, "%s\n", session->password); + len = sysfs_emit(buf, "%s\n", session->password); break; case ISCSI_PARAM_PASSWORD_IN: - len = sprintf(buf, "%s\n", session->password_in); + len = sysfs_emit(buf, "%s\n", session->password_in); break; case ISCSI_PARAM_IFACE_NAME: - len = sprintf(buf, "%s\n", session->ifacename); + len = sysfs_emit(buf, "%s\n", session->ifacename); break; case ISCSI_PARAM_INITIATOR_NAME: - len = sprintf(buf, "%s\n", session->initiatorname); + len = sysfs_emit(buf, "%s\n", session->initiatorname); break; case ISCSI_PARAM_BOOT_ROOT: - len = sprintf(buf, "%s\n", session->boot_root); + len = sysfs_emit(buf, "%s\n", session->boot_root); break; case ISCSI_PARAM_BOOT_NIC: - len = sprintf(buf, "%s\n", session->boot_nic); + len = sysfs_emit(buf, "%s\n", session->boot_nic); break; case ISCSI_PARAM_BOOT_TARGET: - len = sprintf(buf, "%s\n", session->boot_target); + len = sysfs_emit(buf, "%s\n", session->boot_target); break; case ISCSI_PARAM_AUTO_SND_TGT_DISABLE: - len = sprintf(buf, "%u\n", session->auto_snd_tgt_disable); + len = sysfs_emit(buf, "%u\n", session->auto_snd_tgt_disable); break; case ISCSI_PARAM_DISCOVERY_SESS: - len = sprintf(buf, "%u\n", session->discovery_sess); + len = sysfs_emit(buf, "%u\n", session->discovery_sess); break; case ISCSI_PARAM_PORTAL_TYPE: - len = sprintf(buf, "%s\n", session->portal_type); + len = sysfs_emit(buf, "%s\n", session->portal_type); break; case ISCSI_PARAM_CHAP_AUTH_EN: - len = sprintf(buf, "%u\n", session->chap_auth_en); + len = sysfs_emit(buf, "%u\n", session->chap_auth_en); break; case ISCSI_PARAM_DISCOVERY_LOGOUT_EN: - len = sprintf(buf, "%u\n", session->discovery_logout_en); + len = sysfs_emit(buf, "%u\n", session->discovery_logout_en); break; case ISCSI_PARAM_BIDI_CHAP_EN: - len = sprintf(buf, "%u\n", session->bidi_chap_en); + len = sysfs_emit(buf, "%u\n", session->bidi_chap_en); break; case ISCSI_PARAM_DISCOVERY_AUTH_OPTIONAL: - len = sprintf(buf, "%u\n", session->discovery_auth_optional); + len = sysfs_emit(buf, "%u\n", session->discovery_auth_optional); break; case ISCSI_PARAM_DEF_TIME2WAIT: - len = sprintf(buf, "%d\n", session->time2wait); + len = sysfs_emit(buf, "%d\n", session->time2wait); break; case ISCSI_PARAM_DEF_TIME2RETAIN: - len = sprintf(buf, "%d\n", session->time2retain); + len = sysfs_emit(buf, "%d\n", session->time2retain); break; case ISCSI_PARAM_TSID: - len = sprintf(buf, "%u\n", session->tsid); + len = sysfs_emit(buf, "%u\n", session->tsid); break; case ISCSI_PARAM_ISID: - len = sprintf(buf, "%02x%02x%02x%02x%02x%02x\n", + len = sysfs_emit(buf, "%02x%02x%02x%02x%02x%02x\n", session->isid[0], session->isid[1], session->isid[2], session->isid[3], session->isid[4], session->isid[5]); break; case ISCSI_PARAM_DISCOVERY_PARENT_IDX: - len = sprintf(buf, "%u\n", session->discovery_parent_idx); + len = sysfs_emit(buf, "%u\n", session->discovery_parent_idx); break; case ISCSI_PARAM_DISCOVERY_PARENT_TYPE: if (session->discovery_parent_type) - len = sprintf(buf, "%s\n", + len = sysfs_emit(buf, "%s\n", session->discovery_parent_type); else - len = sprintf(buf, "\n"); + len = sysfs_emit(buf, "\n"); break; default: return -ENOSYS; @@ -3559,16 +3559,16 @@ int iscsi_conn_get_addr_param(struct sockaddr_storage *addr, case ISCSI_PARAM_CONN_ADDRESS: case ISCSI_HOST_PARAM_IPADDRESS: if (sin) - len = sprintf(buf, "%pI4\n", &sin->sin_addr.s_addr); + len = sysfs_emit(buf, "%pI4\n", &sin->sin_addr.s_addr); else - len = sprintf(buf, "%pI6\n", &sin6->sin6_addr); + len = sysfs_emit(buf, "%pI6\n", &sin6->sin6_addr); break; case ISCSI_PARAM_CONN_PORT: case ISCSI_PARAM_LOCAL_PORT: if (sin) - len = sprintf(buf, "%hu\n", be16_to_cpu(sin->sin_port)); + len = sysfs_emit(buf, "%hu\n", be16_to_cpu(sin->sin_port)); else - len = sprintf(buf, "%hu\n", + len = sysfs_emit(buf, "%hu\n", be16_to_cpu(sin6->sin6_port)); break; default: @@ -3587,88 +3587,88 @@ int iscsi_conn_get_param(struct iscsi_cls_conn *cls_conn,
switch(param) { case ISCSI_PARAM_PING_TMO: - len = sprintf(buf, "%u\n", conn->ping_timeout); + len = sysfs_emit(buf, "%u\n", conn->ping_timeout); break; case ISCSI_PARAM_RECV_TMO: - len = sprintf(buf, "%u\n", conn->recv_timeout); + len = sysfs_emit(buf, "%u\n", conn->recv_timeout); break; case ISCSI_PARAM_MAX_RECV_DLENGTH: - len = sprintf(buf, "%u\n", conn->max_recv_dlength); + len = sysfs_emit(buf, "%u\n", conn->max_recv_dlength); break; case ISCSI_PARAM_MAX_XMIT_DLENGTH: - len = sprintf(buf, "%u\n", conn->max_xmit_dlength); + len = sysfs_emit(buf, "%u\n", conn->max_xmit_dlength); break; case ISCSI_PARAM_HDRDGST_EN: - len = sprintf(buf, "%d\n", conn->hdrdgst_en); + len = sysfs_emit(buf, "%d\n", conn->hdrdgst_en); break; case ISCSI_PARAM_DATADGST_EN: - len = sprintf(buf, "%d\n", conn->datadgst_en); + len = sysfs_emit(buf, "%d\n", conn->datadgst_en); break; case ISCSI_PARAM_IFMARKER_EN: - len = sprintf(buf, "%d\n", conn->ifmarker_en); + len = sysfs_emit(buf, "%d\n", conn->ifmarker_en); break; case ISCSI_PARAM_OFMARKER_EN: - len = sprintf(buf, "%d\n", conn->ofmarker_en); + len = sysfs_emit(buf, "%d\n", conn->ofmarker_en); break; case ISCSI_PARAM_EXP_STATSN: - len = sprintf(buf, "%u\n", conn->exp_statsn); + len = sysfs_emit(buf, "%u\n", conn->exp_statsn); break; case ISCSI_PARAM_PERSISTENT_PORT: - len = sprintf(buf, "%d\n", conn->persistent_port); + len = sysfs_emit(buf, "%d\n", conn->persistent_port); break; case ISCSI_PARAM_PERSISTENT_ADDRESS: - len = sprintf(buf, "%s\n", conn->persistent_address); + len = sysfs_emit(buf, "%s\n", conn->persistent_address); break; case ISCSI_PARAM_STATSN: - len = sprintf(buf, "%u\n", conn->statsn); + len = sysfs_emit(buf, "%u\n", conn->statsn); break; case ISCSI_PARAM_MAX_SEGMENT_SIZE: - len = sprintf(buf, "%u\n", conn->max_segment_size); + len = sysfs_emit(buf, "%u\n", conn->max_segment_size); break; case ISCSI_PARAM_KEEPALIVE_TMO: - len = sprintf(buf, "%u\n", conn->keepalive_tmo); + len = sysfs_emit(buf, "%u\n", conn->keepalive_tmo); break; case ISCSI_PARAM_LOCAL_PORT: - len = sprintf(buf, "%u\n", conn->local_port); + len = sysfs_emit(buf, "%u\n", conn->local_port); break; case ISCSI_PARAM_TCP_TIMESTAMP_STAT: - len = sprintf(buf, "%u\n", conn->tcp_timestamp_stat); + len = sysfs_emit(buf, "%u\n", conn->tcp_timestamp_stat); break; case ISCSI_PARAM_TCP_NAGLE_DISABLE: - len = sprintf(buf, "%u\n", conn->tcp_nagle_disable); + len = sysfs_emit(buf, "%u\n", conn->tcp_nagle_disable); break; case ISCSI_PARAM_TCP_WSF_DISABLE: - len = sprintf(buf, "%u\n", conn->tcp_wsf_disable); + len = sysfs_emit(buf, "%u\n", conn->tcp_wsf_disable); break; case ISCSI_PARAM_TCP_TIMER_SCALE: - len = sprintf(buf, "%u\n", conn->tcp_timer_scale); + len = sysfs_emit(buf, "%u\n", conn->tcp_timer_scale); break; case ISCSI_PARAM_TCP_TIMESTAMP_EN: - len = sprintf(buf, "%u\n", conn->tcp_timestamp_en); + len = sysfs_emit(buf, "%u\n", conn->tcp_timestamp_en); break; case ISCSI_PARAM_IP_FRAGMENT_DISABLE: - len = sprintf(buf, "%u\n", conn->fragment_disable); + len = sysfs_emit(buf, "%u\n", conn->fragment_disable); break; case ISCSI_PARAM_IPV4_TOS: - len = sprintf(buf, "%u\n", conn->ipv4_tos); + len = sysfs_emit(buf, "%u\n", conn->ipv4_tos); break; case ISCSI_PARAM_IPV6_TC: - len = sprintf(buf, "%u\n", conn->ipv6_traffic_class); + len = sysfs_emit(buf, "%u\n", conn->ipv6_traffic_class); break; case ISCSI_PARAM_IPV6_FLOW_LABEL: - len = sprintf(buf, "%u\n", conn->ipv6_flow_label); + len = sysfs_emit(buf, "%u\n", conn->ipv6_flow_label); break; case ISCSI_PARAM_IS_FW_ASSIGNED_IPV6: - len = sprintf(buf, "%u\n", conn->is_fw_assigned_ipv6); + len = sysfs_emit(buf, "%u\n", conn->is_fw_assigned_ipv6); break; case ISCSI_PARAM_TCP_XMIT_WSF: - len = sprintf(buf, "%u\n", conn->tcp_xmit_wsf); + len = sysfs_emit(buf, "%u\n", conn->tcp_xmit_wsf); break; case ISCSI_PARAM_TCP_RECV_WSF: - len = sprintf(buf, "%u\n", conn->tcp_recv_wsf); + len = sysfs_emit(buf, "%u\n", conn->tcp_recv_wsf); break; case ISCSI_PARAM_LOCAL_IPADDR: - len = sprintf(buf, "%s\n", conn->local_ipaddr); + len = sysfs_emit(buf, "%s\n", conn->local_ipaddr); break; default: return -ENOSYS; @@ -3686,13 +3686,13 @@ int iscsi_host_get_param(struct Scsi_Host *shost, enum iscsi_host_param param,
switch (param) { case ISCSI_HOST_PARAM_NETDEV_NAME: - len = sprintf(buf, "%s\n", ihost->netdev); + len = sysfs_emit(buf, "%s\n", ihost->netdev); break; case ISCSI_HOST_PARAM_HWADDRESS: - len = sprintf(buf, "%s\n", ihost->hwaddress); + len = sysfs_emit(buf, "%s\n", ihost->hwaddress); break; case ISCSI_HOST_PARAM_INITIATOR_NAME: - len = sprintf(buf, "%s\n", ihost->initiatorname); + len = sysfs_emit(buf, "%s\n", ihost->initiatorname); break; default: return -ENOSYS; diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 174bab398202..94c13caf13c3 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -122,7 +122,8 @@ show_transport_handle(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN)) return -EACCES; - return sprintf(buf, "%llu\n", (unsigned long long)iscsi_handle(priv->iscsi_transport)); + return sysfs_emit(buf, "%llu\n", + (unsigned long long)iscsi_handle(priv->iscsi_transport)); } static DEVICE_ATTR(handle, S_IRUGO, show_transport_handle, NULL);
@@ -132,7 +133,7 @@ show_transport_##name(struct device *dev, \ struct device_attribute *attr,char *buf) \ { \ struct iscsi_internal *priv = dev_to_iscsi_internal(dev); \ - return sprintf(buf, format"\n", priv->iscsi_transport->name); \ + return sysfs_emit(buf, format"\n", priv->iscsi_transport->name);\ } \ static DEVICE_ATTR(name, S_IRUGO, show_transport_##name, NULL);
@@ -173,7 +174,7 @@ static ssize_t show_ep_handle(struct device *dev, struct device_attribute *attr, char *buf) { struct iscsi_endpoint *ep = iscsi_dev_to_endpoint(dev); - return sprintf(buf, "%llu\n", (unsigned long long) ep->id); + return sysfs_emit(buf, "%llu\n", (unsigned long long) ep->id); } static ISCSI_ATTR(ep, handle, S_IRUGO, show_ep_handle, NULL);
@@ -2766,6 +2767,9 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) struct iscsi_cls_session *session; int err = 0, value = 0;
+ if (ev->u.set_param.len > PAGE_SIZE) + return -EINVAL; + session = iscsi_session_lookup(ev->u.set_param.sid); conn = iscsi_conn_lookup(ev->u.set_param.sid, ev->u.set_param.cid); if (!conn || !session) @@ -2913,6 +2917,9 @@ iscsi_set_host_param(struct iscsi_transport *transport, if (!transport->set_host_param) return -ENOSYS;
+ if (ev->u.set_host_param.len > PAGE_SIZE) + return -EINVAL; + shost = scsi_host_lookup(ev->u.set_host_param.host_no); if (!shost) { printk(KERN_ERR "set_host_param could not find host no %u\n", @@ -4023,7 +4030,7 @@ show_priv_session_state(struct device *dev, struct device_attribute *attr, char *buf) { struct iscsi_cls_session *session = iscsi_dev_to_session(dev->parent); - return sprintf(buf, "%s\n", iscsi_session_state_name(session->state)); + return sysfs_emit(buf, "%s\n", iscsi_session_state_name(session->state)); } static ISCSI_CLASS_ATTR(priv_sess, state, S_IRUGO, show_priv_session_state, NULL); @@ -4032,7 +4039,7 @@ show_priv_session_creator(struct device *dev, struct device_attribute *attr, char *buf) { struct iscsi_cls_session *session = iscsi_dev_to_session(dev->parent); - return sprintf(buf, "%d\n", session->creator); + return sysfs_emit(buf, "%d\n", session->creator); } static ISCSI_CLASS_ATTR(priv_sess, creator, S_IRUGO, show_priv_session_creator, NULL); @@ -4041,7 +4048,7 @@ show_priv_session_target_id(struct device *dev, struct device_attribute *attr, char *buf) { struct iscsi_cls_session *session = iscsi_dev_to_session(dev->parent); - return sprintf(buf, "%d\n", session->target_id); + return sysfs_emit(buf, "%d\n", session->target_id); } static ISCSI_CLASS_ATTR(priv_sess, target_id, S_IRUGO, show_priv_session_target_id, NULL); @@ -4054,8 +4061,8 @@ show_priv_session_##field(struct device *dev, \ struct iscsi_cls_session *session = \ iscsi_dev_to_session(dev->parent); \ if (session->field == -1) \ - return sprintf(buf, "off\n"); \ - return sprintf(buf, format"\n", session->field); \ + return sysfs_emit(buf, "off\n"); \ + return sysfs_emit(buf, format"\n", session->field); \ }
#define iscsi_priv_session_attr_store(field) \
From: Chris Leech cleech@redhat.com
stable inclusion from linux-4.19.179 commit 23e2942885e8db57311cb4f9a719fd0306073c40 CVE: CVE-2021-27365
--------------------------------
commit f9dbdf97a5bd92b1a49cee3d591b55b11fd7a6d5 upstream.
Open-iSCSI sends passthrough PDUs over netlink, but the kernel should be verifying that the provided PDU header and data lengths fall within the netlink message to prevent accessing beyond that in memory.
Cc: stable@vger.kernel.org Reported-by: Adam Nichols adam@grimm-co.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Chris Leech cleech@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Reviewed-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/scsi/scsi_transport_iscsi.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 94c13caf13c3..e340b05278b6 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -3507,6 +3507,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group) { int err = 0; u32 portid; + u32 pdu_len; struct iscsi_uevent *ev = nlmsg_data(nlh); struct iscsi_transport *transport = NULL; struct iscsi_internal *priv; @@ -3624,6 +3625,14 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group) err = -EINVAL; break; case ISCSI_UEVENT_SEND_PDU: + pdu_len = nlh->nlmsg_len - sizeof(*nlh) - sizeof(*ev); + + if ((ev->u.send_pdu.hdr_size > pdu_len) || + (ev->u.send_pdu.data_size > (pdu_len - ev->u.send_pdu.hdr_size))) { + err = -EINVAL; + break; + } + conn = iscsi_conn_lookup(ev->u.send_pdu.sid, ev->u.send_pdu.cid); if (conn) ev->r.retcode = transport->send_pdu(conn,
From: Yang Yingliang yangyingliang@huawei.com
hulk inclusion category: bugfix bugzilla: NA CVE: CVE-2021-27365
---------------------------
It's introduced by 5bf67c8c2d947 ("sysfs: Add sysfs_emit and..."). Fix it by remove include mm.h.
Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/sysfs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 74de104f8f33..ba8a7daa994f 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -15,7 +15,6 @@ #include <linux/list.h> #include <linux/mutex.h> #include <linux/seq_file.h> -#include <linux/mm.h>
#include "sysfs.h" #include "../kernfs/kernfs-internal.h" @@ -560,6 +559,7 @@ void sysfs_remove_bin_file(struct kobject *kobj, } EXPORT_SYMBOL_GPL(sysfs_remove_bin_file);
+#define offset_in_page(p) ((unsigned long)(p) & ~PAGE_MASK) /** * sysfs_emit - scnprintf equivalent, aware of PAGE_SIZE buffer. * @buf: start of PAGE_SIZE buffer.
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.179 commit 1a999d25ef536a14f6a7c25778836857adfba3f8 CVE: CVE-2021-28038
--------------------------------
commit 8310b77b48c5558c140e7a57a702e7819e62f04e upstream.
Bailing immediately from set_foreign_p2m_mapping() upon a p2m updating error leaves the full batch in an ambiguous state as far as the caller is concerned. Instead flags respective slots as bad, unmapping what was mapped there right away.
HYPERVISOR_grant_table_op()'s return value and the individual unmap slots' status fields get used only for a one-time - there's not much we can do in case of a failure.
Note that there's no GNTST_enomem or alike, so GNTST_general_error gets used.
The map ops' handle fields get overwritten just to be on the safe side.
This is part of XSA-367.
Cc: stable@vger.kernel.org Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/96cccf5d-e756-5f53-b91a-ea269bfb9be0@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm/xen/p2m.c | 35 +++++++++++++++++++++++++++++++---- arch/x86/xen/p2m.c | 44 +++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 72 insertions(+), 7 deletions(-)
diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c index ce538c51fa3f..8a8a388549e7 100644 --- a/arch/arm/xen/p2m.c +++ b/arch/arm/xen/p2m.c @@ -91,12 +91,39 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops, int i;
for (i = 0; i < count; i++) { + struct gnttab_unmap_grant_ref unmap; + int rc; + if (map_ops[i].status) continue; - if (unlikely(!set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, - map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT))) { - return -ENOMEM; - } + if (likely(set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, + map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT))) + continue; + + /* + * Signal an error for this slot. This in turn requires + * immediate unmapping. + */ + map_ops[i].status = GNTST_general_error; + unmap.host_addr = map_ops[i].host_addr, + unmap.handle = map_ops[i].handle; + map_ops[i].handle = ~0; + if (map_ops[i].flags & GNTMAP_device_map) + unmap.dev_bus_addr = map_ops[i].dev_bus_addr; + else + unmap.dev_bus_addr = 0; + + /* + * Pre-populate the status field, to be recognizable in + * the log message below. + */ + unmap.status = 1; + + rc = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, + &unmap, 1); + if (rc || unmap.status != GNTST_okay) + pr_err_once("gnttab unmap failed: rc=%d st=%d\n", + rc, unmap.status); }
return 0; diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index e8ef994c7243..82577eec6d0a 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -706,6 +706,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
for (i = 0; i < count; i++) { unsigned long mfn, pfn; + struct gnttab_unmap_grant_ref unmap[2]; + int rc;
/* Do not add to override if the map failed. */ if (map_ops[i].status != GNTST_okay || @@ -723,10 +725,46 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
WARN(pfn_to_mfn(pfn) != INVALID_P2M_ENTRY, "page must be ballooned");
- if (unlikely(!set_phys_to_machine(pfn, FOREIGN_FRAME(mfn)))) { - ret = -ENOMEM; - goto out; + if (likely(set_phys_to_machine(pfn, FOREIGN_FRAME(mfn)))) + continue; + + /* + * Signal an error for this slot. This in turn requires + * immediate unmapping. + */ + map_ops[i].status = GNTST_general_error; + unmap[0].host_addr = map_ops[i].host_addr, + unmap[0].handle = map_ops[i].handle; + map_ops[i].handle = ~0; + if (map_ops[i].flags & GNTMAP_device_map) + unmap[0].dev_bus_addr = map_ops[i].dev_bus_addr; + else + unmap[0].dev_bus_addr = 0; + + if (kmap_ops) { + kmap_ops[i].status = GNTST_general_error; + unmap[1].host_addr = kmap_ops[i].host_addr, + unmap[1].handle = kmap_ops[i].handle; + kmap_ops[i].handle = ~0; + if (kmap_ops[i].flags & GNTMAP_device_map) + unmap[1].dev_bus_addr = kmap_ops[i].dev_bus_addr; + else + unmap[1].dev_bus_addr = 0; } + + /* + * Pre-populate both status fields, to be recognizable in + * the log message below. + */ + unmap[0].status = 1; + unmap[1].status = 1; + + rc = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, + unmap, 1 + !!kmap_ops); + if (rc || unmap[0].status != GNTST_okay || + unmap[1].status != GNTST_okay) + pr_err_once("gnttab unmap failed: rc=%d st0=%d st1=%d\n", + rc, unmap[0].status, unmap[1].status); }
out:
From: Jan Beulich jbeulich@suse.com
stable inclusion from linux-4.19.179 commit b62d8b5c814be957ce164453ddf4852167908841 CVE: CVE-2021-28038
--------------------------------
commit 2991397d23ec597405b116d96de3813420bdcbc3 upstream.
Commit 3194a1746e8a ("xen-netback: don't "handle" error by BUG()") dropped respective a BUG_ON() without noticing that with this the variable's value wouldn't be consumed anymore. With gnttab_set_map_op() setting all status fields to a non-zero value, in case of an error no slot should have a status of GNTST_okay (zero).
This is part of XSA-367.
Cc: stable@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/d933f495-619a-0086-5fb4-1ec3cf81a8fc@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/net/xen-netback/netback.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index b29a1b279fff..41bdfb684d46 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1326,11 +1326,21 @@ int xenvif_tx_action(struct xenvif_queue *queue, int budget) return 0;
gnttab_batch_copy(queue->tx_copy_ops, nr_cops); - if (nr_mops != 0) + if (nr_mops != 0) { ret = gnttab_map_refs(queue->tx_map_ops, NULL, queue->pages_to_map, nr_mops); + if (ret) { + unsigned int i; + + netdev_err(queue->vif->dev, "Map fail: nr %u ret %d\n", + nr_mops, ret); + for (i = 0; i < nr_mops; ++i) + WARN_ON_ONCE(queue->tx_map_ops[i].status == + GNTST_okay); + } + }
work_done = xenvif_tx_submit(queue);