From: Yufen Yu yuyufen@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
-------------------------------------------------
In some scenario, likely spark-sql, almost all meta file's size is less then 2MB and applications read these smaller files in random mode. That means, it may issue multiple times random io to rotate disk, which can cause performance degradation.
To improve the small files random read, we try to read the first 2MB into pagecache on the first time of read. Then it can avoid multiple random io.
In fact, applications can call fadvise system with POSIX_FADV_WILLNEED to achieve this goal. But, some apps may cannot easily do that. So, we provide a new file flag FMODE_CTL_WILLNEED.
Signed-off-by: Yufen Yu yuyufen@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Conflicts: include/linux/fs.h Value '0x40000000' has been used for flag FMODE_BUF_RASYNC. Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/fs.h | 7 +++++++ mm/readahead.c | 40 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h index db632747781a..dd023a3023b5 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -183,6 +183,12 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, /* File supports async buffered reads */ #define FMODE_BUF_RASYNC ((__force fmode_t)0x40000000)
+/* File mode control flag, expect random access pattern */ +#define FMODE_CTL_RANDOM ((__force fmode_t)0x1) + +/* File mode control flag, will try to read head of the file into pagecache */ +#define FMODE_CTL_WILLNEED ((__force fmode_t)0x2) + /* * Attribute flags. These should be or-ed together to figure out what * has been changed! @@ -947,6 +953,7 @@ struct file { atomic_long_t f_count; unsigned int f_flags; fmode_t f_mode; + fmode_t f_ctl_mode; struct mutex f_pos_lock; loff_t f_pos; struct fown_struct f_owner; diff --git a/mm/readahead.c b/mm/readahead.c index c5b0457415be..ed23d5dec123 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -26,6 +26,7 @@
#include "internal.h"
+#define READAHEAD_FIRST_SIZE (2 * 1024 * 1024) /* * Initialise a struct file's readahead state. Assumes that the caller has * memset *ra to zero. @@ -549,10 +550,41 @@ static void ondemand_readahead(struct readahead_control *ractl, do_page_cache_ra(ractl, ra->size, ra->async_size); }
+/* + * Try to read first @ra_size from head of the file. + */ +static bool page_cache_readahead_from_head(struct address_space *mapping, + struct file *filp, pgoff_t offset, + unsigned long req_size, + unsigned long ra_size) +{ + struct backing_dev_info *bdi = inode_to_bdi(mapping->host); + struct file_ra_state *ra = &filp->f_ra; + unsigned long size = min_t(unsigned long, ra_size, + file_inode(filp)->i_size); + unsigned long nrpages = (size + PAGE_SIZE - 1) / PAGE_SIZE; + unsigned long max_pages; + unsigned int offs = 0; + + /* Cannot read date over target size, back to normal way */ + if (offset + req_size > nrpages) + return false; + + max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); + max_pages = min(max_pages, nrpages); + while (offs < nrpages) { + force_page_cache_readahead(mapping, filp, offs, max_pages); + offs += max_pages; + } + return true; +} + void page_cache_sync_ra(struct readahead_control *ractl, struct file_ra_state *ra, unsigned long req_count) { - bool do_forced_ra = ractl->file && (ractl->file->f_mode & FMODE_RANDOM); + bool do_forced_ra = ractl->file && + ((ractl->file->f_mode & FMODE_RANDOM) || + (ractl->file->f_ctl_mode & FMODE_CTL_RANDOM));
/* * Even if read-ahead is disabled, issue this request as read-ahead @@ -567,6 +599,12 @@ void page_cache_sync_ra(struct readahead_control *ractl, do_forced_ra = true; }
+ /* try to read first READAHEAD_FIRST_SIZE into pagecache */ + if (ractl->file && (ractl->file->f_ctl_mode & FMODE_CTL_WILLNEED) && + page_cache_readahead_from_head(ractl->mapping, ractl->file, + ractl->_index, req_count, READAHEAD_FIRST_SIZE)) + return; + /* be dumb */ if (do_forced_ra) { force_page_cache_ra(ractl, ra, req_count);
From: Qais Yousef qais.yousef@arm.com
mainline inclusion from mainline-5.12 commit 6939f4ef16d48f2093f337162cfc041d0e30ed25 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
Some subsystems only have bare tracepoints (a tracepoint with no associated trace event) to avoid the problem of trace events being an ABI that can't be changed.
From bpf presepective, bare tracepoints are what it calls
RAW_TRACEPOINT().
Since bpf assumed there's 1:1 mapping, it relied on hooking to DEFINE_EVENT() macro to create bpf mapping of the tracepoints. Since bare tracepoints use DECLARE_TRACE() to create the tracepoint, bpf had no knowledge about their existence.
By teaching bpf_probe.h to parse DECLARE_TRACE() in a similar fashion to DEFINE_EVENT(), bpf can find and attach to the new raw tracepoints.
Enabling that comes with the contract that changes to raw tracepoints don't constitute a regression if they break existing bpf programs. We need the ability to continue to morph and modify these raw tracepoints without worrying about any ABI.
Update Documentation/bpf/bpf_design_QA.rst to document this contract.
Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210119122237.2426878-2-qais.yousef@arm.com Signed-off-by: Hou Tao houtao1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/bpf/bpf_design_QA.rst | 6 ++++++ include/trace/bpf_probe.h | 12 ++++++++++-- 2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst index 2df7b067ab93..0e15f9b05c9d 100644 --- a/Documentation/bpf/bpf_design_QA.rst +++ b/Documentation/bpf/bpf_design_QA.rst @@ -208,6 +208,12 @@ data structures and compile with kernel internal headers. Both of these kernel internals are subject to change and can break with newer kernels such that the program needs to be adapted accordingly.
+Q: Are tracepoints part of the stable ABI? +------------------------------------------ +A: NO. Tracepoints are tied to internal implementation details hence they are +subject to change and can break with newer kernels. BPF programs need to change +accordingly when this happens. + Q: How much stack space a BPF program uses? ------------------------------------------- A: Currently all program types are limited to 512 bytes of stack diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index cd74bffed5c6..a23be89119aa 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -55,8 +55,7 @@ /* tracepoints with more than 12 arguments will hit build error */ #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__)
-#undef DECLARE_EVENT_CLASS -#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ +#define __BPF_DECLARE_TRACE(call, proto, args) \ static notrace void \ __bpf_trace_##call(void *__data, proto) \ { \ @@ -64,6 +63,10 @@ __bpf_trace_##call(void *__data, proto) \ CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ }
+#undef DECLARE_EVENT_CLASS +#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) + /* * This part is compiled out, it is only here as a build time check * to make sure that if the tracepoint handling changes, the @@ -111,6 +114,11 @@ __DEFINE_EVENT(template, call, PARAMS(proto), PARAMS(args), size) #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
+#undef DECLARE_TRACE +#define DECLARE_TRACE(call, proto, args) \ + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) \ + __DEFINE_EVENT(call, call, PARAMS(proto), PARAMS(args), 0) + #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
#undef DEFINE_EVENT_WRITABLE
From: Hou Tao houtao1@huawei.com
mainline inclusion from mainline-5.16 commit 65223741ae1b759a14cab84ba88888bb025f816d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
Commit 9df1c28bb752 ("bpf: add writable context for raw tracepoints") supports writable context for tracepoint, but it misses the support for bare tracepoint which has no associated trace event.
Bare tracepoint is defined by DECLARE_TRACE(), so adding a corresponding DECLARE_TRACE_WRITABLE() macro to generate a definition in __bpf_raw_tp_map section for bare tracepoint in a similar way to DEFINE_TRACE_WRITABLE().
Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211004094857.30868-2-hotforest@gmail.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/trace/bpf_probe.h | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index a23be89119aa..a8e97f84b652 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -93,8 +93,7 @@ __section("__bpf_raw_tp_map") = { \
#define FIRST(x, ...) x
-#undef DEFINE_EVENT_WRITABLE -#define DEFINE_EVENT_WRITABLE(template, call, proto, args, size) \ +#define __CHECK_WRITABLE_BUF_SIZE(call, proto, args, size) \ static inline void bpf_test_buffer_##call(void) \ { \ /* BUILD_BUG_ON() is ignored if the code is completely eliminated, but \ @@ -103,8 +102,12 @@ static inline void bpf_test_buffer_##call(void) \ */ \ FIRST(proto); \ (void)BUILD_BUG_ON_ZERO(size != sizeof(*FIRST(args))); \ -} \ -__DEFINE_EVENT(template, call, PARAMS(proto), PARAMS(args), size) +} + +#undef DEFINE_EVENT_WRITABLE +#define DEFINE_EVENT_WRITABLE(template, call, proto, args, size) \ + __CHECK_WRITABLE_BUF_SIZE(call, PARAMS(proto), PARAMS(args), size) \ + __DEFINE_EVENT(template, call, PARAMS(proto), PARAMS(args), size)
#undef DEFINE_EVENT #define DEFINE_EVENT(template, call, proto, args) \ @@ -119,9 +122,17 @@ __DEFINE_EVENT(template, call, PARAMS(proto), PARAMS(args), size) __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) \ __DEFINE_EVENT(call, call, PARAMS(proto), PARAMS(args), 0)
+#undef DECLARE_TRACE_WRITABLE +#define DECLARE_TRACE_WRITABLE(call, proto, args, size) \ + __CHECK_WRITABLE_BUF_SIZE(call, PARAMS(proto), PARAMS(args), size) \ + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) \ + __DEFINE_EVENT(call, call, PARAMS(proto), PARAMS(args), size) + #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+#undef DECLARE_TRACE_WRITABLE #undef DEFINE_EVENT_WRITABLE +#undef __CHECK_WRITABLE_BUF_SIZE #undef __DEFINE_EVENT #undef FIRST
From: Hou Tao houtao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
Add a writable bare tracepoint fs_file_read() and a bare tracepoint fs_file_release().
A version field is added to fs_file_read() to support extension of fs_file_read_ctx in future.
These two tracepoints need to be exported and will be used by filesystem kernel module.
Signed-off-by: Hou Tao houtao1@huawei.com Acked-by: fang wei fangwei1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/read_write.c | 5 +++++ include/linux/fs.h | 17 +++++++++++++++++ include/trace/events/fs.h | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 55 insertions(+) create mode 100644 include/trace/events/fs.h
diff --git a/fs/read_write.c b/fs/read_write.c index 88f445da7515..433ca8ab7c91 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -24,6 +24,8 @@
#include <linux/uaccess.h> #include <asm/unistd.h> +#define CREATE_TRACE_POINTS +#include <trace/events/fs.h>
const struct file_operations generic_ro_fops = { .llseek = generic_file_llseek, @@ -1679,3 +1681,6 @@ int generic_file_rw_checks(struct file *file_in, struct file *file_out)
return 0; } + +EXPORT_TRACEPOINT_SYMBOL_GPL(fs_file_read); +EXPORT_TRACEPOINT_SYMBOL_GPL(fs_file_release); diff --git a/include/linux/fs.h b/include/linux/fs.h index dd023a3023b5..1ed02fc8bdd0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3562,4 +3562,21 @@ static inline int inode_drain_writes(struct inode *inode) return filemap_write_and_wait(inode->i_mapping); }
+struct fs_file_read_ctx { + const unsigned char *name; + unsigned int f_ctl_mode; + unsigned int rsvd; + /* clear from f_ctl_mode */ + unsigned int clr_f_ctl_mode; + /* set into f_ctl_mode */ + unsigned int set_f_ctl_mode; + unsigned long key; + /* file size */ + long long i_size; + /* previous page index */ + long long prev_index; + /* current page index */ + long long index; +}; + #endif /* _LINUX_FS_H */ diff --git a/include/trace/events/fs.h b/include/trace/events/fs.h new file mode 100644 index 000000000000..ee82dad9d9da --- /dev/null +++ b/include/trace/events/fs.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM fs + +#if !defined(_TRACE_FS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_FS_H + +#include <linux/types.h> +#include <linux/tracepoint.h> +#include <linux/fs.h> + +#undef FS_DECLARE_TRACE +#ifdef DECLARE_TRACE_WRITABLE +#define FS_DECLARE_TRACE(call, proto, args, size) \ + DECLARE_TRACE_WRITABLE(call, PARAMS(proto), PARAMS(args), size) +#else +#define FS_DECLARE_TRACE(call, proto, args, size) \ + DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) +#endif + +FS_DECLARE_TRACE(fs_file_read, + TP_PROTO(struct fs_file_read_ctx *ctx, int version), + TP_ARGS(ctx, version), + sizeof(struct fs_file_read_ctx)); + +DECLARE_TRACE(fs_file_release, + TP_PROTO(struct inode *inode, struct file *filp), + TP_ARGS(inode, filp)); + +#endif /* _TRACE_FS_H */ + +/* This part must be outside protection */ +#include <trace/define_trace.h>
From: Hou Tao houtao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
fs_file_read_do_trace() uses writable-tracepoint to update f_mode for file read procedure. Also export it to make it being usable for filesystem kernel module.
Signed-off-by: Hou Tao houtao1@huawei.com Acked-by: fang wei fangwei1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/read_write.c | 33 +++++++++++++++++++++++++++++++++ include/linux/fs.h | 13 +++++++++++++ 2 files changed, 46 insertions(+)
diff --git a/fs/read_write.c b/fs/read_write.c index 433ca8ab7c91..d175b5e8d3d3 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1682,5 +1682,38 @@ int generic_file_rw_checks(struct file *file_in, struct file *file_out) return 0; }
+#ifdef CONFIG_TRACEPOINTS +static void fs_file_read_ctx_init(struct fs_file_read_ctx *ctx, + struct file *filp, loff_t pos) +{ + memset(ctx, 0, sizeof(*ctx)); + ctx->name = file_dentry(filp)->d_name.name; + ctx->f_ctl_mode = filp->f_ctl_mode; + ctx->key = (unsigned long)filp; + ctx->i_size = file_inode(filp)->i_size; + ctx->prev_index = filp->f_ra.prev_pos >> PAGE_SHIFT; + ctx->index = pos >> PAGE_SHIFT; +} + +#define FS_FILE_READ_VERSION 1 +#define FS_FILE_READ_MODE_MASK (FMODE_CTL_RANDOM | FMODE_CTL_WILLNEED) + +void fs_file_read_update_args_by_trace(struct kiocb *iocb) +{ + struct file *filp = iocb->ki_filp; + struct fs_file_read_ctx ctx; + + fs_file_read_ctx_init(&ctx, filp, iocb->ki_pos); + trace_fs_file_read(&ctx, FS_FILE_READ_VERSION); + + if (!ctx.set_f_ctl_mode && !ctx.clr_f_ctl_mode) + return; + + filp->f_ctl_mode |= ctx.set_f_ctl_mode & FS_FILE_READ_MODE_MASK; + filp->f_ctl_mode &= ~(ctx.clr_f_ctl_mode & FS_FILE_READ_MODE_MASK); +} +EXPORT_SYMBOL_GPL(fs_file_read_update_args_by_trace); +#endif + EXPORT_TRACEPOINT_SYMBOL_GPL(fs_file_read); EXPORT_TRACEPOINT_SYMBOL_GPL(fs_file_release); diff --git a/include/linux/fs.h b/include/linux/fs.h index 1ed02fc8bdd0..44ee66086b94 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -41,6 +41,7 @@ #include <linux/build_bug.h> #include <linux/stddef.h> #include <linux/mount.h> +#include <linux/tracepoint-defs.h>
#include <asm/byteorder.h> #include <uapi/linux/fs.h> @@ -3579,4 +3580,16 @@ struct fs_file_read_ctx { long long index; };
+#ifdef CONFIG_TRACEPOINTS +DECLARE_TRACEPOINT(fs_file_read); +extern void fs_file_read_update_args_by_trace(struct kiocb *iocb); +#else +static inline void fs_file_read_update_args_by_trace(struct kiocb *iocb) {} +#endif + +static inline void fs_file_read_do_trace(struct kiocb *iocb) +{ + if (tracepoint_enabled(fs_file_read)) + fs_file_read_update_args_by_trace(iocb); +} #endif /* _LINUX_FS_H */
From: Hou Tao houtao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
Use fs_file_read_do_trace() and trace_fs_file_release() to do that.
Signed-off-by: Hou Tao houtao1@huawei.com Acked-by: fang wei fangwei1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/xfs/xfs_file.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 2fb8cf556f8d..80adec66744b 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -29,6 +29,7 @@ #include <linux/backing-dev.h> #include <linux/mman.h> #include <linux/fadvise.h> +#include <trace/events/fs.h>
static const struct vm_operations_struct xfs_file_vm_ops;
@@ -289,6 +290,7 @@ xfs_file_buffered_aio_read( ssize_t ret;
trace_xfs_file_buffered_read(ip, iov_iter_count(to), iocb->ki_pos); + fs_file_read_do_trace(iocb);
if (iocb->ki_flags & IOCB_NOWAIT) { if (!xfs_ilock_nowait(ip, XFS_IOLOCK_SHARED)) @@ -1197,6 +1199,7 @@ xfs_file_release( struct inode *inode, struct file *filp) { + trace_fs_file_release(inode, filp); return xfs_release(XFS_I(inode)); }
From: Hou Tao houtao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4H3JT CVE: NA backport: openEuler-22.03-LTS
---------------------------
Use fs_file_read_do_trace() and trace_fs_file_release() to do that.
Signed-off-by: Hou Tao houtao1@huawei.com Acked-by: fang wei fangwei1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/file.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 3b09ddbe8970..6f78f7fbf419 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -30,6 +30,7 @@ #include <linux/uio.h> #include <linux/mman.h> #include <linux/backing-dev.h> +#include <trace/events/fs.h> #include "ext4.h" #include "ext4_jbd2.h" #include "xattr.h" @@ -128,6 +129,7 @@ static ssize_t ext4_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (iocb->ki_flags & IOCB_DIRECT) return ext4_dio_read_iter(iocb, to);
+ fs_file_read_do_trace(iocb); return generic_file_read_iter(iocb, to); }
@@ -138,6 +140,8 @@ static ssize_t ext4_file_read_iter(struct kiocb *iocb, struct iov_iter *to) */ static int ext4_release_file(struct inode *inode, struct file *filp) { + trace_fs_file_release(inode, filp); + if (ext4_test_inode_state(inode, EXT4_STATE_DA_ALLOC_CLOSE)) { ext4_alloc_da_blocks(inode); ext4_clear_inode_state(inode, EXT4_STATE_DA_ALLOC_CLOSE);
From: Hou Tao houtao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
Identify writable tracepoint program by section prefix raw_tracepoint.w/.
The correct way is back-porting from commit ccaf12d6215a ("libbpf: Support detecting and attaching of writable tracepoint program"), but the refactoring of libbpf makes it hard, so using the same section prefix as ccaf12d6215a and post a home-made patch instead.
Signed-off-by: Hou Tao houtao1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- tools/lib/bpf/libbpf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index b337d6f29098..2dfe07a872fc 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8320,6 +8320,7 @@ static const struct bpf_sec_def section_defs[] = { .attach_fn = attach_tp), SEC_DEF("raw_tracepoint/", RAW_TRACEPOINT, .attach_fn = attach_raw_tp), + BPF_PROG_SEC("raw_tracepoint.w/", BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE), SEC_DEF("raw_tp/", RAW_TRACEPOINT, .attach_fn = attach_raw_tp), SEC_DEF("tp_btf/", TRACING,
From: Hou Tao houtao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
It attaches eBPF program into fs_file_read() and fs_file_release() respectively. The program for fs_file_read() will record read history, calculate read pattern and set f_mode for specific file, And program for fs_file_release() will clean the saved read history.
Signed-off-by: Hou Tao houtao1@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- tools/testing/selftests/bpf/Makefile | 1 + .../testing/selftests/bpf/file_read_pattern.c | 73 +++++++++ .../bpf/progs/file_read_pattern_prog.c | 142 ++++++++++++++++++ 3 files changed, 216 insertions(+) create mode 100644 tools/testing/selftests/bpf/file_read_pattern.c create mode 100644 tools/testing/selftests/bpf/progs/file_read_pattern_prog.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 1d9155533360..6fabcc2f739b 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -38,6 +38,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test test_netcnt test_tcpnotify_user test_sysctl \ test_progs-no_alu32 \ test_current_pid_tgid_new_ns +TEST_GEN_PROGS += file_read_pattern
# Also test bpf-gcc, if present ifneq ($(BPF_GCC),) diff --git a/tools/testing/selftests/bpf/file_read_pattern.c b/tools/testing/selftests/bpf/file_read_pattern.c new file mode 100644 index 000000000000..81e3a49f0424 --- /dev/null +++ b/tools/testing/selftests/bpf/file_read_pattern.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> +#include <unistd.h> +#include <linux/bpf.h> +#include <linux/err.h> +#include <bpf/bpf.h> +#include <bpf/libbpf.h> + +#include "bpf_rlimit.h" + +#define READ_TP_NAME "fs_file_read" +#define RELEASE_TP_NAME "fs_file_release" + +int main(int argc, char *argv[]) +{ + const char *name = "./file_read_pattern_prog.o"; + struct bpf_object *obj; + const char *prog_name; + struct bpf_program *prog; + int unused; + int err; + int read_fd; + int release_fd; + + err = bpf_prog_load(name, BPF_PROG_TYPE_UNSPEC, &obj, &unused); + if (err) { + printf("Failed to load program\n"); + return err; + } + + prog_name = "raw_tracepoint.w/" READ_TP_NAME; + prog = bpf_object__find_program_by_title(obj, prog_name); + if (!prog) { + printf("no prog %s\n", prog_name); + err = -EINVAL; + goto out; + } + + read_fd = bpf_raw_tracepoint_open(READ_TP_NAME, bpf_program__fd(prog)); + if (read_fd < 0) { + err = -errno; + printf("Failed to attach raw tracepoint %s\n", READ_TP_NAME); + goto out; + } + + prog_name = "raw_tracepoint/" RELEASE_TP_NAME; + prog = bpf_object__find_program_by_title(obj, prog_name); + if (!prog) { + printf("no prog %s\n", prog_name); + err = -EINVAL; + goto out; + } + + release_fd = bpf_raw_tracepoint_open(RELEASE_TP_NAME, + bpf_program__fd(prog)); + if (release_fd < 0) { + err = -errno; + printf("Failed to attach raw tracepoint %s\n", RELEASE_TP_NAME); + goto out; + } + + pause(); + + close(release_fd); + close(read_fd); +out: + bpf_object__close(obj); + return err; +} diff --git a/tools/testing/selftests/bpf/progs/file_read_pattern_prog.c b/tools/testing/selftests/bpf/progs/file_read_pattern_prog.c new file mode 100644 index 000000000000..cd2dcd7c64bb --- /dev/null +++ b/tools/testing/selftests/bpf/progs/file_read_pattern_prog.c @@ -0,0 +1,142 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ +#include <stdbool.h> +#include <string.h> +#include <linux/bpf.h> + +#include <bpf/bpf_helpers.h> + +#ifndef __always_inline +#define __always_inline inline __attribute__((always_inline)) +#endif + +/* Need to keep consistent with definitions in include/linux/fs.h */ +#define FMODE_CTL_RANDOM 0x1 +#define FMODE_CTL_WILLNEED 0x2 + +struct fs_file_read_ctx { + const unsigned char *name; + unsigned int f_ctl_mode; + unsigned int rsvd; + /* clear from f_ctl_mode */ + unsigned int clr_f_ctl_mode; + /* set into f_ctl_mode */ + unsigned int set_f_ctl_mode; + unsigned long key; + /* file size */ + long long i_size; + /* previous page index */ + long long prev_index; + /* current page index */ + long long index; +}; + +struct fs_file_read_args { + struct fs_file_read_ctx *ctx; + int version; +}; + +struct fs_file_release_args { + void *inode; + void *filp; +}; + +struct file_rd_hist { + __u64 last_nsec; + __u32 seq_nr; + __u32 tot_nr; +}; + +struct bpf_map_def SEC("maps") htab = { + .type = BPF_MAP_TYPE_HASH, + .key_size = sizeof(long), + .value_size = sizeof(struct file_rd_hist), + .max_entries = 10000, +}; + +static __always_inline bool is_expected_file(void *name) +{ + char prefix[5]; + int err; + + err = bpf_probe_read_str(&prefix, sizeof(prefix), name); + if (err <= 0) + return false; + return !strncmp(prefix, "blk_", 4); +} + +SEC("raw_tracepoint.w/fs_file_read") +int fs_file_read(struct fs_file_read_args *args) +{ + const char fmt[] = "elapsed %llu, seq %u, tot %u\n"; + struct fs_file_read_ctx *rd_ctx = args->ctx; + struct file_rd_hist *hist; + struct file_rd_hist new_hist; + __u64 key; + __u64 now; + bool first; + + if (!is_expected_file((void *)rd_ctx->name)) + return 0; + + if (rd_ctx->i_size <= (4 << 20)) { + rd_ctx->set_f_ctl_mode = FMODE_CTL_WILLNEED; + return 0; + } + + first = false; + now = bpf_ktime_get_ns(); + key = rd_ctx->key; + hist = bpf_map_lookup_elem(&htab, &key); + if (!hist) { + __builtin_memset(&new_hist, 0, sizeof(new_hist)); + new_hist.last_nsec = now; + first = true; + hist = &new_hist; + } + + if (rd_ctx->index >= rd_ctx->prev_index && + rd_ctx->index - rd_ctx->prev_index <= 1) + hist->seq_nr += 1; + hist->tot_nr += 1; + + bpf_trace_printk(fmt, sizeof(fmt), now - hist->last_nsec, + hist->seq_nr, hist->tot_nr); + + if (first) { + bpf_map_update_elem(&htab, &key, hist, 0); + return 0; + } + + /* 500ms or 10 read */ + if (now - hist->last_nsec >= 500000000ULL || hist->tot_nr >= 10) { + if (hist->tot_nr >= 10) { + if (hist->seq_nr <= hist->tot_nr * 3 / 10) + rd_ctx->set_f_ctl_mode = FMODE_CTL_RANDOM; + else if (hist->seq_nr >= hist->tot_nr * 7 / 10) + rd_ctx->clr_f_ctl_mode = FMODE_CTL_RANDOM; + } + + hist->last_nsec = now; + hist->tot_nr = 0; + hist->seq_nr = 0; + } + + return 0; +} + +SEC("raw_tracepoint/fs_file_release") +int fs_file_release(struct fs_file_release_args *args) +{ + __u64 key = (unsigned long)args->filp; + void *value; + + value = bpf_map_lookup_elem(&htab, &key); + if (value) + bpf_map_delete_elem(&htab, &key); + + return 0; +} + +char _license[] SEC("license") = "GPL"; +__u32 _version SEC("version") = 1;
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53R0H CVE: NA backport: openEuler-22.03-LTS
---------------------------
Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/fs.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 44ee66086b94..18259e38dcd7 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -954,7 +954,6 @@ struct file { atomic_long_t f_count; unsigned int f_flags; fmode_t f_mode; - fmode_t f_ctl_mode; struct mutex f_pos_lock; loff_t f_pos; struct fown_struct f_owner; @@ -976,8 +975,14 @@ struct file { struct address_space *f_mapping; errseq_t f_wb_err; errseq_t f_sb_err; /* for syncfs */ - +#ifndef __GENKSYMS__ + union { + fmode_t f_ctl_mode; + u64 kabi_reserved1; + }; +#else KABI_RESERVE(1) +#endif } __randomize_layout __attribute__((aligned(4))); /* lest something weird decides that 2 is OK */
From: Liu Shixin liushixin2@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I576NI CVE: NA backport: openEuler-22.03-LTS
--------------------------------
There is a build warning when !CONFIG_CGROUPS:
In file included from ./include/linux/memcontrol.h:25, from ./include/linux/swap.h:10, from ./include/linux/suspend.h:5, from drivers/cpuidle/cpuidle.c:23: ./include/linux/dynamic_hugetlb.h:115:47: warning: ‘struct cftype’ declared inside parameter list will not be visible outside of this definition or declaration 115 | static inline bool dhugetlb_hide_files(struct cftype *cft) | ^~~~~~
Since the function is only invoked when CONFIG_CGROUPS enabled, we can fix it by restricting its definition to on CONFIG_CGROUPS.
Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/dynamic_hugetlb.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/include/linux/dynamic_hugetlb.h b/include/linux/dynamic_hugetlb.h index 237a7329ff64..5dcba8e8b933 100644 --- a/include/linux/dynamic_hugetlb.h +++ b/include/linux/dynamic_hugetlb.h @@ -112,10 +112,12 @@ void free_huge_page_to_dhugetlb_pool(struct page *page, bool restore_reserve);
struct dhugetlb_pool {};
+#ifdef CONFIG_CGROUPS static inline bool dhugetlb_hide_files(struct cftype *cft) { return false; } +#endif static inline void hugetlb_pool_inherit(struct mem_cgroup *memcg, struct mem_cgroup *parent) { }
From: Tianjia Zhang tianjia.zhang@linux.alibaba.com
mainline inclusion from mainline-v5.18-rc1 commit eb90686d5d10fef9cadd9c0eb30f3fee66d2b2a5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I50ABL
--------------------------------
Stand-alone implementation of the SM3 algorithm. It is designed to have as little dependencies as possible. In other cases you should generally use the hash APIs from include/crypto/hash.h. Especially when hashing large amounts of data as those APIs may be hw-accelerated. In the new SM3 stand-alone library, sm3_transform() has also been optimized, instead of simply using the code in sm3_generic.
Signed-off-by: Tianjia Zhang tianjia.zhang@linux.alibaba.com Reviewed-by: Gilad Ben-Yossef gilad@benyossef.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/crypto/sm3.h | 32 ++++++ lib/crypto/Kconfig | 3 + lib/crypto/Makefile | 3 + lib/crypto/sm3.c | 246 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 284 insertions(+) create mode 100644 lib/crypto/sm3.c
diff --git a/include/crypto/sm3.h b/include/crypto/sm3.h index 42ea21289ba9..b5fb6d1bf247 100644 --- a/include/crypto/sm3.h +++ b/include/crypto/sm3.h @@ -1,5 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ /* * Common values for SM3 algorithm + * + * Copyright (C) 2017 ARM Limited or its affiliates. + * Copyright (C) 2017 Gilad Ben-Yossef gilad@benyossef.com + * Copyright (C) 2021 Tianjia Zhang tianjia.zhang@linux.alibaba.com */
#ifndef _CRYPTO_SM3_H @@ -39,4 +44,31 @@ extern int crypto_sm3_final(struct shash_desc *desc, u8 *out);
extern int crypto_sm3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *hash); + +/* + * Stand-alone implementation of the SM3 algorithm. It is designed to + * have as little dependencies as possible so it can be used in the + * kexec_file purgatory. In other cases you should generally use the + * hash APIs from include/crypto/hash.h. Especially when hashing large + * amounts of data as those APIs may be hw-accelerated. + * + * For details see lib/crypto/sm3.c + */ + +static inline void sm3_init(struct sm3_state *sctx) +{ + sctx->state[0] = SM3_IVA; + sctx->state[1] = SM3_IVB; + sctx->state[2] = SM3_IVC; + sctx->state[3] = SM3_IVD; + sctx->state[4] = SM3_IVE; + sctx->state[5] = SM3_IVF; + sctx->state[6] = SM3_IVG; + sctx->state[7] = SM3_IVH; + sctx->count = 0; +} + +void sm3_update(struct sm3_state *sctx, const u8 *data, unsigned int len); +void sm3_final(struct sm3_state *sctx, u8 *out); + #endif diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index 545ccbddf6a1..36963e8f4eaa 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -129,5 +129,8 @@ config CRYPTO_LIB_CHACHA20POLY1305 config CRYPTO_LIB_SHA256 tristate
+config CRYPTO_LIB_SM3 + tristate + config CRYPTO_LIB_SM4 tristate diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index 73205ed269ba..8149bc00b627 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -38,6 +38,9 @@ libpoly1305-y += poly1305.o obj-$(CONFIG_CRYPTO_LIB_SHA256) += libsha256.o libsha256-y := sha256.o
+obj-$(CONFIG_CRYPTO_LIB_SM3) += libsm3.o +libsm3-y := sm3.o + obj-$(CONFIG_CRYPTO_LIB_SM4) += libsm4.o libsm4-y := sm4.o
diff --git a/lib/crypto/sm3.c b/lib/crypto/sm3.c new file mode 100644 index 000000000000..d473e358a873 --- /dev/null +++ b/lib/crypto/sm3.c @@ -0,0 +1,246 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * SM3 secure hash, as specified by OSCCA GM/T 0004-2012 SM3 and described + * at https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02 + * + * Copyright (C) 2017 ARM Limited or its affiliates. + * Copyright (C) 2017 Gilad Ben-Yossef gilad@benyossef.com + * Copyright (C) 2021 Tianjia Zhang tianjia.zhang@linux.alibaba.com + */ + +#include <linux/module.h> +#include <asm/unaligned.h> +#include <crypto/sm3.h> + +static const u32 ____cacheline_aligned K[64] = { + 0x79cc4519, 0xf3988a32, 0xe7311465, 0xce6228cb, + 0x9cc45197, 0x3988a32f, 0x7311465e, 0xe6228cbc, + 0xcc451979, 0x988a32f3, 0x311465e7, 0x6228cbce, + 0xc451979c, 0x88a32f39, 0x11465e73, 0x228cbce6, + 0x9d8a7a87, 0x3b14f50f, 0x7629ea1e, 0xec53d43c, + 0xd8a7a879, 0xb14f50f3, 0x629ea1e7, 0xc53d43ce, + 0x8a7a879d, 0x14f50f3b, 0x29ea1e76, 0x53d43cec, + 0xa7a879d8, 0x4f50f3b1, 0x9ea1e762, 0x3d43cec5, + 0x7a879d8a, 0xf50f3b14, 0xea1e7629, 0xd43cec53, + 0xa879d8a7, 0x50f3b14f, 0xa1e7629e, 0x43cec53d, + 0x879d8a7a, 0x0f3b14f5, 0x1e7629ea, 0x3cec53d4, + 0x79d8a7a8, 0xf3b14f50, 0xe7629ea1, 0xcec53d43, + 0x9d8a7a87, 0x3b14f50f, 0x7629ea1e, 0xec53d43c, + 0xd8a7a879, 0xb14f50f3, 0x629ea1e7, 0xc53d43ce, + 0x8a7a879d, 0x14f50f3b, 0x29ea1e76, 0x53d43cec, + 0xa7a879d8, 0x4f50f3b1, 0x9ea1e762, 0x3d43cec5 +}; + +/* + * Transform the message X which consists of 16 32-bit-words. See + * GM/T 004-2012 for details. + */ +#define R(i, a, b, c, d, e, f, g, h, t, w1, w2) \ + do { \ + ss1 = rol32((rol32((a), 12) + (e) + (t)), 7); \ + ss2 = ss1 ^ rol32((a), 12); \ + d += FF ## i(a, b, c) + ss2 + ((w1) ^ (w2)); \ + h += GG ## i(e, f, g) + ss1 + (w1); \ + b = rol32((b), 9); \ + f = rol32((f), 19); \ + h = P0((h)); \ + } while (0) + +#define R1(a, b, c, d, e, f, g, h, t, w1, w2) \ + R(1, a, b, c, d, e, f, g, h, t, w1, w2) +#define R2(a, b, c, d, e, f, g, h, t, w1, w2) \ + R(2, a, b, c, d, e, f, g, h, t, w1, w2) + +#define FF1(x, y, z) (x ^ y ^ z) +#define FF2(x, y, z) ((x & y) | (x & z) | (y & z)) + +#define GG1(x, y, z) FF1(x, y, z) +#define GG2(x, y, z) ((x & y) | (~x & z)) + +/* Message expansion */ +#define P0(x) ((x) ^ rol32((x), 9) ^ rol32((x), 17)) +#define P1(x) ((x) ^ rol32((x), 15) ^ rol32((x), 23)) +#define I(i) (W[i] = get_unaligned_be32(data + i * 4)) +#define W1(i) (W[i & 0x0f]) +#define W2(i) (W[i & 0x0f] = \ + P1(W[i & 0x0f] \ + ^ W[(i-9) & 0x0f] \ + ^ rol32(W[(i-3) & 0x0f], 15)) \ + ^ rol32(W[(i-13) & 0x0f], 7) \ + ^ W[(i-6) & 0x0f]) + +static void sm3_transform(struct sm3_state *sctx, u8 const *data, u32 W[16]) +{ + u32 a, b, c, d, e, f, g, h, ss1, ss2; + + a = sctx->state[0]; + b = sctx->state[1]; + c = sctx->state[2]; + d = sctx->state[3]; + e = sctx->state[4]; + f = sctx->state[5]; + g = sctx->state[6]; + h = sctx->state[7]; + + R1(a, b, c, d, e, f, g, h, K[0], I(0), I(4)); + R1(d, a, b, c, h, e, f, g, K[1], I(1), I(5)); + R1(c, d, a, b, g, h, e, f, K[2], I(2), I(6)); + R1(b, c, d, a, f, g, h, e, K[3], I(3), I(7)); + R1(a, b, c, d, e, f, g, h, K[4], W1(4), I(8)); + R1(d, a, b, c, h, e, f, g, K[5], W1(5), I(9)); + R1(c, d, a, b, g, h, e, f, K[6], W1(6), I(10)); + R1(b, c, d, a, f, g, h, e, K[7], W1(7), I(11)); + R1(a, b, c, d, e, f, g, h, K[8], W1(8), I(12)); + R1(d, a, b, c, h, e, f, g, K[9], W1(9), I(13)); + R1(c, d, a, b, g, h, e, f, K[10], W1(10), I(14)); + R1(b, c, d, a, f, g, h, e, K[11], W1(11), I(15)); + R1(a, b, c, d, e, f, g, h, K[12], W1(12), W2(16)); + R1(d, a, b, c, h, e, f, g, K[13], W1(13), W2(17)); + R1(c, d, a, b, g, h, e, f, K[14], W1(14), W2(18)); + R1(b, c, d, a, f, g, h, e, K[15], W1(15), W2(19)); + + R2(a, b, c, d, e, f, g, h, K[16], W1(16), W2(20)); + R2(d, a, b, c, h, e, f, g, K[17], W1(17), W2(21)); + R2(c, d, a, b, g, h, e, f, K[18], W1(18), W2(22)); + R2(b, c, d, a, f, g, h, e, K[19], W1(19), W2(23)); + R2(a, b, c, d, e, f, g, h, K[20], W1(20), W2(24)); + R2(d, a, b, c, h, e, f, g, K[21], W1(21), W2(25)); + R2(c, d, a, b, g, h, e, f, K[22], W1(22), W2(26)); + R2(b, c, d, a, f, g, h, e, K[23], W1(23), W2(27)); + R2(a, b, c, d, e, f, g, h, K[24], W1(24), W2(28)); + R2(d, a, b, c, h, e, f, g, K[25], W1(25), W2(29)); + R2(c, d, a, b, g, h, e, f, K[26], W1(26), W2(30)); + R2(b, c, d, a, f, g, h, e, K[27], W1(27), W2(31)); + R2(a, b, c, d, e, f, g, h, K[28], W1(28), W2(32)); + R2(d, a, b, c, h, e, f, g, K[29], W1(29), W2(33)); + R2(c, d, a, b, g, h, e, f, K[30], W1(30), W2(34)); + R2(b, c, d, a, f, g, h, e, K[31], W1(31), W2(35)); + + R2(a, b, c, d, e, f, g, h, K[32], W1(32), W2(36)); + R2(d, a, b, c, h, e, f, g, K[33], W1(33), W2(37)); + R2(c, d, a, b, g, h, e, f, K[34], W1(34), W2(38)); + R2(b, c, d, a, f, g, h, e, K[35], W1(35), W2(39)); + R2(a, b, c, d, e, f, g, h, K[36], W1(36), W2(40)); + R2(d, a, b, c, h, e, f, g, K[37], W1(37), W2(41)); + R2(c, d, a, b, g, h, e, f, K[38], W1(38), W2(42)); + R2(b, c, d, a, f, g, h, e, K[39], W1(39), W2(43)); + R2(a, b, c, d, e, f, g, h, K[40], W1(40), W2(44)); + R2(d, a, b, c, h, e, f, g, K[41], W1(41), W2(45)); + R2(c, d, a, b, g, h, e, f, K[42], W1(42), W2(46)); + R2(b, c, d, a, f, g, h, e, K[43], W1(43), W2(47)); + R2(a, b, c, d, e, f, g, h, K[44], W1(44), W2(48)); + R2(d, a, b, c, h, e, f, g, K[45], W1(45), W2(49)); + R2(c, d, a, b, g, h, e, f, K[46], W1(46), W2(50)); + R2(b, c, d, a, f, g, h, e, K[47], W1(47), W2(51)); + + R2(a, b, c, d, e, f, g, h, K[48], W1(48), W2(52)); + R2(d, a, b, c, h, e, f, g, K[49], W1(49), W2(53)); + R2(c, d, a, b, g, h, e, f, K[50], W1(50), W2(54)); + R2(b, c, d, a, f, g, h, e, K[51], W1(51), W2(55)); + R2(a, b, c, d, e, f, g, h, K[52], W1(52), W2(56)); + R2(d, a, b, c, h, e, f, g, K[53], W1(53), W2(57)); + R2(c, d, a, b, g, h, e, f, K[54], W1(54), W2(58)); + R2(b, c, d, a, f, g, h, e, K[55], W1(55), W2(59)); + R2(a, b, c, d, e, f, g, h, K[56], W1(56), W2(60)); + R2(d, a, b, c, h, e, f, g, K[57], W1(57), W2(61)); + R2(c, d, a, b, g, h, e, f, K[58], W1(58), W2(62)); + R2(b, c, d, a, f, g, h, e, K[59], W1(59), W2(63)); + R2(a, b, c, d, e, f, g, h, K[60], W1(60), W2(64)); + R2(d, a, b, c, h, e, f, g, K[61], W1(61), W2(65)); + R2(c, d, a, b, g, h, e, f, K[62], W1(62), W2(66)); + R2(b, c, d, a, f, g, h, e, K[63], W1(63), W2(67)); + + sctx->state[0] ^= a; + sctx->state[1] ^= b; + sctx->state[2] ^= c; + sctx->state[3] ^= d; + sctx->state[4] ^= e; + sctx->state[5] ^= f; + sctx->state[6] ^= g; + sctx->state[7] ^= h; +} +#undef R +#undef R1 +#undef R2 +#undef I +#undef W1 +#undef W2 + +static inline void sm3_block(struct sm3_state *sctx, + u8 const *data, int blocks, u32 W[16]) +{ + while (blocks--) { + sm3_transform(sctx, data, W); + data += SM3_BLOCK_SIZE; + } +} + +void sm3_update(struct sm3_state *sctx, const u8 *data, unsigned int len) +{ + unsigned int partial = sctx->count % SM3_BLOCK_SIZE; + u32 W[16]; + + sctx->count += len; + + if ((partial + len) >= SM3_BLOCK_SIZE) { + int blocks; + + if (partial) { + int p = SM3_BLOCK_SIZE - partial; + + memcpy(sctx->buffer + partial, data, p); + data += p; + len -= p; + + sm3_block(sctx, sctx->buffer, 1, W); + } + + blocks = len / SM3_BLOCK_SIZE; + len %= SM3_BLOCK_SIZE; + + if (blocks) { + sm3_block(sctx, data, blocks, W); + data += blocks * SM3_BLOCK_SIZE; + } + + memzero_explicit(W, sizeof(W)); + + partial = 0; + } + if (len) + memcpy(sctx->buffer + partial, data, len); +} +EXPORT_SYMBOL_GPL(sm3_update); + +void sm3_final(struct sm3_state *sctx, u8 *out) +{ + const int bit_offset = SM3_BLOCK_SIZE - sizeof(u64); + __be64 *bits = (__be64 *)(sctx->buffer + bit_offset); + __be32 *digest = (__be32 *)out; + unsigned int partial = sctx->count % SM3_BLOCK_SIZE; + u32 W[16]; + int i; + + sctx->buffer[partial++] = 0x80; + if (partial > bit_offset) { + memset(sctx->buffer + partial, 0, SM3_BLOCK_SIZE - partial); + partial = 0; + + sm3_block(sctx, sctx->buffer, 1, W); + } + + memset(sctx->buffer + partial, 0, bit_offset - partial); + *bits = cpu_to_be64(sctx->count << 3); + sm3_block(sctx, sctx->buffer, 1, W); + + for (i = 0; i < 8; i++) + put_unaligned_be32(sctx->state[i], digest++); + + /* Zeroize sensitive information. */ + memzero_explicit(W, sizeof(W)); + memzero_explicit(sctx, sizeof(*sctx)); +} +EXPORT_SYMBOL_GPL(sm3_final); + +MODULE_DESCRIPTION("Generic SM3 library"); +MODULE_LICENSE("GPL v2");
From: Tianjia Zhang tianjia.zhang@linux.alibaba.com
mainline inclusion from mainline-v5.18-rc1 commit f3a03d319dbdbb206530ebfce977c334ee2f8765 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I50ABL
--------------------------------
SM3 generic library is stand-alone implementation, sm3-ce can depend on the SM3 library instead of sm3-generic.
Signed-off-by: Tianjia Zhang tianjia.zhang@linux.alibaba.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/sm3-ce-glue.c | 28 ++++++++++++++++++++-------- 2 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 55f19450091b..764b64837b34 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -45,7 +45,7 @@ config CRYPTO_SM3_ARM64_CE tristate "SM3 digest algorithm (ARMv8.2 Crypto Extensions)" depends on KERNEL_MODE_NEON select CRYPTO_HASH - select CRYPTO_SM3 + select CRYPTO_LIB_SM3
config CRYPTO_SM4_ARM64_CE tristate "SM4 symmetric cipher (ARMv8.2 Crypto Extensions)" diff --git a/arch/arm64/crypto/sm3-ce-glue.c b/arch/arm64/crypto/sm3-ce-glue.c index d71faca322f2..ee98954ae8ca 100644 --- a/arch/arm64/crypto/sm3-ce-glue.c +++ b/arch/arm64/crypto/sm3-ce-glue.c @@ -26,8 +26,10 @@ asmlinkage void sm3_ce_transform(struct sm3_state *sst, u8 const *src, static int sm3_ce_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - if (!crypto_simd_usable()) - return crypto_sm3_update(desc, data, len); + if (!crypto_simd_usable()) { + sm3_update(shash_desc_ctx(desc), data, len); + return 0; + }
kernel_neon_begin(); sm3_base_do_update(desc, data, len, sm3_ce_transform); @@ -38,8 +40,10 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
static int sm3_ce_final(struct shash_desc *desc, u8 *out) { - if (!crypto_simd_usable()) - return crypto_sm3_finup(desc, NULL, 0, out); + if (!crypto_simd_usable()) { + sm3_final(shash_desc_ctx(desc), out); + return 0; + }
kernel_neon_begin(); sm3_base_do_finalize(desc, sm3_ce_transform); @@ -51,14 +55,22 @@ static int sm3_ce_final(struct shash_desc *desc, u8 *out) static int sm3_ce_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - if (!crypto_simd_usable()) - return crypto_sm3_finup(desc, data, len, out); + if (!crypto_simd_usable()) { + struct sm3_state *sctx = shash_desc_ctx(desc); + + if (len) + sm3_update(sctx, data, len); + sm3_final(sctx, out); + return 0; + }
kernel_neon_begin(); - sm3_base_do_update(desc, data, len, sm3_ce_transform); + if (len) + sm3_base_do_update(desc, data, len, sm3_ce_transform); + sm3_base_do_finalize(desc, sm3_ce_transform); kernel_neon_end();
- return sm3_ce_final(desc, out); + return sm3_base_finish(desc, out); }
static struct shash_alg sm3_alg = {
From: Tianjia Zhang tianjia.zhang@linux.alibaba.com
mainline inclusion from mainline-v5.18-rc1 commit 114004696bf23499ca834e784d91bd82de195d76 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I50ABL
--------------------------------
SM3 generic library is stand-alone implementation, it is necessary for the calculation of sm2 z digest to depends on SM3 library instead of sm3-generic.
Signed-off-by: Tianjia Zhang tianjia.zhang@linux.alibaba.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- crypto/Kconfig | 2 +- crypto/sm2.c | 38 +++++++++++++++++++------------------- 2 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/crypto/Kconfig b/crypto/Kconfig index 9a4878cb0141..5ebb24b4c644 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -272,7 +272,7 @@ config CRYPTO_ECRDSA
config CRYPTO_SM2 tristate "SM2 algorithm" - select CRYPTO_SM3 + select CRYPTO_LIB_SM3 select CRYPTO_AKCIPHER select CRYPTO_MANAGER select MPILIB diff --git a/crypto/sm2.c b/crypto/sm2.c index db8a4a265669..ae3f77a66070 100644 --- a/crypto/sm2.c +++ b/crypto/sm2.c @@ -13,7 +13,7 @@ #include <crypto/internal/akcipher.h> #include <crypto/akcipher.h> #include <crypto/hash.h> -#include <crypto/sm3_base.h> +#include <crypto/sm3.h> #include <crypto/rng.h> #include <crypto/sm2.h> #include "sm2signature.asn1.h" @@ -213,7 +213,7 @@ int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag, return 0; }
-static int sm2_z_digest_update(struct shash_desc *desc, +static int sm2_z_digest_update(struct sm3_state *sctx, MPI m, unsigned int pbytes) { static const unsigned char zero[32]; @@ -226,20 +226,20 @@ static int sm2_z_digest_update(struct shash_desc *desc,
if (inlen < pbytes) { /* padding with zero */ - crypto_sm3_update(desc, zero, pbytes - inlen); - crypto_sm3_update(desc, in, inlen); + sm3_update(sctx, zero, pbytes - inlen); + sm3_update(sctx, in, inlen); } else if (inlen > pbytes) { /* skip the starting zero */ - crypto_sm3_update(desc, in + inlen - pbytes, pbytes); + sm3_update(sctx, in + inlen - pbytes, pbytes); } else { - crypto_sm3_update(desc, in, inlen); + sm3_update(sctx, in, inlen); }
kfree(in); return 0; }
-static int sm2_z_digest_update_point(struct shash_desc *desc, +static int sm2_z_digest_update_point(struct sm3_state *sctx, MPI_POINT point, struct mpi_ec_ctx *ec, unsigned int pbytes) { MPI x, y; @@ -249,8 +249,8 @@ static int sm2_z_digest_update_point(struct shash_desc *desc, y = mpi_new(0);
if (!mpi_ec_get_affine(x, y, point, ec) && - !sm2_z_digest_update(desc, x, pbytes) && - !sm2_z_digest_update(desc, y, pbytes)) + !sm2_z_digest_update(sctx, x, pbytes) && + !sm2_z_digest_update(sctx, y, pbytes)) ret = 0;
mpi_free(x); @@ -265,7 +265,7 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm, struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm); uint16_t bits_len; unsigned char entl[2]; - SHASH_DESC_ON_STACK(desc, NULL); + struct sm3_state sctx; unsigned int pbytes;
if (id_len > (USHRT_MAX / 8) || !ec->Q) @@ -278,17 +278,17 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm, pbytes = MPI_NBYTES(ec->p);
/* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */ - sm3_base_init(desc); - crypto_sm3_update(desc, entl, 2); - crypto_sm3_update(desc, id, id_len); - - if (sm2_z_digest_update(desc, ec->a, pbytes) || - sm2_z_digest_update(desc, ec->b, pbytes) || - sm2_z_digest_update_point(desc, ec->G, ec, pbytes) || - sm2_z_digest_update_point(desc, ec->Q, ec, pbytes)) + sm3_init(&sctx); + sm3_update(&sctx, entl, 2); + sm3_update(&sctx, id, id_len); + + if (sm2_z_digest_update(&sctx, ec->a, pbytes) || + sm2_z_digest_update(&sctx, ec->b, pbytes) || + sm2_z_digest_update_point(&sctx, ec->G, ec, pbytes) || + sm2_z_digest_update_point(&sctx, ec->Q, ec, pbytes)) return -EINVAL;
- crypto_sm3_final(desc, dgst); + sm3_final(&sctx, dgst); return 0; } EXPORT_SYMBOL(sm2_compute_z_digest);
From: Tianjia Zhang tianjia.zhang@linux.alibaba.com
mainline inclusion from mainline-v5.18-rc1 commit b4784a45ea69577f21f89898c71127774a090a2a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I50ABL
--------------------------------
SM3 generic library is stand-alone implementation, it is necessary making the sm3-generic implementation to depends on SM3 library. The functions crypto_sm3_*() provided by sm3_generic is no longer exported.
Signed-off-by: Tianjia Zhang tianjia.zhang@linux.alibaba.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- crypto/Kconfig | 1 + crypto/sm3_generic.c | 142 +++++-------------------------------------- include/crypto/sm3.h | 10 --- 3 files changed, 16 insertions(+), 137 deletions(-)
diff --git a/crypto/Kconfig b/crypto/Kconfig index 5ebb24b4c644..48dffd1e1b20 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1041,6 +1041,7 @@ config CRYPTO_SHA3 config CRYPTO_SM3 tristate "SM3 digest algorithm" select CRYPTO_HASH + select CRYPTO_LIB_SM3 help SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3). It is part of the Chinese Commercial Cryptography suite. diff --git a/crypto/sm3_generic.c b/crypto/sm3_generic.c index 193c4584bd00..a215c1c37e73 100644 --- a/crypto/sm3_generic.c +++ b/crypto/sm3_generic.c @@ -5,6 +5,7 @@ * * Copyright (C) 2017 ARM Limited or its affiliates. * Written by Gilad Ben-Yossef gilad@benyossef.com + * Copyright (C) 2021 Tianjia Zhang tianjia.zhang@linux.alibaba.com */
#include <crypto/internal/hash.h> @@ -26,143 +27,29 @@ const u8 sm3_zero_message_hash[SM3_DIGEST_SIZE] = { }; EXPORT_SYMBOL_GPL(sm3_zero_message_hash);
-static inline u32 p0(u32 x) -{ - return x ^ rol32(x, 9) ^ rol32(x, 17); -} - -static inline u32 p1(u32 x) -{ - return x ^ rol32(x, 15) ^ rol32(x, 23); -} - -static inline u32 ff(unsigned int n, u32 a, u32 b, u32 c) -{ - return (n < 16) ? (a ^ b ^ c) : ((a & b) | (a & c) | (b & c)); -} - -static inline u32 gg(unsigned int n, u32 e, u32 f, u32 g) -{ - return (n < 16) ? (e ^ f ^ g) : ((e & f) | ((~e) & g)); -} - -static inline u32 t(unsigned int n) -{ - return (n < 16) ? SM3_T1 : SM3_T2; -} - -static void sm3_expand(u32 *t, u32 *w, u32 *wt) -{ - int i; - unsigned int tmp; - - /* load the input */ - for (i = 0; i <= 15; i++) - w[i] = get_unaligned_be32((__u32 *)t + i); - - for (i = 16; i <= 67; i++) { - tmp = w[i - 16] ^ w[i - 9] ^ rol32(w[i - 3], 15); - w[i] = p1(tmp) ^ (rol32(w[i - 13], 7)) ^ w[i - 6]; - } - - for (i = 0; i <= 63; i++) - wt[i] = w[i] ^ w[i + 4]; -} - -static void sm3_compress(u32 *w, u32 *wt, u32 *m) -{ - u32 ss1; - u32 ss2; - u32 tt1; - u32 tt2; - u32 a, b, c, d, e, f, g, h; - int i; - - a = m[0]; - b = m[1]; - c = m[2]; - d = m[3]; - e = m[4]; - f = m[5]; - g = m[6]; - h = m[7]; - - for (i = 0; i <= 63; i++) { - - ss1 = rol32((rol32(a, 12) + e + rol32(t(i), i & 31)), 7); - - ss2 = ss1 ^ rol32(a, 12); - - tt1 = ff(i, a, b, c) + d + ss2 + *wt; - wt++; - - tt2 = gg(i, e, f, g) + h + ss1 + *w; - w++; - - d = c; - c = rol32(b, 9); - b = a; - a = tt1; - h = g; - g = rol32(f, 19); - f = e; - e = p0(tt2); - } - - m[0] = a ^ m[0]; - m[1] = b ^ m[1]; - m[2] = c ^ m[2]; - m[3] = d ^ m[3]; - m[4] = e ^ m[4]; - m[5] = f ^ m[5]; - m[6] = g ^ m[6]; - m[7] = h ^ m[7]; - - a = b = c = d = e = f = g = h = ss1 = ss2 = tt1 = tt2 = 0; -} - -static void sm3_transform(struct sm3_state *sst, u8 const *src) -{ - unsigned int w[68]; - unsigned int wt[64]; - - sm3_expand((u32 *)src, w, wt); - sm3_compress(w, wt, sst->state); - - memzero_explicit(w, sizeof(w)); - memzero_explicit(wt, sizeof(wt)); -} - -static void sm3_generic_block_fn(struct sm3_state *sst, u8 const *src, - int blocks) -{ - while (blocks--) { - sm3_transform(sst, src); - src += SM3_BLOCK_SIZE; - } -} - -int crypto_sm3_update(struct shash_desc *desc, const u8 *data, +static int crypto_sm3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sm3_base_do_update(desc, data, len, sm3_generic_block_fn); + sm3_update(shash_desc_ctx(desc), data, len); + return 0; } -EXPORT_SYMBOL(crypto_sm3_update);
-int crypto_sm3_final(struct shash_desc *desc, u8 *out) +static int crypto_sm3_final(struct shash_desc *desc, u8 *out) { - sm3_base_do_finalize(desc, sm3_generic_block_fn); - return sm3_base_finish(desc, out); + sm3_final(shash_desc_ctx(desc), out); + return 0; } -EXPORT_SYMBOL(crypto_sm3_final);
-int crypto_sm3_finup(struct shash_desc *desc, const u8 *data, +static int crypto_sm3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *hash) { - sm3_base_do_update(desc, data, len, sm3_generic_block_fn); - return crypto_sm3_final(desc, hash); + struct sm3_state *sctx = shash_desc_ctx(desc); + + if (len) + sm3_update(sctx, data, len); + sm3_final(sctx, hash); + return 0; } -EXPORT_SYMBOL(crypto_sm3_finup);
static struct shash_alg sm3_alg = { .digestsize = SM3_DIGEST_SIZE, @@ -174,6 +61,7 @@ static struct shash_alg sm3_alg = { .base = { .cra_name = "sm3", .cra_driver_name = "sm3-generic", + .cra_priority = 100, .cra_blocksize = SM3_BLOCK_SIZE, .cra_module = THIS_MODULE, } diff --git a/include/crypto/sm3.h b/include/crypto/sm3.h index b5fb6d1bf247..1f021ad0533f 100644 --- a/include/crypto/sm3.h +++ b/include/crypto/sm3.h @@ -35,16 +35,6 @@ struct sm3_state { u8 buffer[SM3_BLOCK_SIZE]; };
-struct shash_desc; - -extern int crypto_sm3_update(struct shash_desc *desc, const u8 *data, - unsigned int len); - -extern int crypto_sm3_final(struct shash_desc *desc, u8 *out); - -extern int crypto_sm3_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *hash); - /* * Stand-alone implementation of the SM3 algorithm. It is designed to * have as little dependencies as possible so it can be used in the
From: Tianjia Zhang tianjia.zhang@linux.alibaba.com
mainline inclusion from mainline-v5.18-rc1 commit 930ab34d906d9c44727c9dcfeafcfcd33e3639e7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I50ABL
--------------------------------
This patch adds AVX assembly accelerated implementation of SM3 secure hash algorithm. From the benchmark data, compared to pure software implementation sm3-generic, the performance increase is up to 38%.
The main algorithm implementation based on SM3 AES/BMI2 accelerated work by libgcrypt at: https://gnupg.org/software/libgcrypt/index.html
Benchmark on Intel i5-6200U 2.30GHz, performance data of two implementations, pure software sm3-generic and sm3-avx acceleration. The data comes from the 326 mode and 422 mode of tcrypt. The abscissas are different lengths of per update. The data is tabulated and the unit is Mb/s:
update-size | 16 64 256 1024 2048 4096 8192 ------------+------------------------------------------------------- sm3-generic | 105.97 129.60 182.12 189.62 188.06 193.66 194.88 sm3-avx | 119.87 163.05 244.44 260.92 257.60 264.87 265.88
Signed-off-by: Tianjia Zhang tianjia.zhang@linux.alibaba.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/crypto/Makefile | 3 + arch/x86/crypto/sm3-avx-asm_64.S | 517 +++++++++++++++++++++++++++++++ arch/x86/crypto/sm3_avx_glue.c | 134 ++++++++ crypto/Kconfig | 13 + 4 files changed, 667 insertions(+) create mode 100644 arch/x86/crypto/sm3-avx-asm_64.S create mode 100644 arch/x86/crypto/sm3_avx_glue.c
diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile index 697b8ccfb763..f03ded83dfa2 100644 --- a/arch/x86/crypto/Makefile +++ b/arch/x86/crypto/Makefile @@ -92,6 +92,9 @@ nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o
obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o
+obj-$(CONFIG_CRYPTO_SM3_AVX_X86_64) += sm3-avx-x86_64.o +sm3-avx-x86_64-y := sm3-avx-asm_64.o sm3_avx_glue.o + obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o
diff --git a/arch/x86/crypto/sm3-avx-asm_64.S b/arch/x86/crypto/sm3-avx-asm_64.S new file mode 100644 index 000000000000..71e6aae23e17 --- /dev/null +++ b/arch/x86/crypto/sm3-avx-asm_64.S @@ -0,0 +1,517 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * SM3 AVX accelerated transform. + * specified in: https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02 + * + * Copyright (C) 2021 Jussi Kivilinna jussi.kivilinna@iki.fi + * Copyright (C) 2021 Tianjia Zhang tianjia.zhang@linux.alibaba.com + */ + +/* Based on SM3 AES/BMI2 accelerated work by libgcrypt at: + * https://gnupg.org/software/libgcrypt/index.html + */ + +#include <linux/linkage.h> +#include <asm/frame.h> + +/* Context structure */ + +#define state_h0 0 +#define state_h1 4 +#define state_h2 8 +#define state_h3 12 +#define state_h4 16 +#define state_h5 20 +#define state_h6 24 +#define state_h7 28 + +/* Constants */ + +/* Round constant macros */ + +#define K0 2043430169 /* 0x79cc4519 */ +#define K1 -208106958 /* 0xf3988a32 */ +#define K2 -416213915 /* 0xe7311465 */ +#define K3 -832427829 /* 0xce6228cb */ +#define K4 -1664855657 /* 0x9cc45197 */ +#define K5 965255983 /* 0x3988a32f */ +#define K6 1930511966 /* 0x7311465e */ +#define K7 -433943364 /* 0xe6228cbc */ +#define K8 -867886727 /* 0xcc451979 */ +#define K9 -1735773453 /* 0x988a32f3 */ +#define K10 823420391 /* 0x311465e7 */ +#define K11 1646840782 /* 0x6228cbce */ +#define K12 -1001285732 /* 0xc451979c */ +#define K13 -2002571463 /* 0x88a32f39 */ +#define K14 289824371 /* 0x11465e73 */ +#define K15 579648742 /* 0x228cbce6 */ +#define K16 -1651869049 /* 0x9d8a7a87 */ +#define K17 991229199 /* 0x3b14f50f */ +#define K18 1982458398 /* 0x7629ea1e */ +#define K19 -330050500 /* 0xec53d43c */ +#define K20 -660100999 /* 0xd8a7a879 */ +#define K21 -1320201997 /* 0xb14f50f3 */ +#define K22 1654563303 /* 0x629ea1e7 */ +#define K23 -985840690 /* 0xc53d43ce */ +#define K24 -1971681379 /* 0x8a7a879d */ +#define K25 351604539 /* 0x14f50f3b */ +#define K26 703209078 /* 0x29ea1e76 */ +#define K27 1406418156 /* 0x53d43cec */ +#define K28 -1482130984 /* 0xa7a879d8 */ +#define K29 1330705329 /* 0x4f50f3b1 */ +#define K30 -1633556638 /* 0x9ea1e762 */ +#define K31 1027854021 /* 0x3d43cec5 */ +#define K32 2055708042 /* 0x7a879d8a */ +#define K33 -183551212 /* 0xf50f3b14 */ +#define K34 -367102423 /* 0xea1e7629 */ +#define K35 -734204845 /* 0xd43cec53 */ +#define K36 -1468409689 /* 0xa879d8a7 */ +#define K37 1358147919 /* 0x50f3b14f */ +#define K38 -1578671458 /* 0xa1e7629e */ +#define K39 1137624381 /* 0x43cec53d */ +#define K40 -2019718534 /* 0x879d8a7a */ +#define K41 255530229 /* 0x0f3b14f5 */ +#define K42 511060458 /* 0x1e7629ea */ +#define K43 1022120916 /* 0x3cec53d4 */ +#define K44 2044241832 /* 0x79d8a7a8 */ +#define K45 -206483632 /* 0xf3b14f50 */ +#define K46 -412967263 /* 0xe7629ea1 */ +#define K47 -825934525 /* 0xcec53d43 */ +#define K48 -1651869049 /* 0x9d8a7a87 */ +#define K49 991229199 /* 0x3b14f50f */ +#define K50 1982458398 /* 0x7629ea1e */ +#define K51 -330050500 /* 0xec53d43c */ +#define K52 -660100999 /* 0xd8a7a879 */ +#define K53 -1320201997 /* 0xb14f50f3 */ +#define K54 1654563303 /* 0x629ea1e7 */ +#define K55 -985840690 /* 0xc53d43ce */ +#define K56 -1971681379 /* 0x8a7a879d */ +#define K57 351604539 /* 0x14f50f3b */ +#define K58 703209078 /* 0x29ea1e76 */ +#define K59 1406418156 /* 0x53d43cec */ +#define K60 -1482130984 /* 0xa7a879d8 */ +#define K61 1330705329 /* 0x4f50f3b1 */ +#define K62 -1633556638 /* 0x9ea1e762 */ +#define K63 1027854021 /* 0x3d43cec5 */ + +/* Register macros */ + +#define RSTATE %rdi +#define RDATA %rsi +#define RNBLKS %rdx + +#define t0 %eax +#define t1 %ebx +#define t2 %ecx + +#define a %r8d +#define b %r9d +#define c %r10d +#define d %r11d +#define e %r12d +#define f %r13d +#define g %r14d +#define h %r15d + +#define W0 %xmm0 +#define W1 %xmm1 +#define W2 %xmm2 +#define W3 %xmm3 +#define W4 %xmm4 +#define W5 %xmm5 + +#define XTMP0 %xmm6 +#define XTMP1 %xmm7 +#define XTMP2 %xmm8 +#define XTMP3 %xmm9 +#define XTMP4 %xmm10 +#define XTMP5 %xmm11 +#define XTMP6 %xmm12 + +#define BSWAP_REG %xmm15 + +/* Stack structure */ + +#define STACK_W_SIZE (32 * 2 * 3) +#define STACK_REG_SAVE_SIZE (64) + +#define STACK_W (0) +#define STACK_REG_SAVE (STACK_W + STACK_W_SIZE) +#define STACK_SIZE (STACK_REG_SAVE + STACK_REG_SAVE_SIZE) + +/* Instruction helpers. */ + +#define roll2(v, reg) \ + roll $(v), reg; + +#define roll3mov(v, src, dst) \ + movl src, dst; \ + roll $(v), dst; + +#define roll3(v, src, dst) \ + rorxl $(32-(v)), src, dst; + +#define addl2(a, out) \ + leal (a, out), out; + +/* Round function macros. */ + +#define GG1(x, y, z, o, t) \ + movl x, o; \ + xorl y, o; \ + xorl z, o; + +#define FF1(x, y, z, o, t) GG1(x, y, z, o, t) + +#define GG2(x, y, z, o, t) \ + andnl z, x, o; \ + movl y, t; \ + andl x, t; \ + addl2(t, o); + +#define FF2(x, y, z, o, t) \ + movl y, o; \ + xorl x, o; \ + movl y, t; \ + andl x, t; \ + andl z, o; \ + xorl t, o; + +#define R(i, a, b, c, d, e, f, g, h, round, widx, wtype) \ + /* rol(a, 12) => t0 */ \ + roll3mov(12, a, t0); /* rorxl here would reduce perf by 6% on zen3 */ \ + /* rol (t0 + e + t), 7) => t1 */ \ + leal K##round(t0, e, 1), t1; \ + roll2(7, t1); \ + /* h + w1 => h */ \ + addl wtype##_W1_ADDR(round, widx), h; \ + /* h + t1 => h */ \ + addl2(t1, h); \ + /* t1 ^ t0 => t0 */ \ + xorl t1, t0; \ + /* w1w2 + d => d */ \ + addl wtype##_W1W2_ADDR(round, widx), d; \ + /* FF##i(a,b,c) => t1 */ \ + FF##i(a, b, c, t1, t2); \ + /* d + t1 => d */ \ + addl2(t1, d); \ + /* GG#i(e,f,g) => t2 */ \ + GG##i(e, f, g, t2, t1); \ + /* h + t2 => h */ \ + addl2(t2, h); \ + /* rol (f, 19) => f */ \ + roll2(19, f); \ + /* d + t0 => d */ \ + addl2(t0, d); \ + /* rol (b, 9) => b */ \ + roll2(9, b); \ + /* P0(h) => h */ \ + roll3(9, h, t2); \ + roll3(17, h, t1); \ + xorl t2, h; \ + xorl t1, h; + +#define R1(a, b, c, d, e, f, g, h, round, widx, wtype) \ + R(1, a, b, c, d, e, f, g, h, round, widx, wtype) + +#define R2(a, b, c, d, e, f, g, h, round, widx, wtype) \ + R(2, a, b, c, d, e, f, g, h, round, widx, wtype) + +/* Input expansion macros. */ + +/* Byte-swapped input address. */ +#define IW_W_ADDR(round, widx, offs) \ + (STACK_W + ((round) / 4) * 64 + (offs) + ((widx) * 4))(%rsp) + +/* Expanded input address. */ +#define XW_W_ADDR(round, widx, offs) \ + (STACK_W + ((((round) / 3) - 4) % 2) * 64 + (offs) + ((widx) * 4))(%rsp) + +/* Rounds 1-12, byte-swapped input block addresses. */ +#define IW_W1_ADDR(round, widx) IW_W_ADDR(round, widx, 0) +#define IW_W1W2_ADDR(round, widx) IW_W_ADDR(round, widx, 32) + +/* Rounds 1-12, expanded input block addresses. */ +#define XW_W1_ADDR(round, widx) XW_W_ADDR(round, widx, 0) +#define XW_W1W2_ADDR(round, widx) XW_W_ADDR(round, widx, 32) + +/* Input block loading. */ +#define LOAD_W_XMM_1() \ + vmovdqu 0*16(RDATA), XTMP0; /* XTMP0: w3, w2, w1, w0 */ \ + vmovdqu 1*16(RDATA), XTMP1; /* XTMP1: w7, w6, w5, w4 */ \ + vmovdqu 2*16(RDATA), XTMP2; /* XTMP2: w11, w10, w9, w8 */ \ + vmovdqu 3*16(RDATA), XTMP3; /* XTMP3: w15, w14, w13, w12 */ \ + vpshufb BSWAP_REG, XTMP0, XTMP0; \ + vpshufb BSWAP_REG, XTMP1, XTMP1; \ + vpshufb BSWAP_REG, XTMP2, XTMP2; \ + vpshufb BSWAP_REG, XTMP3, XTMP3; \ + vpxor XTMP0, XTMP1, XTMP4; \ + vpxor XTMP1, XTMP2, XTMP5; \ + vpxor XTMP2, XTMP3, XTMP6; \ + leaq 64(RDATA), RDATA; \ + vmovdqa XTMP0, IW_W1_ADDR(0, 0); \ + vmovdqa XTMP4, IW_W1W2_ADDR(0, 0); \ + vmovdqa XTMP1, IW_W1_ADDR(4, 0); \ + vmovdqa XTMP5, IW_W1W2_ADDR(4, 0); + +#define LOAD_W_XMM_2() \ + vmovdqa XTMP2, IW_W1_ADDR(8, 0); \ + vmovdqa XTMP6, IW_W1W2_ADDR(8, 0); + +#define LOAD_W_XMM_3() \ + vpshufd $0b00000000, XTMP0, W0; /* W0: xx, w0, xx, xx */ \ + vpshufd $0b11111001, XTMP0, W1; /* W1: xx, w3, w2, w1 */ \ + vmovdqa XTMP1, W2; /* W2: xx, w6, w5, w4 */ \ + vpalignr $12, XTMP1, XTMP2, W3; /* W3: xx, w9, w8, w7 */ \ + vpalignr $8, XTMP2, XTMP3, W4; /* W4: xx, w12, w11, w10 */ \ + vpshufd $0b11111001, XTMP3, W5; /* W5: xx, w15, w14, w13 */ + +/* Message scheduling. Note: 3 words per XMM register. */ +#define SCHED_W_0(round, w0, w1, w2, w3, w4, w5) \ + /* Load (w[i - 16]) => XTMP0 */ \ + vpshufd $0b10111111, w0, XTMP0; \ + vpalignr $12, XTMP0, w1, XTMP0; /* XTMP0: xx, w2, w1, w0 */ \ + /* Load (w[i - 13]) => XTMP1 */ \ + vpshufd $0b10111111, w1, XTMP1; \ + vpalignr $12, XTMP1, w2, XTMP1; \ + /* w[i - 9] == w3 */ \ + /* XMM3 ^ XTMP0 => XTMP0 */ \ + vpxor w3, XTMP0, XTMP0; + +#define SCHED_W_1(round, w0, w1, w2, w3, w4, w5) \ + /* w[i - 3] == w5 */ \ + /* rol(XMM5, 15) ^ XTMP0 => XTMP0 */ \ + vpslld $15, w5, XTMP2; \ + vpsrld $(32-15), w5, XTMP3; \ + vpxor XTMP2, XTMP3, XTMP3; \ + vpxor XTMP3, XTMP0, XTMP0; \ + /* rol(XTMP1, 7) => XTMP1 */ \ + vpslld $7, XTMP1, XTMP5; \ + vpsrld $(32-7), XTMP1, XTMP1; \ + vpxor XTMP5, XTMP1, XTMP1; \ + /* XMM4 ^ XTMP1 => XTMP1 */ \ + vpxor w4, XTMP1, XTMP1; \ + /* w[i - 6] == XMM4 */ \ + /* P1(XTMP0) ^ XTMP1 => XMM0 */ \ + vpslld $15, XTMP0, XTMP5; \ + vpsrld $(32-15), XTMP0, XTMP6; \ + vpslld $23, XTMP0, XTMP2; \ + vpsrld $(32-23), XTMP0, XTMP3; \ + vpxor XTMP0, XTMP1, XTMP1; \ + vpxor XTMP6, XTMP5, XTMP5; \ + vpxor XTMP3, XTMP2, XTMP2; \ + vpxor XTMP2, XTMP5, XTMP5; \ + vpxor XTMP5, XTMP1, w0; + +#define SCHED_W_2(round, w0, w1, w2, w3, w4, w5) \ + /* W1 in XMM12 */ \ + vpshufd $0b10111111, w4, XTMP4; \ + vpalignr $12, XTMP4, w5, XTMP4; \ + vmovdqa XTMP4, XW_W1_ADDR((round), 0); \ + /* W1 ^ W2 => XTMP1 */ \ + vpxor w0, XTMP4, XTMP1; \ + vmovdqa XTMP1, XW_W1W2_ADDR((round), 0); + + +.section .rodata.cst16, "aM", @progbits, 16 +.align 16 + +.Lbe32mask: + .long 0x00010203, 0x04050607, 0x08090a0b, 0x0c0d0e0f + +.text + +/* + * Transform nblocks*64 bytes (nblocks*16 32-bit words) at DATA. + * + * void sm3_transform_avx(struct sm3_state *state, + * const u8 *data, int nblocks); + */ +.align 16 +SYM_FUNC_START(sm3_transform_avx) + /* input: + * %rdi: ctx, CTX + * %rsi: data (64*nblks bytes) + * %rdx: nblocks + */ + vzeroupper; + + pushq %rbp; + movq %rsp, %rbp; + + movq %rdx, RNBLKS; + + subq $STACK_SIZE, %rsp; + andq $(~63), %rsp; + + movq %rbx, (STACK_REG_SAVE + 0 * 8)(%rsp); + movq %r15, (STACK_REG_SAVE + 1 * 8)(%rsp); + movq %r14, (STACK_REG_SAVE + 2 * 8)(%rsp); + movq %r13, (STACK_REG_SAVE + 3 * 8)(%rsp); + movq %r12, (STACK_REG_SAVE + 4 * 8)(%rsp); + + vmovdqa .Lbe32mask (%rip), BSWAP_REG; + + /* Get the values of the chaining variables. */ + movl state_h0(RSTATE), a; + movl state_h1(RSTATE), b; + movl state_h2(RSTATE), c; + movl state_h3(RSTATE), d; + movl state_h4(RSTATE), e; + movl state_h5(RSTATE), f; + movl state_h6(RSTATE), g; + movl state_h7(RSTATE), h; + +.align 16 +.Loop: + /* Load data part1. */ + LOAD_W_XMM_1(); + + leaq -1(RNBLKS), RNBLKS; + + /* Transform 0-3 + Load data part2. */ + R1(a, b, c, d, e, f, g, h, 0, 0, IW); LOAD_W_XMM_2(); + R1(d, a, b, c, h, e, f, g, 1, 1, IW); + R1(c, d, a, b, g, h, e, f, 2, 2, IW); + R1(b, c, d, a, f, g, h, e, 3, 3, IW); LOAD_W_XMM_3(); + + /* Transform 4-7 + Precalc 12-14. */ + R1(a, b, c, d, e, f, g, h, 4, 0, IW); + R1(d, a, b, c, h, e, f, g, 5, 1, IW); + R1(c, d, a, b, g, h, e, f, 6, 2, IW); SCHED_W_0(12, W0, W1, W2, W3, W4, W5); + R1(b, c, d, a, f, g, h, e, 7, 3, IW); SCHED_W_1(12, W0, W1, W2, W3, W4, W5); + + /* Transform 8-11 + Precalc 12-17. */ + R1(a, b, c, d, e, f, g, h, 8, 0, IW); SCHED_W_2(12, W0, W1, W2, W3, W4, W5); + R1(d, a, b, c, h, e, f, g, 9, 1, IW); SCHED_W_0(15, W1, W2, W3, W4, W5, W0); + R1(c, d, a, b, g, h, e, f, 10, 2, IW); SCHED_W_1(15, W1, W2, W3, W4, W5, W0); + R1(b, c, d, a, f, g, h, e, 11, 3, IW); SCHED_W_2(15, W1, W2, W3, W4, W5, W0); + + /* Transform 12-14 + Precalc 18-20 */ + R1(a, b, c, d, e, f, g, h, 12, 0, XW); SCHED_W_0(18, W2, W3, W4, W5, W0, W1); + R1(d, a, b, c, h, e, f, g, 13, 1, XW); SCHED_W_1(18, W2, W3, W4, W5, W0, W1); + R1(c, d, a, b, g, h, e, f, 14, 2, XW); SCHED_W_2(18, W2, W3, W4, W5, W0, W1); + + /* Transform 15-17 + Precalc 21-23 */ + R1(b, c, d, a, f, g, h, e, 15, 0, XW); SCHED_W_0(21, W3, W4, W5, W0, W1, W2); + R2(a, b, c, d, e, f, g, h, 16, 1, XW); SCHED_W_1(21, W3, W4, W5, W0, W1, W2); + R2(d, a, b, c, h, e, f, g, 17, 2, XW); SCHED_W_2(21, W3, W4, W5, W0, W1, W2); + + /* Transform 18-20 + Precalc 24-26 */ + R2(c, d, a, b, g, h, e, f, 18, 0, XW); SCHED_W_0(24, W4, W5, W0, W1, W2, W3); + R2(b, c, d, a, f, g, h, e, 19, 1, XW); SCHED_W_1(24, W4, W5, W0, W1, W2, W3); + R2(a, b, c, d, e, f, g, h, 20, 2, XW); SCHED_W_2(24, W4, W5, W0, W1, W2, W3); + + /* Transform 21-23 + Precalc 27-29 */ + R2(d, a, b, c, h, e, f, g, 21, 0, XW); SCHED_W_0(27, W5, W0, W1, W2, W3, W4); + R2(c, d, a, b, g, h, e, f, 22, 1, XW); SCHED_W_1(27, W5, W0, W1, W2, W3, W4); + R2(b, c, d, a, f, g, h, e, 23, 2, XW); SCHED_W_2(27, W5, W0, W1, W2, W3, W4); + + /* Transform 24-26 + Precalc 30-32 */ + R2(a, b, c, d, e, f, g, h, 24, 0, XW); SCHED_W_0(30, W0, W1, W2, W3, W4, W5); + R2(d, a, b, c, h, e, f, g, 25, 1, XW); SCHED_W_1(30, W0, W1, W2, W3, W4, W5); + R2(c, d, a, b, g, h, e, f, 26, 2, XW); SCHED_W_2(30, W0, W1, W2, W3, W4, W5); + + /* Transform 27-29 + Precalc 33-35 */ + R2(b, c, d, a, f, g, h, e, 27, 0, XW); SCHED_W_0(33, W1, W2, W3, W4, W5, W0); + R2(a, b, c, d, e, f, g, h, 28, 1, XW); SCHED_W_1(33, W1, W2, W3, W4, W5, W0); + R2(d, a, b, c, h, e, f, g, 29, 2, XW); SCHED_W_2(33, W1, W2, W3, W4, W5, W0); + + /* Transform 30-32 + Precalc 36-38 */ + R2(c, d, a, b, g, h, e, f, 30, 0, XW); SCHED_W_0(36, W2, W3, W4, W5, W0, W1); + R2(b, c, d, a, f, g, h, e, 31, 1, XW); SCHED_W_1(36, W2, W3, W4, W5, W0, W1); + R2(a, b, c, d, e, f, g, h, 32, 2, XW); SCHED_W_2(36, W2, W3, W4, W5, W0, W1); + + /* Transform 33-35 + Precalc 39-41 */ + R2(d, a, b, c, h, e, f, g, 33, 0, XW); SCHED_W_0(39, W3, W4, W5, W0, W1, W2); + R2(c, d, a, b, g, h, e, f, 34, 1, XW); SCHED_W_1(39, W3, W4, W5, W0, W1, W2); + R2(b, c, d, a, f, g, h, e, 35, 2, XW); SCHED_W_2(39, W3, W4, W5, W0, W1, W2); + + /* Transform 36-38 + Precalc 42-44 */ + R2(a, b, c, d, e, f, g, h, 36, 0, XW); SCHED_W_0(42, W4, W5, W0, W1, W2, W3); + R2(d, a, b, c, h, e, f, g, 37, 1, XW); SCHED_W_1(42, W4, W5, W0, W1, W2, W3); + R2(c, d, a, b, g, h, e, f, 38, 2, XW); SCHED_W_2(42, W4, W5, W0, W1, W2, W3); + + /* Transform 39-41 + Precalc 45-47 */ + R2(b, c, d, a, f, g, h, e, 39, 0, XW); SCHED_W_0(45, W5, W0, W1, W2, W3, W4); + R2(a, b, c, d, e, f, g, h, 40, 1, XW); SCHED_W_1(45, W5, W0, W1, W2, W3, W4); + R2(d, a, b, c, h, e, f, g, 41, 2, XW); SCHED_W_2(45, W5, W0, W1, W2, W3, W4); + + /* Transform 42-44 + Precalc 48-50 */ + R2(c, d, a, b, g, h, e, f, 42, 0, XW); SCHED_W_0(48, W0, W1, W2, W3, W4, W5); + R2(b, c, d, a, f, g, h, e, 43, 1, XW); SCHED_W_1(48, W0, W1, W2, W3, W4, W5); + R2(a, b, c, d, e, f, g, h, 44, 2, XW); SCHED_W_2(48, W0, W1, W2, W3, W4, W5); + + /* Transform 45-47 + Precalc 51-53 */ + R2(d, a, b, c, h, e, f, g, 45, 0, XW); SCHED_W_0(51, W1, W2, W3, W4, W5, W0); + R2(c, d, a, b, g, h, e, f, 46, 1, XW); SCHED_W_1(51, W1, W2, W3, W4, W5, W0); + R2(b, c, d, a, f, g, h, e, 47, 2, XW); SCHED_W_2(51, W1, W2, W3, W4, W5, W0); + + /* Transform 48-50 + Precalc 54-56 */ + R2(a, b, c, d, e, f, g, h, 48, 0, XW); SCHED_W_0(54, W2, W3, W4, W5, W0, W1); + R2(d, a, b, c, h, e, f, g, 49, 1, XW); SCHED_W_1(54, W2, W3, W4, W5, W0, W1); + R2(c, d, a, b, g, h, e, f, 50, 2, XW); SCHED_W_2(54, W2, W3, W4, W5, W0, W1); + + /* Transform 51-53 + Precalc 57-59 */ + R2(b, c, d, a, f, g, h, e, 51, 0, XW); SCHED_W_0(57, W3, W4, W5, W0, W1, W2); + R2(a, b, c, d, e, f, g, h, 52, 1, XW); SCHED_W_1(57, W3, W4, W5, W0, W1, W2); + R2(d, a, b, c, h, e, f, g, 53, 2, XW); SCHED_W_2(57, W3, W4, W5, W0, W1, W2); + + /* Transform 54-56 + Precalc 60-62 */ + R2(c, d, a, b, g, h, e, f, 54, 0, XW); SCHED_W_0(60, W4, W5, W0, W1, W2, W3); + R2(b, c, d, a, f, g, h, e, 55, 1, XW); SCHED_W_1(60, W4, W5, W0, W1, W2, W3); + R2(a, b, c, d, e, f, g, h, 56, 2, XW); SCHED_W_2(60, W4, W5, W0, W1, W2, W3); + + /* Transform 57-59 + Precalc 63 */ + R2(d, a, b, c, h, e, f, g, 57, 0, XW); SCHED_W_0(63, W5, W0, W1, W2, W3, W4); + R2(c, d, a, b, g, h, e, f, 58, 1, XW); + R2(b, c, d, a, f, g, h, e, 59, 2, XW); SCHED_W_1(63, W5, W0, W1, W2, W3, W4); + + /* Transform 60-62 + Precalc 63 */ + R2(a, b, c, d, e, f, g, h, 60, 0, XW); + R2(d, a, b, c, h, e, f, g, 61, 1, XW); SCHED_W_2(63, W5, W0, W1, W2, W3, W4); + R2(c, d, a, b, g, h, e, f, 62, 2, XW); + + /* Transform 63 */ + R2(b, c, d, a, f, g, h, e, 63, 0, XW); + + /* Update the chaining variables. */ + xorl state_h0(RSTATE), a; + xorl state_h1(RSTATE), b; + xorl state_h2(RSTATE), c; + xorl state_h3(RSTATE), d; + movl a, state_h0(RSTATE); + movl b, state_h1(RSTATE); + movl c, state_h2(RSTATE); + movl d, state_h3(RSTATE); + xorl state_h4(RSTATE), e; + xorl state_h5(RSTATE), f; + xorl state_h6(RSTATE), g; + xorl state_h7(RSTATE), h; + movl e, state_h4(RSTATE); + movl f, state_h5(RSTATE); + movl g, state_h6(RSTATE); + movl h, state_h7(RSTATE); + + cmpq $0, RNBLKS; + jne .Loop; + + vzeroall; + + movq (STACK_REG_SAVE + 0 * 8)(%rsp), %rbx; + movq (STACK_REG_SAVE + 1 * 8)(%rsp), %r15; + movq (STACK_REG_SAVE + 2 * 8)(%rsp), %r14; + movq (STACK_REG_SAVE + 3 * 8)(%rsp), %r13; + movq (STACK_REG_SAVE + 4 * 8)(%rsp), %r12; + + vmovdqa %xmm0, IW_W1_ADDR(0, 0); + vmovdqa %xmm0, IW_W1W2_ADDR(0, 0); + vmovdqa %xmm0, IW_W1_ADDR(4, 0); + vmovdqa %xmm0, IW_W1W2_ADDR(4, 0); + vmovdqa %xmm0, IW_W1_ADDR(8, 0); + vmovdqa %xmm0, IW_W1W2_ADDR(8, 0); + + movq %rbp, %rsp; + popq %rbp; + ret; +SYM_FUNC_END(sm3_transform_avx) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c new file mode 100644 index 000000000000..661b6f22ffcd --- /dev/null +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -0,0 +1,134 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * SM3 Secure Hash Algorithm, AVX assembler accelerated. + * specified in: https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02 + * + * Copyright (C) 2021 Tianjia Zhang tianjia.zhang@linux.alibaba.com + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <crypto/internal/hash.h> +#include <crypto/internal/simd.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/types.h> +#include <crypto/sm3.h> +#include <crypto/sm3_base.h> +#include <asm/simd.h> + +asmlinkage void sm3_transform_avx(struct sm3_state *state, + const u8 *data, int nblocks); + +static int sm3_avx_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + struct sm3_state *sctx = shash_desc_ctx(desc); + + if (!crypto_simd_usable() || + (sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) { + sm3_update(sctx, data, len); + return 0; + } + + /* + * Make sure struct sm3_state begins directly with the SM3 + * 256-bit internal state, as this is what the asm functions expect. + */ + BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); + + kernel_fpu_begin(); + sm3_base_do_update(desc, data, len, sm3_transform_avx); + kernel_fpu_end(); + + return 0; +} + +static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + if (!crypto_simd_usable()) { + struct sm3_state *sctx = shash_desc_ctx(desc); + + if (len) + sm3_update(sctx, data, len); + + sm3_final(sctx, out); + return 0; + } + + kernel_fpu_begin(); + if (len) + sm3_base_do_update(desc, data, len, sm3_transform_avx); + sm3_base_do_finalize(desc, sm3_transform_avx); + kernel_fpu_end(); + + return sm3_base_finish(desc, out); +} + +static int sm3_avx_final(struct shash_desc *desc, u8 *out) +{ + if (!crypto_simd_usable()) { + sm3_final(shash_desc_ctx(desc), out); + return 0; + } + + kernel_fpu_begin(); + sm3_base_do_finalize(desc, sm3_transform_avx); + kernel_fpu_end(); + + return sm3_base_finish(desc, out); +} + +static struct shash_alg sm3_avx_alg = { + .digestsize = SM3_DIGEST_SIZE, + .init = sm3_base_init, + .update = sm3_avx_update, + .final = sm3_avx_final, + .finup = sm3_avx_finup, + .descsize = sizeof(struct sm3_state), + .base = { + .cra_name = "sm3", + .cra_driver_name = "sm3-avx", + .cra_priority = 300, + .cra_blocksize = SM3_BLOCK_SIZE, + .cra_module = THIS_MODULE, + } +}; + +static int __init sm3_avx_mod_init(void) +{ + const char *feature_name; + + if (!boot_cpu_has(X86_FEATURE_AVX)) { + pr_info("AVX instruction are not detected.\n"); + return -ENODEV; + } + + if (!boot_cpu_has(X86_FEATURE_BMI2)) { + pr_info("BMI2 instruction are not detected.\n"); + return -ENODEV; + } + + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, + &feature_name)) { + pr_info("CPU feature '%s' is not supported.\n", feature_name); + return -ENODEV; + } + + return crypto_register_shash(&sm3_avx_alg); +} + +static void __exit sm3_avx_mod_exit(void) +{ + crypto_unregister_shash(&sm3_avx_alg); +} + +module_init(sm3_avx_mod_init); +module_exit(sm3_avx_mod_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Tianjia Zhang tianjia.zhang@linux.alibaba.com"); +MODULE_DESCRIPTION("SM3 Secure Hash Algorithm, AVX assembler accelerated"); +MODULE_ALIAS_CRYPTO("sm3"); +MODULE_ALIAS_CRYPTO("sm3-avx"); diff --git a/crypto/Kconfig b/crypto/Kconfig index 48dffd1e1b20..2d2473806567 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1050,6 +1050,19 @@ config CRYPTO_SM3 http://www.oscca.gov.cn/UpFile/20101222141857786.pdf https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash
+config CRYPTO_SM3_AVX_X86_64 + tristate "SM3 digest algorithm (x86_64/AVX)" + depends on X86 && 64BIT + select CRYPTO_HASH + select CRYPTO_LIB_SM3 + help + SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3). + It is part of the Chinese Commercial Cryptography suite. This is + SM3 optimized implementation using Advanced Vector Extensions (AVX) + when available. + + If unsure, say N. + config CRYPTO_STREEBOG tristate "Streebog Hash Function" select CRYPTO_HASH
From: Tianjia Zhang tianjia.zhang@linux.alibaba.com
mainline inclusion from mainline-v5.18-rc1 commit ba2c149d0812cee653a186c9cfe451699b211c91 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I50ABL
--------------------------------
tcrypt supports testing of SM3 hash algorithms that use AVX instruction acceleration.
In order to add the sm3 asynchronous test to the appropriate position, shift the testcase sequence number of the multi buffer backward and start from 450.
Signed-off-by: Tianjia Zhang tianjia.zhang@linux.alibaba.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: GUO Zihua guozihua@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- crypto/tcrypt.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 41b3443f56a0..50efdfbe71e3 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -2619,31 +2619,35 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) if (mode > 400 && mode < 500) break; fallthrough; case 422: + test_ahash_speed("sm3", sec, generic_hash_speed_template); + if (mode > 400 && mode < 500) break; + fallthrough; + case 450: test_mb_ahash_speed("sha1", sec, generic_hash_speed_template, num_mb); if (mode > 400 && mode < 500) break; fallthrough; - case 423: + case 451: test_mb_ahash_speed("sha256", sec, generic_hash_speed_template, num_mb); if (mode > 400 && mode < 500) break; fallthrough; - case 424: + case 452: test_mb_ahash_speed("sha512", sec, generic_hash_speed_template, num_mb); if (mode > 400 && mode < 500) break; fallthrough; - case 425: + case 453: test_mb_ahash_speed("sm3", sec, generic_hash_speed_template, num_mb); if (mode > 400 && mode < 500) break; fallthrough; - case 426: + case 454: test_mb_ahash_speed("streebog256", sec, generic_hash_speed_template, num_mb); if (mode > 400 && mode < 500) break; fallthrough; - case 427: + case 455: test_mb_ahash_speed("streebog512", sec, generic_hash_speed_template, num_mb); if (mode > 400 && mode < 500) break;
From: Xu Qiang xuqiang36@huawei.com
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I554T5 CVE: NA
------------
For Ascend platform, other NON-OS managed GICRs need be initialized in OS.
Signed-off-by: Xu Qiang xuqiang36@huawei.com Reviewed-by: Liao Chang liaochang1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/irqchip/Kconfig | 10 ++ drivers/irqchip/irq-gic-v3-its.c | 217 ++++++++++++++++++++++++++++- drivers/irqchip/irq-gic-v3.c | 102 +++++++++++++- include/linux/irqchip/arm-gic-v3.h | 5 + 4 files changed, 326 insertions(+), 8 deletions(-)
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index 214d7fd1fdd1..375ba3b30b72 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -178,6 +178,16 @@ config HISILICON_IRQ_MBIGEN select ARM_GIC_V3 select ARM_GIC_V3_ITS
+if ASCEND_FEATURES + +config ASCEND_INIT_ALL_GICR + bool "Enable init all GICR for Ascend" + depends on ARM_GIC_V3 + depends on ARM_GIC_V3_ITS + default n + +endif + config IMGPDC_IRQ bool select GENERIC_IRQ_CHIP diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 6b46cfdcb402..e04e0166eac3 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -195,6 +195,14 @@ static DEFINE_RAW_SPINLOCK(vmovp_lock);
static DEFINE_IDA(its_vpeid_ida);
+#ifdef CONFIG_ASCEND_INIT_ALL_GICR +static bool init_all_gicr; +static int nr_gicr; +#else +#define init_all_gicr false +#define nr_gicr 0 +#endif + #define gic_data_rdist() (raw_cpu_ptr(gic_rdists->rdist)) #define gic_data_rdist_cpu(cpu) (per_cpu_ptr(gic_rdists->rdist, cpu)) #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base) @@ -1640,6 +1648,26 @@ static int its_select_cpu(struct irq_data *d, return cpu; }
+#ifdef CONFIG_ASCEND_INIT_ALL_GICR +static int its_select_cpu_other(const struct cpumask *mask_val) +{ + int cpu; + + if (!init_all_gicr) + return -EINVAL; + + cpu = find_first_bit(cpumask_bits(mask_val), NR_CPUS); + if (cpu >= nr_gicr) + cpu = -EINVAL; + return cpu; +} +#else +static int its_select_cpu_other(const struct cpumask *mask_val) +{ + return -EINVAL; +} +#endif + static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val, bool force) { @@ -1661,6 +1689,9 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val, cpu = cpumask_pick_least_loaded(d, mask_val);
if (cpu < 0 || cpu >= nr_cpu_ids) + cpu = its_select_cpu_other(mask_val); + + if (cpu < 0) goto err;
/* don't set the affinity when the target cpu is same as current one */ @@ -2928,8 +2959,12 @@ static int allocate_vpe_l1_table(void) static int its_alloc_collections(struct its_node *its) { int i; + int cpu_nr = nr_cpu_ids; + + if (init_all_gicr) + cpu_nr = CONFIG_NR_CPUS;
- its->collections = kcalloc(nr_cpu_ids, sizeof(*its->collections), + its->collections = kcalloc(cpu_nr, sizeof(*its->collections), GFP_KERNEL); if (!its->collections) return -ENOMEM; @@ -3225,6 +3260,186 @@ static void its_cpu_init_collections(void) raw_spin_unlock(&its_lock); }
+#ifdef CONFIG_ASCEND_INIT_ALL_GICR +void its_set_gicr_nr(int nr) +{ + nr_gicr = nr; +} + +static int __init its_enable_init_all_gicr(char *str) +{ + init_all_gicr = true; + return 1; +} + +__setup("init_all_gicr", its_enable_init_all_gicr); + +bool its_init_all_gicr(void) +{ + return init_all_gicr; +} + +static void its_cpu_init_lpis_others(void __iomem *rbase, int cpu) +{ + struct page *pend_page; + phys_addr_t paddr; + u64 val, tmp; + + if (!init_all_gicr) + return; + + val = readl_relaxed(rbase + GICR_CTLR); + if ((gic_rdists->flags & RDIST_FLAGS_RD_TABLES_PREALLOCATED) && + (val & GICR_CTLR_ENABLE_LPIS)) { + /* + * Check that we get the same property table on all + * RDs. If we don't, this is hopeless. + */ + paddr = gicr_read_propbaser(rbase + GICR_PROPBASER); + paddr &= GENMASK_ULL(51, 12); + if (WARN_ON(gic_rdists->prop_table_pa != paddr)) + add_taint(TAINT_CRAP, LOCKDEP_STILL_OK); + + paddr = gicr_read_pendbaser(rbase + GICR_PENDBASER); + paddr &= GENMASK_ULL(51, 16); + + WARN_ON(!gic_check_reserved_range(paddr, LPI_PENDBASE_SZ)); + + goto out; + } + + /* If we didn't allocate the pending table yet, do it now */ + pend_page = its_allocate_pending_table(GFP_NOWAIT); + if (!pend_page) { + pr_err("Failed to allocate PENDBASE for GICR:%p\n", rbase); + return; + } + + paddr = page_to_phys(pend_page); + pr_info("GICR:%p using LPI pending table @%pa\n", + rbase, &paddr); + + WARN_ON(gic_reserve_range(paddr, LPI_PENDBASE_SZ)); + + /* Disable LPIs */ + val = readl_relaxed(rbase + GICR_CTLR); + val &= ~GICR_CTLR_ENABLE_LPIS; + writel_relaxed(val, rbase + GICR_CTLR); + + /* + * Make sure any change to the table is observable by the GIC. + */ + dsb(sy); + + /* set PROPBASE */ + val = (gic_rdists->prop_table_pa | + GICR_PROPBASER_InnerShareable | + GICR_PROPBASER_RaWaWb | + ((LPI_NRBITS - 1) & GICR_PROPBASER_IDBITS_MASK)); + + gicr_write_propbaser(val, rbase + GICR_PROPBASER); + tmp = gicr_read_propbaser(rbase + GICR_PROPBASER); + + if ((tmp ^ val) & GICR_PROPBASER_SHAREABILITY_MASK) { + if (!(tmp & GICR_PROPBASER_SHAREABILITY_MASK)) { + /* + * The HW reports non-shareable, we must + * remove the cacheability attributes as + * well. + */ + val &= ~(GICR_PROPBASER_SHAREABILITY_MASK | + GICR_PROPBASER_CACHEABILITY_MASK); + val |= GICR_PROPBASER_nC; + gicr_write_propbaser(val, rbase + GICR_PROPBASER); + } + pr_info_once("GIC: using cache flushing for LPI property table\n"); + gic_rdists->flags |= RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING; + } + + /* set PENDBASE */ + val = (page_to_phys(pend_page) | + GICR_PENDBASER_InnerShareable | + GICR_PENDBASER_RaWaWb); + + gicr_write_pendbaser(val, rbase + GICR_PENDBASER); + tmp = gicr_read_pendbaser(rbase + GICR_PENDBASER); + + if (!(tmp & GICR_PENDBASER_SHAREABILITY_MASK)) { + /* + * The HW reports non-shareable, we must remove the + * cacheability attributes as well. + */ + val &= ~(GICR_PENDBASER_SHAREABILITY_MASK | + GICR_PENDBASER_CACHEABILITY_MASK); + val |= GICR_PENDBASER_nC; + gicr_write_pendbaser(val, rbase + GICR_PENDBASER); + } + + /* Enable LPIs */ + val = readl_relaxed(rbase + GICR_CTLR); + val |= GICR_CTLR_ENABLE_LPIS; + writel_relaxed(val, rbase + GICR_CTLR); + + /* Make sure the GIC has seen the above */ + dsb(sy); +out: + pr_info("GICv3: CPU%d: using %s LPI pending table @%pa\n", + cpu, pend_page ? "allocated" : "reserved", &paddr); +} + +static void its_cpu_init_collection_others(void __iomem *rbase, + phys_addr_t phys_base, int cpu) +{ + struct its_node *its; + + if (!init_all_gicr) + return; + + raw_spin_lock(&its_lock); + + list_for_each_entry(its, &its_nodes, entry) { + u64 target; + + /* + * We now have to bind each collection to its target + * redistributor. + */ + if (gic_read_typer(its->base + GITS_TYPER) & GITS_TYPER_PTA) { + /* + * This ITS wants the physical address of the + * redistributor. + */ + target = phys_base; + } else { + /* + * This ITS wants a linear CPU number. + */ + target = gic_read_typer(rbase + GICR_TYPER); + target = GICR_TYPER_CPU_NUMBER(target) << 16; + } + + /* Perform collection mapping */ + its->collections[cpu].target_address = target; + its->collections[cpu].col_id = cpu; + + its_send_mapc(its, &its->collections[cpu], 1); + its_send_invall(its, &its->collections[cpu]); + } + + raw_spin_unlock(&its_lock); +} + +int its_cpu_init_others(void __iomem *base, phys_addr_t phys_base, int cpu) +{ + if (!list_empty(&its_nodes)) { + its_cpu_init_lpis_others(base, cpu); + its_cpu_init_collection_others(base, phys_base, cpu); + } + + return 0; +} +#endif + static struct its_device *its_find_device(struct its_node *its, u32 dev_id) { struct its_device *its_dev = NULL, *tmp; diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 5418c398b6a5..aa8facbdf407 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -244,17 +244,11 @@ static u64 __maybe_unused gic_read_iar(void) } #endif
-static void gic_enable_redist(bool enable) +static void __gic_enable_redist(void __iomem *rbase, bool enable) { - void __iomem *rbase; u32 count = 1000000; /* 1s! */ u32 val;
- if (gic_data.flags & FLAGS_WORKAROUND_GICR_WAKER_MSM8996) - return; - - rbase = gic_data_rdist_rd_base(); - val = readl_relaxed(rbase + GICR_WAKER); if (enable) /* Wake up this CPU redistributor */ @@ -281,6 +275,14 @@ static void gic_enable_redist(bool enable) enable ? "wakeup" : "sleep"); }
+static void gic_enable_redist(bool enable) +{ + if (gic_data.flags & FLAGS_WORKAROUND_GICR_WAKER_MSM8996) + return; + + __gic_enable_redist(gic_data_rdist_rd_base(), enable); +} + /* * Routines to disable, enable, EOI and route interrupts */ @@ -1142,6 +1144,89 @@ static void gic_cpu_init(void) gic_cpu_sys_reg_init(); }
+#ifdef CONFIG_ASCEND_INIT_ALL_GICR +static int __gic_compute_nr_gicr(struct redist_region *region, void __iomem *ptr) +{ + static int gicr_nr = 0; + + its_set_gicr_nr(++gicr_nr); + + return 1; +} + +static void gic_compute_nr_gicr(void) +{ + gic_iterate_rdists(__gic_compute_nr_gicr); +} + +static int gic_rdist_cpu(void __iomem *ptr, unsigned int cpu) +{ + unsigned long mpidr = cpu_logical_map(cpu); + u64 typer; + u32 aff; + + /* + * Convert affinity to a 32bit value that can be matched to + * GICR_TYPER bits [63:32]. + */ + aff = (MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24 | + MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16 | + MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8 | + MPIDR_AFFINITY_LEVEL(mpidr, 0)); + + typer = gic_read_typer(ptr + GICR_TYPER); + if ((typer >> 32) == aff) + return 0; + + return 1; +} + +static int gic_rdist_cpus(void __iomem *ptr) +{ + unsigned int i; + + for (i = 0; i < nr_cpu_ids; i++) { + if (gic_rdist_cpu(ptr, i) == 0) + return 0; + } + + return 1; +} + +static int gic_cpu_init_other(struct redist_region *region, void __iomem *ptr) +{ + u64 offset; + phys_addr_t phys_base; + static int cpu = 0; + + if (cpu == 0) + cpu = nr_cpu_ids; + + if (gic_rdist_cpus(ptr) == 1) { + offset = ptr - region->redist_base; + phys_base = region->phys_base + offset; + __gic_enable_redist(ptr, true); + if (gic_dist_supports_lpis()) + its_cpu_init_others(ptr, phys_base, cpu); + cpu++; + } + + return 1; +} + +static void gic_cpu_init_others(void) +{ + if (!its_init_all_gicr()) + return; + + gic_iterate_rdists(gic_cpu_init_other); +} +#else +static inline void gic_compute_nr_gicr(void) {} + +static inline void gic_cpu_init_others(void) {} +#endif + #ifdef CONFIG_SMP
#define MPIDR_TO_SGI_RS(mpidr) (MPIDR_RS(mpidr) << ICC_SGI1R_RS_SHIFT) @@ -1763,6 +1848,7 @@ static int __init gic_init_bases(void __iomem *dist_base, gic_data.rdists.has_vlpis = true; gic_data.rdists.has_direct_lpi = true; gic_data.rdists.has_vpend_valid_dirty = true; + gic_compute_nr_gicr();
if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdists.rdist)) { err = -ENOMEM; @@ -1799,6 +1885,8 @@ static int __init gic_init_bases(void __iomem *dist_base, gicv2m_init(handle, gic_data.domain); }
+ gic_cpu_init_others(); + return 0;
out_free: diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h index f6d092fdb93d..380c83b7a89f 100644 --- a/include/linux/irqchip/arm-gic-v3.h +++ b/include/linux/irqchip/arm-gic-v3.h @@ -689,6 +689,11 @@ struct rdists { struct irq_domain; struct fwnode_handle; int its_cpu_init(void); +#ifdef CONFIG_ASCEND_INIT_ALL_GICR +void its_set_gicr_nr(int nr); +bool its_init_all_gicr(void); +int its_cpu_init_others(void __iomem *base, phys_addr_t phys_base, int idx); +#endif int its_init(struct fwnode_handle *handle, struct rdists *rdists, struct irq_domain *domain); int mbi_init(struct fwnode_handle *fwnode, struct irq_domain *parent);
From: Xu Qiang xuqiang36@huawei.com
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I554T5 CVE: NA
------------
In FPGA, We need to check if the gicr has been cut, and if it is, it can't be initialized
Signed-off-by: Xu Qiang xuqiang36@huawei.com Reviewed-by: Liao Chang liaochang1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/irqchip/irq-gic-v3-its.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index e04e0166eac3..54662b36dd6a 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -3390,6 +3390,7 @@ static void its_cpu_init_lpis_others(void __iomem *rbase, int cpu) static void its_cpu_init_collection_others(void __iomem *rbase, phys_addr_t phys_base, int cpu) { + u32 count; struct its_node *its;
if (!init_all_gicr) @@ -3418,6 +3419,32 @@ static void its_cpu_init_collection_others(void __iomem *rbase, target = GICR_TYPER_CPU_NUMBER(target) << 16; }
+ dsb(sy); + + /* In FPGA, We need to check if the gicr has been cut, + * and if it is, it can't be initialized + */ + count = 2000; + while (1) { + if (readl_relaxed(rbase + GICR_SYNCR) == 0) + break; + + count--; + if (!count) { + pr_err("this gicr does not exist, or it's abnormal:%pK\n", + &phys_base); + break; + } + cpu_relax(); + udelay(1); + } + + if (count == 0) + break; + + pr_info("its init other collection table, ITS:%pK, GICR:%pK, coreId:%u\n", + &its->phys_base, &phys_base, cpu); + /* Perform collection mapping */ its->collections[cpu].target_address = target; its->collections[cpu].col_id = cpu;
From: Duoming Zhou duoming@zju.edu.cn
stable inclusion from stable-v5.10.115 commit 1961c5a688edb53fe3bc25cbda57f47adf12563c category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I584YD CVE: CVE-2022-1734
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit d270453a0d9ec10bb8a802a142fb1b3601a83098 upstream.
There are destructive operations such as nfcmrvl_fw_dnld_abort and gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, gpio and so on could be destructed while the upper layer functions such as nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads to double-free, use-after-free and null-ptr-deref bugs.
There are three situations that could lead to double-free bugs.
The first situation is shown below:
(Thread 1) | (Thread 2) nfcmrvl_fw_dnld_start | ... | nfcmrvl_nci_unregister_dev release_firmware() | nfcmrvl_fw_dnld_abort kfree(fw) //(1) | fw_dnld_over | release_firmware ... | kfree(fw) //(2) | ...
The second situation is shown below:
(Thread 1) | (Thread 2) nfcmrvl_fw_dnld_start | ... | mod_timer | (wait a time) | fw_dnld_timeout | nfcmrvl_nci_unregister_dev fw_dnld_over | nfcmrvl_fw_dnld_abort release_firmware | fw_dnld_over kfree(fw) //(1) | release_firmware ... | kfree(fw) //(2)
The third situation is shown below:
(Thread 1) | (Thread 2) nfcmrvl_nci_recv_frame | if(..->fw_download_in_progress)| nfcmrvl_fw_dnld_recv_frame | queue_work | | fw_dnld_rx_work | nfcmrvl_nci_unregister_dev fw_dnld_over | nfcmrvl_fw_dnld_abort release_firmware | fw_dnld_over kfree(fw) //(1) | release_firmware | kfree(fw) //(2)
The firmware struct is deallocated in position (1) and deallocated in position (2) again.
The crash trace triggered by POC is like below:
BUG: KASAN: double-free or invalid-free in fw_dnld_over Call Trace: kfree fw_dnld_over nfcmrvl_nci_unregister_dev nci_uart_tty_close tty_ldisc_kill tty_ldisc_hangup __tty_hangup.part.0 tty_release ...
What's more, there are also use-after-free and null-ptr-deref bugs in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev, then, we dereference firmware, gpio or the members of priv->fw_dnld in nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
This patch reorders destructive operations after nci_unregister_device in order to synchronize between cleanup routine and firmware download routine.
The nci_unregister_device is well synchronized. If the device is detaching, the firmware download routine will goto error. If firmware download routine is executing, nci_unregister_device will wait until firmware download routine is finished.
Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support") Signed-off-by: Duoming Zhou duoming@zju.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Baisong Zhong zhongbaisong@huawei.com Reviewed-by: Wei Yongjun weiyongjun1@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/nfc/nfcmrvl/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c index 529be35ac178..54d228acc0f5 100644 --- a/drivers/nfc/nfcmrvl/main.c +++ b/drivers/nfc/nfcmrvl/main.c @@ -194,6 +194,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) { struct nci_dev *ndev = priv->ndev;
+ nci_unregister_device(ndev); if (priv->ndev->nfc_dev->fw_download_in_progress) nfcmrvl_fw_dnld_abort(priv);
@@ -202,7 +203,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv) if (gpio_is_valid(priv->config.reset_n_io)) gpio_free(priv->config.reset_n_io);
- nci_unregister_device(ndev); nci_free_device(ndev); kfree(priv); }
From: Hugh Dickins hughd@google.com
mainline inclusion from mainline-v5.15 commit 050dcb5c85bb47f8151175ca5833aa882cc7fe0c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I58GJ0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
Patch series "huge tmpfs: shmem_is_huge() fixes and cleanups".
A series of huge tmpfs fixes and cleanups.
This patch (of 9):
shmem_fallocate() goes to a lot of trouble to leave its newly allocated pages !Uptodate, partly to identify and undo them on failure, partly to leave the overhead of clearing them until later. But the huge page case did not skip to the end of the extent, walked through the tail pages one by one, and appeared to work just fine: but in doing so, cleared and Uptodated the huge page, so there was no way to undo it on failure.
And by setting Uptodate too soon, it messed up both its nr_falloced and nr_unswapped counts, so that the intended "time to give up" heuristic did not work at all.
Now advance immediately to the end of the huge extent, with a comment on why this is more than just an optimization. But although this speeds up huge tmpfs fallocation, it does leave the clearing until first use, and some users may have come to appreciate slow fallocate but fast first use: if they complain, then we can consider adding a pass to clear at the end.
Link: https://lkml.kernel.org/r/da632211-8e3e-6b1-aee-ab24734429a0@google.com Link: https://lkml.kernel.org/r/16201bd2-70e-37e2-e89b-5f929430da@google.com Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Signed-off-by: Hugh Dickins hughd@google.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: Shakeel Butt shakeelb@google.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Miaohe Lin linmiaohe@huawei.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Michal Hocko mhocko@suse.com Cc: Rik van Riel riel@surriel.com Cc: Matthew Wilcox willy@infradead.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Chen Wandun chenwandun@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- mm/shmem.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c index 701e9cd02adc..4414d9cbeb98 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2788,7 +2788,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, inode->i_private = &shmem_falloc; spin_unlock(&inode->i_lock);
- for (index = start; index < end; index++) { + for (index = start; index < end; ) { struct page *page;
/* @@ -2811,13 +2811,26 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, goto undone; }
+ index++; + /* + * Here is a more important optimization than it appears: + * a second SGP_FALLOC on the same huge page will clear it, + * making it PageUptodate and un-undoable if we fail later. + */ + if (PageTransCompound(page)) { + index = round_up(index, HPAGE_PMD_NR); + /* Beware 32-bit wraparound */ + if (!index) + index--; + } + /* * Inform shmem_writepage() how far we have reached. * No need for lock or barrier: we have the page lock. */ - shmem_falloc.next++; if (!PageUptodate(page)) - shmem_falloc.nr_falloced++; + shmem_falloc.nr_falloced += index - shmem_falloc.next; + shmem_falloc.next = index;
/* * If !PageUptodate, leave it that way so that freeable pages
From: Hugh Dickins hughd@google.com
mainline inclusion from mainline-v5.15 commit d144bf6205342a4b5fed5d204ae18849a4de741b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I58GJ0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
A successful shmem_fallocate() guarantees that the extent has been reserved, even beyond i_size when the FALLOC_FL_KEEP_SIZE flag was used. But that guarantee is broken by shmem_unused_huge_shrink()'s attempts to split huge pages and free their excess beyond i_size; and by other uses of split_huge_page() near i_size.
It's sad to add a shmem inode field just for this, but I did not find a better way to keep the guarantee. A flag to say KEEP_SIZE has been used would be cheaper, but I'm averse to unclearable flags. The fallocend field is not perfect either (many disjoint ranges might be fallocated), but good enough; and gains another use later on.
Link: https://lkml.kernel.org/r/ca9a146-3a59-6cd3-7f28-e9a044bb1052@google.com Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Signed-off-by: Hugh Dickins hughd@google.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Matthew Wilcox willy@infradead.org Cc: Miaohe Lin linmiaohe@huawei.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Rik van Riel riel@surriel.com Cc: Shakeel Butt shakeelb@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Chen Wandun chenwandun@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/linux/shmem_fs.h | 13 +++++++++++++ mm/huge_memory.c | 6 ++++-- mm/shmem.c | 15 ++++++++++++++- 3 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index a5a5d1d4d7b1..93240799a404 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -18,6 +18,7 @@ struct shmem_inode_info { unsigned long flags; unsigned long alloced; /* data pages alloced to file */ unsigned long swapped; /* subtotal assigned to swap */ + pgoff_t fallocend; /* highest fallocate endindex */ struct list_head shrinklist; /* shrinkable hpage inodes */ struct list_head swaplist; /* chain of maybes on swap */ struct shared_policy policy; /* NUMA memory alloc policy */ @@ -115,6 +116,18 @@ static inline bool shmem_file(struct file *file) return shmem_mapping(file->f_mapping); }
+/* + * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages + * beyond i_size's notion of EOF, which fallocate has committed to reserving: + * which split_huge_page() must therefore not delete. This use of a single + * "fallocend" per inode errs on the side of not deleting a reservation when + * in doubt: there are plenty of cases when it preserves unreserved pages. + */ +static inline pgoff_t shmem_fallocend(struct inode *inode, pgoff_t eof) +{ + return max(eof, SHMEM_I(inode)->fallocend); +} + extern bool shmem_charge(struct inode *inode, long pages); extern void shmem_uncharge(struct inode *inode, long pages);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 37704a21b3dc..bfe079e294cb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2501,11 +2501,11 @@ static void __split_huge_page(struct page *page, struct list_head *list,
for (i = nr - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); - /* Some pages can be beyond i_size: drop them from page cache */ + /* Some pages can be beyond EOF: drop them from page cache */ if (head[i].index >= end) { ClearPageDirty(head + i); __delete_from_page_cache(head + i, NULL); - if (IS_ENABLED(CONFIG_SHMEM) && PageSwapBacked(head)) + if (shmem_mapping(head->mapping)) shmem_uncharge(head->mapping->host, 1); put_page(head + i); } else if (!PageAnon(page)) { @@ -2733,6 +2733,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) * head page lock is good enough to serialize the trimming. */ end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); + if (shmem_mapping(mapping)) + end = shmem_fallocend(mapping->host, end); }
/* diff --git a/mm/shmem.c b/mm/shmem.c index 4414d9cbeb98..c45606bd8434 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -939,6 +939,9 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend, if (lend == -1) end = -1; /* unsigned, so actually very big */
+ if (info->fallocend > start && info->fallocend <= end && !unfalloc) + info->fallocend = start; + pagevec_init(&pvec); index = start; while (index < end && find_lock_entries(mapping, index, end - 1, @@ -2719,7 +2722,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); struct shmem_inode_info *info = SHMEM_I(inode); struct shmem_falloc shmem_falloc; - pgoff_t start, index, end; + pgoff_t start, index, end, undo_fallocend; int error;
if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) @@ -2788,6 +2791,15 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, inode->i_private = &shmem_falloc; spin_unlock(&inode->i_lock);
+ /* + * info->fallocend is only relevant when huge pages might be + * involved: to prevent split_huge_page() freeing fallocated + * pages when FALLOC_FL_KEEP_SIZE committed beyond i_size. + */ + undo_fallocend = info->fallocend; + if (info->fallocend < end) + info->fallocend = end; + for (index = start; index < end; ) { struct page *page;
@@ -2802,6 +2814,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, else error = shmem_getpage(inode, index, &page, SGP_FALLOC); if (error) { + info->fallocend = undo_fallocend; /* Remove the !PageUptodate pages we added */ if (index > start) { shmem_undo_range(inode,
From: Hugh Dickins hughd@google.com
mainline inclusion from mainline-v5.15 commit 2b5bbcb1c9c2cd05a06dcf54df77255a8c406a7b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I58GJ0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
There's a block of code in shmem_setattr() to add the inode to shmem_unused_huge_shrink()'s shrinklist when lowering i_size: it dates from before 5.7 changed truncation to do split_huge_page() for itself, and should have been removed at that time.
I am over-stating that: split_huge_page() can fail (notably if there's an extra reference to the page at that time), so there might be value in retrying. But there were already retries as truncation worked through the tails, and this addition risks repeating unsuccessful retries indefinitely: I'd rather remove it now, and work on reducing the chance of split_huge_page() failures separately, if we need to.
Link: https://lkml.kernel.org/r/b73b3492-8822-18f9-83e2-938528cdde94@google.com Fixes: 71725ed10c40 ("mm: huge tmpfs: try to split_huge_page() when punching hole") Signed-off-by: Hugh Dickins hughd@google.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Matthew Wilcox willy@infradead.org Cc: Miaohe Lin linmiaohe@huawei.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Rik van Riel riel@surriel.com Cc: Shakeel Butt shakeelb@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org conflicts: mm/shmem.c Signed-off-by: Chen Wandun chenwandun@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- mm/shmem.c | 19 ------------------- 1 file changed, 19 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c index c45606bd8434..9df016296347 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1093,7 +1093,6 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = d_inode(dentry); struct shmem_inode_info *info = SHMEM_I(inode); - struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); int error;
error = setattr_prepare(dentry, attr); @@ -1129,24 +1128,6 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr) if (oldsize > holebegin) unmap_mapping_range(inode->i_mapping, holebegin, 0, 1); - - /* - * Part of the huge page can be beyond i_size: subject - * to shrink under memory pressure. - */ - if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { - spin_lock(&sbinfo->shrinklist_lock); - /* - * _careful to defend against unlocked access to - * ->shrink_list in shmem_unused_huge_shrink() - */ - if (list_empty_careful(&info->shrinklist)) { - list_add_tail(&info->shrinklist, - &sbinfo->shrinklist); - sbinfo->shrinklist_len++; - } - spin_unlock(&sbinfo->shrinklist_lock); - } } }
From: Chen Jun chenjun102@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I584X2 CVE: NA
--------------------------------
the name passed to cpuhp_setup_state_multi in hisi_uncore_l3c_pmu.c was modification unexpected.
Fixes: 744d0990ad13 ("perf: hisi: Add support for HiSilicon SoC L3TPMU") Signed-off-by: Chen Jun chenjun102@huawei.com Reviewed-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c index 8b35f7cb4f38..2c71a8971723 100644 --- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c @@ -608,7 +608,7 @@ static int __init hisi_l3c_pmu_module_init(void) int ret;
ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, - "AP_PERF_ARM_HISI_L3T_ONLINE", + "AP_PERF_ARM_HISI_L3_ONLINE", hisi_uncore_pmu_online_cpu, hisi_uncore_pmu_offline_cpu); if (ret) {
From: Ye Bin yebin10@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I58KLD CVE: NA
---------------------------
We got issue as follows: WARNING: CPU: 2 PID: 1936 at fs/ext4/inode.c:1511 ext4_da_release_space+0x1b9/0x266 Modules linked in: CPU: 2 PID: 1936 Comm: dd Not tainted 5.10.0+ #344 RIP: 0010:ext4_da_release_space+0x1b9/0x266 RSP: 0018:ffff888127307848 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff843f67cc RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffed1024e60ed9 RBP: ffff888124dc8140 R08: 0000000000000083 R09: ffffed1075da6d23 R10: ffff8883aed36917 R11: ffffed1075da6d22 R12: ffff888124dc83f0 R13: ffff888124dc844c R14: ffff888124dc8168 R15: 000000000000000c FS: 00007f6b7247d740(0000) GS:ffff8883aed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc1a0b7dd8 CR3: 00000001065ce000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ext4_es_remove_extent+0x187/0x230 mpage_release_unused_pages+0x3af/0x470 ext4_writepages+0xb9b/0x1160 do_writepages+0xbb/0x1e0 __filemap_fdatawrite_range+0x1b1/0x1f0 file_write_and_wait_range+0x80/0xe0 ext4_sync_file+0x13d/0x800 vfs_fsync_range+0x75/0x140 do_fsync+0x4d/0x90 __x64_sys_fsync+0x1d/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Above issue may happens as follows: process1 process2 ext4_da_write_begin ext4_da_reserve_space ext4_es_insert_delayed_block[1/1] ext4_da_write_begin ext4_es_insert_delayed_block[0/1] ext4_writepages ****Delayed block allocation failed**** mpage_release_unused_pages ext4_es_remove_extent[1/1] ext4_da_release_space [reserved 0]
ext4_da_write_begin ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk) ->As there exist [0, 1] extent, so will return true ext4_writepages ****Delayed block allocation failed**** mpage_release_unused_pages ext4_es_remove_extent[0/1] ext4_da_release_space [reserved 1] ei->i_reserved_data_blocks [1->0]
ext4_es_insert_delayed_block[1/1]
ext4_writepages ****Delayed block allocation failed**** mpage_release_unused_pages ext4_es_remove_extent[1/1] ext4_da_release_space [reserved 1] ei->i_reserved_data_blocks[0, -1] ->As ei->i_reserved_data_blocks already is zero but to_free is 1, will trigger warning.
To solve above issue, introduce i_clu_lock to protect insert delayed block and remove block under cluster delay allocate mode.
Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/ext4.h | 3 +++ fs/ext4/extents_status.c | 5 +++++ fs/ext4/inode.c | 11 +++++++++-- fs/ext4/super.c | 1 + 4 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index b32d559252b9..c11a23d73c79 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1164,6 +1164,9 @@ struct ext4_inode_info { __u32 i_csum_seed;
kprojid_t i_projid; + + /* Protect concurrent add cluster delayed block and remove block */ + struct mutex i_clu_lock; };
/* diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 9a3a8996aacf..dd679014db98 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1433,6 +1433,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len) { + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); ext4_lblk_t end; int err = 0; int reserved = 0; @@ -1455,9 +1456,13 @@ int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, * so that we are sure __es_shrink() is done with the inode before it * is reclaimed. */ + if (sbi->s_cluster_ratio != 1) + mutex_lock(&EXT4_I(inode)->i_clu_lock); write_lock(&EXT4_I(inode)->i_es_lock); err = __es_remove_extent(inode, lblk, end, &reserved); write_unlock(&EXT4_I(inode)->i_es_lock); + if (sbi->s_cluster_ratio != 1) + mutex_unlock(&EXT4_I(inode)->i_clu_lock); ext4_es_print_tree(inode); ext4_da_release_space(inode, reserved); return err; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 6cf2eecef96b..a057b9f54bda 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1648,17 +1648,22 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) goto errout; reserved = true; } else { /* bigalloc */ + mutex_lock(&EXT4_I(inode)->i_clu_lock); if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) { if (!ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk)) { ret = ext4_clu_mapped(inode, EXT4_B2C(sbi, lblk)); - if (ret < 0) + if (ret < 0) { + mutex_unlock(&EXT4_I(inode)->i_clu_lock); goto errout; + } if (ret == 0) { ret = ext4_da_reserve_space(inode); - if (ret != 0) /* ENOSPC */ + if (ret != 0) { /* ENOSPC */ + mutex_unlock(&EXT4_I(inode)->i_clu_lock); goto errout; + } reserved = true; } else { allocated = true; @@ -1670,6 +1675,8 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) }
ret = ext4_es_insert_delayed_block(inode, lblk, allocated); + if (sbi->s_cluster_ratio != 1) + mutex_unlock(&EXT4_I(inode)->i_clu_lock); if (ret && reserved) ext4_da_release_space(inode, 1);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9c004269f8d6..5e992775fc30 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1367,6 +1367,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) INIT_WORK(&ei->i_rsv_conversion_work, ext4_end_io_rsv_work); ext4_fc_init_inode(&ei->vfs_inode); mutex_init(&ei->i_fc_lock); + mutex_init(&ei->i_clu_lock); return &ei->vfs_inode; }
From: Liu Shixin liushixin2@huawei.com
hulk inclusion category: feature bugzilla: 186704, https://gitee.com/openeuler/kernel/issues/I58V3W CVE: NA
--------------------------------
The memory error handling on 1GB hugepage is disabled by commit 31286a8484a8 because it may lead to a kernel panic.
However, the commit will result a more troublesome downstream problem. So we have to revert it in some situation. At the same time, we backport commit 15494520b776 which resolve the kernel panic described in commit 31286a8484a8.
We add a new cmdline named 'hugetlb_hwpoison_full' to enable memory error handling on 1GB hugepage. By default, the memory error handling on 1GB hugepage is disabled.
Note that the kernel panic may not have been completely resolved!
Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/admin-guide/kernel-parameters.txt | 3 +++ mm/memory-failure.c | 12 +++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 2702a1369c58..98199d3ae741 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1613,6 +1613,9 @@ off: Disable the feature Equivalent to: nohugevmalloc
+ hugetlb_hwpoison_full + [HW] Enable memory error handling of 1GB hugepage. + hung_task_panic= [KNL] Should the hung task detector generate panics. Format: 0 | 1 diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 509fe34a0421..63bacfcca122 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1148,6 +1148,15 @@ static int try_to_split_thp_page(struct page *page, const char *msg) return 0; }
+static bool hugetlb_hwpoison_full; + +static int __init enable_hugetlb_hwpoison_full(char *str) +{ + hugetlb_hwpoison_full = true; + return 0; +} +early_param("hugetlb_hwpoison_full", enable_hugetlb_hwpoison_full); + static int memory_failure_hugetlb(unsigned long pfn, int flags) { struct page *p = pfn_to_page(pfn); @@ -1197,7 +1206,8 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) * - other mm code walking over page table is aware of pud-aligned * hwpoison entries. */ - if (huge_page_size(page_hstate(head)) > PMD_SIZE) { + if (!hugetlb_hwpoison_full && + huge_page_size(page_hstate(head)) > PMD_SIZE) { action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED); res = -EBUSY; goto out;