
Support xcall prefetch. Changes in v4: - Enable FAST_SYSCALL/IRQ and XCALL_PREFETCH by default. - Fix xint sleeping function called from invalid context bug. - free_prefetch_item() in file_free() instead of filp_close(), which fix the alloc-queue and free competition. - Fix kernel_read() warning for FMODE_READ not set. - Add sock_from_file() limit for prefetch. - Check NULL for rc_work. - Simplfy the cpumask interface code, and rename to "/proc/xcall/mask_list". - Remove the xcall_cache_pages_order interface. - tracepoint update: fd -> file. - Simplify the xcall_read() function again. - Handle copy_to_user() return value. - Remove unused XCALL_CACHE_QUEUED. - Update the commit message. Changes in v3: - Add XCALL_PREFETCH config to isolate feature code. - Split the /proc/xxx interface code out to independent patches, which will make it clear. - Update the cpumask interface to "/proc/xcall/numa_mask", and it can set the numa mask of all numas one time. - Add xcall select count to make xcall_cache_pages_order adjust safe. - Introduce xcall_read_start/end() to make it clear. - Simplify the xcall_read() function. - Use cpumask_next() instead of get_nth_cpu_in_cpumask() function. - Use independent cpu select policy function. - Remove some unnecessary pr_err(). - Update the commit message. Changes in v2: - Upadte the xcall prefetch state machine, remove the queued state and add prefetch, cancel states. - Remove the pfi lock and use atomic variables. - Change the 'xcall select' semantics and simplify the code a lot. - Remove keep_running, remove unrelated code. - Remove the count in struct read_cache_entry, and use percpu hit/miss count. - Remove sync mode, so remove struct read_cache_entry in struct task_struct. - Use hash table to find prefetch item for a file, which will not change the file struct KABI. - Use rwlock instead of spinlock for hash table. - Use alloc_page() instead kmalloc() to align 4KB. - Update the commit message. Jinjie Ruan (7): arm64: Introduce Xint software solution arm64: Add debugfs dir for xint eventpoll: xcall: Support async prefetch data in epoll xcall: Add /proc/xcall/prefetch dir for performance tuning xcall: Add /proc/xcall/mask_list for performance tuning xcall: eventpoll: add tracepoint config: Enable FAST_SYSCALL/IRQ and XCALL_PREFETCH by default Liao Chen (1): revert kpti bypass Yipeng Zou (3): arm64: Introduce xcall a faster svc exception handling arm64: Faster SVC exception handler with xcall xcall: Introduce xcall_select to mark special syscall arch/Kconfig | 80 +++++ arch/arm64/Kconfig | 2 + arch/arm64/configs/openeuler_defconfig | 7 + arch/arm64/include/asm/cpucaps.h | 2 + arch/arm64/include/asm/exception.h | 3 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/cpufeature.c | 54 ++++ arch/arm64/kernel/entry-common.c | 22 ++ arch/arm64/kernel/entry.S | 183 +++++++++++- arch/arm64/kernel/syscall.c | 57 ++++ drivers/irqchip/irq-gic-v3.c | 130 +++++++++ fs/eventpoll.c | 387 +++++++++++++++++++++++++ fs/file_table.c | 1 + fs/proc/base.c | 152 ++++++++++ fs/read_write.c | 6 + include/linux/fs.h | 35 +++ include/linux/hardirq.h | 5 + include/linux/irqchip/arm-gic-v3.h | 13 + include/linux/sched.h | 5 + include/trace/events/fs.h | 93 ++++++ kernel/fork.c | 32 ++ kernel/irq/debugfs.c | 33 +++ kernel/irq/internals.h | 18 ++ kernel/irq/irqdesc.c | 19 ++ kernel/irq/proc.c | 10 + kernel/softirq.c | 73 +++++ 26 files changed, 1423 insertions(+), 2 deletions(-) -- 2.34.1