Backport BPF CO-RE support patches for openEuler-22.09
Alan Maguire (5): libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types() libbpf: BTF dumper support for typed data libbpf: Clarify/fix unaligned data issues for btf typed dump libbpf: Avoid use of __int128 in typed dump display libbpf: Propagate errors when retrieving enum value for typed data display
Alexei Starovoitov (47): bpf: Optimize program stats bpf: Run sleepable programs with migration disabled bpf: Compute program stats for sleepable programs bpf: Add per-program recursion prevention mechanism selftest/bpf: Add a recursion test bpf: Count the number of times recursion was prevented selftests/bpf: Improve recursion selftest bpf: Allows per-cpu maps and map-in-map in sleepable programs selftests/bpf: Add a test for map-in-map and per-cpu maps in sleepable progs bpf: Clear per_cpu pointers during bpf_prog_realloc bpf: Dont allow vmlinux BTF to be used in map_create and prog_load. libbpf: Remove unused field. bpf: Introduce bpf_sys_bpf() helper and program type. bpf: Introduce bpfptr_t user/kernel pointer. bpf: Prepare bpf syscall to be used from kernel and user space. libbpf: Support for syscall program type bpf: Make btf_load command to be bpfptr_t compatible. bpf: Introduce fd_idx bpf: Add bpf_btf_find_by_name_kind() helper. bpf: Add bpf_sys_close() helper. libbpf: Change the order of data and text relocations. libbpf: Add bpf_object pointer to kernel_supports(). libbpf: Preliminary support for fd_idx libbpf: Generate loader program out of BPF ELF file. libbpf: Cleanup temp FDs when intermediate sys_bpf fails. libbpf: Introduce bpf_map__initial_value(). bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command. bpf: Add cmd alias BPF_PROG_RUN bpf: Prepare bpf_prog_put() to be called from irq context. bpf: Factor out bpf_spin_lock into helpers. libbpf: Cleanup the layering between CORE and bpf_program. libbpf: Split bpf_core_apply_relo() into bpf_program independent helper. libbpf: Move CO-RE types into relo_core.h. libbpf: Split CO-RE logic into relo_core.c. libbpf: Make gen_loader data aligned. libbpf: Replace btf__type_by_id() with btf_type_by_id(). bpf: Rename btf_member accessors. bpf: Prepare relo_core.c for kernel duty. bpf: Define enum bpf_core_relo_kind as uapi. bpf: Pass a set of bpf_core_relo-s to prog_load command. bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn(). libbpf: Use CO-RE in the kernel in light skeleton. libbpf: Support init of inner maps in light skeleton. libbpf: Clean gen_loader's attach kind. libbpf: Reduce bpf_core_apply_relo_insn() stack usage. bpf: Silence purge_cand_cache build warning. libbpf: Fix gen_loader assumption on number of programs.
Andrei Matei (1): libbpf: Fail early when loading programs with unspecified type
Andrew Delgadillo (1): selftests/bpf: Drop the need for LLVM's llc
Andrii Nakryiko (149): libbpf: Factor out common operations in BTF writing APIs selftest/bpf: Relax btf_dedup test checks libbpf: Unify and speed up BTF string deduplication libbpf: Implement basic split BTF support selftests/bpf: Add split BTF basic test selftests/bpf: Add checking of raw type dump in BTF writer APIs selftests libbpf: Support BTF dedup of split BTFs libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays selftests/bpf: Add split BTF dedup selftests tools/bpftool: Add bpftool support for split BTF bpf: Add in-kernel split BTF support bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO kbuild: Build kernel module BTFs if BTF is enabled and pahole supports it bpf: Load and verify kernel module BTFs tools/bpftool: Add support for in-kernel and named BTF in `btf show` bpf: Compile out btf_parse_module() if module BTF is not enabled bpf: Sanitize BTF data pointer after module is loaded tools/bpftool: Emit name <anon> for anonymous BTFs libbpf: Add base BTF accessor tools/bpftool: Auto-detect split BTFs in common cases bpf: Keep module's btf_data_size intact after load libbpf: Add internal helper to load BTF data by FD libbpf: Refactor CO-RE relocs to not assume a single BTF object libbpf: Add kernel module BTF support for CO-RE relocations selftests/bpf: Add bpf_testmod kernel module for testing selftests/bpf: Add support for marking sub-tests as skipped selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier bpf: Allow to specify kernel module BTFs when attaching BPF programs libbpf: Factor out low-level BPF program loading helper libbpf: Support attachment of BPF tracing programs to kernel modules selftests/bpf: Add tp_btf CO-RE reloc test for modules selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module kbuild: Skip module BTF generation for out-of-tree external modules selftests/bpf: fix bpf_testmod.ko recompilation logic libbpf: Support modules in bpf_program__set_attach_target() API selftests/bpf: Work-around EBUSY errors from hashmap update/delete bpf: Allow empty module BTFs libbpf: Add user-space variants of BPF_CORE_READ() family of macros libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family selftests/bpf: Add tests for user- and non-CO-RE BPF_CORE_READ() variants libbpf: Clarify kernel type use with USER variants of CORE reading macros selftests/bpf: Sync RCU before unloading bpf_testmod bpf: Support BPF ksym variables in kernel modules libbpf: Support kernel module ksym externs selftests/bpf: Test kernel module ksym externs selftests/bpf: Don't exit on failed bpf_testmod unload libbpf: Stop using feature-detection Makefiles libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h libbpf: Expose btf_type_by_id() internally libbpf: Generalize BTF and BTF.ext type ID and strings iteration libbpf: Rename internal memory-management helpers libbpf: Extract internal set-of-strings datastructure APIs libbpf: Add generic BTF type shallow copy API libbpf: Add BPF static linker APIs libbpf: Add BPF static linker BTF and BTF.ext support bpftool: Add ability to specify custom skeleton object name bpftool: Add `gen object` command to perform BPF static linking selftests/bpf: Pass all BPF .o's through BPF static linker selftests/bpf: Add multi-file statically linked BPF object file test libbpf: Skip BTF fixup if object file has no BTF libbpf: Constify few bpf_program getters libbpf: Preserve empty DATASEC BTFs during static linking libbpf: Fix memory leak when emitting final btf_ext libbpf: Add bpf_map__inner_map API libbpf: Suppress compiler warning when using SEC() macro with externs libbpf: Mark BPF subprogs with hidden visibility as static for BPF verifier libbpf: Allow gaps in BPF program sections to support overriden weak functions libbpf: Refactor BTF map definition parsing libbpf: Factor out symtab and relos sanity checks libbpf: Make few internal helpers available outside of libbpf.c libbpf: Extend sanity checking ELF symbols with externs validation libbpf: Tighten BTF type ID rewriting with error checking libbpf: Add linker extern resolution support for functions and global variables libbpf: Support extern resolution for BTF-defined maps in .maps section libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE bpftool: Strip const/volatile/restrict modifiers from .bss and .data vars libbpf: Add per-file linker opts selftests/bpf: Stop using static variables for passing data to/from user-space bpftool: Stop emitting static variables in BPF skeleton libbpf: Fix ELF symbol visibility update logic libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions libbpf: Reject static maps libbpf: Reject static entry-point BPF programs libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviors libbpf: Streamline error reporting for low-level APIs libbpf: Streamline error reporting for high-level APIs bpftool: Set errno on skeleton failures and propagate errors libbpf: Move few APIs from 0.4 to 0.5 version libbpf: Refactor header installation portions of Makefile libbpf: Install skel_internal.h header used from light skeletons selftests/bpf: Add remaining ASSERT_xxx() variants libbpf: Fix build with latest gcc/binutils with LTO libbpf: Make libbpf_version.h non-auto-generated selftests/bpf: Update selftests to always provide "struct_ops" SEC libbpf: Ensure BPF prog types are set before relocations libbpf: Simplify BPF program auto-attach code libbpf: Minimize explicit iterator of section definition array libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id() selftests/bpf: Stop using relaxed_core_relocs which has no effect libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target() libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7 libbpf: Constify all high-level program attach APIs selftests/bpf: Turn on libbpf 1.0 mode and fix all IS_ERR checks selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API libbpf: Add "tc" SEC_DEF which is a better name for "classifier" libbpf: Add API that copies all BTF types from one BTF object to another libbpf: Deprecate btf__finalize_data() and move it into libbpf.c libbpf: Extract ELF processing state into separate struct libbpf: Refactor internal sec_def handling to enable pluggability libbpf: Reduce reliance of attach_fns on sec_def internals libbpf: Use Elf64-specific types explicitly for dealing with ELF libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps bpftool: Support multiple .rodata/.data internal maps in skeleton bpftool: Improve skeleton generation for data maps without DATASEC type libbpf: Support multiple .rodata.* and .data.* BPF maps selftests/bpf: Demonstrate use of custom .rodata/.data sections libbpf: Simplify look up by name of internal maps selftests/bpf: Switch to ".bss"/".rodata"/".data" lookups for internal maps libbpf: Fix off-by-one bug in bpf_core_apply_relo() libbpf: Add ability to fetch bpf_program's underlying instructions libbpf: Deprecate multi-instance bpf_program APIs libbpf: Deprecate ambiguously-named bpf_program__size() API libbpf: Detect corrupted ELF symbols section libbpf: Improve sanity checking during BTF fix up libbpf: Validate that .BTF and .BTF.ext sections contain data libbpf: Fix section counting logic libbpf: Improve ELF relo sanitization libbpf: Deprecate bpf_program__load() API libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS selftests/bpf: Pass sanitizer flags to linker through LDFLAGS libbpf: Free up resources used by inner map definition selftests/bpf: Fix memory leaks in btf_type_c_dump() helper selftests/bpf: Free per-cpu values array in bpf_iter selftest selftests/bpf: Free inner strings index in btf selftest selftests/bpf: Avoid duplicate btf__parse() call libbpf: Load global data maps lazily on legacy kernels libbpf: Fix potential misaligned memory access in btf_ext__new() libbpf: Don't call libc APIs with NULL pointers libbpf: Fix glob_syms memory leak in bpf_linker libbpf: Fix using invalidated memory in bpf_linker selftests/bpf: Fix possible NULL passed to memcpy() with zero size selftests/bpf: Prevent misaligned memory access in get_stack_raw_tp test selftests/bpf: Fix misaligned memory access in queue_stack_map test selftests/bpf: Prevent out-of-bounds stack access in test_bpffs selftests/bpf: Fix misaligned accesses in xdp and xdp_bpf2bpf tests libbpf: Cleanup struct bpf_core_cand. libbpf: Fix non-C89 loop variable declaration in gen_loader.c
Arnaldo Carvalho de Melo (1): libbpf: Provide GELF_ST_VISIBILITY() define for older libelf
Brendan Jackman (15): tools/resolve_btfids: Fix some error messages bpf: Fix cold build of test_progs-no_alu32 bpf: Clarify return value of probe str helpers bpf: x86: Factor out emission of ModR/M for *(reg + off) bpf: x86: Factor out emission of REX byte bpf: x86: Factor out a lookup table for some ALU opcodes bpf: Rename BPF_XADD and prepare to encode other atomics in .imm bpf: Move BPF_STX reserved field check into BPF_STX verifier code bpf: Add BPF_FETCH field / create atomic_fetch_add instruction bpf: Add instructions for atomic_[cmp]xchg bpf: Pull out a macro for interpreting atomic ALU operations bpf: Add bitwise atomic instructions bpf: Add tests for new BPF atomic operations bpf: Document new atomic instructions bpf: Rename fixup_bpf_calls and add some comments
Cong Wang (1): bpf: Clear percpu pointers in bpf_prog_clone_free()
Daniel Xu (1): libbpf: Do not close un-owned FD 0 on errors
Dave Marchevsky (1): bpf: Add verified_insns to bpf_prog_info and fdinfo
Dmitrii Banshchikov (7): bpf: Rename bpf_reg_state variables bpf: Extract nullable reg type conversion into a helper function bpf: Support pointers in global func args selftests/bpf: Add unit tests for pointers in global functions bpf: Drop imprecise log message selftests/bpf: Fix a compiler warning in global func test bpf: Use MAX_BPF_FUNC_REG_ARGS macro
Florent Revest (7): bpf: Be less specific about socket cookies guarantees selftests/bpf: Fix the ASSERT_ERR_PTR macro bpf: Factorize bpf_trace_printk and bpf_seq_printf bpf: Add a ARG_PTR_TO_CONST_STR argument type bpf: Add a bpf_snprintf helper libbpf: Introduce a BPF_SNPRINTF helper macro libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h
Florian Lehner (2): selftests/bpf: Print reason when a tester could not run a program selftests/bpf: Avoid errno clobbering
Gary Lin (3): bpf,x64: Pad NOPs to make images converge more easily test_bpf: Remove EXPECTED_FAIL flag from bpf_fill_maxinsns11 selftests/bpf: Add verifier tests for x64 jit jump padding
Hao Luo (1): libbpf: Support weak typed ksyms.
Hengqi Chen (8): libbpf: Fix KERNEL_VERSION macro tools/resolve_btfids: Emit warnings and patch zero id for missing symbols libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf libbpf: Support uniform BTF-defined key/value specification across all BPF maps libbpf: Deprecate bpf_object__unload() API since v0.6 libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7 selftests/bpf: Switch to new bpf_object__next_{map,program} APIs libbpf: Support static initialization of BPF_MAP_TYPE_PROG_ARRAY
Ian Rogers (3): bpf, libbpf: Avoid unused function warning on bpf_tail_call_static tools/bpftool: Add -Wall when building BPF programs libbpf: Add NULL check to add_dummy_ksym_var
Ilya Leoshkevich (5): selftests/bpf: Copy extras in out-of-srctree builds bpf: Add BTF_KIND_FLOAT to uapi libbpf: Fix whitespace in btf_add_composite() comment libbpf: Add BTF_KIND_FLOAT support bpf: Generate BTF_KIND_FLOAT when linking vmlinux
Jason Wang (1): libbpf: Fix comment typo
Jean-Philippe Brucker (12): tools: Factor HOSTCC, HOSTLD, HOSTAR definitions tools/runqslower: Use Makefile.include tools/runqslower: Enable out-of-tree build tools/runqslower: Build bpftool using HOSTCC tools/bpftool: Fix build slowdown selftests/bpf: Enable cross-building selftests/bpf: Fix out-of-tree build selftests/bpf: Move generated test files to $(TEST_GEN_FILES) selftests/bpf: Fix installation of urandom_read selftests/bpf: Install btf_dump test cases tools/bpftool: Fix cross-build tools/runqslower: Fix cross-build
Jiri Olsa (3): tools/resolve_btfids: Warn when having multiple IDs for single type libbpf: Use string table index from index table if needed perf build: Move feature cleanup under tools/build
Joe Stringer (1): tools: Sync uapi bpf.h header with latest changes
Jonathan Edwards (1): libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading
Kumar Kartikeya Dwivedi (14): libbpf: Add various netlink helpers libbpf: Add low level TC-BPF management API libbpf: Remove unneeded check for flags during tc detach libbpf: Set NLM_F_EXCL when creating qdisc libbpf: Fix segfault in static linker for objects without BTF libbpf: Fix segfault in light skeleton for objects without BTF libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations bpf: Add bpf_kallsyms_lookup_name helper libbpf: Add typeless ksym support to gen_loader libbpf: Add weak ksym support to gen_loader libbpf: Perform map fd cleanup for gen_loader in case of error bpf: Change bpf_kallsyms_lookup_name size type to ARG_CONST_SIZE_OR_ZERO libbpf: Avoid double stores for success/failure case of ksym relocations libbpf: Avoid reload of imm for weak, unresolved, repeating ksym
Lorenz Bauer (2): bpf: Consolidate shared test timing code bpf: Add PROG_TEST_RUN support for sk_lookup programs
Martin KaFai Lau (9): bpf: Simplify freeing logic in linfo and jited_linfo bpf: Refactor btf_check_func_arg_match bpf: Support bpf program calling kernel function bpf: Support kernel function call in x86-32 libbpf: Refactor bpf_object__resolve_ksyms_btf_id libbpf: Refactor codes for finding btf id of a kernel symbol libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR libbpf: Record extern sym relocation first libbpf: Support extern kernel function
Martynas Pumputis (1): selftests/bpf: Check inner map deletion
Matt Smith (3): libbpf: Change bpf_object_skeleton data field to const pointer bpftool: Provide a helper method for accessing skeleton's embedded ELF data selftests/bpf: Add checks for X__elf_bytes() skeleton helper
Mauricio Vásquez (1): libbpf: Fix memory leak in btf__dedup()
Michal Suchanek (1): libbpf: Fix pr_warn type warnings on 32bit
Pedro Tammela (2): libbpf: Avoid inline hint definition from 'linux/stddef.h' libbpf: Clarify flags in ringbuf helpers
Quentin Monnet (17): libbpf: Return non-null error on failures in libbpf_find_prog_btf_id() libbpf: Rename btf__load() as btf__load_into_kernel() libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id() tools: Free BTF objects at various locations tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id() libbpf: Add split BTF support for btf__load_from_kernel_by_id() tools: bpftool: Support dumping split BTF by id libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations libbpf: Skip re-installing headers file if source is older than target bpftool: Remove unused includes to <bpf/bpf_gen_internal.h> bpftool: Install libbpf headers instead of including the dir tools/resolve_btfids: Install libbpf headers when building tools/runqslower: Install libbpf headers when building bpf: preload: Install libbpf headers when building bpf: iterators: Install libbpf headers when building selftests/bpf: Better clean up for runqslower in test_bpftool_build.sh bpftool: Add install-bin target to install binary only
Rafael David Tinoco (1): libbpf: Add bpf object kern_version attribute setter
Sedat Dilek (1): tools: Factor Clang, LLC and LLVM utils definitions
Shuyi Cheng (2): libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts' libbpf: Add "bool skipped" to struct bpf_map
Song Liu (5): bpf: Use separate lockdep class for each hashtab bpf: Avoid hashtab deadlock with map_locked bpftool: Add Makefile target bootstrap perf build: Support build BPF skeletons with perf perf stat: Enable counting events for BPF programs
Stanislav Fomichev (2): libbpf: Cap retries in sys_bpf_prog_load libbpf: Skip bpf_object__probe_loading for light skeleton
Toke Høiland-Jørgensen (5): bpf: Return target info when a tracing bpf_link is queried libbpf: Restore errno return for functions that were already returning it libbpf: Don't crash on object files with no symbol tables libbpf: Ignore STT_SECTION symbols in 'maps' section libbpf: Properly ignore STT_SECTION symbols in legacy map definitions
Wang Hai (1): libbpf: Simplify the return expression of bpf_object__init_maps function
Wang Qing (1): bpf, btf: Remove the duplicate btf_ids.h include
Wedson Almeida Filho (1): bpf: Refactor check_cfg to use a structured loop.
Yauheni Kaliuta (7): selftests/bpf: test_progs/sockopt_sk: Remove version selftests/bpf: test_progs/sockopt_sk: Convert to use BPF skeleton selftests/bpf: Pass page size from userspace in sockopt_sk selftests/bpf: Pass page size from userspace in map_ptr selftests/bpf: mmap: Use runtime page size selftests/bpf: ringbuf: Use runtime page size selftests/bpf: ringbuf_multi: Use runtime page size
Yonghong Song (27): bpf: Permit cond_resched for some iterators bpf: Permit size-0 datasec bpf: Refactor BPF_PSEUDO_CALL checking as a helper function bpf: Factor out visit_func_call_insn() in check_cfg() bpf: Factor out verbose_invalid_scalar() bpf: Refactor check_func_call() to allow callback function bpf: Change return value of verifier function add_subprog() bpf: Add bpf_for_each_map_elem() helper libbpf: Move function is_ldimm64() earlier in libbpf.c libbpf: Support subprog address relocation selftests/bpf: Fix test_cpp compilation failure with clang bpftool: Fix a clang compilation warning libbpf: Add support for new llvm bpf relocations bpf: Emit better log message if bpf_iter ctx arg btf_id == 0 btf: Change BTF_KIND_* macros to enums bpf: Support for new btf kind BTF_KIND_TAG libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag libbpf: Add support for BTF_KIND_TAG bpftool: Add support for BTF_KIND_TAG bpf: Add BTF_KIND_DECL_TAG typedef support docs/bpf: Update documentation for BTF_KIND_DECL_TAG typedef support bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes libbpf: Support BTF_KIND_TYPE_TAG bpftool: Support BTF_KIND_TYPE_TAG docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support libbpf: Fix a couple of missed btf_type_tag handling in btf.c
Documentation/ABI/testing/sysfs-kernel-btf | 8 + Documentation/bpf/btf.rst | 40 +- Documentation/networking/filter.rst | 61 +- arch/arm/net/bpf_jit_32.c | 7 +- arch/arm64/net/bpf_jit_comp.c | 16 +- arch/mips/net/ebpf_jit.c | 11 +- arch/powerpc/net/bpf_jit_comp64.c | 25 +- arch/riscv/net/bpf_jit_comp32.c | 20 +- arch/riscv/net/bpf_jit_comp64.c | 16 +- arch/s390/net/bpf_jit_comp.c | 27 +- arch/sparc/net/bpf_jit_comp_64.c | 17 +- arch/x86/net/bpf_jit_comp.c | 408 +- arch/x86/net/bpf_jit_comp32.c | 204 +- drivers/net/ethernet/netronome/nfp/bpf/fw.h | 4 +- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 +- drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +- .../net/ethernet/netronome/nfp/bpf/verifier.c | 15 +- include/linux/bpf.h | 150 +- include/linux/bpf_types.h | 2 + include/linux/bpf_verifier.h | 46 +- include/linux/bpfptr.h | 75 + include/linux/btf.h | 106 +- include/linux/filter.h | 45 +- include/linux/module.h | 4 + include/uapi/linux/bpf.h | 265 +- include/uapi/linux/btf.h | 57 +- kernel/bpf/Makefile | 4 + kernel/bpf/bpf_iter.c | 43 +- kernel/bpf/bpf_struct_ops.c | 6 +- kernel/bpf/btf.c | 1403 ++++- kernel/bpf/core.c | 153 +- kernel/bpf/disasm.c | 56 +- kernel/bpf/hashtab.c | 130 +- kernel/bpf/helpers.c | 326 +- kernel/bpf/preload/Makefile | 25 +- kernel/bpf/preload/iterators/Makefile | 38 +- kernel/bpf/preload/iterators/iterators.bpf.c | 1 - kernel/bpf/syscall.c | 363 +- kernel/bpf/sysfs_btf.c | 2 +- kernel/bpf/task_iter.c | 2 + kernel/bpf/trampoline.c | 77 +- kernel/bpf/verifier.c | 1508 ++++- kernel/module.c | 36 + kernel/trace/bpf_trace.c | 375 +- lib/Kconfig.debug | 9 + lib/test_bpf.c | 21 +- net/bpf/test_run.c | 290 +- net/core/filter.c | 1 + net/ipv4/bpf_tcp_ca.c | 7 +- samples/bpf/bpf_insn.h | 4 +- samples/bpf/cookie_uid_helper_example.c | 8 +- samples/bpf/sock_example.c | 2 +- samples/bpf/test_cgrp2_attach.c | 5 +- samples/bpf/xdp1_user.c | 2 +- samples/bpf/xdp_sample_pkts_user.c | 2 +- scripts/Makefile.modfinal | 25 +- scripts/link-vmlinux.sh | 7 +- .../bpf/bpftool/Documentation/bpftool-gen.rst | 78 +- tools/bpf/bpftool/Makefile | 55 +- tools/bpf/bpftool/bash-completion/bpftool | 17 +- tools/bpf/bpftool/btf.c | 80 +- tools/bpf/bpftool/btf_dumper.c | 6 +- tools/bpf/bpftool/gen.c | 651 +- tools/bpf/bpftool/iter.c | 2 +- tools/bpf/bpftool/main.c | 20 +- tools/bpf/bpftool/main.h | 2 + tools/bpf/bpftool/map.c | 14 +- tools/bpf/bpftool/net.c | 2 +- tools/bpf/bpftool/prog.c | 141 +- tools/bpf/bpftool/xlated_dumper.c | 3 + tools/bpf/resolve_btfids/Makefile | 19 +- tools/bpf/resolve_btfids/main.c | 38 +- tools/bpf/runqslower/Makefile | 71 +- tools/build/Makefile | 8 +- tools/build/Makefile.feature | 4 +- tools/build/feature/Makefile | 4 +- tools/include/linux/filter.h | 24 +- tools/include/uapi/linux/bpf.h | 977 ++- tools/include/uapi/linux/btf.h | 57 +- tools/lib/bpf/.gitignore | 2 - tools/lib/bpf/Build | 2 +- tools/lib/bpf/Makefile | 104 +- tools/lib/bpf/bpf.c | 272 +- tools/lib/bpf/bpf.h | 1 + tools/lib/bpf/bpf_core_read.h | 169 +- tools/lib/bpf/bpf_gen_internal.h | 65 + tools/lib/bpf/bpf_helpers.h | 108 +- tools/lib/bpf/bpf_prog_linfo.c | 18 +- tools/lib/bpf/bpf_tracing.h | 44 +- tools/lib/bpf/btf.c | 2085 ++++--- tools/lib/bpf/btf.h | 99 +- tools/lib/bpf/btf_dump.c | 910 ++- tools/lib/bpf/gen_loader.c | 1126 ++++ tools/lib/bpf/libbpf.c | 5436 +++++++++-------- tools/lib/bpf/libbpf.h | 212 +- tools/lib/bpf/libbpf.map | 54 + tools/lib/bpf/libbpf_common.h | 26 +- tools/lib/bpf/libbpf_errno.c | 7 +- tools/lib/bpf/libbpf_internal.h | 299 +- tools/lib/bpf/libbpf_legacy.h | 60 + tools/lib/bpf/libbpf_version.h | 9 + tools/lib/bpf/linker.c | 2901 +++++++++ tools/lib/bpf/netlink.c | 593 +- tools/lib/bpf/nlattr.h | 48 + tools/lib/bpf/relo_core.c | 1322 ++++ tools/lib/bpf/relo_core.h | 57 + tools/lib/bpf/ringbuf.c | 26 +- tools/lib/bpf/skel_internal.h | 123 + tools/lib/bpf/strset.c | 176 + tools/lib/bpf/strset.h | 21 + tools/lib/bpf/xsk.c | 4 +- tools/perf/Documentation/perf-stat.txt | 18 + tools/perf/Makefile.config | 9 + tools/perf/Makefile.perf | 58 +- tools/perf/builtin-stat.c | 82 +- tools/perf/util/Build | 1 + tools/perf/util/bpf-event.c | 11 +- tools/perf/util/bpf_counter.c | 320 + tools/perf/util/bpf_counter.h | 72 + tools/perf/util/bpf_skel/.gitignore | 3 + .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 + tools/perf/util/evsel.c | 5 + tools/perf/util/evsel.h | 5 + tools/perf/util/python.c | 21 + tools/perf/util/stat-display.c | 4 +- tools/perf/util/stat.c | 2 +- tools/perf/util/target.c | 34 +- tools/perf/util/target.h | 10 + tools/scripts/Makefile.include | 18 + tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 183 +- tools/testing/selftests/bpf/bench.c | 1 + .../selftests/bpf/benchs/bench_rename.c | 2 +- .../selftests/bpf/benchs/bench_ringbufs.c | 6 +- .../selftests/bpf/benchs/bench_trigger.c | 2 +- .../selftests/bpf/bpf_testmod/.gitignore | 6 + .../selftests/bpf/bpf_testmod/Makefile | 20 + .../bpf/bpf_testmod/bpf_testmod-events.h | 36 + .../selftests/bpf/bpf_testmod/bpf_testmod.c | 55 + .../selftests/bpf/bpf_testmod/bpf_testmod.h | 14 + tools/testing/selftests/bpf/btf_helpers.c | 264 + tools/testing/selftests/bpf/btf_helpers.h | 19 + .../selftests/bpf/prog_tests/atomics.c | 246 + .../selftests/bpf/prog_tests/attach_probe.c | 12 +- .../selftests/bpf/prog_tests/bpf_iter.c | 34 +- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 8 +- tools/testing/selftests/bpf/prog_tests/btf.c | 166 +- .../bpf/prog_tests/btf_dedup_split.c | 325 + .../selftests/bpf/prog_tests/btf_dump.c | 10 +- .../selftests/bpf/prog_tests/btf_endian.c | 4 +- .../selftests/bpf/prog_tests/btf_map_in_map.c | 33 - .../selftests/bpf/prog_tests/btf_split.c | 99 + .../selftests/bpf/prog_tests/btf_write.c | 47 +- .../bpf/prog_tests/cg_storage_multi.c | 84 +- .../bpf/prog_tests/cgroup_attach_multi.c | 6 +- .../selftests/bpf/prog_tests/cgroup_link.c | 16 +- .../bpf/prog_tests/cgroup_skb_sk_lookup.c | 2 +- .../selftests/bpf/prog_tests/core_autosize.c | 2 +- .../bpf/prog_tests/core_read_macros.c | 64 + .../selftests/bpf/prog_tests/core_reloc.c | 105 +- .../selftests/bpf/prog_tests/fexit_bpf2bpf.c | 68 +- .../selftests/bpf/prog_tests/fexit_stress.c | 4 +- .../selftests/bpf/prog_tests/flow_dissector.c | 2 +- .../bpf/prog_tests/flow_dissector_reattach.c | 10 +- .../bpf/prog_tests/get_stack_raw_tp.c | 24 +- .../prog_tests/get_stackid_cannot_attach.c | 9 +- .../selftests/bpf/prog_tests/global_data.c | 11 +- .../bpf/prog_tests/global_data_init.c | 2 +- .../bpf/prog_tests/global_func_args.c | 60 + .../selftests/bpf/prog_tests/hashmap.c | 9 +- .../selftests/bpf/prog_tests/kfree_skb.c | 23 +- .../selftests/bpf/prog_tests/ksyms_btf.c | 34 +- .../selftests/bpf/prog_tests/ksyms_module.c | 31 + .../selftests/bpf/prog_tests/link_pinning.c | 7 +- .../selftests/bpf/prog_tests/map_ptr.c | 15 +- tools/testing/selftests/bpf/prog_tests/mmap.c | 24 +- .../selftests/bpf/prog_tests/module_attach.c | 53 + .../selftests/bpf/prog_tests/obj_name.c | 8 +- .../selftests/bpf/prog_tests/perf_branches.c | 4 +- .../selftests/bpf/prog_tests/perf_buffer.c | 2 +- .../bpf/prog_tests/perf_event_stackmap.c | 3 +- .../selftests/bpf/prog_tests/probe_user.c | 7 +- .../selftests/bpf/prog_tests/prog_run_xattr.c | 2 +- .../bpf/prog_tests/queue_stack_map.c | 12 +- .../bpf/prog_tests/raw_tp_test_run.c | 4 +- .../selftests/bpf/prog_tests/rdonly_maps.c | 9 +- .../selftests/bpf/prog_tests/recursion.c | 41 + .../bpf/prog_tests/reference_tracking.c | 2 +- .../selftests/bpf/prog_tests/resolve_btfids.c | 9 +- .../selftests/bpf/prog_tests/ringbuf.c | 17 +- .../selftests/bpf/prog_tests/ringbuf_multi.c | 23 +- .../bpf/prog_tests/select_reuseport.c | 55 +- .../selftests/bpf/prog_tests/send_signal.c | 5 +- .../selftests/bpf/prog_tests/sk_lookup.c | 2 +- .../selftests/bpf/prog_tests/skeleton.c | 41 +- .../selftests/bpf/prog_tests/snprintf_btf.c | 4 +- .../selftests/bpf/prog_tests/sock_fields.c | 14 +- .../selftests/bpf/prog_tests/sockmap_basic.c | 6 +- .../selftests/bpf/prog_tests/sockmap_ktls.c | 2 +- .../selftests/bpf/prog_tests/sockmap_listen.c | 10 +- .../selftests/bpf/prog_tests/sockopt_sk.c | 66 +- .../bpf/prog_tests/stacktrace_build_id_nmi.c | 3 +- .../selftests/bpf/prog_tests/stacktrace_map.c | 2 +- .../bpf/prog_tests/stacktrace_map_raw_tp.c | 5 +- .../selftests/bpf/prog_tests/static_linked.c | 35 + .../bpf/prog_tests/tcp_hdr_options.c | 15 +- .../selftests/bpf/prog_tests/tcp_rtt.c | 2 +- .../selftests/bpf/prog_tests/test_bpffs.c | 4 +- .../bpf/prog_tests/test_global_funcs.c | 8 + .../selftests/bpf/prog_tests/test_overhead.c | 12 +- .../bpf/prog_tests/trampoline_count.c | 18 +- .../selftests/bpf/prog_tests/udp_limit.c | 7 +- tools/testing/selftests/bpf/prog_tests/xdp.c | 11 +- .../selftests/bpf/prog_tests/xdp_bpf2bpf.c | 8 +- .../selftests/bpf/prog_tests/xdp_link.c | 8 +- tools/testing/selftests/bpf/progs/atomics.c | 154 + tools/testing/selftests/bpf/progs/bpf_cubic.c | 6 +- .../bpf/progs/bpf_iter_bpf_hash_map.c | 1 - .../selftests/bpf/progs/bpf_iter_bpf_map.c | 1 - .../selftests/bpf/progs/bpf_iter_ipv6_route.c | 1 - .../selftests/bpf/progs/bpf_iter_netlink.c | 1 - .../selftests/bpf/progs/bpf_iter_task.c | 1 - .../selftests/bpf/progs/bpf_iter_task_btf.c | 1 - .../selftests/bpf/progs/bpf_iter_task_file.c | 1 - .../selftests/bpf/progs/bpf_iter_task_stack.c | 1 - .../selftests/bpf/progs/bpf_iter_tcp4.c | 1 - .../selftests/bpf/progs/bpf_iter_tcp6.c | 1 - .../selftests/bpf/progs/bpf_iter_test_kern4.c | 4 +- .../selftests/bpf/progs/bpf_iter_udp4.c | 1 - .../selftests/bpf/progs/bpf_iter_udp6.c | 1 - .../selftests/bpf/progs/core_reloc_types.h | 17 + tools/testing/selftests/bpf/progs/kfree_skb.c | 4 +- tools/testing/selftests/bpf/progs/lsm.c | 69 + .../selftests/bpf/progs/map_ptr_kern.c | 4 +- tools/testing/selftests/bpf/progs/recursion.c | 46 + .../testing/selftests/bpf/progs/sockopt_sk.c | 11 +- tools/testing/selftests/bpf/progs/tailcall3.c | 2 +- tools/testing/selftests/bpf/progs/tailcall4.c | 2 +- tools/testing/selftests/bpf/progs/tailcall5.c | 2 +- .../selftests/bpf/progs/tailcall_bpf2bpf2.c | 2 +- .../selftests/bpf/progs/tailcall_bpf2bpf4.c | 2 +- .../selftests/bpf/progs/test_cls_redirect.c | 4 +- .../bpf/progs/test_core_read_macros.c | 50 + .../bpf/progs/test_core_reloc_module.c | 96 + .../selftests/bpf/progs/test_global_func10.c | 29 + .../selftests/bpf/progs/test_global_func11.c | 19 + .../selftests/bpf/progs/test_global_func12.c | 21 + .../selftests/bpf/progs/test_global_func13.c | 24 + .../selftests/bpf/progs/test_global_func14.c | 21 + .../selftests/bpf/progs/test_global_func15.c | 22 + .../selftests/bpf/progs/test_global_func16.c | 22 + .../selftests/bpf/progs/test_global_func9.c | 132 + .../bpf/progs/test_global_func_args.c | 91 + .../selftests/bpf/progs/test_ksyms_module.c | 26 + .../selftests/bpf/progs/test_ksyms_weak.c | 56 + .../bpf/progs/test_map_in_map_invalid.c | 26 + tools/testing/selftests/bpf/progs/test_mmap.c | 2 - .../selftests/bpf/progs/test_module_attach.c | 66 + .../selftests/bpf/progs/test_rdonly_maps.c | 6 +- .../selftests/bpf/progs/test_ringbuf.c | 1 - .../selftests/bpf/progs/test_ringbuf_multi.c | 1 - .../selftests/bpf/progs/test_skeleton.c | 20 +- .../selftests/bpf/progs/test_sockmap_listen.c | 2 +- .../selftests/bpf/progs/test_static_linked1.c | 30 + .../selftests/bpf/progs/test_static_linked2.c | 31 + .../selftests/bpf/progs/test_subprogs.c | 13 + .../selftests/bpf/test_bpftool_build.sh | 4 + tools/testing/selftests/bpf/test_btf.h | 3 + .../selftests/bpf/test_cgroup_storage.c | 2 +- tools/testing/selftests/bpf/test_maps.c | 279 +- tools/testing/selftests/bpf/test_progs.c | 79 +- tools/testing/selftests/bpf/test_progs.h | 72 +- .../selftests/bpf/test_tcpnotify_user.c | 7 +- tools/testing/selftests/bpf/test_verifier.c | 103 +- .../selftests/bpf/verifier/atomic_and.c | 77 + .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 + .../selftests/bpf/verifier/atomic_fetch_add.c | 106 + .../selftests/bpf/verifier/atomic_or.c | 77 + .../selftests/bpf/verifier/atomic_xchg.c | 46 + .../selftests/bpf/verifier/atomic_xor.c | 77 + tools/testing/selftests/bpf/verifier/calls.c | 12 +- tools/testing/selftests/bpf/verifier/ctx.c | 7 +- .../selftests/bpf/verifier/dead_code.c | 10 +- .../bpf/verifier/direct_packet_access.c | 4 +- tools/testing/selftests/bpf/verifier/jit.c | 24 + .../testing/selftests/bpf/verifier/leak_ptr.c | 10 +- .../selftests/bpf/verifier/meta_access.c | 4 +- tools/testing/selftests/bpf/verifier/unpriv.c | 3 +- .../bpf/verifier/value_illegal_alu.c | 2 +- tools/testing/selftests/bpf/verifier/xadd.c | 18 +- tools/testing/selftests/bpf/xdping.c | 2 +- tools/testing/selftests/tc-testing/Makefile | 3 +- 292 files changed, 24645 insertions(+), 6370 deletions(-) create mode 100644 include/linux/bpfptr.h create mode 100644 tools/lib/bpf/bpf_gen_internal.h create mode 100644 tools/lib/bpf/gen_loader.c create mode 100644 tools/lib/bpf/libbpf_legacy.h create mode 100644 tools/lib/bpf/libbpf_version.h create mode 100644 tools/lib/bpf/linker.c create mode 100644 tools/lib/bpf/relo_core.c create mode 100644 tools/lib/bpf/relo_core.h create mode 100644 tools/lib/bpf/skel_internal.h create mode 100644 tools/lib/bpf/strset.c create mode 100644 tools/lib/bpf/strset.h create mode 100644 tools/perf/util/bpf_counter.c create mode 100644 tools/perf/util/bpf_counter.h create mode 100644 tools/perf/util/bpf_skel/.gitignore create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c create mode 100644 tools/testing/selftests/bpf/bpf_testmod/.gitignore create mode 100644 tools/testing/selftests/bpf/bpf_testmod/Makefile create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h create mode 100644 tools/testing/selftests/bpf/btf_helpers.c create mode 100644 tools/testing/selftests/bpf/btf_helpers.h create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_split.c create mode 100644 tools/testing/selftests/bpf/prog_tests/core_read_macros.c create mode 100644 tools/testing/selftests/bpf/prog_tests/global_func_args.c create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c create mode 100644 tools/testing/selftests/bpf/prog_tests/module_attach.c create mode 100644 tools/testing/selftests/bpf/prog_tests/recursion.c create mode 100644 tools/testing/selftests/bpf/prog_tests/static_linked.c create mode 100644 tools/testing/selftests/bpf/progs/atomics.c create mode 100644 tools/testing/selftests/bpf/progs/recursion.c create mode 100644 tools/testing/selftests/bpf/progs/test_core_read_macros.c create mode 100644 tools/testing/selftests/bpf/progs/test_core_reloc_module.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func10.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func11.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func12.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func13.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func14.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func15.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func16.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func9.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func_args.c create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_weak.c create mode 100644 tools/testing/selftests/bpf/progs/test_map_in_map_invalid.c create mode 100644 tools/testing/selftests/bpf/progs/test_module_attach.c create mode 100644 tools/testing/selftests/bpf/progs/test_static_linked1.c create mode 100644 tools/testing/selftests/bpf/progs/test_static_linked2.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.11-rc1 commit cf83b2d2e2b64920bd6999b199dfa271d7e94cf8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Commit e679654a704e ("bpf: Fix a rcu_sched stall issue with bpf task/task_file iterator") tries to fix rcu stalls warning which is caused by bpf task_file iterator when running "bpftool prog".
rcu: INFO: rcu_sched self-detected stall on CPU rcu: \x097-....: (20999 ticks this GP) idle=302/1/0x4000000000000000 softirq=1508852/1508852 fqs=4913 \x09(t=21031 jiffies g=2534773 q=179750) NMI backtrace for cpu 7 CPU: 7 PID: 184195 Comm: bpftool Kdump: loaded Tainted: G W 5.8.0-00004-g68bfc7f8c1b4 #6 Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A17 05/03/2019 Call Trace: <IRQ> dump_stack+0x57/0x70 nmi_cpu_backtrace.cold+0x14/0x53 ? lapic_can_unplug_cpu.cold+0x39/0x39 nmi_trigger_cpumask_backtrace+0xb7/0xc7 rcu_dump_cpu_stacks+0xa2/0xd0 rcu_sched_clock_irq.cold+0x1ff/0x3d9 ? tick_nohz_handler+0x100/0x100 update_process_times+0x5b/0x90 tick_sched_timer+0x5e/0xf0 __hrtimer_run_queues+0x12a/0x2a0 hrtimer_interrupt+0x10e/0x280 __sysvec_apic_timer_interrupt+0x51/0xe0 asm_call_on_stack+0xf/0x20 </IRQ> sysvec_apic_timer_interrupt+0x6f/0x80 ... task_file_seq_next+0x52/0xa0 bpf_seq_read+0xb9/0x320 vfs_read+0x9d/0x180 ksys_read+0x5f/0xe0 do_syscall_64+0x38/0x60 entry_SYSCALL_64_after_hwframe+0x44/0xa9
The fix is to limit the number of bpf program runs to be one million. This fixed the program in most cases. But we also found under heavy load, which can increase the wallclock time for bpf_seq_read(), the warning may still be possible.
For example, calling bpf_delay() in the "while" loop of bpf_seq_read(), which will introduce artificial delay, the warning will show up in my qemu run.
static unsigned q; volatile unsigned *p = &q; volatile unsigned long long ll; static void bpf_delay(void) { int i, j;
for (i = 0; i < 10000; i++) for (j = 0; j < 10000; j++) ll += *p; }
There are two ways to fix this issue. One is to reduce the above one million threshold to say 100,000 and hopefully rcu warning will not show up any more. Another is to introduce a target feature which enables bpf_seq_read() calling cond_resched().
This patch took second approach as the first approach may cause more -EAGAIN failures for read() syscalls. Note that not all bpf_iter targets can permit cond_resched() in bpf_seq_read() as some, e.g., netlink seq iterator, rcu read lock critical section spans through seq_ops->next() -> seq_ops->show() -> seq_ops->next().
For the kernel code with the above hack, "bpftool p" roughly takes 38 seconds to finish on my VM with 184 bpf program runs. Using the following command, I am able to collect the number of context switches: perf stat -e context-switches -- ./bpftool p >& log Without this patch, 69 context-switches With this patch, 75 context-switches This patch added additional 6 context switches, roughly every 6 seconds to reschedule, to avoid lengthy no-rescheduling which may cause the above RCU warnings.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201028061054.1411116-1-yhs@fb.com (cherry picked from commit cf83b2d2e2b64920bd6999b199dfa271d7e94cf8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 5 +++++ kernel/bpf/bpf_iter.c | 14 ++++++++++++++ kernel/bpf/task_iter.c | 2 ++ 3 files changed, 21 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3521e557cc10..a64bbce97cdb 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1417,6 +1417,10 @@ typedef void (*bpf_iter_show_fdinfo_t) (const struct bpf_iter_aux_info *aux, typedef int (*bpf_iter_fill_link_info_t)(const struct bpf_iter_aux_info *aux, struct bpf_link_info *info);
+enum bpf_iter_feature { + BPF_ITER_RESCHED = BIT(0), +}; + #define BPF_ITER_CTX_ARG_MAX 2 struct bpf_iter_reg { const char *target; @@ -1425,6 +1429,7 @@ struct bpf_iter_reg { bpf_iter_show_fdinfo_t show_fdinfo; bpf_iter_fill_link_info_t fill_link_info; u32 ctx_arg_info_size; + u32 feature; struct bpf_ctx_arg_aux ctx_arg_info[BPF_ITER_CTX_ARG_MAX]; const struct bpf_iter_seq_info *seq_info; }; diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index e8957e911de3..a0d9eade9c80 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -67,6 +67,15 @@ static void bpf_iter_done_stop(struct seq_file *seq) iter_priv->done_stop = true; }
+static bool bpf_iter_support_resched(struct seq_file *seq) +{ + struct bpf_iter_priv_data *iter_priv; + + iter_priv = container_of(seq->private, struct bpf_iter_priv_data, + target_private); + return iter_priv->tinfo->reg_info->feature & BPF_ITER_RESCHED; +} + /* maximum visited objects before bailing out */ #define MAX_ITER_OBJECTS 1000000
@@ -83,6 +92,7 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, struct seq_file *seq = file->private_data; size_t n, offs, copied = 0; int err = 0, num_objs = 0; + bool can_resched; void *p;
mutex_lock(&seq->lock); @@ -135,6 +145,7 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, goto done; }
+ can_resched = bpf_iter_support_resched(seq); while (1) { loff_t pos = seq->index;
@@ -180,6 +191,9 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, } break; } + + if (can_resched) + cond_resched(); } stop: offs = seq->count; diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index f3d3a562a802..8033ab19138a 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -318,6 +318,7 @@ static const struct bpf_iter_seq_info task_seq_info = {
static struct bpf_iter_reg task_reg_info = { .target = "task", + .feature = BPF_ITER_RESCHED, .ctx_arg_info_size = 1, .ctx_arg_info = { { offsetof(struct bpf_iter__task, task), @@ -335,6 +336,7 @@ static const struct bpf_iter_seq_info task_file_seq_info = {
static struct bpf_iter_reg task_file_reg_info = { .target = "task_file", + .feature = BPF_ITER_RESCHED, .ctx_arg_info_size = 2, .ctx_arg_info = { { offsetof(struct bpf_iter__task_file, task),
From: Song Liu songliubraving@fb.com
mainline inclusion from mainline-5.11-rc1 commit c50eb518e262fa06bd334e6eec172eaf5d7a5bd9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
If a hashtab is accessed in both NMI and non-NMI contexts, it may cause deadlock in bucket->lock. LOCKDEP NMI warning highlighted this issue:
./test_progs -t stacktrace
[ 74.828970] [ 74.828971] ================================ [ 74.828972] WARNING: inconsistent lock state [ 74.828973] 5.9.0-rc8+ #275 Not tainted [ 74.828974] -------------------------------- [ 74.828975] inconsistent {INITIAL USE} -> {IN-NMI} usage. [ 74.828976] taskset/1174 [HC2[2]:SC0[0]:HE0:SE1] takes: [ 74.828977] ffffc90000ee96b0 (&htab->buckets[i].raw_lock){....}-{2:2}, at: htab_map_update_elem+0x271/0x5a0 [ 74.828981] {INITIAL USE} state was registered at: [ 74.828982] lock_acquire+0x137/0x510 [ 74.828983] _raw_spin_lock_irqsave+0x43/0x90 [ 74.828984] htab_map_update_elem+0x271/0x5a0 [ 74.828984] 0xffffffffa0040b34 [ 74.828985] trace_call_bpf+0x159/0x310 [ 74.828986] perf_trace_run_bpf_submit+0x5f/0xd0 [ 74.828987] perf_trace_urandom_read+0x1be/0x220 [ 74.828988] urandom_read_nowarn.isra.0+0x26f/0x380 [ 74.828989] vfs_read+0xf8/0x280 [ 74.828989] ksys_read+0xc9/0x160 [ 74.828990] do_syscall_64+0x33/0x40 [ 74.828991] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 74.828992] irq event stamp: 1766 [ 74.828993] hardirqs last enabled at (1765): [<ffffffff82800ace>] asm_exc_page_fault+0x1e/0x30 [ 74.828994] hardirqs last disabled at (1766): [<ffffffff8267df87>] irqentry_enter+0x37/0x60 [ 74.828995] softirqs last enabled at (856): [<ffffffff81043e7c>] fpu__clear+0xac/0x120 [ 74.828996] softirqs last disabled at (854): [<ffffffff81043df0>] fpu__clear+0x20/0x120 [ 74.828997] [ 74.828998] other info that might help us debug this: [ 74.828999] Possible unsafe locking scenario: [ 74.828999] [ 74.829000] CPU0 [ 74.829001] ---- [ 74.829001] lock(&htab->buckets[i].raw_lock); [ 74.829003] <Interrupt> [ 74.829004] lock(&htab->buckets[i].raw_lock); [ 74.829006] [ 74.829006] *** DEADLOCK *** [ 74.829007] [ 74.829008] 1 lock held by taskset/1174: [ 74.829008] #0: ffff8883ec3fd020 (&cpuctx_lock){-...}-{2:2}, at: perf_event_task_tick+0x101/0x650 [ 74.829012] [ 74.829013] stack backtrace: [ 74.829014] CPU: 0 PID: 1174 Comm: taskset Not tainted 5.9.0-rc8+ #275 [ 74.829015] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 74.829016] Call Trace: [ 74.829016] <NMI> [ 74.829017] dump_stack+0x9a/0xd0 [ 74.829018] lock_acquire+0x461/0x510 [ 74.829019] ? lock_release+0x6b0/0x6b0 [ 74.829020] ? stack_map_get_build_id_offset+0x45e/0x800 [ 74.829021] ? htab_map_update_elem+0x271/0x5a0 [ 74.829022] ? rcu_read_lock_held_common+0x1a/0x50 [ 74.829022] ? rcu_read_lock_held+0x5f/0xb0 [ 74.829023] _raw_spin_lock_irqsave+0x43/0x90 [ 74.829024] ? htab_map_update_elem+0x271/0x5a0 [ 74.829025] htab_map_update_elem+0x271/0x5a0 [ 74.829026] bpf_prog_1fd9e30e1438d3c5_oncpu+0x9c/0xe88 [ 74.829027] bpf_overflow_handler+0x127/0x320 [ 74.829028] ? perf_event_text_poke_output+0x4d0/0x4d0 [ 74.829029] ? sched_clock_cpu+0x18/0x130 [ 74.829030] __perf_event_overflow+0xae/0x190 [ 74.829030] handle_pmi_common+0x34c/0x470 [ 74.829031] ? intel_pmu_save_and_restart+0x90/0x90 [ 74.829032] ? lock_acquire+0x3f8/0x510 [ 74.829033] ? lock_release+0x6b0/0x6b0 [ 74.829034] intel_pmu_handle_irq+0x11e/0x240 [ 74.829034] perf_event_nmi_handler+0x40/0x60 [ 74.829035] nmi_handle+0x110/0x360 [ 74.829036] ? __intel_pmu_enable_all.constprop.0+0x72/0xf0 [ 74.829037] default_do_nmi+0x6b/0x170 [ 74.829038] exc_nmi+0x106/0x130 [ 74.829038] end_repeat_nmi+0x16/0x55 [ 74.829039] RIP: 0010:__intel_pmu_enable_all.constprop.0+0x72/0xf0 [ 74.829042] Code: 2f 1f 03 48 8d bb b8 0c 00 00 e8 29 09 41 00 48 ... [ 74.829043] RSP: 0000:ffff8880a604fc90 EFLAGS: 00000002 [ 74.829044] RAX: 000000070000000f RBX: ffff8883ec2195a0 RCX: 000000000000038f [ 74.829045] RDX: 0000000000000007 RSI: ffffffff82e72c20 RDI: ffff8883ec21a258 [ 74.829046] RBP: 000000070000000f R08: ffffffff8101b013 R09: fffffbfff0a7982d [ 74.829047] R10: ffffffff853cc167 R11: fffffbfff0a7982c R12: 0000000000000000 [ 74.829049] R13: ffff8883ec3f0af0 R14: ffff8883ec3fd120 R15: ffff8883e9c92098 [ 74.829049] ? intel_pmu_lbr_enable_all+0x43/0x240 [ 74.829050] ? __intel_pmu_enable_all.constprop.0+0x72/0xf0 [ 74.829051] ? __intel_pmu_enable_all.constprop.0+0x72/0xf0 [ 74.829052] </NMI> [ 74.829053] perf_event_task_tick+0x48d/0x650 [ 74.829054] scheduler_tick+0x129/0x210 [ 74.829054] update_process_times+0x37/0x70 [ 74.829055] tick_sched_handle.isra.0+0x35/0x90 [ 74.829056] tick_sched_timer+0x8f/0xb0 [ 74.829057] __hrtimer_run_queues+0x364/0x7d0 [ 74.829058] ? tick_sched_do_timer+0xa0/0xa0 [ 74.829058] ? enqueue_hrtimer+0x1e0/0x1e0 [ 74.829059] ? recalibrate_cpu_khz+0x10/0x10 [ 74.829060] ? ktime_get_update_offsets_now+0x1a3/0x360 [ 74.829061] hrtimer_interrupt+0x1bb/0x360 [ 74.829062] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 74.829063] __sysvec_apic_timer_interrupt+0xed/0x3d0 [ 74.829064] sysvec_apic_timer_interrupt+0x3f/0xd0 [ 74.829064] ? asm_sysvec_apic_timer_interrupt+0xa/0x20 [ 74.829065] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 74.829066] RIP: 0033:0x7fba18d579b4 [ 74.829068] Code: 74 54 44 0f b6 4a 04 41 83 e1 0f 41 80 f9 ... [ 74.829069] RSP: 002b:00007ffc9ba69570 EFLAGS: 00000206 [ 74.829071] RAX: 00007fba192084c0 RBX: 00007fba18c24d28 RCX: 00000000000007a4 [ 74.829072] RDX: 00007fba18c30488 RSI: 0000000000000000 RDI: 000000000000037b [ 74.829073] RBP: 00007fba18ca5760 R08: 00007fba18c248fc R09: 00007fba18c94c30 [ 74.829074] R10: 000000000000002f R11: 0000000000073c30 R12: 00007ffc9ba695e0 [ 74.829075] R13: 00000000000003f3 R14: 00007fba18c21ac8 R15: 00000000000058d6
However, such warning should not apply across multiple hashtabs. The system will not deadlock if one hashtab is used in NMI, while another hashtab is used in non-NMI.
Use separate lockdep class for each hashtab, so that we don't get this false alert.
Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201029071925.3103400-2-songliubraving@fb.com (cherry picked from commit c50eb518e262fa06bd334e6eec172eaf5d7a5bd9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/hashtab.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 6c444e815406..3a7223cef363 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -99,6 +99,7 @@ struct bpf_htab { u32 n_buckets; /* number of hash buckets */ u32 elem_size; /* size of each element in bytes */ u32 hashrnd; + struct lock_class_key lockdep_key; };
/* each htab element is struct htab_elem + key + value */ @@ -136,12 +137,18 @@ static void htab_init_buckets(struct bpf_htab *htab) { unsigned i;
+ lockdep_register_key(&htab->lockdep_key); for (i = 0; i < htab->n_buckets; i++) { INIT_HLIST_NULLS_HEAD(&htab->buckets[i].head, i); - if (htab_use_raw_lock(htab)) + if (htab_use_raw_lock(htab)) { raw_spin_lock_init(&htab->buckets[i].raw_lock); - else + lockdep_set_class(&htab->buckets[i].raw_lock, + &htab->lockdep_key); + } else { spin_lock_init(&htab->buckets[i].lock); + lockdep_set_class(&htab->buckets[i].lock, + &htab->lockdep_key); + } } }
@@ -1338,6 +1345,7 @@ static void htab_map_free(struct bpf_map *map)
free_percpu(htab->extra_elems); bpf_map_area_free(htab->buckets); + lockdep_unregister_key(&htab->lockdep_key); kfree(htab); }
From: Song Liu songliubraving@fb.com
mainline inclusion from mainline-5.11-rc1 commit 20b6cc34ea74b6a84599c1f8a70f3315b56a1883 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
If a hashtab is accessed in both non-NMI and NMI context, the system may deadlock on bucket->lock. Fix this issue with percpu counter map_locked. map_locked rejects concurrent access to the same bucket from the same CPU. To reduce memory overhead, map_locked is not added per bucket. Instead, 8 percpu counters are added to each hashtab. buckets are assigned to these counters based on the lower bits of its hash.
Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201029071925.3103400-3-songliubraving@fb.com (cherry picked from commit 20b6cc34ea74b6a84599c1f8a70f3315b56a1883) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/hashtab.c | 114 +++++++++++++++++++++++++++++++------------ 1 file changed, 82 insertions(+), 32 deletions(-)
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 3a7223cef363..a61c40dd5976 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -86,6 +86,9 @@ struct bucket { }; };
+#define HASHTAB_MAP_LOCK_COUNT 8 +#define HASHTAB_MAP_LOCK_MASK (HASHTAB_MAP_LOCK_COUNT - 1) + struct bpf_htab { struct bpf_map map; struct bucket *buckets; @@ -100,6 +103,7 @@ struct bpf_htab { u32 elem_size; /* size of each element in bytes */ u32 hashrnd; struct lock_class_key lockdep_key; + int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT]; };
/* each htab element is struct htab_elem + key + value */ @@ -152,26 +156,41 @@ static void htab_init_buckets(struct bpf_htab *htab) } }
-static inline unsigned long htab_lock_bucket(const struct bpf_htab *htab, - struct bucket *b) +static inline int htab_lock_bucket(const struct bpf_htab *htab, + struct bucket *b, u32 hash, + unsigned long *pflags) { unsigned long flags;
+ hash = hash & HASHTAB_MAP_LOCK_MASK; + + migrate_disable(); + if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) { + __this_cpu_dec(*(htab->map_locked[hash])); + migrate_enable(); + return -EBUSY; + } + if (htab_use_raw_lock(htab)) raw_spin_lock_irqsave(&b->raw_lock, flags); else spin_lock_irqsave(&b->lock, flags); - return flags; + *pflags = flags; + + return 0; }
static inline void htab_unlock_bucket(const struct bpf_htab *htab, - struct bucket *b, + struct bucket *b, u32 hash, unsigned long flags) { + hash = hash & HASHTAB_MAP_LOCK_MASK; if (htab_use_raw_lock(htab)) raw_spin_unlock_irqrestore(&b->raw_lock, flags); else spin_unlock_irqrestore(&b->lock, flags); + __this_cpu_dec(*(htab->map_locked[hash])); + migrate_enable(); }
static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node); @@ -429,8 +448,8 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU); bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC); struct bpf_htab *htab; + int err, i; u64 cost; - int err;
htab = kzalloc(sizeof(*htab), GFP_USER); if (!htab) @@ -487,6 +506,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) if (!htab->buckets) goto free_charge;
+ for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) { + htab->map_locked[i] = __alloc_percpu_gfp(sizeof(int), + sizeof(int), GFP_USER); + if (!htab->map_locked[i]) + goto free_map_locked; + } + if (htab->map.map_flags & BPF_F_ZERO_SEED) htab->hashrnd = 0; else @@ -497,7 +523,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) if (prealloc) { err = prealloc_init(htab); if (err) - goto free_buckets; + goto free_map_locked;
if (!percpu && !lru) { /* lru itself can remove the least used element, so @@ -513,7 +539,9 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
free_prealloc: prealloc_destroy(htab); -free_buckets: +free_map_locked: + for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) + free_percpu(htab->map_locked[i]); bpf_map_area_free(htab->buckets); free_charge: bpf_map_charge_finish(&htab->map.memory); @@ -694,12 +722,15 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) struct hlist_nulls_node *n; unsigned long flags; struct bucket *b; + int ret;
tgt_l = container_of(node, struct htab_elem, lru_node); b = __select_bucket(htab, tgt_l->hash); head = &b->head;
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags); + if (ret) + return false;
hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) if (l == tgt_l) { @@ -707,7 +738,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) break; }
- htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, tgt_l->hash, flags);
return l == tgt_l; } @@ -1005,7 +1036,9 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value, */ }
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, hash, &flags); + if (ret) + return ret;
l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -1046,7 +1079,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value, } ret = 0; err: - htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, hash, flags); return ret; }
@@ -1084,7 +1117,9 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value, return -ENOMEM; memcpy(l_new->key + round_up(map->key_size, 8), value, map->value_size);
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, hash, &flags); + if (ret) + return ret;
l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -1103,7 +1138,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value, ret = 0;
err: - htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, hash, flags);
if (ret) bpf_lru_push_free(&htab->lru, &l_new->lru_node); @@ -1138,7 +1173,9 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key, b = __select_bucket(htab, hash); head = &b->head;
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, hash, &flags); + if (ret) + return ret;
l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -1161,7 +1198,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key, } ret = 0; err: - htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, hash, flags); return ret; }
@@ -1201,7 +1238,9 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, return -ENOMEM; }
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, hash, &flags); + if (ret) + return ret;
l_old = lookup_elem_raw(head, hash, key, key_size);
@@ -1223,7 +1262,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, } ret = 0; err: - htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, hash, flags); if (l_new) bpf_lru_push_free(&htab->lru, &l_new->lru_node); return ret; @@ -1251,7 +1290,7 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key) struct htab_elem *l; unsigned long flags; u32 hash, key_size; - int ret = -ENOENT; + int ret;
WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
@@ -1261,17 +1300,20 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key) b = __select_bucket(htab, hash); head = &b->head;
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, hash, &flags); + if (ret) + return ret;
l = lookup_elem_raw(head, hash, key, key_size);
if (l) { hlist_nulls_del_rcu(&l->hash_node); free_htab_elem(htab, l); - ret = 0; + } else { + ret = -ENOENT; }
- htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, hash, flags); return ret; }
@@ -1283,7 +1325,7 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key) struct htab_elem *l; unsigned long flags; u32 hash, key_size; - int ret = -ENOENT; + int ret;
WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
@@ -1293,16 +1335,18 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key) b = __select_bucket(htab, hash); head = &b->head;
- flags = htab_lock_bucket(htab, b); + ret = htab_lock_bucket(htab, b, hash, &flags); + if (ret) + return ret;
l = lookup_elem_raw(head, hash, key, key_size);
- if (l) { + if (l) hlist_nulls_del_rcu(&l->hash_node); - ret = 0; - } + else + ret = -ENOENT;
- htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, hash, flags); if (l) bpf_lru_push_free(&htab->lru, &l->lru_node); return ret; @@ -1328,6 +1372,7 @@ static void delete_all_elements(struct bpf_htab *htab) static void htab_map_free(struct bpf_map *map) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + int i;
/* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback. * bpf_free_used_maps() is called after bpf prog is no longer executing. @@ -1346,6 +1391,8 @@ static void htab_map_free(struct bpf_map *map) free_percpu(htab->extra_elems); bpf_map_area_free(htab->buckets); lockdep_unregister_key(&htab->lockdep_key); + for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) + free_percpu(htab->map_locked[i]); kfree(htab); }
@@ -1449,8 +1496,11 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, b = &htab->buckets[batch]; head = &b->head; /* do not grab the lock unless need it (bucket_cnt > 0). */ - if (locked) - flags = htab_lock_bucket(htab, b); + if (locked) { + ret = htab_lock_bucket(htab, b, batch, &flags); + if (ret) + goto next_batch; + }
bucket_cnt = 0; hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) @@ -1467,7 +1517,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, /* Note that since bucket_cnt > 0 here, it is implicit * that the locked was grabbed, so release it. */ - htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, batch, flags); rcu_read_unlock(); bpf_enable_instrumentation(); goto after_loop; @@ -1478,7 +1528,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, /* Note that since bucket_cnt > 0 here, it is implicit * that the locked was grabbed, so release it. */ - htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, batch, flags); rcu_read_unlock(); bpf_enable_instrumentation(); kvfree(keys); @@ -1531,7 +1581,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, dst_val += value_size; }
- htab_unlock_bucket(htab, b, flags); + htab_unlock_bucket(htab, b, batch, flags); locked = false;
while (node_to_free) {
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit c81ed6d81e0560713ceb94917ff1981848d0614e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Factor out commiting of appended type data. Also extract fetching the very last type in the BTF (to append members to). These two operations are common across many APIs and will be easier to refactor with split BTF, if they are extracted into a single place.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-2-andrii@kernel.org (cherry picked from commit c81ed6d81e0560713ceb94917ff1981848d0614e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 123 ++++++++++++++++---------------------------- 1 file changed, 43 insertions(+), 80 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index e6f644cdc9f1..b3633b19cc32 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1557,6 +1557,20 @@ static void btf_type_inc_vlen(struct btf_type *t) t->info = btf_type_info(btf_kind(t), btf_vlen(t) + 1, btf_kflag(t)); }
+static int btf_commit_type(struct btf *btf, int data_sz) +{ + int err; + + err = btf_add_type_idx_entry(btf, btf->hdr->type_len); + if (err) + return err; + + btf->hdr->type_len += data_sz; + btf->hdr->str_off += data_sz; + btf->nr_types++; + return btf->nr_types; +} + /* * Append new BTF_KIND_INT type with: * - *name* - non-empty, non-NULL type name; @@ -1569,7 +1583,7 @@ static void btf_type_inc_vlen(struct btf_type *t) int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding) { struct btf_type *t; - int sz, err, name_off; + int sz, name_off;
/* non-empty name */ if (!name || !name[0]) @@ -1603,14 +1617,7 @@ int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding /* set INT info, we don't allow setting legacy bit offset/size */ *(__u32 *)(t + 1) = (encoding << 24) | (byte_sz * 8);
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* it's completely legal to append BTF types with type IDs pointing forward to @@ -1628,7 +1635,7 @@ static int validate_type_id(int id) static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref_type_id) { struct btf_type *t; - int sz, name_off = 0, err; + int sz, name_off = 0;
if (validate_type_id(ref_type_id)) return -EINVAL; @@ -1651,14 +1658,7 @@ static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref t->info = btf_type_info(kind, 0, 0); t->type = ref_type_id;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* @@ -1686,7 +1686,7 @@ int btf__add_array(struct btf *btf, int index_type_id, int elem_type_id, __u32 n { struct btf_type *t; struct btf_array *a; - int sz, err; + int sz;
if (validate_type_id(index_type_id) || validate_type_id(elem_type_id)) return -EINVAL; @@ -1708,21 +1708,14 @@ int btf__add_array(struct btf *btf, int index_type_id, int elem_type_id, __u32 n a->index_type = index_type_id; a->nelems = nr_elems;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* generic STRUCT/UNION append function */ static int btf_add_composite(struct btf *btf, int kind, const char *name, __u32 bytes_sz) { struct btf_type *t; - int sz, err, name_off = 0; + int sz, name_off = 0;
if (btf_ensure_modifiable(btf)) return -ENOMEM; @@ -1745,14 +1738,7 @@ static int btf_add_composite(struct btf *btf, int kind, const char *name, __u32 t->info = btf_type_info(kind, 0, 0); t->size = bytes_sz;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* @@ -1790,6 +1776,11 @@ int btf__add_union(struct btf *btf, const char *name, __u32 byte_sz) return btf_add_composite(btf, BTF_KIND_UNION, name, byte_sz); }
+static struct btf_type *btf_last_type(struct btf *btf) +{ + return btf_type_by_id(btf, btf__get_nr_types(btf)); +} + /* * Append new field for the current STRUCT/UNION type with: * - *name* - name of the field, can be NULL or empty for anonymous field; @@ -1811,7 +1802,7 @@ int btf__add_field(struct btf *btf, const char *name, int type_id, /* last type should be union/struct */ if (btf->nr_types == 0) return -EINVAL; - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); if (!btf_is_composite(t)) return -EINVAL;
@@ -1846,7 +1837,7 @@ int btf__add_field(struct btf *btf, const char *name, int type_id, m->offset = bit_offset | (bit_size << 24);
/* btf_add_type_mem can invalidate t pointer */ - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); /* update parent type's vlen and kflag */ t->info = btf_type_info(btf_kind(t), btf_vlen(t) + 1, is_bitfield || btf_kflag(t));
@@ -1871,7 +1862,7 @@ int btf__add_field(struct btf *btf, const char *name, int type_id, int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz) { struct btf_type *t; - int sz, err, name_off = 0; + int sz, name_off = 0;
/* byte_sz must be power of 2 */ if (!byte_sz || (byte_sz & (byte_sz - 1)) || byte_sz > 8) @@ -1896,14 +1887,7 @@ int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz) t->info = btf_type_info(BTF_KIND_ENUM, 0, 0); t->size = byte_sz;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* @@ -1923,7 +1907,7 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value) /* last type should be BTF_KIND_ENUM */ if (btf->nr_types == 0) return -EINVAL; - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); if (!btf_is_enum(t)) return -EINVAL;
@@ -1950,7 +1934,7 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value) v->val = value;
/* update parent type's vlen */ - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); btf_type_inc_vlen(t);
btf->hdr->type_len += sz; @@ -2090,7 +2074,7 @@ int btf__add_func(struct btf *btf, const char *name, int btf__add_func_proto(struct btf *btf, int ret_type_id) { struct btf_type *t; - int sz, err; + int sz;
if (validate_type_id(ret_type_id)) return -EINVAL; @@ -2110,14 +2094,7 @@ int btf__add_func_proto(struct btf *btf, int ret_type_id) t->info = btf_type_info(BTF_KIND_FUNC_PROTO, 0, 0); t->type = ret_type_id;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* @@ -2140,7 +2117,7 @@ int btf__add_func_param(struct btf *btf, const char *name, int type_id) /* last type should be BTF_KIND_FUNC_PROTO */ if (btf->nr_types == 0) return -EINVAL; - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); if (!btf_is_func_proto(t)) return -EINVAL;
@@ -2163,7 +2140,7 @@ int btf__add_func_param(struct btf *btf, const char *name, int type_id) p->type = type_id;
/* update parent type's vlen */ - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); btf_type_inc_vlen(t);
btf->hdr->type_len += sz; @@ -2185,7 +2162,7 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id) { struct btf_type *t; struct btf_var *v; - int sz, err, name_off; + int sz, name_off;
/* non-empty name */ if (!name || !name[0]) @@ -2216,14 +2193,7 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id) v = btf_var(t); v->linkage = linkage;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* @@ -2241,7 +2211,7 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id) int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz) { struct btf_type *t; - int sz, err, name_off; + int sz, name_off;
/* non-empty name */ if (!name || !name[0]) @@ -2264,14 +2234,7 @@ int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz) t->info = btf_type_info(BTF_KIND_DATASEC, 0, 0); t->size = byte_sz;
- err = btf_add_type_idx_entry(btf, btf->hdr->type_len); - if (err) - return err; - - btf->hdr->type_len += sz; - btf->hdr->str_off += sz; - btf->nr_types++; - return btf->nr_types; + return btf_commit_type(btf, sz); }
/* @@ -2293,7 +2256,7 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __ /* last type should be BTF_KIND_DATASEC */ if (btf->nr_types == 0) return -EINVAL; - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); if (!btf_is_datasec(t)) return -EINVAL;
@@ -2314,7 +2277,7 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __ v->size = byte_sz;
/* update parent type's vlen */ - t = btf_type_by_id(btf, btf->nr_types); + t = btf_last_type(btf); btf_type_inc_vlen(t);
btf->hdr->type_len += sz;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit d9448f94962bd28554df7d9a342d37c7f13d6232 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove the requirement of a strictly exact string section contents. This used to be true when string deduplication was done through sorting, but with string dedup done through hash table, it's no longer true. So relax test harness to relax strings checks and, consequently, type checks, which now don't have to have exactly the same string offsets.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201105043402.2530976-3-andrii@kernel.org (cherry picked from commit d9448f94962bd28554df7d9a342d37c7f13d6232) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/btf.c | 40 ++++++++++++-------- 1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 93162484c2ca..8ae97e2a4b9d 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -6652,7 +6652,7 @@ static void do_test_dedup(unsigned int test_num) const void *test_btf_data, *expect_btf_data; const char *ret_test_next_str, *ret_expect_next_str; const char *test_strs, *expect_strs; - const char *test_str_cur, *test_str_end; + const char *test_str_cur; const char *expect_str_cur, *expect_str_end; unsigned int raw_btf_size; void *raw_btf; @@ -6719,12 +6719,18 @@ static void do_test_dedup(unsigned int test_num) goto done; }
- test_str_cur = test_strs; - test_str_end = test_strs + test_hdr->str_len; expect_str_cur = expect_strs; expect_str_end = expect_strs + expect_hdr->str_len; - while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) { + while (expect_str_cur < expect_str_end) { size_t test_len, expect_len; + int off; + + off = btf__find_str(test_btf, expect_str_cur); + if (CHECK(off < 0, "exp str '%s' not found: %d\n", expect_str_cur, off)) { + err = -1; + goto done; + } + test_str_cur = btf__str_by_offset(test_btf, off);
test_len = strlen(test_str_cur); expect_len = strlen(expect_str_cur); @@ -6741,15 +6747,8 @@ static void do_test_dedup(unsigned int test_num) err = -1; goto done; } - test_str_cur += test_len + 1; expect_str_cur += expect_len + 1; } - if (CHECK(test_str_cur != test_str_end, - "test_str_cur:%p != test_str_end:%p", - test_str_cur, test_str_end)) { - err = -1; - goto done; - }
test_nr_types = btf__get_nr_types(test_btf); expect_nr_types = btf__get_nr_types(expect_btf); @@ -6775,10 +6774,21 @@ static void do_test_dedup(unsigned int test_num) err = -1; goto done; } - if (CHECK(memcmp((void *)test_type, - (void *)expect_type, - test_size), - "type #%d: contents differ", i)) { + if (CHECK(btf_kind(test_type) != btf_kind(expect_type), + "type %d kind: exp %d != got %u\n", + i, btf_kind(expect_type), btf_kind(test_type))) { + err = -1; + goto done; + } + if (CHECK(test_type->info != expect_type->info, + "type %d info: exp %d != got %u\n", + i, expect_type->info, test_type->info)) { + err = -1; + goto done; + } + if (CHECK(test_type->size != expect_type->size, + "type %d size/type: exp %d != got %u\n", + i, expect_type->size, test_type->size)) { err = -1; goto done; }
From: Andrii Nakryiko andriin@fb.com
mainline inclusion from mainline-5.11-rc1 commit 88a82c2a9ab5b2ce533c3de3f4517853c2f67f53 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Revamp BTF dedup's string deduplication to match the approach of writable BTF string management. This allows to transfer deduplicated strings index back to BTF object after deduplication without expensive extra memory copying and hash map re-construction. It also simplifies the code and speeds it up, because hashmap-based string deduplication is faster than sort + unique approach.
Signed-off-by: Andrii Nakryiko andriin@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-4-andrii@kernel.org (cherry picked from commit 88a82c2a9ab5b2ce533c3de3f4517853c2f67f53) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 263 +++++++++++++++++--------------------------- 1 file changed, 98 insertions(+), 165 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index b3633b19cc32..4b6d184725e2 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -90,6 +90,14 @@ struct btf { struct hashmap *strs_hash; /* whether strings are already deduplicated */ bool strs_deduped; + /* extra indirection layer to make strings hashmap work with stable + * string offsets and ability to transparently choose between + * btf->strs_data or btf_dedup->strs_data as a source of strings. + * This is used for BTF strings dedup to transfer deduplicated strings + * data back to struct btf without re-building strings index. + */ + void **strs_data_ptr; + /* BTF object FD, if loaded into kernel */ int fd;
@@ -1360,17 +1368,19 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
static size_t strs_hash_fn(const void *key, void *ctx) { - struct btf *btf = ctx; - const char *str = btf->strs_data + (long)key; + const struct btf *btf = ctx; + const char *strs = *btf->strs_data_ptr; + const char *str = strs + (long)key;
return str_hash(str); }
static bool strs_hash_equal_fn(const void *key1, const void *key2, void *ctx) { - struct btf *btf = ctx; - const char *str1 = btf->strs_data + (long)key1; - const char *str2 = btf->strs_data + (long)key2; + const struct btf *btf = ctx; + const char *strs = *btf->strs_data_ptr; + const char *str1 = strs + (long)key1; + const char *str2 = strs + (long)key2;
return strcmp(str1, str2) == 0; } @@ -1415,6 +1425,9 @@ static int btf_ensure_modifiable(struct btf *btf) memcpy(types, btf->types_data, btf->hdr->type_len); memcpy(strs, btf->strs_data, btf->hdr->str_len);
+ /* make hashmap below use btf->strs_data as a source of strings */ + btf->strs_data_ptr = &btf->strs_data; + /* build lookup index for all strings */ hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, btf); if (IS_ERR(hash)) { @@ -2821,19 +2834,11 @@ struct btf_dedup { size_t hypot_cap; /* Various option modifying behavior of algorithm */ struct btf_dedup_opts opts; -}; - -struct btf_str_ptr { - const char *str; - __u32 new_off; - bool used; -}; - -struct btf_str_ptrs { - struct btf_str_ptr *ptrs; - const char *data; - __u32 cnt; - __u32 cap; + /* temporary strings deduplication state */ + void *strs_data; + size_t strs_cap; + size_t strs_len; + struct hashmap* strs_hash; };
static long hash_combine(long h, long value) @@ -3060,64 +3065,41 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx) return 0; }
-static int str_sort_by_content(const void *a1, const void *a2) -{ - const struct btf_str_ptr *p1 = a1; - const struct btf_str_ptr *p2 = a2; - - return strcmp(p1->str, p2->str); -} - -static int str_sort_by_offset(const void *a1, const void *a2) -{ - const struct btf_str_ptr *p1 = a1; - const struct btf_str_ptr *p2 = a2; - - if (p1->str != p2->str) - return p1->str < p2->str ? -1 : 1; - return 0; -} - -static int btf_dedup_str_ptr_cmp(const void *str_ptr, const void *pelem) -{ - const struct btf_str_ptr *p = pelem; - - if (str_ptr != p->str) - return (const char *)str_ptr < p->str ? -1 : 1; - return 0; -} - -static int btf_str_mark_as_used(__u32 *str_off_ptr, void *ctx) +static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) { - struct btf_str_ptrs *strs; - struct btf_str_ptr *s; + struct btf_dedup *d = ctx; + long old_off, new_off, len; + const char *s; + void *p; + int err;
if (*str_off_ptr == 0) return 0;
- strs = ctx; - s = bsearch(strs->data + *str_off_ptr, strs->ptrs, strs->cnt, - sizeof(struct btf_str_ptr), btf_dedup_str_ptr_cmp); - if (!s) - return -EINVAL; - s->used = true; - return 0; -} + s = btf__str_by_offset(d->btf, *str_off_ptr); + len = strlen(s) + 1;
-static int btf_str_remap_offset(__u32 *str_off_ptr, void *ctx) -{ - struct btf_str_ptrs *strs; - struct btf_str_ptr *s; + new_off = d->strs_len; + p = btf_add_mem(&d->strs_data, &d->strs_cap, 1, new_off, BTF_MAX_STR_OFFSET, len); + if (!p) + return -ENOMEM;
- if (*str_off_ptr == 0) - return 0; + memcpy(p, s, len);
- strs = ctx; - s = bsearch(strs->data + *str_off_ptr, strs->ptrs, strs->cnt, - sizeof(struct btf_str_ptr), btf_dedup_str_ptr_cmp); - if (!s) - return -EINVAL; - *str_off_ptr = s->new_off; + /* Now attempt to add the string, but only if the string with the same + * contents doesn't exist already (HASHMAP_ADD strategy). If such + * string exists, we'll get its offset in old_off (that's old_key). + */ + err = hashmap__insert(d->strs_hash, (void *)new_off, (void *)new_off, + HASHMAP_ADD, (const void **)&old_off, NULL); + if (err == -EEXIST) { + *str_off_ptr = old_off; + } else if (err) { + return err; + } else { + *str_off_ptr = new_off; + d->strs_len += len; + } return 0; }
@@ -3134,118 +3116,69 @@ static int btf_str_remap_offset(__u32 *str_off_ptr, void *ctx) */ static int btf_dedup_strings(struct btf_dedup *d) { - char *start = d->btf->strs_data; - char *end = start + d->btf->hdr->str_len; - char *p = start, *tmp_strs = NULL; - struct btf_str_ptrs strs = { - .cnt = 0, - .cap = 0, - .ptrs = NULL, - .data = start, - }; - int i, j, err = 0, grp_idx; - bool grp_used; + char *s; + int err;
if (d->btf->strs_deduped) return 0;
- /* build index of all strings */ - while (p < end) { - if (strs.cnt + 1 > strs.cap) { - struct btf_str_ptr *new_ptrs; - - strs.cap += max(strs.cnt / 2, 16U); - new_ptrs = libbpf_reallocarray(strs.ptrs, strs.cap, sizeof(strs.ptrs[0])); - if (!new_ptrs) { - err = -ENOMEM; - goto done; - } - strs.ptrs = new_ptrs; - } - - strs.ptrs[strs.cnt].str = p; - strs.ptrs[strs.cnt].used = false; - - p += strlen(p) + 1; - strs.cnt++; - } - - /* temporary storage for deduplicated strings */ - tmp_strs = malloc(d->btf->hdr->str_len); - if (!tmp_strs) { - err = -ENOMEM; - goto done; - } - - /* mark all used strings */ - strs.ptrs[0].used = true; - err = btf_for_each_str_off(d, btf_str_mark_as_used, &strs); - if (err) - goto done; - - /* sort strings by context, so that we can identify duplicates */ - qsort(strs.ptrs, strs.cnt, sizeof(strs.ptrs[0]), str_sort_by_content); + s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1); + if (!s) + return -ENOMEM; + /* initial empty string */ + s[0] = 0; + d->strs_len = 1;
- /* - * iterate groups of equal strings and if any instance in a group was - * referenced, emit single instance and remember new offset + /* temporarily switch to use btf_dedup's strs_data for strings for hash + * functions; later we'll just transfer hashmap to struct btf as is, + * along the strs_data */ - p = tmp_strs; - grp_idx = 0; - grp_used = strs.ptrs[0].used; - /* iterate past end to avoid code duplication after loop */ - for (i = 1; i <= strs.cnt; i++) { - /* - * when i == strs.cnt, we want to skip string comparison and go - * straight to handling last group of strings (otherwise we'd - * need to handle last group after the loop w/ duplicated code) - */ - if (i < strs.cnt && - !strcmp(strs.ptrs[i].str, strs.ptrs[grp_idx].str)) { - grp_used = grp_used || strs.ptrs[i].used; - continue; - } + d->btf->strs_data_ptr = &d->strs_data;
- /* - * this check would have been required after the loop to handle - * last group of strings, but due to <= condition in a loop - * we avoid that duplication - */ - if (grp_used) { - int new_off = p - tmp_strs; - __u32 len = strlen(strs.ptrs[grp_idx].str); - - memmove(p, strs.ptrs[grp_idx].str, len + 1); - for (j = grp_idx; j < i; j++) - strs.ptrs[j].new_off = new_off; - p += len + 1; - } - - if (i < strs.cnt) { - grp_idx = i; - grp_used = strs.ptrs[i].used; - } + d->strs_hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, d->btf); + if (IS_ERR(d->strs_hash)) { + err = PTR_ERR(d->strs_hash); + d->strs_hash = NULL; + goto err_out; }
- /* replace original strings with deduped ones */ - d->btf->hdr->str_len = p - tmp_strs; - memmove(start, tmp_strs, d->btf->hdr->str_len); - end = start + d->btf->hdr->str_len; - - /* restore original order for further binary search lookups */ - qsort(strs.ptrs, strs.cnt, sizeof(strs.ptrs[0]), str_sort_by_offset); + /* insert empty string; we won't be looking it up during strings + * dedup, but it's good to have it for generic BTF string lookups + */ + err = hashmap__insert(d->strs_hash, (void *)0, (void *)0, + HASHMAP_ADD, NULL, NULL); + if (err) + goto err_out;
/* remap string offsets */ - err = btf_for_each_str_off(d, btf_str_remap_offset, &strs); + err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d); if (err) - goto done; + goto err_out;
- d->btf->hdr->str_len = end - start; + /* replace BTF string data and hash with deduped ones */ + free(d->btf->strs_data); + hashmap__free(d->btf->strs_hash); + d->btf->strs_data = d->strs_data; + d->btf->strs_data_cap = d->strs_cap; + d->btf->hdr->str_len = d->strs_len; + d->btf->strs_hash = d->strs_hash; + /* now point strs_data_ptr back to btf->strs_data */ + d->btf->strs_data_ptr = &d->btf->strs_data; + + d->strs_data = d->strs_hash = NULL; + d->strs_len = d->strs_cap = 0; d->btf->strs_deduped = true; + return 0; + +err_out: + free(d->strs_data); + hashmap__free(d->strs_hash); + d->strs_data = d->strs_hash = NULL; + d->strs_len = d->strs_cap = 0; + + /* restore strings pointer for existing d->btf->strs_hash back */ + d->btf->strs_data_ptr = &d->strs_data;
-done: - free(tmp_strs); - free(strs.ptrs); return err; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit ba451366bf44498f22dd16c31a792083bd6f2ae1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Support split BTF operation, in which one BTF (base BTF) provides basic set of types and strings, while another one (split BTF) builds on top of base's types and strings and adds its own new types and strings. From API standpoint, the fact that the split BTF is built on top of the base BTF is transparent.
Type numeration is transparent. If the base BTF had last type ID #N, then all types in the split BTF start at type ID N+1. Any type in split BTF can reference base BTF types, but not vice versa. Programmatically construction of a split BTF on top of a base BTF is supported: one can create an empty split BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get initial set of split types/strings from the ELF file with .BTF section.
String offsets are similarly transparent and are a logical continuation of base BTF's strings. When building BTF programmatically and adding a new string (explicitly with btf__add_str() or implicitly through appending new types/members), string-to-be-added would first be looked up from the base BTF's string section and re-used if it's there. If not, it will be looked up and/or added to the split BTF string section. Similarly to type IDs, types in split BTF can refer to strings from base BTF absolutely transparently (but not vice versa, of course, because base BTF doesn't "know" about existence of split BTF).
Internal type index is slightly adjusted to be zero-indexed, ignoring a fake [0] VOID type. This allows to handle split/base BTF type lookups transparently by using btf->start_id type ID offset, which is always 1 for base/non-split BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF.
BTF deduplication is not yet supported for split BTF and support for it will be added in separate patch.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-5-andrii@kernel.org (cherry picked from commit ba451366bf44498f22dd16c31a792083bd6f2ae1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 197 ++++++++++++++++++++++++++++++--------- tools/lib/bpf/btf.h | 8 ++ tools/lib/bpf/libbpf.map | 9 ++ 3 files changed, 169 insertions(+), 45 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 4b6d184725e2..176b73ef699c 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -78,10 +78,32 @@ struct btf { void *types_data; size_t types_data_cap; /* used size stored in hdr->type_len */
- /* type ID to `struct btf_type *` lookup index */ + /* type ID to `struct btf_type *` lookup index + * type_offs[0] corresponds to the first non-VOID type: + * - for base BTF it's type [1]; + * - for split BTF it's the first non-base BTF type. + */ __u32 *type_offs; size_t type_offs_cap; + /* number of types in this BTF instance: + * - doesn't include special [0] void type; + * - for split BTF counts number of types added on top of base BTF. + */ __u32 nr_types; + /* if not NULL, points to the base BTF on top of which the current + * split BTF is based + */ + struct btf *base_btf; + /* BTF type ID of the first type in this BTF instance: + * - for base BTF it's equal to 1; + * - for split BTF it's equal to biggest type ID of base BTF plus 1. + */ + int start_id; + /* logical string offset of this BTF instance: + * - for base BTF it's equal to 0; + * - for split BTF it's equal to total size of base BTF's string section size. + */ + int start_str_off;
void *strs_data; size_t strs_data_cap; /* used size stored in hdr->str_len */ @@ -176,7 +198,7 @@ static int btf_add_type_idx_entry(struct btf *btf, __u32 type_off) __u32 *p;
p = btf_add_mem((void **)&btf->type_offs, &btf->type_offs_cap, sizeof(__u32), - btf->nr_types + 1, BTF_MAX_NR_TYPES, 1); + btf->nr_types, BTF_MAX_NR_TYPES, 1); if (!p) return -ENOMEM;
@@ -249,12 +271,16 @@ static int btf_parse_str_sec(struct btf *btf) const char *start = btf->strs_data; const char *end = start + btf->hdr->str_len;
- if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET || - start[0] || end[-1]) { + if (btf->base_btf && hdr->str_len == 0) + return 0; + if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1]) { + pr_debug("Invalid BTF string section\n"); + return -EINVAL; + } + if (!btf->base_btf && start[0]) { pr_debug("Invalid BTF string section\n"); return -EINVAL; } - return 0; }
@@ -369,19 +395,9 @@ static int btf_parse_type_sec(struct btf *btf) struct btf_header *hdr = btf->hdr; void *next_type = btf->types_data; void *end_type = next_type + hdr->type_len; - int err, i = 0, type_size; - - /* VOID (type_id == 0) is specially handled by btf__get_type_by_id(), - * so ensure we can never properly use its offset from index by - * setting it to a large value - */ - err = btf_add_type_idx_entry(btf, UINT_MAX); - if (err) - return err; + int err, type_size;
while (next_type + sizeof(struct btf_type) <= end_type) { - i++; - if (btf->swapped_endian) btf_bswap_type_base(next_type);
@@ -389,7 +405,7 @@ static int btf_parse_type_sec(struct btf *btf) if (type_size < 0) return type_size; if (next_type + type_size > end_type) { - pr_warn("BTF type [%d] is malformed\n", i); + pr_warn("BTF type [%d] is malformed\n", btf->start_id + btf->nr_types); return -EINVAL; }
@@ -414,7 +430,7 @@ static int btf_parse_type_sec(struct btf *btf)
__u32 btf__get_nr_types(const struct btf *btf) { - return btf->nr_types; + return btf->start_id + btf->nr_types - 1; }
/* internal helper returning non-const pointer to a type */ @@ -422,13 +438,14 @@ static struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id) { if (type_id == 0) return &btf_void; - - return btf->types_data + btf->type_offs[type_id]; + if (type_id < btf->start_id) + return btf_type_by_id(btf->base_btf, type_id); + return btf->types_data + btf->type_offs[type_id - btf->start_id]; }
const struct btf_type *btf__type_by_id(const struct btf *btf, __u32 type_id) { - if (type_id > btf->nr_types) + if (type_id >= btf->start_id + btf->nr_types) return NULL; return btf_type_by_id((struct btf *)btf, type_id); } @@ -437,9 +454,13 @@ static int determine_ptr_size(const struct btf *btf) { const struct btf_type *t; const char *name; - int i; + int i, n;
- for (i = 1; i <= btf->nr_types; i++) { + if (btf->base_btf && btf->base_btf->ptr_sz > 0) + return btf->base_btf->ptr_sz; + + n = btf__get_nr_types(btf); + for (i = 1; i <= n; i++) { t = btf__type_by_id(btf, i); if (!btf_is_int(t)) continue; @@ -722,7 +743,7 @@ void btf__free(struct btf *btf) free(btf); }
-struct btf *btf__new_empty(void) +static struct btf *btf_new_empty(struct btf *base_btf) { struct btf *btf;
@@ -730,12 +751,21 @@ struct btf *btf__new_empty(void) if (!btf) return ERR_PTR(-ENOMEM);
+ btf->nr_types = 0; + btf->start_id = 1; + btf->start_str_off = 0; btf->fd = -1; btf->ptr_sz = sizeof(void *); btf->swapped_endian = false;
+ if (base_btf) { + btf->base_btf = base_btf; + btf->start_id = btf__get_nr_types(base_btf) + 1; + btf->start_str_off = base_btf->hdr->str_len; + } + /* +1 for empty string at offset 0 */ - btf->raw_size = sizeof(struct btf_header) + 1; + btf->raw_size = sizeof(struct btf_header) + (base_btf ? 0 : 1); btf->raw_data = calloc(1, btf->raw_size); if (!btf->raw_data) { free(btf); @@ -749,12 +779,22 @@ struct btf *btf__new_empty(void)
btf->types_data = btf->raw_data + btf->hdr->hdr_len; btf->strs_data = btf->raw_data + btf->hdr->hdr_len; - btf->hdr->str_len = 1; /* empty string at offset 0 */ + btf->hdr->str_len = base_btf ? 0 : 1; /* empty string at offset 0 */
return btf; }
-struct btf *btf__new(const void *data, __u32 size) +struct btf *btf__new_empty(void) +{ + return btf_new_empty(NULL); +} + +struct btf *btf__new_empty_split(struct btf *base_btf) +{ + return btf_new_empty(base_btf); +} + +static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf) { struct btf *btf; int err; @@ -763,6 +803,16 @@ struct btf *btf__new(const void *data, __u32 size) if (!btf) return ERR_PTR(-ENOMEM);
+ btf->nr_types = 0; + btf->start_id = 1; + btf->start_str_off = 0; + + if (base_btf) { + btf->base_btf = base_btf; + btf->start_id = btf__get_nr_types(base_btf) + 1; + btf->start_str_off = base_btf->hdr->str_len; + } + btf->raw_data = malloc(size); if (!btf->raw_data) { err = -ENOMEM; @@ -795,7 +845,13 @@ struct btf *btf__new(const void *data, __u32 size) return btf; }
-struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext) +struct btf *btf__new(const void *data, __u32 size) +{ + return btf_new(data, size, NULL); +} + +static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, + struct btf_ext **btf_ext) { Elf_Data *btf_data = NULL, *btf_ext_data = NULL; int err = 0, fd = -1, idx = 0; @@ -873,7 +929,7 @@ struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext) err = -ENOENT; goto done; } - btf = btf__new(btf_data->d_buf, btf_data->d_size); + btf = btf_new(btf_data->d_buf, btf_data->d_size, base_btf); if (IS_ERR(btf)) goto done;
@@ -918,7 +974,17 @@ struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext) return btf; }
-struct btf *btf__parse_raw(const char *path) +struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext) +{ + return btf_parse_elf(path, NULL, btf_ext); +} + +struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf) +{ + return btf_parse_elf(path, base_btf, NULL); +} + +static struct btf *btf_parse_raw(const char *path, struct btf *base_btf) { struct btf *btf = NULL; void *data = NULL; @@ -972,7 +1038,7 @@ struct btf *btf__parse_raw(const char *path) }
/* finally parse BTF data */ - btf = btf__new(data, sz); + btf = btf_new(data, sz, base_btf);
err_out: free(data); @@ -981,18 +1047,38 @@ struct btf *btf__parse_raw(const char *path) return err ? ERR_PTR(err) : btf; }
-struct btf *btf__parse(const char *path, struct btf_ext **btf_ext) +struct btf *btf__parse_raw(const char *path) +{ + return btf_parse_raw(path, NULL); +} + +struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf) +{ + return btf_parse_raw(path, base_btf); +} + +static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext) { struct btf *btf;
if (btf_ext) *btf_ext = NULL;
- btf = btf__parse_raw(path); + btf = btf_parse_raw(path, base_btf); if (!IS_ERR(btf) || PTR_ERR(btf) != -EPROTO) return btf;
- return btf__parse_elf(path, btf_ext); + return btf_parse_elf(path, base_btf, btf_ext); +} + +struct btf *btf__parse(const char *path, struct btf_ext **btf_ext) +{ + return btf_parse(path, NULL, btf_ext); +} + +struct btf *btf__parse_split(const char *path, struct btf *base_btf) +{ + return btf_parse(path, base_btf, NULL); }
static int compare_vsi_off(const void *_a, const void *_b) @@ -1176,8 +1262,8 @@ static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endi
memcpy(p, btf->types_data, hdr->type_len); if (swap_endian) { - for (i = 1; i <= btf->nr_types; i++) { - t = p + btf->type_offs[i]; + for (i = 0; i < btf->nr_types; i++) { + t = p + btf->type_offs[i]; /* btf_bswap_type_rest() relies on native t->info, so * we swap base type info after we swapped all the * additional information @@ -1220,8 +1306,10 @@ const void *btf__get_raw_data(const struct btf *btf_ro, __u32 *size)
const char *btf__str_by_offset(const struct btf *btf, __u32 offset) { - if (offset < btf->hdr->str_len) - return btf->strs_data + offset; + if (offset < btf->start_str_off) + return btf__str_by_offset(btf->base_btf, offset); + else if (offset - btf->start_str_off < btf->hdr->str_len) + return btf->strs_data + (offset - btf->start_str_off); else return NULL; } @@ -1458,7 +1546,10 @@ static int btf_ensure_modifiable(struct btf *btf) /* if BTF was created from scratch, all strings are guaranteed to be * unique and deduplicated */ - btf->strs_deduped = btf->hdr->str_len <= 1; + if (btf->hdr->str_len == 0) + btf->strs_deduped = true; + if (!btf->base_btf && btf->hdr->str_len == 1) + btf->strs_deduped = true;
/* invalidate raw_data representation */ btf_invalidate_raw_data(btf); @@ -1490,6 +1581,14 @@ int btf__find_str(struct btf *btf, const char *s) long old_off, new_off, len; void *p;
+ if (btf->base_btf) { + int ret; + + ret = btf__find_str(btf->base_btf, s); + if (ret != -ENOENT) + return ret; + } + /* BTF needs to be in a modifiable state to build string lookup index */ if (btf_ensure_modifiable(btf)) return -ENOMEM; @@ -1504,7 +1603,7 @@ int btf__find_str(struct btf *btf, const char *s) memcpy(p, s, len);
if (hashmap__find(btf->strs_hash, (void *)new_off, (void **)&old_off)) - return old_off; + return btf->start_str_off + old_off;
return -ENOENT; } @@ -1520,6 +1619,14 @@ int btf__add_str(struct btf *btf, const char *s) void *p; int err;
+ if (btf->base_btf) { + int ret; + + ret = btf__find_str(btf->base_btf, s); + if (ret != -ENOENT) + return ret; + } + if (btf_ensure_modifiable(btf)) return -ENOMEM;
@@ -1546,12 +1653,12 @@ int btf__add_str(struct btf *btf, const char *s) err = hashmap__insert(btf->strs_hash, (void *)new_off, (void *)new_off, HASHMAP_ADD, (const void **)&old_off, NULL); if (err == -EEXIST) - return old_off; /* duplicated string, return existing offset */ + return btf->start_str_off + old_off; /* duplicated string, return existing offset */ if (err) return err;
btf->hdr->str_len += len; /* new unique string, adjust data length */ - return new_off; + return btf->start_str_off + new_off; }
static void *btf_add_type_mem(struct btf *btf, size_t add_sz) @@ -1581,7 +1688,7 @@ static int btf_commit_type(struct btf *btf, int data_sz) btf->hdr->type_len += data_sz; btf->hdr->str_off += data_sz; btf->nr_types++; - return btf->nr_types; + return btf->start_id + btf->nr_types - 1; }
/* @@ -4164,14 +4271,14 @@ static int btf_dedup_compact_types(struct btf_dedup *d)
memmove(p, btf__type_by_id(d->btf, i), len); d->hypot_map[i] = next_type_id; - d->btf->type_offs[next_type_id] = p - d->btf->types_data; + d->btf->type_offs[next_type_id - 1] = p - d->btf->types_data; p += len; next_type_id++; }
/* shrink struct btf's internal types index and update btf_header */ d->btf->nr_types = next_type_id - 1; - d->btf->type_offs_cap = d->btf->nr_types + 1; + d->btf->type_offs_cap = d->btf->nr_types; d->btf->hdr->type_len = p - d->btf->types_data; new_offs = libbpf_reallocarray(d->btf->type_offs, d->btf->type_offs_cap, sizeof(*new_offs)); diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 9cabc8b620e3..12d36d913168 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -31,11 +31,19 @@ enum btf_endianness { };
LIBBPF_API void btf__free(struct btf *btf); + LIBBPF_API struct btf *btf__new(const void *data, __u32 size); +LIBBPF_API struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf); LIBBPF_API struct btf *btf__new_empty(void); +LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf); + LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext); +LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf); LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext); +LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf); LIBBPF_API struct btf *btf__parse_raw(const char *path); +LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf); + LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); LIBBPF_API int btf__load(struct btf *btf); LIBBPF_API __s32 btf__find_by_name(const struct btf *btf, diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 16393cea53d0..0da243abdcf7 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -340,3 +340,12 @@ LIBBPF_0.2.0 { bpf_program__is_sched; bpf_program__set_sched; } LIBBPF_0.1.0; + +LIBBPF_0.3.0 { + global: + btf__parse_elf_split; + btf__parse_raw_split; + btf__parse_split; + btf__new_empty_split; + btf__new_split; +} LIBBPF_0.2.0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 197389da2fbfbc3cefb229268c32d858d9575c96 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add selftest validating ability to programmatically generate and then dump split BTF.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-6-andrii@kernel.org (cherry picked from commit 197389da2fbfbc3cefb229268c32d858d9575c96) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/btf_split.c | 99 +++++++++++++++++++ tools/testing/selftests/bpf/test_progs.h | 11 +++ 2 files changed, 110 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_split.c
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_split.c b/tools/testing/selftests/bpf/prog_tests/btf_split.c new file mode 100644 index 000000000000..ca7c2a91610a --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/btf_split.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include <test_progs.h> +#include <bpf/btf.h> + +static char *dump_buf; +static size_t dump_buf_sz; +static FILE *dump_buf_file; + +static void btf_dump_printf(void *ctx, const char *fmt, va_list args) +{ + vfprintf(ctx, fmt, args); +} + +void test_btf_split() { + struct btf_dump_opts opts; + struct btf_dump *d = NULL; + const struct btf_type *t; + struct btf *btf1, *btf2; + int str_off, i, err; + + btf1 = btf__new_empty(); + if (!ASSERT_OK_PTR(btf1, "empty_main_btf")) + return; + + btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */ + + btf__add_int(btf1, "int", 4, BTF_INT_SIGNED); /* [1] int */ + btf__add_ptr(btf1, 1); /* [2] ptr to int */ + + btf__add_struct(btf1, "s1", 4); /* [3] struct s1 { */ + btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ + /* } */ + + btf2 = btf__new_empty_split(btf1); + if (!ASSERT_OK_PTR(btf2, "empty_split_btf")) + goto cleanup; + + /* pointer size should be "inherited" from main BTF */ + ASSERT_EQ(btf__pointer_size(btf2), 8, "inherit_ptr_sz"); + + str_off = btf__find_str(btf2, "int"); + ASSERT_NEQ(str_off, -ENOENT, "str_int_missing"); + + t = btf__type_by_id(btf2, 1); + if (!ASSERT_OK_PTR(t, "int_type")) + goto cleanup; + ASSERT_EQ(btf_is_int(t), true, "int_kind"); + ASSERT_STREQ(btf__str_by_offset(btf2, t->name_off), "int", "int_name"); + + btf__add_struct(btf2, "s2", 16); /* [4] struct s2 { */ + btf__add_field(btf2, "f1", 3, 0, 0); /* struct s1 f1; */ + btf__add_field(btf2, "f2", 1, 32, 0); /* int f2; */ + btf__add_field(btf2, "f3", 2, 64, 0); /* int *f3; */ + /* } */ + + t = btf__type_by_id(btf1, 4); + ASSERT_NULL(t, "split_type_in_main"); + + t = btf__type_by_id(btf2, 4); + if (!ASSERT_OK_PTR(t, "split_struct_type")) + goto cleanup; + ASSERT_EQ(btf_is_struct(t), true, "split_struct_kind"); + ASSERT_EQ(btf_vlen(t), 3, "split_struct_vlen"); + ASSERT_STREQ(btf__str_by_offset(btf2, t->name_off), "s2", "split_struct_name"); + + /* BTF-to-C dump of split BTF */ + dump_buf_file = open_memstream(&dump_buf, &dump_buf_sz); + if (!ASSERT_OK_PTR(dump_buf_file, "dump_memstream")) + return; + opts.ctx = dump_buf_file; + d = btf_dump__new(btf2, NULL, &opts, btf_dump_printf); + if (!ASSERT_OK_PTR(d, "btf_dump__new")) + goto cleanup; + for (i = 1; i <= btf__get_nr_types(btf2); i++) { + err = btf_dump__dump_type(d, i); + ASSERT_OK(err, "dump_type_ok"); + } + fflush(dump_buf_file); + dump_buf[dump_buf_sz] = 0; /* some libc implementations don't do this */ + ASSERT_STREQ(dump_buf, +"struct s1 {\n" +" int f1;\n" +"};\n" +"\n" +"struct s2 {\n" +" struct s1 f1;\n" +" int f2;\n" +" int *f3;\n" +"};\n\n", "c_dump"); + +cleanup: + if (dump_buf_file) + fclose(dump_buf_file); + free(dump_buf); + btf_dump__free(d); + btf__free(btf1); + btf__free(btf2); +} diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index 238f5f61189e..d6b14853f3bc 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -141,6 +141,17 @@ extern int test__join_cgroup(const char *path); ___ok; \ })
+#define ASSERT_NEQ(actual, expected, name) ({ \ + static int duration = 0; \ + typeof(actual) ___act = (actual); \ + typeof(expected) ___exp = (expected); \ + bool ___ok = ___act != ___exp; \ + CHECK(!___ok, (name), \ + "unexpected %s: actual %lld == expected %lld\n", \ + (name), (long long)(___act), (long long)(___exp)); \ + ___ok; \ +}) + #define ASSERT_STREQ(actual, expected, name) ({ \ static int duration = 0; \ const char *___act = actual; \
From: Andrii Nakryiko andriin@fb.com
mainline inclusion from mainline-5.11-rc1 commit 1306c980cf892bc17e7296d3e9ab8e9082f893a1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add re-usable btf_helpers.{c,h} to provide BTF-related testing routines. Start with adding a raw BTF dumping helpers.
Raw BTF dump is the most succinct and at the same time a very human-friendly way to validate exact contents of BTF types. Cross-validate raw BTF dump and writable BTF in a single selftest. Raw type dump checks also serve as a good self-documentation.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-7-andrii@kernel.org (cherry picked from commit 1306c980cf892bc17e7296d3e9ab8e9082f893a1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 2 +- tools/testing/selftests/bpf/btf_helpers.c | 200 ++++++++++++++++++ tools/testing/selftests/bpf/btf_helpers.h | 12 ++ .../selftests/bpf/prog_tests/btf_write.c | 43 ++++ 4 files changed, 256 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/btf_helpers.c create mode 100644 tools/testing/selftests/bpf/btf_helpers.h
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 6fabcc2f739b..080d9796e8e3 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -394,7 +394,7 @@ TRUNNER_TESTS_DIR := prog_tests TRUNNER_BPF_PROGS_DIR := progs TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ network_helpers.c testing_helpers.c \ - flow_dissector_load.h + btf_helpers.c flow_dissector_load.h TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read \ $(wildcard progs/btf_dump_test_case_*.c) TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE diff --git a/tools/testing/selftests/bpf/btf_helpers.c b/tools/testing/selftests/bpf/btf_helpers.c new file mode 100644 index 000000000000..abc3f6c04cfc --- /dev/null +++ b/tools/testing/selftests/bpf/btf_helpers.c @@ -0,0 +1,200 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include <stdio.h> +#include <errno.h> +#include <bpf/btf.h> + +static const char * const btf_kind_str_mapping[] = { + [BTF_KIND_UNKN] = "UNKNOWN", + [BTF_KIND_INT] = "INT", + [BTF_KIND_PTR] = "PTR", + [BTF_KIND_ARRAY] = "ARRAY", + [BTF_KIND_STRUCT] = "STRUCT", + [BTF_KIND_UNION] = "UNION", + [BTF_KIND_ENUM] = "ENUM", + [BTF_KIND_FWD] = "FWD", + [BTF_KIND_TYPEDEF] = "TYPEDEF", + [BTF_KIND_VOLATILE] = "VOLATILE", + [BTF_KIND_CONST] = "CONST", + [BTF_KIND_RESTRICT] = "RESTRICT", + [BTF_KIND_FUNC] = "FUNC", + [BTF_KIND_FUNC_PROTO] = "FUNC_PROTO", + [BTF_KIND_VAR] = "VAR", + [BTF_KIND_DATASEC] = "DATASEC", +}; + +static const char *btf_kind_str(__u16 kind) +{ + if (kind > BTF_KIND_DATASEC) + return "UNKNOWN"; + return btf_kind_str_mapping[kind]; +} + +static const char *btf_int_enc_str(__u8 encoding) +{ + switch (encoding) { + case 0: + return "(none)"; + case BTF_INT_SIGNED: + return "SIGNED"; + case BTF_INT_CHAR: + return "CHAR"; + case BTF_INT_BOOL: + return "BOOL"; + default: + return "UNKN"; + } +} + +static const char *btf_var_linkage_str(__u32 linkage) +{ + switch (linkage) { + case BTF_VAR_STATIC: + return "static"; + case BTF_VAR_GLOBAL_ALLOCATED: + return "global-alloc"; + default: + return "(unknown)"; + } +} + +static const char *btf_func_linkage_str(const struct btf_type *t) +{ + switch (btf_vlen(t)) { + case BTF_FUNC_STATIC: + return "static"; + case BTF_FUNC_GLOBAL: + return "global"; + case BTF_FUNC_EXTERN: + return "extern"; + default: + return "(unknown)"; + } +} + +static const char *btf_str(const struct btf *btf, __u32 off) +{ + if (!off) + return "(anon)"; + return btf__str_by_offset(btf, off) ?: "(invalid)"; +} + +int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id) +{ + const struct btf_type *t; + int kind, i; + __u32 vlen; + + t = btf__type_by_id(btf, id); + if (!t) + return -EINVAL; + + vlen = btf_vlen(t); + kind = btf_kind(t); + + fprintf(out, "[%u] %s '%s'", id, btf_kind_str(kind), btf_str(btf, t->name_off)); + + switch (kind) { + case BTF_KIND_INT: + fprintf(out, " size=%u bits_offset=%u nr_bits=%u encoding=%s", + t->size, btf_int_offset(t), btf_int_bits(t), + btf_int_enc_str(btf_int_encoding(t))); + break; + case BTF_KIND_PTR: + case BTF_KIND_CONST: + case BTF_KIND_VOLATILE: + case BTF_KIND_RESTRICT: + case BTF_KIND_TYPEDEF: + fprintf(out, " type_id=%u", t->type); + break; + case BTF_KIND_ARRAY: { + const struct btf_array *arr = btf_array(t); + + fprintf(out, " type_id=%u index_type_id=%u nr_elems=%u", + arr->type, arr->index_type, arr->nelems); + break; + } + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: { + const struct btf_member *m = btf_members(t); + + fprintf(out, " size=%u vlen=%u", t->size, vlen); + for (i = 0; i < vlen; i++, m++) { + __u32 bit_off, bit_sz; + + bit_off = btf_member_bit_offset(t, i); + bit_sz = btf_member_bitfield_size(t, i); + fprintf(out, "\n\t'%s' type_id=%u bits_offset=%u", + btf_str(btf, m->name_off), m->type, bit_off); + if (bit_sz) + fprintf(out, " bitfield_size=%u", bit_sz); + } + break; + } + case BTF_KIND_ENUM: { + const struct btf_enum *v = btf_enum(t); + + fprintf(out, " size=%u vlen=%u", t->size, vlen); + for (i = 0; i < vlen; i++, v++) { + fprintf(out, "\n\t'%s' val=%u", + btf_str(btf, v->name_off), v->val); + } + break; + } + case BTF_KIND_FWD: + fprintf(out, " fwd_kind=%s", btf_kflag(t) ? "union" : "struct"); + break; + case BTF_KIND_FUNC: + fprintf(out, " type_id=%u linkage=%s", t->type, btf_func_linkage_str(t)); + break; + case BTF_KIND_FUNC_PROTO: { + const struct btf_param *p = btf_params(t); + + fprintf(out, " ret_type_id=%u vlen=%u", t->type, vlen); + for (i = 0; i < vlen; i++, p++) { + fprintf(out, "\n\t'%s' type_id=%u", + btf_str(btf, p->name_off), p->type); + } + break; + } + case BTF_KIND_VAR: + fprintf(out, " type_id=%u, linkage=%s", + t->type, btf_var_linkage_str(btf_var(t)->linkage)); + break; + case BTF_KIND_DATASEC: { + const struct btf_var_secinfo *v = btf_var_secinfos(t); + + fprintf(out, " size=%u vlen=%u", t->size, vlen); + for (i = 0; i < vlen; i++, v++) { + fprintf(out, "\n\ttype_id=%u offset=%u size=%u", + v->type, v->offset, v->size); + } + break; + } + default: + break; + } + + return 0; +} + +/* Print raw BTF type dump into a local buffer and return string pointer back. + * Buffer *will* be overwritten by subsequent btf_type_raw_dump() calls + */ +const char *btf_type_raw_dump(const struct btf *btf, int type_id) +{ + static char buf[16 * 1024]; + FILE *buf_file; + + buf_file = fmemopen(buf, sizeof(buf) - 1, "w"); + if (!buf_file) { + fprintf(stderr, "Failed to open memstream: %d\n", errno); + return NULL; + } + + fprintf_btf_type_raw(buf_file, btf, type_id); + fflush(buf_file); + fclose(buf_file); + + return buf; +} diff --git a/tools/testing/selftests/bpf/btf_helpers.h b/tools/testing/selftests/bpf/btf_helpers.h new file mode 100644 index 000000000000..2c9ce1b61dc9 --- /dev/null +++ b/tools/testing/selftests/bpf/btf_helpers.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2020 Facebook */ +#ifndef __BTF_HELPERS_H +#define __BTF_HELPERS_H + +#include <stdio.h> +#include <bpf/btf.h> + +int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id); +const char *btf_type_raw_dump(const struct btf *btf, int type_id); + +#endif diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c index 314e1e7c36df..f36da15b134f 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_write.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c @@ -2,6 +2,7 @@ /* Copyright (c) 2020 Facebook */ #include <test_progs.h> #include <bpf/btf.h> +#include "btf_helpers.h"
static int duration = 0;
@@ -39,6 +40,8 @@ void test_btf_write() { ASSERT_EQ(t->size, 4, "int_sz"); ASSERT_EQ(btf_int_encoding(t), BTF_INT_SIGNED, "int_enc"); ASSERT_EQ(btf_int_bits(t), 32, "int_bits"); + ASSERT_STREQ(btf_type_raw_dump(btf, 1), + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", "raw_dump");
/* invalid int size */ id = btf__add_int(btf, "bad sz int", 7, 0); @@ -59,24 +62,32 @@ void test_btf_write() { t = btf__type_by_id(btf, 2); ASSERT_EQ(btf_kind(t), BTF_KIND_PTR, "ptr_kind"); ASSERT_EQ(t->type, 1, "ptr_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 2), + "[2] PTR '(anon)' type_id=1", "raw_dump");
id = btf__add_const(btf, 5); /* points forward to restrict */ ASSERT_EQ(id, 3, "const_id"); t = btf__type_by_id(btf, 3); ASSERT_EQ(btf_kind(t), BTF_KIND_CONST, "const_kind"); ASSERT_EQ(t->type, 5, "const_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 3), + "[3] CONST '(anon)' type_id=5", "raw_dump");
id = btf__add_volatile(btf, 3); ASSERT_EQ(id, 4, "volatile_id"); t = btf__type_by_id(btf, 4); ASSERT_EQ(btf_kind(t), BTF_KIND_VOLATILE, "volatile_kind"); ASSERT_EQ(t->type, 3, "volatile_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 4), + "[4] VOLATILE '(anon)' type_id=3", "raw_dump");
id = btf__add_restrict(btf, 4); ASSERT_EQ(id, 5, "restrict_id"); t = btf__type_by_id(btf, 5); ASSERT_EQ(btf_kind(t), BTF_KIND_RESTRICT, "restrict_kind"); ASSERT_EQ(t->type, 4, "restrict_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 5), + "[5] RESTRICT '(anon)' type_id=4", "raw_dump");
/* ARRAY */ id = btf__add_array(btf, 1, 2, 10); /* int *[10] */ @@ -86,6 +97,8 @@ void test_btf_write() { ASSERT_EQ(btf_array(t)->index_type, 1, "array_index_type"); ASSERT_EQ(btf_array(t)->type, 2, "array_elem_type"); ASSERT_EQ(btf_array(t)->nelems, 10, "array_nelems"); + ASSERT_STREQ(btf_type_raw_dump(btf, 6), + "[6] ARRAY '(anon)' type_id=2 index_type_id=1 nr_elems=10", "raw_dump");
/* STRUCT */ err = btf__add_field(btf, "field", 1, 0, 0); @@ -113,6 +126,10 @@ void test_btf_write() { ASSERT_EQ(m->type, 1, "f2_type"); ASSERT_EQ(btf_member_bit_offset(t, 1), 32, "f2_bit_off"); ASSERT_EQ(btf_member_bitfield_size(t, 1), 16, "f2_bit_sz"); + ASSERT_STREQ(btf_type_raw_dump(btf, 7), + "[7] STRUCT 's1' size=8 vlen=2\n" + "\t'f1' type_id=1 bits_offset=0\n" + "\t'f2' type_id=1 bits_offset=32 bitfield_size=16", "raw_dump");
/* UNION */ id = btf__add_union(btf, "u1", 8); @@ -136,6 +153,9 @@ void test_btf_write() { ASSERT_EQ(m->type, 1, "f1_type"); ASSERT_EQ(btf_member_bit_offset(t, 0), 0, "f1_bit_off"); ASSERT_EQ(btf_member_bitfield_size(t, 0), 16, "f1_bit_sz"); + ASSERT_STREQ(btf_type_raw_dump(btf, 8), + "[8] UNION 'u1' size=8 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0 bitfield_size=16", "raw_dump");
/* ENUM */ id = btf__add_enum(btf, "e1", 4); @@ -156,6 +176,10 @@ void test_btf_write() { v = btf_enum(t) + 1; ASSERT_STREQ(btf__str_by_offset(btf, v->name_off), "v2", "v2_name"); ASSERT_EQ(v->val, 2, "v2_val"); + ASSERT_STREQ(btf_type_raw_dump(btf, 9), + "[9] ENUM 'e1' size=4 vlen=2\n" + "\t'v1' val=1\n" + "\t'v2' val=2", "raw_dump");
/* FWDs */ id = btf__add_fwd(btf, "struct_fwd", BTF_FWD_STRUCT); @@ -164,6 +188,8 @@ void test_btf_write() { ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "struct_fwd", "fwd_name"); ASSERT_EQ(btf_kind(t), BTF_KIND_FWD, "fwd_kind"); ASSERT_EQ(btf_kflag(t), 0, "fwd_kflag"); + ASSERT_STREQ(btf_type_raw_dump(btf, 10), + "[10] FWD 'struct_fwd' fwd_kind=struct", "raw_dump");
id = btf__add_fwd(btf, "union_fwd", BTF_FWD_UNION); ASSERT_EQ(id, 11, "union_fwd_id"); @@ -171,6 +197,8 @@ void test_btf_write() { ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "union_fwd", "fwd_name"); ASSERT_EQ(btf_kind(t), BTF_KIND_FWD, "fwd_kind"); ASSERT_EQ(btf_kflag(t), 1, "fwd_kflag"); + ASSERT_STREQ(btf_type_raw_dump(btf, 11), + "[11] FWD 'union_fwd' fwd_kind=union", "raw_dump");
id = btf__add_fwd(btf, "enum_fwd", BTF_FWD_ENUM); ASSERT_EQ(id, 12, "enum_fwd_id"); @@ -179,6 +207,8 @@ void test_btf_write() { ASSERT_EQ(btf_kind(t), BTF_KIND_ENUM, "enum_fwd_kind"); ASSERT_EQ(btf_vlen(t), 0, "enum_fwd_kind"); ASSERT_EQ(t->size, 4, "enum_fwd_sz"); + ASSERT_STREQ(btf_type_raw_dump(btf, 12), + "[12] ENUM 'enum_fwd' size=4 vlen=0", "raw_dump");
/* TYPEDEF */ id = btf__add_typedef(btf, "typedef1", 1); @@ -187,6 +217,8 @@ void test_btf_write() { ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "typedef1", "typedef_name"); ASSERT_EQ(btf_kind(t), BTF_KIND_TYPEDEF, "typedef_kind"); ASSERT_EQ(t->type, 1, "typedef_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 13), + "[13] TYPEDEF 'typedef1' type_id=1", "raw_dump");
/* FUNC & FUNC_PROTO */ id = btf__add_func(btf, "func1", BTF_FUNC_GLOBAL, 15); @@ -196,6 +228,8 @@ void test_btf_write() { ASSERT_EQ(t->type, 15, "func_type"); ASSERT_EQ(btf_kind(t), BTF_KIND_FUNC, "func_kind"); ASSERT_EQ(btf_vlen(t), BTF_FUNC_GLOBAL, "func_vlen"); + ASSERT_STREQ(btf_type_raw_dump(btf, 14), + "[14] FUNC 'func1' type_id=15 linkage=global", "raw_dump");
id = btf__add_func_proto(btf, 1); ASSERT_EQ(id, 15, "func_proto_id"); @@ -214,6 +248,10 @@ void test_btf_write() { p = btf_params(t) + 1; ASSERT_STREQ(btf__str_by_offset(btf, p->name_off), "p2", "p2_name"); ASSERT_EQ(p->type, 2, "p2_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 15), + "[15] FUNC_PROTO '(anon)' ret_type_id=1 vlen=2\n" + "\t'p1' type_id=1\n" + "\t'p2' type_id=2", "raw_dump");
/* VAR */ id = btf__add_var(btf, "var1", BTF_VAR_GLOBAL_ALLOCATED, 1); @@ -223,6 +261,8 @@ void test_btf_write() { ASSERT_EQ(btf_kind(t), BTF_KIND_VAR, "var_kind"); ASSERT_EQ(t->type, 1, "var_type"); ASSERT_EQ(btf_var(t)->linkage, BTF_VAR_GLOBAL_ALLOCATED, "var_type"); + ASSERT_STREQ(btf_type_raw_dump(btf, 16), + "[16] VAR 'var1' type_id=1, linkage=global-alloc", "raw_dump");
/* DATASECT */ id = btf__add_datasec(btf, "datasec1", 12); @@ -239,6 +279,9 @@ void test_btf_write() { ASSERT_EQ(vi->type, 1, "v1_type"); ASSERT_EQ(vi->offset, 4, "v1_off"); ASSERT_EQ(vi->size, 8, "v1_sz"); + ASSERT_STREQ(btf_type_raw_dump(btf, 17), + "[17] DATASEC 'datasec1' size=12 vlen=1\n" + "\ttype_id=1 offset=4 size=8", "raw_dump");
btf__free(btf); }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit f86524efcf9e3f3a7cf75ebcd82cf8f58ec716cc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add support for deduplication split BTFs. When deduplicating split BTF, base BTF is considered to be immutable and can't be modified or adjusted. 99% of BTF deduplication logic is left intact (module some type numbering adjustments). There are only two differences.
First, each type in base BTF gets hashed (expect VAR and DATASEC, of course, those are always considered to be self-canonical instances) and added into a table of canonical table candidates. Hashing is a shallow, fast operation, so mostly eliminates the overhead of having entire base BTF to be a part of BTF dedup.
Second difference is very critical and subtle. While deduplicating split BTF types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD types can and should be resolved to a full STRUCT/UNION type from the split BTF part. This is, obviously, can't happen because we can't modify the base BTF types anymore. So because of that, any type in split BTF that directly or indirectly references that newly-to-be-resolved FWD type can't be considered to be equivalent to the corresponding canonical types in base BTF, because that would result in a loss of type resolution information. So in such case, split BTF types will be deduplicated separately and will cause some duplication of type information, which is unavoidable.
With those two changes, the rest of the algorithm manages to deduplicate split BTF correctly, pointing all the duplicates to their canonical counter-parts in base BTF, but also is deduplicating whatever unique types are present in split BTF on their own.
Also, theoretically, split BTF after deduplication could end up with either empty type section or empty string section. This is handled by libbpf correctly in one of previous patches in the series.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-9-andrii@kernel.org (cherry picked from commit f86524efcf9e3f3a7cf75ebcd82cf8f58ec716cc) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 221 +++++++++++++++++++++++++++++++++----------- 1 file changed, 168 insertions(+), 53 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 176b73ef699c..a29b99e030e1 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -2719,6 +2719,7 @@ struct btf_dedup; static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext, const struct btf_dedup_opts *opts); static void btf_dedup_free(struct btf_dedup *d); +static int btf_dedup_prep(struct btf_dedup *d); static int btf_dedup_strings(struct btf_dedup *d); static int btf_dedup_prim_types(struct btf_dedup *d); static int btf_dedup_struct_types(struct btf_dedup *d); @@ -2877,6 +2878,11 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext, if (btf_ensure_modifiable(btf)) return -ENOMEM;
+ err = btf_dedup_prep(d); + if (err) { + pr_debug("btf_dedup_prep failed:%d\n", err); + goto done; + } err = btf_dedup_strings(d); if (err < 0) { pr_debug("btf_dedup_strings failed:%d\n", err); @@ -2939,6 +2945,13 @@ struct btf_dedup { __u32 *hypot_list; size_t hypot_cnt; size_t hypot_cap; + /* Whether hypothetical mapping, if successful, would need to adjust + * already canonicalized types (due to a new forward declaration to + * concrete type resolution). In such case, during split BTF dedup + * candidate type would still be considered as different, because base + * BTF is considered to be immutable. + */ + bool hypot_adjust_canon; /* Various option modifying behavior of algorithm */ struct btf_dedup_opts opts; /* temporary strings deduplication state */ @@ -2986,6 +2999,7 @@ static void btf_dedup_clear_hypot_map(struct btf_dedup *d) for (i = 0; i < d->hypot_cnt; i++) d->hypot_map[d->hypot_list[i]] = BTF_UNPROCESSED_ID; d->hypot_cnt = 0; + d->hypot_adjust_canon = false; }
static void btf_dedup_free(struct btf_dedup *d) @@ -3025,7 +3039,7 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext, { struct btf_dedup *d = calloc(1, sizeof(struct btf_dedup)); hashmap_hash_fn hash_fn = btf_dedup_identity_hash_fn; - int i, err = 0; + int i, err = 0, type_cnt;
if (!d) return ERR_PTR(-ENOMEM); @@ -3045,14 +3059,15 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext, goto done; }
- d->map = malloc(sizeof(__u32) * (1 + btf->nr_types)); + type_cnt = btf__get_nr_types(btf) + 1; + d->map = malloc(sizeof(__u32) * type_cnt); if (!d->map) { err = -ENOMEM; goto done; } /* special BTF "void" type is made canonical immediately */ d->map[0] = 0; - for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i < type_cnt; i++) { struct btf_type *t = btf_type_by_id(d->btf, i);
/* VAR and DATASEC are never deduped and are self-canonical */ @@ -3062,12 +3077,12 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext, d->map[i] = BTF_UNPROCESSED_ID; }
- d->hypot_map = malloc(sizeof(__u32) * (1 + btf->nr_types)); + d->hypot_map = malloc(sizeof(__u32) * type_cnt); if (!d->hypot_map) { err = -ENOMEM; goto done; } - for (i = 0; i <= btf->nr_types; i++) + for (i = 0; i < type_cnt; i++) d->hypot_map[i] = BTF_UNPROCESSED_ID;
done: @@ -3091,8 +3106,8 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx) int i, j, r, rec_size; struct btf_type *t;
- for (i = 1; i <= d->btf->nr_types; i++) { - t = btf_type_by_id(d->btf, i); + for (i = 0; i < d->btf->nr_types; i++) { + t = btf_type_by_id(d->btf, d->btf->start_id + i); r = fn(&t->name_off, ctx); if (r) return r; @@ -3175,15 +3190,27 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx) static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) { struct btf_dedup *d = ctx; + __u32 str_off = *str_off_ptr; long old_off, new_off, len; const char *s; void *p; int err;
- if (*str_off_ptr == 0) + /* don't touch empty string or string in main BTF */ + if (str_off == 0 || str_off < d->btf->start_str_off) return 0;
- s = btf__str_by_offset(d->btf, *str_off_ptr); + s = btf__str_by_offset(d->btf, str_off); + if (d->btf->base_btf) { + err = btf__find_str(d->btf->base_btf, s); + if (err >= 0) { + *str_off_ptr = err; + return 0; + } + if (err != -ENOENT) + return err; + } + len = strlen(s) + 1;
new_off = d->strs_len; @@ -3200,11 +3227,11 @@ static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) err = hashmap__insert(d->strs_hash, (void *)new_off, (void *)new_off, HASHMAP_ADD, (const void **)&old_off, NULL); if (err == -EEXIST) { - *str_off_ptr = old_off; + *str_off_ptr = d->btf->start_str_off + old_off; } else if (err) { return err; } else { - *str_off_ptr = new_off; + *str_off_ptr = d->btf->start_str_off + new_off; d->strs_len += len; } return 0; @@ -3229,13 +3256,6 @@ static int btf_dedup_strings(struct btf_dedup *d) if (d->btf->strs_deduped) return 0;
- s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1); - if (!s) - return -ENOMEM; - /* initial empty string */ - s[0] = 0; - d->strs_len = 1; - /* temporarily switch to use btf_dedup's strs_data for strings for hash * functions; later we'll just transfer hashmap to struct btf as is, * along the strs_data @@ -3249,13 +3269,22 @@ static int btf_dedup_strings(struct btf_dedup *d) goto err_out; }
- /* insert empty string; we won't be looking it up during strings - * dedup, but it's good to have it for generic BTF string lookups - */ - err = hashmap__insert(d->strs_hash, (void *)0, (void *)0, - HASHMAP_ADD, NULL, NULL); - if (err) - goto err_out; + if (!d->btf->base_btf) { + s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1); + if (!s) + return -ENOMEM; + /* initial empty string */ + s[0] = 0; + d->strs_len = 1; + + /* insert empty string; we won't be looking it up during strings + * dedup, but it's good to have it for generic BTF string lookups + */ + err = hashmap__insert(d->strs_hash, (void *)0, (void *)0, + HASHMAP_ADD, NULL, NULL); + if (err) + goto err_out; + }
/* remap string offsets */ err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d); @@ -3550,6 +3579,66 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2) return true; }
+/* Prepare split BTF for deduplication by calculating hashes of base BTF's + * types and initializing the rest of the state (canonical type mapping) for + * the fixed base BTF part. + */ +static int btf_dedup_prep(struct btf_dedup *d) +{ + struct btf_type *t; + int type_id; + long h; + + if (!d->btf->base_btf) + return 0; + + for (type_id = 1; type_id < d->btf->start_id; type_id++) { + t = btf_type_by_id(d->btf, type_id); + + /* all base BTF types are self-canonical by definition */ + d->map[type_id] = type_id; + + switch (btf_kind(t)) { + case BTF_KIND_VAR: + case BTF_KIND_DATASEC: + /* VAR and DATASEC are never hash/deduplicated */ + continue; + case BTF_KIND_CONST: + case BTF_KIND_VOLATILE: + case BTF_KIND_RESTRICT: + case BTF_KIND_PTR: + case BTF_KIND_FWD: + case BTF_KIND_TYPEDEF: + case BTF_KIND_FUNC: + h = btf_hash_common(t); + break; + case BTF_KIND_INT: + h = btf_hash_int(t); + break; + case BTF_KIND_ENUM: + h = btf_hash_enum(t); + break; + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + h = btf_hash_struct(t); + break; + case BTF_KIND_ARRAY: + h = btf_hash_array(t); + break; + case BTF_KIND_FUNC_PROTO: + h = btf_hash_fnproto(t); + break; + default: + pr_debug("unknown kind %d for type [%d]\n", btf_kind(t), type_id); + return -EINVAL; + } + if (btf_dedup_table_add(d, h, type_id)) + return -ENOMEM; + } + + return 0; +} + /* * Deduplicate primitive types, that can't reference other types, by calculating * their type signature hash and comparing them with any possible canonical @@ -3643,8 +3732,8 @@ static int btf_dedup_prim_types(struct btf_dedup *d) { int i, err;
- for (i = 1; i <= d->btf->nr_types; i++) { - err = btf_dedup_prim_type(d, i); + for (i = 0; i < d->btf->nr_types; i++) { + err = btf_dedup_prim_type(d, d->btf->start_id + i); if (err) return err; } @@ -3834,6 +3923,9 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, } else { real_kind = cand_kind; fwd_kind = btf_fwd_kind(canon_type); + /* we'd need to resolve base FWD to STRUCT/UNION */ + if (fwd_kind == real_kind && canon_id < d->btf->start_id) + d->hypot_adjust_canon = true; } return fwd_kind == real_kind; } @@ -3871,8 +3963,7 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, return 0; cand_arr = btf_array(cand_type); canon_arr = btf_array(canon_type); - eq = btf_dedup_is_equiv(d, - cand_arr->index_type, canon_arr->index_type); + eq = btf_dedup_is_equiv(d, cand_arr->index_type, canon_arr->index_type); if (eq <= 0) return eq; return btf_dedup_is_equiv(d, cand_arr->type, canon_arr->type); @@ -3955,16 +4046,16 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, */ static void btf_dedup_merge_hypot_map(struct btf_dedup *d) { - __u32 cand_type_id, targ_type_id; + __u32 canon_type_id, targ_type_id; __u16 t_kind, c_kind; __u32 t_id, c_id; int i;
for (i = 0; i < d->hypot_cnt; i++) { - cand_type_id = d->hypot_list[i]; - targ_type_id = d->hypot_map[cand_type_id]; + canon_type_id = d->hypot_list[i]; + targ_type_id = d->hypot_map[canon_type_id]; t_id = resolve_type_id(d, targ_type_id); - c_id = resolve_type_id(d, cand_type_id); + c_id = resolve_type_id(d, canon_type_id); t_kind = btf_kind(btf__type_by_id(d->btf, t_id)); c_kind = btf_kind(btf__type_by_id(d->btf, c_id)); /* @@ -3979,9 +4070,26 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d) * stability is not a requirement for STRUCT/UNION equivalence * checks, though. */ + + /* if it's the split BTF case, we still need to point base FWD + * to STRUCT/UNION in a split BTF, because FWDs from split BTF + * will be resolved against base FWD. If we don't point base + * canonical FWD to the resolved STRUCT/UNION, then all the + * FWDs in split BTF won't be correctly resolved to a proper + * STRUCT/UNION. + */ if (t_kind != BTF_KIND_FWD && c_kind == BTF_KIND_FWD) d->map[c_id] = t_id; - else if (t_kind == BTF_KIND_FWD && c_kind != BTF_KIND_FWD) + + /* if graph equivalence determined that we'd need to adjust + * base canonical types, then we need to only point base FWDs + * to STRUCTs/UNIONs and do no more modifications. For all + * other purposes the type graphs were not equivalent. + */ + if (d->hypot_adjust_canon) + continue; + + if (t_kind == BTF_KIND_FWD && c_kind != BTF_KIND_FWD) d->map[t_id] = c_id;
if ((t_kind == BTF_KIND_STRUCT || t_kind == BTF_KIND_UNION) && @@ -4065,8 +4173,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) return eq; if (!eq) continue; - new_id = cand_id; btf_dedup_merge_hypot_map(d); + if (d->hypot_adjust_canon) /* not really equivalent */ + continue; + new_id = cand_id; break; }
@@ -4081,8 +4191,8 @@ static int btf_dedup_struct_types(struct btf_dedup *d) { int i, err;
- for (i = 1; i <= d->btf->nr_types; i++) { - err = btf_dedup_struct_type(d, i); + for (i = 0; i < d->btf->nr_types; i++) { + err = btf_dedup_struct_type(d, d->btf->start_id + i); if (err) return err; } @@ -4225,8 +4335,8 @@ static int btf_dedup_ref_types(struct btf_dedup *d) { int i, err;
- for (i = 1; i <= d->btf->nr_types; i++) { - err = btf_dedup_ref_type(d, i); + for (i = 0; i < d->btf->nr_types; i++) { + err = btf_dedup_ref_type(d, d->btf->start_id + i); if (err < 0) return err; } @@ -4250,39 +4360,44 @@ static int btf_dedup_ref_types(struct btf_dedup *d) static int btf_dedup_compact_types(struct btf_dedup *d) { __u32 *new_offs; - __u32 next_type_id = 1; + __u32 next_type_id = d->btf->start_id; + const struct btf_type *t; void *p; - int i, len; + int i, id, len;
/* we are going to reuse hypot_map to store compaction remapping */ d->hypot_map[0] = 0; - for (i = 1; i <= d->btf->nr_types; i++) - d->hypot_map[i] = BTF_UNPROCESSED_ID; + /* base BTF types are not renumbered */ + for (id = 1; id < d->btf->start_id; id++) + d->hypot_map[id] = id; + for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++) + d->hypot_map[id] = BTF_UNPROCESSED_ID;
p = d->btf->types_data;
- for (i = 1; i <= d->btf->nr_types; i++) { - if (d->map[i] != i) + for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++) { + if (d->map[id] != id) continue;
- len = btf_type_size(btf__type_by_id(d->btf, i)); + t = btf__type_by_id(d->btf, id); + len = btf_type_size(t); if (len < 0) return len;
- memmove(p, btf__type_by_id(d->btf, i), len); - d->hypot_map[i] = next_type_id; - d->btf->type_offs[next_type_id - 1] = p - d->btf->types_data; + memmove(p, t, len); + d->hypot_map[id] = next_type_id; + d->btf->type_offs[next_type_id - d->btf->start_id] = p - d->btf->types_data; p += len; next_type_id++; }
/* shrink struct btf's internal types index and update btf_header */ - d->btf->nr_types = next_type_id - 1; + d->btf->nr_types = next_type_id - d->btf->start_id; d->btf->type_offs_cap = d->btf->nr_types; d->btf->hdr->type_len = p - d->btf->types_data; new_offs = libbpf_reallocarray(d->btf->type_offs, d->btf->type_offs_cap, sizeof(*new_offs)); - if (!new_offs) + if (d->btf->type_offs_cap && !new_offs) return -ENOMEM; d->btf->type_offs = new_offs; d->btf->hdr->str_off = d->btf->hdr->type_len; @@ -4414,8 +4529,8 @@ static int btf_dedup_remap_types(struct btf_dedup *d) { int i, r;
- for (i = 1; i <= d->btf->nr_types; i++) { - r = btf_dedup_remap_type(d, i); + for (i = 0; i < d->btf->nr_types; i++) { + r = btf_dedup_remap_type(d, d->btf->start_id + i); if (r < 0) return r; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 6b6e6b1d09aa20c351a1fce0ea6402da436624a4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In some cases compiler seems to generate distinct DWARF types for identical arrays within the same CU. That seems like a bug, but it's already out there and breaks type graph equivalence checks, so accommodate it anyway by checking for identical arrays, regardless of their type ID.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-10-andrii@kernel.org (cherry picked from commit 6b6e6b1d09aa20c351a1fce0ea6402da436624a4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index a29b99e030e1..900b557ab640 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3786,6 +3786,19 @@ static inline __u16 btf_fwd_kind(struct btf_type *t) return btf_kflag(t) ? BTF_KIND_UNION : BTF_KIND_STRUCT; }
+/* Check if given two types are identical ARRAY definitions */ +static int btf_dedup_identical_arrays(struct btf_dedup *d, __u32 id1, __u32 id2) +{ + struct btf_type *t1, *t2; + + t1 = btf_type_by_id(d->btf, id1); + t2 = btf_type_by_id(d->btf, id2); + if (!btf_is_array(t1) || !btf_is_array(t2)) + return 0; + + return btf_equal_array(t1, t2); +} + /* * Check equivalence of BTF type graph formed by candidate struct/union (we'll * call it "candidate graph" in this description for brevity) to a type graph @@ -3896,8 +3909,18 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, canon_id = resolve_fwd_id(d, canon_id);
hypot_type_id = d->hypot_map[canon_id]; - if (hypot_type_id <= BTF_MAX_NR_TYPES) - return hypot_type_id == cand_id; + if (hypot_type_id <= BTF_MAX_NR_TYPES) { + /* In some cases compiler will generate different DWARF types + * for *identical* array type definitions and use them for + * different fields within the *same* struct. This breaks type + * equivalence check, which makes an assumption that candidate + * types sub-graph has a consistent and deduped-by-compiler + * types within a single CU. So work around that by explicitly + * allowing identical array types here. + */ + return hypot_type_id == cand_id || + btf_dedup_identical_arrays(d, hypot_type_id, cand_id); + }
if (btf_dedup_hypot_map_add(d, canon_id, cand_id)) return -ENOMEM;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 232338fa2fb47726ab7c459419115a6ab6bfb3e3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add selftests validating BTF deduplication for split BTF case. Add a helper macro that allows to validate entire BTF with raw BTF dump, not just type-by-type. This saves tons of code and complexity.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-11-andrii@kernel.org (cherry picked from commit 232338fa2fb47726ab7c459419115a6ab6bfb3e3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/btf_helpers.c | 59 ++++ tools/testing/selftests/bpf/btf_helpers.h | 7 + .../bpf/prog_tests/btf_dedup_split.c | 325 ++++++++++++++++++ 3 files changed, 391 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
diff --git a/tools/testing/selftests/bpf/btf_helpers.c b/tools/testing/selftests/bpf/btf_helpers.c index abc3f6c04cfc..48f90490f922 100644 --- a/tools/testing/selftests/bpf/btf_helpers.c +++ b/tools/testing/selftests/bpf/btf_helpers.c @@ -3,6 +3,8 @@ #include <stdio.h> #include <errno.h> #include <bpf/btf.h> +#include <bpf/libbpf.h> +#include "test_progs.h"
static const char * const btf_kind_str_mapping[] = { [BTF_KIND_UNKN] = "UNKNOWN", @@ -198,3 +200,60 @@ const char *btf_type_raw_dump(const struct btf *btf, int type_id)
return buf; } + +int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[]) +{ + int i; + bool ok = true; + + ASSERT_EQ(btf__get_nr_types(btf), nr_types, "btf_nr_types"); + + for (i = 1; i <= nr_types; i++) { + if (!ASSERT_STREQ(btf_type_raw_dump(btf, i), exp_types[i - 1], "raw_dump")) + ok = false; + } + + return ok; +} + +static void btf_dump_printf(void *ctx, const char *fmt, va_list args) +{ + vfprintf(ctx, fmt, args); +} + +/* Print BTF-to-C dump into a local buffer and return string pointer back. + * Buffer *will* be overwritten by subsequent btf_type_raw_dump() calls + */ +const char *btf_type_c_dump(const struct btf *btf) +{ + static char buf[16 * 1024]; + FILE *buf_file; + struct btf_dump *d = NULL; + struct btf_dump_opts opts = {}; + int err, i; + + buf_file = fmemopen(buf, sizeof(buf) - 1, "w"); + if (!buf_file) { + fprintf(stderr, "Failed to open memstream: %d\n", errno); + return NULL; + } + + opts.ctx = buf_file; + d = btf_dump__new(btf, NULL, &opts, btf_dump_printf); + if (libbpf_get_error(d)) { + fprintf(stderr, "Failed to create btf_dump instance: %ld\n", libbpf_get_error(d)); + return NULL; + } + + for (i = 1; i <= btf__get_nr_types(btf); i++) { + err = btf_dump__dump_type(d, i); + if (err) { + fprintf(stderr, "Failed to dump type [%d]: %d\n", i, err); + return NULL; + } + } + + fflush(buf_file); + fclose(buf_file); + return buf; +} diff --git a/tools/testing/selftests/bpf/btf_helpers.h b/tools/testing/selftests/bpf/btf_helpers.h index 2c9ce1b61dc9..295c0137d9bd 100644 --- a/tools/testing/selftests/bpf/btf_helpers.h +++ b/tools/testing/selftests/bpf/btf_helpers.h @@ -8,5 +8,12 @@
int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id); const char *btf_type_raw_dump(const struct btf *btf, int type_id); +int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[]);
+#define VALIDATE_RAW_BTF(btf, raw_types...) \ + btf_validate_raw(btf, \ + sizeof((const char *[]){raw_types})/sizeof(void *),\ + (const char *[]){raw_types}) + +const char *btf_type_c_dump(const struct btf *btf); #endif diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c new file mode 100644 index 000000000000..64554fd33547 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c @@ -0,0 +1,325 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include <test_progs.h> +#include <bpf/btf.h> +#include "btf_helpers.h" + +static void test_split_simple() { + const struct btf_type *t; + struct btf *btf1, *btf2; + int str_off, err; + + btf1 = btf__new_empty(); + if (!ASSERT_OK_PTR(btf1, "empty_main_btf")) + return; + + btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */ + + btf__add_int(btf1, "int", 4, BTF_INT_SIGNED); /* [1] int */ + btf__add_ptr(btf1, 1); /* [2] ptr to int */ + btf__add_struct(btf1, "s1", 4); /* [3] struct s1 { */ + btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ + /* } */ + + VALIDATE_RAW_BTF( + btf1, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=1", + "[3] STRUCT 's1' size=4 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0"); + + ASSERT_STREQ(btf_type_c_dump(btf1), "\ +struct s1 {\n\ + int f1;\n\ +};\n\n", "c_dump"); + + btf2 = btf__new_empty_split(btf1); + if (!ASSERT_OK_PTR(btf2, "empty_split_btf")) + goto cleanup; + + /* pointer size should be "inherited" from main BTF */ + ASSERT_EQ(btf__pointer_size(btf2), 8, "inherit_ptr_sz"); + + str_off = btf__find_str(btf2, "int"); + ASSERT_NEQ(str_off, -ENOENT, "str_int_missing"); + + t = btf__type_by_id(btf2, 1); + if (!ASSERT_OK_PTR(t, "int_type")) + goto cleanup; + ASSERT_EQ(btf_is_int(t), true, "int_kind"); + ASSERT_STREQ(btf__str_by_offset(btf2, t->name_off), "int", "int_name"); + + btf__add_struct(btf2, "s2", 16); /* [4] struct s2 { */ + btf__add_field(btf2, "f1", 6, 0, 0); /* struct s1 f1; */ + btf__add_field(btf2, "f2", 5, 32, 0); /* int f2; */ + btf__add_field(btf2, "f3", 2, 64, 0); /* int *f3; */ + /* } */ + + /* duplicated int */ + btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [5] int */ + + /* duplicated struct s1 */ + btf__add_struct(btf2, "s1", 4); /* [6] struct s1 { */ + btf__add_field(btf2, "f1", 5, 0, 0); /* int f1; */ + /* } */ + + VALIDATE_RAW_BTF( + btf2, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=1", + "[3] STRUCT 's1' size=4 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0", + "[4] STRUCT 's2' size=16 vlen=3\n" + "\t'f1' type_id=6 bits_offset=0\n" + "\t'f2' type_id=5 bits_offset=32\n" + "\t'f3' type_id=2 bits_offset=64", + "[5] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[6] STRUCT 's1' size=4 vlen=1\n" + "\t'f1' type_id=5 bits_offset=0"); + + ASSERT_STREQ(btf_type_c_dump(btf2), "\ +struct s1 {\n\ + int f1;\n\ +};\n\ +\n\ +struct s1___2 {\n\ + int f1;\n\ +};\n\ +\n\ +struct s2 {\n\ + struct s1___2 f1;\n\ + int f2;\n\ + int *f3;\n\ +};\n\n", "c_dump"); + + err = btf__dedup(btf2, NULL, NULL); + if (!ASSERT_OK(err, "btf_dedup")) + goto cleanup; + + VALIDATE_RAW_BTF( + btf2, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=1", + "[3] STRUCT 's1' size=4 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0", + "[4] STRUCT 's2' size=16 vlen=3\n" + "\t'f1' type_id=3 bits_offset=0\n" + "\t'f2' type_id=1 bits_offset=32\n" + "\t'f3' type_id=2 bits_offset=64"); + + ASSERT_STREQ(btf_type_c_dump(btf2), "\ +struct s1 {\n\ + int f1;\n\ +};\n\ +\n\ +struct s2 {\n\ + struct s1 f1;\n\ + int f2;\n\ + int *f3;\n\ +};\n\n", "c_dump"); + +cleanup: + btf__free(btf2); + btf__free(btf1); +} + +static void test_split_fwd_resolve() { + struct btf *btf1, *btf2; + int err; + + btf1 = btf__new_empty(); + if (!ASSERT_OK_PTR(btf1, "empty_main_btf")) + return; + + btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */ + + btf__add_int(btf1, "int", 4, BTF_INT_SIGNED); /* [1] int */ + btf__add_ptr(btf1, 4); /* [2] ptr to struct s1 */ + btf__add_ptr(btf1, 5); /* [3] ptr to struct s2 */ + btf__add_struct(btf1, "s1", 16); /* [4] struct s1 { */ + btf__add_field(btf1, "f1", 2, 0, 0); /* struct s1 *f1; */ + btf__add_field(btf1, "f2", 3, 64, 0); /* struct s2 *f2; */ + /* } */ + btf__add_struct(btf1, "s2", 4); /* [5] struct s2 { */ + btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ + /* } */ + + VALIDATE_RAW_BTF( + btf1, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=4", + "[3] PTR '(anon)' type_id=5", + "[4] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=2 bits_offset=0\n" + "\t'f2' type_id=3 bits_offset=64", + "[5] STRUCT 's2' size=4 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0"); + + btf2 = btf__new_empty_split(btf1); + if (!ASSERT_OK_PTR(btf2, "empty_split_btf")) + goto cleanup; + + btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [6] int */ + btf__add_ptr(btf2, 10); /* [7] ptr to struct s1 */ + btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT); /* [8] fwd for struct s2 */ + btf__add_ptr(btf2, 8); /* [9] ptr to fwd struct s2 */ + btf__add_struct(btf2, "s1", 16); /* [10] struct s1 { */ + btf__add_field(btf2, "f1", 7, 0, 0); /* struct s1 *f1; */ + btf__add_field(btf2, "f2", 9, 64, 0); /* struct s2 *f2; */ + /* } */ + + VALIDATE_RAW_BTF( + btf2, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=4", + "[3] PTR '(anon)' type_id=5", + "[4] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=2 bits_offset=0\n" + "\t'f2' type_id=3 bits_offset=64", + "[5] STRUCT 's2' size=4 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0", + "[6] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[7] PTR '(anon)' type_id=10", + "[8] FWD 's2' fwd_kind=struct", + "[9] PTR '(anon)' type_id=8", + "[10] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=7 bits_offset=0\n" + "\t'f2' type_id=9 bits_offset=64"); + + err = btf__dedup(btf2, NULL, NULL); + if (!ASSERT_OK(err, "btf_dedup")) + goto cleanup; + + VALIDATE_RAW_BTF( + btf2, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=4", + "[3] PTR '(anon)' type_id=5", + "[4] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=2 bits_offset=0\n" + "\t'f2' type_id=3 bits_offset=64", + "[5] STRUCT 's2' size=4 vlen=1\n" + "\t'f1' type_id=1 bits_offset=0"); + +cleanup: + btf__free(btf2); + btf__free(btf1); +} + +static void test_split_struct_duped() { + struct btf *btf1, *btf2; + int err; + + btf1 = btf__new_empty(); + if (!ASSERT_OK_PTR(btf1, "empty_main_btf")) + return; + + btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */ + + btf__add_int(btf1, "int", 4, BTF_INT_SIGNED); /* [1] int */ + btf__add_ptr(btf1, 5); /* [2] ptr to struct s1 */ + btf__add_fwd(btf1, "s2", BTF_FWD_STRUCT); /* [3] fwd for struct s2 */ + btf__add_ptr(btf1, 3); /* [4] ptr to fwd struct s2 */ + btf__add_struct(btf1, "s1", 16); /* [5] struct s1 { */ + btf__add_field(btf1, "f1", 2, 0, 0); /* struct s1 *f1; */ + btf__add_field(btf1, "f2", 4, 64, 0); /* struct s2 *f2; */ + /* } */ + + VALIDATE_RAW_BTF( + btf1, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=5", + "[3] FWD 's2' fwd_kind=struct", + "[4] PTR '(anon)' type_id=3", + "[5] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=2 bits_offset=0\n" + "\t'f2' type_id=4 bits_offset=64"); + + btf2 = btf__new_empty_split(btf1); + if (!ASSERT_OK_PTR(btf2, "empty_split_btf")) + goto cleanup; + + btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [6] int */ + btf__add_ptr(btf2, 10); /* [7] ptr to struct s1 */ + btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT); /* [8] fwd for struct s2 */ + btf__add_ptr(btf2, 11); /* [9] ptr to struct s2 */ + btf__add_struct(btf2, "s1", 16); /* [10] struct s1 { */ + btf__add_field(btf2, "f1", 7, 0, 0); /* struct s1 *f1; */ + btf__add_field(btf2, "f2", 9, 64, 0); /* struct s2 *f2; */ + /* } */ + btf__add_struct(btf2, "s2", 40); /* [11] struct s2 { */ + btf__add_field(btf2, "f1", 7, 0, 0); /* struct s1 *f1; */ + btf__add_field(btf2, "f2", 9, 64, 0); /* struct s2 *f2; */ + btf__add_field(btf2, "f3", 6, 128, 0); /* int f3; */ + btf__add_field(btf2, "f4", 10, 192, 0); /* struct s1 f4; */ + /* } */ + btf__add_ptr(btf2, 8); /* [12] ptr to fwd struct s2 */ + btf__add_struct(btf2, "s3", 8); /* [13] struct s3 { */ + btf__add_field(btf2, "f1", 12, 0, 0); /* struct s2 *f1; (fwd) */ + /* } */ + + VALIDATE_RAW_BTF( + btf2, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=5", + "[3] FWD 's2' fwd_kind=struct", + "[4] PTR '(anon)' type_id=3", + "[5] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=2 bits_offset=0\n" + "\t'f2' type_id=4 bits_offset=64", + "[6] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[7] PTR '(anon)' type_id=10", + "[8] FWD 's2' fwd_kind=struct", + "[9] PTR '(anon)' type_id=11", + "[10] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=7 bits_offset=0\n" + "\t'f2' type_id=9 bits_offset=64", + "[11] STRUCT 's2' size=40 vlen=4\n" + "\t'f1' type_id=7 bits_offset=0\n" + "\t'f2' type_id=9 bits_offset=64\n" + "\t'f3' type_id=6 bits_offset=128\n" + "\t'f4' type_id=10 bits_offset=192", + "[12] PTR '(anon)' type_id=8", + "[13] STRUCT 's3' size=8 vlen=1\n" + "\t'f1' type_id=12 bits_offset=0"); + + err = btf__dedup(btf2, NULL, NULL); + if (!ASSERT_OK(err, "btf_dedup")) + goto cleanup; + + VALIDATE_RAW_BTF( + btf2, + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", + "[2] PTR '(anon)' type_id=5", + "[3] FWD 's2' fwd_kind=struct", + "[4] PTR '(anon)' type_id=3", + "[5] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=2 bits_offset=0\n" + "\t'f2' type_id=4 bits_offset=64", + "[6] PTR '(anon)' type_id=8", + "[7] PTR '(anon)' type_id=9", + "[8] STRUCT 's1' size=16 vlen=2\n" + "\t'f1' type_id=6 bits_offset=0\n" + "\t'f2' type_id=7 bits_offset=64", + "[9] STRUCT 's2' size=40 vlen=4\n" + "\t'f1' type_id=6 bits_offset=0\n" + "\t'f2' type_id=7 bits_offset=64\n" + "\t'f3' type_id=1 bits_offset=128\n" + "\t'f4' type_id=8 bits_offset=192", + "[10] STRUCT 's3' size=8 vlen=1\n" + "\t'f1' type_id=7 bits_offset=0"); + +cleanup: + btf__free(btf2); + btf__free(btf1); +} + +void test_btf_dedup_split() +{ + if (test__start_subtest("split_simple")) + test_split_simple(); + if (test__start_subtest("split_struct_duped")) + test_split_struct_duped(); + if (test__start_subtest("split_fwd_resolve")) + test_split_fwd_resolve(); +}
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 75fa1777694c245c1e59ac774cb1d58a15ecefeb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add ability to work with split BTF by providing extra -B flag, which allows to specify the path to the base BTF file.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105043402.2530976-12-andrii@kernel.org (cherry picked from commit 75fa1777694c245c1e59ac774cb1d58a15ecefeb) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 9 ++++++--- tools/bpf/bpftool/main.c | 15 ++++++++++++++- tools/bpf/bpftool/main.h | 1 + 3 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 592803af9734..112797055117 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -358,8 +358,12 @@ static int dump_btf_raw(const struct btf *btf, } } else { int cnt = btf__get_nr_types(btf); + int start_id = 1;
- for (i = 1; i <= cnt; i++) { + if (base_btf) + start_id = btf__get_nr_types(base_btf) + 1; + + for (i = start_id; i <= cnt; i++) { t = btf__type_by_id(btf, i); dump_btf_type(btf, i, t); } @@ -438,7 +442,6 @@ static int do_dump(int argc, char **argv) return -1; } src = GET_ARG(); - if (is_prefix(src, "map")) { struct bpf_map_info info = {}; __u32 len = sizeof(info); @@ -499,7 +502,7 @@ static int do_dump(int argc, char **argv) } NEXT_ARG(); } else if (is_prefix(src, "file")) { - btf = btf__parse(*argv, NULL); + btf = btf__parse_split(*argv, base_btf); if (IS_ERR(btf)) { err = -PTR_ERR(btf); btf = NULL; diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c index 1854d6b97860..edc889acd1fe 100644 --- a/tools/bpf/bpftool/main.c +++ b/tools/bpf/bpftool/main.c @@ -11,6 +11,7 @@
#include <bpf/bpf.h> #include <bpf/libbpf.h> +#include <bpf/btf.h>
#include "main.h"
@@ -28,6 +29,7 @@ bool show_pinned; bool block_mount; bool verifier_logs; bool relaxed_maps; +struct btf *base_btf; struct pinned_obj_table prog_table; struct pinned_obj_table map_table; struct pinned_obj_table link_table; @@ -392,6 +394,7 @@ int main(int argc, char **argv) { "mapcompat", no_argument, NULL, 'm' }, { "nomount", no_argument, NULL, 'n' }, { "debug", no_argument, NULL, 'd' }, + { "base-btf", required_argument, NULL, 'B' }, { 0 } }; int opt, ret; @@ -410,7 +413,7 @@ int main(int argc, char **argv) hash_init(link_table.table);
opterr = 0; - while ((opt = getopt_long(argc, argv, "Vhpjfmnd", + while ((opt = getopt_long(argc, argv, "VhpjfmndB:", options, NULL)) >= 0) { switch (opt) { case 'V': @@ -444,6 +447,15 @@ int main(int argc, char **argv) libbpf_set_print(print_all_levels); verifier_logs = true; break; + case 'B': + base_btf = btf__parse(optarg, NULL); + if (libbpf_get_error(base_btf)) { + p_err("failed to parse base BTF at '%s': %ld\n", + optarg, libbpf_get_error(base_btf)); + base_btf = NULL; + return -1; + } + break; default: p_err("unrecognized option '%s'", argv[optind - 1]); if (json_output) @@ -468,6 +480,7 @@ int main(int argc, char **argv) delete_pinned_obj_table(&map_table); delete_pinned_obj_table(&link_table); } + btf__free(base_btf);
return ret; } diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index c46e52137b87..76e91641262b 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -90,6 +90,7 @@ extern bool show_pids; extern bool block_mount; extern bool verifier_logs; extern bool relaxed_maps; +extern struct btf *base_btf; extern struct pinned_obj_table prog_table; extern struct pinned_obj_table map_table; extern struct pinned_obj_table link_table;
From: Wang Qing wangqing@vivo.com
mainline inclusion from mainline-5.11-rc1 commit 666475ccbf1dc99c1e61e47975d5fbf86d6236aa category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove duplicate btf_ids.h header which is included twice.
Signed-off-by: Wang Qing wangqing@vivo.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/1604736650-11197-1-git-send-email-wangqing@vivo.... (cherry picked from commit 666475ccbf1dc99c1e61e47975d5fbf86d6236aa) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 9a0a9895ec62..0fbff23a7ebb 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -22,7 +22,6 @@ #include <linux/skmsg.h> #include <linux/perf_event.h> #include <linux/bsearch.h> -#include <linux/btf_ids.h> #include <net/sock.h>
/* BTF (BPF Type Format) is the meta data format which describes
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 951bb64621b8139c0cd99dcadc13e6510c08aa73 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Adjust in-kernel BTF implementation to support a split BTF mode of operation. Changes are mostly mirroring libbpf split BTF changes, with the exception of start_id being 0 for in-kernel implementation due to simpler read-only mode.
Otherwise, for split BTF logic, most of the logic of jumping to base BTF, where necessary, is encapsulated in few helper functions. Type numbering and string offset in a split BTF are logically continuing where base BTF ends, so most of the high-level logic is kept without changes.
Type verification and size resolution is only doing an added resolution of new split BTF types and relies on already cached size and type resolution results in the base BTF.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201110011932.3201430-2-andrii@kernel.org (cherry picked from commit 951bb64621b8139c0cd99dcadc13e6510c08aa73) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 119 insertions(+), 52 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 0fbff23a7ebb..00bf3346e79f 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -203,12 +203,17 @@ struct btf { const char *strings; void *nohdr_data; struct btf_header hdr; - u32 nr_types; + u32 nr_types; /* includes VOID for base BTF */ u32 types_size; u32 data_size; refcount_t refcnt; u32 id; struct rcu_head rcu; + + /* split BTF support */ + struct btf *base_btf; + u32 start_id; /* first type ID in this BTF (0 for base BTF) */ + u32 start_str_off; /* first string offset (0 for base BTF) */ };
enum verifier_phase { @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; }
+static u32 btf_nr_types_total(const struct btf *btf) +{ + u32 total = 0; + + while (btf) { + total += btf->nr_types; + btf = btf->base_btf; + } + + return total; +} + s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind) { const struct btf_type *t; const char *tname; - u32 i; + u32 i, total;
- for (i = 1; i <= btf->nr_types; i++) { - t = btf->types[i]; + total = btf_nr_types_total(btf); + for (i = 1; i < total; i++) { + t = btf_type_by_id(btf, i); if (BTF_INFO_KIND(t->info) != kind) continue;
@@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
static bool btf_name_offset_valid(const struct btf *btf, u32 offset) { - return BTF_STR_OFFSET_VALID(offset) && - offset < btf->hdr.str_len; + if (!BTF_STR_OFFSET_VALID(offset)) + return false; + + while (offset < btf->start_str_off) + btf = btf->base_btf; + + offset -= btf->start_str_off; + return offset < btf->hdr.str_len; }
static bool __btf_name_char_ok(char c, bool first, bool dot_ok) @@ -614,10 +638,22 @@ static bool __btf_name_char_ok(char c, bool first, bool dot_ok) return true; }
+static const char *btf_str_by_offset(const struct btf *btf, u32 offset) +{ + while (offset < btf->start_str_off) + btf = btf->base_btf; + + offset -= btf->start_str_off; + if (offset < btf->hdr.str_len) + return &btf->strings[offset]; + + return NULL; +} + static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok) { /* offset must be valid */ - const char *src = &btf->strings[offset]; + const char *src = btf_str_by_offset(btf, offset); const char *src_limit;
if (!__btf_name_char_ok(*src, true, dot_ok)) @@ -650,27 +686,28 @@ static bool btf_name_valid_section(const struct btf *btf, u32 offset)
static const char *__btf_name_by_offset(const struct btf *btf, u32 offset) { + const char *name; + if (!offset) return "(anon)"; - else if (offset < btf->hdr.str_len) - return &btf->strings[offset]; - else - return "(invalid-name-offset)"; + + name = btf_str_by_offset(btf, offset); + return name ?: "(invalid-name-offset)"; }
const char *btf_name_by_offset(const struct btf *btf, u32 offset) { - if (offset < btf->hdr.str_len) - return &btf->strings[offset]; - - return NULL; + return btf_str_by_offset(btf, offset); }
const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id) { - if (type_id > btf->nr_types) - return NULL; + while (type_id < btf->start_id) + btf = btf->base_btf;
+ type_id -= btf->start_id; + if (type_id >= btf->nr_types) + return NULL; return btf->types[type_id]; }
@@ -1390,17 +1427,13 @@ static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t) { struct btf *btf = env->btf;
- /* < 2 because +1 for btf_void which is always in btf->types[0]. - * btf_void is not accounted in btf->nr_types because btf_void - * does not come from the BTF file. - */ - if (btf->types_size - btf->nr_types < 2) { + if (btf->types_size == btf->nr_types) { /* Expand 'types' array */
struct btf_type **new_types; u32 expand_by, new_size;
- if (btf->types_size == BTF_MAX_TYPE) { + if (btf->start_id + btf->types_size == BTF_MAX_TYPE) { btf_verifier_log(env, "Exceeded max num of types"); return -E2BIG; } @@ -1414,18 +1447,23 @@ static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t) if (!new_types) return -ENOMEM;
- if (btf->nr_types == 0) - new_types[0] = &btf_void; - else + if (btf->nr_types == 0) { + if (!btf->base_btf) { + /* lazily init VOID type */ + new_types[0] = &btf_void; + btf->nr_types++; + } + } else { memcpy(new_types, btf->types, - sizeof(*btf->types) * (btf->nr_types + 1)); + sizeof(*btf->types) * btf->nr_types); + }
kvfree(btf->types); btf->types = new_types; btf->types_size = new_size; }
- btf->types[++(btf->nr_types)] = t; + btf->types[btf->nr_types++] = t;
return 0; } @@ -1498,18 +1536,17 @@ static int env_resolve_init(struct btf_verifier_env *env) u32 *resolved_ids = NULL; u8 *visit_states = NULL;
- /* +1 for btf_void */ - resolved_sizes = kvcalloc(nr_types + 1, sizeof(*resolved_sizes), + resolved_sizes = kvcalloc(nr_types, sizeof(*resolved_sizes), GFP_KERNEL | __GFP_NOWARN); if (!resolved_sizes) goto nomem;
- resolved_ids = kvcalloc(nr_types + 1, sizeof(*resolved_ids), + resolved_ids = kvcalloc(nr_types, sizeof(*resolved_ids), GFP_KERNEL | __GFP_NOWARN); if (!resolved_ids) goto nomem;
- visit_states = kvcalloc(nr_types + 1, sizeof(*visit_states), + visit_states = kvcalloc(nr_types, sizeof(*visit_states), GFP_KERNEL | __GFP_NOWARN); if (!visit_states) goto nomem; @@ -1561,21 +1598,27 @@ static bool env_type_is_resolve_sink(const struct btf_verifier_env *env, static bool env_type_is_resolved(const struct btf_verifier_env *env, u32 type_id) { - return env->visit_states[type_id] == RESOLVED; + /* base BTF types should be resolved by now */ + if (type_id < env->btf->start_id) + return true; + + return env->visit_states[type_id - env->btf->start_id] == RESOLVED; }
static int env_stack_push(struct btf_verifier_env *env, const struct btf_type *t, u32 type_id) { + const struct btf *btf = env->btf; struct resolve_vertex *v;
if (env->top_stack == MAX_RESOLVE_DEPTH) return -E2BIG;
- if (env->visit_states[type_id] != NOT_VISITED) + if (type_id < btf->start_id + || env->visit_states[type_id - btf->start_id] != NOT_VISITED) return -EEXIST;
- env->visit_states[type_id] = VISITED; + env->visit_states[type_id - btf->start_id] = VISITED;
v = &env->stack[env->top_stack++]; v->t = t; @@ -1605,6 +1648,7 @@ static void env_stack_pop_resolved(struct btf_verifier_env *env, u32 type_id = env->stack[--(env->top_stack)].type_id; struct btf *btf = env->btf;
+ type_id -= btf->start_id; /* adjust to local type id */ btf->resolved_sizes[type_id] = resolved_size; btf->resolved_ids[type_id] = resolved_type_id; env->visit_states[type_id] = RESOLVED; @@ -1709,14 +1753,30 @@ btf_resolve_size(const struct btf *btf, const struct btf_type *type, return __btf_resolve_size(btf, type, type_size, NULL, NULL, NULL, NULL); }
+static u32 btf_resolved_type_id(const struct btf *btf, u32 type_id) +{ + while (type_id < btf->start_id) + btf = btf->base_btf; + + return btf->resolved_ids[type_id - btf->start_id]; +} + /* The input param "type_id" must point to a needs_resolve type */ static const struct btf_type *btf_type_id_resolve(const struct btf *btf, u32 *type_id) { - *type_id = btf->resolved_ids[*type_id]; + *type_id = btf_resolved_type_id(btf, *type_id); return btf_type_by_id(btf, *type_id); }
+static u32 btf_resolved_type_size(const struct btf *btf, u32 type_id) +{ + while (type_id < btf->start_id) + btf = btf->base_btf; + + return btf->resolved_sizes[type_id - btf->start_id]; +} + const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size) { @@ -1731,7 +1791,7 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, if (btf_type_has_size(size_type)) { size = size_type->size; } else if (btf_type_is_array(size_type)) { - size = btf->resolved_sizes[size_type_id]; + size = btf_resolved_type_size(btf, size_type_id); } else if (btf_type_is_ptr(size_type)) { size = sizeof(void *); } else { @@ -1739,14 +1799,14 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, !btf_type_is_var(size_type))) return NULL;
- size_type_id = btf->resolved_ids[size_type_id]; + size_type_id = btf_resolved_type_id(btf, size_type_id); size_type = btf_type_by_id(btf, size_type_id); if (btf_type_nosize_or_null(size_type)) return NULL; else if (btf_type_has_size(size_type)) size = size_type->size; else if (btf_type_is_array(size_type)) - size = btf->resolved_sizes[size_type_id]; + size = btf_resolved_type_size(btf, size_type_id); else if (btf_type_is_ptr(size_type)) size = sizeof(void *); else @@ -3798,7 +3858,7 @@ static int btf_check_all_metas(struct btf_verifier_env *env) cur = btf->nohdr_data + hdr->type_off; end = cur + hdr->type_len;
- env->log_type_id = 1; + env->log_type_id = btf->base_btf ? btf->start_id : 1; while (cur < end) { struct btf_type *t = cur; s32 meta_size; @@ -3825,8 +3885,8 @@ static bool btf_resolve_valid(struct btf_verifier_env *env, return false;
if (btf_type_is_struct(t) || btf_type_is_datasec(t)) - return !btf->resolved_ids[type_id] && - !btf->resolved_sizes[type_id]; + return !btf_resolved_type_id(btf, type_id) && + !btf_resolved_type_size(btf, type_id);
if (btf_type_is_modifier(t) || btf_type_is_ptr(t) || btf_type_is_var(t)) { @@ -3846,7 +3906,7 @@ static bool btf_resolve_valid(struct btf_verifier_env *env, elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size); return elem_type && !btf_type_is_modifier(elem_type) && (array->nelems * elem_size == - btf->resolved_sizes[type_id]); + btf_resolved_type_size(btf, type_id)); }
return false; @@ -3888,7 +3948,8 @@ static int btf_resolve(struct btf_verifier_env *env, static int btf_check_all_types(struct btf_verifier_env *env) { struct btf *btf = env->btf; - u32 type_id; + const struct btf_type *t; + u32 type_id, i; int err;
err = env_resolve_init(env); @@ -3896,8 +3957,9 @@ static int btf_check_all_types(struct btf_verifier_env *env) return err;
env->phase++; - for (type_id = 1; type_id <= btf->nr_types; type_id++) { - const struct btf_type *t = btf_type_by_id(btf, type_id); + for (i = btf->base_btf ? 0 : 1; i < btf->nr_types; i++) { + type_id = btf->start_id + i; + t = btf_type_by_id(btf, type_id);
env->log_type_id = type_id; if (btf_type_needs_resolve(t) && @@ -3934,7 +3996,7 @@ static int btf_parse_type_sec(struct btf_verifier_env *env) return -EINVAL; }
- if (!hdr->type_len) { + if (!env->btf->base_btf && !hdr->type_len) { btf_verifier_log(env, "No type found"); return -EINVAL; } @@ -3961,13 +4023,18 @@ static int btf_parse_str_sec(struct btf_verifier_env *env) return -EINVAL; }
- if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_NAME_OFFSET || - start[0] || end[-1]) { + btf->strings = start; + + if (btf->base_btf && !hdr->str_len) + return 0; + if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_NAME_OFFSET || end[-1]) { + btf_verifier_log(env, "Invalid string section"); + return -EINVAL; + } + if (!btf->base_btf && start[0]) { btf_verifier_log(env, "Invalid string section"); return -EINVAL; } - - btf->strings = start;
return 0; } @@ -4910,7 +4977,7 @@ static int __get_type_size(struct btf *btf, u32 btf_id, while (t && btf_type_is_modifier(t)) t = btf_type_by_id(btf, t->type); if (!t) { - *bad_type = btf->types[0]; + *bad_type = btf_type_by_id(btf, 0); return -EINVAL; } if (btf_type_is_ptr(t))
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 5329722057d41aebc31e391907a501feaa42f7d9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF objects in the system. To allow distinguishing vmlinux BTF (and later kernel module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module BTF). We might want to later allow specifying BTF name for user-provided BTFs as well, if that makes sense. But currently this is reserved only for in-kernel BTFs.
Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require in-kernel BTF type with ability to specify BTF types from kernel modules, not just vmlinux BTF. This will be implemented in a follow up patch set for fentry/fexit/fmod_ret/lsm/etc.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org (cherry picked from commit 5329722057d41aebc31e391907a501feaa42f7d9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 3 +++ kernel/bpf/btf.c | 43 +++++++++++++++++++++++++++++++--- tools/include/uapi/linux/bpf.h | 3 +++ 3 files changed, 46 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d5fbbc28b6a0..dfbd0974e2ed 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4459,6 +4459,9 @@ struct bpf_btf_info { __aligned_u64 btf; __u32 btf_size; __u32 id; + __aligned_u64 name; + __u32 name_len; + __u32 kernel_btf; } __attribute__((aligned(8)));
struct bpf_link_info { diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 00bf3346e79f..f6373d2ad949 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -214,6 +214,8 @@ struct btf { struct btf *base_btf; u32 start_id; /* first type ID in this BTF (0 for base BTF) */ u32 start_str_off; /* first string offset (0 for base BTF) */ + char name[MODULE_NAME_LEN]; + bool kernel_btf; };
enum verifier_phase { @@ -4428,6 +4430,8 @@ struct btf *btf_parse_vmlinux(void)
btf->data = __start_BTF; btf->data_size = __stop_BTF - __start_BTF; + btf->kernel_btf = true; + snprintf(btf->name, sizeof(btf->name), "vmlinux");
err = btf_parse_hdr(env); if (err) @@ -4453,8 +4457,13 @@ struct btf *btf_parse_vmlinux(void)
bpf_struct_ops_init(btf, log);
- btf_verifier_env_free(env); refcount_set(&btf->refcnt, 1); + + err = btf_alloc_id(btf); + if (err) + goto errout; + + btf_verifier_env_free(env); return btf;
errout: @@ -5567,7 +5576,9 @@ int btf_get_info_by_fd(const struct btf *btf, struct bpf_btf_info info; u32 info_copy, btf_copy; void __user *ubtf; - u32 uinfo_len; + char __user *uname; + u32 uinfo_len, uname_len, name_len; + int ret = 0;
uinfo = u64_to_user_ptr(attr->info.info); uinfo_len = attr->info.info_len; @@ -5584,11 +5595,37 @@ int btf_get_info_by_fd(const struct btf *btf, return -EFAULT; info.btf_size = btf->data_size;
+ info.kernel_btf = btf->kernel_btf; + + uname = u64_to_user_ptr(info.name); + uname_len = info.name_len; + if (!uname ^ !uname_len) + return -EINVAL; + + name_len = strlen(btf->name); + info.name_len = name_len; + + if (uname) { + if (uname_len >= name_len + 1) { + if (copy_to_user(uname, btf->name, name_len + 1)) + return -EFAULT; + } else { + char zero = '\0'; + + if (copy_to_user(uname, btf->name, uname_len - 1)) + return -EFAULT; + if (put_user(zero, uname + uname_len - 1)) + return -EFAULT; + /* let user-space know about too short buffer */ + ret = -ENOSPC; + } + } + if (copy_to_user(uinfo, &info, info_copy) || put_user(info_copy, &uattr->info.info_len)) return -EFAULT;
- return 0; + return ret; }
int btf_get_fd_by_id(u32 id) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index b2a0b189b797..934ec0450b7c 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4458,6 +4458,9 @@ struct bpf_btf_info { __aligned_u64 btf; __u32 btf_size; __u32 id; + __aligned_u64 name; + __u32 name_len; + __u32 kernel_btf; } __attribute__((aligned(8)));
struct bpf_link_info {
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 5f9ae91f7c0dbbc4195e2a6c8eedcaeb5b9e4cbb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Detect if pahole supports split BTF generation, and generate BTF for each selected kernel module, if it does. This is exposed to Makefiles and C code as CONFIG_DEBUG_INFO_BTF_MODULES flag.
Kernel module BTF has to be re-generated if either vmlinux's BTF changes or module's .ko changes. To achieve that, I needed a helper similar to if_changed, but that would allow to filter out vmlinux from the list of updated dependencies for .ko building. I've put it next to the only place that uses and needs it, but it might be a better idea to just add it along the other if_changed variants into scripts/Kbuild.include.
Each kernel module's BTF deduplication is pretty fast, as it does only incremental BTF deduplication on top of already deduplicated vmlinux BTF. To show the added build time, I've first ran make only just built kernel (to establish the baseline) and then forced only BTF re-generation, without regenerating .ko files. The build was performed with -j60 parallelization on 56-core machine. The final time also includes bzImage building, so it's not a pure BTF overhead.
$ time make -j60 ... make -j60 27.65s user 10.96s system 782% cpu 4.933 total $ touch ~/linux-build/default/vmlinux && time make -j60 ... make -j60 123.69s user 27.85s system 1566% cpu 9.675 total
So 4.6 seconds real time, with noticeable part spent in compressed vmlinux and bzImage building.
To show size savings, I've built my kernel configuration with about 700 kernel modules with full BTF per each kernel module (without deduplicating against vmlinux) and with split BTF against deduplicated vmlinux (approach in this patch). Below are top 10 modules with biggest BTF sizes. And total size of BTF data across all kernel modules.
It shows that split BTF "compresses" 115MB down to 5MB total. And the biggest kernel modules get a downsize from 500-570KB down to 200-300KB.
FULL BTF
========
$ for f in $(find . -name '*.ko'); do size -A -d $f | grep BTF | awk '{print $2}'; done | awk '{ s += $1 } END { print s }' 115710691
$ for f in $(find . -name '*.ko'); do printf "%s %d\n" $f $(size -A -d $f | grep BTF | awk '{print $2}'); done | sort -nr -k2 | head -n10 ./drivers/gpu/drm/i915/i915.ko 570570 ./drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko 520240 ./drivers/gpu/drm/radeon/radeon.ko 503849 ./drivers/infiniband/hw/mlx5/mlx5_ib.ko 491777 ./fs/xfs/xfs.ko 411544 ./drivers/net/ethernet/intel/i40e/i40e.ko 403904 ./drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko 398754 ./drivers/infiniband/core/ib_core.ko 397224 ./fs/cifs/cifs.ko 386249 ./fs/nfsd/nfsd.ko 379738
SPLIT BTF =========
$ for f in $(find . -name '*.ko'); do size -A -d $f | grep BTF | awk '{print $2}'; done | awk '{ s += $1 } END { print s }' 5194047
$ for f in $(find . -name '*.ko'); do printf "%s %d\n" $f $(size -A -d $f | grep BTF | awk '{print $2}'); done | sort -nr -k2 | head -n10 ./drivers/gpu/drm/i915/i915.ko 293206 ./drivers/gpu/drm/radeon/radeon.ko 282103 ./fs/xfs/xfs.ko 222150 ./drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko 198503 ./drivers/infiniband/hw/mlx5/mlx5_ib.ko 198356 ./drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko 113444 ./fs/cifs/cifs.ko 109379 ./arch/x86/kvm/kvm.ko 100225 ./drivers/gpu/drm/drm.ko 94827 ./drivers/infiniband/core/ib_core.ko 91188
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201110011932.3201430-4-andrii@kernel.org (cherry picked from commit 5f9ae91f7c0dbbc4195e2a6c8eedcaeb5b9e4cbb) Signed-off-by: Wang Yufen wangyufen@huawei.com --- lib/Kconfig.debug | 9 +++++++++ scripts/Makefile.modfinal | 20 ++++++++++++++++++-- 2 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index f906df9db2e2..54ba01323fe6 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -274,6 +274,15 @@ config DEBUG_INFO_BTF Turning this on expects presence of pahole tool, which will convert DWARF type info into equivalent deduplicated BTF type info.
+config PAHOLE_HAS_SPLIT_BTF + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+).([0-9]+)/\1\2/'` -ge "119") + +config DEBUG_INFO_BTF_MODULES + def_bool y + depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF + help + Generate compact split BTF type information for kernel modules. + config GDB_SCRIPTS bool "Provide GDB scripts for kernel debugging" help diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index ae01baf96f4e..02b892421f7a 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -6,6 +6,7 @@ PHONY := __modfinal __modfinal:
+include include/config/auto.conf include $(srctree)/scripts/Kbuild.include
# for c_flags @@ -36,8 +37,23 @@ quiet_cmd_ld_ko_o = LD [M] $@ -T scripts/module.lds -o $@ $(filter %.o, $^); \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
-$(modules): %.ko: %.o %.mod.o scripts/module.lds FORCE - +$(call if_changed,ld_ko_o) +quiet_cmd_btf_ko = BTF [M] $@ + cmd_btf_ko = LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@ + +# Same as newer-prereqs, but allows to exclude specified extra dependencies +newer_prereqs_except = $(filter-out $(PHONY) $(1),$?) + +# Same as if_changed, but allows to exclude specified extra dependencies +if_changed_except = $(if $(call newer_prereqs_except,$(2))$(cmd-check), \ + $(cmd); \ + printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd, @:) + +# Re-generate module BTFs if either module's .ko or vmlinux changed +$(modules): %.ko: %.o %.mod.o scripts/module.lds vmlinux FORCE + +$(call if_changed_except,ld_ko_o,vmlinux) +ifdef CONFIG_DEBUG_INFO_BTF_MODULES + +$(if $(newer-prereqs),$(call cmd,btf_ko)) +endif
targets += $(modules) $(modules:.ko=.mod.o)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 36e68442d1afd4f720704ee1ea8486331507e834 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add kernel module listener that will load/validate and unload module BTF. Module BTFs gets ID generated for them, which makes it possible to iterate them with existing BTF iteration API. They are given their respective module's names, which will get reported through GET_OBJ_INFO API. They are also marked as in-kernel BTFs for tooling to distinguish them from user-provided BTFs.
Also, similarly to vmlinux BTF, kernel module BTFs are exposed through sysfs as /sys/kernel/btf/<module-name>. This is convenient for user-space tools to inspect module BTF contents and dump their types with existing tools:
[vmuser@archvm bpf]$ ls -la /sys/kernel/btf total 0 drwxr-xr-x 2 root root 0 Nov 4 19:46 . drwxr-xr-x 13 root root 0 Nov 4 19:46 ..
...
-r--r--r-- 1 root root 888 Nov 4 19:46 irqbypass -r--r--r-- 1 root root 100225 Nov 4 19:46 kvm -r--r--r-- 1 root root 35401 Nov 4 19:46 kvm_intel -r--r--r-- 1 root root 120 Nov 4 19:46 pcspkr -r--r--r-- 1 root root 399 Nov 4 19:46 serio_raw -r--r--r-- 1 root root 4094095 Nov 4 19:46 vmlinux
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org (cherry picked from commit 36e68442d1afd4f720704ee1ea8486331507e834) Signed-off-by: Wang Yufen wangyufen@huawei.com --- Documentation/ABI/testing/sysfs-kernel-btf | 8 + include/linux/bpf.h | 2 + include/linux/module.h | 4 + kernel/bpf/btf.c | 194 +++++++++++++++++++++ kernel/bpf/sysfs_btf.c | 2 +- kernel/module.c | 32 ++++ 6 files changed, 241 insertions(+), 1 deletion(-)
diff --git a/Documentation/ABI/testing/sysfs-kernel-btf b/Documentation/ABI/testing/sysfs-kernel-btf index 2c9744b2cd59..fe96efdc9b6c 100644 --- a/Documentation/ABI/testing/sysfs-kernel-btf +++ b/Documentation/ABI/testing/sysfs-kernel-btf @@ -15,3 +15,11 @@ Description: information with description of all internal kernel types. See Documentation/bpf/btf.rst for detailed description of format itself. + +What: /sys/kernel/btf/<module-name> +Date: Nov 2020 +KernelVersion: 5.11 +Contact: bpf@vger.kernel.org +Description: + Read-only binary attribute exposing kernel module's BTF type + information as an add-on to the kernel's BTF (/sys/kernel/btf/vmlinux). diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a64bbce97cdb..8a0c2bd2b8d5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -37,9 +37,11 @@ struct seq_operations; struct bpf_iter_aux_info; struct bpf_local_storage; struct bpf_local_storage_map; +struct kobject;
extern struct idr btf_idr; extern spinlock_t btf_idr_lock; +extern struct kobject *btf_kobj;
typedef int (*bpf_iter_init_seq_priv_t)(void *private_data, struct bpf_iter_aux_info *aux); diff --git a/include/linux/module.h b/include/linux/module.h index 54cdd20fc3de..c185767a7313 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -481,6 +481,10 @@ struct module { unsigned int num_bpf_raw_events; struct bpf_raw_event_map *bpf_raw_events; #endif +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES + unsigned int btf_data_size; + void *btf_data; +#endif #ifdef CONFIG_JUMP_LABEL struct jump_entry *jump_entries; unsigned int num_jump_entries; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index f6373d2ad949..0a0c81588e61 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -22,6 +22,8 @@ #include <linux/skmsg.h> #include <linux/perf_event.h> #include <linux/bsearch.h> +#include <linux/kobject.h> +#include <linux/sysfs.h> #include <net/sock.h>
/* BTF (BPF Type Format) is the meta data format which describes @@ -4475,6 +4477,75 @@ struct btf *btf_parse_vmlinux(void) return ERR_PTR(err); }
+static struct btf *btf_parse_module(const char *module_name, const void *data, unsigned int data_size) +{ + struct btf_verifier_env *env = NULL; + struct bpf_verifier_log *log; + struct btf *btf = NULL, *base_btf; + int err; + + base_btf = bpf_get_btf_vmlinux(); + if (IS_ERR(base_btf)) + return base_btf; + if (!base_btf) + return ERR_PTR(-EINVAL); + + env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN); + if (!env) + return ERR_PTR(-ENOMEM); + + log = &env->log; + log->level = BPF_LOG_KERNEL; + + btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN); + if (!btf) { + err = -ENOMEM; + goto errout; + } + env->btf = btf; + + btf->base_btf = base_btf; + btf->start_id = base_btf->nr_types; + btf->start_str_off = base_btf->hdr.str_len; + btf->kernel_btf = true; + snprintf(btf->name, sizeof(btf->name), "%s", module_name); + + btf->data = kvmalloc(data_size, GFP_KERNEL | __GFP_NOWARN); + if (!btf->data) { + err = -ENOMEM; + goto errout; + } + memcpy(btf->data, data, data_size); + btf->data_size = data_size; + + err = btf_parse_hdr(env); + if (err) + goto errout; + + btf->nohdr_data = btf->data + btf->hdr.hdr_len; + + err = btf_parse_str_sec(env); + if (err) + goto errout; + + err = btf_check_all_metas(env); + if (err) + goto errout; + + btf_verifier_env_free(env); + refcount_set(&btf->refcnt, 1); + return btf; + +errout: + btf_verifier_env_free(env); + if (btf) { + kvfree(btf->data); + kvfree(btf->types); + kfree(btf); + } + return ERR_PTR(err); +} + struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog) { struct bpf_prog *tgt_prog = prog->aux->dst_prog; @@ -5665,3 +5736,126 @@ bool btf_id_set_contains(const struct btf_id_set *set, u32 id) { return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL; } + +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES +struct btf_module { + struct list_head list; + struct module *module; + struct btf *btf; + struct bin_attribute *sysfs_attr; +}; + +static LIST_HEAD(btf_modules); +static DEFINE_MUTEX(btf_module_mutex); + +static ssize_t +btf_module_read(struct file *file, struct kobject *kobj, + struct bin_attribute *bin_attr, + char *buf, loff_t off, size_t len) +{ + const struct btf *btf = bin_attr->private; + + memcpy(buf, btf->data + off, len); + return len; +} + +static int btf_module_notify(struct notifier_block *nb, unsigned long op, + void *module) +{ + struct btf_module *btf_mod, *tmp; + struct module *mod = module; + struct btf *btf; + int err = 0; + + if (mod->btf_data_size == 0 || + (op != MODULE_STATE_COMING && op != MODULE_STATE_GOING)) + goto out; + + switch (op) { + case MODULE_STATE_COMING: + btf_mod = kzalloc(sizeof(*btf_mod), GFP_KERNEL); + if (!btf_mod) { + err = -ENOMEM; + goto out; + } + btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); + if (IS_ERR(btf)) { + pr_warn("failed to validate module [%s] BTF: %ld\n", + mod->name, PTR_ERR(btf)); + kfree(btf_mod); + err = PTR_ERR(btf); + goto out; + } + err = btf_alloc_id(btf); + if (err) { + btf_free(btf); + kfree(btf_mod); + goto out; + } + + mutex_lock(&btf_module_mutex); + btf_mod->module = module; + btf_mod->btf = btf; + list_add(&btf_mod->list, &btf_modules); + mutex_unlock(&btf_module_mutex); + + if (IS_ENABLED(CONFIG_SYSFS)) { + struct bin_attribute *attr; + + attr = kzalloc(sizeof(*attr), GFP_KERNEL); + if (!attr) + goto out; + + sysfs_bin_attr_init(attr); + attr->attr.name = btf->name; + attr->attr.mode = 0444; + attr->size = btf->data_size; + attr->private = btf; + attr->read = btf_module_read; + + err = sysfs_create_bin_file(btf_kobj, attr); + if (err) { + pr_warn("failed to register module [%s] BTF in sysfs: %d\n", + mod->name, err); + kfree(attr); + err = 0; + goto out; + } + + btf_mod->sysfs_attr = attr; + } + + break; + case MODULE_STATE_GOING: + mutex_lock(&btf_module_mutex); + list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) { + if (btf_mod->module != module) + continue; + + list_del(&btf_mod->list); + if (btf_mod->sysfs_attr) + sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr); + btf_put(btf_mod->btf); + kfree(btf_mod->sysfs_attr); + kfree(btf_mod); + break; + } + mutex_unlock(&btf_module_mutex); + break; + } +out: + return notifier_from_errno(err); +} + +static struct notifier_block btf_module_nb = { + .notifier_call = btf_module_notify, +}; + +static int __init btf_module_init(void) +{ + register_module_notifier(&btf_module_nb); + return 0; +} + +fs_initcall(btf_module_init); +#endif /* CONFIG_DEBUG_INFO_BTF_MODULES */ diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c index 11b3380887fa..ef6911aee3bb 100644 --- a/kernel/bpf/sysfs_btf.c +++ b/kernel/bpf/sysfs_btf.c @@ -26,7 +26,7 @@ static struct bin_attribute bin_attr_btf_vmlinux __ro_after_init = { .read = btf_vmlinux_read, };
-static struct kobject *btf_kobj; +struct kobject *btf_kobj;
static int __init btf_vmlinux_init(void) { diff --git a/kernel/module.c b/kernel/module.c index 5fdfa29a0738..e7d2a53842da 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -387,6 +387,35 @@ static void *section_objs(const struct load_info *info, return (void *)info->sechdrs[sec].sh_addr; }
+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */ +static unsigned int find_any_sec(const struct load_info *info, const char *name) +{ + unsigned int i; + + for (i = 1; i < info->hdr->e_shnum; i++) { + Elf_Shdr *shdr = &info->sechdrs[i]; + if (strcmp(info->secstrings + shdr->sh_name, name) == 0) + return i; + } + return 0; +} + +/* + * Find a module section, or NULL. Fill in number of "objects" in section. + * Ignores SHF_ALLOC flag. + */ +static __maybe_unused void *any_section_objs(const struct load_info *info, + const char *name, + size_t object_size, + unsigned int *num) +{ + unsigned int sec = find_any_sec(info, name); + + /* Section 0 has sh_addr 0 and sh_size 0. */ + *num = info->sechdrs[sec].sh_size / object_size; + return (void *)info->sechdrs[sec].sh_addr; +} + /* Provided by the linker */ extern const struct kernel_symbol __start___ksymtab[]; extern const struct kernel_symbol __stop___ksymtab[]; @@ -3375,6 +3404,9 @@ static int find_module_sections(struct module *mod, struct load_info *info) sizeof(*mod->bpf_raw_events), &mod->num_bpf_raw_events); #endif +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES + mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size); +#endif #ifdef CONFIG_JUMP_LABEL mod->jump_entries = section_objs(info, "__jump_table", sizeof(*mod->jump_entries),
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit cecaf4a0f2dcbd7e76156f6d63748b84c176b95b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Display vmlinux BTF name and kernel module names when listing available BTFs on the system.
In human-readable output mode, module BTFs are reported with "name [module-name]", while vmlinux BTF will be reported as "name [vmlinux]". Square brackets are added by bpftool and follow kernel convention when displaying modules in human-readable text outputs.
[vmuser@archvm bpf]$ sudo ../../../bpf/bpftool/bpftool btf s 1: name [vmlinux] size 4082281B 6: size 2365B prog_ids 8,6 map_ids 3 7: name [button] size 46895B 8: name [pcspkr] size 42328B 9: name [serio_raw] size 39375B 10: name [floppy] size 57185B 11: name [i2c_core] size 76186B 12: name [crc32c_intel] size 16036B 13: name [i2c_piix4] size 50497B 14: name [irqbypass] size 14124B 15: name [kvm] size 197985B 16: name [kvm_intel] size 123564B 17: name [cryptd] size 42466B 18: name [crypto_simd] size 17187B 19: name [glue_helper] size 39205B 20: name [aesni_intel] size 41034B 25: size 36150B pids bpftool(2519)
In JSON mode, two fields (boolean "kernel" and string "name") are reported for each BTF object. vmlinux BTF is reported with name "vmlinux" (kernel itself returns and empty name for vmlinux BTF).
[vmuser@archvm bpf]$ sudo ../../../bpf/bpftool/bpftool btf s -jp [{ "id": 1, "size": 4082281, "prog_ids": [], "map_ids": [], "kernel": true, "name": "vmlinux" },{ "id": 6, "size": 2365, "prog_ids": [8,6 ], "map_ids": [3 ], "kernel": false },{ "id": 7, "size": 46895, "prog_ids": [], "map_ids": [], "kernel": true, "name": "button" },{
...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Tested-by: Alan Maguire alan.maguire@oracle.com Link: https://lore.kernel.org/bpf/20201110011932.3201430-6-andrii@kernel.org (cherry picked from commit cecaf4a0f2dcbd7e76156f6d63748b84c176b95b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 112797055117..3dc13419a2df 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -746,9 +746,14 @@ show_btf_plain(struct bpf_btf_info *info, int fd, struct btf_attach_table *btf_map_table) { struct btf_attach_point *obj; + const char *name = u64_to_ptr(info->name); int n;
printf("%u: ", info->id); + if (info->kernel_btf) + printf("name [%s] ", name); + else if (name && name[0]) + printf("name %s ", name); printf("size %uB", info->btf_size);
n = 0; @@ -775,6 +780,7 @@ show_btf_json(struct bpf_btf_info *info, int fd, struct btf_attach_table *btf_map_table) { struct btf_attach_point *obj; + const char *name = u64_to_ptr(info->name);
jsonw_start_object(json_wtr); /* btf object */ jsonw_uint_field(json_wtr, "id", info->id); @@ -800,6 +806,11 @@ show_btf_json(struct bpf_btf_info *info, int fd,
emit_obj_refs_json(&refs_table, info->id, json_wtr); /* pids */
+ jsonw_bool_field(json_wtr, "kernel", info->kernel_btf); + + if (name && name[0]) + jsonw_string_field(json_wtr, "name", name); + jsonw_end_object(json_wtr); /* btf object */ }
@@ -807,15 +818,30 @@ static int show_btf(int fd, struct btf_attach_table *btf_prog_table, struct btf_attach_table *btf_map_table) { - struct bpf_btf_info info = {}; + struct bpf_btf_info info; __u32 len = sizeof(info); + char name[64]; int err;
+ memset(&info, 0, sizeof(info)); err = bpf_obj_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get BTF object info: %s", strerror(errno)); return -1; } + /* if kernel support emitting BTF object name, pass name pointer */ + if (info.name_len) { + memset(&info, 0, sizeof(info)); + info.name_len = sizeof(name); + info.name = ptr_to_u64(name); + len = sizeof(info); + + err = bpf_obj_get_info_by_fd(fd, &info, &len); + if (err) { + p_err("can't get BTF object info: %s", strerror(errno)); + return -1; + } + }
if (json_output) show_btf_json(&info, fd, btf_prog_table, btf_map_table);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 7112d127984bd7b0c8ded7973b358829f16735f5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Make sure btf_parse_module() is compiled out if module BTFs are not enabled.
Fixes: 36e68442d1af ("bpf: Load and verify kernel module BTFs") Reported-by: Stephen Rothwell sfr@canb.auug.org.au Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201111040645.903494-1-andrii@kernel.org (cherry picked from commit 7112d127984bd7b0c8ded7973b358829f16735f5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 0a0c81588e61..680f5d0c7fe8 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4477,6 +4477,8 @@ struct btf *btf_parse_vmlinux(void) return ERR_PTR(err); }
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES + static struct btf *btf_parse_module(const char *module_name, const void *data, unsigned int data_size) { struct btf_verifier_env *env = NULL; @@ -4546,6 +4548,8 @@ static struct btf *btf_parse_module(const char *module_name, const void *data, u return ERR_PTR(err); }
+#endif /* CONFIG_DEBUG_INFO_BTF_MODULES */ + struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog) { struct bpf_prog *tgt_prog = prog->aux->dst_prog;
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.11-rc1 commit c8a950d0d3b926a02c7b2e713850d38217cec3d1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Several Makefiles in tools/ need to define the host toolchain variables. Move their definition to tools/scripts/Makefile.include
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Jiri Olsa jolsa@redhat.com Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/bpf/20201110164310.2600671-2-jean-philippe@linaro.or... (cherry picked from commit c8a950d0d3b926a02c7b2e713850d38217cec3d1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/scripts/Makefile.include | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index a3f99ef6b11b..865e21d4cc12 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -77,6 +77,16 @@ HOSTCC ?= gcc HOSTLD ?= ld endif
+ifneq ($(LLVM),) +HOSTAR ?= llvm-ar +HOSTCC ?= clang +HOSTLD ?= ld.lld +else +HOSTAR ?= ar +HOSTCC ?= gcc +HOSTLD ?= ld +endif + ifeq ($(CC_NO_CLANG), 1) EXTRA_WARNINGS += -Wstrict-aliasing=3 endif
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.11-rc1 commit 3290996e713315dfc9589e2c340a4c4997d90be8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Makefile.include defines variables such as OUTPUT and CC for out-of-tree build and cross-build. Include it into the runqslower Makefile and use its $(QUIET*) helpers.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201110164310.2600671-5-jean-philippe@linaro.or... (cherry picked from commit 3290996e713315dfc9589e2c340a4c4997d90be8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/runqslower/Makefile | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-)
diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index fb1337d69868..bcc4a7396713 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -1,4 +1,6 @@ # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +include ../../scripts/Makefile.include + OUTPUT := .output CLANG ?= clang LLC ?= llc @@ -21,10 +23,8 @@ VMLINUX_BTF_PATH := $(or $(VMLINUX_BTF),$(firstword \ abs_out := $(abspath $(OUTPUT)) ifeq ($(V),1) Q = -msg = else Q = @ -msg = @printf ' %-8s %s%s\n' "$(1)" "$(notdir $(2))" "$(if $(3), $(3))"; MAKEFLAGS += --no-print-directory submake_extras := feature_display=0 endif @@ -37,12 +37,11 @@ all: runqslower runqslower: $(OUTPUT)/runqslower
clean: - $(call msg,CLEAN) + $(call QUIET_CLEAN, runqslower) $(Q)rm -rf $(OUTPUT) runqslower
$(OUTPUT)/runqslower: $(OUTPUT)/runqslower.o $(BPFOBJ) - $(call msg,BINARY,$@) - $(Q)$(CC) $(CFLAGS) $^ -lelf -lz -o $@ + $(QUIET_LINK)$(CC) $(CFLAGS) $^ -lelf -lz -o $@
$(OUTPUT)/runqslower.o: runqslower.h $(OUTPUT)/runqslower.skel.h \ $(OUTPUT)/runqslower.bpf.o @@ -50,31 +49,26 @@ $(OUTPUT)/runqslower.o: runqslower.h $(OUTPUT)/runqslower.skel.h \ $(OUTPUT)/runqslower.bpf.o: $(OUTPUT)/vmlinux.h runqslower.h
$(OUTPUT)/%.skel.h: $(OUTPUT)/%.bpf.o | $(BPFTOOL) - $(call msg,GEN-SKEL,$@) - $(Q)$(BPFTOOL) gen skeleton $< > $@ + $(QUIET_GEN)$(BPFTOOL) gen skeleton $< > $@
$(OUTPUT)/%.bpf.o: %.bpf.c $(BPFOBJ) | $(OUTPUT) - $(call msg,BPF,$@) - $(Q)$(CLANG) -g -O2 -target bpf $(INCLUDES) \ + $(QUIET_GEN)$(CLANG) -g -O2 -target bpf $(INCLUDES) \ -c $(filter %.c,$^) -o $@ && \ $(LLVM_STRIP) -g $@
$(OUTPUT)/%.o: %.c | $(OUTPUT) - $(call msg,CC,$@) - $(Q)$(CC) $(CFLAGS) $(INCLUDES) -c $(filter %.c,$^) -o $@ + $(QUIET_CC)$(CC) $(CFLAGS) $(INCLUDES) -c $(filter %.c,$^) -o $@
$(OUTPUT): - $(call msg,MKDIR,$@) - $(Q)mkdir -p $(OUTPUT) + $(QUIET_MKDIR)mkdir -p $(OUTPUT)
$(OUTPUT)/vmlinux.h: $(VMLINUX_BTF_PATH) | $(OUTPUT) $(BPFTOOL) - $(call msg,GEN,$@) $(Q)if [ ! -e "$(VMLINUX_BTF_PATH)" ] ; then \ echo "Couldn't find kernel BTF; set VMLINUX_BTF to" \ "specify its location." >&2; \ exit 1;\ fi - $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@ + $(QUIET_GEN)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@
$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(OUTPUT) $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) \
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.11-rc1 commit 85e59344d0790379e063bf3cd3dd0fe91ce3b505 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Enable out-of-tree build for runqslower. Only set OUTPUT=.output if it wasn't already set by the user.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201110164310.2600671-6-jean-philippe@linaro.or... (cherry picked from commit 85e59344d0790379e063bf3cd3dd0fe91ce3b505) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/runqslower/Makefile | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index bcc4a7396713..0fc4d4046193 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -1,15 +1,18 @@ # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) include ../../scripts/Makefile.include
-OUTPUT := .output +OUTPUT ?= $(abspath .output)/ + CLANG ?= clang LLC ?= llc LLVM_STRIP ?= llvm-strip -DEFAULT_BPFTOOL := $(OUTPUT)/sbin/bpftool +BPFTOOL_OUTPUT := $(OUTPUT)bpftool/ +DEFAULT_BPFTOOL := $(BPFTOOL_OUTPUT)bpftool BPFTOOL ?= $(DEFAULT_BPFTOOL) LIBBPF_SRC := $(abspath ../../lib/bpf) -BPFOBJ := $(OUTPUT)/libbpf.a -BPF_INCLUDE := $(OUTPUT) +BPFOBJ_OUTPUT := $(OUTPUT)libbpf/ +BPFOBJ := $(BPFOBJ_OUTPUT)libbpf.a +BPF_INCLUDE := $(BPFOBJ_OUTPUT) INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../lib) \ -I$(abspath ../../include/uapi) CFLAGS := -g -Wall @@ -20,7 +23,6 @@ VMLINUX_BTF_PATHS := /sys/kernel/btf/vmlinux /boot/vmlinux-$(KERNEL_REL) VMLINUX_BTF_PATH := $(or $(VMLINUX_BTF),$(firstword \ $(wildcard $(VMLINUX_BTF_PATHS))))
-abs_out := $(abspath $(OUTPUT)) ifeq ($(V),1) Q = else @@ -38,7 +40,11 @@ runqslower: $(OUTPUT)/runqslower
clean: $(call QUIET_CLEAN, runqslower) - $(Q)rm -rf $(OUTPUT) runqslower + $(Q)$(RM) -r $(BPFOBJ_OUTPUT) $(BPFTOOL_OUTPUT) + $(Q)$(RM) $(OUTPUT)*.o $(OUTPUT)*.d + $(Q)$(RM) $(OUTPUT)*.skel.h $(OUTPUT)vmlinux.h + $(Q)$(RM) $(OUTPUT)runqslower + $(Q)$(RM) -r .output
$(OUTPUT)/runqslower: $(OUTPUT)/runqslower.o $(BPFOBJ) $(QUIET_LINK)$(CC) $(CFLAGS) $^ -lelf -lz -o $@ @@ -59,8 +65,8 @@ $(OUTPUT)/%.bpf.o: %.bpf.c $(BPFOBJ) | $(OUTPUT) $(OUTPUT)/%.o: %.c | $(OUTPUT) $(QUIET_CC)$(CC) $(CFLAGS) $(INCLUDES) -c $(filter %.c,$^) -o $@
-$(OUTPUT): - $(QUIET_MKDIR)mkdir -p $(OUTPUT) +$(OUTPUT) $(BPFOBJ_OUTPUT) $(BPFTOOL_OUTPUT): + $(QUIET_MKDIR)mkdir -p $@
$(OUTPUT)/vmlinux.h: $(VMLINUX_BTF_PATH) | $(OUTPUT) $(BPFTOOL) $(Q)if [ ! -e "$(VMLINUX_BTF_PATH)" ] ; then \ @@ -70,10 +76,8 @@ $(OUTPUT)/vmlinux.h: $(VMLINUX_BTF_PATH) | $(OUTPUT) $(BPFTOOL) fi $(QUIET_GEN)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@
-$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(OUTPUT) - $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) \ - OUTPUT=$(abspath $(dir $@))/ $(abspath $@) +$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(BPFOBJ_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(BPFOBJ_OUTPUT) $@
-$(DEFAULT_BPFTOOL): - $(Q)$(MAKE) $(submake_extras) -C ../bpftool \ - prefix= OUTPUT=$(abs_out)/ DESTDIR=$(abs_out) install +$(DEFAULT_BPFTOOL): | $(BPFTOOL_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT)
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.11-rc1 commit 2d9393fefb506fb8c8a483e5dc9bbd7774657b89 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When cross building runqslower for an other architecture, the intermediate bpftool used to generate a skeleton must be built using the host toolchain. Pass HOSTCC and HOSTLD, defined in Makefile.include, to the bpftool Makefile.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201110164310.2600671-7-jean-philippe@linaro.or... (cherry picked from commit 2d9393fefb506fb8c8a483e5dc9bbd7774657b89) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/runqslower/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index 0fc4d4046193..4d5ca54fcd4c 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -80,4 +80,5 @@ $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(BPFOBJ_OU $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(BPFOBJ_OUTPUT) $@
$(DEFAULT_BPFTOOL): | $(BPFTOOL_OUTPUT) - $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT) \ + CC=$(HOSTCC) LD=$(HOSTLD)
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.11-rc1 commit 0639e5e97ad9c58dd15dcf6f6ccf677cfba39f98 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Commit ba2fd563b740 ("tools/bpftool: Support passing BPFTOOL_VERSION to make") changed BPFTOOL_VERSION to a recursively expanded variable, forcing it to be recomputed on every expansion of CFLAGS and dramatically slowing down the bpftool build. Restore BPFTOOL_VERSION as a simply expanded variable, guarded by an ifeq().
Fixes: ba2fd563b740 ("tools/bpftool: Support passing BPFTOOL_VERSION to make") Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201110164310.2600671-8-jean-philippe@linaro.or... (cherry picked from commit 0639e5e97ad9c58dd15dcf6f6ccf677cfba39f98) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index 802c4a113400..ea6704f027e3 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -29,7 +29,9 @@ LIBBPF = $(LIBBPF_PATH)libbpf.a LIBBPF_BOOTSTRAP_OUTPUT = $(BOOTSTRAP_OUTPUT)libbpf/ LIBBPF_BOOTSTRAP = $(LIBBPF_BOOTSTRAP_OUTPUT)libbpf.a
-BPFTOOL_VERSION ?= $(shell make -rR --no-print-directory -sC ../../.. kernelversion) +ifeq ($(BPFTOOL_VERSION),) +BPFTOOL_VERSION := $(shell make -rR --no-print-directory -sC ../../.. kernelversion) +endif
$(LIBBPF_OUTPUT) $(BOOTSTRAP_OUTPUT) $(LIBBPF_BOOTSTRAP_OUTPUT): $(QUIET_MKDIR)mkdir -p $@
From: Alan Maguire alan.maguire@oracle.com
mainline inclusion from mainline-5.11-rc1 commit de91e631bdc7e6411989e1a9ab65501a31527e0b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When operating on split BTF, btf__find_by_name[_kind] will not iterate over all types since they use btf->nr_types to show the number of types to iterate over. For split BTF this is the number of types _on top of base BTF_, so it will underestimate the number of types to iterate over, especially for vmlinux + module BTF, where the latter is much smaller.
Use btf__get_nr_types() instead.
Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support") Signed-off-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/1605437195-2175-1-git-send-email-alan.maguire@or... (cherry picked from commit de91e631bdc7e6411989e1a9ab65501a31527e0b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 900b557ab640..7c1f23909be8 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -675,12 +675,12 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
__s32 btf__find_by_name(const struct btf *btf, const char *type_name) { - __u32 i; + __u32 i, nr_types = btf__get_nr_types(btf);
if (!strcmp(type_name, "void")) return 0;
- for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i <= nr_types; i++) { const struct btf_type *t = btf__type_by_id(btf, i); const char *name = btf__name_by_offset(btf, t->name_off);
@@ -694,12 +694,12 @@ __s32 btf__find_by_name(const struct btf *btf, const char *type_name) __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name, __u32 kind) { - __u32 i; + __u32 i, nr_types = btf__get_nr_types(btf);
if (kind == BTF_KIND_UNKN || !strcmp(type_name, "void")) return 0;
- for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i <= nr_types; i++) { const struct btf_type *t = btf__type_by_id(btf, i); const char *name;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 607c543f939d8ca6fed7afe90b3a8d6f6684dd17 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Given .BTF section is not allocatable, it will get trimmed after module is loaded. BPF system handles that properly by creating an independent copy of data. But prevent any accidental misused by resetting the pointer to BTF data.
Fixes: 36e68442d1af ("bpf: Load and verify kernel module BTFs") Suggested-by: Jessica Yu jeyu@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Jessica Yu jeyu@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/bpf/20201121070829.2612884-2-andrii@kernel.org (cherry picked from commit 607c543f939d8ca6fed7afe90b3a8d6f6684dd17) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/module.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/module.c b/kernel/module.c index e7d2a53842da..f19bbb9e30dc 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3822,6 +3822,11 @@ static noinline int do_init_module(struct module *mod) mod->init_layout.ro_size = 0; mod->init_layout.ro_after_init_size = 0; mod->init_layout.text_size = 0; +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES + /* .BTF is not SHF_ALLOC and will get removed, so sanitize pointer */ + mod->btf_data = NULL; + mod->btf_data_size = 0; +#endif /* * We want to free module_init, but be aware that kallsyms may be * walking this with preempt disabled. In all the failure paths, we
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 71ccb50074f31a50a1da4c1d8306d54da0907b00 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
For consistency of output, emit "name <anon>" for BTFs without the name. This keeps output more consistent and obvious.
Suggested-by: Song Liu songliubraving@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201202065244.530571-2-andrii@kernel.org (cherry picked from commit 71ccb50074f31a50a1da4c1d8306d54da0907b00) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 3dc13419a2df..aec4e92f8ea7 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -754,6 +754,8 @@ show_btf_plain(struct bpf_btf_info *info, int fd, printf("name [%s] ", name); else if (name && name[0]) printf("name %s ", name); + else + printf("name <anon> "); printf("size %uB", info->btf_size);
n = 0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 0cfdcd6378071f383c900e3d8862347e2af1d1ca category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add ability to get base BTF. It can be also used to check if BTF is split BTF.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201202065244.530571-3-andrii@kernel.org (cherry picked from commit 0cfdcd6378071f383c900e3d8862347e2af1d1ca) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 5 +++++ tools/lib/bpf/btf.h | 1 + tools/lib/bpf/libbpf.map | 1 + 3 files changed, 7 insertions(+)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 7c1f23909be8..cdf2f225f944 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -433,6 +433,11 @@ __u32 btf__get_nr_types(const struct btf *btf) return btf->start_id + btf->nr_types - 1; }
+const struct btf *btf__base_btf(const struct btf *btf) +{ + return btf->base_btf; +} + /* internal helper returning non-const pointer to a type */ static struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id) { diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 12d36d913168..5b8a6ea44b38 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -51,6 +51,7 @@ LIBBPF_API __s32 btf__find_by_name(const struct btf *btf, LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name, __u32 kind); LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf); +LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf); LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf, __u32 id); LIBBPF_API size_t btf__pointer_size(const struct btf *btf); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 0da243abdcf7..d56a8268d349 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -343,6 +343,7 @@ LIBBPF_0.2.0 {
LIBBPF_0.3.0 { global: + btf__base_btf; btf__parse_elf_split; btf__parse_raw_split; btf__parse_split;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit fa4528379a51ff8d5271e1bcfa0d5ef71657869f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In case of working with module's split BTF from /sys/kernel/btf/*, auto-substitute /sys/kernel/btf/vmlinux as the base BTF. This makes using bpftool with module BTFs faster and simpler.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201202065244.530571-4-andrii@kernel.org (cherry picked from commit fa4528379a51ff8d5271e1bcfa0d5ef71657869f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index aec4e92f8ea7..1326fff3629b 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -357,11 +357,13 @@ static int dump_btf_raw(const struct btf *btf, dump_btf_type(btf, root_type_ids[i], t); } } else { + const struct btf *base; int cnt = btf__get_nr_types(btf); int start_id = 1;
- if (base_btf) - start_id = btf__get_nr_types(base_btf) + 1; + base = btf__base_btf(btf); + if (base) + start_id = btf__get_nr_types(base) + 1;
for (i = start_id; i <= cnt; i++) { t = btf__type_by_id(btf, i); @@ -428,7 +430,7 @@ static int dump_btf_c(const struct btf *btf,
static int do_dump(int argc, char **argv) { - struct btf *btf = NULL; + struct btf *btf = NULL, *base = NULL; __u32 root_type_ids[2]; int root_type_cnt = 0; bool dump_c = false; @@ -502,7 +504,21 @@ static int do_dump(int argc, char **argv) } NEXT_ARG(); } else if (is_prefix(src, "file")) { - btf = btf__parse_split(*argv, base_btf); + const char sysfs_prefix[] = "/sys/kernel/btf/"; + const char sysfs_vmlinux[] = "/sys/kernel/btf/vmlinux"; + + if (!base_btf && + strncmp(*argv, sysfs_prefix, sizeof(sysfs_prefix) - 1) == 0 && + strcmp(*argv, sysfs_vmlinux) != 0) { + base = btf__parse(sysfs_vmlinux, NULL); + if (libbpf_get_error(base)) { + p_err("failed to parse vmlinux BTF at '%s': %ld\n", + sysfs_vmlinux, libbpf_get_error(base)); + base = NULL; + } + } + + btf = btf__parse_split(*argv, base ?: base_btf); if (IS_ERR(btf)) { err = -PTR_ERR(btf); btf = NULL; @@ -570,6 +586,7 @@ static int do_dump(int argc, char **argv) done: close(fd); btf__free(btf); + btf__free(base); return err; }
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.11-rc1 commit 22e8ebe35a2e30ee19e02c41cacc99c2f896bc4b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add missing newlines and fix polarity of strerror argument.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Jiri Olsa jolsa@redhat.com Link: https://lore.kernel.org/bpf/20201203102234.648540-1-jackmanb@google.com (cherry picked from commit 22e8ebe35a2e30ee19e02c41cacc99c2f896bc4b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/resolve_btfids/main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c index f32c059fbfb4..87bded641afb 100644 --- a/tools/bpf/resolve_btfids/main.c +++ b/tools/bpf/resolve_btfids/main.c @@ -459,7 +459,7 @@ static int symbols_collect(struct object *obj) return -ENOMEM;
if (id->addr_cnt >= ADDR_CNT) { - pr_err("FAILED symbol %s crossed the number of allowed lists", + pr_err("FAILED symbol %s crossed the number of allowed lists\n", id->name); return -1; } @@ -482,8 +482,8 @@ static int symbols_resolve(struct object *obj) btf = btf__parse(obj->btf ?: obj->path, NULL); err = libbpf_get_error(btf); if (err) { - pr_err("FAILED: load BTF from %s: %s", - obj->path, strerror(err)); + pr_err("FAILED: load BTF from %s: %s\n", + obj->path, strerror(-err)); return -1; }
From: Andrei Matei andreimatei1@gmail.com
mainline inclusion from mainline-5.11-rc1 commit 80b2b5c3a701d56de98d00d99bc9cc384fb316d9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Before this patch, a program with unspecified type (BPF_PROG_TYPE_UNSPEC) would be passed to the BPF syscall, only to have the kernel reject it with an opaque invalid argument error. This patch makes libbpf reject such programs with a nicer error message - in particular libbpf now tries to diagnose bad ELF section names at both open time and load time.
Signed-off-by: Andrei Matei andreimatei1@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201203043410.59699-1-andreimatei1@gmail.com (cherry picked from commit 80b2b5c3a701d56de98d00d99bc9cc384fb316d9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2894d837e9f8..76c65fd7206c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6687,6 +6687,16 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, char *log_buf = NULL; int btf_fd, ret;
+ if (prog->type == BPF_PROG_TYPE_UNSPEC) { + /* + * The program type must be set. Most likely we couldn't find a proper + * section definition at load time, and thus we didn't infer the type. + */ + pr_warn("prog '%s': missing BPF prog type, check ELF section name '%s'\n", + prog->name, prog->sec_name); + return -EINVAL; + } + if (!insns || !insns_cnt) return -EINVAL;
@@ -6982,9 +6992,12 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
bpf_object__for_each_program(prog, obj) { prog->sec_def = find_sec_def(prog->sec_name); - if (!prog->sec_def) + if (!prog->sec_def) { /* couldn't guess, but user might manually specify */ + pr_debug("prog '%s': unrecognized ELF section name '%s'\n", + prog->name, prog->sec_name); continue; + }
if (prog->sec_def->is_sleepable) prog->prog_flags |= BPF_F_SLEEPABLE;
From: Stanislav Fomichev sdf@google.com
mainline inclusion from mainline-5.11-rc1 commit d6d418bd8f92aaa4c7c26d606188147c2ee0dae9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
I've seen a situation, where a process that's under pprof constantly generates SIGPROF which prevents program loading indefinitely. The right thing to do probably is to disable signals in the upper layers while loading, but it still would be nice to get some error from libbpf instead of an endless loop.
Let's add some small retry limit to the program loading: try loading the program 5 (arbitrary) times and give up.
v2: * 10 -> 5 retires (Andrii Nakryiko)
Signed-off-by: Stanislav Fomichev sdf@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201202231332.3923644-1-sdf@google.com (cherry picked from commit d6d418bd8f92aaa4c7c26d606188147c2ee0dae9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 13e08c0d8b1a..fa6dbd5aec3d 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -67,11 +67,12 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size) { + int retries = 5; int fd;
do { fd = sys_bpf(BPF_PROG_LOAD, attr, size); - } while (fd < 0 && errno == EAGAIN); + } while (fd < 0 && errno == EAGAIN && retries-- > 0);
return fd; }
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.11-rc1 commit 58c185b85d0c1753b0b6a9390294bd883faf4d77 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This object lives inside the trunner output dir, i.e. tools/testing/selftests/bpf/no_alu32/btf_data.o
At some point it gets copied into the parent directory during another part of the build, but that doesn't happen when building test_progs-no_alu32 from clean.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Jiri Olsa jolsa@redhat.com Link: https://lore.kernel.org/bpf/20201203120850.859170-1-jackmanb@google.com (cherry picked from commit 58c185b85d0c1753b0b6a9390294bd883faf4d77) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 080d9796e8e3..cceea641ace0 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -385,7 +385,7 @@ $(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS) \ | $(TRUNNER_BINARY)-extras $$(call msg,BINARY,,$$@) $(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@ - $(Q)$(RESOLVE_BTFIDS) --no-fail --btf btf_data.o $$@ + $(Q)$(RESOLVE_BTFIDS) --no-fail --btf $(TRUNNER_OUTPUT)/btf_data.o $$@
endef
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 2fe8890848c799515a881502339a0a7b2b555988 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Having real btf_data_size stored in struct module is benefitial to quickly determine which kernel modules have associated BTF object and which don't. There is no harm in keeping this info, as opposed to keeping invalid pointer.
Fixes: 607c543f939d ("bpf: Sanitize BTF data pointer after module is loaded") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-3-andrii@kernel.org (cherry picked from commit 2fe8890848c799515a881502339a0a7b2b555988) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/module.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/kernel/module.c b/kernel/module.c index f19bbb9e30dc..00d46cc4bfc7 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3825,7 +3825,6 @@ static noinline int do_init_module(struct module *mod) #ifdef CONFIG_DEBUG_INFO_BTF_MODULES /* .BTF is not SHF_ALLOC and will get removed, so sanitize pointer */ mod->btf_data = NULL; - mod->btf_data_size = 0; #endif /* * We want to free module_init, but be aware that kallsyms may be
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit a19f93cfafdf85851c69bc9f677aa4f40c53610f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a btf_get_from_fd() helper, which constructs struct btf from in-kernel BTF data by FD. This is used for loading module BTFs.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-4-andrii@kernel.org (cherry picked from commit a19f93cfafdf85851c69bc9f677aa4f40c53610f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 61 +++++++++++++++++++-------------- tools/lib/bpf/libbpf_internal.h | 1 + 2 files changed, 36 insertions(+), 26 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index cdf2f225f944..6225c403685a 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1324,35 +1324,27 @@ const char *btf__name_by_offset(const struct btf *btf, __u32 offset) return btf__str_by_offset(btf, offset); }
-int btf__get_from_id(__u32 id, struct btf **btf) +struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) { - struct bpf_btf_info btf_info = { 0 }; + struct bpf_btf_info btf_info; __u32 len = sizeof(btf_info); __u32 last_size; - int btf_fd; + struct btf *btf; void *ptr; int err;
- err = 0; - *btf = NULL; - btf_fd = bpf_btf_get_fd_by_id(id); - if (btf_fd < 0) - return 0; - /* we won't know btf_size until we call bpf_obj_get_info_by_fd(). so * let's start with a sane default - 4KiB here - and resize it only if * bpf_obj_get_info_by_fd() needs a bigger buffer. */ - btf_info.btf_size = 4096; - last_size = btf_info.btf_size; + last_size = 4096; ptr = malloc(last_size); - if (!ptr) { - err = -ENOMEM; - goto exit_free; - } + if (!ptr) + return ERR_PTR(-ENOMEM);
- memset(ptr, 0, last_size); + memset(&btf_info, 0, sizeof(btf_info)); btf_info.btf = ptr_to_u64(ptr); + btf_info.btf_size = last_size; err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len);
if (!err && btf_info.btf_size > last_size) { @@ -1361,31 +1353,48 @@ int btf__get_from_id(__u32 id, struct btf **btf) last_size = btf_info.btf_size; temp_ptr = realloc(ptr, last_size); if (!temp_ptr) { - err = -ENOMEM; + btf = ERR_PTR(-ENOMEM); goto exit_free; } ptr = temp_ptr; - memset(ptr, 0, last_size); + + len = sizeof(btf_info); + memset(&btf_info, 0, sizeof(btf_info)); btf_info.btf = ptr_to_u64(ptr); + btf_info.btf_size = last_size; + err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len); }
if (err || btf_info.btf_size > last_size) { - err = errno; + btf = err ? ERR_PTR(-errno) : ERR_PTR(-E2BIG); goto exit_free; }
- *btf = btf__new((__u8 *)(long)btf_info.btf, btf_info.btf_size); - if (IS_ERR(*btf)) { - err = PTR_ERR(*btf); - *btf = NULL; - } + btf = btf_new(ptr, btf_info.btf_size, base_btf);
exit_free: - close(btf_fd); free(ptr); + return btf; +}
- return err; +int btf__get_from_id(__u32 id, struct btf **btf) +{ + struct btf *res; + int btf_fd; + + *btf = NULL; + btf_fd = bpf_btf_get_fd_by_id(id); + if (btf_fd < 0) + return -errno; + + res = btf_get_from_fd(btf_fd, NULL); + close(btf_fd); + if (IS_ERR(res)) + return PTR_ERR(res); + + *btf = res; + return 0; }
int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index d99bc847bf84..e569ae63808e 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -155,6 +155,7 @@ int bpf_object__section_size(const struct bpf_object *obj, const char *name, __u32 *size); int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, __u32 *off); +struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
struct btf_ext_info { /*
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 0f7515ca7cddadabe04e28e20a257b1bbb6cb98a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Refactor CO-RE relocation candidate search to not expect a single BTF, rather return all candidate types with their corresponding BTF objects. This will allow to extend CO-RE relocations to accommodate kernel module BTFs.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20201203204634.1325171-5-andrii@kernel.org (cherry picked from commit 0f7515ca7cddadabe04e28e20a257b1bbb6cb98a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 187 ++++++++++++++++++++++++----------------- 1 file changed, 111 insertions(+), 76 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 76c65fd7206c..07700ba811ff 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -462,11 +462,14 @@ struct bpf_object { struct list_head list;
struct btf *btf; + struct btf_ext *btf_ext; + /* Parse and load BTF vmlinux if any of the programs in the object need * it at load time. */ struct btf *btf_vmlinux; - struct btf_ext *btf_ext; + /* vmlinux BTF override for CO-RE relocations */ + struct btf *btf_vmlinux_override;
void *priv; bpf_object_clear_priv_t clear_priv; @@ -4658,46 +4661,43 @@ static size_t bpf_core_essential_name_len(const char *name) return n; }
-/* dynamically sized list of type IDs */ -struct ids_vec { - __u32 *data; +struct core_cand +{ + const struct btf *btf; + const struct btf_type *t; + const char *name; + __u32 id; +}; + +/* dynamically sized list of type IDs and its associated struct btf */ +struct core_cand_list { + struct core_cand *cands; int len; };
-static void bpf_core_free_cands(struct ids_vec *cand_ids) +static void bpf_core_free_cands(struct core_cand_list *cands) { - free(cand_ids->data); - free(cand_ids); + free(cands->cands); + free(cands); }
-static struct ids_vec *bpf_core_find_cands(const struct btf *local_btf, - __u32 local_type_id, - const struct btf *targ_btf) +static int bpf_core_add_cands(struct core_cand *local_cand, + size_t local_essent_len, + const struct btf *targ_btf, + const char *targ_btf_name, + int targ_start_id, + struct core_cand_list *cands) { - size_t local_essent_len, targ_essent_len; - const char *local_name, *targ_name; - const struct btf_type *t, *local_t; - struct ids_vec *cand_ids; - __u32 *new_ids; - int i, err, n; - - local_t = btf__type_by_id(local_btf, local_type_id); - if (!local_t) - return ERR_PTR(-EINVAL); - - local_name = btf__name_by_offset(local_btf, local_t->name_off); - if (str_is_empty(local_name)) - return ERR_PTR(-EINVAL); - local_essent_len = bpf_core_essential_name_len(local_name); - - cand_ids = calloc(1, sizeof(*cand_ids)); - if (!cand_ids) - return ERR_PTR(-ENOMEM); + struct core_cand *new_cands, *cand; + const struct btf_type *t; + const char *targ_name; + size_t targ_essent_len; + int n, i;
n = btf__get_nr_types(targ_btf); - for (i = 1; i <= n; i++) { + for (i = targ_start_id; i <= n; i++) { t = btf__type_by_id(targ_btf, i); - if (btf_kind(t) != btf_kind(local_t)) + if (btf_kind(t) != btf_kind(local_cand->t)) continue;
targ_name = btf__name_by_offset(targ_btf, t->name_off); @@ -4708,25 +4708,62 @@ static struct ids_vec *bpf_core_find_cands(const struct btf *local_btf, if (targ_essent_len != local_essent_len) continue;
- if (strncmp(local_name, targ_name, local_essent_len) == 0) { - pr_debug("CO-RE relocating [%d] %s %s: found target candidate [%d] %s %s\n", - local_type_id, btf_kind_str(local_t), - local_name, i, btf_kind_str(t), targ_name); - new_ids = libbpf_reallocarray(cand_ids->data, - cand_ids->len + 1, - sizeof(*cand_ids->data)); - if (!new_ids) { - err = -ENOMEM; - goto err_out; - } - cand_ids->data = new_ids; - cand_ids->data[cand_ids->len++] = i; - } + if (strncmp(local_cand->name, targ_name, local_essent_len) != 0) + continue; + + pr_debug("CO-RE relocating [%d] %s %s: found target candidate [%d] %s %s in [%s]\n", + local_cand->id, btf_kind_str(local_cand->t), + local_cand->name, i, btf_kind_str(t), targ_name, + targ_btf_name); + new_cands = libbpf_reallocarray(cands->cands, cands->len + 1, + sizeof(*cands->cands)); + if (!new_cands) + return -ENOMEM; + + cand = &new_cands[cands->len]; + cand->btf = targ_btf; + cand->t = t; + cand->name = targ_name; + cand->id = i; + + cands->cands = new_cands; + cands->len++; } - return cand_ids; -err_out: - bpf_core_free_cands(cand_ids); - return ERR_PTR(err); + return 0; +} + +static struct core_cand_list * +bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 local_type_id) +{ + struct core_cand local_cand = {}; + struct core_cand_list *cands; + size_t local_essent_len; + int err; + + local_cand.btf = local_btf; + local_cand.t = btf__type_by_id(local_btf, local_type_id); + if (!local_cand.t) + return ERR_PTR(-EINVAL); + + local_cand.name = btf__name_by_offset(local_btf, local_cand.t->name_off); + if (str_is_empty(local_cand.name)) + return ERR_PTR(-EINVAL); + local_essent_len = bpf_core_essential_name_len(local_cand.name); + + cands = calloc(1, sizeof(*cands)); + if (!cands) + return ERR_PTR(-ENOMEM); + + /* Attempt to find target candidates in vmlinux BTF first */ + err = bpf_core_add_cands(&local_cand, local_essent_len, + obj->btf_vmlinux_override ?: obj->btf_vmlinux, + "vmlinux", 1, cands); + if (err) { + bpf_core_free_cands(cands); + return ERR_PTR(err); + } + + return cands; }
/* Check two types for compatibility for the purpose of field access @@ -5719,7 +5756,6 @@ static int bpf_core_apply_relo(struct bpf_program *prog, const struct bpf_core_relo *relo, int relo_idx, const struct btf *local_btf, - const struct btf *targ_btf, struct hashmap *cand_cache) { struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; @@ -5727,8 +5763,8 @@ static int bpf_core_apply_relo(struct bpf_program *prog, struct bpf_core_relo_res cand_res, targ_res; const struct btf_type *local_type; const char *local_name; - struct ids_vec *cand_ids; - __u32 local_id, cand_id; + struct core_cand_list *cands = NULL; + __u32 local_id; const char *spec_str; int i, j, err;
@@ -5775,24 +5811,24 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return -EOPNOTSUPP; }
- if (!hashmap__find(cand_cache, type_key, (void **)&cand_ids)) { - cand_ids = bpf_core_find_cands(local_btf, local_id, targ_btf); - if (IS_ERR(cand_ids)) { + if (!hashmap__find(cand_cache, type_key, (void **)&cands)) { + cands = bpf_core_find_cands(prog->obj, local_btf, local_id); + if (IS_ERR(cands)) { pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld", prog->name, relo_idx, local_id, btf_kind_str(local_type), - local_name, PTR_ERR(cand_ids)); - return PTR_ERR(cand_ids); + local_name, PTR_ERR(cands)); + return PTR_ERR(cands); } - err = hashmap__set(cand_cache, type_key, cand_ids, NULL, NULL); + err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); if (err) { - bpf_core_free_cands(cand_ids); + bpf_core_free_cands(cands); return err; } }
- for (i = 0, j = 0; i < cand_ids->len; i++) { - cand_id = cand_ids->data[i]; - err = bpf_core_spec_match(&local_spec, targ_btf, cand_id, &cand_spec); + for (i = 0, j = 0; i < cands->len; i++) { + err = bpf_core_spec_match(&local_spec, cands->cands[i].btf, + cands->cands[i].id, &cand_spec); if (err < 0) { pr_warn("prog '%s': relo #%d: error matching candidate #%d ", prog->name, relo_idx, i); @@ -5836,7 +5872,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return -EINVAL; }
- cand_ids->data[j++] = cand_spec.root_type_id; + cands->cands[j++] = cands->cands[i]; }
/* @@ -5848,7 +5884,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, * depending on relo's kind. */ if (j > 0) - cand_ids->len = j; + cands->len = j;
/* * If no candidates were found, it might be both a programmer error, @@ -5892,20 +5928,19 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) struct hashmap_entry *entry; struct hashmap *cand_cache = NULL; struct bpf_program *prog; - struct btf *targ_btf; const char *sec_name; int i, err = 0, insn_idx, sec_idx;
if (obj->btf_ext->core_relo_info.len == 0) return 0;
- if (targ_btf_path) - targ_btf = btf__parse(targ_btf_path, NULL); - else - targ_btf = obj->btf_vmlinux; - if (IS_ERR_OR_NULL(targ_btf)) { - pr_warn("failed to get target BTF: %ld\n", PTR_ERR(targ_btf)); - return PTR_ERR(targ_btf); + if (targ_btf_path) { + obj->btf_vmlinux_override = btf__parse(targ_btf_path, NULL); + if (IS_ERR_OR_NULL(obj->btf_vmlinux_override)) { + err = PTR_ERR(obj->btf_vmlinux_override); + pr_warn("failed to parse target BTF: %d\n", err); + return err; + } }
cand_cache = hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL); @@ -5957,8 +5992,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) if (!prog->load) continue;
- err = bpf_core_apply_relo(prog, rec, i, obj->btf, - targ_btf, cand_cache); + err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache); if (err) { pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", prog->name, i, err); @@ -5969,8 +6003,9 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
out: /* obj->btf_vmlinux is freed at the end of object load phase */ - if (targ_btf != obj->btf_vmlinux) - btf__free(targ_btf); + btf__free(obj->btf_vmlinux_override); + obj->btf_vmlinux_override = NULL; + if (!IS_ERR_OR_NULL(cand_cache)) { hashmap__for_each_entry(cand_cache, entry, i) { bpf_core_free_cands(entry->value);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 4f33a53d56000cfa67e2e4e8a5dac08f084a979b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Teach libbpf to search for candidate types for CO-RE relocations across kernel modules BTFs, in addition to vmlinux BTF. If at least one candidate type is found in vmlinux BTF, kernel module BTFs are not iterated. If vmlinux BTF has no matching candidates, then find all kernel module BTFs and search for all matching candidates across all of them.
Kernel's support for module BTFs are inferred from the support for BTF name pointer in BPF UAPI.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-6-andrii@kernel.org (cherry picked from commit 4f33a53d56000cfa67e2e4e8a5dac08f084a979b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 179 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 169 insertions(+), 10 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 07700ba811ff..f64144c48317 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -176,6 +176,8 @@ enum kern_feature_id { FEAT_PROBE_READ_KERN, /* BPF_PROG_BIND_MAP is supported */ FEAT_PROG_BIND_MAP, + /* Kernel support for module BTFs */ + FEAT_MODULE_BTF, __FEAT_CNT, };
@@ -402,6 +404,12 @@ struct extern_desc {
static LIST_HEAD(bpf_objects_list);
+struct module_btf { + struct btf *btf; + char *name; + __u32 id; +}; + struct bpf_object { char name[BPF_OBJ_NAME_LEN]; char license[64]; @@ -470,6 +478,11 @@ struct bpf_object { struct btf *btf_vmlinux; /* vmlinux BTF override for CO-RE relocations */ struct btf *btf_vmlinux_override; + /* Lazily initialized kernel module BTFs */ + struct module_btf *btf_modules; + bool btf_modules_loaded; + size_t btf_module_cnt; + size_t btf_module_cap;
void *priv; bpf_object_clear_priv_t clear_priv; @@ -4003,6 +4016,35 @@ static int probe_prog_bind_map(void) return ret >= 0; }
+static int probe_module_btf(void) +{ + static const char strs[] = "\0int"; + __u32 types[] = { + /* int */ + BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), + }; + struct bpf_btf_info info; + __u32 len = sizeof(info); + char name[16]; + int fd, err; + + fd = libbpf__load_raw_btf((char *)types, sizeof(types), strs, sizeof(strs)); + if (fd < 0) + return 0; /* BTF not supported at all */ + + memset(&info, 0, sizeof(info)); + info.name = ptr_to_u64(name); + info.name_len = sizeof(name); + + /* check that BPF_OBJ_GET_INFO_BY_FD supports specifying name pointer; + * kernel's module BTF support coincides with support for + * name/name_len fields in struct bpf_btf_info. + */ + err = bpf_obj_get_info_by_fd(fd, &info, &len); + close(fd); + return !err; +} + enum kern_feature_result { FEAT_UNKNOWN = 0, FEAT_SUPPORTED = 1, @@ -4046,7 +4088,10 @@ static struct kern_feature_desc { }, [FEAT_PROG_BIND_MAP] = { "BPF_PROG_BIND_MAP support", probe_prog_bind_map, - } + }, + [FEAT_MODULE_BTF] = { + "module BTF support", probe_module_btf, + }, };
static bool kernel_supports(enum kern_feature_id feat_id) @@ -4732,13 +4777,96 @@ static int bpf_core_add_cands(struct core_cand *local_cand, return 0; }
+static int load_module_btfs(struct bpf_object *obj) +{ + struct bpf_btf_info info; + struct module_btf *mod_btf; + struct btf *btf; + char name[64]; + __u32 id = 0, len; + int err, fd; + + if (obj->btf_modules_loaded) + return 0; + + /* don't do this again, even if we find no module BTFs */ + obj->btf_modules_loaded = true; + + /* kernel too old to support module BTFs */ + if (!kernel_supports(FEAT_MODULE_BTF)) + return 0; + + while (true) { + err = bpf_btf_get_next_id(id, &id); + if (err && errno == ENOENT) + return 0; + if (err) { + err = -errno; + pr_warn("failed to iterate BTF objects: %d\n", err); + return err; + } + + fd = bpf_btf_get_fd_by_id(id); + if (fd < 0) { + if (errno == ENOENT) + continue; /* expected race: BTF was unloaded */ + err = -errno; + pr_warn("failed to get BTF object #%d FD: %d\n", id, err); + return err; + } + + len = sizeof(info); + memset(&info, 0, sizeof(info)); + info.name = ptr_to_u64(name); + info.name_len = sizeof(name); + + err = bpf_obj_get_info_by_fd(fd, &info, &len); + if (err) { + err = -errno; + pr_warn("failed to get BTF object #%d info: %d\n", id, err); + close(fd); + return err; + } + + /* ignore non-module BTFs */ + if (!info.kernel_btf || strcmp(name, "vmlinux") == 0) { + close(fd); + continue; + } + + btf = btf_get_from_fd(fd, obj->btf_vmlinux); + close(fd); + if (IS_ERR(btf)) { + pr_warn("failed to load module [%s]'s BTF object #%d: %ld\n", + name, id, PTR_ERR(btf)); + return PTR_ERR(btf); + } + + err = btf_ensure_mem((void **)&obj->btf_modules, &obj->btf_module_cap, + sizeof(*obj->btf_modules), obj->btf_module_cnt + 1); + if (err) + return err; + + mod_btf = &obj->btf_modules[obj->btf_module_cnt++]; + + mod_btf->btf = btf; + mod_btf->id = id; + mod_btf->name = strdup(name); + if (!mod_btf->name) + return -ENOMEM; + } + + return 0; +} + static struct core_cand_list * bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 local_type_id) { struct core_cand local_cand = {}; struct core_cand_list *cands; + const struct btf *main_btf; size_t local_essent_len; - int err; + int err, i;
local_cand.btf = local_btf; local_cand.t = btf__type_by_id(local_btf, local_type_id); @@ -4755,15 +4883,38 @@ bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 l return ERR_PTR(-ENOMEM);
/* Attempt to find target candidates in vmlinux BTF first */ - err = bpf_core_add_cands(&local_cand, local_essent_len, - obj->btf_vmlinux_override ?: obj->btf_vmlinux, - "vmlinux", 1, cands); - if (err) { - bpf_core_free_cands(cands); - return ERR_PTR(err); + main_btf = obj->btf_vmlinux_override ?: obj->btf_vmlinux; + err = bpf_core_add_cands(&local_cand, local_essent_len, main_btf, "vmlinux", 1, cands); + if (err) + goto err_out; + + /* if vmlinux BTF has any candidate, don't got for module BTFs */ + if (cands->len) + return cands; + + /* if vmlinux BTF was overridden, don't attempt to load module BTFs */ + if (obj->btf_vmlinux_override) + return cands; + + /* now look through module BTFs, trying to still find candidates */ + err = load_module_btfs(obj); + if (err) + goto err_out; + + for (i = 0; i < obj->btf_module_cnt; i++) { + err = bpf_core_add_cands(&local_cand, local_essent_len, + obj->btf_modules[i].btf, + obj->btf_modules[i].name, + btf__get_nr_types(obj->btf_vmlinux) + 1, + cands); + if (err) + goto err_out; }
return cands; +err_out: + bpf_core_free_cands(cands); + return ERR_PTR(err); }
/* Check two types for compatibility for the purpose of field access @@ -5814,7 +5965,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, if (!hashmap__find(cand_cache, type_key, (void **)&cands)) { cands = bpf_core_find_cands(prog->obj, local_btf, local_id); if (IS_ERR(cands)) { - pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld", + pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n", prog->name, relo_idx, local_id, btf_kind_str(local_type), local_name, PTR_ERR(cands)); return PTR_ERR(cands); @@ -6002,7 +6153,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) }
out: - /* obj->btf_vmlinux is freed at the end of object load phase */ + /* obj->btf_vmlinux and module BTFs are freed after object load */ btf__free(obj->btf_vmlinux_override); obj->btf_vmlinux_override = NULL;
@@ -7378,6 +7529,14 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) err = err ? : bpf_object__relocate(obj, attr->target_btf_path); err = err ? : bpf_object__load_progs(obj, attr->log_level);
+ /* clean up module BTFs */ + for (i = 0; i < obj->btf_module_cnt; i++) { + btf__free(obj->btf_modules[i].btf); + free(obj->btf_modules[i].name); + } + free(obj->btf_modules); + + /* clean up vmlinux BTF */ btf__free(obj->btf_vmlinux); obj->btf_vmlinux = NULL;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 9f7fa225894c7fcb014f3699a402fcc4d896cb1c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add bpf_testmod module, which is conceptually out-of-tree module and provides ways for selftests/bpf to test various kernel module-related functionality: raw tracepoint, fentry/fexit/fmod_ret, etc. This module will be auto-loaded by test_progs test runner and expected by some of selftests to be present and loaded.
Pahole currently isn't able to generate BTF for static functions in kernel modules, so make sure traced function is global.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20201203204634.1325171-7-andrii@kernel.org (cherry picked from commit 9f7fa225894c7fcb014f3699a402fcc4d896cb1c) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 12 +++- .../selftests/bpf/bpf_testmod/.gitignore | 6 ++ .../selftests/bpf/bpf_testmod/Makefile | 20 +++++++ .../bpf/bpf_testmod/bpf_testmod-events.h | 36 +++++++++++ .../selftests/bpf/bpf_testmod/bpf_testmod.c | 52 ++++++++++++++++ .../selftests/bpf/bpf_testmod/bpf_testmod.h | 14 +++++ tools/testing/selftests/bpf/test_progs.c | 59 +++++++++++++++++++ tools/testing/selftests/bpf/test_progs.h | 1 + 9 files changed, 198 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/bpf/bpf_testmod/.gitignore create mode 100644 tools/testing/selftests/bpf/bpf_testmod/Makefile create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h
diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index b1b37dcade9f..d593ceac40ed 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -37,3 +37,4 @@ test_cpp /tools /runqslower /bench +*.ko diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index cceea641ace0..729b4255343a 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -81,7 +81,7 @@ TEST_PROGS_EXTENDED := with_addr.sh \ # Compile but not part of 'make run_tests' TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ - test_lirc_mode2_user xdping test_cpp runqslower bench + test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko
TEST_CUSTOM_PROGS = urandom_read
@@ -105,6 +105,7 @@ OVERRIDE_TARGETS := 1 override define CLEAN $(call msg,CLEAN) $(Q)$(RM) -r $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) $(EXTRA_CLEAN) + $(Q)$(MAKE) -C bpf_testmod clean endef
include ../lib.mk @@ -137,6 +138,11 @@ $(OUTPUT)/urandom_read: urandom_read.c $(call msg,BINARY,,$@) $(Q)$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS) -Wl,--build-id=sha1
+$(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch]) + $(call msg,MOD,,$@) + $(Q)$(MAKE) $(submake_extras) -C bpf_testmod + $(Q)cp bpf_testmod/bpf_testmod.ko $@ + $(OUTPUT)/test_stub.o: test_stub.c $(BPFOBJ) $(call msg,CC,,$@) $(Q)$(CC) -c $(CFLAGS) -o $@ $< @@ -395,7 +401,7 @@ TRUNNER_BPF_PROGS_DIR := progs TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ network_helpers.c testing_helpers.c \ btf_helpers.c flow_dissector_load.h -TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read \ +TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(wildcard progs/btf_dump_test_case_*.c) TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS) @@ -466,4 +472,4 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \ EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) \ prog_tests/tests.h map_tests/tests.h verifier/tests.h \ feature \ - $(addprefix $(OUTPUT)/,*.o *.skel.h no_alu32 bpf_gcc) + $(addprefix $(OUTPUT)/,*.o *.skel.h no_alu32 bpf_gcc bpf_testmod.ko) diff --git a/tools/testing/selftests/bpf/bpf_testmod/.gitignore b/tools/testing/selftests/bpf/bpf_testmod/.gitignore new file mode 100644 index 000000000000..ded513777281 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_testmod/.gitignore @@ -0,0 +1,6 @@ +*.mod +*.mod.c +*.o +.ko +/Module.symvers +/modules.order diff --git a/tools/testing/selftests/bpf/bpf_testmod/Makefile b/tools/testing/selftests/bpf/bpf_testmod/Makefile new file mode 100644 index 000000000000..15cb36c4483a --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile @@ -0,0 +1,20 @@ +BPF_TESTMOD_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST))))) +KDIR ?= $(abspath $(BPF_TESTMOD_DIR)/../../../../..) + +ifeq ($(V),1) +Q = +else +Q = @ +endif + +MODULES = bpf_testmod.ko + +obj-m += bpf_testmod.o +CFLAGS_bpf_testmod.o = -I$(src) + +all: + +$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) modules + +clean: + +$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) clean + diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h new file mode 100644 index 000000000000..b83ea448bc79 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2020 Facebook */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM bpf_testmod + +#if !defined(_BPF_TESTMOD_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _BPF_TESTMOD_EVENTS_H + +#include <linux/tracepoint.h> +#include "bpf_testmod.h" + +TRACE_EVENT(bpf_testmod_test_read, + TP_PROTO(struct task_struct *task, struct bpf_testmod_test_read_ctx *ctx), + TP_ARGS(task, ctx), + TP_STRUCT__entry( + __field(pid_t, pid) + __array(char, comm, TASK_COMM_LEN) + __field(loff_t, off) + __field(size_t, len) + ), + TP_fast_assign( + __entry->pid = task->pid; + memcpy(__entry->comm, task->comm, TASK_COMM_LEN); + __entry->off = ctx->off; + __entry->len = ctx->len; + ), + TP_printk("pid=%d comm=%s off=%llu len=%zu", + __entry->pid, __entry->comm, __entry->off, __entry->len) +); + +#endif /* _BPF_TESTMOD_EVENTS_H */ + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH . +#define TRACE_INCLUDE_FILE bpf_testmod-events +#include <trace/define_trace.h> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c new file mode 100644 index 000000000000..2df19d73ca49 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ +#include <linux/error-injection.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/sysfs.h> +#include <linux/tracepoint.h> +#include "bpf_testmod.h" + +#define CREATE_TRACE_POINTS +#include "bpf_testmod-events.h" + +noinline ssize_t +bpf_testmod_test_read(struct file *file, struct kobject *kobj, + struct bin_attribute *bin_attr, + char *buf, loff_t off, size_t len) +{ + struct bpf_testmod_test_read_ctx ctx = { + .buf = buf, + .off = off, + .len = len, + }; + + trace_bpf_testmod_test_read(current, &ctx); + + return -EIO; /* always fail */ +} +EXPORT_SYMBOL(bpf_testmod_test_read); +ALLOW_ERROR_INJECTION(bpf_testmod_test_read, ERRNO); + +static struct bin_attribute bin_attr_bpf_testmod_file __ro_after_init = { + .attr = { .name = "bpf_testmod", .mode = 0444, }, + .read = bpf_testmod_test_read, +}; + +static int bpf_testmod_init(void) +{ + return sysfs_create_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file); +} + +static void bpf_testmod_exit(void) +{ + return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file); +} + +module_init(bpf_testmod_init); +module_exit(bpf_testmod_exit); + +MODULE_AUTHOR("Andrii Nakryiko"); +MODULE_DESCRIPTION("BPF selftests module"); +MODULE_LICENSE("Dual BSD/GPL"); + diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h new file mode 100644 index 000000000000..b81adfedb4f6 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2020 Facebook */ +#ifndef _BPF_TESTMOD_H +#define _BPF_TESTMOD_H + +#include <linux/types.h> + +struct bpf_testmod_test_read_ctx { + char *buf; + loff_t off; + size_t len; +}; + +#endif /* _BPF_TESTMOD_H */ diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 4a13477aef9d..8c714cc16c76 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -360,6 +360,58 @@ int extract_build_id(char *build_id, size_t size) return -1; }
+static int finit_module(int fd, const char *param_values, int flags) +{ + return syscall(__NR_finit_module, fd, param_values, flags); +} + +static int delete_module(const char *name, int flags) +{ + return syscall(__NR_delete_module, name, flags); +} + +static void unload_bpf_testmod(void) +{ + if (delete_module("bpf_testmod", 0)) { + if (errno == ENOENT) { + if (env.verbosity > VERBOSE_NONE) + fprintf(stdout, "bpf_testmod.ko is already unloaded.\n"); + return; + } + fprintf(env.stderr, "Failed to unload bpf_testmod.ko from kernel: %d\n", -errno); + exit(1); + } + if (env.verbosity > VERBOSE_NONE) + fprintf(stdout, "Successfully unloaded bpf_testmod.ko.\n"); +} + +static int load_bpf_testmod(void) +{ + int fd; + + /* ensure previous instance of the module is unloaded */ + unload_bpf_testmod(); + + if (env.verbosity > VERBOSE_NONE) + fprintf(stdout, "Loading bpf_testmod.ko...\n"); + + fd = open("bpf_testmod.ko", O_RDONLY); + if (fd < 0) { + fprintf(env.stderr, "Can't find bpf_testmod.ko kernel module: %d\n", -errno); + return -ENOENT; + } + if (finit_module(fd, "", 0)) { + fprintf(env.stderr, "Failed to load bpf_testmod.ko into the kernel: %d\n", -errno); + close(fd); + return -EINVAL; + } + close(fd); + + if (env.verbosity > VERBOSE_NONE) + fprintf(stdout, "Successfully loaded bpf_testmod.ko.\n"); + return 0; +} + /* extern declarations for test funcs */ #define DEFINE_TEST(name) extern void test_##name(void); #include <prog_tests/tests.h> @@ -678,6 +730,11 @@ int main(int argc, char **argv)
save_netns(); stdio_hijack(); + env.has_testmod = true; + if (load_bpf_testmod()) { + fprintf(env.stderr, "WARNING! Selftests relying on bpf_testmod.ko will be skipped.\n"); + env.has_testmod = false; + } for (i = 0; i < prog_test_cnt; i++) { struct prog_test_def *test = &prog_test_defs[i];
@@ -722,6 +779,8 @@ int main(int argc, char **argv) if (test->need_cgroup_cleanup) cleanup_cgroup_environment(); } + if (env.has_testmod) + unload_bpf_testmod(); stdio_restore();
if (env.get_test_cnt) { diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index d6b14853f3bc..115953243f62 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -66,6 +66,7 @@ struct test_env { enum verbosity verbosity;
bool jit_enabled; + bool has_testmod; bool get_test_cnt; bool list_test_names;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 5ed31472b9ad6373a0a24bc21186b5eac999213d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Previously skipped sub-tests would be counted as passing with ":OK" appened in the log. Change that to be accounted as ":SKIP".
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-8-andrii@kernel.org (cherry picked from commit 5ed31472b9ad6373a0a24bc21186b5eac999213d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_progs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 8c714cc16c76..41221f78e074 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -149,15 +149,15 @@ void test__end_subtest()
if (sub_error_cnt) env.fail_cnt++; - else + else if (test->skip_cnt == 0) env.sub_succ_cnt++; skip_account();
dump_test_log(test, sub_error_cnt);
fprintf(env.stdout, "#%d/%d %s:%s\n", - test->test_num, test->subtest_num, - test->subtest_name, sub_error_cnt ? "FAIL" : "OK"); + test->test_num, test->subtest_num, test->subtest_name, + sub_error_cnt ? "FAIL" : (test->skip_cnt ? "SKIP" : "OK"));
free(test->subtest_name); test->subtest_name = NULL;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 6bcd39d366b64318562785d5b47c2837e3a53ae5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a self-tests validating libbpf is able to perform CO-RE relocations against the type defined in kernel module BTF. if bpf_testmod.o is not supported by the kernel (e.g., due to version mismatch), skip tests, instead of failing.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-9-andrii@kernel.org (cherry picked from commit 6bcd39d366b64318562785d5b47c2837e3a53ae5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/core_reloc.c | 79 ++++++++++++++++--- .../selftests/bpf/progs/core_reloc_types.h | 17 ++++ .../bpf/progs/test_core_reloc_module.c | 66 ++++++++++++++++ 3 files changed, 151 insertions(+), 11 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/test_core_reloc_module.c
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index 5b52985cb60e..e2cfbb4b4113 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include <test_progs.h> #include "progs/core_reloc_types.h" +#include "bpf_testmod/bpf_testmod.h" #include <sys/mman.h> #include <sys/syscall.h> #include <bpf/btf.h> @@ -9,6 +10,30 @@ static int duration = 0;
#define STRUCT_TO_CHAR_PTR(struct_name) (const char *)&(struct struct_name)
+#define MODULES_CASE(name, sec_name, tp_name) { \ + .case_name = name, \ + .bpf_obj_file = "test_core_reloc_module.o", \ + .btf_src_file = NULL, /* find in kernel module BTFs */ \ + .input = "", \ + .input_len = 0, \ + .output = STRUCT_TO_CHAR_PTR(core_reloc_module_output) { \ + .read_ctx_sz = sizeof(struct bpf_testmod_test_read_ctx),\ + .read_ctx_exists = true, \ + .buf_exists = true, \ + .len_exists = true, \ + .off_exists = true, \ + .len = 123, \ + .off = 0, \ + .comm = "test_progs", \ + .comm_len = sizeof("test_progs"), \ + }, \ + .output_len = sizeof(struct core_reloc_module_output), \ + .prog_sec_name = sec_name, \ + .raw_tp_name = tp_name, \ + .trigger = trigger_module_test_read, \ + .needs_testmod = true, \ +} + #define FLAVORS_DATA(struct_name) STRUCT_TO_CHAR_PTR(struct_name) { \ .a = 42, \ .b = 0xc001, \ @@ -206,7 +231,7 @@ static int duration = 0; .output = STRUCT_TO_CHAR_PTR(core_reloc_bitfields_output) \ __VA_ARGS__, \ .output_len = sizeof(struct core_reloc_bitfields_output), \ - .direct_raw_tp = true, \ + .prog_sec_name = "tp_btf/sys_enter", \ }
@@ -217,7 +242,7 @@ static int duration = 0; }, { \ BITFIELDS_CASE_COMMON("test_core_reloc_bitfields_direct.o", \ "direct:", name), \ - .direct_raw_tp = true, \ + .prog_sec_name = "tp_btf/sys_enter", \ .fails = true, \ }
@@ -304,6 +329,7 @@ static int duration = 0; struct core_reloc_test_case;
typedef int (*setup_test_fn)(struct core_reloc_test_case *test); +typedef int (*trigger_test_fn)(const struct core_reloc_test_case *test);
struct core_reloc_test_case { const char *case_name; @@ -314,9 +340,12 @@ struct core_reloc_test_case { const char *output; int output_len; bool fails; + bool needs_testmod; bool relaxed_core_relocs; - bool direct_raw_tp; + const char *prog_sec_name; + const char *raw_tp_name; setup_test_fn setup; + trigger_test_fn trigger; };
static int find_btf_type(const struct btf *btf, const char *name, __u32 kind) @@ -446,6 +475,23 @@ static int setup_type_id_case_failure(struct core_reloc_test_case *test) return 0; }
+static int trigger_module_test_read(const struct core_reloc_test_case *test) +{ + struct core_reloc_module_output *exp = (void *)test->output; + int fd, err; + + fd = open("/sys/kernel/bpf_testmod", O_RDONLY); + err = -errno; + if (CHECK(fd < 0, "testmod_file_open", "failed: %d\n", err)) + return err; + + read(fd, NULL, exp->len); /* request expected number of bytes */ + close(fd); + + return 0; +} + + static struct core_reloc_test_case test_cases[] = { /* validate we can find kernel image and use its BTF for relocs */ { @@ -462,6 +508,9 @@ static struct core_reloc_test_case test_cases[] = { .output_len = sizeof(struct core_reloc_kernel_output), },
+ /* validate we can find kernel module BTF types for relocs/attach */ + MODULES_CASE("module", "raw_tp/bpf_testmod_test_read", "bpf_testmod_test_read"), + /* validate BPF program can use multiple flavors to match against * single target BTF type */ @@ -785,6 +834,11 @@ void test_core_reloc(void) if (!test__start_subtest(test_case->case_name)) continue;
+ if (test_case->needs_testmod && !env.has_testmod) { + test__skip(); + continue; + } + if (test_case->setup) { err = test_case->setup(test_case); if (CHECK(err, "test_setup", "test #%d setup failed: %d\n", i, err)) @@ -796,13 +850,11 @@ void test_core_reloc(void) test_case->bpf_obj_file, PTR_ERR(obj))) continue;
- /* for typed raw tracepoints, NULL should be specified */ - if (test_case->direct_raw_tp) { - probe_name = "tp_btf/sys_enter"; - tp_name = NULL; - } else { - probe_name = "raw_tracepoint/sys_enter"; - tp_name = "sys_enter"; + probe_name = "raw_tracepoint/sys_enter"; + tp_name = "sys_enter"; + if (test_case->prog_sec_name) { + probe_name = test_case->prog_sec_name; + tp_name = test_case->raw_tp_name; /* NULL for tp_btf */ }
prog = bpf_object__find_program_by_title(obj, probe_name); @@ -850,7 +902,12 @@ void test_core_reloc(void) goto cleanup;
/* trigger test run */ - usleep(1); + if (test_case->trigger) { + if (!ASSERT_OK(test_case->trigger(test_case), "test_trigger")) + goto cleanup; + } else { + usleep(1); + }
if (data->skip) { test__skip(); diff --git a/tools/testing/selftests/bpf/progs/core_reloc_types.h b/tools/testing/selftests/bpf/progs/core_reloc_types.h index af58ef9a28ca..664eea1013aa 100644 --- a/tools/testing/selftests/bpf/progs/core_reloc_types.h +++ b/tools/testing/selftests/bpf/progs/core_reloc_types.h @@ -15,6 +15,23 @@ struct core_reloc_kernel_output { int comm_len; };
+/* + * MODULE + */ + +struct core_reloc_module_output { + long long len; + long long off; + int read_ctx_sz; + bool read_ctx_exists; + bool buf_exists; + bool len_exists; + bool off_exists; + /* we have test_progs[-flavor], so cut flavor part */ + char comm[sizeof("test_progs")]; + int comm_len; +}; + /* * FLAVORS */ diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_module.c b/tools/testing/selftests/bpf/progs/test_core_reloc_module.c new file mode 100644 index 000000000000..d1840c1a9d36 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_core_reloc_module.c @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> +#include <bpf/bpf_tracing.h> + +char _license[] SEC("license") = "GPL"; + +struct bpf_testmod_test_read_ctx { + /* field order is mixed up */ + size_t len; + char *buf; + loff_t off; +} __attribute__((preserve_access_index)); + +struct { + char in[256]; + char out[256]; + bool skip; + uint64_t my_pid_tgid; +} data = {}; + +struct core_reloc_module_output { + long long len; + long long off; + int read_ctx_sz; + bool read_ctx_exists; + bool buf_exists; + bool len_exists; + bool off_exists; + /* we have test_progs[-flavor], so cut flavor part */ + char comm[sizeof("test_progs")]; + int comm_len; +}; + +SEC("raw_tp/bpf_testmod_test_read") +int BPF_PROG(test_core_module, + struct task_struct *task, + struct bpf_testmod_test_read_ctx *read_ctx) +{ + struct core_reloc_module_output *out = (void *)&data.out; + __u64 pid_tgid = bpf_get_current_pid_tgid(); + __u32 real_tgid = (__u32)(pid_tgid >> 32); + __u32 real_pid = (__u32)pid_tgid; + + if (data.my_pid_tgid != pid_tgid) + return 0; + + if (BPF_CORE_READ(task, pid) != real_pid || BPF_CORE_READ(task, tgid) != real_tgid) + return 0; + + out->len = BPF_CORE_READ(read_ctx, len); + out->off = BPF_CORE_READ(read_ctx, off); + + out->read_ctx_sz = bpf_core_type_size(struct bpf_testmod_test_read_ctx); + out->read_ctx_exists = bpf_core_type_exists(struct bpf_testmod_test_read_ctx); + out->buf_exists = bpf_core_field_exists(read_ctx->buf); + out->off_exists = bpf_core_field_exists(read_ctx->off); + out->len_exists = bpf_core_field_exists(read_ctx->len); + + out->comm_len = BPF_CORE_READ_STR_INTO(&out->comm, task, comm); + + return 0; +}
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 22dc4a0f5ed11b6dc8fd73a0892fa0ea1a4c3cdf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead, wherever BTF type IDs are involved, also track the instance of struct btf that goes along with the type ID. This allows to gradually add support for kernel module BTFs and using/tracking module types across BPF helper calls and registers.
This patch also renames btf_id() function to btf_obj_id() to minimize naming clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.
Also, altough btf_vmlinux can't get destructed and thus doesn't need refcounting, module BTFs need that, so apply BTF refcounting universally when BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This makes for simpler clean up code.
Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF trampoline key to include BTF object ID. To differentiate that from target program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are not allowed to take full 32 bits, so there is no danger of confusing that bit with a valid BTF type ID.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org (cherry picked from commit 22dc4a0f5ed11b6dc8fd73a0892fa0ea1a4c3cdf) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/syscall.c kernel/bpf/verifier.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 13 ++++-- include/linux/bpf_verifier.h | 28 +++++++++---- include/linux/btf.h | 5 ++- kernel/bpf/btf.c | 65 ++++++++++++++++++++---------- kernel/bpf/syscall.c | 24 +++++++++-- kernel/bpf/verifier.c | 77 ++++++++++++++++++++++-------------- net/ipv4/bpf_tcp_ca.c | 3 +- 7 files changed, 148 insertions(+), 67 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 8a0c2bd2b8d5..966337ba4845 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -480,7 +480,10 @@ struct bpf_insn_access_aux { enum bpf_reg_type reg_type; union { int ctx_field_size; - u32 btf_id; + struct { + struct btf *btf; + u32 btf_id; + }; }; struct bpf_verifier_log *log; /* for verbose logs */ }; @@ -517,6 +520,7 @@ struct bpf_verifier_ops { struct bpf_insn *dst, struct bpf_prog *prog, u32 *target_size); int (*btf_struct_access)(struct bpf_verifier_log *log, + const struct btf *btf, const struct btf_type *t, int off, int size, enum bpf_access_type atype, u32 *next_btf_id); @@ -858,6 +862,7 @@ struct bpf_prog_aux { u32 ctx_arg_info_size; u32 max_rdonly_access; u32 max_rdwr_access; + struct btf *attach_btf; const struct bpf_ctx_arg_aux *ctx_arg_info; struct mutex dst_mutex; /* protects dst_* pointers below, *after* prog becomes visible */ struct bpf_prog *dst_prog; @@ -1094,7 +1099,6 @@ struct bpf_event_entry {
bool bpf_prog_array_compatible(struct bpf_array *array, const struct bpf_prog *fp); int bpf_prog_calc_tag(struct bpf_prog *fp); -const char *kernel_type_name(u32 btf_type_id);
const struct bpf_func_proto *bpf_get_trace_printk_proto(void);
@@ -1555,12 +1559,13 @@ int bpf_prog_test_run_raw_tp(struct bpf_prog *prog, bool btf_ctx_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, struct bpf_insn_access_aux *info); -int btf_struct_access(struct bpf_verifier_log *log, +int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, const struct btf_type *t, int off, int size, enum bpf_access_type atype, u32 *next_btf_id); bool btf_struct_ids_match(struct bpf_verifier_log *log, - int off, u32 id, u32 need_type_id); + const struct btf *btf, u32 id, int off, + const struct btf *need_btf, u32 need_type_id);
int btf_distill_func_proto(struct bpf_verifier_log *log, struct btf *btf, diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 351923aab869..9ee5c7d9e9cb 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -5,6 +5,7 @@ #define _LINUX_BPF_VERIFIER_H 1
#include <linux/bpf.h> /* for enum bpf_reg_type */ +#include <linux/btf.h> /* for struct btf and btf_id() */ #include <linux/filter.h> /* for MAX_BPF_STACK */ #include <linux/tnum.h>
@@ -45,6 +46,8 @@ enum bpf_reg_liveness { struct bpf_reg_state { /* Ordering of fields matters. See states_equal() */ enum bpf_reg_type type; + /* Fixed part of pointer offset, pointer types only */ + s32 off; union { /* valid when type == PTR_TO_PACKET */ u16 range; @@ -54,15 +57,20 @@ struct bpf_reg_state { */ struct bpf_map *map_ptr;
- u32 btf_id; /* for PTR_TO_BTF_ID */ + /* for PTR_TO_BTF_ID */ + struct { + struct btf *btf; + u32 btf_id; + };
u32 mem_size; /* for PTR_TO_MEM | PTR_TO_MEM_OR_NULL */
/* Max size from any of the above. */ - unsigned long raw; + struct { + unsigned long raw1; + unsigned long raw2; + } raw; }; - /* Fixed part of pointer offset, pointer types only */ - s32 off; /* For PTR_TO_PACKET, used to find other pointers with the same variable * offset, so they can share range knowledge. * For PTR_TO_MAP_VALUE_OR_NULL this is used to share which map value we @@ -321,7 +329,10 @@ struct bpf_insn_aux_data { struct { enum bpf_reg_type reg_type; /* type of pseudo_btf_id */ union { - u32 btf_id; /* btf_id for struct typed var */ + struct { + struct btf *btf; + u32 btf_id; /* btf_id for struct typed var */ + }; u32 mem_size; /* mem_size for non-struct typed var */ }; } btf_var; @@ -481,9 +492,12 @@ int check_ptr_off_reg(struct bpf_verifier_env *env,
/* this lives here instead of in bpf.h because it needs to dereference tgt_prog */ static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog, - u32 btf_id) + struct btf *btf, u32 btf_id) { - return tgt_prog ? (((u64)tgt_prog->aux->id) << 32 | btf_id) : btf_id; + if (tgt_prog) + return ((u64)tgt_prog->aux->id << 32) | btf_id; + else + return ((u64)btf_obj_id(btf) << 32) | 0x80000000 | btf_id; }
int bpf_check_attach_target(struct bpf_verifier_log *log, diff --git a/include/linux/btf.h b/include/linux/btf.h index 2bf641829664..fb608e4de076 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -18,6 +18,7 @@ struct btf_show;
extern const struct file_operations btf_fops;
+void btf_get(struct btf *btf); void btf_put(struct btf *btf); int btf_new_fd(const union bpf_attr *attr); struct btf *btf_get_by_fd(int fd); @@ -88,7 +89,7 @@ int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, char *buf, int len, u64 flags);
int btf_get_fd_by_id(u32 id); -u32 btf_id(const struct btf *btf); +u32 btf_obj_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); @@ -206,6 +207,8 @@ static inline const struct btf_var_secinfo *btf_type_var_secinfo( }
#ifdef CONFIG_BPF_SYSCALL +struct bpf_prog; + const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); const char *btf_name_by_offset(const struct btf *btf, u32 offset); struct btf *btf_parse_vmlinux(void); diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 680f5d0c7fe8..f8e4e3729943 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -1524,6 +1524,11 @@ static void btf_free_rcu(struct rcu_head *rcu) btf_free(btf); }
+void btf_get(struct btf *btf) +{ + refcount_inc(&btf->refcnt); +} + void btf_put(struct btf *btf) { if (btf && refcount_dec_and_test(&btf->refcnt)) { @@ -4554,11 +4559,10 @@ struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog) { struct bpf_prog *tgt_prog = prog->aux->dst_prog;
- if (tgt_prog) { + if (tgt_prog) return tgt_prog->aux->btf; - } else { - return btf_vmlinux; - } + else + return prog->aux->attach_btf; }
static bool is_string_ptr(struct btf *btf, const struct btf_type *t) @@ -4702,6 +4706,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
if (ctx_arg_info->offset == off) { info->reg_type = ctx_arg_info->reg_type; + info->btf = btf_vmlinux; info->btf_id = ctx_arg_info->btf_id; return true; } @@ -4718,6 +4723,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
ret = btf_translate_to_vmlinux(log, btf, t, tgt_type, arg); if (ret > 0) { + info->btf = btf_vmlinux; info->btf_id = ret; return true; } else { @@ -4725,6 +4731,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, } }
+ info->btf = btf; info->btf_id = t->type; t = btf_type_by_id(btf, t->type); /* skip modifiers */ @@ -4751,7 +4758,7 @@ enum bpf_struct_walk_result { WALK_STRUCT, };
-static int btf_struct_walk(struct bpf_verifier_log *log, +static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf, const struct btf_type *t, int off, int size, u32 *next_btf_id) { @@ -4762,7 +4769,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, u32 vlen, elem_id, mid;
again: - tname = __btf_name_by_offset(btf_vmlinux, t->name_off); + tname = __btf_name_by_offset(btf, t->name_off); if (!btf_type_is_struct(t)) { bpf_log(log, "Type '%s' is not a struct\n", tname); return -EINVAL; @@ -4779,7 +4786,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, goto error;
member = btf_type_member(t) + vlen - 1; - mtype = btf_type_skip_modifiers(btf_vmlinux, member->type, + mtype = btf_type_skip_modifiers(btf, member->type, NULL); if (!btf_type_is_array(mtype)) goto error; @@ -4795,7 +4802,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, /* Only allow structure for now, can be relaxed for * other types later. */ - t = btf_type_skip_modifiers(btf_vmlinux, array_elem->type, + t = btf_type_skip_modifiers(btf, array_elem->type, NULL); if (!btf_type_is_struct(t)) goto error; @@ -4853,10 +4860,10 @@ static int btf_struct_walk(struct bpf_verifier_log *log,
/* type of the field */ mid = member->type; - mtype = btf_type_by_id(btf_vmlinux, member->type); - mname = __btf_name_by_offset(btf_vmlinux, member->name_off); + mtype = btf_type_by_id(btf, member->type); + mname = __btf_name_by_offset(btf, member->name_off);
- mtype = __btf_resolve_size(btf_vmlinux, mtype, &msize, + mtype = __btf_resolve_size(btf, mtype, &msize, &elem_type, &elem_id, &total_nelems, &mid); if (IS_ERR(mtype)) { @@ -4951,7 +4958,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, mname, moff, tname, off, size); return -EACCES; } - stype = btf_type_skip_modifiers(btf_vmlinux, mtype->type, &id); + stype = btf_type_skip_modifiers(btf, mtype->type, &id); if (btf_type_is_struct(stype)) { *next_btf_id = id; return WALK_PTR; @@ -4977,7 +4984,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, return -EINVAL; }
-int btf_struct_access(struct bpf_verifier_log *log, +int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, const struct btf_type *t, int off, int size, enum bpf_access_type atype __maybe_unused, u32 *next_btf_id) @@ -4986,7 +4993,7 @@ int btf_struct_access(struct bpf_verifier_log *log, u32 id;
do { - err = btf_struct_walk(log, t, off, size, &id); + err = btf_struct_walk(log, btf, t, off, size, &id);
switch (err) { case WALK_PTR: @@ -5002,7 +5009,7 @@ int btf_struct_access(struct bpf_verifier_log *log, * by diving in it. At this point the offset is * aligned with the new type, so set it to 0. */ - t = btf_type_by_id(btf_vmlinux, id); + t = btf_type_by_id(btf, id); off = 0; break; default: @@ -5018,21 +5025,37 @@ int btf_struct_access(struct bpf_verifier_log *log, return -EINVAL; }
+/* Check that two BTF types, each specified as an BTF object + id, are exactly + * the same. Trivial ID check is not enough due to module BTFs, because we can + * end up with two different module BTFs, but IDs point to the common type in + * vmlinux BTF. + */ +static bool btf_types_are_same(const struct btf *btf1, u32 id1, + const struct btf *btf2, u32 id2) +{ + if (id1 != id2) + return false; + if (btf1 == btf2) + return true; + return btf_type_by_id(btf1, id1) == btf_type_by_id(btf2, id2); +} + bool btf_struct_ids_match(struct bpf_verifier_log *log, - int off, u32 id, u32 need_type_id) + const struct btf *btf, u32 id, int off, + const struct btf *need_btf, u32 need_type_id) { const struct btf_type *type; int err;
/* Are we already done? */ - if (need_type_id == id && off == 0) + if (off == 0 && btf_types_are_same(btf, id, need_btf, need_type_id)) return true;
again: - type = btf_type_by_id(btf_vmlinux, id); + type = btf_type_by_id(btf, id); if (!type) return false; - err = btf_struct_walk(log, type, off, 1, &id); + err = btf_struct_walk(log, btf, type, off, 1, &id); if (err != WALK_STRUCT) return false;
@@ -5041,7 +5064,7 @@ bool btf_struct_ids_match(struct bpf_verifier_log *log, * continue the search with offset 0 in the new * type. */ - if (need_type_id != id) { + if (!btf_types_are_same(btf, id, need_btf, need_type_id)) { off = 0; goto again; } @@ -5724,7 +5747,7 @@ int btf_get_fd_by_id(u32 id) return fd; }
-u32 btf_id(const struct btf *btf) +u32 btf_obj_id(const struct btf *btf) { return btf->id; } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 2f4091da923f..7b1aba88038f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1754,6 +1754,8 @@ static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred) bpf_prog_kallsyms_del_all(prog); btf_put(prog->aux->btf); bpf_prog_free_linfo(prog); + if (prog->aux->attach_btf) + btf_put(prog->aux->attach_btf);
if (deferred) { if (prog->aux->sleepable) @@ -2178,6 +2180,20 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
prog->expected_attach_type = attr->expected_attach_type; prog->aux->attach_btf_id = attr->attach_btf_id; + + if (attr->attach_btf_id && !attr->attach_prog_fd) { + struct btf *btf; + + btf = bpf_get_btf_vmlinux(); + if (IS_ERR(btf)) + return PTR_ERR(btf); + if (!btf) + return -EINVAL; + + btf_get(btf); + prog->aux->attach_btf = btf; + } + if (attr->attach_prog_fd) { struct bpf_prog *dst_prog;
@@ -2278,6 +2294,8 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) free_prog_sec: security_bpf_prog_free(prog->aux); free_prog_nouncharge: + if (prog->aux->attach_btf) + btf_put(prog->aux->attach_btf); bpf_prog_free(prog); return err; } @@ -2646,7 +2664,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, goto out_put_prog; }
- key = bpf_trampoline_compute_key(tgt_prog, btf_id); + key = bpf_trampoline_compute_key(tgt_prog, NULL, btf_id); }
link = kzalloc(sizeof(*link), GFP_USER); @@ -3631,7 +3649,7 @@ static int bpf_prog_get_info_by_fd(struct file *file, }
if (prog->aux->btf) - info.btf_id = btf_id(prog->aux->btf); + info.btf_id = btf_obj_id(prog->aux->btf);
ulen = info.nr_func_info; info.nr_func_info = prog->aux->func_info_cnt; @@ -3734,7 +3752,7 @@ static int bpf_map_get_info_by_fd(struct file *file, memcpy(info.name, map->name, sizeof(map->name));
if (map->btf) { - info.btf_id = btf_id(map->btf); + info.btf_id = btf_obj_id(map->btf); info.btf_key_type_id = map->btf_key_type_id; info.btf_value_type_id = map->btf_value_type_id; } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d26104b258ba..8abf2b6bb27b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -239,7 +239,9 @@ struct bpf_call_arg_meta { u64 msize_max_value; int ref_obj_id; int func_id; + struct btf *btf; u32 btf_id; + struct btf *ret_btf; u32 ret_btf_id; };
@@ -560,10 +562,9 @@ static struct bpf_func_state *func(struct bpf_verifier_env *env, return cur->frame[reg->frameno]; }
-const char *kernel_type_name(u32 id) +static const char *kernel_type_name(const struct btf* btf, u32 id) { - return btf_name_by_offset(btf_vmlinux, - btf_type_by_id(btf_vmlinux, id)->name_off); + return btf_name_by_offset(btf, btf_type_by_id(btf, id)->name_off); }
static void print_verifier_state(struct bpf_verifier_env *env, @@ -592,7 +593,7 @@ static void print_verifier_state(struct bpf_verifier_env *env, } else { if (base_type(t) == PTR_TO_BTF_ID || base_type(t) == PTR_TO_PERCPU_BTF_ID) - verbose(env, "%s", kernel_type_name(reg->btf_id)); + verbose(env, "%s", kernel_type_name(reg->btf, reg->btf_id)); verbose(env, "(id=%d", reg->id); if (reg_type_may_be_refcounted_or_null(t)) verbose(env, ",ref_obj_id=%d", reg->ref_obj_id); @@ -1387,7 +1388,8 @@ static void mark_reg_not_init(struct bpf_verifier_env *env,
static void mark_btf_ld_reg(struct bpf_verifier_env *env, struct bpf_reg_state *regs, u32 regno, - enum bpf_reg_type reg_type, u32 btf_id) + enum bpf_reg_type reg_type, + struct btf *btf, u32 btf_id) { if (reg_type == SCALAR_VALUE) { mark_reg_unknown(env, regs, regno); @@ -1395,6 +1397,7 @@ static void mark_btf_ld_reg(struct bpf_verifier_env *env, } mark_reg_known_zero(env, regs, regno); regs[regno].type = PTR_TO_BTF_ID; + regs[regno].btf = btf; regs[regno].btf_id = btf_id; }
@@ -2995,7 +2998,7 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, /* check access to 'struct bpf_context' fields. Supports fixed offsets only */ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, int size, enum bpf_access_type t, enum bpf_reg_type *reg_type, - u32 *btf_id) + struct btf **btf, u32 *btf_id) { struct bpf_insn_access_aux info = { .reg_type = *reg_type, @@ -3013,10 +3016,12 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, */ *reg_type = info.reg_type;
- if (base_type(*reg_type) == PTR_TO_BTF_ID) + if (base_type(*reg_type) == PTR_TO_BTF_ID) { + *btf = info.btf; *btf_id = info.btf_id; - else + } else { env->insn_aux_data[insn_idx].ctx_field_size = info.ctx_field_size; + } /* remember the offset of last byte accessed in ctx */ if (env->prog->aux->max_ctx_offset < off + size) env->prog->aux->max_ctx_offset = off + size; @@ -3548,8 +3553,8 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env, int value_regno) { struct bpf_reg_state *reg = regs + regno; - const struct btf_type *t = btf_type_by_id(btf_vmlinux, reg->btf_id); - const char *tname = btf_name_by_offset(btf_vmlinux, t->name_off); + const struct btf_type *t = btf_type_by_id(reg->btf, reg->btf_id); + const char *tname = btf_name_by_offset(reg->btf, t->name_off); u32 btf_id; int ret;
@@ -3570,23 +3575,23 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env, }
if (env->ops->btf_struct_access) { - ret = env->ops->btf_struct_access(&env->log, t, off, size, - atype, &btf_id); + ret = env->ops->btf_struct_access(&env->log, reg->btf, t, + off, size, atype, &btf_id); } else { if (atype != BPF_READ) { verbose(env, "only read is supported\n"); return -EACCES; }
- ret = btf_struct_access(&env->log, t, off, size, atype, - &btf_id); + ret = btf_struct_access(&env->log, reg->btf, t, off, size, + atype, &btf_id); }
if (ret < 0) return ret;
if (atype == BPF_READ && value_regno >= 0) - mark_btf_ld_reg(env, regs, value_regno, ret, btf_id); + mark_btf_ld_reg(env, regs, value_regno, ret, reg->btf, btf_id);
return 0; } @@ -3636,12 +3641,12 @@ static int check_ptr_to_map_access(struct bpf_verifier_env *env, return -EACCES; }
- ret = btf_struct_access(&env->log, t, off, size, atype, &btf_id); + ret = btf_struct_access(&env->log, btf_vmlinux, t, off, size, atype, &btf_id); if (ret < 0) return ret;
if (value_regno >= 0) - mark_btf_ld_reg(env, regs, value_regno, ret, btf_id); + mark_btf_ld_reg(env, regs, value_regno, ret, btf_vmlinux, btf_id);
return 0; } @@ -3817,6 +3822,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn mark_reg_unknown(env, regs, value_regno); } else if (reg->type == PTR_TO_CTX) { enum bpf_reg_type reg_type = SCALAR_VALUE; + struct btf *btf = NULL; u32 btf_id = 0;
if (t == BPF_WRITE && value_regno >= 0 && @@ -3829,7 +3835,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn if (err < 0) return err;
- err = check_ctx_access(env, insn_idx, off, size, t, ®_type, &btf_id); + err = check_ctx_access(env, insn_idx, off, size, t, ®_type, &btf, &btf_id); if (err) verbose_linfo(env, insn_idx, "; "); if (!err && t == BPF_READ && value_regno >= 0) { @@ -3850,8 +3856,10 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn * a sub-register. */ regs[value_regno].subreg_def = DEF_NOT_SUBREG; - if (base_type(reg_type) == PTR_TO_BTF_ID) + if (base_type(reg_type) == PTR_TO_BTF_ID) { + regs[value_regno].btf = btf; regs[value_regno].btf_id = btf_id; + } } regs[value_regno].type = reg_type; } @@ -4477,11 +4485,11 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno, arg_btf_id = compatible->btf_id; }
- if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id, - *arg_btf_id)) { + if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off, + btf_vmlinux, *arg_btf_id)) { verbose(env, "R%d is of type %s but %s is expected\n", - regno, kernel_type_name(reg->btf_id), - kernel_type_name(*arg_btf_id)); + regno, kernel_type_name(reg->btf, reg->btf_id), + kernel_type_name(btf_vmlinux, *arg_btf_id)); return -EACCES; } } @@ -4619,6 +4627,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, verbose(env, "Helper has invalid btf_id in R%d\n", regno); return -EACCES; } + meta->ret_btf = reg->btf; meta->ret_btf_id = reg->btf_id; } else if (arg_type == ARG_PTR_TO_SPIN_LOCK) { if (meta->func_id == BPF_FUNC_spin_lock) { @@ -5536,16 +5545,16 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn const struct btf_type *t;
mark_reg_known_zero(env, regs, BPF_REG_0); - t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL); + t = btf_type_skip_modifiers(meta.ret_btf, meta.ret_btf_id, NULL); if (!btf_type_is_struct(t)) { u32 tsize; const struct btf_type *ret; const char *tname;
/* resolve the type size of ksym. */ - ret = btf_resolve_size(btf_vmlinux, t, &tsize); + ret = btf_resolve_size(meta.ret_btf, t, &tsize); if (IS_ERR(ret)) { - tname = btf_name_by_offset(btf_vmlinux, t->name_off); + tname = btf_name_by_offset(meta.ret_btf, t->name_off); verbose(env, "unable to resolve the size of type '%s': %ld\n", tname, PTR_ERR(ret)); return -EINVAL; @@ -5561,6 +5570,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn ret_flag &= ~MEM_RDONLY;
regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag; + regs[BPF_REG_0].btf = meta.ret_btf; regs[BPF_REG_0].btf_id = meta.ret_btf_id; } } else if (base_type(ret_type) == RET_PTR_TO_BTF_ID) { @@ -5575,6 +5585,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn func_id); return -EINVAL; } + /* current BPF helper definitions are only coming from + * built-in code with type IDs from vmlinux BTF + */ + regs[BPF_REG_0].btf = btf_vmlinux; regs[BPF_REG_0].btf_id = ret_btf_id; } else { verbose(env, "unknown return type %u of func %s#%d\n", @@ -6173,7 +6187,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, if (reg_is_pkt_pointer(ptr_reg)) { dst_reg->id = ++env->id_gen; /* something was added to pkt_ptr, set range to zero */ - dst_reg->raw = 0; + memset(&dst_reg->raw, 0, sizeof(dst_reg->raw)); } break; case BPF_SUB: @@ -6233,7 +6247,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, dst_reg->id = ++env->id_gen; /* something was added to pkt_ptr, set range to zero */ if (smin_val < 0) - dst_reg->raw = 0; + memset(&dst_reg->raw, 0, sizeof(dst_reg->raw)); } break; case BPF_AND: @@ -8221,6 +8235,7 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) break; case PTR_TO_BTF_ID: case PTR_TO_PERCPU_BTF_ID: + dst_reg->btf = aux->btf_var.btf; dst_reg->btf_id = aux->btf_var.btf_id; break; default: @@ -10202,6 +10217,7 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, t = btf_type_skip_modifiers(btf_vmlinux, type, NULL); if (percpu) { aux->btf_var.reg_type = PTR_TO_PERCPU_BTF_ID; + aux->btf_var.btf = btf_vmlinux; aux->btf_var.btf_id = type; } else if (!btf_type_is_struct(t)) { const struct btf_type *ret; @@ -10220,6 +10236,7 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, aux->btf_var.mem_size = tsize; } else { aux->btf_var.reg_type = PTR_TO_BTF_ID; + aux->btf_var.btf = btf_vmlinux; aux->btf_var.btf_id = type; } return 0; @@ -12048,7 +12065,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, bpf_log(log, "Tracing programs must provide btf_id\n"); return -EINVAL; } - btf = tgt_prog ? tgt_prog->aux->btf : btf_vmlinux; + btf = tgt_prog ? tgt_prog->aux->btf : prog->aux->attach_btf; if (!btf) { bpf_log(log, "FENTRY/FEXIT program can only be attached to another program annotated with BTF\n"); @@ -12332,7 +12349,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) return ret; }
- key = bpf_trampoline_compute_key(tgt_prog, btf_id); + key = bpf_trampoline_compute_key(tgt_prog, prog->aux->attach_btf, btf_id); tr = bpf_trampoline_get(key, &tgt_info); if (!tr) return -ENOMEM; diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 618954f82764..d520e61649c8 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -95,6 +95,7 @@ static bool bpf_tcp_ca_is_valid_access(int off, int size, }
static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log, + const struct btf *btf, const struct btf_type *t, int off, int size, enum bpf_access_type atype, u32 *next_btf_id) @@ -102,7 +103,7 @@ static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log, size_t end;
if (atype == BPF_READ) - return btf_struct_access(log, t, off, size, atype, next_btf_id); + return btf_struct_access(log, btf, t, off, size, atype, next_btf_id);
if (t != tcp_sock_type) { bpf_log(log, "only read is supported\n");
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 290248a5b7d829871b3ea3c62578613a580a1744 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add ability for user-space programs to specify non-vmlinux BTF when attaching BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this, attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD of a module or vmlinux BTF object. For backwards compatibility reasons, 0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org (cherry picked from commit 290248a5b7d829871b3ea3c62578613a580a1744) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/syscall.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/btf.h | 1 + include/uapi/linux/bpf.h | 7 ++- kernel/bpf/btf.c | 5 +++ kernel/bpf/syscall.c | 82 +++++++++++++++++++++------------- tools/include/uapi/linux/bpf.h | 7 ++- 5 files changed, 69 insertions(+), 33 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h index fb608e4de076..4c200f5d242b 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -90,6 +90,7 @@ int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
int btf_get_fd_by_id(u32 id); u32 btf_obj_id(const struct btf *btf); +bool btf_is_kernel(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index dfbd0974e2ed..241f26f52a64 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -558,7 +558,12 @@ union bpf_attr { __aligned_u64 line_info; /* line info */ __u32 line_info_cnt; /* number of bpf_line_info records */ __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ - __u32 attach_prog_fd; /* 0 to attach to vmlinux */ + union { + /* valid prog_fd to attach to bpf prog */ + __u32 attach_prog_fd; + /* or valid module BTF object fd or 0 to attach to vmlinux */ + __u32 attach_btf_obj_fd; + }; };
struct { /* anonymous struct used by BPF_OBJ_* commands */ diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index f8e4e3729943..3edf141cc146 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5752,6 +5752,11 @@ u32 btf_obj_id(const struct btf *btf) return btf->id; }
+bool btf_is_kernel(const struct btf *btf) +{ + return btf->kernel_btf; +} + static int btf_id_cmp_func(const void *a, const void *b) { const int *pa = a, *pb = b; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7b1aba88038f..2d3b7a75155b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1989,12 +1989,16 @@ static void bpf_prog_load_fixup_attach_type(union bpf_attr *attr) static int bpf_prog_load_check_attach(enum bpf_prog_type prog_type, enum bpf_attach_type expected_attach_type, - u32 btf_id, u32 prog_fd) + struct btf *attach_btf, u32 btf_id, + struct bpf_prog *dst_prog) { if (btf_id) { if (btf_id > BTF_MAX_TYPE) return -EINVAL;
+ if (!attach_btf && !dst_prog) + return -EINVAL; + switch (prog_type) { case BPF_PROG_TYPE_TRACING: case BPF_PROG_TYPE_LSM: @@ -2007,7 +2011,10 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type, } }
- if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING && + if (attach_btf && (!btf_id || dst_prog)) + return -EINVAL; + + if (dst_prog && prog_type != BPF_PROG_TYPE_TRACING && prog_type != BPF_PROG_TYPE_EXT) return -EINVAL;
@@ -2125,7 +2132,8 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type) static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) { enum bpf_prog_type type = attr->prog_type; - struct bpf_prog *prog; + struct bpf_prog *prog, *dst_prog = NULL; + struct btf *attach_btf = NULL; int err; char license[128]; bool is_gpl; @@ -2167,44 +2175,56 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) if (is_perfmon_prog_type(type) && !perfmon_capable()) return -EPERM;
+ /* attach_prog_fd/attach_btf_obj_fd can specify fd of either bpf_prog + * or btf, we need to check which one it is + */ + if (attr->attach_prog_fd) { + dst_prog = bpf_prog_get(attr->attach_prog_fd); + if (IS_ERR(dst_prog)) { + dst_prog = NULL; + attach_btf = btf_get_by_fd(attr->attach_btf_obj_fd); + if (IS_ERR(attach_btf)) + return -EINVAL; + if (!btf_is_kernel(attach_btf)) { + btf_put(attach_btf); + return -EINVAL; + } + } + } else if (attr->attach_btf_id) { + /* fall back to vmlinux BTF, if BTF type ID is specified */ + attach_btf = bpf_get_btf_vmlinux(); + if (IS_ERR(attach_btf)) + return PTR_ERR(attach_btf); + if (!attach_btf) + return -EINVAL; + btf_get(attach_btf); + } + bpf_prog_load_fixup_attach_type(attr); if (bpf_prog_load_check_attach(type, attr->expected_attach_type, - attr->attach_btf_id, - attr->attach_prog_fd)) + attach_btf, attr->attach_btf_id, + dst_prog)) { + if (dst_prog) + bpf_prog_put(dst_prog); + if (attach_btf) + btf_put(attach_btf); return -EINVAL; + }
/* plain bpf_prog allocation */ prog = bpf_prog_alloc(bpf_prog_size(attr->insn_cnt), GFP_USER); - if (!prog) + if (!prog) { + if (dst_prog) + bpf_prog_put(dst_prog); + if (attach_btf) + btf_put(attach_btf); return -ENOMEM; + }
prog->expected_attach_type = attr->expected_attach_type; + prog->aux->attach_btf = attach_btf; prog->aux->attach_btf_id = attr->attach_btf_id; - - if (attr->attach_btf_id && !attr->attach_prog_fd) { - struct btf *btf; - - btf = bpf_get_btf_vmlinux(); - if (IS_ERR(btf)) - return PTR_ERR(btf); - if (!btf) - return -EINVAL; - - btf_get(btf); - prog->aux->attach_btf = btf; - } - - if (attr->attach_prog_fd) { - struct bpf_prog *dst_prog; - - dst_prog = bpf_prog_get(attr->attach_prog_fd); - if (IS_ERR(dst_prog)) { - err = PTR_ERR(dst_prog); - goto free_prog_nouncharge; - } - prog->aux->dst_prog = dst_prog; - } - + prog->aux->dst_prog = dst_prog; prog->aux->offload_requested = !!attr->prog_ifindex; prog->aux->sleepable = attr->prog_flags & BPF_F_SLEEPABLE;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 934ec0450b7c..9d885ea960f0 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -558,7 +558,12 @@ union bpf_attr { __aligned_u64 line_info; /* line info */ __u32 line_info_cnt; /* number of bpf_line_info records */ __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ - __u32 attach_prog_fd; /* 0 to attach to vmlinux */ + union { + /* valid prog_fd to attach to bpf prog */ + __u32 attach_prog_fd; + /* or valid module BTF object fd or 0 to attach to vmlinux */ + __u32 attach_btf_obj_fd; + }; };
struct { /* anonymous struct used by BPF_OBJ_* commands */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 6aef10a481a3f42c8021fe410e07440c0d71a5fc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Refactor low-level API for BPF program loading to not rely on public API types. This allows painless extension without constant efforts to cleverly not break backwards compatibility.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-12-andrii@kernel.org (cherry picked from commit 6aef10a481a3f42c8021fe410e07440c0d71a5fc) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/bpf.c tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf.c | 102 +++++++++++++++++++++----------- tools/lib/bpf/libbpf.c | 35 +++++------ tools/lib/bpf/libbpf_internal.h | 29 +++++++++ 3 files changed, 114 insertions(+), 52 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index fa6dbd5aec3d..46c702d8a1d8 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -215,60 +215,52 @@ alloc_zero_tailing_info(const void *orecord, __u32 cnt, return info; }
-int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, - char *log_buf, size_t log_buf_sz) +int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr) { void *finfo = NULL, *linfo = NULL; union bpf_attr attr; - __u32 log_level; int fd;
- if (!load_attr || !log_buf != !log_buf_sz) + if (!load_attr->log_buf != !load_attr->log_buf_sz) return -EINVAL;
- log_level = load_attr->log_level; - if (log_level > (4 | 2 | 1) || (log_level && !log_buf)) + if (load_attr->log_level > (4 | 2 | 1) || (load_attr->log_level && !load_attr->log_buf)) return -EINVAL;
memset(&attr, 0, sizeof(attr)); attr.prog_type = load_attr->prog_type; attr.expected_attach_type = load_attr->expected_attach_type; - if (attr.prog_type == BPF_PROG_TYPE_STRUCT_OPS || - attr.prog_type == BPF_PROG_TYPE_LSM) { - attr.attach_btf_id = load_attr->attach_btf_id; - } else if (attr.prog_type == BPF_PROG_TYPE_TRACING || - attr.prog_type == BPF_PROG_TYPE_EXT || - attr.prog_type == BPF_PROG_TYPE_SCHED) { - attr.attach_btf_id = load_attr->attach_btf_id; - attr.attach_prog_fd = load_attr->attach_prog_fd; - } else { - attr.prog_ifindex = load_attr->prog_ifindex; - attr.kern_version = load_attr->kern_version; - } - attr.insn_cnt = (__u32)load_attr->insns_cnt; + + attr.attach_btf_id = load_attr->attach_btf_id; + attr.attach_prog_fd = load_attr->attach_prog_fd; + + attr.prog_ifindex = load_attr->prog_ifindex; + attr.kern_version = load_attr->kern_version; + + attr.insn_cnt = (__u32)load_attr->insn_cnt; attr.insns = ptr_to_u64(load_attr->insns); attr.license = ptr_to_u64(load_attr->license);
- attr.log_level = log_level; - if (log_level) { - attr.log_buf = ptr_to_u64(log_buf); - attr.log_size = log_buf_sz; - } else { - attr.log_buf = ptr_to_u64(NULL); - attr.log_size = 0; + attr.log_level = load_attr->log_level; + if (attr.log_level) { + attr.log_buf = ptr_to_u64(load_attr->log_buf); + attr.log_size = load_attr->log_buf_sz; }
attr.prog_btf_fd = load_attr->prog_btf_fd; + attr.prog_flags = load_attr->prog_flags; + attr.func_info_rec_size = load_attr->func_info_rec_size; attr.func_info_cnt = load_attr->func_info_cnt; attr.func_info = ptr_to_u64(load_attr->func_info); + attr.line_info_rec_size = load_attr->line_info_rec_size; attr.line_info_cnt = load_attr->line_info_cnt; attr.line_info = ptr_to_u64(load_attr->line_info); + if (load_attr->name) memcpy(attr.prog_name, load_attr->name, - min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1)); - attr.prog_flags = load_attr->prog_flags; + min(strlen(load_attr->name), (size_t)BPF_OBJ_NAME_LEN - 1));
fd = sys_bpf_prog_load(&attr, sizeof(attr)); if (fd >= 0) @@ -308,19 +300,19 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, }
fd = sys_bpf_prog_load(&attr, sizeof(attr)); - if (fd >= 0) goto done; }
- if (log_level || !log_buf) + if (load_attr->log_level || !load_attr->log_buf) goto done;
/* Try again with log */ - attr.log_buf = ptr_to_u64(log_buf); - attr.log_size = log_buf_sz; + attr.log_buf = ptr_to_u64(load_attr->log_buf); + attr.log_size = load_attr->log_buf_sz; attr.log_level = 1; - log_buf[0] = 0; + load_attr->log_buf[0] = 0; + fd = sys_bpf_prog_load(&attr, sizeof(attr)); done: free(finfo); @@ -328,6 +320,50 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, return fd; }
+int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, + char *log_buf, size_t log_buf_sz) +{ + struct bpf_prog_load_params p = {}; + + if (!load_attr || !log_buf != !log_buf_sz) + return -EINVAL; + + p.prog_type = load_attr->prog_type; + p.expected_attach_type = load_attr->expected_attach_type; + switch (p.prog_type) { + case BPF_PROG_TYPE_STRUCT_OPS: + case BPF_PROG_TYPE_LSM: + p.attach_btf_id = load_attr->attach_btf_id; + break; + case BPF_PROG_TYPE_TRACING: + case BPF_PROG_TYPE_EXT: + case BPF_PROG_TYPE_SCHED: + p.attach_btf_id = load_attr->attach_btf_id; + p.attach_prog_fd = load_attr->attach_prog_fd; + break; + default: + p.prog_ifindex = load_attr->prog_ifindex; + p.kern_version = load_attr->kern_version; + } + p.insn_cnt = load_attr->insns_cnt; + p.insns = load_attr->insns; + p.license = load_attr->license; + p.log_level = load_attr->log_level; + p.log_buf = log_buf; + p.log_buf_sz = log_buf_sz; + p.prog_btf_fd = load_attr->prog_btf_fd; + p.func_info_rec_size = load_attr->func_info_rec_size; + p.func_info_cnt = load_attr->func_info_cnt; + p.func_info = load_attr->func_info; + p.line_info_rec_size = load_attr->line_info_rec_size; + p.line_info_cnt = load_attr->line_info_cnt; + p.line_info = load_attr->line_info; + p.name = load_attr->name; + p.prog_flags = load_attr->prog_flags; + + return libbpf__bpf_prog_load(&p); +} + int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns, size_t insns_cnt, const char *license, __u32 kern_version, char *log_buf, diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f64144c48317..be1a5143e0f5 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6867,7 +6867,7 @@ static int load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, char *license, __u32 kern_version, int *pfd) { - struct bpf_load_program_attr load_attr; + struct bpf_prog_load_params load_attr = {}; char *cp, errmsg[STRERR_BUFSIZE]; size_t log_buf_size = 0; char *log_buf = NULL; @@ -6886,7 +6886,6 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, if (!insns || !insns_cnt) return -EINVAL;
- memset(&load_attr, 0, sizeof(struct bpf_load_program_attr)); load_attr.prog_type = prog->type; /* old kernels might not support specifying expected_attach_type */ if (!kernel_supports(FEAT_EXP_ATTACH_TYPE) && prog->sec_def && @@ -6897,20 +6896,14 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, if (kernel_supports(FEAT_PROG_NAME)) load_attr.name = prog->name; load_attr.insns = insns; - load_attr.insns_cnt = insns_cnt; + load_attr.insn_cnt = insns_cnt; load_attr.license = license; - if (prog->type == BPF_PROG_TYPE_STRUCT_OPS || - prog->type == BPF_PROG_TYPE_LSM) { - load_attr.attach_btf_id = prog->attach_btf_id; - } else if (prog->type == BPF_PROG_TYPE_TRACING || - prog->type == BPF_PROG_TYPE_EXT || - prog->type == BPF_PROG_TYPE_SCHED) { - load_attr.attach_prog_fd = prog->attach_prog_fd; - load_attr.attach_btf_id = prog->attach_btf_id; - } else { - load_attr.kern_version = kern_version; - load_attr.prog_ifindex = prog->prog_ifindex; - } + load_attr.attach_btf_id = prog->attach_btf_id; + load_attr.attach_prog_fd = prog->attach_prog_fd; + load_attr.attach_btf_id = prog->attach_btf_id; + load_attr.kern_version = kern_version; + load_attr.prog_ifindex = prog->prog_ifindex; + /* specify func_info/line_info only if kernel supports them */ btf_fd = bpf_object__btf_fd(prog->obj); if (btf_fd >= 0 && kernel_supports(FEAT_BTF_FUNC)) { @@ -6934,7 +6927,9 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, *log_buf = 0; }
- ret = bpf_load_program_xattr(&load_attr, log_buf, log_buf_size); + load_attr.log_buf = log_buf; + load_attr.log_buf_sz = log_buf_size; + ret = libbpf__bpf_prog_load(&load_attr);
if (ret >= 0) { if (log_buf && load_attr.log_level) @@ -6975,9 +6970,9 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, pr_warn("-- BEGIN DUMP LOG ---\n"); pr_warn("\n%s\n", log_buf); pr_warn("-- END LOG --\n"); - } else if (load_attr.insns_cnt >= BPF_MAXINSNS) { + } else if (load_attr.insn_cnt >= BPF_MAXINSNS) { pr_warn("Program too large (%zu insns), at most %d insns\n", - load_attr.insns_cnt, BPF_MAXINSNS); + load_attr.insn_cnt, BPF_MAXINSNS); ret = -LIBBPF_ERRNO__PROG2BIG; } else if (load_attr.prog_type != BPF_PROG_TYPE_KPROBE) { /* Wrong program type? */ @@ -6985,7 +6980,9 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
load_attr.prog_type = BPF_PROG_TYPE_KPROBE; load_attr.expected_attach_type = 0; - fd = bpf_load_program_xattr(&load_attr, NULL, 0); + load_attr.log_buf = NULL; + load_attr.log_buf_sz = 0; + fd = libbpf__bpf_prog_load(&load_attr); if (fd >= 0) { close(fd); ret = -LIBBPF_ERRNO__PROGTYPE; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index e569ae63808e..681073a67ae3 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -151,6 +151,35 @@ int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz); int libbpf__load_raw_btf(const char *raw_types, size_t types_len, const char *str_sec, size_t str_len);
+struct bpf_prog_load_params { + enum bpf_prog_type prog_type; + enum bpf_attach_type expected_attach_type; + const char *name; + const struct bpf_insn *insns; + size_t insn_cnt; + const char *license; + __u32 kern_version; + __u32 attach_prog_fd; + __u32 attach_btf_id; + __u32 prog_ifindex; + __u32 prog_btf_fd; + __u32 prog_flags; + + __u32 func_info_rec_size; + const void *func_info; + __u32 func_info_cnt; + + __u32 line_info_rec_size; + const void *line_info; + __u32 line_info_cnt; + + __u32 log_level; + char *log_buf; + size_t log_buf_sz; +}; + +int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr); + int bpf_object__section_size(const struct bpf_object *obj, const char *name, __u32 *size); int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 91abb4a6d79df6c4dcd86d85632df53c8cca2dcf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Teach libbpf to search for BTF types in kernel modules for tracing BPF programs. This allows attachment of raw_tp/fentry/fexit/fmod_ret/etc BPF program types to tracepoints and functions in kernel modules.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-13-andrii@kernel.org (cherry picked from commit 91abb4a6d79df6c4dcd86d85632df53c8cca2dcf) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf.c | 5 +- tools/lib/bpf/libbpf.c | 138 +++++++++++++++++++++++++------- tools/lib/bpf/libbpf_internal.h | 1 + 3 files changed, 112 insertions(+), 32 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 46c702d8a1d8..1ce17f3528f6 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -231,8 +231,11 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr) attr.prog_type = load_attr->prog_type; attr.expected_attach_type = load_attr->expected_attach_type;
+ if (load_attr->attach_prog_fd) + attr.attach_prog_fd = load_attr->attach_prog_fd; + else + attr.attach_btf_obj_fd = load_attr->attach_btf_obj_fd; attr.attach_btf_id = load_attr->attach_btf_id; - attr.attach_prog_fd = load_attr->attach_prog_fd;
attr.prog_ifindex = load_attr->prog_ifindex; attr.kern_version = load_attr->kern_version; diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index be1a5143e0f5..fdf8441ec8a0 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -278,6 +278,7 @@ struct bpf_program { enum bpf_prog_type type; enum bpf_attach_type expected_attach_type; int prog_ifindex; + __u32 attach_btf_obj_fd; __u32 attach_btf_id; __u32 attach_prog_fd; void *func_info; @@ -408,6 +409,7 @@ struct module_btf { struct btf *btf; char *name; __u32 id; + int fd; };
struct bpf_object { @@ -4824,8 +4826,7 @@ static int load_module_btfs(struct bpf_object *obj) if (err) { err = -errno; pr_warn("failed to get BTF object #%d info: %d\n", id, err); - close(fd); - return err; + goto err_out; }
/* ignore non-module BTFs */ @@ -4835,25 +4836,33 @@ static int load_module_btfs(struct bpf_object *obj) }
btf = btf_get_from_fd(fd, obj->btf_vmlinux); - close(fd); if (IS_ERR(btf)) { pr_warn("failed to load module [%s]'s BTF object #%d: %ld\n", name, id, PTR_ERR(btf)); - return PTR_ERR(btf); + err = PTR_ERR(btf); + goto err_out; }
err = btf_ensure_mem((void **)&obj->btf_modules, &obj->btf_module_cap, sizeof(*obj->btf_modules), obj->btf_module_cnt + 1); if (err) - return err; + goto err_out;
mod_btf = &obj->btf_modules[obj->btf_module_cnt++];
mod_btf->btf = btf; mod_btf->id = id; + mod_btf->fd = fd; mod_btf->name = strdup(name); - if (!mod_btf->name) - return -ENOMEM; + if (!mod_btf->name) { + err = -ENOMEM; + goto err_out; + } + continue; + +err_out: + close(fd); + return err; }
return 0; @@ -6899,7 +6908,10 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, load_attr.insn_cnt = insns_cnt; load_attr.license = license; load_attr.attach_btf_id = prog->attach_btf_id; - load_attr.attach_prog_fd = prog->attach_prog_fd; + if (prog->attach_prog_fd) + load_attr.attach_prog_fd = prog->attach_prog_fd; + else + load_attr.attach_btf_obj_fd = prog->attach_btf_obj_fd; load_attr.attach_btf_id = prog->attach_btf_id; load_attr.kern_version = kern_version; load_attr.prog_ifindex = prog->prog_ifindex; @@ -6995,11 +7007,11 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, return ret; }
-static int libbpf_find_attach_btf_id(struct bpf_program *prog); +static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id);
int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) { - int err = 0, fd, i, btf_id; + int err = 0, fd, i;
if (prog->obj->loaded) { pr_warn("prog '%s': can't load after object was loaded\n", prog->name); @@ -7010,10 +7022,14 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) prog->type == BPF_PROG_TYPE_LSM || prog->type == BPF_PROG_TYPE_EXT || prog->type == BPF_PROG_TYPE_SCHED) && !prog->attach_btf_id) { - btf_id = libbpf_find_attach_btf_id(prog); - if (btf_id <= 0) - return btf_id; - prog->attach_btf_id = btf_id; + int btf_obj_fd = 0, btf_type_id = 0; + + err = libbpf_find_attach_btf_id(prog, &btf_obj_fd, &btf_type_id); + if (err) + return err; + + prog->attach_btf_obj_fd = btf_obj_fd; + prog->attach_btf_id = btf_type_id; }
if (prog->instances.nr < 0 || !prog->instances.fds) { @@ -7528,6 +7544,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
/* clean up module BTFs */ for (i = 0; i < obj->btf_module_cnt; i++) { + close(obj->btf_modules[i].fd); btf__free(obj->btf_modules[i].btf); free(obj->btf_modules[i].name); } @@ -8891,8 +8908,8 @@ static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix, return btf__find_by_name_kind(btf, btf_type_name, kind); }
-static inline int __find_vmlinux_btf_id(struct btf *btf, const char *name, - enum bpf_attach_type attach_type) +static inline int find_attach_btf_id(struct btf *btf, const char *name, + enum bpf_attach_type attach_type) { int err;
@@ -8911,9 +8928,6 @@ static inline int __find_vmlinux_btf_id(struct btf *btf, const char *name, else err = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC);
- if (err <= 0) - pr_warn("%s is not found in vmlinux BTF\n", name); - return err; }
@@ -8929,7 +8943,10 @@ int libbpf_find_vmlinux_btf_id(const char *name, return -EINVAL; }
- err = __find_vmlinux_btf_id(btf, name, attach_type); + err = find_attach_btf_id(btf, name, attach_type); + if (err <= 0) + pr_warn("%s is not found in vmlinux BTF\n", name); + btf__free(btf); return err; } @@ -8967,11 +8984,49 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) return err; }
-static int libbpf_find_attach_btf_id(struct bpf_program *prog) +static int find_kernel_btf_id(struct bpf_object *obj, const char *attach_name, + enum bpf_attach_type attach_type, + int *btf_obj_fd, int *btf_type_id) +{ + int ret, i; + + ret = find_attach_btf_id(obj->btf_vmlinux, attach_name, attach_type); + if (ret > 0) { + *btf_obj_fd = 0; /* vmlinux BTF */ + *btf_type_id = ret; + return 0; + } + if (ret != -ENOENT) + return ret; + + ret = load_module_btfs(obj); + if (ret) + return ret; + + for (i = 0; i < obj->btf_module_cnt; i++) { + const struct module_btf *mod = &obj->btf_modules[i]; + + ret = find_attach_btf_id(mod->btf, attach_name, attach_type); + if (ret > 0) { + *btf_obj_fd = mod->fd; + *btf_type_id = ret; + return 0; + } + if (ret == -ENOENT) + continue; + + return ret; + } + + return -ESRCH; +} + +static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id) { enum bpf_attach_type attach_type = prog->expected_attach_type; __u32 attach_prog_fd = prog->attach_prog_fd; - const char *name = prog->sec_name; + const char *name = prog->sec_name, *attach_name; + const struct bpf_sec_def *sec = NULL; int i, err;
if (!name) @@ -8982,17 +9037,37 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog) continue; if (strncmp(name, section_defs[i].sec, section_defs[i].len)) continue; - if (attach_prog_fd) - err = libbpf_find_prog_btf_id(name + section_defs[i].len, - attach_prog_fd); - else - err = __find_vmlinux_btf_id(prog->obj->btf_vmlinux, - name + section_defs[i].len, - attach_type); + + sec = §ion_defs[i]; + break; + } + + if (!sec) { + pr_warn("failed to identify BTF ID based on ELF section name '%s'\n", name); + return -ESRCH; + } + attach_name = name + sec->len; + + /* BPF program's BTF ID */ + if (attach_prog_fd) { + err = libbpf_find_prog_btf_id(attach_name, attach_prog_fd); + if (err < 0) { + pr_warn("failed to find BPF program (FD %d) BTF ID for '%s': %d\n", + attach_prog_fd, attach_name, err); + return err; + } + *btf_obj_fd = 0; + *btf_type_id = err; + return 0; + } + + /* kernel/module BTF ID */ + err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id); + if (err) { + pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err); return err; } - pr_warn("failed to identify btf_id based on ELF section name '%s'\n", name); - return -ESRCH; + return 0; }
int libbpf_attach_type_by_name(const char *name, @@ -10892,6 +10967,7 @@ int bpf_program__set_attach_target(struct bpf_program *prog, return btf_id;
prog->attach_btf_id = btf_id; + prog->attach_btf_obj_fd = 0; prog->attach_prog_fd = attach_prog_fd; return 0; } diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 681073a67ae3..969d0ac592ba 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -160,6 +160,7 @@ struct bpf_prog_load_params { const char *license; __u32 kern_version; __u32 attach_prog_fd; + __u32 attach_btf_obj_fd; __u32 attach_btf_id; __u32 prog_ifindex; __u32 prog_btf_fd;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit bc9ed69c79ae7577314a24e09c5b0d1c1c314ced category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add another CO-RE relocation test for kernel module relocations. This time for tp_btf with direct memory reads.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-14-andrii@kernel.org (cherry picked from commit bc9ed69c79ae7577314a24e09c5b0d1c1c314ced) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/core_reloc.c | 3 +- .../bpf/progs/test_core_reloc_module.c | 32 ++++++++++++++++++- 2 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index e2cfbb4b4113..a9127478d5e0 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -509,7 +509,8 @@ static struct core_reloc_test_case test_cases[] = { },
/* validate we can find kernel module BTF types for relocs/attach */ - MODULES_CASE("module", "raw_tp/bpf_testmod_test_read", "bpf_testmod_test_read"), + MODULES_CASE("module_probed", "raw_tp/bpf_testmod_test_read", "bpf_testmod_test_read"), + MODULES_CASE("module_direct", "tp_btf/bpf_testmod_test_read", NULL),
/* validate BPF program can use multiple flavors to match against * single target BTF type diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_module.c b/tools/testing/selftests/bpf/progs/test_core_reloc_module.c index d1840c1a9d36..56363959f7b0 100644 --- a/tools/testing/selftests/bpf/progs/test_core_reloc_module.c +++ b/tools/testing/selftests/bpf/progs/test_core_reloc_module.c @@ -36,7 +36,7 @@ struct core_reloc_module_output { };
SEC("raw_tp/bpf_testmod_test_read") -int BPF_PROG(test_core_module, +int BPF_PROG(test_core_module_probed, struct task_struct *task, struct bpf_testmod_test_read_ctx *read_ctx) { @@ -64,3 +64,33 @@ int BPF_PROG(test_core_module,
return 0; } + +SEC("tp_btf/bpf_testmod_test_read") +int BPF_PROG(test_core_module_direct, + struct task_struct *task, + struct bpf_testmod_test_read_ctx *read_ctx) +{ + struct core_reloc_module_output *out = (void *)&data.out; + __u64 pid_tgid = bpf_get_current_pid_tgid(); + __u32 real_tgid = (__u32)(pid_tgid >> 32); + __u32 real_pid = (__u32)pid_tgid; + + if (data.my_pid_tgid != pid_tgid) + return 0; + + if (task->pid != real_pid || task->tgid != real_tgid) + return 0; + + out->len = read_ctx->len; + out->off = read_ctx->off; + + out->read_ctx_sz = bpf_core_type_size(struct bpf_testmod_test_read_ctx); + out->read_ctx_exists = bpf_core_type_exists(struct bpf_testmod_test_read_ctx); + out->buf_exists = bpf_core_field_exists(read_ctx->buf); + out->off_exists = bpf_core_field_exists(read_ctx->off); + out->len_exists = bpf_core_field_exists(read_ctx->len); + + out->comm_len = BPF_CORE_READ_STR_INTO(&out->comm, task, comm); + + return 0; +}
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit 1e38abefcfd65f3ef7b12895dfd48db80aca28da category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add new selftest checking attachment of fentry/fexit/fmod_ret (and raw tracepoint ones for completeness) BPF programs to kernel module function.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201203204634.1325171-15-andrii@kernel.org (cherry picked from commit 1e38abefcfd65f3ef7b12895dfd48db80aca28da) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/module_attach.c | 53 +++++++++++++++ .../selftests/bpf/progs/test_module_attach.c | 66 +++++++++++++++++++ 2 files changed, 119 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/module_attach.c create mode 100644 tools/testing/selftests/bpf/progs/test_module_attach.c
diff --git a/tools/testing/selftests/bpf/prog_tests/module_attach.c b/tools/testing/selftests/bpf/prog_tests/module_attach.c new file mode 100644 index 000000000000..4b65e9918764 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/module_attach.c @@ -0,0 +1,53 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ + +#include <test_progs.h> +#include "test_module_attach.skel.h" + +static int duration; + +static int trigger_module_test_read(int read_sz) +{ + int fd, err; + + fd = open("/sys/kernel/bpf_testmod", O_RDONLY); + err = -errno; + if (CHECK(fd < 0, "testmod_file_open", "failed: %d\n", err)) + return err; + + read(fd, NULL, read_sz); + close(fd); + + return 0; +} + +void test_module_attach(void) +{ + const int READ_SZ = 456; + struct test_module_attach* skel; + struct test_module_attach__bss *bss; + int err; + + skel = test_module_attach__open_and_load(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + bss = skel->bss; + + err = test_module_attach__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* trigger tracepoint */ + ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read"); + + ASSERT_EQ(bss->raw_tp_read_sz, READ_SZ, "raw_tp"); + ASSERT_EQ(bss->tp_btf_read_sz, READ_SZ, "tp_btf"); + ASSERT_EQ(bss->fentry_read_sz, READ_SZ, "fentry"); + ASSERT_EQ(bss->fexit_read_sz, READ_SZ, "fexit"); + ASSERT_EQ(bss->fexit_ret, -EIO, "fexit_tet"); + ASSERT_EQ(bss->fmod_ret_read_sz, READ_SZ, "fmod_ret"); + +cleanup: + test_module_attach__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_module_attach.c b/tools/testing/selftests/bpf/progs/test_module_attach.c new file mode 100644 index 000000000000..b563563df172 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_module_attach.c @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_core_read.h> +#include "../bpf_testmod/bpf_testmod.h" + +__u32 raw_tp_read_sz = 0; + +SEC("raw_tp/bpf_testmod_test_read") +int BPF_PROG(handle_raw_tp, + struct task_struct *task, struct bpf_testmod_test_read_ctx *read_ctx) +{ + raw_tp_read_sz = BPF_CORE_READ(read_ctx, len); + return 0; +} + +__u32 tp_btf_read_sz = 0; + +SEC("tp_btf/bpf_testmod_test_read") +int BPF_PROG(handle_tp_btf, + struct task_struct *task, struct bpf_testmod_test_read_ctx *read_ctx) +{ + tp_btf_read_sz = read_ctx->len; + return 0; +} + +__u32 fentry_read_sz = 0; + +SEC("fentry/bpf_testmod_test_read") +int BPF_PROG(handle_fentry, + struct file *file, struct kobject *kobj, + struct bin_attribute *bin_attr, char *buf, loff_t off, size_t len) +{ + fentry_read_sz = len; + return 0; +} + +__u32 fexit_read_sz = 0; +int fexit_ret = 0; + +SEC("fexit/bpf_testmod_test_read") +int BPF_PROG(handle_fexit, + struct file *file, struct kobject *kobj, + struct bin_attribute *bin_attr, char *buf, loff_t off, size_t len, + int ret) +{ + fexit_read_sz = len; + fexit_ret = ret; + return 0; +} + +__u32 fmod_ret_read_sz = 0; + +SEC("fmod_ret/bpf_testmod_test_read") +int BPF_PROG(handle_fmod_ret, + struct file *file, struct kobject *kobj, + struct bin_attribute *bin_attr, char *buf, loff_t off, size_t len) +{ + fmod_ret_read_sz = len; + return 0; /* don't override the exit code */ +} + +char _license[] SEC("license") = "GPL";
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit e732b538f4557cd0a856bbce3cde55d2dfef3b03 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In some modes of operation, Kbuild allows to build modules without having vmlinux image around. In such case, generation of module BTF is impossible. This patch changes the behavior to emit a warning about impossibility of generating kernel module BTF, instead of breaking the build. This is especially important for out-of-tree external module builds.
In vmlinux-less mode:
$ make clean $ make modules_prepare $ touch drivers/acpi/button.c $ make M=drivers/acpi ... CC [M] drivers/acpi/button.o MODPOST drivers/acpi/Module.symvers LD [M] drivers/acpi/button.ko BTF [M] drivers/acpi/button.ko Skipping BTF generation for drivers/acpi/button.ko due to unavailability of vmlinux ... $ readelf -S ~/linux-build/default/drivers/acpi/button.ko | grep BTF -A1 ... empty ...
Now with normal build:
$ make all ... LD [M] drivers/acpi/button.ko BTF [M] drivers/acpi/button.ko ... $ readelf -S ~/linux-build/default/drivers/acpi/button.ko | grep BTF -A1 [60] .BTF PROGBITS 0000000000000000 00029310 000000000000ab3f 0000000000000000 0 0 1
Fixes: 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole supports it") Reported-by: Bruce Allan bruce.w.allan@intel.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Cc: Jessica Yu jeyu@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Masahiro Yamada yamada.masahiro@socionext.com Link: https://lore.kernel.org/bpf/20201121070829.2612884-1-andrii@kernel.org (cherry picked from commit e732b538f4557cd0a856bbce3cde55d2dfef3b03) Signed-off-by: Wang Yufen wangyufen@huawei.com --- scripts/Makefile.modfinal | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 02b892421f7a..d49ec001825d 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -38,7 +38,12 @@ quiet_cmd_ld_ko_o = LD [M] $@ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
quiet_cmd_btf_ko = BTF [M] $@ - cmd_btf_ko = LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@ + cmd_btf_ko = \ + if [ -f vmlinux ]; then \ + LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@; \ + else \ + printf "Skipping BTF generation for %s due to unavailability of vmlinux\n" $@ 1>&2; \ + fi;
# Same as newer-prereqs, but allows to exclude specified extra dependencies newer_prereqs_except = $(filter-out $(PHONY) $(1),$?) @@ -49,7 +54,7 @@ if_changed_except = $(if $(call newer_prereqs_except,$(2))$(cmd-check), \ printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd, @:)
# Re-generate module BTFs if either module's .ko or vmlinux changed -$(modules): %.ko: %.o %.mod.o scripts/module.lds vmlinux FORCE +$(modules): %.ko: %.o %.mod.o scripts/module.lds $(if $(KBUILD_BUILTIN),vmlinux) FORCE +$(call if_changed_except,ld_ko_o,vmlinux) ifdef CONFIG_DEBUG_INFO_BTF_MODULES +$(if $(newer-prereqs),$(call cmd,btf_ko))
From: Wedson Almeida Filho wedsonaf@google.com
mainline inclusion from mainline-5.11-rc1 commit 59e2e27d227a0a4e7ec0e22c63ca36a5ad1ab438 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The current implementation uses a number of gotos to implement a loop and different paths within the loop, which makes the code less readable than it would be with an explicit while-loop. This patch also replaces a chain of if/if-elses keyed on the same expression with a switch statement.
No change in behaviour is intended.
Signed-off-by: Wedson Almeida Filho wedsonaf@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20201121015509.3594191-1-wedsonaf@google.com (cherry picked from commit 59e2e27d227a0a4e7ec0e22c63ca36a5ad1ab438) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 179 ++++++++++++++++++++++-------------------- 1 file changed, 95 insertions(+), 84 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8abf2b6bb27b..e8572e669a6d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8549,6 +8549,11 @@ static void init_explored_state(struct bpf_verifier_env *env, int idx) env->insn_aux_data[idx].prune_point = true; }
+enum { + DONE_EXPLORING = 0, + KEEP_EXPLORING = 1, +}; + /* t, w, e - match pseudo-code above: * t - index of current instruction * w - next instruction @@ -8561,10 +8566,10 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, int *insn_state = env->cfg.insn_state;
if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH)) - return 0; + return DONE_EXPLORING;
if (e == BRANCH && insn_state[t] >= (DISCOVERED | BRANCH)) - return 0; + return DONE_EXPLORING;
if (w < 0 || w >= env->prog->len) { verbose_linfo(env, t, "%d: ", t); @@ -8583,10 +8588,10 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, if (env->cfg.cur_stack >= env->prog->len) return -E2BIG; insn_stack[env->cfg.cur_stack++] = w; - return 1; + return KEEP_EXPLORING; } else if ((insn_state[w] & 0xF0) == DISCOVERED) { if (loop_ok && env->bpf_capable) - return 0; + return DONE_EXPLORING; verbose_linfo(env, t, "%d: ", t); verbose_linfo(env, w, "%d: ", w); verbose(env, "back-edge from insn %d to %d\n", t, w); @@ -8598,7 +8603,74 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, verbose(env, "insn state internal bug\n"); return -EFAULT; } - return 0; + return DONE_EXPLORING; +} + +/* Visits the instruction at index t and returns one of the following: + * < 0 - an error occurred + * DONE_EXPLORING - the instruction was fully explored + * KEEP_EXPLORING - there is still work to be done before it is fully explored + */ +static int visit_insn(int t, int insn_cnt, struct bpf_verifier_env *env) +{ + struct bpf_insn *insns = env->prog->insnsi; + int ret; + + /* All non-branch instructions have a single fall-through edge. */ + if (BPF_CLASS(insns[t].code) != BPF_JMP && + BPF_CLASS(insns[t].code) != BPF_JMP32) + return push_insn(t, t + 1, FALLTHROUGH, env, false); + + switch (BPF_OP(insns[t].code)) { + case BPF_EXIT: + return DONE_EXPLORING; + + case BPF_CALL: + ret = push_insn(t, t + 1, FALLTHROUGH, env, false); + if (ret) + return ret; + + if (t + 1 < insn_cnt) + init_explored_state(env, t + 1); + if (insns[t].src_reg == BPF_PSEUDO_CALL) { + init_explored_state(env, t); + ret = push_insn(t, t + insns[t].imm + 1, BRANCH, + env, false); + } + return ret; + + case BPF_JA: + if (BPF_SRC(insns[t].code) != BPF_K) + return -EINVAL; + + /* unconditional jump with single edge */ + ret = push_insn(t, t + insns[t].off + 1, FALLTHROUGH, env, + true); + if (ret) + return ret; + + /* unconditional jmp is not a good pruning point, + * but it's marked, since backtracking needs + * to record jmp history in is_state_visited(). + */ + init_explored_state(env, t + insns[t].off + 1); + /* tell verifier to check for equivalent states + * after every call and jump + */ + if (t + 1 < insn_cnt) + init_explored_state(env, t + 1); + + return ret; + + default: + /* conditional jump with two edges */ + init_explored_state(env, t); + ret = push_insn(t, t + 1, FALLTHROUGH, env, true); + if (ret) + return ret; + + return push_insn(t, t + insns[t].off + 1, BRANCH, env, true); + } }
/* non-recursive depth-first-search to detect loops in BPF program @@ -8606,11 +8678,10 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, */ static int check_cfg(struct bpf_verifier_env *env) { - struct bpf_insn *insns = env->prog->insnsi; int insn_cnt = env->prog->len; int *insn_stack, *insn_state; int ret = 0; - int i, t; + int i;
insn_state = env->cfg.insn_state = kvcalloc(insn_cnt, sizeof(int), GFP_KERNEL); if (!insn_state) @@ -8626,92 +8697,32 @@ static int check_cfg(struct bpf_verifier_env *env) insn_stack[0] = 0; /* 0 is the first instruction */ env->cfg.cur_stack = 1;
-peek_stack: - if (env->cfg.cur_stack == 0) - goto check_state; - t = insn_stack[env->cfg.cur_stack - 1]; - - if (BPF_CLASS(insns[t].code) == BPF_JMP || - BPF_CLASS(insns[t].code) == BPF_JMP32) { - u8 opcode = BPF_OP(insns[t].code); - - if (opcode == BPF_EXIT) { - goto mark_explored; - } else if (opcode == BPF_CALL) { - ret = push_insn(t, t + 1, FALLTHROUGH, env, false); - if (ret == 1) - goto peek_stack; - else if (ret < 0) - goto err_free; - if (t + 1 < insn_cnt) - init_explored_state(env, t + 1); - if (insns[t].src_reg == BPF_PSEUDO_CALL) { - init_explored_state(env, t); - ret = push_insn(t, t + insns[t].imm + 1, BRANCH, - env, false); - if (ret == 1) - goto peek_stack; - else if (ret < 0) - goto err_free; - } - } else if (opcode == BPF_JA) { - if (BPF_SRC(insns[t].code) != BPF_K) { - ret = -EINVAL; - goto err_free; - } - /* unconditional jump with single edge */ - ret = push_insn(t, t + insns[t].off + 1, - FALLTHROUGH, env, true); - if (ret == 1) - goto peek_stack; - else if (ret < 0) - goto err_free; - /* unconditional jmp is not a good pruning point, - * but it's marked, since backtracking needs - * to record jmp history in is_state_visited(). - */ - init_explored_state(env, t + insns[t].off + 1); - /* tell verifier to check for equivalent states - * after every call and jump - */ - if (t + 1 < insn_cnt) - init_explored_state(env, t + 1); - } else { - /* conditional jump with two edges */ - init_explored_state(env, t); - ret = push_insn(t, t + 1, FALLTHROUGH, env, true); - if (ret == 1) - goto peek_stack; - else if (ret < 0) - goto err_free; + while (env->cfg.cur_stack > 0) { + int t = insn_stack[env->cfg.cur_stack - 1];
- ret = push_insn(t, t + insns[t].off + 1, BRANCH, env, true); - if (ret == 1) - goto peek_stack; - else if (ret < 0) - goto err_free; - } - } else { - /* all other non-branch instructions with single - * fall-through edge - */ - ret = push_insn(t, t + 1, FALLTHROUGH, env, false); - if (ret == 1) - goto peek_stack; - else if (ret < 0) + ret = visit_insn(t, insn_cnt, env); + switch (ret) { + case DONE_EXPLORING: + insn_state[t] = EXPLORED; + env->cfg.cur_stack--; + break; + case KEEP_EXPLORING: + break; + default: + if (ret > 0) { + verbose(env, "visit_insn internal bug\n"); + ret = -EFAULT; + } goto err_free; + } }
-mark_explored: - insn_state[t] = EXPLORED; - if (env->cfg.cur_stack-- <= 0) { + if (env->cfg.cur_stack < 0) { verbose(env, "pop stack internal bug\n"); ret = -EFAULT; goto err_free; } - goto peek_stack;
-check_state: for (i = 0; i < insn_cnt; i++) { if (insn_state[i] != EXPLORED) { verbose(env, "unreachable insn %d\n", i);
From: Florian Lehner dev@der-flo.net
mainline inclusion from mainline-5.11-rc1 commit 7d17167244f5415bc6bc90f5bb0074b6d79676b4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Commit 8184d44c9a57 ("selftests/bpf: skip verifier tests for unsupported program types") added a check to skip unsupported program types. As bpf_probe_prog_type can change errno, do_single_test should save it before printing a reason why a supported BPF program type failed to load.
Fixes: 8184d44c9a57 ("selftests/bpf: skip verifier tests for unsupported program types") Signed-off-by: Florian Lehner dev@der-flo.net Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201204181828.11974-2-dev@der-flo.net (cherry picked from commit 7d17167244f5415bc6bc90f5bb0074b6d79676b4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_verifier.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 0fc813235575..0ce1138f6841 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -946,6 +946,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv, int run_errs, run_successes; int map_fds[MAX_NR_MAPS]; const char *expected_err; + int saved_errno; int fixup_skips; __u32 pflags; int i, err; @@ -1007,6 +1008,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv, }
fd_prog = bpf_load_program_xattr(&attr, bpf_vlog, sizeof(bpf_vlog)); + saved_errno = errno;
/* BPF_PROG_TYPE_TRACING requires more setup and * bpf_probe_prog_type won't give correct answer @@ -1023,7 +1025,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv, if (expected_ret == ACCEPT || expected_ret == VERBOSE_ACCEPT) { if (fd_prog < 0) { printf("FAIL\nFailed to load prog '%s'!\n", - strerror(errno)); + strerror(saved_errno)); goto fail_log; } #ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
From: Florian Lehner dev@der-flo.net
mainline inclusion from mainline-5.11-rc1 commit 5f61b7c6975b03e6ace2cfb13d415d5f475c8830 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Print a message when the returned error is about a program type being not supported or because of permission problems. These messages are expected if the program to test was actually executed.
Signed-off-by: Florian Lehner dev@der-flo.net Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201204181828.11974-3-dev@der-flo.net (cherry picked from commit 5f61b7c6975b03e6ace2cfb13d415d5f475c8830) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_verifier.c | 27 +++++++++++++++++---- 1 file changed, 22 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 0ce1138f6841..90c0428bc0e5 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -885,19 +885,36 @@ static int do_prog_test_run(int fd_prog, bool unpriv, uint32_t expected_val, __u8 tmp[TEST_DATA_LEN << 2]; __u32 size_tmp = sizeof(tmp); uint32_t retval; - int err; + int err, saved_errno;
if (unpriv) set_admin(true); err = bpf_prog_test_run(fd_prog, 1, data, size_data, tmp, &size_tmp, &retval, NULL); + saved_errno = errno; + if (unpriv) set_admin(false); - if (err && errno != 524/*ENOTSUPP*/ && errno != EPERM) { - printf("Unexpected bpf_prog_test_run error "); - return err; + + if (err) { + switch (saved_errno) { + case 524/*ENOTSUPP*/: + printf("Did not run the program (not supported) "); + return 0; + case EPERM: + if (unpriv) { + printf("Did not run the program (no permission) "); + return 0; + } + /* fallthrough; */ + default: + printf("FAIL: Unexpected bpf_prog_test_run error (%s) ", + strerror(saved_errno)); + return err; + } } - if (!err && retval != expected_val && + + if (retval != expected_val && expected_val != POINTER_VALUE) { printf("FAIL retval %d != %d ", retval, expected_val); return 1;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit a67079b03165a17f9aceab3dd26b1638af68e0fc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
bpf_testmod.ko build rule declared dependency on VMLINUX_BTF, but the variable itself was initialized after the rule was declared, which often caused bpf_testmod.ko to not be re-compiled. Fix by moving VMLINUX_BTF determination sooner.
Also enforce bpf_testmod.ko recompilation when we detect that vmlinux image changed by removing bpf_testmod/bpf_testmod.ko. This is necessary to generate correct module's split BTF. Without it, Kbuild's module build logic might determine that nothing changed on the kernel side and thus bpf_testmod.ko shouldn't be rebuilt, so won't re-generate module BTF, which often leads to module's BTF with wrong string offsets against vmlinux BTF. Removing .ko file forces Kbuild to re-build the module.
Reported-by: Alexei Starovoitov ast@kernel.org Fixes: 9f7fa225894c ("selftests/bpf: Add bpf_testmod kernel module for testing") Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20201211015946.4062098-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit a67079b03165a17f9aceab3dd26b1638af68e0fc) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 729b4255343a..db52c29b073c 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -116,6 +116,16 @@ INCLUDE_DIR := $(SCRATCH_DIR)/include BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a RESOLVE_BTFIDS := $(BUILD_DIR)/resolve_btfids/resolve_btfids
+VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ + $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ + ../../../../vmlinux \ + /sys/kernel/btf/vmlinux \ + /boot/vmlinux-$(shell uname -r) +VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS)))) +ifeq ($(VMLINUX_BTF),) +$(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)") +endif + # Define simple and short `make test_progs`, `make test_sysctl`, etc targets # to build individual tests. # NOTE: Semicolon at the end is critical to override lib.mk's default static @@ -140,6 +150,7 @@ $(OUTPUT)/urandom_read: urandom_read.c
$(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch]) $(call msg,MOD,,$@) + $(Q)$(RM) bpf_testmod/bpf_testmod.ko # force re-compilation $(Q)$(MAKE) $(submake_extras) -C bpf_testmod $(Q)cp bpf_testmod/bpf_testmod.ko $@
@@ -147,16 +158,6 @@ $(OUTPUT)/test_stub.o: test_stub.c $(BPFOBJ) $(call msg,CC,,$@) $(Q)$(CC) -c $(CFLAGS) -o $@ $<
-VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ - $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ - ../../../../vmlinux \ - /sys/kernel/btf/vmlinux \ - /boot/vmlinux-$(shell uname -r) -VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS)))) -ifeq ($(VMLINUX_BTF),) -$(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)") -endif - DEFAULT_BPFTOOL := $(SCRATCH_DIR)/sbin/bpftool
$(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL)
From: Andrew Delgadillo adelg@google.com
mainline inclusion from mainline-5.11-rc1 commit 89ad7420b25c2b40a4d916f4fd43b9ccacd50500 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
LLC is meant for compiler development and debugging. Consequently, it exposes many low level options about its backend. To avoid future bugs introduced by using the raw LLC tool, use clang directly so that all appropriate options are passed to the back end.
Additionally, simplify the Makefile by removing the CLANG_NATIVE_BPF_BUILD_RULE as it is not being use, stop passing dwarfris attr since elfutils/libdw now supports the bpf backend (which should work with any recent pahole), and stop passing alu32 since -mcpu=v3 implies alu32.
Signed-off-by: Andrew Delgadillo adelg@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20201211004344.3355074-1-adelg@google.com (cherry picked from commit 89ad7420b25c2b40a4d916f4fd43b9ccacd50500) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 28 +++++----------------------- 1 file changed, 5 insertions(+), 23 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index db52c29b073c..3362a0afd2d0 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -19,7 +19,6 @@ ifneq ($(wildcard $(GENHDR)),) endif
CLANG ?= clang -LLC ?= llc LLVM_OBJCOPY ?= llvm-objcopy BPF_GCC ?= $(shell command -v bpf-gcc;) SAN_CFLAGS ?= @@ -257,31 +256,19 @@ $(OUTPUT)/flow_dissector_load.o: flow_dissector_load.h # $1 - input .c file # $2 - output .o file # $3 - CFLAGS -# $4 - LDFLAGS define CLANG_BPF_BUILD_RULE - $(call msg,CLNG-LLC,$(TRUNNER_BINARY),$2) - $(Q)($(CLANG) $3 -O2 -target bpf -emit-llvm \ - -c $1 -o - || echo "BPF obj compilation failed") | \ - $(LLC) -mattr=dwarfris -march=bpf -mcpu=v3 $4 -filetype=obj -o $2 + $(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2) + $(Q)$(CLANG) $3 -O2 -target bpf -c $1 -o $2 -mcpu=v3 endef # Similar to CLANG_BPF_BUILD_RULE, but with disabled alu32 define CLANG_NOALU32_BPF_BUILD_RULE - $(call msg,CLNG-LLC,$(TRUNNER_BINARY),$2) - $(Q)($(CLANG) $3 -O2 -target bpf -emit-llvm \ - -c $1 -o - || echo "BPF obj compilation failed") | \ - $(LLC) -march=bpf -mcpu=v2 $4 -filetype=obj -o $2 -endef -# Similar to CLANG_BPF_BUILD_RULE, but using native Clang and bpf LLC -define CLANG_NATIVE_BPF_BUILD_RULE $(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2) - $(Q)($(CLANG) $3 -O2 -emit-llvm \ - -c $1 -o - || echo "BPF obj compilation failed") | \ - $(LLC) -march=bpf -mcpu=v3 $4 -filetype=obj -o $2 + $(Q)$(CLANG) $3 -O2 -target bpf -c $1 -o $2 -mcpu=v2 endef # Build BPF object using GCC define GCC_BPF_BUILD_RULE $(call msg,GCC-BPF,$(TRUNNER_BINARY),$2) - $(Q)$(BPF_GCC) $3 $4 -O2 -c $1 -o $2 + $(Q)$(BPF_GCC) $3 -O2 -c $1 -o $2 endef
SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c @@ -337,8 +324,7 @@ $(TRUNNER_BPF_OBJS): $(TRUNNER_OUTPUT)/%.o: \ $(wildcard $(BPFDIR)/bpf_*.h) \ | $(TRUNNER_OUTPUT) $$(BPFOBJ) $$(call $(TRUNNER_BPF_BUILD_RULE),$$<,$$@, \ - $(TRUNNER_BPF_CFLAGS), \ - $(TRUNNER_BPF_LDFLAGS)) + $(TRUNNER_BPF_CFLAGS))
$(TRUNNER_BPF_SKELS): $(TRUNNER_OUTPUT)/%.skel.h: \ $(TRUNNER_OUTPUT)/%.o \ @@ -406,19 +392,16 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(wildcard progs/btf_dump_test_case_*.c) TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS) -TRUNNER_BPF_LDFLAGS := -mattr=+alu32 $(eval $(call DEFINE_TEST_RUNNER,test_progs))
# Define test_progs-no_alu32 test runner. TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE -TRUNNER_BPF_LDFLAGS := $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
# Define test_progs BPF-GCC-flavored test runner. ifneq ($(BPF_GCC),) TRUNNER_BPF_BUILD_RULE := GCC_BPF_BUILD_RULE TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(call get_sys_includes,gcc) -TRUNNER_BPF_LDFLAGS := $(eval $(call DEFINE_TEST_RUNNER,test_progs,bpf_gcc)) endif
@@ -429,7 +412,6 @@ TRUNNER_EXTRA_SOURCES := test_maps.c TRUNNER_EXTRA_FILES := TRUNNER_BPF_BUILD_RULE := $$(error no BPF objects should be built) TRUNNER_BPF_CFLAGS := -TRUNNER_BPF_LDFLAGS := $(eval $(call DEFINE_TEST_RUNNER,test_maps))
# Define test_verifier test runner.
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc1 commit fe62de310e2b563c0d303a09d06b020077fe86b4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Support finding kernel targets in kernel modules when using bpf_program__set_attach_target() API. This brings it up to par with what libbpf supports when doing declarative SEC()-based target determination.
Some minor internal refactoring was needed to make sure vmlinux BTF can be loaded before bpf_object's load phase.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20201211215825.3646154-2-andrii@kernel.org (cherry picked from commit fe62de310e2b563c0d303a09d06b020077fe86b4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 64 ++++++++++++++++++++++++++---------------- 1 file changed, 40 insertions(+), 24 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index fdf8441ec8a0..a0a1b7c99f2f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2519,7 +2519,7 @@ static int bpf_object__finalize_btf(struct bpf_object *obj) return 0; }
-static inline bool libbpf_prog_needs_vmlinux_btf(struct bpf_program *prog) +static bool prog_needs_vmlinux_btf(struct bpf_program *prog) { if (prog->type == BPF_PROG_TYPE_STRUCT_OPS || prog->type == BPF_PROG_TYPE_LSM || @@ -2535,37 +2535,43 @@ static inline bool libbpf_prog_needs_vmlinux_btf(struct bpf_program *prog) return false; }
-static int bpf_object__load_vmlinux_btf(struct bpf_object *obj) +static bool obj_needs_vmlinux_btf(const struct bpf_object *obj) { - bool need_vmlinux_btf = false; struct bpf_program *prog; - int i, err; + int i;
/* CO-RE relocations need kernel BTF */ if (obj->btf_ext && obj->btf_ext->core_relo_info.len) - need_vmlinux_btf = true; + return true;
/* Support for typed ksyms needs kernel BTF */ for (i = 0; i < obj->nr_extern; i++) { const struct extern_desc *ext;
ext = &obj->externs[i]; - if (ext->type == EXT_KSYM && ext->ksym.type_id) { - need_vmlinux_btf = true; - break; - } + if (ext->type == EXT_KSYM && ext->ksym.type_id) + return true; }
bpf_object__for_each_program(prog, obj) { if (!prog->load) continue; - if (libbpf_prog_needs_vmlinux_btf(prog)) { - need_vmlinux_btf = true; - break; - } + if (prog_needs_vmlinux_btf(prog)) + return true; }
- if (!need_vmlinux_btf) + return false; +} + +static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force) +{ + int err; + + /* btf_vmlinux could be loaded earlier */ + if (obj->btf_vmlinux) + return 0; + + if (!force && !obj_needs_vmlinux_btf(obj)) return 0;
obj->btf_vmlinux = libbpf_find_kernel_btf(); @@ -7533,7 +7539,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) }
err = bpf_object__probe_loading(obj); - err = err ? : bpf_object__load_vmlinux_btf(obj); + err = err ? : bpf_object__load_vmlinux_btf(obj, false); err = err ? : bpf_object__resolve_externs(obj, obj->kconfig); err = err ? : bpf_object__sanitize_and_load_btf(obj); err = err ? : bpf_object__sanitize_maps(obj); @@ -10951,23 +10957,33 @@ int bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd, const char *attach_func_name) { - int btf_id; + int btf_obj_fd = 0, btf_id = 0, err;
if (!prog || attach_prog_fd < 0 || !attach_func_name) return -EINVAL;
- if (attach_prog_fd) + if (prog->obj->loaded) + return -EINVAL; + + if (attach_prog_fd) { btf_id = libbpf_find_prog_btf_id(attach_func_name, attach_prog_fd); - else - btf_id = libbpf_find_vmlinux_btf_id(attach_func_name, - prog->expected_attach_type); - - if (btf_id < 0) - return btf_id; + if (btf_id < 0) + return btf_id; + } else { + /* load btf_vmlinux, if not yet */ + err = bpf_object__load_vmlinux_btf(prog->obj, true); + if (err) + return err; + err = find_kernel_btf_id(prog->obj, attach_func_name, + prog->expected_attach_type, + &btf_obj_fd, &btf_id); + if (err) + return err; + }
prog->attach_btf_id = btf_id; - prog->attach_btf_obj_fd = 0; + prog->attach_btf_obj_fd = btf_obj_fd; prog->attach_prog_fd = attach_prog_fd; return 0; }
From: Jiri Olsa jolsa@kernel.org
mainline inclusion from mainline-5.11-rc3 commit 67208692802ce3cacfa00fe586dc0cb1bef0a51c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The kernel image can contain multiple types (structs/unions) with the same name. This causes distinct type hierarchies in BTF data and makes resolve_btfids fail with error like:
BTFIDS vmlinux FAILED unresolved symbol udp6_sock
as reported by Qais Yousef [1].
This change adds warning when multiple types of the same name are detected:
BTFIDS vmlinux WARN: multiple IDs found for 'file': 526, 113351 - using 526 WARN: multiple IDs found for 'sk_buff': 2744, 113958 - using 2744
We keep the lower ID for the given type instance and let the build continue.
Also changing the 'nr' variable name to 'nr_types' to avoid confusion.
[1] https://lore.kernel.org/lkml/20201229151352.6hzmjvu3qh6p2qgg@e107158-lin/
Signed-off-by: Jiri Olsa jolsa@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210105234219.970039-1-jolsa@kernel.org (cherry picked from commit 67208692802ce3cacfa00fe586dc0cb1bef0a51c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/resolve_btfids/main.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c index 87bded641afb..7ce4558a932f 100644 --- a/tools/bpf/resolve_btfids/main.c +++ b/tools/bpf/resolve_btfids/main.c @@ -139,6 +139,8 @@ int eprintf(int level, int var, const char *fmt, ...) #define pr_debug2(fmt, ...) pr_debugN(2, pr_fmt(fmt), ##__VA_ARGS__) #define pr_err(fmt, ...) \ eprintf(0, verbose, pr_fmt(fmt), ##__VA_ARGS__) +#define pr_info(fmt, ...) \ + eprintf(0, verbose, pr_fmt(fmt), ##__VA_ARGS__)
static bool is_btf_id(const char *name) { @@ -477,7 +479,7 @@ static int symbols_resolve(struct object *obj) int nr_funcs = obj->nr_funcs; int err, type_id; struct btf *btf; - __u32 nr; + __u32 nr_types;
btf = btf__parse(obj->btf ?: obj->path, NULL); err = libbpf_get_error(btf); @@ -488,12 +490,12 @@ static int symbols_resolve(struct object *obj) }
err = -1; - nr = btf__get_nr_types(btf); + nr_types = btf__get_nr_types(btf);
/* * Iterate all the BTF types and search for collected symbol IDs. */ - for (type_id = 1; type_id <= nr; type_id++) { + for (type_id = 1; type_id <= nr_types; type_id++) { const struct btf_type *type; struct rb_root *root; struct btf_id *id; @@ -531,8 +533,13 @@ static int symbols_resolve(struct object *obj)
id = btf_id__find(root, str); if (id) { - id->id = type_id; - (*nr)--; + if (id->id) { + pr_info("WARN: multiple IDs found for '%s': %d, %d - using %d\n", + str, id->id, type_id, id->id); + } else { + id->id = type_id; + (*nr)--; + } } }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc3 commit 11b844b0b7c7c3dc8e8f4d0bbaad5e798351862c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked") introduced a possibility of getting EBUSY error on lock contention, which seems to happen very deterministically in test_maps when running 1024 threads on low-CPU machine. In libbpf CI case, it's a 2 CPU VM and it's hitting this 100% of the time. Work around by retrying on EBUSY (and EAGAIN, while we are at it) after a small sleep. sched_yield() is too agressive and fails even after 20 retries, so I went with usleep(1) for backoff.
Also log actual error returned to make it easier to see what's going on.
Fixes: 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201223200652.3417075-1-andrii@kernel.org (cherry picked from commit 11b844b0b7c7c3dc8e8f4d0bbaad5e798351862c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_maps.c | 48 +++++++++++++++++++++---- 1 file changed, 42 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index 179e680e8d13..20eb7301850e 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -1311,22 +1311,58 @@ static void test_map_stress(void) #define DO_UPDATE 1 #define DO_DELETE 0
+#define MAP_RETRIES 20 + +static int map_update_retriable(int map_fd, const void *key, const void *value, + int flags, int attempts) +{ + while (bpf_map_update_elem(map_fd, key, value, flags)) { + if (!attempts || (errno != EAGAIN && errno != EBUSY)) + return -errno; + + usleep(1); + attempts--; + } + + return 0; +} + +static int map_delete_retriable(int map_fd, const void *key, int attempts) +{ + while (bpf_map_delete_elem(map_fd, key)) { + if (!attempts || (errno != EAGAIN && errno != EBUSY)) + return -errno; + + usleep(1); + attempts--; + } + + return 0; +} + static void test_update_delete(unsigned int fn, void *data) { int do_update = ((int *)data)[1]; int fd = ((int *)data)[0]; - int i, key, value; + int i, key, value, err;
for (i = fn; i < MAP_SIZE; i += TASKS) { key = value = i;
if (do_update) { - assert(bpf_map_update_elem(fd, &key, &value, - BPF_NOEXIST) == 0); - assert(bpf_map_update_elem(fd, &key, &value, - BPF_EXIST) == 0); + err = map_update_retriable(fd, &key, &value, BPF_NOEXIST, MAP_RETRIES); + if (err) + printf("error %d %d\n", err, errno); + assert(err == 0); + err = map_update_retriable(fd, &key, &value, BPF_EXIST, MAP_RETRIES); + if (err) + printf("error %d %d\n", err, errno); + assert(err == 0); } else { - assert(bpf_map_delete_elem(fd, &key) == 0); + err = map_delete_retriable(fd, &key, MAP_RETRIES); + if (err) + printf("error %d %d\n", err, errno); + assert(err == 0); } } }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.11-rc5 commit bcc5e6162d66d44f7929f30fce032f95855fc8b4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Some modules don't declare any new types and end up with an empty BTF, containing only valid BTF header and no types or strings sections. This currently causes BTF validation error. There is nothing wrong with such BTF, so fix the issue by allowing module BTFs with no types or strings.
Fixes: 36e68442d1af ("bpf: Load and verify kernel module BTFs") Reported-by: Christopher William Snowhill chris@kode54.net Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210110070341.1380086-1-andrii@kernel.org (cherry picked from commit bcc5e6162d66d44f7929f30fce032f95855fc8b4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 3edf141cc146..86e062b3e32a 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4172,7 +4172,7 @@ static int btf_parse_hdr(struct btf_verifier_env *env) return -ENOTSUPP; }
- if (btf_data_size == hdr->hdr_len) { + if (!btf->base_btf && btf_data_size == hdr->hdr_len) { btf_verifier_log(env, "No data"); return -EINVAL; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 792001f4f7aa036b1f1c1ed7bce44bb49126208a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BPF_CORE_READ_USER(), BPF_CORE_READ_USER_STR() and their _INTO() variations to allow reading CO-RE-relocatable kernel data structures from the user-space. One of such cases is reading input arguments of syscalls, while reaping the benefits of CO-RE relocations w.r.t. handling 32/64 bit conversions and handling missing/new fields in UAPI data structs.
Suggested-by: Gilad Reti gilad.reti@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20201218235614.2284956-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit 792001f4f7aa036b1f1c1ed7bce44bb49126208a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_core_read.h | 98 +++++++++++++++++++++-------------- 1 file changed, 59 insertions(+), 39 deletions(-)
diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h index f05cfc082915..a2d75ad8c706 100644 --- a/tools/lib/bpf/bpf_core_read.h +++ b/tools/lib/bpf/bpf_core_read.h @@ -203,17 +203,20 @@ enum bpf_enum_value_kind { * (local) BTF, used to record relocation. */ #define bpf_core_read(dst, sz, src) \ - bpf_probe_read_kernel(dst, sz, \ - (const void *)__builtin_preserve_access_index(src)) + bpf_probe_read_kernel(dst, sz, (const void *)__builtin_preserve_access_index(src))
+#define bpf_core_read_user(dst, sz, src) \ + bpf_probe_read_user(dst, sz, (const void *)__builtin_preserve_access_index(src)) /* * bpf_core_read_str() is a thin wrapper around bpf_probe_read_str() * additionally emitting BPF CO-RE field relocation for specified source * argument. */ #define bpf_core_read_str(dst, sz, src) \ - bpf_probe_read_kernel_str(dst, sz, \ - (const void *)__builtin_preserve_access_index(src)) + bpf_probe_read_kernel_str(dst, sz, (const void *)__builtin_preserve_access_index(src)) + +#define bpf_core_read_user_str(dst, sz, src) \ + bpf_probe_read_user_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
#define ___concat(a, b) a ## b #define ___apply(fn, n) ___concat(fn, n) @@ -272,30 +275,29 @@ enum bpf_enum_value_kind { read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)
/* "recursively" read a sequence of inner pointers using local __t var */ -#define ___rd_first(src, a) ___read(bpf_core_read, &__t, ___type(src), src, a); -#define ___rd_last(...) \ - ___read(bpf_core_read, &__t, \ - ___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__)); -#define ___rd_p1(...) const void *__t; ___rd_first(__VA_ARGS__) -#define ___rd_p2(...) ___rd_p1(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p3(...) ___rd_p2(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p4(...) ___rd_p3(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p5(...) ___rd_p4(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p6(...) ___rd_p5(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p7(...) ___rd_p6(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p8(...) ___rd_p7(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___rd_p9(...) ___rd_p8(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__) -#define ___read_ptrs(src, ...) \ - ___apply(___rd_p, ___narg(__VA_ARGS__))(src, __VA_ARGS__) - -#define ___core_read0(fn, dst, src, a) \ +#define ___rd_first(fn, src, a) ___read(fn, &__t, ___type(src), src, a); +#define ___rd_last(fn, ...) \ + ___read(fn, &__t, ___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__)); +#define ___rd_p1(fn, ...) const void *__t; ___rd_first(fn, __VA_ARGS__) +#define ___rd_p2(fn, ...) ___rd_p1(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p3(fn, ...) ___rd_p2(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p4(fn, ...) ___rd_p3(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p5(fn, ...) ___rd_p4(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p6(fn, ...) ___rd_p5(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p7(fn, ...) ___rd_p6(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p8(fn, ...) ___rd_p7(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___rd_p9(fn, ...) ___rd_p8(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__) +#define ___read_ptrs(fn, src, ...) \ + ___apply(___rd_p, ___narg(__VA_ARGS__))(fn, src, __VA_ARGS__) + +#define ___core_read0(fn, fn_ptr, dst, src, a) \ ___read(fn, dst, ___type(src), src, a); -#define ___core_readN(fn, dst, src, ...) \ - ___read_ptrs(src, ___nolast(__VA_ARGS__)) \ +#define ___core_readN(fn, fn_ptr, dst, src, ...) \ + ___read_ptrs(fn_ptr, src, ___nolast(__VA_ARGS__)) \ ___read(fn, dst, ___type(src, ___nolast(__VA_ARGS__)), __t, \ ___last(__VA_ARGS__)); -#define ___core_read(fn, dst, src, a, ...) \ - ___apply(___core_read, ___empty(__VA_ARGS__))(fn, dst, \ +#define ___core_read(fn, fn_ptr, dst, src, a, ...) \ + ___apply(___core_read, ___empty(__VA_ARGS__))(fn, fn_ptr, dst, \ src, a, ##__VA_ARGS__)
/* @@ -303,20 +305,32 @@ enum bpf_enum_value_kind { * BPF_CORE_READ(), in which final field is read into user-provided storage. * See BPF_CORE_READ() below for more details on general usage. */ -#define BPF_CORE_READ_INTO(dst, src, a, ...) \ - ({ \ - ___core_read(bpf_core_read, dst, (src), a, ##__VA_ARGS__) \ - }) +#define BPF_CORE_READ_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_core_read, bpf_core_read, \ + dst, (src), a, ##__VA_ARGS__) \ +}) + +/* Variant of BPF_CORE_READ_INTO() for reading from user-space memory */ +#define BPF_CORE_READ_USER_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_core_read_user, bpf_core_read_user, \ + dst, (src), a, ##__VA_ARGS__) \ +})
/* * BPF_CORE_READ_STR_INTO() does same "pointer chasing" as * BPF_CORE_READ() for intermediate pointers, but then executes (and returns * corresponding error code) bpf_core_read_str() for final string read. */ -#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) \ - ({ \ - ___core_read(bpf_core_read_str, dst, (src), a, ##__VA_ARGS__)\ - }) +#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_core_read_str, bpf_core_read, \ + dst, (src), a, ##__VA_ARGS__) \ +}) + +/* Variant of BPF_CORE_READ_STR_INTO() for reading from user-space memory */ +#define BPF_CORE_READ_USER_STR_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_core_read_user_str, bpf_core_read_user, \ + dst, (src), a, ##__VA_ARGS__) \ +})
/* * BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially @@ -342,12 +356,18 @@ enum bpf_enum_value_kind { * N.B. Only up to 9 "field accessors" are supported, which should be more * than enough for any practical purpose. */ -#define BPF_CORE_READ(src, a, ...) \ - ({ \ - ___type((src), a, ##__VA_ARGS__) __r; \ - BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \ - __r; \ - }) +#define BPF_CORE_READ(src, a, ...) ({ \ + ___type((src), a, ##__VA_ARGS__) __r; \ + BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \ + __r; \ +}) + +/* Variant of BPF_CORE_READ() for reading from user-space memory */ +#define BPF_CORE_READ_USER(src, a, ...) ({ \ + ___type((src), a, ##__VA_ARGS__) __r; \ + BPF_CORE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \ + __r; \ +})
#endif
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit a4b09a9ef9451b09d87550720f8db3ae280b3eea category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
BPF_CORE_READ(), in addition to handling CO-RE relocations, also allows much nicer way to read data structures with nested pointers. Instead of writing a sequence of bpf_probe_read() calls to follow links, one can just write BPF_CORE_READ(a, b, c, d) to effectively do a->b->c->d read. This is a welcome ability when porting BCC code, which (in most cases) allows exactly the intuitive a->b->c->d variant.
This patch adds non-CO-RE variants of BPF_CORE_READ() family of macros for cases where CO-RE is not supported (e.g., old kernels). In such cases, the property of shortening a sequence of bpf_probe_read()s to a simple BPF_PROBE_READ(a, b, c, d) invocation is still desirable, especially when porting BCC code to libbpf. Yet, no CO-RE relocation is going to be emitted.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20201218235614.2284956-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit a4b09a9ef9451b09d87550720f8db3ae280b3eea) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_core_read.h | 38 +++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)
diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h index a2d75ad8c706..c032f435ce35 100644 --- a/tools/lib/bpf/bpf_core_read.h +++ b/tools/lib/bpf/bpf_core_read.h @@ -316,6 +316,18 @@ enum bpf_enum_value_kind { dst, (src), a, ##__VA_ARGS__) \ })
+/* Non-CO-RE variant of BPF_CORE_READ_INTO() */ +#define BPF_PROBE_READ_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_probe_read, bpf_probe_read, \ + dst, (src), a, ##__VA_ARGS__) \ +}) + +/* Non-CO-RE variant of BPF_CORE_READ_USER_INTO() */ +#define BPF_PROBE_READ_USER_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_probe_read_user, bpf_probe_read_user, \ + dst, (src), a, ##__VA_ARGS__) \ +}) + /* * BPF_CORE_READ_STR_INTO() does same "pointer chasing" as * BPF_CORE_READ() for intermediate pointers, but then executes (and returns @@ -332,6 +344,18 @@ enum bpf_enum_value_kind { dst, (src), a, ##__VA_ARGS__) \ })
+/* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */ +#define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_probe_read_str, bpf_probe_read, \ + dst, (src), a, ##__VA_ARGS__) \ +}) + +/* Non-CO-RE variant of BPF_CORE_READ_USER_STR_INTO() */ +#define BPF_PROBE_READ_USER_STR_INTO(dst, src, a, ...) ({ \ + ___core_read(bpf_probe_read_user_str, bpf_probe_read_user, \ + dst, (src), a, ##__VA_ARGS__) \ +}) + /* * BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially * when there are few pointer chasing steps. @@ -369,5 +393,19 @@ enum bpf_enum_value_kind { __r; \ })
+/* Non-CO-RE variant of BPF_CORE_READ() */ +#define BPF_PROBE_READ(src, a, ...) ({ \ + ___type((src), a, ##__VA_ARGS__) __r; \ + BPF_PROBE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \ + __r; \ +}) + +/* Non-CO-RE variant of BPF_CORE_READ_USER() */ +#define BPF_PROBE_READ_USER(src, a, ...) ({ \ + ___type((src), a, ##__VA_ARGS__) __r; \ + BPF_PROBE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \ + __r; \ +}) + #endif
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 9e80114b1a271803767d4c5baa11ea9e8678aac3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add selftests validating that newly added variations of BPF_CORE_READ(), for use with user-space addresses and for non-CO-RE reads, work as expected.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20201218235614.2284956-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit 9e80114b1a271803767d4c5baa11ea9e8678aac3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../bpf/prog_tests/core_read_macros.c | 64 +++++++++++++++++++ .../bpf/progs/test_core_read_macros.c | 50 +++++++++++++++ 2 files changed, 114 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/core_read_macros.c create mode 100644 tools/testing/selftests/bpf/progs/test_core_read_macros.c
diff --git a/tools/testing/selftests/bpf/prog_tests/core_read_macros.c b/tools/testing/selftests/bpf/prog_tests/core_read_macros.c new file mode 100644 index 000000000000..96f5cf3c6fa2 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/core_read_macros.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ + +#include <test_progs.h> + +struct callback_head { + struct callback_head *next; + void (*func)(struct callback_head *head); +}; + +/* ___shuffled flavor is just an illusion for BPF code, it doesn't really + * exist and user-space needs to provide data in the memory layout that + * matches callback_head. We just defined ___shuffled flavor to make it easier + * to work with the skeleton + */ +struct callback_head___shuffled { + struct callback_head___shuffled *next; + void (*func)(struct callback_head *head); +}; + +#include "test_core_read_macros.skel.h" + +void test_core_read_macros(void) +{ + int duration = 0, err; + struct test_core_read_macros* skel; + struct test_core_read_macros__bss *bss; + struct callback_head u_probe_in; + struct callback_head___shuffled u_core_in; + + skel = test_core_read_macros__open_and_load(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + bss = skel->bss; + bss->my_pid = getpid(); + + /* next pointers have to be set from the kernel side */ + bss->k_probe_in.func = (void *)(long)0x1234; + bss->k_core_in.func = (void *)(long)0xabcd; + + u_probe_in.next = &u_probe_in; + u_probe_in.func = (void *)(long)0x5678; + bss->u_probe_in = &u_probe_in; + + u_core_in.next = &u_core_in; + u_core_in.func = (void *)(long)0xdbca; + bss->u_core_in = &u_core_in; + + err = test_core_read_macros__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* trigger tracepoint */ + usleep(1); + + ASSERT_EQ(bss->k_probe_out, 0x1234, "k_probe_out"); + ASSERT_EQ(bss->k_core_out, 0xabcd, "k_core_out"); + + ASSERT_EQ(bss->u_probe_out, 0x5678, "u_probe_out"); + ASSERT_EQ(bss->u_core_out, 0xdbca, "u_core_out"); + +cleanup: + test_core_read_macros__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_core_read_macros.c b/tools/testing/selftests/bpf/progs/test_core_read_macros.c new file mode 100644 index 000000000000..fd54caa17319 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_core_read_macros.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2020 Facebook + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> + +char _license[] SEC("license") = "GPL"; + +/* shuffled layout for relocatable (CO-RE) reads */ +struct callback_head___shuffled { + void (*func)(struct callback_head___shuffled *head); + struct callback_head___shuffled *next; +}; + +struct callback_head k_probe_in = {}; +struct callback_head___shuffled k_core_in = {}; + +struct callback_head *u_probe_in = 0; +struct callback_head___shuffled *u_core_in = 0; + +long k_probe_out = 0; +long u_probe_out = 0; + +long k_core_out = 0; +long u_core_out = 0; + +int my_pid = 0; + +SEC("raw_tracepoint/sys_enter") +int handler(void *ctx) +{ + int pid = bpf_get_current_pid_tgid() >> 32; + + if (my_pid != pid) + return 0; + + /* next pointers for kernel address space have to be initialized from + * BPF side, user-space mmaped addresses are stil user-space addresses + */ + k_probe_in.next = &k_probe_in; + __builtin_preserve_access_index(({k_core_in.next = &k_core_in;})); + + k_probe_out = (long)BPF_PROBE_READ(&k_probe_in, next, next, func); + k_core_out = (long)BPF_CORE_READ(&k_core_in, next, next, func); + u_probe_out = (long)BPF_PROBE_READ_USER(u_probe_in, next, next, func); + u_core_out = (long)BPF_CORE_READ_USER(u_core_in, next, next, func); + + return 0; +}
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit e22d7f05e445165e58feddb4e40cc9c0f94453bc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add comments clarifying that USER variants of CO-RE reading macro are still only going to work with kernel types, defined in kernel or kernel module BTF. This should help preventing invalid use of those macro to read user-defined types (which doesn't work with CO-RE).
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210108194408.3468860-1-andrii@kernel.org (cherry picked from commit e22d7f05e445165e58feddb4e40cc9c0f94453bc) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_core_read.h | 45 ++++++++++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h index c032f435ce35..e4aa9996a550 100644 --- a/tools/lib/bpf/bpf_core_read.h +++ b/tools/lib/bpf/bpf_core_read.h @@ -205,6 +205,7 @@ enum bpf_enum_value_kind { #define bpf_core_read(dst, sz, src) \ bpf_probe_read_kernel(dst, sz, (const void *)__builtin_preserve_access_index(src))
+/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */ #define bpf_core_read_user(dst, sz, src) \ bpf_probe_read_user(dst, sz, (const void *)__builtin_preserve_access_index(src)) /* @@ -215,6 +216,7 @@ enum bpf_enum_value_kind { #define bpf_core_read_str(dst, sz, src) \ bpf_probe_read_kernel_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
+/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */ #define bpf_core_read_user_str(dst, sz, src) \ bpf_probe_read_user_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
@@ -310,7 +312,11 @@ enum bpf_enum_value_kind { dst, (src), a, ##__VA_ARGS__) \ })
-/* Variant of BPF_CORE_READ_INTO() for reading from user-space memory */ +/* + * Variant of BPF_CORE_READ_INTO() for reading from user-space memory. + * + * NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. + */ #define BPF_CORE_READ_USER_INTO(dst, src, a, ...) ({ \ ___core_read(bpf_core_read_user, bpf_core_read_user, \ dst, (src), a, ##__VA_ARGS__) \ @@ -322,7 +328,11 @@ enum bpf_enum_value_kind { dst, (src), a, ##__VA_ARGS__) \ })
-/* Non-CO-RE variant of BPF_CORE_READ_USER_INTO() */ +/* Non-CO-RE variant of BPF_CORE_READ_USER_INTO(). + * + * As no CO-RE relocations are emitted, source types can be arbitrary and are + * not restricted to kernel types only. + */ #define BPF_PROBE_READ_USER_INTO(dst, src, a, ...) ({ \ ___core_read(bpf_probe_read_user, bpf_probe_read_user, \ dst, (src), a, ##__VA_ARGS__) \ @@ -338,7 +348,11 @@ enum bpf_enum_value_kind { dst, (src), a, ##__VA_ARGS__) \ })
-/* Variant of BPF_CORE_READ_STR_INTO() for reading from user-space memory */ +/* + * Variant of BPF_CORE_READ_STR_INTO() for reading from user-space memory. + * + * NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. + */ #define BPF_CORE_READ_USER_STR_INTO(dst, src, a, ...) ({ \ ___core_read(bpf_core_read_user_str, bpf_core_read_user, \ dst, (src), a, ##__VA_ARGS__) \ @@ -350,7 +364,12 @@ enum bpf_enum_value_kind { dst, (src), a, ##__VA_ARGS__) \ })
-/* Non-CO-RE variant of BPF_CORE_READ_USER_STR_INTO() */ +/* + * Non-CO-RE variant of BPF_CORE_READ_USER_STR_INTO(). + * + * As no CO-RE relocations are emitted, source types can be arbitrary and are + * not restricted to kernel types only. + */ #define BPF_PROBE_READ_USER_STR_INTO(dst, src, a, ...) ({ \ ___core_read(bpf_probe_read_user_str, bpf_probe_read_user, \ dst, (src), a, ##__VA_ARGS__) \ @@ -386,7 +405,16 @@ enum bpf_enum_value_kind { __r; \ })
-/* Variant of BPF_CORE_READ() for reading from user-space memory */ +/* + * Variant of BPF_CORE_READ() for reading from user-space memory. + * + * NOTE: all the source types involved are still *kernel types* and need to + * exist in kernel (or kernel module) BTF, otherwise CO-RE relocation will + * fail. Custom user types are not relocatable with CO-RE. + * The typical situation in which BPF_CORE_READ_USER() might be used is to + * read kernel UAPI types from the user-space memory passed in as a syscall + * input argument. + */ #define BPF_CORE_READ_USER(src, a, ...) ({ \ ___type((src), a, ##__VA_ARGS__) __r; \ BPF_CORE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \ @@ -400,7 +428,12 @@ enum bpf_enum_value_kind { __r; \ })
-/* Non-CO-RE variant of BPF_CORE_READ_USER() */ +/* + * Non-CO-RE variant of BPF_CORE_READ_USER(). + * + * As no CO-RE relocations are emitted, source types can be arbitrary and are + * not restricted to kernel types only. + */ #define BPF_PROBE_READ_USER(src, a, ...) ({ \ ___type((src), a, ##__VA_ARGS__) __r; \ BPF_PROBE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit c6458e72f6fd6ac7e390da0d9abe8446084886e5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When the buffer is too small to contain the input string, these helpers return the length of the buffer, not the length of the original string. This tries to make the docs totally clear about that, since "the length of the [copied ]string" could also refer to the length of the input.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: KP Singh kpsingh@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210112123422.2011234-1-jackmanb@google.com (cherry picked from commit c6458e72f6fd6ac7e390da0d9abe8446084886e5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 10 +++++----- tools/include/uapi/linux/bpf.h | 10 +++++----- 2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 241f26f52a64..04f78c6cd567 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2994,10 +2994,10 @@ union bpf_attr { * string length is larger than *size*, just *size*-1 bytes are * copied and the last byte is set to NUL. * - * On success, the length of the copied string is returned. This - * makes this helper useful in tracing programs for reading - * strings, and more importantly to get its length at runtime. See - * the following snippet: + * On success, returns the number of bytes that were written, + * including the terminal NUL. This makes this helper useful in + * tracing programs for reading strings, and more importantly to + * get its length at runtime. See the following snippet: * * :: * @@ -3025,7 +3025,7 @@ union bpf_attr { * **->mm->env_start**: using this helper and the return value, * one can quickly iterate at the right offset of the memory area. * Return - * On success, the strictly positive length of the string, + * On success, the strictly positive length of the output string, * including the trailing NUL character. On error, a negative * value. * diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 9d885ea960f0..acdc91cd5d6b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -2994,10 +2994,10 @@ union bpf_attr { * string length is larger than *size*, just *size*-1 bytes are * copied and the last byte is set to NUL. * - * On success, the length of the copied string is returned. This - * makes this helper useful in tracing programs for reading - * strings, and more importantly to get its length at runtime. See - * the following snippet: + * On success, returns the number of bytes that were written, + * including the terminal NUL. This makes this helper useful in + * tracing programs for reading strings, and more importantly to + * get its length at runtime. See the following snippet: * * :: * @@ -3025,7 +3025,7 @@ union bpf_attr { * **->mm->env_start**: using this helper and the return value, * one can quickly iterate at the right offset of the memory area. * Return - * On success, the strictly positive length of the string, + * On success, the strictly positive length of the output string, * including the trailing NUL character. On error, a negative * value. *
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 635599bace259a2c42741c3ea61bfa7be6f15556 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
If some of the subtests use module BTFs through ksyms, they will cause bpf_prog to take a refcount on bpf_testmod module, which will prevent it from successfully unloading. Module's refcnt is decremented when bpf_prog is freed, which generally happens in RCU callback. So we need to trigger syncronize_rcu() in the kernel, which can be achieved nicely with membarrier(MEMBARRIER_CMD_SHARED) or membarrier(MEMBARRIER_CMD_GLOBAL) syscall. So do that in kernel_sync_rcu() and make it available to other test inside the test_progs. This synchronize_rcu() is called before attempting to unload bpf_testmod.
Fixes: 9f7fa225894c ("selftests/bpf: Add bpf_testmod kernel module for testing") Suggested-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Acked-by: Hao Luo haoluo@google.com Link: https://lore.kernel.org/bpf/20210112075520.4103414-5-andrii@kernel.org (cherry picked from commit 635599bace259a2c42741c3ea61bfa7be6f15556) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/btf_map_in_map.c | 33 ------------------- tools/testing/selftests/bpf/test_progs.c | 11 +++++++ tools/testing/selftests/bpf/test_progs.h | 1 + 3 files changed, 12 insertions(+), 33 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c index 76ebe4c250f1..eb90a6b8850d 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c @@ -20,39 +20,6 @@ static __u32 bpf_map_id(struct bpf_map *map) return info.id; }
-/* - * Trigger synchronize_rcu() in kernel. - * - * ARRAY_OF_MAPS/HASH_OF_MAPS lookup/update operations trigger synchronize_rcu() - * if looking up an existing non-NULL element or updating the map with a valid - * inner map FD. Use this fact to trigger synchronize_rcu(): create map-in-map, - * create a trivial ARRAY map, update map-in-map with ARRAY inner map. Then - * cleanup. At the end, at least one synchronize_rcu() would be called. - */ -static int kern_sync_rcu(void) -{ - int inner_map_fd, outer_map_fd, err, zero = 0; - - inner_map_fd = bpf_create_map(BPF_MAP_TYPE_ARRAY, 4, 4, 1, 0); - if (CHECK(inner_map_fd < 0, "inner_map_create", "failed %d\n", -errno)) - return -1; - - outer_map_fd = bpf_create_map_in_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, NULL, - sizeof(int), inner_map_fd, 1, 0); - if (CHECK(outer_map_fd < 0, "outer_map_create", "failed %d\n", -errno)) { - close(inner_map_fd); - return -1; - } - - err = bpf_map_update_elem(outer_map_fd, &zero, &inner_map_fd, 0); - if (err) - err = -errno; - CHECK(err, "outer_map_update", "failed %d\n", err); - close(inner_map_fd); - close(outer_map_fd); - return err; -} - static void test_lookup_update(void) { int map1_fd, map2_fd, map3_fd, map4_fd, map5_fd, map1_id, map2_id; diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 41221f78e074..fe1367071846 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -11,6 +11,7 @@ #include <signal.h> #include <string.h> #include <execinfo.h> /* backtrace */ +#include <linux/membarrier.h>
#define EXIT_NO_TEST 2 #define EXIT_ERR_SETUP_INFRA 3 @@ -370,8 +371,18 @@ static int delete_module(const char *name, int flags) return syscall(__NR_delete_module, name, flags); }
+/* + * Trigger synchronize_rcu() in kernel. + */ +int kern_sync_rcu(void) +{ + return syscall(__NR_membarrier, MEMBARRIER_CMD_SHARED, 0, 0); +} + static void unload_bpf_testmod(void) { + if (kern_sync_rcu()) + fprintf(env.stderr, "Failed to trigger kernel-side RCU sync!\n"); if (delete_module("bpf_testmod", 0)) { if (errno == ENOENT) { if (env.verbosity > VERBOSE_NONE) diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index 115953243f62..e49e2fdde942 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -219,6 +219,7 @@ int bpf_find_map(const char *test, struct bpf_object *obj, const char *name); int compare_map_keys(int map1_fd, int map2_fd); int compare_stack_ips(int smap_fd, int amap_fd, int stack_trace_len); int extract_build_id(char *build_id, size_t size); +int kern_sync_rcu(void);
#ifdef __x86_64__ #define SYS_NANOSLEEP_KPROBE_NAME "__x64_sys_nanosleep"
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 541c3bad8dc51b253ba8686d0cd7628e6b9b5f4c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add support for directly accessing kernel module variables from BPF programs using special ldimm64 instructions. This functionality builds upon vmlinux ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow specifying kernel module BTF's FD in insn[1].imm field.
During BPF program load time, verifier will resolve FD to BTF object and will take reference on BTF object itself and, for module BTFs, corresponding module as well, to make sure it won't be unloaded from under running BPF program. The mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
One interesting change is also in how per-CPU variable is determined. The logic is to find .data..percpu data section in provided BTF, but both vmlinux and module each have their own .data..percpu entries in BTF. So for module's case, the search for DATASEC record needs to look at only module's added BTF types. This is implemented with custom search function.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Acked-by: Hao Luo haoluo@google.com Link: https://lore.kernel.org/bpf/20210112075520.4103414-6-andrii@kernel.org (cherry picked from commit 541c3bad8dc51b253ba8686d0cd7628e6b9b5f4c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 10 +++ include/linux/bpf_verifier.h | 3 + include/linux/btf.h | 3 + kernel/bpf/btf.c | 31 ++++++- kernel/bpf/core.c | 23 ++++++ kernel/bpf/verifier.c | 154 ++++++++++++++++++++++++++++------- 6 files changed, 194 insertions(+), 30 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 966337ba4845..714dc0148643 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -848,9 +848,15 @@ struct bpf_ctx_arg_aux { u32 btf_id; };
+struct btf_mod_pair { + struct btf *btf; + struct module *module; +}; + struct bpf_prog_aux { atomic64_t refcnt; u32 used_map_cnt; + u32 used_btf_cnt; u32 max_ctx_offset; u32 max_pkt_offset; u32 max_tp_access; @@ -888,6 +894,7 @@ struct bpf_prog_aux { const struct bpf_prog_ops *ops; struct bpf_map **used_maps; struct mutex used_maps_mutex; /* mutex for used_maps and used_map_cnt */ + struct btf_mod_pair *used_btfs; struct bpf_prog *prog; struct user_struct *user; u64 load_time; /* ns since boottime */ @@ -1793,6 +1800,9 @@ static inline bool unprivileged_ebpf_enabled(void)
#endif /* CONFIG_BPF_SYSCALL */
+void __bpf_free_used_btfs(struct bpf_prog_aux *aux, + struct btf_mod_pair *used_btfs, u32 len); + static inline struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type) { diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 9ee5c7d9e9cb..94c65a8f056e 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -350,6 +350,7 @@ struct bpf_insn_aux_data { };
#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ +#define MAX_USED_BTFS 64 /* max number of BTFs accessed by one BPF program */
#define BPF_VERIFIER_TMP_LOG_SIZE 1024
@@ -415,7 +416,9 @@ struct bpf_verifier_env { struct bpf_verifier_state_list **explored_states; /* search pruning optimization */ struct bpf_verifier_state_list *free_list; struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */ + struct btf_mod_pair used_btfs[MAX_USED_BTFS]; /* array of BTF's used by BPF program */ u32 used_map_cnt; /* number of used maps */ + u32 used_btf_cnt; /* number of used BTF objects */ u32 id_gen; /* used to generate unique reg IDs */ bool explore_alu_limits; bool allow_ptr_leaks; diff --git a/include/linux/btf.h b/include/linux/btf.h index 4c200f5d242b..7fabf1428093 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -91,6 +91,9 @@ int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, int btf_get_fd_by_id(u32 id); u32 btf_obj_id(const struct btf *btf); bool btf_is_kernel(const struct btf *btf); +bool btf_is_module(const struct btf *btf); +struct module *btf_try_get_module(const struct btf *btf); +u32 btf_nr_types(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 86e062b3e32a..57f99c23a6fd 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -458,7 +458,7 @@ static bool btf_type_is_datasec(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; }
-static u32 btf_nr_types_total(const struct btf *btf) +u32 btf_nr_types(const struct btf *btf) { u32 total = 0;
@@ -476,7 +476,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind) const char *tname; u32 i, total;
- total = btf_nr_types_total(btf); + total = btf_nr_types(btf); for (i = 1; i < total; i++) { t = btf_type_by_id(btf, i); if (BTF_INFO_KIND(t->info) != kind) @@ -5757,6 +5757,11 @@ bool btf_is_kernel(const struct btf *btf) return btf->kernel_btf; }
+bool btf_is_module(const struct btf *btf) +{ + return btf->kernel_btf && strcmp(btf->name, "vmlinux") != 0; +} + static int btf_id_cmp_func(const void *a, const void *b) { const int *pa = a, *pb = b; @@ -5891,3 +5896,25 @@ static int __init btf_module_init(void)
fs_initcall(btf_module_init); #endif /* CONFIG_DEBUG_INFO_BTF_MODULES */ + +struct module *btf_try_get_module(const struct btf *btf) +{ + struct module *res = NULL; +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES + struct btf_module *btf_mod, *tmp; + + mutex_lock(&btf_module_mutex); + list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) { + if (btf_mod->btf != btf) + continue; + + if (try_module_get(btf_mod->module)) + res = btf_mod->module; + + break; + } + mutex_unlock(&btf_module_mutex); +#endif + + return res; +} diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index d3a1f25f8ec2..e66f59f1ed6b 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2177,6 +2177,28 @@ static void bpf_free_used_maps(struct bpf_prog_aux *aux) kfree(aux->used_maps); }
+void __bpf_free_used_btfs(struct bpf_prog_aux *aux, + struct btf_mod_pair *used_btfs, u32 len) +{ +#ifdef CONFIG_BPF_SYSCALL + struct btf_mod_pair *btf_mod; + u32 i; + + for (i = 0; i < len; i++) { + btf_mod = &used_btfs[i]; + if (btf_mod->module) + module_put(btf_mod->module); + btf_put(btf_mod->btf); + } +#endif +} + +static void bpf_free_used_btfs(struct bpf_prog_aux *aux) +{ + __bpf_free_used_btfs(aux, aux->used_btfs, aux->used_btf_cnt); + kfree(aux->used_btfs); +} + static void bpf_prog_free_deferred(struct work_struct *work) { struct bpf_prog_aux *aux; @@ -2184,6 +2206,7 @@ static void bpf_prog_free_deferred(struct work_struct *work)
aux = container_of(work, struct bpf_prog_aux, work); bpf_free_used_maps(aux); + bpf_free_used_btfs(aux); if (bpf_prog_is_dev_bound(aux)) bpf_prog_offload_destroy(aux->prog); #ifdef CONFIG_PERF_EVENTS diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e8572e669a6d..a362c9f63778 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10164,6 +10164,36 @@ static int do_check(struct bpf_verifier_env *env) return 0; }
+static int find_btf_percpu_datasec(struct btf *btf) +{ + const struct btf_type *t; + const char *tname; + int i, n; + + /* + * Both vmlinux and module each have their own ".data..percpu" + * DATASECs in BTF. So for module's case, we need to skip vmlinux BTF + * types to look at only module's own BTF types. + */ + n = btf_nr_types(btf); + if (btf_is_module(btf)) + i = btf_nr_types(btf_vmlinux); + else + i = 1; + + for(; i < n; i++) { + t = btf_type_by_id(btf, i); + if (BTF_INFO_KIND(t->info) != BTF_KIND_DATASEC) + continue; + + tname = btf_name_by_offset(btf, t->name_off); + if (!strcmp(tname, ".data..percpu")) + return i; + } + + return -ENOENT; +} + /* replace pseudo btf_id with kernel symbol address */ static int check_pseudo_btf_id(struct bpf_verifier_env *env, struct bpf_insn *insn, @@ -10171,48 +10201,57 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, { const struct btf_var_secinfo *vsi; const struct btf_type *datasec; + struct btf_mod_pair *btf_mod; const struct btf_type *t; const char *sym_name; bool percpu = false; u32 type, id = insn->imm; + struct btf *btf; s32 datasec_id; u64 addr; - int i; - - if (!btf_vmlinux) { - verbose(env, "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified in Kconfig.\n"); - return -EINVAL; - } + int i, btf_fd, err;
- if (insn[1].imm != 0) { - verbose(env, "reserved field (insn[1].imm) is used in pseudo_btf_id ldimm64 insn.\n"); - return -EINVAL; + btf_fd = insn[1].imm; + if (btf_fd) { + btf = btf_get_by_fd(btf_fd); + if (IS_ERR(btf)) { + verbose(env, "invalid module BTF object FD specified.\n"); + return -EINVAL; + } + } else { + if (!btf_vmlinux) { + verbose(env, "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified in Kconfig.\n"); + return -EINVAL; + } + btf = btf_vmlinux; + btf_get(btf); }
- t = btf_type_by_id(btf_vmlinux, id); + t = btf_type_by_id(btf, id); if (!t) { verbose(env, "ldimm64 insn specifies invalid btf_id %d.\n", id); - return -ENOENT; + err = -ENOENT; + goto err_put; }
if (!btf_type_is_var(t)) { - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", - id); - return -EINVAL; + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); + err = -EINVAL; + goto err_put; }
- sym_name = btf_name_by_offset(btf_vmlinux, t->name_off); + sym_name = btf_name_by_offset(btf, t->name_off); addr = kallsyms_lookup_name(sym_name); if (!addr) { verbose(env, "ldimm64 failed to find the address for kernel symbol '%s'.\n", sym_name); - return -ENOENT; + err = -ENOENT; + goto err_put; }
- datasec_id = btf_find_by_name_kind(btf_vmlinux, ".data..percpu", - BTF_KIND_DATASEC); + datasec_id = find_btf_percpu_datasec(btf); if (datasec_id > 0) { - datasec = btf_type_by_id(btf_vmlinux, datasec_id); + datasec = btf_type_by_id(btf, datasec_id); for_each_vsi(i, datasec, vsi) { if (vsi->type == id) { percpu = true; @@ -10225,10 +10264,10 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, insn[1].imm = addr >> 32;
type = t->type; - t = btf_type_skip_modifiers(btf_vmlinux, type, NULL); + t = btf_type_skip_modifiers(btf, type, NULL); if (percpu) { aux->btf_var.reg_type = PTR_TO_PERCPU_BTF_ID; - aux->btf_var.btf = btf_vmlinux; + aux->btf_var.btf = btf; aux->btf_var.btf_id = type; } else if (!btf_type_is_struct(t)) { const struct btf_type *ret; @@ -10236,21 +10275,54 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, u32 tsize;
/* resolve the type size of ksym. */ - ret = btf_resolve_size(btf_vmlinux, t, &tsize); + ret = btf_resolve_size(btf, t, &tsize); if (IS_ERR(ret)) { - tname = btf_name_by_offset(btf_vmlinux, t->name_off); + tname = btf_name_by_offset(btf, t->name_off); verbose(env, "ldimm64 unable to resolve the size of type '%s': %ld\n", tname, PTR_ERR(ret)); - return -EINVAL; + err = -EINVAL; + goto err_put; } aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; aux->btf_var.mem_size = tsize; } else { aux->btf_var.reg_type = PTR_TO_BTF_ID; - aux->btf_var.btf = btf_vmlinux; + aux->btf_var.btf = btf; aux->btf_var.btf_id = type; } + + /* check whether we recorded this BTF (and maybe module) already */ + for (i = 0; i < env->used_btf_cnt; i++) { + if (env->used_btfs[i].btf == btf) { + btf_put(btf); + return 0; + } + } + + if (env->used_btf_cnt >= MAX_USED_BTFS) { + err = -E2BIG; + goto err_put; + } + + btf_mod = &env->used_btfs[env->used_btf_cnt]; + btf_mod->btf = btf; + btf_mod->module = NULL; + + /* if we reference variables from kernel module, bump its refcount */ + if (btf_is_module(btf)) { + btf_mod->module = btf_try_get_module(btf); + if (!btf_mod->module) { + err = -ENXIO; + goto err_put; + } + } + + env->used_btf_cnt++; + return 0; +err_put: + btf_put(btf); + return err; }
static int check_map_prealloc(struct bpf_map *map) @@ -10537,6 +10609,13 @@ static void release_maps(struct bpf_verifier_env *env) env->used_map_cnt); }
+/* drop refcnt of maps used by the rejected program */ +static void release_btfs(struct bpf_verifier_env *env) +{ + __bpf_free_used_btfs(env->prog->aux, env->used_btfs, + env->used_btf_cnt); +} + /* convert pseudo BPF_LD_IMM64 into generic BPF_LD_IMM64 */ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env) { @@ -12544,7 +12623,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, goto err_release_maps; }
- if (ret == 0 && env->used_map_cnt) { + if (ret) + goto err_release_maps; + + if (env->used_map_cnt) { /* if program passed verifier, update used_maps in bpf_prog_info */ env->prog->aux->used_maps = kmalloc_array(env->used_map_cnt, sizeof(env->used_maps[0]), @@ -12558,15 +12640,29 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, memcpy(env->prog->aux->used_maps, env->used_maps, sizeof(env->used_maps[0]) * env->used_map_cnt); env->prog->aux->used_map_cnt = env->used_map_cnt; + } + if (env->used_btf_cnt) { + /* if program passed verifier, update used_btfs in bpf_prog_aux */ + env->prog->aux->used_btfs = kmalloc_array(env->used_btf_cnt, + sizeof(env->used_btfs[0]), + GFP_KERNEL); + if (!env->prog->aux->used_btfs) { + ret = -ENOMEM; + goto err_release_maps; + }
+ memcpy(env->prog->aux->used_btfs, env->used_btfs, + sizeof(env->used_btfs[0]) * env->used_btf_cnt); + env->prog->aux->used_btf_cnt = env->used_btf_cnt; + } + if (env->used_map_cnt || env->used_btf_cnt) { /* program is valid. Convert pseudo bpf_ld_imm64 into generic * bpf_ld_imm64 instructions */ convert_pseudo_ld_imm64(env); }
- if (ret == 0) - adjust_btf_func(env); + adjust_btf_func(env);
err_release_maps: if (!env->prog->aux->used_maps) @@ -12574,6 +12670,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, * them now. Otherwise free_used_maps() will release them. */ release_maps(env); + if (!env->prog->aux->used_btfs) + release_btfs(env);
/* extension progs temporarily inherit the attach_type of their targets for verification purposes, so set it back to zero before returning
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 284d2587ea8a96f97ca519a3de683ff226e9e2b3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add support for searching for ksym externs not just in vmlinux BTF, but across all module BTFs, similarly to how it's done for CO-RE relocations. Kernels that expose module BTFs through sysfs are assumed to support new ldimm64 instruction extension with BTF FD provided in insn[1].imm field, so no extra feature detection is performed.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Acked-by: Hao Luo haoluo@google.com Link: https://lore.kernel.org/bpf/20210112075520.4103414-7-andrii@kernel.org (cherry picked from commit 284d2587ea8a96f97ca519a3de683ff226e9e2b3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 50 +++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 18 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a0a1b7c99f2f..668103ebd5cb 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -395,7 +395,8 @@ struct extern_desc { unsigned long long addr;
/* target btf_id of the corresponding kernel var. */ - int vmlinux_btf_id; + int kernel_btf_obj_fd; + int kernel_btf_id;
/* local btf_id of the ksym extern's type. */ __u32 type_id; @@ -6217,7 +6218,8 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) } else /* EXT_KSYM */ { if (ext->ksym.type_id) { /* typed ksyms */ insn[0].src_reg = BPF_PSEUDO_BTF_ID; - insn[0].imm = ext->ksym.vmlinux_btf_id; + insn[0].imm = ext->ksym.kernel_btf_id; + insn[1].imm = ext->ksym.kernel_btf_obj_fd; } else { /* typeless ksyms */ insn[0].imm = (__u32)ext->ksym.addr; insn[1].imm = ext->ksym.addr >> 32; @@ -7377,7 +7379,8 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj) static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) { struct extern_desc *ext; - int i, id; + struct btf *btf; + int i, j, id, btf_fd, err;
for (i = 0; i < obj->nr_extern; i++) { const struct btf_type *targ_var, *targ_type; @@ -7389,10 +7392,25 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) if (ext->type != EXT_KSYM || !ext->ksym.type_id) continue;
- id = btf__find_by_name_kind(obj->btf_vmlinux, ext->name, - BTF_KIND_VAR); + btf = obj->btf_vmlinux; + btf_fd = 0; + id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); + if (id == -ENOENT) { + err = load_module_btfs(obj); + if (err) + return err; + + for (j = 0; j < obj->btf_module_cnt; j++) { + btf = obj->btf_modules[j].btf; + /* we assume module BTF FD is always >0 */ + btf_fd = obj->btf_modules[j].fd; + id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); + if (id != -ENOENT) + break; + } + } if (id <= 0) { - pr_warn("extern (ksym) '%s': failed to find BTF ID in vmlinux BTF.\n", + pr_warn("extern (ksym) '%s': failed to find BTF ID in kernel BTF(s).\n", ext->name); return -ESRCH; } @@ -7401,24 +7419,19 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) local_type_id = ext->ksym.type_id;
/* find target type_id */ - targ_var = btf__type_by_id(obj->btf_vmlinux, id); - targ_var_name = btf__name_by_offset(obj->btf_vmlinux, - targ_var->name_off); - targ_type = skip_mods_and_typedefs(obj->btf_vmlinux, - targ_var->type, - &targ_type_id); + targ_var = btf__type_by_id(btf, id); + targ_var_name = btf__name_by_offset(btf, targ_var->name_off); + targ_type = skip_mods_and_typedefs(btf, targ_var->type, &targ_type_id);
ret = bpf_core_types_are_compat(obj->btf, local_type_id, - obj->btf_vmlinux, targ_type_id); + btf, targ_type_id); if (ret <= 0) { const struct btf_type *local_type; const char *targ_name, *local_name;
local_type = btf__type_by_id(obj->btf, local_type_id); - local_name = btf__name_by_offset(obj->btf, - local_type->name_off); - targ_name = btf__name_by_offset(obj->btf_vmlinux, - targ_type->name_off); + local_name = btf__name_by_offset(obj->btf, local_type->name_off); + targ_name = btf__name_by_offset(btf, targ_type->name_off);
pr_warn("extern (ksym) '%s': incompatible types, expected [%d] %s %s, but kernel has [%d] %s %s\n", ext->name, local_type_id, @@ -7428,7 +7441,8 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) }
ext->is_set = true; - ext->ksym.vmlinux_btf_id = id; + ext->ksym.kernel_btf_obj_fd = btf_fd; + ext->ksym.kernel_btf_id = id; pr_debug("extern (ksym) '%s': resolved to [%d] %s %s\n", ext->name, id, btf_kind_str(targ_var), targ_var_name); }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 430d97a8a7bf1500c081ce756ff2ead73d2303c5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add per-CPU variable to bpf_testmod.ko and use those from new selftest to validate it works end-to-end.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Acked-by: Hao Luo haoluo@google.com Link: https://lore.kernel.org/bpf/20210112075520.4103414-8-andrii@kernel.org (cherry picked from commit 430d97a8a7bf1500c081ce756ff2ead73d2303c5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/bpf_testmod/bpf_testmod.c | 3 ++ .../selftests/bpf/prog_tests/ksyms_module.c | 31 +++++++++++++++++++ .../selftests/bpf/progs/test_ksyms_module.c | 26 ++++++++++++++++ 3 files changed, 60 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 2df19d73ca49..0b991e115d1f 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -3,6 +3,7 @@ #include <linux/error-injection.h> #include <linux/init.h> #include <linux/module.h> +#include <linux/percpu-defs.h> #include <linux/sysfs.h> #include <linux/tracepoint.h> #include "bpf_testmod.h" @@ -10,6 +11,8 @@ #define CREATE_TRACE_POINTS #include "bpf_testmod-events.h"
+DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123; + noinline ssize_t bpf_testmod_test_read(struct file *file, struct kobject *kobj, struct bin_attribute *bin_attr, diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c new file mode 100644 index 000000000000..4c232b456479 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ + +#include <test_progs.h> +#include <bpf/libbpf.h> +#include <bpf/btf.h> +#include "test_ksyms_module.skel.h" + +static int duration; + +void test_ksyms_module(void) +{ + struct test_ksyms_module* skel; + int err; + + skel = test_ksyms_module__open_and_load(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = test_ksyms_module__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + usleep(1); + + ASSERT_EQ(skel->bss->triggered, true, "triggered"); + ASSERT_EQ(skel->bss->out_mod_ksym_global, 123, "global_ksym_val"); + +cleanup: + test_ksyms_module__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_module.c b/tools/testing/selftests/bpf/progs/test_ksyms_module.c new file mode 100644 index 000000000000..d6a0b3086b90 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_ksyms_module.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ + +#include "vmlinux.h" + +#include <bpf/bpf_helpers.h> + +extern const int bpf_testmod_ksym_percpu __ksym; + +int out_mod_ksym_global = 0; +bool triggered = false; + +SEC("raw_tp/sys_enter") +int handler(const void *ctx) +{ + int *val; + __u32 cpu; + + val = (int *)bpf_this_cpu_ptr(&bpf_testmod_ksym_percpu); + out_mod_ksym_global = *val; + triggered = true; + + return 0; +} + +char LICENSE[] SEC("license") = "GPL";
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.12-rc1 commit de11ae4f56fdc53474c08b03142860428e6c5655 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Build bpftool and resolve_btfids using the host toolchain when cross-compiling, since they are executed during build to generate the selftests. Add a host build directory in order to build both host and target version of libbpf. Build host tools using $(HOSTCC) defined in Makefile.include.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210113163319.1516382-2-jean-philippe@linaro.or... (cherry picked from commit de11ae4f56fdc53474c08b03142860428e6c5655) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 46 +++++++++++++++++++++------- 1 file changed, 35 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 3362a0afd2d0..0a39761d8f77 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 include ../../../../scripts/Kbuild.include include ../../../scripts/Makefile.arch +include ../../../scripts/Makefile.include
CXX ?= $(CROSS_COMPILE)g++
@@ -113,7 +114,15 @@ SCRATCH_DIR := $(OUTPUT)/tools BUILD_DIR := $(SCRATCH_DIR)/build INCLUDE_DIR := $(SCRATCH_DIR)/include BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a -RESOLVE_BTFIDS := $(BUILD_DIR)/resolve_btfids/resolve_btfids +ifneq ($(CROSS_COMPILE),) +HOST_BUILD_DIR := $(BUILD_DIR)/host +HOST_SCRATCH_DIR := $(OUTPUT)/host-tools +else +HOST_BUILD_DIR := $(BUILD_DIR) +HOST_SCRATCH_DIR := $(SCRATCH_DIR) +endif +HOST_BPFOBJ := $(HOST_BUILD_DIR)/libbpf/libbpf.a +RESOLVE_BTFIDS := $(HOST_BUILD_DIR)/resolve_btfids/resolve_btfids
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ @@ -135,6 +144,14 @@ $(notdir $(TEST_GEN_PROGS) \ $(TEST_GEN_PROGS_EXTENDED) \ $(TEST_CUSTOM_PROGS)): %: $(OUTPUT)/% ;
+# sort removes libbpf duplicates when not cross-building +MAKE_DIRS := $(sort $(BUILD_DIR)/libbpf $(HOST_BUILD_DIR)/libbpf \ + $(HOST_BUILD_DIR)/bpftool $(HOST_BUILD_DIR)/resolve_btfids \ + $(INCLUDE_DIR)) +$(MAKE_DIRS): + $(call msg,MKDIR,,$@) + $(Q)mkdir -p $@ + $(OUTPUT)/%.o: %.c $(call msg,CC,,$@) $(Q)$(CC) $(CFLAGS) -c $(filter %.c,$^) $(LDLIBS) -o $@ @@ -157,7 +174,7 @@ $(OUTPUT)/test_stub.o: test_stub.c $(BPFOBJ) $(call msg,CC,,$@) $(Q)$(CC) -c $(CFLAGS) -o $@ $<
-DEFAULT_BPFTOOL := $(SCRATCH_DIR)/sbin/bpftool +DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool
$(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL) $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/runqslower \ @@ -183,10 +200,11 @@ $(OUTPUT)/test_sysctl: cgroup_helpers.c
BPFTOOL ?= $(DEFAULT_BPFTOOL) $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ - $(BPFOBJ) | $(BUILD_DIR)/bpftool + $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/bpftool $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) \ - OUTPUT=$(BUILD_DIR)/bpftool/ \ - prefix= DESTDIR=$(SCRATCH_DIR)/ install + CC=$(HOSTCC) LD=$(HOSTLD) \ + OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ + prefix= DESTDIR=$(HOST_SCRATCH_DIR)/ install $(Q)mkdir -p $(BUILD_DIR)/bpftool/Documentation $(Q)RST2MAN_OPTS="--exit-status=1" $(MAKE) $(submake_extras) \ -C $(BPFTOOLDIR)/Documentation \ @@ -199,9 +217,14 @@ $(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/ \ DESTDIR=$(SCRATCH_DIR) prefix= all install_headers
-$(BUILD_DIR)/libbpf $(BUILD_DIR)/bpftool $(BUILD_DIR)/resolve_btfids $(INCLUDE_DIR): - $(call msg,MKDIR,,$@) - $(Q)mkdir -p $@ +ifneq ($(BPFOBJ),$(HOST_BPFOBJ)) +$(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ + ../../../include/uapi/linux/bpf.h \ + | $(INCLUDE_DIR) $(HOST_BUILD_DIR)/libbpf + $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) \ + OUTPUT=$(HOST_BUILD_DIR)/libbpf/ CC=$(HOSTCC) LD=$(HOSTLD) \ + DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers +endif
$(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR) ifeq ($(VMLINUX_H),) @@ -212,7 +235,7 @@ else $(Q)cp "$(VMLINUX_H)" $@ endif
-$(RESOLVE_BTFIDS): $(BPFOBJ) | $(BUILD_DIR)/resolve_btfids \ +$(RESOLVE_BTFIDS): $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/resolve_btfids \ $(TOOLSDIR)/bpf/resolve_btfids/main.c \ $(TOOLSDIR)/lib/rbtree.c \ $(TOOLSDIR)/lib/zalloc.c \ @@ -220,7 +243,8 @@ $(RESOLVE_BTFIDS): $(BPFOBJ) | $(BUILD_DIR)/resolve_btfids \ $(TOOLSDIR)/lib/ctype.c \ $(TOOLSDIR)/lib/str_error_r.c $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/resolve_btfids \ - OUTPUT=$(BUILD_DIR)/resolve_btfids/ BPFOBJ=$(BPFOBJ) + CC=$(HOSTCC) LD=$(HOSTLD) AR=$(HOSTAR) \ + OUTPUT=$(HOST_BUILD_DIR)/resolve_btfids/ BPFOBJ=$(HOST_BPFOBJ)
# Get Clang's default includes on this system, as opposed to those seen by # '-target bpf'. This fixes "missing" files on some architectures/distros, @@ -452,7 +476,7 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \ $(call msg,BINARY,,$@) $(Q)$(CC) $(LDFLAGS) -o $@ $(filter %.a %.o,$^) $(LDLIBS)
-EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) \ +EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \ prog_tests/tests.h map_tests/tests.h verifier/tests.h \ feature \ $(addprefix $(OUTPUT)/,*.o *.skel.h no_alu32 bpf_gcc bpf_testmod.ko)
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.12-rc1 commit 5837cedef6f33e01d1c4ad90ee4e966bcc6f9c30 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When building out-of-tree, the .skel.h files are generated into the $(OUTPUT) directory, rather than $(CURDIR). Add $(OUTPUT) to the include paths.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210113163319.1516382-3-jean-philippe@linaro.or... (cherry picked from commit 5837cedef6f33e01d1c4ad90ee4e966bcc6f9c30) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 0a39761d8f77..a6fae1b9894c 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -25,7 +25,7 @@ BPF_GCC ?= $(shell command -v bpf-gcc;) SAN_CFLAGS ?= CFLAGS += -g -rdynamic -Wall -O2 $(GENFLAGS) $(SAN_CFLAGS) \ -I$(CURDIR) -I$(INCLUDE_DIR) -I$(GENDIR) -I$(LIBDIR) \ - -I$(TOOLSINCDIR) -I$(APIDIR) \ + -I$(TOOLSINCDIR) -I$(APIDIR) -I$(OUTPUT) \ -Dbpf_prog_load=bpf_prog_test_load \ -Dbpf_load_program=bpf_test_load_program LDLIBS += -lcap -lelf -lz -lrt -lpthread
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.12-rc1 commit d6ac8cad50f0c0fed5cfeb61301bb19e8e88985c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
During an out-of-tree build, attempting to install the $(TEST_FILES) into the $(OUTPUT) directory fails, because the objects were already generated into $(OUTPUT):
rsync: [sender] link_stat "tools/testing/selftests/bpf/test_lwt_ip_encap.o" failed: No such file or directory (2) rsync: [sender] link_stat "tools/testing/selftests/bpf/test_tc_edt.o" failed: No such file or directory (2)
Use $(TEST_GEN_FILES) instead.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210113163319.1516382-4-jean-philippe@linaro.or... (cherry picked from commit d6ac8cad50f0c0fed5cfeb61301bb19e8e88985c) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index a6fae1b9894c..4870cb8a311c 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -45,8 +45,7 @@ ifneq ($(BPF_GCC),) TEST_GEN_PROGS += test_progs-bpf_gcc endif
-TEST_GEN_FILES = -TEST_FILES = test_lwt_ip_encap.o \ +TEST_GEN_FILES = test_lwt_ip_encap.o \ test_tc_edt.o
# Order correspond to 'make run_tests' order
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.12-rc1 commit ca1e846711a8f49382005eb5feb82973fa88a849 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
For out-of-tree builds, $(TEST_CUSTOM_PROGS) require the $(OUTPUT) prefix, otherwise the kselftest lib doesn't know how to install them:
rsync: [sender] link_stat "tools/testing/selftests/bpf/urandom_read" failed: No such file or directory (2)
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210113163319.1516382-5-jean-philippe@linaro.or... (cherry picked from commit ca1e846711a8f49382005eb5feb82973fa88a849) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 4870cb8a311c..6dea655b752f 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -82,7 +82,7 @@ TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko
-TEST_CUSTOM_PROGS = urandom_read +TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
# Emit succinct information message describing current building step # $1 - generic step name (e.g., CC, LINK, etc);
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.12-rc1 commit b8d1cbef2ea4415edcdb5a825972d93b63fcf63c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The btf_dump test cannot access the original source files for comparison when running the selftests out of tree, causing several failures:
awk: btf_dump_test_case_syntax.c: No such file or directory ...
Add those files to $(TEST_FILES) to have "make install" pick them up.
Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210113163319.1516382-6-jean-philippe@linaro.or... (cherry picked from commit b8d1cbef2ea4415edcdb5a825972d93b63fcf63c) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 6dea655b752f..60a301a93491 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -47,6 +47,7 @@ endif
TEST_GEN_FILES = test_lwt_ip_encap.o \ test_tc_edt.o +TEST_FILES = $(wildcard progs/btf_dump_test_case_*.c)
# Order correspond to 'make run_tests' order TEST_PROGS := test_kmod.sh \
From: Ian Rogers irogers@google.com
mainline inclusion from mainline-5.12-rc1 commit ce5a518e9de53446f10d46fac98640f7ac026100 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add inline to __always_inline making it match the linux/compiler.h. Adding this avoids an unused function warning on bpf_tail_call_static when compining with -Wall.
Signed-off-by: Ian Rogers irogers@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210113223609.3358812-1-irogers@google.com (cherry picked from commit ce5a518e9de53446f10d46fac98640f7ac026100) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_helpers.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 72b251110c4d..ae6c975e0b87 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -30,7 +30,7 @@ #define SEC(NAME) __attribute__((section(NAME), used))
#ifndef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __always_inline inline __attribute__((always_inline)) #endif #ifndef __noinline #define __noinline __attribute__((noinline))
From: Ian Rogers irogers@google.com
mainline inclusion from mainline-5.12-rc1 commit bade5c554f1ac70a50cefe96517957629dbc0d8f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
No additional warnings are generated by enabling this, but having it enabled will help avoid regressions.
Signed-off-by: Ian Rogers irogers@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210113223609.3358812-2-irogers@google.com (cherry picked from commit bade5c554f1ac70a50cefe96517957629dbc0d8f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index ea6704f027e3..752d3e97e077 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -165,7 +165,7 @@ $(OUTPUT)%.bpf.o: skeleton/%.bpf.c $(OUTPUT)vmlinux.h $(LIBBPF) -I$(srctree)/tools/include/uapi/ \ -I$(LIBBPF_PATH) \ -I$(srctree)/tools/lib \ - -g -O2 -target bpf -c $< -o $@ && $(LLVM_STRIP) -g $@ + -g -O2 -Wall -target bpf -c $< -o $@ && $(LLVM_STRIP) -g $@
$(OUTPUT)%.skel.h: $(OUTPUT)%.bpf.o $(BPFTOOL_BOOTSTRAP) $(QUIET_GEN)$(BPFTOOL_BOOTSTRAP) gen skeleton $< > $@
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 11c11d0751fce605090761f9c066bae947a35e76 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The case for JITing atomics is about to get more complicated. Let's factor out some common code to make the review and result more readable.
NB the atomics code doesn't yet use the new helper - a subsequent patch will add its use as a side-effect of other changes.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210114181751.768687-2-jackmanb@google.com (cherry picked from commit 11c11d0751fce605090761f9c066bae947a35e76) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 43 +++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 18 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 62ed3710fc53..ce742655304a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -684,6 +684,27 @@ static void emit_mov_reg(u8 **pprog, bool is64, u32 dst_reg, u32 src_reg) *pprog = prog; }
+/* Emit the suffix (ModR/M etc) for addressing *(ptr_reg + off) and val_reg */ +static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off) +{ + u8 *prog = *pprog; + int cnt = 0; + + if (is_imm8(off)) { + /* 1-byte signed displacement. + * + * If off == 0 we could skip this and save one extra byte, but + * special case of x86 R13 which always needs an offset is not + * worth the hassle + */ + EMIT2(add_2reg(0x40, ptr_reg, val_reg), off); + } else { + /* 4-byte signed displacement */ + EMIT1_off32(add_2reg(0x80, ptr_reg, val_reg), off); + } + *pprog = prog; +} + /* LDX: dst_reg = *(u8*)(src_reg + off) */ static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) { @@ -711,15 +732,7 @@ static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) EMIT2(add_2mod(0x48, src_reg, dst_reg), 0x8B); break; } - /* - * If insn->off == 0 we can save one extra byte, but - * special case of x86 R13 which always needs an offset - * is not worth the hassle - */ - if (is_imm8(off)) - EMIT2(add_2reg(0x40, src_reg, dst_reg), off); - else - EMIT1_off32(add_2reg(0x80, src_reg, dst_reg), off); + emit_insn_suffix(&prog, src_reg, dst_reg, off); *pprog = prog; }
@@ -754,10 +767,7 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) EMIT2(add_2mod(0x48, dst_reg, src_reg), 0x89); break; } - if (is_imm8(off)) - EMIT2(add_2reg(0x40, dst_reg, src_reg), off); - else - EMIT1_off32(add_2reg(0x80, dst_reg, src_reg), off); + emit_insn_suffix(&prog, dst_reg, src_reg, off); *pprog = prog; }
@@ -1243,11 +1253,8 @@ st: if (is_imm8(insn->off)) goto xadd; case BPF_STX | BPF_XADD | BPF_DW: EMIT3(0xF0, add_2mod(0x48, dst_reg, src_reg), 0x01); -xadd: if (is_imm8(insn->off)) - EMIT2(add_2reg(0x40, dst_reg, src_reg), insn->off); - else - EMIT1_off32(add_2reg(0x80, dst_reg, src_reg), - insn->off); +xadd: + emit_modrm_dstoff(&prog, dst_reg, src_reg, insn->off); break;
/* call */
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 74007cfc1f71e47394ca173b93d28afd0529fc86 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The JIT case for encoding atomic ops is about to get more complicated. In order to make the review & resulting code easier, let's factor out some shared helpers.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210114181751.768687-3-jackmanb@google.com (cherry picked from commit 74007cfc1f71e47394ca173b93d28afd0529fc86) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 39 ++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 16 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index ce742655304a..df04d0291fb8 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -705,6 +705,21 @@ static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off) *pprog = prog; }
+/* + * Emit a REX byte if it will be necessary to address these registers + */ +static void maybe_emit_mod(u8 **pprog, u32 dst_reg, u32 src_reg, bool is64) +{ + u8 *prog = *pprog; + int cnt = 0; + + if (is64) + EMIT1(add_2mod(0x48, dst_reg, src_reg)); + else if (is_ereg(dst_reg) || is_ereg(src_reg)) + EMIT1(add_2mod(0x40, dst_reg, src_reg)); + *pprog = prog; +} + /* LDX: dst_reg = *(u8*)(src_reg + off) */ static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) { @@ -855,10 +870,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, case BPF_OR: b2 = 0x09; break; case BPF_XOR: b2 = 0x31; break; } - if (BPF_CLASS(insn->code) == BPF_ALU64) - EMIT1(add_2mod(0x48, dst_reg, src_reg)); - else if (is_ereg(dst_reg) || is_ereg(src_reg)) - EMIT1(add_2mod(0x40, dst_reg, src_reg)); + maybe_emit_mod(&prog, dst_reg, src_reg, + BPF_CLASS(insn->code) == BPF_ALU64); EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg)); break;
@@ -1305,20 +1318,16 @@ st: if (is_imm8(insn->off)) case BPF_JMP32 | BPF_JSGE | BPF_X: case BPF_JMP32 | BPF_JSLE | BPF_X: /* cmp dst_reg, src_reg */ - if (BPF_CLASS(insn->code) == BPF_JMP) - EMIT1(add_2mod(0x48, dst_reg, src_reg)); - else if (is_ereg(dst_reg) || is_ereg(src_reg)) - EMIT1(add_2mod(0x40, dst_reg, src_reg)); + maybe_emit_mod(&prog, dst_reg, src_reg, + BPF_CLASS(insn->code) == BPF_JMP); EMIT2(0x39, add_2reg(0xC0, dst_reg, src_reg)); goto emit_cond_jmp;
case BPF_JMP | BPF_JSET | BPF_X: case BPF_JMP32 | BPF_JSET | BPF_X: /* test dst_reg, src_reg */ - if (BPF_CLASS(insn->code) == BPF_JMP) - EMIT1(add_2mod(0x48, dst_reg, src_reg)); - else if (is_ereg(dst_reg) || is_ereg(src_reg)) - EMIT1(add_2mod(0x40, dst_reg, src_reg)); + maybe_emit_mod(&prog, dst_reg, src_reg, + BPF_CLASS(insn->code) == BPF_JMP); EMIT2(0x85, add_2reg(0xC0, dst_reg, src_reg)); goto emit_cond_jmp;
@@ -1354,10 +1363,8 @@ st: if (is_imm8(insn->off)) case BPF_JMP32 | BPF_JSLE | BPF_K: /* test dst_reg, dst_reg to save one extra byte */ if (imm32 == 0) { - if (BPF_CLASS(insn->code) == BPF_JMP) - EMIT1(add_2mod(0x48, dst_reg, dst_reg)); - else if (is_ereg(dst_reg)) - EMIT1(add_2mod(0x40, dst_reg, dst_reg)); + maybe_emit_mod(&prog, dst_reg, dst_reg, + BPF_CLASS(insn->code) == BPF_JMP); EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg)); goto emit_cond_jmp; }
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit e5f02caccfae94f5baf6ec6dbb57ce8a7e9a40e7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
A later commit will need to lookup a subset of these opcodes. To avoid duplicating code, pull out a table.
The shift opcodes won't be needed by that later commit, but they're already duplicated, so fold them into the table anyway.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210114181751.768687-4-jackmanb@google.com (cherry picked from commit e5f02caccfae94f5baf6ec6dbb57ce8a7e9a40e7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 33 +++++++++++++++------------------ 1 file changed, 15 insertions(+), 18 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index df04d0291fb8..d35383a66f15 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -205,6 +205,18 @@ static u8 add_2reg(u8 byte, u32 dst_reg, u32 src_reg) return byte + reg2hex[dst_reg] + (reg2hex[src_reg] << 3); }
+/* Some 1-byte opcodes for binary ALU operations */ +static u8 simple_alu_opcodes[] = { + [BPF_ADD] = 0x01, + [BPF_SUB] = 0x29, + [BPF_AND] = 0x21, + [BPF_OR] = 0x09, + [BPF_XOR] = 0x31, + [BPF_LSH] = 0xE0, + [BPF_RSH] = 0xE8, + [BPF_ARSH] = 0xF8, +}; + static void jit_fill_hole(void *area, unsigned int size) { /* Fill whole space with INT3 instructions */ @@ -863,15 +875,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, case BPF_ALU64 | BPF_AND | BPF_X: case BPF_ALU64 | BPF_OR | BPF_X: case BPF_ALU64 | BPF_XOR | BPF_X: - switch (BPF_OP(insn->code)) { - case BPF_ADD: b2 = 0x01; break; - case BPF_SUB: b2 = 0x29; break; - case BPF_AND: b2 = 0x21; break; - case BPF_OR: b2 = 0x09; break; - case BPF_XOR: b2 = 0x31; break; - } maybe_emit_mod(&prog, dst_reg, src_reg, BPF_CLASS(insn->code) == BPF_ALU64); + b2 = simple_alu_opcodes[BPF_OP(insn->code)]; EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg)); break;
@@ -1051,12 +1057,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, else if (is_ereg(dst_reg)) EMIT1(add_1mod(0x40, dst_reg));
- switch (BPF_OP(insn->code)) { - case BPF_LSH: b3 = 0xE0; break; - case BPF_RSH: b3 = 0xE8; break; - case BPF_ARSH: b3 = 0xF8; break; - } - + b3 = simple_alu_opcodes[BPF_OP(insn->code)]; if (imm32 == 1) EMIT2(0xD1, add_1reg(b3, dst_reg)); else @@ -1090,11 +1091,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, else if (is_ereg(dst_reg)) EMIT1(add_1mod(0x40, dst_reg));
- switch (BPF_OP(insn->code)) { - case BPF_LSH: b3 = 0xE0; break; - case BPF_RSH: b3 = 0xE8; break; - case BPF_ARSH: b3 = 0xF8; break; - } + b3 = simple_alu_opcodes[BPF_OP(insn->code)]; EMIT2(0xD3, add_1reg(b3, dst_reg));
if (src_reg != BPF_REG_4)
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 91c960b0056672e74627776655c926388350fa30 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
A subsequent patch will add additional atomic operations. These new operations will use the same opcode field as the existing XADD, with the immediate discriminating different operations.
In preparation, rename the instruction mode BPF_ATOMIC and start calling the zero immediate BPF_ADD.
This is possible (doesn't break existing valid BPF progs) because the immediate field is currently reserved MBZ and BPF_ADD is zero.
All uses are removed from the tree but the BPF_XADD definition is kept around to avoid breaking builds for people including kernel headers.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Björn Töpel bjorn.topel@gmail.com Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com (cherry picked from commit 91c960b0056672e74627776655c926388350fa30) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: arch/x86/net/bpf_jit_comp.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- Documentation/networking/filter.rst | 30 +++++++----- arch/arm/net/bpf_jit_32.c | 7 ++- arch/arm64/net/bpf_jit_comp.c | 16 +++++-- arch/mips/net/ebpf_jit.c | 11 +++-- arch/powerpc/net/bpf_jit_comp64.c | 25 ++++++++-- arch/riscv/net/bpf_jit_comp32.c | 20 ++++++-- arch/riscv/net/bpf_jit_comp64.c | 16 +++++-- arch/s390/net/bpf_jit_comp.c | 27 ++++++----- arch/sparc/net/bpf_jit_comp_64.c | 17 +++++-- arch/x86/net/bpf_jit_comp.c | 46 ++++++++++++++----- arch/x86/net/bpf_jit_comp32.c | 6 +-- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 ++++-- drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +- .../net/ethernet/netronome/nfp/bpf/verifier.c | 15 ++++-- include/linux/filter.h | 16 +++++-- include/uapi/linux/bpf.h | 5 +- kernel/bpf/core.c | 31 +++++++++---- kernel/bpf/disasm.c | 6 ++- kernel/bpf/verifier.c | 24 ++++++---- lib/test_bpf.c | 14 +++--- samples/bpf/bpf_insn.h | 4 +- samples/bpf/cookie_uid_helper_example.c | 8 ++-- samples/bpf/sock_example.c | 2 +- samples/bpf/test_cgrp2_attach.c | 5 +- tools/include/linux/filter.h | 15 ++++-- tools/include/uapi/linux/bpf.h | 5 +- .../bpf/prog_tests/cgroup_attach_multi.c | 4 +- .../selftests/bpf/test_cgroup_storage.c | 2 +- tools/testing/selftests/bpf/verifier/ctx.c | 7 ++- .../bpf/verifier/direct_packet_access.c | 4 +- .../testing/selftests/bpf/verifier/leak_ptr.c | 10 ++-- .../selftests/bpf/verifier/meta_access.c | 4 +- tools/testing/selftests/bpf/verifier/unpriv.c | 3 +- .../bpf/verifier/value_illegal_alu.c | 2 +- tools/testing/selftests/bpf/verifier/xadd.c | 18 ++++---- 35 files changed, 291 insertions(+), 152 deletions(-)
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst index debb59e374de..1583d59d806d 100644 --- a/Documentation/networking/filter.rst +++ b/Documentation/networking/filter.rst @@ -1006,13 +1006,13 @@ Size modifier is one of ...
Mode modifier is one of::
- BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */ - BPF_ABS 0x20 - BPF_IND 0x40 - BPF_MEM 0x60 - BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */ - BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */ - BPF_XADD 0xc0 /* eBPF only, exclusive add */ + BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */ + BPF_ABS 0x20 + BPF_IND 0x40 + BPF_MEM 0x60 + BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */ + BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */ + BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */
eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and (BPF_IND | <size> | BPF_LD) which are used to access packet data. @@ -1044,11 +1044,19 @@ Unlike classic BPF instruction set, eBPF has generic load/store operations:: BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32 BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off) - BPF_XADD | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg - BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and -2 byte atomic increments are not supported. +Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. + +It also includes atomic operations, which use the immediate field for extra +encoding. + + .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg + .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg + +Note that 1 and 2 byte atomic operations are not supported. + +You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to +the exclusive-add operation encoded when the immediate field is zero.
eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c index 1214e39aad5e..a903b26cde40 100644 --- a/arch/arm/net/bpf_jit_32.c +++ b/arch/arm/net/bpf_jit_32.c @@ -1642,10 +1642,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) } emit_str_r(dst_lo, tmp2, off, ctx, BPF_SIZE(code)); break; - /* STX XADD: lock *(u32 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_W: - /* STX XADD: lock *(u64 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_DW: + /* Atomic ops */ + case BPF_STX | BPF_ATOMIC | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_DW: goto notyet; /* STX: *(size *)(dst + off) = src */ case BPF_STX | BPF_MEM | BPF_W: diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 9c6cab71ba98..6f04276f3d38 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -888,10 +888,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, } break;
- /* STX XADD: lock *(u32 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_W: - /* STX XADD: lock *(u64 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_DW: + case BPF_STX | BPF_ATOMIC | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_DW: + if (insn->imm != BPF_ADD) { + pr_err_once("unknown atomic op code %02x\n", insn->imm); + return -EINVAL; + } + + /* STX XADD: lock *(u32 *)(dst + off) += src + * and + * STX XADD: lock *(u64 *)(dst + off) += src + */ + if (!off) { reg = dst; } else { diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c index b31b91e57c34..3a73e9375712 100644 --- a/arch/mips/net/ebpf_jit.c +++ b/arch/mips/net/ebpf_jit.c @@ -1426,8 +1426,8 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_STX | BPF_H | BPF_MEM: case BPF_STX | BPF_W | BPF_MEM: case BPF_STX | BPF_DW | BPF_MEM: - case BPF_STX | BPF_W | BPF_XADD: - case BPF_STX | BPF_DW | BPF_XADD: + case BPF_STX | BPF_W | BPF_ATOMIC: + case BPF_STX | BPF_DW | BPF_ATOMIC: if (insn->dst_reg == BPF_REG_10) { ctx->flags |= EBPF_SEEN_FP; dst = MIPS_R_SP; @@ -1441,7 +1441,12 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp); if (src < 0) return src; - if (BPF_MODE(insn->code) == BPF_XADD) { + if (BPF_MODE(insn->code) == BPF_ATOMIC) { + if (insn->imm != BPF_ADD) { + pr_err("ATOMIC OP %02x NOT HANDLED\n", insn->imm); + return -EINVAL; + } + /* * If mem_off does not fit within the 9 bit ll/sc * instruction immediate field, use a temp reg. diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 0d47514e8870..f91d03c57727 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -756,10 +756,18 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, break;
/* - * BPF_STX XADD (atomic_add) + * BPF_STX ATOMIC (atomic ops) */ - /* *(u32 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_W: + if (insn->imm != BPF_ADD) { + pr_err_ratelimited( + "eBPF filter atomic op code %02x (@%d) unsupported\n", + code, i); + return -ENOTSUPP; + } + + /* *(u32 *)(dst + off) += src */ + /* Get EA into TMP_REG_1 */ EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], dst_reg, off)); tmp_idx = ctx->idx * 4; @@ -772,8 +780,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, /* we're done if this succeeded */ PPC_BCC_SHORT(COND_NE, tmp_idx); break; - /* *(u64 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_DW: + case BPF_STX | BPF_ATOMIC | BPF_DW: + if (insn->imm != BPF_ADD) { + pr_err_ratelimited( + "eBPF filter atomic op code %02x (@%d) unsupported\n", + code, i); + return -ENOTSUPP; + } + /* *(u64 *)(dst + off) += src */ + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], dst_reg, off)); tmp_idx = ctx->idx * 4; EMIT(PPC_RAW_LDARX(b2p[TMP_REG_2], 0, b2p[TMP_REG_1], 0)); diff --git a/arch/riscv/net/bpf_jit_comp32.c b/arch/riscv/net/bpf_jit_comp32.c index f300f93ba645..e6497424cbf6 100644 --- a/arch/riscv/net/bpf_jit_comp32.c +++ b/arch/riscv/net/bpf_jit_comp32.c @@ -881,7 +881,7 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off, const s8 *rd = bpf_get_reg64(dst, tmp1, ctx); const s8 *rs = bpf_get_reg64(src, tmp2, ctx);
- if (mode == BPF_XADD && size != BPF_W) + if (mode == BPF_ATOMIC && size != BPF_W) return -1;
emit_imm(RV_REG_T0, off, ctx); @@ -899,7 +899,7 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off, case BPF_MEM: emit(rv_sw(RV_REG_T0, 0, lo(rs)), ctx); break; - case BPF_XADD: + case BPF_ATOMIC: /* Only BPF_ADD supported */ emit(rv_amoadd_w(RV_REG_ZERO, lo(rs), RV_REG_T0, 0, 0), ctx); break; @@ -1264,7 +1264,6 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, case BPF_STX | BPF_MEM | BPF_H: case BPF_STX | BPF_MEM | BPF_W: case BPF_STX | BPF_MEM | BPF_DW: - case BPF_STX | BPF_XADD | BPF_W: if (BPF_CLASS(code) == BPF_ST) { emit_imm32(tmp2, imm, ctx); src = tmp2; @@ -1275,8 +1274,21 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, return -1; break;
+ case BPF_STX | BPF_ATOMIC | BPF_W: + if (insn->imm != BPF_ADD) { + pr_info_once( + "bpf-jit: not supported: atomic operation %02x ***\n", + insn->imm); + return -EFAULT; + } + + if (emit_store_r64(dst, src, off, ctx, BPF_SIZE(code), + BPF_MODE(code))) + return -1; + break; + /* No hardware support for 8-byte atomics in RV32. */ - case BPF_STX | BPF_XADD | BPF_DW: + case BPF_STX | BPF_ATOMIC | BPF_DW: /* Fallthrough. */
notsupported: diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index c113ae818b14..8ce394fe5177 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -1031,10 +1031,18 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); emit_sd(RV_REG_T1, 0, rs, ctx); break; - /* STX XADD: lock *(u32 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_W: - /* STX XADD: lock *(u64 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_DW: + case BPF_STX | BPF_ATOMIC | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_DW: + if (insn->imm != BPF_ADD) { + pr_err("bpf-jit: not supported: atomic operation %02x ***\n", + insn->imm); + return -EINVAL; + } + + /* atomic_add: lock *(u32 *)(dst + off) += src + * atomic_add: lock *(u64 *)(dst + off) += src + */ + if (off) { if (is_12b_int(off)) { emit_addi(RV_REG_T1, rd, off, ctx); diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index cd0cbdafedbd..c1fea914fb16 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -1216,18 +1216,23 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, jit->seen |= SEEN_MEM; break; /* - * BPF_STX XADD (atomic_add) + * BPF_ATOMIC */ - case BPF_STX | BPF_XADD | BPF_W: /* *(u32 *)(dst + off) += src */ - /* laal %w0,%src,off(%dst) */ - EMIT6_DISP_LH(0xeb000000, 0x00fa, REG_W0, src_reg, - dst_reg, off); - jit->seen |= SEEN_MEM; - break; - case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src */ - /* laalg %w0,%src,off(%dst) */ - EMIT6_DISP_LH(0xeb000000, 0x00ea, REG_W0, src_reg, - dst_reg, off); + case BPF_STX | BPF_ATOMIC | BPF_DW: + case BPF_STX | BPF_ATOMIC | BPF_W: + if (insn->imm != BPF_ADD) { + pr_err("Unknown atomic operation %02x\n", insn->imm); + return -1; + } + + /* *(u32/u64 *)(dst + off) += src + * + * BFW_W: laal %w0,%src,off(%dst) + * BPF_DW: laalg %w0,%src,off(%dst) + */ + EMIT6_DISP_LH(0xeb000000, + BPF_SIZE(insn->code) == BPF_W ? 0x00fa : 0x00ea, + REG_W0, src_reg, dst_reg, off); jit->seen |= SEEN_MEM; break; /* diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c index fef734473c0f..9a2f20cbd48b 100644 --- a/arch/sparc/net/bpf_jit_comp_64.c +++ b/arch/sparc/net/bpf_jit_comp_64.c @@ -1369,12 +1369,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) break; }
- /* STX XADD: lock *(u32 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_W: { + case BPF_STX | BPF_ATOMIC | BPF_W: { const u8 tmp = bpf2sparc[TMP_REG_1]; const u8 tmp2 = bpf2sparc[TMP_REG_2]; const u8 tmp3 = bpf2sparc[TMP_REG_3];
+ if (insn->imm != BPF_ADD) { + pr_err_once("unknown atomic op %02x\n", insn->imm); + return -EINVAL; + } + + /* lock *(u32 *)(dst + off) += src */ + if (insn->dst_reg == BPF_REG_FP) ctx->saw_frame_pointer = true;
@@ -1393,11 +1399,16 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) break; } /* STX XADD: lock *(u64 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_DW: { + case BPF_STX | BPF_ATOMIC | BPF_DW: { const u8 tmp = bpf2sparc[TMP_REG_1]; const u8 tmp2 = bpf2sparc[TMP_REG_2]; const u8 tmp3 = bpf2sparc[TMP_REG_3];
+ if (insn->imm != BPF_ADD) { + pr_err_once("unknown atomic op %02x\n", insn->imm); + return -EINVAL; + } + if (insn->dst_reg == BPF_REG_FP) ctx->saw_frame_pointer = true;
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index d35383a66f15..4c154a5f6220 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -798,6 +798,33 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) *pprog = prog; }
+static int emit_atomic(u8 **pprog, u8 atomic_op, + u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size) +{ + u8 *prog = *pprog; + int cnt = 0; + + EMIT1(0xF0); /* lock prefix */ + + maybe_emit_mod(&prog, dst_reg, src_reg, bpf_size == BPF_DW); + + /* emit opcode */ + switch (atomic_op) { + case BPF_ADD: + /* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */ + EMIT1(simple_alu_opcodes[atomic_op]); + break; + default: + pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op); + return -EFAULT; + } + + emit_insn_suffix(&prog, dst_reg, src_reg, off); + + *pprog = prog; + return 0; +} + bool ex_handler_bpf(const struct exception_table_entry *x, struct pt_regs *regs) { u32 reg = x->fixup >> 8; @@ -840,6 +867,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, int i, cnt = 0, excnt = 0; int proglen = 0; u8 *prog = temp; + int err;
detect_reg_usage(insn, insn_cnt, callee_regs_used, &tail_call_seen); @@ -1253,18 +1281,12 @@ st: if (is_imm8(insn->off)) } break;
- /* STX XADD: lock *(u32*)(dst_reg + off) += src_reg */ - case BPF_STX | BPF_XADD | BPF_W: - /* Emit 'lock add dword ptr [rax + off], eax' */ - if (is_ereg(dst_reg) || is_ereg(src_reg)) - EMIT3(0xF0, add_2mod(0x40, dst_reg, src_reg), 0x01); - else - EMIT2(0xF0, 0x01); - goto xadd; - case BPF_STX | BPF_XADD | BPF_DW: - EMIT3(0xF0, add_2mod(0x48, dst_reg, src_reg), 0x01); -xadd: - emit_modrm_dstoff(&prog, dst_reg, src_reg, insn->off); + case BPF_STX | BPF_ATOMIC | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_DW: + err = emit_atomic(&prog, insn->imm, dst_reg, src_reg, + insn->off, BPF_SIZE(insn->code)); + if (err) + return err; break;
/* call */ diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c index 4bd0f98df700..e6108287eef1 100644 --- a/arch/x86/net/bpf_jit_comp32.c +++ b/arch/x86/net/bpf_jit_comp32.c @@ -2249,10 +2249,8 @@ emit_cond_jmp: jmp_cond = get_cond_jmp_opcode(BPF_OP(code), false); return -EFAULT; } break; - /* STX XADD: lock *(u32 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_W: - /* STX XADD: lock *(u64 *)(dst + off) += src */ - case BPF_STX | BPF_XADD | BPF_DW: + case BPF_STX | BPF_ATOMIC | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_DW: goto notyet; case BPF_JMP | BPF_EXIT: if (seen_exit) { diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c index 0a721f6e8676..e31f8fbbc696 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c @@ -3109,13 +3109,19 @@ mem_xadd(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, bool is64) return 0; }
-static int mem_xadd4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +static int mem_atomic4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) { + if (meta->insn.imm != BPF_ADD) + return -EOPNOTSUPP; + return mem_xadd(nfp_prog, meta, false); }
-static int mem_xadd8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +static int mem_atomic8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) { + if (meta->insn.imm != BPF_ADD) + return -EOPNOTSUPP; + return mem_xadd(nfp_prog, meta, true); }
@@ -3475,8 +3481,8 @@ static const instr_cb_t instr_cb[256] = { [BPF_STX | BPF_MEM | BPF_H] = mem_stx2, [BPF_STX | BPF_MEM | BPF_W] = mem_stx4, [BPF_STX | BPF_MEM | BPF_DW] = mem_stx8, - [BPF_STX | BPF_XADD | BPF_W] = mem_xadd4, - [BPF_STX | BPF_XADD | BPF_DW] = mem_xadd8, + [BPF_STX | BPF_ATOMIC | BPF_W] = mem_atomic4, + [BPF_STX | BPF_ATOMIC | BPF_DW] = mem_atomic8, [BPF_ST | BPF_MEM | BPF_B] = mem_st1, [BPF_ST | BPF_MEM | BPF_H] = mem_st2, [BPF_ST | BPF_MEM | BPF_W] = mem_st4, diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h b/drivers/net/ethernet/netronome/nfp/bpf/main.h index c74620fcc539..16841bb750b7 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/main.h +++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h @@ -428,9 +428,9 @@ static inline bool is_mbpf_classic_store_pkt(const struct nfp_insn_meta *meta) return is_mbpf_classic_store(meta) && meta->ptr.type == PTR_TO_PACKET; }
-static inline bool is_mbpf_xadd(const struct nfp_insn_meta *meta) +static inline bool is_mbpf_atomic(const struct nfp_insn_meta *meta) { - return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_XADD); + return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_ATOMIC); }
static inline bool is_mbpf_mul(const struct nfp_insn_meta *meta) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c index e92ee510fd52..9d235c0ce46a 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c @@ -479,7 +479,7 @@ nfp_bpf_check_ptr(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, pr_vlog(env, "map writes not supported\n"); return -EOPNOTSUPP; } - if (is_mbpf_xadd(meta)) { + if (is_mbpf_atomic(meta)) { err = nfp_bpf_map_mark_used(env, meta, reg, NFP_MAP_USE_ATOMIC_CNT); if (err) @@ -523,12 +523,17 @@ nfp_bpf_check_store(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, }
static int -nfp_bpf_check_xadd(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, - struct bpf_verifier_env *env) +nfp_bpf_check_atomic(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, + struct bpf_verifier_env *env) { const struct bpf_reg_state *sreg = cur_regs(env) + meta->insn.src_reg; const struct bpf_reg_state *dreg = cur_regs(env) + meta->insn.dst_reg;
+ if (meta->insn.imm != BPF_ADD) { + pr_vlog(env, "atomic op not implemented: %d\n", meta->insn.imm); + return -EOPNOTSUPP; + } + if (dreg->type != PTR_TO_MAP_VALUE) { pr_vlog(env, "atomic add not to a map value pointer: %d\n", dreg->type); @@ -655,8 +660,8 @@ int nfp_verify_insn(struct bpf_verifier_env *env, int insn_idx, if (is_mbpf_store(meta)) return nfp_bpf_check_store(nfp_prog, meta, env);
- if (is_mbpf_xadd(meta)) - return nfp_bpf_check_xadd(nfp_prog, meta, env); + if (is_mbpf_atomic(meta)) + return nfp_bpf_check_atomic(nfp_prog, meta, env);
if (is_mbpf_alu(meta)) return nfp_bpf_check_alu(nfp_prog, meta, env); diff --git a/include/linux/filter.h b/include/linux/filter.h index 45ba2a61f9e5..bd45146af0d6 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -264,15 +264,23 @@ static inline bool insn_is_zext(const struct bpf_insn *insn) .off = OFF, \ .imm = 0 })
-/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */
-#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \ +/* + * Atomic operations: + * + * BPF_ADD *(uint *) (dst_reg + off16) += src_reg + */ + +#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \ ((struct bpf_insn) { \ - .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \ + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \ .dst_reg = DST, \ .src_reg = SRC, \ .off = OFF, \ - .imm = 0 }) + .imm = OP }) + +/* Legacy alias */ +#define BPF_STX_XADD(SIZE, DST, SRC, OFF) BPF_ATOMIC_OP(SIZE, BPF_ADD, DST, SRC, OFF)
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 04f78c6cd567..5e5730630674 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -19,7 +19,8 @@
/* ld/ldx fields */ #define BPF_DW 0x18 /* double word (64-bit) */ -#define BPF_XADD 0xc0 /* exclusive add */ +#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */ +#define BPF_XADD 0xc0 /* exclusive add - legacy name */
/* alu/jmp fields */ #define BPF_MOV 0xb0 /* mov reg to reg */ @@ -2449,7 +2450,7 @@ union bpf_attr { * running simultaneously. * * A user should care about the synchronization by himself. - * For example, by using the **BPF_STX_XADD** instruction to alter + * For example, by using the **BPF_ATOMIC** instructions to alter * the shared data. * Return * A pointer to the local storage area. diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index e66f59f1ed6b..1d26ebb11e6f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1321,8 +1321,8 @@ EXPORT_SYMBOL_GPL(__bpf_call_base); INSN_3(STX, MEM, H), \ INSN_3(STX, MEM, W), \ INSN_3(STX, MEM, DW), \ - INSN_3(STX, XADD, W), \ - INSN_3(STX, XADD, DW), \ + INSN_3(STX, ATOMIC, W), \ + INSN_3(STX, ATOMIC, DW), \ /* Immediate based. */ \ INSN_3(ST, MEM, B), \ INSN_3(ST, MEM, H), \ @@ -1670,13 +1670,25 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) LDX_PROBE(DW, 8) #undef LDX_PROBE
- STX_XADD_W: /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */ - atomic_add((u32) SRC, (atomic_t *)(unsigned long) - (DST + insn->off)); + STX_ATOMIC_W: + switch (IMM) { + case BPF_ADD: + /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */ + atomic_add((u32) SRC, (atomic_t *)(unsigned long) + (DST + insn->off)); + default: + goto default_label; + } CONT; - STX_XADD_DW: /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */ - atomic64_add((u64) SRC, (atomic64_t *)(unsigned long) - (DST + insn->off)); + STX_ATOMIC_DW: + switch (IMM) { + case BPF_ADD: + /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */ + atomic64_add((u64) SRC, (atomic64_t *)(unsigned long) + (DST + insn->off)); + default: + goto default_label; + } CONT;
default_label: @@ -1686,7 +1698,8 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) * * Note, verifier whitelists all opcodes in bpf_opcode_in_insntable(). */ - pr_warn("BPF interpreter: unknown opcode %02x\n", insn->code); + pr_warn("BPF interpreter: unknown opcode %02x (imm: 0x%x)\n", + insn->code, insn->imm); BUG_ON(1); return 0; } diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index ff1dd7d45b58..25536dcce864 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -153,14 +153,16 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, bpf_ldst_string[BPF_SIZE(insn->code) >> 3], insn->dst_reg, insn->off, insn->src_reg); - else if (BPF_MODE(insn->code) == BPF_XADD) + else if (BPF_MODE(insn->code) == BPF_ATOMIC && + insn->imm == BPF_ADD) { verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) += r%d\n", insn->code, bpf_ldst_string[BPF_SIZE(insn->code) >> 3], insn->dst_reg, insn->off, insn->src_reg); - else + } else { verbose(cbs->private_data, "BUG_%02x\n", insn->code); + } } else if (class == BPF_ST) { if (BPF_MODE(insn->code) == BPF_MEM) { verbose(cbs->private_data, "(%02x) *(%s *)(r%d %+d) = %d\n", diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a362c9f63778..694c4a31626f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3961,13 +3961,17 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn return err; }
-static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn) +static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn) { int err;
- if ((BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) || - insn->imm != 0) { - verbose(env, "BPF_XADD uses reserved fields\n"); + if (insn->imm != BPF_ADD) { + verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm); + return -EINVAL; + } + + if (BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) { + verbose(env, "invalid atomic operand size\n"); return -EINVAL; }
@@ -3990,19 +3994,19 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins is_pkt_reg(env, insn->dst_reg) || is_flow_key_reg(env, insn->dst_reg) || is_sk_reg(env, insn->dst_reg)) { - verbose(env, "BPF_XADD stores into R%d %s is not allowed\n", + verbose(env, "BPF_ATOMIC stores into R%d %s is not allowed\n", insn->dst_reg, reg_type_str(env, reg_state(env, insn->dst_reg)->type)); return -EACCES; }
- /* check whether atomic_add can read the memory */ + /* check whether we can read the memory */ err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off, BPF_SIZE(insn->code), BPF_READ, -1, true); if (err) return err;
- /* check whether atomic_add can write into the same memory */ + /* check whether we can write into the same memory */ return check_mem_access(env, insn_idx, insn->dst_reg, insn->off, BPF_SIZE(insn->code), BPF_WRITE, -1, true); } @@ -9985,8 +9989,8 @@ static int do_check(struct bpf_verifier_env *env) } else if (class == BPF_STX) { enum bpf_reg_type *prev_dst_type, dst_reg_type;
- if (BPF_MODE(insn->code) == BPF_XADD) { - err = check_xadd(env, env->insn_idx, insn); + if (BPF_MODE(insn->code) == BPF_ATOMIC) { + err = check_atomic(env, env->insn_idx, insn); if (err) return err; env->insn_idx++; @@ -10461,7 +10465,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
if (BPF_CLASS(insn->code) == BPF_STX && ((BPF_MODE(insn->code) != BPF_MEM && - BPF_MODE(insn->code) != BPF_XADD) || insn->imm != 0)) { + BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) { verbose(env, "BPF_STX uses reserved fields\n"); return -EINVAL; } diff --git a/lib/test_bpf.c b/lib/test_bpf.c index 4a9137c8551a..aba3c7e43ee5 100644 --- a/lib/test_bpf.c +++ b/lib/test_bpf.c @@ -4295,13 +4295,13 @@ static struct bpf_test tests[] = { { { 0, 0xffffffff } }, .stack_depth = 40, }, - /* BPF_STX | BPF_XADD | BPF_W/DW */ + /* BPF_STX | BPF_ATOMIC | BPF_W/DW */ { "STX_XADD_W: Test: 0x12 + 0x10 = 0x22", .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_W, R10, -40, 0x10), - BPF_STX_XADD(BPF_W, R10, R0, -40), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, R10, R0, -40), BPF_LDX_MEM(BPF_W, R0, R10, -40), BPF_EXIT_INSN(), }, @@ -4316,7 +4316,7 @@ static struct bpf_test tests[] = { BPF_ALU64_REG(BPF_MOV, R1, R10), BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_W, R10, -40, 0x10), - BPF_STX_XADD(BPF_W, R10, R0, -40), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, R10, R0, -40), BPF_ALU64_REG(BPF_MOV, R0, R10), BPF_ALU64_REG(BPF_SUB, R0, R1), BPF_EXIT_INSN(), @@ -4331,7 +4331,7 @@ static struct bpf_test tests[] = { .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_W, R10, -40, 0x10), - BPF_STX_XADD(BPF_W, R10, R0, -40), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, R10, R0, -40), BPF_EXIT_INSN(), }, INTERNAL, @@ -4352,7 +4352,7 @@ static struct bpf_test tests[] = { .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_DW, R10, -40, 0x10), - BPF_STX_XADD(BPF_DW, R10, R0, -40), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, R10, R0, -40), BPF_LDX_MEM(BPF_DW, R0, R10, -40), BPF_EXIT_INSN(), }, @@ -4367,7 +4367,7 @@ static struct bpf_test tests[] = { BPF_ALU64_REG(BPF_MOV, R1, R10), BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_DW, R10, -40, 0x10), - BPF_STX_XADD(BPF_DW, R10, R0, -40), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, R10, R0, -40), BPF_ALU64_REG(BPF_MOV, R0, R10), BPF_ALU64_REG(BPF_SUB, R0, R1), BPF_EXIT_INSN(), @@ -4382,7 +4382,7 @@ static struct bpf_test tests[] = { .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_DW, R10, -40, 0x10), - BPF_STX_XADD(BPF_DW, R10, R0, -40), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, R10, R0, -40), BPF_EXIT_INSN(), }, INTERNAL, diff --git a/samples/bpf/bpf_insn.h b/samples/bpf/bpf_insn.h index 544237980582..db67a2847395 100644 --- a/samples/bpf/bpf_insn.h +++ b/samples/bpf/bpf_insn.h @@ -138,11 +138,11 @@ struct bpf_insn;
#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \ ((struct bpf_insn) { \ - .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \ + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \ .dst_reg = DST, \ .src_reg = SRC, \ .off = OFF, \ - .imm = 0 }) + .imm = BPF_ADD })
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */
diff --git a/samples/bpf/cookie_uid_helper_example.c b/samples/bpf/cookie_uid_helper_example.c index deb0e3e0324d..c5ff7a13918c 100644 --- a/samples/bpf/cookie_uid_helper_example.c +++ b/samples/bpf/cookie_uid_helper_example.c @@ -147,12 +147,12 @@ static void prog_load(void) */ BPF_MOV64_REG(BPF_REG_9, BPF_REG_0), BPF_MOV64_IMM(BPF_REG_1, 1), - BPF_STX_XADD(BPF_DW, BPF_REG_9, BPF_REG_1, - offsetof(struct stats, packets)), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_9, BPF_REG_1, + offsetof(struct stats, packets)), BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6, offsetof(struct __sk_buff, len)), - BPF_STX_XADD(BPF_DW, BPF_REG_9, BPF_REG_1, - offsetof(struct stats, bytes)), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_9, BPF_REG_1, + offsetof(struct stats, bytes)), BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct __sk_buff, len)), BPF_EXIT_INSN(), diff --git a/samples/bpf/sock_example.c b/samples/bpf/sock_example.c index 00aae1d33fca..23d1930e1927 100644 --- a/samples/bpf/sock_example.c +++ b/samples/bpf/sock_example.c @@ -54,7 +54,7 @@ static int test_sock(void) BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */ - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */ + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0), BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */ BPF_EXIT_INSN(), }; diff --git a/samples/bpf/test_cgrp2_attach.c b/samples/bpf/test_cgrp2_attach.c index 20fbd1241db3..390ff38d2ac6 100644 --- a/samples/bpf/test_cgrp2_attach.c +++ b/samples/bpf/test_cgrp2_attach.c @@ -53,7 +53,7 @@ static int prog_load(int map_fd, int verdict) BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */ - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */ + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
/* Count bytes */ BPF_MOV64_IMM(BPF_REG_0, MAP_KEY_BYTES), /* r0 = 1 */ @@ -64,7 +64,8 @@ static int prog_load(int map_fd, int verdict) BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6, offsetof(struct __sk_buff, len)), /* r1 = skb->len */ - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */ + + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_0, verdict), /* r0 = verdict */ BPF_EXIT_INSN(), diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h index ca28b6ab8db7..e870c9039f0d 100644 --- a/tools/include/linux/filter.h +++ b/tools/include/linux/filter.h @@ -169,15 +169,22 @@ .off = OFF, \ .imm = 0 })
-/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */ +/* + * Atomic operations: + * + * BPF_ADD *(uint *) (dst_reg + off16) += src_reg + */
-#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \ +#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \ ((struct bpf_insn) { \ - .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \ + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \ .dst_reg = DST, \ .src_reg = SRC, \ .off = OFF, \ - .imm = 0 }) + .imm = OP }) + +/* Legacy alias */ +#define BPF_STX_XADD(SIZE, DST, SRC, OFF) BPF_ATOMIC_OP(SIZE, BPF_ADD, DST, SRC, OFF)
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index acdc91cd5d6b..a82f3e4f2f5d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -19,7 +19,8 @@
/* ld/ldx fields */ #define BPF_DW 0x18 /* double word (64-bit) */ -#define BPF_XADD 0xc0 /* exclusive add */ +#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */ +#define BPF_XADD 0xc0 /* exclusive add - legacy name */
/* alu/jmp fields */ #define BPF_MOV 0xb0 /* mov reg to reg */ @@ -2449,7 +2450,7 @@ union bpf_attr { * running simultaneously. * * A user should care about the synchronization by himself. - * For example, by using the **BPF_STX_XADD** instruction to alter + * For example, by using the **BPF_ATOMIC** instructions to alter * the shared data. * Return * A pointer to the local storage area. diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c b/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c index b549fcfacc0b..0a1fc9816cef 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c @@ -45,13 +45,13 @@ static int prog_load_cnt(int verdict, int val) BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), BPF_MOV64_IMM(BPF_REG_1, val), /* r1 = 1 */ - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */ + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
BPF_LD_MAP_FD(BPF_REG_1, cgroup_storage_fd), BPF_MOV64_IMM(BPF_REG_2, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_local_storage), BPF_MOV64_IMM(BPF_REG_1, val), - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_W, BPF_REG_0, BPF_REG_1, 0, 0), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
BPF_LD_MAP_FD(BPF_REG_1, percpu_cgroup_storage_fd), BPF_MOV64_IMM(BPF_REG_2, 0), diff --git a/tools/testing/selftests/bpf/test_cgroup_storage.c b/tools/testing/selftests/bpf/test_cgroup_storage.c index d946252a25bb..0cda61da5d39 100644 --- a/tools/testing/selftests/bpf/test_cgroup_storage.c +++ b/tools/testing/selftests/bpf/test_cgroup_storage.c @@ -29,7 +29,7 @@ int main(int argc, char **argv) BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_local_storage), BPF_MOV64_IMM(BPF_REG_1, 1), - BPF_STX_XADD(BPF_DW, BPF_REG_0, BPF_REG_1, 0), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0), BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0x1), BPF_MOV64_REG(BPF_REG_0, BPF_REG_1), diff --git a/tools/testing/selftests/bpf/verifier/ctx.c b/tools/testing/selftests/bpf/verifier/ctx.c index 93d6b1641481..23080862aafd 100644 --- a/tools/testing/selftests/bpf/verifier/ctx.c +++ b/tools/testing/selftests/bpf/verifier/ctx.c @@ -10,14 +10,13 @@ .prog_type = BPF_PROG_TYPE_SCHED_CLS, }, { - "context stores via XADD", + "context stores via BPF_ATOMIC", .insns = { BPF_MOV64_IMM(BPF_REG_0, 0), - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_W, BPF_REG_1, - BPF_REG_0, offsetof(struct __sk_buff, mark), 0), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_1, BPF_REG_0, offsetof(struct __sk_buff, mark)), BPF_EXIT_INSN(), }, - .errstr = "BPF_XADD stores into R1 ctx is not allowed", + .errstr = "BPF_ATOMIC stores into R1 ctx is not allowed", .result = REJECT, .prog_type = BPF_PROG_TYPE_SCHED_CLS, }, diff --git a/tools/testing/selftests/bpf/verifier/direct_packet_access.c b/tools/testing/selftests/bpf/verifier/direct_packet_access.c index ae72536603fe..ac1e19d0f520 100644 --- a/tools/testing/selftests/bpf/verifier/direct_packet_access.c +++ b/tools/testing/selftests/bpf/verifier/direct_packet_access.c @@ -333,7 +333,7 @@ BPF_MOV64_REG(BPF_REG_4, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -8), BPF_STX_MEM(BPF_DW, BPF_REG_4, BPF_REG_2, 0), - BPF_STX_XADD(BPF_DW, BPF_REG_4, BPF_REG_5, 0), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_4, BPF_REG_5, 0), BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_4, 0), BPF_STX_MEM(BPF_W, BPF_REG_2, BPF_REG_5, 0), BPF_MOV64_IMM(BPF_REG_0, 0), @@ -488,7 +488,7 @@ BPF_JMP_REG(BPF_JGT, BPF_REG_0, BPF_REG_3, 11), BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_10, -8), BPF_MOV64_IMM(BPF_REG_4, 0xffffffff), - BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_4, -8), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_10, BPF_REG_4, -8), BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -8), BPF_ALU64_IMM(BPF_RSH, BPF_REG_4, 49), BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_2), diff --git a/tools/testing/selftests/bpf/verifier/leak_ptr.c b/tools/testing/selftests/bpf/verifier/leak_ptr.c index d6eec17f2cd2..73f0dea95546 100644 --- a/tools/testing/selftests/bpf/verifier/leak_ptr.c +++ b/tools/testing/selftests/bpf/verifier/leak_ptr.c @@ -5,7 +5,7 @@ BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, offsetof(struct __sk_buff, cb[0])), BPF_LD_MAP_FD(BPF_REG_2, 0), - BPF_STX_XADD(BPF_DW, BPF_REG_1, BPF_REG_2, + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_1, BPF_REG_2, offsetof(struct __sk_buff, cb[0])), BPF_EXIT_INSN(), }, @@ -13,7 +13,7 @@ .errstr_unpriv = "R2 leaks addr into mem", .result_unpriv = REJECT, .result = REJECT, - .errstr = "BPF_XADD stores into R1 ctx is not allowed", + .errstr = "BPF_ATOMIC stores into R1 ctx is not allowed", }, { "leak pointer into ctx 2", @@ -21,14 +21,14 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, offsetof(struct __sk_buff, cb[0])), - BPF_STX_XADD(BPF_DW, BPF_REG_1, BPF_REG_10, + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_1, BPF_REG_10, offsetof(struct __sk_buff, cb[0])), BPF_EXIT_INSN(), }, .errstr_unpriv = "R10 leaks addr into mem", .result_unpriv = REJECT, .result = REJECT, - .errstr = "BPF_XADD stores into R1 ctx is not allowed", + .errstr = "BPF_ATOMIC stores into R1 ctx is not allowed", }, { "leak pointer into ctx 3", @@ -56,7 +56,7 @@ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3), BPF_MOV64_IMM(BPF_REG_3, 0), BPF_STX_MEM(BPF_DW, BPF_REG_0, BPF_REG_3, 0), - BPF_STX_XADD(BPF_DW, BPF_REG_0, BPF_REG_6, 0), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_6, 0), BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, diff --git a/tools/testing/selftests/bpf/verifier/meta_access.c b/tools/testing/selftests/bpf/verifier/meta_access.c index 205292b8dd65..b45e8af41420 100644 --- a/tools/testing/selftests/bpf/verifier/meta_access.c +++ b/tools/testing/selftests/bpf/verifier/meta_access.c @@ -171,7 +171,7 @@ BPF_MOV64_IMM(BPF_REG_5, 42), BPF_MOV64_IMM(BPF_REG_6, 24), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_5, -8), - BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_6, -8), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_10, BPF_REG_6, -8), BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -8), BPF_JMP_IMM(BPF_JGT, BPF_REG_5, 100, 6), BPF_ALU64_REG(BPF_ADD, BPF_REG_3, BPF_REG_5), @@ -196,7 +196,7 @@ BPF_MOV64_IMM(BPF_REG_5, 42), BPF_MOV64_IMM(BPF_REG_6, 24), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_5, -8), - BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_6, -8), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_10, BPF_REG_6, -8), BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -8), BPF_JMP_IMM(BPF_JGT, BPF_REG_5, 100, 6), BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_5), diff --git a/tools/testing/selftests/bpf/verifier/unpriv.c b/tools/testing/selftests/bpf/verifier/unpriv.c index 9dfb68c8c78d..111801aea5e3 100644 --- a/tools/testing/selftests/bpf/verifier/unpriv.c +++ b/tools/testing/selftests/bpf/verifier/unpriv.c @@ -207,7 +207,8 @@ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, -8), BPF_STX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, 0), BPF_MOV64_IMM(BPF_REG_0, 1), - BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_10, BPF_REG_0, -8, 0), + BPF_RAW_INSN(BPF_STX | BPF_ATOMIC | BPF_DW, + BPF_REG_10, BPF_REG_0, -8, BPF_ADD), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_hash_recalc), BPF_EXIT_INSN(), diff --git a/tools/testing/selftests/bpf/verifier/value_illegal_alu.c b/tools/testing/selftests/bpf/verifier/value_illegal_alu.c index ed1c2cea1dea..489062867218 100644 --- a/tools/testing/selftests/bpf/verifier/value_illegal_alu.c +++ b/tools/testing/selftests/bpf/verifier/value_illegal_alu.c @@ -82,7 +82,7 @@ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_0, 0), - BPF_STX_XADD(BPF_DW, BPF_REG_2, BPF_REG_3, 0), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_2, BPF_REG_3, 0), BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0), BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 22), BPF_EXIT_INSN(), diff --git a/tools/testing/selftests/bpf/verifier/xadd.c b/tools/testing/selftests/bpf/verifier/xadd.c index c5de2e62cc8b..b96ef3526815 100644 --- a/tools/testing/selftests/bpf/verifier/xadd.c +++ b/tools/testing/selftests/bpf/verifier/xadd.c @@ -3,7 +3,7 @@ .insns = { BPF_MOV64_IMM(BPF_REG_0, 1), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8), - BPF_STX_XADD(BPF_W, BPF_REG_10, BPF_REG_0, -7), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_10, BPF_REG_0, -7), BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8), BPF_EXIT_INSN(), }, @@ -22,7 +22,7 @@ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), BPF_EXIT_INSN(), BPF_MOV64_IMM(BPF_REG_1, 1), - BPF_STX_XADD(BPF_W, BPF_REG_0, BPF_REG_1, 3), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_0, BPF_REG_1, 3), BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 3), BPF_EXIT_INSN(), }, @@ -45,13 +45,13 @@ BPF_MOV64_IMM(BPF_REG_0, 1), BPF_ST_MEM(BPF_W, BPF_REG_2, 0, 0), BPF_ST_MEM(BPF_W, BPF_REG_2, 3, 0), - BPF_STX_XADD(BPF_W, BPF_REG_2, BPF_REG_0, 1), - BPF_STX_XADD(BPF_W, BPF_REG_2, BPF_REG_0, 2), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_2, BPF_REG_0, 1), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_2, BPF_REG_0, 2), BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_2, 1), BPF_EXIT_INSN(), }, .result = REJECT, - .errstr = "BPF_XADD stores into R2 pkt is not allowed", + .errstr = "BPF_ATOMIC stores into R2 pkt is not allowed", .prog_type = BPF_PROG_TYPE_XDP, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, }, @@ -62,8 +62,8 @@ BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), BPF_MOV64_REG(BPF_REG_7, BPF_REG_10), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8), - BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_0, -8), - BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_0, -8), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_10, BPF_REG_0, -8), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_10, BPF_REG_0, -8), BPF_JMP_REG(BPF_JNE, BPF_REG_6, BPF_REG_0, 3), BPF_JMP_REG(BPF_JNE, BPF_REG_7, BPF_REG_10, 2), BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8), @@ -82,8 +82,8 @@ BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), BPF_MOV64_REG(BPF_REG_7, BPF_REG_10), BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -8), - BPF_STX_XADD(BPF_W, BPF_REG_10, BPF_REG_0, -8), - BPF_STX_XADD(BPF_W, BPF_REG_10, BPF_REG_0, -8), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_10, BPF_REG_0, -8), + BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_10, BPF_REG_0, -8), BPF_JMP_REG(BPF_JNE, BPF_REG_6, BPF_REG_0, 3), BPF_JMP_REG(BPF_JNE, BPF_REG_7, BPF_REG_10, 2), BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8),
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit c5bcb5eb4db632280b4123135d583a7bc8caea3e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
I can't find a reason why this code is in resolve_pseudo_ldimm64; since I'll be modifying it in a subsequent commit, tidy it up.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210114181751.768687-6-jackmanb@google.com (cherry picked from commit c5bcb5eb4db632280b4123135d583a7bc8caea3e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 694c4a31626f..982c38fb3604 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9989,6 +9989,12 @@ static int do_check(struct bpf_verifier_env *env) } else if (class == BPF_STX) { enum bpf_reg_type *prev_dst_type, dst_reg_type;
+ if (((BPF_MODE(insn->code) != BPF_MEM && + BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) { + verbose(env, "BPF_STX uses reserved fields\n"); + return -EINVAL; + } + if (BPF_MODE(insn->code) == BPF_ATOMIC) { err = check_atomic(env, env->insn_idx, insn); if (err) @@ -10463,13 +10469,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) return -EINVAL; }
- if (BPF_CLASS(insn->code) == BPF_STX && - ((BPF_MODE(insn->code) != BPF_MEM && - BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) { - verbose(env, "BPF_STX uses reserved fields\n"); - return -EINVAL; - } - if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) { struct bpf_insn_aux_data *aux; struct bpf_map *map;
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 5ca419f2864a2c60940dcf4bbaeb69546200e36f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC instructions, in order to have the previous value of the atomically-modified memory location loaded into the src register after an atomic op is carried out.
Suggested-by: Yonghong Song yhs@fb.com Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com (cherry picked from commit 5ca419f2864a2c60940dcf4bbaeb69546200e36f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 4 ++++ include/linux/filter.h | 1 + include/uapi/linux/bpf.h | 3 +++ kernel/bpf/core.c | 13 +++++++++++++ kernel/bpf/disasm.c | 7 +++++++ kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++--------- tools/include/linux/filter.h | 1 + tools/include/uapi/linux/bpf.h | 3 +++ 8 files changed, 56 insertions(+), 9 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 4c154a5f6220..d79585615580 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -814,6 +814,10 @@ static int emit_atomic(u8 **pprog, u8 atomic_op, /* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */ EMIT1(simple_alu_opcodes[atomic_op]); break; + case BPF_ADD | BPF_FETCH: + /* src_reg = atomic_fetch_add(dst_reg + off, src_reg); */ + EMIT2(0x0F, 0xC1); + break; default: pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op); return -EFAULT; diff --git a/include/linux/filter.h b/include/linux/filter.h index bd45146af0d6..9d4563850f54 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -269,6 +269,7 @@ static inline bool insn_is_zext(const struct bpf_insn *insn) * Atomic operations: * * BPF_ADD *(uint *) (dst_reg + off16) += src_reg + * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg); */
#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 5e5730630674..b2c03f930238 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -44,6 +44,9 @@ #define BPF_CALL 0x80 /* function call */ #define BPF_EXIT 0x90 /* function return */
+/* atomic op type fields (stored in immediate) */ +#define BPF_FETCH 0x01 /* fetch previous value into src reg */ + /* Register numbers */ enum { BPF_REG_0 = 0, diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 1d26ebb11e6f..db328aa9908e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1676,16 +1676,29 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */ atomic_add((u32) SRC, (atomic_t *)(unsigned long) (DST + insn->off)); + break; + case BPF_ADD | BPF_FETCH: + SRC = (u32) atomic_fetch_add( + (u32) SRC, + (atomic_t *)(unsigned long) (DST + insn->off)); + break; default: goto default_label; } CONT; + STX_ATOMIC_DW: switch (IMM) { case BPF_ADD: /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */ atomic64_add((u64) SRC, (atomic64_t *)(unsigned long) (DST + insn->off)); + break; + case BPF_ADD | BPF_FETCH: + SRC = (u64) atomic64_fetch_add( + (u64) SRC, + (atomic64_t *)(unsigned long) (DST + insn->off)); + break; default: goto default_label; } diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 25536dcce864..0f64a9cf20b5 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -160,6 +160,13 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, bpf_ldst_string[BPF_SIZE(insn->code) >> 3], insn->dst_reg, insn->off, insn->src_reg); + } else if (BPF_MODE(insn->code) == BPF_ATOMIC && + insn->imm == (BPF_ADD | BPF_FETCH)) { + verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_add((%s *)(r%d %+d), r%d)\n", + insn->code, insn->src_reg, + BPF_SIZE(insn->code) == BPF_DW ? "64" : "", + bpf_ldst_string[BPF_SIZE(insn->code) >> 3], + insn->dst_reg, insn->off, insn->src_reg); } else { verbose(cbs->private_data, "BUG_%02x\n", insn->code); } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 982c38fb3604..e9ce54a0a84b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3965,7 +3965,11 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i { int err;
- if (insn->imm != BPF_ADD) { + switch (insn->imm) { + case BPF_ADD: + case BPF_ADD | BPF_FETCH: + break; + default: verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm); return -EINVAL; } @@ -4007,8 +4011,20 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i return err;
/* check whether we can write into the same memory */ - return check_mem_access(env, insn_idx, insn->dst_reg, insn->off, - BPF_SIZE(insn->code), BPF_WRITE, -1, true); + err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off, + BPF_SIZE(insn->code), BPF_WRITE, -1, true); + if (err) + return err; + + if (!(insn->imm & BPF_FETCH)) + return 0; + + /* check and record load of old value into src reg */ + err = check_reg_arg(env, insn->src_reg, DST_OP); + if (err) + return err; + + return 0; }
/* When register 'regno' is used to read the stack (either directly or through @@ -9989,12 +10005,6 @@ static int do_check(struct bpf_verifier_env *env) } else if (class == BPF_STX) { enum bpf_reg_type *prev_dst_type, dst_reg_type;
- if (((BPF_MODE(insn->code) != BPF_MEM && - BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) { - verbose(env, "BPF_STX uses reserved fields\n"); - return -EINVAL; - } - if (BPF_MODE(insn->code) == BPF_ATOMIC) { err = check_atomic(env, env->insn_idx, insn); if (err) @@ -10003,6 +10013,11 @@ static int do_check(struct bpf_verifier_env *env) continue; }
+ if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) { + verbose(env, "BPF_STX uses reserved fields\n"); + return -EINVAL; + } + /* check src1 operand */ err = check_reg_arg(env, insn->src_reg, SRC_OP); if (err) diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h index e870c9039f0d..7211ce9fba53 100644 --- a/tools/include/linux/filter.h +++ b/tools/include/linux/filter.h @@ -173,6 +173,7 @@ * Atomic operations: * * BPF_ADD *(uint *) (dst_reg + off16) += src_reg + * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg); */
#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \ diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index a82f3e4f2f5d..ab3a7b8ecdf0 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -44,6 +44,9 @@ #define BPF_CALL 0x80 /* function call */ #define BPF_EXIT 0x90 /* function return */
+/* atomic op type fields (stored in immediate) */ +#define BPF_FETCH 0x01 /* fetch previous value into src reg */ + /* Register numbers */ enum { BPF_REG_0 = 0,
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 5ffa25502b5ab3d639829a2d1e316cff7f59a41e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This adds two atomic opcodes, both of which include the BPF_FETCH flag. XCHG without the BPF_FETCH flag would naturally encode atomic_set. This is not supported because it would be of limited value to userspace (it doesn't imply any barriers). CMPXCHG without BPF_FETCH woulud be an atomic compare-and-write. We don't have such an operation in the kernel so it isn't provided to BPF either.
There are two significant design decisions made for the CMPXCHG instruction:
- To solve the issue that this operation fundamentally has 3 operands, but we only have two register fields. Therefore the operand we compare against (the kernel's API calls it 'old') is hard-coded to be R0. x86 has similar design (and A64 doesn't have this problem).
A potential alternative might be to encode the other operand's register number in the immediate field.
- The kernel's atomic_cmpxchg returns the old value, while the C11 userspace APIs return a boolean indicating the comparison result. Which should BPF do? A64 returns the old value. x86 returns the old value in the hard-coded register (and also sets a flag). That means return-old-value is easier to JIT, so that's what we use.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com (cherry picked from commit 5ffa25502b5ab3d639829a2d1e316cff7f59a41e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 8 ++++++++ include/linux/filter.h | 2 ++ include/uapi/linux/bpf.h | 4 +++- kernel/bpf/core.c | 20 ++++++++++++++++++++ kernel/bpf/disasm.c | 15 +++++++++++++++ kernel/bpf/verifier.c | 19 +++++++++++++++++-- tools/include/linux/filter.h | 2 ++ tools/include/uapi/linux/bpf.h | 4 +++- 8 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index d79585615580..5b9a7db44229 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -818,6 +818,14 @@ static int emit_atomic(u8 **pprog, u8 atomic_op, /* src_reg = atomic_fetch_add(dst_reg + off, src_reg); */ EMIT2(0x0F, 0xC1); break; + case BPF_XCHG: + /* src_reg = atomic_xchg(dst_reg + off, src_reg); */ + EMIT1(0x87); + break; + case BPF_CMPXCHG: + /* r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg); */ + EMIT2(0x0F, 0xB1); + break; default: pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op); return -EFAULT; diff --git a/include/linux/filter.h b/include/linux/filter.h index 9d4563850f54..0e2f7eaf1854 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -270,6 +270,8 @@ static inline bool insn_is_zext(const struct bpf_insn *insn) * * BPF_ADD *(uint *) (dst_reg + off16) += src_reg * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg); + * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg) + * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg) */
#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b2c03f930238..3eddc8add9b8 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -45,7 +45,9 @@ #define BPF_EXIT 0x90 /* function return */
/* atomic op type fields (stored in immediate) */ -#define BPF_FETCH 0x01 /* fetch previous value into src reg */ +#define BPF_FETCH 0x01 /* not an opcode on its own, used to build others */ +#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */ +#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
/* Register numbers */ enum { diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index db328aa9908e..74abac5bcc3d 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1682,6 +1682,16 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) (u32) SRC, (atomic_t *)(unsigned long) (DST + insn->off)); break; + case BPF_XCHG: + SRC = (u32) atomic_xchg( + (atomic_t *)(unsigned long) (DST + insn->off), + (u32) SRC); + break; + case BPF_CMPXCHG: + BPF_R0 = (u32) atomic_cmpxchg( + (atomic_t *)(unsigned long) (DST + insn->off), + (u32) BPF_R0, (u32) SRC); + break; default: goto default_label; } @@ -1699,6 +1709,16 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) (u64) SRC, (atomic64_t *)(unsigned long) (DST + insn->off)); break; + case BPF_XCHG: + SRC = (u64) atomic64_xchg( + (atomic64_t *)(unsigned long) (DST + insn->off), + (u64) SRC); + break; + case BPF_CMPXCHG: + BPF_R0 = (u64) atomic64_cmpxchg( + (atomic64_t *)(unsigned long) (DST + insn->off), + (u64) BPF_R0, (u64) SRC); + break; default: goto default_label; } diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 0f64a9cf20b5..42635b7bfeef 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -167,6 +167,21 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, BPF_SIZE(insn->code) == BPF_DW ? "64" : "", bpf_ldst_string[BPF_SIZE(insn->code) >> 3], insn->dst_reg, insn->off, insn->src_reg); + } else if (BPF_MODE(insn->code) == BPF_ATOMIC && + insn->imm == BPF_CMPXCHG) { + verbose(cbs->private_data, "(%02x) r0 = atomic%s_cmpxchg((%s *)(r%d %+d), r0, r%d)\n", + insn->code, + BPF_SIZE(insn->code) == BPF_DW ? "64" : "", + bpf_ldst_string[BPF_SIZE(insn->code) >> 3], + insn->dst_reg, insn->off, + insn->src_reg); + } else if (BPF_MODE(insn->code) == BPF_ATOMIC && + insn->imm == BPF_XCHG) { + verbose(cbs->private_data, "(%02x) r%d = atomic%s_xchg((%s *)(r%d %+d), r%d)\n", + insn->code, insn->src_reg, + BPF_SIZE(insn->code) == BPF_DW ? "64" : "", + bpf_ldst_string[BPF_SIZE(insn->code) >> 3], + insn->dst_reg, insn->off, insn->src_reg); } else { verbose(cbs->private_data, "BUG_%02x\n", insn->code); } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e9ce54a0a84b..f03a95cebdd3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3963,11 +3963,14 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn) { + int load_reg; int err;
switch (insn->imm) { case BPF_ADD: case BPF_ADD | BPF_FETCH: + case BPF_XCHG: + case BPF_CMPXCHG: break; default: verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm); @@ -3989,6 +3992,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i if (err) return err;
+ if (insn->imm == BPF_CMPXCHG) { + /* Check comparison of R0 with memory location */ + err = check_reg_arg(env, BPF_REG_0, SRC_OP); + if (err) + return err; + } + if (is_pointer_value(env, insn->src_reg)) { verbose(env, "R%d leaks addr into mem\n", insn->src_reg); return -EACCES; @@ -4019,8 +4029,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i if (!(insn->imm & BPF_FETCH)) return 0;
- /* check and record load of old value into src reg */ - err = check_reg_arg(env, insn->src_reg, DST_OP); + if (insn->imm == BPF_CMPXCHG) + load_reg = BPF_REG_0; + else + load_reg = insn->src_reg; + + /* check and record load of old value */ + err = check_reg_arg(env, load_reg, DST_OP); if (err) return err;
diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h index 7211ce9fba53..d75998b0d5ac 100644 --- a/tools/include/linux/filter.h +++ b/tools/include/linux/filter.h @@ -174,6 +174,8 @@ * * BPF_ADD *(uint *) (dst_reg + off16) += src_reg * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg); + * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg) + * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg) */
#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \ diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index ab3a7b8ecdf0..c9ab32ae5d7e 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -45,7 +45,9 @@ #define BPF_EXIT 0x90 /* function return */
/* atomic op type fields (stored in immediate) */ -#define BPF_FETCH 0x01 /* fetch previous value into src reg */ +#define BPF_FETCH 0x01 /* not an opcode on its own, used to build others */ +#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */ +#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
/* Register numbers */ enum {
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 462910670e4ac91509829c5549bd0227668176fb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since the atomic operations that are added in subsequent commits are all isomorphic with BPF_ADD, pull out a macro to avoid the interpreter becoming dominated by lines of atomic-related code.
Note that this sacrificies interpreter performance (combining STX_ATOMIC_W and STX_ATOMIC_DW into single switch case means that we need an extra conditional branch to differentiate them) in favour of compact and (relatively!) simple C code.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210114181751.768687-9-jackmanb@google.com (cherry picked from commit 462910670e4ac91509829c5549bd0227668176fb) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/core.c | 80 +++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 41 deletions(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 74abac5bcc3d..30c7b63d4e3e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1670,55 +1670,53 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) LDX_PROBE(DW, 8) #undef LDX_PROBE
- STX_ATOMIC_W: - switch (IMM) { - case BPF_ADD: - /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */ - atomic_add((u32) SRC, (atomic_t *)(unsigned long) - (DST + insn->off)); - break; - case BPF_ADD | BPF_FETCH: - SRC = (u32) atomic_fetch_add( - (u32) SRC, - (atomic_t *)(unsigned long) (DST + insn->off)); - break; - case BPF_XCHG: - SRC = (u32) atomic_xchg( - (atomic_t *)(unsigned long) (DST + insn->off), - (u32) SRC); - break; - case BPF_CMPXCHG: - BPF_R0 = (u32) atomic_cmpxchg( - (atomic_t *)(unsigned long) (DST + insn->off), - (u32) BPF_R0, (u32) SRC); +#define ATOMIC_ALU_OP(BOP, KOP) \ + case BOP: \ + if (BPF_SIZE(insn->code) == BPF_W) \ + atomic_##KOP((u32) SRC, (atomic_t *)(unsigned long) \ + (DST + insn->off)); \ + else \ + atomic64_##KOP((u64) SRC, (atomic64_t *)(unsigned long) \ + (DST + insn->off)); \ + break; \ + case BOP | BPF_FETCH: \ + if (BPF_SIZE(insn->code) == BPF_W) \ + SRC = (u32) atomic_fetch_##KOP( \ + (u32) SRC, \ + (atomic_t *)(unsigned long) (DST + insn->off)); \ + else \ + SRC = (u64) atomic64_fetch_##KOP( \ + (u64) SRC, \ + (atomic64_t *)(unsigned long) (DST + insn->off)); \ break; - default: - goto default_label; - } - CONT;
STX_ATOMIC_DW: + STX_ATOMIC_W: switch (IMM) { - case BPF_ADD: - /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */ - atomic64_add((u64) SRC, (atomic64_t *)(unsigned long) - (DST + insn->off)); - break; - case BPF_ADD | BPF_FETCH: - SRC = (u64) atomic64_fetch_add( - (u64) SRC, - (atomic64_t *)(unsigned long) (DST + insn->off)); - break; + ATOMIC_ALU_OP(BPF_ADD, add) +#undef ATOMIC_ALU_OP + case BPF_XCHG: - SRC = (u64) atomic64_xchg( - (atomic64_t *)(unsigned long) (DST + insn->off), - (u64) SRC); + if (BPF_SIZE(insn->code) == BPF_W) + SRC = (u32) atomic_xchg( + (atomic_t *)(unsigned long) (DST + insn->off), + (u32) SRC); + else + SRC = (u64) atomic64_xchg( + (atomic64_t *)(unsigned long) (DST + insn->off), + (u64) SRC); break; case BPF_CMPXCHG: - BPF_R0 = (u64) atomic64_cmpxchg( - (atomic64_t *)(unsigned long) (DST + insn->off), - (u64) BPF_R0, (u64) SRC); + if (BPF_SIZE(insn->code) == BPF_W) + BPF_R0 = (u32) atomic_cmpxchg( + (atomic_t *)(unsigned long) (DST + insn->off), + (u32) BPF_R0, (u32) SRC); + else + BPF_R0 = (u64) atomic64_cmpxchg( + (atomic64_t *)(unsigned long) (DST + insn->off), + (u64) BPF_R0, (u64) SRC); break; + default: goto default_label; }
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 981f94c3e92146705baf97fb417a5ed1ab1a79a5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This adds instructions for
atomic[64]_[fetch_]and atomic[64]_[fetch_]or atomic[64]_[fetch_]xor
All these operations are isomorphic enough to implement with the same verifier, interpreter, and x86 JIT code, hence being a single commit.
The main interesting thing here is that x86 doesn't directly support the fetch_ version these operations, so we need to generate a CMPXCHG loop in the JIT. This requires the use of two temporary registers, IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210114181751.768687-10-jackmanb@google.com (cherry picked from commit 981f94c3e92146705baf97fb417a5ed1ab1a79a5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 50 +++++++++++++++++++++++++++++++++++- include/linux/filter.h | 6 +++++ kernel/bpf/core.c | 3 +++ kernel/bpf/disasm.c | 21 ++++++++++++--- kernel/bpf/verifier.c | 6 +++++ tools/include/linux/filter.h | 6 +++++ 6 files changed, 87 insertions(+), 5 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 5b9a7db44229..5735184f5dbd 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -811,6 +811,10 @@ static int emit_atomic(u8 **pprog, u8 atomic_op, /* emit opcode */ switch (atomic_op) { case BPF_ADD: + case BPF_SUB: + case BPF_AND: + case BPF_OR: + case BPF_XOR: /* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */ EMIT1(simple_alu_opcodes[atomic_op]); break; @@ -1295,8 +1299,52 @@ st: if (is_imm8(insn->off))
case BPF_STX | BPF_ATOMIC | BPF_W: case BPF_STX | BPF_ATOMIC | BPF_DW: + if (insn->imm == (BPF_AND | BPF_FETCH) || + insn->imm == (BPF_OR | BPF_FETCH) || + insn->imm == (BPF_XOR | BPF_FETCH)) { + u8 *branch_target; + bool is64 = BPF_SIZE(insn->code) == BPF_DW; + + /* + * Can't be implemented with a single x86 insn. + * Need to do a CMPXCHG loop. + */ + + /* Will need RAX as a CMPXCHG operand so save R0 */ + emit_mov_reg(&prog, true, BPF_REG_AX, BPF_REG_0); + branch_target = prog; + /* Load old value */ + emit_ldx(&prog, BPF_SIZE(insn->code), + BPF_REG_0, dst_reg, insn->off); + /* + * Perform the (commutative) operation locally, + * put the result in the AUX_REG. + */ + emit_mov_reg(&prog, is64, AUX_REG, BPF_REG_0); + maybe_emit_mod(&prog, AUX_REG, src_reg, is64); + EMIT2(simple_alu_opcodes[BPF_OP(insn->imm)], + add_2reg(0xC0, AUX_REG, src_reg)); + /* Attempt to swap in new value */ + err = emit_atomic(&prog, BPF_CMPXCHG, + dst_reg, AUX_REG, insn->off, + BPF_SIZE(insn->code)); + if (WARN_ON(err)) + return err; + /* + * ZF tells us whether we won the race. If it's + * cleared we need to try again. + */ + EMIT2(X86_JNE, -(prog - branch_target) - 2); + /* Return the pre-modification value */ + emit_mov_reg(&prog, is64, src_reg, BPF_REG_0); + /* Restore R0 after clobbering RAX */ + emit_mov_reg(&prog, true, BPF_REG_0, BPF_REG_AX); + break; + + } + err = emit_atomic(&prog, insn->imm, dst_reg, src_reg, - insn->off, BPF_SIZE(insn->code)); + insn->off, BPF_SIZE(insn->code)); if (err) return err; break; diff --git a/include/linux/filter.h b/include/linux/filter.h index 0e2f7eaf1854..a852b2af281c 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -269,7 +269,13 @@ static inline bool insn_is_zext(const struct bpf_insn *insn) * Atomic operations: * * BPF_ADD *(uint *) (dst_reg + off16) += src_reg + * BPF_AND *(uint *) (dst_reg + off16) &= src_reg + * BPF_OR *(uint *) (dst_reg + off16) |= src_reg + * BPF_XOR *(uint *) (dst_reg + off16) ^= src_reg * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg); + * BPF_AND | BPF_FETCH src_reg = atomic_fetch_and(dst_reg + off16, src_reg); + * BPF_OR | BPF_FETCH src_reg = atomic_fetch_or(dst_reg + off16, src_reg); + * BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg); * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg) * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg) */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 30c7b63d4e3e..dff79f5c0e26 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1694,6 +1694,9 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) STX_ATOMIC_W: switch (IMM) { ATOMIC_ALU_OP(BPF_ADD, add) + ATOMIC_ALU_OP(BPF_AND, and) + ATOMIC_ALU_OP(BPF_OR, or) + ATOMIC_ALU_OP(BPF_XOR, xor) #undef ATOMIC_ALU_OP
case BPF_XCHG: diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 42635b7bfeef..9d4ebd77705e 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -80,6 +80,13 @@ const char *const bpf_alu_string[16] = { [BPF_END >> 4] = "endian", };
+static const char *const bpf_atomic_alu_string[16] = { + [BPF_ADD >> 4] = "add", + [BPF_AND >> 4] = "and", + [BPF_OR >> 4] = "or", + [BPF_XOR >> 4] = "or", +}; + static const char *const bpf_ldst_string[] = { [BPF_W >> 3] = "u32", [BPF_H >> 3] = "u16", @@ -154,17 +161,23 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, insn->dst_reg, insn->off, insn->src_reg); else if (BPF_MODE(insn->code) == BPF_ATOMIC && - insn->imm == BPF_ADD) { - verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) += r%d\n", + (insn->imm == BPF_ADD || insn->imm == BPF_ADD || + insn->imm == BPF_OR || insn->imm == BPF_XOR)) { + verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) %s r%d\n", insn->code, bpf_ldst_string[BPF_SIZE(insn->code) >> 3], insn->dst_reg, insn->off, + bpf_alu_string[BPF_OP(insn->imm) >> 4], insn->src_reg); } else if (BPF_MODE(insn->code) == BPF_ATOMIC && - insn->imm == (BPF_ADD | BPF_FETCH)) { - verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_add((%s *)(r%d %+d), r%d)\n", + (insn->imm == (BPF_ADD | BPF_FETCH) || + insn->imm == (BPF_AND | BPF_FETCH) || + insn->imm == (BPF_OR | BPF_FETCH) || + insn->imm == (BPF_XOR | BPF_FETCH))) { + verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_%s((%s *)(r%d %+d), r%d)\n", insn->code, insn->src_reg, BPF_SIZE(insn->code) == BPF_DW ? "64" : "", + bpf_atomic_alu_string[BPF_OP(insn->imm) >> 4], bpf_ldst_string[BPF_SIZE(insn->code) >> 3], insn->dst_reg, insn->off, insn->src_reg); } else if (BPF_MODE(insn->code) == BPF_ATOMIC && diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f03a95cebdd3..8f71cd346361 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3969,6 +3969,12 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i switch (insn->imm) { case BPF_ADD: case BPF_ADD | BPF_FETCH: + case BPF_AND: + case BPF_AND | BPF_FETCH: + case BPF_OR: + case BPF_OR | BPF_FETCH: + case BPF_XOR: + case BPF_XOR | BPF_FETCH: case BPF_XCHG: case BPF_CMPXCHG: break; diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h index d75998b0d5ac..736bdeccdfe4 100644 --- a/tools/include/linux/filter.h +++ b/tools/include/linux/filter.h @@ -173,7 +173,13 @@ * Atomic operations: * * BPF_ADD *(uint *) (dst_reg + off16) += src_reg + * BPF_AND *(uint *) (dst_reg + off16) &= src_reg + * BPF_OR *(uint *) (dst_reg + off16) |= src_reg + * BPF_XOR *(uint *) (dst_reg + off16) ^= src_reg * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg); + * BPF_AND | BPF_FETCH src_reg = atomic_fetch_and(dst_reg + off16, src_reg); + * BPF_OR | BPF_FETCH src_reg = atomic_fetch_or(dst_reg + off16, src_reg); + * BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg); * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg) * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg) */
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit 98d666d05a1d9706bb3fe972157fa6155dbb180f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The prog_test that's added depends on Clang/LLVM features added by Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184).
Note the use of a define called ENABLE_ATOMICS_TESTS: this is used to:
- Avoid breaking the build for people on old versions of Clang - Avoid needing separate lists of test objects for no_alu32, where atomics are not supported even if Clang has the feature.
The atomics_test.o BPF object is built unconditionally both for test_progs and test_progs-no_alu32. For test_progs, if Clang supports atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper test code. Otherwise, progs and global vars are defined anyway, as stubs; this means that the skeleton user code still builds.
The atomics_test.o userspace object is built once and used for both test_progs and test_progs-no_alu32. A variable called skip_tests is defined in the BPF object's data section, which tells the userspace object whether to skip the atomics test.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210114181751.768687-11-jackmanb@google.com (cherry picked from commit 98d666d05a1d9706bb3fe972157fa6155dbb180f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 2 + .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++ tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++ .../selftests/bpf/verifier/atomic_and.c | 77 ++++++ .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++ .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++ .../selftests/bpf/verifier/atomic_or.c | 77 ++++++ .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++ .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++ 9 files changed, 881 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c create mode 100644 tools/testing/selftests/bpf/progs/atomics.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 60a301a93491..2ec1fb02b21d 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -416,10 +416,12 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(wildcard progs/btf_dump_test_case_*.c) TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS) +TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS $(eval $(call DEFINE_TEST_RUNNER,test_progs))
# Define test_progs-no_alu32 test runner. TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS) $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
# Define test_progs BPF-GCC-flavored test runner. diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c new file mode 100644 index 000000000000..21efe7bbf10d --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c @@ -0,0 +1,246 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <test_progs.h> + +#include "atomics.skel.h" + +static void test_add(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.add); + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.add); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run add", + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->add64_value, 3, "add64_value"); + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result"); + + ASSERT_EQ(skel->data->add32_value, 3, "add32_value"); + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result"); + + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value"); + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result"); + + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value"); + +cleanup: + bpf_link__destroy(link); +} + +static void test_sub(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.sub); + if (CHECK(IS_ERR(link), "attach(sub)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.sub); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run sub", + "err %d errno %d retval %d duration %d\n", + err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->sub64_value, -1, "sub64_value"); + ASSERT_EQ(skel->bss->sub64_result, 1, "sub64_result"); + + ASSERT_EQ(skel->data->sub32_value, -1, "sub32_value"); + ASSERT_EQ(skel->bss->sub32_result, 1, "sub32_result"); + + ASSERT_EQ(skel->bss->sub_stack_value_copy, -1, "sub_stack_value"); + ASSERT_EQ(skel->bss->sub_stack_result, 1, "sub_stack_result"); + + ASSERT_EQ(skel->data->sub_noreturn_value, -1, "sub_noreturn_value"); + +cleanup: + bpf_link__destroy(link); +} + +static void test_and(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.and); + if (CHECK(IS_ERR(link), "attach(and)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.and); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run and", + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->and64_value, 0x010ull << 32, "and64_value"); + ASSERT_EQ(skel->bss->and64_result, 0x110ull << 32, "and64_result"); + + ASSERT_EQ(skel->data->and32_value, 0x010, "and32_value"); + ASSERT_EQ(skel->bss->and32_result, 0x110, "and32_result"); + + ASSERT_EQ(skel->data->and_noreturn_value, 0x010ull << 32, "and_noreturn_value"); +cleanup: + bpf_link__destroy(link); +} + +static void test_or(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.or); + if (CHECK(IS_ERR(link), "attach(or)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.or); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run or", + "err %d errno %d retval %d duration %d\n", + err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->or64_value, 0x111ull << 32, "or64_value"); + ASSERT_EQ(skel->bss->or64_result, 0x110ull << 32, "or64_result"); + + ASSERT_EQ(skel->data->or32_value, 0x111, "or32_value"); + ASSERT_EQ(skel->bss->or32_result, 0x110, "or32_result"); + + ASSERT_EQ(skel->data->or_noreturn_value, 0x111ull << 32, "or_noreturn_value"); +cleanup: + bpf_link__destroy(link); +} + +static void test_xor(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.xor); + if (CHECK(IS_ERR(link), "attach(xor)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.xor); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run xor", + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->xor64_value, 0x101ull << 32, "xor64_value"); + ASSERT_EQ(skel->bss->xor64_result, 0x110ull << 32, "xor64_result"); + + ASSERT_EQ(skel->data->xor32_value, 0x101, "xor32_value"); + ASSERT_EQ(skel->bss->xor32_result, 0x110, "xor32_result"); + + ASSERT_EQ(skel->data->xor_noreturn_value, 0x101ull << 32, "xor_nxoreturn_value"); +cleanup: + bpf_link__destroy(link); +} + +static void test_cmpxchg(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.cmpxchg); + if (CHECK(IS_ERR(link), "attach(cmpxchg)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.cmpxchg); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run add", + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->cmpxchg64_value, 2, "cmpxchg64_value"); + ASSERT_EQ(skel->bss->cmpxchg64_result_fail, 1, "cmpxchg_result_fail"); + ASSERT_EQ(skel->bss->cmpxchg64_result_succeed, 1, "cmpxchg_result_succeed"); + + ASSERT_EQ(skel->data->cmpxchg32_value, 2, "lcmpxchg32_value"); + ASSERT_EQ(skel->bss->cmpxchg32_result_fail, 1, "cmpxchg_result_fail"); + ASSERT_EQ(skel->bss->cmpxchg32_result_succeed, 1, "cmpxchg_result_succeed"); + +cleanup: + bpf_link__destroy(link); +} + +static void test_xchg(struct atomics *skel) +{ + int err, prog_fd; + __u32 duration = 0, retval; + struct bpf_link *link; + + link = bpf_program__attach(skel->progs.xchg); + if (CHECK(IS_ERR(link), "attach(xchg)", "err: %ld\n", PTR_ERR(link))) + return; + + prog_fd = bpf_program__fd(skel->progs.xchg); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + if (CHECK(err || retval, "test_run add", + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration)) + goto cleanup; + + ASSERT_EQ(skel->data->xchg64_value, 2, "xchg64_value"); + ASSERT_EQ(skel->bss->xchg64_result, 1, "xchg64_result"); + + ASSERT_EQ(skel->data->xchg32_value, 2, "xchg32_value"); + ASSERT_EQ(skel->bss->xchg32_result, 1, "xchg32_result"); + +cleanup: + bpf_link__destroy(link); +} + +void test_atomics(void) +{ + struct atomics *skel; + __u32 duration = 0; + + skel = atomics__open_and_load(); + if (CHECK(!skel, "skel_load", "atomics skeleton failed\n")) + return; + + if (skel->data->skip_tests) { + printf("%s:SKIP:no ENABLE_ATOMICS_TESTS (missing Clang BPF atomics support)", + __func__); + test__skip(); + goto cleanup; + } + + if (test__start_subtest("add")) + test_add(skel); + if (test__start_subtest("sub")) + test_sub(skel); + if (test__start_subtest("and")) + test_and(skel); + if (test__start_subtest("or")) + test_or(skel); + if (test__start_subtest("xor")) + test_xor(skel); + if (test__start_subtest("cmpxchg")) + test_cmpxchg(skel); + if (test__start_subtest("xchg")) + test_xchg(skel); + +cleanup: + atomics__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/atomics.c b/tools/testing/selftests/bpf/progs/atomics.c new file mode 100644 index 000000000000..c245345e41ca --- /dev/null +++ b/tools/testing/selftests/bpf/progs/atomics.c @@ -0,0 +1,154 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <stdbool.h> + +#ifdef ENABLE_ATOMICS_TESTS +bool skip_tests __attribute((__section__(".data"))) = false; +#else +bool skip_tests = true; +#endif + +__u64 add64_value = 1; +__u64 add64_result = 0; +__u32 add32_value = 1; +__u32 add32_result = 0; +__u64 add_stack_value_copy = 0; +__u64 add_stack_result = 0; +__u64 add_noreturn_value = 1; + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(add, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + __u64 add_stack_value = 1; + + add64_result = __sync_fetch_and_add(&add64_value, 2); + add32_result = __sync_fetch_and_add(&add32_value, 2); + add_stack_result = __sync_fetch_and_add(&add_stack_value, 2); + add_stack_value_copy = add_stack_value; + __sync_fetch_and_add(&add_noreturn_value, 2); +#endif + + return 0; +} + +__s64 sub64_value = 1; +__s64 sub64_result = 0; +__s32 sub32_value = 1; +__s32 sub32_result = 0; +__s64 sub_stack_value_copy = 0; +__s64 sub_stack_result = 0; +__s64 sub_noreturn_value = 1; + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(sub, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + __u64 sub_stack_value = 1; + + sub64_result = __sync_fetch_and_sub(&sub64_value, 2); + sub32_result = __sync_fetch_and_sub(&sub32_value, 2); + sub_stack_result = __sync_fetch_and_sub(&sub_stack_value, 2); + sub_stack_value_copy = sub_stack_value; + __sync_fetch_and_sub(&sub_noreturn_value, 2); +#endif + + return 0; +} + +__u64 and64_value = (0x110ull << 32); +__u64 and64_result = 0; +__u32 and32_value = 0x110; +__u32 and32_result = 0; +__u64 and_noreturn_value = (0x110ull << 32); + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(and, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + + and64_result = __sync_fetch_and_and(&and64_value, 0x011ull << 32); + and32_result = __sync_fetch_and_and(&and32_value, 0x011); + __sync_fetch_and_and(&and_noreturn_value, 0x011ull << 32); +#endif + + return 0; +} + +__u64 or64_value = (0x110ull << 32); +__u64 or64_result = 0; +__u32 or32_value = 0x110; +__u32 or32_result = 0; +__u64 or_noreturn_value = (0x110ull << 32); + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(or, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + or64_result = __sync_fetch_and_or(&or64_value, 0x011ull << 32); + or32_result = __sync_fetch_and_or(&or32_value, 0x011); + __sync_fetch_and_or(&or_noreturn_value, 0x011ull << 32); +#endif + + return 0; +} + +__u64 xor64_value = (0x110ull << 32); +__u64 xor64_result = 0; +__u32 xor32_value = 0x110; +__u32 xor32_result = 0; +__u64 xor_noreturn_value = (0x110ull << 32); + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(xor, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + xor64_result = __sync_fetch_and_xor(&xor64_value, 0x011ull << 32); + xor32_result = __sync_fetch_and_xor(&xor32_value, 0x011); + __sync_fetch_and_xor(&xor_noreturn_value, 0x011ull << 32); +#endif + + return 0; +} + +__u64 cmpxchg64_value = 1; +__u64 cmpxchg64_result_fail = 0; +__u64 cmpxchg64_result_succeed = 0; +__u32 cmpxchg32_value = 1; +__u32 cmpxchg32_result_fail = 0; +__u32 cmpxchg32_result_succeed = 0; + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(cmpxchg, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + cmpxchg64_result_fail = __sync_val_compare_and_swap(&cmpxchg64_value, 0, 3); + cmpxchg64_result_succeed = __sync_val_compare_and_swap(&cmpxchg64_value, 1, 2); + + cmpxchg32_result_fail = __sync_val_compare_and_swap(&cmpxchg32_value, 0, 3); + cmpxchg32_result_succeed = __sync_val_compare_and_swap(&cmpxchg32_value, 1, 2); +#endif + + return 0; +} + +__u64 xchg64_value = 1; +__u64 xchg64_result = 0; +__u32 xchg32_value = 1; +__u32 xchg32_result = 0; + +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(xchg, int a) +{ +#ifdef ENABLE_ATOMICS_TESTS + __u64 val64 = 2; + __u32 val32 = 2; + + xchg64_result = __sync_lock_test_and_set(&xchg64_value, val64); + xchg32_result = __sync_lock_test_and_set(&xchg32_value, val32); +#endif + + return 0; +} diff --git a/tools/testing/selftests/bpf/verifier/atomic_and.c b/tools/testing/selftests/bpf/verifier/atomic_and.c new file mode 100644 index 000000000000..600bc5e0f143 --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/atomic_and.c @@ -0,0 +1,77 @@ +{ + "BPF_ATOMIC_AND without fetch", + .insns = { + /* val = 0x110; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110), + /* atomic_and(&val, 0x011); */ + BPF_MOV64_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_DW, BPF_AND, BPF_REG_10, BPF_REG_1, -8), + /* if (val != 0x010) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0x010, 2), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* r1 should not be clobbered, no BPF_FETCH flag */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x011, 1), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC_AND with fetch", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 123), + /* val = 0x110; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110), + /* old = atomic_fetch_and(&val, 0x011); */ + BPF_MOV64_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_DW, BPF_AND | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8), + /* if (old != 0x110) exit(3); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2), + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* if (val != 0x010) exit(2); */ + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x010, 2), + BPF_MOV64_IMM(BPF_REG_1, 2), + BPF_EXIT_INSN(), + /* Check R0 wasn't clobbered (for fear of x86 JIT bug) */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 123, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC_AND with fetch 32bit", + .insns = { + /* r0 = (s64) -1 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1), + /* val = 0x110; */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 0x110), + /* old = atomic_fetch_and(&val, 0x011); */ + BPF_MOV32_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_W, BPF_AND | BPF_FETCH, BPF_REG_10, BPF_REG_1, -4), + /* if (old != 0x110) exit(3); */ + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2), + BPF_MOV32_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* if (val != 0x010) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4), + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x010, 2), + BPF_MOV32_IMM(BPF_REG_1, 2), + BPF_EXIT_INSN(), + /* Check R0 wasn't clobbered (for fear of x86 JIT bug) + * It should be -1 so add 1 to get exit code. + */ + BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c new file mode 100644 index 000000000000..2efd8bcf57a1 --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c @@ -0,0 +1,96 @@ +{ + "atomic compare-and-exchange smoketest - 64bit", + .insns = { + /* val = 3; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3), + /* old = atomic_cmpxchg(&val, 2, 4); */ + BPF_MOV64_IMM(BPF_REG_1, 4), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8), + /* if (old != 3) exit(2); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* if (val != 3) exit(3); */ + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* old = atomic_cmpxchg(&val, 3, 4); */ + BPF_MOV64_IMM(BPF_REG_1, 4), + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8), + /* if (old != 3) exit(4); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + BPF_MOV64_IMM(BPF_REG_0, 4), + BPF_EXIT_INSN(), + /* if (val != 4) exit(5); */ + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 4, 2), + BPF_MOV64_IMM(BPF_REG_0, 5), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "atomic compare-and-exchange smoketest - 32bit", + .insns = { + /* val = 3; */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 3), + /* old = atomic_cmpxchg(&val, 2, 4); */ + BPF_MOV32_IMM(BPF_REG_1, 4), + BPF_MOV32_IMM(BPF_REG_0, 2), + BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -4), + /* if (old != 3) exit(2); */ + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + BPF_MOV32_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* if (val != 3) exit(3); */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -4), + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + BPF_MOV32_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* old = atomic_cmpxchg(&val, 3, 4); */ + BPF_MOV32_IMM(BPF_REG_1, 4), + BPF_MOV32_IMM(BPF_REG_0, 3), + BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -4), + /* if (old != 3) exit(4); */ + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + BPF_MOV32_IMM(BPF_REG_0, 4), + BPF_EXIT_INSN(), + /* if (val != 4) exit(5); */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -4), + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 4, 2), + BPF_MOV32_IMM(BPF_REG_0, 5), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV32_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "Can't use cmpxchg on uninit src reg", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3), + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_2, -8), + BPF_EXIT_INSN(), + }, + .result = REJECT, + .errstr = "!read_ok", +}, +{ + "Can't use cmpxchg on uninit memory", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_MOV64_IMM(BPF_REG_2, 4), + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_2, -8), + BPF_EXIT_INSN(), + }, + .result = REJECT, + .errstr = "invalid read from stack", +}, diff --git a/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c b/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c new file mode 100644 index 000000000000..a91de8cd9def --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c @@ -0,0 +1,106 @@ +{ + "BPF_ATOMIC_FETCH_ADD smoketest - 64bit", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + /* Write 3 to stack */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3), + /* Put a 1 in R1, add it to the 3 on the stack, and load the value back into R1 */ + BPF_MOV64_IMM(BPF_REG_1, 1), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8), + /* Check the value we loaded back was 3 */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 3, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* Load value from stack */ + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8), + /* Check value loaded from stack was 4 */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 4, 1), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC_FETCH_ADD smoketest - 32bit", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + /* Write 3 to stack */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 3), + /* Put a 1 in R1, add it to the 3 on the stack, and load the value back into R1 */ + BPF_MOV32_IMM(BPF_REG_1, 1), + BPF_ATOMIC_OP(BPF_W, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_1, -4), + /* Check the value we loaded back was 3 */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 3, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* Load value from stack */ + BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4), + /* Check value loaded from stack was 4 */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 4, 1), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "Can't use ATM_FETCH_ADD on frame pointer", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_10, -8), + BPF_EXIT_INSN(), + }, + .result = REJECT, + .errstr_unpriv = "R10 leaks addr into mem", + .errstr = "frame pointer is read only", +}, +{ + "Can't use ATM_FETCH_ADD on uninit src reg", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_2, -8), + BPF_EXIT_INSN(), + }, + .result = REJECT, + /* It happens that the address leak check is first, but it would also be + * complain about the fact that we're trying to modify R10. + */ + .errstr = "!read_ok", +}, +{ + "Can't use ATM_FETCH_ADD on uninit dst reg", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_2, BPF_REG_0, -8), + BPF_EXIT_INSN(), + }, + .result = REJECT, + /* It happens that the address leak check is first, but it would also be + * complain about the fact that we're trying to modify R10. + */ + .errstr = "!read_ok", +}, +{ + "Can't use ATM_FETCH_ADD on kernel memory", + .insns = { + /* This is an fentry prog, context is array of the args of the + * kernel function being called. Load first arg into R2. + */ + BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 0), + /* First arg of bpf_fentry_test7 is a pointer to a struct. + * Attempt to modify that struct. Verifier shouldn't let us + * because it's kernel memory. + */ + BPF_MOV64_IMM(BPF_REG_3, 1), + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_2, BPF_REG_3, 0), + /* Done */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_TRACE_FENTRY, + .kfunc = "bpf_fentry_test7", + .result = REJECT, + .errstr = "only read is supported", +}, diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c new file mode 100644 index 000000000000..ebe6e51455ba --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/atomic_or.c @@ -0,0 +1,77 @@ +{ + "BPF_ATOMIC OR without fetch", + .insns = { + /* val = 0x110; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110), + /* atomic_or(&val, 0x011); */ + BPF_MOV64_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_DW, BPF_OR, BPF_REG_10, BPF_REG_1, -8), + /* if (val != 0x111) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0x111, 2), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* r1 should not be clobbered, no BPF_FETCH flag */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x011, 1), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC OR with fetch", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 123), + /* val = 0x110; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110), + /* old = atomic_fetch_or(&val, 0x011); */ + BPF_MOV64_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_DW, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8), + /* if (old != 0x110) exit(3); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2), + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* if (val != 0x111) exit(2); */ + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x111, 2), + BPF_MOV64_IMM(BPF_REG_1, 2), + BPF_EXIT_INSN(), + /* Check R0 wasn't clobbered (for fear of x86 JIT bug) */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 123, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC OR with fetch 32bit", + .insns = { + /* r0 = (s64) -1 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1), + /* val = 0x110; */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 0x110), + /* old = atomic_fetch_or(&val, 0x011); */ + BPF_MOV32_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -4), + /* if (old != 0x110) exit(3); */ + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2), + BPF_MOV32_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* if (val != 0x111) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4), + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x111, 2), + BPF_MOV32_IMM(BPF_REG_1, 2), + BPF_EXIT_INSN(), + /* Check R0 wasn't clobbered (for fear of x86 JIT bug) + * It should be -1 so add 1 to get exit code. + */ + BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, diff --git a/tools/testing/selftests/bpf/verifier/atomic_xchg.c b/tools/testing/selftests/bpf/verifier/atomic_xchg.c new file mode 100644 index 000000000000..33e2d6c973ee --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/atomic_xchg.c @@ -0,0 +1,46 @@ +{ + "atomic exchange smoketest - 64bit", + .insns = { + /* val = 3; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3), + /* old = atomic_xchg(&val, 4); */ + BPF_MOV64_IMM(BPF_REG_1, 4), + BPF_ATOMIC_OP(BPF_DW, BPF_XCHG, BPF_REG_10, BPF_REG_1, -8), + /* if (old != 3) exit(1); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 3, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* if (val != 4) exit(2); */ + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 4, 2), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "atomic exchange smoketest - 32bit", + .insns = { + /* val = 3; */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 3), + /* old = atomic_xchg(&val, 4); */ + BPF_MOV32_IMM(BPF_REG_1, 4), + BPF_ATOMIC_OP(BPF_W, BPF_XCHG, BPF_REG_10, BPF_REG_1, -4), + /* if (old != 3) exit(1); */ + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 3, 2), + BPF_MOV32_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* if (val != 4) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -4), + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 4, 2), + BPF_MOV32_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV32_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, diff --git a/tools/testing/selftests/bpf/verifier/atomic_xor.c b/tools/testing/selftests/bpf/verifier/atomic_xor.c new file mode 100644 index 000000000000..eb791e547b47 --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/atomic_xor.c @@ -0,0 +1,77 @@ +{ + "BPF_ATOMIC XOR without fetch", + .insns = { + /* val = 0x110; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110), + /* atomic_xor(&val, 0x011); */ + BPF_MOV64_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_DW, BPF_XOR, BPF_REG_10, BPF_REG_1, -8), + /* if (val != 0x101) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0x101, 2), + BPF_MOV64_IMM(BPF_REG_0, 2), + BPF_EXIT_INSN(), + /* r1 should not be clobbered, no BPF_FETCH flag */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x011, 1), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC XOR with fetch", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 123), + /* val = 0x110; */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110), + /* old = atomic_fetch_xor(&val, 0x011); */ + BPF_MOV64_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_DW, BPF_XOR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8), + /* if (old != 0x110) exit(3); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2), + BPF_MOV64_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* if (val != 0x101) exit(2); */ + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x101, 2), + BPF_MOV64_IMM(BPF_REG_1, 2), + BPF_EXIT_INSN(), + /* Check R0 wasn't clobbered (fxor fear of x86 JIT bug) */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 123, 2), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + /* exit(0); */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +}, +{ + "BPF_ATOMIC XOR with fetch 32bit", + .insns = { + /* r0 = (s64) -1 */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1), + /* val = 0x110; */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 0x110), + /* old = atomic_fetch_xor(&val, 0x011); */ + BPF_MOV32_IMM(BPF_REG_1, 0x011), + BPF_ATOMIC_OP(BPF_W, BPF_XOR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -4), + /* if (old != 0x110) exit(3); */ + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2), + BPF_MOV32_IMM(BPF_REG_0, 3), + BPF_EXIT_INSN(), + /* if (val != 0x101) exit(2); */ + BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4), + BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x101, 2), + BPF_MOV32_IMM(BPF_REG_1, 2), + BPF_EXIT_INSN(), + /* Check R0 wasn't clobbered (fxor fear of x86 JIT bug) + * It should be -1 so add 1 to get exit code. + */ + BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, +},
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.12-rc1 commit de948576f8e7d7fa1b5db04f56184ffe176177c5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Document new atomic instructions.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210114181751.768687-12-jackmanb@google.com (cherry picked from commit de948576f8e7d7fa1b5db04f56184ffe176177c5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- Documentation/networking/filter.rst | 31 +++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst index 1583d59d806d..f6d8f90e9a56 100644 --- a/Documentation/networking/filter.rst +++ b/Documentation/networking/filter.rst @@ -1053,8 +1053,39 @@ encoding. .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+The basic atomic operations supported are: + + BPF_ADD + BPF_AND + BPF_OR + BPF_XOR + +Each having equivalent semantics with the ``BPF_ADD`` example, that is: the +memory location addresed by ``dst_reg + off`` is atomically modified, with +``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the +immediate, then these operations also overwrite ``src_reg`` with the +value that was in memory before it was modified. + +The more special operations are: + + BPF_XCHG + +This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg + +off``. + + BPF_CMPXCHG + +This atomically compares the value addressed by ``dst_reg + off`` with +``R0``. If they match it is replaced with ``src_reg``, The value that was there +before is loaded back to ``R0``. + Note that 1 and 2 byte atomic operations are not supported.
+Except ``BPF_ADD`` _without_ ``BPF_FETCH`` (for legacy reasons), all 4 byte +atomic operations require alu32 mode. Clang enables this mode by default in +architecture v3 (``-mcpu=v3``). For older versions it can be enabled with +``-Xclang -target-feature -Xclang +alu32``. + You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to the exclusive-add operation encoded when the immediate field is zero.
From: Gary Lin glin@suse.com
mainline inclusion from mainline-5.12-rc1 commit 93c5aecc35c61414073d848e1ba637fc2cae98a8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The x64 bpf jit expects bpf images converge within the given passes, but it could fail to do so with some corner cases. For example:
l0: ja 40 l1: ja 40
[... repeated ja 40 ]
l39: ja 40 l40: ret #0
This bpf program contains 40 "ja 40" instructions which are effectively NOPs and designed to be replaced with valid code dynamically. Ideally, bpf jit should optimize those "ja 40" instructions out when translating the bpf instructions into x64 machine code. However, do_jit() can only remove one "ja 40" for offset==0 on each pass, so it requires at least 40 runs to eliminate those JMPs and exceeds the current limit of passes(20). In the end, the program got rejected when BPF_JIT_ALWAYS_ON is set even though it's legit as a classic socket filter.
To make bpf images more likely converge within 20 passes, this commit pads some instructions with NOPs in the last 5 passes:
1. conditional jumps A possible size variance comes from the adoption of imm8 JMP. If the offset is imm8, we calculate the size difference of this BPF instruction between the previous and the current pass and fill the gap with NOPs. To avoid the recalculation of jump offset, those NOPs are inserted before the JMP code, so we have to subtract the 2 bytes of imm8 JMP when calculating the NOP number.
2. BPF_JA There are two conditions for BPF_JA. a.) nop jumps If this instruction is not optimized out in the previous pass, instead of removing it, we insert the equivalent size of NOPs. b.) label jumps Similar to condition jumps, we prepend NOPs right before the JMP code.
To make the code concise, emit_nops() is modified to use the signed len and return the number of inserted NOPs.
For bpf-to-bpf, we always enable padding for the extra pass since there is only one extra run and the jump padding doesn't affected the images that converge without padding.
After applying this patch, the corner case was loaded with the following jit code:
flen=45 proglen=77 pass=17 image=ffffffffc03367d4 from=jump pid=10097 JIT code: 00000000: 0f 1f 44 00 00 55 48 89 e5 53 41 55 31 c0 45 31 JIT code: 00000010: ed 48 89 fb eb 30 eb 2e eb 2c eb 2a eb 28 eb 26 JIT code: 00000020: eb 24 eb 22 eb 20 eb 1e eb 1c eb 1a eb 18 eb 16 JIT code: 00000030: eb 14 eb 12 eb 10 eb 0e eb 0c eb 0a eb 08 eb 06 JIT code: 00000040: eb 04 eb 02 66 90 31 c0 41 5d 5b c9 c3
0: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0] 5: 55 push rbp 6: 48 89 e5 mov rbp,rsp 9: 53 push rbx a: 41 55 push r13 c: 31 c0 xor eax,eax e: 45 31 ed xor r13d,r13d 11: 48 89 fb mov rbx,rdi 14: eb 30 jmp 0x46 16: eb 2e jmp 0x46 ... 3e: eb 06 jmp 0x46 40: eb 04 jmp 0x46 42: eb 02 jmp 0x46 44: 66 90 xchg ax,ax 46: 31 c0 xor eax,eax 48: 41 5d pop r13 4a: 5b pop rbx 4b: c9 leave 4c: c3 ret
At the 16th pass, 15 jumps were already optimized out, and one jump was replaced with NOPs at 44 and the image converged at the 17th pass.
v4: - Add the detailed comments about the possible padding bytes
v3: - Copy the instructions of prologue separately or the size calculation of the first BPF instruction would include the prologue. - Replace WARN_ONCE() with pr_err() and EFAULT - Use MAX_PASSES in the for loop condition check - Remove the "padded" flag from x64_jit_data. For the extra pass of subprogs, padding is always enabled since it won't hurt the images that converge without padding.
v2: - Simplify the sample code in the description and provide the jit code - Check the expected padding bytes with WARN_ONCE - Move the 'padded' flag to 'struct x64_jit_data'
Signed-off-by: Gary Lin glin@suse.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210119102501.511-2-glin@suse.com (cherry picked from commit 93c5aecc35c61414073d848e1ba637fc2cae98a8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 140 ++++++++++++++++++++++++++++-------- 1 file changed, 112 insertions(+), 28 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 5735184f5dbd..4c2324295ae2 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -870,8 +870,31 @@ static void detect_reg_usage(struct bpf_insn *insn, int insn_cnt, } }
+static int emit_nops(u8 **pprog, int len) +{ + u8 *prog = *pprog; + int i, noplen, cnt = 0; + + while (len > 0) { + noplen = len; + + if (noplen > ASM_NOP_MAX) + noplen = ASM_NOP_MAX; + + for (i = 0; i < noplen; i++) + EMIT1(ideal_nops[noplen][i]); + len -= noplen; + } + + *pprog = prog; + + return cnt; +} + +#define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) + static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, - int oldproglen, struct jit_context *ctx) + int oldproglen, struct jit_context *ctx, bool jmp_padding) { bool tail_call_reachable = bpf_prog->aux->tail_call_reachable; struct bpf_insn *insn = bpf_prog->insnsi; @@ -881,7 +904,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; int i, cnt = 0, excnt = 0; - int proglen = 0; + int ilen, proglen = 0; u8 *prog = temp; int err;
@@ -895,7 +918,13 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, bpf_prog_was_classic(bpf_prog), tail_call_reachable, bpf_prog->aux->func_idx != 0); push_callee_regs(&prog, callee_regs_used); - addrs[0] = prog - temp; + + ilen = prog - temp; + if (image) + memcpy(image + proglen, temp, ilen); + proglen += ilen; + addrs[0] = proglen; + prog = temp;
for (i = 1; i <= insn_cnt; i++, insn++) { const s32 imm32 = insn->imm; @@ -904,8 +933,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 b2 = 0, b3 = 0; s64 jmp_offset; u8 jmp_cond; - int ilen; u8 *func; + int nops;
switch (insn->code) { /* ALU */ @@ -1505,6 +1534,30 @@ st: if (is_imm8(insn->off)) } jmp_offset = addrs[i + insn->off] - addrs[i]; if (is_imm8(jmp_offset)) { + if (jmp_padding) { + /* To keep the jmp_offset valid, the extra bytes are + * padded before the jump insn, so we substract the + * 2 bytes of jmp_cond insn from INSN_SZ_DIFF. + * + * If the previous pass already emits an imm8 + * jmp_cond, then this BPF insn won't shrink, so + * "nops" is 0. + * + * On the other hand, if the previous pass emits an + * imm32 jmp_cond, the extra 4 bytes(*) is padded to + * keep the image from shrinking further. + * + * (*) imm32 jmp_cond is 6 bytes, and imm8 jmp_cond + * is 2 bytes, so the size difference is 4 bytes. + */ + nops = INSN_SZ_DIFF - 2; + if (nops != 0 && nops != 4) { + pr_err("unexpected jmp_cond padding: %d bytes\n", + nops); + return -EFAULT; + } + cnt += emit_nops(&prog, nops); + } EMIT2(jmp_cond, jmp_offset); } else if (is_simm32(jmp_offset)) { EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset); @@ -1527,11 +1580,55 @@ st: if (is_imm8(insn->off)) else jmp_offset = addrs[i + insn->off] - addrs[i];
- if (!jmp_offset) - /* Optimize out nop jumps */ + if (!jmp_offset) { + /* + * If jmp_padding is enabled, the extra nops will + * be inserted. Otherwise, optimize out nop jumps. + */ + if (jmp_padding) { + /* There are 3 possible conditions. + * (1) This BPF_JA is already optimized out in + * the previous run, so there is no need + * to pad any extra byte (0 byte). + * (2) The previous pass emits an imm8 jmp, + * so we pad 2 bytes to match the previous + * insn size. + * (3) Similarly, the previous pass emits an + * imm32 jmp, and 5 bytes is padded. + */ + nops = INSN_SZ_DIFF; + if (nops != 0 && nops != 2 && nops != 5) { + pr_err("unexpected nop jump padding: %d bytes\n", + nops); + return -EFAULT; + } + cnt += emit_nops(&prog, nops); + } break; + } emit_jmp: if (is_imm8(jmp_offset)) { + if (jmp_padding) { + /* To avoid breaking jmp_offset, the extra bytes + * are padded before the actual jmp insn, so + * 2 bytes is substracted from INSN_SZ_DIFF. + * + * If the previous pass already emits an imm8 + * jmp, there is nothing to pad (0 byte). + * + * If it emits an imm32 jmp (5 bytes) previously + * and now an imm8 jmp (2 bytes), then we pad + * (5 - 2 = 3) bytes to stop the image from + * shrinking further. + */ + nops = INSN_SZ_DIFF - 2; + if (nops != 0 && nops != 3) { + pr_err("unexpected jump padding: %d bytes\n", + nops); + return -EFAULT; + } + cnt += emit_nops(&prog, INSN_SZ_DIFF - 2); + } EMIT2(0xEB, jmp_offset); } else if (is_simm32(jmp_offset)) { EMIT1_off32(0xE9, jmp_offset); @@ -1687,26 +1784,6 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, return 0; }
-static void emit_nops(u8 **pprog, unsigned int len) -{ - unsigned int i, noplen; - u8 *prog = *pprog; - int cnt = 0; - - while (len > 0) { - noplen = len; - - if (noplen > ASM_NOP_MAX) - noplen = ASM_NOP_MAX; - - for (i = 0; i < noplen; i++) - EMIT1(ideal_nops[noplen][i]); - len -= noplen; - } - - *pprog = prog; -} - static void emit_align(u8 **pprog, u32 align) { u8 *target, *prog = *pprog; @@ -2123,6 +2200,9 @@ struct x64_jit_data { struct jit_context ctx; };
+#define MAX_PASSES 20 +#define PADDING_PASSES (MAX_PASSES - 5) + struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) { struct bpf_binary_header *header = NULL; @@ -2132,6 +2212,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) struct jit_context ctx = {}; bool tmp_blinded = false; bool extra_pass = false; + bool padding = false; u8 *image = NULL; int *addrs; int pass; @@ -2168,6 +2249,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) image = jit_data->image; header = jit_data->header; extra_pass = true; + padding = true; goto skip_init_addrs; } addrs = kvmalloc_array(prog->len + 1, sizeof(*addrs), GFP_KERNEL); @@ -2193,8 +2275,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) * may converge on the last pass. In such case do one more * pass to emit the final image. */ - for (pass = 0; pass < 20 || image; pass++) { - proglen = do_jit(prog, addrs, image, oldproglen, &ctx); + for (pass = 0; pass < MAX_PASSES || image; pass++) { + if (!padding && pass >= PADDING_PASSES) + padding = true; + proglen = do_jit(prog, addrs, image, oldproglen, &ctx, padding); if (proglen <= 0) { out_image: image = NULL;
From: Gary Lin glin@suse.com
mainline inclusion from mainline-5.12-rc1 commit 16a660ef7d8c89787ee4bf352458681439485649 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
With NOPs padding, x64 jit now can handle the jump cases like bpf_fill_maxinsns11().
Signed-off-by: Gary Lin glin@suse.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210119102501.511-3-glin@suse.com (cherry picked from commit 16a660ef7d8c89787ee4bf352458681439485649) Signed-off-by: Wang Yufen wangyufen@huawei.com --- lib/test_bpf.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/lib/test_bpf.c b/lib/test_bpf.c index aba3c7e43ee5..acf825d81671 100644 --- a/lib/test_bpf.c +++ b/lib/test_bpf.c @@ -345,7 +345,7 @@ static int __bpf_fill_ja(struct bpf_test *self, unsigned int len,
static int bpf_fill_maxinsns11(struct bpf_test *self) { - /* Hits 70 passes on x86_64, so cannot get JITed there. */ + /* Hits 70 passes on x86_64 and triggers NOPs padding. */ return __bpf_fill_ja(self, BPF_MAXINSNS, 68); }
@@ -5318,15 +5318,10 @@ static struct bpf_test tests[] = { { "BPF_MAXINSNS: Jump, gap, jump, ...", { }, -#if defined(CONFIG_BPF_JIT_ALWAYS_ON) && defined(CONFIG_X86) - CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL, -#else CLASSIC | FLAG_NO_DATA, -#endif { }, { { 0, 0xababcbac } }, .fill_helper = bpf_fill_maxinsns11, - .expected_errcode = -ENOTSUPP, }, { "BPF_MAXINSNS: jump over MSH",
From: Gary Lin glin@suse.com
mainline inclusion from mainline-5.12-rc1 commit 79d1b684e21533bd417df50704a9692830eb8358 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
There are 3 tests added into verifier's jit tests to trigger x64 jit jump padding.
The first test can be represented as the following assembly code:
1: bpf_call bpf_get_prandom_u32 2: if r0 == 1 goto pc+128 3: if r0 == 2 goto pc+128 ... 129: if r0 == 128 goto pc+128 130: goto pc+128 131: goto pc+127 ... 256: goto pc+2 257: goto pc+1 258: r0 = 1 259: ret
We first store a random number to r0 and add the corresponding conditional jumps (2~129) to make verifier believe that those jump instructions from 130 to 257 are reachable. When the program is sent to x64 jit, it starts to optimize out the NOP jumps backwards from 257. Since there are 128 such jumps, the program easily reaches 15 passes and triggers jump padding.
Here is the x64 jit code of the first test:
0: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0] 5: 66 90 xchg ax,ax 7: 55 push rbp 8: 48 89 e5 mov rbp,rsp b: e8 4c 90 75 e3 call 0xffffffffe375905c 10: 48 83 f8 01 cmp rax,0x1 14: 0f 84 fe 04 00 00 je 0x518 1a: 48 83 f8 02 cmp rax,0x2 1e: 0f 84 f9 04 00 00 je 0x51d ... f6: 48 83 f8 18 cmp rax,0x18 fa: 0f 84 8b 04 00 00 je 0x58b 100: 48 83 f8 19 cmp rax,0x19 104: 0f 84 86 04 00 00 je 0x590 10a: 48 83 f8 1a cmp rax,0x1a 10e: 0f 84 81 04 00 00 je 0x595 ... 500: 0f 84 83 01 00 00 je 0x689 506: 48 81 f8 80 00 00 00 cmp rax,0x80 50d: 0f 84 76 01 00 00 je 0x689 513: e9 71 01 00 00 jmp 0x689 518: e9 6c 01 00 00 jmp 0x689 ... 5fe: e9 86 00 00 00 jmp 0x689 603: e9 81 00 00 00 jmp 0x689 608: 0f 1f 00 nop DWORD PTR [rax] 60b: eb 7c jmp 0x689 60d: eb 7a jmp 0x689 ... 683: eb 04 jmp 0x689 685: eb 02 jmp 0x689 687: 66 90 xchg ax,ax 689: b8 01 00 00 00 mov eax,0x1 68e: c9 leave 68f: c3 ret
As expected, a 3 bytes NOPs is inserted at 608 due to the transition from imm32 jmp to imm8 jmp. A 2 bytes NOPs is also inserted at 687 to replace a NOP jump.
The second test case is tricky. Here is the assembly code:
1: bpf_call bpf_get_prandom_u32 2: if r0 == 1 goto pc+2048 3: if r0 == 2 goto pc+2048 ... 2049: if r0 == 2048 goto pc+2048 2050: goto pc+2048 2051: goto pc+16 2052: goto pc+15 ... 2064: goto pc+3 2065: goto pc+2 2066: goto pc+1 ... [repeat "goto pc+16".."goto pc+1" 127 times] ... 4099: r0 = 2 4100: ret
There are 4 major parts of the program. 1) 1~2049: Those are instructions to make 2050~4098 reachable. Some of them also could generate the padding for jmp_cond. 2) 2050: This is the target instruction for the imm32 nop jmp padding. 3) 2051~4098: The repeated "goto 1~16" instructions are designed to be consumed by the nop jmp optimization. In the end, those instrucitons become 128 continuous 0 offset jmp and are optimized out in 1 pass, and this make insn 2050 an imm32 nop jmp in the next pass, so that we can trigger the 5 bytes padding. 4) 4099~4100: Those are the instructions to end the program.
The x64 jit code is like this:
0: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0] 5: 66 90 xchg ax,ax 7: 55 push rbp 8: 48 89 e5 mov rbp,rsp b: e8 bc 7b d5 d3 call 0xffffffffd3d57bcc 10: 48 83 f8 01 cmp rax,0x1 14: 0f 84 7e 66 00 00 je 0x6698 1a: 48 83 f8 02 cmp rax,0x2 1e: 0f 84 74 66 00 00 je 0x6698 24: 48 83 f8 03 cmp rax,0x3 28: 0f 84 6a 66 00 00 je 0x6698 2e: 48 83 f8 04 cmp rax,0x4 32: 0f 84 60 66 00 00 je 0x6698 38: 48 83 f8 05 cmp rax,0x5 3c: 0f 84 56 66 00 00 je 0x6698 42: 48 83 f8 06 cmp rax,0x6 46: 0f 84 4c 66 00 00 je 0x6698 ... 666c: 48 81 f8 fe 07 00 00 cmp rax,0x7fe 6673: 0f 1f 40 00 nop DWORD PTR [rax+0x0] 6677: 74 1f je 0x6698 6679: 48 81 f8 ff 07 00 00 cmp rax,0x7ff 6680: 0f 1f 40 00 nop DWORD PTR [rax+0x0] 6684: 74 12 je 0x6698 6686: 48 81 f8 00 08 00 00 cmp rax,0x800 668d: 0f 1f 40 00 nop DWORD PTR [rax+0x0] 6691: 74 05 je 0x6698 6693: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0] 6698: b8 02 00 00 00 mov eax,0x2 669d: c9 leave 669e: c3 ret
Since insn 2051~4098 are optimized out right before the padding pass, there are several conditional jumps from the first part are replaced with imm8 jmp_cond, and this triggers the 4 bytes padding, for example at 6673, 6680, and 668d. On the other hand, Insn 2050 is replaced with the 5 bytes nops at 6693.
The third test is to invoke the first and second tests as subprogs to test bpf2bpf. Per the system log, there was one more jit happened with only one pass and the same jit code was produced.
v4: - Add the second test case which triggers jmp_cond padding and imm32 nop jmp padding. - Add the new test case as another subprog
Signed-off-by: Gary Lin glin@suse.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210119102501.511-4-glin@suse.com (cherry picked from commit 79d1b684e21533bd417df50704a9692830eb8358) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_verifier.c | 72 +++++++++++++++++++++ tools/testing/selftests/bpf/verifier/jit.c | 24 +++++++ 2 files changed, 96 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 90c0428bc0e5..cd59e9854e8c 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -297,6 +297,78 @@ static void bpf_fill_scale(struct bpf_test *self) } }
+static int bpf_fill_torturous_jumps_insn_1(struct bpf_insn *insn) +{ + unsigned int len = 259, hlen = 128; + int i; + + insn[0] = BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32); + for (i = 1; i <= hlen; i++) { + insn[i] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, i, hlen); + insn[i + hlen] = BPF_JMP_A(hlen - i); + } + insn[len - 2] = BPF_MOV64_IMM(BPF_REG_0, 1); + insn[len - 1] = BPF_EXIT_INSN(); + + return len; +} + +static int bpf_fill_torturous_jumps_insn_2(struct bpf_insn *insn) +{ + unsigned int len = 4100, jmp_off = 2048; + int i, j; + + insn[0] = BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32); + for (i = 1; i <= jmp_off; i++) { + insn[i] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, i, jmp_off); + } + insn[i++] = BPF_JMP_A(jmp_off); + for (; i <= jmp_off * 2 + 1; i+=16) { + for (j = 0; j < 16; j++) { + insn[i + j] = BPF_JMP_A(16 - j - 1); + } + } + + insn[len - 2] = BPF_MOV64_IMM(BPF_REG_0, 2); + insn[len - 1] = BPF_EXIT_INSN(); + + return len; +} + +static void bpf_fill_torturous_jumps(struct bpf_test *self) +{ + struct bpf_insn *insn = self->fill_insns; + int i = 0; + + switch (self->retval) { + case 1: + self->prog_len = bpf_fill_torturous_jumps_insn_1(insn); + return; + case 2: + self->prog_len = bpf_fill_torturous_jumps_insn_2(insn); + return; + case 3: + /* main */ + insn[i++] = BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 4); + insn[i++] = BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 262); + insn[i++] = BPF_ST_MEM(BPF_B, BPF_REG_10, -32, 0); + insn[i++] = BPF_MOV64_IMM(BPF_REG_0, 3); + insn[i++] = BPF_EXIT_INSN(); + + /* subprog 1 */ + i += bpf_fill_torturous_jumps_insn_1(insn + i); + + /* subprog 2 */ + i += bpf_fill_torturous_jumps_insn_2(insn + i); + + self->prog_len = i; + return; + default: + self->prog_len = 0; + break; + } +} + /* BPF_SK_LOOKUP contains 13 instructions, if you need to fix up maps */ #define BPF_SK_LOOKUP(func) \ /* struct bpf_sock_tuple tuple = {} */ \ diff --git a/tools/testing/selftests/bpf/verifier/jit.c b/tools/testing/selftests/bpf/verifier/jit.c index c33adf344fae..df215e004566 100644 --- a/tools/testing/selftests/bpf/verifier/jit.c +++ b/tools/testing/selftests/bpf/verifier/jit.c @@ -105,3 +105,27 @@ .result = ACCEPT, .retval = 2, }, +{ + "jit: torturous jumps, imm8 nop jmp and pure jump padding", + .insns = { }, + .fill_helper = bpf_fill_torturous_jumps, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = ACCEPT, + .retval = 1, +}, +{ + "jit: torturous jumps, imm32 nop jmp and jmp_cond padding", + .insns = { }, + .fill_helper = bpf_fill_torturous_jumps, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = ACCEPT, + .retval = 2, +}, +{ + "jit: torturous jumps in subprog", + .insns = { }, + .fill_helper = bpf_fill_torturous_jumps, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = ACCEPT, + .retval = 3, +},
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.12-rc1 commit 13ca51d5eb358edcb673afccb48c3440b9fda21b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
llvm patch https://reviews.llvm.org/D84002 permitted to emit empty rodata datasec if the elf .rodata section contains read-only data from local variables. These local variables will be not emitted as BTF_KIND_VARs since llvm converted these local variables as static variables with private linkage without debuginfo types. Such an empty rodata datasec will make skeleton code generation easy since for skeleton a rodata struct will be generated if there is a .rodata elf section. The existence of a rodata btf datasec is also consistent with the existence of a rodata map created by libbpf.
The btf with such an empty rodata datasec will fail in the kernel though as kernel will reject a datasec with zero vlen and zero size. For example, for the below code, int sys_enter(void *ctx) { int fmt[6] = {1, 2, 3, 4, 5, 6}; int dst[6];
bpf_probe_read(dst, sizeof(dst), fmt); return 0; } We got the below btf (bpftool btf dump ./test.o): [1] PTR '(anon)' type_id=0 [2] FUNC_PROTO '(anon)' ret_type_id=3 vlen=1 'ctx' type_id=1 [3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [4] FUNC 'sys_enter' type_id=2 linkage=global [5] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED [6] ARRAY '(anon)' type_id=5 index_type_id=7 nr_elems=4 [7] INT '__ARRAY_SIZE_TYPE__' size=4 bits_offset=0 nr_bits=32 encoding=(none) [8] VAR '_license' type_id=6, linkage=global-alloc [9] DATASEC '.rodata' size=0 vlen=0 [10] DATASEC 'license' size=0 vlen=1 type_id=8 offset=0 size=4 When loading the ./test.o to the kernel with bpftool, we see the following error: libbpf: Error loading BTF: Invalid argument(22) libbpf: magic: 0xeb9f ... [6] ARRAY (anon) type_id=5 index_type_id=7 nr_elems=4 [7] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none) [8] VAR _license type_id=6 linkage=1 [9] DATASEC .rodata size=24 vlen=0 vlen == 0 libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.
Basically, libbpf changed .rodata datasec size to 24 since elf .rodata section size is 24. The kernel then rejected the BTF since vlen = 0. Note that the above kernel verifier failure can be worked around with changing local variable "fmt" to a static or global, optionally const, variable.
This patch permits a datasec with vlen = 0 in kernel.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210119153519.3901963-1-yhs@fb.com (cherry picked from commit 13ca51d5eb358edcb673afccb48c3440b9fda21b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 5 ----- tools/testing/selftests/bpf/prog_tests/btf.c | 21 ++++++++++++++++++++ 2 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 57f99c23a6fd..a3a32e83a623 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3540,11 +3540,6 @@ static s32 btf_datasec_check_meta(struct btf_verifier_env *env, return -EINVAL; }
- if (!btf_type_vlen(t)) { - btf_verifier_log_type(env, t, "vlen == 0"); - return -EINVAL; - } - if (!t->size) { btf_verifier_log_type(env, t, "size == 0"); return -EINVAL; diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 8ae97e2a4b9d..055d2c0486ed 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -3509,6 +3509,27 @@ static struct btf_raw_test raw_tests[] = { .value_type_id = 3 /* arr_t */, .max_entries = 4, }, +/* + * elf .rodata section size 4 and btf .rodata section vlen 0. + */ +{ + .descr = "datasec: vlen == 0", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* .rodata section */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 0), 4), + /* [2] */ + BTF_END_RAW, + }, + BTF_STR_SEC("\0.rodata"), + .map_type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(int), + .value_size = sizeof(int), + .key_type_id = 1, + .value_type_id = 1, + .max_entries = 1, +},
}; /* struct btf_raw_test raw_tests[] */
From: Jiri Olsa jolsa@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 6095d5a271ad6dce30489526771a9040d78e0895 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
For very large ELF objects (with many sections), we could get special value SHN_XINDEX (65535) for elf object's string table index - e_shstrndx.
Call elf_getshdrstrndx to get the proper string table index, instead of reading it directly from ELF header.
Signed-off-by: Jiri Olsa jolsa@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210121202203.9346-4-jolsa@kernel.org (cherry picked from commit 6095d5a271ad6dce30489526771a9040d78e0895) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 6225c403685a..0b47edee3c93 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -864,6 +864,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, Elf_Scn *scn = NULL; Elf *elf = NULL; GElf_Ehdr ehdr; + size_t shstrndx;
if (elf_version(EV_CURRENT) == EV_NONE) { pr_warn("failed to init libelf for %s\n", path); @@ -888,7 +889,14 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, pr_warn("failed to get EHDR from %s\n", path); goto done; } - if (!elf_rawdata(elf_getscn(elf, ehdr.e_shstrndx), NULL)) { + + if (elf_getshdrstrndx(elf, &shstrndx)) { + pr_warn("failed to get section names section index for %s\n", + path); + goto done; + } + + if (!elf_rawdata(elf_getscn(elf, shstrndx), NULL)) { pr_warn("failed to get e_shstrndx from %s\n", path); goto done; } @@ -903,7 +911,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, idx, path); goto done; } - name = elf_strptr(elf, ehdr.e_shstrndx, sh.sh_name); + name = elf_strptr(elf, shstrndx, sh.sh_name); if (!name) { pr_warn("failed to get section(%d) name from %s\n", idx, path);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 86ce322d21eb032ed8fdd294d0fb095d2debb430 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Fix bug in handling bpf_testmod unloading that will cause test_progs exiting prematurely if bpf_testmod unloading failed. This is especially problematic when running a subset of test_progs that doesn't require root permissions and doesn't rely on bpf_testmod, yet will fail immediately due to exit(1) in unload_bpf_testmod().
Fixes: 9f7fa225894c ("selftests/bpf: Add bpf_testmod kernel module for testing") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210126065019.1268027-1-andrii@kernel.org (cherry picked from commit 86ce322d21eb032ed8fdd294d0fb095d2debb430) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_progs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index fe1367071846..91c52c98bd28 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -390,7 +390,7 @@ static void unload_bpf_testmod(void) return; } fprintf(env.stderr, "Failed to unload bpf_testmod.ko from kernel: %d\n", -errno); - exit(1); + return; } if (env.verbosity > VERBOSE_NONE) fprintf(stdout, "Successfully unloaded bpf_testmod.ko.\n");
From: Sedat Dilek sedat.dilek@gmail.com
mainline inclusion from mainline-5.12-rc1 commit 211a741cd3e124bffdc13ee82e7e65f204e53f60 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When dealing with BPF/BTF/pahole and DWARF v5 I wanted to build bpftool.
While looking into the source code I found duplicate assignments in misc tools for the LLVM eco system, e.g. clang and llvm-objcopy.
Move the Clang, LLC and/or LLVM utils definitions to tools/scripts/Makefile.include file and add missing includes where needed. Honestly, I was inspired by the commit c8a950d0d3b9 ("tools: Factor HOSTCC, HOSTLD, HOSTAR definitions").
I tested with bpftool and perf on Debian/testing AMD64 and LLVM/Clang v11.1.0-rc1.
Build instructions:
[ make and make-options ] MAKE="make V=1" MAKE_OPTS="HOSTCC=clang HOSTCXX=clang++ HOSTLD=ld.lld CC=clang LD=ld.lld LLVM=1 LLVM_IAS=1" MAKE_OPTS="$MAKE_OPTS PAHOLE=/opt/pahole/bin/pahole"
[ clean-up ] $MAKE $MAKE_OPTS -C tools/ clean
[ bpftool ] $MAKE $MAKE_OPTS -C tools/bpf/bpftool/
[ perf ] PYTHON=python3 $MAKE $MAKE_OPTS -C tools/perf/
I was careful with respecting the user's wish to override custom compiler, linker, GNU/binutils and/or LLVM utils settings.
Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: Jiri Olsa jolsa@redhat.com # tools/build and tools/perf Link: https://lore.kernel.org/bpf/20210128015117.20515-1-sedat.dilek@gmail.com (cherry picked from commit 211a741cd3e124bffdc13ee82e7e65f204e53f60) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 2 -- tools/bpf/runqslower/Makefile | 3 --- tools/build/feature/Makefile | 4 ++-- tools/perf/Makefile.perf | 1 - tools/scripts/Makefile.include | 7 +++++++ tools/testing/selftests/bpf/Makefile | 2 -- tools/testing/selftests/tc-testing/Makefile | 3 +-- 7 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index 752d3e97e077..df67ef3248e1 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -74,8 +74,6 @@ endif
INSTALL ?= install RM ?= rm -f -CLANG ?= clang -LLVM_STRIP ?= llvm-strip
FEATURE_USER = .bpftool FEATURE_TESTS = libbfd disassembler-four-args reallocarray zlib libcap \ diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index 4d5ca54fcd4c..9d9fb6209be1 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -3,9 +3,6 @@ include ../../scripts/Makefile.include
OUTPUT ?= $(abspath .output)/
-CLANG ?= clang -LLC ?= llc -LLVM_STRIP ?= llvm-strip BPFTOOL_OUTPUT := $(OUTPUT)bpftool/ DEFAULT_BPFTOOL := $(BPFTOOL_OUTPUT)bpftool BPFTOOL ?= $(DEFAULT_BPFTOOL) diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index db3417ff8c5f..cd123c7ad1f4 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -1,4 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 +include ../../scripts/Makefile.include + FILES= \ test-all.bin \ test-backtrace.bin \ @@ -76,8 +78,6 @@ FILES= \ FILES := $(addprefix $(OUTPUT),$(FILES))
PKG_CONFIG ?= $(CROSS_COMPILE)pkg-config -LLVM_CONFIG ?= llvm-config -CLANG ?= clang
all: $(FILES)
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index e41a8f9b99d2..d128b6006405 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -176,7 +176,6 @@ endef LD += $(EXTRA_LDFLAGS)
PKG_CONFIG = $(CROSS_COMPILE)pkg-config -LLVM_CONFIG ?= llvm-config
RM = rm -f LN = ln -f diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index 865e21d4cc12..41ca9473d0f0 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -87,6 +87,13 @@ HOSTCC ?= gcc HOSTLD ?= ld endif
+# Some tools require Clang, LLC and/or LLVM utils +CLANG ?= clang +LLC ?= llc +LLVM_CONFIG ?= llvm-config +LLVM_OBJCOPY ?= llvm-objcopy +LLVM_STRIP ?= llvm-strip + ifeq ($(CC_NO_CLANG), 1) EXTRA_WARNINGS += -Wstrict-aliasing=3 endif diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 2ec1fb02b21d..c654cbc1f622 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -19,8 +19,6 @@ ifneq ($(wildcard $(GENHDR)),) GENFLAGS := -DHAVE_GENHDR endif
-CLANG ?= clang -LLVM_OBJCOPY ?= llvm-objcopy BPF_GCC ?= $(shell command -v bpf-gcc;) SAN_CFLAGS ?= CFLAGS += -g -rdynamic -Wall -O2 $(GENFLAGS) $(SAN_CFLAGS) \ diff --git a/tools/testing/selftests/tc-testing/Makefile b/tools/testing/selftests/tc-testing/Makefile index 91fee5c43274..4d639279f41e 100644 --- a/tools/testing/selftests/tc-testing/Makefile +++ b/tools/testing/selftests/tc-testing/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 +include ../../../scripts/Makefile.include
top_srcdir = $(abspath ../../../..) APIDIR := $(top_scrdir)/include/uapi @@ -7,8 +8,6 @@ TEST_GEN_FILES = action.o KSFT_KHDR_INSTALL := 1 include ../lib.mk
-CLANG ?= clang -LLC ?= llc PROBE := $(shell $(LLC) -march=bpf -mcpu=probe -filetype=null /dev/null 2>&1)
ifeq ($(PROBE),)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 5f10c1aac8b29d225d19a74656865d1ee3db6eaa category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Libbpf's Makefile relies on Linux tools infrastructure's feature detection framework, but libbpf's needs are very modest: it detects the presence of libelf and libz, both of which are mandatory. So it doesn't benefit much from the framework, but pays significant costs in terms of maintainability and debugging experience, when something goes wrong. The other feature detector, testing for the presernce of minimal BPF API in system headers is long obsolete as well, providing no value.
So stop using feature detection and just assume the presence of libelf and libz during build time. Worst case, user will get a clear and actionable linker error, e.g.:
/usr/bin/ld: cannot find -lelf
On the other hand, we completely bypass recurring issues various users reported over time with false negatives of feature detection (libelf or libz not being detected, while they are actually present in the system).
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Randy Dunlap rdunlap@infradead.org Link: https://lore.kernel.org/bpf/20210203203445.3356114-1-andrii@kernel.org (cherry picked from commit 5f10c1aac8b29d225d19a74656865d1ee3db6eaa) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/.gitignore | 1 - tools/lib/bpf/Makefile | 47 ++++------------------------------------ 2 files changed, 4 insertions(+), 44 deletions(-)
diff --git a/tools/lib/bpf/.gitignore b/tools/lib/bpf/.gitignore index 8a81b3679d2b..5d4cfac671d5 100644 --- a/tools/lib/bpf/.gitignore +++ b/tools/lib/bpf/.gitignore @@ -1,7 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only libbpf_version.h libbpf.pc -FEATURE-DUMP.libbpf libbpf.so.* TAGS tags diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index f2a353bba25f..76cf5ab8d00c 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -59,28 +59,7 @@ ifndef VERBOSE VERBOSE = 0 endif
-FEATURE_USER = .libbpf -FEATURE_TESTS = libelf zlib bpf -FEATURE_DISPLAY = libelf zlib bpf - INCLUDES = -I. -I$(srctree)/tools/include -I$(srctree)/tools/include/uapi -FEATURE_CHECK_CFLAGS-bpf = $(INCLUDES) - -check_feat := 1 -NON_CHECK_FEAT_TARGETS := clean TAGS tags cscope help -ifdef MAKECMDGOALS -ifeq ($(filter-out $(NON_CHECK_FEAT_TARGETS),$(MAKECMDGOALS)),) - check_feat := 0 -endif -endif - -ifeq ($(check_feat),1) -ifeq ($(FEATURES_DUMP),) -include $(srctree)/tools/build/Makefile.feature -else -include $(FEATURES_DUMP) -endif -endif
export prefix libdir src obj
@@ -157,7 +136,7 @@ all: fixdep
all_cmd: $(CMD_TARGETS) check
-$(BPF_IN_SHARED): force elfdep zdep bpfdep $(BPF_HELPER_DEFS) +$(BPF_IN_SHARED): force $(BPF_HELPER_DEFS) @(test -f ../../include/uapi/linux/bpf.h -a -f ../../../include/uapi/linux/bpf.h && ( \ (diff -B ../../include/uapi/linux/bpf.h ../../../include/uapi/linux/bpf.h >/dev/null) || \ echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'" >&2 )) || true @@ -175,7 +154,7 @@ $(BPF_IN_SHARED): force elfdep zdep bpfdep $(BPF_HELPER_DEFS) echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'" >&2 )) || true $(Q)$(MAKE) $(build)=libbpf OUTPUT=$(SHARED_OBJDIR) CFLAGS="$(CFLAGS) $(SHLIB_FLAGS)"
-$(BPF_IN_STATIC): force elfdep zdep bpfdep $(BPF_HELPER_DEFS) +$(BPF_IN_STATIC): force $(BPF_HELPER_DEFS) $(Q)$(MAKE) $(build)=libbpf OUTPUT=$(STATIC_OBJDIR)
$(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h @@ -264,34 +243,16 @@ install_pkgconfig: $(PC_FILE)
install: install_lib install_pkgconfig install_headers
-### Cleaning rules - -config-clean: - $(call QUIET_CLEAN, feature-detect) - $(Q)$(MAKE) -C $(srctree)/tools/build/feature/ clean >/dev/null - -clean: config-clean +clean: $(call QUIET_CLEAN, libbpf) $(RM) -rf $(CMD_TARGETS) \ *~ .*.d .*.cmd LIBBPF-CFLAGS $(BPF_HELPER_DEFS) \ $(SHARED_OBJDIR) $(STATIC_OBJDIR) \ $(addprefix $(OUTPUT), \ *.o *.a *.so *.so.$(LIBBPF_MAJOR_VERSION) *.pc) - $(call QUIET_CLEAN, core-gen) $(RM) $(OUTPUT)FEATURE-DUMP.libbpf - -
-PHONY += force elfdep zdep bpfdep cscope tags +PHONY += force cscope tags force:
-elfdep: - @if [ "$(feature-libelf)" != "1" ]; then echo "No libelf found"; exit 1 ; fi - -zdep: - @if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi - -bpfdep: - @if [ "$(feature-bpf)" != "1" ]; then echo "BPF API too old"; exit 1 ; fi - cscope: ls *.c *.h > cscope.files cscope -b -q -I $(srctree)/include -f cscope.out
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.12-rc1 commit 23a2d70c7a2f28eb1a8f6bc19d68d23968cad0ce category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
There is no functionality change. This refactoring intends to facilitate next patch change with BPF_PSEUDO_FUNC.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210204234827.1628953-1-yhs@fb.com (cherry picked from commit 23a2d70c7a2f28eb1a8f6bc19d68d23968cad0ce) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8f71cd346361..492069c92731 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -229,6 +229,12 @@ static void bpf_map_key_store(struct bpf_insn_aux_data *aux, u64 state) (poisoned ? BPF_MAP_KEY_POISON : 0ULL); }
+static bool bpf_pseudo_call(const struct bpf_insn *insn) +{ + return insn->code == (BPF_JMP | BPF_CALL) && + insn->src_reg == BPF_PSEUDO_CALL; +} + struct bpf_call_arg_meta { struct bpf_map *map_ptr; bool raw_mode; @@ -1492,9 +1498,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
/* determine subprog starts. The end is one before the next starts */ for (i = 0; i < insn_cnt; i++) { - if (insn[i].code != (BPF_JMP | BPF_CALL)) - continue; - if (insn[i].src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn + i)) continue; if (!env->bpf_capable) { verbose(env, @@ -3300,9 +3304,7 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) continue_func: subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { - if (insn[i].code != (BPF_JMP | BPF_CALL)) - continue; - if (insn[i].src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn + i)) continue; /* remember insn and function to return to */ ret_insn[frame] = i + 1; @@ -11301,8 +11303,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) return 0;
for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) { - if (insn->code != (BPF_JMP | BPF_CALL) || - insn->src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn)) continue; /* Upon error here we cannot fall back to interpreter but * need a hard reject of the program. Thus -EFAULT is @@ -11405,8 +11406,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) for (i = 0; i < env->subprog_cnt; i++) { insn = func[i]->insnsi; for (j = 0; j < func[i]->len; j++, insn++) { - if (insn->code != (BPF_JMP | BPF_CALL) || - insn->src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn)) continue; subprog = insn->off; insn->imm = BPF_CAST_CALL(func[subprog]->bpf_func) - @@ -11451,8 +11451,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) * later look the same as if they were interpreted only. */ for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) { - if (insn->code != (BPF_JMP | BPF_CALL) || - insn->src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn)) continue; insn->off = env->insn_aux_data[i].call_imm; subprog = find_subprog(env, i + insn->off + 1); @@ -11489,8 +11488,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) /* cleanup main prog to be interpreted */ prog->jit_requested = 0; for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) { - if (insn->code != (BPF_JMP | BPF_CALL) || - insn->src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn)) continue; insn->off = 0; insn->imm = env->insn_aux_data[i].call_imm; @@ -11525,8 +11523,7 @@ static int fixup_call_args(struct bpf_verifier_env *env) return -EINVAL; } for (i = 0; i < prog->len; i++, insn++) { - if (insn->code != (BPF_JMP | BPF_CALL) || - insn->src_reg != BPF_PSEUDO_CALL) + if (!bpf_pseudo_call(insn)) continue; depth = get_callee_stack_depth(env, insn, i); if (depth < 0)
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 700d4796ef59f5faf240d307839bd419e2b6bdff category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Move bpf_prog_stats from prog->aux into prog to avoid one extra load in critical path of program execution.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-2-alexei.starovoitov@gmail.... (cherry picked from commit 700d4796ef59f5faf240d307839bd419e2b6bdff) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 8 -------- include/linux/filter.h | 14 +++++++++++--- kernel/bpf/core.c | 8 ++++---- kernel/bpf/syscall.c | 2 +- kernel/bpf/trampoline.c | 2 +- kernel/bpf/verifier.c | 2 +- 6 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 714dc0148643..d70eca074c2e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -14,7 +14,6 @@ #include <linux/numa.h> #include <linux/mm_types.h> #include <linux/wait.h> -#include <linux/u64_stats_sync.h> #include <linux/refcount.h> #include <linux/mutex.h> #include <linux/module.h> @@ -574,12 +573,6 @@ enum bpf_cgroup_storage_type { */ #define MAX_BPF_FUNC_ARGS 12
-struct bpf_prog_stats { - u64 cnt; - u64 nsecs; - struct u64_stats_sync syncp; -} __aligned(2 * sizeof(u64)); - struct btf_func_model { u8 ret_size; u8 nr_args; @@ -931,7 +924,6 @@ struct bpf_prog_aux { u32 linfo_idx; u32 num_exentries; struct exception_table_entry *extable; - struct bpf_prog_stats __percpu *stats; union { struct work_struct work; struct rcu_head rcu; diff --git a/include/linux/filter.h b/include/linux/filter.h index a852b2af281c..884759cb2507 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -22,6 +22,7 @@ #include <linux/vmalloc.h> #include <linux/sockptr.h> #include <crypto/sha1.h> +#include <linux/u64_stats_sync.h>
#include <net/sch_generic.h>
@@ -554,6 +555,12 @@ struct bpf_binary_header { u8 image[] __aligned(BPF_IMAGE_ALIGNMENT); };
+struct bpf_prog_stats { + u64 cnt; + u64 nsecs; + struct u64_stats_sync syncp; +} __aligned(2 * sizeof(u64)); + struct bpf_prog { u16 pages; /* Number of allocated pages */ u16 jited:1, /* Is our filter JIT'ed? */ @@ -572,10 +579,11 @@ struct bpf_prog { u32 len; /* Number of filter blocks */ u32 jited_len; /* Size of jited insns in bytes */ u8 tag[BPF_TAG_SIZE]; - struct bpf_prog_aux *aux; /* Auxiliary fields */ - struct sock_fprog_kern *orig_prog; /* Original BPF program */ + struct bpf_prog_stats __percpu *stats; unsigned int (*bpf_func)(const void *ctx, const struct bpf_insn *insn); + struct bpf_prog_aux *aux; /* Auxiliary fields */ + struct sock_fprog_kern *orig_prog; /* Original BPF program */ /* Instructions for interpreter */ struct sock_filter insns[0]; struct bpf_insn insnsi[]; @@ -596,7 +604,7 @@ DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); struct bpf_prog_stats *__stats; \ u64 __start = sched_clock(); \ __ret = dfunc(ctx, (prog)->insnsi, (prog)->bpf_func); \ - __stats = this_cpu_ptr(prog->aux->stats); \ + __stats = this_cpu_ptr(prog->stats); \ u64_stats_update_begin(&__stats->syncp); \ __stats->cnt++; \ __stats->nsecs += sched_clock() - __start; \ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index dff79f5c0e26..8a5723146ed4 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -116,8 +116,8 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags) if (!prog) return NULL;
- prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags); - if (!prog->aux->stats) { + prog->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags); + if (!prog->stats) { kfree(prog->aux); vfree(prog); return NULL; @@ -126,7 +126,7 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags) for_each_possible_cpu(cpu) { struct bpf_prog_stats *pstats;
- pstats = per_cpu_ptr(prog->aux->stats, cpu); + pstats = per_cpu_ptr(prog->stats, cpu); u64_stats_init(&pstats->syncp); } return prog; @@ -259,10 +259,10 @@ void __bpf_prog_free(struct bpf_prog *fp) if (fp->aux) { mutex_destroy(&fp->aux->used_maps_mutex); mutex_destroy(&fp->aux->dst_mutex); - free_percpu(fp->aux->stats); kfree(fp->aux->poke_tab); kfree(fp->aux); } + free_percpu(fp->stats); vfree(fp); }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 2d3b7a75155b..a265a9ef96d1 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1803,7 +1803,7 @@ static void bpf_prog_get_stats(const struct bpf_prog *prog, unsigned int start; u64 tnsecs, tcnt;
- st = per_cpu_ptr(prog->aux->stats, cpu); + st = per_cpu_ptr(prog->stats, cpu); do { start = u64_stats_fetch_begin_irq(&st->syncp); tnsecs = st->nsecs; diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index cc6ba35a1d14..9bd1731eea0d 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -521,7 +521,7 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start) * Hence check that 'start' is not zero. */ start) { - stats = this_cpu_ptr(prog->aux->stats); + stats = this_cpu_ptr(prog->stats); u64_stats_update_begin(&stats->syncp); stats->cnt++; stats->nsecs += sched_clock() - start; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 492069c92731..2d6dd6c3a59a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11344,7 +11344,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) /* BPF_PROG_RUN doesn't call subprogs directly, * hence main prog stats include the runtime of subprogs. * subprogs don't have IDs and not reachable via prog_get_next_id - * func[i]->aux->stats will never be accessed and stays NULL + * func[i]->stats will never be accessed and stays NULL */ func[i] = bpf_prog_alloc_no_stats(bpf_prog_size(len), GFP_USER); if (!func[i])
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 031d6e02ddbb8dea747c1abb697d556901f07dd4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In older non-RT kernels migrate_disable() was the same as preempt_disable(). Since commit 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT") migrate_disable() is real and doesn't prevent sleeping.
Running sleepable programs with migration disabled allows to add support for program stats and per-cpu maps later.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: KP Singh kpsingh@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-3-alexei.starovoitov@gmail.... (cherry picked from commit 031d6e02ddbb8dea747c1abb697d556901f07dd4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/trampoline.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index 9bd1731eea0d..e9cc7ac3856f 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -534,11 +534,13 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start) void notrace __bpf_prog_enter_sleepable(void) { rcu_read_lock_trace(); + migrate_disable(); might_fault(); }
void notrace __bpf_prog_exit_sleepable(void) { + migrate_enable(); rcu_read_unlock_trace(); }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit f2dd3b39467411c53703125a111f45b3672c1771 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since sleepable programs don't migrate from the cpu the excution stats can be computed for them as well. Reuse the same infrastructure for both sleepable and non-sleepable programs.
run_cnt -> the number of times the program was executed. run_time_ns -> the program execution time in nanoseconds including the off-cpu time when the program was sleeping.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: KP Singh kpsingh@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-4-alexei.starovoitov@gmail.... (cherry picked from commit f2dd3b39467411c53703125a111f45b3672c1771) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 31 +++++++++++---------------- include/linux/bpf.h | 4 ++-- kernel/bpf/trampoline.c | 42 ++++++++++++++++++++++++------------- 3 files changed, 42 insertions(+), 35 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 4c2324295ae2..a96855ffa6e5 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1735,15 +1735,12 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, u8 *prog = *pprog; int cnt = 0;
- if (p->aux->sleepable) { - if (emit_call(&prog, __bpf_prog_enter_sleepable, prog)) + if (emit_call(&prog, + p->aux->sleepable ? __bpf_prog_enter_sleepable : + __bpf_prog_enter, prog)) return -EINVAL; - } else { - if (emit_call(&prog, __bpf_prog_enter, prog)) - return -EINVAL; - /* remember prog start time returned by __bpf_prog_enter */ - emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0); - } + /* remember prog start time returned by __bpf_prog_enter */ + emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
/* arg1: lea rdi, [rbp - stack_size] */ EMIT4(0x48, 0x8D, 0x7D, -stack_size); @@ -1767,18 +1764,14 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, if (save_ret) emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
- if (p->aux->sleepable) { - if (emit_call(&prog, __bpf_prog_exit_sleepable, prog)) + /* arg1: mov rdi, progs[i] */ + emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p); + /* arg2: mov rsi, rbx <- start time in nsec */ + emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6); + if (emit_call(&prog, + p->aux->sleepable ? __bpf_prog_exit_sleepable : + __bpf_prog_exit, prog)) return -EINVAL; - } else { - /* arg1: mov rdi, progs[i] */ - emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, - (u32) (long) p); - /* arg2: mov rsi, rbx <- start time in nsec */ - emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6); - if (emit_call(&prog, __bpf_prog_exit, prog)) - return -EINVAL; - }
*pprog = prog; return 0; diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d70eca074c2e..d11892380be3 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -633,8 +633,8 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *i /* these two functions are called from generated trampoline */ u64 notrace __bpf_prog_enter(void); void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start); -void notrace __bpf_prog_enter_sleepable(void); -void notrace __bpf_prog_exit_sleepable(void); +u64 notrace __bpf_prog_enter_sleepable(void); +void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start); void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr); void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index e9cc7ac3856f..3c171344f629 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -490,56 +490,70 @@ void bpf_trampoline_put(struct bpf_trampoline *tr) mutex_unlock(&trampoline_mutex); }
+#define NO_START_TIME 0 +static u64 notrace bpf_prog_start_time(void) +{ + u64 start = NO_START_TIME; + + if (static_branch_unlikely(&bpf_stats_enabled_key)) + start = sched_clock(); + return start; +} + /* The logic is similar to BPF_PROG_RUN, but with an explicit * rcu_read_lock() and migrate_disable() which are required * for the trampoline. The macro is split into - * call _bpf_prog_enter + * call __bpf_prog_enter * call prog->bpf_func * call __bpf_prog_exit */ u64 notrace __bpf_prog_enter(void) __acquires(RCU) { - u64 start = 0; - rcu_read_lock(); migrate_disable(); - if (static_branch_unlikely(&bpf_stats_enabled_key)) - start = sched_clock(); - return start; + return bpf_prog_start_time(); }
-void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start) - __releases(RCU) +static void notrace update_prog_stats(struct bpf_prog *prog, + u64 start) { struct bpf_prog_stats *stats;
if (static_branch_unlikely(&bpf_stats_enabled_key) && - /* static_key could be enabled in __bpf_prog_enter - * and disabled in __bpf_prog_exit. + /* static_key could be enabled in __bpf_prog_enter* + * and disabled in __bpf_prog_exit*. * And vice versa. - * Hence check that 'start' is not zero. + * Hence check that 'start' is valid. */ - start) { + start > NO_START_TIME) { stats = this_cpu_ptr(prog->stats); u64_stats_update_begin(&stats->syncp); stats->cnt++; stats->nsecs += sched_clock() - start; u64_stats_update_end(&stats->syncp); } +} + +void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start) + __releases(RCU) +{ + update_prog_stats(prog, start); migrate_enable(); rcu_read_unlock(); }
-void notrace __bpf_prog_enter_sleepable(void) +u64 notrace __bpf_prog_enter_sleepable(void) { rcu_read_lock_trace(); migrate_disable(); might_fault(); + return bpf_prog_start_time(); }
-void notrace __bpf_prog_exit_sleepable(void) +void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start) { + update_prog_stats(prog, start); migrate_enable(); rcu_read_unlock_trace(); }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit ca06f55b90020cd97f4cc6d52db95436162e7dcf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since both sleepable and non-sleepable programs execute under migrate_disable add recursion prevention mechanism to both types of programs when they're executed via bpf trampoline.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-5-alexei.starovoitov@gmail.... (cherry picked from commit ca06f55b90020cd97f4cc6d52db95436162e7dcf) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 15 ++++++++++++ include/linux/bpf.h | 6 ++--- include/linux/filter.h | 1 + kernel/bpf/core.c | 8 +++++++ kernel/bpf/trampoline.c | 23 +++++++++++++++---- .../selftests/bpf/prog_tests/fexit_stress.c | 4 ++-- .../bpf/prog_tests/trampoline_count.c | 4 ++-- 7 files changed, 50 insertions(+), 11 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index a96855ffa6e5..4c4c10e7455a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1733,8 +1733,11 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, struct bpf_prog *p, int stack_size, bool save_ret) { u8 *prog = *pprog; + u8 *jmp_insn; int cnt = 0;
+ /* arg1: mov rdi, progs[i] */ + emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p); if (emit_call(&prog, p->aux->sleepable ? __bpf_prog_enter_sleepable : __bpf_prog_enter, prog)) @@ -1742,6 +1745,14 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, /* remember prog start time returned by __bpf_prog_enter */ emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
+ /* if (__bpf_prog_enter*(prog) == 0) + * goto skip_exec_of_prog; + */ + EMIT3(0x48, 0x85, 0xC0); /* test rax,rax */ + /* emit 2 nops that will be replaced with JE insn */ + jmp_insn = prog; + emit_nops(&prog, 2); + /* arg1: lea rdi, [rbp - stack_size] */ EMIT4(0x48, 0x8D, 0x7D, -stack_size); /* arg2: progs[i]->insnsi for interpreter */ @@ -1764,6 +1775,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, if (save_ret) emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
+ /* replace 2 nops with JE insn, since jmp target is known */ + jmp_insn[0] = X86_JE; + jmp_insn[1] = prog - jmp_insn - 2; + /* arg1: mov rdi, progs[i] */ emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p); /* arg2: mov rsi, rbx <- start time in nsec */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d11892380be3..a6f0daa55096 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -598,7 +598,7 @@ struct btf_func_model { /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50 * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2 */ -#define BPF_MAX_TRAMP_PROGS 40 +#define BPF_MAX_TRAMP_PROGS 38
struct bpf_tramp_progs { struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS]; @@ -631,9 +631,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *i struct bpf_tramp_progs *tprogs, void *orig_call); /* these two functions are called from generated trampoline */ -u64 notrace __bpf_prog_enter(void); +u64 notrace __bpf_prog_enter(struct bpf_prog *prog); void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start); -u64 notrace __bpf_prog_enter_sleepable(void); +u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog); void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start); void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr); void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr); diff --git a/include/linux/filter.h b/include/linux/filter.h index 884759cb2507..52b7b111f939 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -580,6 +580,7 @@ struct bpf_prog { u32 jited_len; /* Size of jited insns in bytes */ u8 tag[BPF_TAG_SIZE]; struct bpf_prog_stats __percpu *stats; + int __percpu *active; unsigned int (*bpf_func)(const void *ctx, const struct bpf_insn *insn); struct bpf_prog_aux *aux; /* Auxiliary fields */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 8a5723146ed4..0b91351c7e41 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -93,6 +93,12 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag vfree(fp); return NULL; } + fp->active = alloc_percpu_gfp(int, GFP_KERNEL_ACCOUNT | gfp_extra_flags); + if (!fp->active) { + vfree(fp); + kfree(aux); + return NULL; + }
fp->pages = size / PAGE_SIZE; fp->aux = aux; @@ -118,6 +124,7 @@ struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags)
prog->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags); if (!prog->stats) { + free_percpu(prog->active); kfree(prog->aux); vfree(prog); return NULL; @@ -263,6 +270,7 @@ void __bpf_prog_free(struct bpf_prog *fp) kfree(fp->aux); } free_percpu(fp->stats); + free_percpu(fp->active); vfree(fp); }
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index 3c171344f629..5bc5aef59261 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -490,13 +490,16 @@ void bpf_trampoline_put(struct bpf_trampoline *tr) mutex_unlock(&trampoline_mutex); }
-#define NO_START_TIME 0 +#define NO_START_TIME 1 static u64 notrace bpf_prog_start_time(void) { u64 start = NO_START_TIME;
- if (static_branch_unlikely(&bpf_stats_enabled_key)) + if (static_branch_unlikely(&bpf_stats_enabled_key)) { start = sched_clock(); + if (unlikely(!start)) + start = NO_START_TIME; + } return start; }
@@ -506,12 +509,20 @@ static u64 notrace bpf_prog_start_time(void) * call __bpf_prog_enter * call prog->bpf_func * call __bpf_prog_exit + * + * __bpf_prog_enter returns: + * 0 - skip execution of the bpf prog + * 1 - execute bpf prog + * [2..MAX_U64] - excute bpf prog and record execution time. + * This is start time. */ -u64 notrace __bpf_prog_enter(void) +u64 notrace __bpf_prog_enter(struct bpf_prog *prog) __acquires(RCU) { rcu_read_lock(); migrate_disable(); + if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) + return 0; return bpf_prog_start_time(); }
@@ -539,21 +550,25 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start) __releases(RCU) { update_prog_stats(prog, start); + __this_cpu_dec(*(prog->active)); migrate_enable(); rcu_read_unlock(); }
-u64 notrace __bpf_prog_enter_sleepable(void) +u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog) { rcu_read_lock_trace(); migrate_disable(); might_fault(); + if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) + return 0; return bpf_prog_start_time(); }
void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start) { update_prog_stats(prog, start); + __this_cpu_dec(*(prog->active)); migrate_enable(); rcu_read_unlock_trace(); } diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c index 3b9dbf7433f0..7c9b62e971f1 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c @@ -2,8 +2,8 @@ /* Copyright (c) 2019 Facebook */ #include <test_progs.h>
-/* x86-64 fits 55 JITed and 43 interpreted progs into half page */ -#define CNT 40 +/* that's kernel internal BPF_MAX_TRAMP_PROGS define */ +#define CNT 38
void test_fexit_stress(void) { diff --git a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c index 781c8d11604b..f3022d934e2d 100644 --- a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c +++ b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c @@ -4,7 +4,7 @@ #include <sys/prctl.h> #include <test_progs.h>
-#define MAX_TRAMP_PROGS 40 +#define MAX_TRAMP_PROGS 38
struct inst { struct bpf_object *obj; @@ -52,7 +52,7 @@ void test_trampoline_count(void) struct bpf_link *link; char comm[16] = {};
- /* attach 'allowed' 40 trampoline programs */ + /* attach 'allowed' trampoline programs */ for (i = 0; i < MAX_TRAMP_PROGS; i++) { obj = bpf_object__open_file(object, NULL); if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj))) {
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 406c557edc5bb903db9f6cdd543cfc282c663ad8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add recursive non-sleepable fentry program as a test. All attach points where sleepable progs can execute are non recursive so far. The recursion protection mechanism for sleepable cannot be activated yet.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-6-alexei.starovoitov@gmail.... (cherry picked from commit 406c557edc5bb903db9f6cdd543cfc282c663ad8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/recursion.c | 33 +++++++++++++ tools/testing/selftests/bpf/progs/recursion.c | 46 +++++++++++++++++++ 2 files changed, 79 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/recursion.c create mode 100644 tools/testing/selftests/bpf/progs/recursion.c
diff --git a/tools/testing/selftests/bpf/prog_tests/recursion.c b/tools/testing/selftests/bpf/prog_tests/recursion.c new file mode 100644 index 000000000000..863757461e3f --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/recursion.c @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include <test_progs.h> +#include "recursion.skel.h" + +void test_recursion(void) +{ + struct recursion *skel; + int key = 0; + int err; + + skel = recursion__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + return; + + err = recursion__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto out; + + ASSERT_EQ(skel->bss->pass1, 0, "pass1 == 0"); + bpf_map_lookup_elem(bpf_map__fd(skel->maps.hash1), &key, 0); + ASSERT_EQ(skel->bss->pass1, 1, "pass1 == 1"); + bpf_map_lookup_elem(bpf_map__fd(skel->maps.hash1), &key, 0); + ASSERT_EQ(skel->bss->pass1, 2, "pass1 == 2"); + + ASSERT_EQ(skel->bss->pass2, 0, "pass2 == 0"); + bpf_map_lookup_elem(bpf_map__fd(skel->maps.hash2), &key, 0); + ASSERT_EQ(skel->bss->pass2, 1, "pass2 == 1"); + bpf_map_lookup_elem(bpf_map__fd(skel->maps.hash2), &key, 0); + ASSERT_EQ(skel->bss->pass2, 2, "pass2 == 2"); +out: + recursion__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/recursion.c b/tools/testing/selftests/bpf/progs/recursion.c new file mode 100644 index 000000000000..49f679375b9d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/recursion.c @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +char _license[] SEC("license") = "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 1); + __type(key, int); + __type(value, long); +} hash1 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 1); + __type(key, int); + __type(value, long); +} hash2 SEC(".maps"); + +int pass1 = 0; +int pass2 = 0; + +SEC("fentry/__htab_map_lookup_elem") +int BPF_PROG(on_lookup, struct bpf_map *map) +{ + int key = 0; + + if (map == (void *)&hash1) { + pass1++; + return 0; + } + if (map == (void *)&hash2) { + pass2++; + /* htab_map_gen_lookup() will inline below call + * into direct call to __htab_map_lookup_elem() + */ + bpf_map_lookup_elem(&hash2, &key); + return 0; + } + + return 0; +}
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 9ed9e9ba2337205311398a312796c213737bac35 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add per-program counter for number of times recursion prevention mechanism was triggered and expose it via show_fdinfo and bpf_prog_info. Teach bpftool to print it.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.... (cherry picked from commit 9ed9e9ba2337205311398a312796c213737bac35) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/filter.h | 1 + include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 14 ++++++++++---- kernel/bpf/trampoline.c | 18 ++++++++++++++++-- tools/bpf/bpftool/prog.c | 4 ++++ tools/include/uapi/linux/bpf.h | 1 + 6 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h index 52b7b111f939..f77cb6c78a27 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -558,6 +558,7 @@ struct bpf_binary_header { struct bpf_prog_stats { u64 cnt; u64 nsecs; + u64 misses; struct u64_stats_sync syncp; } __aligned(2 * sizeof(u64));
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 3eddc8add9b8..bf0717a82bff 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4447,6 +4447,7 @@ struct bpf_prog_info { __aligned_u64 prog_tags; __u64 run_time_ns; __u64 run_cnt; + __u64 recursion_misses; } __attribute__((aligned(8)));
struct bpf_map_info { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index a265a9ef96d1..969d0598434c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1795,25 +1795,28 @@ static int bpf_prog_release(struct inode *inode, struct file *filp) static void bpf_prog_get_stats(const struct bpf_prog *prog, struct bpf_prog_stats *stats) { - u64 nsecs = 0, cnt = 0; + u64 nsecs = 0, cnt = 0, misses = 0; int cpu;
for_each_possible_cpu(cpu) { const struct bpf_prog_stats *st; unsigned int start; - u64 tnsecs, tcnt; + u64 tnsecs, tcnt, tmisses;
st = per_cpu_ptr(prog->stats, cpu); do { start = u64_stats_fetch_begin_irq(&st->syncp); tnsecs = st->nsecs; tcnt = st->cnt; + tmisses = st->misses; } while (u64_stats_fetch_retry_irq(&st->syncp, start)); nsecs += tnsecs; cnt += tcnt; + misses += tmisses; } stats->nsecs = nsecs; stats->cnt = cnt; + stats->misses = misses; }
#ifdef CONFIG_PROC_FS @@ -1832,14 +1835,16 @@ static void bpf_prog_show_fdinfo(struct seq_file *m, struct file *filp) "memlock:\t%llu\n" "prog_id:\t%u\n" "run_time_ns:\t%llu\n" - "run_cnt:\t%llu\n", + "run_cnt:\t%llu\n" + "recursion_misses:\t%llu\n", prog->type, prog->jited, prog_tag, prog->pages * 1ULL << PAGE_SHIFT, prog->aux->id, stats.nsecs, - stats.cnt); + stats.cnt, + stats.misses); } #endif
@@ -3522,6 +3527,7 @@ static int bpf_prog_get_info_by_fd(struct file *file, bpf_prog_get_stats(prog, &stats); info.run_time_ns = stats.nsecs; info.run_cnt = stats.cnt; + info.recursion_misses = stats.misses;
if (!bpf_capable()) { info.jited_prog_len = 0; diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index 5bc5aef59261..fdb14e2626a4 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -503,6 +503,16 @@ static u64 notrace bpf_prog_start_time(void) return start; }
+static void notrace inc_misses_counter(struct bpf_prog *prog) +{ + struct bpf_prog_stats *stats; + + stats = this_cpu_ptr(prog->stats); + u64_stats_update_begin(&stats->syncp); + stats->misses++; + u64_stats_update_end(&stats->syncp); +} + /* The logic is similar to BPF_PROG_RUN, but with an explicit * rcu_read_lock() and migrate_disable() which are required * for the trampoline. The macro is split into @@ -521,8 +531,10 @@ u64 notrace __bpf_prog_enter(struct bpf_prog *prog) { rcu_read_lock(); migrate_disable(); - if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) + if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) { + inc_misses_counter(prog); return 0; + } return bpf_prog_start_time(); }
@@ -560,8 +572,10 @@ u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog) rcu_read_lock_trace(); migrate_disable(); might_fault(); - if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) + if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) { + inc_misses_counter(prog); return 0; + } return bpf_prog_start_time(); }
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 4e1d8e57d951..88ee5ebabb47 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -371,6 +371,8 @@ static void print_prog_header_json(struct bpf_prog_info *info) jsonw_uint_field(json_wtr, "run_time_ns", info->run_time_ns); jsonw_uint_field(json_wtr, "run_cnt", info->run_cnt); } + if (info->recursion_misses) + jsonw_uint_field(json_wtr, "recursion_misses", info->recursion_misses); }
static void print_prog_json(struct bpf_prog_info *info, int fd) @@ -449,6 +451,8 @@ static void print_prog_header_plain(struct bpf_prog_info *info) if (info->run_time_ns) printf(" run_time_ns %lld run_cnt %lld", info->run_time_ns, info->run_cnt); + if (info->recursion_misses) + printf(" recursion_misses %lld", info->recursion_misses); printf("\n"); }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index c9ab32ae5d7e..04cbc5f7dfb4 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4446,6 +4446,7 @@ struct bpf_prog_info { __aligned_u64 prog_tags; __u64 run_time_ns; __u64 run_cnt; + __u64 recursion_misses; } __attribute__((aligned(8)));
struct bpf_map_info {
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit dcf33b6f4de173818540e3a2a0668c80a1ebdc68 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since recursion_misses counter is available in bpf_prog_info improve the selftest to make sure it's counting correctly.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210210033634.62081-8-alexei.starovoitov@gmail.... (cherry picked from commit dcf33b6f4de173818540e3a2a0668c80a1ebdc68) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/recursion.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/recursion.c b/tools/testing/selftests/bpf/prog_tests/recursion.c index 863757461e3f..0e378d63fe18 100644 --- a/tools/testing/selftests/bpf/prog_tests/recursion.c +++ b/tools/testing/selftests/bpf/prog_tests/recursion.c @@ -5,6 +5,8 @@
void test_recursion(void) { + struct bpf_prog_info prog_info = {}; + __u32 prog_info_len = sizeof(prog_info); struct recursion *skel; int key = 0; int err; @@ -28,6 +30,12 @@ void test_recursion(void) ASSERT_EQ(skel->bss->pass2, 1, "pass2 == 1"); bpf_map_lookup_elem(bpf_map__fd(skel->maps.hash2), &key, 0); ASSERT_EQ(skel->bss->pass2, 2, "pass2 == 2"); + + err = bpf_obj_get_info_by_fd(bpf_program__fd(skel->progs.on_lookup), + &prog_info, &prog_info_len); + if (!ASSERT_OK(err, "get_prog_info")) + goto out; + ASSERT_EQ(prog_info.recursion_misses, 2, "recursion_misses"); out: recursion__destroy(skel); }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 638e4b825d523bed7a55e776c153049fb7716466 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since sleepable programs are now executing under migrate_disable the per-cpu maps are safe to use. The map-in-map were ok to use in sleepable from the time sleepable progs were introduced.
Note that non-preallocated maps are still not safe, since there is no rcu_read_lock yet in sleepable programs and dynamically allocated map elements are relying on rcu protection. The sleepable programs have rcu_read_lock_trace instead. That limitation will be addresses in the future.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: KP Singh kpsingh@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-9-alexei.starovoitov@gmail.... (cherry picked from commit 638e4b825d523bed7a55e776c153049fb7716466) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/hashtab.c | 4 ++-- kernel/bpf/verifier.c | 7 ++++++- 2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index a61c40dd5976..53e641ff3b9e 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -1164,7 +1164,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key, /* unknown flags */ return -EINVAL;
- WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
@@ -1218,7 +1218,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, /* unknown flags */ return -EINVAL;
- WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2d6dd6c3a59a..606790c02cf4 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10462,9 +10462,14 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, case BPF_MAP_TYPE_HASH: case BPF_MAP_TYPE_LRU_HASH: case BPF_MAP_TYPE_ARRAY: + case BPF_MAP_TYPE_PERCPU_HASH: + case BPF_MAP_TYPE_PERCPU_ARRAY: + case BPF_MAP_TYPE_LRU_PERCPU_HASH: + case BPF_MAP_TYPE_ARRAY_OF_MAPS: + case BPF_MAP_TYPE_HASH_OF_MAPS: if (!is_preallocated_map(map)) { verbose(env, - "Sleepable programs can only use preallocated hash maps\n"); + "Sleepable programs can only use preallocated maps\n"); return -EINVAL; } break;
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 750e5d7649b1415e27979f91f917fa5e103714d9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a basic test for map-in-map and per-cpu maps in sleepable programs.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: KP Singh kpsingh@kernel.org Link: https://lore.kernel.org/bpf/20210210033634.62081-10-alexei.starovoitov@gmail... (cherry picked from commit 750e5d7649b1415e27979f91f917fa5e103714d9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/progs/lsm.c | 69 +++++++++++++++++++++++++ 1 file changed, 69 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/lsm.c b/tools/testing/selftests/bpf/progs/lsm.c index ff4d343b94b5..33694ef8acfa 100644 --- a/tools/testing/selftests/bpf/progs/lsm.c +++ b/tools/testing/selftests/bpf/progs/lsm.c @@ -30,6 +30,53 @@ struct { __type(value, __u64); } lru_hash SEC(".maps");
+struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} percpu_array SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_HASH); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} percpu_hash SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_LRU_PERCPU_HASH); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} lru_percpu_hash SEC(".maps"); + +struct inner_map { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, int); + __type(value, __u64); +} inner_map SEC(".maps"); + +struct outer_arr { + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); + __uint(max_entries, 1); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __array(values, struct inner_map); +} outer_arr SEC(".maps") = { + .values = { [0] = &inner_map }, +}; + +struct outer_hash { + __uint(type, BPF_MAP_TYPE_HASH_OF_MAPS); + __uint(max_entries, 1); + __uint(key_size, sizeof(int)); + __array(values, struct inner_map); +} outer_hash SEC(".maps") = { + .values = { [0] = &inner_map }, +}; + char _license[] SEC("license") = "GPL";
int monitored_pid = 0; @@ -61,6 +108,7 @@ SEC("lsm.s/bprm_committed_creds") int BPF_PROG(test_void_hook, struct linux_binprm *bprm) { __u32 pid = bpf_get_current_pid_tgid() >> 32; + struct inner_map *inner_map; char args[64]; __u32 key = 0; __u64 *value; @@ -80,6 +128,27 @@ int BPF_PROG(test_void_hook, struct linux_binprm *bprm) value = bpf_map_lookup_elem(&lru_hash, &key); if (value) *value = 0; + value = bpf_map_lookup_elem(&percpu_array, &key); + if (value) + *value = 0; + value = bpf_map_lookup_elem(&percpu_hash, &key); + if (value) + *value = 0; + value = bpf_map_lookup_elem(&lru_percpu_hash, &key); + if (value) + *value = 0; + inner_map = bpf_map_lookup_elem(&outer_arr, &key); + if (inner_map) { + value = bpf_map_lookup_elem(inner_map, &key); + if (value) + *value = 0; + } + inner_map = bpf_map_lookup_elem(&outer_hash, &key); + if (inner_map) { + value = bpf_map_lookup_elem(inner_map, &key); + if (value) + *value = 0; + }
return 0; }
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.12-rc1 commit 07881ccbf40cc7893869f3f170301889ddca54ac category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one" socket cookies are not guaranteed to be non-decreasing. The bpf_get_socket_cookie helper descriptions are currently specifying that cookies are non-decreasing but we don't want users to rely on that.
Reported-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: KP Singh kpsingh@kernel.org Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org (cherry picked from commit 07881ccbf40cc7893869f3f170301889ddca54ac) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 8 ++++---- tools/include/uapi/linux/bpf.h | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index bf0717a82bff..601f24eff2b6 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1657,22 +1657,22 @@ union bpf_attr { * networking traffic statistics as it provides a global socket * identifier that can be assumed unique. * Return - * A 8-byte long non-decreasing number on success, or 0 if the - * socket field is missing inside *skb*. + * A 8-byte long unique number on success, or 0 if the socket + * field is missing inside *skb*. * * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx) * Description * Equivalent to bpf_get_socket_cookie() helper that accepts * *skb*, but gets socket from **struct bpf_sock_addr** context. * Return - * A 8-byte long non-decreasing number. + * A 8-byte long unique number. * * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx) * Description * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts * *skb*, but gets socket from **struct bpf_sock_ops** context. * Return - * A 8-byte long non-decreasing number. + * A 8-byte long unique number. * * u32 bpf_get_socket_uid(struct sk_buff *skb) * Return diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 04cbc5f7dfb4..fd065ed61ffe 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1657,22 +1657,22 @@ union bpf_attr { * networking traffic statistics as it provides a global socket * identifier that can be assumed unique. * Return - * A 8-byte long non-decreasing number on success, or 0 if the - * socket field is missing inside *skb*. + * A 8-byte long unique number on success, or 0 if the socket + * field is missing inside *skb*. * * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx) * Description * Equivalent to bpf_get_socket_cookie() helper that accepts * *skb*, but gets socket from **struct bpf_sock_addr** context. * Return - * A 8-byte long non-decreasing number. + * A 8-byte long unique number. * * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx) * Description * Equivalent to **bpf_get_socket_cookie**\ () helper that accepts * *skb*, but gets socket from **struct bpf_sock_ops** context. * Return - * A 8-byte long non-decreasing number. + * A 8-byte long unique number. * * u32 bpf_get_socket_uid(struct sk_buff *skb) * Return
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc1 commit 1336c662474edec3966c96c8de026f794d16b804 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
bpf_prog_realloc copies contents of struct bpf_prog. The pointers have to be cleared before freeing old struct.
Reported-by: Ilya Leoshkevich iii@linux.ibm.com Fixes: 700d4796ef59 ("bpf: Optimize program stats") Fixes: ca06f55b9002 ("bpf: Add per-program recursion prevention mechanism") Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit 1336c662474edec3966c96c8de026f794d16b804) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 0b91351c7e41..557b2a866c6f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -255,6 +255,8 @@ struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size, * reallocated structure. */ fp_old->aux = NULL; + fp_old->stats = NULL; + fp_old->active = NULL; __bpf_prog_free(fp_old); }
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-5.12-rc1 commit feb4adfad575c1e27cbfaa3462f376c13da36942 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Using "reg" for an array of bpf_reg_state and "reg[i + 1]" for an individual bpf_reg_state is error-prone and verbose. Use "regs" for the former and "reg" for the latter as other code nearby does.
Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210212205642.620788-2-me@ubique.spb.ru (cherry picked from commit feb4adfad575c1e27cbfaa3462f376c13da36942) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/btf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index a3a32e83a623..d90a5e308f44 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5305,7 +5305,7 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr * Only PTR_TO_CTX and SCALAR_VALUE states are recognized. */ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, - struct bpf_reg_state *reg) + struct bpf_reg_state *regs) { struct bpf_verifier_log *log = &env->log; struct bpf_prog *prog = env->prog; @@ -5351,17 +5351,19 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, * verifier sees. */ for (i = 0; i < nargs; i++) { + struct bpf_reg_state *reg = ®s[i + 1]; + t = btf_type_by_id(btf, args[i].type); while (btf_type_is_modifier(t)) t = btf_type_by_id(btf, t->type); if (btf_type_is_int(t) || btf_type_is_enum(t)) { - if (reg[i + 1].type == SCALAR_VALUE) + if (reg->type == SCALAR_VALUE) continue; bpf_log(log, "R%d is not a scalar\n", i + 1); goto out; } if (btf_type_is_ptr(t)) { - if (reg[i + 1].type == SCALAR_VALUE) { + if (reg->type == SCALAR_VALUE) { bpf_log(log, "R%d is not a pointer\n", i + 1); goto out; } @@ -5369,13 +5371,13 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, * is passing PTR_TO_CTX. */ if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) { - if (reg[i + 1].type != PTR_TO_CTX) { + if (reg->type != PTR_TO_CTX) { bpf_log(log, "arg#%d expected pointer to ctx, but got %s\n", i, btf_kind_str[BTF_INFO_KIND(t->info)]); goto out; } - if (check_ptr_off_reg(env, ®[i + 1], i + 1)) + if (check_ptr_off_reg(env, reg, i + 1)) goto out; continue; } @@ -5402,7 +5404,7 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, * (either PTR_TO_CTX or SCALAR_VALUE). */ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, - struct bpf_reg_state *reg) + struct bpf_reg_state *regs) { struct bpf_verifier_log *log = &env->log; struct bpf_prog *prog = env->prog; @@ -5473,16 +5475,18 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, * Only PTR_TO_CTX and SCALAR are supported atm. */ for (i = 0; i < nargs; i++) { + struct bpf_reg_state *reg = ®s[i + 1]; + t = btf_type_by_id(btf, args[i].type); while (btf_type_is_modifier(t)) t = btf_type_by_id(btf, t->type); if (btf_type_is_int(t) || btf_type_is_enum(t)) { - reg[i + 1].type = SCALAR_VALUE; + reg->type = SCALAR_VALUE; continue; } if (btf_type_is_ptr(t) && btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { - reg[i + 1].type = PTR_TO_CTX; + reg->type = PTR_TO_CTX; continue; } bpf_log(log, "Arg#%d type %s in %s() is not supported yet.\n",
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-v5.12-rc1 commit 4ddb74165ae580b6dcbb5ab1919d994fc8d03c3f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Extract conversion from a register's nullable type to a type with a value. The helper will be used in mark_ptr_not_null_reg().
Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210212205642.620788-3-me@ubique.spb.ru
Conflicts: kernel/bpf/verifier.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 49 +++++++++++++++++++++++++------------------ 1 file changed, 29 insertions(+), 20 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 606790c02cf4..a34652b0d726 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1081,6 +1081,28 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env, __mark_reg_known_zero(regs + regno); }
+static void mark_ptr_not_null_reg(struct bpf_reg_state *reg) +{ + if (base_type(reg->type) == PTR_TO_MAP_VALUE) { + const struct bpf_map *map = reg->map_ptr; + + if (map->inner_map_meta) { + reg->type = CONST_PTR_TO_MAP; + reg->map_ptr = map->inner_map_meta; + } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { + reg->type = PTR_TO_XDP_SOCK; + } else if (map->map_type == BPF_MAP_TYPE_SOCKMAP || + map->map_type == BPF_MAP_TYPE_SOCKHASH) { + reg->type = PTR_TO_SOCKET; + } else { + reg->type = PTR_TO_MAP_VALUE; + } + return; + } + + reg->type &= ~PTR_MAYBE_NULL; +} + static bool reg_is_pkt_pointer(const struct bpf_reg_state *reg) { return type_is_pkt_pointer(reg->type); @@ -7849,32 +7871,19 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state, } if (is_null) { reg->type = SCALAR_VALUE; - } else if (base_type(reg->type) == PTR_TO_MAP_VALUE) { - const struct bpf_map *map = reg->map_ptr; - - if (map->inner_map_meta) { - reg->type = CONST_PTR_TO_MAP; - reg->map_ptr = map->inner_map_meta; - } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { - reg->type = PTR_TO_XDP_SOCK; - } else if (map->map_type == BPF_MAP_TYPE_SOCKMAP || - map->map_type == BPF_MAP_TYPE_SOCKHASH) { - reg->type = PTR_TO_SOCKET; - } else { - reg->type = PTR_TO_MAP_VALUE; - } - } else { - reg->type &= ~PTR_MAYBE_NULL; - } - - if (is_null) { /* We don't need id and ref_obj_id from this point * onwards anymore, thus we should better reset it, * so that state pruning has chances to take effect. */ reg->id = 0; reg->ref_obj_id = 0; - } else if (!reg_may_point_to_spin_lock(reg)) { + + return; + } + + mark_ptr_not_null_reg(reg); + + if (!reg_may_point_to_spin_lock(reg)) { /* For not-NULL ptr, reg->ref_obj_id will be reset * in release_reg_references(). *
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-v5.12-rc1 commit e5069b9c23b3857db986c58801bebe450cff3392 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Add an ability to pass a pointer to a type with known size in arguments of a global function. Such pointers may be used to overcome the limit on the maximum number of arguments, avoid expensive and tricky workarounds and to have multiple output arguments.
A referenced type may contain pointers but indirect access through them isn't supported.
The implementation consists of two parts. If a global function has an argument that is a pointer to a type with known size then:
1) In btf_check_func_arg_match(): check that the corresponding register points to NULL or to a valid memory region that is large enough to contain the expected argument's type.
2) In btf_prepare_func_args(): set the corresponding register type to PTR_TO_MEM_OR_NULL and its size to the size of the expected type.
Only global functions are supported because allowance of pointers for static functions might break validation. Consider the following scenario. A static function has a pointer argument. A caller passes pointer to its stack memory. Because the callee can change referenced memory verifier cannot longer assume any particular slot type of the caller's stack memory hence the slot type is changed to SLOT_MISC. If there is an operation that relies on slot type other than SLOT_MISC then verifier won't be able to infer safety of the operation.
When verifier sees a static function that has a pointer argument different from PTR_TO_CTX then it skips arguments check and continues with "inline" validation with more information available. The operation that relies on the particular slot type now succeeds.
Because global functions were not allowed to have pointer arguments different from PTR_TO_CTX it's not possible to break existing and valid code.
Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210212205642.620788-4-me@ubique.spb.ru
Conflicts: include/linux/bpf_verifier.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf_verifier.h | 2 ++ kernel/bpf/btf.c | 55 +++++++++++++++++++++++++++++------- kernel/bpf/verifier.c | 30 ++++++++++++++++++++ 3 files changed, 77 insertions(+), 10 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 94c65a8f056e..bea28c220829 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -492,6 +492,8 @@ bpf_prog_offload_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt);
int check_ptr_off_reg(struct bpf_verifier_env *env, const struct bpf_reg_state *reg, int regno); +int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, + u32 regno, u32 mem_size);
/* this lives here instead of in bpf.h because it needs to dereference tgt_prog */ static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index d90a5e308f44..172178f617c6 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5311,9 +5311,10 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, struct bpf_prog *prog = env->prog; struct btf *btf = prog->aux->btf; const struct btf_param *args; - const struct btf_type *t; - u32 i, nargs, btf_id; + const struct btf_type *t, *ref_t; + u32 i, nargs, btf_id, type_size; const char *tname; + bool is_global;
if (!prog->aux->func_info) return -EINVAL; @@ -5347,6 +5348,8 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, bpf_log(log, "Function %s has %d > 5 args\n", tname, nargs); goto out; } + + is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; /* check that BTF function arguments match actual types that the * verifier sees. */ @@ -5363,10 +5366,6 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, goto out; } if (btf_type_is_ptr(t)) { - if (reg->type == SCALAR_VALUE) { - bpf_log(log, "R%d is not a pointer\n", i + 1); - goto out; - } /* If function expects ctx type in BTF check that caller * is passing PTR_TO_CTX. */ @@ -5381,6 +5380,25 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, goto out; continue; } + + if (!is_global) + goto out; + + t = btf_type_skip_modifiers(btf, t->type, NULL); + + ref_t = btf_resolve_size(btf, t, &type_size); + if (IS_ERR(ref_t)) { + bpf_log(log, + "arg#%d reference type('%s %s') size cannot be determined: %ld\n", + i, btf_type_str(t), btf_name_by_offset(btf, t->name_off), + PTR_ERR(ref_t)); + goto out; + } + + if (check_mem_reg(env, reg, i + 1, type_size)) + goto out; + + continue; } bpf_log(log, "Unrecognized arg#%d type %s\n", i, btf_kind_str[BTF_INFO_KIND(t->info)]); @@ -5411,7 +5429,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, enum bpf_prog_type prog_type = prog->type; struct btf *btf = prog->aux->btf; const struct btf_param *args; - const struct btf_type *t; + const struct btf_type *t, *ref_t; u32 i, nargs, btf_id; const char *tname;
@@ -5484,9 +5502,26 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, reg->type = SCALAR_VALUE; continue; } - if (btf_type_is_ptr(t) && - btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { - reg->type = PTR_TO_CTX; + if (btf_type_is_ptr(t)) { + if (btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { + reg->type = PTR_TO_CTX; + continue; + } + + t = btf_type_skip_modifiers(btf, t->type, NULL); + + ref_t = btf_resolve_size(btf, t, ®->mem_size); + if (IS_ERR(ref_t)) { + bpf_log(log, + "arg#%d reference type('%s %s') size cannot be determined: %ld\n", + i, btf_type_str(t), btf_name_by_offset(btf, t->name_off), + PTR_ERR(ref_t)); + return -EINVAL; + } + + reg->type = PTR_TO_MEM | PTR_MAYBE_NULL; + reg->id = ++env->id_gen; + continue; } bpf_log(log, "Arg#%d type %s in %s() is not supported yet.\n", diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a34652b0d726..9d3af209e9c7 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4264,6 +4264,29 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, } }
+int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, + u32 regno, u32 mem_size) +{ + if (register_is_null(reg)) + return 0; + + if (type_may_be_null(reg->type)) { + /* Assuming that the register contains a value check if the memory + * access is safe. Temporarily save and restore the register's state as + * the conversion shouldn't be visible to a caller. + */ + const struct bpf_reg_state saved_reg = *reg; + int rv; + + mark_ptr_not_null_reg(reg); + rv = check_helper_mem_access(env, regno, mem_size, true, NULL); + *reg = saved_reg; + return rv; + } + + return check_helper_mem_access(env, regno, mem_size, true, NULL); +} + /* Implementation details: * bpf_map_lookup returns PTR_TO_MAP_VALUE_OR_NULL * Two bpf_map_lookups (even with the same key) will have different reg->id. @@ -11975,6 +11998,13 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) mark_reg_known_zero(env, regs, i); else if (regs[i].type == SCALAR_VALUE) mark_reg_unknown(env, regs, i); + else if (base_type(regs[i].type) == PTR_TO_MEM) { + const u32 mem_size = regs[i].mem_size; + + mark_reg_known_zero(env, regs, i); + regs[i].mem_size = mem_size; + regs[i].id = ++env->id_gen; + } } } else { /* 1st arg to a function */
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-5.12-rc1 commit 8b08807d039a843163fd4aeca93aec69dfc4fbcf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
test_global_func9 - check valid pointer's scenarios test_global_func10 - check that a smaller type cannot be passed as a larger one test_global_func11 - check that CTX pointer cannot be passed test_global_func12 - check access to a null pointer test_global_func13 - check access to an arbitrary pointer value test_global_func14 - check that an opaque pointer cannot be passed test_global_func15 - check that a variable has an unknown value after it was passed to a global function by pointer test_global_func16 - check access to uninitialized stack memory
test_global_func_args - check read and write operations through a pointer
Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210212205642.620788-5-me@ubique.spb.ru (cherry picked from commit 8b08807d039a843163fd4aeca93aec69dfc4fbcf) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../bpf/prog_tests/global_func_args.c | 60 ++++++++ .../bpf/prog_tests/test_global_funcs.c | 8 ++ .../selftests/bpf/progs/test_global_func10.c | 29 ++++ .../selftests/bpf/progs/test_global_func11.c | 19 +++ .../selftests/bpf/progs/test_global_func12.c | 21 +++ .../selftests/bpf/progs/test_global_func13.c | 24 ++++ .../selftests/bpf/progs/test_global_func14.c | 21 +++ .../selftests/bpf/progs/test_global_func15.c | 22 +++ .../selftests/bpf/progs/test_global_func16.c | 22 +++ .../selftests/bpf/progs/test_global_func9.c | 132 ++++++++++++++++++ .../bpf/progs/test_global_func_args.c | 91 ++++++++++++ 11 files changed, 449 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/global_func_args.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func10.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func11.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func12.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func13.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func14.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func15.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func16.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func9.c create mode 100644 tools/testing/selftests/bpf/progs/test_global_func_args.c
diff --git a/tools/testing/selftests/bpf/prog_tests/global_func_args.c b/tools/testing/selftests/bpf/prog_tests/global_func_args.c new file mode 100644 index 000000000000..8bcc2869102f --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/global_func_args.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "test_progs.h" +#include "network_helpers.h" + +static __u32 duration; + +static void test_global_func_args0(struct bpf_object *obj) +{ + int err, i, map_fd, actual_value; + const char *map_name = "values"; + + map_fd = bpf_find_map(__func__, obj, map_name); + if (CHECK(map_fd < 0, "bpf_find_map", "cannot find BPF map %s: %s\n", + map_name, strerror(errno))) + return; + + struct { + const char *descr; + int expected_value; + } tests[] = { + {"passing NULL pointer", 0}, + {"returning value", 1}, + {"reading local variable", 100 }, + {"writing local variable", 101 }, + {"reading global variable", 42 }, + {"writing global variable", 43 }, + {"writing to pointer-to-pointer", 1 }, + }; + + for (i = 0; i < ARRAY_SIZE(tests); ++i) { + const int expected_value = tests[i].expected_value; + + err = bpf_map_lookup_elem(map_fd, &i, &actual_value); + + CHECK(err || actual_value != expected_value, tests[i].descr, + "err %d result %d expected %d\n", err, actual_value, expected_value); + } +} + +void test_global_func_args(void) +{ + const char *file = "./test_global_func_args.o"; + __u32 retval; + struct bpf_object *obj; + int err, prog_fd; + + err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd); + if (CHECK(err, "load program", "error %d loading %s\n", err, file)) + return; + + err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4), + NULL, NULL, &retval, &duration); + CHECK(err || retval, "pass global func args run", + "err %d errno %d retval %d duration %d\n", + err, errno, retval, duration); + + test_global_func_args0(obj); + + bpf_object__close(obj); +} diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c index 32e4348b714b..7e13129f593a 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c +++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c @@ -61,6 +61,14 @@ void test_test_global_funcs(void) { "test_global_func6.o" , "modified ctx ptr R2" }, { "test_global_func7.o" , "foo() doesn't return scalar" }, { "test_global_func8.o" }, + { "test_global_func9.o" }, + { "test_global_func10.o", "invalid indirect read from stack" }, + { "test_global_func11.o", "Caller passes invalid args into func#1" }, + { "test_global_func12.o", "invalid mem access 'mem_or_null'" }, + { "test_global_func13.o", "Caller passes invalid args into func#1" }, + { "test_global_func14.o", "reference type('FWD S') size cannot be determined" }, + { "test_global_func15.o", "At program exit the register R0 has value" }, + { "test_global_func16.o", "invalid indirect read from stack" }, }; libbpf_print_fn_t old_print_fn = NULL; int err, i, duration = 0; diff --git a/tools/testing/selftests/bpf/progs/test_global_func10.c b/tools/testing/selftests/bpf/progs/test_global_func10.c new file mode 100644 index 000000000000..61c2ae92ce41 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func10.c @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct Small { + int x; +}; + +struct Big { + int x; + int y; +}; + +__noinline int foo(const struct Big *big) +{ + if (big == 0) + return 0; + + return bpf_get_prandom_u32() < big->y; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + const struct Small small = {.x = skb->len }; + + return foo((struct Big *)&small) ? 1 : 0; +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func11.c b/tools/testing/selftests/bpf/progs/test_global_func11.c new file mode 100644 index 000000000000..28488047c849 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func11.c @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct S { + int x; +}; + +__noinline int foo(const struct S *s) +{ + return s ? bpf_get_prandom_u32() < s->x : 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + return foo(skb); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func12.c b/tools/testing/selftests/bpf/progs/test_global_func12.c new file mode 100644 index 000000000000..62343527cc59 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func12.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct S { + int x; +}; + +__noinline int foo(const struct S *s) +{ + return bpf_get_prandom_u32() < s->x; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + const struct S s = {.x = skb->len }; + + return foo(&s); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func13.c b/tools/testing/selftests/bpf/progs/test_global_func13.c new file mode 100644 index 000000000000..ff8897c1ac22 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func13.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct S { + int x; +}; + +__noinline int foo(const struct S *s) +{ + if (s) + return bpf_get_prandom_u32() < s->x; + + return 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + const struct S *s = (const struct S *)(0xbedabeda); + + return foo(s); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func14.c b/tools/testing/selftests/bpf/progs/test_global_func14.c new file mode 100644 index 000000000000..698c77199ebf --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func14.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct S; + +__noinline int foo(const struct S *s) +{ + if (s) + return bpf_get_prandom_u32() < *(const int *) s; + + return 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + + return foo(NULL); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func15.c b/tools/testing/selftests/bpf/progs/test_global_func15.c new file mode 100644 index 000000000000..c19c435988d5 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func15.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +__noinline int foo(unsigned int *v) +{ + if (v) + *v = bpf_get_prandom_u32(); + + return 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + unsigned int v = 1; + + foo(&v); + + return v; +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func16.c b/tools/testing/selftests/bpf/progs/test_global_func16.c new file mode 100644 index 000000000000..0312d1e8d8c0 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func16.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +__noinline int foo(int (*arr)[10]) +{ + if (arr) + return (*arr)[9]; + + return 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + int array[10]; + + const int rv = foo(&array); + + return rv ? 1 : 0; +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func9.c b/tools/testing/selftests/bpf/progs/test_global_func9.c new file mode 100644 index 000000000000..bd233ddede98 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func9.c @@ -0,0 +1,132 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct S { + int x; +}; + +struct C { + int x; + int y; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, struct S); +} map SEC(".maps"); + +enum E { + E_ITEM +}; + +static int global_data_x = 100; +static int volatile global_data_y = 500; + +__noinline int foo(const struct S *s) +{ + if (s) + return bpf_get_prandom_u32() < s->x; + + return 0; +} + +__noinline int bar(int *x) +{ + if (x) + *x &= bpf_get_prandom_u32(); + + return 0; +} +__noinline int baz(volatile int *x) +{ + if (x) + *x &= bpf_get_prandom_u32(); + + return 0; +} + +__noinline int qux(enum E *e) +{ + if (e) + return *e; + + return 0; +} + +__noinline int quux(int (*arr)[10]) +{ + if (arr) + return (*arr)[9]; + + return 0; +} + +__noinline int quuz(int **p) +{ + if (p) + *p = NULL; + + return 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + int result = 0; + + { + const struct S s = {.x = skb->len }; + + result |= foo(&s); + } + + { + const __u32 key = 1; + const struct S *s = bpf_map_lookup_elem(&map, &key); + + result |= foo(s); + } + + { + const struct C c = {.x = skb->len, .y = skb->family }; + + result |= foo((const struct S *)&c); + } + + { + result |= foo(NULL); + } + + { + bar(&result); + bar(&global_data_x); + } + + { + result |= baz(&global_data_y); + } + + { + enum E e = E_ITEM; + + result |= qux(&e); + } + + { + int array[10] = {0}; + + result |= quux(&array); + } + + { + int *p; + + result |= quuz(&p); + } + + return result ? 1 : 0; +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func_args.c b/tools/testing/selftests/bpf/progs/test_global_func_args.c new file mode 100644 index 000000000000..cae309538a9e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func_args.c @@ -0,0 +1,91 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/bpf.h> + +#include <bpf/bpf_helpers.h> + +struct S { + int v; +}; + +static volatile struct S global_variable; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 7); + __type(key, __u32); + __type(value, int); +} values SEC(".maps"); + +static void save_value(__u32 index, int value) +{ + bpf_map_update_elem(&values, &index, &value, 0); +} + +__noinline int foo(__u32 index, struct S *s) +{ + if (s) { + save_value(index, s->v); + return ++s->v; + } + + save_value(index, 0); + + return 1; +} + +__noinline int bar(__u32 index, volatile struct S *s) +{ + if (s) { + save_value(index, s->v); + return ++s->v; + } + + save_value(index, 0); + + return 1; +} + +__noinline int baz(struct S **s) +{ + if (s) + *s = 0; + + return 0; +} + +SEC("cgroup_skb/ingress") +int test_cls(struct __sk_buff *skb) +{ + __u32 index = 0; + + { + const int v = foo(index++, 0); + + save_value(index++, v); + } + + { + struct S s = { .v = 100 }; + + foo(index++, &s); + save_value(index++, s.v); + } + + { + global_variable.v = 42; + bar(index++, &global_variable); + save_value(index++, global_variable.v); + } + + { + struct S v, *p = &v; + + baz(&p); + save_value(index++, !p); + } + + return 0; +} + +char _license[] SEC("license") = "GPL";
From: Song Liu songliubraving@fb.com
mainline inclusion from mainline-5.12-rc1 commit d2032d45101670be3a0fe221c815145a41ae2672 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This target is used to only build the bootstrap bpftool, which will be used to generate bpf skeletons for other tools, like perf.
Signed-off-by: Song Liu songliubraving@fb.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: kernel-team@fb.com Link: http://lore.kernel.org/lkml/20201228174054.907740-2-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com (cherry picked from commit d2032d45101670be3a0fe221c815145a41ae2672) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index df67ef3248e1..f45bc3c7fd07 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -145,6 +145,8 @@ VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ /boot/vmlinux-$(shell uname -r) VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
+bootstrap: $(BPFTOOL_BOOTSTRAP) + ifneq ($(VMLINUX_BTF)$(VMLINUX_H),) ifeq ($(feature-clang-bpf-co-re),1)
From: Song Liu songliubraving@fb.com
mainline inclusion from mainline-5.12-rc1 commit fbcdaa1908e8f61aa56c71a1db9a9deb72110a9d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
BPF programs are useful in perf to profile BPF programs.
BPF skeleton is by far the easiest way to write BPF tools. Enable building BPF skeletons in util/bpf_skel. A dummy bpf skeleton is added. More bpf skeletons will be added for different use cases.
Signed-off-by: Song Liu songliubraving@fb.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Acked-by: Jiri Olsa jolsa@redhat.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: kernel-team@fb.com Link: http://lore.kernel.org/lkml/20201229214214.3413833-3-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com (cherry picked from commit fbcdaa1908e8f61aa56c71a1db9a9deb72110a9d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/build/Makefile.feature | 4 ++- tools/perf/Makefile.config | 9 ++++++ tools/perf/Makefile.perf | 49 +++++++++++++++++++++++++++-- tools/perf/util/bpf_skel/.gitignore | 3 ++ tools/scripts/Makefile.include | 1 + 5 files changed, 63 insertions(+), 3 deletions(-) create mode 100644 tools/perf/util/bpf_skel/.gitignore
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index c7a60517c229..49ef7e3930fd 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -99,7 +99,9 @@ FEATURE_TESTS_EXTRA := \ clang \ libbpf \ libpfm4 \ - libdebuginfod + libdebuginfod \ + clang-bpf-co-re +
FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 78700c7ec7df..1971df3bd90e 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -636,6 +636,15 @@ ifndef NO_LIBBPF endif endif
+ifdef BUILD_BPF_SKEL + $(call feature_check,clang-bpf-co-re) + ifeq ($(feature-clang-bpf-co-re), 0) + dummy := $(error Error: clang too old. Please install recent clang) + endif + $(call detected,CONFIG_PERF_BPF_SKEL) + CFLAGS += -DHAVE_BPF_SKEL +endif + dwarf-post-unwind := 1 dwarf-post-unwind-text := BUG
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index d128b6006405..70242735cb6e 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -126,6 +126,8 @@ include ../scripts/utilities.mak # # Define NO_LIBDEBUGINFOD if you do not want support debuginfod # +# Define BUILD_BPF_SKEL to enable BPF skeletons +#
# As per kernel Makefile, avoid funny character set dependencies unexport LC_ALL @@ -175,6 +177,12 @@ endef
LD += $(EXTRA_LDFLAGS)
+HOSTCC ?= gcc +HOSTLD ?= ld +HOSTAR ?= ar +CLANG ?= clang +LLVM_STRIP ?= llvm-strip + PKG_CONFIG = $(CROSS_COMPILE)pkg-config
RM = rm -f @@ -730,7 +738,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc $(x86_arch_prctl_code_array) \ $(rename_flags_array) \ $(arch_errno_name_array) \ - $(sync_file_range_arrays) + $(sync_file_range_arrays) \ + bpf-skel
$(OUTPUT)%.o: %.c prepare FORCE $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@ @@ -1003,7 +1012,43 @@ config-clean: python-clean: $(python-clean)
-clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean +SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) +SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) +SKELETONS := + +ifdef BUILD_BPF_SKEL +BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool +LIBBPF_SRC := $(abspath ../lib/bpf) +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/.. + +$(SKEL_TMP_OUT): + $(Q)$(MKDIR) -p $@ + +$(BPFTOOL): | $(SKEL_TMP_OUT) + CFLAGS= $(MAKE) -C ../bpf/bpftool \ + OUTPUT=$(SKEL_TMP_OUT)/ bootstrap + +$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT) + $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \ + -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@ + +$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL) + $(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@ + +bpf-skel: $(SKELETONS) + +.PRECIOUS: $(SKEL_TMP_OUT)/%.bpf.o + +else # BUILD_BPF_SKEL + +bpf-skel: + +endif # BUILD_BPF_SKEL + +bpf-skel-clean: + $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) + +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS) $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '.*.cmd' -delete -o -name '.*.d' -delete $(Q)$(RM) $(OUTPUT).config-detected diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore new file mode 100644 index 000000000000..5263e9e6c5d8 --- /dev/null +++ b/tools/perf/util/bpf_skel/.gitignore @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only +.tmp +*.skel.h \ No newline at end of file diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index 41ca9473d0f0..70164ea037b9 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -152,6 +152,7 @@ ifneq ($(silent),1) $(MAKE) $(PRINT_DIR) -C $$subdir QUIET_FLEX = @echo ' FLEX '$@; QUIET_BISON = @echo ' BISON '$@; + QUIET_GENSKEL = @echo ' GEN-SKEL '$@;
descend = \ +@echo ' DESCEND '$(1); \
From: Song Liu songliubraving@fb.com
mainline inclusion from mainline-5.12-rc1 commit fa853c4b839ece9cd589e8858819240933cc4d78 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Introduce 'perf stat -b' option, which counts events for BPF programs, like:
[root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 1.487903822 115,200 ref-cycles 1.487903822 86,012 cycles 2.489147029 80,560 ref-cycles 2.489147029 73,784 cycles 3.490341825 60,720 ref-cycles 3.490341825 37,797 cycles 4.491540887 37,120 ref-cycles 4.491540887 31,963 cycles
The example above counts 'cycles' and 'ref-cycles' of BPF program of id 254. This is similar to bpftool-prog-profile command, but more flexible.
'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF programs (monitor-progs) to the target BPF program (target-prog). The monitor-progs read perf_event before and after the target-prog, and aggregate the difference in a BPF map. Then the user space reads data from these maps.
A new 'struct bpf_counter' is introduced to provide a common interface that uses BPF programs/maps to count perf events.
Committer notes:
Removed all but bpf_counter.h includes from evsel.h, not needed at all.
Also BPF map lookups for PERCPU_ARRAYs need to have as its value receive buffer passed to the kernel libbpf_num_possible_cpus() entries, not evsel__nr_cpus(evsel), as the former uses /sys/devices/system/cpu/possible while the later uses /sys/devices/system/cpu/online, which may be less than the 'possible' number making the bpf map lookup overwrite memory and cause hard to debug memory corruption.
We need to continue using evsel__nr_cpus(evsel) when accessing the perf_counts array tho, not to overwrite another are of memory :-)
Signed-off-by: Song Liu songliubraving@fb.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Link: https://lore.kernel.org/lkml/20210120163031.GU12699@kernel.org/ Acked-by: Namhyung Kim namhyung@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: kernel-team@fb.com Link: http://lore.kernel.org/lkml/20201229214214.3413833-4-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com (cherry picked from commit fa853c4b839ece9cd589e8858819240933cc4d78) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/perf/builtin-stat.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/perf/Documentation/perf-stat.txt | 18 + tools/perf/Makefile.perf | 2 +- tools/perf/builtin-stat.c | 82 ++++- tools/perf/util/Build | 1 + tools/perf/util/bpf_counter.c | 314 ++++++++++++++++++ tools/perf/util/bpf_counter.h | 72 ++++ .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ tools/perf/util/evsel.c | 5 + tools/perf/util/evsel.h | 5 + tools/perf/util/python.c | 21 ++ tools/perf/util/stat-display.c | 4 +- tools/perf/util/stat.c | 2 +- tools/perf/util/target.c | 34 +- tools/perf/util/target.h | 10 + 14 files changed, 645 insertions(+), 18 deletions(-) create mode 100644 tools/perf/util/bpf_counter.c create mode 100644 tools/perf/util/bpf_counter.h create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index f9bcd95bf352..5bfc0357694b 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -75,6 +75,24 @@ report:: --tid=<tid>:: stat events on existing thread id (comma separated list)
+-b:: +--bpf-prog:: + stat events on existing bpf program id (comma separated list), + requiring root rights. bpftool-prog could be used to find program + id all bpf programs in the system. For example: + + # bpftool prog | head -n 1 + 17247: tracepoint name sys_enter tag 192d548b9d754067 gpl + + # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000 + + Performance counter stats for 'BPF program(s) 17247': + + 85,967 cycles + 28,982 instructions # 0.34 insn per cycle + + 1.102235068 seconds time elapsed + ifdef::HAVE_LIBPFM[] --pfm-events events:: Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 70242735cb6e..9bfc725db608 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -1014,7 +1014,7 @@ python-clean:
SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) -SKELETONS := +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
ifdef BUILD_BPF_SKEL BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 89e80a3bc9c3..8cf0490d01b6 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -67,6 +67,7 @@ #include "util/top.h" #include "util/affinity.h" #include "util/pfm.h" +#include "util/bpf_counter.h" #include "asm/bug.h"
#include <linux/time64.h> @@ -409,12 +410,32 @@ static int read_affinity_counters(struct timespec *rs) return 0; }
+static int read_bpf_map_counters(void) +{ + struct evsel *counter; + int err; + + evlist__for_each_entry(evsel_list, counter) { + err = bpf_counter__read(counter); + if (err) + return err; + } + return 0; +} + static void read_counters(struct timespec *rs) { struct evsel *counter; + int err;
- if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) - return; + if (!stat_config.stop_read_counter) { + if (target__has_bpf(&target)) + err = read_bpf_map_counters(); + else + err = read_affinity_counters(rs); + if (err < 0) + return; + }
evlist__for_each_entry(evsel_list, counter) { if (counter->err) @@ -496,11 +517,22 @@ static bool handle_interval(unsigned int interval, int *times) return false; }
-static void enable_counters(void) +static int enable_counters(void) { + struct evsel *evsel; + int err; + + if (target__has_bpf(&target)) { + evlist__for_each_entry(evsel_list, evsel) { + err = bpf_counter__enable(evsel); + if (err) + return err; + } + } + if (stat_config.initial_delay < 0) { pr_info(EVLIST_DISABLED_MSG); - return; + return 0; }
if (stat_config.initial_delay > 0) { @@ -518,6 +550,7 @@ static void enable_counters(void) if (stat_config.initial_delay > 0) pr_info(EVLIST_ENABLED_MSG); } + return 0; }
static void disable_counters(void) @@ -720,7 +753,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) const bool forks = (argc > 0); bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false; struct affinity affinity; - int i, cpu; + int i, cpu, err; bool second_pass = false;
if (forks) { @@ -738,6 +771,13 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) if (affinity__setup(&affinity) < 0) return -1;
+ if (target__has_bpf(&target)) { + evlist__for_each_entry(evsel_list, counter) { + if (bpf_counter__load(counter, &target)) + return -1; + } + } + evlist__for_each_cpu (evsel_list, i, cpu) { affinity__set(&affinity, cpu);
@@ -851,7 +891,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) }
if (STAT_RECORD) { - int err, fd = perf_data__fd(&perf_stat.data); + int fd = perf_data__fd(&perf_stat.data);
if (is_pipe) { err = perf_header__write_pipe(perf_data__fd(&perf_stat.data)); @@ -877,7 +917,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
if (forks) { perf_evlist__start_workload(evsel_list); - enable_counters(); + err = enable_counters(); + if (err) + return -1;
if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) status = dispatch_events(forks, timeout, interval, ×); @@ -896,7 +938,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) if (WIFSIGNALED(status)) psignal(WTERMSIG(status), argv[0]); } else { - enable_counters(); + err = enable_counters(); + if (err) + return -1; status = dispatch_events(forks, timeout, interval, ×); }
@@ -1087,6 +1131,10 @@ static struct option stat_options[] = { "stat events on existing process id"), OPT_STRING('t', "tid", &target.tid, "tid", "stat events on existing thread id"), +#ifdef HAVE_BPF_SKEL + OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id", + "stat events on existing bpf program id"), +#endif OPT_BOOLEAN('a', "all-cpus", &target.system_wide, "system-wide collection from all CPUs"), OPT_BOOLEAN('g', "group", &group, @@ -2058,11 +2106,12 @@ int cmd_stat(int argc, const char **argv) "perf stat [<options>] [<command>]", NULL }; - int status = -EINVAL, run_idx; + int status = -EINVAL, run_idx, err; const char *mode; FILE *output = stderr; unsigned int interval, timeout; const char * const stat_subcommands[] = { "record", "report" }; + char errbuf[BUFSIZ];
setlocale(LC_ALL, "");
@@ -2173,6 +2222,12 @@ int cmd_stat(int argc, const char **argv) } else if (big_num_opt == 0) /* User passed --no-big-num */ stat_config.big_num = false;
+ err = target__validate(&target); + if (err) { + target__strerror(&target, err, errbuf, BUFSIZ); + pr_warning("%s\n", errbuf); + } + setup_system_wide(argc);
/* @@ -2243,8 +2298,6 @@ int cmd_stat(int argc, const char **argv) goto out; }
- target__validate(&target); - if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide)) target.per_thread = true;
@@ -2375,9 +2428,10 @@ int cmd_stat(int argc, const char **argv) * tools remain -acme */ int fd = perf_data__fd(&perf_stat.data); - int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, - process_synthesized_event, - &perf_stat.session->machines.host); + + err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, + process_synthesized_event, + &perf_stat.session->machines.host); if (err) { pr_warning("Couldn't synthesize the kernel mmap record, harmless, " "older tools may produce warnings about this file\n."); diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 661d995f5016..83a2ea022dc2 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -135,6 +135,7 @@ perf-y += clockid.o
perf-$(CONFIG_LIBBPF) += bpf-loader.o perf-$(CONFIG_LIBBPF) += bpf_map.o +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o perf-$(CONFIG_LIBELF) += symbol-elf.o perf-$(CONFIG_LIBELF) += probe-file.o diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c new file mode 100644 index 000000000000..04f89120b323 --- /dev/null +++ b/tools/perf/util/bpf_counter.c @@ -0,0 +1,314 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Copyright (c) 2019 Facebook */ + +#include <assert.h> +#include <limits.h> +#include <unistd.h> +#include <sys/time.h> +#include <sys/resource.h> +#include <linux/err.h> +#include <linux/zalloc.h> +#include <bpf/bpf.h> +#include <bpf/btf.h> +#include <bpf/libbpf.h> + +#include "bpf_counter.h" +#include "counts.h" +#include "debug.h" +#include "evsel.h" +#include "target.h" + +#include "bpf_skel/bpf_prog_profiler.skel.h" + +static inline void *u64_to_ptr(__u64 ptr) +{ + return (void *)(unsigned long)ptr; +} + +static void set_max_rlimit(void) +{ + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; + + setrlimit(RLIMIT_MEMLOCK, &rinf); +} + +static struct bpf_counter *bpf_counter_alloc(void) +{ + struct bpf_counter *counter; + + counter = zalloc(sizeof(*counter)); + if (counter) + INIT_LIST_HEAD(&counter->list); + return counter; +} + +static int bpf_program_profiler__destroy(struct evsel *evsel) +{ + struct bpf_counter *counter, *tmp; + + list_for_each_entry_safe(counter, tmp, + &evsel->bpf_counter_list, list) { + list_del_init(&counter->list); + bpf_prog_profiler_bpf__destroy(counter->skel); + free(counter); + } + assert(list_empty(&evsel->bpf_counter_list)); + + return 0; +} + +static char *bpf_target_prog_name(int tgt_fd) +{ + struct bpf_prog_info_linear *info_linear; + struct bpf_func_info *func_info; + const struct btf_type *t; + char *name = NULL; + struct btf *btf; + + info_linear = bpf_program__get_prog_info_linear( + tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); + if (IS_ERR_OR_NULL(info_linear)) { + pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd); + return NULL; + } + + if (info_linear->info.btf_id == 0 || + btf__get_from_id(info_linear->info.btf_id, &btf)) { + pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); + goto out; + } + + func_info = u64_to_ptr(info_linear->info.func_info); + t = btf__type_by_id(btf, func_info[0].type_id); + if (!t) { + pr_debug("btf %d doesn't have type %d\n", + info_linear->info.btf_id, func_info[0].type_id); + goto out; + } + name = strdup(btf__name_by_offset(btf, t->name_off)); +out: + free(info_linear); + return name; +} + +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) +{ + struct bpf_prog_profiler_bpf *skel; + struct bpf_counter *counter; + struct bpf_program *prog; + char *prog_name; + int prog_fd; + int err; + + prog_fd = bpf_prog_get_fd_by_id(prog_id); + if (prog_fd < 0) { + pr_err("Failed to open fd for bpf prog %u\n", prog_id); + return -1; + } + counter = bpf_counter_alloc(); + if (!counter) { + close(prog_fd); + return -1; + } + + skel = bpf_prog_profiler_bpf__open(); + if (!skel) { + pr_err("Failed to open bpf skeleton\n"); + goto err_out; + } + + skel->rodata->num_cpu = evsel__nr_cpus(evsel); + + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); + bpf_map__resize(skel->maps.fentry_readings, 1); + bpf_map__resize(skel->maps.accum_readings, 1); + + prog_name = bpf_target_prog_name(prog_fd); + if (!prog_name) { + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); + goto err_out; + } + + bpf_object__for_each_program(prog, skel->obj) { + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); + if (err) { + pr_err("bpf_program__set_attach_target failed.\n" + "Does bpf prog %u have BTF?\n", prog_id); + goto err_out; + } + } + set_max_rlimit(); + err = bpf_prog_profiler_bpf__load(skel); + if (err) { + pr_err("bpf_prog_profiler_bpf__load failed\n"); + goto err_out; + } + + assert(skel != NULL); + counter->skel = skel; + list_add(&counter->list, &evsel->bpf_counter_list); + close(prog_fd); + return 0; +err_out: + bpf_prog_profiler_bpf__destroy(skel); + free(counter); + close(prog_fd); + return -1; +} + +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) +{ + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; + u32 prog_id; + int ret; + + bpf_str_ = bpf_str = strdup(target->bpf_str); + if (!bpf_str) + return -1; + + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { + prog_id = strtoul(tok, &p, 10); + if (prog_id == 0 || prog_id == UINT_MAX || + (*p != '\0' && *p != ',')) { + pr_err("Failed to parse bpf prog ids %s\n", + target->bpf_str); + return -1; + } + + ret = bpf_program_profiler_load_one(evsel, prog_id); + if (ret) { + bpf_program_profiler__destroy(evsel); + free(bpf_str_); + return -1; + } + bpf_str = NULL; + } + free(bpf_str_); + return 0; +} + +static int bpf_program_profiler__enable(struct evsel *evsel) +{ + struct bpf_counter *counter; + int ret; + + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + assert(counter->skel != NULL); + ret = bpf_prog_profiler_bpf__attach(counter->skel); + if (ret) { + bpf_program_profiler__destroy(evsel); + return ret; + } + } + return 0; +} + +static int bpf_program_profiler__read(struct evsel *evsel) +{ + // perf_cpu_map uses /sys/devices/system/cpu/online + int num_cpu = evsel__nr_cpus(evsel); + // BPF_MAP_TYPE_PERCPU_ARRAY uses /sys/devices/system/cpu/possible + // Sometimes possible > online, like on a Ryzen 3900X that has 24 + // threads but its possible showed 0-31 -acme + int num_cpu_bpf = libbpf_num_possible_cpus(); + struct bpf_perf_event_value values[num_cpu_bpf]; + struct bpf_counter *counter; + int reading_map_fd; + __u32 key = 0; + int err, cpu; + + if (list_empty(&evsel->bpf_counter_list)) + return -EAGAIN; + + for (cpu = 0; cpu < num_cpu; cpu++) { + perf_counts(evsel->counts, cpu, 0)->val = 0; + perf_counts(evsel->counts, cpu, 0)->ena = 0; + perf_counts(evsel->counts, cpu, 0)->run = 0; + } + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + struct bpf_prog_profiler_bpf *skel = counter->skel; + + assert(skel != NULL); + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); + + err = bpf_map_lookup_elem(reading_map_fd, &key, values); + if (err) { + pr_err("failed to read value\n"); + return err; + } + + for (cpu = 0; cpu < num_cpu; cpu++) { + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; + } + } + return 0; +} + +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, + int fd) +{ + struct bpf_prog_profiler_bpf *skel; + struct bpf_counter *counter; + int ret; + + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + skel = counter->skel; + assert(skel != NULL); + + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), + &cpu, &fd, BPF_ANY); + if (ret) + return ret; + } + return 0; +} + +struct bpf_counter_ops bpf_program_profiler_ops = { + .load = bpf_program_profiler__load, + .enable = bpf_program_profiler__enable, + .read = bpf_program_profiler__read, + .destroy = bpf_program_profiler__destroy, + .install_pe = bpf_program_profiler__install_pe, +}; + +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd) +{ + if (list_empty(&evsel->bpf_counter_list)) + return 0; + return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd); +} + +int bpf_counter__load(struct evsel *evsel, struct target *target) +{ + if (target__has_bpf(target)) + evsel->bpf_counter_ops = &bpf_program_profiler_ops; + + if (evsel->bpf_counter_ops) + return evsel->bpf_counter_ops->load(evsel, target); + return 0; +} + +int bpf_counter__enable(struct evsel *evsel) +{ + if (list_empty(&evsel->bpf_counter_list)) + return 0; + return evsel->bpf_counter_ops->enable(evsel); +} + +int bpf_counter__read(struct evsel *evsel) +{ + if (list_empty(&evsel->bpf_counter_list)) + return -EAGAIN; + return evsel->bpf_counter_ops->read(evsel); +} + +void bpf_counter__destroy(struct evsel *evsel) +{ + if (list_empty(&evsel->bpf_counter_list)) + return; + evsel->bpf_counter_ops->destroy(evsel); + evsel->bpf_counter_ops = NULL; +} diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h new file mode 100644 index 000000000000..2eca210e5dc1 --- /dev/null +++ b/tools/perf/util/bpf_counter.h @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PERF_BPF_COUNTER_H +#define __PERF_BPF_COUNTER_H 1 + +#include <linux/list.h> + +struct evsel; +struct target; +struct bpf_counter; + +typedef int (*bpf_counter_evsel_op)(struct evsel *evsel); +typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel, + struct target *target); +typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel, + int cpu, + int fd); + +struct bpf_counter_ops { + bpf_counter_evsel_target_op load; + bpf_counter_evsel_op enable; + bpf_counter_evsel_op read; + bpf_counter_evsel_op destroy; + bpf_counter_evsel_install_pe_op install_pe; +}; + +struct bpf_counter { + void *skel; + struct list_head list; +}; + +#ifdef HAVE_BPF_SKEL + +int bpf_counter__load(struct evsel *evsel, struct target *target); +int bpf_counter__enable(struct evsel *evsel); +int bpf_counter__read(struct evsel *evsel); +void bpf_counter__destroy(struct evsel *evsel); +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); + +#else /* HAVE_BPF_SKEL */ + +#include<linux/err.h> + +static inline int bpf_counter__load(struct evsel *evsel __maybe_unused, + struct target *target __maybe_unused) +{ + return 0; +} + +static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused) +{ + return 0; +} + +static inline int bpf_counter__read(struct evsel *evsel __maybe_unused) +{ + return -EAGAIN; +} + +static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused) +{ +} + +static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, + int cpu __maybe_unused, + int fd __maybe_unused) +{ + return 0; +} + +#endif /* HAVE_BPF_SKEL */ + +#endif /* __PERF_BPF_COUNTER_H */ diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c new file mode 100644 index 000000000000..c7cec92d0236 --- /dev/null +++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +// Copyright (c) 2020 Facebook +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +/* map of perf event fds, num_cpu * num_metric entries */ +struct { + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(int)); +} events SEC(".maps"); + +/* readings at fentry */ +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(struct bpf_perf_event_value)); + __uint(max_entries, 1); +} fentry_readings SEC(".maps"); + +/* accumulated readings */ +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(struct bpf_perf_event_value)); + __uint(max_entries, 1); +} accum_readings SEC(".maps"); + +const volatile __u32 num_cpu = 1; + +SEC("fentry/XXX") +int BPF_PROG(fentry_XXX) +{ + __u32 key = bpf_get_smp_processor_id(); + struct bpf_perf_event_value *ptr; + __u32 zero = 0; + long err; + + /* look up before reading, to reduce error */ + ptr = bpf_map_lookup_elem(&fentry_readings, &zero); + if (!ptr) + return 0; + + err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr)); + if (err) + return 0; + + return 0; +} + +static inline void +fexit_update_maps(struct bpf_perf_event_value *after) +{ + struct bpf_perf_event_value *before, diff, *accum; + __u32 zero = 0; + + before = bpf_map_lookup_elem(&fentry_readings, &zero); + /* only account samples with a valid fentry_reading */ + if (before && before->counter) { + struct bpf_perf_event_value *accum; + + diff.counter = after->counter - before->counter; + diff.enabled = after->enabled - before->enabled; + diff.running = after->running - before->running; + + accum = bpf_map_lookup_elem(&accum_readings, &zero); + if (accum) { + accum->counter += diff.counter; + accum->enabled += diff.enabled; + accum->running += diff.running; + } + } +} + +SEC("fexit/XXX") +int BPF_PROG(fexit_XXX) +{ + struct bpf_perf_event_value reading; + __u32 cpu = bpf_get_smp_processor_id(); + __u32 one = 1, zero = 0; + int err; + + /* read all events before updating the maps, to reduce error */ + err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading)); + if (err) + return 0; + + fexit_update_maps(&reading); + return 0; +} + +char LICENSE[] SEC("license") = "Dual BSD/GPL"; diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 1a1cbd16d76d..8829fed10910 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -25,6 +25,7 @@ #include <stdlib.h> #include <perf/evsel.h> #include "asm/bug.h" +#include "bpf_counter.h" #include "callchain.h" #include "cgroup.h" #include "counts.h" @@ -247,6 +248,7 @@ void evsel__init(struct evsel *evsel, evsel->bpf_obj = NULL; evsel->bpf_fd = -1; INIT_LIST_HEAD(&evsel->config_terms); + INIT_LIST_HEAD(&evsel->bpf_counter_list); perf_evsel__object.init(evsel); evsel->sample_size = __evsel__sample_size(attr->sample_type); evsel__calc_id_pos(evsel); @@ -1374,6 +1376,7 @@ void evsel__exit(struct evsel *evsel) { assert(list_empty(&evsel->core.node)); assert(evsel->evlist == NULL); + bpf_counter__destroy(evsel); evsel__free_counts(evsel); perf_evsel__free_fd(&evsel->core); perf_evsel__free_id(&evsel->core); @@ -1797,6 +1800,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
FD(evsel, cpu, thread) = fd;
+ bpf_counter__install_pe(evsel, cpu, fd); + if (unlikely(test_attr__enabled)) { test_attr__open(&evsel->core.attr, pid, cpus->map[cpu], fd, group_fd, flags); diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 79a860d8e3ee..e818e558a275 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -17,6 +17,8 @@ struct cgroup; struct perf_counts; struct perf_stat_evsel; union perf_event; +struct bpf_counter_ops; +struct target;
typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
@@ -127,6 +129,8 @@ struct evsel { * See also evsel__has_callchain(). */ __u64 synth_sample_type; + struct list_head bpf_counter_list; + struct bpf_counter_ops *bpf_counter_ops; };
struct perf_missing_features { @@ -423,4 +427,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel) struct perf_env *evsel__env(struct evsel *evsel);
int evsel__store_ids(struct evsel *evsel, struct evlist *evlist); + #endif /* __PERF_EVSEL_H */ diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c index ae8edde7c50e..5dcc30ae6cab 100644 --- a/tools/perf/util/python.c +++ b/tools/perf/util/python.c @@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp, return 0; }
+/* + * XXX: All these evsel destructors need some better mechanism, like a linked + * list of destructors registered when the relevant code indeed is used instead + * of having more and more calls in perf_evsel__delete(). -- acme + * + * For now, add some more: + * + * Not to drag the BPF bandwagon... + */ +void bpf_counter__destroy(struct evsel *evsel); +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); + +void bpf_counter__destroy(struct evsel *evsel __maybe_unused) +{ +} + +int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused) +{ + return 0; +} + /* * Support debug printing even though util/debug.c is not linked. That means * implementing 'verbose' and 'eprintf'. diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index 96fe9c1af336..b6f71ad4b536 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -1027,7 +1027,9 @@ static void print_header(struct perf_stat_config *config, if (!config->csv_output) { fprintf(output, "\n"); fprintf(output, " Performance counter stats for "); - if (_target->system_wide) + if (_target->bpf_str) + fprintf(output, "'BPF program(s) %s", _target->bpf_str); + else if (_target->system_wide) fprintf(output, "'system wide"); else if (_target->cpu_list) fprintf(output, "'CPU(s) %s", _target->cpu_list); diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index bd0decd6d753..8cd26aa111c1 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel, if (leader->core.nr_members > 1) attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
- attr->inherit = !config->no_inherit; + attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list);
/* * Some events get initialized with sample_(period/type) set, diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c index a3db13dea937..0f383418e3df 100644 --- a/tools/perf/util/target.c +++ b/tools/perf/util/target.c @@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target) ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM; }
+ /* BPF and CPU are mutually exclusive */ + if (target->bpf_str && target->cpu_list) { + target->cpu_list = NULL; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_CPU; + } + + /* BPF and PID/TID are mutually exclusive */ + if (target->bpf_str && target->tid) { + target->tid = NULL; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_PID; + } + + /* BPF and UID are mutually exclusive */ + if (target->bpf_str && target->uid_str) { + target->uid_str = NULL; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_UID; + } + + /* BPF and THREADS are mutually exclusive */ + if (target->bpf_str && target->per_thread) { + target->per_thread = false; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD; + } + /* THREAD and SYSTEM/CPU are mutually exclusive */ if (target->per_thread && (target->system_wide || target->cpu_list)) { target->per_thread = false; @@ -109,6 +137,10 @@ static const char *target__error_str[] = { "PID/TID switch overriding SYSTEM", "UID switch overriding SYSTEM", "SYSTEM/CPU switch overriding PER-THREAD", + "BPF switch overriding CPU", + "BPF switch overriding PID/TID", + "BPF switch overriding UID", + "BPF switch overriding THREAD", "Invalid User: %s", "Problems obtaining information for user %s", }; @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
switch (errnum) { case TARGET_ERRNO__PID_OVERRIDE_CPU ... - TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD: + TARGET_ERRNO__BPF_OVERRIDE_THREAD: snprintf(buf, buflen, "%s", msg); break;
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h index 6ef01a83b24e..f132c6c2eef8 100644 --- a/tools/perf/util/target.h +++ b/tools/perf/util/target.h @@ -10,6 +10,7 @@ struct target { const char *tid; const char *cpu_list; const char *uid_str; + const char *bpf_str; uid_t uid; bool system_wide; bool uses_mmap; @@ -36,6 +37,10 @@ enum target_errno { TARGET_ERRNO__PID_OVERRIDE_SYSTEM, TARGET_ERRNO__UID_OVERRIDE_SYSTEM, TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD, + TARGET_ERRNO__BPF_OVERRIDE_CPU, + TARGET_ERRNO__BPF_OVERRIDE_PID, + TARGET_ERRNO__BPF_OVERRIDE_UID, + TARGET_ERRNO__BPF_OVERRIDE_THREAD,
/* for target__parse_uid() */ TARGET_ERRNO__INVALID_UID, @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target) return target->system_wide || target->cpu_list; }
+static inline bool target__has_bpf(struct target *target) +{ + return target->bpf_str; +} + static inline bool target__none(struct target *target) { return !target__has_task(target) && !target__has_cpu(target);
From: Cong Wang cong.wang@bytedance.com
mainline inclusion from mainline-5.12-rc3 commit 53f523f3052ac16bbc7718032aa6b848f971d28c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Similar to bpf_prog_realloc(), bpf_prog_clone_create() also copies the percpu pointers, but the clone still shares them with the original prog, so we have to clear these two percpu pointers in bpf_prog_clone_free(). Otherwise we would get a double free:
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 13 PID: 8140 Comm: kworker/13:247 Kdump: loaded Tainted: G W OE 5.11.0-rc4.bm.1-amd64+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 test_bpf: #1 TXA Workqueue: events bpf_prog_free_deferred RIP: 0010:percpu_ref_get_many.constprop.97+0x42/0xf0 Code: [...] RSP: 0018:ffffa6bce1f9bda0 EFLAGS: 00010002 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000021dfc7b RDX: ffffffffae2eeb90 RSI: 867f92637e338da5 RDI: 0000000000000046 RBP: ffffa6bce1f9bda8 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000046 R11: 0000000000000000 R12: 0000000000000280 R13: 0000000000000000 R14: 0000000000000000 R15: ffff9b5f3ffdedc0 FS: 0000000000000000(0000) GS:ffff9b5f2fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000027c36c002 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: refill_obj_stock+0x5e/0xd0 free_percpu+0xee/0x550 __bpf_prog_free+0x4d/0x60 process_one_work+0x26a/0x590 worker_thread+0x3c/0x390 ? process_one_work+0x590/0x590 kthread+0x130/0x150 ? kthread_park+0x80/0x80 ret_from_fork+0x1f/0x30
This bug is 100% reproducible with test_kmod.sh.
Fixes: 700d4796ef59 ("bpf: Optimize program stats") Fixes: ca06f55b9002 ("bpf: Add per-program recursion prevention mechanism") Reported-by: Jiang Wang jiang.wang@bytedance.com Signed-off-by: Cong Wang cong.wang@bytedance.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Cc: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210218001647.71631-1-xiyou.wangcong@gmail.com (cherry picked from commit 53f523f3052ac16bbc7718032aa6b848f971d28c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 557b2a866c6f..c766bda64fe5 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1130,6 +1130,8 @@ static void bpf_prog_clone_free(struct bpf_prog *fp) * clone is guaranteed to not be locked. */ fp->aux = NULL; + fp->stats = NULL; + fp->active = NULL; __bpf_prog_free(fp); }
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-5.12-rc3 commit f4eda8b6e4a5c7897c6bb992ed63a27061b371ef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Now it is possible for global function to have a pointer argument that points to something different than struct. Drop the irrelevant log message and keep the logic same.
Fixes: e5069b9c23b3 ("bpf: Support pointers in global func args") Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210223090416.333943-1-me@ubique.spb.ru (cherry picked from commit f4eda8b6e4a5c7897c6bb992ed63a27061b371ef) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 172178f617c6..0fbd024a90f3 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4320,8 +4320,6 @@ btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf, * is not supported yet. * BPF_PROG_TYPE_RAW_TRACEPOINT is fine. */ - if (log->level & BPF_LOG_LEVEL) - bpf_log(log, "arg#%d type is not a struct\n", arg); return NULL; } tname = btf_name_by_offset(btf, t->name_off);
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-5.12-rc3 commit c41d81bfbb4579c3e583457e383dd63d026bf947 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add an explicit 'const void *' cast to pass program ctx pointer type into a global function that expects pointer to structure.
warning: incompatible pointer types passing 'struct __sk_buff *' to parameter of type 'const struct S *' [-Wincompatible-pointer-types] return foo(skb); ^~~ progs/test_global_func11.c:10:36: note: passing argument to parameter 's' here __noinline int foo(const struct S *s) ^
Fixes: 8b08807d039a ("selftests/bpf: Add unit tests for pointers in global functions") Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210223082211.302596-1-me@ubique.spb.ru (cherry picked from commit c41d81bfbb4579c3e583457e383dd63d026bf947) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/progs/test_global_func11.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/test_global_func11.c b/tools/testing/selftests/bpf/progs/test_global_func11.c index 28488047c849..ef5277d982d9 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func11.c +++ b/tools/testing/selftests/bpf/progs/test_global_func11.c @@ -15,5 +15,5 @@ __noinline int foo(const struct S *s) SEC("cgroup_skb/ingress") int test_cls(struct __sk_buff *skb) { - return foo(skb); + return foo((const void *)skb); }
From: Jiri Olsa jolsa@kernel.org
mainline inclusion from mainline-5.12-rc3 commit 762323eb39a257c3b9875172d5ee134bd448692c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Arnaldo reported issue for following build command:
$ rm -rf /tmp/krava; mkdir /tmp/krava; make O=/tmp/krava clean CLEAN config /bin/sh: line 0: cd: /tmp/krava/feature/: No such file or directory ../../scripts/Makefile.include:17: *** output directory "/tmp/krava/feature/" does not exist. Stop. make[1]: *** [Makefile.perf:1010: config-clean] Error 2 make: *** [Makefile:90: clean] Error 2
The problem is that now that we include scripts/Makefile.include in feature's Makefile (which is fine and needed), we need to ensure the OUTPUT directory exists, before executing (out of tree) clean command.
Removing the feature's cleanup from perf Makefile and fixing feature's cleanup under build Makefile, so it now checks that there's existing OUTPUT directory before calling the clean.
Fixes: 211a741cd3e1 ("tools: Factor Clang, LLC and LLVM utils definitions") Reported-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Jiri Olsa jolsa@kernel.org Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Tested-by: Sedat Dilek sedat.dilek@gmail.com # LLVM/Clang v13-git Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Mark Rutland mark.rutland@arm.com Cc: Michael Petlan mpetlan@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/20210224150831.409639-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com (cherry picked from commit 762323eb39a257c3b9875172d5ee134bd448692c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/build/Makefile | 8 +++++++- tools/perf/Makefile.perf | 10 +--------- 2 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/tools/build/Makefile b/tools/build/Makefile index bae48e6fa995..5ed41b96fcde 100644 --- a/tools/build/Makefile +++ b/tools/build/Makefile @@ -30,12 +30,18 @@ build := -f $(srctree)/tools/build/Makefile.build dir=. obj
all: $(OUTPUT)fixdep
+# Make sure there's anything to clean, +# feature contains check for existing OUTPUT +TMP_O := $(if $(OUTPUT),$(OUTPUT)/feature,./) + clean: $(call QUIET_CLEAN, fixdep) $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '.*.cmd' -delete -o -name '.*.d' -delete $(Q)rm -f $(OUTPUT)fixdep $(call QUIET_CLEAN, feature-detect) - $(Q)$(MAKE) -C feature/ clean >/dev/null +ifneq ($(wildcard $(TMP_O)),) + $(Q)$(MAKE) -C feature OUTPUT=$(TMP_O) clean >/dev/null +endif
$(OUTPUT)fixdep-in.o: FORCE $(Q)$(MAKE) $(build)=fixdep diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 9bfc725db608..f6e609673de2 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -1001,14 +1001,6 @@ $(INSTALL_DOC_TARGETS):
### Cleaning rules
-# -# This is here, not in Makefile.config, because Makefile.config does -# not get included for the clean target: -# -config-clean: - $(call QUIET_CLEAN, config) - $(Q)$(MAKE) -C $(srctree)/tools/build/feature/ $(if $(OUTPUT),OUTPUT=$(OUTPUT)feature/,) clean >/dev/null - python-clean: $(python-clean)
@@ -1048,7 +1040,7 @@ endif # BUILD_BPF_SKEL bpf-skel-clean: $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
-clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS) $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '.*.cmd' -delete -o -name '.*.d' -delete $(Q)$(RM) $(OUTPUT).config-detected
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.12-rc5 commit 350a5c4dd2452ea999cc5e1d4a8dbf12de2f97ef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The syzbot got FD of vmlinux BTF and passed it into map_create which caused crash in btf_type_id_size() when it tried to access resolved_ids. The vmlinux BTF doesn't have 'resolved_ids' and 'resolved_sizes' initialized to save memory. To avoid such issues disallow using vmlinux BTF in prog_load and map_create commands.
Fixes: 5329722057d4 ("bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO") Reported-by: syzbot+8bab8ed346746e7540e8@syzkaller.appspotmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210307225248.79031-1-alexei.starovoitov@gmail.... (cherry picked from commit 350a5c4dd2452ea999cc5e1d4a8dbf12de2f97ef) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/syscall.c | 5 +++++ kernel/bpf/verifier.c | 4 ++++ 2 files changed, 9 insertions(+)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 969d0598434c..0b9b3f0d772c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -864,6 +864,11 @@ static int map_create(union bpf_attr *attr) err = PTR_ERR(btf); goto free_map; } + if (btf_is_kernel(btf)) { + btf_put(btf); + err = -EACCES; + goto free_map; + } map->btf = btf;
if (attr->btf_value_type_id) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9d3af209e9c7..f3c20e763156 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9120,6 +9120,10 @@ static int check_btf_info(struct bpf_verifier_env *env, btf = btf_get_by_fd(attr->prog_btf_fd); if (IS_ERR(btf)) return PTR_ERR(btf); + if (btf_is_kernel(btf)) { + btf_put(btf); + return -EACCES; + } env->prog->aux->btf = btf;
err = check_btf_func(env, attr, uattr);
From: Dmitrii Banshchikov me@ubique.spb.ru
mainline inclusion from mainline-5.13-rc1 commit 523a4cf491b3c9e2d546040d57250f1a0ca84f03 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Instead of using integer literal here and there use macro name for better context.
Signed-off-by: Dmitrii Banshchikov me@ubique.spb.ru Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210225202629.585485-1-me@ubique.spb.ru (cherry picked from commit 523a4cf491b3c9e2d546040d57250f1a0ca84f03) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 5 +++++ kernel/bpf/btf.c | 25 ++++++++++++++----------- kernel/bpf/verifier.c | 2 +- 3 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a6f0daa55096..f532d9f61d26 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -573,6 +573,11 @@ enum bpf_cgroup_storage_type { */ #define MAX_BPF_FUNC_ARGS 12
+/* The maximum number of arguments passed through registers + * a single function may have. + */ +#define MAX_BPF_FUNC_REG_ARGS 5 + struct btf_func_model { u8 ret_size; u8 nr_args; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 0fbd024a90f3..ab190f20e811 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4591,8 +4591,10 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, } arg = off / 8; args = (const struct btf_param *)(t + 1); - /* if (t == NULL) Fall back to default BPF prog with 5 u64 arguments */ - nr_args = t ? btf_type_vlen(t) : 5; + /* if (t == NULL) Fall back to default BPF prog with + * MAX_BPF_FUNC_REG_ARGS u64 arguments. + */ + nr_args = t ? btf_type_vlen(t) : MAX_BPF_FUNC_REG_ARGS; if (prog->aux->attach_btf_trace) { /* skip first 'void *__data' argument in btf_trace_##name typedef */ args++; @@ -4649,7 +4651,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, } } else { if (!t) - /* Default prog with 5 args */ + /* Default prog with MAX_BPF_FUNC_REG_ARGS args */ return true; t = btf_type_by_id(btf, args[arg].type); } @@ -5102,12 +5104,12 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
if (!func) { /* BTF function prototype doesn't match the verifier types. - * Fall back to 5 u64 args. + * Fall back to MAX_BPF_FUNC_REG_ARGS u64 args. */ - for (i = 0; i < 5; i++) + for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) m->arg_size[i] = 8; m->ret_size = 8; - m->nr_args = 5; + m->nr_args = MAX_BPF_FUNC_REG_ARGS; return 0; } args = (const struct btf_param *)(func + 1); @@ -5342,8 +5344,9 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, } args = (const struct btf_param *)(t + 1); nargs = btf_type_vlen(t); - if (nargs > 5) { - bpf_log(log, "Function %s has %d > 5 args\n", tname, nargs); + if (nargs > MAX_BPF_FUNC_REG_ARGS) { + bpf_log(log, "Function %s has %d > %d args\n", tname, nargs, + MAX_BPF_FUNC_REG_ARGS); goto out; }
@@ -5472,9 +5475,9 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, } args = (const struct btf_param *)(t + 1); nargs = btf_type_vlen(t); - if (nargs > 5) { - bpf_log(log, "Global function %s() with %d > 5 args. Buggy compiler.\n", - tname, nargs); + if (nargs > MAX_BPF_FUNC_REG_ARGS) { + bpf_log(log, "Global function %s() with %d > %d args. Buggy compiler.\n", + tname, nargs, MAX_BPF_FUNC_REG_ARGS); return -EINVAL; } /* check that function returns int */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f3c20e763156..a0c3205760aa 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5531,7 +5531,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
meta.func_id = func_id; /* check args */ - for (i = 0; i < 5; i++) { + for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) { err = check_func_arg(env, i, &meta, fn); if (err) return err;
From: Brendan Jackman jackmanb@google.com
mainline inclusion from mainline-5.13-rc1 commit e6ac593372aadacc14e02b198e4a1acfef1db595 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This function has become overloaded, it actually does lots of diverse things in a single pass. Rename it to avoid confusion, and add some concise commentary.
Signed-off-by: Brendan Jackman jackmanb@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210217104509.2423183-1-jackmanb@google.com (cherry picked from commit e6ac593372aadacc14e02b198e4a1acfef1db595) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a0c3205760aa..d78c809926f7 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5871,7 +5871,7 @@ static int update_alu_sanitation_state(struct bpf_insn_aux_data *aux, aux->alu_limit != alu_limit)) return REASON_PATHS;
- /* Corresponding fixup done in fixup_bpf_calls(). */ + /* Corresponding fixup done in do_misc_fixups(). */ aux->alu_state = alu_state; aux->alu_limit = alu_limit; return 0; @@ -11576,12 +11576,10 @@ static int fixup_call_args(struct bpf_verifier_env *env) return err; }
-/* fixup insn->imm field of bpf_call instructions - * and inline eligible helpers as explicit sequence of BPF instructions - * - * this function is called after eBPF program passed verification +/* Do various post-verification rewrites in a single program pass. + * These rewrites simplify JIT and interpreter implementations. */ -static int fixup_bpf_calls(struct bpf_verifier_env *env) +static int do_misc_fixups(struct bpf_verifier_env *env) { struct bpf_prog *prog = env->prog; bool expect_blinding = bpf_jit_blinding_enabled(prog); @@ -11596,6 +11594,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) int i, ret, cnt, delta = 0;
for (i = 0; i < insn_cnt; i++, insn++) { + /* Make divide-by-zero exceptions impossible. */ if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || insn->code == (BPF_ALU | BPF_MOD | BPF_X) || @@ -11636,6 +11635,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) continue; }
+ /* Implement LD_ABS and LD_IND with a rewrite, if supported by the program type. */ if (BPF_CLASS(insn->code) == BPF_LD && (BPF_MODE(insn->code) == BPF_ABS || BPF_MODE(insn->code) == BPF_IND)) { @@ -11655,6 +11655,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) continue; }
+ /* Rewrite pointer arithmetic to mitigate speculation attacks. */ if (insn->code == (BPF_ALU64 | BPF_ADD | BPF_X) || insn->code == (BPF_ALU64 | BPF_SUB | BPF_X)) { const u8 code_add = BPF_ALU64 | BPF_ADD | BPF_X; @@ -11877,6 +11878,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) goto patch_call_imm; }
+ /* Implement bpf_jiffies64 inline. */ if (prog->jit_requested && BITS_PER_LONG == 64 && insn->imm == BPF_FUNC_jiffies64) { struct bpf_insn ld_jiffies_addr[2] = { @@ -12683,7 +12685,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, ret = convert_ctx_accesses(env);
if (ret == 0) - ret = fixup_bpf_calls(env); + ret = do_misc_fixups(env);
/* do 32-bit optimization after insn patching has done so those patched * insns could be handled correctly.
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit efdb22de7dcd3b24b8154b3c3ba496f62afea00c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
During verifier check_cfg(), all instructions are visited to ensure verifier can handle program control flows. This patch factored out function visit_func_call_insn() so it can be reused in later patch to visit callback function calls. There is no functionality change for this patch.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204920.3884136-1-yhs@fb.com (cherry picked from commit efdb22de7dcd3b24b8154b3c3ba496f62afea00c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d78c809926f7..5e426656ddac 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8681,6 +8681,27 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, return DONE_EXPLORING; }
+static int visit_func_call_insn(int t, int insn_cnt, + struct bpf_insn *insns, + struct bpf_verifier_env *env, + bool visit_callee) +{ + int ret; + + ret = push_insn(t, t + 1, FALLTHROUGH, env, false); + if (ret) + return ret; + + if (t + 1 < insn_cnt) + init_explored_state(env, t + 1); + if (visit_callee) { + init_explored_state(env, t); + ret = push_insn(t, t + insns[t].imm + 1, BRANCH, + env, false); + } + return ret; +} + /* Visits the instruction at index t and returns one of the following: * < 0 - an error occurred * DONE_EXPLORING - the instruction was fully explored @@ -8701,18 +8722,8 @@ static int visit_insn(int t, int insn_cnt, struct bpf_verifier_env *env) return DONE_EXPLORING;
case BPF_CALL: - ret = push_insn(t, t + 1, FALLTHROUGH, env, false); - if (ret) - return ret; - - if (t + 1 < insn_cnt) - init_explored_state(env, t + 1); - if (insns[t].src_reg == BPF_PSEUDO_CALL) { - init_explored_state(env, t); - ret = push_insn(t, t + insns[t].imm + 1, BRANCH, - env, false); - } - return ret; + return visit_func_call_insn(t, insn_cnt, insns, env, + insns[t].src_reg == BPF_PSEUDO_CALL);
case BPF_JA: if (BPF_SRC(insns[t].code) != BPF_K)
From: Ilya Leoshkevich iii@linux.ibm.com
mainline inclusion from mainline-5.13-rc1 commit 86fd166575c38c17ecd3a6b8fb9c64fa19871486 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Building selftests in a separate directory like this:
make O="$BUILD" -C tools/testing/selftests/bpf
and then running:
cd "$BUILD" && ./test_progs -t btf
causes all the non-flavored btf_dump_test_case_*.c tests to fail, because these files are not copied to where test_progs expects to find them.
Fix by not skipping EXT-COPY when the original $(OUTPUT) is not empty (lib.mk sets it to $(shell pwd) in that case) and using rsync instead of cp: cp fails because e.g. urandom_read is being copied into itself, and rsync simply skips such cases. rsync is already used by kselftests and therefore is not a new dependency.
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210224111445.102342-1-iii@linux.ibm.com (cherry picked from commit 86fd166575c38c17ecd3a6b8fb9c64fa19871486) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index c654cbc1f622..cd0e436be9d9 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -387,11 +387,12 @@ $(TRUNNER_EXTRA_OBJS): $(TRUNNER_OUTPUT)/%.o: \ $$(call msg,EXT-OBJ,$(TRUNNER_BINARY),$$@) $(Q)$$(CC) $$(CFLAGS) -c $$< $$(LDLIBS) -o $$@
-# only copy extra resources if in flavored build +# non-flavored in-srctree builds receive special treatment, in particular, we +# do not need to copy extra resources (see e.g. test_btf_dump_case()) $(TRUNNER_BINARY)-extras: $(TRUNNER_EXTRA_FILES) | $(TRUNNER_OUTPUT) -ifneq ($2,) +ifneq ($2:$(OUTPUT),:$(shell pwd)) $$(call msg,EXT-COPY,$(TRUNNER_BINARY),$(TRUNNER_EXTRA_FILES)) - $(Q)cp -a $$^ $(TRUNNER_OUTPUT)/ + $(Q)rsync -aq $$^ $(TRUNNER_OUTPUT)/ endif
$(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS) \
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit bc2591d63fc91bd5a9aaff145148a224d892bdb8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Factor out the function verbose_invalid_scalar() to verbose print if a scalar is not in a tnum range. There is no functionality change and the function will be used by later patch which introduced bpf_for_each_map_elem().
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204922.3884375-1-yhs@fb.com (cherry picked from commit bc2591d63fc91bd5a9aaff145148a224d892bdb8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5e426656ddac..1a6c7b1c3b23 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -391,6 +391,24 @@ __printf(3, 4) static void verbose_linfo(struct bpf_verifier_env *env, env->prev_linfo = linfo; }
+static void verbose_invalid_scalar(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, + struct tnum *range, const char *ctx, + const char *reg_name) +{ + char tn_buf[48]; + + verbose(env, "At %s the register %s ", ctx, reg_name); + if (!tnum_is_unknown(reg->var_off)) { + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); + verbose(env, "has value %s", tn_buf); + } else { + verbose(env, "has unknown scalar value"); + } + tnum_strn(tn_buf, sizeof(tn_buf), *range); + verbose(env, " should have been in %s\n", tn_buf); +} + static bool type_is_pkt_pointer(enum bpf_reg_type type) { return type == PTR_TO_PACKET || @@ -8544,17 +8562,7 @@ static int check_return_code(struct bpf_verifier_env *env) }
if (!tnum_in(range, reg->var_off)) { - char tn_buf[48]; - - verbose(env, "At program exit the register R0 "); - if (!tnum_is_unknown(reg->var_off)) { - tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); - verbose(env, "has value %s", tn_buf); - } else { - verbose(env, "has unknown scalar value"); - } - tnum_strn(tn_buf, sizeof(tn_buf), range); - verbose(env, " should have been in %s\n", tn_buf); + verbose_invalid_scalar(env, reg, &range, "program exit", "R0"); return -EINVAL; }
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit 1435137573f9c75455903e8cd01f84d6e092ea16 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Later proposed bpf_for_each_map_elem() helper has callback function as one of its arguments. This patch refactored check_func_call() to permit callback function which sets callee state. Different callback functions may have different callee states. There is no functionality change for this patch.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204923.3884627-1-yhs@fb.com (cherry picked from commit 1435137573f9c75455903e8cd01f84d6e092ea16) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 60 +++++++++++++++++++++++++++++++------------ 1 file changed, 43 insertions(+), 17 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1a6c7b1c3b23..9de48d714239 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5234,13 +5234,19 @@ static void clear_caller_saved_regs(struct bpf_verifier_env *env, } }
-static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, - int *insn_idx) +typedef int (*set_callee_state_fn)(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, + int insn_idx); + +static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, + int *insn_idx, int subprog, + set_callee_state_fn set_callee_state_cb) { struct bpf_verifier_state *state = env->cur_state; struct bpf_func_info_aux *func_info_aux; struct bpf_func_state *caller, *callee; - int i, err, subprog, target_insn; + int err; bool is_global = false;
if (state->curframe + 1 >= MAX_CALL_FRAMES) { @@ -5249,14 +5255,6 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, return -E2BIG; }
- target_insn = *insn_idx + insn->imm; - subprog = find_subprog(env, target_insn + 1); - if (subprog < 0) { - verbose(env, "verifier bug. No program starts at insn %d\n", - target_insn + 1); - return -EFAULT; - } - caller = state->frame[state->curframe]; if (state->frame[state->curframe + 1]) { verbose(env, "verifier bug. Frame %d already allocated\n", @@ -5311,11 +5309,9 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, if (err) return err;
- /* copy r1 - r5 args that callee can access. The copy includes parent - * pointers, which connects us up to the liveness chain - */ - for (i = BPF_REG_1; i <= BPF_REG_5; i++) - callee->regs[i] = caller->regs[i]; + err = set_callee_state_cb(env, caller, callee, *insn_idx); + if (err) + return err;
clear_caller_saved_regs(env, caller->regs);
@@ -5323,7 +5319,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, state->curframe++;
/* and go analyze first insn of the callee */ - *insn_idx = target_insn; + *insn_idx = env->subprog_info[subprog].start - 1;
if (env->log.level & BPF_LOG_LEVEL) { verbose(env, "caller:\n"); @@ -5334,6 +5330,36 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, return 0; }
+static int set_callee_state(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, int insn_idx) +{ + int i; + + /* copy r1 - r5 args that callee can access. The copy includes parent + * pointers, which connects us up to the liveness chain + */ + for (i = BPF_REG_1; i <= BPF_REG_5; i++) + callee->regs[i] = caller->regs[i]; + return 0; +} + +static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, + int *insn_idx) +{ + int subprog, target_insn; + + target_insn = *insn_idx + insn->imm + 1; + subprog = find_subprog(env, target_insn); + if (subprog < 0) { + verbose(env, "verifier bug. No program starts at insn %d\n", + target_insn); + return -EFAULT; + } + + return __check_func_call(env, insn, insn_idx, subprog, set_callee_state); +} + static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) { struct bpf_verifier_state *state = env->cur_state;
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit 282a0f46d6cda7cf843cd77c9b53b4d1d9e31302 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Currently, verifier function add_subprog() returns 0 for success and negative value for failure. Change the return value to be the subprog number for success. This functionality will be used in the next patch to save a call to find_subprog().
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204924.3884848-1-yhs@fb.com (cherry picked from commit 282a0f46d6cda7cf843cd77c9b53b4d1d9e31302) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/verifier.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9de48d714239..ae9b0c99e681 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1513,7 +1513,7 @@ static int add_subprog(struct bpf_verifier_env *env, int off) } ret = find_subprog(env, off); if (ret >= 0) - return 0; + return ret; if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { verbose(env, "too many subprograms\n"); return -E2BIG; @@ -1521,7 +1521,7 @@ static int add_subprog(struct bpf_verifier_env *env, int off) env->subprog_info[env->subprog_cnt++].start = off; sort(env->subprog_info, env->subprog_cnt, sizeof(env->subprog_info[0]), cmp_subprogs, NULL); - return 0; + return env->subprog_cnt - 1; }
static int check_subprogs(struct bpf_verifier_env *env)
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit 69c087ba6225b574afb6e505b72cb75242a3d844 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The bpf_for_each_map_elem() helper is introduced which iterates all map elements with a callback function. The helper signature looks like long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags) and for each map element, the callback_fn will be called. For example, like hashmap, the callback signature may look like long callback_fn(map, key, val, callback_ctx)
There are two known use cases for this. One is from upstream ([1]) where a for_each_map_elem helper may help implement a timeout mechanism in a more generic way. Another is from our internal discussion for a firewall use case where a map contains all the rules. The packet data can be compared to all these rules to decide allow or deny the packet.
For array maps, users can already use a bounded loop to traverse elements. Using this helper can avoid using bounded loop. For other type of maps (e.g., hash maps) where bounded loop is hard or impossible to use, this helper provides a convenient way to operate on all elements.
For callback_fn, besides map and map element, a callback_ctx, allocated on caller stack, is also passed to the callback function. This callback_ctx argument can provide additional input and allow to write to caller stack for output.
If the callback_fn returns 0, the helper will iterate through next element if available. If the callback_fn returns 1, the helper will stop iterating and returns to the bpf program. Other return values are not used for now.
Currently, this helper is only available with jit. It is possible to make it work with interpreter with so effort but I leave it as the future work.
[1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com...
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com (cherry picked from commit 69c087ba6225b574afb6e505b72cb75242a3d844) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf.h kernel/bpf/verifier.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- drivers/net/ethernet/netronome/nfp/bpf/fw.h | 4 +- include/linux/bpf.h | 13 ++ include/linux/bpf_verifier.h | 3 + include/uapi/linux/bpf.h | 39 ++++ kernel/bpf/bpf_iter.c | 16 ++ kernel/bpf/helpers.c | 2 + kernel/bpf/verifier.c | 191 ++++++++++++++++++-- kernel/trace/bpf_trace.c | 2 + tools/include/uapi/linux/bpf.h | 39 ++++ 9 files changed, 295 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/fw.h b/drivers/net/ethernet/netronome/nfp/bpf/fw.h index 33f9058ed32e..4268a7e0f344 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/fw.h +++ b/drivers/net/ethernet/netronome/nfp/bpf/fw.h @@ -13,8 +13,8 @@ */ #define NFP_BPF_SCALAR_VALUE 1 #define NFP_BPF_MAP_VALUE 4 -#define NFP_BPF_STACK 5 -#define NFP_BPF_PACKET_DATA 7 +#define NFP_BPF_STACK 6 +#define NFP_BPF_PACKET_DATA 8
enum bpf_cap_tlv_type { NFP_BPF_CAP_TYPE_FUNC = 1, diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f532d9f61d26..dd2f794dd04e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -37,6 +37,7 @@ struct bpf_iter_aux_info; struct bpf_local_storage; struct bpf_local_storage_map; struct kobject; +struct bpf_func_state;
extern struct idr btf_idr; extern spinlock_t btf_idr_lock; @@ -127,6 +128,13 @@ struct bpf_map_ops { bool (*map_meta_equal)(const struct bpf_map *meta0, const struct bpf_map *meta1);
+ + int (*map_set_for_each_callback_args)(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee); + int (*map_for_each_callback)(struct bpf_map *map, void *callback_fn, + void *callback_ctx, u64 flags); + /* BTF name and id of struct allocated by map_alloc */ const char * const map_btf_name; int *map_btf_id; @@ -324,6 +332,8 @@ enum bpf_arg_type { ARG_CONST_ALLOC_SIZE_OR_ZERO, /* number of allocated bytes requested */ ARG_PTR_TO_BTF_ID_SOCK_COMMON, /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */ ARG_PTR_TO_PERCPU_BTF_ID, /* pointer to in-kernel percpu type */ + ARG_PTR_TO_FUNC, /* pointer to a bpf program function */ + ARG_PTR_TO_STACK_OR_NULL, /* pointer to stack or NULL */ __BPF_ARG_TYPE_MAX,
/* Extended arg_types. */ @@ -428,6 +438,7 @@ enum bpf_reg_type { PTR_TO_CTX, /* reg points to bpf_context */ CONST_PTR_TO_MAP, /* reg points to struct bpf_map */ PTR_TO_MAP_VALUE, /* reg points to map element value */ + PTR_TO_MAP_KEY, /* reg points to a map element key */ PTR_TO_STACK, /* reg == frame_pointer + offset */ PTR_TO_PACKET_META, /* skb->data - meta_len */ PTR_TO_PACKET, /* reg points to skb->data */ @@ -455,6 +466,7 @@ enum bpf_reg_type { */ PTR_TO_MEM, /* reg points to valid memory region */ PTR_TO_BUF, /* reg points to a read/write buffer */ + PTR_TO_FUNC, /* reg points to a bpf program function */ PTR_TO_PERCPU_BTF_ID, /* reg points to a percpu kernel variable */ __BPF_REG_TYPE_MAX,
@@ -1994,6 +2006,7 @@ extern const struct bpf_func_proto bpf_copy_from_user_proto; extern const struct bpf_func_proto bpf_snprintf_btf_proto; extern const struct bpf_func_proto bpf_per_cpu_ptr_proto; extern const struct bpf_func_proto bpf_this_cpu_ptr_proto; +extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index bea28c220829..ef49a22433b1 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -70,6 +70,8 @@ struct bpf_reg_state { unsigned long raw1; unsigned long raw2; } raw; + + u32 subprogno; /* for PTR_TO_FUNC */ }; /* For PTR_TO_PACKET, used to find other pointers with the same variable * offset, so they can share range knowledge. @@ -206,6 +208,7 @@ struct bpf_func_state { int acquired_refs; struct bpf_reference_state *refs; int allocated_stack; + bool in_callback_fn; struct bpf_stack_state *stack; };
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 601f24eff2b6..e51f97b72f65 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -394,6 +394,15 @@ enum bpf_link_type { * is struct/union. */ #define BPF_PSEUDO_BTF_ID 3 +/* insn[0].src_reg: BPF_PSEUDO_FUNC + * insn[0].imm: insn offset to the func + * insn[1].imm: 0 + * insn[0].off: 0 + * insn[1].off: 0 + * ldimm64 rewrite: address of the function + * verifier type: PTR_TO_FUNC. + */ +#define BPF_PSEUDO_FUNC 4
/* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative * offset to another bpf function @@ -3788,6 +3797,35 @@ union bpf_attr { * to be enabled. * Return * 1 if the sched entity belongs to a cgroup, 0 otherwise. + * + * long bpf_for_each_map_elem(struct bpf_map *map, void *callback_fn, void *callback_ctx, u64 flags) + * Description + * For each element in **map**, call **callback_fn** function with + * **map**, **callback_ctx** and other map-specific parameters. + * The **callback_fn** should be a static function and + * the **callback_ctx** should be a pointer to the stack. + * The **flags** is used to control certain aspects of the helper. + * Currently, the **flags** must be 0. + * + * The following are a list of supported map types and their + * respective expected callback signatures: + * + * BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH, + * BPF_MAP_TYPE_LRU_HASH, BPF_MAP_TYPE_LRU_PERCPU_HASH, + * BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY + * + * long (*callback_fn)(struct bpf_map *map, const void *key, void *value, void *ctx); + * + * For per_cpu maps, the map_value is the value on the cpu where the + * bpf_prog is running. + * + * If **callback_fn** return 0, the helper will continue to the next + * element. If return value is 1, the helper will skip the rest of + * elements and return. Other return values are not used now. + * + * Return + * The number of traversed map elements for success, **-EINVAL** for + * invalid **flags**. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3951,6 +3989,7 @@ union bpf_attr { FN(sched_entity_to_tgidpid), \ FN(sched_entity_to_cgrpid), \ FN(sched_entity_belongs_to_cgrp), \ + FN(for_each_map_elem), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index a0d9eade9c80..931870f9cf56 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -675,3 +675,19 @@ int bpf_iter_run_prog(struct bpf_prog *prog, void *ctx) */ return ret == 0 ? 0 : -EAGAIN; } + +BPF_CALL_4(bpf_for_each_map_elem, struct bpf_map *, map, void *, callback_fn, + void *, callback_ctx, u64, flags) +{ + return map->ops->map_for_each_callback(map, callback_fn, callback_ctx, flags); +} + +const struct bpf_func_proto bpf_for_each_map_elem_proto = { + .func = bpf_for_each_map_elem, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_CONST_MAP_PTR, + .arg2_type = ARG_PTR_TO_FUNC, + .arg3_type = ARG_PTR_TO_STACK_OR_NULL, + .arg4_type = ARG_ANYTHING, +}; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 4bb5921a7d21..9ff81ff5bf9b 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -697,6 +697,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_ringbuf_discard_proto; case BPF_FUNC_ringbuf_query: return &bpf_ringbuf_query_proto; + case BPF_FUNC_for_each_map_elem: + return &bpf_for_each_map_elem_proto; default: break; } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ae9b0c99e681..f63a736a645a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -235,6 +235,12 @@ static bool bpf_pseudo_call(const struct bpf_insn *insn) insn->src_reg == BPF_PSEUDO_CALL; }
+static bool bpf_pseudo_func(const struct bpf_insn *insn) +{ + return insn->code == (BPF_LD | BPF_IMM | BPF_DW) && + insn->src_reg == BPF_PSEUDO_FUNC; +} + struct bpf_call_arg_meta { struct bpf_map *map_ptr; bool raw_mode; @@ -249,6 +255,7 @@ struct bpf_call_arg_meta { u32 btf_id; struct btf *ret_btf; u32 ret_btf_id; + u32 subprogno; };
struct btf *btf_vmlinux; @@ -428,6 +435,7 @@ static bool reg_type_not_null(enum bpf_reg_type type) return type == PTR_TO_SOCKET || type == PTR_TO_TCP_SOCK || type == PTR_TO_MAP_VALUE || + type == PTR_TO_MAP_KEY || type == PTR_TO_SOCK_COMMON; }
@@ -538,6 +546,8 @@ static const char *reg_type_str(struct bpf_verifier_env *env, [PTR_TO_PERCPU_BTF_ID] = "percpu_ptr_", [PTR_TO_MEM] = "mem", [PTR_TO_BUF] = "buf", + [PTR_TO_FUNC] = "func", + [PTR_TO_MAP_KEY] = "map_key", };
if (type & PTR_MAYBE_NULL) { @@ -626,6 +636,7 @@ static void print_verifier_state(struct bpf_verifier_env *env, if (type_is_pkt_pointer(t)) verbose(env, ",r=%d", reg->range); else if (base_type(t) == CONST_PTR_TO_MAP || + base_type(t) == PTR_TO_MAP_KEY || base_type(t) == PTR_TO_MAP_VALUE) verbose(env, ",ks=%d,vs=%d", reg->map_ptr->key_size, @@ -1538,6 +1549,19 @@ static int check_subprogs(struct bpf_verifier_env *env)
/* determine subprog starts. The end is one before the next starts */ for (i = 0; i < insn_cnt; i++) { + if (bpf_pseudo_func(insn + i)) { + if (!env->bpf_capable) { + verbose(env, + "function pointers are allowed for CAP_BPF and CAP_SYS_ADMIN\n"); + return -EPERM; + } + ret = add_subprog(env, i + insn[i].imm + 1); + if (ret < 0) + return ret; + /* remember subprog */ + insn[i + 1].imm = ret; + continue; + } if (!bpf_pseudo_call(insn + i)) continue; if (!env->bpf_capable) { @@ -2260,6 +2284,8 @@ static bool is_spillable_regtype(enum bpf_reg_type type) case PTR_TO_BUF: case PTR_TO_PERCPU_BTF_ID: case PTR_TO_MEM: + case PTR_TO_FUNC: + case PTR_TO_MAP_KEY: return true; default: return false; @@ -2840,6 +2866,10 @@ static int __check_mem_access(struct bpf_verifier_env *env, int regno,
reg = &cur_regs(env)[regno]; switch (reg->type) { + case PTR_TO_MAP_KEY: + verbose(env, "invalid access to map key, key_size=%d off=%d size=%d\n", + mem_size, off, size); + break; case PTR_TO_MAP_VALUE: verbose(env, "invalid access to map value, value_size=%d off=%d size=%d\n", mem_size, off, size); @@ -3243,6 +3273,9 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, case PTR_TO_FLOW_KEYS: pointer_desc = "flow keys "; break; + case PTR_TO_MAP_KEY: + pointer_desc = "key "; + break; case PTR_TO_MAP_VALUE: pointer_desc = "value "; break; @@ -3344,7 +3377,7 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) continue_func: subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { - if (!bpf_pseudo_call(insn + i)) + if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i)) continue; /* remember insn and function to return to */ ret_insn[frame] = i + 1; @@ -3806,7 +3839,19 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn /* for access checks, reg->off is just part of off */ off += reg->off;
- if (reg->type == PTR_TO_MAP_VALUE) { + if (reg->type == PTR_TO_MAP_KEY) { + if (t == BPF_WRITE) { + verbose(env, "write to change key R%d not allowed\n", regno); + return -EACCES; + } + + err = check_mem_region_access(env, regno, off, size, + reg->map_ptr->key_size, false); + if (err) + return err; + if (value_regno >= 0) + mark_reg_unknown(env, regs, value_regno); + } else if (reg->type == PTR_TO_MAP_VALUE) { if (t == BPF_WRITE && value_regno >= 0 && is_pointer_value(env, value_regno)) { verbose(env, "R%d leaks addr into map\n", value_regno); @@ -4239,6 +4284,9 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, case PTR_TO_PACKET_META: return check_packet_access(env, regno, reg->off, access_size, zero_size_allowed); + case PTR_TO_MAP_KEY: + return check_mem_region_access(env, regno, reg->off, access_size, + reg->map_ptr->key_size, false); case PTR_TO_MAP_VALUE: if (check_map_access_type(env, regno, reg->off, access_size, meta && meta->raw_mode ? BPF_WRITE : @@ -4456,6 +4504,7 @@ static const struct bpf_reg_types map_key_value_types = { PTR_TO_STACK, PTR_TO_PACKET, PTR_TO_PACKET_META, + PTR_TO_MAP_KEY, PTR_TO_MAP_VALUE, }, }; @@ -4487,6 +4536,7 @@ static const struct bpf_reg_types mem_types = { PTR_TO_STACK, PTR_TO_PACKET, PTR_TO_PACKET_META, + PTR_TO_MAP_KEY, PTR_TO_MAP_VALUE, PTR_TO_MEM, PTR_TO_MEM | MEM_ALLOC, @@ -4499,6 +4549,7 @@ static const struct bpf_reg_types int_ptr_types = { PTR_TO_STACK, PTR_TO_PACKET, PTR_TO_PACKET_META, + PTR_TO_MAP_KEY, PTR_TO_MAP_VALUE, }, }; @@ -4511,6 +4562,8 @@ static const struct bpf_reg_types const_map_ptr_types = { .types = { CONST_PTR_T static const struct bpf_reg_types btf_ptr_types = { .types = { PTR_TO_BTF_ID } }; static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALUE } }; static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_PERCPU_BTF_ID } }; +static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } }; +static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } };
static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_MAP_KEY] = &map_key_value_types, @@ -4534,6 +4587,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_INT] = &int_ptr_types, [ARG_PTR_TO_LONG] = &int_ptr_types, [ARG_PTR_TO_PERCPU_BTF_ID] = &percpu_btf_ptr_types, + [ARG_PTR_TO_FUNC] = &func_ptr_types, + [ARG_PTR_TO_STACK_OR_NULL] = &stack_ptr_types, };
static int check_reg_type(struct bpf_verifier_env *env, u32 regno, @@ -4746,6 +4801,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, verbose(env, "verifier internal error\n"); return -EFAULT; } + } else if (arg_type == ARG_PTR_TO_FUNC) { + meta->subprogno = reg->subprogno; } else if (arg_type_is_mem_ptr(arg_type)) { /* The access to this pointer is only checked when we hit the * next is_mem_size argument below. @@ -5360,6 +5417,35 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, return __check_func_call(env, insn, insn_idx, subprog, set_callee_state); }
+static int set_map_elem_callback_state(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, + int insn_idx) +{ + struct bpf_insn_aux_data *insn_aux = &env->insn_aux_data[insn_idx]; + struct bpf_map *map; + int err; + + if (bpf_map_ptr_poisoned(insn_aux)) { + verbose(env, "tail_call abusing map_ptr\n"); + return -EINVAL; + } + + map = BPF_MAP_PTR(insn_aux->map_ptr_state); + if (!map->ops->map_set_for_each_callback_args || + !map->ops->map_for_each_callback) { + verbose(env, "callback function not allowed for map\n"); + return -ENOTSUPP; + } + + err = map->ops->map_set_for_each_callback_args(env, caller, callee); + if (err) + return err; + + callee->in_callback_fn = true; + return 0; +} + static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) { struct bpf_verifier_state *state = env->cur_state; @@ -5382,8 +5468,22 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
state->curframe--; caller = state->frame[state->curframe]; - /* return to the caller whatever r0 had in the callee */ - caller->regs[BPF_REG_0] = *r0; + if (callee->in_callback_fn) { + /* enforce R0 return value range [0, 1]. */ + struct tnum range = tnum_range(0, 1); + + if (r0->type != SCALAR_VALUE) { + verbose(env, "R0 not a scalar value\n"); + return -EACCES; + } + if (!tnum_in(range, r0->var_off)) { + verbose_invalid_scalar(env, r0, &range, "callback return", "R0"); + return -EINVAL; + } + } else { + /* return to the caller whatever r0 had in the callee */ + caller->regs[BPF_REG_0] = *r0; + }
/* Transfer references to the caller */ err = transfer_reference_state(caller, callee); @@ -5438,7 +5538,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, func_id != BPF_FUNC_map_delete_elem && func_id != BPF_FUNC_map_push_elem && func_id != BPF_FUNC_map_pop_elem && - func_id != BPF_FUNC_map_peek_elem) + func_id != BPF_FUNC_map_peek_elem && + func_id != BPF_FUNC_for_each_map_elem) return 0;
if (map == NULL) { @@ -5519,17 +5620,20 @@ static int check_reference_leak(struct bpf_verifier_env *env) return state->acquired_refs ? -EINVAL : 0; }
-static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn_idx) +static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn, + int *insn_idx_p) { const struct bpf_func_proto *fn = NULL; enum bpf_return_type ret_type; enum bpf_type_flag ret_flag; struct bpf_reg_state *regs; struct bpf_call_arg_meta meta; + int insn_idx = *insn_idx_p; bool changes_data; - int i, err; + int i, err, func_id;
/* find function prototype */ + func_id = insn->imm; if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) { verbose(env, "invalid func %s#%d\n", func_id_name(func_id), func_id); @@ -5625,6 +5729,13 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn return -EINVAL; }
+ if (func_id == BPF_FUNC_for_each_map_elem) { + err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, + set_map_elem_callback_state); + if (err < 0) + return -EINVAL; + } + /* reset caller saved regs */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, regs, caller_saved[i]); @@ -8364,6 +8475,24 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) return 0; }
+ if (insn->src_reg == BPF_PSEUDO_FUNC) { + struct bpf_prog_aux *aux = env->prog->aux; + u32 subprogno = insn[1].imm; + + if (!aux->func_info) { + verbose(env, "missing btf func_info\n"); + return -EINVAL; + } + if (aux->func_info_aux[subprogno].linkage != BTF_FUNC_STATIC) { + verbose(env, "callback function not static\n"); + return -EINVAL; + } + + dst_reg->type = PTR_TO_FUNC; + dst_reg->subprogno = subprogno; + return 0; + } + map = env->used_maps[aux->map_index]; dst_reg->map_ptr = map;
@@ -8746,6 +8875,9 @@ static int visit_insn(int t, int insn_cnt, struct bpf_verifier_env *env) struct bpf_insn *insns = env->prog->insnsi; int ret;
+ if (bpf_pseudo_func(insns + t)) + return visit_func_call_insn(t, insn_cnt, insns, env, true); + /* All non-branch instructions have a single fall-through edge. */ if (BPF_CLASS(insns[t].code) != BPF_JMP && BPF_CLASS(insns[t].code) != BPF_JMP32) @@ -9367,6 +9499,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, */ return false; } + case PTR_TO_MAP_KEY: case PTR_TO_MAP_VALUE: /* a PTR_TO_MAP_VALUE could be safe to use as a * PTR_TO_MAP_VALUE_OR_NULL into the same map. @@ -10198,10 +10331,9 @@ static int do_check(struct bpf_verifier_env *env) if (insn->src_reg == BPF_PSEUDO_CALL) err = check_func_call(env, insn, &env->insn_idx); else - err = check_helper_call(env, insn->imm, env->insn_idx); + err = check_helper_call(env, insn, &env->insn_idx); if (err) return err; - } else if (opcode == BPF_JA) { if (BPF_SRC(insn->code) != BPF_K || insn->imm != 0 || @@ -10618,6 +10750,12 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) goto next_insn; }
+ if (insn[0].src_reg == BPF_PSEUDO_FUNC) { + aux = &env->insn_aux_data[i]; + aux->ptr_type = PTR_TO_FUNC; + goto next_insn; + } + /* In final convert_pseudo_ld_imm64() step, this is * converted into regular 64-bit imm load insn. */ @@ -10750,9 +10888,13 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env) int insn_cnt = env->prog->len; int i;
- for (i = 0; i < insn_cnt; i++, insn++) - if (insn->code == (BPF_LD | BPF_IMM | BPF_DW)) - insn->src_reg = 0; + for (i = 0; i < insn_cnt; i++, insn++) { + if (insn->code != (BPF_LD | BPF_IMM | BPF_DW)) + continue; + if (insn->src_reg == BPF_PSEUDO_FUNC) + continue; + insn->src_reg = 0; + } }
/* single env->prog->insni[off] instruction was replaced with the range @@ -11389,6 +11531,12 @@ static int jit_subprogs(struct bpf_verifier_env *env) return 0;
for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) { + if (bpf_pseudo_func(insn)) { + env->insn_aux_data[i].call_imm = insn->imm; + /* subprog is encoded in insn[1].imm */ + continue; + } + if (!bpf_pseudo_call(insn)) continue; /* Upon error here we cannot fall back to interpreter but @@ -11492,6 +11640,12 @@ static int jit_subprogs(struct bpf_verifier_env *env) for (i = 0; i < env->subprog_cnt; i++) { insn = func[i]->insnsi; for (j = 0; j < func[i]->len; j++, insn++) { + if (bpf_pseudo_func(insn)) { + subprog = insn[1].imm; + insn[0].imm = (u32)(long)func[subprog]->bpf_func; + insn[1].imm = ((u64)(long)func[subprog]->bpf_func) >> 32; + continue; + } if (!bpf_pseudo_call(insn)) continue; subprog = insn->off; @@ -11537,6 +11691,11 @@ static int jit_subprogs(struct bpf_verifier_env *env) * later look the same as if they were interpreted only. */ for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) { + if (bpf_pseudo_func(insn)) { + insn[0].imm = env->insn_aux_data[i].call_imm; + insn[1].imm = find_subprog(env, i + insn[0].imm + 1); + continue; + } if (!bpf_pseudo_call(insn)) continue; insn->off = env->insn_aux_data[i].call_imm; @@ -11609,6 +11768,14 @@ static int fixup_call_args(struct bpf_verifier_env *env) return -EINVAL; } for (i = 0; i < prog->len; i++, insn++) { + if (bpf_pseudo_func(insn)) { + /* When JIT fails the progs with callback calls + * have to be rejected, since interpreter doesn't support them yet. + */ + verbose(env, "callbacks are not allowed in non-JITed programs\n"); + return -EINVAL; + } + if (!bpf_pseudo_call(insn)) continue; depth = get_callee_stack_depth(env, insn, i); diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 81a7b622c691..fba205086cf8 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1331,6 +1331,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_per_cpu_ptr_proto; case BPF_FUNC_this_cpu_ptr: return &bpf_this_cpu_ptr_proto; + case BPF_FUNC_for_each_map_elem: + return &bpf_for_each_map_elem_proto; default: return NULL; } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index fd065ed61ffe..d236252e1af5 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -394,6 +394,15 @@ enum bpf_link_type { * is struct/union. */ #define BPF_PSEUDO_BTF_ID 3 +/* insn[0].src_reg: BPF_PSEUDO_FUNC + * insn[0].imm: insn offset to the func + * insn[1].imm: 0 + * insn[0].off: 0 + * insn[1].off: 0 + * ldimm64 rewrite: address of the function + * verifier type: PTR_TO_FUNC. + */ +#define BPF_PSEUDO_FUNC 4
/* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative * offset to another bpf function @@ -3788,6 +3797,35 @@ union bpf_attr { * to be enabled. * Return * 1 if the sched entity belongs to a cgroup, 0 otherwise. + * + * long bpf_for_each_map_elem(struct bpf_map *map, void *callback_fn, void *callback_ctx, u64 flags) + * Description + * For each element in **map**, call **callback_fn** function with + * **map**, **callback_ctx** and other map-specific parameters. + * The **callback_fn** should be a static function and + * the **callback_ctx** should be a pointer to the stack. + * The **flags** is used to control certain aspects of the helper. + * Currently, the **flags** must be 0. + * + * The following are a list of supported map types and their + * respective expected callback signatures: + * + * BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH, + * BPF_MAP_TYPE_LRU_HASH, BPF_MAP_TYPE_LRU_PERCPU_HASH, + * BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY + * + * long (*callback_fn)(struct bpf_map *map, const void *key, void *value, void *ctx); + * + * For per_cpu maps, the map_value is the value on the cpu where the + * bpf_prog is running. + * + * If **callback_fn** return 0, the helper will continue to the next + * element. If return value is 1, the helper will skip the rest of + * elements and return. Other return values are not used now. + * + * Return + * The number of traversed map elements for success, **-EINVAL** for + * invalid **flags**. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3951,6 +3989,7 @@ union bpf_attr { FN(sched_entity_to_tgidpid), \ FN(sched_entity_to_cgrpid), \ FN(sched_entity_belongs_to_cgrp), \ + FN(for_each_map_elem), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit b8f871fa32ad392759bc70090fa8c60d9f10625c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Move function is_ldimm64() close to the beginning of libbpf.c, so it can be reused by later code and the next patch as well. There is no functionality change.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204929.3885295-1-yhs@fb.com (cherry picked from commit b8f871fa32ad392759bc70090fa8c60d9f10625c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 668103ebd5cb..6023e6c63521 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -574,6 +574,11 @@ static bool insn_is_subprog_call(const struct bpf_insn *insn) insn->off == 0; }
+static bool is_ldimm64(struct bpf_insn *insn) +{ + return insn->code == (BPF_LD | BPF_IMM | BPF_DW); +} + static int bpf_object__init_prog(struct bpf_object *obj, struct bpf_program *prog, const char *name, size_t sec_idx, const char *sec_name, @@ -3397,7 +3402,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog, return 0; }
- if (insn->code != (BPF_LD | BPF_IMM | BPF_DW)) { + if (!is_ldimm64(insn)) { pr_warn("prog '%s': invalid relo against '%s' for insns[%d].code 0x%x\n", prog->name, sym_name, insn_idx, insn->code); return -LIBBPF_ERRNO__RELOC; @@ -5621,11 +5626,6 @@ static void bpf_core_poison_insn(struct bpf_program *prog, int relo_idx, insn->imm = 195896080; /* => 0xbad2310 => "bad relo" */ }
-static bool is_ldimm64(struct bpf_insn *insn) -{ - return insn->code == (BPF_LD | BPF_IMM | BPF_DW); -} - static int insn_bpf_size_to_bytes(struct bpf_insn *insn) { switch (BPF_SIZE(insn->code)) {
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit 53eddb5e04ac5c53a4ccef9f1f900562a5a75246 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
A new relocation RELO_SUBPROG_ADDR is added to capture subprog addresses loaded with ld_imm64 insns. Such ld_imm64 insns are marked with BPF_PSEUDO_FUNC and will be passed to kernel. For bpf_for_each_map_elem() case, kernel will check that the to-be-used subprog address must be a static function and replace it with proper actual jited func address.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226204930.3885367-1-yhs@fb.com (cherry picked from commit 53eddb5e04ac5c53a4ccef9f1f900562a5a75246) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 64 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 61 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6023e6c63521..69d79c318230 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -188,6 +188,7 @@ enum reloc_type { RELO_CALL, RELO_DATA, RELO_EXTERN, + RELO_SUBPROG_ADDR, };
struct reloc_desc { @@ -579,6 +580,11 @@ static bool is_ldimm64(struct bpf_insn *insn) return insn->code == (BPF_LD | BPF_IMM | BPF_DW); }
+static bool insn_is_pseudo_func(struct bpf_insn *insn) +{ + return is_ldimm64(insn) && insn->src_reg == BPF_PSEUDO_FUNC; +} + static int bpf_object__init_prog(struct bpf_object *obj, struct bpf_program *prog, const char *name, size_t sec_idx, const char *sec_name, @@ -2981,6 +2987,23 @@ static bool sym_is_extern(const GElf_Sym *sym) GELF_ST_TYPE(sym->st_info) == STT_NOTYPE; }
+static bool sym_is_subprog(const GElf_Sym *sym, int text_shndx) +{ + int bind = GELF_ST_BIND(sym->st_info); + int type = GELF_ST_TYPE(sym->st_info); + + /* in .text section */ + if (sym->st_shndx != text_shndx) + return false; + + /* local function */ + if (bind == STB_LOCAL && type == STT_SECTION) + return true; + + /* global function */ + return bind == STB_GLOBAL && type == STT_FUNC; +} + static int find_extern_btf_id(const struct btf *btf, const char *ext_name) { const struct btf_type *t; @@ -3437,6 +3460,23 @@ static int bpf_program__record_reloc(struct bpf_program *prog, return -LIBBPF_ERRNO__RELOC; }
+ /* loading subprog addresses */ + if (sym_is_subprog(sym, obj->efile.text_shndx)) { + /* global_func: sym->st_value = offset in the section, insn->imm = 0. + * local_func: sym->st_value = 0, insn->imm = offset in the section. + */ + if ((sym->st_value % BPF_INSN_SZ) || (insn->imm % BPF_INSN_SZ)) { + pr_warn("prog '%s': bad subprog addr relo against '%s' at offset %zu+%d\n", + prog->name, sym_name, (size_t)sym->st_value, insn->imm); + return -LIBBPF_ERRNO__RELOC; + } + + reloc_desc->type = RELO_SUBPROG_ADDR; + reloc_desc->insn_idx = insn_idx; + reloc_desc->sym_off = sym->st_value; + return 0; + } + type = bpf_object__section_to_libbpf_map_type(obj, shdr_idx); sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx));
@@ -6227,6 +6267,10 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) } relo->processed = true; break; + case RELO_SUBPROG_ADDR: + insn[0].src_reg = BPF_PSEUDO_FUNC; + /* will be handled as a follow up pass */ + break; case RELO_CALL: /* will be handled as a follow up pass */ break; @@ -6413,11 +6457,11 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) { insn = &main_prog->insns[prog->sub_insn_off + insn_idx]; - if (!insn_is_subprog_call(insn)) + if (!insn_is_subprog_call(insn) && !insn_is_pseudo_func(insn)) continue;
relo = find_prog_insn_relo(prog, insn_idx); - if (relo && relo->type != RELO_CALL) { + if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) { pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n", prog->name, insn_idx, relo->type); return -LIBBPF_ERRNO__RELOC; @@ -6429,8 +6473,22 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog, * call always has imm = -1, but for static functions * relocation is against STT_SECTION and insn->imm * points to a start of a static function + * + * for subprog addr relocation, the relo->sym_off + insn->imm is + * the byte offset in the corresponding section. */ - sub_insn_idx = relo->sym_off / BPF_INSN_SZ + insn->imm + 1; + if (relo->type == RELO_CALL) + sub_insn_idx = relo->sym_off / BPF_INSN_SZ + insn->imm + 1; + else + sub_insn_idx = (relo->sym_off + insn->imm) / BPF_INSN_SZ; + } else if (insn_is_pseudo_func(insn)) { + /* + * RELO_SUBPROG_ADDR relo is always emitted even if both + * functions are in the same section, so it shouldn't reach here. + */ + pr_warn("prog '%s': missing subprog addr relo for insn #%zu\n", + prog->name, insn_idx); + return -LIBBPF_ERRNO__RELOC; } else { /* if subprogram call is to a static function within * the same ELF section, there won't be any relocation
From: Ilya Leoshkevich iii@linux.ibm.com
mainline inclusion from mainline-5.13-rc1 commit 8fd886911a6a99acf4a8facf619a2e7b5225be78 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a new kind value and expand the kind bitfield.
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210226202256.116518-2-iii@linux.ibm.com (cherry picked from commit 8fd886911a6a99acf4a8facf619a2e7b5225be78) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/btf.h | 5 +++-- tools/include/uapi/linux/btf.h | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index 5a667107ad2c..d27b1708efe9 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -52,7 +52,7 @@ struct btf_type { }; };
-#define BTF_INFO_KIND(info) (((info) >> 24) & 0x0f) +#define BTF_INFO_KIND(info) (((info) >> 24) & 0x1f) #define BTF_INFO_VLEN(info) ((info) & 0xffff) #define BTF_INFO_KFLAG(info) ((info) >> 31)
@@ -72,7 +72,8 @@ struct btf_type { #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ #define BTF_KIND_VAR 14 /* Variable */ #define BTF_KIND_DATASEC 15 /* Section */ -#define BTF_KIND_MAX BTF_KIND_DATASEC +#define BTF_KIND_FLOAT 16 /* Floating point */ +#define BTF_KIND_MAX BTF_KIND_FLOAT #define NR_BTF_KINDS (BTF_KIND_MAX + 1)
/* For some specific BTF_KIND, "struct btf_type" is immediately diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h index 5a667107ad2c..d27b1708efe9 100644 --- a/tools/include/uapi/linux/btf.h +++ b/tools/include/uapi/linux/btf.h @@ -52,7 +52,7 @@ struct btf_type { }; };
-#define BTF_INFO_KIND(info) (((info) >> 24) & 0x0f) +#define BTF_INFO_KIND(info) (((info) >> 24) & 0x1f) #define BTF_INFO_VLEN(info) ((info) & 0xffff) #define BTF_INFO_KFLAG(info) ((info) >> 31)
@@ -72,7 +72,8 @@ struct btf_type { #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ #define BTF_KIND_VAR 14 /* Variable */ #define BTF_KIND_DATASEC 15 /* Section */ -#define BTF_KIND_MAX BTF_KIND_DATASEC +#define BTF_KIND_FLOAT 16 /* Floating point */ +#define BTF_KIND_MAX BTF_KIND_FLOAT #define NR_BTF_KINDS (BTF_KIND_MAX + 1)
/* For some specific BTF_KIND, "struct btf_type" is immediately
From: Ilya Leoshkevich iii@linux.ibm.com
mainline inclusion from mainline-5.13-rc1 commit 1b1ce92b24331b569a444858fc487a1ca19dc778 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove trailing space.
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210226202256.116518-3-iii@linux.ibm.com (cherry picked from commit 1b1ce92b24331b569a444858fc487a1ca19dc778) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 0b47edee3c93..723d8e6f410f 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1889,7 +1889,7 @@ static int btf_add_composite(struct btf *btf, int kind, const char *name, __u32 * - *byte_sz* - size of the struct, in bytes; * * Struct initially has no fields in it. Fields can be added by - * btf__add_field() right after btf__add_struct() succeeds. + * btf__add_field() right after btf__add_struct() succeeds. * * Returns: * - >0, type ID of newly added BTF type;
From: Ilya Leoshkevich iii@linux.ibm.com
mainline inclusion from mainline-5.13-rc1 commit 22541a9eeb0d968c133aaebd95fa59da3208e705 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The logic follows that of BTF_KIND_INT most of the time. Sanitization replaces BTF_KIND_FLOATs with equally-sized empty BTF_KIND_STRUCTs on older kernels, for example, the following:
[4] FLOAT 'float' size=4
becomes the following:
[4] STRUCT '(anon)' size=4 vlen=0
With dwarves patch [1] and this patch, the older kernels, which were failing with the floating-point-related errors, will now start working correctly.
[1] https://github.com/iii-i/dwarves/commit/btf-kind-float-v2
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210226202256.116518-4-iii@linux.ibm.com (cherry picked from commit 22541a9eeb0d968c133aaebd95fa59da3208e705) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 49 +++++++++++++++++++++++++++++++++ tools/lib/bpf/btf.h | 6 ++++ tools/lib/bpf/btf_dump.c | 4 +++ tools/lib/bpf/libbpf.c | 29 ++++++++++++++++++- tools/lib/bpf/libbpf.map | 5 ++++ tools/lib/bpf/libbpf_internal.h | 2 ++ 6 files changed, 94 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 723d8e6f410f..3a944a487e7e 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -297,6 +297,7 @@ static int btf_type_size(const struct btf_type *t) case BTF_KIND_PTR: case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: + case BTF_KIND_FLOAT: return base_size; case BTF_KIND_INT: return base_size + sizeof(__u32); @@ -344,6 +345,7 @@ static int btf_bswap_type_rest(struct btf_type *t) case BTF_KIND_PTR: case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: + case BTF_KIND_FLOAT: return 0; case BTF_KIND_INT: *(__u32 *)(t + 1) = bswap_32(*(__u32 *)(t + 1)); @@ -584,6 +586,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_UNION: case BTF_KIND_ENUM: case BTF_KIND_DATASEC: + case BTF_KIND_FLOAT: size = t->size; goto done; case BTF_KIND_PTR: @@ -627,6 +630,7 @@ int btf__align_of(const struct btf *btf, __u32 id) switch (kind) { case BTF_KIND_INT: case BTF_KIND_ENUM: + case BTF_KIND_FLOAT: return min(btf_ptr_sz(btf), (size_t)t->size); case BTF_KIND_PTR: return btf_ptr_sz(btf); @@ -1762,6 +1766,47 @@ int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding return btf_commit_type(btf, sz); }
+/* + * Append new BTF_KIND_FLOAT type with: + * - *name* - non-empty, non-NULL type name; + * - *sz* - size of the type, in bytes; + * Returns: + * - >0, type ID of newly added BTF type; + * - <0, on error. + */ +int btf__add_float(struct btf *btf, const char *name, size_t byte_sz) +{ + struct btf_type *t; + int sz, name_off; + + /* non-empty name */ + if (!name || !name[0]) + return -EINVAL; + + /* byte_sz must be one of the explicitly allowed values */ + if (byte_sz != 2 && byte_sz != 4 && byte_sz != 8 && byte_sz != 12 && + byte_sz != 16) + return -EINVAL; + + if (btf_ensure_modifiable(btf)) + return -ENOMEM; + + sz = sizeof(struct btf_type); + t = btf_add_type_mem(btf, sz); + if (!t) + return -ENOMEM; + + name_off = btf__add_str(btf, name); + if (name_off < 0) + return name_off; + + t->name_off = name_off; + t->info = btf_type_info(BTF_KIND_FLOAT, 0, 0); + t->size = byte_sz; + + return btf_commit_type(btf, sz); +} + /* it's completely legal to append BTF types with type IDs pointing forward to * types that haven't been appended yet, so we only make sure that id looks * sane, we can't guarantee that ID will always be valid @@ -3632,6 +3677,7 @@ static int btf_dedup_prep(struct btf_dedup *d) case BTF_KIND_FWD: case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: + case BTF_KIND_FLOAT: h = btf_hash_common(t); break; case BTF_KIND_INT: @@ -3728,6 +3774,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id) break;
case BTF_KIND_FWD: + case BTF_KIND_FLOAT: h = btf_hash_common(t); for_each_dedup_cand(d, hash_entry, h) { cand_id = (__u32)(long)hash_entry->value; @@ -3989,6 +4036,7 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, return btf_compat_enum(cand_type, canon_type);
case BTF_KIND_FWD: + case BTF_KIND_FLOAT: return btf_equal_common(cand_type, canon_type);
case BTF_KIND_CONST: @@ -4485,6 +4533,7 @@ static int btf_dedup_remap_type(struct btf_dedup *d, __u32 type_id) switch (btf_kind(t)) { case BTF_KIND_INT: case BTF_KIND_ENUM: + case BTF_KIND_FLOAT: break;
case BTF_KIND_FWD: diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 5b8a6ea44b38..00313159c42e 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -95,6 +95,7 @@ LIBBPF_API int btf__find_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding); +LIBBPF_API int btf__add_float(struct btf *btf, const char *name, size_t byte_sz); LIBBPF_API int btf__add_ptr(struct btf *btf, int ref_type_id); LIBBPF_API int btf__add_array(struct btf *btf, int index_type_id, int elem_type_id, __u32 nr_elems); @@ -295,6 +296,11 @@ static inline bool btf_is_datasec(const struct btf_type *t) return btf_kind(t) == BTF_KIND_DATASEC; }
+static inline bool btf_is_float(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_FLOAT; +} + static inline __u8 btf_int_encoding(const struct btf_type *t) { return BTF_INT_ENCODING(*(__u32 *)(t + 1)); diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index bd22853be4a6..48ab93034dbc 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -279,6 +279,7 @@ static int btf_dump_mark_referenced(struct btf_dump *d) case BTF_KIND_INT: case BTF_KIND_ENUM: case BTF_KIND_FWD: + case BTF_KIND_FLOAT: break;
case BTF_KIND_VOLATILE: @@ -453,6 +454,7 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr)
switch (btf_kind(t)) { case BTF_KIND_INT: + case BTF_KIND_FLOAT: tstate->order_state = ORDERED; return 0;
@@ -1133,6 +1135,7 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id, case BTF_KIND_STRUCT: case BTF_KIND_UNION: case BTF_KIND_TYPEDEF: + case BTF_KIND_FLOAT: goto done; default: pr_warn("unexpected type in decl chain, kind:%u, id:[%u]\n", @@ -1247,6 +1250,7 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
switch (kind) { case BTF_KIND_INT: + case BTF_KIND_FLOAT: btf_dump_emit_mods(d, decls); name = btf_name_of(d, t->name_off); btf_dump_printf(d, "%s", name); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 69d79c318230..5bd3d4416500 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -178,6 +178,8 @@ enum kern_feature_id { FEAT_PROG_BIND_MAP, /* Kernel support for module BTFs */ FEAT_MODULE_BTF, + /* BTF_KIND_FLOAT support */ + FEAT_BTF_FLOAT, __FEAT_CNT, };
@@ -1947,6 +1949,7 @@ static const char *btf_kind_str(const struct btf_type *t) case BTF_KIND_FUNC_PROTO: return "func_proto"; case BTF_KIND_VAR: return "var"; case BTF_KIND_DATASEC: return "datasec"; + case BTF_KIND_FLOAT: return "float"; default: return "unknown"; } } @@ -2396,15 +2399,17 @@ static bool btf_needs_sanitization(struct bpf_object *obj) { bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC); bool has_datasec = kernel_supports(FEAT_BTF_DATASEC); + bool has_float = kernel_supports(FEAT_BTF_FLOAT); bool has_func = kernel_supports(FEAT_BTF_FUNC);
- return !has_func || !has_datasec || !has_func_global; + return !has_func || !has_datasec || !has_func_global || !has_float; }
static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) { bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC); bool has_datasec = kernel_supports(FEAT_BTF_DATASEC); + bool has_float = kernel_supports(FEAT_BTF_FLOAT); bool has_func = kernel_supports(FEAT_BTF_FUNC); struct btf_type *t; int i, j, vlen; @@ -2457,6 +2462,13 @@ static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) } else if (!has_func_global && btf_is_func(t)) { /* replace BTF_FUNC_GLOBAL with BTF_FUNC_STATIC */ t->info = BTF_INFO_ENC(BTF_KIND_FUNC, 0, 0); + } else if (!has_float && btf_is_float(t)) { + /* replace FLOAT with an equally-sized empty STRUCT; + * since C compilers do not accept e.g. "float" as a + * valid struct name, make it anonymous + */ + t->name_off = 0; + t->info = BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 0); } } } @@ -3967,6 +3979,18 @@ static int probe_kern_btf_datasec(void) strs, sizeof(strs))); }
+static int probe_kern_btf_float(void) +{ + static const char strs[] = "\0float"; + __u32 types[] = { + /* float */ + BTF_TYPE_FLOAT_ENC(1, 4), + }; + + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); +} + static int probe_kern_array_mmap(void) { struct bpf_create_map_attr attr = { @@ -4146,6 +4170,9 @@ static struct kern_feature_desc { [FEAT_MODULE_BTF] = { "module BTF support", probe_module_btf, }, + [FEAT_BTF_FLOAT] = { + "BTF_KIND_FLOAT support", probe_kern_btf_float, + }, };
static bool kernel_supports(enum kern_feature_id feat_id) diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index d56a8268d349..84099e305666 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -350,3 +350,8 @@ LIBBPF_0.3.0 { btf__new_empty_split; btf__new_split; } LIBBPF_0.2.0; + +LIBBPF_0.4.0 { + global: + btf__add_float; +} LIBBPF_0.3.0; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 969d0ac592ba..343f6eb05637 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -31,6 +31,8 @@ #define BTF_MEMBER_ENC(name, type, bits_offset) (name), (type), (bits_offset) #define BTF_PARAM_ENC(name, type) (name), (type) #define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size) +#define BTF_TYPE_FLOAT_ENC(name, sz) \ + BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FLOAT, 0, 0), sz)
#ifndef likely #define likely(x) __builtin_expect(!!(x), 1)
From: Joe Stringer joe@cilium.io
mainline inclusion from mainline-5.13-rc1 commit 242029f42691e05ac09b31b98221421bd564375e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Synchronize the header after all of the recent changes.
Signed-off-by: Joe Stringer joe@cilium.io Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Quentin Monnet quentin@isovalent.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210302171947.2268128-16-joe@cilium.io (cherry picked from commit 242029f42691e05ac09b31b98221421bd564375e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/include/uapi/linux/bpf.h | 712 ++++++++++++++++++++++++++++++++- 1 file changed, 711 insertions(+), 1 deletion(-)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d236252e1af5..ffe578a63de7 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -93,7 +93,717 @@ union bpf_iter_link_info { } map; };
-/* BPF syscall commands, see bpf(2) man-page for details. */ +/* BPF syscall commands, see bpf(2) man-page for more details. */ +/** + * DOC: eBPF Syscall Preamble + * + * The operation to be performed by the **bpf**\ () system call is determined + * by the *cmd* argument. Each operation takes an accompanying argument, + * provided via *attr*, which is a pointer to a union of type *bpf_attr* (see + * below). The size argument is the size of the union pointed to by *attr*. + */ +/** + * DOC: eBPF Syscall Commands + * + * BPF_MAP_CREATE + * Description + * Create a map and return a file descriptor that refers to the + * map. The close-on-exec file descriptor flag (see **fcntl**\ (2)) + * is automatically enabled for the new file descriptor. + * + * Applying **close**\ (2) to the file descriptor returned by + * **BPF_MAP_CREATE** will delete the map (but see NOTES). + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_MAP_LOOKUP_ELEM + * Description + * Look up an element with a given *key* in the map referred to + * by the file descriptor *map_fd*. + * + * The *flags* argument may be specified as one of the + * following: + * + * **BPF_F_LOCK** + * Look up the value of a spin-locked map without + * returning the lock. This must be specified if the + * elements contain a spinlock. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_MAP_UPDATE_ELEM + * Description + * Create or update an element (key/value pair) in a specified map. + * + * The *flags* argument should be specified as one of the + * following: + * + * **BPF_ANY** + * Create a new element or update an existing element. + * **BPF_NOEXIST** + * Create a new element only if it did not exist. + * **BPF_EXIST** + * Update an existing element. + * **BPF_F_LOCK** + * Update a spin_lock-ed map element. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * May set *errno* to **EINVAL**, **EPERM**, **ENOMEM**, + * **E2BIG**, **EEXIST**, or **ENOENT**. + * + * **E2BIG** + * The number of elements in the map reached the + * *max_entries* limit specified at map creation time. + * **EEXIST** + * If *flags* specifies **BPF_NOEXIST** and the element + * with *key* already exists in the map. + * **ENOENT** + * If *flags* specifies **BPF_EXIST** and the element with + * *key* does not exist in the map. + * + * BPF_MAP_DELETE_ELEM + * Description + * Look up and delete an element by key in a specified map. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_MAP_GET_NEXT_KEY + * Description + * Look up an element by key in a specified map and return the key + * of the next element. Can be used to iterate over all elements + * in the map. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * The following cases can be used to iterate over all elements of + * the map: + * + * * If *key* is not found, the operation returns zero and sets + * the *next_key* pointer to the key of the first element. + * * If *key* is found, the operation returns zero and sets the + * *next_key* pointer to the key of the next element. + * * If *key* is the last element, returns -1 and *errno* is set + * to **ENOENT**. + * + * May set *errno* to **ENOMEM**, **EFAULT**, **EPERM**, or + * **EINVAL** on error. + * + * BPF_PROG_LOAD + * Description + * Verify and load an eBPF program, returning a new file + * descriptor associated with the program. + * + * Applying **close**\ (2) to the file descriptor returned by + * **BPF_PROG_LOAD** will unload the eBPF program (but see NOTES). + * + * The close-on-exec file descriptor flag (see **fcntl**\ (2)) is + * automatically enabled for the new file descriptor. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_OBJ_PIN + * Description + * Pin an eBPF program or map referred by the specified *bpf_fd* + * to the provided *pathname* on the filesystem. + * + * The *pathname* argument must not contain a dot ("."). + * + * On success, *pathname* retains a reference to the eBPF object, + * preventing deallocation of the object when the original + * *bpf_fd* is closed. This allow the eBPF object to live beyond + * **close**\ (\ *bpf_fd*\ ), and hence the lifetime of the parent + * process. + * + * Applying **unlink**\ (2) or similar calls to the *pathname* + * unpins the object from the filesystem, removing the reference. + * If no other file descriptors or filesystem nodes refer to the + * same object, it will be deallocated (see NOTES). + * + * The filesystem type for the parent directory of *pathname* must + * be **BPF_FS_MAGIC**. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_OBJ_GET + * Description + * Open a file descriptor for the eBPF object pinned to the + * specified *pathname*. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_PROG_ATTACH + * Description + * Attach an eBPF program to a *target_fd* at the specified + * *attach_type* hook. + * + * The *attach_type* specifies the eBPF attachment point to + * attach the program to, and must be one of *bpf_attach_type* + * (see below). + * + * The *attach_bpf_fd* must be a valid file descriptor for a + * loaded eBPF program of a cgroup, flow dissector, LIRC, sockmap + * or sock_ops type corresponding to the specified *attach_type*. + * + * The *target_fd* must be a valid file descriptor for a kernel + * object which depends on the attach type of *attach_bpf_fd*: + * + * **BPF_PROG_TYPE_CGROUP_DEVICE**, + * **BPF_PROG_TYPE_CGROUP_SKB**, + * **BPF_PROG_TYPE_CGROUP_SOCK**, + * **BPF_PROG_TYPE_CGROUP_SOCK_ADDR**, + * **BPF_PROG_TYPE_CGROUP_SOCKOPT**, + * **BPF_PROG_TYPE_CGROUP_SYSCTL**, + * **BPF_PROG_TYPE_SOCK_OPS** + * + * Control Group v2 hierarchy with the eBPF controller + * enabled. Requires the kernel to be compiled with + * **CONFIG_CGROUP_BPF**. + * + * **BPF_PROG_TYPE_FLOW_DISSECTOR** + * + * Network namespace (eg /proc/self/ns/net). + * + * **BPF_PROG_TYPE_LIRC_MODE2** + * + * LIRC device path (eg /dev/lircN). Requires the kernel + * to be compiled with **CONFIG_BPF_LIRC_MODE2**. + * + * **BPF_PROG_TYPE_SK_SKB**, + * **BPF_PROG_TYPE_SK_MSG** + * + * eBPF map of socket type (eg **BPF_MAP_TYPE_SOCKHASH**). + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_PROG_DETACH + * Description + * Detach the eBPF program associated with the *target_fd* at the + * hook specified by *attach_type*. The program must have been + * previously attached using **BPF_PROG_ATTACH**. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_PROG_TEST_RUN + * Description + * Run the eBPF program associated with the *prog_fd* a *repeat* + * number of times against a provided program context *ctx_in* and + * data *data_in*, and return the modified program context + * *ctx_out*, *data_out* (for example, packet data), result of the + * execution *retval*, and *duration* of the test run. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * **ENOSPC** + * Either *data_size_out* or *ctx_size_out* is too small. + * **ENOTSUPP** + * This command is not supported by the program type of + * the program referred to by *prog_fd*. + * + * BPF_PROG_GET_NEXT_ID + * Description + * Fetch the next eBPF program currently loaded into the kernel. + * + * Looks for the eBPF program with an id greater than *start_id* + * and updates *next_id* on success. If no other eBPF programs + * remain with ids higher than *start_id*, returns -1 and sets + * *errno* to **ENOENT**. + * + * Return + * Returns zero on success. On error, or when no id remains, -1 + * is returned and *errno* is set appropriately. + * + * BPF_MAP_GET_NEXT_ID + * Description + * Fetch the next eBPF map currently loaded into the kernel. + * + * Looks for the eBPF map with an id greater than *start_id* + * and updates *next_id* on success. If no other eBPF maps + * remain with ids higher than *start_id*, returns -1 and sets + * *errno* to **ENOENT**. + * + * Return + * Returns zero on success. On error, or when no id remains, -1 + * is returned and *errno* is set appropriately. + * + * BPF_PROG_GET_FD_BY_ID + * Description + * Open a file descriptor for the eBPF program corresponding to + * *prog_id*. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_MAP_GET_FD_BY_ID + * Description + * Open a file descriptor for the eBPF map corresponding to + * *map_id*. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_OBJ_GET_INFO_BY_FD + * Description + * Obtain information about the eBPF object corresponding to + * *bpf_fd*. + * + * Populates up to *info_len* bytes of *info*, which will be in + * one of the following formats depending on the eBPF object type + * of *bpf_fd*: + * + * * **struct bpf_prog_info** + * * **struct bpf_map_info** + * * **struct bpf_btf_info** + * * **struct bpf_link_info** + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_PROG_QUERY + * Description + * Obtain information about eBPF programs associated with the + * specified *attach_type* hook. + * + * The *target_fd* must be a valid file descriptor for a kernel + * object which depends on the attach type of *attach_bpf_fd*: + * + * **BPF_PROG_TYPE_CGROUP_DEVICE**, + * **BPF_PROG_TYPE_CGROUP_SKB**, + * **BPF_PROG_TYPE_CGROUP_SOCK**, + * **BPF_PROG_TYPE_CGROUP_SOCK_ADDR**, + * **BPF_PROG_TYPE_CGROUP_SOCKOPT**, + * **BPF_PROG_TYPE_CGROUP_SYSCTL**, + * **BPF_PROG_TYPE_SOCK_OPS** + * + * Control Group v2 hierarchy with the eBPF controller + * enabled. Requires the kernel to be compiled with + * **CONFIG_CGROUP_BPF**. + * + * **BPF_PROG_TYPE_FLOW_DISSECTOR** + * + * Network namespace (eg /proc/self/ns/net). + * + * **BPF_PROG_TYPE_LIRC_MODE2** + * + * LIRC device path (eg /dev/lircN). Requires the kernel + * to be compiled with **CONFIG_BPF_LIRC_MODE2**. + * + * **BPF_PROG_QUERY** always fetches the number of programs + * attached and the *attach_flags* which were used to attach those + * programs. Additionally, if *prog_ids* is nonzero and the number + * of attached programs is less than *prog_cnt*, populates + * *prog_ids* with the eBPF program ids of the programs attached + * at *target_fd*. + * + * The following flags may alter the result: + * + * **BPF_F_QUERY_EFFECTIVE** + * Only return information regarding programs which are + * currently effective at the specified *target_fd*. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_RAW_TRACEPOINT_OPEN + * Description + * Attach an eBPF program to a tracepoint *name* to access kernel + * internal arguments of the tracepoint in their raw form. + * + * The *prog_fd* must be a valid file descriptor associated with + * a loaded eBPF program of type **BPF_PROG_TYPE_RAW_TRACEPOINT**. + * + * No ABI guarantees are made about the content of tracepoint + * arguments exposed to the corresponding eBPF program. + * + * Applying **close**\ (2) to the file descriptor returned by + * **BPF_RAW_TRACEPOINT_OPEN** will delete the map (but see NOTES). + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_BTF_LOAD + * Description + * Verify and load BPF Type Format (BTF) metadata into the kernel, + * returning a new file descriptor associated with the metadata. + * BTF is described in more detail at + * https://www.kernel.org/doc/html/latest/bpf/btf.html. + * + * The *btf* parameter must point to valid memory providing + * *btf_size* bytes of BTF binary metadata. + * + * The returned file descriptor can be passed to other **bpf**\ () + * subcommands such as **BPF_PROG_LOAD** or **BPF_MAP_CREATE** to + * associate the BTF with those objects. + * + * Similar to **BPF_PROG_LOAD**, **BPF_BTF_LOAD** has optional + * parameters to specify a *btf_log_buf*, *btf_log_size* and + * *btf_log_level* which allow the kernel to return freeform log + * output regarding the BTF verification process. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_BTF_GET_FD_BY_ID + * Description + * Open a file descriptor for the BPF Type Format (BTF) + * corresponding to *btf_id*. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_TASK_FD_QUERY + * Description + * Obtain information about eBPF programs associated with the + * target process identified by *pid* and *fd*. + * + * If the *pid* and *fd* are associated with a tracepoint, kprobe + * or uprobe perf event, then the *prog_id* and *fd_type* will + * be populated with the eBPF program id and file descriptor type + * of type **bpf_task_fd_type**. If associated with a kprobe or + * uprobe, the *probe_offset* and *probe_addr* will also be + * populated. Optionally, if *buf* is provided, then up to + * *buf_len* bytes of *buf* will be populated with the name of + * the tracepoint, kprobe or uprobe. + * + * The resulting *prog_id* may be introspected in deeper detail + * using **BPF_PROG_GET_FD_BY_ID** and **BPF_OBJ_GET_INFO_BY_FD**. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_MAP_LOOKUP_AND_DELETE_ELEM + * Description + * Look up an element with the given *key* in the map referred to + * by the file descriptor *fd*, and if found, delete the element. + * + * The **BPF_MAP_TYPE_QUEUE** and **BPF_MAP_TYPE_STACK** map types + * implement this command as a "pop" operation, deleting the top + * element rather than one corresponding to *key*. + * The *key* and *key_len* parameters should be zeroed when + * issuing this operation for these map types. + * + * This command is only valid for the following map types: + * * **BPF_MAP_TYPE_QUEUE** + * * **BPF_MAP_TYPE_STACK** + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_MAP_FREEZE + * Description + * Freeze the permissions of the specified map. + * + * Write permissions may be frozen by passing zero *flags*. + * Upon success, no future syscall invocations may alter the + * map state of *map_fd*. Write operations from eBPF programs + * are still possible for a frozen map. + * + * Not supported for maps of type **BPF_MAP_TYPE_STRUCT_OPS**. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_BTF_GET_NEXT_ID + * Description + * Fetch the next BPF Type Format (BTF) object currently loaded + * into the kernel. + * + * Looks for the BTF object with an id greater than *start_id* + * and updates *next_id* on success. If no other BTF objects + * remain with ids higher than *start_id*, returns -1 and sets + * *errno* to **ENOENT**. + * + * Return + * Returns zero on success. On error, or when no id remains, -1 + * is returned and *errno* is set appropriately. + * + * BPF_MAP_LOOKUP_BATCH + * Description + * Iterate and fetch multiple elements in a map. + * + * Two opaque values are used to manage batch operations, + * *in_batch* and *out_batch*. Initially, *in_batch* must be set + * to NULL to begin the batched operation. After each subsequent + * **BPF_MAP_LOOKUP_BATCH**, the caller should pass the resultant + * *out_batch* as the *in_batch* for the next operation to + * continue iteration from the current point. + * + * The *keys* and *values* are output parameters which must point + * to memory large enough to hold *count* items based on the key + * and value size of the map *map_fd*. The *keys* buffer must be + * of *key_size* * *count*. The *values* buffer must be of + * *value_size* * *count*. + * + * The *elem_flags* argument may be specified as one of the + * following: + * + * **BPF_F_LOCK** + * Look up the value of a spin-locked map without + * returning the lock. This must be specified if the + * elements contain a spinlock. + * + * On success, *count* elements from the map are copied into the + * user buffer, with the keys copied into *keys* and the values + * copied into the corresponding indices in *values*. + * + * If an error is returned and *errno* is not **EFAULT**, *count* + * is set to the number of successfully processed elements. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * May set *errno* to **ENOSPC** to indicate that *keys* or + * *values* is too small to dump an entire bucket during + * iteration of a hash-based map type. + * + * BPF_MAP_LOOKUP_AND_DELETE_BATCH + * Description + * Iterate and delete all elements in a map. + * + * This operation has the same behavior as + * **BPF_MAP_LOOKUP_BATCH** with two exceptions: + * + * * Every element that is successfully returned is also deleted + * from the map. This is at least *count* elements. Note that + * *count* is both an input and an output parameter. + * * Upon returning with *errno* set to **EFAULT**, up to + * *count* elements may be deleted without returning the keys + * and values of the deleted elements. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_MAP_UPDATE_BATCH + * Description + * Update multiple elements in a map by *key*. + * + * The *keys* and *values* are input parameters which must point + * to memory large enough to hold *count* items based on the key + * and value size of the map *map_fd*. The *keys* buffer must be + * of *key_size* * *count*. The *values* buffer must be of + * *value_size* * *count*. + * + * Each element specified in *keys* is sequentially updated to the + * value in the corresponding index in *values*. The *in_batch* + * and *out_batch* parameters are ignored and should be zeroed. + * + * The *elem_flags* argument should be specified as one of the + * following: + * + * **BPF_ANY** + * Create new elements or update a existing elements. + * **BPF_NOEXIST** + * Create new elements only if they do not exist. + * **BPF_EXIST** + * Update existing elements. + * **BPF_F_LOCK** + * Update spin_lock-ed map elements. This must be + * specified if the map value contains a spinlock. + * + * On success, *count* elements from the map are updated. + * + * If an error is returned and *errno* is not **EFAULT**, *count* + * is set to the number of successfully processed elements. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * May set *errno* to **EINVAL**, **EPERM**, **ENOMEM**, or + * **E2BIG**. **E2BIG** indicates that the number of elements in + * the map reached the *max_entries* limit specified at map + * creation time. + * + * May set *errno* to one of the following error codes under + * specific circumstances: + * + * **EEXIST** + * If *flags* specifies **BPF_NOEXIST** and the element + * with *key* already exists in the map. + * **ENOENT** + * If *flags* specifies **BPF_EXIST** and the element with + * *key* does not exist in the map. + * + * BPF_MAP_DELETE_BATCH + * Description + * Delete multiple elements in a map by *key*. + * + * The *keys* parameter is an input parameter which must point + * to memory large enough to hold *count* items based on the key + * size of the map *map_fd*, that is, *key_size* * *count*. + * + * Each element specified in *keys* is sequentially deleted. The + * *in_batch*, *out_batch*, and *values* parameters are ignored + * and should be zeroed. + * + * The *elem_flags* argument may be specified as one of the + * following: + * + * **BPF_F_LOCK** + * Look up the value of a spin-locked map without + * returning the lock. This must be specified if the + * elements contain a spinlock. + * + * On success, *count* elements from the map are updated. + * + * If an error is returned and *errno* is not **EFAULT**, *count* + * is set to the number of successfully processed elements. If + * *errno* is **EFAULT**, up to *count* elements may be been + * deleted. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_LINK_CREATE + * Description + * Attach an eBPF program to a *target_fd* at the specified + * *attach_type* hook and return a file descriptor handle for + * managing the link. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_LINK_UPDATE + * Description + * Update the eBPF program in the specified *link_fd* to + * *new_prog_fd*. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_LINK_GET_FD_BY_ID + * Description + * Open a file descriptor for the eBPF Link corresponding to + * *link_id*. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_LINK_GET_NEXT_ID + * Description + * Fetch the next eBPF link currently loaded into the kernel. + * + * Looks for the eBPF link with an id greater than *start_id* + * and updates *next_id* on success. If no other eBPF links + * remain with ids higher than *start_id*, returns -1 and sets + * *errno* to **ENOENT**. + * + * Return + * Returns zero on success. On error, or when no id remains, -1 + * is returned and *errno* is set appropriately. + * + * BPF_ENABLE_STATS + * Description + * Enable eBPF runtime statistics gathering. + * + * Runtime statistics gathering for the eBPF runtime is disabled + * by default to minimize the corresponding performance overhead. + * This command enables statistics globally. + * + * Multiple programs may independently enable statistics. + * After gathering the desired statistics, eBPF runtime statistics + * may be disabled again by calling **close**\ (2) for the file + * descriptor returned by this function. Statistics will only be + * disabled system-wide when all outstanding file descriptors + * returned by prior calls for this subcommand are closed. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_ITER_CREATE + * Description + * Create an iterator on top of the specified *link_fd* (as + * previously created using **BPF_LINK_CREATE**) and return a + * file descriptor that can be used to trigger the iteration. + * + * If the resulting file descriptor is pinned to the filesystem + * using **BPF_OBJ_PIN**, then subsequent **read**\ (2) syscalls + * for that path will trigger the iterator to read kernel state + * using the eBPF program attached to *link_fd*. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * + * BPF_LINK_DETACH + * Description + * Forcefully detach the specified *link_fd* from its + * corresponding attachment point. + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * BPF_PROG_BIND_MAP + * Description + * Bind a map to the lifetime of an eBPF program. + * + * The map identified by *map_fd* is bound to the program + * identified by *prog_fd* and only released when *prog_fd* is + * released. This may be used in cases where metadata should be + * associated with a program which otherwise does not contain any + * references to the map (for example, embedded in the eBPF + * program instructions). + * + * Return + * Returns zero on success. On error, -1 is returned and *errno* + * is set appropriately. + * + * NOTES + * eBPF objects (maps and programs) can be shared between processes. + * + * * After **fork**\ (2), the child inherits file descriptors + * referring to the same eBPF objects. + * * File descriptors referring to eBPF objects can be transferred over + * **unix**\ (7) domain sockets. + * * File descriptors referring to eBPF objects can be duplicated in the + * usual way, using **dup**\ (2) and similar calls. + * * File descriptors referring to eBPF objects can be pinned to the + * filesystem using the **BPF_OBJ_PIN** command of **bpf**\ (2). + * + * An eBPF object is deallocated only after all file descriptors referring + * to the object have been closed and no references remain pinned to the + * filesystem or attached (for example, bound to a program or device). + */ enum bpf_cmd { BPF_MAP_CREATE, BPF_MAP_LOOKUP_ELEM,
From: Lorenz Bauer lmb@cloudflare.com
mainline inclusion from mainline-5.13-rc1 commit 607b9cc92bd7208338d714a22b8082fe83bcb177 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Share the timing / signal interruption logic between different implementations of PROG_TEST_RUN. There is a change in behaviour as well. We check the loop exit condition before checking for pending signals. This resolves an edge case where a signal arrives during the last iteration. Instead of aborting with EINTR we return the successful result to user space.
Signed-off-by: Lorenz Bauer lmb@cloudflare.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210303101816.36774-2-lmb@cloudflare.com (cherry picked from commit 607b9cc92bd7208338d714a22b8082fe83bcb177) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: net/bpf/test_run.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- net/bpf/test_run.c | 142 +++++++++++++++++++++++++-------------------- 1 file changed, 78 insertions(+), 64 deletions(-)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index f266a9453c8e..3a8f5a2d0e74 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -16,16 +16,80 @@ #define CREATE_TRACE_POINTS #include <trace/events/bpf_test_run.h>
+struct bpf_test_timer { + enum { NO_PREEMPT, NO_MIGRATE } mode; + u32 i; + u64 time_start, time_spent; +}; + +static void bpf_test_timer_enter(struct bpf_test_timer *t) + __acquires(rcu) +{ + rcu_read_lock(); + if (t->mode == NO_PREEMPT) + preempt_disable(); + else + migrate_disable(); + + t->time_start = ktime_get_ns(); +} + +static void bpf_test_timer_leave(struct bpf_test_timer *t) + __releases(rcu) +{ + t->time_start = 0; + + if (t->mode == NO_PREEMPT) + preempt_enable(); + else + migrate_enable(); + rcu_read_unlock(); +} + +static bool bpf_test_timer_continue(struct bpf_test_timer *t, u32 repeat, int *err, u32 *duration) + __must_hold(rcu) +{ + t->i++; + if (t->i >= repeat) { + /* We're done. */ + t->time_spent += ktime_get_ns() - t->time_start; + do_div(t->time_spent, t->i); + *duration = t->time_spent > U32_MAX ? U32_MAX : (u32)t->time_spent; + *err = 0; + goto reset; + } + + if (signal_pending(current)) { + /* During iteration: we've been cancelled, abort. */ + *err = -EINTR; + goto reset; + } + + if (need_resched()) { + /* During iteration: we need to reschedule between runs. */ + t->time_spent += ktime_get_ns() - t->time_start; + bpf_test_timer_leave(t); + cond_resched(); + bpf_test_timer_enter(t); + } + + /* Do another round. */ + return true; + +reset: + t->i = 0; + return false; +} + static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *retval, u32 *time, bool xdp) { struct bpf_prog_array_item item = {.prog = prog}; struct bpf_run_ctx *old_ctx; struct bpf_cg_run_ctx run_ctx; + struct bpf_test_timer t = { NO_MIGRATE }; enum bpf_cgroup_storage_type stype; - u64 time_start, time_spent = 0; - int ret = 0; - u32 i; + int ret;
for_each_cgroup_storage_type(stype) { item.cgroup_storage[stype] = bpf_cgroup_storage_alloc(prog, stype); @@ -40,42 +104,17 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, if (!repeat) repeat = 1;
- rcu_read_lock(); - migrate_disable(); - time_start = ktime_get_ns(); + bpf_test_timer_enter(&t); old_ctx = bpf_set_run_ctx(&run_ctx.run_ctx); - for (i = 0; i < repeat; i++) { + do { run_ctx.prog_item = &item; - if (xdp) *retval = bpf_prog_run_xdp(prog, ctx); else *retval = BPF_PROG_RUN(prog, ctx); - - if (signal_pending(current)) { - ret = -EINTR; - break; - } - - if (need_resched()) { - time_spent += ktime_get_ns() - time_start; - migrate_enable(); - rcu_read_unlock(); - - cond_resched(); - - rcu_read_lock(); - migrate_disable(); - time_start = ktime_get_ns(); - } - } + } while (bpf_test_timer_continue(&t, repeat, &ret, time)); bpf_reset_run_ctx(old_ctx); - time_spent += ktime_get_ns() - time_start; - migrate_enable(); - rcu_read_unlock(); - - do_div(time_spent, repeat); - *time = time_spent > U32_MAX ? U32_MAX : (u32)time_spent; + bpf_test_timer_leave(&t);
for_each_cgroup_storage_type(stype) bpf_cgroup_storage_free(item.cgroup_storage[stype]); @@ -691,18 +730,17 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, const union bpf_attr *kattr, union bpf_attr __user *uattr) { + struct bpf_test_timer t = { NO_PREEMPT }; u32 size = kattr->test.data_size_in; struct bpf_flow_dissector ctx = {}; u32 repeat = kattr->test.repeat; struct bpf_flow_keys *user_ctx; struct bpf_flow_keys flow_keys; - u64 time_start, time_spent = 0; const struct ethhdr *eth; unsigned int flags = 0; u32 retval, duration; void *data; int ret; - u32 i;
if (prog->type != BPF_PROG_TYPE_FLOW_DISSECTOR) return -EINVAL; @@ -738,39 +776,15 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, ctx.data = data; ctx.data_end = (__u8 *)data + size;
- rcu_read_lock(); - preempt_disable(); - time_start = ktime_get_ns(); - for (i = 0; i < repeat; i++) { + bpf_test_timer_enter(&t); + do { retval = bpf_flow_dissect(prog, &ctx, eth->h_proto, ETH_HLEN, size, flags); + } while (bpf_test_timer_continue(&t, repeat, &ret, &duration)); + bpf_test_timer_leave(&t);
- if (signal_pending(current)) { - preempt_enable(); - rcu_read_unlock(); - - ret = -EINTR; - goto out; - } - - if (need_resched()) { - time_spent += ktime_get_ns() - time_start; - preempt_enable(); - rcu_read_unlock(); - - cond_resched(); - - rcu_read_lock(); - preempt_disable(); - time_start = ktime_get_ns(); - } - } - time_spent += ktime_get_ns() - time_start; - preempt_enable(); - rcu_read_unlock(); - - do_div(time_spent, repeat); - duration = time_spent > U32_MAX ? U32_MAX : (u32)time_spent; + if (ret < 0) + goto out;
ret = bpf_test_finish(kattr, uattr, &flow_keys, sizeof(flow_keys), retval, duration);
From: Lorenz Bauer lmb@cloudflare.com
mainline inclusion from mainline-5.13-rc1 commit 7c32e8f8bc33a5f4b113a630857e46634e3e143b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Allow to pass sk_lookup programs to PROG_TEST_RUN. User space provides the full bpf_sk_lookup struct as context. Since the context includes a socket pointer that can't be exposed to user space we define that PROG_TEST_RUN returns the cookie of the selected socket or zero in place of the socket pointer.
We don't support testing programs that select a reuseport socket, since this would mean running another (unrelated) BPF program from the sk_lookup test handler.
Signed-off-by: Lorenz Bauer lmb@cloudflare.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com (cherry picked from commit 7c32e8f8bc33a5f4b113a630857e46634e3e143b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 10 ++++ include/uapi/linux/bpf.h | 5 +- net/bpf/test_run.c | 105 +++++++++++++++++++++++++++++++++ net/core/filter.c | 1 + tools/include/uapi/linux/bpf.h | 5 +- 5 files changed, 124 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index dd2f794dd04e..d115f01ef1b6 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1572,6 +1572,9 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, int bpf_prog_test_run_raw_tp(struct bpf_prog *prog, const union bpf_attr *kattr, union bpf_attr __user *uattr); +int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr); bool btf_ctx_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, struct bpf_insn_access_aux *info); @@ -1787,6 +1790,13 @@ static inline int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, return -ENOTSUPP; }
+static inline int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + return -ENOTSUPP; +} + static inline void bpf_map_put(struct bpf_map *map) { } diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index e51f97b72f65..e8ae59d9cb37 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5101,7 +5101,10 @@ struct bpf_pidns_info {
/* User accessible data for SK_LOOKUP programs. Add new fields at the end. */ struct bpf_sk_lookup { - __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */ + union { + __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */ + __u64 cookie; /* Non-zero if socket was selected in PROG_TEST_RUN */ + };
__u32 family; /* Protocol family (AF_INET, AF_INET6) */ __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 3a8f5a2d0e74..0dfef59cf3de 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -10,8 +10,10 @@ #include <net/bpf_sk_storage.h> #include <net/sock.h> #include <net/tcp.h> +#include <net/net_namespace.h> #include <linux/error-injection.h> #include <linux/smp.h> +#include <linux/sock_diag.h>
#define CREATE_TRACE_POINTS #include <trace/events/bpf_test_run.h> @@ -797,3 +799,106 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, kfree(data); return ret; } + +int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + struct bpf_test_timer t = { NO_PREEMPT }; + struct bpf_prog_array *progs = NULL; + struct bpf_sk_lookup_kern ctx = {}; + u32 repeat = kattr->test.repeat; + struct bpf_sk_lookup *user_ctx; + u32 retval, duration; + int ret = -EINVAL; + + if (prog->type != BPF_PROG_TYPE_SK_LOOKUP) + return -EINVAL; + + if (kattr->test.flags || kattr->test.cpu) + return -EINVAL; + + if (kattr->test.data_in || kattr->test.data_size_in || kattr->test.data_out || + kattr->test.data_size_out) + return -EINVAL; + + if (!repeat) + repeat = 1; + + user_ctx = bpf_ctx_init(kattr, sizeof(*user_ctx)); + if (IS_ERR(user_ctx)) + return PTR_ERR(user_ctx); + + if (!user_ctx) + return -EINVAL; + + if (user_ctx->sk) + goto out; + + if (!range_is_zero(user_ctx, offsetofend(typeof(*user_ctx), local_port), sizeof(*user_ctx))) + goto out; + + if (user_ctx->local_port > U16_MAX || user_ctx->remote_port > U16_MAX) { + ret = -ERANGE; + goto out; + } + + ctx.family = (u16)user_ctx->family; + ctx.protocol = (u16)user_ctx->protocol; + ctx.dport = (u16)user_ctx->local_port; + ctx.sport = (__force __be16)user_ctx->remote_port; + + switch (ctx.family) { + case AF_INET: + ctx.v4.daddr = (__force __be32)user_ctx->local_ip4; + ctx.v4.saddr = (__force __be32)user_ctx->remote_ip4; + break; + +#if IS_ENABLED(CONFIG_IPV6) + case AF_INET6: + ctx.v6.daddr = (struct in6_addr *)user_ctx->local_ip6; + ctx.v6.saddr = (struct in6_addr *)user_ctx->remote_ip6; + break; +#endif + + default: + ret = -EAFNOSUPPORT; + goto out; + } + + progs = bpf_prog_array_alloc(1, GFP_KERNEL); + if (!progs) { + ret = -ENOMEM; + goto out; + } + + progs->items[0].prog = prog; + + bpf_test_timer_enter(&t); + do { + ctx.selected_sk = NULL; + retval = BPF_PROG_SK_LOOKUP_RUN_ARRAY(progs, ctx, BPF_PROG_RUN); + } while (bpf_test_timer_continue(&t, repeat, &ret, &duration)); + bpf_test_timer_leave(&t); + + if (ret < 0) + goto out; + + user_ctx->cookie = 0; + if (ctx.selected_sk) { + if (ctx.selected_sk->sk_reuseport && !ctx.no_reuseport) { + ret = -EOPNOTSUPP; + goto out; + } + + user_ctx->cookie = sock_gen_cookie(ctx.selected_sk); + } + + ret = bpf_test_finish(kattr, uattr, NULL, 0, retval, duration); + if (!ret) + ret = bpf_ctx_finish(kattr, uattr, user_ctx, sizeof(*user_ctx)); + +out: + bpf_prog_array_free(progs); + kfree(user_ctx); + return ret; +} diff --git a/net/core/filter.c b/net/core/filter.c index 933fdf6e6a90..07da29b132f3 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -10393,6 +10393,7 @@ static u32 sk_lookup_convert_ctx_access(enum bpf_access_type type, }
const struct bpf_prog_ops sk_lookup_prog_ops = { + .test_run = bpf_prog_test_run_sk_lookup, };
const struct bpf_verifier_ops sk_lookup_verifier_ops = { diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index ffe578a63de7..35ada5baa156 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5810,7 +5810,10 @@ struct bpf_pidns_info {
/* User accessible data for SK_LOOKUP programs. Add new fields at the end. */ struct bpf_sk_lookup { - __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */ + union { + __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */ + __u64 cookie; /* Non-zero if socket was selected in PROG_TEST_RUN */ + };
__u32 family; /* Protocol family (AF_INET, AF_INET6) */ __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */
From: Pedro Tammela pctammela@gmail.com
mainline inclusion from mainline-5.13-rc1 commit 0205e9de42911404902728911b03fc1469242419 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Linux headers might pull 'linux/stddef.h' which defines '__always_inline' as the following:
#ifndef __always_inline #define __always_inline inline #endif
This becomes an issue if the program picks up the 'linux/stddef.h' definition as the macro now just hints inline to clang.
This change now enforces the proper definition for BPF programs regardless of the include order.
Signed-off-by: Pedro Tammela pctammela@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210314173839.457768-1-pctammela@gmail.com (cherry picked from commit 0205e9de42911404902728911b03fc1469242419) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_helpers.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index ae6c975e0b87..53ff81c49dbd 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -29,9 +29,10 @@ */ #define SEC(NAME) __attribute__((section(NAME), used))
-#ifndef __always_inline +/* Avoid 'linux/stddef.h' definition of '__always_inline'. */ +#undef __always_inline #define __always_inline inline __attribute__((always_inline)) -#endif + #ifndef __noinline #define __noinline __attribute__((noinline)) #endif
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 9ae2c26e43248b722e79fe867be38062c9dd1e5f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Given that vmlinux.h is not compatible with headers like stddef.h, NULL poses an annoying problem: it is defined as #define, so is not captured in BTF, so is not emitted into vmlinux.h. This leads to users either sticking to explicit 0, or defining their own NULL (as progs/skb_pkt_end.c does).
But it's easy for bpf_helpers.h to provide (conditionally) NULL definition. Similarly, KERNEL_VERSION is another commonly missed macro that came up multiple times. So this patch adds both of them, along with offsetof(), that also is typically defined in stddef.h, just like NULL.
This might cause compilation warning for existing BPF applications defining their own NULL and/or KERNEL_VERSION already:
progs/skb_pkt_end.c:7:9: warning: 'NULL' macro redefined [-Wmacro-redefined] #define NULL 0 ^ /tmp/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h:4:9: note: previous definition is here #define NULL ((void *)0) ^
It is trivial to fix, though, so long-term benefits outweight temporary inconveniences.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20210317200510.1354627-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit 9ae2c26e43248b722e79fe867be38062c9dd1e5f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_helpers.h | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 53ff81c49dbd..cc2e51c64a54 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -40,8 +40,22 @@ #define __weak __attribute__((weak)) #endif
+/* When utilizing vmlinux.h with BPF CO-RE, user BPF programs can't include + * any system-level headers (such as stddef.h, linux/version.h, etc), and + * commonly-used macros like NULL and KERNEL_VERSION aren't available through + * vmlinux.h. This just adds unnecessary hurdles and forces users to re-define + * them on their own. So as a convenience, provide such definitions here. + */ +#ifndef NULL +#define NULL ((void *)0) +#endif + +#ifndef KERNEL_VERSION +#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c)) +#endif + /* - * Helper macro to manipulate data structures + * Helper macros to manipulate data structures */ #ifndef offsetof #define offsetof(TYPE, MEMBER) ((unsigned long)&((TYPE *)0)->MEMBER)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit e14ef4bf011192b61b48c4f3a35b3041140073ff category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
btf_type_by_id() is internal-only convenience API returning non-const pointer to struct btf_type. Expose it outside of btf.c for re-use.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-2-andrii@kernel.org (cherry picked from commit e14ef4bf011192b61b48c4f3a35b3041140073ff) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 2 +- tools/lib/bpf/libbpf_internal.h | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 3a944a487e7e..27091759aa81 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -441,7 +441,7 @@ const struct btf *btf__base_btf(const struct btf *btf) }
/* internal helper returning non-const pointer to a type */ -static struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id) +struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id) { if (type_id == 0) return &btf_void; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 343f6eb05637..d09860e435c8 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -107,6 +107,11 @@ static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size) return realloc(ptr, total); }
+struct btf; +struct btf_type; + +struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id); + void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t cur_cnt, size_t max_cnt, size_t add_cnt); int btf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit f36e99a45dbe76949eb99bba413c67eda5cd2591 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Extract and generalize the logic to iterate BTF type ID and string offset fields within BTF types and .BTF.ext data. Expose this internally in libbpf for re-use by bpf_linker.
Additionally, complete strings deduplication handling for BTF.ext (e.g., CO-RE access strings), which was previously missing. There previously was no case of deduplicating .BTF.ext data, but bpf_linker is going to use it.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-3-andrii@kernel.org (cherry picked from commit f36e99a45dbe76949eb99bba413c67eda5cd2591) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 393 ++++++++++++++++++-------------- tools/lib/bpf/libbpf_internal.h | 7 + 2 files changed, 228 insertions(+), 172 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 27091759aa81..194d330f6715 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3161,95 +3161,28 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext, return d; }
-typedef int (*str_off_fn_t)(__u32 *str_off_ptr, void *ctx); - /* * Iterate over all possible places in .BTF and .BTF.ext that can reference * string and pass pointer to it to a provided callback `fn`. */ -static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx) +static int btf_for_each_str_off(struct btf_dedup *d, str_off_visit_fn fn, void *ctx) { - void *line_data_cur, *line_data_end; - int i, j, r, rec_size; - struct btf_type *t; + int i, r;
for (i = 0; i < d->btf->nr_types; i++) { - t = btf_type_by_id(d->btf, d->btf->start_id + i); - r = fn(&t->name_off, ctx); + struct btf_type *t = btf_type_by_id(d->btf, d->btf->start_id + i); + + r = btf_type_visit_str_offs(t, fn, ctx); if (r) return r; - - switch (btf_kind(t)) { - case BTF_KIND_STRUCT: - case BTF_KIND_UNION: { - struct btf_member *m = btf_members(t); - __u16 vlen = btf_vlen(t); - - for (j = 0; j < vlen; j++) { - r = fn(&m->name_off, ctx); - if (r) - return r; - m++; - } - break; - } - case BTF_KIND_ENUM: { - struct btf_enum *m = btf_enum(t); - __u16 vlen = btf_vlen(t); - - for (j = 0; j < vlen; j++) { - r = fn(&m->name_off, ctx); - if (r) - return r; - m++; - } - break; - } - case BTF_KIND_FUNC_PROTO: { - struct btf_param *m = btf_params(t); - __u16 vlen = btf_vlen(t); - - for (j = 0; j < vlen; j++) { - r = fn(&m->name_off, ctx); - if (r) - return r; - m++; - } - break; - } - default: - break; - } }
if (!d->btf_ext) return 0;
- line_data_cur = d->btf_ext->line_info.info; - line_data_end = d->btf_ext->line_info.info + d->btf_ext->line_info.len; - rec_size = d->btf_ext->line_info.rec_size; - - while (line_data_cur < line_data_end) { - struct btf_ext_info_sec *sec = line_data_cur; - struct bpf_line_info_min *line_info; - __u32 num_info = sec->num_info; - - r = fn(&sec->sec_name_off, ctx); - if (r) - return r; - - line_data_cur += sizeof(struct btf_ext_info_sec); - for (i = 0; i < num_info; i++) { - line_info = line_data_cur; - r = fn(&line_info->file_name_off, ctx); - if (r) - return r; - r = fn(&line_info->line_off, ctx); - if (r) - return r; - line_data_cur += rec_size; - } - } + r = btf_ext_visit_str_offs(d->btf_ext, fn, ctx); + if (r) + return r;
return 0; } @@ -4504,15 +4437,18 @@ static int btf_dedup_compact_types(struct btf_dedup *d) * then mapping it to a deduplicated type ID, stored in btf_dedup->hypot_map, * which is populated during compaction phase. */ -static int btf_dedup_remap_type_id(struct btf_dedup *d, __u32 type_id) +static int btf_dedup_remap_type_id(__u32 *type_id, void *ctx) { + struct btf_dedup *d = ctx; __u32 resolved_type_id, new_type_id;
- resolved_type_id = resolve_type_id(d, type_id); + resolved_type_id = resolve_type_id(d, *type_id); new_type_id = d->hypot_map[resolved_type_id]; if (new_type_id > BTF_MAX_NR_TYPES) return -EINVAL; - return new_type_id; + + *type_id = new_type_id; + return 0; }
/* @@ -4525,109 +4461,25 @@ static int btf_dedup_remap_type_id(struct btf_dedup *d, __u32 type_id) * referenced from any BTF type (e.g., struct fields, func proto args, etc) to * their final deduped type IDs. */ -static int btf_dedup_remap_type(struct btf_dedup *d, __u32 type_id) +static int btf_dedup_remap_types(struct btf_dedup *d) { - struct btf_type *t = btf_type_by_id(d->btf, type_id); int i, r;
- switch (btf_kind(t)) { - case BTF_KIND_INT: - case BTF_KIND_ENUM: - case BTF_KIND_FLOAT: - break; - - case BTF_KIND_FWD: - case BTF_KIND_CONST: - case BTF_KIND_VOLATILE: - case BTF_KIND_RESTRICT: - case BTF_KIND_PTR: - case BTF_KIND_TYPEDEF: - case BTF_KIND_FUNC: - case BTF_KIND_VAR: - r = btf_dedup_remap_type_id(d, t->type); - if (r < 0) - return r; - t->type = r; - break; - - case BTF_KIND_ARRAY: { - struct btf_array *arr_info = btf_array(t); - - r = btf_dedup_remap_type_id(d, arr_info->type); - if (r < 0) - return r; - arr_info->type = r; - r = btf_dedup_remap_type_id(d, arr_info->index_type); - if (r < 0) - return r; - arr_info->index_type = r; - break; - } - - case BTF_KIND_STRUCT: - case BTF_KIND_UNION: { - struct btf_member *member = btf_members(t); - __u16 vlen = btf_vlen(t); - - for (i = 0; i < vlen; i++) { - r = btf_dedup_remap_type_id(d, member->type); - if (r < 0) - return r; - member->type = r; - member++; - } - break; - } - - case BTF_KIND_FUNC_PROTO: { - struct btf_param *param = btf_params(t); - __u16 vlen = btf_vlen(t); + for (i = 0; i < d->btf->nr_types; i++) { + struct btf_type *t = btf_type_by_id(d->btf, d->btf->start_id + i);
- r = btf_dedup_remap_type_id(d, t->type); - if (r < 0) + r = btf_type_visit_type_ids(t, btf_dedup_remap_type_id, d); + if (r) return r; - t->type = r; - - for (i = 0; i < vlen; i++) { - r = btf_dedup_remap_type_id(d, param->type); - if (r < 0) - return r; - param->type = r; - param++; - } - break; - } - - case BTF_KIND_DATASEC: { - struct btf_var_secinfo *var = btf_var_secinfos(t); - __u16 vlen = btf_vlen(t); - - for (i = 0; i < vlen; i++) { - r = btf_dedup_remap_type_id(d, var->type); - if (r < 0) - return r; - var->type = r; - var++; - } - break; - } - - default: - return -EINVAL; }
- return 0; -} + if (!d->btf_ext) + return 0;
-static int btf_dedup_remap_types(struct btf_dedup *d) -{ - int i, r; + r = btf_ext_visit_type_ids(d->btf_ext, btf_dedup_remap_type_id, d); + if (r) + return r;
- for (i = 0; i < d->btf->nr_types; i++) { - r = btf_dedup_remap_type(d, d->btf->start_id + i); - if (r < 0) - return r; - } return 0; }
@@ -4681,3 +4533,200 @@ struct btf *libbpf_find_kernel_btf(void) pr_warn("failed to find valid kernel BTF\n"); return ERR_PTR(-ESRCH); } + +int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx) +{ + int i, n, err; + + switch (btf_kind(t)) { + case BTF_KIND_INT: + case BTF_KIND_FLOAT: + case BTF_KIND_ENUM: + return 0; + + case BTF_KIND_FWD: + case BTF_KIND_CONST: + case BTF_KIND_VOLATILE: + case BTF_KIND_RESTRICT: + case BTF_KIND_PTR: + case BTF_KIND_TYPEDEF: + case BTF_KIND_FUNC: + case BTF_KIND_VAR: + return visit(&t->type, ctx); + + case BTF_KIND_ARRAY: { + struct btf_array *a = btf_array(t); + + err = visit(&a->type, ctx); + err = err ?: visit(&a->index_type, ctx); + return err; + } + + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: { + struct btf_member *m = btf_members(t); + + for (i = 0, n = btf_vlen(t); i < n; i++, m++) { + err = visit(&m->type, ctx); + if (err) + return err; + } + return 0; + } + + case BTF_KIND_FUNC_PROTO: { + struct btf_param *m = btf_params(t); + + err = visit(&t->type, ctx); + if (err) + return err; + for (i = 0, n = btf_vlen(t); i < n; i++, m++) { + err = visit(&m->type, ctx); + if (err) + return err; + } + return 0; + } + + case BTF_KIND_DATASEC: { + struct btf_var_secinfo *m = btf_var_secinfos(t); + + for (i = 0, n = btf_vlen(t); i < n; i++, m++) { + err = visit(&m->type, ctx); + if (err) + return err; + } + return 0; + } + + default: + return -EINVAL; + } +} + +int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx) +{ + int i, n, err; + + err = visit(&t->name_off, ctx); + if (err) + return err; + + switch (btf_kind(t)) { + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: { + struct btf_member *m = btf_members(t); + + for (i = 0, n = btf_vlen(t); i < n; i++, m++) { + err = visit(&m->name_off, ctx); + if (err) + return err; + } + break; + } + case BTF_KIND_ENUM: { + struct btf_enum *m = btf_enum(t); + + for (i = 0, n = btf_vlen(t); i < n; i++, m++) { + err = visit(&m->name_off, ctx); + if (err) + return err; + } + break; + } + case BTF_KIND_FUNC_PROTO: { + struct btf_param *m = btf_params(t); + + for (i = 0, n = btf_vlen(t); i < n; i++, m++) { + err = visit(&m->name_off, ctx); + if (err) + return err; + } + break; + } + default: + break; + } + + return 0; +} + +int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx) +{ + const struct btf_ext_info *seg; + struct btf_ext_info_sec *sec; + int i, err; + + seg = &btf_ext->func_info; + for_each_btf_ext_sec(seg, sec) { + struct bpf_func_info_min *rec; + + for_each_btf_ext_rec(seg, sec, i, rec) { + err = visit(&rec->type_id, ctx); + if (err < 0) + return err; + } + } + + seg = &btf_ext->core_relo_info; + for_each_btf_ext_sec(seg, sec) { + struct bpf_core_relo *rec; + + for_each_btf_ext_rec(seg, sec, i, rec) { + err = visit(&rec->type_id, ctx); + if (err < 0) + return err; + } + } + + return 0; +} + +int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx) +{ + const struct btf_ext_info *seg; + struct btf_ext_info_sec *sec; + int i, err; + + seg = &btf_ext->func_info; + for_each_btf_ext_sec(seg, sec) { + err = visit(&sec->sec_name_off, ctx); + if (err) + return err; + } + + seg = &btf_ext->line_info; + for_each_btf_ext_sec(seg, sec) { + struct bpf_line_info_min *rec; + + err = visit(&sec->sec_name_off, ctx); + if (err) + return err; + + for_each_btf_ext_rec(seg, sec, i, rec) { + err = visit(&rec->file_name_off, ctx); + if (err) + return err; + err = visit(&rec->line_off, ctx); + if (err) + return err; + } + } + + seg = &btf_ext->core_relo_info; + for_each_btf_ext_sec(seg, sec) { + struct bpf_core_relo *rec; + + err = visit(&sec->sec_name_off, ctx); + if (err) + return err; + + for_each_btf_ext_rec(seg, sec, i, rec) { + err = visit(&rec->access_str_off, ctx); + if (err) + return err; + } + } + + return 0; +} diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index d09860e435c8..97b6b9cc9839 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -356,4 +356,11 @@ struct bpf_core_relo { enum bpf_core_relo_kind kind; };
+typedef int (*type_id_visit_fn)(__u32 *type_id, void *ctx); +typedef int (*str_off_visit_fn)(__u32 *str_off, void *ctx); +int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx); +int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx); +int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx); +int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx); + #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 3b029e06f624efa90c9a4354e408acf134adb185 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Rename btf_add_mem() and btf_ensure_mem() helpers that abstract away details of dynamically resizable memory to use libbpf_ prefix, as they are not BTF-specific. No functional changes.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-4-andrii@kernel.org (cherry picked from commit 3b029e06f624efa90c9a4354e408acf134adb185) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 24 ++++++++++++------------ tools/lib/bpf/btf_dump.c | 8 ++++---- tools/lib/bpf/libbpf.c | 4 ++-- tools/lib/bpf/libbpf_internal.h | 6 +++--- 4 files changed, 21 insertions(+), 21 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 194d330f6715..f04dbf036014 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -142,8 +142,8 @@ static inline __u64 ptr_to_u64(const void *ptr) * On success, memory pointer to the beginning of unused memory is returned. * On error, NULL is returned. */ -void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, - size_t cur_cnt, size_t max_cnt, size_t add_cnt) +void *libbpf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, + size_t cur_cnt, size_t max_cnt, size_t add_cnt) { size_t new_cnt; void *new_data; @@ -179,14 +179,14 @@ void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, /* Ensure given dynamically allocated memory region has enough allocated space * to accommodate *need_cnt* elements of size *elem_sz* bytes each */ -int btf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt) +int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt) { void *p;
if (need_cnt <= *cap_cnt) return 0;
- p = btf_add_mem(data, cap_cnt, elem_sz, *cap_cnt, SIZE_MAX, need_cnt - *cap_cnt); + p = libbpf_add_mem(data, cap_cnt, elem_sz, *cap_cnt, SIZE_MAX, need_cnt - *cap_cnt); if (!p) return -ENOMEM;
@@ -197,8 +197,8 @@ static int btf_add_type_idx_entry(struct btf *btf, __u32 type_off) { __u32 *p;
- p = btf_add_mem((void **)&btf->type_offs, &btf->type_offs_cap, sizeof(__u32), - btf->nr_types, BTF_MAX_NR_TYPES, 1); + p = libbpf_add_mem((void **)&btf->type_offs, &btf->type_offs_cap, sizeof(__u32), + btf->nr_types, BTF_MAX_NR_TYPES, 1); if (!p) return -ENOMEM;
@@ -1592,8 +1592,8 @@ static int btf_ensure_modifiable(struct btf *btf)
static void *btf_add_str_mem(struct btf *btf, size_t add_sz) { - return btf_add_mem(&btf->strs_data, &btf->strs_data_cap, 1, - btf->hdr->str_len, BTF_MAX_STR_OFFSET, add_sz); + return libbpf_add_mem(&btf->strs_data, &btf->strs_data_cap, 1, + btf->hdr->str_len, BTF_MAX_STR_OFFSET, add_sz); }
/* Find an offset in BTF string section that corresponds to a given string *s*. @@ -1689,8 +1689,8 @@ int btf__add_str(struct btf *btf, const char *s)
static void *btf_add_type_mem(struct btf *btf, size_t add_sz) { - return btf_add_mem(&btf->types_data, &btf->types_data_cap, 1, - btf->hdr->type_len, UINT_MAX, add_sz); + return libbpf_add_mem(&btf->types_data, &btf->types_data_cap, 1, + btf->hdr->type_len, UINT_MAX, add_sz); }
static __u32 btf_type_info(int kind, int vlen, int kflag) @@ -3214,7 +3214,7 @@ static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) len = strlen(s) + 1;
new_off = d->strs_len; - p = btf_add_mem(&d->strs_data, &d->strs_cap, 1, new_off, BTF_MAX_STR_OFFSET, len); + p = libbpf_add_mem(&d->strs_data, &d->strs_cap, 1, new_off, BTF_MAX_STR_OFFSET, len); if (!p) return -ENOMEM;
@@ -3270,7 +3270,7 @@ static int btf_dedup_strings(struct btf_dedup *d) }
if (!d->btf->base_btf) { - s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1); + s = libbpf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1); if (!s) return -ENOMEM; /* initial empty string */ diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 48ab93034dbc..fa27b0bbea0e 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -166,11 +166,11 @@ static int btf_dump_resize(struct btf_dump *d) if (last_id <= d->last_id) return 0;
- if (btf_ensure_mem((void **)&d->type_states, &d->type_states_cap, - sizeof(*d->type_states), last_id + 1)) + if (libbpf_ensure_mem((void **)&d->type_states, &d->type_states_cap, + sizeof(*d->type_states), last_id + 1)) return -ENOMEM; - if (btf_ensure_mem((void **)&d->cached_names, &d->cached_names_cap, - sizeof(*d->cached_names), last_id + 1)) + if (libbpf_ensure_mem((void **)&d->cached_names, &d->cached_names_cap, + sizeof(*d->cached_names), last_id + 1)) return -ENOMEM;
if (d->last_id == 0) { diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 5bd3d4416500..a86590c01010 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4922,8 +4922,8 @@ static int load_module_btfs(struct bpf_object *obj) goto err_out; }
- err = btf_ensure_mem((void **)&obj->btf_modules, &obj->btf_module_cap, - sizeof(*obj->btf_modules), obj->btf_module_cnt + 1); + err = libbpf_ensure_mem((void **)&obj->btf_modules, &obj->btf_module_cap, + sizeof(*obj->btf_modules), obj->btf_module_cnt + 1); if (err) goto err_out;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 97b6b9cc9839..e0787900eeca 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -112,9 +112,9 @@ struct btf_type;
struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id);
-void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, - size_t cur_cnt, size_t max_cnt, size_t add_cnt); -int btf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt); +void *libbpf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, + size_t cur_cnt, size_t max_cnt, size_t add_cnt); +int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
static inline bool libbpf_validate_opts(const char *opts, size_t opts_sz, size_t user_sz,
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 90d76d3ececc74bf43b2a97f178dadfa1e52be54 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Extract BTF logic for maintaining a set of strings data structure, used for BTF strings section construction in writable mode, into separate re-usable API. This data structure is going to be used by bpf_linker to maintains ELF STRTAB section, which has the same layout as BTF strings section.
Suggested-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-5-andrii@kernel.org (cherry picked from commit 90d76d3ececc74bf43b2a97f178dadfa1e52be54) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Build | 2 +- tools/lib/bpf/btf.c | 255 ++++++++++------------------------------- tools/lib/bpf/strset.c | 176 ++++++++++++++++++++++++++++ tools/lib/bpf/strset.h | 21 ++++ 4 files changed, 259 insertions(+), 195 deletions(-) create mode 100644 tools/lib/bpf/strset.c create mode 100644 tools/lib/bpf/strset.h
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build index 190366d05588..8136186a453f 100644 --- a/tools/lib/bpf/Build +++ b/tools/lib/bpf/Build @@ -1,3 +1,3 @@ libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \ netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \ - btf_dump.o ringbuf.o + btf_dump.o ringbuf.o strset.o diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f04dbf036014..c2116f2ad380 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -21,6 +21,7 @@ #include "libbpf.h" #include "libbpf_internal.h" #include "hashmap.h" +#include "strset.h"
#define BTF_MAX_NR_TYPES 0x7fffffffU #define BTF_MAX_STR_OFFSET 0x7fffffffU @@ -67,7 +68,7 @@ struct btf { * | | | * hdr | | * types_data----+ | - * strs_data------------------+ + * strset__data(strs_set)-----+ * * +----------+---------+-----------+ * | Header | Types | Strings | @@ -105,20 +106,15 @@ struct btf { */ int start_str_off;
+ /* only one of strs_data or strs_set can be non-NULL, depending on + * whether BTF is in a modifiable state (strs_set is used) or not + * (strs_data points inside raw_data) + */ void *strs_data; - size_t strs_data_cap; /* used size stored in hdr->str_len */ - - /* lookup index for each unique string in strings section */ - struct hashmap *strs_hash; + /* a set of unique strings */ + struct strset *strs_set; /* whether strings are already deduplicated */ bool strs_deduped; - /* extra indirection layer to make strings hashmap work with stable - * string offsets and ability to transparently choose between - * btf->strs_data or btf_dedup->strs_data as a source of strings. - * This is used for BTF strings dedup to transfer deduplicated strings - * data back to struct btf without re-building strings index. - */ - void **strs_data_ptr;
/* BTF object FD, if loaded into kernel */ int fd; @@ -744,7 +740,7 @@ void btf__free(struct btf *btf) */ free(btf->hdr); free(btf->types_data); - free(btf->strs_data); + strset__free(btf->strs_set); } free(btf->raw_data); free(btf->raw_data_swapped); @@ -1252,6 +1248,11 @@ void btf__set_fd(struct btf *btf, int fd) btf->fd = fd; }
+static const void *btf_strs_data(const struct btf *btf) +{ + return btf->strs_data ? btf->strs_data : strset__data(btf->strs_set); +} + static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian) { struct btf_header *hdr = btf->hdr; @@ -1292,7 +1293,7 @@ static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endi } p += hdr->type_len;
- memcpy(p, btf->strs_data, hdr->str_len); + memcpy(p, btf_strs_data(btf), hdr->str_len); p += hdr->str_len;
*size = data_sz; @@ -1326,7 +1327,7 @@ const char *btf__str_by_offset(const struct btf *btf, __u32 offset) if (offset < btf->start_str_off) return btf__str_by_offset(btf->base_btf, offset); else if (offset - btf->start_str_off < btf->hdr->str_len) - return btf->strs_data + (offset - btf->start_str_off); + return btf_strs_data(btf) + (offset - btf->start_str_off); else return NULL; } @@ -1480,25 +1481,6 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, return 0; }
-static size_t strs_hash_fn(const void *key, void *ctx) -{ - const struct btf *btf = ctx; - const char *strs = *btf->strs_data_ptr; - const char *str = strs + (long)key; - - return str_hash(str); -} - -static bool strs_hash_equal_fn(const void *key1, const void *key2, void *ctx) -{ - const struct btf *btf = ctx; - const char *strs = *btf->strs_data_ptr; - const char *str1 = strs + (long)key1; - const char *str2 = strs + (long)key2; - - return strcmp(str1, str2) == 0; -} - static void btf_invalidate_raw_data(struct btf *btf) { if (btf->raw_data) { @@ -1517,10 +1499,9 @@ static void btf_invalidate_raw_data(struct btf *btf) */ static int btf_ensure_modifiable(struct btf *btf) { - void *hdr, *types, *strs, *strs_end, *s; - struct hashmap *hash = NULL; - long off; - int err; + void *hdr, *types; + struct strset *set = NULL; + int err = -ENOMEM;
if (btf_is_modifiable(btf)) { /* any BTF modification invalidates raw_data */ @@ -1531,44 +1512,25 @@ static int btf_ensure_modifiable(struct btf *btf) /* split raw data into three memory regions */ hdr = malloc(btf->hdr->hdr_len); types = malloc(btf->hdr->type_len); - strs = malloc(btf->hdr->str_len); - if (!hdr || !types || !strs) + if (!hdr || !types) goto err_out;
memcpy(hdr, btf->hdr, btf->hdr->hdr_len); memcpy(types, btf->types_data, btf->hdr->type_len); - memcpy(strs, btf->strs_data, btf->hdr->str_len); - - /* make hashmap below use btf->strs_data as a source of strings */ - btf->strs_data_ptr = &btf->strs_data;
/* build lookup index for all strings */ - hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, btf); - if (IS_ERR(hash)) { - err = PTR_ERR(hash); - hash = NULL; + set = strset__new(BTF_MAX_STR_OFFSET, btf->strs_data, btf->hdr->str_len); + if (IS_ERR(set)) { + err = PTR_ERR(set); goto err_out; }
- strs_end = strs + btf->hdr->str_len; - for (off = 0, s = strs; s < strs_end; off += strlen(s) + 1, s = strs + off) { - /* hashmap__add() returns EEXIST if string with the same - * content already is in the hash map - */ - err = hashmap__add(hash, (void *)off, (void *)off); - if (err == -EEXIST) - continue; /* duplicate */ - if (err) - goto err_out; - } - /* only when everything was successful, update internal state */ btf->hdr = hdr; btf->types_data = types; btf->types_data_cap = btf->hdr->type_len; - btf->strs_data = strs; - btf->strs_data_cap = btf->hdr->str_len; - btf->strs_hash = hash; + btf->strs_data = NULL; + btf->strs_set = set; /* if BTF was created from scratch, all strings are guaranteed to be * unique and deduplicated */ @@ -1583,17 +1545,10 @@ static int btf_ensure_modifiable(struct btf *btf) return 0;
err_out: - hashmap__free(hash); + strset__free(set); free(hdr); free(types); - free(strs); - return -ENOMEM; -} - -static void *btf_add_str_mem(struct btf *btf, size_t add_sz) -{ - return libbpf_add_mem(&btf->strs_data, &btf->strs_data_cap, 1, - btf->hdr->str_len, BTF_MAX_STR_OFFSET, add_sz); + return err; }
/* Find an offset in BTF string section that corresponds to a given string *s*. @@ -1604,34 +1559,23 @@ static void *btf_add_str_mem(struct btf *btf, size_t add_sz) */ int btf__find_str(struct btf *btf, const char *s) { - long old_off, new_off, len; - void *p; + int off;
if (btf->base_btf) { - int ret; - - ret = btf__find_str(btf->base_btf, s); - if (ret != -ENOENT) - return ret; + off = btf__find_str(btf->base_btf, s); + if (off != -ENOENT) + return off; }
/* BTF needs to be in a modifiable state to build string lookup index */ if (btf_ensure_modifiable(btf)) return -ENOMEM;
- /* see btf__add_str() for why we do this */ - len = strlen(s) + 1; - p = btf_add_str_mem(btf, len); - if (!p) - return -ENOMEM; - - new_off = btf->hdr->str_len; - memcpy(p, s, len); + off = strset__find_str(btf->strs_set, s); + if (off < 0) + return off;
- if (hashmap__find(btf->strs_hash, (void *)new_off, (void **)&old_off)) - return btf->start_str_off + old_off; - - return -ENOENT; + return btf->start_str_off + off; }
/* Add a string s to the BTF string section. @@ -1641,50 +1585,24 @@ int btf__find_str(struct btf *btf, const char *s) */ int btf__add_str(struct btf *btf, const char *s) { - long old_off, new_off, len; - void *p; - int err; + int off;
if (btf->base_btf) { - int ret; - - ret = btf__find_str(btf->base_btf, s); - if (ret != -ENOENT) - return ret; + off = btf__find_str(btf->base_btf, s); + if (off != -ENOENT) + return off; }
if (btf_ensure_modifiable(btf)) return -ENOMEM;
- /* Hashmap keys are always offsets within btf->strs_data, so to even - * look up some string from the "outside", we need to first append it - * at the end, so that it can be addressed with an offset. Luckily, - * until btf->hdr->str_len is incremented, that string is just a piece - * of garbage for the rest of BTF code, so no harm, no foul. On the - * other hand, if the string is unique, it's already appended and - * ready to be used, only a simple btf->hdr->str_len increment away. - */ - len = strlen(s) + 1; - p = btf_add_str_mem(btf, len); - if (!p) - return -ENOMEM; + off = strset__add_str(btf->strs_set, s); + if (off < 0) + return off;
- new_off = btf->hdr->str_len; - memcpy(p, s, len); + btf->hdr->str_len = strset__data_size(btf->strs_set);
- /* Now attempt to add the string, but only if the string with the same - * contents doesn't exist already (HASHMAP_ADD strategy). If such - * string exists, we'll get its offset in old_off (that's old_key). - */ - err = hashmap__insert(btf->strs_hash, (void *)new_off, (void *)new_off, - HASHMAP_ADD, (const void **)&old_off, NULL); - if (err == -EEXIST) - return btf->start_str_off + old_off; /* duplicated string, return existing offset */ - if (err) - return err; - - btf->hdr->str_len += len; /* new unique string, adjust data length */ - return btf->start_str_off + new_off; + return btf->start_str_off + off; }
static void *btf_add_type_mem(struct btf *btf, size_t add_sz) @@ -3022,10 +2940,7 @@ struct btf_dedup { /* Various option modifying behavior of algorithm */ struct btf_dedup_opts opts; /* temporary strings deduplication state */ - void *strs_data; - size_t strs_cap; - size_t strs_len; - struct hashmap* strs_hash; + struct strset *strs_set; };
static long hash_combine(long h, long value) @@ -3191,10 +3106,8 @@ static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) { struct btf_dedup *d = ctx; __u32 str_off = *str_off_ptr; - long old_off, new_off, len; const char *s; - void *p; - int err; + int off, err;
/* don't touch empty string or string in main BTF */ if (str_off == 0 || str_off < d->btf->start_str_off) @@ -3211,29 +3124,11 @@ static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) return err; }
- len = strlen(s) + 1; + off = strset__add_str(d->strs_set, s); + if (off < 0) + return off;
- new_off = d->strs_len; - p = libbpf_add_mem(&d->strs_data, &d->strs_cap, 1, new_off, BTF_MAX_STR_OFFSET, len); - if (!p) - return -ENOMEM; - - memcpy(p, s, len); - - /* Now attempt to add the string, but only if the string with the same - * contents doesn't exist already (HASHMAP_ADD strategy). If such - * string exists, we'll get its offset in old_off (that's old_key). - */ - err = hashmap__insert(d->strs_hash, (void *)new_off, (void *)new_off, - HASHMAP_ADD, (const void **)&old_off, NULL); - if (err == -EEXIST) { - *str_off_ptr = d->btf->start_str_off + old_off; - } else if (err) { - return err; - } else { - *str_off_ptr = d->btf->start_str_off + new_off; - d->strs_len += len; - } + *str_off_ptr = d->btf->start_str_off + off; return 0; }
@@ -3250,39 +3145,23 @@ static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx) */ static int btf_dedup_strings(struct btf_dedup *d) { - char *s; int err;
if (d->btf->strs_deduped) return 0;
- /* temporarily switch to use btf_dedup's strs_data for strings for hash - * functions; later we'll just transfer hashmap to struct btf as is, - * along the strs_data - */ - d->btf->strs_data_ptr = &d->strs_data; - - d->strs_hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, d->btf); - if (IS_ERR(d->strs_hash)) { - err = PTR_ERR(d->strs_hash); - d->strs_hash = NULL; + d->strs_set = strset__new(BTF_MAX_STR_OFFSET, NULL, 0); + if (IS_ERR(d->strs_set)) { + err = PTR_ERR(d->strs_set); goto err_out; }
if (!d->btf->base_btf) { - s = libbpf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1); - if (!s) - return -ENOMEM; - /* initial empty string */ - s[0] = 0; - d->strs_len = 1; - /* insert empty string; we won't be looking it up during strings * dedup, but it's good to have it for generic BTF string lookups */ - err = hashmap__insert(d->strs_hash, (void *)0, (void *)0, - HASHMAP_ADD, NULL, NULL); - if (err) + err = strset__add_str(d->strs_set, ""); + if (err < 0) goto err_out; }
@@ -3292,28 +3171,16 @@ static int btf_dedup_strings(struct btf_dedup *d) goto err_out;
/* replace BTF string data and hash with deduped ones */ - free(d->btf->strs_data); - hashmap__free(d->btf->strs_hash); - d->btf->strs_data = d->strs_data; - d->btf->strs_data_cap = d->strs_cap; - d->btf->hdr->str_len = d->strs_len; - d->btf->strs_hash = d->strs_hash; - /* now point strs_data_ptr back to btf->strs_data */ - d->btf->strs_data_ptr = &d->btf->strs_data; - - d->strs_data = d->strs_hash = NULL; - d->strs_len = d->strs_cap = 0; + strset__free(d->btf->strs_set); + d->btf->hdr->str_len = strset__data_size(d->strs_set); + d->btf->strs_set = d->strs_set; + d->strs_set = NULL; d->btf->strs_deduped = true; return 0;
err_out: - free(d->strs_data); - hashmap__free(d->strs_hash); - d->strs_data = d->strs_hash = NULL; - d->strs_len = d->strs_cap = 0; - - /* restore strings pointer for existing d->btf->strs_hash back */ - d->btf->strs_data_ptr = &d->strs_data; + strset__free(d->strs_set); + d->strs_set = NULL;
return err; } diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c new file mode 100644 index 000000000000..1fb8b49de1d6 --- /dev/null +++ b/tools/lib/bpf/strset.c @@ -0,0 +1,176 @@ +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +/* Copyright (c) 2021 Facebook */ +#include <stdint.h> +#include <stdlib.h> +#include <stdio.h> +#include <errno.h> +#include <linux/err.h> +#include "hashmap.h" +#include "libbpf_internal.h" +#include "strset.h" + +struct strset { + void *strs_data; + size_t strs_data_len; + size_t strs_data_cap; + size_t strs_data_max_len; + + /* lookup index for each unique string in strings set */ + struct hashmap *strs_hash; +}; + +static size_t strset_hash_fn(const void *key, void *ctx) +{ + const struct strset *s = ctx; + const char *str = s->strs_data + (long)key; + + return str_hash(str); +} + +static bool strset_equal_fn(const void *key1, const void *key2, void *ctx) +{ + const struct strset *s = ctx; + const char *str1 = s->strs_data + (long)key1; + const char *str2 = s->strs_data + (long)key2; + + return strcmp(str1, str2) == 0; +} + +struct strset *strset__new(size_t max_data_sz, const char *init_data, size_t init_data_sz) +{ + struct strset *set = calloc(1, sizeof(*set)); + struct hashmap *hash; + int err = -ENOMEM; + + if (!set) + return ERR_PTR(-ENOMEM); + + hash = hashmap__new(strset_hash_fn, strset_equal_fn, set); + if (IS_ERR(hash)) + goto err_out; + + set->strs_data_max_len = max_data_sz; + set->strs_hash = hash; + + if (init_data) { + long off; + + set->strs_data = malloc(init_data_sz); + if (!set->strs_data) + goto err_out; + + memcpy(set->strs_data, init_data, init_data_sz); + set->strs_data_len = init_data_sz; + set->strs_data_cap = init_data_sz; + + for (off = 0; off < set->strs_data_len; off += strlen(set->strs_data + off) + 1) { + /* hashmap__add() returns EEXIST if string with the same + * content already is in the hash map + */ + err = hashmap__add(hash, (void *)off, (void *)off); + if (err == -EEXIST) + continue; /* duplicate */ + if (err) + goto err_out; + } + } + + return set; +err_out: + strset__free(set); + return ERR_PTR(err); +} + +void strset__free(struct strset *set) +{ + if (IS_ERR_OR_NULL(set)) + return; + + hashmap__free(set->strs_hash); + free(set->strs_data); +} + +size_t strset__data_size(const struct strset *set) +{ + return set->strs_data_len; +} + +const char *strset__data(const struct strset *set) +{ + return set->strs_data; +} + +static void *strset_add_str_mem(struct strset *set, size_t add_sz) +{ + return libbpf_add_mem(&set->strs_data, &set->strs_data_cap, 1, + set->strs_data_len, set->strs_data_max_len, add_sz); +} + +/* Find string offset that corresponds to a given string *s*. + * Returns: + * - >0 offset into string data, if string is found; + * - -ENOENT, if string is not in the string data; + * - <0, on any other error. + */ +int strset__find_str(struct strset *set, const char *s) +{ + long old_off, new_off, len; + void *p; + + /* see strset__add_str() for why we do this */ + len = strlen(s) + 1; + p = strset_add_str_mem(set, len); + if (!p) + return -ENOMEM; + + new_off = set->strs_data_len; + memcpy(p, s, len); + + if (hashmap__find(set->strs_hash, (void *)new_off, (void **)&old_off)) + return old_off; + + return -ENOENT; +} + +/* Add a string s to the string data. If the string already exists, return its + * offset within string data. + * Returns: + * - > 0 offset into string data, on success; + * - < 0, on error. + */ +int strset__add_str(struct strset *set, const char *s) +{ + long old_off, new_off, len; + void *p; + int err; + + /* Hashmap keys are always offsets within set->strs_data, so to even + * look up some string from the "outside", we need to first append it + * at the end, so that it can be addressed with an offset. Luckily, + * until set->strs_data_len is incremented, that string is just a piece + * of garbage for the rest of the code, so no harm, no foul. On the + * other hand, if the string is unique, it's already appended and + * ready to be used, only a simple set->strs_data_len increment away. + */ + len = strlen(s) + 1; + p = strset_add_str_mem(set, len); + if (!p) + return -ENOMEM; + + new_off = set->strs_data_len; + memcpy(p, s, len); + + /* Now attempt to add the string, but only if the string with the same + * contents doesn't exist already (HASHMAP_ADD strategy). If such + * string exists, we'll get its offset in old_off (that's old_key). + */ + err = hashmap__insert(set->strs_hash, (void *)new_off, (void *)new_off, + HASHMAP_ADD, (const void **)&old_off, NULL); + if (err == -EEXIST) + return old_off; /* duplicated string, return existing offset */ + if (err) + return err; + + set->strs_data_len += len; /* new unique string, adjust data length */ + return new_off; +} diff --git a/tools/lib/bpf/strset.h b/tools/lib/bpf/strset.h new file mode 100644 index 000000000000..b6ddf77a83c2 --- /dev/null +++ b/tools/lib/bpf/strset.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ + +/* Copyright (c) 2021 Facebook */ +#ifndef __LIBBPF_STRSET_H +#define __LIBBPF_STRSET_H + +#include <stdbool.h> +#include <stddef.h> + +struct strset; + +struct strset *strset__new(size_t max_data_sz, const char *init_data, size_t init_data_sz); +void strset__free(struct strset *set); + +const char *strset__data(const struct strset *set); +size_t strset__data_size(const struct strset *set); + +int strset__find_str(struct strset *set, const char *s); +int strset__add_str(struct strset *set, const char *s); + +#endif /* __LIBBPF_STRSET_H */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 9af44bc5d4d70b37c9ada24d8e0367b34b805bd3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add btf__add_type() API that performs shallow copy of a given BTF type from the source BTF into the destination BTF. All the information and type IDs are preserved, but all the strings encountered are added into the destination BTF and corresponding offsets are rewritten. BTF type IDs are assumed to be correct or such that will be (somehow) modified afterwards.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org (cherry picked from commit 9af44bc5d4d70b37c9ada24d8e0367b34b805bd3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 48 ++++++++++++++++++++++++++++++++++++++++ tools/lib/bpf/btf.h | 2 ++ tools/lib/bpf/libbpf.map | 1 + 3 files changed, 51 insertions(+)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index c2116f2ad380..4c0dd20c0ec9 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1635,6 +1635,54 @@ static int btf_commit_type(struct btf *btf, int data_sz) return btf->start_id + btf->nr_types - 1; }
+struct btf_pipe { + const struct btf *src; + struct btf *dst; +}; + +static int btf_rewrite_str(__u32 *str_off, void *ctx) +{ + struct btf_pipe *p = ctx; + int off; + + if (!*str_off) /* nothing to do for empty strings */ + return 0; + + off = btf__add_str(p->dst, btf__str_by_offset(p->src, *str_off)); + if (off < 0) + return off; + + *str_off = off; + return 0; +} + +int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type) +{ + struct btf_pipe p = { .src = src_btf, .dst = btf }; + struct btf_type *t; + int sz, err; + + sz = btf_type_size(src_type); + if (sz < 0) + return sz; + + /* deconstruct BTF, if necessary, and invalidate raw_data */ + if (btf_ensure_modifiable(btf)) + return -ENOMEM; + + t = btf_add_type_mem(btf, sz); + if (!t) + return -ENOMEM; + + memcpy(t, src_type, sz); + + err = btf_type_visit_str_offs(t, btf_rewrite_str, &p); + if (err) + return err; + + return btf_commit_type(btf, sz); +} + /* * Append new BTF_KIND_INT type with: * - *name* - non-empty, non-NULL type name; diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 00313159c42e..b54f1c3ebd57 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -93,6 +93,8 @@ LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_str(struct btf *btf, const char *s); +LIBBPF_API int btf__add_type(struct btf *btf, const struct btf *src_btf, + const struct btf_type *src_type);
LIBBPF_API int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding); LIBBPF_API int btf__add_float(struct btf *btf, const char *name, size_t byte_sz); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 84099e305666..79b09a0ed6f5 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -354,4 +354,5 @@ LIBBPF_0.3.0 { LIBBPF_0.4.0 { global: btf__add_float; + btf__add_type; } LIBBPF_0.3.0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit faf6ed321cf61fafa17444fe01e7e336b8e89acc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Introduce BPF static linker APIs to libbpf. BPF static linker allows to perform static linking of multiple BPF object files into a single combined resulting object file, preserving all the BPF programs, maps, global variables, etc.
Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are concatenated together. Similarly, code sections are also concatenated. All the symbols and ELF relocations are also concatenated in their respective ELF sections and are adjusted accordingly to the new object file layout.
Static variables and functions are handled correctly as well, adjusting BPF instructions offsets to reflect new variable/function offset within the combined ELF section. Such relocations are referencing STT_SECTION symbols and that stays intact.
Data sections in different files can have different alignment requirements, so that is taken care of as well, adjusting sizes and offsets as necessary to satisfy both old and new alignment requirements.
DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG section, which is ignored by libbpf in bpf_object__open() anyways. So, in a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty nice property, especially if resulting .o file is then used to generate BPF skeleton.
Original string sections are ignored and instead we construct our own set of unique strings using libbpf-internal `struct strset` API.
To reduce the size of the patch, all the .BTF and .BTF.ext processing was moved into a separate patch.
The high-level API consists of just 4 functions: - bpf_linker__new() creates an instance of BPF static linker. It accepts output filename and (currently empty) options struct; - bpf_linker__add_file() takes input filename and appends it to the already processed ELF data; it can be called multiple times, one for each BPF ELF object file that needs to be linked in; - bpf_linker__finalize() needs to be called to dump final ELF contents into the output file, specified when bpf_linker was created; after bpf_linker__finalize() is called, no more bpf_linker__add_file() and bpf_linker__finalize() calls are allowed, they will return error; - regardless of whether bpf_linker__finalize() was called or not, bpf_linker__free() will free up all the used resources.
Currently, BPF static linker doesn't resolve cross-object file references (extern variables and/or functions). This will be added in the follow up patch set.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org (cherry picked from commit faf6ed321cf61fafa17444fe01e7e336b8e89acc) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Build | 2 +- tools/lib/bpf/libbpf.c | 11 +- tools/lib/bpf/libbpf.h | 13 + tools/lib/bpf/libbpf.map | 4 + tools/lib/bpf/libbpf_internal.h | 20 + tools/lib/bpf/linker.c | 1176 +++++++++++++++++++++++++++++++ 6 files changed, 1215 insertions(+), 11 deletions(-) create mode 100644 tools/lib/bpf/linker.c
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build index 8136186a453f..9b057cc7650a 100644 --- a/tools/lib/bpf/Build +++ b/tools/lib/bpf/Build @@ -1,3 +1,3 @@ libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \ netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \ - btf_dump.o ringbuf.o strset.o + btf_dump.o ringbuf.o strset.o linker.o diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a86590c01010..e1350af34228 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -55,10 +55,6 @@ #include "libbpf_internal.h" #include "hashmap.h"
-#ifndef EM_BPF -#define EM_BPF 247 -#endif - #ifndef BPF_FS_MAGIC #define BPF_FS_MAGIC 0xcafe4a11 #endif @@ -1134,11 +1130,6 @@ static void bpf_object__elf_finish(struct bpf_object *obj) obj->efile.obj_buf_sz = 0; }
-/* if libelf is old and doesn't support mmap(), fall back to read() */ -#ifndef ELF_C_READ_MMAP -#define ELF_C_READ_MMAP ELF_C_READ -#endif - static int bpf_object__elf_init(struct bpf_object *obj) { int err = 0; @@ -2809,7 +2800,7 @@ static bool ignore_elf_section(GElf_Shdr *hdr, const char *name) return true;
/* ignore .llvm_addrsig section as well */ - if (hdr->sh_type == 0x6FFF4C03 /* SHT_LLVM_ADDRSIG */) + if (hdr->sh_type == SHT_LLVM_ADDRSIG) return true;
/* no subprograms will lead to an empty .text section, ignore it */ diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index a011179f705d..7c1fd67c9339 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -763,6 +763,19 @@ enum libbpf_tristate { TRI_MODULE = 2, };
+struct bpf_linker_opts { + /* size of this struct, for forward/backward compatiblity */ + size_t sz; +}; +#define bpf_linker_opts__last_field sz + +struct bpf_linker; + +LIBBPF_API struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts *opts); +LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker, const char *filename); +LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker); +LIBBPF_API void bpf_linker__free(struct bpf_linker *linker); + #ifdef __cplusplus } /* extern "C" */ #endif diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 79b09a0ed6f5..e07bb4fb0923 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -355,4 +355,8 @@ LIBBPF_0.4.0 { global: btf__add_float; btf__add_type; + bpf_linker__add_file; + bpf_linker__finalize; + bpf_linker__free; + bpf_linker__new; } LIBBPF_0.3.0; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index e0787900eeca..6017902c687e 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -20,6 +20,26 @@
#include "libbpf.h"
+#ifndef EM_BPF +#define EM_BPF 247 +#endif + +#ifndef R_BPF_64_64 +#define R_BPF_64_64 1 +#endif +#ifndef R_BPF_64_32 +#define R_BPF_64_32 10 +#endif + +#ifndef SHT_LLVM_ADDRSIG +#define SHT_LLVM_ADDRSIG 0x6FFF4C03 +#endif + +/* if libelf is old and doesn't support mmap(), fall back to read() */ +#ifndef ELF_C_READ_MMAP +#define ELF_C_READ_MMAP ELF_C_READ +#endif + #define BTF_INFO_ENC(kind, kind_flag, vlen) \ ((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN)) #define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type) diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c new file mode 100644 index 000000000000..01b56939a835 --- /dev/null +++ b/tools/lib/bpf/linker.c @@ -0,0 +1,1176 @@ +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +/* + * BPF static linker + * + * Copyright (c) 2021 Facebook + */ +#include <stdbool.h> +#include <stddef.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <errno.h> +#include <linux/err.h> +#include <linux/btf.h> +#include <elf.h> +#include <libelf.h> +#include <gelf.h> +#include <fcntl.h> +#include "libbpf.h" +#include "btf.h" +#include "libbpf_internal.h" +#include "strset.h" + +struct src_sec { + const char *sec_name; + /* positional (not necessarily ELF) index in an array of sections */ + int id; + /* positional (not necessarily ELF) index of a matching section in a final object file */ + int dst_id; + /* section data offset in a matching output section */ + int dst_off; + /* whether section is omitted from the final ELF file */ + bool skipped; + + /* ELF info */ + size_t sec_idx; + Elf_Scn *scn; + Elf64_Shdr *shdr; + Elf_Data *data; +}; + +struct src_obj { + const char *filename; + int fd; + Elf *elf; + /* Section header strings section index */ + size_t shstrs_sec_idx; + /* SYMTAB section index */ + size_t symtab_sec_idx; + + /* List of sections. Slot zero is unused. */ + struct src_sec *secs; + int sec_cnt; + + /* mapping of symbol indices from src to dst ELF */ + int *sym_map; +}; + +struct dst_sec { + char *sec_name; + /* positional (not necessarily ELF) index in an array of sections */ + int id; + + /* ELF info */ + size_t sec_idx; + Elf_Scn *scn; + Elf64_Shdr *shdr; + Elf_Data *data; + + /* final output section size */ + int sec_sz; + /* final output contents of the section */ + void *raw_data; + + /* corresponding STT_SECTION symbol index in SYMTAB */ + int sec_sym_idx; +}; + +struct bpf_linker { + char *filename; + int fd; + Elf *elf; + Elf64_Ehdr *elf_hdr; + + /* Output sections metadata */ + struct dst_sec *secs; + int sec_cnt; + + struct strset *strtab_strs; /* STRTAB unique strings */ + size_t strtab_sec_idx; /* STRTAB section index */ + size_t symtab_sec_idx; /* SYMTAB section index */ +}; + +#define pr_warn_elf(fmt, ...) \ +do { \ + libbpf_print(LIBBPF_WARN, "libbpf: " fmt ": %s\n", ##__VA_ARGS__, elf_errmsg(-1)); \ +} while (0) + +static int init_output_elf(struct bpf_linker *linker, const char *file); + +static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, struct src_obj *obj); +static int linker_sanity_check_elf(struct src_obj *obj); +static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj); +static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj); +static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *obj); + +void bpf_linker__free(struct bpf_linker *linker) +{ + int i; + + if (!linker) + return; + + free(linker->filename); + + if (linker->elf) + elf_end(linker->elf); + + if (linker->fd >= 0) + close(linker->fd); + + strset__free(linker->strtab_strs); + + for (i = 1; i < linker->sec_cnt; i++) { + struct dst_sec *sec = &linker->secs[i]; + + free(sec->sec_name); + free(sec->raw_data); + } + free(linker->secs); + + free(linker); +} + +struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts *opts) +{ + struct bpf_linker *linker; + int err; + + if (!OPTS_VALID(opts, bpf_linker_opts)) + return NULL; + + if (elf_version(EV_CURRENT) == EV_NONE) { + pr_warn_elf("libelf initialization failed"); + return NULL; + } + + linker = calloc(1, sizeof(*linker)); + if (!linker) + return NULL; + + linker->fd = -1; + + err = init_output_elf(linker, filename); + if (err) + goto err_out; + + return linker; + +err_out: + bpf_linker__free(linker); + return NULL; +} + +static struct dst_sec *add_dst_sec(struct bpf_linker *linker, const char *sec_name) +{ + struct dst_sec *secs = linker->secs, *sec; + size_t new_cnt = linker->sec_cnt ? linker->sec_cnt + 1 : 2; + + secs = libbpf_reallocarray(secs, new_cnt, sizeof(*secs)); + if (!secs) + return NULL; + + /* zero out newly allocated memory */ + memset(secs + linker->sec_cnt, 0, (new_cnt - linker->sec_cnt) * sizeof(*secs)); + + linker->secs = secs; + linker->sec_cnt = new_cnt; + + sec = &linker->secs[new_cnt - 1]; + sec->id = new_cnt - 1; + sec->sec_name = strdup(sec_name); + if (!sec->sec_name) + return NULL; + + return sec; +} + +static Elf64_Sym *add_new_sym(struct bpf_linker *linker, size_t *sym_idx) +{ + struct dst_sec *symtab = &linker->secs[linker->symtab_sec_idx]; + Elf64_Sym *syms, *sym; + size_t sym_cnt = symtab->sec_sz / sizeof(*sym); + + syms = libbpf_reallocarray(symtab->raw_data, sym_cnt + 1, sizeof(*sym)); + if (!syms) + return NULL; + + sym = &syms[sym_cnt]; + memset(sym, 0, sizeof(*sym)); + + symtab->raw_data = syms; + symtab->sec_sz += sizeof(*sym); + symtab->shdr->sh_size += sizeof(*sym); + symtab->data->d_size += sizeof(*sym); + + if (sym_idx) + *sym_idx = sym_cnt; + + return sym; +} + +static int init_output_elf(struct bpf_linker *linker, const char *file) +{ + int err, str_off; + Elf64_Sym *init_sym; + struct dst_sec *sec; + + linker->filename = strdup(file); + if (!linker->filename) + return -ENOMEM; + + linker->fd = open(file, O_WRONLY | O_CREAT | O_TRUNC, 0644); + if (linker->fd < 0) { + err = -errno; + pr_warn("failed to create '%s': %d\n", file, err); + return err; + } + + linker->elf = elf_begin(linker->fd, ELF_C_WRITE, NULL); + if (!linker->elf) { + pr_warn_elf("failed to create ELF object"); + return -EINVAL; + } + + /* ELF header */ + linker->elf_hdr = elf64_newehdr(linker->elf); + if (!linker->elf_hdr){ + pr_warn_elf("failed to create ELF header"); + return -EINVAL; + } + + linker->elf_hdr->e_machine = EM_BPF; + linker->elf_hdr->e_type = ET_REL; +#if __BYTE_ORDER == __LITTLE_ENDIAN + linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2LSB; +#elif __BYTE_ORDER == __BIG_ENDIAN + linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2MSB; +#else +#error "Unknown __BYTE_ORDER" +#endif + + /* STRTAB */ + /* initialize strset with an empty string to conform to ELF */ + linker->strtab_strs = strset__new(INT_MAX, "", sizeof("")); + if (libbpf_get_error(linker->strtab_strs)) + return libbpf_get_error(linker->strtab_strs); + + sec = add_dst_sec(linker, ".strtab"); + if (!sec) + return -ENOMEM; + + sec->scn = elf_newscn(linker->elf); + if (!sec->scn) { + pr_warn_elf("failed to create STRTAB section"); + return -EINVAL; + } + + sec->shdr = elf64_getshdr(sec->scn); + if (!sec->shdr) + return -EINVAL; + + sec->data = elf_newdata(sec->scn); + if (!sec->data) { + pr_warn_elf("failed to create STRTAB data"); + return -EINVAL; + } + + str_off = strset__add_str(linker->strtab_strs, sec->sec_name); + if (str_off < 0) + return str_off; + + sec->sec_idx = elf_ndxscn(sec->scn); + linker->elf_hdr->e_shstrndx = sec->sec_idx; + linker->strtab_sec_idx = sec->sec_idx; + + sec->shdr->sh_name = str_off; + sec->shdr->sh_type = SHT_STRTAB; + sec->shdr->sh_flags = SHF_STRINGS; + sec->shdr->sh_offset = 0; + sec->shdr->sh_link = 0; + sec->shdr->sh_info = 0; + sec->shdr->sh_addralign = 1; + sec->shdr->sh_size = sec->sec_sz = 0; + sec->shdr->sh_entsize = 0; + + /* SYMTAB */ + sec = add_dst_sec(linker, ".symtab"); + if (!sec) + return -ENOMEM; + + sec->scn = elf_newscn(linker->elf); + if (!sec->scn) { + pr_warn_elf("failed to create SYMTAB section"); + return -EINVAL; + } + + sec->shdr = elf64_getshdr(sec->scn); + if (!sec->shdr) + return -EINVAL; + + sec->data = elf_newdata(sec->scn); + if (!sec->data) { + pr_warn_elf("failed to create SYMTAB data"); + return -EINVAL; + } + + str_off = strset__add_str(linker->strtab_strs, sec->sec_name); + if (str_off < 0) + return str_off; + + sec->sec_idx = elf_ndxscn(sec->scn); + linker->symtab_sec_idx = sec->sec_idx; + + sec->shdr->sh_name = str_off; + sec->shdr->sh_type = SHT_SYMTAB; + sec->shdr->sh_flags = 0; + sec->shdr->sh_offset = 0; + sec->shdr->sh_link = linker->strtab_sec_idx; + /* sh_info should be one greater than the index of the last local + * symbol (i.e., binding is STB_LOCAL). But why and who cares? + */ + sec->shdr->sh_info = 0; + sec->shdr->sh_addralign = 8; + sec->shdr->sh_entsize = sizeof(Elf64_Sym); + + /* add the special all-zero symbol */ + init_sym = add_new_sym(linker, NULL); + if (!init_sym) + return -EINVAL; + + init_sym->st_name = 0; + init_sym->st_info = 0; + init_sym->st_other = 0; + init_sym->st_shndx = SHN_UNDEF; + init_sym->st_value = 0; + init_sym->st_size = 0; + + return 0; +} + +int bpf_linker__add_file(struct bpf_linker *linker, const char *filename) +{ + struct src_obj obj = {}; + int err = 0; + + if (!linker->elf) + return -EINVAL; + + err = err ?: linker_load_obj_file(linker, filename, &obj); + err = err ?: linker_append_sec_data(linker, &obj); + err = err ?: linker_append_elf_syms(linker, &obj); + err = err ?: linker_append_elf_relos(linker, &obj); + + /* free up src_obj resources */ + free(obj.secs); + free(obj.sym_map); + if (obj.elf) + elf_end(obj.elf); + if (obj.fd >= 0) + close(obj.fd); + + return err; +} + +static bool is_dwarf_sec_name(const char *name) +{ + /* approximation, but the actual list is too long */ + return strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0; +} + +static bool is_ignored_sec(struct src_sec *sec) +{ + Elf64_Shdr *shdr = sec->shdr; + const char *name = sec->sec_name; + + /* no special handling of .strtab */ + if (shdr->sh_type == SHT_STRTAB) + return true; + + /* ignore .llvm_addrsig section as well */ + if (shdr->sh_type == SHT_LLVM_ADDRSIG) + return true; + + /* no subprograms will lead to an empty .text section, ignore it */ + if (shdr->sh_type == SHT_PROGBITS && shdr->sh_size == 0 && + strcmp(sec->sec_name, ".text") == 0) + return true; + + /* DWARF sections */ + if (is_dwarf_sec_name(sec->sec_name)) + return true; + + if (strncmp(name, ".rel", sizeof(".rel") - 1) == 0) { + name += sizeof(".rel") - 1; + /* DWARF section relocations */ + if (is_dwarf_sec_name(name)) + return true; + + /* .BTF and .BTF.ext don't need relocations */ + if (strcmp(name, BTF_ELF_SEC) == 0 || + strcmp(name, BTF_EXT_ELF_SEC) == 0) + return true; + } + + return false; +} + +static struct src_sec *add_src_sec(struct src_obj *obj, const char *sec_name) +{ + struct src_sec *secs = obj->secs, *sec; + size_t new_cnt = obj->sec_cnt ? obj->sec_cnt + 1 : 2; + + secs = libbpf_reallocarray(secs, new_cnt, sizeof(*secs)); + if (!secs) + return NULL; + + /* zero out newly allocated memory */ + memset(secs + obj->sec_cnt, 0, (new_cnt - obj->sec_cnt) * sizeof(*secs)); + + obj->secs = secs; + obj->sec_cnt = new_cnt; + + sec = &obj->secs[new_cnt - 1]; + sec->id = new_cnt - 1; + sec->sec_name = sec_name; + + return sec; +} + +static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, struct src_obj *obj) +{ +#if __BYTE_ORDER == __LITTLE_ENDIAN + const int host_endianness = ELFDATA2LSB; +#elif __BYTE_ORDER == __BIG_ENDIAN + const int host_endianness = ELFDATA2MSB; +#else +#error "Unknown __BYTE_ORDER" +#endif + int err = 0; + Elf_Scn *scn; + Elf_Data *data; + Elf64_Ehdr *ehdr; + Elf64_Shdr *shdr; + struct src_sec *sec; + + pr_debug("linker: adding object file '%s'...\n", filename); + + obj->filename = filename; + + obj->fd = open(filename, O_RDONLY); + if (obj->fd < 0) { + err = -errno; + pr_warn("failed to open file '%s': %d\n", filename, err); + return err; + } + obj->elf = elf_begin(obj->fd, ELF_C_READ_MMAP, NULL); + if (!obj->elf) { + err = -errno; + pr_warn_elf("failed to parse ELF file '%s'", filename); + return err; + } + + /* Sanity check ELF file high-level properties */ + ehdr = elf64_getehdr(obj->elf); + if (!ehdr) { + err = -errno; + pr_warn_elf("failed to get ELF header for %s", filename); + return err; + } + if (ehdr->e_ident[EI_DATA] != host_endianness) { + err = -EOPNOTSUPP; + pr_warn_elf("unsupported byte order of ELF file %s", filename); + return err; + } + if (ehdr->e_type != ET_REL + || ehdr->e_machine != EM_BPF + || ehdr->e_ident[EI_CLASS] != ELFCLASS64) { + err = -EOPNOTSUPP; + pr_warn_elf("unsupported kind of ELF file %s", filename); + return err; + } + + if (elf_getshdrstrndx(obj->elf, &obj->shstrs_sec_idx)) { + err = -errno; + pr_warn_elf("failed to get SHSTRTAB section index for %s", filename); + return err; + } + + scn = NULL; + while ((scn = elf_nextscn(obj->elf, scn)) != NULL) { + size_t sec_idx = elf_ndxscn(scn); + const char *sec_name; + + shdr = elf64_getshdr(scn); + if (!shdr) { + err = -errno; + pr_warn_elf("failed to get section #%zu header for %s", + sec_idx, filename); + return err; + } + + sec_name = elf_strptr(obj->elf, obj->shstrs_sec_idx, shdr->sh_name); + if (!sec_name) { + err = -errno; + pr_warn_elf("failed to get section #%zu name for %s", + sec_idx, filename); + return err; + } + + data = elf_getdata(scn, 0); + if (!data) { + err = -errno; + pr_warn_elf("failed to get section #%zu (%s) data from %s", + sec_idx, sec_name, filename); + return err; + } + + sec = add_src_sec(obj, sec_name); + if (!sec) + return -ENOMEM; + + sec->scn = scn; + sec->shdr = shdr; + sec->data = data; + sec->sec_idx = elf_ndxscn(scn); + + if (is_ignored_sec(sec)) { + sec->skipped = true; + continue; + } + + switch (shdr->sh_type) { + case SHT_SYMTAB: + if (obj->symtab_sec_idx) { + err = -EOPNOTSUPP; + pr_warn("multiple SYMTAB sections found, not supported\n"); + return err; + } + obj->symtab_sec_idx = sec_idx; + break; + case SHT_STRTAB: + /* we'll construct our own string table */ + break; + case SHT_PROGBITS: + if (strcmp(sec_name, BTF_ELF_SEC) == 0) { + sec->skipped = true; + continue; + } + if (strcmp(sec_name, BTF_EXT_ELF_SEC) == 0) { + sec->skipped = true; + continue; + } + + /* data & code */ + break; + case SHT_NOBITS: + /* BSS */ + break; + case SHT_REL: + /* relocations */ + break; + default: + pr_warn("unrecognized section #%zu (%s) in %s\n", + sec_idx, sec_name, filename); + err = -EINVAL; + return err; + } + } + + err = err ?: linker_sanity_check_elf(obj); + + return err; +} + +static bool is_pow_of_2(size_t x) +{ + return x && (x & (x - 1)) == 0; +} + +static int linker_sanity_check_elf(struct src_obj *obj) +{ + struct src_sec *sec, *link_sec; + int i, j, n; + + if (!obj->symtab_sec_idx) { + pr_warn("ELF is missing SYMTAB section in %s\n", obj->filename); + return -EINVAL; + } + if (!obj->shstrs_sec_idx) { + pr_warn("ELF is missing section headers STRTAB section in %s\n", obj->filename); + return -EINVAL; + } + + for (i = 1; i < obj->sec_cnt; i++) { + sec = &obj->secs[i]; + + if (sec->sec_name[0] == '\0') { + pr_warn("ELF section #%zu has empty name in %s\n", sec->sec_idx, obj->filename); + return -EINVAL; + } + + if (sec->shdr->sh_addralign && !is_pow_of_2(sec->shdr->sh_addralign)) + return -EINVAL; + if (sec->shdr->sh_addralign != sec->data->d_align) + return -EINVAL; + + if (sec->shdr->sh_size != sec->data->d_size) + return -EINVAL; + + switch (sec->shdr->sh_type) { + case SHT_SYMTAB: { + Elf64_Sym *sym; + + if (sec->shdr->sh_entsize != sizeof(Elf64_Sym)) + return -EINVAL; + if (sec->shdr->sh_size % sec->shdr->sh_entsize != 0) + return -EINVAL; + + if (!sec->shdr->sh_link || sec->shdr->sh_link >= obj->sec_cnt) { + pr_warn("ELF SYMTAB section #%zu points to missing STRTAB section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); + return -EINVAL; + } + link_sec = &obj->secs[sec->shdr->sh_link]; + if (link_sec->shdr->sh_type != SHT_STRTAB) { + pr_warn("ELF SYMTAB section #%zu points to invalid STRTAB section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); + return -EINVAL; + } + + n = sec->shdr->sh_size / sec->shdr->sh_entsize; + sym = sec->data->d_buf; + for (j = 0; j < n; j++, sym++) { + if (sym->st_shndx + && sym->st_shndx < SHN_LORESERVE + && sym->st_shndx >= obj->sec_cnt) { + pr_warn("ELF sym #%d in section #%zu points to missing section #%zu in %s\n", + j, sec->sec_idx, (size_t)sym->st_shndx, obj->filename); + return -EINVAL; + } + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) { + if (sym->st_value != 0) + return -EINVAL; + } + } + break; + } + case SHT_STRTAB: + break; + case SHT_PROGBITS: + if (sec->shdr->sh_flags & SHF_EXECINSTR) { + if (sec->shdr->sh_size % sizeof(struct bpf_insn) != 0) + return -EINVAL; + } + break; + case SHT_NOBITS: + break; + case SHT_REL: { + Elf64_Rel *relo; + struct src_sec *sym_sec; + + if (sec->shdr->sh_entsize != sizeof(Elf64_Rel)) + return -EINVAL; + if (sec->shdr->sh_size % sec->shdr->sh_entsize != 0) + return -EINVAL; + + /* SHT_REL's sh_link should point to SYMTAB */ + if (sec->shdr->sh_link != obj->symtab_sec_idx) { + pr_warn("ELF relo section #%zu points to invalid SYMTAB section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); + return -EINVAL; + } + + /* SHT_REL's sh_info points to relocated section */ + if (!sec->shdr->sh_info || sec->shdr->sh_info >= obj->sec_cnt) { + pr_warn("ELF relo section #%zu points to missing section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_info, obj->filename); + return -EINVAL; + } + link_sec = &obj->secs[sec->shdr->sh_info]; + + /* .rel<secname> -> <secname> pattern is followed */ + if (strncmp(sec->sec_name, ".rel", sizeof(".rel") - 1) != 0 + || strcmp(sec->sec_name + sizeof(".rel") - 1, link_sec->sec_name) != 0) { + pr_warn("ELF relo section #%zu name has invalid name in %s\n", + sec->sec_idx, obj->filename); + return -EINVAL; + } + + /* don't further validate relocations for ignored sections */ + if (link_sec->skipped) + break; + + /* relocatable section is data or instructions */ + if (link_sec->shdr->sh_type != SHT_PROGBITS + && link_sec->shdr->sh_type != SHT_NOBITS) { + pr_warn("ELF relo section #%zu points to invalid section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_info, obj->filename); + return -EINVAL; + } + + /* check sanity of each relocation */ + n = sec->shdr->sh_size / sec->shdr->sh_entsize; + relo = sec->data->d_buf; + sym_sec = &obj->secs[obj->symtab_sec_idx]; + for (j = 0; j < n; j++, relo++) { + size_t sym_idx = ELF64_R_SYM(relo->r_info); + size_t sym_type = ELF64_R_TYPE(relo->r_info); + + if (sym_type != R_BPF_64_64 && sym_type != R_BPF_64_32) { + pr_warn("ELF relo #%d in section #%zu has unexpected type %zu in %s\n", + j, sec->sec_idx, sym_type, obj->filename); + return -EINVAL; + } + + if (!sym_idx || sym_idx * sizeof(Elf64_Sym) >= sym_sec->shdr->sh_size) { + pr_warn("ELF relo #%d in section #%zu points to invalid symbol #%zu in %s\n", + j, sec->sec_idx, sym_idx, obj->filename); + return -EINVAL; + } + + if (link_sec->shdr->sh_flags & SHF_EXECINSTR) { + if (relo->r_offset % sizeof(struct bpf_insn) != 0) { + pr_warn("ELF relo #%d in section #%zu points to missing symbol #%zu in %s\n", + j, sec->sec_idx, sym_idx, obj->filename); + return -EINVAL; + } + } + } + break; + } + case SHT_LLVM_ADDRSIG: + break; + default: + pr_warn("ELF section #%zu (%s) has unrecognized type %zu in %s\n", + sec->sec_idx, sec->sec_name, (size_t)sec->shdr->sh_type, obj->filename); + return -EINVAL; + } + } + + return 0; +} + +static int init_sec(struct bpf_linker *linker, struct dst_sec *dst_sec, struct src_sec *src_sec) +{ + Elf_Scn *scn; + Elf_Data *data; + Elf64_Shdr *shdr; + int name_off; + + dst_sec->sec_sz = 0; + dst_sec->sec_idx = 0; + + scn = elf_newscn(linker->elf); + if (!scn) + return -1; + data = elf_newdata(scn); + if (!data) + return -1; + shdr = elf64_getshdr(scn); + if (!shdr) + return -1; + + dst_sec->scn = scn; + dst_sec->shdr = shdr; + dst_sec->data = data; + dst_sec->sec_idx = elf_ndxscn(scn); + + name_off = strset__add_str(linker->strtab_strs, src_sec->sec_name); + if (name_off < 0) + return name_off; + + shdr->sh_name = name_off; + shdr->sh_type = src_sec->shdr->sh_type; + shdr->sh_flags = src_sec->shdr->sh_flags; + shdr->sh_size = 0; + /* sh_link and sh_info have different meaning for different types of + * sections, so we leave it up to the caller code to fill them in, if + * necessary + */ + shdr->sh_link = 0; + shdr->sh_info = 0; + shdr->sh_addralign = src_sec->shdr->sh_addralign; + shdr->sh_entsize = src_sec->shdr->sh_entsize; + + data->d_type = src_sec->data->d_type; + data->d_size = 0; + data->d_buf = NULL; + data->d_align = src_sec->data->d_align; + data->d_off = 0; + + return 0; +} + +static struct dst_sec *find_dst_sec_by_name(struct bpf_linker *linker, const char *sec_name) +{ + struct dst_sec *sec; + int i; + + for (i = 1; i < linker->sec_cnt; i++) { + sec = &linker->secs[i]; + + if (strcmp(sec->sec_name, sec_name) == 0) + return sec; + } + + return NULL; +} + +static bool secs_match(struct dst_sec *dst, struct src_sec *src) +{ + if (dst->shdr->sh_type != src->shdr->sh_type) { + pr_warn("sec %s types mismatch\n", dst->sec_name); + return false; + } + if (dst->shdr->sh_flags != src->shdr->sh_flags) { + pr_warn("sec %s flags mismatch\n", dst->sec_name); + return false; + } + if (dst->shdr->sh_entsize != src->shdr->sh_entsize) { + pr_warn("sec %s entsize mismatch\n", dst->sec_name); + return false; + } + + return true; +} + +static bool sec_content_is_same(struct dst_sec *dst_sec, struct src_sec *src_sec) +{ + if (dst_sec->sec_sz != src_sec->shdr->sh_size) + return false; + if (memcmp(dst_sec->raw_data, src_sec->data->d_buf, dst_sec->sec_sz) != 0) + return false; + return true; +} + +static int extend_sec(struct dst_sec *dst, struct src_sec *src) +{ + void *tmp; + size_t dst_align = dst->shdr->sh_addralign; + size_t src_align = src->shdr->sh_addralign; + size_t dst_align_sz, dst_final_sz; + + if (dst_align == 0) + dst_align = 1; + if (dst_align < src_align) + dst_align = src_align; + + dst_align_sz = (dst->sec_sz + dst_align - 1) / dst_align * dst_align; + + /* no need to re-align final size */ + dst_final_sz = dst_align_sz + src->shdr->sh_size; + + if (src->shdr->sh_type != SHT_NOBITS) { + tmp = realloc(dst->raw_data, dst_final_sz); + if (!tmp) + return -ENOMEM; + dst->raw_data = tmp; + + /* pad dst section, if it's alignment forced size increase */ + memset(dst->raw_data + dst->sec_sz, 0, dst_align_sz - dst->sec_sz); + /* now copy src data at a properly aligned offset */ + memcpy(dst->raw_data + dst_align_sz, src->data->d_buf, src->shdr->sh_size); + } + + dst->sec_sz = dst_final_sz; + dst->shdr->sh_size = dst_final_sz; + dst->data->d_size = dst_final_sz; + + dst->shdr->sh_addralign = dst_align; + dst->data->d_align = dst_align; + + src->dst_off = dst_align_sz; + + return 0; +} + +static bool is_data_sec(struct src_sec *sec) +{ + if (!sec || sec->skipped) + return false; + return sec->shdr->sh_type == SHT_PROGBITS || sec->shdr->sh_type == SHT_NOBITS; +} + +static bool is_relo_sec(struct src_sec *sec) +{ + if (!sec || sec->skipped) + return false; + return sec->shdr->sh_type == SHT_REL; +} + +static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj) +{ + int i, err; + + for (i = 1; i < obj->sec_cnt; i++) { + struct src_sec *src_sec; + struct dst_sec *dst_sec; + + src_sec = &obj->secs[i]; + if (!is_data_sec(src_sec)) + continue; + + dst_sec = find_dst_sec_by_name(linker, src_sec->sec_name); + if (!dst_sec) { + dst_sec = add_dst_sec(linker, src_sec->sec_name); + if (!dst_sec) + return -ENOMEM; + err = init_sec(linker, dst_sec, src_sec); + if (err) { + pr_warn("failed to init section '%s'\n", src_sec->sec_name); + return err; + } + } else { + if (!secs_match(dst_sec, src_sec)) { + pr_warn("ELF sections %s are incompatible\n", src_sec->sec_name); + return -1; + } + + /* "license" and "version" sections are deduped */ + if (strcmp(src_sec->sec_name, "license") == 0 + || strcmp(src_sec->sec_name, "version") == 0) { + if (!sec_content_is_same(dst_sec, src_sec)) { + pr_warn("non-identical contents of section '%s' are not supported\n", src_sec->sec_name); + return -EINVAL; + } + src_sec->skipped = true; + src_sec->dst_id = dst_sec->id; + continue; + } + } + + /* record mapped section index */ + src_sec->dst_id = dst_sec->id; + + err = extend_sec(dst_sec, src_sec); + if (err) + return err; + } + + return 0; +} + +static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj) +{ + struct src_sec *symtab = &obj->secs[obj->symtab_sec_idx]; + Elf64_Sym *sym = symtab->data->d_buf, *dst_sym; + int i, n = symtab->shdr->sh_size / symtab->shdr->sh_entsize; + int str_sec_idx = symtab->shdr->sh_link; + + obj->sym_map = calloc(n + 1, sizeof(*obj->sym_map)); + if (!obj->sym_map) + return -ENOMEM; + + for (i = 0; i < n; i++, sym++) { + struct src_sec *src_sec = NULL; + struct dst_sec *dst_sec = NULL; + const char *sym_name; + size_t dst_sym_idx; + int name_off; + + /* we already have all-zero initial symbol */ + if (sym->st_name == 0 && sym->st_info == 0 && + sym->st_other == 0 && sym->st_shndx == SHN_UNDEF && + sym->st_value == 0 && sym->st_size ==0) + continue; + + sym_name = elf_strptr(obj->elf, str_sec_idx, sym->st_name); + if (!sym_name) { + pr_warn("can't fetch symbol name for symbol #%d in '%s'\n", i, obj->filename); + return -1; + } + + if (sym->st_shndx && sym->st_shndx < SHN_LORESERVE) { + src_sec = &obj->secs[sym->st_shndx]; + if (src_sec->skipped) + continue; + dst_sec = &linker->secs[src_sec->dst_id]; + + /* allow only one STT_SECTION symbol per section */ + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION && dst_sec->sec_sym_idx) { + obj->sym_map[i] = dst_sec->sec_sym_idx; + continue; + } + } + + name_off = strset__add_str(linker->strtab_strs, sym_name); + if (name_off < 0) + return name_off; + + dst_sym = add_new_sym(linker, &dst_sym_idx); + if (!dst_sym) + return -ENOMEM; + + dst_sym->st_name = name_off; + dst_sym->st_info = sym->st_info; + dst_sym->st_other = sym->st_other; + dst_sym->st_shndx = src_sec ? dst_sec->sec_idx : sym->st_shndx; + dst_sym->st_value = (src_sec ? src_sec->dst_off : 0) + sym->st_value; + dst_sym->st_size = sym->st_size; + + obj->sym_map[i] = dst_sym_idx; + + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION && dst_sym) { + dst_sec->sec_sym_idx = dst_sym_idx; + dst_sym->st_value = 0; + } + + } + + return 0; +} + +static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *obj) +{ + struct src_sec *src_symtab = &obj->secs[obj->symtab_sec_idx]; + struct dst_sec *dst_symtab = &linker->secs[linker->symtab_sec_idx]; + int i, err; + + for (i = 1; i < obj->sec_cnt; i++) { + struct src_sec *src_sec, *src_linked_sec; + struct dst_sec *dst_sec, *dst_linked_sec; + Elf64_Rel *src_rel, *dst_rel; + int j, n; + + src_sec = &obj->secs[i]; + if (!is_relo_sec(src_sec)) + continue; + + /* shdr->sh_info points to relocatable section */ + src_linked_sec = &obj->secs[src_sec->shdr->sh_info]; + if (src_linked_sec->skipped) + continue; + + dst_sec = find_dst_sec_by_name(linker, src_sec->sec_name); + if (!dst_sec) { + dst_sec = add_dst_sec(linker, src_sec->sec_name); + if (!dst_sec) + return -ENOMEM; + err = init_sec(linker, dst_sec, src_sec); + if (err) { + pr_warn("failed to init section '%s'\n", src_sec->sec_name); + return err; + } + } else if (!secs_match(dst_sec, src_sec)) { + pr_warn("Secs %s are not compatible\n", src_sec->sec_name); + return -1; + } + + /* shdr->sh_link points to SYMTAB */ + dst_sec->shdr->sh_link = linker->symtab_sec_idx; + + /* shdr->sh_info points to relocated section */ + dst_linked_sec = &linker->secs[src_linked_sec->dst_id]; + dst_sec->shdr->sh_info = dst_linked_sec->sec_idx; + + src_sec->dst_id = dst_sec->id; + err = extend_sec(dst_sec, src_sec); + if (err) + return err; + + src_rel = src_sec->data->d_buf; + dst_rel = dst_sec->raw_data + src_sec->dst_off; + n = src_sec->shdr->sh_size / src_sec->shdr->sh_entsize; + for (j = 0; j < n; j++, src_rel++, dst_rel++) { + size_t src_sym_idx = ELF64_R_SYM(src_rel->r_info); + size_t sym_type = ELF64_R_TYPE(src_rel->r_info); + Elf64_Sym *src_sym, *dst_sym; + size_t dst_sym_idx; + + src_sym_idx = ELF64_R_SYM(src_rel->r_info); + src_sym = src_symtab->data->d_buf + sizeof(*src_sym) * src_sym_idx; + + dst_sym_idx = obj->sym_map[src_sym_idx]; + dst_sym = dst_symtab->raw_data + sizeof(*dst_sym) * dst_sym_idx; + dst_rel->r_offset += src_linked_sec->dst_off; + sym_type = ELF64_R_TYPE(src_rel->r_info); + dst_rel->r_info = ELF64_R_INFO(dst_sym_idx, sym_type); + + if (ELF64_ST_TYPE(src_sym->st_info) == STT_SECTION) { + struct src_sec *sec = &obj->secs[src_sym->st_shndx]; + struct bpf_insn *insn; + + if (src_linked_sec->shdr->sh_flags & SHF_EXECINSTR) { + /* calls to the very first static function inside + * .text section at offset 0 will + * reference section symbol, not the + * function symbol. Fix that up, + * otherwise it won't be possible to + * relocate calls to two different + * static functions with the same name + * (rom two different object files) + */ + insn = dst_linked_sec->raw_data + dst_rel->r_offset; + if (insn->code == (BPF_JMP | BPF_CALL)) + insn->imm += sec->dst_off / sizeof(struct bpf_insn); + else + insn->imm += sec->dst_off; + } else { + pr_warn("relocation against STT_SECTION in non-exec section is not supported!\n"); + return -EINVAL; + } + } + + } + } + + return 0; +} + +int bpf_linker__finalize(struct bpf_linker *linker) +{ + struct dst_sec *sec; + size_t strs_sz; + const void *strs; + int err, i; + + if (!linker->elf) + return -EINVAL; + + /* Finalize strings */ + strs_sz = strset__data_size(linker->strtab_strs); + strs = strset__data(linker->strtab_strs); + + sec = &linker->secs[linker->strtab_sec_idx]; + sec->data->d_align = 1; + sec->data->d_off = 0LL; + sec->data->d_buf = (void *)strs; + sec->data->d_type = ELF_T_BYTE; + sec->data->d_size = strs_sz; + sec->shdr->sh_size = strs_sz; + + for (i = 1; i < linker->sec_cnt; i++) { + sec = &linker->secs[i]; + + /* STRTAB is handled specially above */ + if (sec->sec_idx == linker->strtab_sec_idx) + continue; + + sec->data->d_buf = sec->raw_data; + } + + /* Finalize ELF layout */ + if (elf_update(linker->elf, ELF_C_NULL) < 0) { + err = -errno; + pr_warn_elf("failed to finalize ELF layout"); + return err; + } + + /* Write out final ELF contents */ + if (elf_update(linker->elf, ELF_C_WRITE) < 0) { + err = -errno; + pr_warn_elf("failed to write ELF contents"); + return err; + } + + elf_end(linker->elf); + close(linker->fd); + + linker->elf = NULL; + linker->fd = -1; + + return 0; +}
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 8fd27bf69b864b1c2a6e64cf5673603f3959a6ef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add .BTF and .BTF.ext static linking logic.
When multiple BPF object files are linked together, their respective .BTF and .BTF.ext sections are merged together. BTF types are not just concatenated, but also deduplicated. .BTF.ext data is grouped by type (func info, line info, core_relos) and target section names, and then all the records are concatenated together, preserving their relative order. All the BTF type ID references and string offsets are updated as necessary, to take into account possibly deduplicated strings and types.
BTF DATASEC types are handled specially. Their respective var_secinfos are accumulated separately in special per-section data and then final DATASEC types are emitted at the very end during bpf_linker__finalize() operation, just before emitting final ELF output file.
BTF data can also provide "section annotations" for some extern variables. Such concept is missing in ELF, but BTF will have DATASEC types for such special extern datasections (e.g., .kconfig, .ksyms). Such sections are called "ephemeral" internally. Internally linker will keep metadata for each such section, collecting variables information, but those sections won't be emitted into the final ELF file.
Also, given LLVM/Clang during compilation emits BTF DATASECS that are incomplete, missing section size and variable offsets for static variables, BPF static linker will initially fix up such DATASECs, using ELF symbols data. The final DATASECs will preserve section sizes and all variable offsets. This is handled correctly by libbpf already, so won't cause any new issues. On the other hand, it's actually a nice property to have a complete BTF data without runtime adjustments done during bpf_object__open() by libbpf. In that sense, BPF static linker is also a BTF normalizer.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-8-andrii@kernel.org (cherry picked from commit 8fd27bf69b864b1c2a6e64cf5673603f3959a6ef) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 769 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 767 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 01b56939a835..b4fff912dce2 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -32,12 +32,17 @@ struct src_sec { int dst_off; /* whether section is omitted from the final ELF file */ bool skipped; + /* whether section is an ephemeral section, not mapped to an ELF section */ + bool ephemeral;
/* ELF info */ size_t sec_idx; Elf_Scn *scn; Elf64_Shdr *shdr; Elf_Data *data; + + /* corresponding BTF DATASEC type ID */ + int sec_type_id; };
struct src_obj { @@ -49,12 +54,24 @@ struct src_obj { /* SYMTAB section index */ size_t symtab_sec_idx;
- /* List of sections. Slot zero is unused. */ + struct btf *btf; + struct btf_ext *btf_ext; + + /* List of sections (including ephemeral). Slot zero is unused. */ struct src_sec *secs; int sec_cnt;
/* mapping of symbol indices from src to dst ELF */ int *sym_map; + /* mapping from the src BTF type IDs to dst ones */ + int *btf_type_map; +}; + +/* single .BTF.ext data section */ +struct btf_ext_sec_data { + size_t rec_cnt; + __u32 rec_sz; + void *recs; };
struct dst_sec { @@ -75,6 +92,15 @@ struct dst_sec {
/* corresponding STT_SECTION symbol index in SYMTAB */ int sec_sym_idx; + + /* section's DATASEC variable info, emitted on BTF finalization */ + int sec_var_cnt; + struct btf_var_secinfo *sec_vars; + + /* section's .BTF.ext data */ + struct btf_ext_sec_data func_info; + struct btf_ext_sec_data line_info; + struct btf_ext_sec_data core_relo_info; };
struct bpf_linker { @@ -90,6 +116,9 @@ struct bpf_linker { struct strset *strtab_strs; /* STRTAB unique strings */ size_t strtab_sec_idx; /* STRTAB section index */ size_t symtab_sec_idx; /* SYMTAB section index */ + + struct btf *btf; + struct btf_ext *btf_ext; };
#define pr_warn_elf(fmt, ...) \ @@ -101,9 +130,17 @@ static int init_output_elf(struct bpf_linker *linker, const char *file);
static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, struct src_obj *obj); static int linker_sanity_check_elf(struct src_obj *obj); +static int linker_sanity_check_btf(struct src_obj *obj); +static int linker_sanity_check_btf_ext(struct src_obj *obj); +static int linker_fixup_btf(struct src_obj *obj); static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj); static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj); static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *obj); +static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj); +static int linker_append_btf_ext(struct bpf_linker *linker, struct src_obj *obj); + +static int finalize_btf(struct bpf_linker *linker); +static int finalize_btf_ext(struct bpf_linker *linker);
void bpf_linker__free(struct bpf_linker *linker) { @@ -122,11 +159,19 @@ void bpf_linker__free(struct bpf_linker *linker)
strset__free(linker->strtab_strs);
+ btf__free(linker->btf); + btf_ext__free(linker->btf_ext); + for (i = 1; i < linker->sec_cnt; i++) { struct dst_sec *sec = &linker->secs[i];
free(sec->sec_name); free(sec->raw_data); + free(sec->sec_vars); + + free(sec->func_info.recs); + free(sec->line_info.recs); + free(sec->core_relo_info.recs); } free(linker->secs);
@@ -335,6 +380,12 @@ static int init_output_elf(struct bpf_linker *linker, const char *file) sec->shdr->sh_addralign = 8; sec->shdr->sh_entsize = sizeof(Elf64_Sym);
+ /* .BTF */ + linker->btf = btf__new_empty(); + err = libbpf_get_error(linker->btf); + if (err) + return err; + /* add the special all-zero symbol */ init_sym = add_new_sym(linker, NULL); if (!init_sym) @@ -362,8 +413,13 @@ int bpf_linker__add_file(struct bpf_linker *linker, const char *filename) err = err ?: linker_append_sec_data(linker, &obj); err = err ?: linker_append_elf_syms(linker, &obj); err = err ?: linker_append_elf_relos(linker, &obj); + err = err ?: linker_append_btf(linker, &obj); + err = err ?: linker_append_btf_ext(linker, &obj);
/* free up src_obj resources */ + free(obj.btf_type_map); + btf__free(obj.btf); + btf_ext__free(obj.btf_ext); free(obj.secs); free(obj.sym_map); if (obj.elf) @@ -555,10 +611,22 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, break; case SHT_PROGBITS: if (strcmp(sec_name, BTF_ELF_SEC) == 0) { + obj->btf = btf__new(data->d_buf, shdr->sh_size); + err = libbpf_get_error(obj->btf); + if (err) { + pr_warn("failed to parse .BTF from %s: %d\n", filename, err); + return err; + } sec->skipped = true; continue; } if (strcmp(sec_name, BTF_EXT_ELF_SEC) == 0) { + obj->btf_ext = btf_ext__new(data->d_buf, shdr->sh_size); + err = libbpf_get_error(obj->btf_ext); + if (err) { + pr_warn("failed to parse .BTF.ext from '%s': %d\n", filename, err); + return err; + } sec->skipped = true; continue; } @@ -580,6 +648,9 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, }
err = err ?: linker_sanity_check_elf(obj); + err = err ?: linker_sanity_check_btf(obj); + err = err ?: linker_sanity_check_btf_ext(obj); + err = err ?: linker_fixup_btf(obj);
return err; } @@ -753,6 +824,69 @@ static int linker_sanity_check_elf(struct src_obj *obj) return 0; }
+static int check_btf_type_id(__u32 *type_id, void *ctx) +{ + struct btf *btf = ctx; + + if (*type_id > btf__get_nr_types(btf)) + return -EINVAL; + + return 0; +} + +static int check_btf_str_off(__u32 *str_off, void *ctx) +{ + struct btf *btf = ctx; + const char *s; + + s = btf__str_by_offset(btf, *str_off); + + if (!s) + return -EINVAL; + + return 0; +} + +static int linker_sanity_check_btf(struct src_obj *obj) +{ + struct btf_type *t; + int i, n, err = 0; + + if (!obj->btf) + return 0; + + n = btf__get_nr_types(obj->btf); + for (i = 1; i <= n; i++) { + t = btf_type_by_id(obj->btf, i); + + err = err ?: btf_type_visit_type_ids(t, check_btf_type_id, obj->btf); + err = err ?: btf_type_visit_str_offs(t, check_btf_str_off, obj->btf); + if (err) + return err; + } + + return 0; +} + +static int linker_sanity_check_btf_ext(struct src_obj *obj) +{ + int err = 0; + + if (!obj->btf_ext) + return 0; + + /* can't use .BTF.ext without .BTF */ + if (!obj->btf) + return -EINVAL; + + err = err ?: btf_ext_visit_type_ids(obj->btf_ext, check_btf_type_id, obj->btf); + err = err ?: btf_ext_visit_str_offs(obj->btf_ext, check_btf_str_off, obj->btf); + if (err) + return err; + + return 0; +} + static int init_sec(struct bpf_linker *linker, struct dst_sec *dst_sec, struct src_sec *src_sec) { Elf_Scn *scn; @@ -763,6 +897,10 @@ static int init_sec(struct bpf_linker *linker, struct dst_sec *dst_sec, struct s dst_sec->sec_sz = 0; dst_sec->sec_idx = 0;
+ /* ephemeral sections are just thin section shells lacking most parts */ + if (src_sec->ephemeral) + return 0; + scn = elf_newscn(linker->elf); if (!scn) return -1; @@ -891,12 +1029,15 @@ static bool is_data_sec(struct src_sec *sec) { if (!sec || sec->skipped) return false; + /* ephemeral sections are data sections, e.g., .kconfig, .ksyms */ + if (sec->ephemeral) + return true; return sec->shdr->sh_type == SHT_PROGBITS || sec->shdr->sh_type == SHT_NOBITS; }
static bool is_relo_sec(struct src_sec *sec) { - if (!sec || sec->skipped) + if (!sec || sec->skipped || sec->ephemeral) return false; return sec->shdr->sh_type == SHT_REL; } @@ -945,6 +1086,9 @@ static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj /* record mapped section index */ src_sec->dst_id = dst_sec->id;
+ if (src_sec->ephemeral) + continue; + err = extend_sec(dst_sec, src_sec); if (err) return err; @@ -1120,6 +1264,339 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob return 0; }
+static struct src_sec *find_src_sec_by_name(struct src_obj *obj, const char *sec_name) +{ + struct src_sec *sec; + int i; + + for (i = 1; i < obj->sec_cnt; i++) { + sec = &obj->secs[i]; + + if (strcmp(sec->sec_name, sec_name) == 0) + return sec; + } + + return NULL; +} + +static Elf64_Sym *find_sym_by_name(struct src_obj *obj, size_t sec_idx, + int sym_type, const char *sym_name) +{ + struct src_sec *symtab = &obj->secs[obj->symtab_sec_idx]; + Elf64_Sym *sym = symtab->data->d_buf; + int i, n = symtab->shdr->sh_size / symtab->shdr->sh_entsize; + int str_sec_idx = symtab->shdr->sh_link; + const char *name; + + for (i = 0; i < n; i++, sym++) { + if (sym->st_shndx != sec_idx) + continue; + if (ELF64_ST_TYPE(sym->st_info) != sym_type) + continue; + + name = elf_strptr(obj->elf, str_sec_idx, sym->st_name); + if (!name) + return NULL; + + if (strcmp(sym_name, name) != 0) + continue; + + return sym; + } + + return NULL; +} + +static int linker_fixup_btf(struct src_obj *obj) +{ + const char *sec_name; + struct src_sec *sec; + int i, j, n, m; + + n = btf__get_nr_types(obj->btf); + for (i = 1; i <= n; i++) { + struct btf_var_secinfo *vi; + struct btf_type *t; + + t = btf_type_by_id(obj->btf, i); + if (btf_kind(t) != BTF_KIND_DATASEC) + continue; + + sec_name = btf__str_by_offset(obj->btf, t->name_off); + sec = find_src_sec_by_name(obj, sec_name); + if (sec) { + /* record actual section size, unless ephemeral */ + if (sec->shdr) + t->size = sec->shdr->sh_size; + } else { + /* BTF can have some sections that are not represented + * in ELF, e.g., .kconfig and .ksyms, which are used + * for special extern variables. Here we'll + * pre-create "section shells" for them to be able to + * keep track of extra per-section metadata later + * (e.g., BTF variables). + */ + sec = add_src_sec(obj, sec_name); + if (!sec) + return -ENOMEM; + + sec->ephemeral = true; + sec->sec_idx = 0; /* will match UNDEF shndx in ELF */ + } + + /* remember ELF section and its BTF type ID match */ + sec->sec_type_id = i; + + /* fix up variable offsets */ + vi = btf_var_secinfos(t); + for (j = 0, m = btf_vlen(t); j < m; j++, vi++) { + const struct btf_type *vt = btf__type_by_id(obj->btf, vi->type); + const char *var_name = btf__str_by_offset(obj->btf, vt->name_off); + int var_linkage = btf_var(vt)->linkage; + Elf64_Sym *sym; + + /* no need to patch up static or extern vars */ + if (var_linkage != BTF_VAR_GLOBAL_ALLOCATED) + continue; + + sym = find_sym_by_name(obj, sec->sec_idx, STT_OBJECT, var_name); + if (!sym) { + pr_warn("failed to find symbol for variable '%s' in section '%s'\n", var_name, sec_name); + return -ENOENT; + } + + vi->offset = sym->st_value; + } + } + + return 0; +} + +static int remap_type_id(__u32 *type_id, void *ctx) +{ + int *id_map = ctx; + + *type_id = id_map[*type_id]; + + return 0; +} + +static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) +{ + const struct btf_type *t; + int i, j, n, start_id, id; + + if (!obj->btf) + return 0; + + start_id = btf__get_nr_types(linker->btf) + 1; + n = btf__get_nr_types(obj->btf); + + obj->btf_type_map = calloc(n + 1, sizeof(int)); + if (!obj->btf_type_map) + return -ENOMEM; + + for (i = 1; i <= n; i++) { + t = btf__type_by_id(obj->btf, i); + + /* DATASECs are handled specially below */ + if (btf_kind(t) == BTF_KIND_DATASEC) + continue; + + id = btf__add_type(linker->btf, obj->btf, t); + if (id < 0) { + pr_warn("failed to append BTF type #%d from file '%s'\n", i, obj->filename); + return id; + } + + obj->btf_type_map[i] = id; + } + + /* remap all the types except DATASECs */ + n = btf__get_nr_types(linker->btf); + for (i = start_id; i <= n; i++) { + struct btf_type *dst_t = btf_type_by_id(linker->btf, i); + + if (btf_type_visit_type_ids(dst_t, remap_type_id, obj->btf_type_map)) + return -EINVAL; + } + + /* append DATASEC info */ + for (i = 1; i < obj->sec_cnt; i++) { + struct src_sec *src_sec; + struct dst_sec *dst_sec; + const struct btf_var_secinfo *src_var; + struct btf_var_secinfo *dst_var; + + src_sec = &obj->secs[i]; + if (!src_sec->sec_type_id || src_sec->skipped) + continue; + dst_sec = &linker->secs[src_sec->dst_id]; + + t = btf__type_by_id(obj->btf, src_sec->sec_type_id); + src_var = btf_var_secinfos(t); + n = btf_vlen(t); + for (j = 0; j < n; j++, src_var++) { + void *sec_vars = dst_sec->sec_vars; + + sec_vars = libbpf_reallocarray(sec_vars, + dst_sec->sec_var_cnt + 1, + sizeof(*dst_sec->sec_vars)); + if (!sec_vars) + return -ENOMEM; + + dst_sec->sec_vars = sec_vars; + dst_sec->sec_var_cnt++; + + dst_var = &dst_sec->sec_vars[dst_sec->sec_var_cnt - 1]; + dst_var->type = obj->btf_type_map[src_var->type]; + dst_var->size = src_var->size; + dst_var->offset = src_sec->dst_off + src_var->offset; + } + } + + return 0; +} + +static void *add_btf_ext_rec(struct btf_ext_sec_data *ext_data, const void *src_rec) +{ + void *tmp; + + tmp = libbpf_reallocarray(ext_data->recs, ext_data->rec_cnt + 1, ext_data->rec_sz); + if (!tmp) + return NULL; + ext_data->recs = tmp; + + tmp += ext_data->rec_cnt * ext_data->rec_sz; + memcpy(tmp, src_rec, ext_data->rec_sz); + + ext_data->rec_cnt++; + + return tmp; +} + +static int linker_append_btf_ext(struct bpf_linker *linker, struct src_obj *obj) +{ + const struct btf_ext_info_sec *ext_sec; + const char *sec_name, *s; + struct src_sec *src_sec; + struct dst_sec *dst_sec; + int rec_sz, str_off, i; + + if (!obj->btf_ext) + return 0; + + rec_sz = obj->btf_ext->func_info.rec_size; + for_each_btf_ext_sec(&obj->btf_ext->func_info, ext_sec) { + struct bpf_func_info_min *src_rec, *dst_rec; + + sec_name = btf__name_by_offset(obj->btf, ext_sec->sec_name_off); + src_sec = find_src_sec_by_name(obj, sec_name); + if (!src_sec) { + pr_warn("can't find section '%s' referenced from .BTF.ext\n", sec_name); + return -EINVAL; + } + dst_sec = &linker->secs[src_sec->dst_id]; + + if (dst_sec->func_info.rec_sz == 0) + dst_sec->func_info.rec_sz = rec_sz; + if (dst_sec->func_info.rec_sz != rec_sz) { + pr_warn("incompatible .BTF.ext record sizes for section '%s'\n", sec_name); + return -EINVAL; + } + + for_each_btf_ext_rec(&obj->btf_ext->func_info, ext_sec, i, src_rec) { + dst_rec = add_btf_ext_rec(&dst_sec->func_info, src_rec); + if (!dst_rec) + return -ENOMEM; + + dst_rec->insn_off += src_sec->dst_off; + dst_rec->type_id = obj->btf_type_map[dst_rec->type_id]; + } + } + + rec_sz = obj->btf_ext->line_info.rec_size; + for_each_btf_ext_sec(&obj->btf_ext->line_info, ext_sec) { + struct bpf_line_info_min *src_rec, *dst_rec; + + sec_name = btf__name_by_offset(obj->btf, ext_sec->sec_name_off); + src_sec = find_src_sec_by_name(obj, sec_name); + if (!src_sec) { + pr_warn("can't find section '%s' referenced from .BTF.ext\n", sec_name); + return -EINVAL; + } + dst_sec = &linker->secs[src_sec->dst_id]; + + if (dst_sec->line_info.rec_sz == 0) + dst_sec->line_info.rec_sz = rec_sz; + if (dst_sec->line_info.rec_sz != rec_sz) { + pr_warn("incompatible .BTF.ext record sizes for section '%s'\n", sec_name); + return -EINVAL; + } + + for_each_btf_ext_rec(&obj->btf_ext->line_info, ext_sec, i, src_rec) { + dst_rec = add_btf_ext_rec(&dst_sec->line_info, src_rec); + if (!dst_rec) + return -ENOMEM; + + dst_rec->insn_off += src_sec->dst_off; + + s = btf__str_by_offset(obj->btf, src_rec->file_name_off); + str_off = btf__add_str(linker->btf, s); + if (str_off < 0) + return -ENOMEM; + dst_rec->file_name_off = str_off; + + s = btf__str_by_offset(obj->btf, src_rec->line_off); + str_off = btf__add_str(linker->btf, s); + if (str_off < 0) + return -ENOMEM; + dst_rec->line_off = str_off; + + /* dst_rec->line_col is fine */ + } + } + + rec_sz = obj->btf_ext->core_relo_info.rec_size; + for_each_btf_ext_sec(&obj->btf_ext->core_relo_info, ext_sec) { + struct bpf_core_relo *src_rec, *dst_rec; + + sec_name = btf__name_by_offset(obj->btf, ext_sec->sec_name_off); + src_sec = find_src_sec_by_name(obj, sec_name); + if (!src_sec) { + pr_warn("can't find section '%s' referenced from .BTF.ext\n", sec_name); + return -EINVAL; + } + dst_sec = &linker->secs[src_sec->dst_id]; + + if (dst_sec->core_relo_info.rec_sz == 0) + dst_sec->core_relo_info.rec_sz = rec_sz; + if (dst_sec->core_relo_info.rec_sz != rec_sz) { + pr_warn("incompatible .BTF.ext record sizes for section '%s'\n", sec_name); + return -EINVAL; + } + + for_each_btf_ext_rec(&obj->btf_ext->core_relo_info, ext_sec, i, src_rec) { + dst_rec = add_btf_ext_rec(&dst_sec->core_relo_info, src_rec); + if (!dst_rec) + return -ENOMEM; + + dst_rec->insn_off += src_sec->dst_off; + dst_rec->type_id = obj->btf_type_map[dst_rec->type_id]; + + s = btf__str_by_offset(obj->btf, src_rec->access_str_off); + str_off = btf__add_str(linker->btf, s); + if (str_off < 0) + return -ENOMEM; + dst_rec->access_str_off = str_off; + + /* dst_rec->kind is fine */ + } + } + + return 0; +} + int bpf_linker__finalize(struct bpf_linker *linker) { struct dst_sec *sec; @@ -1130,6 +1607,10 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (!linker->elf) return -EINVAL;
+ err = finalize_btf(linker); + if (err) + return err; + /* Finalize strings */ strs_sz = strset__data_size(linker->strtab_strs); strs = strset__data(linker->strtab_strs); @@ -1149,6 +1630,10 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (sec->sec_idx == linker->strtab_sec_idx) continue;
+ /* special ephemeral sections (.ksyms, .kconfig, etc) */ + if (!sec->scn) + continue; + sec->data->d_buf = sec->raw_data; }
@@ -1174,3 +1659,283 @@ int bpf_linker__finalize(struct bpf_linker *linker)
return 0; } + +static int emit_elf_data_sec(struct bpf_linker *linker, const char *sec_name, + size_t align, const void *raw_data, size_t raw_sz) +{ + Elf_Scn *scn; + Elf_Data *data; + Elf64_Shdr *shdr; + int name_off; + + name_off = strset__add_str(linker->strtab_strs, sec_name); + if (name_off < 0) + return name_off; + + scn = elf_newscn(linker->elf); + if (!scn) + return -ENOMEM; + data = elf_newdata(scn); + if (!data) + return -ENOMEM; + shdr = elf64_getshdr(scn); + if (!shdr) + return -EINVAL; + + shdr->sh_name = name_off; + shdr->sh_type = SHT_PROGBITS; + shdr->sh_flags = 0; + shdr->sh_size = raw_sz; + shdr->sh_link = 0; + shdr->sh_info = 0; + shdr->sh_addralign = align; + shdr->sh_entsize = 0; + + data->d_type = ELF_T_BYTE; + data->d_size = raw_sz; + data->d_buf = (void *)raw_data; + data->d_align = align; + data->d_off = 0; + + return 0; +} + +static int finalize_btf(struct bpf_linker *linker) +{ + struct btf *btf = linker->btf; + const void *raw_data; + int i, j, id, err; + __u32 raw_sz; + + /* bail out if no BTF data was produced */ + if (btf__get_nr_types(linker->btf) == 0) + return 0; + + for (i = 1; i < linker->sec_cnt; i++) { + struct dst_sec *sec = &linker->secs[i]; + + if (!sec->sec_var_cnt) + continue; + + id = btf__add_datasec(btf, sec->sec_name, sec->sec_sz); + if (id < 0) { + pr_warn("failed to add consolidated BTF type for datasec '%s': %d\n", + sec->sec_name, id); + return id; + } + + for (j = 0; j < sec->sec_var_cnt; j++) { + struct btf_var_secinfo *vi = &sec->sec_vars[j]; + + if (btf__add_datasec_var_info(btf, vi->type, vi->offset, vi->size)) + return -EINVAL; + } + } + + err = finalize_btf_ext(linker); + if (err) { + pr_warn(".BTF.ext generation failed: %d\n", err); + return err; + } + + err = btf__dedup(linker->btf, linker->btf_ext, NULL); + if (err) { + pr_warn("BTF dedup failed: %d\n", err); + return err; + } + + /* Emit .BTF section */ + raw_data = btf__get_raw_data(linker->btf, &raw_sz); + if (!raw_data) + return -ENOMEM; + + err = emit_elf_data_sec(linker, BTF_ELF_SEC, 8, raw_data, raw_sz); + if (err) { + pr_warn("failed to write out .BTF ELF section: %d\n", err); + return err; + } + + /* Emit .BTF.ext section */ + if (linker->btf_ext) { + raw_data = btf_ext__get_raw_data(linker->btf_ext, &raw_sz); + if (!raw_data) + return -ENOMEM; + + err = emit_elf_data_sec(linker, BTF_EXT_ELF_SEC, 8, raw_data, raw_sz); + if (err) { + pr_warn("failed to write out .BTF.ext ELF section: %d\n", err); + return err; + } + } + + return 0; +} + +static int emit_btf_ext_data(struct bpf_linker *linker, void *output, + const char *sec_name, struct btf_ext_sec_data *sec_data) +{ + struct btf_ext_info_sec *sec_info; + void *cur = output; + int str_off; + size_t sz; + + if (!sec_data->rec_cnt) + return 0; + + str_off = btf__add_str(linker->btf, sec_name); + if (str_off < 0) + return -ENOMEM; + + sec_info = cur; + sec_info->sec_name_off = str_off; + sec_info->num_info = sec_data->rec_cnt; + cur += sizeof(struct btf_ext_info_sec); + + sz = sec_data->rec_cnt * sec_data->rec_sz; + memcpy(cur, sec_data->recs, sz); + cur += sz; + + return cur - output; +} + +static int finalize_btf_ext(struct bpf_linker *linker) +{ + size_t funcs_sz = 0, lines_sz = 0, core_relos_sz = 0, total_sz = 0; + size_t func_rec_sz = 0, line_rec_sz = 0, core_relo_rec_sz = 0; + struct btf_ext_header *hdr; + void *data, *cur; + int i, err, sz; + + /* validate that all sections have the same .BTF.ext record sizes + * and calculate total data size for each type of data (func info, + * line info, core relos) + */ + for (i = 1; i < linker->sec_cnt; i++) { + struct dst_sec *sec = &linker->secs[i]; + + if (sec->func_info.rec_cnt) { + if (func_rec_sz == 0) + func_rec_sz = sec->func_info.rec_sz; + if (func_rec_sz != sec->func_info.rec_sz) { + pr_warn("mismatch in func_info record size %zu != %u\n", + func_rec_sz, sec->func_info.rec_sz); + return -EINVAL; + } + + funcs_sz += sizeof(struct btf_ext_info_sec) + func_rec_sz * sec->func_info.rec_cnt; + } + if (sec->line_info.rec_cnt) { + if (line_rec_sz == 0) + line_rec_sz = sec->line_info.rec_sz; + if (line_rec_sz != sec->line_info.rec_sz) { + pr_warn("mismatch in line_info record size %zu != %u\n", + line_rec_sz, sec->line_info.rec_sz); + return -EINVAL; + } + + lines_sz += sizeof(struct btf_ext_info_sec) + line_rec_sz * sec->line_info.rec_cnt; + } + if (sec->core_relo_info.rec_cnt) { + if (core_relo_rec_sz == 0) + core_relo_rec_sz = sec->core_relo_info.rec_sz; + if (core_relo_rec_sz != sec->core_relo_info.rec_sz) { + pr_warn("mismatch in core_relo_info record size %zu != %u\n", + core_relo_rec_sz, sec->core_relo_info.rec_sz); + return -EINVAL; + } + + core_relos_sz += sizeof(struct btf_ext_info_sec) + core_relo_rec_sz * sec->core_relo_info.rec_cnt; + } + } + + if (!funcs_sz && !lines_sz && !core_relos_sz) + return 0; + + total_sz += sizeof(struct btf_ext_header); + if (funcs_sz) { + funcs_sz += sizeof(__u32); /* record size prefix */ + total_sz += funcs_sz; + } + if (lines_sz) { + lines_sz += sizeof(__u32); /* record size prefix */ + total_sz += lines_sz; + } + if (core_relos_sz) { + core_relos_sz += sizeof(__u32); /* record size prefix */ + total_sz += core_relos_sz; + } + + cur = data = calloc(1, total_sz); + if (!data) + return -ENOMEM; + + hdr = cur; + hdr->magic = BTF_MAGIC; + hdr->version = BTF_VERSION; + hdr->flags = 0; + hdr->hdr_len = sizeof(struct btf_ext_header); + cur += sizeof(struct btf_ext_header); + + /* All offsets are in bytes relative to the end of this header */ + hdr->func_info_off = 0; + hdr->func_info_len = funcs_sz; + hdr->line_info_off = funcs_sz; + hdr->line_info_len = lines_sz; + hdr->core_relo_off = funcs_sz + lines_sz;; + hdr->core_relo_len = core_relos_sz; + + if (funcs_sz) { + *(__u32 *)cur = func_rec_sz; + cur += sizeof(__u32); + + for (i = 1; i < linker->sec_cnt; i++) { + struct dst_sec *sec = &linker->secs[i]; + + sz = emit_btf_ext_data(linker, cur, sec->sec_name, &sec->func_info); + if (sz < 0) + return sz; + + cur += sz; + } + } + + if (lines_sz) { + *(__u32 *)cur = line_rec_sz; + cur += sizeof(__u32); + + for (i = 1; i < linker->sec_cnt; i++) { + struct dst_sec *sec = &linker->secs[i]; + + sz = emit_btf_ext_data(linker, cur, sec->sec_name, &sec->line_info); + if (sz < 0) + return sz; + + cur += sz; + } + } + + if (core_relos_sz) { + *(__u32 *)cur = core_relo_rec_sz; + cur += sizeof(__u32); + + for (i = 1; i < linker->sec_cnt; i++) { + struct dst_sec *sec = &linker->secs[i]; + + sz = emit_btf_ext_data(linker, cur, sec->sec_name, &sec->core_relo_info); + if (sz < 0) + return sz; + + cur += sz; + } + } + + linker->btf_ext = btf_ext__new(data, total_sz); + err = libbpf_get_error(linker->btf_ext); + if (err) { + linker->btf_ext = NULL; + pr_warn("failed to parse final .BTF.ext data: %d\n", err); + return err; + } + + return 0; +}
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit c41226654550b0a8aa75e91ce0a1cdb6ce2316ee category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add optional name OBJECT_NAME parameter to `gen skeleton` command to override default object name, normally derived from input file name. This allows much more flexibility during build time.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-9-andrii@kernel.org (cherry picked from commit c41226654550b0a8aa75e91ce0a1cdb6ce2316ee) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../bpf/bpftool/Documentation/bpftool-gen.rst | 13 +++++---- tools/bpf/bpftool/bash-completion/bpftool | 11 ++++++- tools/bpf/bpftool/gen.c | 29 +++++++++++++++++-- 3 files changed, 44 insertions(+), 9 deletions(-)
diff --git a/tools/bpf/bpftool/Documentation/bpftool-gen.rst b/tools/bpf/bpftool/Documentation/bpftool-gen.rst index 84cf0639696f..d4e7338e22e7 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-gen.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-gen.rst @@ -19,7 +19,7 @@ SYNOPSIS GEN COMMANDS =============
-| **bpftool** **gen skeleton** *FILE* +| **bpftool** **gen skeleton** *FILE* [**name** *OBJECT_NAME*] | **bpftool** **gen help**
DESCRIPTION @@ -75,10 +75,13 @@ DESCRIPTION specific maps, programs, etc.
As part of skeleton, few custom functions are generated. - Each of them is prefixed with object name, derived from - object file name. I.e., if BPF object file name is - **example.o**, BPF object name will be **example**. The - following custom functions are provided in such case: + Each of them is prefixed with object name. Object name can + either be derived from object file name, i.e., if BPF object + file name is **example.o**, BPF object name will be + **example**. Object name can be also specified explicitly + through **name** *OBJECT_NAME* parameter. The following + custom functions are provided (assuming **example** as + the object name):
- **example__open** and **example__open_opts**. These functions are used to instantiate skeleton. It diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool index f783e7c5a8df..5590f6209e3d 100644 --- a/tools/bpf/bpftool/bash-completion/bpftool +++ b/tools/bpf/bpftool/bash-completion/bpftool @@ -982,7 +982,16 @@ _bpftool() gen) case $command in skeleton) - _filedir + case $prev in + $command) + _filedir + return 0 + ;; + *) + _bpftool_once_attr 'name' + return 0 + ;; + esac ;; *) [[ $prev == $object ]] && \ diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 4033c46d83e7..9bff89a66835 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -273,7 +273,7 @@ static int do_skeleton(int argc, char **argv) char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")]; size_t i, map_cnt = 0, prog_cnt = 0, file_sz, mmap_sz; DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts); - char obj_name[MAX_OBJ_NAME_LEN], *obj_data; + char obj_name[MAX_OBJ_NAME_LEN] = "", *obj_data; struct bpf_object *obj = NULL; const char *file, *ident; struct bpf_program *prog; @@ -288,6 +288,28 @@ static int do_skeleton(int argc, char **argv) } file = GET_ARG();
+ while (argc) { + if (!REQ_ARGS(2)) + return -1; + + if (is_prefix(*argv, "name")) { + NEXT_ARG(); + + if (obj_name[0] != '\0') { + p_err("object name already specified"); + return -1; + } + + strncpy(obj_name, *argv, MAX_OBJ_NAME_LEN - 1); + obj_name[MAX_OBJ_NAME_LEN - 1] = '\0'; + } else { + p_err("unknown arg %s", *argv); + return -1; + } + + NEXT_ARG(); + } + if (argc) { p_err("extra unknown arguments"); return -1; @@ -310,7 +332,8 @@ static int do_skeleton(int argc, char **argv) p_err("failed to mmap() %s: %s", file, strerror(errno)); goto out; } - get_obj_name(obj_name, file); + if (obj_name[0] == '\0') + get_obj_name(obj_name, file); opts.object_name = obj_name; obj = bpf_object__open_mem(obj_data, file_sz, &opts); if (IS_ERR(obj)) { @@ -599,7 +622,7 @@ static int do_help(int argc, char **argv) }
fprintf(stderr, - "Usage: %1$s %2$s skeleton FILE\n" + "Usage: %1$s %2$s skeleton FILE [name OBJECT_NAME]\n" " %1$s %2$s help\n" "\n" " " HELP_SPEC_OPTIONS "\n"
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit d80b2fcbe0a023619e0fc73112f2a02c2662f6ab category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add `bpftool gen object <output-file> <input_file>...` command to statically link multiple BPF ELF object files into a single output BPF ELF object file.
This patch also updates bash completions and man page. Man page gets a short section on `gen object` command, but also updates the skeleton example to show off workflow for BPF application with two .bpf.c files, compiled individually with Clang, then resulting object files are linked together with `gen object`, and then final object file is used to generate usable BPF skeleton. This should help new users understand realistic workflow w.r.t. compiling mutli-file BPF application.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Quentin Monnet quentin@isovalent.com Link: https://lore.kernel.org/bpf/20210318194036.3521577-10-andrii@kernel.org (cherry picked from commit d80b2fcbe0a023619e0fc73112f2a02c2662f6ab) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../bpf/bpftool/Documentation/bpftool-gen.rst | 65 +++++++++++++++---- tools/bpf/bpftool/bash-completion/bpftool | 6 +- tools/bpf/bpftool/gen.c | 45 ++++++++++++- 3 files changed, 100 insertions(+), 16 deletions(-)
diff --git a/tools/bpf/bpftool/Documentation/bpftool-gen.rst b/tools/bpf/bpftool/Documentation/bpftool-gen.rst index d4e7338e22e7..7cd6681137f3 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-gen.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-gen.rst @@ -14,16 +14,37 @@ SYNOPSIS
*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] }
- *COMMAND* := { **skeleton** | **help** } + *COMMAND* := { **object** | **skeleton** | **help** }
GEN COMMANDS =============
+| **bpftool** **gen object** *OUTPUT_FILE* *INPUT_FILE* [*INPUT_FILE*...] | **bpftool** **gen skeleton** *FILE* [**name** *OBJECT_NAME*] | **bpftool** **gen help**
DESCRIPTION =========== + **bpftool gen object** *OUTPUT_FILE* *INPUT_FILE* [*INPUT_FILE*...] + Statically link (combine) together one or more *INPUT_FILE*'s + into a single resulting *OUTPUT_FILE*. All the files involved + are BPF ELF object files. + + The rules of BPF static linking are mostly the same as for + user-space object files, but in addition to combining data + and instruction sections, .BTF and .BTF.ext (if present in + any of the input files) data are combined together. .BTF + data is deduplicated, so all the common types across + *INPUT_FILE*'s will only be represented once in the resulting + BTF information. + + BPF static linking allows to partition BPF source code into + individually compiled files that are then linked into + a single resulting BPF object file, which can be used to + generated BPF skeleton (with **gen skeleton** command) or + passed directly into **libbpf** (using **bpf_object__open()** + family of APIs). + **bpftool gen skeleton** *FILE* Generate BPF skeleton C header file for a given *FILE*.
@@ -133,26 +154,19 @@ OPTIONS
EXAMPLES ======== -**$ cat example.c** +**$ cat example1.bpf.c**
::
#include <stdbool.h> #include <linux/ptrace.h> #include <linux/bpf.h> - #include "bpf_helpers.h" + #include <bpf/bpf_helpers.h>
const volatile int param1 = 42; bool global_flag = true; struct { int x; } data = {};
- struct { - __uint(type, BPF_MAP_TYPE_HASH); - __uint(max_entries, 128); - __type(key, int); - __type(value, long); - } my_map SEC(".maps"); - SEC("raw_tp/sys_enter") int handle_sys_enter(struct pt_regs *ctx) { @@ -164,6 +178,21 @@ EXAMPLES return 0; }
+**$ cat example2.bpf.c** + +:: + + #include <linux/ptrace.h> + #include <linux/bpf.h> + #include <bpf/bpf_helpers.h> + + struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 128); + __type(key, int); + __type(value, long); + } my_map SEC(".maps"); + SEC("raw_tp/sys_exit") int handle_sys_exit(struct pt_regs *ctx) { @@ -173,9 +202,17 @@ EXAMPLES }
This is example BPF application with two BPF programs and a mix of BPF maps -and global variables. +and global variables. Source code is split across two source code files.
-**$ bpftool gen skeleton example.o** +**$ clang -target bpf -g example1.bpf.c -o example1.bpf.o** +**$ clang -target bpf -g example2.bpf.c -o example2.bpf.o** +**$ bpftool gen object example.bpf.o example1.bpf.o example2.bpf.o** + +This set of commands compiles *example1.bpf.c* and *example2.bpf.c* +individually and then statically links respective object files into the final +BPF ELF object file *example.bpf.o*. + +**$ bpftool gen skeleton example.bpf.o name example | tee example.skel.h**
::
@@ -230,7 +267,7 @@ and global variables.
#endif /* __EXAMPLE_SKEL_H__ */
-**$ cat example_user.c** +**$ cat example.c**
::
@@ -273,7 +310,7 @@ and global variables. return err; }
-**# ./example_user** +**# ./example**
::
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool index 5590f6209e3d..888758ec5d27 100644 --- a/tools/bpf/bpftool/bash-completion/bpftool +++ b/tools/bpf/bpftool/bash-completion/bpftool @@ -981,6 +981,10 @@ _bpftool() ;; gen) case $command in + object) + _filedir + return 0 + ;; skeleton) case $prev in $command) @@ -995,7 +999,7 @@ _bpftool() ;; *) [[ $prev == $object ]] && \ - COMPREPLY=( $( compgen -W 'skeleton help' -- "$cur" ) ) + COMPREPLY=( $( compgen -W 'object skeleton help' -- "$cur" ) ) ;; esac ;; diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 9bff89a66835..31ade77f5ef8 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -614,6 +614,47 @@ static int do_skeleton(int argc, char **argv) return err; }
+static int do_object(int argc, char **argv) +{ + struct bpf_linker *linker; + const char *output_file, *file; + int err = 0; + + if (!REQ_ARGS(2)) { + usage(); + return -1; + } + + output_file = GET_ARG(); + + linker = bpf_linker__new(output_file, NULL); + if (!linker) { + p_err("failed to create BPF linker instance"); + return -1; + } + + while (argc) { + file = GET_ARG(); + + err = bpf_linker__add_file(linker, file); + if (err) { + p_err("failed to link '%s': %s (%d)", file, strerror(err), err); + goto out; + } + } + + err = bpf_linker__finalize(linker); + if (err) { + p_err("failed to finalize ELF file: %s (%d)", strerror(err), err); + goto out; + } + + err = 0; +out: + bpf_linker__free(linker); + return err; +} + static int do_help(int argc, char **argv) { if (json_output) { @@ -622,7 +663,8 @@ static int do_help(int argc, char **argv) }
fprintf(stderr, - "Usage: %1$s %2$s skeleton FILE [name OBJECT_NAME]\n" + "Usage: %1$s %2$s object OUTPUT_FILE INPUT_FILE [INPUT_FILE...]\n" + " %1$s %2$s skeleton FILE [name OBJECT_NAME]\n" " %1$s %2$s help\n" "\n" " " HELP_SPEC_OPTIONS "\n" @@ -633,6 +675,7 @@ static int do_help(int argc, char **argv) }
static const struct cmd cmds[] = { + { "object", do_object }, { "skeleton", do_skeleton }, { "help", do_help }, { 0 }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 14137f3c62186799b01eea8a338f90c9cbc57f00 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Pass all individual BPF object files (generated from progs/*.c) through `bpftool gen object` command to validate that BPF static linker doesn't corrupt them.
As an additional sanity checks, validate that passing resulting object files through linker again results in identical ELF files. Exact same ELF contents can be guaranteed only after two passes, as after the first pass ELF sections order changes, and thus .BTF.ext data sections order changes. That, in turn, means that strings are added into the final BTF string sections in different order, so .BTF strings data might not be exactly the same. But doing another round of linking afterwards should result in the identical ELF file, which is checked with additional `diff` command.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-12-andrii@kernel.org (cherry picked from commit 14137f3c62186799b01eea8a338f90c9cbc57f00) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index cd0e436be9d9..d730b4bfbccf 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -348,12 +348,13 @@ $(TRUNNER_BPF_OBJS): $(TRUNNER_OUTPUT)/%.o: \ $$(call $(TRUNNER_BPF_BUILD_RULE),$$<,$$@, \ $(TRUNNER_BPF_CFLAGS))
-$(TRUNNER_BPF_SKELS): $(TRUNNER_OUTPUT)/%.skel.h: \ - $(TRUNNER_OUTPUT)/%.o \ - $(BPFTOOL) \ - | $(TRUNNER_OUTPUT) +$(TRUNNER_BPF_SKELS): %.skel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT) $$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@) - $(Q)$$(BPFTOOL) gen skeleton $$< > $$@ + $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked.o) $$< + $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked2.o) $$(<:.o=.linked.o) + $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked3.o) $$(<:.o=.linked2.o) + $(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o) + $(Q)$$(BPFTOOL) gen skeleton $$(<:.o=.linked.o) name $$(notdir $$(<:.o=)) > $$@ endif
# ensure we set up tests.h header generation rule just once
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit a0964f526df6facd4e12a4c416185013026eecf9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add Makefile infra to specify multi-file BPF object files (and derivative skeletons). Add first selftest validating BPF static linker can merge together successfully two independent BPF object files and resulting object and skeleton are correct and usable.
Use the same F(F(F(X))) = F(F(X)) identity test on linked object files as for the case of single BPF object files.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210318194036.3521577-13-andrii@kernel.org (cherry picked from commit a0964f526df6facd4e12a4c416185013026eecf9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 21 ++++++++-- .../selftests/bpf/prog_tests/static_linked.c | 40 +++++++++++++++++++ .../selftests/bpf/progs/test_static_linked1.c | 30 ++++++++++++++ .../selftests/bpf/progs/test_static_linked2.c | 31 ++++++++++++++ 4 files changed, 119 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/static_linked.c create mode 100644 tools/testing/selftests/bpf/progs/test_static_linked1.c create mode 100644 tools/testing/selftests/bpf/progs/test_static_linked2.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index d730b4bfbccf..cc68e3ecec7b 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -295,6 +295,10 @@ endef
SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
+LINKED_SKELS := test_static_linked.skel.h + +test_static_linked.skel.h-deps := test_static_linked1.o test_static_linked2.o + # Set up extra TRUNNER_XXX "temporary" variables in the environment (relies on # $eval()) and pass control to DEFINE_TEST_RUNNER_RULES. # Parameters: @@ -315,6 +319,7 @@ TRUNNER_BPF_OBJS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.o, $$(TRUNNER_BPF_SRCS) TRUNNER_BPF_SKELS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.skel.h, \ $$(filter-out $(SKEL_BLACKLIST), \ $$(TRUNNER_BPF_SRCS))) +TRUNNER_BPF_SKELS_LINKED := $$(addprefix $$(TRUNNER_OUTPUT)/,$(LINKED_SKELS)) TEST_GEN_FILES += $$(TRUNNER_BPF_OBJS)
# Evaluate rules now with extra TRUNNER_XXX variables above already defined @@ -350,11 +355,20 @@ $(TRUNNER_BPF_OBJS): $(TRUNNER_OUTPUT)/%.o: \
$(TRUNNER_BPF_SKELS): %.skel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT) $$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@) - $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked.o) $$< - $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked2.o) $$(<:.o=.linked.o) + $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked1.o) $$< + $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked2.o) $$(<:.o=.linked1.o) $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked3.o) $$(<:.o=.linked2.o) $(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o) - $(Q)$$(BPFTOOL) gen skeleton $$(<:.o=.linked.o) name $$(notdir $$(<:.o=)) > $$@ + $(Q)$$(BPFTOOL) gen skeleton $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$@ + +$(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_BPF_OBJS) $(BPFTOOL) | $(TRUNNER_OUTPUT) + $$(call msg,LINK-BPF,$(TRUNNER_BINARY),$$(@:.skel.h=.o)) + $(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked1.o) $$(addprefix $(TRUNNER_OUTPUT)/,$$($$(@F)-deps)) + $(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked2.o) $$(@:.skel.h=.linked1.o) + $(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked3.o) $$(@:.skel.h=.linked2.o) + $(Q)diff $$(@:.skel.h=.linked2.o) $$(@:.skel.h=.linked3.o) + $$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@) + $(Q)$$(BPFTOOL) gen skeleton $$(@:.skel.h=.linked3.o) name $$(notdir $$(@:.skel.h=)) > $$@ endif
# ensure we set up tests.h header generation rule just once @@ -376,6 +390,7 @@ $(TRUNNER_TEST_OBJS): $(TRUNNER_OUTPUT)/%.test.o: \ $(TRUNNER_EXTRA_HDRS) \ $(TRUNNER_BPF_OBJS) \ $(TRUNNER_BPF_SKELS) \ + $(TRUNNER_BPF_SKELS_LINKED) \ $$(BPFOBJ) | $(TRUNNER_OUTPUT) $$(call msg,TEST-OBJ,$(TRUNNER_BINARY),$$@) $(Q)cd $$(@D) && $$(CC) -I. $$(CFLAGS) -c $(CURDIR)/$$< $$(LDLIBS) -o $$(@F) diff --git a/tools/testing/selftests/bpf/prog_tests/static_linked.c b/tools/testing/selftests/bpf/prog_tests/static_linked.c new file mode 100644 index 000000000000..46556976dccc --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/static_linked.c @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019 Facebook */ + +#include <test_progs.h> +#include "test_static_linked.skel.h" + +void test_static_linked(void) +{ + int err; + struct test_static_linked* skel; + + skel = test_static_linked__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + skel->rodata->rovar1 = 1; + skel->bss->static_var1 = 2; + skel->bss->static_var11 = 3; + + skel->rodata->rovar2 = 4; + skel->bss->static_var2 = 5; + skel->bss->static_var22 = 6; + + err = test_static_linked__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + err = test_static_linked__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto cleanup; + + /* trigger */ + usleep(1); + + ASSERT_EQ(skel->bss->var1, 1 * 2 + 2 + 3, "var1"); + ASSERT_EQ(skel->bss->var2, 4 * 3 + 5 + 6, "var2"); + +cleanup: + test_static_linked__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_static_linked1.c b/tools/testing/selftests/bpf/progs/test_static_linked1.c new file mode 100644 index 000000000000..ea1a6c4c7172 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_static_linked1.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +/* 8-byte aligned .bss */ +static volatile long static_var1; +static volatile int static_var11; +int var1 = 0; +/* 4-byte aligned .rodata */ +const volatile int rovar1; + +/* same "subprog" name in both files */ +static __noinline int subprog(int x) +{ + /* but different formula */ + return x * 2; +} + +SEC("raw_tp/sys_enter") +int handler1(const void *ctx) +{ + var1 = subprog(rovar1) + static_var1 + static_var11; + + return 0; +} + +char LICENSE[] SEC("license") = "GPL"; +int VERSION SEC("version") = 1; diff --git a/tools/testing/selftests/bpf/progs/test_static_linked2.c b/tools/testing/selftests/bpf/progs/test_static_linked2.c new file mode 100644 index 000000000000..54d8d1ab577c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_static_linked2.c @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +/* 4-byte aligned .bss */ +static volatile int static_var2; +static volatile int static_var22; +int var2 = 0; +/* 8-byte aligned .rodata */ +const volatile long rovar2; + +/* same "subprog" name in both files */ +static __noinline int subprog(int x) +{ + /* but different formula */ + return x * 3; +} + +SEC("raw_tp/sys_enter") +int handler2(const void *ctx) +{ + var2 = subprog(rovar2) + static_var2 + static_var22; + + return 0; +} + +/* different name and/or type of the variable doesn't matter */ +char _license[] SEC("license") = "GPL"; +int _version SEC("version") = 1;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 78b226d48106fc91628f941c66545f05273269df category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Skip BTF fixup step when input object file is missing BTF altogether.
Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Jiri Olsa jolsa@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Tested-by: Jiri Olsa jolsa@redhat.com Link: https://lore.kernel.org/bpf/20210319205909.1748642-3-andrii@kernel.org (cherry picked from commit 78b226d48106fc91628f941c66545f05273269df) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index b4fff912dce2..5e0aa2f2c0ca 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1313,6 +1313,9 @@ static int linker_fixup_btf(struct src_obj *obj) struct src_sec *sec; int i, j, n, m;
+ if (!obj->btf) + return 0; + n = btf__get_nr_types(obj->btf); for (i = 1; i <= n; i++) { struct btf_var_secinfo *vi;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit a46410d5e4975d701d526397156fa0815747dc2f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
bpf_program__get_type() and bpf_program__get_expected_attach_type() shouldn't modify given bpf_program, so mark input parameter as const struct bpf_program. This eliminates unnecessary compilation warnings or explicit casts in user programs.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210324172941.2609884-1-andrii@kernel.org (cherry picked from commit a46410d5e4975d701d526397156fa0815747dc2f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 ++-- tools/lib/bpf/libbpf.h | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e1350af34228..32360e75da1c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8515,7 +8515,7 @@ int bpf_program__nth_fd(const struct bpf_program *prog, int n) return fd; }
-enum bpf_prog_type bpf_program__get_type(struct bpf_program *prog) +enum bpf_prog_type bpf_program__get_type(const struct bpf_program *prog) { return prog->type; } @@ -8561,7 +8561,7 @@ BPF_PROG_TYPE_FNS(sk_lookup, BPF_PROG_TYPE_SK_LOOKUP); BPF_PROG_TYPE_FNS(sched, BPF_PROG_TYPE_SCHED);
enum bpf_attach_type -bpf_program__get_expected_attach_type(struct bpf_program *prog) +bpf_program__get_expected_attach_type(const struct bpf_program *prog) { return prog->expected_attach_type; } diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 7c1fd67c9339..41f4a1c9c432 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -364,12 +364,12 @@ LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog); LIBBPF_API int bpf_program__set_sk_lookup(struct bpf_program *prog); LIBBPF_API int bpf_program__set_sched(struct bpf_program *prog);
-LIBBPF_API enum bpf_prog_type bpf_program__get_type(struct bpf_program *prog); +LIBBPF_API enum bpf_prog_type bpf_program__get_type(const struct bpf_program *prog); LIBBPF_API void bpf_program__set_type(struct bpf_program *prog, enum bpf_prog_type type);
LIBBPF_API enum bpf_attach_type -bpf_program__get_expected_attach_type(struct bpf_program *prog); +bpf_program__get_expected_attach_type(const struct bpf_program *prog); LIBBPF_API void bpf_program__set_expected_attach_type(struct bpf_program *prog, enum bpf_attach_type type);
From: Rafael David Tinoco rafaeldtinoco@ubuntu.com
mainline inclusion from mainline-5.13-rc1 commit 155f556d64b1a48710f01305e14bb860734ed1e3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Unfortunately some distros don't have their kernel version defined accurately in <linux/version.h> due to different long term support reasons.
It is important to have a way to override the bpf kern_version attribute during runtime: some old kernels might still check for kern_version attribute during bpf_prog_load().
Signed-off-by: Rafael David Tinoco rafaeldtinoco@ubuntu.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210323040952.2118241-1-rafaeldtinoco@ubuntu.co... (cherry picked from commit 155f556d64b1a48710f01305e14bb860734ed1e3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 10 ++++++++++ tools/lib/bpf/libbpf.h | 1 + tools/lib/bpf/libbpf.map | 1 + 3 files changed, 12 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 32360e75da1c..0c1c6de1e0b3 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8327,6 +8327,16 @@ int bpf_object__btf_fd(const struct bpf_object *obj) return obj->btf ? btf__fd(obj->btf) : -1; }
+int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_version) +{ + if (obj->loaded) + return -EINVAL; + + obj->kern_version = kern_version; + + return 0; +} + int bpf_object__set_priv(struct bpf_object *obj, void *priv, bpf_object_clear_priv_t clear_priv) { diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 41f4a1c9c432..4bb141b6727d 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -143,6 +143,7 @@ LIBBPF_API int bpf_object__unload(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj); LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj); +LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_version);
struct btf; LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index e07bb4fb0923..0d3355c1b2e1 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -359,4 +359,5 @@ LIBBPF_0.4.0 { bpf_linker__finalize; bpf_linker__free; bpf_linker__new; + bpf_object__set_kversion; } LIBBPF_0.3.0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 36e7985160782bc683001afe09e33a288435def0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Ensure that BPF static linker preserves all DATASEC BTF types, even if some of them might not have any variable information at all. This may happen if the compiler promotes local initialized variable contents into .rodata section and there are no global or static functions in the program.
For example,
$ cat t.c struct t { char a; char b; char c; }; void bar(struct t*); void find() { struct t tmp = {1, 2, 3}; bar(&tmp); }
$ clang -target bpf -O2 -g -S t.c .long 104 # BTF_KIND_DATASEC(id = 8) .long 251658240 # 0xf000000 .long 0
.ascii ".rodata" # string offset=104
$ clang -target bpf -O2 -g -c t.c $ readelf -S t.o | grep data [ 4] .rodata PROGBITS 0000000000000000 00000090
Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org (cherry picked from commit 36e7985160782bc683001afe09e33a288435def0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 5e0aa2f2c0ca..a29d62ff8041 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -94,6 +94,7 @@ struct dst_sec { int sec_sym_idx;
/* section's DATASEC variable info, emitted on BTF finalization */ + bool has_btf; int sec_var_cnt; struct btf_var_secinfo *sec_vars;
@@ -1436,6 +1437,16 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) continue; dst_sec = &linker->secs[src_sec->dst_id];
+ /* Mark section as having BTF regardless of the presence of + * variables. In some cases compiler might generate empty BTF + * with no variables information. E.g., when promoting local + * array/structure variable initial values and BPF object + * file otherwise has no read-only static variables in + * .rodata. We need to preserve such empty BTF and just set + * correct section size. + */ + dst_sec->has_btf = true; + t = btf__type_by_id(obj->btf, src_sec->sec_type_id); src_var = btf_var_secinfos(t); n = btf_vlen(t); @@ -1717,7 +1728,7 @@ static int finalize_btf(struct bpf_linker *linker) for (i = 1; i < linker->sec_cnt; i++) { struct dst_sec *sec = &linker->secs[i];
- if (!sec->sec_var_cnt) + if (!sec->has_btf) continue;
id = btf__add_datasec(btf, sec->sec_name, sec->sec_sz);
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit e16301fbe1837c9594f9c1957c28fd1bb18fbd15 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch simplifies the linfo freeing logic by combining "bpf_prog_free_jited_linfo()" and "bpf_prog_free_unused_jited_linfo()" into the new "bpf_prog_jit_attempt_done()". It is a prep work for the kernel function call support. In a later patch, freeing the kernel function call descriptors will also be done in the "bpf_prog_jit_attempt_done()".
"bpf_prog_free_linfo()" is removed since it is only called by "__bpf_prog_put_noref()". The kvfree() are directly called instead.
It also takes this chance to s/kcalloc/kvcalloc/ for the jited_linfo allocation.
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210325015130.1544323-1-kafai@fb.com (cherry picked from commit e16301fbe1837c9594f9c1957c28fd1bb18fbd15) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/core.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/filter.h | 3 +-- kernel/bpf/core.c | 31 ++++++++++--------------------- kernel/bpf/syscall.c | 3 ++- kernel/bpf/verifier.c | 4 ++-- 4 files changed, 15 insertions(+), 26 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h index f77cb6c78a27..5e49b08c4a1c 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -891,8 +891,7 @@ void bpf_prog_free_linfo(struct bpf_prog *prog); void bpf_prog_fill_jited_linfo(struct bpf_prog *prog, const u32 *insn_to_jit_off); int bpf_prog_alloc_jited_linfo(struct bpf_prog *prog); -void bpf_prog_free_jited_linfo(struct bpf_prog *prog); -void bpf_prog_free_unused_jited_linfo(struct bpf_prog *prog); +void bpf_prog_jit_attempt_done(struct bpf_prog *prog);
struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags); struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index c766bda64fe5..f0e796c46b0e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -145,7 +145,7 @@ int bpf_prog_alloc_jited_linfo(struct bpf_prog *prog) if (!prog->aux->nr_linfo || !prog->jit_requested) return 0;
- prog->aux->jited_linfo = kcalloc(prog->aux->nr_linfo, + prog->aux->jited_linfo = kvcalloc(prog->aux->nr_linfo, sizeof(*prog->aux->jited_linfo), GFP_KERNEL | __GFP_NOWARN); if (!prog->aux->jited_linfo) @@ -154,16 +154,13 @@ int bpf_prog_alloc_jited_linfo(struct bpf_prog *prog) return 0; }
-void bpf_prog_free_jited_linfo(struct bpf_prog *prog) +void bpf_prog_jit_attempt_done(struct bpf_prog *prog) { - kfree(prog->aux->jited_linfo); - prog->aux->jited_linfo = NULL; -} - -void bpf_prog_free_unused_jited_linfo(struct bpf_prog *prog) -{ - if (prog->aux->jited_linfo && !prog->aux->jited_linfo[0]) - bpf_prog_free_jited_linfo(prog); + if (prog->aux->jited_linfo && + (!prog->jited || !prog->aux->jited_linfo[0])) { + kvfree(prog->aux->jited_linfo); + prog->aux->jited_linfo = NULL; + } }
/* The jit engine is responsible to provide an array @@ -219,12 +216,6 @@ void bpf_prog_fill_jited_linfo(struct bpf_prog *prog, insn_to_jit_off[linfo[i].insn_off - insn_start - 1]; }
-void bpf_prog_free_linfo(struct bpf_prog *prog) -{ - bpf_prog_free_jited_linfo(prog); - kvfree(prog->aux->linfo); -} - struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size, gfp_t gfp_extra_flags) { @@ -1924,15 +1915,13 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) return fp;
fp = bpf_int_jit_compile(fp); - if (!fp->jited) { - bpf_prog_free_jited_linfo(fp); + bpf_prog_jit_attempt_done(fp); #ifdef CONFIG_BPF_JIT_ALWAYS_ON + if (!fp->jited) { *err = -ENOTSUPP; return fp; -#endif - } else { - bpf_prog_free_unused_jited_linfo(fp); } +#endif } else { *err = bpf_prog_offload_compile(fp); if (*err) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 0b9b3f0d772c..9cf791808a7f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1758,7 +1758,8 @@ static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred) { bpf_prog_kallsyms_del_all(prog); btf_put(prog->aux->btf); - bpf_prog_free_linfo(prog); + kvfree(prog->aux->jited_linfo); + kvfree(prog->aux->linfo); if (prog->aux->attach_btf) btf_put(prog->aux->attach_btf);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f63a736a645a..f920f4f429bc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11707,7 +11707,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) prog->bpf_func = func[0]->bpf_func; prog->aux->func = func; prog->aux->func_cnt = env->subprog_cnt; - bpf_prog_free_unused_jited_linfo(prog); + bpf_prog_jit_attempt_done(prog); return 0; out_free: /* We failed JIT'ing, so at this point we need to unregister poke @@ -11738,7 +11738,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) insn->off = 0; insn->imm = env->insn_aux_data[i].call_imm; } - bpf_prog_free_jited_linfo(prog); + bpf_prog_jit_attempt_done(prog); return err; }
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit 34747c4120418143097d4343312a0ca96c986d86 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch moved the subprog specific logic from btf_check_func_arg_match() to the new btf_check_subprog_arg_match(). The core logic is left in btf_check_func_arg_match() which will be reused later to check the kernel function call.
The "if (!btf_type_is_ptr(t))" is checked first to improve the indentation which will be useful for a later patch.
Some of the "btf_kind_str[]" usages is replaced with the shortcut "btf_type_str(t)".
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210325015136.1544504-1-kafai@fb.com (cherry picked from commit 34747c4120418143097d4343312a0ca96c986d86) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/btf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 4 +- include/linux/btf.h | 5 ++ kernel/bpf/btf.c | 159 +++++++++++++++++++++++------------------- kernel/bpf/verifier.c | 4 +- 4 files changed, 95 insertions(+), 77 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d115f01ef1b6..cc2fcbeafa19 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1593,8 +1593,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, struct btf_func_model *m);
struct bpf_reg_state; -int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, - struct bpf_reg_state *regs); +int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog, + struct bpf_reg_state *regs); int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, struct bpf_reg_state *reg); int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *prog, diff --git a/include/linux/btf.h b/include/linux/btf.h index 7fabf1428093..93bf2e5225f5 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -140,6 +140,11 @@ static inline bool btf_type_is_enum(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM; }
+static inline bool btf_type_is_scalar(const struct btf_type *t) +{ + return btf_type_is_int(t) || btf_type_is_enum(t); +} + static inline bool btf_type_is_typedef(const struct btf_type *t) { return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index ab190f20e811..11779e1cca45 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4297,7 +4297,7 @@ static u8 bpf_ctx_convert_map[] = { #undef BPF_LINK_TYPE
static const struct btf_member * -btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf, +btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf, const struct btf_type *t, enum bpf_prog_type prog_type, int arg) { @@ -5297,122 +5297,135 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr return btf_check_func_type_match(log, btf1, t1, btf2, t2); }
-/* Compare BTF of a function with given bpf_reg_state. - * Returns: - * EFAULT - there is a verifier bug. Abort verification. - * EINVAL - there is a type mismatch or BTF is not available. - * 0 - BTF matches with what bpf_reg_state expects. - * Only PTR_TO_CTX and SCALAR_VALUE states are recognized. - */ -int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, - struct bpf_reg_state *regs) +static int btf_check_func_arg_match(struct bpf_verifier_env *env, + const struct btf *btf, u32 func_id, + struct bpf_reg_state *regs, + bool ptr_to_mem_ok) { struct bpf_verifier_log *log = &env->log; - struct bpf_prog *prog = env->prog; - struct btf *btf = prog->aux->btf; - const struct btf_param *args; + const char *func_name, *ref_tname; const struct btf_type *t, *ref_t; - u32 i, nargs, btf_id, type_size; - const char *tname; - bool is_global; - - if (!prog->aux->func_info) - return -EINVAL; - - btf_id = prog->aux->func_info[subprog].type_id; - if (!btf_id) - return -EFAULT; - - if (prog->aux->func_info_aux[subprog].unreliable) - return -EINVAL; + const struct btf_param *args; + u32 i, nargs;
- t = btf_type_by_id(btf, btf_id); + t = btf_type_by_id(btf, func_id); if (!t || !btf_type_is_func(t)) { /* These checks were already done by the verifier while loading * struct bpf_func_info */ - bpf_log(log, "BTF of func#%d doesn't point to KIND_FUNC\n", - subprog); + bpf_log(log, "BTF of func_id %u doesn't point to KIND_FUNC\n", + func_id); return -EFAULT; } - tname = btf_name_by_offset(btf, t->name_off); + func_name = btf_name_by_offset(btf, t->name_off);
t = btf_type_by_id(btf, t->type); if (!t || !btf_type_is_func_proto(t)) { - bpf_log(log, "Invalid BTF of func %s\n", tname); + bpf_log(log, "Invalid BTF of func %s\n", func_name); return -EFAULT; } args = (const struct btf_param *)(t + 1); nargs = btf_type_vlen(t); if (nargs > MAX_BPF_FUNC_REG_ARGS) { - bpf_log(log, "Function %s has %d > %d args\n", tname, nargs, + bpf_log(log, "Function %s has %d > %d args\n", func_name, nargs, MAX_BPF_FUNC_REG_ARGS); - goto out; + return -EINVAL; }
- is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; /* check that BTF function arguments match actual types that the * verifier sees. */ for (i = 0; i < nargs; i++) { - struct bpf_reg_state *reg = ®s[i + 1]; + u32 regno = i + 1; + struct bpf_reg_state *reg = ®s[regno];
- t = btf_type_by_id(btf, args[i].type); - while (btf_type_is_modifier(t)) - t = btf_type_by_id(btf, t->type); - if (btf_type_is_int(t) || btf_type_is_enum(t)) { + t = btf_type_skip_modifiers(btf, args[i].type, NULL); + if (btf_type_is_scalar(t)) { if (reg->type == SCALAR_VALUE) continue; - bpf_log(log, "R%d is not a scalar\n", i + 1); - goto out; + bpf_log(log, "R%d is not a scalar\n", regno); + return -EINVAL; } - if (btf_type_is_ptr(t)) { + + if (!btf_type_is_ptr(t)) { + bpf_log(log, "Unrecognized arg#%d type %s\n", + i, btf_type_str(t)); + return -EINVAL; + } + + ref_t = btf_type_skip_modifiers(btf, t->type, NULL); + ref_tname = btf_name_by_offset(btf, ref_t->name_off); + if (btf_get_prog_ctx_type(log, btf, t, env->prog->type, i)) { /* If function expects ctx type in BTF check that caller * is passing PTR_TO_CTX. */ - if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) { - if (reg->type != PTR_TO_CTX) { - bpf_log(log, - "arg#%d expected pointer to ctx, but got %s\n", - i, btf_kind_str[BTF_INFO_KIND(t->info)]); - goto out; - } - if (check_ptr_off_reg(env, reg, i + 1)) - goto out; - continue; + if (reg->type != PTR_TO_CTX) { + bpf_log(log, + "arg#%d expected pointer to ctx, but got %s\n", + i, btf_type_str(t)); + return -EINVAL; } + if (check_ptr_off_reg(env, reg, regno)) + return -EINVAL; + } else if (ptr_to_mem_ok) { + const struct btf_type *resolve_ret; + u32 type_size;
- if (!is_global) - goto out; - - t = btf_type_skip_modifiers(btf, t->type, NULL); - - ref_t = btf_resolve_size(btf, t, &type_size); - if (IS_ERR(ref_t)) { + resolve_ret = btf_resolve_size(btf, ref_t, &type_size); + if (IS_ERR(resolve_ret)) { bpf_log(log, - "arg#%d reference type('%s %s') size cannot be determined: %ld\n", - i, btf_type_str(t), btf_name_by_offset(btf, t->name_off), - PTR_ERR(ref_t)); - goto out; + "arg#%d reference type('%s %s') size cannot be determined: %ld\n", + i, btf_type_str(ref_t), ref_tname, + PTR_ERR(resolve_ret)); + return -EINVAL; }
- if (check_mem_reg(env, reg, i + 1, type_size)) - goto out; - - continue; + if (check_mem_reg(env, reg, regno, type_size)) + return -EINVAL; + } else { + return -EINVAL; } - bpf_log(log, "Unrecognized arg#%d type %s\n", - i, btf_kind_str[BTF_INFO_KIND(t->info)]); - goto out; } + return 0; -out: +} + +/* Compare BTF of a function with given bpf_reg_state. + * Returns: + * EFAULT - there is a verifier bug. Abort verification. + * EINVAL - there is a type mismatch or BTF is not available. + * 0 - BTF matches with what bpf_reg_state expects. + * Only PTR_TO_CTX and SCALAR_VALUE states are recognized. + */ +int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog, + struct bpf_reg_state *regs) +{ + struct bpf_prog *prog = env->prog; + struct btf *btf = prog->aux->btf; + bool is_global; + u32 btf_id; + int err; + + if (!prog->aux->func_info) + return -EINVAL; + + btf_id = prog->aux->func_info[subprog].type_id; + if (!btf_id) + return -EFAULT; + + if (prog->aux->func_info_aux[subprog].unreliable) + return -EINVAL; + + is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; + err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global); + /* Compiler optimizations can remove arguments from static functions * or mismatched type can be passed into a global function. * In such cases mark the function as unreliable from BTF point of view. */ - prog->aux->func_info_aux[subprog].unreliable = true; - return -EINVAL; + if (err) + prog->aux->func_info_aux[subprog].unreliable = true; + return err; }
/* Convert BTF of a function into bpf_reg_state if possible diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f920f4f429bc..9e00ddff18af 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5322,7 +5322,7 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn func_info_aux = env->prog->aux->func_info_aux; if (func_info_aux) is_global = func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; - err = btf_check_func_arg_match(env, subprog, caller->regs); + err = btf_check_subprog_arg_match(env, subprog, caller->regs); if (err == -EFAULT) return err; if (is_global) { @@ -12228,7 +12228,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) /* 1st arg to a function */ regs[BPF_REG_1].type = PTR_TO_CTX; mark_reg_known_zero(env, regs, BPF_REG_1); - ret = btf_check_func_arg_match(env, subprog, regs); + ret = btf_check_subprog_arg_match(env, subprog, regs); if (ret == -EFAULT) /* unlikely verifier bug. abort. * ret == 0 and ret < 0 are sadly acceptable for
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit e6ac2450d6dee3121cd8bbf2907b78a68a8a353d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch adds support to BPF verifier to allow bpf program calling kernel function directly.
The use case included in this set is to allow bpf-tcp-cc to directly call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those functions have already been used by some kernel tcp-cc implementations.
This set will also allow the bpf-tcp-cc program to directly call the kernel tcp-cc implementation, For example, a bpf_dctcp may only want to implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly from the kernel tcp_dctcp.c instead of reimplementing (or copy-and-pasting) them.
The tcp-cc kernel functions mentioned above will be white listed for the struct_ops bpf-tcp-cc programs to use in a later patch. The white listed functions are not bounded to a fixed ABI contract. Those functions have already been used by the existing kernel tcp-cc. If any of them has changed, both in-tree and out-of-tree kernel tcp-cc implementations have to be changed. The same goes for the struct_ops bpf-tcp-cc programs which have to be adjusted accordingly.
This patch is to make the required changes in the bpf verifier.
First change is in btf.c, it adds a case in "btf_check_func_arg_match()". When the passed in "btf->kernel_btf == true", it means matching the verifier regs' states with a kernel function. This will handle the PTR_TO_BTF_ID reg. It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET, and PTR_TO_TCP_SOCK to its kernel's btf_id.
In the later libbpf patch, the insn calling a kernel function will look like:
insn->code == (BPF_JMP | BPF_CALL) insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */ insn->imm == func_btf_id /* btf_id of the running kernel */
[ For the future calling function-in-kernel-module support, an array of module btf_fds can be passed at the load time and insn->off can be used to index into this array. ]
At the early stage of verifier, the verifier will collect all kernel function calls into "struct bpf_kfunc_desc". Those descriptors are stored in "prog->aux->kfunc_tab" and will be available to the JIT. Since this "add" operation is similar to the current "add_subprog()" and looking for the same insn->code, they are done together in the new "add_subprog_and_kfunc()".
In the "do_check()" stage, the new "check_kfunc_call()" is added to verify the kernel function call instruction: 1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE. A new bpf_verifier_ops "check_kfunc_call" is added to do that. The bpf-tcp-cc struct_ops program will implement this function in a later patch. 2. Call "btf_check_kfunc_args_match()" to ensure the regs can be used as the args of a kernel function. 3. Mark the regs' type, subreg_def, and zext_dst.
At the later do_misc_fixups() stage, the new fixup_kfunc_call() will replace the insn->imm with the function address (relative to __bpf_call_base). If needed, the jit can find the btf_func_model by calling the new bpf_jit_find_kfunc_model(prog, insn). With the imm set to the function address, "bpftool prog dump xlated" will be able to display the kernel function calls the same way as it displays other bpf helper calls.
gpl_compatible program is required to call kernel function.
This feature currently requires JIT.
The verifier selftests are adjusted because of the changes in the verbose log in add_subprog_and_kfunc().
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com (cherry picked from commit e6ac2450d6dee3121cd8bbf2907b78a68a8a353d) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp.c | 5 + include/linux/bpf.h | 23 ++ include/linux/btf.h | 1 + include/linux/filter.h | 1 + include/uapi/linux/bpf.h | 4 + kernel/bpf/btf.c | 65 +++- kernel/bpf/core.c | 18 +- kernel/bpf/disasm.c | 13 +- kernel/bpf/syscall.c | 1 + kernel/bpf/verifier.c | 368 ++++++++++++++++-- tools/include/uapi/linux/bpf.h | 4 + tools/testing/selftests/bpf/verifier/calls.c | 12 +- .../selftests/bpf/verifier/dead_code.c | 10 +- 13 files changed, 479 insertions(+), 46 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 4c4c10e7455a..42a16cb6c3fc 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -2362,3 +2362,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) tmp : orig_prog); return prog; } + +bool bpf_jit_supports_kfunc_call(void) +{ + return true; +} diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cc2fcbeafa19..880ab6a8e1f3 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -535,6 +535,7 @@ struct bpf_verifier_ops { const struct btf_type *t, int off, int size, enum bpf_access_type atype, u32 *next_btf_id); + bool (*check_kfunc_call)(u32 kfunc_btf_id); };
struct bpf_prog_offload_ops { @@ -863,6 +864,8 @@ struct btf_mod_pair { struct module *module; };
+struct bpf_kfunc_desc_tab; + struct bpf_prog_aux { atomic64_t refcnt; u32 used_map_cnt; @@ -899,6 +902,7 @@ struct bpf_prog_aux { struct bpf_prog **func; void *jit_data; /* JIT specific data. arch dependent */ struct bpf_jit_poke_descriptor *poke_tab; + struct bpf_kfunc_desc_tab *kfunc_tab; u32 size_poke_tab; struct bpf_ksym ksym; const struct bpf_prog_ops *ops; @@ -1595,6 +1599,9 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, struct bpf_reg_state; int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog, struct bpf_reg_state *regs); +int btf_check_kfunc_arg_match(struct bpf_verifier_env *env, + const struct btf *btf, u32 func_id, + struct bpf_reg_state *regs); int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, struct bpf_reg_state *reg); int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *prog, @@ -1604,6 +1611,10 @@ struct bpf_prog *bpf_prog_by_id(u32 id); struct bpf_link *bpf_link_by_id(u32 id);
const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id); +bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog); +const struct btf_func_model * +bpf_jit_find_kfunc_model(const struct bpf_prog *prog, + const struct bpf_insn *insn);
static inline bool unprivileged_ebpf_enabled(void) { @@ -1812,6 +1823,18 @@ bpf_base_func_proto(enum bpf_func_id func_id) return NULL; }
+static inline bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog) +{ + return false; +} + +static inline const struct btf_func_model * +bpf_jit_find_kfunc_model(const struct bpf_prog *prog, + const struct bpf_insn *insn) +{ + return NULL; +} + static inline bool unprivileged_ebpf_enabled(void) { return false; diff --git a/include/linux/btf.h b/include/linux/btf.h index 93bf2e5225f5..8f6ea0d4d8a1 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -109,6 +109,7 @@ const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf, const struct btf_type * btf_resolve_size(const struct btf *btf, const struct btf_type *type, u32 *type_size); +const char *btf_type_str(const struct btf_type *t);
#define for_each_member(i, struct_type, member) \ for (i = 0, member = btf_type_member(struct_type); \ diff --git a/include/linux/filter.h b/include/linux/filter.h index 5e49b08c4a1c..53753deda3d8 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -932,6 +932,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog); void bpf_jit_compile(struct bpf_prog *prog); bool bpf_jit_needs_zext(void); +bool bpf_jit_supports_kfunc_call(void); bool bpf_helper_changes_pkt_data(void *func);
static inline bool bpf_dump_raw_ok(const struct cred *cred) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index e8ae59d9cb37..4feb43973716 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -408,6 +408,10 @@ enum bpf_link_type { * offset to another bpf function */ #define BPF_PSEUDO_CALL 1 +/* when bpf_call->src_reg == BPF_PSEUDO_KFUNC_CALL, + * bpf_call->imm == btf_id of a BTF_KIND_FUNC in the running kernel + */ +#define BPF_PSEUDO_KFUNC_CALL 2
/* flags for BPF_MAP_UPDATE_ELEM command */ enum { diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 11779e1cca45..692e27ae3083 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -282,7 +282,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_DATASEC] = "DATASEC", };
-static const char *btf_type_str(const struct btf_type *t) +const char *btf_type_str(const struct btf_type *t) { return btf_kind_str[BTF_INFO_KIND(t->info)]; } @@ -5297,6 +5297,14 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr return btf_check_func_type_match(log, btf1, t1, btf2, t2); }
+static u32 *reg2btf_ids[__BPF_REG_TYPE_MAX] = { +#ifdef CONFIG_NET + [PTR_TO_SOCKET] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK], + [PTR_TO_SOCK_COMMON] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON], + [PTR_TO_TCP_SOCK] = &btf_sock_ids[BTF_SOCK_TYPE_TCP], +#endif +}; + static int btf_check_func_arg_match(struct bpf_verifier_env *env, const struct btf *btf, u32 func_id, struct bpf_reg_state *regs, @@ -5306,12 +5314,12 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, const char *func_name, *ref_tname; const struct btf_type *t, *ref_t; const struct btf_param *args; - u32 i, nargs; + u32 i, nargs, ref_id;
t = btf_type_by_id(btf, func_id); if (!t || !btf_type_is_func(t)) { /* These checks were already done by the verifier while loading - * struct bpf_func_info + * struct bpf_func_info or in add_kfunc_call(). */ bpf_log(log, "BTF of func_id %u doesn't point to KIND_FUNC\n", func_id); @@ -5353,9 +5361,49 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, return -EINVAL; }
- ref_t = btf_type_skip_modifiers(btf, t->type, NULL); + ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id); ref_tname = btf_name_by_offset(btf, ref_t->name_off); - if (btf_get_prog_ctx_type(log, btf, t, env->prog->type, i)) { + if (btf_is_kernel(btf)) { + const struct btf_type *reg_ref_t; + const struct btf *reg_btf; + const char *reg_ref_tname; + u32 reg_ref_id; + + if (!btf_type_is_struct(ref_t)) { + bpf_log(log, "kernel function %s args#%d pointer type %s %s is not supported\n", + func_name, i, btf_type_str(ref_t), + ref_tname); + return -EINVAL; + } + + if (reg->type == PTR_TO_BTF_ID) { + reg_btf = reg->btf; + reg_ref_id = reg->btf_id; + } else if (reg2btf_ids[reg->type]) { + reg_btf = btf_vmlinux; + reg_ref_id = *reg2btf_ids[reg->type]; + } else { + bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d is not a pointer to btf_id\n", + func_name, i, + btf_type_str(ref_t), ref_tname, regno); + return -EINVAL; + } + + reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, + ®_ref_id); + reg_ref_tname = btf_name_by_offset(reg_btf, + reg_ref_t->name_off); + if (!btf_struct_ids_match(log, reg_btf, reg_ref_id, + reg->off, btf, ref_id)) { + bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n", + func_name, i, + btf_type_str(ref_t), ref_tname, + regno, btf_type_str(reg_ref_t), + reg_ref_tname); + return -EINVAL; + } + } else if (btf_get_prog_ctx_type(log, btf, t, + env->prog->type, i)) { /* If function expects ctx type in BTF check that caller * is passing PTR_TO_CTX. */ @@ -5428,6 +5476,13 @@ int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog, return err; }
+int btf_check_kfunc_arg_match(struct bpf_verifier_env *env, + const struct btf *btf, u32 func_id, + struct bpf_reg_state *regs) +{ + return btf_check_func_arg_match(env, btf, func_id, regs, false); +} + /* Convert BTF of a function into bpf_reg_state if possible * Returns: * EFAULT - there is a verifier bug. Abort verification. diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index f0e796c46b0e..fae5c9a3843f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -161,6 +161,9 @@ void bpf_prog_jit_attempt_done(struct bpf_prog *prog) kvfree(prog->aux->jited_linfo); prog->aux->jited_linfo = NULL; } + + kfree(prog->aux->kfunc_tab); + prog->aux->kfunc_tab = NULL; }
/* The jit engine is responsible to provide an array @@ -1898,9 +1901,15 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) /* In case of BPF to BPF calls, verifier did all the prep * work with regards to JITing, etc. */ + bool jit_needed = false; + if (fp->bpf_func) goto finalize;
+ if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) || + bpf_prog_has_kfunc_call(fp)) + jit_needed = true; + bpf_prog_select_func(fp);
/* eBPF JITs can rewrite the program in case constant @@ -1916,12 +1925,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
fp = bpf_int_jit_compile(fp); bpf_prog_jit_attempt_done(fp); -#ifdef CONFIG_BPF_JIT_ALWAYS_ON - if (!fp->jited) { + if (!fp->jited && jit_needed) { *err = -ENOTSUPP; return fp; } -#endif } else { *err = bpf_prog_offload_compile(fp); if (*err) @@ -2402,6 +2409,11 @@ bool __weak bpf_jit_needs_zext(void) return false; }
+bool __weak bpf_jit_supports_kfunc_call(void) +{ + return false; +} + /* To execute LD_ABS/LD_IND instructions __bpf_prog_run() may call * skb_copy_bits(), so provide a weak definition of it for NET-less config. */ diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 9d4ebd77705e..4e73d567c6cf 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -19,16 +19,23 @@ static const char *__func_get_name(const struct bpf_insn_cbs *cbs, { BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
- if (insn->src_reg != BPF_PSEUDO_CALL && + if (!insn->src_reg && insn->imm >= 0 && insn->imm < __BPF_FUNC_MAX_ID && func_id_str[insn->imm]) return func_id_str[insn->imm];
- if (cbs && cbs->cb_call) - return cbs->cb_call(cbs->private_data, insn); + if (cbs && cbs->cb_call) { + const char *res; + + res = cbs->cb_call(cbs->private_data, insn); + if (res) + return res; + }
if (insn->src_reg == BPF_PSEUDO_CALL) snprintf(buff, len, "%+d", insn->imm); + else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) + snprintf(buff, len, "kernel-function");
return buff; } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 9cf791808a7f..fb72f2e887ad 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1760,6 +1760,7 @@ static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred) btf_put(prog->aux->btf); kvfree(prog->aux->jited_linfo); kvfree(prog->aux->linfo); + kfree(prog->aux->kfunc_tab); if (prog->aux->attach_btf) btf_put(prog->aux->attach_btf);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9e00ddff18af..915b18d62d1b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -235,6 +235,12 @@ static bool bpf_pseudo_call(const struct bpf_insn *insn) insn->src_reg == BPF_PSEUDO_CALL; }
+static bool bpf_pseudo_kfunc_call(const struct bpf_insn *insn) +{ + return insn->code == (BPF_JMP | BPF_CALL) && + insn->src_reg == BPF_PSEUDO_KFUNC_CALL; +} + static bool bpf_pseudo_func(const struct bpf_insn *insn) { return insn->code == (BPF_LD | BPF_IMM | BPF_DW) && @@ -1529,47 +1535,205 @@ static int add_subprog(struct bpf_verifier_env *env, int off) verbose(env, "too many subprograms\n"); return -E2BIG; } + /* determine subprog starts. The end is one before the next starts */ env->subprog_info[env->subprog_cnt++].start = off; sort(env->subprog_info, env->subprog_cnt, sizeof(env->subprog_info[0]), cmp_subprogs, NULL); return env->subprog_cnt - 1; }
-static int check_subprogs(struct bpf_verifier_env *env) +struct bpf_kfunc_desc { + struct btf_func_model func_model; + u32 func_id; + s32 imm; +}; + +#define MAX_KFUNC_DESCS 256 +struct bpf_kfunc_desc_tab { + struct bpf_kfunc_desc descs[MAX_KFUNC_DESCS]; + u32 nr_descs; +}; + +static int kfunc_desc_cmp_by_id(const void *a, const void *b) +{ + const struct bpf_kfunc_desc *d0 = a; + const struct bpf_kfunc_desc *d1 = b; + + /* func_id is not greater than BTF_MAX_TYPE */ + return d0->func_id - d1->func_id; +} + +static const struct bpf_kfunc_desc * +find_kfunc_desc(const struct bpf_prog *prog, u32 func_id) +{ + struct bpf_kfunc_desc desc = { + .func_id = func_id, + }; + struct bpf_kfunc_desc_tab *tab; + + tab = prog->aux->kfunc_tab; + return bsearch(&desc, tab->descs, tab->nr_descs, + sizeof(tab->descs[0]), kfunc_desc_cmp_by_id); +} + +static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id) +{ + const struct btf_type *func, *func_proto; + struct bpf_kfunc_desc_tab *tab; + struct bpf_prog_aux *prog_aux; + struct bpf_kfunc_desc *desc; + const char *func_name; + unsigned long addr; + int err; + + prog_aux = env->prog->aux; + tab = prog_aux->kfunc_tab; + if (!tab) { + if (!btf_vmlinux) { + verbose(env, "calling kernel function is not supported without CONFIG_DEBUG_INFO_BTF\n"); + return -ENOTSUPP; + } + + if (!env->prog->jit_requested) { + verbose(env, "JIT is required for calling kernel function\n"); + return -ENOTSUPP; + } + + if (!bpf_jit_supports_kfunc_call()) { + verbose(env, "JIT does not support calling kernel function\n"); + return -ENOTSUPP; + } + + if (!env->prog->gpl_compatible) { + verbose(env, "cannot call kernel function from non-GPL compatible program\n"); + return -EINVAL; + } + + tab = kzalloc(sizeof(*tab), GFP_KERNEL); + if (!tab) + return -ENOMEM; + prog_aux->kfunc_tab = tab; + } + + if (find_kfunc_desc(env->prog, func_id)) + return 0; + + if (tab->nr_descs == MAX_KFUNC_DESCS) { + verbose(env, "too many different kernel function calls\n"); + return -E2BIG; + } + + func = btf_type_by_id(btf_vmlinux, func_id); + if (!func || !btf_type_is_func(func)) { + verbose(env, "kernel btf_id %u is not a function\n", + func_id); + return -EINVAL; + } + func_proto = btf_type_by_id(btf_vmlinux, func->type); + if (!func_proto || !btf_type_is_func_proto(func_proto)) { + verbose(env, "kernel function btf_id %u does not have a valid func_proto\n", + func_id); + return -EINVAL; + } + + func_name = btf_name_by_offset(btf_vmlinux, func->name_off); + addr = kallsyms_lookup_name(func_name); + if (!addr) { + verbose(env, "cannot find address for kernel function %s\n", + func_name); + return -EINVAL; + } + + desc = &tab->descs[tab->nr_descs++]; + desc->func_id = func_id; + desc->imm = BPF_CAST_CALL(addr) - __bpf_call_base; + err = btf_distill_func_proto(&env->log, btf_vmlinux, + func_proto, func_name, + &desc->func_model); + if (!err) + sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]), + kfunc_desc_cmp_by_id, NULL); + return err; +} + +static int kfunc_desc_cmp_by_imm(const void *a, const void *b) +{ + const struct bpf_kfunc_desc *d0 = a; + const struct bpf_kfunc_desc *d1 = b; + + if (d0->imm > d1->imm) + return 1; + else if (d0->imm < d1->imm) + return -1; + return 0; +} + +static void sort_kfunc_descs_by_imm(struct bpf_prog *prog) +{ + struct bpf_kfunc_desc_tab *tab; + + tab = prog->aux->kfunc_tab; + if (!tab) + return; + + sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]), + kfunc_desc_cmp_by_imm, NULL); +} + +bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog) +{ + return !!prog->aux->kfunc_tab; +} + +const struct btf_func_model * +bpf_jit_find_kfunc_model(const struct bpf_prog *prog, + const struct bpf_insn *insn) +{ + const struct bpf_kfunc_desc desc = { + .imm = insn->imm, + }; + const struct bpf_kfunc_desc *res; + struct bpf_kfunc_desc_tab *tab; + + tab = prog->aux->kfunc_tab; + res = bsearch(&desc, tab->descs, tab->nr_descs, + sizeof(tab->descs[0]), kfunc_desc_cmp_by_imm); + + return res ? &res->func_model : NULL; +} + +static int add_subprog_and_kfunc(struct bpf_verifier_env *env) { - int i, ret, subprog_start, subprog_end, off, cur_subprog = 0; struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; - int insn_cnt = env->prog->len; + int i, ret, insn_cnt = env->prog->len;
/* Add entry function. */ ret = add_subprog(env, 0); - if (ret < 0) + if (ret) return ret;
- /* determine subprog starts. The end is one before the next starts */ - for (i = 0; i < insn_cnt; i++) { - if (bpf_pseudo_func(insn + i)) { - if (!env->bpf_capable) { - verbose(env, - "function pointers are allowed for CAP_BPF and CAP_SYS_ADMIN\n"); - return -EPERM; - } - ret = add_subprog(env, i + insn[i].imm + 1); - if (ret < 0) - return ret; - /* remember subprog */ - insn[i + 1].imm = ret; - continue; - } - if (!bpf_pseudo_call(insn + i)) + for (i = 0; i < insn_cnt; i++, insn++) { + if (!bpf_pseudo_func(insn) && !bpf_pseudo_call(insn) && + !bpf_pseudo_kfunc_call(insn)) continue; + if (!env->bpf_capable) { - verbose(env, - "function calls to other bpf functions are allowed for CAP_BPF and CAP_SYS_ADMIN\n"); + verbose(env, "loading/calling other bpf or kernel functions are allowed for CAP_BPF and CAP_SYS_ADMIN\n"); return -EPERM; } - ret = add_subprog(env, i + insn[i].imm + 1); + + if (bpf_pseudo_func(insn)) { + ret = add_subprog(env, i + insn->imm + 1); + if (ret >= 0) + /* remember subprog */ + insn[1].imm = ret; + } else if (bpf_pseudo_call(insn)) { + ret = add_subprog(env, i + insn->imm + 1); + } else { + ret = add_kfunc_call(env, insn->imm); + } + if (ret < 0) return ret; } @@ -1583,6 +1747,16 @@ static int check_subprogs(struct bpf_verifier_env *env) for (i = 0; i < env->subprog_cnt; i++) verbose(env, "func#%d @%d\n", i, subprog[i].start);
+ return 0; +} + +static int check_subprogs(struct bpf_verifier_env *env) +{ + int i, subprog_start, subprog_end, off, cur_subprog = 0; + struct bpf_subprog_info *subprog = env->subprog_info; + struct bpf_insn *insn = env->prog->insnsi; + int insn_cnt = env->prog->len; + /* now check that all jumps are within the same subprog */ subprog_start = subprog[cur_subprog].start; subprog_end = subprog[cur_subprog + 1].start; @@ -1871,6 +2045,17 @@ static int get_prev_insn_idx(struct bpf_verifier_state *st, int i, return i; }
+static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn) +{ + const struct btf_type *func; + + if (insn->src_reg != BPF_PSEUDO_KFUNC_CALL) + return NULL; + + func = btf_type_by_id(btf_vmlinux, insn->imm); + return btf_name_by_offset(btf_vmlinux, func->name_off); +} + /* For given verifier state backtrack_insn() is called from the last insn to * the first insn. Its purpose is to compute a bitmask of registers and * stack slots that needs precision in the parent verifier state. @@ -1879,6 +2064,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, u32 *reg_mask, u64 *stack_mask) { const struct bpf_insn_cbs cbs = { + .cb_call = disasm_kfunc_name, .cb_print = verbose, .private_data = env, }; @@ -5890,6 +6076,98 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn return 0; }
+/* mark_btf_func_reg_size() is used when the reg size is determined by + * the BTF func_proto's return value size and argument. + */ +static void mark_btf_func_reg_size(struct bpf_verifier_env *env, u32 regno, + size_t reg_size) +{ + struct bpf_reg_state *reg = &cur_regs(env)[regno]; + + if (regno == BPF_REG_0) { + /* Function return value */ + reg->live |= REG_LIVE_WRITTEN; + reg->subreg_def = reg_size == sizeof(u64) ? + DEF_NOT_SUBREG : env->insn_idx + 1; + } else { + /* Function argument */ + if (reg_size == sizeof(u64)) { + mark_insn_zext(env, reg); + mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64); + } else { + mark_reg_read(env, reg, reg->parent, REG_LIVE_READ32); + } + } +} + +static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn) +{ + const struct btf_type *t, *func, *func_proto, *ptr_type; + struct bpf_reg_state *regs = cur_regs(env); + const char *func_name, *ptr_type_name; + u32 i, nargs, func_id, ptr_type_id; + const struct btf_param *args; + int err; + + func_id = insn->imm; + func = btf_type_by_id(btf_vmlinux, func_id); + func_name = btf_name_by_offset(btf_vmlinux, func->name_off); + func_proto = btf_type_by_id(btf_vmlinux, func->type); + + if (!env->ops->check_kfunc_call || + !env->ops->check_kfunc_call(func_id)) { + verbose(env, "calling kernel function %s is not allowed\n", + func_name); + return -EACCES; + } + + /* Check the arguments */ + err = btf_check_kfunc_arg_match(env, btf_vmlinux, func_id, regs); + if (err) + return err; + + for (i = 0; i < CALLER_SAVED_REGS; i++) + mark_reg_not_init(env, regs, caller_saved[i]); + + /* Check return type */ + t = btf_type_skip_modifiers(btf_vmlinux, func_proto->type, NULL); + if (btf_type_is_scalar(t)) { + mark_reg_unknown(env, regs, BPF_REG_0); + mark_btf_func_reg_size(env, BPF_REG_0, t->size); + } else if (btf_type_is_ptr(t)) { + ptr_type = btf_type_skip_modifiers(btf_vmlinux, t->type, + &ptr_type_id); + if (!btf_type_is_struct(ptr_type)) { + ptr_type_name = btf_name_by_offset(btf_vmlinux, + ptr_type->name_off); + verbose(env, "kernel function %s returns pointer type %s %s is not supported\n", + func_name, btf_type_str(ptr_type), + ptr_type_name); + return -EINVAL; + } + mark_reg_known_zero(env, regs, BPF_REG_0); + regs[BPF_REG_0].btf = btf_vmlinux; + regs[BPF_REG_0].type = PTR_TO_BTF_ID; + regs[BPF_REG_0].btf_id = ptr_type_id; + mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *)); + } /* else { add_kfunc_call() ensures it is btf_type_is_void(t) } */ + + nargs = btf_type_vlen(func_proto); + args = (const struct btf_param *)(func_proto + 1); + for (i = 0; i < nargs; i++) { + u32 regno = i + 1; + + t = btf_type_skip_modifiers(btf_vmlinux, args[i].type, NULL); + if (btf_type_is_ptr(t)) + mark_btf_func_reg_size(env, regno, sizeof(void *)); + else + /* scalar. ensured by btf_check_kfunc_arg_match() */ + mark_btf_func_reg_size(env, regno, t->size); + } + + return 0; +} + static bool signed_add_overflows(s64 a, s64 b) { /* Do the add in u64, where overflow is well-defined */ @@ -10168,6 +10446,7 @@ static int do_check(struct bpf_verifier_env *env)
if (env->log.level & BPF_LOG_LEVEL) { const struct bpf_insn_cbs cbs = { + .cb_call = disasm_kfunc_name, .cb_print = verbose, .private_data = env, }; @@ -10315,7 +10594,8 @@ static int do_check(struct bpf_verifier_env *env) if (BPF_SRC(insn->code) != BPF_K || insn->off != 0 || (insn->src_reg != BPF_REG_0 && - insn->src_reg != BPF_PSEUDO_CALL) || + insn->src_reg != BPF_PSEUDO_CALL && + insn->src_reg != BPF_PSEUDO_KFUNC_CALL) || insn->dst_reg != BPF_REG_0 || class == BPF_JMP32) { verbose(env, "BPF_CALL uses reserved fields\n"); @@ -10330,6 +10610,8 @@ static int do_check(struct bpf_verifier_env *env) } if (insn->src_reg == BPF_PSEUDO_CALL) err = check_func_call(env, insn, &env->insn_idx); + else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) + err = check_kfunc_call(env, insn); else err = check_helper_call(env, insn, &env->insn_idx); if (err) @@ -11612,6 +11894,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->aux->name[0] = 'F'; func[i]->aux->stack_depth = env->subprog_info[i].stack_depth; func[i]->jit_requested = 1; + func[i]->aux->kfunc_tab = prog->aux->kfunc_tab; func[i]->aux->linfo = prog->aux->linfo; func[i]->aux->nr_linfo = prog->aux->nr_linfo; func[i]->aux->jited_linfo = prog->aux->jited_linfo; @@ -11747,6 +12030,7 @@ static int fixup_call_args(struct bpf_verifier_env *env) #ifndef CONFIG_BPF_JIT_ALWAYS_ON struct bpf_prog *prog = env->prog; struct bpf_insn *insn = prog->insnsi; + bool has_kfunc_call = bpf_prog_has_kfunc_call(prog); int i, depth; #endif int err = 0; @@ -11760,6 +12044,10 @@ static int fixup_call_args(struct bpf_verifier_env *env) return err; } #ifndef CONFIG_BPF_JIT_ALWAYS_ON + if (has_kfunc_call) { + verbose(env, "calling kernel functions are not allowed in non-JITed programs\n"); + return -EINVAL; + } if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) { /* When JIT fails the progs with bpf2bpf calls and tail_calls * have to be rejected, since interpreter doesn't support them yet. @@ -11788,6 +12076,26 @@ static int fixup_call_args(struct bpf_verifier_env *env) return err; }
+static int fixup_kfunc_call(struct bpf_verifier_env *env, + struct bpf_insn *insn) +{ + const struct bpf_kfunc_desc *desc; + + /* insn->imm has the btf func_id. Replace it with + * an address (relative to __bpf_base_call). + */ + desc = find_kfunc_desc(env->prog, insn->imm); + if (!desc) { + verbose(env, "verifier internal error: kernel function descriptor not found for func_id %u\n", + insn->imm); + return -EFAULT; + } + + insn->imm = desc->imm; + + return 0; +} + /* Do various post-verification rewrites in a single program pass. * These rewrites simplify JIT and interpreter implementations. */ @@ -11925,6 +12233,12 @@ static int do_misc_fixups(struct bpf_verifier_env *env) continue; if (insn->src_reg == BPF_PSEUDO_CALL) continue; + if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { + ret = fixup_kfunc_call(env, insn); + if (ret) + return ret; + continue; + }
if (insn->imm == BPF_FUNC_get_route_realm) prog->dst_needed = 1; @@ -12146,6 +12460,8 @@ static int do_misc_fixups(struct bpf_verifier_env *env) } }
+ sort_kfunc_descs_by_imm(env->prog); + return 0; }
@@ -12841,6 +13157,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, if (!env->explored_states) goto skip_full_check;
+ ret = add_subprog_and_kfunc(env); + if (ret < 0) + goto skip_full_check; + ret = check_subprogs(env); if (ret < 0) goto skip_full_check; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 35ada5baa156..098dffc26f2e 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1118,6 +1118,10 @@ enum bpf_link_type { * offset to another bpf function */ #define BPF_PSEUDO_CALL 1 +/* when bpf_call->src_reg == BPF_PSEUDO_KFUNC_CALL, + * bpf_call->imm == btf_id of a BTF_KIND_FUNC in the running kernel + */ +#define BPF_PSEUDO_KFUNC_CALL 2
/* flags for BPF_MAP_UPDATE_ELEM command */ enum { diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c index eb888c8479c3..336a749673d1 100644 --- a/tools/testing/selftests/bpf/verifier/calls.c +++ b/tools/testing/selftests/bpf/verifier/calls.c @@ -19,7 +19,7 @@ BPF_MOV64_IMM(BPF_REG_0, 2), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, .retval = 1, @@ -136,7 +136,7 @@ { "calls: wrong src reg", .insns = { - BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 2, 0, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 3, 0, 0), BPF_MOV64_IMM(BPF_REG_0, 1), BPF_EXIT_INSN(), }, @@ -397,7 +397,7 @@ BPF_MOV64_IMM(BPF_REG_0, 1), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .fixup_map_hash_48b = { 3 }, .result_unpriv = REJECT, .result = ACCEPT, @@ -1977,7 +1977,7 @@ BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, }, @@ -2003,7 +2003,7 @@ BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .errstr = "!read_ok", .result = REJECT, }, @@ -2028,7 +2028,7 @@ BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .errstr = "!read_ok", .result = REJECT, }, diff --git a/tools/testing/selftests/bpf/verifier/dead_code.c b/tools/testing/selftests/bpf/verifier/dead_code.c index 721ec9391be5..2c8935b3e65d 100644 --- a/tools/testing/selftests/bpf/verifier/dead_code.c +++ b/tools/testing/selftests/bpf/verifier/dead_code.c @@ -87,7 +87,7 @@ BPF_MOV64_IMM(BPF_REG_0, 12), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, .retval = 7, @@ -105,7 +105,7 @@ BPF_MOV64_IMM(BPF_REG_0, 12), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, .retval = 7, @@ -123,7 +123,7 @@ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, -5), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, .retval = 7, @@ -139,7 +139,7 @@ BPF_MOV64_REG(BPF_REG_0, BPF_REG_1), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, @@ -154,7 +154,7 @@ BPF_MOV64_REG(BPF_REG_0, BPF_REG_1), BPF_EXIT_INSN(), }, - .errstr_unpriv = "function calls to other bpf functions are allowed for", + .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for", .result_unpriv = REJECT, .result = ACCEPT, .retval = 2,
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit 797b84f727bce9c64ea2c85899c1ba283df54c16 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch adds kernel function call support to the x86-32 bpf jit.
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210325015149.1545267-1-kafai@fb.com (cherry picked from commit 797b84f727bce9c64ea2c85899c1ba283df54c16) Signed-off-by: Wang Yufen wangyufen@huawei.com --- arch/x86/net/bpf_jit_comp32.c | 198 ++++++++++++++++++++++++++++++++++ 1 file changed, 198 insertions(+)
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c index e6108287eef1..3bfda5f502cb 100644 --- a/arch/x86/net/bpf_jit_comp32.c +++ b/arch/x86/net/bpf_jit_comp32.c @@ -1390,6 +1390,19 @@ static inline void emit_push_r64(const u8 src[], u8 **pprog) *pprog = prog; }
+static void emit_push_r32(const u8 src[], u8 **pprog) +{ + u8 *prog = *pprog; + int cnt = 0; + + /* mov ecx,dword ptr [ebp+off] */ + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_lo)); + /* push ecx */ + EMIT1(0x51); + + *pprog = prog; +} + static u8 get_cond_jmp_opcode(const u8 op, bool is_cmp_lo) { u8 jmp_cond; @@ -1459,6 +1472,174 @@ static u8 get_cond_jmp_opcode(const u8 op, bool is_cmp_lo) return jmp_cond; }
+/* i386 kernel compiles with "-mregparm=3". From gcc document: + * + * ==== snippet ==== + * regparm (number) + * On x86-32 targets, the regparm attribute causes the compiler + * to pass arguments number one to (number) if they are of integral + * type in registers EAX, EDX, and ECX instead of on the stack. + * Functions that take a variable number of arguments continue + * to be passed all of their arguments on the stack. + * ==== snippet ==== + * + * The first three args of a function will be considered for + * putting into the 32bit register EAX, EDX, and ECX. + * + * Two 32bit registers are used to pass a 64bit arg. + * + * For example, + * void foo(u32 a, u32 b, u32 c, u32 d): + * u32 a: EAX + * u32 b: EDX + * u32 c: ECX + * u32 d: stack + * + * void foo(u64 a, u32 b, u32 c): + * u64 a: EAX (lo32) EDX (hi32) + * u32 b: ECX + * u32 c: stack + * + * void foo(u32 a, u64 b, u32 c): + * u32 a: EAX + * u64 b: EDX (lo32) ECX (hi32) + * u32 c: stack + * + * void foo(u32 a, u32 b, u64 c): + * u32 a: EAX + * u32 b: EDX + * u64 c: stack + * + * The return value will be stored in the EAX (and EDX for 64bit value). + * + * For example, + * u32 foo(u32 a, u32 b, u32 c): + * return value: EAX + * + * u64 foo(u32 a, u32 b, u32 c): + * return value: EAX (lo32) EDX (hi32) + * + * Notes: + * The verifier only accepts function having integer and pointers + * as its args and return value, so it does not have + * struct-by-value. + * + * emit_kfunc_call() finds out the btf_func_model by calling + * bpf_jit_find_kfunc_model(). A btf_func_model + * has the details about the number of args, size of each arg, + * and the size of the return value. + * + * It first decides how many args can be passed by EAX, EDX, and ECX. + * That will decide what args should be pushed to the stack: + * [first_stack_regno, last_stack_regno] are the bpf regnos + * that should be pushed to the stack. + * + * It will first push all args to the stack because the push + * will need to use ECX. Then, it moves + * [BPF_REG_1, first_stack_regno) to EAX, EDX, and ECX. + * + * When emitting a call (0xE8), it needs to figure out + * the jmp_offset relative to the jit-insn address immediately + * following the call (0xE8) instruction. At this point, it knows + * the end of the jit-insn address after completely translated the + * current (BPF_JMP | BPF_CALL) bpf-insn. It is passed as "end_addr" + * to the emit_kfunc_call(). Thus, it can learn the "immediate-follow-call" + * address by figuring out how many jit-insn is generated between + * the call (0xE8) and the end_addr: + * - 0-1 jit-insn (3 bytes each) to restore the esp pointer if there + * is arg pushed to the stack. + * - 0-2 jit-insns (3 bytes each) to handle the return value. + */ +static int emit_kfunc_call(const struct bpf_prog *bpf_prog, u8 *end_addr, + const struct bpf_insn *insn, u8 **pprog) +{ + const u8 arg_regs[] = { IA32_EAX, IA32_EDX, IA32_ECX }; + int i, cnt = 0, first_stack_regno, last_stack_regno; + int free_arg_regs = ARRAY_SIZE(arg_regs); + const struct btf_func_model *fm; + int bytes_in_stack = 0; + const u8 *cur_arg_reg; + u8 *prog = *pprog; + s64 jmp_offset; + + fm = bpf_jit_find_kfunc_model(bpf_prog, insn); + if (!fm) + return -EINVAL; + + first_stack_regno = BPF_REG_1; + for (i = 0; i < fm->nr_args; i++) { + int regs_needed = fm->arg_size[i] > sizeof(u32) ? 2 : 1; + + if (regs_needed > free_arg_regs) + break; + + free_arg_regs -= regs_needed; + first_stack_regno++; + } + + /* Push the args to the stack */ + last_stack_regno = BPF_REG_0 + fm->nr_args; + for (i = last_stack_regno; i >= first_stack_regno; i--) { + if (fm->arg_size[i - 1] > sizeof(u32)) { + emit_push_r64(bpf2ia32[i], &prog); + bytes_in_stack += 8; + } else { + emit_push_r32(bpf2ia32[i], &prog); + bytes_in_stack += 4; + } + } + + cur_arg_reg = &arg_regs[0]; + for (i = BPF_REG_1; i < first_stack_regno; i++) { + /* mov e[adc]x,dword ptr [ebp+off] */ + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, *cur_arg_reg++), + STACK_VAR(bpf2ia32[i][0])); + if (fm->arg_size[i - 1] > sizeof(u32)) + /* mov e[adc]x,dword ptr [ebp+off] */ + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, *cur_arg_reg++), + STACK_VAR(bpf2ia32[i][1])); + } + + if (bytes_in_stack) + /* add esp,"bytes_in_stack" */ + end_addr -= 3; + + /* mov dword ptr [ebp+off],edx */ + if (fm->ret_size > sizeof(u32)) + end_addr -= 3; + + /* mov dword ptr [ebp+off],eax */ + if (fm->ret_size) + end_addr -= 3; + + jmp_offset = (u8 *)__bpf_call_base + insn->imm - end_addr; + if (!is_simm32(jmp_offset)) { + pr_err("unsupported BPF kernel function jmp_offset:%lld\n", + jmp_offset); + return -EINVAL; + } + + EMIT1_off32(0xE8, jmp_offset); + + if (fm->ret_size) + /* mov dword ptr [ebp+off],eax */ + EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX), + STACK_VAR(bpf2ia32[BPF_REG_0][0])); + + if (fm->ret_size > sizeof(u32)) + /* mov dword ptr [ebp+off],edx */ + EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX), + STACK_VAR(bpf2ia32[BPF_REG_0][1])); + + if (bytes_in_stack) + /* add esp,"bytes_in_stack" */ + EMIT3(0x83, add_1reg(0xC0, IA32_ESP), bytes_in_stack); + + *pprog = prog; + + return 0; +} + static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, int oldproglen, struct jit_context *ctx) { @@ -1894,6 +2075,18 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, if (insn->src_reg == BPF_PSEUDO_CALL) goto notyet;
+ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { + int err; + + err = emit_kfunc_call(bpf_prog, + image + addrs[i], + insn, &prog); + + if (err) + return err; + break; + } + func = (u8 *) __bpf_call_base + imm32; jmp_offset = func - (image + addrs[i]);
@@ -2408,3 +2601,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) tmp : orig_prog); return prog; } + +bool bpf_jit_supports_kfunc_call(void) +{ + return true; +}
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit 933d1aa32409ef4209c8065ee4ede68236659cd2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch refactors most of the logic from bpf_object__resolve_ksyms_btf_id() into a new function bpf_object__resolve_ksym_var_btf_id(). It is to get ready for a later patch adding bpf_object__resolve_ksym_func_btf_id() which resolves a kernel function to the running kernel btf_id.
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com (cherry picked from commit 933d1aa32409ef4209c8065ee4ede68236659cd2) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 124 ++++++++++++++++++++++------------------- 1 file changed, 67 insertions(+), 57 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0c1c6de1e0b3..9a0dec536abf 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7452,75 +7452,85 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj) return err; }
-static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) +static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj, + struct extern_desc *ext) { - struct extern_desc *ext; + const struct btf_type *targ_var, *targ_type; + __u32 targ_type_id, local_type_id; + const char *targ_var_name; + int i, id, btf_fd, err; struct btf *btf; - int i, j, id, btf_fd, err;
- for (i = 0; i < obj->nr_extern; i++) { - const struct btf_type *targ_var, *targ_type; - __u32 targ_type_id, local_type_id; - const char *targ_var_name; - int ret; + btf = obj->btf_vmlinux; + btf_fd = 0; + id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); + if (id == -ENOENT) { + err = load_module_btfs(obj); + if (err) + return err;
- ext = &obj->externs[i]; - if (ext->type != EXT_KSYM || !ext->ksym.type_id) - continue; + for (i = 0; i < obj->btf_module_cnt; i++) { + btf = obj->btf_modules[i].btf; + /* we assume module BTF FD is always >0 */ + btf_fd = obj->btf_modules[i].fd; + id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); + if (id != -ENOENT) + break; + } + } + if (id <= 0) { + pr_warn("extern (var ksym) '%s': failed to find BTF ID in kernel BTF(s).\n", + ext->name); + return -ESRCH; + }
- btf = obj->btf_vmlinux; - btf_fd = 0; - id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); - if (id == -ENOENT) { - err = load_module_btfs(obj); - if (err) - return err; + /* find local type_id */ + local_type_id = ext->ksym.type_id;
- for (j = 0; j < obj->btf_module_cnt; j++) { - btf = obj->btf_modules[j].btf; - /* we assume module BTF FD is always >0 */ - btf_fd = obj->btf_modules[j].fd; - id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); - if (id != -ENOENT) - break; - } - } - if (id <= 0) { - pr_warn("extern (ksym) '%s': failed to find BTF ID in kernel BTF(s).\n", - ext->name); - return -ESRCH; - } + /* find target type_id */ + targ_var = btf__type_by_id(btf, id); + targ_var_name = btf__name_by_offset(btf, targ_var->name_off); + targ_type = skip_mods_and_typedefs(btf, targ_var->type, &targ_type_id);
- /* find local type_id */ - local_type_id = ext->ksym.type_id; + err = bpf_core_types_are_compat(obj->btf, local_type_id, + btf, targ_type_id); + if (err <= 0) { + const struct btf_type *local_type; + const char *targ_name, *local_name;
- /* find target type_id */ - targ_var = btf__type_by_id(btf, id); - targ_var_name = btf__name_by_offset(btf, targ_var->name_off); - targ_type = skip_mods_and_typedefs(btf, targ_var->type, &targ_type_id); + local_type = btf__type_by_id(obj->btf, local_type_id); + local_name = btf__name_by_offset(obj->btf, local_type->name_off); + targ_name = btf__name_by_offset(btf, targ_type->name_off);
- ret = bpf_core_types_are_compat(obj->btf, local_type_id, - btf, targ_type_id); - if (ret <= 0) { - const struct btf_type *local_type; - const char *targ_name, *local_name; + pr_warn("extern (var ksym) '%s': incompatible types, expected [%d] %s %s, but kernel has [%d] %s %s\n", + ext->name, local_type_id, + btf_kind_str(local_type), local_name, targ_type_id, + btf_kind_str(targ_type), targ_name); + return -EINVAL; + }
- local_type = btf__type_by_id(obj->btf, local_type_id); - local_name = btf__name_by_offset(obj->btf, local_type->name_off); - targ_name = btf__name_by_offset(btf, targ_type->name_off); + ext->is_set = true; + ext->ksym.kernel_btf_obj_fd = btf_fd; + ext->ksym.kernel_btf_id = id; + pr_debug("extern (var ksym) '%s': resolved to [%d] %s %s\n", + ext->name, id, btf_kind_str(targ_var), targ_var_name);
- pr_warn("extern (ksym) '%s': incompatible types, expected [%d] %s %s, but kernel has [%d] %s %s\n", - ext->name, local_type_id, - btf_kind_str(local_type), local_name, targ_type_id, - btf_kind_str(targ_type), targ_name); - return -EINVAL; - } + return 0; +} + +static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) +{ + struct extern_desc *ext; + int i, err; + + for (i = 0; i < obj->nr_extern; i++) { + ext = &obj->externs[i]; + if (ext->type != EXT_KSYM || !ext->ksym.type_id) + continue;
- ext->is_set = true; - ext->ksym.kernel_btf_obj_fd = btf_fd; - ext->ksym.kernel_btf_id = id; - pr_debug("extern (ksym) '%s': resolved to [%d] %s %s\n", - ext->name, id, btf_kind_str(targ_var), targ_var_name); + err = bpf_object__resolve_ksym_var_btf_id(obj, ext); + if (err) + return err; } return 0; }
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit 774e132e83d0f10a7ebbfe7db1debdaed6013f83 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch refactors code, that finds kernel btf_id by kind and symbol name, to a new function find_ksym_btf_id().
It also adds a new helper __btf_kind_str() to return a string by the numeric kind value.
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com (cherry picked from commit 774e132e83d0f10a7ebbfe7db1debdaed6013f83) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 44 +++++++++++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 9a0dec536abf..efec4119a71f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1921,9 +1921,9 @@ resolve_func_ptr(const struct btf *btf, __u32 id, __u32 *res_id) return btf_is_func_proto(t) ? t : NULL; }
-static const char *btf_kind_str(const struct btf_type *t) +static const char *__btf_kind_str(__u16 kind) { - switch (btf_kind(t)) { + switch (kind) { case BTF_KIND_UNKN: return "void"; case BTF_KIND_INT: return "int"; case BTF_KIND_PTR: return "ptr"; @@ -1945,6 +1945,11 @@ static const char *btf_kind_str(const struct btf_type *t) } }
+static const char *btf_kind_str(const struct btf_type *t) +{ + return __btf_kind_str(btf_kind(t)); +} + /* * Fetch integer attribute of BTF map definition. Such attributes are * represented using a pointer to an array, in which dimensionality of array @@ -7452,18 +7457,17 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj) return err; }
-static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj, - struct extern_desc *ext) +static int find_ksym_btf_id(struct bpf_object *obj, const char *ksym_name, + __u16 kind, struct btf **res_btf, + int *res_btf_fd) { - const struct btf_type *targ_var, *targ_type; - __u32 targ_type_id, local_type_id; - const char *targ_var_name; int i, id, btf_fd, err; struct btf *btf;
btf = obj->btf_vmlinux; btf_fd = 0; - id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); + id = btf__find_by_name_kind(btf, ksym_name, kind); + if (id == -ENOENT) { err = load_module_btfs(obj); if (err) @@ -7473,17 +7477,35 @@ static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj, btf = obj->btf_modules[i].btf; /* we assume module BTF FD is always >0 */ btf_fd = obj->btf_modules[i].fd; - id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR); + id = btf__find_by_name_kind(btf, ksym_name, kind); if (id != -ENOENT) break; } } if (id <= 0) { - pr_warn("extern (var ksym) '%s': failed to find BTF ID in kernel BTF(s).\n", - ext->name); + pr_warn("extern (%s ksym) '%s': failed to find BTF ID in kernel BTF(s).\n", + __btf_kind_str(kind), ksym_name); return -ESRCH; }
+ *res_btf = btf; + *res_btf_fd = btf_fd; + return id; +} + +static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj, + struct extern_desc *ext) +{ + const struct btf_type *targ_var, *targ_type; + __u32 targ_type_id, local_type_id; + const char *targ_var_name; + int id, btf_fd = 0, err; + struct btf *btf = NULL; + + id = find_ksym_btf_id(obj, ext->name, BTF_KIND_VAR, &btf, &btf_fd); + if (id < 0) + return id; + /* find local type_id */ local_type_id = ext->ksym.type_id;
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit 0c091e5c2d37696589a3e0131a809b5499899995 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch renames RELO_EXTERN to RELO_EXTERN_VAR. It is to avoid the confusion with a later patch adding RELO_EXTERN_FUNC.
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com (cherry picked from commit 0c091e5c2d37696589a3e0131a809b5499899995) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index efec4119a71f..43408a6130a9 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -185,7 +185,7 @@ enum reloc_type { RELO_LD64, RELO_CALL, RELO_DATA, - RELO_EXTERN, + RELO_EXTERN_VAR, RELO_SUBPROG_ADDR, };
@@ -3456,7 +3456,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog, } pr_debug("prog '%s': found extern #%d '%s' (sym %d) for insn #%u\n", prog->name, i, ext->name, ext->sym_idx, insn_idx); - reloc_desc->type = RELO_EXTERN; + reloc_desc->type = RELO_EXTERN_VAR; reloc_desc->insn_idx = insn_idx; reloc_desc->sym_off = i; /* sym_off stores extern index */ return 0; @@ -6272,7 +6272,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) insn[0].imm = obj->maps[relo->map_idx].fd; relo->processed = true; break; - case RELO_EXTERN: + case RELO_EXTERN_VAR: ext = &obj->externs[relo->sym_off]; if (ext->type == EXT_KCFG) { insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit aa0b8d43e9537d371cbd3f272d3403f2b15201af category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch records the extern sym relocs first before recording subprog relocs. The later patch will have relocs for extern kernel function call which is also using BPF_JMP | BPF_CALL. It will be easier to handle the extern symbols first in the later patch.
is_call_insn() helper is added. The existing is_ldimm64() helper is renamed to is_ldimm64_insn() for consistency.
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com (cherry picked from commit aa0b8d43e9537d371cbd3f272d3403f2b15201af) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 63 +++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 29 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 43408a6130a9..a2400786725f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -573,14 +573,19 @@ static bool insn_is_subprog_call(const struct bpf_insn *insn) insn->off == 0; }
-static bool is_ldimm64(struct bpf_insn *insn) +static bool is_ldimm64_insn(struct bpf_insn *insn) { return insn->code == (BPF_LD | BPF_IMM | BPF_DW); }
+static bool is_call_insn(const struct bpf_insn *insn) +{ + return insn->code == (BPF_JMP | BPF_CALL); +} + static bool insn_is_pseudo_func(struct bpf_insn *insn) { - return is_ldimm64(insn) && insn->src_reg == BPF_PSEUDO_FUNC; + return is_ldimm64_insn(insn) && insn->src_reg == BPF_PSEUDO_FUNC; }
static int @@ -3409,31 +3414,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
reloc_desc->processed = false;
- /* sub-program call relocation */ - if (insn->code == (BPF_JMP | BPF_CALL)) { - if (insn->src_reg != BPF_PSEUDO_CALL) { - pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name); - return -LIBBPF_ERRNO__RELOC; - } - /* text_shndx can be 0, if no default "main" program exists */ - if (!shdr_idx || shdr_idx != obj->efile.text_shndx) { - sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx)); - pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n", - prog->name, sym_name, sym_sec_name); - return -LIBBPF_ERRNO__RELOC; - } - if (sym->st_value % BPF_INSN_SZ) { - pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n", - prog->name, sym_name, (size_t)sym->st_value); - return -LIBBPF_ERRNO__RELOC; - } - reloc_desc->type = RELO_CALL; - reloc_desc->insn_idx = insn_idx; - reloc_desc->sym_off = sym->st_value; - return 0; - } - - if (!is_ldimm64(insn)) { + if (!is_call_insn(insn) && !is_ldimm64_insn(insn)) { pr_warn("prog '%s': invalid relo against '%s' for insns[%d].code 0x%x\n", prog->name, sym_name, insn_idx, insn->code); return -LIBBPF_ERRNO__RELOC; @@ -3462,6 +3443,30 @@ static int bpf_program__record_reloc(struct bpf_program *prog, return 0; }
+ /* sub-program call relocation */ + if (is_call_insn(insn)) { + if (insn->src_reg != BPF_PSEUDO_CALL) { + pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name); + return -LIBBPF_ERRNO__RELOC; + } + /* text_shndx can be 0, if no default "main" program exists */ + if (!shdr_idx || shdr_idx != obj->efile.text_shndx) { + sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx)); + pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n", + prog->name, sym_name, sym_sec_name); + return -LIBBPF_ERRNO__RELOC; + } + if (sym->st_value % BPF_INSN_SZ) { + pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n", + prog->name, sym_name, (size_t)sym->st_value); + return -LIBBPF_ERRNO__RELOC; + } + reloc_desc->type = RELO_CALL; + reloc_desc->insn_idx = insn_idx; + reloc_desc->sym_off = sym->st_value; + return 0; + } + if (!shdr_idx || shdr_idx >= SHN_LORESERVE) { pr_warn("prog '%s': invalid relo against '%s' in special section 0x%x; forgot to initialize global var?..\n", prog->name, sym_name, shdr_idx); @@ -5754,7 +5759,7 @@ static int bpf_core_patch_insn(struct bpf_program *prog, /* poison second part of ldimm64 to avoid confusing error from * verifier about "unknown opcode 00" */ - if (is_ldimm64(insn)) + if (is_ldimm64_insn(insn)) bpf_core_poison_insn(prog, relo_idx, insn_idx + 1, insn + 1); bpf_core_poison_insn(prog, relo_idx, insn_idx, insn); return 0; @@ -5830,7 +5835,7 @@ static int bpf_core_patch_insn(struct bpf_program *prog, case BPF_LD: { __u64 imm;
- if (!is_ldimm64(insn) || + if (!is_ldimm64_insn(insn) || insn[0].src_reg != 0 || insn[0].off != 0 || insn_idx + 1 >= prog->insns_cnt || insn[1].code != 0 || insn[1].dst_reg != 0 ||
From: Martin KaFai Lau kafai@fb.com
mainline inclusion from mainline-5.13-rc1 commit 5bd022ec01f060f30672cc6383b8b04e75a4310d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch is to make libbpf able to handle the following extern kernel function declaration and do the needed relocations before loading the bpf program to the kernel.
extern int foo(struct sock *) __attribute__((section(".ksyms")))
In the collect extern phase, needed changes is made to bpf_object__collect_externs() and find_extern_btf_id() to collect extern function in ".ksyms" section. The func in the BTF datasec also needs to be replaced by an int var. The idea is similar to the existing handling in extern var. In case the BTF may not have a var, a dummy ksym var is added at the beginning of bpf_object__collect_externs() if there is func under ksyms datasec. It will also change the func linkage from extern to global which the kernel can support. It also assigns a param name if it does not have one.
In the collect relo phase, it will record the kernel function call as RELO_EXTERN_FUNC.
bpf_object__resolve_ksym_func_btf_id() is added to find the func btf_id of the running kernel.
During actual relocation, it will patch the BPF_CALL instruction with src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running kernel func's btf_id.
The required LLVM patch: https://reviews.llvm.org/D93563
Signed-off-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com (cherry picked from commit 5bd022ec01f060f30672cc6383b8b04e75a4310d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 174 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 162 insertions(+), 12 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a2400786725f..f662059b1ff1 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -186,6 +186,7 @@ enum reloc_type { RELO_CALL, RELO_DATA, RELO_EXTERN_VAR, + RELO_EXTERN_FUNC, RELO_SUBPROG_ADDR, };
@@ -1955,6 +1956,11 @@ static const char *btf_kind_str(const struct btf_type *t) return __btf_kind_str(btf_kind(t)); }
+static enum btf_func_linkage btf_func_linkage(const struct btf_type *t) +{ + return (enum btf_func_linkage)BTF_INFO_VLEN(t->info); +} + /* * Fetch integer attribute of BTF map definition. Such attributes are * represented using a pointer to an array, in which dimensionality of array @@ -3020,7 +3026,7 @@ static bool sym_is_subprog(const GElf_Sym *sym, int text_shndx) static int find_extern_btf_id(const struct btf *btf, const char *ext_name) { const struct btf_type *t; - const char *var_name; + const char *tname; int i, n;
if (!btf) @@ -3030,14 +3036,18 @@ static int find_extern_btf_id(const struct btf *btf, const char *ext_name) for (i = 1; i <= n; i++) { t = btf__type_by_id(btf, i);
- if (!btf_is_var(t)) + if (!btf_is_var(t) && !btf_is_func(t)) continue;
- var_name = btf__name_by_offset(btf, t->name_off); - if (strcmp(var_name, ext_name)) + tname = btf__name_by_offset(btf, t->name_off); + if (strcmp(tname, ext_name)) continue;
- if (btf_var(t)->linkage != BTF_VAR_GLOBAL_EXTERN) + if (btf_is_var(t) && + btf_var(t)->linkage != BTF_VAR_GLOBAL_EXTERN) + return -EINVAL; + + if (btf_is_func(t) && btf_func_linkage(t) != BTF_FUNC_EXTERN) return -EINVAL;
return i; @@ -3150,12 +3160,48 @@ static int find_int_btf_id(const struct btf *btf) return 0; }
+static int add_dummy_ksym_var(struct btf *btf) +{ + int i, int_btf_id, sec_btf_id, dummy_var_btf_id; + const struct btf_var_secinfo *vs; + const struct btf_type *sec; + + sec_btf_id = btf__find_by_name_kind(btf, KSYMS_SEC, + BTF_KIND_DATASEC); + if (sec_btf_id < 0) + return 0; + + sec = btf__type_by_id(btf, sec_btf_id); + vs = btf_var_secinfos(sec); + for (i = 0; i < btf_vlen(sec); i++, vs++) { + const struct btf_type *vt; + + vt = btf__type_by_id(btf, vs->type); + if (btf_is_func(vt)) + break; + } + + /* No func in ksyms sec. No need to add dummy var. */ + if (i == btf_vlen(sec)) + return 0; + + int_btf_id = find_int_btf_id(btf); + dummy_var_btf_id = btf__add_var(btf, + "dummy_ksym", + BTF_VAR_GLOBAL_ALLOCATED, + int_btf_id); + if (dummy_var_btf_id < 0) + pr_warn("cannot create a dummy_ksym var\n"); + + return dummy_var_btf_id; +} + static int bpf_object__collect_externs(struct bpf_object *obj) { struct btf_type *sec, *kcfg_sec = NULL, *ksym_sec = NULL; const struct btf_type *t; struct extern_desc *ext; - int i, n, off; + int i, n, off, dummy_var_btf_id; const char *ext_name, *sec_name; Elf_Scn *scn; GElf_Shdr sh; @@ -3167,6 +3213,10 @@ static int bpf_object__collect_externs(struct bpf_object *obj) if (elf_sec_hdr(obj, scn, &sh)) return -LIBBPF_ERRNO__FORMAT;
+ dummy_var_btf_id = add_dummy_ksym_var(obj->btf); + if (dummy_var_btf_id < 0) + return dummy_var_btf_id; + n = sh.sh_size / sh.sh_entsize; pr_debug("looking for externs among %d symbols...\n", n);
@@ -3211,6 +3261,11 @@ static int bpf_object__collect_externs(struct bpf_object *obj) sec_name = btf__name_by_offset(obj->btf, sec->name_off);
if (strcmp(sec_name, KCONFIG_SEC) == 0) { + if (btf_is_func(t)) { + pr_warn("extern function %s is unsupported under %s section\n", + ext->name, KCONFIG_SEC); + return -ENOTSUP; + } kcfg_sec = sec; ext->type = EXT_KCFG; ext->kcfg.sz = btf__resolve_size(obj->btf, t->type); @@ -3232,6 +3287,11 @@ static int bpf_object__collect_externs(struct bpf_object *obj) return -ENOTSUP; } } else if (strcmp(sec_name, KSYMS_SEC) == 0) { + if (btf_is_func(t) && ext->is_weak) { + pr_warn("extern weak function %s is unsupported\n", + ext->name); + return -ENOTSUP; + } ksym_sec = sec; ext->type = EXT_KSYM; skip_mods_and_typedefs(obj->btf, t->type, @@ -3258,7 +3318,14 @@ static int bpf_object__collect_externs(struct bpf_object *obj) * extern variables in DATASEC */ int int_btf_id = find_int_btf_id(obj->btf); + /* For extern function, a dummy_var added earlier + * will be used to replace the vs->type and + * its name string will be used to refill + * the missing param's name. + */ + const struct btf_type *dummy_var;
+ dummy_var = btf__type_by_id(obj->btf, dummy_var_btf_id); for (i = 0; i < obj->nr_extern; i++) { ext = &obj->externs[i]; if (ext->type != EXT_KSYM) @@ -3277,12 +3344,32 @@ static int bpf_object__collect_externs(struct bpf_object *obj) ext_name = btf__name_by_offset(obj->btf, vt->name_off); ext = find_extern_by_name(obj, ext_name); if (!ext) { - pr_warn("failed to find extern definition for BTF var '%s'\n", - ext_name); + pr_warn("failed to find extern definition for BTF %s '%s'\n", + btf_kind_str(vt), ext_name); return -ESRCH; } - btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED; - vt->type = int_btf_id; + if (btf_is_func(vt)) { + const struct btf_type *func_proto; + struct btf_param *param; + int j; + + func_proto = btf__type_by_id(obj->btf, + vt->type); + param = btf_params(func_proto); + /* Reuse the dummy_var string if the + * func proto does not have param name. + */ + for (j = 0; j < btf_vlen(func_proto); j++) + if (param[j].type && !param[j].name_off) + param[j].name_off = + dummy_var->name_off; + vs->type = dummy_var_btf_id; + vt->info &= ~0xffff; + vt->info |= BTF_FUNC_GLOBAL; + } else { + btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED; + vt->type = int_btf_id; + } vs->offset = off; vs->size = sizeof(int); } @@ -3437,7 +3524,10 @@ static int bpf_program__record_reloc(struct bpf_program *prog, } pr_debug("prog '%s': found extern #%d '%s' (sym %d) for insn #%u\n", prog->name, i, ext->name, ext->sym_idx, insn_idx); - reloc_desc->type = RELO_EXTERN_VAR; + if (insn->code == (BPF_JMP | BPF_CALL)) + reloc_desc->type = RELO_EXTERN_FUNC; + else + reloc_desc->type = RELO_EXTERN_VAR; reloc_desc->insn_idx = insn_idx; reloc_desc->sym_off = i; /* sym_off stores extern index */ return 0; @@ -6295,6 +6385,12 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) } relo->processed = true; break; + case RELO_EXTERN_FUNC: + ext = &obj->externs[relo->sym_off]; + insn[0].src_reg = BPF_PSEUDO_KFUNC_CALL; + insn[0].imm = ext->ksym.kernel_btf_id; + relo->processed = true; + break; case RELO_SUBPROG_ADDR: insn[0].src_reg = BPF_PSEUDO_FUNC; /* will be handled as a follow up pass */ @@ -7418,6 +7514,7 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj) { char sym_type, sym_name[500]; unsigned long long sym_addr; + const struct btf_type *t; struct extern_desc *ext; int ret, err = 0; FILE *f; @@ -7444,6 +7541,10 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj) if (!ext || ext->type != EXT_KSYM) continue;
+ t = btf__type_by_id(obj->btf, ext->btf_id); + if (!btf_is_var(t)) + continue; + if (ext->is_set && ext->ksym.addr != sym_addr) { pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n", sym_name, ext->ksym.addr, sym_addr); @@ -7545,8 +7646,53 @@ static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj, return 0; }
+static int bpf_object__resolve_ksym_func_btf_id(struct bpf_object *obj, + struct extern_desc *ext) +{ + int local_func_proto_id, kfunc_proto_id, kfunc_id; + const struct btf_type *kern_func; + struct btf *kern_btf = NULL; + int ret, kern_btf_fd = 0; + + local_func_proto_id = ext->ksym.type_id; + + kfunc_id = find_ksym_btf_id(obj, ext->name, BTF_KIND_FUNC, + &kern_btf, &kern_btf_fd); + if (kfunc_id < 0) { + pr_warn("extern (func ksym) '%s': not found in kernel BTF\n", + ext->name); + return kfunc_id; + } + + if (kern_btf != obj->btf_vmlinux) { + pr_warn("extern (func ksym) '%s': function in kernel module is not supported\n", + ext->name); + return -ENOTSUP; + } + + kern_func = btf__type_by_id(kern_btf, kfunc_id); + kfunc_proto_id = kern_func->type; + + ret = bpf_core_types_are_compat(obj->btf, local_func_proto_id, + kern_btf, kfunc_proto_id); + if (ret <= 0) { + pr_warn("extern (func ksym) '%s': func_proto [%d] incompatible with kernel [%d]\n", + ext->name, local_func_proto_id, kfunc_proto_id); + return -EINVAL; + } + + ext->is_set = true; + ext->ksym.kernel_btf_obj_fd = kern_btf_fd; + ext->ksym.kernel_btf_id = kfunc_id; + pr_debug("extern (func ksym) '%s': resolved to kernel [%d]\n", + ext->name, kfunc_id); + + return 0; +} + static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) { + const struct btf_type *t; struct extern_desc *ext; int i, err;
@@ -7555,7 +7701,11 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) if (ext->type != EXT_KSYM || !ext->ksym.type_id) continue;
- err = bpf_object__resolve_ksym_var_btf_id(obj, ext); + t = btf__type_by_id(obj->btf, ext->btf_id); + if (btf_is_var(t)) + err = bpf_object__resolve_ksym_var_btf_id(obj, ext); + else + err = bpf_object__resolve_ksym_func_btf_id(obj, ext); if (err) return err; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 05d817031ff9686a8206039b19e37616cf9e1d44 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Free temporary allocated memory used to construct finalized .BTF.ext data. Found by Coverity static analysis on libbpf's Github repo.
Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20210327042502.969745-1-andrii@kernel.org (cherry picked from commit 05d817031ff9686a8206039b19e37616cf9e1d44) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index a29d62ff8041..46b16cbdcda3 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1906,8 +1906,10 @@ static int finalize_btf_ext(struct bpf_linker *linker) struct dst_sec *sec = &linker->secs[i];
sz = emit_btf_ext_data(linker, cur, sec->sec_name, &sec->func_info); - if (sz < 0) - return sz; + if (sz < 0) { + err = sz; + goto out; + }
cur += sz; } @@ -1921,8 +1923,10 @@ static int finalize_btf_ext(struct bpf_linker *linker) struct dst_sec *sec = &linker->secs[i];
sz = emit_btf_ext_data(linker, cur, sec->sec_name, &sec->line_info); - if (sz < 0) - return sz; + if (sz < 0) { + err = sz; + goto out; + }
cur += sz; } @@ -1936,8 +1940,10 @@ static int finalize_btf_ext(struct bpf_linker *linker) struct dst_sec *sec = &linker->secs[i];
sz = emit_btf_ext_data(linker, cur, sec->sec_name, &sec->core_relo_info); - if (sz < 0) - return sz; + if (sz < 0) { + err = sz; + goto out; + }
cur += sz; } @@ -1948,8 +1954,10 @@ static int finalize_btf_ext(struct bpf_linker *linker) if (err) { linker->btf_ext = NULL; pr_warn("failed to parse final .BTF.ext data: %d\n", err); - return err; + goto out; }
- return 0; +out: + free(data); + return err; }
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.13-rc1 commit 1e1032b0c4afaed7739a6681ff6b4cb120b82994 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add missing ')' for KERNEL_VERSION macro.
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210405040119.802188-1-hengqi.chen@gmail.com (cherry picked from commit 1e1032b0c4afaed7739a6681ff6b4cb120b82994) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_helpers.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index cc2e51c64a54..b904128626c2 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -51,7 +51,7 @@ #endif
#ifndef KERNEL_VERSION -#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c)) +#define KERNEL_VERSION(a, b, c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c))) #endif
/*
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit ff182bc572cec4ca757775bf2a33a3ce8611227a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
As pointed by Andrii Nakryiko, _version is useless now, remove it.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-1-yauheni.kaliuta@redhat.co... (cherry picked from commit ff182bc572cec4ca757775bf2a33a3ce8611227a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/progs/sockopt_sk.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c index 712df7b49cb1..a40f9ae83437 100644 --- a/tools/testing/selftests/bpf/progs/sockopt_sk.c +++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c @@ -6,7 +6,6 @@ #include <bpf/bpf_helpers.h>
char _license[] SEC("license") = "GPL"; -__u32 _version SEC("version") = 1;
#ifndef PAGE_SIZE #define PAGE_SIZE 4096
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit cad99cce133dd5377f22da1e75d7eb5c4b886d75 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Switch the test to use BPF skeleton to save some boilerplate and make it easy to access bpf program bss segment.
The latter will be used to pass PAGE_SIZE from userspace since there is no convenient way for bpf program to get it from inside of the kernel.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-2-yauheni.kaliuta@redhat.co... (cherry picked from commit cad99cce133dd5377f22da1e75d7eb5c4b886d75) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/sockopt_sk.c | 66 +++++-------------- 1 file changed, 18 insertions(+), 48 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c index b25c9c45c148..7b6ed00e6b25 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c +++ b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c @@ -2,6 +2,8 @@ #include <test_progs.h> #include "cgroup_helpers.h"
+#include "sockopt_sk.skel.h" + #define SOL_CUSTOM 0xdeadbeef
static int getsetsockopt(void) @@ -163,60 +165,28 @@ static int getsetsockopt(void) return -1; }
-static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title) -{ - enum bpf_attach_type attach_type; - enum bpf_prog_type prog_type; - struct bpf_program *prog; - int err; - - err = libbpf_prog_type_by_name(title, &prog_type, &attach_type); - if (err) { - log_err("Failed to deduct types for %s BPF program", title); - return -1; - } - - prog = bpf_object__find_program_by_title(obj, title); - if (!prog) { - log_err("Failed to find %s BPF program", title); - return -1; - } - - err = bpf_prog_attach(bpf_program__fd(prog), cgroup_fd, - attach_type, 0); - if (err) { - log_err("Failed to attach %s BPF program", title); - return -1; - } - - return 0; -} - static void run_test(int cgroup_fd) { - struct bpf_prog_load_attr attr = { - .file = "./sockopt_sk.o", - }; - struct bpf_object *obj; - int ignored; - int err; - - err = bpf_prog_load_xattr(&attr, &obj, &ignored); - if (CHECK_FAIL(err)) - return; + struct sockopt_sk *skel; + + skel = sockopt_sk__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup;
- err = prog_attach(obj, cgroup_fd, "cgroup/getsockopt"); - if (CHECK_FAIL(err)) - goto close_bpf_object; + skel->links._setsockopt = + bpf_program__attach_cgroup(skel->progs._setsockopt, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links._setsockopt, "setsockopt_link")) + goto cleanup;
- err = prog_attach(obj, cgroup_fd, "cgroup/setsockopt"); - if (CHECK_FAIL(err)) - goto close_bpf_object; + skel->links._getsockopt = + bpf_program__attach_cgroup(skel->progs._getsockopt, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links._getsockopt, "getsockopt_link")) + goto cleanup;
- CHECK_FAIL(getsetsockopt()); + ASSERT_OK(getsetsockopt(), "getsetsockopt");
-close_bpf_object: - bpf_object__close(obj); +cleanup: + sockopt_sk__destroy(skel); }
void test_sockopt_sk(void)
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit 361d32028c7d52d28d7f0562193a2f4a41d10351 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Since there is no convenient way for bpf program to get PAGE_SIZE from inside of the kernel, pass the value from userspace.
Zero-initialize the variable in bpf prog, otherwise it will cause problems on some versions of Clang.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-3-yauheni.kaliuta@redhat.co... (cherry picked from commit 361d32028c7d52d28d7f0562193a2f4a41d10351) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/sockopt_sk.c | 2 ++ tools/testing/selftests/bpf/progs/sockopt_sk.c | 10 ++++------ 2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c index 7b6ed00e6b25..6e75937fc95e 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c +++ b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c @@ -173,6 +173,8 @@ static void run_test(int cgroup_fd) if (!ASSERT_OK_PTR(skel, "skel_load")) goto cleanup;
+ skel->bss->page_size = getpagesize(); + skel->links._setsockopt = bpf_program__attach_cgroup(skel->progs._setsockopt, cgroup_fd); if (!ASSERT_OK_PTR(skel->links._setsockopt, "setsockopt_link")) diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c index a40f9ae83437..03dfc875c390 100644 --- a/tools/testing/selftests/bpf/progs/sockopt_sk.c +++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c @@ -7,9 +7,7 @@
char _license[] SEC("license") = "GPL";
-#ifndef PAGE_SIZE -#define PAGE_SIZE 4096 -#endif +int page_size = 0; /* userspace should set it */
#define SOL_CUSTOM 0xdeadbeef
@@ -70,7 +68,7 @@ int _getsockopt(struct bpf_sockopt *ctx) * program can only see the first PAGE_SIZE * bytes of data. */ - if (optval_end - optval != PAGE_SIZE) + if (optval_end - optval != page_size) return 0; /* EPERM, unexpected data size */
return 1; @@ -141,7 +139,7 @@ int _setsockopt(struct bpf_sockopt *ctx)
if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) { /* Original optlen is larger than PAGE_SIZE. */ - if (ctx->optlen != PAGE_SIZE * 2) + if (ctx->optlen != page_size * 2) return 0; /* EPERM, unexpected data size */
if (optval + 1 > optval_end) @@ -155,7 +153,7 @@ int _setsockopt(struct bpf_sockopt *ctx) * program can only see the first PAGE_SIZE * bytes of data. */ - if (optval_end - optval != PAGE_SIZE) + if (optval_end - optval != page_size) return 0; /* EPERM, unexpected data size */
return 1;
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit 7a85e4dfa7f5d052ae5016e96f1ee426a8870e78 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Use ASSERT to check result but keep CHECK where format was used to report error.
Use bpf_map__set_max_entries() to set map size dynamically from userspace according to page size.
Zero-initialize the variable in bpf prog, otherwise it will cause problems on some versions of Clang.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-4-yauheni.kaliuta@redhat.co... (cherry picked from commit 7a85e4dfa7f5d052ae5016e96f1ee426a8870e78) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/map_ptr.c | 15 +++++++++++++-- tools/testing/selftests/bpf/progs/map_ptr_kern.c | 4 ++-- 2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/map_ptr.c b/tools/testing/selftests/bpf/prog_tests/map_ptr.c index c230a573c373..4972f92205c7 100644 --- a/tools/testing/selftests/bpf/prog_tests/map_ptr.c +++ b/tools/testing/selftests/bpf/prog_tests/map_ptr.c @@ -12,11 +12,22 @@ void test_map_ptr(void) __u32 duration = 0, retval; char buf[128]; int err; + int page_size = getpagesize();
- skel = map_ptr_kern__open_and_load(); - if (CHECK(!skel, "skel_open_load", "open_load failed\n")) + skel = map_ptr_kern__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) return;
+ err = bpf_map__set_max_entries(skel->maps.m_ringbuf, page_size); + if (!ASSERT_OK(err, "bpf_map__set_max_entries")) + goto cleanup; + + err = map_ptr_kern__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + skel->bss->page_size = page_size; + err = bpf_prog_test_run(bpf_program__fd(skel->progs.cg_skb), 1, &pkt_v4, sizeof(pkt_v4), buf, NULL, &retval, NULL);
diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c index c325405751e2..e16b33a80de0 100644 --- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c +++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c @@ -12,6 +12,7 @@ _Static_assert(MAX_ENTRIES < LOOP_BOUND, "MAX_ENTRIES must be < LOOP_BOUND");
enum bpf_map_type g_map_type = BPF_MAP_TYPE_UNSPEC; __u32 g_line = 0; +int page_size = 0; /* userspace should set it */
#define VERIFY_TYPE(type, func) ({ \ g_map_type = type; \ @@ -642,7 +643,6 @@ struct bpf_ringbuf_map {
struct { __uint(type, BPF_MAP_TYPE_RINGBUF); - __uint(max_entries, 1 << 12); } m_ringbuf SEC(".maps");
static inline int check_ringbuf(void) @@ -650,7 +650,7 @@ static inline int check_ringbuf(void) struct bpf_ringbuf_map *ringbuf = (struct bpf_ringbuf_map *)&m_ringbuf; struct bpf_map *map = (struct bpf_map *)&m_ringbuf;
- VERIFY(check(&ringbuf->map, map, 0, 0, 1 << 12)); + VERIFY(check(&ringbuf->map, map, 0, 0, page_size));
return 1; }
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit 34090aaf256e469a39401cc04d5cd43275ccd97d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Replace hardcoded 4096 with runtime value in the userspace part of the test and set bpf table sizes dynamically according to the value.
Do not switch to ASSERT macros, keep CHECK, for consistency with the rest of the test. Can be a separate cleanup patch.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-5-yauheni.kaliuta@redhat.co... (cherry picked from commit 34090aaf256e469a39401cc04d5cd43275ccd97d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/mmap.c | 24 +++++++++++++++---- tools/testing/selftests/bpf/progs/test_mmap.c | 2 -- 2 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/mmap.c b/tools/testing/selftests/bpf/prog_tests/mmap.c index 9c3c5c0f068f..37b002ca1167 100644 --- a/tools/testing/selftests/bpf/prog_tests/mmap.c +++ b/tools/testing/selftests/bpf/prog_tests/mmap.c @@ -29,22 +29,36 @@ void test_mmap(void) struct test_mmap *skel; __u64 val = 0;
- skel = test_mmap__open_and_load(); - if (CHECK(!skel, "skel_open_and_load", "skeleton open/load failed\n")) + skel = test_mmap__open(); + if (CHECK(!skel, "skel_open", "skeleton open failed\n")) return;
+ err = bpf_map__set_max_entries(skel->maps.rdonly_map, page_size); + if (CHECK(err != 0, "bpf_map__set_max_entries", "bpf_map__set_max_entries failed\n")) + goto cleanup; + + /* at least 4 pages of data */ + err = bpf_map__set_max_entries(skel->maps.data_map, + 4 * (page_size / sizeof(u64))); + if (CHECK(err != 0, "bpf_map__set_max_entries", "bpf_map__set_max_entries failed\n")) + goto cleanup; + + err = test_mmap__load(skel); + if (CHECK(err != 0, "skel_load", "skeleton load failed\n")) + goto cleanup; + bss_map = skel->maps.bss; data_map = skel->maps.data_map; data_map_fd = bpf_map__fd(data_map);
rdmap_fd = bpf_map__fd(skel->maps.rdonly_map); - tmp1 = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, rdmap_fd, 0); + tmp1 = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, rdmap_fd, 0); if (CHECK(tmp1 != MAP_FAILED, "rdonly_write_mmap", "unexpected success\n")) { - munmap(tmp1, 4096); + munmap(tmp1, page_size); goto cleanup; } /* now double-check if it's mmap()'able at all */ - tmp1 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, rdmap_fd, 0); + tmp1 = mmap(NULL, page_size, PROT_READ, MAP_SHARED, rdmap_fd, 0); if (CHECK(tmp1 == MAP_FAILED, "rdonly_read_mmap", "failed: %d\n", errno)) goto cleanup;
diff --git a/tools/testing/selftests/bpf/progs/test_mmap.c b/tools/testing/selftests/bpf/progs/test_mmap.c index 4eb42cff5fe9..5a5cc19a15bf 100644 --- a/tools/testing/selftests/bpf/progs/test_mmap.c +++ b/tools/testing/selftests/bpf/progs/test_mmap.c @@ -9,7 +9,6 @@ char _license[] SEC("license") = "GPL";
struct { __uint(type, BPF_MAP_TYPE_ARRAY); - __uint(max_entries, 4096); __uint(map_flags, BPF_F_MMAPABLE | BPF_F_RDONLY_PROG); __type(key, __u32); __type(value, char); @@ -17,7 +16,6 @@ struct {
struct { __uint(type, BPF_MAP_TYPE_ARRAY); - __uint(max_entries, 512 * 4); /* at least 4 pages of data */ __uint(map_flags, BPF_F_MMAPABLE); __type(key, __u32); __type(value, __u64);
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit 23a65766066bb6e42a44e9097ddf79f72292a19f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Replace hardcoded 4096 with runtime value in the userspace part of the test and set bpf table sizes dynamically according to the value.
Do not switch to ASSERT macros, keep CHECK, for consistency with the rest of the test. Can be a separate cleanup patch.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-6-yauheni.kaliuta@redhat.co... (cherry picked from commit 23a65766066bb6e42a44e9097ddf79f72292a19f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../testing/selftests/bpf/prog_tests/ringbuf.c | 17 +++++++++++++---- .../testing/selftests/bpf/progs/test_ringbuf.c | 1 - 2 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c index fddbc5db5d6a..de78617f6550 100644 --- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c +++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c @@ -87,11 +87,20 @@ void test_ringbuf(void) pthread_t thread; long bg_ret = -1; int err, cnt; + int page_size = getpagesize();
- skel = test_ringbuf__open_and_load(); - if (CHECK(!skel, "skel_open_load", "skeleton open&load failed\n")) + skel = test_ringbuf__open(); + if (CHECK(!skel, "skel_open", "skeleton open failed\n")) return;
+ err = bpf_map__set_max_entries(skel->maps.ringbuf, page_size); + if (CHECK(err != 0, "bpf_map__set_max_entries", "bpf_map__set_max_entries failed\n")) + goto cleanup; + + err = test_ringbuf__load(skel); + if (CHECK(err != 0, "skel_load", "skeleton load failed\n")) + goto cleanup; + /* only trigger BPF program for current process */ skel->bss->pid = getpid();
@@ -110,9 +119,9 @@ void test_ringbuf(void) CHECK(skel->bss->avail_data != 3 * rec_sz, "err_avail_size", "exp %ld, got %ld\n", 3L * rec_sz, skel->bss->avail_data); - CHECK(skel->bss->ring_size != 4096, + CHECK(skel->bss->ring_size != page_size, "err_ring_size", "exp %ld, got %ld\n", - 4096L, skel->bss->ring_size); + (long)page_size, skel->bss->ring_size); CHECK(skel->bss->cons_pos != 0, "err_cons_pos", "exp %ld, got %ld\n", 0L, skel->bss->cons_pos); diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf.c b/tools/testing/selftests/bpf/progs/test_ringbuf.c index 8ba9959b036b..6b3f288b7c63 100644 --- a/tools/testing/selftests/bpf/progs/test_ringbuf.c +++ b/tools/testing/selftests/bpf/progs/test_ringbuf.c @@ -15,7 +15,6 @@ struct sample {
struct { __uint(type, BPF_MAP_TYPE_RINGBUF); - __uint(max_entries, 1 << 12); } ringbuf SEC(".maps");
/* inputs */
From: Andrii Nakryiko andrii.nakryiko@gmail.com
mainline inclusion from mainline-5.13-rc1 commit b3278099b2f6e81771c6c2b70fcf9a56e9ba5d93 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The API gives access to inner map for map in map types (array or hash of map). It will be used to dynamically set max_entries in it.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-7-yauheni.kaliuta@redhat.co... (cherry picked from commit b3278099b2f6e81771c6c2b70fcf9a56e9ba5d93) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 10 ++++++++++ tools/lib/bpf/libbpf.h | 1 + tools/lib/bpf/libbpf.map | 1 + 3 files changed, 12 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f662059b1ff1..2859ab36aa55 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2194,6 +2194,7 @@ static int parse_btf_map_def(struct bpf_object *obj, map->inner_map = calloc(1, sizeof(*map->inner_map)); if (!map->inner_map) return -ENOMEM; + map->inner_map->fd = -1; map->inner_map->sec_idx = obj->efile.btf_maps_shndx; map->inner_map->name = malloc(strlen(map->name) + sizeof(".inner") + 1); @@ -3884,6 +3885,14 @@ __u32 bpf_map__max_entries(const struct bpf_map *map) return map->def.max_entries; }
+struct bpf_map *bpf_map__inner_map(struct bpf_map *map) +{ + if (!bpf_map_type__is_map_in_map(map->def.type)) + return NULL; + + return map->inner_map; +} + int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries) { if (map->fd >= 0) @@ -9545,6 +9554,7 @@ int bpf_map__set_inner_map_fd(struct bpf_map *map, int fd) pr_warn("error: inner_map_fd already specified\n"); return -EINVAL; } + zfree(&map->inner_map); map->inner_map_fd = fd; return 0; } diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 4bb141b6727d..b92012b4a2f7 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -484,6 +484,7 @@ LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path); LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path);
LIBBPF_API int bpf_map__set_inner_map_fd(struct bpf_map *map, int fd); +LIBBPF_API struct bpf_map *bpf_map__inner_map(struct bpf_map *map);
LIBBPF_API long libbpf_get_error(const void *ptr);
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 0d3355c1b2e1..238830f75522 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -359,5 +359,6 @@ LIBBPF_0.4.0 { bpf_linker__finalize; bpf_linker__free; bpf_linker__new; + bpf_map__inner_map; bpf_object__set_kversion; } LIBBPF_0.3.0;
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
mainline inclusion from mainline-5.13-rc1 commit f3f4c23e1238783f3d719cfbda6c5e2fd03a48c4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Set bpf table sizes dynamically according to the runtime page size value.
Do not switch to ASSERT macros, keep CHECK, for consistency with the rest of the test. Can be a separate cleanup patch.
Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210408061310.95877-8-yauheni.kaliuta@redhat.co... (cherry picked from commit f3f4c23e1238783f3d719cfbda6c5e2fd03a48c4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/ringbuf_multi.c | 23 ++++++++++++++++--- .../selftests/bpf/progs/test_ringbuf_multi.c | 1 - 2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c b/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c index d37161e59bb2..159de99621c7 100644 --- a/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c +++ b/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c @@ -41,13 +41,30 @@ static int process_sample(void *ctx, void *data, size_t len) void test_ringbuf_multi(void) { struct test_ringbuf_multi *skel; - struct ring_buffer *ringbuf; + struct ring_buffer *ringbuf = NULL; int err; + int page_size = getpagesize();
- skel = test_ringbuf_multi__open_and_load(); - if (CHECK(!skel, "skel_open_load", "skeleton open&load failed\n")) + skel = test_ringbuf_multi__open(); + if (CHECK(!skel, "skel_open", "skeleton open failed\n")) return;
+ err = bpf_map__set_max_entries(skel->maps.ringbuf1, page_size); + if (CHECK(err != 0, "bpf_map__set_max_entries", "bpf_map__set_max_entries failed\n")) + goto cleanup; + + err = bpf_map__set_max_entries(skel->maps.ringbuf2, page_size); + if (CHECK(err != 0, "bpf_map__set_max_entries", "bpf_map__set_max_entries failed\n")) + goto cleanup; + + err = bpf_map__set_max_entries(bpf_map__inner_map(skel->maps.ringbuf_arr), page_size); + if (CHECK(err != 0, "bpf_map__set_max_entries", "bpf_map__set_max_entries failed\n")) + goto cleanup; + + err = test_ringbuf_multi__load(skel); + if (CHECK(err != 0, "skel_load", "skeleton load failed\n")) + goto cleanup; + /* only trigger BPF program for current process */ skel->bss->pid = getpid();
diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c b/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c index edf3b6953533..055c10b2ff80 100644 --- a/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c +++ b/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c @@ -15,7 +15,6 @@ struct sample {
struct ringbuf_map { __uint(type, BPF_MAP_TYPE_RINGBUF); - __uint(max_entries, 1 << 12); } ringbuf1 SEC(".maps"), ringbuf2 SEC(".maps");
From: Pedro Tammela pctammela@gmail.com
mainline inclusion from mainline-5.13-rc1 commit 5c507329000e282dce91e6c98ee6ffa61a8a5e49 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In 'bpf_ringbuf_reserve()' we require the flag to '0' at the moment.
For 'bpf_ringbuf_{discard,submit,output}' a flag of '0' might send a notification to the process if needed.
Signed-off-by: Pedro Tammela pctammela@mojatatu.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210412192434.944343-1-pctammela@mojatatu.com (cherry picked from commit 5c507329000e282dce91e6c98ee6ffa61a8a5e49) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 16 ++++++++++++++++ tools/include/uapi/linux/bpf.h | 16 ++++++++++++++++ 2 files changed, 32 insertions(+)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 4feb43973716..9c39b101b326 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3336,12 +3336,20 @@ union bpf_attr { * of new data availability is sent. * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification * of new data availability is sent unconditionally. + * If **0** is specified in *flags*, an adaptive notification + * of new data availability is sent. + * + * An adaptive notification is a notification sent whenever the user-space + * process has caught up and consumed all available payloads. In case the user-space + * process is still processing a previous payload, then no notification is needed + * as it will process the newly added payload automatically. * Return * 0 on success, or a negative error in case of failure. * * void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags) * Description * Reserve *size* bytes of payload in a ring buffer *ringbuf*. + * *flags* must be 0. * Return * Valid pointer with *size* bytes of memory available; NULL, * otherwise. @@ -3353,6 +3361,10 @@ union bpf_attr { * of new data availability is sent. * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification * of new data availability is sent unconditionally. + * If **0** is specified in *flags*, an adaptive notification + * of new data availability is sent. + * + * See 'bpf_ringbuf_output()' for the definition of adaptive notification. * Return * Nothing. Always succeeds. * @@ -3363,6 +3375,10 @@ union bpf_attr { * of new data availability is sent. * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification * of new data availability is sent unconditionally. + * If **0** is specified in *flags*, an adaptive notification + * of new data availability is sent. + * + * See 'bpf_ringbuf_output()' for the definition of adaptive notification. * Return * Nothing. Always succeeds. * diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 098dffc26f2e..360b06fa4399 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4046,12 +4046,20 @@ union bpf_attr { * of new data availability is sent. * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification * of new data availability is sent unconditionally. + * If **0** is specified in *flags*, an adaptive notification + * of new data availability is sent. + * + * An adaptive notification is a notification sent whenever the user-space + * process has caught up and consumed all available payloads. In case the user-space + * process is still processing a previous payload, then no notification is needed + * as it will process the newly added payload automatically. * Return * 0 on success, or a negative error in case of failure. * * void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags) * Description * Reserve *size* bytes of payload in a ring buffer *ringbuf*. + * *flags* must be 0. * Return * Valid pointer with *size* bytes of memory available; NULL, * otherwise. @@ -4063,6 +4071,10 @@ union bpf_attr { * of new data availability is sent. * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification * of new data availability is sent unconditionally. + * If **0** is specified in *flags*, an adaptive notification + * of new data availability is sent. + * + * See 'bpf_ringbuf_output()' for the definition of adaptive notification. * Return * Nothing. Always succeeds. * @@ -4073,6 +4085,10 @@ union bpf_attr { * of new data availability is sent. * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification * of new data availability is sent unconditionally. + * If **0** is specified in *flags*, an adaptive notification + * of new data availability is sent. + * + * See 'bpf_ringbuf_output()' for the definition of adaptive notification. * Return * Nothing. Always succeeds. *
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.13-rc1 commit d3d93e34bd98e4dbb002310fed08630f4b549a08 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
relo->processed is set, but not used. Remove it.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210415141817.53136-1-alexei.starovoitov@gmail.... (cherry picked from commit d3d93e34bd98e4dbb002310fed08630f4b549a08) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2859ab36aa55..cad6b9bd13ba 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -195,7 +195,6 @@ struct reloc_desc { int insn_idx; int map_idx; int sym_off; - bool processed; };
struct bpf_sec_def; @@ -3500,8 +3499,6 @@ static int bpf_program__record_reloc(struct bpf_program *prog, const char *sym_sec_name; struct bpf_map *map;
- reloc_desc->processed = false; - if (!is_call_insn(insn) && !is_ldimm64_insn(insn)) { pr_warn("prog '%s': invalid relo against '%s' for insns[%d].code 0x%x\n", prog->name, sym_name, insn_idx, insn->code); @@ -6368,13 +6365,11 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) case RELO_LD64: insn[0].src_reg = BPF_PSEUDO_MAP_FD; insn[0].imm = obj->maps[relo->map_idx].fd; - relo->processed = true; break; case RELO_DATA: insn[0].src_reg = BPF_PSEUDO_MAP_VALUE; insn[1].imm = insn[0].imm + relo->sym_off; insn[0].imm = obj->maps[relo->map_idx].fd; - relo->processed = true; break; case RELO_EXTERN_VAR: ext = &obj->externs[relo->sym_off]; @@ -6392,13 +6387,11 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) insn[1].imm = ext->ksym.addr >> 32; } } - relo->processed = true; break; case RELO_EXTERN_FUNC: ext = &obj->externs[relo->sym_off]; insn[0].src_reg = BPF_PSEUDO_KFUNC_CALL; insn[0].imm = ext->ksym.kernel_btf_id; - relo->processed = true; break; case RELO_SUBPROG_ADDR: insn[0].src_reg = BPF_PSEUDO_FUNC; @@ -6684,9 +6677,6 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog, * different main programs */ insn->imm = subprog->sub_insn_off - (prog->sub_insn_off + insn_idx) - 1;
- if (relo) - relo->processed = true; - pr_debug("prog '%s': insn #%zu relocated, imm %d points to subprog '%s' (now at %zu offset)\n", prog->name, insn_idx, insn->imm, subprog->name, subprog->sub_insn_off); } @@ -6779,7 +6769,7 @@ static int bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog) { struct bpf_program *subprog; - int i, j, err; + int i, err;
/* mark all subprogs as not relocated (yet) within the context of * current main program @@ -6790,9 +6780,6 @@ bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog) continue;
subprog->sub_insn_off = 0; - for (j = 0; j < subprog->nr_reloc; j++) - if (subprog->reloc_desc[j].type == RELO_CALL) - subprog->reloc_desc[j].processed = false; }
err = bpf_object__reloc_code(obj, prog, prog);
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit a22c0c81da644223d911466746ef414a786cb1c8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
With clang compiler: make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 the test_cpp build failed due to the failure: warning: treating 'c-header' input as 'c++-header' when in C++ mode, this behavior is deprecated [-Wdeprecated] clang-13: error: cannot specify -o when generating multiple output files
test_cpp compilation flag looks like: clang++ -g -Og -rdynamic -Wall -I<...> ... \ -Dbpf_prog_load=bpf_prog_test_load -Dbpf_load_program=bpf_test_load_program \ test_cpp.cpp <...>/test_core_extern.skel.h <...>/libbpf.a <...>/test_stub.o \ -lcap -lelf -lz -lrt -lpthread -o <...>/test_cpp
The clang++ compiler complains the header file in the command line and also failed the compilation due to this. Let us remove the header file from the command line which is not intended any way, and this fixed the compilation problem.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210413153424.3028986-1-yhs@fb.com (cherry picked from commit a22c0c81da644223d911466746ef414a786cb1c8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index cc68e3ecec7b..827d37dbd00f 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -473,7 +473,7 @@ $(OUTPUT)/test_verifier: test_verifier.c verifier/tests.h $(BPFOBJ) | $(OUTPUT) # Make sure we are able to include and link libbpf against c++. $(OUTPUT)/test_cpp: test_cpp.cpp $(OUTPUT)/test_core_extern.skel.h $(BPFOBJ) $(call msg,CXX,,$@) - $(Q)$(CXX) $(CFLAGS) $^ $(LDLIBS) -o $@ + $(Q)$(CXX) $(CFLAGS) $(filter %.a %.o %.cpp,$^) $(LDLIBS) -o $@
# Benchmark runner $(OUTPUT)/bench_%.o: benchs/bench_%.c bench.h
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.13-rc1 commit 8af50142763c6e70d426e45278b23d7103e5b7a7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
With clang compiler: make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel # build selftests/bpf or bpftool make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 make -j60 -C tools/bpf/bpftool LLVM=1 LLVM_IAS=1 the following compilation warning showed up, net.c:160:37: warning: comparison of integers of different signs: '__u32' (aka 'unsigned int') and 'int' [-Wsign-compare] for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len); ^~~~~~~~~~~~~~~~~ .../tools/include/uapi/linux/netlink.h:99:24: note: expanded from macro 'NLMSG_OK' (nlh)->nlmsg_len <= (len)) ~~~~~~~~~~~~~~~~ ^ ~~~
In this particular case, "len" is defined as "int" and (nlh)->nlmsg_len is "unsigned int". The macro NLMSG_OK is defined as below in uapi/linux/netlink.h. #define NLMSG_OK(nlh,len) ((len) >= (int)sizeof(struct nlmsghdr) && \ (nlh)->nlmsg_len >= sizeof(struct nlmsghdr) && \ (nlh)->nlmsg_len <= (len))
The clang compiler complains the comparision "(nlh)->nlmsg_len <= (len))", but in bpftool/net.c, it is already ensured that "len > 0" must be true. So theoretically the compiler could deduce that comparison of "(nlh)->nlmsg_len" and "len" is okay, but this really depends on compiler internals. Let us add an explicit type conversion (from "int" to "unsigned int") for "len" in NLMSG_OK to silence this warning right now.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210413153435.3029635-1-yhs@fb.com (cherry picked from commit 8af50142763c6e70d426e45278b23d7103e5b7a7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c index ff3aa0cf3997..f836d115d7d6 100644 --- a/tools/bpf/bpftool/net.c +++ b/tools/bpf/bpftool/net.c @@ -157,7 +157,7 @@ static int netlink_recv(int sock, __u32 nl_pid, __u32 seq, if (len == 0) break;
- for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len); + for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, (unsigned int)len); nh = NLMSG_NEXT(nh, len)) { if (nh->nlmsg_pid != nl_pid) { ret = -LIBBPF_ERRNO__WRNGPID;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 0fec7a3cee1cf8e4f86ff563d229408ccbdc2d66 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When used on externs SEC() macro will trigger compilation warning about inapplicable `__attribute__((used))`. That's expected for extern declarations, so suppress it with the corresponding _Pragma.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-4-andrii@kernel.org (cherry picked from commit 0fec7a3cee1cf8e4f86ff563d229408ccbdc2d66) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_helpers.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index b904128626c2..75c7581b304c 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -25,9 +25,16 @@ /* * Helper macro to place programs, maps, license in * different sections in elf_bpf file. Section names - * are interpreted by elf_bpf loader + * are interpreted by libbpf depending on the context (BPF programs, BPF maps, + * extern variables, etc). + * To allow use of SEC() with externs (e.g., for extern .maps declarations), + * make sure __attribute__((unused)) doesn't trigger compilation warning. */ -#define SEC(NAME) __attribute__((section(NAME), used)) +#define SEC(name) \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored "-Wignored-attributes"") \ + __attribute__((section(name), used)) \ + _Pragma("GCC diagnostic pop") \
/* Avoid 'linux/stddef.h' definition of '__always_inline'. */ #undef __always_inline
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit aea28a602fa19fb4fe66374030ab7e2c8ddf643e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Define __hidden helper macro in bpf_helpers.h, which is a short-hand for __attribute__((visibility("hidden"))). Add libbpf support to mark BPF subprograms marked with __hidden as static in BTF information to enforce BPF verifier's static function validation algorithm, which takes more information (caller's context) into account during a subprogram validation.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-5-andrii@kernel.org (cherry picked from commit aea28a602fa19fb4fe66374030ab7e2c8ddf643e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_helpers.h | 8 ++++++ tools/lib/bpf/btf.c | 5 ---- tools/lib/bpf/libbpf.c | 45 ++++++++++++++++++++++++++++++++- tools/lib/bpf/libbpf_internal.h | 6 +++++ 4 files changed, 58 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 75c7581b304c..9720dc0b4605 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -47,6 +47,14 @@ #define __weak __attribute__((weak)) #endif
+/* + * Use __hidden attribute to mark a non-static BPF subprogram effectively + * static for BPF verifier's verification algorithm purposes, allowing more + * extensive and permissive BPF verification process, taking into account + * subprogram's caller context. + */ +#define __hidden __attribute__((visibility("hidden"))) + /* When utilizing vmlinux.h with BPF CO-RE, user BPF programs can't include * any system-level headers (such as stddef.h, linux/version.h, etc), and * commonly-used macros like NULL and KERNEL_VERSION aren't available through diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 4c0dd20c0ec9..9be6b87f8d52 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1611,11 +1611,6 @@ static void *btf_add_type_mem(struct btf *btf, size_t add_sz) btf->hdr->type_len, UINT_MAX, add_sz); }
-static __u32 btf_type_info(int kind, int vlen, int kflag) -{ - return (kflag << 31) | (kind << 24) | vlen; -} - static void btf_type_inc_vlen(struct btf_type *t) { t->info = btf_type_info(btf_kind(t), btf_vlen(t) + 1, btf_kflag(t)); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index cad6b9bd13ba..f2c667482ce6 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -71,6 +71,7 @@ static struct bpf_map *bpf_object__add_map(struct bpf_object *obj); static const struct btf_type * skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id); +static bool prog_is_subprog(const struct bpf_object *obj, const struct bpf_program *prog);
static int __base_pr(enum libbpf_print_level level, const char *format, va_list args) @@ -274,6 +275,7 @@ struct bpf_program { bpf_program_clear_priv_t clear_priv;
bool load; + bool mark_btf_static; enum bpf_prog_type type; enum bpf_attach_type expected_attach_type; int prog_ifindex; @@ -698,6 +700,15 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, if (err) return err;
+ /* if function is a global/weak symbol, but has hidden + * visibility (STV_HIDDEN), mark its BTF FUNC as static to + * enable more permissive BPF verification mode with more + * outside context available to BPF verifier + */ + if (GELF_ST_BIND(sym.st_info) != STB_LOCAL + && GELF_ST_VISIBILITY(sym.st_other) == STV_HIDDEN) + prog->mark_btf_static = true; + nr_progs++; obj->nr_programs = nr_progs;
@@ -2619,7 +2630,7 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) { struct btf *kern_btf = obj->btf; bool btf_mandatory, sanitize; - int err = 0; + int i, err = 0;
if (!obj->btf) return 0; @@ -2633,6 +2644,38 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) return 0; }
+ /* Even though some subprogs are global/weak, user might prefer more + * permissive BPF verification process that BPF verifier performs for + * static functions, taking into account more context from the caller + * functions. In such case, they need to mark such subprogs with + * __attribute__((visibility("hidden"))) and libbpf will adjust + * corresponding FUNC BTF type to be marked as static and trigger more + * involved BPF verification process. + */ + for (i = 0; i < obj->nr_programs; i++) { + struct bpf_program *prog = &obj->programs[i]; + struct btf_type *t; + const char *name; + int j, n; + + if (!prog->mark_btf_static || !prog_is_subprog(obj, prog)) + continue; + + n = btf__get_nr_types(obj->btf); + for (j = 1; j <= n; j++) { + t = btf_type_by_id(obj->btf, j); + if (!btf_is_func(t) || btf_func_linkage(t) != BTF_FUNC_GLOBAL) + continue; + + name = btf__str_by_offset(obj->btf, t->name_off); + if (strcmp(name, prog->name) != 0) + continue; + + t->info = btf_type_info(BTF_KIND_FUNC, BTF_FUNC_STATIC, 0); + break; + } + } + sanitize = btf_needs_sanitization(obj); if (sanitize) { const void *raw_data; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 6017902c687e..92b7eae10c6d 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -19,6 +19,7 @@ #pragma GCC poison reallocarray
#include "libbpf.h" +#include "btf.h"
#ifndef EM_BPF #define EM_BPF 247 @@ -132,6 +133,11 @@ struct btf_type;
struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id);
+static inline __u32 btf_type_info(int kind, int vlen, int kflag) +{ + return (kflag << 31) | (kind << 24) | vlen; +} + void *libbpf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t cur_cnt, size_t max_cnt, size_t add_cnt); int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 6245947c1b3c6783976e3af113bac30250d0a93e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Currently libbpf is very strict about parsing BPF program instruction sections. No gaps are allowed between sequential BPF programs within a given ELF section. Libbpf enforced that by keeping track of the next section offset that should start a new BPF (sub)program and cross-checks that by searching for a corresponding STT_FUNC ELF symbol.
But this is too restrictive once we allow to have weak BPF programs and link together two or more BPF object files. In such case, some weak BPF programs might be "overridden" by either non-weak BPF program with the same name and signature, or even by another weak BPF program that just happened to be linked first. That, in turn, leaves BPF instructions of the "lost" BPF (sub)program intact, but there is no corresponding ELF symbol, because no one is going to be referencing it.
Libbpf already correctly handles such cases in the sense that it won't append such dead code to actual BPF programs loaded into kernel. So the only change that needs to be done is to relax the logic of parsing BPF instruction sections. Instead of assuming next BPF (sub)program section offset, iterate available STT_FUNC ELF symbols to discover all available BPF subprograms and programs.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-6-andrii@kernel.org (cherry picked from commit 6245947c1b3c6783976e3af113bac30250d0a93e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 58 ++++++++++++++++-------------------------- 1 file changed, 22 insertions(+), 36 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f2c667482ce6..14492e27aa2c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -502,8 +502,6 @@ static Elf_Scn *elf_sec_by_name(const struct bpf_object *obj, const char *name); static int elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn, GElf_Shdr *hdr); static const char *elf_sec_name(const struct bpf_object *obj, Elf_Scn *scn); static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn); -static int elf_sym_by_sec_off(const struct bpf_object *obj, size_t sec_idx, - size_t off, __u32 sym_type, GElf_Sym *sym);
void bpf_program__unload(struct bpf_program *prog) { @@ -644,25 +642,29 @@ static int bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, const char *sec_name, int sec_idx) { + Elf_Data *symbols = obj->efile.symbols; struct bpf_program *prog, *progs; void *data = sec_data->d_buf; - size_t sec_sz = sec_data->d_size, sec_off, prog_sz; - int nr_progs, err; + size_t sec_sz = sec_data->d_size, sec_off, prog_sz, nr_syms; + int nr_progs, err, i; const char *name; GElf_Sym sym;
progs = obj->programs; nr_progs = obj->nr_programs; + nr_syms = symbols->d_size / sizeof(GElf_Sym); sec_off = 0;
- while (sec_off < sec_sz) { - if (elf_sym_by_sec_off(obj, sec_idx, sec_off, STT_FUNC, &sym)) { - pr_warn("sec '%s': failed to find program symbol at offset %zu\n", - sec_name, sec_off); - return -LIBBPF_ERRNO__FORMAT; - } + for (i = 0; i < nr_syms; i++) { + if (!gelf_getsym(symbols, i, &sym)) + continue; + if (sym.st_shndx != sec_idx) + continue; + if (GELF_ST_TYPE(sym.st_info) != STT_FUNC) + continue;
prog_sz = sym.st_size; + sec_off = sym.st_value;
name = elf_sym_str(obj, sym.st_name); if (!name) { @@ -711,8 +713,6 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data,
nr_progs++; obj->nr_programs = nr_progs; - - sec_off += prog_sz; }
return 0; @@ -2826,26 +2826,6 @@ static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn) return data; }
-static int elf_sym_by_sec_off(const struct bpf_object *obj, size_t sec_idx, - size_t off, __u32 sym_type, GElf_Sym *sym) -{ - Elf_Data *symbols = obj->efile.symbols; - size_t n = symbols->d_size / sizeof(GElf_Sym); - int i; - - for (i = 0; i < n; i++) { - if (!gelf_getsym(symbols, i, sym)) - continue; - if (sym->st_shndx != sec_idx || sym->st_value != off) - continue; - if (GELF_ST_TYPE(sym->st_info) != sym_type) - continue; - return 0; - } - - return -ENOENT; -} - static bool is_sec_name_dwarf(const char *name) { /* approximation, but the actual list is too long */ @@ -3724,11 +3704,16 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data int err, i, nrels; const char *sym_name; __u32 insn_idx; + Elf_Scn *scn; + Elf_Data *scn_data; GElf_Sym sym; GElf_Rel rel;
+ scn = elf_sec_by_idx(obj, sec_idx); + scn_data = elf_sec_data(obj, scn); + relo_sec_name = elf_sec_str(obj, shdr->sh_name); - sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx)); + sec_name = elf_sec_name(obj, scn); if (!relo_sec_name || !sec_name) return -EINVAL;
@@ -3746,7 +3731,8 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data relo_sec_name, (size_t)GELF_R_SYM(rel.r_info), i); return -LIBBPF_ERRNO__FORMAT; } - if (rel.r_offset % BPF_INSN_SZ) { + + if (rel.r_offset % BPF_INSN_SZ || rel.r_offset >= scn_data->d_size) { pr_warn("sec '%s': invalid offset 0x%zx for relo #%d\n", relo_sec_name, (size_t)GELF_R_SYM(rel.r_info), i); return -LIBBPF_ERRNO__FORMAT; @@ -3770,9 +3756,9 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data
prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx); if (!prog) { - pr_warn("sec '%s': relo #%d: program not found in section '%s' for insn #%u\n", + pr_debug("sec '%s': relo #%d: couldn't find program in section '%s' for insn #%u, probably overridden weak function, skipping...\n", relo_sec_name, i, sec_name, insn_idx); - return -LIBBPF_ERRNO__RELOC; + continue; }
relos = libbpf_reallocarray(prog->reloc_desc,
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit c7ef5ec9573f05535370d8716576263681cabec7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Refactor BTF-defined maps parsing logic to allow it to be nicely reused by BPF static linker. Further, at least for BPF static linker, it's important to know which attributes of a BPF map were defined explicitly, so provide a bit set for each known portion of BTF map definition. This allows BPF static linker to do a simple check when dealing with extern map declarations.
The same capabilities allow to distinguish attributes explicitly set to zero (e.g., __uint(max_entries, 0)) vs the case of not specifying it at all (no max_entries attribute at all). Libbpf is currently not utilizing that, but it could be useful for backwards compatibility reasons later.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-7-andrii@kernel.org (cherry picked from commit c7ef5ec9573f05535370d8716576263681cabec7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 257 ++++++++++++++++++-------------- tools/lib/bpf/libbpf_internal.h | 32 ++++ 2 files changed, 178 insertions(+), 111 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 14492e27aa2c..2d5d86928a32 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2025,255 +2025,262 @@ static int build_map_pin_path(struct bpf_map *map, const char *path) return bpf_map__set_pin_path(map, buf); }
- -static int parse_btf_map_def(struct bpf_object *obj, - struct bpf_map *map, - const struct btf_type *def, - bool strict, bool is_inner, - const char *pin_root_path) +int parse_btf_map_def(const char *map_name, struct btf *btf, + const struct btf_type *def_t, bool strict, + struct btf_map_def *map_def, struct btf_map_def *inner_def) { const struct btf_type *t; const struct btf_member *m; + bool is_inner = inner_def == NULL; int vlen, i;
- vlen = btf_vlen(def); - m = btf_members(def); + vlen = btf_vlen(def_t); + m = btf_members(def_t); for (i = 0; i < vlen; i++, m++) { - const char *name = btf__name_by_offset(obj->btf, m->name_off); + const char *name = btf__name_by_offset(btf, m->name_off);
if (!name) { - pr_warn("map '%s': invalid field #%d.\n", map->name, i); + pr_warn("map '%s': invalid field #%d.\n", map_name, i); return -EINVAL; } if (strcmp(name, "type") == 0) { - if (!get_map_field_int(map->name, obj->btf, m, - &map->def.type)) + if (!get_map_field_int(map_name, btf, m, &map_def->map_type)) return -EINVAL; - pr_debug("map '%s': found type = %u.\n", - map->name, map->def.type); + map_def->parts |= MAP_DEF_MAP_TYPE; } else if (strcmp(name, "max_entries") == 0) { - if (!get_map_field_int(map->name, obj->btf, m, - &map->def.max_entries)) + if (!get_map_field_int(map_name, btf, m, &map_def->max_entries)) return -EINVAL; - pr_debug("map '%s': found max_entries = %u.\n", - map->name, map->def.max_entries); + map_def->parts |= MAP_DEF_MAX_ENTRIES; } else if (strcmp(name, "map_flags") == 0) { - if (!get_map_field_int(map->name, obj->btf, m, - &map->def.map_flags)) + if (!get_map_field_int(map_name, btf, m, &map_def->map_flags)) return -EINVAL; - pr_debug("map '%s': found map_flags = %u.\n", - map->name, map->def.map_flags); + map_def->parts |= MAP_DEF_MAP_FLAGS; } else if (strcmp(name, "numa_node") == 0) { - if (!get_map_field_int(map->name, obj->btf, m, &map->numa_node)) + if (!get_map_field_int(map_name, btf, m, &map_def->numa_node)) return -EINVAL; - pr_debug("map '%s': found numa_node = %u.\n", map->name, map->numa_node); + map_def->parts |= MAP_DEF_NUMA_NODE; } else if (strcmp(name, "key_size") == 0) { __u32 sz;
- if (!get_map_field_int(map->name, obj->btf, m, &sz)) + if (!get_map_field_int(map_name, btf, m, &sz)) return -EINVAL; - pr_debug("map '%s': found key_size = %u.\n", - map->name, sz); - if (map->def.key_size && map->def.key_size != sz) { + if (map_def->key_size && map_def->key_size != sz) { pr_warn("map '%s': conflicting key size %u != %u.\n", - map->name, map->def.key_size, sz); + map_name, map_def->key_size, sz); return -EINVAL; } - map->def.key_size = sz; + map_def->key_size = sz; + map_def->parts |= MAP_DEF_KEY_SIZE; } else if (strcmp(name, "key") == 0) { __s64 sz;
- t = btf__type_by_id(obj->btf, m->type); + t = btf__type_by_id(btf, m->type); if (!t) { pr_warn("map '%s': key type [%d] not found.\n", - map->name, m->type); + map_name, m->type); return -EINVAL; } if (!btf_is_ptr(t)) { pr_warn("map '%s': key spec is not PTR: %s.\n", - map->name, btf_kind_str(t)); + map_name, btf_kind_str(t)); return -EINVAL; } - sz = btf__resolve_size(obj->btf, t->type); + sz = btf__resolve_size(btf, t->type); if (sz < 0) { pr_warn("map '%s': can't determine key size for type [%u]: %zd.\n", - map->name, t->type, (ssize_t)sz); + map_name, t->type, (ssize_t)sz); return sz; } - pr_debug("map '%s': found key [%u], sz = %zd.\n", - map->name, t->type, (ssize_t)sz); - if (map->def.key_size && map->def.key_size != sz) { + if (map_def->key_size && map_def->key_size != sz) { pr_warn("map '%s': conflicting key size %u != %zd.\n", - map->name, map->def.key_size, (ssize_t)sz); + map_name, map_def->key_size, (ssize_t)sz); return -EINVAL; } - map->def.key_size = sz; - map->btf_key_type_id = t->type; + map_def->key_size = sz; + map_def->key_type_id = t->type; + map_def->parts |= MAP_DEF_KEY_SIZE | MAP_DEF_KEY_TYPE; } else if (strcmp(name, "value_size") == 0) { __u32 sz;
- if (!get_map_field_int(map->name, obj->btf, m, &sz)) + if (!get_map_field_int(map_name, btf, m, &sz)) return -EINVAL; - pr_debug("map '%s': found value_size = %u.\n", - map->name, sz); - if (map->def.value_size && map->def.value_size != sz) { + if (map_def->value_size && map_def->value_size != sz) { pr_warn("map '%s': conflicting value size %u != %u.\n", - map->name, map->def.value_size, sz); + map_name, map_def->value_size, sz); return -EINVAL; } - map->def.value_size = sz; + map_def->value_size = sz; + map_def->parts |= MAP_DEF_VALUE_SIZE; } else if (strcmp(name, "value") == 0) { __s64 sz;
- t = btf__type_by_id(obj->btf, m->type); + t = btf__type_by_id(btf, m->type); if (!t) { pr_warn("map '%s': value type [%d] not found.\n", - map->name, m->type); + map_name, m->type); return -EINVAL; } if (!btf_is_ptr(t)) { pr_warn("map '%s': value spec is not PTR: %s.\n", - map->name, btf_kind_str(t)); + map_name, btf_kind_str(t)); return -EINVAL; } - sz = btf__resolve_size(obj->btf, t->type); + sz = btf__resolve_size(btf, t->type); if (sz < 0) { pr_warn("map '%s': can't determine value size for type [%u]: %zd.\n", - map->name, t->type, (ssize_t)sz); + map_name, t->type, (ssize_t)sz); return sz; } - pr_debug("map '%s': found value [%u], sz = %zd.\n", - map->name, t->type, (ssize_t)sz); - if (map->def.value_size && map->def.value_size != sz) { + if (map_def->value_size && map_def->value_size != sz) { pr_warn("map '%s': conflicting value size %u != %zd.\n", - map->name, map->def.value_size, (ssize_t)sz); + map_name, map_def->value_size, (ssize_t)sz); return -EINVAL; } - map->def.value_size = sz; - map->btf_value_type_id = t->type; + map_def->value_size = sz; + map_def->value_type_id = t->type; + map_def->parts |= MAP_DEF_VALUE_SIZE | MAP_DEF_VALUE_TYPE; } else if (strcmp(name, "values") == 0) { + char inner_map_name[128]; int err;
if (is_inner) { pr_warn("map '%s': multi-level inner maps not supported.\n", - map->name); + map_name); return -ENOTSUP; } if (i != vlen - 1) { pr_warn("map '%s': '%s' member should be last.\n", - map->name, name); + map_name, name); return -EINVAL; } - if (!bpf_map_type__is_map_in_map(map->def.type)) { + if (!bpf_map_type__is_map_in_map(map_def->map_type)) { pr_warn("map '%s': should be map-in-map.\n", - map->name); + map_name); return -ENOTSUP; } - if (map->def.value_size && map->def.value_size != 4) { + if (map_def->value_size && map_def->value_size != 4) { pr_warn("map '%s': conflicting value size %u != 4.\n", - map->name, map->def.value_size); + map_name, map_def->value_size); return -EINVAL; } - map->def.value_size = 4; - t = btf__type_by_id(obj->btf, m->type); + map_def->value_size = 4; + t = btf__type_by_id(btf, m->type); if (!t) { pr_warn("map '%s': map-in-map inner type [%d] not found.\n", - map->name, m->type); + map_name, m->type); return -EINVAL; } if (!btf_is_array(t) || btf_array(t)->nelems) { pr_warn("map '%s': map-in-map inner spec is not a zero-sized array.\n", - map->name); + map_name); return -EINVAL; } - t = skip_mods_and_typedefs(obj->btf, btf_array(t)->type, - NULL); + t = skip_mods_and_typedefs(btf, btf_array(t)->type, NULL); if (!btf_is_ptr(t)) { pr_warn("map '%s': map-in-map inner def is of unexpected kind %s.\n", - map->name, btf_kind_str(t)); + map_name, btf_kind_str(t)); return -EINVAL; } - t = skip_mods_and_typedefs(obj->btf, t->type, NULL); + t = skip_mods_and_typedefs(btf, t->type, NULL); if (!btf_is_struct(t)) { pr_warn("map '%s': map-in-map inner def is of unexpected kind %s.\n", - map->name, btf_kind_str(t)); + map_name, btf_kind_str(t)); return -EINVAL; }
- map->inner_map = calloc(1, sizeof(*map->inner_map)); - if (!map->inner_map) - return -ENOMEM; - map->inner_map->fd = -1; - map->inner_map->sec_idx = obj->efile.btf_maps_shndx; - map->inner_map->name = malloc(strlen(map->name) + - sizeof(".inner") + 1); - if (!map->inner_map->name) - return -ENOMEM; - sprintf(map->inner_map->name, "%s.inner", map->name); - - err = parse_btf_map_def(obj, map->inner_map, t, strict, - true /* is_inner */, NULL); + snprintf(inner_map_name, sizeof(inner_map_name), "%s.inner", map_name); + err = parse_btf_map_def(inner_map_name, btf, t, strict, inner_def, NULL); if (err) return err; + + map_def->parts |= MAP_DEF_INNER_MAP; } else if (strcmp(name, "pinning") == 0) { __u32 val; - int err;
if (is_inner) { - pr_debug("map '%s': inner def can't be pinned.\n", - map->name); + pr_warn("map '%s': inner def can't be pinned.\n", map_name); return -EINVAL; } - if (!get_map_field_int(map->name, obj->btf, m, &val)) + if (!get_map_field_int(map_name, btf, m, &val)) return -EINVAL; - pr_debug("map '%s': found pinning = %u.\n", - map->name, val); - - if (val != LIBBPF_PIN_NONE && - val != LIBBPF_PIN_BY_NAME) { + if (val != LIBBPF_PIN_NONE && val != LIBBPF_PIN_BY_NAME) { pr_warn("map '%s': invalid pinning value %u.\n", - map->name, val); + map_name, val); return -EINVAL; } - if (val == LIBBPF_PIN_BY_NAME) { - err = build_map_pin_path(map, pin_root_path); - if (err) { - pr_warn("map '%s': couldn't build pin path.\n", - map->name); - return err; - } - } + map_def->pinning = val; + map_def->parts |= MAP_DEF_PINNING; } else { if (strict) { - pr_warn("map '%s': unknown field '%s'.\n", - map->name, name); + pr_warn("map '%s': unknown field '%s'.\n", map_name, name); return -ENOTSUP; } - pr_debug("map '%s': ignoring unknown field '%s'.\n", - map->name, name); + pr_debug("map '%s': ignoring unknown field '%s'.\n", map_name, name); } }
- if (map->def.type == BPF_MAP_TYPE_UNSPEC) { - pr_warn("map '%s': map type isn't specified.\n", map->name); + if (map_def->map_type == BPF_MAP_TYPE_UNSPEC) { + pr_warn("map '%s': map type isn't specified.\n", map_name); return -EINVAL; }
return 0; }
+static void fill_map_from_def(struct bpf_map *map, const struct btf_map_def *def) +{ + map->def.type = def->map_type; + map->def.key_size = def->key_size; + map->def.value_size = def->value_size; + map->def.max_entries = def->max_entries; + map->def.map_flags = def->map_flags; + + map->numa_node = def->numa_node; + map->btf_key_type_id = def->key_type_id; + map->btf_value_type_id = def->value_type_id; + + if (def->parts & MAP_DEF_MAP_TYPE) + pr_debug("map '%s': found type = %u.\n", map->name, def->map_type); + + if (def->parts & MAP_DEF_KEY_TYPE) + pr_debug("map '%s': found key [%u], sz = %u.\n", + map->name, def->key_type_id, def->key_size); + else if (def->parts & MAP_DEF_KEY_SIZE) + pr_debug("map '%s': found key_size = %u.\n", map->name, def->key_size); + + if (def->parts & MAP_DEF_VALUE_TYPE) + pr_debug("map '%s': found value [%u], sz = %u.\n", + map->name, def->value_type_id, def->value_size); + else if (def->parts & MAP_DEF_VALUE_SIZE) + pr_debug("map '%s': found value_size = %u.\n", map->name, def->value_size); + + if (def->parts & MAP_DEF_MAX_ENTRIES) + pr_debug("map '%s': found max_entries = %u.\n", map->name, def->max_entries); + if (def->parts & MAP_DEF_MAP_FLAGS) + pr_debug("map '%s': found map_flags = %u.\n", map->name, def->map_flags); + if (def->parts & MAP_DEF_PINNING) + pr_debug("map '%s': found pinning = %u.\n", map->name, def->pinning); + if (def->parts & MAP_DEF_NUMA_NODE) + pr_debug("map '%s': found numa_node = %u.\n", map->name, def->numa_node); + + if (def->parts & MAP_DEF_INNER_MAP) + pr_debug("map '%s': found inner map definition.\n", map->name); +} + static int bpf_object__init_user_btf_map(struct bpf_object *obj, const struct btf_type *sec, int var_idx, int sec_idx, const Elf_Data *data, bool strict, const char *pin_root_path) { + struct btf_map_def map_def = {}, inner_def = {}; const struct btf_type *var, *def; const struct btf_var_secinfo *vi; const struct btf_var *var_extra; const char *map_name; struct bpf_map *map; + int err;
vi = btf_var_secinfos(sec) + var_idx; var = btf__type_by_id(obj->btf, vi->type); @@ -2327,7 +2334,35 @@ static int bpf_object__init_user_btf_map(struct bpf_object *obj, pr_debug("map '%s': at sec_idx %d, offset %zu.\n", map_name, map->sec_idx, map->sec_offset);
- return parse_btf_map_def(obj, map, def, strict, false, pin_root_path); + err = parse_btf_map_def(map->name, obj->btf, def, strict, &map_def, &inner_def); + if (err) + return err; + + fill_map_from_def(map, &map_def); + + if (map_def.pinning == LIBBPF_PIN_BY_NAME) { + err = build_map_pin_path(map, pin_root_path); + if (err) { + pr_warn("map '%s': couldn't build pin path.\n", map->name); + return err; + } + } + + if (map_def.parts & MAP_DEF_INNER_MAP) { + map->inner_map = calloc(1, sizeof(*map->inner_map)); + if (!map->inner_map) + return -ENOMEM; + map->inner_map->fd = -1; + map->inner_map->sec_idx = sec_idx; + map->inner_map->name = malloc(strlen(map_name) + sizeof(".inner") + 1); + if (!map->inner_map->name) + return -ENOMEM; + sprintf(map->inner_map->name, "%s.inner", map_name); + + fill_map_from_def(map->inner_map, &inner_def); + } + + return 0; }
static int bpf_object__init_user_btf_maps(struct bpf_object *obj, bool strict, diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 92b7eae10c6d..17883073710c 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -138,6 +138,38 @@ static inline __u32 btf_type_info(int kind, int vlen, int kflag) return (kflag << 31) | (kind << 24) | vlen; }
+enum map_def_parts { + MAP_DEF_MAP_TYPE = 0x001, + MAP_DEF_KEY_TYPE = 0x002, + MAP_DEF_KEY_SIZE = 0x004, + MAP_DEF_VALUE_TYPE = 0x008, + MAP_DEF_VALUE_SIZE = 0x010, + MAP_DEF_MAX_ENTRIES = 0x020, + MAP_DEF_MAP_FLAGS = 0x040, + MAP_DEF_NUMA_NODE = 0x080, + MAP_DEF_PINNING = 0x100, + MAP_DEF_INNER_MAP = 0x200, + + MAP_DEF_ALL = 0x3ff, /* combination of all above */ +}; + +struct btf_map_def { + enum map_def_parts parts; + __u32 map_type; + __u32 key_type_id; + __u32 key_size; + __u32 value_type_id; + __u32 value_size; + __u32 max_entries; + __u32 map_flags; + __u32 numa_node; + __u32 pinning; +}; + +int parse_btf_map_def(const char *map_name, struct btf *btf, + const struct btf_type *def_t, bool strict, + struct btf_map_def *map_def, struct btf_map_def *inner_def); + void *libbpf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t cur_cnt, size_t max_cnt, size_t add_cnt); int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit beaa3711ada4e4a0c8e03a78fec72330185213bf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Factor out logic for sanity checking SHT_SYMTAB and SHT_REL sections into separate sections. They are already quite extensive and are suffering from too deep indentation. Subsequent changes will extend SYMTAB sanity checking further, so it's better to factor each into a separate function.
No functional changes are intended.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-8-andrii@kernel.org (cherry picked from commit beaa3711ada4e4a0c8e03a78fec72330185213bf) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 233 ++++++++++++++++++++++------------------- 1 file changed, 127 insertions(+), 106 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 46b16cbdcda3..b3639bd1ba1e 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -131,6 +131,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file);
static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, struct src_obj *obj); static int linker_sanity_check_elf(struct src_obj *obj); +static int linker_sanity_check_elf_symtab(struct src_obj *obj, struct src_sec *sec); +static int linker_sanity_check_elf_relos(struct src_obj *obj, struct src_sec *sec); static int linker_sanity_check_btf(struct src_obj *obj); static int linker_sanity_check_btf_ext(struct src_obj *obj); static int linker_fixup_btf(struct src_obj *obj); @@ -663,8 +665,8 @@ static bool is_pow_of_2(size_t x)
static int linker_sanity_check_elf(struct src_obj *obj) { - struct src_sec *sec, *link_sec; - int i, j, n; + struct src_sec *sec; + int i, err;
if (!obj->symtab_sec_idx) { pr_warn("ELF is missing SYMTAB section in %s\n", obj->filename); @@ -692,43 +694,11 @@ static int linker_sanity_check_elf(struct src_obj *obj) return -EINVAL;
switch (sec->shdr->sh_type) { - case SHT_SYMTAB: { - Elf64_Sym *sym; - - if (sec->shdr->sh_entsize != sizeof(Elf64_Sym)) - return -EINVAL; - if (sec->shdr->sh_size % sec->shdr->sh_entsize != 0) - return -EINVAL; - - if (!sec->shdr->sh_link || sec->shdr->sh_link >= obj->sec_cnt) { - pr_warn("ELF SYMTAB section #%zu points to missing STRTAB section #%zu in %s\n", - sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); - return -EINVAL; - } - link_sec = &obj->secs[sec->shdr->sh_link]; - if (link_sec->shdr->sh_type != SHT_STRTAB) { - pr_warn("ELF SYMTAB section #%zu points to invalid STRTAB section #%zu in %s\n", - sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); - return -EINVAL; - } - - n = sec->shdr->sh_size / sec->shdr->sh_entsize; - sym = sec->data->d_buf; - for (j = 0; j < n; j++, sym++) { - if (sym->st_shndx - && sym->st_shndx < SHN_LORESERVE - && sym->st_shndx >= obj->sec_cnt) { - pr_warn("ELF sym #%d in section #%zu points to missing section #%zu in %s\n", - j, sec->sec_idx, (size_t)sym->st_shndx, obj->filename); - return -EINVAL; - } - if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) { - if (sym->st_value != 0) - return -EINVAL; - } - } + case SHT_SYMTAB: + err = linker_sanity_check_elf_symtab(obj, sec); + if (err) + return err; break; - } case SHT_STRTAB: break; case SHT_PROGBITS: @@ -739,87 +709,138 @@ static int linker_sanity_check_elf(struct src_obj *obj) break; case SHT_NOBITS: break; - case SHT_REL: { - Elf64_Rel *relo; - struct src_sec *sym_sec; + case SHT_REL: + err = linker_sanity_check_elf_relos(obj, sec); + if (err) + return err; + break; + case SHT_LLVM_ADDRSIG: + break; + default: + pr_warn("ELF section #%zu (%s) has unrecognized type %zu in %s\n", + sec->sec_idx, sec->sec_name, (size_t)sec->shdr->sh_type, obj->filename); + return -EINVAL; + } + }
- if (sec->shdr->sh_entsize != sizeof(Elf64_Rel)) - return -EINVAL; - if (sec->shdr->sh_size % sec->shdr->sh_entsize != 0) - return -EINVAL; + return 0; +}
- /* SHT_REL's sh_link should point to SYMTAB */ - if (sec->shdr->sh_link != obj->symtab_sec_idx) { - pr_warn("ELF relo section #%zu points to invalid SYMTAB section #%zu in %s\n", - sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); - return -EINVAL; - } +static int linker_sanity_check_elf_symtab(struct src_obj *obj, struct src_sec *sec) +{ + struct src_sec *link_sec; + Elf64_Sym *sym; + int i, n;
- /* SHT_REL's sh_info points to relocated section */ - if (!sec->shdr->sh_info || sec->shdr->sh_info >= obj->sec_cnt) { - pr_warn("ELF relo section #%zu points to missing section #%zu in %s\n", - sec->sec_idx, (size_t)sec->shdr->sh_info, obj->filename); - return -EINVAL; - } - link_sec = &obj->secs[sec->shdr->sh_info]; + if (sec->shdr->sh_entsize != sizeof(Elf64_Sym)) + return -EINVAL; + if (sec->shdr->sh_size % sec->shdr->sh_entsize != 0) + return -EINVAL; + + if (!sec->shdr->sh_link || sec->shdr->sh_link >= obj->sec_cnt) { + pr_warn("ELF SYMTAB section #%zu points to missing STRTAB section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); + return -EINVAL; + } + link_sec = &obj->secs[sec->shdr->sh_link]; + if (link_sec->shdr->sh_type != SHT_STRTAB) { + pr_warn("ELF SYMTAB section #%zu points to invalid STRTAB section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); + return -EINVAL; + }
- /* .rel<secname> -> <secname> pattern is followed */ - if (strncmp(sec->sec_name, ".rel", sizeof(".rel") - 1) != 0 - || strcmp(sec->sec_name + sizeof(".rel") - 1, link_sec->sec_name) != 0) { - pr_warn("ELF relo section #%zu name has invalid name in %s\n", - sec->sec_idx, obj->filename); + n = sec->shdr->sh_size / sec->shdr->sh_entsize; + sym = sec->data->d_buf; + for (i = 0; i < n; i++, sym++) { + if (sym->st_shndx + && sym->st_shndx < SHN_LORESERVE + && sym->st_shndx >= obj->sec_cnt) { + pr_warn("ELF sym #%d in section #%zu points to missing section #%zu in %s\n", + i, sec->sec_idx, (size_t)sym->st_shndx, obj->filename); + return -EINVAL; + } + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) { + if (sym->st_value != 0) return -EINVAL; - } + continue; + } + }
- /* don't further validate relocations for ignored sections */ - if (link_sec->skipped) - break; + return 0; +}
- /* relocatable section is data or instructions */ - if (link_sec->shdr->sh_type != SHT_PROGBITS - && link_sec->shdr->sh_type != SHT_NOBITS) { - pr_warn("ELF relo section #%zu points to invalid section #%zu in %s\n", - sec->sec_idx, (size_t)sec->shdr->sh_info, obj->filename); - return -EINVAL; - } +static int linker_sanity_check_elf_relos(struct src_obj *obj, struct src_sec *sec) +{ + struct src_sec *link_sec, *sym_sec; + Elf64_Rel *relo; + int i, n;
- /* check sanity of each relocation */ - n = sec->shdr->sh_size / sec->shdr->sh_entsize; - relo = sec->data->d_buf; - sym_sec = &obj->secs[obj->symtab_sec_idx]; - for (j = 0; j < n; j++, relo++) { - size_t sym_idx = ELF64_R_SYM(relo->r_info); - size_t sym_type = ELF64_R_TYPE(relo->r_info); - - if (sym_type != R_BPF_64_64 && sym_type != R_BPF_64_32) { - pr_warn("ELF relo #%d in section #%zu has unexpected type %zu in %s\n", - j, sec->sec_idx, sym_type, obj->filename); - return -EINVAL; - } + if (sec->shdr->sh_entsize != sizeof(Elf64_Rel)) + return -EINVAL; + if (sec->shdr->sh_size % sec->shdr->sh_entsize != 0) + return -EINVAL;
- if (!sym_idx || sym_idx * sizeof(Elf64_Sym) >= sym_sec->shdr->sh_size) { - pr_warn("ELF relo #%d in section #%zu points to invalid symbol #%zu in %s\n", - j, sec->sec_idx, sym_idx, obj->filename); - return -EINVAL; - } + /* SHT_REL's sh_link should point to SYMTAB */ + if (sec->shdr->sh_link != obj->symtab_sec_idx) { + pr_warn("ELF relo section #%zu points to invalid SYMTAB section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_link, obj->filename); + return -EINVAL; + }
- if (link_sec->shdr->sh_flags & SHF_EXECINSTR) { - if (relo->r_offset % sizeof(struct bpf_insn) != 0) { - pr_warn("ELF relo #%d in section #%zu points to missing symbol #%zu in %s\n", - j, sec->sec_idx, sym_idx, obj->filename); - return -EINVAL; - } - } - } - break; + /* SHT_REL's sh_info points to relocated section */ + if (!sec->shdr->sh_info || sec->shdr->sh_info >= obj->sec_cnt) { + pr_warn("ELF relo section #%zu points to missing section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_info, obj->filename); + return -EINVAL; + } + link_sec = &obj->secs[sec->shdr->sh_info]; + + /* .rel<secname> -> <secname> pattern is followed */ + if (strncmp(sec->sec_name, ".rel", sizeof(".rel") - 1) != 0 + || strcmp(sec->sec_name + sizeof(".rel") - 1, link_sec->sec_name) != 0) { + pr_warn("ELF relo section #%zu name has invalid name in %s\n", + sec->sec_idx, obj->filename); + return -EINVAL; + } + + /* don't further validate relocations for ignored sections */ + if (link_sec->skipped) + return 0; + + /* relocatable section is data or instructions */ + if (link_sec->shdr->sh_type != SHT_PROGBITS && link_sec->shdr->sh_type != SHT_NOBITS) { + pr_warn("ELF relo section #%zu points to invalid section #%zu in %s\n", + sec->sec_idx, (size_t)sec->shdr->sh_info, obj->filename); + return -EINVAL; + } + + /* check sanity of each relocation */ + n = sec->shdr->sh_size / sec->shdr->sh_entsize; + relo = sec->data->d_buf; + sym_sec = &obj->secs[obj->symtab_sec_idx]; + for (i = 0; i < n; i++, relo++) { + size_t sym_idx = ELF64_R_SYM(relo->r_info); + size_t sym_type = ELF64_R_TYPE(relo->r_info); + + if (sym_type != R_BPF_64_64 && sym_type != R_BPF_64_32) { + pr_warn("ELF relo #%d in section #%zu has unexpected type %zu in %s\n", + i, sec->sec_idx, sym_type, obj->filename); + return -EINVAL; } - case SHT_LLVM_ADDRSIG: - break; - default: - pr_warn("ELF section #%zu (%s) has unrecognized type %zu in %s\n", - sec->sec_idx, sec->sec_name, (size_t)sec->shdr->sh_type, obj->filename); + + if (!sym_idx || sym_idx * sizeof(Elf64_Sym) >= sym_sec->shdr->sh_size) { + pr_warn("ELF relo #%d in section #%zu points to invalid symbol #%zu in %s\n", + i, sec->sec_idx, sym_idx, obj->filename); return -EINVAL; } + + if (link_sec->shdr->sh_flags & SHF_EXECINSTR) { + if (relo->r_offset % sizeof(struct bpf_insn) != 0) { + pr_warn("ELF relo #%d in section #%zu points to missing symbol #%zu in %s\n", + i, sec->sec_idx, sym_idx, obj->filename); + return -EINVAL; + } + } }
return 0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 42869d28527695a75346c988ceeedbba7e3880b7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Make skip_mods_and_typedefs(), btf_kind_str(), and btf_func_linkage() helpers available outside of libbpf.c, to be used by static linker code.
Also do few cleanups (error code fixes, comment clean up, etc) that don't deserve their own commit.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-9-andrii@kernel.org (cherry picked from commit 42869d28527695a75346c988ceeedbba7e3880b7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 13 +++---------- tools/lib/bpf/libbpf_internal.h | 7 +++++++ tools/lib/bpf/linker.c | 14 ++++++-------- 3 files changed, 16 insertions(+), 18 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2d5d86928a32..e79f793dd4bf 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -69,8 +69,6 @@ #define __printf(a, b) __attribute__((format(printf, a, b)))
static struct bpf_map *bpf_object__add_map(struct bpf_object *obj); -static const struct btf_type * -skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id); static bool prog_is_subprog(const struct bpf_object *obj, const struct bpf_program *prog);
static int __base_pr(enum libbpf_print_level level, const char *format, @@ -1906,7 +1904,7 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) return 0; }
-static const struct btf_type * +const struct btf_type * skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id) { const struct btf_type *t = btf__type_by_id(btf, id); @@ -1961,16 +1959,11 @@ static const char *__btf_kind_str(__u16 kind) } }
-static const char *btf_kind_str(const struct btf_type *t) +const char *btf_kind_str(const struct btf_type *t) { return __btf_kind_str(btf_kind(t)); }
-static enum btf_func_linkage btf_func_linkage(const struct btf_type *t) -{ - return (enum btf_func_linkage)BTF_INFO_VLEN(t->info); -} - /* * Fetch integer attribute of BTF map definition. Such attributes are * represented using a pointer to an array, in which dimensionality of array @@ -7090,7 +7083,7 @@ static bool insn_is_helper_call(struct bpf_insn *insn, enum bpf_func_id *func_id return false; }
-static int bpf_object__sanitize_prog(struct bpf_object* obj, struct bpf_program *prog) +static int bpf_object__sanitize_prog(struct bpf_object *obj, struct bpf_program *prog) { struct bpf_insn *insn = prog->insns; enum bpf_func_id func_id; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 17883073710c..ee426226928f 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -132,6 +132,13 @@ struct btf; struct btf_type;
struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id); +const char *btf_kind_str(const struct btf_type *t); +const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id); + +static inline enum btf_func_linkage btf_func_linkage(const struct btf_type *t) +{ + return (enum btf_func_linkage)(int)btf_vlen(t); +}
static inline __u32 btf_type_info(int kind, int vlen, int kflag) { diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index b3639bd1ba1e..9998b151d1d5 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -123,9 +123,7 @@ struct bpf_linker { };
#define pr_warn_elf(fmt, ...) \ -do { \ - libbpf_print(LIBBPF_WARN, "libbpf: " fmt ": %s\n", ##__VA_ARGS__, elf_errmsg(-1)); \ -} while (0) + libbpf_print(LIBBPF_WARN, "libbpf: " fmt ": %s\n", ##__VA_ARGS__, elf_errmsg(-1))
static int init_output_elf(struct bpf_linker *linker, const char *file);
@@ -284,7 +282,7 @@ static int init_output_elf(struct bpf_linker *linker, const char *file)
/* ELF header */ linker->elf_hdr = elf64_newehdr(linker->elf); - if (!linker->elf_hdr){ + if (!linker->elf_hdr) { pr_warn_elf("failed to create ELF header"); return -EINVAL; } @@ -925,13 +923,13 @@ static int init_sec(struct bpf_linker *linker, struct dst_sec *dst_sec, struct s
scn = elf_newscn(linker->elf); if (!scn) - return -1; + return -ENOMEM; data = elf_newdata(scn); if (!data) - return -1; + return -ENOMEM; shdr = elf64_getshdr(scn); if (!shdr) - return -1; + return -ENOMEM;
dst_sec->scn = scn; dst_sec->shdr = shdr; @@ -1221,7 +1219,7 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob return err; } } else if (!secs_match(dst_sec, src_sec)) { - pr_warn("Secs %s are not compatible\n", src_sec->sec_name); + pr_warn("sections %s are not compatible\n", src_sec->sec_name); return -1; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 386b1d241e1b975a239d33be836bc183a52ab18c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add logic to validate extern symbols, plus some other minor extra checks, like ELF symbol #0 validation, general symbol visibility and binding validations.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-10-andrii@kernel.org (cherry picked from commit 386b1d241e1b975a239d33be836bc183a52ab18c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 49 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 9998b151d1d5..a5971eb36fae 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -750,14 +750,45 @@ static int linker_sanity_check_elf_symtab(struct src_obj *obj, struct src_sec *s n = sec->shdr->sh_size / sec->shdr->sh_entsize; sym = sec->data->d_buf; for (i = 0; i < n; i++, sym++) { - if (sym->st_shndx - && sym->st_shndx < SHN_LORESERVE - && sym->st_shndx >= obj->sec_cnt) { + int sym_type = ELF64_ST_TYPE(sym->st_info); + int sym_bind = ELF64_ST_BIND(sym->st_info); + int sym_vis = ELF64_ST_VISIBILITY(sym->st_other); + + if (i == 0) { + if (sym->st_name != 0 || sym->st_info != 0 + || sym->st_other != 0 || sym->st_shndx != 0 + || sym->st_value != 0 || sym->st_size != 0) { + pr_warn("ELF sym #0 is invalid in %s\n", obj->filename); + return -EINVAL; + } + continue; + } + if (sym_bind != STB_LOCAL && sym_bind != STB_GLOBAL && sym_bind != STB_WEAK) { + pr_warn("ELF sym #%d in section #%zu has unsupported symbol binding %d\n", + i, sec->sec_idx, sym_bind); + return -EINVAL; + } + if (sym_vis != STV_DEFAULT && sym_vis != STV_HIDDEN) { + pr_warn("ELF sym #%d in section #%zu has unsupported symbol visibility %d\n", + i, sec->sec_idx, sym_vis); + return -EINVAL; + } + if (sym->st_shndx == 0) { + if (sym_type != STT_NOTYPE || sym_bind == STB_LOCAL + || sym->st_value != 0 || sym->st_size != 0) { + pr_warn("ELF sym #%d is invalid extern symbol in %s\n", + i, obj->filename); + + return -EINVAL; + } + continue; + } + if (sym->st_shndx < SHN_LORESERVE && sym->st_shndx >= obj->sec_cnt) { pr_warn("ELF sym #%d in section #%zu points to missing section #%zu in %s\n", i, sec->sec_idx, (size_t)sym->st_shndx, obj->filename); return -EINVAL; } - if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) { + if (sym_type == STT_SECTION) { if (sym->st_value != 0) return -EINVAL; continue; @@ -1135,16 +1166,16 @@ static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj size_t dst_sym_idx; int name_off;
- /* we already have all-zero initial symbol */ - if (sym->st_name == 0 && sym->st_info == 0 && - sym->st_other == 0 && sym->st_shndx == SHN_UNDEF && - sym->st_value == 0 && sym->st_size ==0) + /* We already validated all-zero symbol #0 and we already + * appended it preventively to the final SYMTAB, so skip it. + */ + if (i == 0) continue;
sym_name = elf_strptr(obj->elf, str_sec_idx, sym->st_name); if (!sym_name) { pr_warn("can't fetch symbol name for symbol #%d in '%s'\n", i, obj->filename); - return -1; + return -EINVAL; }
if (sym->st_shndx && sym->st_shndx < SHN_LORESERVE) {
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 83a157279f2125ce1c4d6d93750440853746dff0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It should never fail, but if it does, it's better to know about this rather than end up with nonsensical type IDs.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-11-andrii@kernel.org (cherry picked from commit 83a157279f2125ce1c4d6d93750440853746dff0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index a5971eb36fae..f19a15892654 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1429,6 +1429,13 @@ static int linker_fixup_btf(struct src_obj *obj) static int remap_type_id(__u32 *type_id, void *ctx) { int *id_map = ctx; + int new_id = id_map[*type_id]; + + /* Error out if the type wasn't remapped. Ignore VOID which stays VOID. */ + if (new_id == 0 && *type_id != 0) { + pr_warn("failed to find new ID mapping for original BTF type ID %u\n", *type_id); + return -EINVAL; + }
*type_id = id_map[*type_id];
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit a46349227cd832b33c12f74b85712ea67de3c6c4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BPF static linker logic to resolve extern variables and functions across multiple linked together BPF object files.
For that, linker maintains a separate list of struct glob_sym structures, which keeps track of few pieces of metadata (is it extern or resolved global, is it a weak symbol, which ELF section it belongs to, etc) and ties together BTF type info and ELF symbol information and keeps them in sync.
With adding support for extern variables/funcs, it's now possible for some sections to contain both extern and non-extern definitions. This means that some sections may start out as ephemeral (if only externs are present and thus there is not corresponding ELF section), but will be "upgraded" to actual ELF section as symbols are resolved or new non-extern definitions are appended.
Additional care is taken to not duplicate extern entries in sections like .kconfig and .ksyms.
Given libbpf requires BTF type to always be present for .kconfig/.ksym externs, linker extends this requirement to all the externs, even those that are supposed to be resolved during static linking and which won't be visible to libbpf. With BTF information always present, static linker will check not just ELF symbol matches, but entire BTF type signature match as well. That logic is stricter that BPF CO-RE checks. It probably should be re-used by .ksym resolution logic in libbpf as well, but that's left for follow up patches.
To make it unnecessary to rewrite ELF symbols and minimize BTF type rewriting/removal, ELF symbols that correspond to externs initially will be updated in place once they are resolved. Similarly for BTF type info, VAR/FUNC and var_secinfo's (sec_vars in struct bpf_linker) are staying stable, but types they point to might get replaced when extern is resolved. This might leave some left-over types (even though we try to minimize this for common cases of having extern funcs with not argument names vs concrete function with names properly specified). That can be addresses later with a generic BTF garbage collection. That's left for a follow up as well.
Given BTF type appending phase is separate from ELF symbol appending/resolution, special struct glob_sym->underlying_btf_id variable is used to communicate resolution and rewrite decisions. 0 means underlying_btf_id needs to be appended (it's not yet in final linker->btf), <0 values are used for temporary storage of source BTF type ID (not yet rewritten), so -glob_sym->underlying_btf_id is BTF type id in obj-btf. But by the end of linker_append_btf() phase, that underlying_btf_id will be remapped and will always be > 0. This is the uglies part of the whole process, but keeps the other parts much simpler due to stability of sec_var and VAR/FUNC types, as well as ELF symbol, so please keep that in mind while reviewing.
BTF-defined maps require some extra custom logic and is addressed separate in the next patch, so that to keep this one smaller and easier to review.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-12-andrii@kernel.org (cherry picked from commit a46349227cd832b33c12f74b85712ea67de3c6c4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 849 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 790 insertions(+), 59 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index f19a15892654..be521cf3c26e 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -22,6 +22,8 @@ #include "libbpf_internal.h" #include "strset.h"
+#define BTF_EXTERN_SEC ".extern" + struct src_sec { const char *sec_name; /* positional (not necessarily ELF) index in an array of sections */ @@ -74,11 +76,36 @@ struct btf_ext_sec_data { void *recs; };
+struct glob_sym { + /* ELF symbol index */ + int sym_idx; + /* associated section id for .ksyms, .kconfig, etc, but not .extern */ + int sec_id; + /* extern name offset in STRTAB */ + int name_off; + /* optional associated BTF type ID */ + int btf_id; + /* BTF type ID to which VAR/FUNC type is pointing to; used for + * rewriting types when extern VAR/FUNC is resolved to a concrete + * definition + */ + int underlying_btf_id; + /* sec_var index in the corresponding dst_sec, if exists */ + int var_idx; + + /* extern or resolved/global symbol */ + bool is_extern; + /* weak or strong symbol, never goes back from strong to weak */ + bool is_weak; +}; + struct dst_sec { char *sec_name; /* positional (not necessarily ELF) index in an array of sections */ int id;
+ bool ephemeral; + /* ELF info */ size_t sec_idx; Elf_Scn *scn; @@ -120,6 +147,10 @@ struct bpf_linker {
struct btf *btf; struct btf_ext *btf_ext; + + /* global (including extern) ELF symbols */ + int glob_sym_cnt; + struct glob_sym *glob_syms; };
#define pr_warn_elf(fmt, ...) \ @@ -136,6 +167,8 @@ static int linker_sanity_check_btf_ext(struct src_obj *obj); static int linker_fixup_btf(struct src_obj *obj); static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj); static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj); +static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj, + Elf64_Sym *sym, const char *sym_name, int src_sym_idx); static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *obj); static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj); static int linker_append_btf_ext(struct bpf_linker *linker, struct src_obj *obj); @@ -947,6 +980,7 @@ static int init_sec(struct bpf_linker *linker, struct dst_sec *dst_sec, struct s
dst_sec->sec_sz = 0; dst_sec->sec_idx = 0; + dst_sec->ephemeral = src_sec->ephemeral;
/* ephemeral sections are just thin section shells lacking most parts */ if (src_sec->ephemeral) @@ -1010,6 +1044,9 @@ static struct dst_sec *find_dst_sec_by_name(struct bpf_linker *linker, const cha
static bool secs_match(struct dst_sec *dst, struct src_sec *src) { + if (dst->ephemeral || src->ephemeral) + return true; + if (dst->shdr->sh_type != src->shdr->sh_type) { pr_warn("sec %s types mismatch\n", dst->sec_name); return false; @@ -1035,13 +1072,33 @@ static bool sec_content_is_same(struct dst_sec *dst_sec, struct src_sec *src_sec return true; }
-static int extend_sec(struct dst_sec *dst, struct src_sec *src) +static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src_sec *src) { void *tmp; - size_t dst_align = dst->shdr->sh_addralign; - size_t src_align = src->shdr->sh_addralign; + size_t dst_align, src_align; size_t dst_align_sz, dst_final_sz; + int err; + + /* Ephemeral source section doesn't contribute anything to ELF + * section data. + */ + if (src->ephemeral) + return 0; + + /* Some sections (like .maps) can contain both externs (and thus be + * ephemeral) and non-externs (map definitions). So it's possible that + * it has to be "upgraded" from ephemeral to non-ephemeral when the + * first non-ephemeral entity appears. In such case, we add ELF + * section, data, etc. + */ + if (dst->ephemeral) { + err = init_sec(linker, dst, src); + if (err) + return err; + }
+ dst_align = dst->shdr->sh_addralign; + src_align = src->shdr->sh_addralign; if (dst_align == 0) dst_align = 1; if (dst_align < src_align) @@ -1137,10 +1194,7 @@ static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj /* record mapped section index */ src_sec->dst_id = dst_sec->id;
- if (src_sec->ephemeral) - continue; - - err = extend_sec(dst_sec, src_sec); + err = extend_sec(linker, dst_sec, src_sec); if (err) return err; } @@ -1151,21 +1205,16 @@ static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj) { struct src_sec *symtab = &obj->secs[obj->symtab_sec_idx]; - Elf64_Sym *sym = symtab->data->d_buf, *dst_sym; - int i, n = symtab->shdr->sh_size / symtab->shdr->sh_entsize; + Elf64_Sym *sym = symtab->data->d_buf; + int i, n = symtab->shdr->sh_size / symtab->shdr->sh_entsize, err; int str_sec_idx = symtab->shdr->sh_link; + const char *sym_name;
obj->sym_map = calloc(n + 1, sizeof(*obj->sym_map)); if (!obj->sym_map) return -ENOMEM;
for (i = 0; i < n; i++, sym++) { - struct src_sec *src_sec = NULL; - struct dst_sec *dst_sec = NULL; - const char *sym_name; - size_t dst_sym_idx; - int name_off; - /* We already validated all-zero symbol #0 and we already * appended it preventively to the final SYMTAB, so skip it. */ @@ -1178,41 +1227,624 @@ static int linker_append_elf_syms(struct bpf_linker *linker, struct src_obj *obj return -EINVAL; }
- if (sym->st_shndx && sym->st_shndx < SHN_LORESERVE) { - src_sec = &obj->secs[sym->st_shndx]; - if (src_sec->skipped) + err = linker_append_elf_sym(linker, obj, sym, sym_name, i); + if (err) + return err; + } + + return 0; +} + +static Elf64_Sym *get_sym_by_idx(struct bpf_linker *linker, size_t sym_idx) +{ + struct dst_sec *symtab = &linker->secs[linker->symtab_sec_idx]; + Elf64_Sym *syms = symtab->raw_data; + + return &syms[sym_idx]; +} + +static struct glob_sym *find_glob_sym(struct bpf_linker *linker, const char *sym_name) +{ + struct glob_sym *glob_sym; + const char *name; + int i; + + for (i = 0; i < linker->glob_sym_cnt; i++) { + glob_sym = &linker->glob_syms[i]; + name = strset__data(linker->strtab_strs) + glob_sym->name_off; + + if (strcmp(name, sym_name) == 0) + return glob_sym; + } + + return NULL; +} + +static struct glob_sym *add_glob_sym(struct bpf_linker *linker) +{ + struct glob_sym *syms, *sym; + + syms = libbpf_reallocarray(linker->glob_syms, linker->glob_sym_cnt + 1, + sizeof(*linker->glob_syms)); + if (!syms) + return NULL; + + sym = &syms[linker->glob_sym_cnt]; + memset(sym, 0, sizeof(*sym)); + sym->var_idx = -1; + + linker->glob_syms = syms; + linker->glob_sym_cnt++; + + return sym; +} + +static bool glob_sym_btf_matches(const char *sym_name, bool exact, + const struct btf *btf1, __u32 id1, + const struct btf *btf2, __u32 id2) +{ + const struct btf_type *t1, *t2; + bool is_static1, is_static2; + const char *n1, *n2; + int i, n; + +recur: + n1 = n2 = NULL; + t1 = skip_mods_and_typedefs(btf1, id1, &id1); + t2 = skip_mods_and_typedefs(btf2, id2, &id2); + + /* check if only one side is FWD, otherwise handle with common logic */ + if (!exact && btf_is_fwd(t1) != btf_is_fwd(t2)) { + n1 = btf__str_by_offset(btf1, t1->name_off); + n2 = btf__str_by_offset(btf2, t2->name_off); + if (strcmp(n1, n2) != 0) { + pr_warn("global '%s': incompatible forward declaration names '%s' and '%s'\n", + sym_name, n1, n2); + return false; + } + /* validate if FWD kind matches concrete kind */ + if (btf_is_fwd(t1)) { + if (btf_kflag(t1) && btf_is_union(t2)) + return true; + if (!btf_kflag(t1) && btf_is_struct(t2)) + return true; + pr_warn("global '%s': incompatible %s forward declaration and concrete kind %s\n", + sym_name, btf_kflag(t1) ? "union" : "struct", btf_kind_str(t2)); + } else { + if (btf_kflag(t2) && btf_is_union(t1)) + return true; + if (!btf_kflag(t2) && btf_is_struct(t1)) + return true; + pr_warn("global '%s': incompatible %s forward declaration and concrete kind %s\n", + sym_name, btf_kflag(t2) ? "union" : "struct", btf_kind_str(t1)); + } + return false; + } + + if (btf_kind(t1) != btf_kind(t2)) { + pr_warn("global '%s': incompatible BTF kinds %s and %s\n", + sym_name, btf_kind_str(t1), btf_kind_str(t2)); + return false; + } + + switch (btf_kind(t1)) { + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + case BTF_KIND_ENUM: + case BTF_KIND_FWD: + case BTF_KIND_FUNC: + case BTF_KIND_VAR: + n1 = btf__str_by_offset(btf1, t1->name_off); + n2 = btf__str_by_offset(btf2, t2->name_off); + if (strcmp(n1, n2) != 0) { + pr_warn("global '%s': incompatible %s names '%s' and '%s'\n", + sym_name, btf_kind_str(t1), n1, n2); + return false; + } + break; + default: + break; + } + + switch (btf_kind(t1)) { + case BTF_KIND_UNKN: /* void */ + case BTF_KIND_FWD: + return true; + case BTF_KIND_INT: + case BTF_KIND_FLOAT: + case BTF_KIND_ENUM: + /* ignore encoding for int and enum values for enum */ + if (t1->size != t2->size) { + pr_warn("global '%s': incompatible %s '%s' size %u and %u\n", + sym_name, btf_kind_str(t1), n1, t1->size, t2->size); + return false; + } + return true; + case BTF_KIND_PTR: + /* just validate overall shape of the referenced type, so no + * contents comparison for struct/union, and allowd fwd vs + * struct/union + */ + exact = false; + id1 = t1->type; + id2 = t2->type; + goto recur; + case BTF_KIND_ARRAY: + /* ignore index type and array size */ + id1 = btf_array(t1)->type; + id2 = btf_array(t2)->type; + goto recur; + case BTF_KIND_FUNC: + /* extern and global linkages are compatible */ + is_static1 = btf_func_linkage(t1) == BTF_FUNC_STATIC; + is_static2 = btf_func_linkage(t2) == BTF_FUNC_STATIC; + if (is_static1 != is_static2) { + pr_warn("global '%s': incompatible func '%s' linkage\n", sym_name, n1); + return false; + } + + id1 = t1->type; + id2 = t2->type; + goto recur; + case BTF_KIND_VAR: + /* extern and global linkages are compatible */ + is_static1 = btf_var(t1)->linkage == BTF_VAR_STATIC; + is_static2 = btf_var(t2)->linkage == BTF_VAR_STATIC; + if (is_static1 != is_static2) { + pr_warn("global '%s': incompatible var '%s' linkage\n", sym_name, n1); + return false; + } + + id1 = t1->type; + id2 = t2->type; + goto recur; + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: { + const struct btf_member *m1, *m2; + + if (!exact) + return true; + + if (btf_vlen(t1) != btf_vlen(t2)) { + pr_warn("global '%s': incompatible number of %s fields %u and %u\n", + sym_name, btf_kind_str(t1), btf_vlen(t1), btf_vlen(t2)); + return false; + } + + n = btf_vlen(t1); + m1 = btf_members(t1); + m2 = btf_members(t2); + for (i = 0; i < n; i++, m1++, m2++) { + n1 = btf__str_by_offset(btf1, m1->name_off); + n2 = btf__str_by_offset(btf2, m2->name_off); + if (strcmp(n1, n2) != 0) { + pr_warn("global '%s': incompatible field #%d names '%s' and '%s'\n", + sym_name, i, n1, n2); + return false; + } + if (m1->offset != m2->offset) { + pr_warn("global '%s': incompatible field #%d ('%s') offsets\n", + sym_name, i, n1); + return false; + } + if (!glob_sym_btf_matches(sym_name, exact, btf1, m1->type, btf2, m2->type)) + return false; + } + + return true; + } + case BTF_KIND_FUNC_PROTO: { + const struct btf_param *m1, *m2; + + if (btf_vlen(t1) != btf_vlen(t2)) { + pr_warn("global '%s': incompatible number of %s params %u and %u\n", + sym_name, btf_kind_str(t1), btf_vlen(t1), btf_vlen(t2)); + return false; + } + + n = btf_vlen(t1); + m1 = btf_params(t1); + m2 = btf_params(t2); + for (i = 0; i < n; i++, m1++, m2++) { + /* ignore func arg names */ + if (!glob_sym_btf_matches(sym_name, exact, btf1, m1->type, btf2, m2->type)) + return false; + } + + /* now check return type as well */ + id1 = t1->type; + id2 = t2->type; + goto recur; + } + + /* skip_mods_and_typedefs() make this impossible */ + case BTF_KIND_TYPEDEF: + case BTF_KIND_VOLATILE: + case BTF_KIND_CONST: + case BTF_KIND_RESTRICT: + /* DATASECs are never compared with each other */ + case BTF_KIND_DATASEC: + default: + pr_warn("global '%s': unsupported BTF kind %s\n", + sym_name, btf_kind_str(t1)); + return false; + } +} + +static bool glob_syms_match(const char *sym_name, + struct bpf_linker *linker, struct glob_sym *glob_sym, + struct src_obj *obj, Elf64_Sym *sym, size_t sym_idx, int btf_id) +{ + const struct btf_type *src_t; + + /* if we are dealing with externs, BTF types describing both global + * and extern VARs/FUNCs should be completely present in all files + */ + if (!glob_sym->btf_id || !btf_id) { + pr_warn("BTF info is missing for global symbol '%s'\n", sym_name); + return false; + } + + src_t = btf__type_by_id(obj->btf, btf_id); + if (!btf_is_var(src_t) && !btf_is_func(src_t)) { + pr_warn("only extern variables and functions are supported, but got '%s' for '%s'\n", + btf_kind_str(src_t), sym_name); + return false; + } + + if (!glob_sym_btf_matches(sym_name, true /*exact*/, + linker->btf, glob_sym->btf_id, obj->btf, btf_id)) + return false; + + return true; +} + +static bool btf_is_non_static(const struct btf_type *t) +{ + return (btf_is_var(t) && btf_var(t)->linkage != BTF_VAR_STATIC) + || (btf_is_func(t) && btf_func_linkage(t) != BTF_FUNC_STATIC); +} + +static int find_glob_sym_btf(struct src_obj *obj, Elf64_Sym *sym, const char *sym_name, + int *out_btf_sec_id, int *out_btf_id) +{ + int i, j, n = btf__get_nr_types(obj->btf), m, btf_id = 0; + const struct btf_type *t; + const struct btf_var_secinfo *vi; + const char *name; + + for (i = 1; i <= n; i++) { + t = btf__type_by_id(obj->btf, i); + + /* some global and extern FUNCs and VARs might not be associated with any + * DATASEC, so try to detect them in the same pass + */ + if (btf_is_non_static(t)) { + name = btf__str_by_offset(obj->btf, t->name_off); + if (strcmp(name, sym_name) != 0) continue; - dst_sec = &linker->secs[src_sec->dst_id];
- /* allow only one STT_SECTION symbol per section */ - if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION && dst_sec->sec_sym_idx) { - obj->sym_map[i] = dst_sec->sec_sym_idx; + /* remember and still try to find DATASEC */ + btf_id = i; + continue; + } + + if (!btf_is_datasec(t)) + continue; + + vi = btf_var_secinfos(t); + for (j = 0, m = btf_vlen(t); j < m; j++, vi++) { + t = btf__type_by_id(obj->btf, vi->type); + name = btf__str_by_offset(obj->btf, t->name_off); + + if (strcmp(name, sym_name) != 0) + continue; + if (btf_is_var(t) && btf_var(t)->linkage == BTF_VAR_STATIC) + continue; + if (btf_is_func(t) && btf_func_linkage(t) == BTF_FUNC_STATIC) continue; + + if (btf_id && btf_id != vi->type) { + pr_warn("global/extern '%s' BTF is ambiguous: both types #%d and #%u match\n", + sym_name, btf_id, vi->type); + return -EINVAL; } + + *out_btf_sec_id = i; + *out_btf_id = vi->type; + + return 0; } + }
- name_off = strset__add_str(linker->strtab_strs, sym_name); - if (name_off < 0) - return name_off; + /* free-floating extern or global FUNC */ + if (btf_id) { + *out_btf_sec_id = 0; + *out_btf_id = btf_id; + return 0; + }
- dst_sym = add_new_sym(linker, &dst_sym_idx); - if (!dst_sym) - return -ENOMEM; + pr_warn("failed to find BTF info for global/extern symbol '%s'\n", sym_name); + return -ENOENT; +}
- dst_sym->st_name = name_off; - dst_sym->st_info = sym->st_info; - dst_sym->st_other = sym->st_other; - dst_sym->st_shndx = src_sec ? dst_sec->sec_idx : sym->st_shndx; - dst_sym->st_value = (src_sec ? src_sec->dst_off : 0) + sym->st_value; - dst_sym->st_size = sym->st_size; +static struct src_sec *find_src_sec_by_name(struct src_obj *obj, const char *sec_name) +{ + struct src_sec *sec; + int i;
- obj->sym_map[i] = dst_sym_idx; + for (i = 1; i < obj->sec_cnt; i++) { + sec = &obj->secs[i];
- if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION && dst_sym) { - dst_sec->sec_sym_idx = dst_sym_idx; - dst_sym->st_value = 0; + if (strcmp(sec->sec_name, sec_name) == 0) + return sec; + } + + return NULL; +} + +static int complete_extern_btf_info(struct btf *dst_btf, int dst_id, + struct btf *src_btf, int src_id) +{ + struct btf_type *dst_t = btf_type_by_id(dst_btf, dst_id); + struct btf_type *src_t = btf_type_by_id(src_btf, src_id); + struct btf_param *src_p, *dst_p; + const char *s; + int i, n, off; + + /* We already made sure that source and destination types (FUNC or + * VAR) match in terms of types and argument names. + */ + if (btf_is_var(dst_t)) { + btf_var(dst_t)->linkage = BTF_VAR_GLOBAL_ALLOCATED; + return 0; + } + + dst_t->info = btf_type_info(BTF_KIND_FUNC, BTF_FUNC_GLOBAL, 0); + + /* now onto FUNC_PROTO types */ + src_t = btf_type_by_id(src_btf, src_t->type); + dst_t = btf_type_by_id(dst_btf, dst_t->type); + + /* Fill in all the argument names, which for extern FUNCs are missing. + * We'll end up with two copies of FUNCs/VARs for externs, but that + * will be taken care of by BTF dedup at the very end. + * It might be that BTF types for extern in one file has less/more BTF + * information (e.g., FWD instead of full STRUCT/UNION information), + * but that should be (in most cases, subject to BTF dedup rules) + * handled and resolved by BTF dedup algorithm as well, so we won't + * worry about it. Our only job is to make sure that argument names + * are populated on both sides, otherwise BTF dedup will pedantically + * consider them different. + */ + src_p = btf_params(src_t); + dst_p = btf_params(dst_t); + for (i = 0, n = btf_vlen(dst_t); i < n; i++, src_p++, dst_p++) { + if (!src_p->name_off) + continue; + + /* src_btf has more complete info, so add name to dst_btf */ + s = btf__str_by_offset(src_btf, src_p->name_off); + off = btf__add_str(dst_btf, s); + if (off < 0) + return off; + dst_p->name_off = off; + } + return 0; +} + +static void sym_update_bind(Elf64_Sym *sym, int sym_bind) +{ + sym->st_info = ELF64_ST_INFO(sym_bind, ELF64_ST_TYPE(sym->st_info)); +} + +static void sym_update_type(Elf64_Sym *sym, int sym_type) +{ + sym->st_info = ELF64_ST_INFO(ELF64_ST_BIND(sym->st_info), sym_type); +} + +static void sym_update_visibility(Elf64_Sym *sym, int sym_vis) +{ + /* libelf doesn't provide setters for ST_VISIBILITY, + * but it is stored in the lower 2 bits of st_other + */ + sym->st_other &= 0x03; + sym->st_other |= sym_vis; +} + +static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj, + Elf64_Sym *sym, const char *sym_name, int src_sym_idx) +{ + struct src_sec *src_sec = NULL; + struct dst_sec *dst_sec = NULL; + struct glob_sym *glob_sym = NULL; + int name_off, sym_type, sym_bind, sym_vis, err; + int btf_sec_id = 0, btf_id = 0; + size_t dst_sym_idx; + Elf64_Sym *dst_sym; + bool sym_is_extern; + + sym_type = ELF64_ST_TYPE(sym->st_info); + sym_bind = ELF64_ST_BIND(sym->st_info); + sym_vis = ELF64_ST_VISIBILITY(sym->st_other); + sym_is_extern = sym->st_shndx == SHN_UNDEF; + + if (sym_is_extern) { + if (!obj->btf) { + pr_warn("externs without BTF info are not supported\n"); + return -ENOTSUP; + } + } else if (sym->st_shndx < SHN_LORESERVE) { + src_sec = &obj->secs[sym->st_shndx]; + if (src_sec->skipped) + return 0; + dst_sec = &linker->secs[src_sec->dst_id]; + + /* allow only one STT_SECTION symbol per section */ + if (sym_type == STT_SECTION && dst_sec->sec_sym_idx) { + obj->sym_map[src_sym_idx] = dst_sec->sec_sym_idx; + return 0; + } + } + + if (sym_bind == STB_LOCAL) + goto add_sym; + + /* find matching BTF info */ + err = find_glob_sym_btf(obj, sym, sym_name, &btf_sec_id, &btf_id); + if (err) + return err; + + if (sym_is_extern && btf_sec_id) { + const char *sec_name = NULL; + const struct btf_type *t; + + t = btf__type_by_id(obj->btf, btf_sec_id); + sec_name = btf__str_by_offset(obj->btf, t->name_off); + + /* Clang puts unannotated extern vars into + * '.extern' BTF DATASEC. Treat them the same + * as unannotated extern funcs (which are + * currently not put into any DATASECs). + * Those don't have associated src_sec/dst_sec. + */ + if (strcmp(sec_name, BTF_EXTERN_SEC) != 0) { + src_sec = find_src_sec_by_name(obj, sec_name); + if (!src_sec) { + pr_warn("failed to find matching ELF sec '%s'\n", sec_name); + return -ENOENT; + } + dst_sec = &linker->secs[src_sec->dst_id]; + } + } + + glob_sym = find_glob_sym(linker, sym_name); + if (glob_sym) { + /* Preventively resolve to existing symbol. This is + * needed for further relocation symbol remapping in + * the next step of linking. + */ + obj->sym_map[src_sym_idx] = glob_sym->sym_idx; + + /* If both symbols are non-externs, at least one of + * them has to be STB_WEAK, otherwise they are in + * a conflict with each other. + */ + if (!sym_is_extern && !glob_sym->is_extern + && !glob_sym->is_weak && sym_bind != STB_WEAK) { + pr_warn("conflicting non-weak symbol #%d (%s) definition in '%s'\n", + src_sym_idx, sym_name, obj->filename); + return -EINVAL; + } + + if (!glob_syms_match(sym_name, linker, glob_sym, obj, sym, src_sym_idx, btf_id)) + return -EINVAL; + + dst_sym = get_sym_by_idx(linker, glob_sym->sym_idx); + + /* If new symbol is strong, then force dst_sym to be strong as + * well; this way a mix of weak and non-weak extern + * definitions will end up being strong. + */ + if (sym_bind == STB_GLOBAL) { + /* We still need to preserve type (NOTYPE or + * OBJECT/FUNC, depending on whether the symbol is + * extern or not) + */ + sym_update_bind(dst_sym, STB_GLOBAL); + glob_sym->is_weak = false; }
+ /* Non-default visibility is "contaminating", with stricter + * visibility overwriting more permissive ones, even if more + * permissive visibility comes from just an extern definition. + * Currently only STV_DEFAULT and STV_HIDDEN are allowed and + * ensured by ELF symbol sanity checks above. + */ + if (sym_vis > ELF64_ST_VISIBILITY(dst_sym->st_other)) + sym_update_visibility(dst_sym, sym_vis); + + /* If the new symbol is extern, then regardless if + * existing symbol is extern or resolved global, just + * keep the existing one untouched. + */ + if (sym_is_extern) + return 0; + + /* If existing symbol is a strong resolved symbol, bail out, + * because we lost resolution battle have nothing to + * contribute. We already checked abover that there is no + * strong-strong conflict. We also already tightened binding + * and visibility, so nothing else to contribute at that point. + */ + if (!glob_sym->is_extern && sym_bind == STB_WEAK) + return 0; + + /* At this point, new symbol is strong non-extern, + * so overwrite glob_sym with new symbol information. + * Preserve binding and visibility. + */ + sym_update_type(dst_sym, sym_type); + dst_sym->st_shndx = dst_sec->sec_idx; + dst_sym->st_value = src_sec->dst_off + sym->st_value; + dst_sym->st_size = sym->st_size; + + /* see comment below about dst_sec->id vs dst_sec->sec_idx */ + glob_sym->sec_id = dst_sec->id; + glob_sym->is_extern = false; + + if (complete_extern_btf_info(linker->btf, glob_sym->btf_id, + obj->btf, btf_id)) + return -EINVAL; + + /* request updating VAR's/FUNC's underlying BTF type when appending BTF type */ + glob_sym->underlying_btf_id = 0; + + obj->sym_map[src_sym_idx] = glob_sym->sym_idx; + return 0; + } + +add_sym: + name_off = strset__add_str(linker->strtab_strs, sym_name); + if (name_off < 0) + return name_off; + + dst_sym = add_new_sym(linker, &dst_sym_idx); + if (!dst_sym) + return -ENOMEM; + + dst_sym->st_name = name_off; + dst_sym->st_info = sym->st_info; + dst_sym->st_other = sym->st_other; + dst_sym->st_shndx = dst_sec ? dst_sec->sec_idx : sym->st_shndx; + dst_sym->st_value = (src_sec ? src_sec->dst_off : 0) + sym->st_value; + dst_sym->st_size = sym->st_size; + + obj->sym_map[src_sym_idx] = dst_sym_idx; + + if (sym_type == STT_SECTION && dst_sym) { + dst_sec->sec_sym_idx = dst_sym_idx; + dst_sym->st_value = 0; + } + + if (sym_bind != STB_LOCAL) { + glob_sym = add_glob_sym(linker); + if (!glob_sym) + return -ENOMEM; + + glob_sym->sym_idx = dst_sym_idx; + /* we use dst_sec->id (and not dst_sec->sec_idx), because + * ephemeral sections (.kconfig, .ksyms, etc) don't have + * sec_idx (as they don't have corresponding ELF section), but + * still have id. .extern doesn't have even ephemeral section + * associated with it, so dst_sec->id == dst_sec->sec_idx == 0. + */ + glob_sym->sec_id = dst_sec ? dst_sec->id : 0; + glob_sym->name_off = name_off; + /* we will fill btf_id in during BTF merging step */ + glob_sym->btf_id = 0; + glob_sym->is_extern = sym_is_extern; + glob_sym->is_weak = sym_bind == STB_WEAK; }
return 0; @@ -1262,7 +1894,7 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob dst_sec->shdr->sh_info = dst_linked_sec->sec_idx;
src_sec->dst_id = dst_sec->id; - err = extend_sec(dst_sec, src_sec); + err = extend_sec(linker, dst_sec, src_sec); if (err) return err;
@@ -1315,21 +1947,6 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob return 0; }
-static struct src_sec *find_src_sec_by_name(struct src_obj *obj, const char *sec_name) -{ - struct src_sec *sec; - int i; - - for (i = 1; i < obj->sec_cnt; i++) { - sec = &obj->secs[i]; - - if (strcmp(sec->sec_name, sec_name) == 0) - return sec; - } - - return NULL; -} - static Elf64_Sym *find_sym_by_name(struct src_obj *obj, size_t sec_idx, int sym_type, const char *sym_name) { @@ -1384,12 +2001,32 @@ static int linker_fixup_btf(struct src_obj *obj) t->size = sec->shdr->sh_size; } else { /* BTF can have some sections that are not represented - * in ELF, e.g., .kconfig and .ksyms, which are used - * for special extern variables. Here we'll - * pre-create "section shells" for them to be able to - * keep track of extra per-section metadata later - * (e.g., BTF variables). + * in ELF, e.g., .kconfig, .ksyms, .extern, which are used + * for special extern variables. + * + * For all but one such special (ephemeral) + * sections, we pre-create "section shells" to be able + * to keep track of extra per-section metadata later + * (e.g., those BTF extern variables). + * + * .extern is even more special, though, because it + * contains extern variables that need to be resolved + * by static linker, not libbpf and kernel. When such + * externs are resolved, we are going to remove them + * from .extern BTF section and might end up not + * needing it at all. Each resolved extern should have + * matching non-extern VAR/FUNC in other sections. + * + * We do support leaving some of the externs + * unresolved, though, to support cases of building + * libraries, which will later be linked against final + * BPF applications. So if at finalization we still + * see unresolved externs, we'll create .extern + * section on our own. */ + if (strcmp(sec_name, BTF_EXTERN_SEC) == 0) + continue; + sec = add_src_sec(obj, sec_name); if (!sec) return -ENOMEM; @@ -1446,6 +2083,7 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) { const struct btf_type *t; int i, j, n, start_id, id; + const char *name;
if (!obj->btf) return 0; @@ -1458,12 +2096,44 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) return -ENOMEM;
for (i = 1; i <= n; i++) { + struct glob_sym *glob_sym = NULL; + t = btf__type_by_id(obj->btf, i);
/* DATASECs are handled specially below */ if (btf_kind(t) == BTF_KIND_DATASEC) continue;
+ if (btf_is_non_static(t)) { + /* there should be glob_sym already */ + name = btf__str_by_offset(obj->btf, t->name_off); + glob_sym = find_glob_sym(linker, name); + + /* VARs without corresponding glob_sym are those that + * belong to skipped/deduplicated sections (i.e., + * license and version), so just skip them + */ + if (!glob_sym) + continue; + + /* linker_append_elf_sym() might have requested + * updating underlying type ID, if extern was resolved + * to strong symbol or weak got upgraded to non-weak + */ + if (glob_sym->underlying_btf_id == 0) + glob_sym->underlying_btf_id = -t->type; + + /* globals from previous object files that match our + * VAR/FUNC already have a corresponding associated + * BTF type, so just make sure to use it + */ + if (glob_sym->btf_id) { + /* reuse existing BTF type for global var/func */ + obj->btf_type_map[i] = glob_sym->btf_id; + continue; + } + } + id = btf__add_type(linker->btf, obj->btf, t); if (id < 0) { pr_warn("failed to append BTF type #%d from file '%s'\n", i, obj->filename); @@ -1471,6 +2141,12 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) }
obj->btf_type_map[i] = id; + + /* record just appended BTF type for var/func */ + if (glob_sym) { + glob_sym->btf_id = id; + glob_sym->underlying_btf_id = -t->type; + } }
/* remap all the types except DATASECs */ @@ -1482,6 +2158,22 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) return -EINVAL; }
+ /* Rewrite VAR/FUNC underlying types (i.e., FUNC's FUNC_PROTO and VAR's + * actual type), if necessary + */ + for (i = 0; i < linker->glob_sym_cnt; i++) { + struct glob_sym *glob_sym = &linker->glob_syms[i]; + struct btf_type *glob_t; + + if (glob_sym->underlying_btf_id >= 0) + continue; + + glob_sym->underlying_btf_id = obj->btf_type_map[-glob_sym->underlying_btf_id]; + + glob_t = btf_type_by_id(linker->btf, glob_sym->btf_id); + glob_t->type = glob_sym->underlying_btf_id; + } + /* append DATASEC info */ for (i = 1; i < obj->sec_cnt; i++) { struct src_sec *src_sec; @@ -1509,6 +2201,42 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) n = btf_vlen(t); for (j = 0; j < n; j++, src_var++) { void *sec_vars = dst_sec->sec_vars; + int new_id = obj->btf_type_map[src_var->type]; + struct glob_sym *glob_sym = NULL; + + t = btf_type_by_id(linker->btf, new_id); + if (btf_is_non_static(t)) { + name = btf__str_by_offset(linker->btf, t->name_off); + glob_sym = find_glob_sym(linker, name); + if (glob_sym->sec_id != dst_sec->id) { + pr_warn("global '%s': section mismatch %d vs %d\n", + name, glob_sym->sec_id, dst_sec->id); + return -EINVAL; + } + } + + /* If there is already a member (VAR or FUNC) mapped + * to the same type, don't add a duplicate entry. + * This will happen when multiple object files define + * the same extern VARs/FUNCs. + */ + if (glob_sym && glob_sym->var_idx >= 0) { + __s64 sz; + + dst_var = &dst_sec->sec_vars[glob_sym->var_idx]; + /* Because underlying BTF type might have + * changed, so might its size have changed, so + * re-calculate and update it in sec_var. + */ + sz = btf__resolve_size(linker->btf, glob_sym->underlying_btf_id); + if (sz < 0) { + pr_warn("global '%s': failed to resolve size of underlying type: %d\n", + name, (int)sz); + return -EINVAL; + } + dst_var->size = sz; + continue; + }
sec_vars = libbpf_reallocarray(sec_vars, dst_sec->sec_var_cnt + 1, @@ -1523,6 +2251,9 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj) dst_var->type = obj->btf_type_map[src_var->type]; dst_var->size = src_var->size; dst_var->offset = src_sec->dst_off + src_var->offset; + + if (glob_sym) + glob_sym->var_idx = dst_sec->sec_var_cnt - 1; } }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 0a342457b3bd36e6f9b558da3ff520dee35c5363 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add extra logic to handle map externs (only BTF-defined maps are supported for linking). Re-use the map parsing logic used during bpf_object__open(). Map externs are currently restricted to always match complete map definition. So all the specified attributes will be compared (down to pining, map_flags, numa_node, etc). In the future this restriction might be relaxed with no backwards compatibility issues. If any attribute is mismatched between extern and actual map definition, linker will report an error, pointing out which one mismatches.
The original intent was to allow for extern to specify attributes that matters (to user) to enforce. E.g., if you specify just key information and omit value, then any value fits. Similarly, it should have been possible to enforce map_flags, pinning, and any other possible map attribute. Unfortunately, that means that multiple externs can be only partially overlapping with each other, which means linker would need to combine their type definitions to end up with the most restrictive and fullest map definition. This requires an extra amount of BTF manipulation which at this time was deemed unnecessary and would require further extending generic BTF writer APIs. So that is left for future follow ups, if there will be demand for that. But the idea seems intresting and useful, so I want to document it here.
Weak definitions are also supported, but are pretty strict as well, just like externs: all weak map definitions have to match exactly. In the follow up patches this most probably will be relaxed, with __weak map definitions being able to differ between each other (with non-weak definition always winning, of course).
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210423181348.1801389-13-andrii@kernel.org (cherry picked from commit 0a342457b3bd36e6f9b558da3ff520dee35c5363) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 132 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 132 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index be521cf3c26e..f40fa50e9a89 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1471,6 +1471,134 @@ static bool glob_sym_btf_matches(const char *sym_name, bool exact, } }
+static bool map_defs_match(const char *sym_name, + const struct btf *main_btf, + const struct btf_map_def *main_def, + const struct btf_map_def *main_inner_def, + const struct btf *extra_btf, + const struct btf_map_def *extra_def, + const struct btf_map_def *extra_inner_def) +{ + const char *reason; + + if (main_def->map_type != extra_def->map_type) { + reason = "type"; + goto mismatch; + } + + /* check key type/size match */ + if (main_def->key_size != extra_def->key_size) { + reason = "key_size"; + goto mismatch; + } + if (!!main_def->key_type_id != !!extra_def->key_type_id) { + reason = "key type"; + goto mismatch; + } + if ((main_def->parts & MAP_DEF_KEY_TYPE) + && !glob_sym_btf_matches(sym_name, true /*exact*/, + main_btf, main_def->key_type_id, + extra_btf, extra_def->key_type_id)) { + reason = "key type"; + goto mismatch; + } + + /* validate value type/size match */ + if (main_def->value_size != extra_def->value_size) { + reason = "value_size"; + goto mismatch; + } + if (!!main_def->value_type_id != !!extra_def->value_type_id) { + reason = "value type"; + goto mismatch; + } + if ((main_def->parts & MAP_DEF_VALUE_TYPE) + && !glob_sym_btf_matches(sym_name, true /*exact*/, + main_btf, main_def->value_type_id, + extra_btf, extra_def->value_type_id)) { + reason = "key type"; + goto mismatch; + } + + if (main_def->max_entries != extra_def->max_entries) { + reason = "max_entries"; + goto mismatch; + } + if (main_def->map_flags != extra_def->map_flags) { + reason = "map_flags"; + goto mismatch; + } + if (main_def->numa_node != extra_def->numa_node) { + reason = "numa_node"; + goto mismatch; + } + if (main_def->pinning != extra_def->pinning) { + reason = "pinning"; + goto mismatch; + } + + if ((main_def->parts & MAP_DEF_INNER_MAP) != (extra_def->parts & MAP_DEF_INNER_MAP)) { + reason = "inner map"; + goto mismatch; + } + + if (main_def->parts & MAP_DEF_INNER_MAP) { + char inner_map_name[128]; + + snprintf(inner_map_name, sizeof(inner_map_name), "%s.inner", sym_name); + + return map_defs_match(inner_map_name, + main_btf, main_inner_def, NULL, + extra_btf, extra_inner_def, NULL); + } + + return true; + +mismatch: + pr_warn("global '%s': map %s mismatch\n", sym_name, reason); + return false; +} + +static bool glob_map_defs_match(const char *sym_name, + struct bpf_linker *linker, struct glob_sym *glob_sym, + struct src_obj *obj, Elf64_Sym *sym, int btf_id) +{ + struct btf_map_def dst_def = {}, dst_inner_def = {}; + struct btf_map_def src_def = {}, src_inner_def = {}; + const struct btf_type *t; + int err; + + t = btf__type_by_id(obj->btf, btf_id); + if (!btf_is_var(t)) { + pr_warn("global '%s': invalid map definition type [%d]\n", sym_name, btf_id); + return false; + } + t = skip_mods_and_typedefs(obj->btf, t->type, NULL); + + err = parse_btf_map_def(sym_name, obj->btf, t, true /*strict*/, &src_def, &src_inner_def); + if (err) { + pr_warn("global '%s': invalid map definition\n", sym_name); + return false; + } + + /* re-parse existing map definition */ + t = btf__type_by_id(linker->btf, glob_sym->btf_id); + t = skip_mods_and_typedefs(linker->btf, t->type, NULL); + err = parse_btf_map_def(sym_name, linker->btf, t, true /*strict*/, &dst_def, &dst_inner_def); + if (err) { + /* this should not happen, because we already validated it */ + pr_warn("global '%s': invalid dst map definition\n", sym_name); + return false; + } + + /* Currently extern map definition has to be complete and match + * concrete map definition exactly. This restriction might be lifted + * in the future. + */ + return map_defs_match(sym_name, linker->btf, &dst_def, &dst_inner_def, + obj->btf, &src_def, &src_inner_def); +} + static bool glob_syms_match(const char *sym_name, struct bpf_linker *linker, struct glob_sym *glob_sym, struct src_obj *obj, Elf64_Sym *sym, size_t sym_idx, int btf_id) @@ -1492,6 +1620,10 @@ static bool glob_syms_match(const char *sym_name, return false; }
+ /* deal with .maps definitions specially */ + if (glob_sym->sec_id && strcmp(linker->secs[glob_sym->sec_id].sec_name, MAPS_ELF_SEC) == 0) + return glob_map_defs_match(sym_name, linker, glob_sym, obj, sym, btf_id); + if (!glob_sym_btf_matches(sym_name, true /*exact*/, linker->btf, glob_sym->btf_id, obj->btf, btf_id)) return false;
From: Ilya Leoshkevich iii@linux.ibm.com
mainline inclusion from mainline-5.13-rc1 commit db16c1fe92d7ba7d39061faef897842baee2c887 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
pahole v1.21 supports the --btf_gen_floats flag, which makes it generate the information about the floating-point types [1].
Adjust link-vmlinux.sh to pass this flag to pahole in case it's supported, which is determined using a simple version check.
[1] https://lore.kernel.org/dwarves/YHRiXNX1JUF2Az0A@kernel.org/
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210413190043.21918-1-iii@linux.ibm.com (cherry picked from commit db16c1fe92d7ba7d39061faef897842baee2c887) Signed-off-by: Wang Yufen wangyufen@huawei.com --- scripts/link-vmlinux.sh | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 6eded325c837..b7f5119fa1fd 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -140,6 +140,7 @@ vmlinux_link() gen_btf() { local pahole_ver + local extra_paholeopt=
if ! [ -x "$(command -v ${PAHOLE})" ]; then echo >&2 "BTF: ${1}: pahole (${PAHOLE}) is not available" @@ -154,8 +155,12 @@ gen_btf()
vmlinux_link ${1}
+ if [ "${pahole_ver}" -ge "121" ]; then + extra_paholeopt="${extra_paholeopt} --btf_gen_floats" + fi + info "BTF" ${2} - LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${1} + LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${extra_paholeopt} ${1}
# Create ${2} which contains just .BTF section but no symbols. Add # SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all
From: Toke Høiland-Jørgensen toke@redhat.com
mainline inclusion from mainline-5.13-rc1 commit 441e8c66b23e027c00ccebd70df9fd933918eefe category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
There is currently no way to discover the target of a tracing program attachment after the fact. Add this information to bpf_link_info and return it when querying the bpf_link fd.
Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210413091607.58945-1-toke@redhat.com (cherry picked from commit 441e8c66b23e027c00ccebd70df9fd933918eefe) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf_verifier.h | 9 +++++++++ include/uapi/linux/bpf.h | 2 ++ kernel/bpf/syscall.c | 3 +++ tools/include/uapi/linux/bpf.h | 2 ++ 4 files changed, 16 insertions(+)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index ef49a22433b1..099947df8cd6 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -508,6 +508,15 @@ static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog, return ((u64)btf_obj_id(btf) << 32) | 0x80000000 | btf_id; }
+/* unpack the IDs from the key as constructed above */ +static inline void bpf_trampoline_unpack_key(u64 key, u32 *obj_id, u32 *btf_id) +{ + if (obj_id) + *obj_id = key >> 32; + if (btf_id) + *btf_id = key & 0x7FFFFFFF; +} + int bpf_check_attach_target(struct bpf_verifier_log *log, const struct bpf_prog *prog, const struct bpf_prog *tgt_prog, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 9c39b101b326..3f0520a185cf 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4546,6 +4546,8 @@ struct bpf_link_info { } raw_tracepoint; struct { __u32 attach_type; + __u32 target_obj_id; /* prog_id for PROG_EXT, otherwise btf object id */ + __u32 target_btf_id; /* BTF type id inside the object */ } tracing; struct { __u64 cgroup_id; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index fb72f2e887ad..17e8d056602e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2623,6 +2623,9 @@ static int bpf_tracing_link_fill_link_info(const struct bpf_link *link, container_of(link, struct bpf_tracing_link, link);
info->tracing.attach_type = tr_link->attach_type; + bpf_trampoline_unpack_key(tr_link->trampoline->key, + &info->tracing.target_obj_id, + &info->tracing.target_btf_id);
return 0; } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 360b06fa4399..75a0ee4e10d4 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5255,6 +5255,8 @@ struct bpf_link_info { } raw_tracepoint; struct { __u32 attach_type; + __u32 target_obj_id; /* prog_id for PROG_EXT, otherwise btf object id */ + __u32 target_btf_id; /* BTF type id inside the object */ } tracing; struct { __u64 cgroup_id;
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.13-rc1 commit 1969b3c60db675040ec0d1b09698807647aac7ed category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It is just missing a ';'. This macro is not used by any test yet.
Fixes: 22ba36351631 ("selftests/bpf: Move and extend ASSERT_xxx() testing macros") Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210414155632.737866-1-revest@chromium.org (cherry picked from commit 1969b3c60db675040ec0d1b09698807647aac7ed) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_progs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index e49e2fdde942..5daf25f83f0b 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -200,7 +200,7 @@ extern int test__join_cgroup(const char *path); #define ASSERT_ERR_PTR(ptr, name) ({ \ static int duration = 0; \ const void *___res = (ptr); \ - bool ___ok = IS_ERR(___res) \ + bool ___ok = IS_ERR(___res); \ CHECK(!___ok, (name), "unexpected pointer: %p\n", ___res); \ ___ok; \ })
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.13-rc1 commit d9c9e4db186ab4d81f84e6f22b225d333b9424e3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Two helpers (trace_printk and seq_printf) have very similar implementations of format string parsing and a third one is coming (snprintf). To avoid code duplication and make the code easier to maintain, this moves the operations associated with format string parsing (validation and argument sanitization) into one generic function.
The implementation of the two existing helpers already drifted quite a bit so unifying them entailed a lot of changes:
- bpf_trace_printk always expected fmt[fmt_size] to be the terminating NULL character, this is no longer true, the first 0 is terminating. - bpf_trace_printk now supports %% (which produces the percentage char). - bpf_trace_printk now skips width formating fields. - bpf_trace_printk now supports the X modifier (capital hexadecimal). - bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6 - argument casting on 32 bit has been simplified into one macro and using an enum instead of obscure int increments.
- bpf_seq_printf now uses bpf_trace_copy_string instead of strncpy_from_kernel_nofault and handles the %pks %pus specifiers. - bpf_seq_printf now prints longs correctly on 32 bit architectures.
- both were changed to use a global per-cpu tmp buffer instead of one stack buffer for trace_printk and 6 small buffers for seq_printf. - to avoid per-cpu buffer usage conflict, these helpers disable preemption while the per-cpu buffer is in use. - both helpers now support the %ps and %pS specifiers to print symbols.
The implementation is also moved from bpf_trace.c to helpers.c because the upcoming bpf_snprintf helper will be made available to all BPF programs and will need it.
Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210419155243.1632274-2-revest@chromium.org (cherry picked from commit d9c9e4db186ab4d81f84e6f22b225d333b9424e3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 20 +++ kernel/bpf/helpers.c | 256 +++++++++++++++++++++++++++ kernel/trace/bpf_trace.c | 371 ++++----------------------------------- 3 files changed, 313 insertions(+), 334 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 880ab6a8e1f3..4b2072ade23b 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2157,4 +2157,24 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, struct btf_id_set; bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
+enum bpf_printf_mod_type { + BPF_PRINTF_INT, + BPF_PRINTF_LONG, + BPF_PRINTF_LONG_LONG, +}; + +/* Workaround for getting va_list handling working with different argument type + * combinations generically for 32 and 64 bit archs. + */ +#define BPF_CAST_FMT_ARG(arg_nb, args, mod) \ + (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \ + (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64) \ + ? (u64)args[arg_nb] \ + : (u32)args[arg_nb]) + +int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args, + u64 *final_args, enum bpf_printf_mod_type *mod, + u32 num_args); +void bpf_printf_cleanup(void); + #endif /* _LINUX_BPF_H */ diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 9ff81ff5bf9b..8009d372be02 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -653,6 +653,262 @@ const struct bpf_func_proto bpf_this_cpu_ptr_proto = { .arg1_type = ARG_PTR_TO_PERCPU_BTF_ID, };
+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, + size_t bufsz) +{ + void __user *user_ptr = (__force void __user *)unsafe_ptr; + + buf[0] = 0; + + switch (fmt_ptype) { + case 's': +#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE + if ((unsigned long)unsafe_ptr < TASK_SIZE) + return strncpy_from_user_nofault(buf, user_ptr, bufsz); + fallthrough; +#endif + case 'k': + return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz); + case 'u': + return strncpy_from_user_nofault(buf, user_ptr, bufsz); + } + + return -EINVAL; +} + +/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p + */ +#define MAX_PRINTF_BUF_LEN 512 + +struct bpf_printf_buf { + char tmp_buf[MAX_PRINTF_BUF_LEN]; +}; +static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf); +static DEFINE_PER_CPU(int, bpf_printf_buf_used); + +static int try_get_fmt_tmp_buf(char **tmp_buf) +{ + struct bpf_printf_buf *bufs; + int used; + + if (*tmp_buf) + return 0; + + preempt_disable(); + used = this_cpu_inc_return(bpf_printf_buf_used); + if (WARN_ON_ONCE(used > 1)) { + this_cpu_dec(bpf_printf_buf_used); + preempt_enable(); + return -EBUSY; + } + bufs = this_cpu_ptr(&bpf_printf_buf); + *tmp_buf = bufs->tmp_buf; + + return 0; +} + +void bpf_printf_cleanup(void) +{ + if (this_cpu_read(bpf_printf_buf_used)) { + this_cpu_dec(bpf_printf_buf_used); + preempt_enable(); + } +} + +/* + * bpf_parse_fmt_str - Generic pass on format strings for printf-like helpers + * + * Returns a negative value if fmt is an invalid format string or 0 otherwise. + * + * This can be used in two ways: + * - Format string verification only: when final_args and mod are NULL + * - Arguments preparation: in addition to the above verification, it writes in + * final_args a copy of raw_args where pointers from BPF have been sanitized + * into pointers safe to use by snprintf. This also writes in the mod array + * the size requirement of each argument, usable by BPF_CAST_FMT_ARG for ex. + * + * In argument preparation mode, if 0 is returned, safe temporary buffers are + * allocated and bpf_printf_cleanup should be called to free them after use. + */ +int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args, + u64 *final_args, enum bpf_printf_mod_type *mod, + u32 num_args) +{ + char *unsafe_ptr = NULL, *tmp_buf = NULL, *fmt_end; + size_t tmp_buf_len = MAX_PRINTF_BUF_LEN; + int err, i, num_spec = 0, copy_size; + enum bpf_printf_mod_type cur_mod; + u64 cur_arg; + char fmt_ptype; + + if (!!final_args != !!mod) + return -EINVAL; + + fmt_end = strnchr(fmt, fmt_size, 0); + if (!fmt_end) + return -EINVAL; + fmt_size = fmt_end - fmt; + + for (i = 0; i < fmt_size; i++) { + if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) { + err = -EINVAL; + goto cleanup; + } + + if (fmt[i] != '%') + continue; + + if (fmt[i + 1] == '%') { + i++; + continue; + } + + if (num_spec >= num_args) { + err = -EINVAL; + goto cleanup; + } + + /* The string is zero-terminated so if fmt[i] != 0, we can + * always access fmt[i + 1], in the worst case it will be a 0 + */ + i++; + + /* skip optional "[0 +-][num]" width formatting field */ + while (fmt[i] == '0' || fmt[i] == '+' || fmt[i] == '-' || + fmt[i] == ' ') + i++; + if (fmt[i] >= '1' && fmt[i] <= '9') { + i++; + while (fmt[i] >= '0' && fmt[i] <= '9') + i++; + } + + if (fmt[i] == 'p') { + cur_mod = BPF_PRINTF_LONG; + + if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') && + fmt[i + 2] == 's') { + fmt_ptype = fmt[i + 1]; + i += 2; + goto fmt_str; + } + + if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) || + ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' || + fmt[i + 1] == 'x' || fmt[i + 1] == 'B' || + fmt[i + 1] == 's' || fmt[i + 1] == 'S') { + /* just kernel pointers */ + if (final_args) + cur_arg = raw_args[num_spec]; + goto fmt_next; + } + + /* only support "%pI4", "%pi4", "%pI6" and "%pi6". */ + if ((fmt[i + 1] != 'i' && fmt[i + 1] != 'I') || + (fmt[i + 2] != '4' && fmt[i + 2] != '6')) { + err = -EINVAL; + goto cleanup; + } + + i += 2; + if (!final_args) + goto fmt_next; + + if (try_get_fmt_tmp_buf(&tmp_buf)) { + err = -EBUSY; + goto out; + } + + copy_size = (fmt[i + 2] == '4') ? 4 : 16; + if (tmp_buf_len < copy_size) { + err = -ENOSPC; + goto cleanup; + } + + unsafe_ptr = (char *)(long)raw_args[num_spec]; + err = copy_from_kernel_nofault(tmp_buf, unsafe_ptr, + copy_size); + if (err < 0) + memset(tmp_buf, 0, copy_size); + cur_arg = (u64)(long)tmp_buf; + tmp_buf += copy_size; + tmp_buf_len -= copy_size; + + goto fmt_next; + } else if (fmt[i] == 's') { + cur_mod = BPF_PRINTF_LONG; + fmt_ptype = fmt[i]; +fmt_str: + if (fmt[i + 1] != 0 && + !isspace(fmt[i + 1]) && + !ispunct(fmt[i + 1])) { + err = -EINVAL; + goto cleanup; + } + + if (!final_args) + goto fmt_next; + + if (try_get_fmt_tmp_buf(&tmp_buf)) { + err = -EBUSY; + goto out; + } + + if (!tmp_buf_len) { + err = -ENOSPC; + goto cleanup; + } + + unsafe_ptr = (char *)(long)raw_args[num_spec]; + err = bpf_trace_copy_string(tmp_buf, unsafe_ptr, + fmt_ptype, tmp_buf_len); + if (err < 0) { + tmp_buf[0] = '\0'; + err = 1; + } + + cur_arg = (u64)(long)tmp_buf; + tmp_buf += err; + tmp_buf_len -= err; + + goto fmt_next; + } + + cur_mod = BPF_PRINTF_INT; + + if (fmt[i] == 'l') { + cur_mod = BPF_PRINTF_LONG; + i++; + } + if (fmt[i] == 'l') { + cur_mod = BPF_PRINTF_LONG_LONG; + i++; + } + + if (fmt[i] != 'i' && fmt[i] != 'd' && fmt[i] != 'u' && + fmt[i] != 'x' && fmt[i] != 'X') { + err = -EINVAL; + goto cleanup; + } + + if (final_args) + cur_arg = raw_args[num_spec]; +fmt_next: + if (final_args) { + mod[num_spec] = cur_mod; + final_args[num_spec] = cur_arg; + } + num_spec++; + } + + err = 0; +cleanup: + if (err) + bpf_printf_cleanup(); +out: + return err; +} + const struct bpf_func_proto bpf_get_current_task_proto __weak; const struct bpf_func_proto bpf_probe_read_user_proto __weak; const struct bpf_func_proto bpf_probe_read_user_str_proto __weak; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index fba205086cf8..3e98255e6ec5 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -357,188 +357,38 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void) return &bpf_probe_write_user_proto; }
-static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, - size_t bufsz) -{ - void __user *user_ptr = (__force void __user *)unsafe_ptr; - - buf[0] = 0; - - switch (fmt_ptype) { - case 's': -#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE - if ((unsigned long)unsafe_ptr < TASK_SIZE) { - strncpy_from_user_nofault(buf, user_ptr, bufsz); - break; - } - fallthrough; -#endif - case 'k': - strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz); - break; - case 'u': - strncpy_from_user_nofault(buf, user_ptr, bufsz); - break; - } -} - static DEFINE_RAW_SPINLOCK(trace_printk_lock);
-#define BPF_TRACE_PRINTK_SIZE 1024 +#define MAX_TRACE_PRINTK_VARARGS 3 +#define BPF_TRACE_PRINTK_SIZE 1024
-static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...) +BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1, + u64, arg2, u64, arg3) { + u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 }; + enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS]; static char buf[BPF_TRACE_PRINTK_SIZE]; unsigned long flags; - va_list ap; int ret;
- raw_spin_lock_irqsave(&trace_printk_lock, flags); - va_start(ap, fmt); - ret = vsnprintf(buf, sizeof(buf), fmt, ap); - va_end(ap); - /* vsnprintf() will not append null for zero-length strings */ + ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod, + MAX_TRACE_PRINTK_VARARGS); + if (ret < 0) + return ret; + + ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod), + BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod)); + /* snprintf() will not append null for zero-length strings */ if (ret == 0) buf[0] = '\0'; + + raw_spin_lock_irqsave(&trace_printk_lock, flags); trace_bpf_trace_printk(buf); raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
- return ret; -} - -/* - * Only limited trace_printk() conversion specifiers allowed: - * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s - */ -BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1, - u64, arg2, u64, arg3) -{ - int i, mod[3] = {}, fmt_cnt = 0; - char buf[64], fmt_ptype; - void *unsafe_ptr = NULL; - bool str_seen = false; + bpf_printf_cleanup();
- /* - * bpf_check()->check_func_arg()->check_stack_boundary() - * guarantees that fmt points to bpf program stack, - * fmt_size bytes of it were initialized and fmt_size > 0 - */ - if (fmt[--fmt_size] != 0) - return -EINVAL; - - /* check format string for allowed specifiers */ - for (i = 0; i < fmt_size; i++) { - if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) - return -EINVAL; - - if (fmt[i] != '%') - continue; - - if (fmt_cnt >= 3) - return -EINVAL; - - /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */ - i++; - if (fmt[i] == 'l') { - mod[fmt_cnt]++; - i++; - } else if (fmt[i] == 'p') { - mod[fmt_cnt]++; - if ((fmt[i + 1] == 'k' || - fmt[i + 1] == 'u') && - fmt[i + 2] == 's') { - fmt_ptype = fmt[i + 1]; - i += 2; - goto fmt_str; - } - - if (fmt[i + 1] == 'B') { - i++; - goto fmt_next; - } - - /* disallow any further format extensions */ - if (fmt[i + 1] != 0 && - !isspace(fmt[i + 1]) && - !ispunct(fmt[i + 1])) - return -EINVAL; - - goto fmt_next; - } else if (fmt[i] == 's') { - mod[fmt_cnt]++; - fmt_ptype = fmt[i]; -fmt_str: - if (str_seen) - /* allow only one '%s' per fmt string */ - return -EINVAL; - str_seen = true; - - if (fmt[i + 1] != 0 && - !isspace(fmt[i + 1]) && - !ispunct(fmt[i + 1])) - return -EINVAL; - - switch (fmt_cnt) { - case 0: - unsafe_ptr = (void *)(long)arg1; - arg1 = (long)buf; - break; - case 1: - unsafe_ptr = (void *)(long)arg2; - arg2 = (long)buf; - break; - case 2: - unsafe_ptr = (void *)(long)arg3; - arg3 = (long)buf; - break; - } - - bpf_trace_copy_string(buf, unsafe_ptr, fmt_ptype, - sizeof(buf)); - goto fmt_next; - } - - if (fmt[i] == 'l') { - mod[fmt_cnt]++; - i++; - } - - if (fmt[i] != 'i' && fmt[i] != 'd' && - fmt[i] != 'u' && fmt[i] != 'x') - return -EINVAL; -fmt_next: - fmt_cnt++; - } - -/* Horrid workaround for getting va_list handling working with different - * argument type combinations generically for 32 and 64 bit archs. - */ -#define __BPF_TP_EMIT() __BPF_ARG3_TP() -#define __BPF_TP(...) \ - bpf_do_trace_printk(fmt, ##__VA_ARGS__) - -#define __BPF_ARG1_TP(...) \ - ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64)) \ - ? __BPF_TP(arg1, ##__VA_ARGS__) \ - : ((mod[0] == 1 || (mod[0] == 0 && __BITS_PER_LONG == 32)) \ - ? __BPF_TP((long)arg1, ##__VA_ARGS__) \ - : __BPF_TP((u32)arg1, ##__VA_ARGS__))) - -#define __BPF_ARG2_TP(...) \ - ((mod[1] == 2 || (mod[1] == 1 && __BITS_PER_LONG == 64)) \ - ? __BPF_ARG1_TP(arg2, ##__VA_ARGS__) \ - : ((mod[1] == 1 || (mod[1] == 0 && __BITS_PER_LONG == 32)) \ - ? __BPF_ARG1_TP((long)arg2, ##__VA_ARGS__) \ - : __BPF_ARG1_TP((u32)arg2, ##__VA_ARGS__))) - -#define __BPF_ARG3_TP(...) \ - ((mod[2] == 2 || (mod[2] == 1 && __BITS_PER_LONG == 64)) \ - ? __BPF_ARG2_TP(arg3, ##__VA_ARGS__) \ - : ((mod[2] == 1 || (mod[2] == 0 && __BITS_PER_LONG == 32)) \ - ? __BPF_ARG2_TP((long)arg3, ##__VA_ARGS__) \ - : __BPF_ARG2_TP((u32)arg3, ##__VA_ARGS__))) - - return __BPF_TP_EMIT(); + return ret; }
static const struct bpf_func_proto bpf_trace_printk_proto = { @@ -566,184 +416,37 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void) }
#define MAX_SEQ_PRINTF_VARARGS 12 -#define MAX_SEQ_PRINTF_MAX_MEMCPY 6 -#define MAX_SEQ_PRINTF_STR_LEN 128 - -struct bpf_seq_printf_buf { - char buf[MAX_SEQ_PRINTF_MAX_MEMCPY][MAX_SEQ_PRINTF_STR_LEN]; -}; -static DEFINE_PER_CPU(struct bpf_seq_printf_buf, bpf_seq_printf_buf); -static DEFINE_PER_CPU(int, bpf_seq_printf_buf_used);
BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size, const void *, data, u32, data_len) { - int err = -EINVAL, fmt_cnt = 0, memcpy_cnt = 0; - int i, buf_used, copy_size, num_args; - u64 params[MAX_SEQ_PRINTF_VARARGS]; - struct bpf_seq_printf_buf *bufs; - const u64 *args = data; - - buf_used = this_cpu_inc_return(bpf_seq_printf_buf_used); - if (WARN_ON_ONCE(buf_used > 1)) { - err = -EBUSY; - goto out; - } - - bufs = this_cpu_ptr(&bpf_seq_printf_buf); - - /* - * bpf_check()->check_func_arg()->check_stack_boundary() - * guarantees that fmt points to bpf program stack, - * fmt_size bytes of it were initialized and fmt_size > 0 - */ - if (fmt[--fmt_size] != 0) - goto out; - - if (data_len & 7) - goto out; - - for (i = 0; i < fmt_size; i++) { - if (fmt[i] == '%') { - if (fmt[i + 1] == '%') - i++; - else if (!data || !data_len) - goto out; - } - } + enum bpf_printf_mod_type mod[MAX_SEQ_PRINTF_VARARGS]; + u64 args[MAX_SEQ_PRINTF_VARARGS]; + int err, num_args;
+ if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 || + (data_len && !data)) + return -EINVAL; num_args = data_len / 8;
- /* check format string for allowed specifiers */ - for (i = 0; i < fmt_size; i++) { - /* only printable ascii for now. */ - if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) { - err = -EINVAL; - goto out; - } - - if (fmt[i] != '%') - continue; - - if (fmt[i + 1] == '%') { - i++; - continue; - } - - if (fmt_cnt >= MAX_SEQ_PRINTF_VARARGS) { - err = -E2BIG; - goto out; - } - - if (fmt_cnt >= num_args) { - err = -EINVAL; - goto out; - } - - /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */ - i++; - - /* skip optional "[0 +-][num]" width formating field */ - while (fmt[i] == '0' || fmt[i] == '+' || fmt[i] == '-' || - fmt[i] == ' ') - i++; - if (fmt[i] >= '1' && fmt[i] <= '9') { - i++; - while (fmt[i] >= '0' && fmt[i] <= '9') - i++; - } - - if (fmt[i] == 's') { - void *unsafe_ptr; - - /* try our best to copy */ - if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) { - err = -E2BIG; - goto out; - } - - unsafe_ptr = (void *)(long)args[fmt_cnt]; - err = strncpy_from_kernel_nofault(bufs->buf[memcpy_cnt], - unsafe_ptr, MAX_SEQ_PRINTF_STR_LEN); - if (err < 0) - bufs->buf[memcpy_cnt][0] = '\0'; - params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt]; - - fmt_cnt++; - memcpy_cnt++; - continue; - } - - if (fmt[i] == 'p') { - if (fmt[i + 1] == 0 || - fmt[i + 1] == 'K' || - fmt[i + 1] == 'x' || - fmt[i + 1] == 'B') { - /* just kernel pointers */ - params[fmt_cnt] = args[fmt_cnt]; - fmt_cnt++; - continue; - } - - /* only support "%pI4", "%pi4", "%pI6" and "%pi6". */ - if (fmt[i + 1] != 'i' && fmt[i + 1] != 'I') { - err = -EINVAL; - goto out; - } - if (fmt[i + 2] != '4' && fmt[i + 2] != '6') { - err = -EINVAL; - goto out; - } - - if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) { - err = -E2BIG; - goto out; - } - - - copy_size = (fmt[i + 2] == '4') ? 4 : 16; - - err = copy_from_kernel_nofault(bufs->buf[memcpy_cnt], - (void *) (long) args[fmt_cnt], - copy_size); - if (err < 0) - memset(bufs->buf[memcpy_cnt], 0, copy_size); - params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt]; - - i += 2; - fmt_cnt++; - memcpy_cnt++; - continue; - } - - if (fmt[i] == 'l') { - i++; - if (fmt[i] == 'l') - i++; - } - - if (fmt[i] != 'i' && fmt[i] != 'd' && - fmt[i] != 'u' && fmt[i] != 'x' && - fmt[i] != 'X') { - err = -EINVAL; - goto out; - } - - params[fmt_cnt] = args[fmt_cnt]; - fmt_cnt++; - } + err = bpf_printf_prepare(fmt, fmt_size, data, args, mod, num_args); + if (err < 0) + return err;
/* Maximumly we can have MAX_SEQ_PRINTF_VARARGS parameter, just give * all of them to seq_printf(). */ - seq_printf(m, fmt, params[0], params[1], params[2], params[3], - params[4], params[5], params[6], params[7], params[8], - params[9], params[10], params[11]); + seq_printf(m, fmt, BPF_CAST_FMT_ARG(0, args, mod), + BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod), + BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod), + BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod), + BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod), + BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod), + BPF_CAST_FMT_ARG(11, args, mod));
- err = seq_has_overflowed(m) ? -EOVERFLOW : 0; -out: - this_cpu_dec(bpf_seq_printf_buf_used); - return err; + bpf_printf_cleanup(); + + return seq_has_overflowed(m) ? -EOVERFLOW : 0; }
BTF_ID_LIST_SINGLE(btf_seq_file_ids, struct, seq_file)
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.13-rc1 commit fff13c4bb646ef849fd74ced87eef54340d28c21 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This type provides the guarantee that an argument is going to be a const pointer to somewhere in a read-only map value. It also checks that this pointer is followed by a zero character before the end of the map value.
Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210419155243.1632274-3-revest@chromium.org (cherry picked from commit fff13c4bb646ef849fd74ced87eef54340d28c21) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 1 + kernel/bpf/verifier.c | 41 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 4b2072ade23b..f7c69d6a4762 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -334,6 +334,7 @@ enum bpf_arg_type { ARG_PTR_TO_PERCPU_BTF_ID, /* pointer to in-kernel percpu type */ ARG_PTR_TO_FUNC, /* pointer to a bpf program function */ ARG_PTR_TO_STACK_OR_NULL, /* pointer to stack or NULL */ + ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */ __BPF_ARG_TYPE_MAX,
/* Extended arg_types. */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 915b18d62d1b..fa21ed4691a5 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4750,6 +4750,7 @@ static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALU static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_PERCPU_BTF_ID } }; static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } }; static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } }; +static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } };
static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_MAP_KEY] = &map_key_value_types, @@ -4775,6 +4776,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_PERCPU_BTF_ID] = &percpu_btf_ptr_types, [ARG_PTR_TO_FUNC] = &func_ptr_types, [ARG_PTR_TO_STACK_OR_NULL] = &stack_ptr_types, + [ARG_PTR_TO_CONST_STR] = &const_str_ptr_types, };
static int check_reg_type(struct bpf_verifier_env *env, u32 regno, @@ -5056,6 +5058,45 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, if (err) return err; err = check_ptr_alignment(env, reg, 0, size, true); + } else if (arg_type == ARG_PTR_TO_CONST_STR) { + struct bpf_map *map = reg->map_ptr; + int map_off; + u64 map_addr; + char *str_ptr; + + if (reg->type != PTR_TO_MAP_VALUE || !map || + !bpf_map_is_rdonly(map)) { + verbose(env, "R%d does not point to a readonly map'\n", regno); + return -EACCES; + } + + if (!tnum_is_const(reg->var_off)) { + verbose(env, "R%d is not a constant address'\n", regno); + return -EACCES; + } + + if (!map->ops->map_direct_value_addr) { + verbose(env, "no direct value access support for this map type\n"); + return -EACCES; + } + + err = check_map_access(env, regno, reg->off, + map->value_size - reg->off, false); + if (err) + return err; + + map_off = reg->off + reg->var_off.value; + err = map->ops->map_direct_value_addr(map, &map_addr, map_off); + if (err) { + verbose(env, "direct value access on string failed\n"); + return err; + } + + str_ptr = (char *)(long)(map_addr); + if (!strnchr(str_ptr + map_off, map->value_size - map_off, 0)) { + verbose(env, "string is not zero-terminated\n"); + return -EINVAL; + } }
return err;
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.13-rc1 commit 7b15523a989b63927c2bb08e9b5b0bbc10b58bef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The implementation takes inspiration from the existing bpf_trace_printk helper but there are a few differences:
To allow for a large number of format-specifiers, parameters are provided in an array, like in bpf_seq_printf.
Because the output string takes two arguments and the array of parameters also takes two arguments, the format string needs to fit in one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to a zero-terminated read-only map so we don't need a format string length arg.
Because the format-string is known at verification time, we also do a first pass of format string validation in the verifier logic. This makes debugging easier.
Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org (cherry picked from commit 7b15523a989b63927c2bb08e9b5b0bbc10b58bef) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 28 +++++++++++++++++++ kernel/bpf/helpers.c | 50 ++++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 41 ++++++++++++++++++++++++++++ kernel/trace/bpf_trace.c | 2 ++ tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++ 6 files changed, 150 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f7c69d6a4762..bbc3e717c763 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2038,6 +2038,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto; extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto; extern const struct bpf_func_proto bpf_copy_from_user_proto; extern const struct bpf_func_proto bpf_snprintf_btf_proto; +extern const struct bpf_func_proto bpf_snprintf_proto; extern const struct bpf_func_proto bpf_per_cpu_ptr_proto; extern const struct bpf_func_proto bpf_this_cpu_ptr_proto; extern const struct bpf_func_proto bpf_for_each_map_elem_proto; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 3f0520a185cf..615ca5424058 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3846,6 +3846,33 @@ union bpf_attr { * Return * The number of traversed map elements for success, **-EINVAL** for * invalid **flags**. + * + * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len) + * Description + * Outputs a string into the **str** buffer of size **str_size** + * based on a format string stored in a read-only map pointed by + * **fmt**. + * + * Each format specifier in **fmt** corresponds to one u64 element + * in the **data** array. For strings and pointers where pointees + * are accessed, only the pointer values are stored in the *data* + * array. The *data_len* is the size of *data* in bytes. + * + * Formats **%s** and **%p{i,I}{4,6}** require to read kernel + * memory. Reading kernel memory may fail due to either invalid + * address or valid address but requiring a major memory fault. If + * reading kernel memory fails, the string for **%s** will be an + * empty string, and the ip address for **%p{i,I}{4,6}** will be 0. + * Not returning error to bpf program is consistent with what + * **bpf_trace_printk**\ () does for now. + * + * Return + * The strictly positive length of the formatted string, including + * the trailing zero character. If the return value is greater than + * **str_size**, **str** contains a truncated string, guaranteed to + * be zero-terminated except when **str_size** is 0. + * + * Or **-EBUSY** if the per-CPU memory copy buffer is busy. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4010,6 +4037,7 @@ union bpf_attr { FN(sched_entity_to_cgrpid), \ FN(sched_entity_belongs_to_cgrp), \ FN(for_each_map_elem), \ + FN(snprintf), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 8009d372be02..e49b4cb5f099 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -909,6 +909,54 @@ int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args, return err; }
+#define MAX_SNPRINTF_VARARGS 12 + +BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt, + const void *, data, u32, data_len) +{ + enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS]; + u64 args[MAX_SNPRINTF_VARARGS]; + int err, num_args; + + if (data_len % 8 || data_len > MAX_SNPRINTF_VARARGS * 8 || + (data_len && !data)) + return -EINVAL; + num_args = data_len / 8; + + /* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we + * can safely give an unbounded size. + */ + err = bpf_printf_prepare(fmt, UINT_MAX, data, args, mod, num_args); + if (err < 0) + return err; + + /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give + * all of them to snprintf(). + */ + err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod), + BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod), + BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod), + BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod), + BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod), + BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod), + BPF_CAST_FMT_ARG(11, args, mod)); + + bpf_printf_cleanup(); + + return err + 1; +} + +const struct bpf_func_proto bpf_snprintf_proto = { + .func = bpf_snprintf, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_MEM_OR_NULL, + .arg2_type = ARG_CONST_SIZE_OR_ZERO, + .arg3_type = ARG_PTR_TO_CONST_STR, + .arg4_type = ARG_PTR_TO_MEM_OR_NULL, + .arg5_type = ARG_CONST_SIZE_OR_ZERO, +}; + const struct bpf_func_proto bpf_get_current_task_proto __weak; const struct bpf_func_proto bpf_probe_read_user_proto __weak; const struct bpf_func_proto bpf_probe_read_user_str_proto __weak; @@ -997,6 +1045,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) NULL : &bpf_probe_read_kernel_str_proto; case BPF_FUNC_snprintf_btf: return &bpf_snprintf_btf_proto; + case BPF_FUNC_snprintf: + return &bpf_snprintf_proto; default: return NULL; } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fa21ed4691a5..fe9799326522 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5847,6 +5847,41 @@ static int check_reference_leak(struct bpf_verifier_env *env) return state->acquired_refs ? -EINVAL : 0; }
+static int check_bpf_snprintf_call(struct bpf_verifier_env *env, + struct bpf_reg_state *regs) +{ + struct bpf_reg_state *fmt_reg = ®s[BPF_REG_3]; + struct bpf_reg_state *data_len_reg = ®s[BPF_REG_5]; + struct bpf_map *fmt_map = fmt_reg->map_ptr; + int err, fmt_map_off, num_args; + u64 fmt_addr; + char *fmt; + + /* data must be an array of u64 */ + if (data_len_reg->var_off.value % 8) + return -EINVAL; + num_args = data_len_reg->var_off.value / 8; + + /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const + * and map_direct_value_addr is set. + */ + fmt_map_off = fmt_reg->off + fmt_reg->var_off.value; + err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr, + fmt_map_off); + if (err) + return err; + fmt = (char *)(long)fmt_addr + fmt_map_off; + + /* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we + * can focus on validating the format specifiers. + */ + err = bpf_printf_prepare(fmt, UINT_MAX, NULL, NULL, NULL, num_args); + if (err < 0) + verbose(env, "Invalid format string\n"); + + return err; +} + static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn, int *insn_idx_p) { @@ -5963,6 +5998,12 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn return -EINVAL; }
+ if (func_id == BPF_FUNC_snprintf) { + err = check_bpf_snprintf_call(env, regs); + if (err < 0) + return err; + } + /* reset caller saved regs */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, regs, caller_saved[i]); diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 3e98255e6ec5..c08a504573a3 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1036,6 +1036,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_this_cpu_ptr_proto; case BPF_FUNC_for_each_map_elem: return &bpf_for_each_map_elem_proto; + case BPF_FUNC_snprintf: + return &bpf_snprintf_proto; default: return NULL; } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 75a0ee4e10d4..db550831230c 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4556,6 +4556,33 @@ union bpf_attr { * Return * The number of traversed map elements for success, **-EINVAL** for * invalid **flags**. + * + * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len) + * Description + * Outputs a string into the **str** buffer of size **str_size** + * based on a format string stored in a read-only map pointed by + * **fmt**. + * + * Each format specifier in **fmt** corresponds to one u64 element + * in the **data** array. For strings and pointers where pointees + * are accessed, only the pointer values are stored in the *data* + * array. The *data_len* is the size of *data* in bytes. + * + * Formats **%s** and **%p{i,I}{4,6}** require to read kernel + * memory. Reading kernel memory may fail due to either invalid + * address or valid address but requiring a major memory fault. If + * reading kernel memory fails, the string for **%s** will be an + * empty string, and the ip address for **%p{i,I}{4,6}** will be 0. + * Not returning error to bpf program is consistent with what + * **bpf_trace_printk**\ () does for now. + * + * Return + * The strictly positive length of the formatted string, including + * the trailing zero character. If the return value is greater than + * **str_size**, **str** contains a truncated string, guaranteed to + * be zero-terminated except when **str_size** is 0. + * + * Or **-EBUSY** if the per-CPU memory copy buffer is busy. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4720,6 +4747,7 @@ union bpf_attr { FN(sched_entity_to_cgrpid), \ FN(sched_entity_belongs_to_cgrp), \ FN(for_each_map_elem), \ + FN(snprintf), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.13-rc1 commit 58c2b1f5e0121efd698b6ec8e45e47e58ca9caee category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an array of u64, making it more natural to call the bpf_snprintf helper.
Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org (cherry picked from commit 58c2b1f5e0121efd698b6ec8e45e47e58ca9caee) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_tracing.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h index 1c2e91ee041d..8c954ebc0c7c 100644 --- a/tools/lib/bpf/bpf_tracing.h +++ b/tools/lib/bpf/bpf_tracing.h @@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args) ___param, sizeof(___param)); \ })
+/* + * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of + * an array of u64. + */ +#define BPF_SNPRINTF(out, out_size, fmt, args...) \ +({ \ + static const char ___fmt[] = fmt; \ + unsigned long long ___param[___bpf_narg(args)]; \ + \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored "-Wint-conversion"") \ + ___bpf_fill(___param, args); \ + _Pragma("GCC diagnostic pop") \ + \ + bpf_snprintf(out, out_size, ___fmt, \ + ___param, sizeof(___param)); \ +}) + #endif
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 6709a914c8498f42b1498b3d31f4b078d092fd35 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check. Without this, relocations against float/double fields will fail.
Also adjust one error message to emit instruction index instead of less convenient instruction byte offset.
Fixes: 22541a9eeb0d ("libbpf: Add BTF_KIND_FLOAT support") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Lorenz Bauer lmb@cloudflare.com Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org (cherry picked from commit 6709a914c8498f42b1498b3d31f4b078d092fd35) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e79f793dd4bf..127a137bb6ba 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5169,6 +5169,7 @@ bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 l * least one of enums should be anonymous; * - for ENUMs, check sizes, names are ignored; * - for INT, size and signedness are ignored; + * - any two FLOATs are always compatible; * - for ARRAY, dimensionality is ignored, element types are checked for * compatibility recursively; * - everything else shouldn't be ever a target of relocation. @@ -5195,6 +5196,7 @@ static int bpf_core_fields_are_compat(const struct btf *local_btf,
switch (btf_kind(local_type)) { case BTF_KIND_PTR: + case BTF_KIND_FLOAT: return 1; case BTF_KIND_FWD: case BTF_KIND_ENUM: { @@ -6299,8 +6301,8 @@ static int bpf_core_apply_relo(struct bpf_program *prog, /* bpf_core_patch_insn() should know how to handle missing targ_spec */ err = bpf_core_patch_insn(prog, relo, relo_idx, &targ_res); if (err) { - pr_warn("prog '%s': relo #%d: failed to patch insn at offset %d: %d\n", - prog->name, relo_idx, relo->insn_off, err); + pr_warn("prog '%s': relo #%d: failed to patch insn #%zu: %d\n", + prog->name, relo_idx, relo->insn_off / BPF_INSN_SZ, err); return -EINVAL; }
From: Ian Rogers irogers@google.com
mainline inclusion from mainline-5.13-rc4 commit 9683e5775c75097c46bd24e65411b16ac6c6cbb3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Avoids a segv if btf isn't present. Seen on the call path __bpf_object__open calling bpf_object__collect_externs.
Fixes: 5bd022ec01f0 (libbpf: Support extern kernel function) Suggested-by: Stanislav Fomichev sdf@google.com Suggested-by: Petar Penkov ppenkov@google.com Signed-off-by: Ian Rogers irogers@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210504234910.976501-1-irogers@google.com (cherry picked from commit 9683e5775c75097c46bd24e65411b16ac6c6cbb3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 127a137bb6ba..565998539c09 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3217,6 +3217,9 @@ static int add_dummy_ksym_var(struct btf *btf) const struct btf_var_secinfo *vs; const struct btf_type *sec;
+ if (!btf) + return 0; + sec_btf_id = btf__find_by_name_kind(btf, KSYMS_SEC, BTF_KIND_DATASEC); if (sec_btf_id < 0)
From: Arnaldo Carvalho de Melo acme@kernel.org
mainline inclusion from mainline-5.13-rc4 commit 67e7ec0bd4535fc6e6d3f5d174f80e10a8a80c6e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Where that macro isn't available.
Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org (cherry picked from commit 67e7ec0bd4535fc6e6d3f5d174f80e10a8a80c6e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf_internal.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index ee426226928f..acbcf6c7bdf8 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -41,6 +41,11 @@ #define ELF_C_READ_MMAP ELF_C_READ #endif
+/* Older libelf all end up in this expression, for both 32 and 64 bit */ +#ifndef GELF_ST_VISIBILITY +#define GELF_ST_VISIBILITY(o) ((o) & 0x03) +#endif + #define BTF_INFO_ENC(kind, kind_flag, vlen) \ ((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN)) #define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 37f05601eabc29f82c03b461a22d8fafacd736d2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Similarly to .rodata, strip any const/volatile/restrict modifiers when generating BPF skeleton. They are not helpful and actually just get in the way.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210507054119.270888-2-andrii@kernel.org (cherry picked from commit 37f05601eabc29f82c03b461a22d8fafacd736d2) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 31ade77f5ef8..440a2fcb6441 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -106,8 +106,10 @@ static int codegen_datasec_def(struct bpf_object *obj,
if (strcmp(sec_name, ".data") == 0) { sec_ident = "data"; + strip_mods = true; } else if (strcmp(sec_name, ".bss") == 0) { sec_ident = "bss"; + strip_mods = true; } else if (strcmp(sec_name, ".rodata") == 0) { sec_ident = "rodata"; strip_mods = true;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit fdbf5ddeb855a80831af2e5bb9db9218926e6789 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
For better future extensibility add per-file linker options. Currently the set of available options is empty. This changes bpf_linker__add_file() API, but it's not a breaking change as bpf_linker APIs hasn't been released yet.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210507054119.270888-3-andrii@kernel.org (cherry picked from commit fdbf5ddeb855a80831af2e5bb9db9218926e6789) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 2 +- tools/lib/bpf/libbpf.h | 10 +++++++++- tools/lib/bpf/linker.c | 16 ++++++++++++---- 3 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 440a2fcb6441..06fee4a2910a 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -638,7 +638,7 @@ static int do_object(int argc, char **argv) while (argc) { file = GET_ARG();
- err = bpf_linker__add_file(linker, file); + err = bpf_linker__add_file(linker, file, NULL); if (err) { p_err("failed to link '%s': %s (%d)", file, strerror(err), err); goto out; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index b92012b4a2f7..9f2c588faf3d 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -771,10 +771,18 @@ struct bpf_linker_opts { }; #define bpf_linker_opts__last_field sz
+struct bpf_linker_file_opts { + /* size of this struct, for forward/backward compatiblity */ + size_t sz; +}; +#define bpf_linker_file_opts__last_field sz + struct bpf_linker;
LIBBPF_API struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts *opts); -LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker, const char *filename); +LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker, + const char *filename, + const struct bpf_linker_file_opts *opts); LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker); LIBBPF_API void bpf_linker__free(struct bpf_linker *linker);
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index f40fa50e9a89..b82a78d6d957 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -158,7 +158,9 @@ struct bpf_linker {
static int init_output_elf(struct bpf_linker *linker, const char *file);
-static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, struct src_obj *obj); +static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, + const struct bpf_linker_file_opts *opts, + struct src_obj *obj); static int linker_sanity_check_elf(struct src_obj *obj); static int linker_sanity_check_elf_symtab(struct src_obj *obj, struct src_sec *sec); static int linker_sanity_check_elf_relos(struct src_obj *obj, struct src_sec *sec); @@ -435,15 +437,19 @@ static int init_output_elf(struct bpf_linker *linker, const char *file) return 0; }
-int bpf_linker__add_file(struct bpf_linker *linker, const char *filename) +int bpf_linker__add_file(struct bpf_linker *linker, const char *filename, + const struct bpf_linker_file_opts *opts) { struct src_obj obj = {}; int err = 0;
+ if (!OPTS_VALID(opts, bpf_linker_file_opts)) + return -EINVAL; + if (!linker->elf) return -EINVAL;
- err = err ?: linker_load_obj_file(linker, filename, &obj); + err = err ?: linker_load_obj_file(linker, filename, opts, &obj); err = err ?: linker_append_sec_data(linker, &obj); err = err ?: linker_append_elf_syms(linker, &obj); err = err ?: linker_append_elf_relos(linker, &obj); @@ -529,7 +535,9 @@ static struct src_sec *add_src_sec(struct src_obj *obj, const char *sec_name) return sec; }
-static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, struct src_obj *obj) +static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, + const struct bpf_linker_file_opts *opts, + struct src_obj *obj) { #if __BYTE_ORDER == __LITTLE_ENDIAN const int host_endianness = ELFDATA2LSB;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 256eab48e70c0eaf5b1b9af83c0588491986c7de category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In preparation of skipping emitting static variables in BPF skeletons, switch all current selftests uses of static variables to pass data between BPF and user-space to use global variables.
All non-read-only `static volatile` variables become just plain global variables by dropping `static volatile` part.
Read-only `static volatile const` variables, though, still require `volatile` modifier, otherwise compiler will ignore whatever values are set from user-space.
Few static linker tests are using name-conflicting static variables to validate that static linker still properly handles static variables and doesn't trip up on name conflicts.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210507054119.270888-4-andrii@kernel.org (cherry picked from commit 256eab48e70c0eaf5b1b9af83c0588491986c7de) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/progs/test_check_mtu.c tools/testing/selftests/bpf/progs/test_snprintf_single.c tools/testing/selftests/bpf/progs/test_sockmap_listen.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/send_signal.c | 2 +- tools/testing/selftests/bpf/prog_tests/skeleton.c | 6 ++---- tools/testing/selftests/bpf/prog_tests/static_linked.c | 5 ----- tools/testing/selftests/bpf/progs/bpf_iter_test_kern4.c | 4 ++-- tools/testing/selftests/bpf/progs/kfree_skb.c | 4 ++-- tools/testing/selftests/bpf/progs/tailcall3.c | 2 +- tools/testing/selftests/bpf/progs/tailcall4.c | 2 +- tools/testing/selftests/bpf/progs/tailcall5.c | 2 +- tools/testing/selftests/bpf/progs/tailcall_bpf2bpf2.c | 2 +- tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c | 2 +- tools/testing/selftests/bpf/progs/test_cls_redirect.c | 4 ++-- tools/testing/selftests/bpf/progs/test_global_func_args.c | 2 +- tools/testing/selftests/bpf/progs/test_rdonly_maps.c | 6 +++--- tools/testing/selftests/bpf/progs/test_skeleton.c | 4 ++-- tools/testing/selftests/bpf/progs/test_sockmap_listen.c | 2 +- tools/testing/selftests/bpf/progs/test_static_linked1.c | 8 ++++---- tools/testing/selftests/bpf/progs/test_static_linked2.c | 8 ++++---- 17 files changed, 29 insertions(+), 36 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c index 75b72c751772..1931de5275e5 100644 --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c @@ -4,7 +4,7 @@ #include <sys/resource.h> #include "test_send_signal_kern.skel.h"
-static volatile int sigusr1_received = 0; +int sigusr1_received = 0;
static void sigusr1_handler(int signum) { diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c index fe87b77af459..f6f130c99b8c 100644 --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c @@ -82,10 +82,8 @@ void test_skeleton(void) CHECK(data->out2 != 2, "res2", "got %lld != exp %d\n", data->out2, 2); CHECK(bss->out3 != 3, "res3", "got %d != exp %d\n", (int)bss->out3, 3); CHECK(bss->out4 != 4, "res4", "got %lld != exp %d\n", bss->out4, 4); - CHECK(bss->handler_out5.a != 5, "res5", "got %d != exp %d\n", - bss->handler_out5.a, 5); - CHECK(bss->handler_out5.b != 6, "res6", "got %lld != exp %d\n", - bss->handler_out5.b, 6); + CHECK(bss->out5.a != 5, "res5", "got %d != exp %d\n", bss->out5.a, 5); + CHECK(bss->out5.b != 6, "res6", "got %lld != exp %d\n", bss->out5.b, 6); CHECK(bss->out6 != 14, "res7", "got %d != exp %d\n", bss->out6, 14);
CHECK(bss->bpf_syscall != kcfg->CONFIG_BPF_SYSCALL, "ext1", diff --git a/tools/testing/selftests/bpf/prog_tests/static_linked.c b/tools/testing/selftests/bpf/prog_tests/static_linked.c index 46556976dccc..ab6acbaf9d8c 100644 --- a/tools/testing/selftests/bpf/prog_tests/static_linked.c +++ b/tools/testing/selftests/bpf/prog_tests/static_linked.c @@ -14,12 +14,7 @@ void test_static_linked(void) return;
skel->rodata->rovar1 = 1; - skel->bss->static_var1 = 2; - skel->bss->static_var11 = 3; - skel->rodata->rovar2 = 4; - skel->bss->static_var2 = 5; - skel->bss->static_var22 = 6;
err = test_static_linked__load(skel); if (!ASSERT_OK(err, "skel_load")) diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_test_kern4.c b/tools/testing/selftests/bpf/progs/bpf_iter_test_kern4.c index ee49493dc125..400fdf8d6233 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_test_kern4.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_test_kern4.c @@ -9,8 +9,8 @@ __u32 map1_id = 0, map2_id = 0; __u32 map1_accessed = 0, map2_accessed = 0; __u64 map1_seqnum = 0, map2_seqnum1 = 0, map2_seqnum2 = 0;
-static volatile const __u32 print_len; -static volatile const __u32 ret1; +volatile const __u32 print_len; +volatile const __u32 ret1;
SEC("iter/bpf_map") int dump_bpf_map(struct bpf_iter__bpf_map *ctx) diff --git a/tools/testing/selftests/bpf/progs/kfree_skb.c b/tools/testing/selftests/bpf/progs/kfree_skb.c index a46a264ce24e..55e283050cab 100644 --- a/tools/testing/selftests/bpf/progs/kfree_skb.c +++ b/tools/testing/selftests/bpf/progs/kfree_skb.c @@ -109,10 +109,10 @@ int BPF_PROG(trace_kfree_skb, struct sk_buff *skb, void *location) return 0; }
-static volatile struct { +struct { bool fentry_test_ok; bool fexit_test_ok; -} result; +} result = {};
SEC("fentry/eth_type_trans") int BPF_PROG(fentry_eth_type_trans, struct sk_buff *skb, struct net_device *dev, diff --git a/tools/testing/selftests/bpf/progs/tailcall3.c b/tools/testing/selftests/bpf/progs/tailcall3.c index 739dc2a51e74..910858fe078a 100644 --- a/tools/testing/selftests/bpf/progs/tailcall3.c +++ b/tools/testing/selftests/bpf/progs/tailcall3.c @@ -10,7 +10,7 @@ struct { __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps");
-static volatile int count; +int count = 0;
SEC("classifier/0") int bpf_func_0(struct __sk_buff *skb) diff --git a/tools/testing/selftests/bpf/progs/tailcall4.c b/tools/testing/selftests/bpf/progs/tailcall4.c index f82075b47d7d..bd4be135c39d 100644 --- a/tools/testing/selftests/bpf/progs/tailcall4.c +++ b/tools/testing/selftests/bpf/progs/tailcall4.c @@ -10,7 +10,7 @@ struct { __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps");
-static volatile int selector; +int selector = 0;
#define TAIL_FUNC(x) \ SEC("classifier/" #x) \ diff --git a/tools/testing/selftests/bpf/progs/tailcall5.c b/tools/testing/selftests/bpf/progs/tailcall5.c index ce5450744fd4..adf30a33064e 100644 --- a/tools/testing/selftests/bpf/progs/tailcall5.c +++ b/tools/testing/selftests/bpf/progs/tailcall5.c @@ -10,7 +10,7 @@ struct { __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps");
-static volatile int selector; +int selector = 0;
#define TAIL_FUNC(x) \ SEC("classifier/" #x) \ diff --git a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf2.c b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf2.c index 7b1c04183824..3cc4c12817b5 100644 --- a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf2.c +++ b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf2.c @@ -20,7 +20,7 @@ int subprog_tail(struct __sk_buff *skb) return 1; }
-static volatile int count; +int count = 0;
SEC("classifier/0") int bpf_func_0(struct __sk_buff *skb) diff --git a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c index 9a1b166b7fbe..77df6d4db895 100644 --- a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c +++ b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c @@ -9,7 +9,7 @@ struct { __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps");
-static volatile int count; +int count = 0;
__noinline int subprog_tail_2(struct __sk_buff *skb) diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect.c b/tools/testing/selftests/bpf/progs/test_cls_redirect.c index c9f8464996ea..8eaeef01a925 100644 --- a/tools/testing/selftests/bpf/progs/test_cls_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_cls_redirect.c @@ -39,8 +39,8 @@ char _license[] SEC("license") = "Dual BSD/GPL"; /** * Destination port and IP used for UDP encapsulation. */ -static volatile const __be16 ENCAPSULATION_PORT; -static volatile const __be32 ENCAPSULATION_IP; +volatile const __be16 ENCAPSULATION_PORT; +volatile const __be32 ENCAPSULATION_IP;
typedef struct { uint64_t processed_packets_total; diff --git a/tools/testing/selftests/bpf/progs/test_global_func_args.c b/tools/testing/selftests/bpf/progs/test_global_func_args.c index cae309538a9e..e712bf77daae 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func_args.c +++ b/tools/testing/selftests/bpf/progs/test_global_func_args.c @@ -8,7 +8,7 @@ struct S { int v; };
-static volatile struct S global_variable; +struct S global_variable = {};
struct { __uint(type, BPF_MAP_TYPE_ARRAY); diff --git a/tools/testing/selftests/bpf/progs/test_rdonly_maps.c b/tools/testing/selftests/bpf/progs/test_rdonly_maps.c index ecbeea2df259..fc8e8a34a3db 100644 --- a/tools/testing/selftests/bpf/progs/test_rdonly_maps.c +++ b/tools/testing/selftests/bpf/progs/test_rdonly_maps.c @@ -5,7 +5,7 @@ #include <linux/bpf.h> #include <bpf/bpf_helpers.h>
-static volatile const struct { +const struct { unsigned a[4]; /* * if the struct's size is multiple of 16, compiler will put it into @@ -15,11 +15,11 @@ static volatile const struct { char _y; } rdonly_values = { .a = {2, 3, 4, 5} };
-static volatile struct { +struct { unsigned did_run; unsigned iters; unsigned sum; -} res; +} res = {};
SEC("raw_tracepoint/sys_enter:skip_loop") int skip_loop(struct pt_regs *ctx) diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c index 374ccef704e1..441fa1c552c8 100644 --- a/tools/testing/selftests/bpf/progs/test_skeleton.c +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c @@ -38,11 +38,11 @@ extern int LINUX_KERNEL_VERSION __kconfig; bool bpf_syscall = 0; int kern_ver = 0;
+struct s out5 = {}; + SEC("raw_tp/sys_enter") int handler(const void *ctx) { - static volatile struct s out5; - out1 = in1; out2 = in2; out3 = in3; diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c index a3a366c57ce1..b8539b4d29ac 100644 --- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c +++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c @@ -28,7 +28,7 @@ struct { __type(value, unsigned int); } verdict_map SEC(".maps");
-static volatile bool test_sockmap; /* toggled by user-space */ +bool test_sockmap; /* toggled by user-space */
SEC("sk_skb/stream_parser") int prog_skb_parser(struct __sk_buff *skb) diff --git a/tools/testing/selftests/bpf/progs/test_static_linked1.c b/tools/testing/selftests/bpf/progs/test_static_linked1.c index ea1a6c4c7172..cae304045d9c 100644 --- a/tools/testing/selftests/bpf/progs/test_static_linked1.c +++ b/tools/testing/selftests/bpf/progs/test_static_linked1.c @@ -4,9 +4,9 @@ #include <linux/bpf.h> #include <bpf/bpf_helpers.h>
-/* 8-byte aligned .bss */ -static volatile long static_var1; -static volatile int static_var11; +/* 8-byte aligned .data */ +static volatile long static_var1 = 2; +static volatile int static_var2 = 3; int var1 = 0; /* 4-byte aligned .rodata */ const volatile int rovar1; @@ -21,7 +21,7 @@ static __noinline int subprog(int x) SEC("raw_tp/sys_enter") int handler1(const void *ctx) { - var1 = subprog(rovar1) + static_var1 + static_var11; + var1 = subprog(rovar1) + static_var1 + static_var2;
return 0; } diff --git a/tools/testing/selftests/bpf/progs/test_static_linked2.c b/tools/testing/selftests/bpf/progs/test_static_linked2.c index 54d8d1ab577c..c54c4e865ed8 100644 --- a/tools/testing/selftests/bpf/progs/test_static_linked2.c +++ b/tools/testing/selftests/bpf/progs/test_static_linked2.c @@ -4,9 +4,9 @@ #include <linux/bpf.h> #include <bpf/bpf_helpers.h>
-/* 4-byte aligned .bss */ -static volatile int static_var2; -static volatile int static_var22; +/* 4-byte aligned .data */ +static volatile int static_var1 = 5; +static volatile int static_var2 = 6; int var2 = 0; /* 8-byte aligned .rodata */ const volatile long rovar2; @@ -21,7 +21,7 @@ static __noinline int subprog(int x) SEC("raw_tp/sys_enter") int handler2(const void *ctx) { - var2 = subprog(rovar2) + static_var2 + static_var22; + var2 = subprog(rovar2) + static_var1 + static_var2;
return 0; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 31332ccb756274c185cfd458b68b29a9371dceac category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
As discussed in [0], stop emitting static variables in BPF skeletons to avoid issues with name-conflicting static variables across multiple statically-linked BPF object files.
Users using static variables to pass data between BPF programs and user-space should do a trivial one-time switch according to the following simple rules: - read-only `static volatile const` variables should be converted to `volatile const`; - read/write `static volatile` variables should just drop `static volatile` modifiers to become global variables/symbols. To better handle older Clang versions, such newly converted global variables should be explicitly initialized with a specific value or `= 0`/`= {}`, whichever is appropriate.
[0] https://lore.kernel.org/bpf/CAEf4BzZo7_r-hsNvJt3w3kyrmmBJj7ghGY8+k4nvKF0KLjm...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210507054119.270888-5-andrii@kernel.org (cherry picked from commit 31332ccb756274c185cfd458b68b29a9371dceac) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 06fee4a2910a..27dceaf66ecb 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -131,6 +131,10 @@ static int codegen_datasec_def(struct bpf_object *obj, int need_off = sec_var->offset, align_off, align; __u32 var_type_id = var->type;
+ /* static variables are not exposed through BPF skeleton */ + if (btf_var(var)->linkage == BTF_VAR_STATIC) + continue; + if (off > need_off) { p_err("Something is wrong for %s's variable #%d: need offset %d, already at %d.\n", sec_name, i, need_off, off);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 247b8634e6446dbc8024685f803290501cba226f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Fix silly bug in updating ELF symbol's visibility.
Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210507054119.270888-6-andrii@kernel.org (cherry picked from commit 247b8634e6446dbc8024685f803290501cba226f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index b82a78d6d957..488029cb4e42 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1788,7 +1788,7 @@ static void sym_update_visibility(Elf64_Sym *sym, int sym_vis) /* libelf doesn't provide setters for ST_VISIBILITY, * but it is stored in the lower 2 bits of st_other */ - sym->st_other &= 0x03; + sym->st_other &= ~0x03; sym->st_other |= sym_vis; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit e5670fa0293b05e8e24dae7d18481aba281cb85d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Do the same global -> static BTF update for global functions with STV_INTERNAL visibility to turn on static BPF verification mode.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210507054119.270888-7-andrii@kernel.org (cherry picked from commit e5670fa0293b05e8e24dae7d18481aba281cb85d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 565998539c09..d48f2b4fa73b 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -700,13 +700,14 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, if (err) return err;
- /* if function is a global/weak symbol, but has hidden - * visibility (STV_HIDDEN), mark its BTF FUNC as static to - * enable more permissive BPF verification mode with more - * outside context available to BPF verifier + /* if function is a global/weak symbol, but has restricted + * (STV_HIDDEN or STV_INTERNAL) visibility, mark its BTF FUNC + * as static to enable more permissive BPF verification mode + * with more outside context available to BPF verifier */ if (GELF_ST_BIND(sym.st_info) != STB_LOCAL - && GELF_ST_VISIBILITY(sym.st_other) == STV_HIDDEN) + && (GELF_ST_VISIBILITY(sym.st_other) == STV_HIDDEN + || GELF_ST_VISIBILITY(sym.st_other) == STV_INTERNAL)) prog->mark_btf_static = true;
nr_progs++;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit c1cccec9c63637c4c5ee0aa2da2850d983c19e88 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Static maps never really worked with libbpf, because all such maps were always silently resolved to the very first map. Detect static maps (both legacy and BTF-defined) and report user-friendly error.
Tested locally by switching few maps (legacy and BTF-defined) in selftests to static ones and verifying that now libbpf rejects them loudly.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210513233643.194711-2-andrii@kernel.org (cherry picked from commit c1cccec9c63637c4c5ee0aa2da2850d983c19e88) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index d48f2b4fa73b..f909e03a2deb 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1795,7 +1795,6 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) if (!symbols) return -EINVAL;
- scn = elf_sec_by_idx(obj, obj->efile.maps_shndx); data = elf_sec_data(obj, scn); if (!scn || !data) { @@ -1855,6 +1854,12 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) return -LIBBPF_ERRNO__FORMAT; }
+ if (GELF_ST_TYPE(sym.st_info) == STT_SECTION + || GELF_ST_BIND(sym.st_info) == STB_LOCAL) { + pr_warn("map '%s' (legacy): static maps are not supported\n", map_name); + return -ENOTSUP; + } + map->libbpf_type = LIBBPF_MAP_UNSPEC; map->sec_idx = sym.st_shndx; map->sec_offset = sym.st_value; @@ -2262,6 +2267,16 @@ static void fill_map_from_def(struct bpf_map *map, const struct btf_map_def *def pr_debug("map '%s': found inner map definition.\n", map->name); }
+static const char *btf_var_linkage_str(__u32 linkage) +{ + switch (linkage) { + case BTF_VAR_STATIC: return "static"; + case BTF_VAR_GLOBAL_ALLOCATED: return "global"; + case BTF_VAR_GLOBAL_EXTERN: return "extern"; + default: return "unknown"; + } +} + static int bpf_object__init_user_btf_map(struct bpf_object *obj, const struct btf_type *sec, int var_idx, int sec_idx, @@ -2294,10 +2309,9 @@ static int bpf_object__init_user_btf_map(struct bpf_object *obj, map_name, btf_kind_str(var)); return -EINVAL; } - if (var_extra->linkage != BTF_VAR_GLOBAL_ALLOCATED && - var_extra->linkage != BTF_VAR_STATIC) { - pr_warn("map '%s': unsupported var linkage %u.\n", - map_name, var_extra->linkage); + if (var_extra->linkage != BTF_VAR_GLOBAL_ALLOCATED) { + pr_warn("map '%s': unsupported map linkage %s.\n", + map_name, btf_var_linkage_str(var_extra->linkage)); return -EOPNOTSUPP; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 513f485ca5163c6cba869602d076a8e2f04d1ca1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Detect use of static entry-point BPF programs (those with SEC() markings) and emit error message. This is similar to c1cccec9c636 ("libbpf: Reject static maps") but for BPF programs.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20210514195534.1440970-1-andrii@kernel.org (cherry picked from commit 513f485ca5163c6cba869602d076a8e2f04d1ca1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f909e03a2deb..6b57e429ba7d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -677,6 +677,11 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return -LIBBPF_ERRNO__FORMAT; }
+ if (sec_idx != obj->efile.text_shndx && GELF_ST_BIND(sym.st_info) == STB_LOCAL) { + pr_warn("sec '%s': program '%s' is static and not supported\n", sec_name, name); + return -ENOTSUP; + } + pr_debug("sec '%s': found program '%s' at insn offset %zu (%zu bytes), code size %zu insns (%zu bytes)\n", sec_name, name, sec_off / BPF_INSN_SZ, sec_off, prog_sz / BPF_INSN_SZ, prog_sz);
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.14-rc1 commit 8bbb77b7c7a226803270dac3fc8dd564fd2f5756 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This change introduces a few helpers to wrap open coded attribute preparation in netlink.c. It also adds a libbpf_netlink_send_recv() that is useful to wrap send + recv handling in a generic way. Subsequent patch will also use this function for sending and receiving a netlink response. The libbpf_nl_get_link() helper has been removed instead, moving socket creation into the newly named libbpf_netlink_send_recv().
Every nested attribute's closure must happen using the helper nlattr_end_nested(), which sets its length properly. NLA_F_NESTED is enforced using nlattr_begin_nested() helper. Other simple attributes can be added directly.
The maxsz parameter corresponds to the size of the request structure which is being filled in, so for instance with req being:
struct { struct nlmsghdr nh; struct tcmsg t; char buf[4096]; } req;
Then, maxsz should be sizeof(req).
This change also converts the open coded attribute preparation with these helpers. Note that the only failure the internal call to nlattr_add() could result in the nested helper would be -EMSGSIZE, hence that is what we return to our caller.
The libbpf_netlink_send_recv() call takes care of opening the socket, sending the netlink message, receiving the response, potentially invoking callbacks, and return errors if any, and then finally close the socket. This allows users to avoid identical socket setup code in different places. The only user of libbpf_nl_get_link() has been converted to make use of it. __bpf_set_link_xdp_fd_replace() has also been refactored to use it.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com [ Daniel: major patch cleanup ] Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210512103451.989420-2-memxor@gmail.com
(cherry picked from commit 8bbb77b7c7a226803270dac3fc8dd564fd2f5756) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/netlink.c | 161 ++++++++++++++++++---------------------- tools/lib/bpf/nlattr.h | 48 ++++++++++++ 2 files changed, 120 insertions(+), 89 deletions(-)
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index d2cb28e9ef52..8bbdc6c38f06 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -73,9 +73,14 @@ static int libbpf_netlink_open(__u32 *nl_pid) return ret; }
-static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq, - __dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn, - void *cookie) +static void libbpf_netlink_close(int sock) +{ + close(sock); +} + +static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq, + __dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn, + void *cookie) { bool multipart = true; struct nlmsgerr *err; @@ -131,72 +136,72 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq, return ret; }
+static int libbpf_netlink_send_recv(struct nlmsghdr *nh, + __dump_nlmsg_t parse_msg, + libbpf_dump_nlmsg_t parse_attr, + void *cookie) +{ + __u32 nl_pid = 0; + int sock, ret; + + sock = libbpf_netlink_open(&nl_pid); + if (sock < 0) + return sock; + + nh->nlmsg_pid = 0; + nh->nlmsg_seq = time(NULL); + + if (send(sock, nh, nh->nlmsg_len, 0) < 0) { + ret = -errno; + goto out; + } + + ret = libbpf_netlink_recv(sock, nl_pid, nh->nlmsg_seq, + parse_msg, parse_attr, cookie); +out: + libbpf_netlink_close(sock); + return ret; +} + static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd, __u32 flags) { - int sock, seq = 0, ret; - struct nlattr *nla, *nla_xdp; + struct nlattr *nla; + int ret; struct { struct nlmsghdr nh; struct ifinfomsg ifinfo; char attrbuf[64]; } req; - __u32 nl_pid = 0; - - sock = libbpf_netlink_open(&nl_pid); - if (sock < 0) - return sock;
memset(&req, 0, sizeof(req)); - req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)); - req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; - req.nh.nlmsg_type = RTM_SETLINK; - req.nh.nlmsg_pid = 0; - req.nh.nlmsg_seq = ++seq; + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)); + req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + req.nh.nlmsg_type = RTM_SETLINK; req.ifinfo.ifi_family = AF_UNSPEC; - req.ifinfo.ifi_index = ifindex; - - /* started nested attribute for XDP */ - nla = (struct nlattr *)(((char *)&req) - + NLMSG_ALIGN(req.nh.nlmsg_len)); - nla->nla_type = NLA_F_NESTED | IFLA_XDP; - nla->nla_len = NLA_HDRLEN; - - /* add XDP fd */ - nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len); - nla_xdp->nla_type = IFLA_XDP_FD; - nla_xdp->nla_len = NLA_HDRLEN + sizeof(int); - memcpy((char *)nla_xdp + NLA_HDRLEN, &fd, sizeof(fd)); - nla->nla_len += nla_xdp->nla_len; - - /* if user passed in any flags, add those too */ + req.ifinfo.ifi_index = ifindex; + + nla = nlattr_begin_nested(&req.nh, sizeof(req), IFLA_XDP); + if (!nla) + return -EMSGSIZE; + ret = nlattr_add(&req.nh, sizeof(req), IFLA_XDP_FD, &fd, sizeof(fd)); + if (ret < 0) + return ret; if (flags) { - nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len); - nla_xdp->nla_type = IFLA_XDP_FLAGS; - nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags); - memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags)); - nla->nla_len += nla_xdp->nla_len; + ret = nlattr_add(&req.nh, sizeof(req), IFLA_XDP_FLAGS, &flags, + sizeof(flags)); + if (ret < 0) + return ret; } - if (flags & XDP_FLAGS_REPLACE) { - nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len); - nla_xdp->nla_type = IFLA_XDP_EXPECTED_FD; - nla_xdp->nla_len = NLA_HDRLEN + sizeof(old_fd); - memcpy((char *)nla_xdp + NLA_HDRLEN, &old_fd, sizeof(old_fd)); - nla->nla_len += nla_xdp->nla_len; + ret = nlattr_add(&req.nh, sizeof(req), IFLA_XDP_EXPECTED_FD, + &old_fd, sizeof(old_fd)); + if (ret < 0) + return ret; } + nlattr_end_nested(&req.nh, nla);
- req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len); - - if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) { - ret = -errno; - goto cleanup; - } - ret = bpf_netlink_recv(sock, nl_pid, seq, NULL, NULL, NULL); - -cleanup: - close(sock); - return ret; + return libbpf_netlink_send_recv(&req.nh, NULL, NULL, NULL); }
int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags, @@ -212,9 +217,7 @@ int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags, flags |= XDP_FLAGS_REPLACE; }
- return __bpf_set_link_xdp_fd_replace(ifindex, fd, - old_fd, - flags); + return __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags); }
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags) @@ -231,6 +234,7 @@ static int __dump_link_nlmsg(struct nlmsghdr *nlh,
len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*ifi)); attr = (struct nlattr *) ((void *) ifi + NLMSG_ALIGN(sizeof(*ifi))); + if (libbpf_nla_parse(tb, IFLA_MAX, attr, len, NULL) != 0) return -LIBBPF_ERRNO__NLPARSE;
@@ -282,16 +286,21 @@ static int get_xdp_info(void *cookie, void *msg, struct nlattr **tb) return 0; }
-static int libbpf_nl_get_link(int sock, unsigned int nl_pid, - libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie); - int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, size_t info_size, __u32 flags) { struct xdp_id_md xdp_id = {}; - int sock, ret; - __u32 nl_pid = 0; __u32 mask; + int ret; + struct { + struct nlmsghdr nh; + struct ifinfomsg ifm; + } req = { + .nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)), + .nh.nlmsg_type = RTM_GETLINK, + .nh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST, + .ifm.ifi_family = AF_PACKET, + };
if (flags & ~XDP_FLAGS_MASK || !info_size) return -EINVAL; @@ -302,14 +311,11 @@ int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, if (flags && flags & mask) return -EINVAL;
- sock = libbpf_netlink_open(&nl_pid); - if (sock < 0) - return sock; - xdp_id.ifindex = ifindex; xdp_id.flags = flags;
- ret = libbpf_nl_get_link(sock, nl_pid, get_xdp_info, &xdp_id); + ret = libbpf_netlink_send_recv(&req.nh, __dump_link_nlmsg, + get_xdp_info, &xdp_id); if (!ret) { size_t sz = min(info_size, sizeof(xdp_id.info));
@@ -317,7 +323,6 @@ int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, memset((void *) info + sz, 0, info_size - sz); }
- close(sock); return ret; }
@@ -348,25 +353,3 @@ int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
return ret; } - -int libbpf_nl_get_link(int sock, unsigned int nl_pid, - libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie) -{ - struct { - struct nlmsghdr nlh; - struct ifinfomsg ifm; - } req = { - .nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)), - .nlh.nlmsg_type = RTM_GETLINK, - .nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST, - .ifm.ifi_family = AF_PACKET, - }; - int seq = time(NULL); - - req.nlh.nlmsg_seq = seq; - if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0) - return -errno; - - return bpf_netlink_recv(sock, nl_pid, seq, __dump_link_nlmsg, - dump_link_nlmsg, cookie); -} diff --git a/tools/lib/bpf/nlattr.h b/tools/lib/bpf/nlattr.h index 6cc3ac91690f..3c780ab6d022 100644 --- a/tools/lib/bpf/nlattr.h +++ b/tools/lib/bpf/nlattr.h @@ -10,7 +10,10 @@ #define __LIBBPF_NLATTR_H
#include <stdint.h> +#include <string.h> +#include <errno.h> #include <linux/netlink.h> + /* avoid multiple definition of netlink features */ #define __LINUX_NETLINK_H
@@ -103,4 +106,49 @@ int libbpf_nla_parse_nested(struct nlattr *tb[], int maxtype,
int libbpf_nla_dump_errormsg(struct nlmsghdr *nlh);
+static inline struct nlattr *nla_data(struct nlattr *nla) +{ + return (struct nlattr *)((char *)nla + NLA_HDRLEN); +} + +static inline struct nlattr *nh_tail(struct nlmsghdr *nh) +{ + return (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len)); +} + +static inline int nlattr_add(struct nlmsghdr *nh, size_t maxsz, int type, + const void *data, int len) +{ + struct nlattr *nla; + + if (NLMSG_ALIGN(nh->nlmsg_len) + NLA_ALIGN(NLA_HDRLEN + len) > maxsz) + return -EMSGSIZE; + if (!!data != !!len) + return -EINVAL; + + nla = nh_tail(nh); + nla->nla_type = type; + nla->nla_len = NLA_HDRLEN + len; + if (data) + memcpy(nla_data(nla), data, len); + nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_ALIGN(nla->nla_len); + return 0; +} + +static inline struct nlattr *nlattr_begin_nested(struct nlmsghdr *nh, + size_t maxsz, int type) +{ + struct nlattr *tail; + + tail = nh_tail(nh); + if (nlattr_add(nh, maxsz, type | NLA_F_NESTED, NULL, 0)) + return NULL; + return tail; +} + +static inline void nlattr_end_nested(struct nlmsghdr *nh, struct nlattr *tail) +{ + tail->nla_len = (char *)nh_tail(nh) - (char *)tail; +} + #endif /* __LIBBPF_NLATTR_H */
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.14-rc1 commit 715c5ce454a6a9b94a1a4a3360de6a87eaf0d833 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This adds functions that wrap the netlink API used for adding, manipulating, and removing traffic control filters.
The API summary:
A bpf_tc_hook represents a location where a TC-BPF filter can be attached. This means that creating a hook leads to creation of the backing qdisc, while destruction either removes all filters attached to a hook, or destroys qdisc if requested explicitly (as discussed below).
The TC-BPF API functions operate on this bpf_tc_hook to attach, replace, query, and detach tc filters. All functions return 0 on success, and a negative error code on failure.
bpf_tc_hook_create - Create a hook Parameters: @hook - Cannot be NULL, ifindex > 0, attach_point must be set to proper enum constant. Note that parent must be unset when attach_point is one of BPF_TC_INGRESS or BPF_TC_EGRESS. Note that as an exception BPF_TC_INGRESS|BPF_TC_EGRESS is also a valid value for attach_point.
Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM.
bpf_tc_hook_destroy - Destroy a hook Parameters: @hook - Cannot be NULL. The behaviour depends on value of attach_point. If BPF_TC_INGRESS, all filters attached to the ingress hook will be detached. If BPF_TC_EGRESS, all filters attached to the egress hook will be detached. If BPF_TC_INGRESS|BPF_TC_EGRESS, the clsact qdisc will be deleted, also detaching all filters. As before, parent must be unset for these attach_points, and set for BPF_TC_CUSTOM.
It is advised that if the qdisc is operated on by many programs, then the program at least check that there are no other existing filters before deleting the clsact qdisc. An example is shown below:
DECLARE_LIBBPF_OPTS(bpf_tc_hook, .ifindex = if_nametoindex("lo"), .attach_point = BPF_TC_INGRESS); /* set opts as NULL, as we're not really interested in * getting any info for a particular filter, but just * detecting its presence. */ r = bpf_tc_query(&hook, NULL); if (r == -ENOENT) { /* no filters */ hook.attach_point = BPF_TC_INGRESS|BPF_TC_EGREESS; return bpf_tc_hook_destroy(&hook); } else { /* failed or r == 0, the latter means filters do exist */ return r; }
Note that there is a small race between checking for no filters and deleting the qdisc. This is currently unavoidable.
Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM.
bpf_tc_attach - Attach a filter to a hook Parameters: @hook - Cannot be NULL. Represents the hook the filter will be attached to. Requirements for ifindex and attach_point are same as described in bpf_tc_hook_create, but BPF_TC_CUSTOM is also supported. In that case, parent must be set to the handle where the filter will be attached (using BPF_TC_PARENT). E.g. to set parent to 1:16 like in tc command line, the equivalent would be BPF_TC_PARENT(1, 16).
@opts - Cannot be NULL. The following opts are optional: * handle - The handle of the filter * priority - The priority of the filter Must be >= 0 and <= UINT16_MAX Note that when left unset, they will be auto-allocated by the kernel. The following opts must be set: * prog_fd - The fd of the loaded SCHED_CLS prog The following opts must be unset: * prog_id - The ID of the BPF prog The following opts are optional: * flags - Currently only BPF_TC_F_REPLACE is allowed. It allows replacing an existing filter instead of failing with -EEXIST. The following opts will be filled by bpf_tc_attach on a successful attach operation if they are unset: * handle - The handle of the attached filter * priority - The priority of the attached filter * prog_id - The ID of the attached SCHED_CLS prog This way, the user can know what the auto allocated values for optional opts like handle and priority are for the newly attached filter, if they were unset.
Note that some other attributes are set to fixed default values listed below (this holds for all bpf_tc_* APIs): protocol as ETH_P_ALL, direct action mode, chain index of 0, and class ID of 0 (this can be set by writing to the skb->tc_classid field from the BPF program).
bpf_tc_detach Parameters: @hook - Cannot be NULL. Represents the hook the filter will be detached from. Requirements are same as described above in bpf_tc_attach.
@opts - Cannot be NULL. The following opts must be set: * handle, priority The following opts must be unset: * prog_fd, prog_id, flags
bpf_tc_query Parameters: @hook - Cannot be NULL. Represents the hook where the filter lookup will be performed. Requirements are same as described above in bpf_tc_attach().
@opts - Cannot be NULL. The following opts must be set: * handle, priority The following opts must be unset: * prog_fd, prog_id, flags The following fields will be filled by bpf_tc_query upon a successful lookup: * prog_id
Some usage examples (using BPF skeleton infrastructure):
BPF program (test_tc_bpf.c):
#include <linux/bpf.h> #include <bpf/bpf_helpers.h>
SEC("classifier") int cls(struct __sk_buff *skb) { return 0; }
Userspace loader:
struct test_tc_bpf *skel = NULL; int fd, r;
skel = test_tc_bpf__open_and_load(); if (!skel) return -ENOMEM;
fd = bpf_program__fd(skel->progs.cls);
DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = if_nametoindex("lo"), .attach_point = BPF_TC_INGRESS); /* Create clsact qdisc */ r = bpf_tc_hook_create(&hook); if (r < 0) goto end;
DECLARE_LIBBPF_OPTS(bpf_tc_opts, opts, .prog_fd = fd); r = bpf_tc_attach(&hook, &opts); if (r < 0) goto end; /* Print the auto allocated handle and priority */ printf("Handle=%u", opts.handle); printf("Priority=%u", opts.priority);
opts.prog_fd = opts.prog_id = 0; bpf_tc_detach(&hook, &opts); end: test_tc_bpf__destroy(skel);
This is equivalent to doing the following using tc command line: # tc qdisc add dev lo clsact # tc filter add dev lo ingress bpf obj foo.o sec classifier da # tc filter del dev lo ingress handle <h> prio <p> bpf ... where the handle and priority can be found using: # tc filter show dev lo ingress
Another example replacing a filter (extending prior example):
/* We can also choose both (or one), let's try replacing an * existing filter. */ DECLARE_LIBBPF_OPTS(bpf_tc_opts, replace_opts, .handle = opts.handle, .priority = opts.priority, .prog_fd = fd); r = bpf_tc_attach(&hook, &replace_opts); if (r == -EEXIST) { /* Expected, now use BPF_TC_F_REPLACE to replace it */ replace_opts.flags = BPF_TC_F_REPLACE; return bpf_tc_attach(&hook, &replace_opts); } else if (r < 0) { return r; } /* There must be no existing filter with these * attributes, so cleanup and return an error. */ replace_opts.prog_fd = replace_opts.prog_id = 0; bpf_tc_detach(&hook, &replace_opts); return -1;
To obtain info of a particular filter:
/* Find info for filter with handle 1 and priority 50 */ DECLARE_LIBBPF_OPTS(bpf_tc_opts, info_opts, .handle = 1, .priority = 50); r = bpf_tc_query(&hook, &info_opts); if (r == -ENOENT) printf("Filter not found"); else if (r < 0) return r; printf("Prog ID: %u", info_opts.prog_id); return 0;
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Co-developed-by: Daniel Borkmann daniel@iogearbox.net # libbpf API design [ Daniel: also did major patch cleanup ] Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210512103451.989420-3-memxor@gmail.com
(cherry picked from commit 715c5ce454a6a9b94a1a4a3360de6a87eaf0d833) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.h | 44 ++++ tools/lib/bpf/libbpf.map | 5 + tools/lib/bpf/netlink.c | 421 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 469 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 9f2c588faf3d..e2c0aaf20416 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -502,6 +502,7 @@ LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, LIBBPF_API int bpf_prog_load(const char *file, enum bpf_prog_type type, struct bpf_object **pobj, int *prog_fd);
+/* XDP related API */ struct xdp_link_info { __u32 prog_id; __u32 drv_prog_id; @@ -524,6 +525,49 @@ LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags); LIBBPF_API int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, size_t info_size, __u32 flags);
+/* TC related API */ +enum bpf_tc_attach_point { + BPF_TC_INGRESS = 1 << 0, + BPF_TC_EGRESS = 1 << 1, + BPF_TC_CUSTOM = 1 << 2, +}; + +#define BPF_TC_PARENT(a, b) \ + ((((a) << 16) & 0xFFFF0000U) | ((b) & 0x0000FFFFU)) + +enum bpf_tc_flags { + BPF_TC_F_REPLACE = 1 << 0, +}; + +struct bpf_tc_hook { + size_t sz; + int ifindex; + enum bpf_tc_attach_point attach_point; + __u32 parent; + size_t :0; +}; +#define bpf_tc_hook__last_field parent + +struct bpf_tc_opts { + size_t sz; + int prog_fd; + __u32 flags; + __u32 prog_id; + __u32 handle; + __u32 priority; + size_t :0; +}; +#define bpf_tc_opts__last_field priority + +LIBBPF_API int bpf_tc_hook_create(struct bpf_tc_hook *hook); +LIBBPF_API int bpf_tc_hook_destroy(struct bpf_tc_hook *hook); +LIBBPF_API int bpf_tc_attach(const struct bpf_tc_hook *hook, + struct bpf_tc_opts *opts); +LIBBPF_API int bpf_tc_detach(const struct bpf_tc_hook *hook, + const struct bpf_tc_opts *opts); +LIBBPF_API int bpf_tc_query(const struct bpf_tc_hook *hook, + struct bpf_tc_opts *opts); + /* Ring buffer APIs */ struct ring_buffer;
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 238830f75522..cd09c984ccdc 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -361,4 +361,9 @@ LIBBPF_0.4.0 { bpf_linker__new; bpf_map__inner_map; bpf_object__set_kversion; + bpf_tc_attach; + bpf_tc_detach; + bpf_tc_hook_create; + bpf_tc_hook_destroy; + bpf_tc_query; } LIBBPF_0.3.0; diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 8bbdc6c38f06..47444588e0d2 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -4,7 +4,10 @@ #include <stdlib.h> #include <memory.h> #include <unistd.h> +#include <arpa/inet.h> #include <linux/bpf.h> +#include <linux/if_ether.h> +#include <linux/pkt_cls.h> #include <linux/rtnetlink.h> #include <sys/socket.h> #include <errno.h> @@ -78,6 +81,12 @@ static void libbpf_netlink_close(int sock) close(sock); }
+enum { + NL_CONT, + NL_NEXT, + NL_DONE, +}; + static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq, __dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn, void *cookie) @@ -89,6 +98,7 @@ static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq, int len, ret;
while (multipart) { +start: multipart = false; len = recv(sock, buf, sizeof(buf), 0); if (len < 0) { @@ -126,8 +136,16 @@ static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq, } if (_fn) { ret = _fn(nh, fn, cookie); - if (ret) + switch (ret) { + case NL_CONT: + break; + case NL_NEXT: + goto start; + case NL_DONE: + return 0; + default: return ret; + } } } } @@ -353,3 +371,404 @@ int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
return ret; } + +typedef int (*qdisc_config_t)(struct nlmsghdr *nh, struct tcmsg *t, + size_t maxsz); + +static int clsact_config(struct nlmsghdr *nh, struct tcmsg *t, size_t maxsz) +{ + t->tcm_parent = TC_H_CLSACT; + t->tcm_handle = TC_H_MAKE(TC_H_CLSACT, 0); + + return nlattr_add(nh, maxsz, TCA_KIND, "clsact", sizeof("clsact")); +} + +static int attach_point_to_config(struct bpf_tc_hook *hook, + qdisc_config_t *config) +{ + switch (OPTS_GET(hook, attach_point, 0)) { + case BPF_TC_INGRESS: + case BPF_TC_EGRESS: + case BPF_TC_INGRESS | BPF_TC_EGRESS: + if (OPTS_GET(hook, parent, 0)) + return -EINVAL; + *config = &clsact_config; + return 0; + case BPF_TC_CUSTOM: + return -EOPNOTSUPP; + default: + return -EINVAL; + } +} + +static int tc_get_tcm_parent(enum bpf_tc_attach_point attach_point, + __u32 *parent) +{ + switch (attach_point) { + case BPF_TC_INGRESS: + case BPF_TC_EGRESS: + if (*parent) + return -EINVAL; + *parent = TC_H_MAKE(TC_H_CLSACT, + attach_point == BPF_TC_INGRESS ? + TC_H_MIN_INGRESS : TC_H_MIN_EGRESS); + break; + case BPF_TC_CUSTOM: + if (!*parent) + return -EINVAL; + break; + default: + return -EINVAL; + } + return 0; +} + +static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags) +{ + qdisc_config_t config; + int ret; + struct { + struct nlmsghdr nh; + struct tcmsg tc; + char buf[256]; + } req; + + ret = attach_point_to_config(hook, &config); + if (ret < 0) + return ret; + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)); + req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | flags; + req.nh.nlmsg_type = cmd; + req.tc.tcm_family = AF_UNSPEC; + req.tc.tcm_ifindex = OPTS_GET(hook, ifindex, 0); + + ret = config(&req.nh, &req.tc, sizeof(req)); + if (ret < 0) + return ret; + + return libbpf_netlink_send_recv(&req.nh, NULL, NULL, NULL); +} + +static int tc_qdisc_create_excl(struct bpf_tc_hook *hook) +{ + return tc_qdisc_modify(hook, RTM_NEWQDISC, NLM_F_CREATE); +} + +static int tc_qdisc_delete(struct bpf_tc_hook *hook) +{ + return tc_qdisc_modify(hook, RTM_DELQDISC, 0); +} + +int bpf_tc_hook_create(struct bpf_tc_hook *hook) +{ + if (!hook || !OPTS_VALID(hook, bpf_tc_hook) || + OPTS_GET(hook, ifindex, 0) <= 0) + return -EINVAL; + + return tc_qdisc_create_excl(hook); +} + +static int __bpf_tc_detach(const struct bpf_tc_hook *hook, + const struct bpf_tc_opts *opts, + const bool flush); + +int bpf_tc_hook_destroy(struct bpf_tc_hook *hook) +{ + if (!hook || !OPTS_VALID(hook, bpf_tc_hook) || + OPTS_GET(hook, ifindex, 0) <= 0) + return -EINVAL; + + switch (OPTS_GET(hook, attach_point, 0)) { + case BPF_TC_INGRESS: + case BPF_TC_EGRESS: + return __bpf_tc_detach(hook, NULL, true); + case BPF_TC_INGRESS | BPF_TC_EGRESS: + return tc_qdisc_delete(hook); + case BPF_TC_CUSTOM: + return -EOPNOTSUPP; + default: + return -EINVAL; + } +} + +struct bpf_cb_ctx { + struct bpf_tc_opts *opts; + bool processed; +}; + +static int __get_tc_info(void *cookie, struct tcmsg *tc, struct nlattr **tb, + bool unicast) +{ + struct nlattr *tbb[TCA_BPF_MAX + 1]; + struct bpf_cb_ctx *info = cookie; + + if (!info || !info->opts) + return -EINVAL; + if (unicast && info->processed) + return -EINVAL; + if (!tb[TCA_OPTIONS]) + return NL_CONT; + + libbpf_nla_parse_nested(tbb, TCA_BPF_MAX, tb[TCA_OPTIONS], NULL); + if (!tbb[TCA_BPF_ID]) + return -EINVAL; + + OPTS_SET(info->opts, prog_id, libbpf_nla_getattr_u32(tbb[TCA_BPF_ID])); + OPTS_SET(info->opts, handle, tc->tcm_handle); + OPTS_SET(info->opts, priority, TC_H_MAJ(tc->tcm_info) >> 16); + + info->processed = true; + return unicast ? NL_NEXT : NL_DONE; +} + +static int get_tc_info(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn, + void *cookie) +{ + struct tcmsg *tc = NLMSG_DATA(nh); + struct nlattr *tb[TCA_MAX + 1]; + + libbpf_nla_parse(tb, TCA_MAX, + (struct nlattr *)((char *)tc + NLMSG_ALIGN(sizeof(*tc))), + NLMSG_PAYLOAD(nh, sizeof(*tc)), NULL); + if (!tb[TCA_KIND]) + return NL_CONT; + return __get_tc_info(cookie, tc, tb, nh->nlmsg_flags & NLM_F_ECHO); +} + +static int tc_add_fd_and_name(struct nlmsghdr *nh, size_t maxsz, int fd) +{ + struct bpf_prog_info info = {}; + __u32 info_len = sizeof(info); + char name[256]; + int len, ret; + + ret = bpf_obj_get_info_by_fd(fd, &info, &info_len); + if (ret < 0) + return ret; + + ret = nlattr_add(nh, maxsz, TCA_BPF_FD, &fd, sizeof(fd)); + if (ret < 0) + return ret; + len = snprintf(name, sizeof(name), "%s:[%u]", info.name, info.id); + if (len < 0) + return -errno; + if (len >= sizeof(name)) + return -ENAMETOOLONG; + return nlattr_add(nh, maxsz, TCA_BPF_NAME, name, len + 1); +} + +int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) +{ + __u32 protocol, bpf_flags, handle, priority, parent, prog_id, flags; + int ret, ifindex, attach_point, prog_fd; + struct bpf_cb_ctx info = {}; + struct nlattr *nla; + struct { + struct nlmsghdr nh; + struct tcmsg tc; + char buf[256]; + } req; + + if (!hook || !opts || + !OPTS_VALID(hook, bpf_tc_hook) || + !OPTS_VALID(opts, bpf_tc_opts)) + return -EINVAL; + + ifindex = OPTS_GET(hook, ifindex, 0); + parent = OPTS_GET(hook, parent, 0); + attach_point = OPTS_GET(hook, attach_point, 0); + + handle = OPTS_GET(opts, handle, 0); + priority = OPTS_GET(opts, priority, 0); + prog_fd = OPTS_GET(opts, prog_fd, 0); + prog_id = OPTS_GET(opts, prog_id, 0); + flags = OPTS_GET(opts, flags, 0); + + if (ifindex <= 0 || !prog_fd || prog_id) + return -EINVAL; + if (priority > UINT16_MAX) + return -EINVAL; + if (flags & ~BPF_TC_F_REPLACE) + return -EINVAL; + + flags = (flags & BPF_TC_F_REPLACE) ? NLM_F_REPLACE : NLM_F_EXCL; + protocol = ETH_P_ALL; + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)); + req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | + NLM_F_ECHO | flags; + req.nh.nlmsg_type = RTM_NEWTFILTER; + req.tc.tcm_family = AF_UNSPEC; + req.tc.tcm_ifindex = ifindex; + req.tc.tcm_handle = handle; + req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol)); + + ret = tc_get_tcm_parent(attach_point, &parent); + if (ret < 0) + return ret; + req.tc.tcm_parent = parent; + + ret = nlattr_add(&req.nh, sizeof(req), TCA_KIND, "bpf", sizeof("bpf")); + if (ret < 0) + return ret; + nla = nlattr_begin_nested(&req.nh, sizeof(req), TCA_OPTIONS); + if (!nla) + return -EMSGSIZE; + ret = tc_add_fd_and_name(&req.nh, sizeof(req), prog_fd); + if (ret < 0) + return ret; + bpf_flags = TCA_BPF_FLAG_ACT_DIRECT; + ret = nlattr_add(&req.nh, sizeof(req), TCA_BPF_FLAGS, &bpf_flags, + sizeof(bpf_flags)); + if (ret < 0) + return ret; + nlattr_end_nested(&req.nh, nla); + + info.opts = opts; + + ret = libbpf_netlink_send_recv(&req.nh, get_tc_info, NULL, &info); + if (ret < 0) + return ret; + if (!info.processed) + return -ENOENT; + return ret; +} + +static int __bpf_tc_detach(const struct bpf_tc_hook *hook, + const struct bpf_tc_opts *opts, + const bool flush) +{ + __u32 protocol = 0, handle, priority, parent, prog_id, flags; + int ret, ifindex, attach_point, prog_fd; + struct { + struct nlmsghdr nh; + struct tcmsg tc; + char buf[256]; + } req; + + if (!hook || + !OPTS_VALID(hook, bpf_tc_hook) || + !OPTS_VALID(opts, bpf_tc_opts)) + return -EINVAL; + + ifindex = OPTS_GET(hook, ifindex, 0); + parent = OPTS_GET(hook, parent, 0); + attach_point = OPTS_GET(hook, attach_point, 0); + + handle = OPTS_GET(opts, handle, 0); + priority = OPTS_GET(opts, priority, 0); + prog_fd = OPTS_GET(opts, prog_fd, 0); + prog_id = OPTS_GET(opts, prog_id, 0); + flags = OPTS_GET(opts, flags, 0); + + if (ifindex <= 0 || flags || prog_fd || prog_id) + return -EINVAL; + if (priority > UINT16_MAX) + return -EINVAL; + if (flags & ~BPF_TC_F_REPLACE) + return -EINVAL; + if (!flush) { + if (!handle || !priority) + return -EINVAL; + protocol = ETH_P_ALL; + } else { + if (handle || priority) + return -EINVAL; + } + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)); + req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + req.nh.nlmsg_type = RTM_DELTFILTER; + req.tc.tcm_family = AF_UNSPEC; + req.tc.tcm_ifindex = ifindex; + if (!flush) { + req.tc.tcm_handle = handle; + req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol)); + } + + ret = tc_get_tcm_parent(attach_point, &parent); + if (ret < 0) + return ret; + req.tc.tcm_parent = parent; + + if (!flush) { + ret = nlattr_add(&req.nh, sizeof(req), TCA_KIND, + "bpf", sizeof("bpf")); + if (ret < 0) + return ret; + } + + return libbpf_netlink_send_recv(&req.nh, NULL, NULL, NULL); +} + +int bpf_tc_detach(const struct bpf_tc_hook *hook, + const struct bpf_tc_opts *opts) +{ + return !opts ? -EINVAL : __bpf_tc_detach(hook, opts, false); +} + +int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) +{ + __u32 protocol, handle, priority, parent, prog_id, flags; + int ret, ifindex, attach_point, prog_fd; + struct bpf_cb_ctx info = {}; + struct { + struct nlmsghdr nh; + struct tcmsg tc; + char buf[256]; + } req; + + if (!hook || !opts || + !OPTS_VALID(hook, bpf_tc_hook) || + !OPTS_VALID(opts, bpf_tc_opts)) + return -EINVAL; + + ifindex = OPTS_GET(hook, ifindex, 0); + parent = OPTS_GET(hook, parent, 0); + attach_point = OPTS_GET(hook, attach_point, 0); + + handle = OPTS_GET(opts, handle, 0); + priority = OPTS_GET(opts, priority, 0); + prog_fd = OPTS_GET(opts, prog_fd, 0); + prog_id = OPTS_GET(opts, prog_id, 0); + flags = OPTS_GET(opts, flags, 0); + + if (ifindex <= 0 || flags || prog_fd || prog_id || + !handle || !priority) + return -EINVAL; + if (priority > UINT16_MAX) + return -EINVAL; + + protocol = ETH_P_ALL; + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)); + req.nh.nlmsg_flags = NLM_F_REQUEST; + req.nh.nlmsg_type = RTM_GETTFILTER; + req.tc.tcm_family = AF_UNSPEC; + req.tc.tcm_ifindex = ifindex; + req.tc.tcm_handle = handle; + req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol)); + + ret = tc_get_tcm_parent(attach_point, &parent); + if (ret < 0) + return ret; + req.tc.tcm_parent = parent; + + ret = nlattr_add(&req.nh, sizeof(req), TCA_KIND, "bpf", sizeof("bpf")); + if (ret < 0) + return ret; + + info.opts = opts; + + ret = libbpf_netlink_send_recv(&req.nh, get_tc_info, NULL, &info); + if (ret < 0) + return ret; + if (!info.processed) + return -ENOENT; + return ret; +}
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 79a7f8bdb159d9914b58740f3d31d602a6e4aca8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add placeholders for bpf_sys_bpf() helper and new program type. Make sure to check that expected_attach_type is zero for future extensibility. Allow tracing helper functions to be used in this program type, since they will only execute from user context via bpf_prog_test_run.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@gmail.... (cherry picked from commit 79a7f8bdb159d9914b58740f3d31d602a6e4aca8) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf_types.h include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 10 +++++++ include/linux/bpf_types.h | 2 ++ include/uapi/linux/bpf.h | 8 +++++ kernel/bpf/syscall.c | 53 ++++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 8 +++++ net/bpf/test_run.c | 43 +++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 8 +++++ 7 files changed, 132 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index bbc3e717c763..97e9aa721509 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1898,6 +1898,9 @@ static inline bool bpf_map_is_dev_bound(struct bpf_map *map)
struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr); void bpf_map_offload_map_free(struct bpf_map *map); +int bpf_prog_test_run_syscall(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr); #else static inline int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) @@ -1923,6 +1926,13 @@ static inline struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr) static inline void bpf_map_offload_map_free(struct bpf_map *map) { } + +static inline int bpf_prog_test_run_syscall(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + return -ENOTSUPP; +} #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */
#if defined(CONFIG_BPF_STREAM_PARSER) diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 5732b485c539..9f2f2315ff42 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -81,6 +81,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LSM, lsm, BPF_PROG_TYPE(BPF_PROG_TYPE_SCHED, bpf_sched, void *, void *) #endif /* CONFIG_BPF_SCHED */ +BPF_PROG_TYPE(BPF_PROG_TYPE_SYSCALL, bpf_syscall, + void *, void *)
BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 615ca5424058..a099d8505221 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -206,6 +206,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_LSM, BPF_PROG_TYPE_SK_LOOKUP, BPF_PROG_TYPE_SCHED, + BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */ };
enum bpf_attach_type { @@ -3873,6 +3874,12 @@ union bpf_attr { * be zero-terminated except when **str_size** is 0. * * Or **-EBUSY** if the per-CPU memory copy buffer is busy. + * + * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size) + * Description + * Execute bpf syscall with given arguments. + * Return + * A syscall result. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4038,6 +4045,7 @@ union bpf_attr { FN(sched_entity_belongs_to_cgrp), \ FN(for_each_map_elem), \ FN(snprintf), \ + FN(sys_bpf), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 17e8d056602e..f67f1a59e1f0 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2079,6 +2079,7 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type, if (expected_attach_type == BPF_SK_LOOKUP) return 0; return -EINVAL; + case BPF_PROG_TYPE_SYSCALL: case BPF_PROG_TYPE_EXT: if (expected_attach_type) return -EINVAL; @@ -4595,3 +4596,55 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
return err; } + +static bool syscall_prog_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + if (off < 0 || off >= U16_MAX) + return false; + if (off % size != 0) + return false; + return true; +} + +BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size) +{ + return -EINVAL; +} + +const struct bpf_func_proto bpf_sys_bpf_proto = { + .func = bpf_sys_bpf, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_ANYTHING, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE, +}; + +const struct bpf_func_proto * __weak +tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + return bpf_base_func_proto(func_id); +} + +static const struct bpf_func_proto * +syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_sys_bpf: + return &bpf_sys_bpf_proto; + default: + return tracing_prog_func_proto(func_id, prog); + } +} + +const struct bpf_verifier_ops bpf_syscall_verifier_ops = { + .get_func_proto = syscall_prog_func_proto, + .is_valid_access = syscall_prog_is_valid_access, +}; + +const struct bpf_prog_ops bpf_syscall_prog_ops = { + .test_run = bpf_prog_test_run_syscall, +}; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fe9799326522..b1062a7879a4 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13077,6 +13077,14 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) int ret; u64 key;
+ if (prog->type == BPF_PROG_TYPE_SYSCALL) { + if (prog->aux->sleepable) + /* attach_btf_id checked to be zero already */ + return 0; + verbose(env, "Syscall programs can only be sleepable\n"); + return -EINVAL; + } + if (prog->aux->sleepable && prog->type != BPF_PROG_TYPE_TRACING && prog->type != BPF_PROG_TYPE_LSM) { verbose(env, "Only fentry/fexit/fmod_ret and lsm programs can be sleepable\n"); diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 0dfef59cf3de..6175e516d413 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -902,3 +902,46 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat kfree(user_ctx); return ret; } + +int bpf_prog_test_run_syscall(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + void __user *ctx_in = u64_to_user_ptr(kattr->test.ctx_in); + __u32 ctx_size_in = kattr->test.ctx_size_in; + void *ctx = NULL; + u32 retval; + int err = 0; + + /* doesn't support data_in/out, ctx_out, duration, or repeat or flags */ + if (kattr->test.data_in || kattr->test.data_out || + kattr->test.ctx_out || kattr->test.duration || + kattr->test.repeat || kattr->test.flags) + return -EINVAL; + + if (ctx_size_in < prog->aux->max_ctx_offset || + ctx_size_in > U16_MAX) + return -EINVAL; + + if (ctx_size_in) { + ctx = kzalloc(ctx_size_in, GFP_USER); + if (!ctx) + return -ENOMEM; + if (copy_from_user(ctx, ctx_in, ctx_size_in)) { + err = -EFAULT; + goto out; + } + } + retval = bpf_prog_run_pin_on_cpu(prog, ctx); + + if (copy_to_user(&uattr->test.retval, &retval, sizeof(u32))) { + err = -EFAULT; + goto out; + } + if (ctx_size_in) + if (copy_to_user(ctx_in, ctx, ctx_size_in)) + err = -EFAULT; +out: + kfree(ctx); + return err; +} diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index db550831230c..4ba5820ef470 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -916,6 +916,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_LSM, BPF_PROG_TYPE_SK_LOOKUP, BPF_PROG_TYPE_SCHED, + BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */ };
enum bpf_attach_type { @@ -4583,6 +4584,12 @@ union bpf_attr { * be zero-terminated except when **str_size** is 0. * * Or **-EBUSY** if the per-CPU memory copy buffer is busy. + * + * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size) + * Description + * Execute bpf syscall with given arguments. + * Return + * A syscall result. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4748,6 +4755,7 @@ union bpf_attr { FN(sched_entity_belongs_to_cgrp), \ FN(for_each_map_elem), \ FN(snprintf), \ + FN(sys_bpf), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit cdf7fb0a9f3d36b279590ac41e61c6b655db0d4a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Similar to sockptr_t introduce bpfptr_t with few additions: make_bpfptr() creates new user/kernel pointer in the same address space as existing user/kernel pointer. bpfptr_add() advances the user/kernel pointer.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-3-alexei.starovoitov@gmail.... (cherry picked from commit cdf7fb0a9f3d36b279590ac41e61c6b655db0d4a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpfptr.h | 75 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) create mode 100644 include/linux/bpfptr.h
diff --git a/include/linux/bpfptr.h b/include/linux/bpfptr.h new file mode 100644 index 000000000000..5cdeab497cb3 --- /dev/null +++ b/include/linux/bpfptr.h @@ -0,0 +1,75 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* A pointer that can point to either kernel or userspace memory. */ +#ifndef _LINUX_BPFPTR_H +#define _LINUX_BPFPTR_H + +#include <linux/sockptr.h> + +typedef sockptr_t bpfptr_t; + +static inline bool bpfptr_is_kernel(bpfptr_t bpfptr) +{ + return bpfptr.is_kernel; +} + +static inline bpfptr_t KERNEL_BPFPTR(void *p) +{ + return (bpfptr_t) { .kernel = p, .is_kernel = true }; +} + +static inline bpfptr_t USER_BPFPTR(void __user *p) +{ + return (bpfptr_t) { .user = p }; +} + +static inline bpfptr_t make_bpfptr(u64 addr, bool is_kernel) +{ + if (is_kernel) + return KERNEL_BPFPTR((void*) (uintptr_t) addr); + else + return USER_BPFPTR(u64_to_user_ptr(addr)); +} + +static inline bool bpfptr_is_null(bpfptr_t bpfptr) +{ + if (bpfptr_is_kernel(bpfptr)) + return !bpfptr.kernel; + return !bpfptr.user; +} + +static inline void bpfptr_add(bpfptr_t *bpfptr, size_t val) +{ + if (bpfptr_is_kernel(*bpfptr)) + bpfptr->kernel += val; + else + bpfptr->user += val; +} + +static inline int copy_from_bpfptr_offset(void *dst, bpfptr_t src, + size_t offset, size_t size) +{ + return copy_from_sockptr_offset(dst, (sockptr_t) src, offset, size); +} + +static inline int copy_from_bpfptr(void *dst, bpfptr_t src, size_t size) +{ + return copy_from_bpfptr_offset(dst, src, 0, size); +} + +static inline int copy_to_bpfptr_offset(bpfptr_t dst, size_t offset, + const void *src, size_t size) +{ + return copy_to_sockptr_offset((sockptr_t) dst, offset, src, size); +} + +static inline void *memdup_bpfptr(bpfptr_t src, size_t len) +{ + return memdup_sockptr((sockptr_t) src, len); +} + +static inline long strncpy_from_bpfptr(char *dst, bpfptr_t src, size_t count) +{ + return strncpy_from_sockptr(dst, (sockptr_t) src, count); +} + +#endif /* _LINUX_BPFPTR_H */
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit af2ac3e13e45752af03c8a933f9b6e18841b128b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
With the help from bpfptr_t prepare relevant bpf syscall commands to be used from kernel and user space.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-4-alexei.starovoitov@gmail.... (cherry picked from commit af2ac3e13e45752af03c8a933f9b6e18841b128b) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/syscall.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 8 +-- kernel/bpf/bpf_iter.c | 13 ++--- kernel/bpf/syscall.c | 113 +++++++++++++++++++++++++++--------------- kernel/bpf/verifier.c | 34 +++++++------ net/bpf/test_run.c | 2 +- 5 files changed, 104 insertions(+), 66 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 97e9aa721509..143c187dc4f0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -20,6 +20,7 @@ #include <linux/kallsyms.h> #include <linux/capability.h> #include <linux/percpu-refcount.h> +#include <linux/bpfptr.h>
struct bpf_verifier_env; struct bpf_verifier_log; @@ -1477,7 +1478,7 @@ struct bpf_iter__bpf_map_elem { int bpf_iter_reg_target(const struct bpf_iter_reg *reg_info); void bpf_iter_unreg_target(const struct bpf_iter_reg *reg_info); bool bpf_iter_prog_supported(struct bpf_prog *prog); -int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); +int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_prog *prog); int bpf_iter_new_fd(struct bpf_link *link); bool bpf_link_is_iter(struct bpf_link *link); struct bpf_prog *bpf_iter_get_info(struct bpf_iter_meta *meta, bool in_stop); @@ -1504,7 +1505,7 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file, int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
int bpf_get_file_flag(int flags); -int bpf_check_uarg_tail_zero(void __user *uaddr, size_t expected_size, +int bpf_check_uarg_tail_zero(bpfptr_t uaddr, size_t expected_size, size_t actual_size);
/* memcpy that is used with 8-byte aligned pointers, power-of-8 size and @@ -1524,8 +1525,7 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size) }
/* verify correctness of eBPF program */ -int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, - union bpf_attr __user *uattr); +int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, bpfptr_t uattr);
#ifndef CONFIG_BPF_JIT_ALWAYS_ON void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth); diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index 931870f9cf56..2d4fbdbb194e 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -473,15 +473,16 @@ bool bpf_link_is_iter(struct bpf_link *link) return link->ops == &bpf_iter_link_lops; }
-int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr, + struct bpf_prog *prog) { - union bpf_iter_link_info __user *ulinfo; struct bpf_link_primer link_primer; struct bpf_iter_target_info *tinfo; union bpf_iter_link_info linfo; struct bpf_iter_link *link; u32 prog_btf_id, linfo_len; bool existed = false; + bpfptr_t ulinfo; int err;
if (attr->link_create.target_fd || attr->link_create.flags) @@ -489,18 +490,18 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
memset(&linfo, 0, sizeof(union bpf_iter_link_info));
- ulinfo = u64_to_user_ptr(attr->link_create.iter_info); + ulinfo = make_bpfptr(attr->link_create.iter_info, uattr.is_kernel); linfo_len = attr->link_create.iter_info_len; - if (!ulinfo ^ !linfo_len) + if (bpfptr_is_null(ulinfo) ^ !linfo_len) return -EINVAL;
- if (ulinfo) { + if (!bpfptr_is_null(ulinfo)) { err = bpf_check_uarg_tail_zero(ulinfo, sizeof(linfo), linfo_len); if (err) return err; linfo_len = min_t(u32, linfo_len, sizeof(linfo)); - if (copy_from_user(&linfo, ulinfo, linfo_len)) + if (copy_from_bpfptr(&linfo, ulinfo, linfo_len)) return -EFAULT; }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index f67f1a59e1f0..7069cb1cd99a 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -74,11 +74,10 @@ static const struct bpf_map_ops * const bpf_map_types[] = { * copy_from_user() call. However, this is not a concern since this function is * meant to be a future-proofing of bits. */ -int bpf_check_uarg_tail_zero(void __user *uaddr, +int bpf_check_uarg_tail_zero(bpfptr_t uaddr, size_t expected_size, size_t actual_size) { - unsigned char __user *addr = uaddr + expected_size; int res;
if (unlikely(actual_size > PAGE_SIZE)) /* silly large */ @@ -87,7 +86,12 @@ int bpf_check_uarg_tail_zero(void __user *uaddr, if (actual_size <= expected_size) return 0;
- res = check_zeroed_user(addr, actual_size - expected_size); + if (uaddr.is_kernel) + res = memchr_inv(uaddr.kernel + expected_size, 0, + actual_size - expected_size) == NULL; + else + res = check_zeroed_user(uaddr.user + expected_size, + actual_size - expected_size); if (res < 0) return res; return res ? 0 : -E2BIG; @@ -1014,6 +1018,17 @@ static void *__bpf_copy_key(void __user *ukey, u64 key_size) return NULL; }
+static void *___bpf_copy_key(bpfptr_t ukey, u64 key_size) +{ + if (key_size) + return memdup_bpfptr(ukey, key_size); + + if (!bpfptr_is_null(ukey)) + return ERR_PTR(-EINVAL); + + return NULL; +} + /* last field in 'union bpf_attr' used by this command */ #define BPF_MAP_LOOKUP_ELEM_LAST_FIELD flags
@@ -1084,10 +1099,10 @@ static int map_lookup_elem(union bpf_attr *attr)
#define BPF_MAP_UPDATE_ELEM_LAST_FIELD flags
-static int map_update_elem(union bpf_attr *attr) +static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr) { - void __user *ukey = u64_to_user_ptr(attr->key); - void __user *uvalue = u64_to_user_ptr(attr->value); + bpfptr_t ukey = make_bpfptr(attr->key, uattr.is_kernel); + bpfptr_t uvalue = make_bpfptr(attr->value, uattr.is_kernel); int ufd = attr->map_fd; struct bpf_map *map; void *key, *value; @@ -1114,7 +1129,7 @@ static int map_update_elem(union bpf_attr *attr) goto err_put; }
- key = __bpf_copy_key(ukey, map->key_size); + key = ___bpf_copy_key(ukey, map->key_size); if (IS_ERR(key)) { err = PTR_ERR(key); goto err_put; @@ -1134,7 +1149,7 @@ static int map_update_elem(union bpf_attr *attr) goto free_key;
err = -EFAULT; - if (copy_from_user(value, uvalue, value_size) != 0) + if (copy_from_bpfptr(value, uvalue, value_size) != 0) goto free_value;
err = bpf_map_update_value(map, f, key, value, attr->flags); @@ -2142,7 +2157,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type) /* last field in 'union bpf_attr' used by this command */ #define BPF_PROG_LOAD_LAST_FIELD attach_prog_fd
-static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) +static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) { enum bpf_prog_type type = attr->prog_type; struct bpf_prog *prog, *dst_prog = NULL; @@ -2167,8 +2182,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) return -EPERM;
/* copy eBPF program license from user space */ - if (strncpy_from_user(license, u64_to_user_ptr(attr->license), - sizeof(license) - 1) < 0) + if (strncpy_from_bpfptr(license, + make_bpfptr(attr->license, uattr.is_kernel), + sizeof(license) - 1) < 0) return -EFAULT; license[sizeof(license) - 1] = 0;
@@ -2252,8 +2268,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) prog->len = attr->insn_cnt;
err = -EFAULT; - if (copy_from_user(prog->insns, u64_to_user_ptr(attr->insns), - bpf_prog_insn_size(prog)) != 0) + if (copy_from_bpfptr(prog->insns, + make_bpfptr(attr->insns, uattr.is_kernel), + bpf_prog_insn_size(prog)) != 0) goto free_prog;
prog->orig_prog = NULL; @@ -3495,7 +3512,7 @@ static int bpf_prog_get_info_by_fd(struct file *file, u32 ulen; int err;
- err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len); + err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len); if (err) return err; info_len = min_t(u32, sizeof(info), info_len); @@ -3774,7 +3791,7 @@ static int bpf_map_get_info_by_fd(struct file *file, u32 info_len = attr->info.info_len; int err;
- err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len); + err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len); if (err) return err; info_len = min_t(u32, sizeof(info), info_len); @@ -3817,7 +3834,7 @@ static int bpf_btf_get_info_by_fd(struct file *file, u32 info_len = attr->info.info_len; int err;
- err = bpf_check_uarg_tail_zero(uinfo, sizeof(*uinfo), info_len); + err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(*uinfo), info_len); if (err) return err;
@@ -3834,7 +3851,7 @@ static int bpf_link_get_info_by_fd(struct file *file, u32 info_len = attr->info.info_len; int err;
- err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len); + err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len); if (err) return err; info_len = min_t(u32, sizeof(info), info_len); @@ -4110,13 +4127,14 @@ static int bpf_map_do_batch(const union bpf_attr *attr, return err; }
-static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr, + struct bpf_prog *prog) { if (attr->link_create.attach_type != prog->expected_attach_type) return -EINVAL;
if (prog->expected_attach_type == BPF_TRACE_ITER) - return bpf_iter_link_attach(attr, prog); + return bpf_iter_link_attach(attr, uattr, prog); else if (prog->type == BPF_PROG_TYPE_EXT) return bpf_tracing_prog_attach(prog, attr->link_create.target_fd, @@ -4125,7 +4143,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog * }
#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len -static int link_create(union bpf_attr *attr) +static int link_create(union bpf_attr *attr, bpfptr_t uattr) { enum bpf_prog_type ptype; struct bpf_prog *prog; @@ -4144,7 +4162,7 @@ static int link_create(union bpf_attr *attr) goto out;
if (prog->type == BPF_PROG_TYPE_EXT) { - ret = tracing_bpf_link_attach(attr, prog); + ret = tracing_bpf_link_attach(attr, uattr, prog); goto out; }
@@ -4165,7 +4183,7 @@ static int link_create(union bpf_attr *attr) ret = cgroup_bpf_link_attach(attr, prog); break; case BPF_PROG_TYPE_TRACING: - ret = tracing_bpf_link_attach(attr, prog); + ret = tracing_bpf_link_attach(attr, uattr, prog); break; case BPF_PROG_TYPE_FLOW_DISSECTOR: case BPF_PROG_TYPE_SK_LOOKUP: @@ -4453,7 +4471,7 @@ static int bpf_prog_bind_map(union bpf_attr *attr) return ret; }
-SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) +static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) { union bpf_attr attr; int err; @@ -4468,7 +4486,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
/* copy attributes from user space, may be less than sizeof(bpf_attr) */ memset(&attr, 0, sizeof(attr)); - if (copy_from_user(&attr, uattr, size) != 0) + if (copy_from_bpfptr(&attr, uattr, size) != 0) return -EFAULT;
err = security_bpf(cmd, &attr, size); @@ -4483,7 +4501,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz err = map_lookup_elem(&attr); break; case BPF_MAP_UPDATE_ELEM: - err = map_update_elem(&attr); + err = map_update_elem(&attr, uattr); break; case BPF_MAP_DELETE_ELEM: err = map_delete_elem(&attr); @@ -4510,21 +4528,21 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz err = bpf_prog_detach(&attr); break; case BPF_PROG_QUERY: - err = bpf_prog_query(&attr, uattr); + err = bpf_prog_query(&attr, uattr.user); break; case BPF_PROG_TEST_RUN: - err = bpf_prog_test_run(&attr, uattr); + err = bpf_prog_test_run(&attr, uattr.user); break; case BPF_PROG_GET_NEXT_ID: - err = bpf_obj_get_next_id(&attr, uattr, + err = bpf_obj_get_next_id(&attr, uattr.user, &prog_idr, &prog_idr_lock); break; case BPF_MAP_GET_NEXT_ID: - err = bpf_obj_get_next_id(&attr, uattr, + err = bpf_obj_get_next_id(&attr, uattr.user, &map_idr, &map_idr_lock); break; case BPF_BTF_GET_NEXT_ID: - err = bpf_obj_get_next_id(&attr, uattr, + err = bpf_obj_get_next_id(&attr, uattr.user, &btf_idr, &btf_idr_lock); break; case BPF_PROG_GET_FD_BY_ID: @@ -4534,7 +4552,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz err = bpf_map_get_fd_by_id(&attr); break; case BPF_OBJ_GET_INFO_BY_FD: - err = bpf_obj_get_info_by_fd(&attr, uattr); + err = bpf_obj_get_info_by_fd(&attr, uattr.user); break; case BPF_RAW_TRACEPOINT_OPEN: err = bpf_raw_tracepoint_open(&attr); @@ -4546,26 +4564,26 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz err = bpf_btf_get_fd_by_id(&attr); break; case BPF_TASK_FD_QUERY: - err = bpf_task_fd_query(&attr, uattr); + err = bpf_task_fd_query(&attr, uattr.user); break; case BPF_MAP_LOOKUP_AND_DELETE_ELEM: err = map_lookup_and_delete_elem(&attr); break; case BPF_MAP_LOOKUP_BATCH: - err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); + err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_LOOKUP_BATCH); break; case BPF_MAP_LOOKUP_AND_DELETE_BATCH: - err = bpf_map_do_batch(&attr, uattr, + err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_LOOKUP_AND_DELETE_BATCH); break; case BPF_MAP_UPDATE_BATCH: - err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH); + err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_UPDATE_BATCH); break; case BPF_MAP_DELETE_BATCH: - err = bpf_map_do_batch(&attr, uattr, BPF_MAP_DELETE_BATCH); + err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_DELETE_BATCH); break; case BPF_LINK_CREATE: - err = link_create(&attr); + err = link_create(&attr, uattr); break; case BPF_LINK_UPDATE: err = link_update(&attr); @@ -4574,7 +4592,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz err = bpf_link_get_fd_by_id(&attr); break; case BPF_LINK_GET_NEXT_ID: - err = bpf_obj_get_next_id(&attr, uattr, + err = bpf_obj_get_next_id(&attr, uattr.user, &link_idr, &link_idr_lock); break; case BPF_ENABLE_STATS: @@ -4597,6 +4615,11 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz return err; }
+SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) +{ + return __sys_bpf(cmd, USER_BPFPTR(uattr), size); +} + static bool syscall_prog_is_valid_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, @@ -4611,7 +4634,19 @@ static bool syscall_prog_is_valid_access(int off, int size,
BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size) { - return -EINVAL; + switch (cmd) { + case BPF_MAP_CREATE: + case BPF_MAP_UPDATE_ELEM: + case BPF_MAP_FREEZE: + case BPF_PROG_LOAD: + break; + /* case BPF_PROG_TEST_RUN: + * is not part of this list to prevent recursive test_run + */ + default: + return -EINVAL; + } + return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size); }
const struct bpf_func_proto bpf_sys_bpf_proto = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b1062a7879a4..592d1b8de6a1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9374,7 +9374,7 @@ static int check_abnormal_return(struct bpf_verifier_env *env)
static int check_btf_func(struct bpf_verifier_env *env, const union bpf_attr *attr, - union bpf_attr __user *uattr) + bpfptr_t uattr) { const struct btf_type *type, *func_proto, *ret_type; u32 i, nfuncs, urec_size, min_size; @@ -9383,7 +9383,7 @@ static int check_btf_func(struct bpf_verifier_env *env, struct bpf_func_info_aux *info_aux = NULL; struct bpf_prog *prog; const struct btf *btf; - void __user *urecord; + bpfptr_t urecord; u32 prev_offset = 0; bool scalar_return; int ret = -ENOMEM; @@ -9411,7 +9411,7 @@ static int check_btf_func(struct bpf_verifier_env *env, prog = env->prog; btf = prog->aux->btf;
- urecord = u64_to_user_ptr(attr->func_info); + urecord = make_bpfptr(attr->func_info, uattr.is_kernel); min_size = min_t(u32, krec_size, urec_size);
krecord = kvcalloc(nfuncs, krec_size, GFP_KERNEL | __GFP_NOWARN); @@ -9429,13 +9429,15 @@ static int check_btf_func(struct bpf_verifier_env *env, /* set the size kernel expects so loader can zero * out the rest of the record. */ - if (put_user(min_size, &uattr->func_info_rec_size)) + if (copy_to_bpfptr_offset(uattr, + offsetof(union bpf_attr, func_info_rec_size), + &min_size, sizeof(min_size))) ret = -EFAULT; } goto err_free; }
- if (copy_from_user(&krecord[i], urecord, min_size)) { + if (copy_from_bpfptr(&krecord[i], urecord, min_size)) { ret = -EFAULT; goto err_free; } @@ -9487,7 +9489,7 @@ static int check_btf_func(struct bpf_verifier_env *env, }
prev_offset = krecord[i].insn_off; - urecord += urec_size; + bpfptr_add(&urecord, urec_size); }
prog->aux->func_info = krecord; @@ -9519,14 +9521,14 @@ static void adjust_btf_func(struct bpf_verifier_env *env)
static int check_btf_line(struct bpf_verifier_env *env, const union bpf_attr *attr, - union bpf_attr __user *uattr) + bpfptr_t uattr) { u32 i, s, nr_linfo, ncopy, expected_size, rec_size, prev_offset = 0; struct bpf_subprog_info *sub; struct bpf_line_info *linfo; struct bpf_prog *prog; const struct btf *btf; - void __user *ulinfo; + bpfptr_t ulinfo; int err;
nr_linfo = attr->line_info_cnt; @@ -9554,7 +9556,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
s = 0; sub = env->subprog_info; - ulinfo = u64_to_user_ptr(attr->line_info); + ulinfo = make_bpfptr(attr->line_info, uattr.is_kernel); expected_size = sizeof(struct bpf_line_info); ncopy = min_t(u32, expected_size, rec_size); for (i = 0; i < nr_linfo; i++) { @@ -9562,14 +9564,15 @@ static int check_btf_line(struct bpf_verifier_env *env, if (err) { if (err == -E2BIG) { verbose(env, "nonzero tailing record in line_info"); - if (put_user(expected_size, - &uattr->line_info_rec_size)) + if (copy_to_bpfptr_offset(uattr, + offsetof(union bpf_attr, line_info_rec_size), + &expected_size, sizeof(expected_size))) err = -EFAULT; } goto err_free; }
- if (copy_from_user(&linfo[i], ulinfo, ncopy)) { + if (copy_from_bpfptr(&linfo[i], ulinfo, ncopy)) { err = -EFAULT; goto err_free; } @@ -9621,7 +9624,7 @@ static int check_btf_line(struct bpf_verifier_env *env, }
prev_offset = linfo[i].insn_off; - ulinfo += rec_size; + bpfptr_add(&ulinfo, rec_size); }
if (s != env->subprog_cnt) { @@ -9643,7 +9646,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
static int check_btf_info(struct bpf_verifier_env *env, const union bpf_attr *attr, - union bpf_attr __user *uattr) + bpfptr_t uattr) { struct btf *btf; int err; @@ -13163,8 +13166,7 @@ struct btf *bpf_get_btf_vmlinux(void) return btf_vmlinux; }
-int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, - union bpf_attr __user *uattr) +int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) { u64 start_time = ktime_get_ns(); struct bpf_verifier_env *env; diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 6175e516d413..fad7a3c4483f 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -380,7 +380,7 @@ static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size) return ERR_PTR(-ENOMEM);
if (data_in) { - err = bpf_check_uarg_tail_zero(data_in, max_size, size); + err = bpf_check_uarg_tail_zero(USER_BPFPTR(data_in), max_size, size); if (err) { kfree(data); return ERR_PTR(err);
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 5452fc9a17fc26816a683ab04cf1c29131ca27e4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Trivial support for syscall program type.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-5-alexei.starovoitov@gmail.... (cherry picked from commit 5452fc9a17fc26816a683ab04cf1c29131ca27e4) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6b57e429ba7d..6cc9c77ef131 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8972,6 +8972,8 @@ static const struct bpf_sec_def section_defs[] = { .is_attach_btf = true, .expected_attach_type = BPF_SCHED, .attach_fn = attach_sched), + SEC_DEF("syscall", SYSCALL, + .is_sleepable = true), BPF_EAPROG_SEC("xdp_devmap/", BPF_PROG_TYPE_XDP, BPF_XDP_DEVMAP), BPF_EAPROG_SEC("xdp_cpumap/", BPF_PROG_TYPE_XDP,
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit c571bd752e91602f092823b2f1ee685a74d2726c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Similar to prog_load make btf_load command to be availble to bpf_prog_type_syscall program.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-7-alexei.starovoitov@gmail.... (cherry picked from commit c571bd752e91602f092823b2f1ee685a74d2726c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/btf.h | 2 +- kernel/bpf/btf.c | 8 ++++---- kernel/bpf/syscall.c | 7 ++++--- 3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h index 8f6ea0d4d8a1..6964f39150a0 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -20,7 +20,7 @@ extern const struct file_operations btf_fops;
void btf_get(struct btf *btf); void btf_put(struct btf *btf); -int btf_new_fd(const union bpf_attr *attr); +int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr); struct btf *btf_get_by_fd(int fd); int btf_get_info_by_fd(const struct btf *btf, const union bpf_attr *attr, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 692e27ae3083..390e9186371b 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4179,7 +4179,7 @@ static int btf_parse_hdr(struct btf_verifier_env *env) return 0; }
-static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size, +static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size, u32 log_level, char __user *log_ubuf, u32 log_size) { struct btf_verifier_env *env = NULL; @@ -4227,7 +4227,7 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size, btf->data = data; btf->data_size = btf_data_size;
- if (copy_from_user(data, btf_data, btf_data_size)) { + if (copy_from_bpfptr(data, btf_data, btf_data_size)) { err = -EFAULT; goto errout; } @@ -5716,12 +5716,12 @@ static int __btf_new_fd(struct btf *btf) return anon_inode_getfd("btf", &btf_fops, btf, O_RDONLY | O_CLOEXEC); }
-int btf_new_fd(const union bpf_attr *attr) +int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr) { struct btf *btf; int ret;
- btf = btf_parse(u64_to_user_ptr(attr->btf), + btf = btf_parse(make_bpfptr(attr->btf, uattr.is_kernel), attr->btf_size, attr->btf_log_level, u64_to_user_ptr(attr->btf_log_buf), attr->btf_log_size); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7069cb1cd99a..54130c37b42b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3914,7 +3914,7 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
#define BPF_BTF_LOAD_LAST_FIELD btf_log_level
-static int bpf_btf_load(const union bpf_attr *attr) +static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr) { if (CHECK_ATTR(BPF_BTF_LOAD)) return -EINVAL; @@ -3922,7 +3922,7 @@ static int bpf_btf_load(const union bpf_attr *attr) if (!bpf_capable()) return -EPERM;
- return btf_new_fd(attr); + return btf_new_fd(attr, uattr); }
#define BPF_BTF_GET_FD_BY_ID_LAST_FIELD btf_id @@ -4558,7 +4558,7 @@ static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) err = bpf_raw_tracepoint_open(&attr); break; case BPF_BTF_LOAD: - err = bpf_btf_load(&attr); + err = bpf_btf_load(&attr, uattr); break; case BPF_BTF_GET_FD_BY_ID: err = bpf_btf_get_fd_by_id(&attr); @@ -4639,6 +4639,7 @@ BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size) case BPF_MAP_UPDATE_ELEM: case BPF_MAP_FREEZE: case BPF_PROG_LOAD: + case BPF_BTF_LOAD: break; /* case BPF_PROG_TEST_RUN: * is not part of this list to prevent recursive test_run
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 387544bfa291a22383d60b40f887360e2b931ec6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Typical program loading sequence involves creating bpf maps and applying map FDs into bpf instructions in various places in the bpf program. This job is done by libbpf that is using compiler generated ELF relocations to patch certain instruction after maps are created and BTFs are loaded. The goal of fd_idx is to allow bpf instructions to stay immutable after compilation. At load time the libbpf would still create maps as usual, but it wouldn't need to patch instructions. It would store map_fds into __u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD).
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-9-alexei.starovoitov@gmail.... (cherry picked from commit 387544bfa291a22383d60b40f887360e2b931ec6) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf_verifier.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf_verifier.h | 1 + include/uapi/linux/bpf.h | 16 ++++++++---- kernel/bpf/syscall.c | 2 +- kernel/bpf/verifier.c | 47 ++++++++++++++++++++++++++-------- tools/include/uapi/linux/bpf.h | 16 ++++++++---- 5 files changed, 61 insertions(+), 21 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 099947df8cd6..154cd9a2c51b 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -460,6 +460,7 @@ struct bpf_verifier_env { u32 peak_states; /* longest register parentage chain walked for liveness marking */ u32 longest_mark_read_walk; + bpfptr_t fd_array; /* buffer used in reg_type_str() to generate reg_type string */ char type_str_buf[TYPE_STR_BUF_LEN]; }; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index a099d8505221..18061388dd1b 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -367,8 +367,8 @@ enum bpf_link_type { /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * - * insn[0].src_reg: BPF_PSEUDO_MAP_FD - * insn[0].imm: map fd + * insn[0].src_reg: BPF_PSEUDO_MAP_[FD|IDX] + * insn[0].imm: map fd or fd_idx * insn[1].imm: 0 * insn[0].off: 0 * insn[1].off: 0 @@ -376,15 +376,19 @@ enum bpf_link_type { * verifier type: CONST_PTR_TO_MAP */ #define BPF_PSEUDO_MAP_FD 1 -/* insn[0].src_reg: BPF_PSEUDO_MAP_VALUE - * insn[0].imm: map fd +#define BPF_PSEUDO_MAP_IDX 5 + +/* insn[0].src_reg: BPF_PSEUDO_MAP_[IDX_]VALUE + * insn[0].imm: map fd or fd_idx * insn[1].imm: offset into value * insn[0].off: 0 * insn[1].off: 0 * ldimm64 rewrite: address of map[0]+offset * verifier type: PTR_TO_MAP_VALUE */ -#define BPF_PSEUDO_MAP_VALUE 2 +#define BPF_PSEUDO_MAP_VALUE 2 +#define BPF_PSEUDO_MAP_IDX_VALUE 6 + /* insn[0].src_reg: BPF_PSEUDO_BTF_ID * insn[0].imm: kernel btd id of VAR * insn[1].imm: 0 @@ -584,6 +588,8 @@ union bpf_attr { /* or valid module BTF object fd or 0 to attach to vmlinux */ __u32 attach_btf_obj_fd; }; + __u32 :32; /* pad */ + __aligned_u64 fd_array; /* array of FDs */ };
struct { /* anonymous struct used by BPF_OBJ_* commands */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 54130c37b42b..cb40f29c4603 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2155,7 +2155,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type) }
/* last field in 'union bpf_attr' used by this command */ -#define BPF_PROG_LOAD_LAST_FIELD attach_prog_fd +#define BPF_PROG_LOAD_LAST_FIELD fd_array
static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 592d1b8de6a1..0e8000bd93ff 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8856,12 +8856,14 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) map = env->used_maps[aux->map_index]; dst_reg->map_ptr = map;
- if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) { + if (insn->src_reg == BPF_PSEUDO_MAP_VALUE || + insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { dst_reg->type = PTR_TO_MAP_VALUE; dst_reg->off = aux->map_off; if (map_value_has_spin_lock(map)) dst_reg->id = ++env->id_gen; - } else if (insn->src_reg == BPF_PSEUDO_MAP_FD) { + } else if (insn->src_reg == BPF_PSEUDO_MAP_FD || + insn->src_reg == BPF_PSEUDO_MAP_IDX) { dst_reg->type = CONST_PTR_TO_MAP; } else { verbose(env, "bpf verifier is misconfigured\n"); @@ -11097,6 +11099,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) struct bpf_map *map; struct fd f; u64 addr; + u32 fd;
if (i == insn_cnt - 1 || insn[1].code != 0 || insn[1].dst_reg != 0 || insn[1].src_reg != 0 || @@ -11126,16 +11129,38 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) /* In final convert_pseudo_ld_imm64() step, this is * converted into regular 64-bit imm load insn. */ - if ((insn[0].src_reg != BPF_PSEUDO_MAP_FD && - insn[0].src_reg != BPF_PSEUDO_MAP_VALUE) || - (insn[0].src_reg == BPF_PSEUDO_MAP_FD && - insn[1].imm != 0)) { - verbose(env, - "unrecognized bpf_ld_imm64 insn\n"); + switch (insn[0].src_reg) { + case BPF_PSEUDO_MAP_VALUE: + case BPF_PSEUDO_MAP_IDX_VALUE: + break; + case BPF_PSEUDO_MAP_FD: + case BPF_PSEUDO_MAP_IDX: + if (insn[1].imm == 0) + break; + fallthrough; + default: + verbose(env, "unrecognized bpf_ld_imm64 insn\n"); return -EINVAL; }
- f = fdget(insn[0].imm); + switch (insn[0].src_reg) { + case BPF_PSEUDO_MAP_IDX_VALUE: + case BPF_PSEUDO_MAP_IDX: + if (bpfptr_is_null(env->fd_array)) { + verbose(env, "fd_idx without fd_array is invalid\n"); + return -EPROTO; + } + if (copy_from_bpfptr_offset(&fd, env->fd_array, + insn[0].imm * sizeof(fd), + sizeof(fd))) + return -EFAULT; + break; + default: + fd = insn[0].imm; + break; + } + + f = fdget(fd); map = __bpf_map_get(f); if (IS_ERR(map)) { verbose(env, "fd %d is not pointing to valid bpf_map\n", @@ -11150,7 +11175,8 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) }
aux = &env->insn_aux_data[i]; - if (insn->src_reg == BPF_PSEUDO_MAP_FD) { + if (insn[0].src_reg == BPF_PSEUDO_MAP_FD || + insn[0].src_reg == BPF_PSEUDO_MAP_IDX) { addr = (unsigned long)map; } else { u32 off = insn[1].imm; @@ -13196,6 +13222,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) env->insn_aux_data[i].orig_idx = i; env->prog = *prog; env->ops = bpf_verifier_ops[env->prog->type]; + env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel); is_priv = bpf_capable();
bpf_get_btf_vmlinux(); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 4ba5820ef470..d7ba96b3bf9e 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1077,8 +1077,8 @@ enum bpf_link_type { /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * - * insn[0].src_reg: BPF_PSEUDO_MAP_FD - * insn[0].imm: map fd + * insn[0].src_reg: BPF_PSEUDO_MAP_[FD|IDX] + * insn[0].imm: map fd or fd_idx * insn[1].imm: 0 * insn[0].off: 0 * insn[1].off: 0 @@ -1086,15 +1086,19 @@ enum bpf_link_type { * verifier type: CONST_PTR_TO_MAP */ #define BPF_PSEUDO_MAP_FD 1 -/* insn[0].src_reg: BPF_PSEUDO_MAP_VALUE - * insn[0].imm: map fd +#define BPF_PSEUDO_MAP_IDX 5 + +/* insn[0].src_reg: BPF_PSEUDO_MAP_[IDX_]VALUE + * insn[0].imm: map fd or fd_idx * insn[1].imm: offset into value * insn[0].off: 0 * insn[1].off: 0 * ldimm64 rewrite: address of map[0]+offset * verifier type: PTR_TO_MAP_VALUE */ -#define BPF_PSEUDO_MAP_VALUE 2 +#define BPF_PSEUDO_MAP_VALUE 2 +#define BPF_PSEUDO_MAP_IDX_VALUE 6 + /* insn[0].src_reg: BPF_PSEUDO_BTF_ID * insn[0].imm: kernel btd id of VAR * insn[1].imm: 0 @@ -1294,6 +1298,8 @@ union bpf_attr { /* or valid module BTF object fd or 0 to attach to vmlinux */ __u32 attach_btf_obj_fd; }; + __u32 :32; /* pad */ + __aligned_u64 fd_array; /* array of FDs */ };
struct { /* anonymous struct used by BPF_OBJ_* commands */
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 3d78417b60fba249cc555468cb72d96f5cde2964 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add new helper: long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags) Description Find BTF type with given name and kind in vmlinux BTF or in module's BTFs. Return Returns btf_id and btf_obj_fd in lower and upper 32 bits.
It will be used by loader program to find btf_id to attach the program to and to find btf_ids of ksyms.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-10-alexei.starovoitov@gmail... (cherry picked from commit 3d78417b60fba249cc555468cb72d96f5cde2964) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 7 ++++ kernel/bpf/btf.c | 62 ++++++++++++++++++++++++++++++++++ kernel/bpf/syscall.c | 2 ++ tools/include/uapi/linux/bpf.h | 7 ++++ 5 files changed, 79 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 143c187dc4f0..ce82ba540afa 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2052,6 +2052,7 @@ extern const struct bpf_func_proto bpf_snprintf_proto; extern const struct bpf_func_proto bpf_per_cpu_ptr_proto; extern const struct bpf_func_proto bpf_this_cpu_ptr_proto; extern const struct bpf_func_proto bpf_for_each_map_elem_proto; +extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 18061388dd1b..0e88c9c8552e 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3886,6 +3886,12 @@ union bpf_attr { * Execute bpf syscall with given arguments. * Return * A syscall result. + * + * long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags) + * Description + * Find BTF type with given name and kind in vmlinux BTF or in module's BTFs. + * Return + * Returns btf_id and btf_obj_fd in lower and upper 32 bits. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4052,6 +4058,7 @@ union bpf_attr { FN(for_each_map_elem), \ FN(snprintf), \ FN(sys_bpf), \ + FN(btf_find_by_name_kind), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 390e9186371b..cd3c1a11fef1 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6021,3 +6021,65 @@ struct module *btf_try_get_module(const struct btf *btf)
return res; } + +BPF_CALL_4(bpf_btf_find_by_name_kind, char *, name, int, name_sz, u32, kind, int, flags) +{ + struct btf *btf; + long ret; + + if (flags) + return -EINVAL; + + if (name_sz <= 1 || name[name_sz - 1]) + return -EINVAL; + + btf = bpf_get_btf_vmlinux(); + if (IS_ERR(btf)) + return PTR_ERR(btf); + + ret = btf_find_by_name_kind(btf, name, kind); + /* ret is never zero, since btf_find_by_name_kind returns + * positive btf_id or negative error. + */ + if (ret < 0) { + struct btf *mod_btf; + int id; + + /* If name is not found in vmlinux's BTF then search in module's BTFs */ + spin_lock_bh(&btf_idr_lock); + idr_for_each_entry(&btf_idr, mod_btf, id) { + if (!btf_is_module(mod_btf)) + continue; + /* linear search could be slow hence unlock/lock + * the IDR to avoiding holding it for too long + */ + btf_get(mod_btf); + spin_unlock_bh(&btf_idr_lock); + ret = btf_find_by_name_kind(mod_btf, name, kind); + if (ret > 0) { + int btf_obj_fd; + + btf_obj_fd = __btf_new_fd(mod_btf); + if (btf_obj_fd < 0) { + btf_put(mod_btf); + return btf_obj_fd; + } + return ret | (((u64)btf_obj_fd) << 32); + } + spin_lock_bh(&btf_idr_lock); + btf_put(mod_btf); + } + spin_unlock_bh(&btf_idr_lock); + } + return ret; +} + +const struct bpf_func_proto bpf_btf_find_by_name_kind_proto = { + .func = bpf_btf_find_by_name_kind, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_CONST_SIZE, + .arg3_type = ARG_ANYTHING, + .arg4_type = ARG_ANYTHING, +}; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index cb40f29c4603..a9c1fa6d05a7 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4671,6 +4671,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) switch (func_id) { case BPF_FUNC_sys_bpf: return &bpf_sys_bpf_proto; + case BPF_FUNC_btf_find_by_name_kind: + return &bpf_btf_find_by_name_kind_proto; default: return tracing_prog_func_proto(func_id, prog); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d7ba96b3bf9e..f9d82a60d26b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4596,6 +4596,12 @@ union bpf_attr { * Execute bpf syscall with given arguments. * Return * A syscall result. + * + * long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags) + * Description + * Find BTF type with given name and kind in vmlinux BTF or in module's BTFs. + * Return + * Returns btf_id and btf_obj_fd in lower and upper 32 bits. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4762,6 +4768,7 @@ union bpf_attr { FN(for_each_map_elem), \ FN(snprintf), \ FN(sys_bpf), \ + FN(btf_find_by_name_kind), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 3abea089246f76c1517b054ddb5946f3f1dbd2c0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add bpf_sys_close() helper to be used by the syscall/loader program to close intermediate FDs and other cleanup. Note this helper must never be allowed inside fdget/fdput bracketing.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-11-alexei.starovoitov@gmail... (cherry picked from commit 3abea089246f76c1517b054ddb5946f3f1dbd2c0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 7 +++++++ kernel/bpf/syscall.c | 19 +++++++++++++++++++ tools/include/uapi/linux/bpf.h | 7 +++++++ 3 files changed, 33 insertions(+)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 0e88c9c8552e..d8fc4d8c7bc5 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3892,6 +3892,12 @@ union bpf_attr { * Find BTF type with given name and kind in vmlinux BTF or in module's BTFs. * Return * Returns btf_id and btf_obj_fd in lower and upper 32 bits. + * + * long bpf_sys_close(u32 fd) + * Description + * Execute close syscall for given FD. + * Return + * A syscall result. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4059,6 +4065,7 @@ union bpf_attr { FN(snprintf), \ FN(sys_bpf), \ FN(btf_find_by_name_kind), \ + FN(sys_close), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index a9c1fa6d05a7..b916a6438d7d 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4665,6 +4665,23 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return bpf_base_func_proto(func_id); }
+BPF_CALL_1(bpf_sys_close, u32, fd) +{ + /* When bpf program calls this helper there should not be + * an fdget() without matching completed fdput(). + * This helper is allowed in the following callchain only: + * sys_bpf->prog_test_run->bpf_prog->bpf_sys_close + */ + return __close_fd(current->files, fd); +} + +const struct bpf_func_proto bpf_sys_close_proto = { + .func = bpf_sys_close, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_ANYTHING, +}; + static const struct bpf_func_proto * syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -4673,6 +4690,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_sys_bpf_proto; case BPF_FUNC_btf_find_by_name_kind: return &bpf_btf_find_by_name_kind_proto; + case BPF_FUNC_sys_close: + return &bpf_sys_close_proto; default: return tracing_prog_func_proto(func_id, prog); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index f9d82a60d26b..dde6a6042d32 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4602,6 +4602,12 @@ union bpf_attr { * Find BTF type with given name and kind in vmlinux BTF or in module's BTFs. * Return * Returns btf_id and btf_obj_fd in lower and upper 32 bits. + * + * long bpf_sys_close(u32 fd) + * Description + * Execute close syscall for given FD. + * Return + * A syscall result. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4769,6 +4775,7 @@ union bpf_attr { FN(snprintf), \ FN(sys_bpf), \ FN(btf_find_by_name_kind), \ + FN(sys_close), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit b12688267280b223256c8cf912486577d3adce25 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In order to be able to generate loader program in the later patches change the order of data and text relocations. Also improve the test to include data relos.
If the kernel supports "FD array" the map_fd relocations can be processed before text relos since generated loader program won't need to manually patch ld_imm64 insns with map_fd. But ksym and kfunc relocations can only be processed after all calls are relocated, since loader program will consist of a sequence of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id and btf_obj_fd into corresponding ld_imm64 insns. The locations of those ld_imm64 insns are specified in relocations. Hence process all data relocations (maps, ksym, kfunc) together after call relos.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-12-alexei.starovoitov@gmail... (cherry picked from commit b12688267280b223256c8cf912486577d3adce25) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 86 ++++++++++++++++--- .../selftests/bpf/progs/test_subprogs.c | 13 +++ 2 files changed, 85 insertions(+), 14 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6cc9c77ef131..e061492a9ca5 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6476,11 +6476,15 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) insn[0].imm = ext->ksym.kernel_btf_id; break; case RELO_SUBPROG_ADDR: - insn[0].src_reg = BPF_PSEUDO_FUNC; - /* will be handled as a follow up pass */ + if (insn[0].src_reg != BPF_PSEUDO_FUNC) { + pr_warn("prog '%s': relo #%d: bad insn\n", + prog->name, i); + return -EINVAL; + } + /* handled already */ break; case RELO_CALL: - /* will be handled as a follow up pass */ + /* handled already */ break; default: pr_warn("prog '%s': relo #%d: bad relo type %d\n", @@ -6649,6 +6653,30 @@ static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, si sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx); }
+static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_program *subprog) +{ + int new_cnt = main_prog->nr_reloc + subprog->nr_reloc; + struct reloc_desc *relos; + int i; + + if (main_prog == subprog) + return 0; + relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos)); + if (!relos) + return -ENOMEM; + memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc, + sizeof(*relos) * subprog->nr_reloc); + + for (i = main_prog->nr_reloc; i < new_cnt; i++) + relos[i].insn_idx += subprog->sub_insn_off; + /* After insn_idx adjustment the 'relos' array is still sorted + * by insn_idx and doesn't break bsearch. + */ + main_prog->reloc_desc = relos; + main_prog->nr_reloc = new_cnt; + return 0; +} + static int bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog, struct bpf_program *prog) @@ -6669,6 +6697,11 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog, continue;
relo = find_prog_insn_relo(prog, insn_idx); + if (relo && relo->type == RELO_EXTERN_FUNC) + /* kfunc relocations will be handled later + * in bpf_object__relocate_data() + */ + continue; if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) { pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n", prog->name, insn_idx, relo->type); @@ -6743,6 +6776,10 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog, pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n", main_prog->name, subprog->insns_cnt, subprog->name);
+ /* The subprog insns are now appended. Append its relos too. */ + err = append_subprog_relos(main_prog, subprog); + if (err) + return err; err = bpf_object__reloc_code(obj, main_prog, subprog); if (err) return err; @@ -6876,7 +6913,7 @@ static int bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) { struct bpf_program *prog; - size_t i; + size_t i, j; int err;
if (obj->btf_ext) { @@ -6887,23 +6924,32 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) return err; } } - /* relocate data references first for all programs and sub-programs, - * as they don't change relative to code locations, so subsequent - * subprogram processing won't need to re-calculate any of them + + /* Before relocating calls pre-process relocations and mark + * few ld_imm64 instructions that points to subprogs. + * Otherwise bpf_object__reloc_code() later would have to consider + * all ld_imm64 insns as relocation candidates. That would + * reduce relocation speed, since amount of find_prog_insn_relo() + * would increase and most of them will fail to find a relo. */ for (i = 0; i < obj->nr_programs; i++) { prog = &obj->programs[i]; - err = bpf_object__relocate_data(obj, prog); - if (err) { - pr_warn("prog '%s': failed to relocate data references: %d\n", - prog->name, err); - return err; + for (j = 0; j < prog->nr_reloc; j++) { + struct reloc_desc *relo = &prog->reloc_desc[j]; + struct bpf_insn *insn = &prog->insns[relo->insn_idx]; + + /* mark the insn, so it's recognized by insn_is_pseudo_func() */ + if (relo->type == RELO_SUBPROG_ADDR) + insn[0].src_reg = BPF_PSEUDO_FUNC; } } - /* now relocate subprogram calls and append used subprograms to main + + /* relocate subprogram calls and append used subprograms to main * programs; each copy of subprogram code needs to be relocated * differently for each main program, because its code location might - * have changed + * have changed. + * Append subprog relos to main programs to allow data relos to be + * processed after text is completely relocated. */ for (i = 0; i < obj->nr_programs; i++) { prog = &obj->programs[i]; @@ -6920,6 +6966,18 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) return err; } } + /* Process data relos for main programs */ + for (i = 0; i < obj->nr_programs; i++) { + prog = &obj->programs[i]; + if (prog_is_subprog(obj, prog)) + continue; + err = bpf_object__relocate_data(obj, prog); + if (err) { + pr_warn("prog '%s': failed to relocate data references: %d\n", + prog->name, err); + return err; + } + } /* free up relocation descriptors */ for (i = 0; i < obj->nr_programs; i++) { prog = &obj->programs[i]; diff --git a/tools/testing/selftests/bpf/progs/test_subprogs.c b/tools/testing/selftests/bpf/progs/test_subprogs.c index d3c5673c0218..b7c37ca09544 100644 --- a/tools/testing/selftests/bpf/progs/test_subprogs.c +++ b/tools/testing/selftests/bpf/progs/test_subprogs.c @@ -4,8 +4,18 @@
const char LICENSE[] SEC("license") = "GPL";
+struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} array SEC(".maps"); + __noinline int sub1(int x) { + int key = 0; + + bpf_map_lookup_elem(&array, &key); return x + 1; }
@@ -23,6 +33,9 @@ static __noinline int sub3(int z)
static __noinline int sub4(int w) { + int key = 0; + + bpf_map_lookup_elem(&array, &key); return w + sub3(5) + sub1(6); }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 9ca1f56ababea5f5c714074845ee1c9e4dd75956 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a pointer to 'struct bpf_object' to kernel_supports() helper. It will be used in the next patch. No functional changes.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-13-alexei.starovoitov@gmail... (cherry picked from commit 9ca1f56ababea5f5c714074845ee1c9e4dd75956) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 44 +++++++++++++++++++++--------------------- 1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e061492a9ca5..dbb82494deb0 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -178,7 +178,7 @@ enum kern_feature_id { __FEAT_CNT, };
-static bool kernel_supports(enum kern_feature_id feat_id); +static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id);
enum reloc_type { RELO_LD64, @@ -2463,20 +2463,20 @@ static bool section_have_execinstr(struct bpf_object *obj, int idx)
static bool btf_needs_sanitization(struct bpf_object *obj) { - bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC); - bool has_datasec = kernel_supports(FEAT_BTF_DATASEC); - bool has_float = kernel_supports(FEAT_BTF_FLOAT); - bool has_func = kernel_supports(FEAT_BTF_FUNC); + bool has_func_global = kernel_supports(obj, FEAT_BTF_GLOBAL_FUNC); + bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC); + bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); + bool has_func = kernel_supports(obj, FEAT_BTF_FUNC);
return !has_func || !has_datasec || !has_func_global || !has_float; }
static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) { - bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC); - bool has_datasec = kernel_supports(FEAT_BTF_DATASEC); - bool has_float = kernel_supports(FEAT_BTF_FLOAT); - bool has_func = kernel_supports(FEAT_BTF_FUNC); + bool has_func_global = kernel_supports(obj, FEAT_BTF_GLOBAL_FUNC); + bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC); + bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); + bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); struct btf_type *t; int i, j, vlen;
@@ -2683,7 +2683,7 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) if (!obj->btf) return 0;
- if (!kernel_supports(FEAT_BTF)) { + if (!kernel_supports(obj, FEAT_BTF)) { if (kernel_needs_btf(obj)) { err = -EOPNOTSUPP; goto report; @@ -4352,7 +4352,7 @@ static struct kern_feature_desc { }, };
-static bool kernel_supports(enum kern_feature_id feat_id) +static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id) { struct kern_feature_desc *feat = &feature_probes[feat_id]; int ret; @@ -4476,7 +4476,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
memset(&create_attr, 0, sizeof(create_attr));
- if (kernel_supports(FEAT_PROG_NAME)) + if (kernel_supports(obj, FEAT_PROG_NAME)) create_attr.name = map->name; create_attr.map_ifindex = map->map_ifindex; create_attr.map_type = def->type; @@ -5051,7 +5051,7 @@ static int load_module_btfs(struct bpf_object *obj) obj->btf_modules_loaded = true;
/* kernel too old to support module BTFs */ - if (!kernel_supports(FEAT_MODULE_BTF)) + if (!kernel_supports(obj, FEAT_MODULE_BTF)) return 0;
while (true) { @@ -6575,7 +6575,7 @@ reloc_prog_func_and_line_info(const struct bpf_object *obj, /* no .BTF.ext relocation if .BTF.ext is missing or kernel doesn't * supprot func/line info */ - if (!obj->btf_ext || !kernel_supports(FEAT_BTF_FUNC)) + if (!obj->btf_ext || !kernel_supports(obj, FEAT_BTF_FUNC)) return 0;
/* only attempt func info relocation if main program's func_info @@ -7183,12 +7183,12 @@ static int bpf_object__sanitize_prog(struct bpf_object *obj, struct bpf_program switch (func_id) { case BPF_FUNC_probe_read_kernel: case BPF_FUNC_probe_read_user: - if (!kernel_supports(FEAT_PROBE_READ_KERN)) + if (!kernel_supports(obj, FEAT_PROBE_READ_KERN)) insn->imm = BPF_FUNC_probe_read; break; case BPF_FUNC_probe_read_kernel_str: case BPF_FUNC_probe_read_user_str: - if (!kernel_supports(FEAT_PROBE_READ_KERN)) + if (!kernel_supports(obj, FEAT_PROBE_READ_KERN)) insn->imm = BPF_FUNC_probe_read_str; break; default: @@ -7223,12 +7223,12 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
load_attr.prog_type = prog->type; /* old kernels might not support specifying expected_attach_type */ - if (!kernel_supports(FEAT_EXP_ATTACH_TYPE) && prog->sec_def && + if (!kernel_supports(prog->obj, FEAT_EXP_ATTACH_TYPE) && prog->sec_def && prog->sec_def->is_exp_attach_type_optional) load_attr.expected_attach_type = 0; else load_attr.expected_attach_type = prog->expected_attach_type; - if (kernel_supports(FEAT_PROG_NAME)) + if (kernel_supports(prog->obj, FEAT_PROG_NAME)) load_attr.name = prog->name; load_attr.insns = insns; load_attr.insn_cnt = insns_cnt; @@ -7244,7 +7244,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
/* specify func_info/line_info only if kernel supports them */ btf_fd = bpf_object__btf_fd(prog->obj); - if (btf_fd >= 0 && kernel_supports(FEAT_BTF_FUNC)) { + if (btf_fd >= 0 && kernel_supports(prog->obj, FEAT_BTF_FUNC)) { load_attr.prog_btf_fd = btf_fd; load_attr.func_info = prog->func_info; load_attr.func_info_rec_size = prog->func_info_rec_size; @@ -7274,7 +7274,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, pr_debug("verifier log:\n%s", log_buf);
if (prog->obj->rodata_map_idx >= 0 && - kernel_supports(FEAT_PROG_BIND_MAP)) { + kernel_supports(prog->obj, FEAT_PROG_BIND_MAP)) { struct bpf_map *rodata_map = &prog->obj->maps[prog->obj->rodata_map_idx];
@@ -7635,11 +7635,11 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj) bpf_object__for_each_map(m, obj) { if (!bpf_map__is_internal(m)) continue; - if (!kernel_supports(FEAT_GLOBAL_DATA)) { + if (!kernel_supports(obj, FEAT_GLOBAL_DATA)) { pr_warn("kernel doesn't support global data\n"); return -ENOTSUP; } - if (!kernel_supports(FEAT_ARRAY_MMAP)) + if (!kernel_supports(obj, FEAT_ARRAY_MMAP)) m->def.map_flags ^= BPF_F_MMAPABLE; }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit e2fa0156a434c140998aa16ecad329e4bc19f263 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Prep libbpf to use FD_IDX kernel feature when generating loader program.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-14-alexei.starovoitov@gmail... (cherry picked from commit e2fa0156a434c140998aa16ecad329e4bc19f263) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index dbb82494deb0..5f94c744945d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -412,6 +412,8 @@ struct module_btf { int fd; };
+struct bpf_gen; + struct bpf_object { char name[BPF_OBJ_NAME_LEN]; char license[64]; @@ -432,6 +434,8 @@ struct bpf_object { bool loaded; bool has_subcalls;
+ struct bpf_gen *gen_loader; + /* * Information when doing elf related work. Only valid if fd * is valid. @@ -6445,19 +6449,34 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
switch (relo->type) { case RELO_LD64: - insn[0].src_reg = BPF_PSEUDO_MAP_FD; - insn[0].imm = obj->maps[relo->map_idx].fd; + if (obj->gen_loader) { + insn[0].src_reg = BPF_PSEUDO_MAP_IDX; + insn[0].imm = relo->map_idx; + } else { + insn[0].src_reg = BPF_PSEUDO_MAP_FD; + insn[0].imm = obj->maps[relo->map_idx].fd; + } break; case RELO_DATA: - insn[0].src_reg = BPF_PSEUDO_MAP_VALUE; insn[1].imm = insn[0].imm + relo->sym_off; - insn[0].imm = obj->maps[relo->map_idx].fd; + if (obj->gen_loader) { + insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE; + insn[0].imm = relo->map_idx; + } else { + insn[0].src_reg = BPF_PSEUDO_MAP_VALUE; + insn[0].imm = obj->maps[relo->map_idx].fd; + } break; case RELO_EXTERN_VAR: ext = &obj->externs[relo->sym_off]; if (ext->type == EXT_KCFG) { - insn[0].src_reg = BPF_PSEUDO_MAP_VALUE; - insn[0].imm = obj->maps[obj->kconfig_map_idx].fd; + if (obj->gen_loader) { + insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE; + insn[0].imm = obj->kconfig_map_idx; + } else { + insn[0].src_reg = BPF_PSEUDO_MAP_VALUE; + insn[0].imm = obj->maps[obj->kconfig_map_idx].fd; + } insn[1].imm = ext->kcfg.data_off; } else /* EXT_KSYM */ { if (ext->ksym.type_id) { /* typed ksyms */
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 67234743736a6ac31e3e74f6ec5e6d7bb3073676 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The BPF program loading process performed by libbpf is quite complex and consists of the following steps: "open" phase: - parse elf file and remember relocations, sections - collect externs and ksyms including their btf_ids in prog's BTF - patch BTF datasec (since llvm couldn't do it) - init maps (old style map_def, BTF based, global data map, kconfig map) - collect relocations against progs and maps "load" phase: - probe kernel features - load vmlinux BTF - resolve externs (kconfig and ksym) - load program BTF - init struct_ops - create maps - apply CO-RE relocations - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID - reposition subprograms and adjust call insns - sanitize and load progs
During this process libbpf does sys_bpf() calls to load BTF, create maps, populate maps and finally load programs. Instead of actually doing the syscalls generate a trace of what libbpf would have done and represent it as the "loader program". The "loader program" consists of single map with: - union bpf_attr(s) - BTF bytes - map value bytes - insns bytes and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper. Executing such "loader program" via bpf_prog_test_run() command will replay the sequence of syscalls that libbpf would have done which will result the same maps created and programs loaded as specified in the elf file. The "loader program" removes libelf and majority of libbpf dependency from program loading process.
kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
The order of relocate_data and relocate_calls had to change, so that bpf_gen__prog_load() can see all relocations for a given program with correct insn_idx-es.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-15-alexei.starovoitov@gmail... (cherry picked from commit 67234743736a6ac31e3e74f6ec5e6d7bb3073676) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Build | 2 +- tools/lib/bpf/bpf_gen_internal.h | 40 ++ tools/lib/bpf/gen_loader.c | 689 +++++++++++++++++++++++++++++++ tools/lib/bpf/libbpf.c | 233 +++++++++-- tools/lib/bpf/libbpf.h | 12 + tools/lib/bpf/libbpf.map | 1 + tools/lib/bpf/libbpf_internal.h | 2 + tools/lib/bpf/skel_internal.h | 123 ++++++ 8 files changed, 1064 insertions(+), 38 deletions(-) create mode 100644 tools/lib/bpf/bpf_gen_internal.h create mode 100644 tools/lib/bpf/gen_loader.c create mode 100644 tools/lib/bpf/skel_internal.h
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build index 9b057cc7650a..430f6874fa41 100644 --- a/tools/lib/bpf/Build +++ b/tools/lib/bpf/Build @@ -1,3 +1,3 @@ libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \ netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \ - btf_dump.o ringbuf.o strset.o linker.o + btf_dump.o ringbuf.o strset.o linker.o gen_loader.o diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h new file mode 100644 index 000000000000..f42a55efd559 --- /dev/null +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +/* Copyright (c) 2021 Facebook */ +#ifndef __BPF_GEN_INTERNAL_H +#define __BPF_GEN_INTERNAL_H + +struct ksym_relo_desc { + const char *name; + int kind; + int insn_idx; +}; + +struct bpf_gen { + struct gen_loader_opts *opts; + void *data_start; + void *data_cur; + void *insn_start; + void *insn_cur; + __u32 nr_progs; + __u32 nr_maps; + int log_level; + int error; + struct ksym_relo_desc *relos; + int relo_cnt; + char attach_target[128]; + int attach_kind; +}; + +void bpf_gen__init(struct bpf_gen *gen, int log_level); +int bpf_gen__finish(struct bpf_gen *gen); +void bpf_gen__free(struct bpf_gen *gen); +void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size); +void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx); +struct bpf_prog_load_params; +void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx); +void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size); +void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx); +void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type); +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx); + +#endif diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c new file mode 100644 index 000000000000..0fc54b1ca311 --- /dev/null +++ b/tools/lib/bpf/gen_loader.c @@ -0,0 +1,689 @@ +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +/* Copyright (c) 2021 Facebook */ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> +#include <linux/filter.h> +#include "btf.h" +#include "bpf.h" +#include "libbpf.h" +#include "libbpf_internal.h" +#include "hashmap.h" +#include "bpf_gen_internal.h" +#include "skel_internal.h" + +#define MAX_USED_MAPS 64 +#define MAX_USED_PROGS 32 + +/* The following structure describes the stack layout of the loader program. + * In addition R6 contains the pointer to context. + * R7 contains the result of the last sys_bpf command (typically error or FD). + * R9 contains the result of the last sys_close command. + * + * Naming convention: + * ctx - bpf program context + * stack - bpf program stack + * blob - bpf_attr-s, strings, insns, map data. + * All the bytes that loader prog will use for read/write. + */ +struct loader_stack { + __u32 btf_fd; + __u32 map_fd[MAX_USED_MAPS]; + __u32 prog_fd[MAX_USED_PROGS]; + __u32 inner_map_fd; +}; + +#define stack_off(field) \ + (__s16)(-sizeof(struct loader_stack) + offsetof(struct loader_stack, field)) + +#define attr_field(attr, field) (attr + offsetof(union bpf_attr, field)) + +static int realloc_insn_buf(struct bpf_gen *gen, __u32 size) +{ + size_t off = gen->insn_cur - gen->insn_start; + void *insn_start; + + if (gen->error) + return gen->error; + if (size > INT32_MAX || off + size > INT32_MAX) { + gen->error = -ERANGE; + return -ERANGE; + } + insn_start = realloc(gen->insn_start, off + size); + if (!insn_start) { + gen->error = -ENOMEM; + free(gen->insn_start); + gen->insn_start = NULL; + return -ENOMEM; + } + gen->insn_start = insn_start; + gen->insn_cur = insn_start + off; + return 0; +} + +static int realloc_data_buf(struct bpf_gen *gen, __u32 size) +{ + size_t off = gen->data_cur - gen->data_start; + void *data_start; + + if (gen->error) + return gen->error; + if (size > INT32_MAX || off + size > INT32_MAX) { + gen->error = -ERANGE; + return -ERANGE; + } + data_start = realloc(gen->data_start, off + size); + if (!data_start) { + gen->error = -ENOMEM; + free(gen->data_start); + gen->data_start = NULL; + return -ENOMEM; + } + gen->data_start = data_start; + gen->data_cur = data_start + off; + return 0; +} + +static void emit(struct bpf_gen *gen, struct bpf_insn insn) +{ + if (realloc_insn_buf(gen, sizeof(insn))) + return; + memcpy(gen->insn_cur, &insn, sizeof(insn)); + gen->insn_cur += sizeof(insn); +} + +static void emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2) +{ + emit(gen, insn1); + emit(gen, insn2); +} + +void bpf_gen__init(struct bpf_gen *gen, int log_level) +{ + gen->log_level = log_level; + emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1)); +} + +static int add_data(struct bpf_gen *gen, const void *data, __u32 size) +{ + void *prev; + + if (realloc_data_buf(gen, size)) + return 0; + prev = gen->data_cur; + memcpy(gen->data_cur, data, size); + gen->data_cur += size; + return prev - gen->data_start; +} + +static int insn_bytes_to_bpf_size(__u32 sz) +{ + switch (sz) { + case 8: return BPF_DW; + case 4: return BPF_W; + case 2: return BPF_H; + case 1: return BPF_B; + default: return -1; + } +} + +/* *(u64 *)(blob + off) = (u64)(void *)(blob + data) */ +static void emit_rel_store(struct bpf_gen *gen, int off, int data) +{ + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, data)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, off)); + emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0)); +} + +/* *(u64 *)(blob + off) = (u64)(void *)(%sp + stack_off) */ +static void emit_rel_store_sp(struct bpf_gen *gen, int off, int stack_off) +{ + emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_10)); + emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, stack_off)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, off)); + emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0)); +} + +static void move_ctx2blob(struct bpf_gen *gen, int off, int size, int ctx_off, + bool check_non_zero) +{ + emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_6, ctx_off)); + if (check_non_zero) + /* If value in ctx is zero don't update the blob. + * For example: when ctx->map.max_entries == 0, keep default max_entries from bpf.c + */ + emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, off)); + emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0)); +} + +static void move_stack2blob(struct bpf_gen *gen, int off, int size, int stack_off) +{ + emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, off)); + emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0)); +} + +static void move_stack2ctx(struct bpf_gen *gen, int ctx_off, int size, int stack_off) +{ + emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off)); + emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_6, BPF_REG_0, ctx_off)); +} + +static void emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size) +{ + emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, attr)); + emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf)); + /* remember the result in R7 */ + emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0)); +} + +static void emit_check_err(struct bpf_gen *gen) +{ + emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2)); + emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7)); + /* TODO: close intermediate FDs in case of error */ + emit(gen, BPF_EXIT_INSN()); +} + +/* reg1 and reg2 should not be R1 - R5. They can be R0, R6 - R10 */ +static void emit_debug(struct bpf_gen *gen, int reg1, int reg2, + const char *fmt, va_list args) +{ + char buf[1024]; + int addr, len, ret; + + if (!gen->log_level) + return; + ret = vsnprintf(buf, sizeof(buf), fmt, args); + if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0) + /* The special case to accommodate common debug_ret(): + * to avoid specifying BPF_REG_7 and adding " r=%%d" to + * prints explicitly. + */ + strcat(buf, " r=%d"); + len = strlen(buf) + 1; + addr = add_data(gen, buf, len); + + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, addr)); + emit(gen, BPF_MOV64_IMM(BPF_REG_2, len)); + if (reg1 >= 0) + emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1)); + if (reg2 >= 0) + emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk)); +} + +static void debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...) +{ + va_list args; + + va_start(args, fmt); + emit_debug(gen, reg1, reg2, fmt, args); + va_end(args); +} + +static void debug_ret(struct bpf_gen *gen, const char *fmt, ...) +{ + va_list args; + + va_start(args, fmt); + emit_debug(gen, BPF_REG_7, -1, fmt, args); + va_end(args); +} + +static void __emit_sys_close(struct bpf_gen *gen) +{ + emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, + /* 2 is the number of the following insns + * * 6 is additional insns in debug_regs + */ + 2 + (gen->log_level ? 6 : 0))); + emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close)); + debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d"); +} + +static void emit_sys_close_stack(struct bpf_gen *gen, int stack_off) +{ + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off)); + __emit_sys_close(gen); +} + +static void emit_sys_close_blob(struct bpf_gen *gen, int blob_off) +{ + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, blob_off)); + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0)); + __emit_sys_close(gen); +} + +int bpf_gen__finish(struct bpf_gen *gen) +{ + int i; + + emit_sys_close_stack(gen, stack_off(btf_fd)); + for (i = 0; i < gen->nr_progs; i++) + move_stack2ctx(gen, + sizeof(struct bpf_loader_ctx) + + sizeof(struct bpf_map_desc) * gen->nr_maps + + sizeof(struct bpf_prog_desc) * i + + offsetof(struct bpf_prog_desc, prog_fd), 4, + stack_off(prog_fd[i])); + for (i = 0; i < gen->nr_maps; i++) + move_stack2ctx(gen, + sizeof(struct bpf_loader_ctx) + + sizeof(struct bpf_map_desc) * i + + offsetof(struct bpf_map_desc, map_fd), 4, + stack_off(map_fd[i])); + emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0)); + emit(gen, BPF_EXIT_INSN()); + pr_debug("gen: finish %d\n", gen->error); + if (!gen->error) { + struct gen_loader_opts *opts = gen->opts; + + opts->insns = gen->insn_start; + opts->insns_sz = gen->insn_cur - gen->insn_start; + opts->data = gen->data_start; + opts->data_sz = gen->data_cur - gen->data_start; + } + return gen->error; +} + +void bpf_gen__free(struct bpf_gen *gen) +{ + if (!gen) + return; + free(gen->data_start); + free(gen->insn_start); + free(gen); +} + +void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, + __u32 btf_raw_size) +{ + int attr_size = offsetofend(union bpf_attr, btf_log_level); + int btf_data, btf_load_attr; + union bpf_attr attr; + + memset(&attr, 0, attr_size); + pr_debug("gen: load_btf: size %d\n", btf_raw_size); + btf_data = add_data(gen, btf_raw_data, btf_raw_size); + + attr.btf_size = btf_raw_size; + btf_load_attr = add_data(gen, &attr, attr_size); + + /* populate union bpf_attr with user provided log details */ + move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_level), 4, + offsetof(struct bpf_loader_ctx, log_level), false); + move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_size), 4, + offsetof(struct bpf_loader_ctx, log_size), false); + move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_buf), 8, + offsetof(struct bpf_loader_ctx, log_buf), false); + /* populate union bpf_attr with a pointer to the BTF data */ + emit_rel_store(gen, attr_field(btf_load_attr, btf), btf_data); + /* emit BTF_LOAD command */ + emit_sys_bpf(gen, BPF_BTF_LOAD, btf_load_attr, attr_size); + debug_ret(gen, "btf_load size %d", btf_raw_size); + emit_check_err(gen); + /* remember btf_fd in the stack, if successful */ + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(btf_fd))); +} + +void bpf_gen__map_create(struct bpf_gen *gen, + struct bpf_create_map_attr *map_attr, int map_idx) +{ + int attr_size = offsetofend(union bpf_attr, btf_vmlinux_value_type_id); + bool close_inner_map_fd = false; + int map_create_attr; + union bpf_attr attr; + + memset(&attr, 0, attr_size); + attr.map_type = map_attr->map_type; + attr.key_size = map_attr->key_size; + attr.value_size = map_attr->value_size; + attr.map_flags = map_attr->map_flags; + memcpy(attr.map_name, map_attr->name, + min((unsigned)strlen(map_attr->name), BPF_OBJ_NAME_LEN - 1)); + attr.numa_node = map_attr->numa_node; + attr.map_ifindex = map_attr->map_ifindex; + attr.max_entries = map_attr->max_entries; + switch (attr.map_type) { + case BPF_MAP_TYPE_PERF_EVENT_ARRAY: + case BPF_MAP_TYPE_CGROUP_ARRAY: + case BPF_MAP_TYPE_STACK_TRACE: + case BPF_MAP_TYPE_ARRAY_OF_MAPS: + case BPF_MAP_TYPE_HASH_OF_MAPS: + case BPF_MAP_TYPE_DEVMAP: + case BPF_MAP_TYPE_DEVMAP_HASH: + case BPF_MAP_TYPE_CPUMAP: + case BPF_MAP_TYPE_XSKMAP: + case BPF_MAP_TYPE_SOCKMAP: + case BPF_MAP_TYPE_SOCKHASH: + case BPF_MAP_TYPE_QUEUE: + case BPF_MAP_TYPE_STACK: + case BPF_MAP_TYPE_RINGBUF: + break; + default: + attr.btf_key_type_id = map_attr->btf_key_type_id; + attr.btf_value_type_id = map_attr->btf_value_type_id; + } + + pr_debug("gen: map_create: %s idx %d type %d value_type_id %d\n", + attr.map_name, map_idx, map_attr->map_type, attr.btf_value_type_id); + + map_create_attr = add_data(gen, &attr, attr_size); + if (attr.btf_value_type_id) + /* populate union bpf_attr with btf_fd saved in the stack earlier */ + move_stack2blob(gen, attr_field(map_create_attr, btf_fd), 4, + stack_off(btf_fd)); + switch (attr.map_type) { + case BPF_MAP_TYPE_ARRAY_OF_MAPS: + case BPF_MAP_TYPE_HASH_OF_MAPS: + move_stack2blob(gen, attr_field(map_create_attr, inner_map_fd), 4, + stack_off(inner_map_fd)); + close_inner_map_fd = true; + break; + default: + break; + } + /* conditionally update max_entries */ + if (map_idx >= 0) + move_ctx2blob(gen, attr_field(map_create_attr, max_entries), 4, + sizeof(struct bpf_loader_ctx) + + sizeof(struct bpf_map_desc) * map_idx + + offsetof(struct bpf_map_desc, max_entries), + true /* check that max_entries != 0 */); + /* emit MAP_CREATE command */ + emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size); + debug_ret(gen, "map_create %s idx %d type %d value_size %d value_btf_id %d", + attr.map_name, map_idx, map_attr->map_type, attr.value_size, + attr.btf_value_type_id); + emit_check_err(gen); + /* remember map_fd in the stack, if successful */ + if (map_idx < 0) { + /* This bpf_gen__map_create() function is called with map_idx >= 0 + * for all maps that libbpf loading logic tracks. + * It's called with -1 to create an inner map. + */ + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, + stack_off(inner_map_fd))); + } else if (map_idx != gen->nr_maps) { + gen->error = -EDOM; /* internal bug */ + return; + } else { + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, + stack_off(map_fd[map_idx]))); + gen->nr_maps++; + } + if (close_inner_map_fd) + emit_sys_close_stack(gen, stack_off(inner_map_fd)); +} + +void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *attach_name, + enum bpf_attach_type type) +{ + const char *prefix; + int kind, ret; + + btf_get_kernel_prefix_kind(type, &prefix, &kind); + gen->attach_kind = kind; + ret = snprintf(gen->attach_target, sizeof(gen->attach_target), "%s%s", + prefix, attach_name); + if (ret == sizeof(gen->attach_target)) + gen->error = -ENOSPC; +} + +static void emit_find_attach_target(struct bpf_gen *gen) +{ + int name, len = strlen(gen->attach_target) + 1; + + pr_debug("gen: find_attach_tgt %s %d\n", gen->attach_target, gen->attach_kind); + name = add_data(gen, gen->attach_target, len); + + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, name)); + emit(gen, BPF_MOV64_IMM(BPF_REG_2, len)); + emit(gen, BPF_MOV64_IMM(BPF_REG_3, gen->attach_kind)); + emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind)); + emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0)); + debug_ret(gen, "find_by_name_kind(%s,%d)", + gen->attach_target, gen->attach_kind); + emit_check_err(gen); + /* if successful, btf_id is in lower 32-bit of R7 and + * btf_obj_fd is in upper 32-bit + */ +} + +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, + int insn_idx) +{ + struct ksym_relo_desc *relo; + + relo = libbpf_reallocarray(gen->relos, gen->relo_cnt + 1, sizeof(*relo)); + if (!relo) { + gen->error = -ENOMEM; + return; + } + gen->relos = relo; + relo += gen->relo_cnt; + relo->name = name; + relo->kind = kind; + relo->insn_idx = insn_idx; + gen->relo_cnt++; +} + +static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns) +{ + int name, insn, len = strlen(relo->name) + 1; + + pr_debug("gen: emit_relo: %s at %d\n", relo->name, relo->insn_idx); + name = add_data(gen, relo->name, len); + + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, name)); + emit(gen, BPF_MOV64_IMM(BPF_REG_2, len)); + emit(gen, BPF_MOV64_IMM(BPF_REG_3, relo->kind)); + emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind)); + emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0)); + debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind); + emit_check_err(gen); + /* store btf_id into insn[insn_idx].imm */ + insn = insns + sizeof(struct bpf_insn) * relo->insn_idx + + offsetof(struct bpf_insn, imm); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, insn)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, 0)); + if (relo->kind == BTF_KIND_VAR) { + /* store btf_obj_fd into insn[insn_idx + 1].imm */ + emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, + sizeof(struct bpf_insn))); + } +} + +static void emit_relos(struct bpf_gen *gen, int insns) +{ + int i; + + for (i = 0; i < gen->relo_cnt; i++) + emit_relo(gen, gen->relos + i, insns); +} + +static void cleanup_relos(struct bpf_gen *gen, int insns) +{ + int i, insn; + + for (i = 0; i < gen->relo_cnt; i++) { + if (gen->relos[i].kind != BTF_KIND_VAR) + continue; + /* close fd recorded in insn[insn_idx + 1].imm */ + insn = insns + + sizeof(struct bpf_insn) * (gen->relos[i].insn_idx + 1) + + offsetof(struct bpf_insn, imm); + emit_sys_close_blob(gen, insn); + } + if (gen->relo_cnt) { + free(gen->relos); + gen->relo_cnt = 0; + gen->relos = NULL; + } +} + +void bpf_gen__prog_load(struct bpf_gen *gen, + struct bpf_prog_load_params *load_attr, int prog_idx) +{ + int attr_size = offsetofend(union bpf_attr, fd_array); + int prog_load_attr, license, insns, func_info, line_info; + union bpf_attr attr; + + memset(&attr, 0, attr_size); + pr_debug("gen: prog_load: type %d insns_cnt %zd\n", + load_attr->prog_type, load_attr->insn_cnt); + /* add license string to blob of bytes */ + license = add_data(gen, load_attr->license, strlen(load_attr->license) + 1); + /* add insns to blob of bytes */ + insns = add_data(gen, load_attr->insns, + load_attr->insn_cnt * sizeof(struct bpf_insn)); + + attr.prog_type = load_attr->prog_type; + attr.expected_attach_type = load_attr->expected_attach_type; + attr.attach_btf_id = load_attr->attach_btf_id; + attr.prog_ifindex = load_attr->prog_ifindex; + attr.kern_version = 0; + attr.insn_cnt = (__u32)load_attr->insn_cnt; + attr.prog_flags = load_attr->prog_flags; + + attr.func_info_rec_size = load_attr->func_info_rec_size; + attr.func_info_cnt = load_attr->func_info_cnt; + func_info = add_data(gen, load_attr->func_info, + attr.func_info_cnt * attr.func_info_rec_size); + + attr.line_info_rec_size = load_attr->line_info_rec_size; + attr.line_info_cnt = load_attr->line_info_cnt; + line_info = add_data(gen, load_attr->line_info, + attr.line_info_cnt * attr.line_info_rec_size); + + memcpy(attr.prog_name, load_attr->name, + min((unsigned)strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1)); + prog_load_attr = add_data(gen, &attr, attr_size); + + /* populate union bpf_attr with a pointer to license */ + emit_rel_store(gen, attr_field(prog_load_attr, license), license); + + /* populate union bpf_attr with a pointer to instructions */ + emit_rel_store(gen, attr_field(prog_load_attr, insns), insns); + + /* populate union bpf_attr with a pointer to func_info */ + emit_rel_store(gen, attr_field(prog_load_attr, func_info), func_info); + + /* populate union bpf_attr with a pointer to line_info */ + emit_rel_store(gen, attr_field(prog_load_attr, line_info), line_info); + + /* populate union bpf_attr fd_array with a pointer to stack where map_fds are saved */ + emit_rel_store_sp(gen, attr_field(prog_load_attr, fd_array), + stack_off(map_fd[0])); + + /* populate union bpf_attr with user provided log details */ + move_ctx2blob(gen, attr_field(prog_load_attr, log_level), 4, + offsetof(struct bpf_loader_ctx, log_level), false); + move_ctx2blob(gen, attr_field(prog_load_attr, log_size), 4, + offsetof(struct bpf_loader_ctx, log_size), false); + move_ctx2blob(gen, attr_field(prog_load_attr, log_buf), 8, + offsetof(struct bpf_loader_ctx, log_buf), false); + /* populate union bpf_attr with btf_fd saved in the stack earlier */ + move_stack2blob(gen, attr_field(prog_load_attr, prog_btf_fd), 4, + stack_off(btf_fd)); + if (gen->attach_kind) { + emit_find_attach_target(gen); + /* populate union bpf_attr with btf_id and btf_obj_fd found by helper */ + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, prog_load_attr)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, + offsetof(union bpf_attr, attach_btf_id))); + emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, + offsetof(union bpf_attr, attach_btf_obj_fd))); + } + emit_relos(gen, insns); + /* emit PROG_LOAD command */ + emit_sys_bpf(gen, BPF_PROG_LOAD, prog_load_attr, attr_size); + debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt); + /* successful or not, close btf module FDs used in extern ksyms and attach_btf_obj_fd */ + cleanup_relos(gen, insns); + if (gen->attach_kind) + emit_sys_close_blob(gen, + attr_field(prog_load_attr, attach_btf_obj_fd)); + emit_check_err(gen); + /* remember prog_fd in the stack, if successful */ + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, + stack_off(prog_fd[gen->nr_progs]))); + gen->nr_progs++; +} + +void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, + __u32 value_size) +{ + int attr_size = offsetofend(union bpf_attr, flags); + int map_update_attr, value, key; + union bpf_attr attr; + int zero = 0; + + memset(&attr, 0, attr_size); + pr_debug("gen: map_update_elem: idx %d\n", map_idx); + + value = add_data(gen, pvalue, value_size); + key = add_data(gen, &zero, sizeof(zero)); + + /* if (map_desc[map_idx].initial_value) + * copy_from_user(value, initial_value, value_size); + */ + emit(gen, BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_6, + sizeof(struct bpf_loader_ctx) + + sizeof(struct bpf_map_desc) * map_idx + + offsetof(struct bpf_map_desc, initial_value))); + emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0, 4)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, value)); + emit(gen, BPF_MOV64_IMM(BPF_REG_2, value_size)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_copy_from_user)); + + map_update_attr = add_data(gen, &attr, attr_size); + move_stack2blob(gen, attr_field(map_update_attr, map_fd), 4, + stack_off(map_fd[map_idx])); + emit_rel_store(gen, attr_field(map_update_attr, key), key); + emit_rel_store(gen, attr_field(map_update_attr, value), value); + /* emit MAP_UPDATE_ELEM command */ + emit_sys_bpf(gen, BPF_MAP_UPDATE_ELEM, map_update_attr, attr_size); + debug_ret(gen, "update_elem idx %d value_size %d", map_idx, value_size); + emit_check_err(gen); +} + +void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx) +{ + int attr_size = offsetofend(union bpf_attr, map_fd); + int map_freeze_attr; + union bpf_attr attr; + + memset(&attr, 0, attr_size); + pr_debug("gen: map_freeze: idx %d\n", map_idx); + map_freeze_attr = add_data(gen, &attr, attr_size); + move_stack2blob(gen, attr_field(map_freeze_attr, map_fd), 4, + stack_off(map_fd[map_idx])); + /* emit MAP_FREEZE command */ + emit_sys_bpf(gen, BPF_MAP_FREEZE, map_freeze_attr, attr_size); + debug_ret(gen, "map_freeze"); + emit_check_err(gen); +} diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 5f94c744945d..a7ae242f2b4e 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -54,6 +54,7 @@ #include "str_error.h" #include "libbpf_internal.h" #include "hashmap.h" +#include "bpf_gen_internal.h"
#ifndef BPF_FS_MAGIC #define BPF_FS_MAGIC 0xcafe4a11 @@ -412,8 +413,6 @@ struct module_btf { int fd; };
-struct bpf_gen; - struct bpf_object { char name[BPF_OBJ_NAME_LEN]; char license[64]; @@ -2662,7 +2661,7 @@ static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force) int err;
/* btf_vmlinux could be loaded earlier */ - if (obj->btf_vmlinux) + if (obj->btf_vmlinux || obj->gen_loader) return 0;
if (!force && !obj_needs_vmlinux_btf(obj)) @@ -2744,7 +2743,20 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) bpf_object__sanitize_btf(obj, kern_btf); }
- err = btf__load(kern_btf); + if (obj->gen_loader) { + __u32 raw_size = 0; + const void *raw_data = btf__get_raw_data(kern_btf, &raw_size); + + if (!raw_data) + return -ENOMEM; + bpf_gen__load_btf(obj->gen_loader, raw_data, raw_size); + /* Pretend to have valid FD to pass various fd >= 0 checks. + * This fd == 0 will not be used with any syscall and will be reset to -1 eventually. + */ + btf__set_fd(kern_btf, 0); + } else { + err = btf__load(kern_btf); + } if (sanitize) { if (!err) { /* move fd to libbpf's BTF */ @@ -4361,6 +4373,12 @@ static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id f struct kern_feature_desc *feat = &feature_probes[feat_id]; int ret;
+ if (obj->gen_loader) + /* To generate loader program assume the latest kernel + * to avoid doing extra prog_load, map_create syscalls. + */ + return true; + if (READ_ONCE(feat->res) == FEAT_UNKNOWN) { ret = feat->probe(); if (ret > 0) { @@ -4447,6 +4465,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map) char *cp, errmsg[STRERR_BUFSIZE]; int err, zero = 0;
+ if (obj->gen_loader) { + bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps, + map->mmaped, map->def.value_size); + if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG) + bpf_gen__map_freeze(obj->gen_loader, map - obj->maps); + return 0; + } err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0); if (err) { err = -errno; @@ -4472,7 +4497,7 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
static void bpf_map__destroy(struct bpf_map *map);
-static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map) +static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner) { struct bpf_create_map_attr create_attr; struct bpf_map_def *def = &map->def; @@ -4519,7 +4544,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
if (bpf_map_type__is_map_in_map(def->type)) { if (map->inner_map) { - err = bpf_object__create_map(obj, map->inner_map); + err = bpf_object__create_map(obj, map->inner_map, true); if (err) { pr_warn("map '%s': failed to create inner map: %d\n", map->name, err); @@ -4531,7 +4556,15 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map) create_attr.inner_map_fd = map->inner_map_fd; }
- map->fd = bpf_create_map_xattr(&create_attr); + if (obj->gen_loader) { + bpf_gen__map_create(obj->gen_loader, &create_attr, is_inner ? -1 : map - obj->maps); + /* Pretend to have valid FD to pass various fd >= 0 checks. + * This fd == 0 will not be used with any syscall and will be reset to -1 eventually. + */ + map->fd = 0; + } else { + map->fd = bpf_create_map_xattr(&create_attr); + } if (map->fd < 0 && (create_attr.btf_key_type_id || create_attr.btf_value_type_id)) { char *cp, errmsg[STRERR_BUFSIZE]; @@ -4551,6 +4584,8 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map) err = map->fd < 0 ? -errno : 0;
if (bpf_map_type__is_map_in_map(def->type) && map->inner_map) { + if (obj->gen_loader) + map->inner_map->fd = -1; bpf_map__destroy(map->inner_map); zfree(&map->inner_map); } @@ -4558,11 +4593,11 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map) return err; }
-static int init_map_slots(struct bpf_map *map) +static int init_map_slots(struct bpf_object *obj, struct bpf_map *map) { const struct bpf_map *targ_map; unsigned int i; - int fd, err; + int fd, err = 0;
for (i = 0; i < map->init_slots_sz; i++) { if (!map->init_slots[i]) @@ -4570,7 +4605,13 @@ static int init_map_slots(struct bpf_map *map)
targ_map = map->init_slots[i]; fd = bpf_map__fd(targ_map); - err = bpf_map_update_elem(map->fd, &i, &fd, 0); + if (obj->gen_loader) { + pr_warn("// TODO map_update_elem: idx %ld key %d value==map_idx %ld\n", + map - obj->maps, i, targ_map - obj->maps); + return -ENOTSUP; + } else { + err = bpf_map_update_elem(map->fd, &i, &fd, 0); + } if (err) { err = -errno; pr_warn("map '%s': failed to initialize slot [%d] to map '%s' fd=%d: %d\n", @@ -4621,7 +4662,7 @@ bpf_object__create_maps(struct bpf_object *obj) pr_debug("map '%s': skipping creation (preset fd=%d)\n", map->name, map->fd); } else { - err = bpf_object__create_map(obj, map); + err = bpf_object__create_map(obj, map, false); if (err) goto err_out;
@@ -4637,7 +4678,7 @@ bpf_object__create_maps(struct bpf_object *obj) }
if (map->init_slots_sz) { - err = init_map_slots(map); + err = init_map_slots(obj, map); if (err < 0) { zclose(map->fd); goto err_out; @@ -5051,6 +5092,9 @@ static int load_module_btfs(struct bpf_object *obj) if (obj->btf_modules_loaded) return 0;
+ if (obj->gen_loader) + return 0; + /* don't do this again, even if we find no module BTFs */ obj->btf_modules_loaded = true;
@@ -6198,6 +6242,12 @@ static int bpf_core_apply_relo(struct bpf_program *prog, if (str_is_empty(spec_str)) return -EINVAL;
+ if (prog->obj->gen_loader) { + pr_warn("// TODO core_relo: prog %ld insn[%d] %s %s kind %d\n", + prog - prog->obj->programs, relo->insn_off / 8, + local_name, spec_str, relo->kind); + return -ENOTSUP; + } err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); if (err) { pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", @@ -6928,6 +6978,20 @@ bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog) return 0; }
+static void +bpf_object__free_relocs(struct bpf_object *obj) +{ + struct bpf_program *prog; + int i; + + /* free up relocation descriptors */ + for (i = 0; i < obj->nr_programs; i++) { + prog = &obj->programs[i]; + zfree(&prog->reloc_desc); + prog->nr_reloc = 0; + } +} + static int bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) { @@ -6997,12 +7061,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) return err; } } - /* free up relocation descriptors */ - for (i = 0; i < obj->nr_programs; i++) { - prog = &obj->programs[i]; - zfree(&prog->reloc_desc); - prog->nr_reloc = 0; - } + if (!obj->gen_loader) + bpf_object__free_relocs(obj); return 0; }
@@ -7191,6 +7251,9 @@ static int bpf_object__sanitize_prog(struct bpf_object *obj, struct bpf_program enum bpf_func_id func_id; int i;
+ if (obj->gen_loader) + return 0; + for (i = 0; i < prog->insns_cnt; i++, insn++) { if (!insn_is_helper_call(insn, &func_id)) continue; @@ -7275,6 +7338,12 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, load_attr.log_level = prog->log_level; load_attr.prog_flags = prog->prog_flags;
+ if (prog->obj->gen_loader) { + bpf_gen__prog_load(prog->obj->gen_loader, &load_attr, + prog - prog->obj->programs); + *pfd = -1; + return 0; + } retry_load: if (log_buf_size) { log_buf = malloc(log_buf_size); @@ -7352,6 +7421,38 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, return ret; }
+static int bpf_program__record_externs(struct bpf_program *prog) +{ + struct bpf_object *obj = prog->obj; + int i; + + for (i = 0; i < prog->nr_reloc; i++) { + struct reloc_desc *relo = &prog->reloc_desc[i]; + struct extern_desc *ext = &obj->externs[relo->sym_off]; + + switch (relo->type) { + case RELO_EXTERN_VAR: + if (ext->type != EXT_KSYM) + continue; + if (!ext->ksym.type_id) { + pr_warn("typeless ksym %s is not supported yet\n", + ext->name); + return -ENOTSUP; + } + bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_VAR, + relo->insn_idx); + break; + case RELO_EXTERN_FUNC: + bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_FUNC, + relo->insn_idx); + break; + default: + continue; + } + } + return 0; +} + static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id);
int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) @@ -7398,6 +7499,8 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) pr_warn("prog '%s': inconsistent nr(%d) != 1\n", prog->name, prog->instances.nr); } + if (prog->obj->gen_loader) + bpf_program__record_externs(prog); err = load_program(prog, prog->insns, prog->insns_cnt, license, kern_ver, &fd); if (!err) @@ -7474,6 +7577,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level) if (err) return err; } + if (obj->gen_loader) + bpf_object__free_relocs(obj); return 0; }
@@ -7856,6 +7961,12 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj) if (ext->type != EXT_KSYM || !ext->ksym.type_id) continue;
+ if (obj->gen_loader) { + ext->is_set = true; + ext->ksym.kernel_btf_obj_fd = 0; + ext->ksym.kernel_btf_id = 0; + continue; + } t = btf__type_by_id(obj->btf, ext->btf_id); if (btf_is_var(t)) err = bpf_object__resolve_ksym_var_btf_id(obj, ext); @@ -7970,6 +8081,9 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) return -EINVAL; }
+ if (obj->gen_loader) + bpf_gen__init(obj->gen_loader, attr->log_level); + err = bpf_object__probe_loading(obj); err = err ? : bpf_object__load_vmlinux_btf(obj, false); err = err ? : bpf_object__resolve_externs(obj, obj->kconfig); @@ -7980,6 +8094,15 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) err = err ? : bpf_object__relocate(obj, attr->target_btf_path); err = err ? : bpf_object__load_progs(obj, attr->log_level);
+ if (obj->gen_loader) { + /* reset FDs */ + btf__set_fd(obj->btf, -1); + for (i = 0; i < obj->nr_maps; i++) + obj->maps[i].fd = -1; + if (!err) + err = bpf_gen__finish(obj->gen_loader); + } + /* clean up module BTFs */ for (i = 0; i < obj->btf_module_cnt; i++) { close(obj->btf_modules[i].fd); @@ -8605,6 +8728,7 @@ void bpf_object__close(struct bpf_object *obj) if (obj->clear_priv) obj->clear_priv(obj, obj->priv);
+ bpf_gen__free(obj->gen_loader); bpf_object__elf_finish(obj); bpf_object__unload(obj); btf__free(obj->btf); @@ -8695,6 +8819,22 @@ void *bpf_object__priv(const struct bpf_object *obj) return obj ? obj->priv : ERR_PTR(-EINVAL); }
+int bpf_object__gen_loader(struct bpf_object *obj, struct gen_loader_opts *opts) +{ + struct bpf_gen *gen; + + if (!opts) + return -EFAULT; + if (!OPTS_VALID(opts, gen_loader_opts)) + return -EINVAL; + gen = calloc(sizeof(*gen), 1); + if (!gen) + return -ENOMEM; + gen->opts = opts; + obj->gen_loader = gen; + return 0; +} + static struct bpf_program * __bpf_program__iter(const struct bpf_program *p, const struct bpf_object *obj, bool forward) @@ -9341,6 +9481,32 @@ static int bpf_object__collect_st_ops_relos(struct bpf_object *obj, #define BTF_SCHED_PREFIX "bpf_sched_" #define BTF_MAX_NAME_SIZE 128
+void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type, + const char **prefix, int *kind) +{ + switch (attach_type) { + case BPF_TRACE_RAW_TP: + *prefix = BTF_TRACE_PREFIX; + *kind = BTF_KIND_TYPEDEF; + break; + case BPF_LSM_MAC: + *prefix = BTF_LSM_PREFIX; + *kind = BTF_KIND_FUNC; + break; + case BPF_TRACE_ITER: + *prefix = BTF_ITER_PREFIX; + *kind = BTF_KIND_FUNC; + break; + case BPF_SCHED: + *prefix = BTF_SCHED_PREFIX; + *kind = BTF_KIND_FUNC; + break; + default: + *prefix = ""; + *kind = BTF_KIND_FUNC; + } +} + static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix, const char *name, __u32 kind) { @@ -9361,24 +9527,11 @@ static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix, static inline int find_attach_btf_id(struct btf *btf, const char *name, enum bpf_attach_type attach_type) { - int err; - - if (attach_type == BPF_TRACE_RAW_TP) - err = find_btf_by_prefix_kind(btf, BTF_TRACE_PREFIX, name, - BTF_KIND_TYPEDEF); - else if (attach_type == BPF_LSM_MAC) - err = find_btf_by_prefix_kind(btf, BTF_LSM_PREFIX, name, - BTF_KIND_FUNC); - else if (attach_type == BPF_TRACE_ITER) - err = find_btf_by_prefix_kind(btf, BTF_ITER_PREFIX, name, - BTF_KIND_FUNC); - else if (attach_type == BPF_SCHED) - err = find_btf_by_prefix_kind(btf, BTF_SCHED_PREFIX, name, - BTF_KIND_FUNC); - else - err = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC); + const char *prefix; + int kind;
- return err; + btf_get_kernel_prefix_kind(attach_type, &prefix, &kind); + return find_btf_by_prefix_kind(btf, prefix, name, kind); }
int libbpf_find_vmlinux_btf_id(const char *name, @@ -9477,7 +9630,7 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, __u32 attach_prog_fd = prog->attach_prog_fd; const char *name = prog->sec_name, *attach_name; const struct bpf_sec_def *sec = NULL; - int i, err; + int i, err = 0;
if (!name) return -EINVAL; @@ -9512,7 +9665,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, }
/* kernel/module BTF ID */ - err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id); + if (prog->obj->gen_loader) { + bpf_gen__record_attach_target(prog->obj->gen_loader, attach_name, attach_type); + *btf_obj_fd = 0; + *btf_type_id = 1; + } else { + err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id); + } if (err) { pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err); return err; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index e2c0aaf20416..3137292b5a6d 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -803,6 +803,18 @@ LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s); LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s); LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
+struct gen_loader_opts { + size_t sz; /* size of this struct, for forward/backward compatiblity */ + const char *data; + const char *insns; + __u32 data_sz; + __u32 insns_sz; +}; + +#define gen_loader_opts__last_field insns_sz +LIBBPF_API int bpf_object__gen_loader(struct bpf_object *obj, + struct gen_loader_opts *opts); + enum libbpf_tristate { TRI_NO = 0, TRI_YES = 1, diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index cd09c984ccdc..240f337c14d8 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -360,6 +360,7 @@ LIBBPF_0.4.0 { bpf_linker__free; bpf_linker__new; bpf_map__inner_map; + bpf_object__gen_loader; bpf_object__set_kversion; bpf_tc_attach; bpf_tc_detach; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index acbcf6c7bdf8..a2cc297edb99 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -263,6 +263,8 @@ int bpf_object__section_size(const struct bpf_object *obj, const char *name, int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, __u32 *off); struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf); +void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type, + const char **prefix, int *kind);
struct btf_ext_info { /* diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h new file mode 100644 index 000000000000..12a126b452c1 --- /dev/null +++ b/tools/lib/bpf/skel_internal.h @@ -0,0 +1,123 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +/* Copyright (c) 2021 Facebook */ +#ifndef __SKEL_INTERNAL_H +#define __SKEL_INTERNAL_H + +#include <unistd.h> +#include <sys/syscall.h> +#include <sys/mman.h> + +/* This file is a base header for auto-generated *.lskel.h files. + * Its contents will change and may become part of auto-generation in the future. + * + * The layout of bpf_[map|prog]_desc and bpf_loader_ctx is feature dependent + * and will change from one version of libbpf to another and features + * requested during loader program generation. + */ +struct bpf_map_desc { + union { + /* input for the loader prog */ + struct { + __aligned_u64 initial_value; + __u32 max_entries; + }; + /* output of the loader prog */ + struct { + int map_fd; + }; + }; +}; +struct bpf_prog_desc { + int prog_fd; +}; + +struct bpf_loader_ctx { + size_t sz; + __u32 log_level; + __u32 log_size; + __u64 log_buf; +}; + +struct bpf_load_and_run_opts { + struct bpf_loader_ctx *ctx; + const void *data; + const void *insns; + __u32 data_sz; + __u32 insns_sz; + const char *errstr; +}; + +static inline int skel_sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, + unsigned int size) +{ + return syscall(__NR_bpf, cmd, attr, size); +} + +static inline int skel_closenz(int fd) +{ + if (fd > 0) + return close(fd); + return -EINVAL; +} + +static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts) +{ + int map_fd = -1, prog_fd = -1, key = 0, err; + union bpf_attr attr; + + map_fd = bpf_create_map_name(BPF_MAP_TYPE_ARRAY, "__loader.map", 4, + opts->data_sz, 1, 0); + if (map_fd < 0) { + opts->errstr = "failed to create loader map"; + err = -errno; + goto out; + } + + err = bpf_map_update_elem(map_fd, &key, opts->data, 0); + if (err < 0) { + opts->errstr = "failed to update loader map"; + err = -errno; + goto out; + } + + memset(&attr, 0, sizeof(attr)); + attr.prog_type = BPF_PROG_TYPE_SYSCALL; + attr.insns = (long) opts->insns; + attr.insn_cnt = opts->insns_sz / sizeof(struct bpf_insn); + attr.license = (long) "Dual BSD/GPL"; + memcpy(attr.prog_name, "__loader.prog", sizeof("__loader.prog")); + attr.fd_array = (long) &map_fd; + attr.log_level = opts->ctx->log_level; + attr.log_size = opts->ctx->log_size; + attr.log_buf = opts->ctx->log_buf; + attr.prog_flags = BPF_F_SLEEPABLE; + prog_fd = skel_sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr)); + if (prog_fd < 0) { + opts->errstr = "failed to load loader prog"; + err = -errno; + goto out; + } + + memset(&attr, 0, sizeof(attr)); + attr.test.prog_fd = prog_fd; + attr.test.ctx_in = (long) opts->ctx; + attr.test.ctx_size_in = opts->ctx->sz; + err = skel_sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); + if (err < 0 || (int)attr.test.retval < 0) { + opts->errstr = "failed to execute loader prog"; + if (err < 0) + err = -errno; + else + err = (int)attr.test.retval; + goto out; + } + err = 0; +out: + if (map_fd >= 0) + close(map_fd); + if (prog_fd >= 0) + close(prog_fd); + return err; +} + +#endif
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 30f51aedabda92b74927979b2b3b50169e285f6b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Fix loader program to close temporary FDs when intermediate sys_bpf command fails.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-16-alexei.starovoitov@gmail... (cherry picked from commit 30f51aedabda92b74927979b2b3b50169e285f6b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/gen_loader.c | 48 +++++++++++++++++++++++++++++--- 2 files changed, 45 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index f42a55efd559..615400391e57 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -15,6 +15,7 @@ struct bpf_gen { void *data_cur; void *insn_start; void *insn_cur; + ssize_t cleanup_label; __u32 nr_progs; __u32 nr_maps; int log_level; diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 0fc54b1ca311..8df718a6b142 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -101,8 +101,36 @@ static void emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn in
void bpf_gen__init(struct bpf_gen *gen, int log_level) { + size_t stack_sz = sizeof(struct loader_stack); + int i; + gen->log_level = log_level; + /* save ctx pointer into R6 */ emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1)); + + /* bzero stack */ + emit(gen, BPF_MOV64_REG(BPF_REG_1, BPF_REG_10)); + emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -stack_sz)); + emit(gen, BPF_MOV64_IMM(BPF_REG_2, stack_sz)); + emit(gen, BPF_MOV64_IMM(BPF_REG_3, 0)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel)); + + /* jump over cleanup code */ + emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, + /* size of cleanup code below */ + (stack_sz / 4) * 3 + 2)); + + /* remember the label where all error branches will jump to */ + gen->cleanup_label = gen->insn_cur - gen->insn_start; + /* emit cleanup code: close all temp FDs */ + for (i = 0; i < stack_sz; i += 4) { + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -stack_sz + i)); + emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 1)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close)); + } + /* R7 contains the error code from sys_bpf. Copy it into R0 and exit. */ + emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7)); + emit(gen, BPF_EXIT_INSN()); }
static int add_data(struct bpf_gen *gen, const void *data, __u32 size) @@ -187,12 +215,24 @@ static void emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size) emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0)); }
+static bool is_simm16(__s64 value) +{ + return value == (__s64)(__s16)value; +} + static void emit_check_err(struct bpf_gen *gen) { - emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2)); - emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7)); - /* TODO: close intermediate FDs in case of error */ - emit(gen, BPF_EXIT_INSN()); + __s64 off = -(gen->insn_cur - gen->insn_start - gen->cleanup_label) / 8 - 1; + + /* R7 contains result of last sys_bpf command. + * if (R7 < 0) goto cleanup; + */ + if (is_simm16(off)) { + emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, off)); + } else { + gen->error = -ERANGE; + emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, -1)); + } }
/* reg1 and reg2 should not be R1 - R5. They can be R0, R6 - R10 */
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 7723256bf2443d6bd7db3e583953d14107955233 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Introduce bpf_map__initial_value() to read initial contents of mmaped data/rodata/bss maps. Note that bpf_map__set_initial_value() doesn't allow modifying kconfig map while bpf_map__initial_value() allows reading its values.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-17-alexei.starovoitov@gmail... (cherry picked from commit 7723256bf2443d6bd7db3e583953d14107955233) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 8 ++++++++ tools/lib/bpf/libbpf.h | 1 + tools/lib/bpf/libbpf.map | 1 + 3 files changed, 10 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a7ae242f2b4e..dbc51f4af43d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -9828,6 +9828,14 @@ int bpf_map__set_initial_value(struct bpf_map *map, return 0; }
+const void *bpf_map__initial_value(struct bpf_map *map, size_t *psize) +{ + if (!map->mmaped) + return NULL; + *psize = map->def.value_size; + return map->mmaped; +} + bool bpf_map__is_offload_neutral(const struct bpf_map *map) { return map->def.type == BPF_MAP_TYPE_PERF_EVENT_ARRAY; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 3137292b5a6d..2fee8d11cdd8 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -475,6 +475,7 @@ LIBBPF_API int bpf_map__set_priv(struct bpf_map *map, void *priv, LIBBPF_API void *bpf_map__priv(const struct bpf_map *map); LIBBPF_API int bpf_map__set_initial_value(struct bpf_map *map, const void *data, size_t size); +LIBBPF_API const void *bpf_map__initial_value(struct bpf_map *map, size_t *psize); LIBBPF_API bool bpf_map__is_offload_neutral(const struct bpf_map *map); LIBBPF_API bool bpf_map__is_internal(const struct bpf_map *map); LIBBPF_API int bpf_map__set_pin_path(struct bpf_map *map, const char *path); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 240f337c14d8..5355f67e97ff 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -359,6 +359,7 @@ LIBBPF_0.4.0 { bpf_linker__finalize; bpf_linker__free; bpf_linker__new; + bpf_map__initial_value; bpf_map__inner_map; bpf_object__gen_loader; bpf_object__set_kversion;
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit d510296d331accd4afaa13498220c93ae690628a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add -L flag to bpftool to use libbpf gen_trace facility and syscall/loader program for skeleton generation and program loading.
"bpftool gen skeleton -L" command will generate a "light skeleton" or "loader skeleton" that is similar to existing skeleton, but has one major difference: $ bpftool gen skeleton lsm.o > lsm.skel.h $ bpftool gen skeleton -L lsm.o > lsm.lskel.h $ diff lsm.skel.h lsm.lskel.h @@ -5,34 +4,34 @@ #define __LSM_SKEL_H__
#include <stdlib.h> -#include <bpf/libbpf.h> +#include <bpf/bpf.h>
The light skeleton does not use majority of libbpf infrastructure. It doesn't need libelf. It doesn't parse .o file. It only needs few sys_bpf wrappers. All of them are in bpf/bpf.h file. In future libbpf/bpf.c can be inlined into bpf.h, so not even libbpf.a would be needed to work with light skeleton.
"bpftool prog load -L file.o" command is introduced for debugging of syscall/loader program generation. Just like the same command without -L it will try to load the programs from file.o into the kernel. It won't even try to pin them.
"bpftool prog load -L -d file.o" command will provide additional debug messages on how syscall/loader program was generated. Also the execution of syscall/loader program will use bpf_trace_printk() for each step of loading BTF, creating maps, and loading programs. The user can do "cat /.../trace_pipe" for further debug.
An example of fexit_sleep.lskel.h generated from progs/fexit_sleep.c: struct fexit_sleep { struct bpf_loader_ctx ctx; struct { struct bpf_map_desc bss; } maps; struct { struct bpf_prog_desc nanosleep_fentry; struct bpf_prog_desc nanosleep_fexit; } progs; struct { int nanosleep_fentry_fd; int nanosleep_fexit_fd; } links; struct fexit_sleep__bss { int pid; int fentry_cnt; int fexit_cnt; } *bss; };
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210514003623.28033-18-alexei.starovoitov@gmail... (cherry picked from commit d510296d331accd4afaa13498220c93ae690628a) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/bpf/bpftool/xlated_dumper.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 2 +- tools/bpf/bpftool/gen.c | 386 ++++++++++++++++++++++++++++-- tools/bpf/bpftool/main.c | 7 +- tools/bpf/bpftool/main.h | 1 + tools/bpf/bpftool/prog.c | 107 ++++++++- tools/bpf/bpftool/xlated_dumper.c | 3 + 6 files changed, 482 insertions(+), 24 deletions(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index f45bc3c7fd07..c7d13e87b4ed 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -135,7 +135,7 @@ endif
BPFTOOL_BOOTSTRAP := $(BOOTSTRAP_OUTPUT)bpftool
-BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o) +BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o) $(OUTPUT)disasm.o OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 27dceaf66ecb..13b0aa789178 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -18,6 +18,7 @@ #include <sys/stat.h> #include <sys/mman.h> #include <bpf/btf.h> +#include <bpf/bpf_gen_internal.h>
#include "json_writer.h" #include "main.h" @@ -274,6 +275,327 @@ static void codegen(const char *template, ...) free(s); }
+static void print_hex(const char *data, int data_sz) +{ + int i, len; + + for (i = 0, len = 0; i < data_sz; i++) { + int w = data[i] ? 4 : 2; + + len += w; + if (len > 78) { + printf("\\n"); + len = w; + } + if (!data[i]) + printf("\0"); + else + printf("\x%02x", (unsigned char)data[i]); + } +} + +static size_t bpf_map_mmap_sz(const struct bpf_map *map) +{ + long page_sz = sysconf(_SC_PAGE_SIZE); + size_t map_sz; + + map_sz = (size_t)roundup(bpf_map__value_size(map), 8) * bpf_map__max_entries(map); + map_sz = roundup(map_sz, page_sz); + return map_sz; +} + +static void codegen_attach_detach(struct bpf_object *obj, const char *obj_name) +{ + struct bpf_program *prog; + + bpf_object__for_each_program(prog, obj) { + const char *tp_name; + + codegen("\ + \n\ + \n\ + static inline int \n\ + %1$s__%2$s__attach(struct %1$s *skel) \n\ + { \n\ + int prog_fd = skel->progs.%2$s.prog_fd; \n\ + ", obj_name, bpf_program__name(prog)); + + switch (bpf_program__get_type(prog)) { + case BPF_PROG_TYPE_RAW_TRACEPOINT: + tp_name = strchr(bpf_program__section_name(prog), '/') + 1; + printf("\tint fd = bpf_raw_tracepoint_open("%s", prog_fd);\n", tp_name); + break; + case BPF_PROG_TYPE_TRACING: + printf("\tint fd = bpf_raw_tracepoint_open(NULL, prog_fd);\n"); + break; + default: + printf("\tint fd = ((void)prog_fd, 0); /* auto-attach not supported */\n"); + break; + } + codegen("\ + \n\ + \n\ + if (fd > 0) \n\ + skel->links.%1$s_fd = fd; \n\ + return fd; \n\ + } \n\ + ", bpf_program__name(prog)); + } + + codegen("\ + \n\ + \n\ + static inline int \n\ + %1$s__attach(struct %1$s *skel) \n\ + { \n\ + int ret = 0; \n\ + \n\ + ", obj_name); + + bpf_object__for_each_program(prog, obj) { + codegen("\ + \n\ + ret = ret < 0 ? ret : %1$s__%2$s__attach(skel); \n\ + ", obj_name, bpf_program__name(prog)); + } + + codegen("\ + \n\ + return ret < 0 ? ret : 0; \n\ + } \n\ + \n\ + static inline void \n\ + %1$s__detach(struct %1$s *skel) \n\ + { \n\ + ", obj_name); + + bpf_object__for_each_program(prog, obj) { + codegen("\ + \n\ + skel_closenz(skel->links.%1$s_fd); \n\ + ", bpf_program__name(prog)); + } + + codegen("\ + \n\ + } \n\ + "); +} + +static void codegen_destroy(struct bpf_object *obj, const char *obj_name) +{ + struct bpf_program *prog; + struct bpf_map *map; + + codegen("\ + \n\ + static void \n\ + %1$s__destroy(struct %1$s *skel) \n\ + { \n\ + if (!skel) \n\ + return; \n\ + %1$s__detach(skel); \n\ + ", + obj_name); + + bpf_object__for_each_program(prog, obj) { + codegen("\ + \n\ + skel_closenz(skel->progs.%1$s.prog_fd); \n\ + ", bpf_program__name(prog)); + } + + bpf_object__for_each_map(map, obj) { + const char * ident; + + ident = get_map_ident(map); + if (!ident) + continue; + if (bpf_map__is_internal(map) && + (bpf_map__def(map)->map_flags & BPF_F_MMAPABLE)) + printf("\tmunmap(skel->%1$s, %2$zd);\n", + ident, bpf_map_mmap_sz(map)); + codegen("\ + \n\ + skel_closenz(skel->maps.%1$s.map_fd); \n\ + ", ident); + } + codegen("\ + \n\ + free(skel); \n\ + } \n\ + ", + obj_name); +} + +static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *header_guard) +{ + struct bpf_object_load_attr load_attr = {}; + DECLARE_LIBBPF_OPTS(gen_loader_opts, opts); + struct bpf_map *map; + int err = 0; + + err = bpf_object__gen_loader(obj, &opts); + if (err) + return err; + + load_attr.obj = obj; + if (verifier_logs) + /* log_level1 + log_level2 + stats, but not stable UAPI */ + load_attr.log_level = 1 + 2 + 4; + + err = bpf_object__load_xattr(&load_attr); + if (err) { + p_err("failed to load object file"); + goto out; + } + /* If there was no error during load then gen_loader_opts + * are populated with the loader program. + */ + + /* finish generating 'struct skel' */ + codegen("\ + \n\ + }; \n\ + ", obj_name); + + + codegen_attach_detach(obj, obj_name); + + codegen_destroy(obj, obj_name); + + codegen("\ + \n\ + static inline struct %1$s * \n\ + %1$s__open(void) \n\ + { \n\ + struct %1$s *skel; \n\ + \n\ + skel = calloc(sizeof(*skel), 1); \n\ + if (!skel) \n\ + goto cleanup; \n\ + skel->ctx.sz = (void *)&skel->links - (void *)skel; \n\ + ", + obj_name, opts.data_sz); + bpf_object__for_each_map(map, obj) { + const char *ident; + const void *mmap_data = NULL; + size_t mmap_size = 0; + + ident = get_map_ident(map); + if (!ident) + continue; + + if (!bpf_map__is_internal(map) || + !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE)) + continue; + + codegen("\ + \n\ + skel->%1$s = \n\ + mmap(NULL, %2$zd, PROT_READ | PROT_WRITE,\n\ + MAP_SHARED | MAP_ANONYMOUS, -1, 0); \n\ + if (skel->%1$s == (void *) -1) \n\ + goto cleanup; \n\ + memcpy(skel->%1$s, (void *)"\ \n\ + ", ident, bpf_map_mmap_sz(map)); + mmap_data = bpf_map__initial_value(map, &mmap_size); + print_hex(mmap_data, mmap_size); + printf("", %2$zd);\n" + "\tskel->maps.%1$s.initial_value = (__u64)(long)skel->%1$s;\n", + ident, mmap_size); + } + codegen("\ + \n\ + return skel; \n\ + cleanup: \n\ + %1$s__destroy(skel); \n\ + return NULL; \n\ + } \n\ + \n\ + static inline int \n\ + %1$s__load(struct %1$s *skel) \n\ + { \n\ + struct bpf_load_and_run_opts opts = {}; \n\ + int err; \n\ + \n\ + opts.ctx = (struct bpf_loader_ctx *)skel; \n\ + opts.data_sz = %2$d; \n\ + opts.data = (void *)"\ \n\ + ", + obj_name, opts.data_sz); + print_hex(opts.data, opts.data_sz); + codegen("\ + \n\ + "; \n\ + "); + + codegen("\ + \n\ + opts.insns_sz = %d; \n\ + opts.insns = (void *)"\ \n\ + ", + opts.insns_sz); + print_hex(opts.insns, opts.insns_sz); + codegen("\ + \n\ + "; \n\ + err = bpf_load_and_run(&opts); \n\ + if (err < 0) \n\ + return err; \n\ + ", obj_name); + bpf_object__for_each_map(map, obj) { + const char *ident, *mmap_flags; + + ident = get_map_ident(map); + if (!ident) + continue; + + if (!bpf_map__is_internal(map) || + !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE)) + continue; + if (bpf_map__def(map)->map_flags & BPF_F_RDONLY_PROG) + mmap_flags = "PROT_READ"; + else + mmap_flags = "PROT_READ | PROT_WRITE"; + + printf("\tskel->%1$s =\n" + "\t\tmmap(skel->%1$s, %2$zd, %3$s, MAP_SHARED | MAP_FIXED,\n" + "\t\t\tskel->maps.%1$s.map_fd, 0);\n", + ident, bpf_map_mmap_sz(map), mmap_flags); + } + codegen("\ + \n\ + return 0; \n\ + } \n\ + \n\ + static inline struct %1$s * \n\ + %1$s__open_and_load(void) \n\ + { \n\ + struct %1$s *skel; \n\ + \n\ + skel = %1$s__open(); \n\ + if (!skel) \n\ + return NULL; \n\ + if (%1$s__load(skel)) { \n\ + %1$s__destroy(skel); \n\ + return NULL; \n\ + } \n\ + return skel; \n\ + } \n\ + ", obj_name); + + codegen("\ + \n\ + \n\ + #endif /* %s */ \n\ + ", + header_guard); + err = 0; +out: + return err; +} + static int do_skeleton(int argc, char **argv) { char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")]; @@ -283,7 +605,7 @@ static int do_skeleton(int argc, char **argv) struct bpf_object *obj = NULL; const char *file, *ident; struct bpf_program *prog; - int fd, len, err = -1; + int fd, err = -1; struct bpf_map *map; struct btf *btf; struct stat st; @@ -365,7 +687,25 @@ static int do_skeleton(int argc, char **argv) }
get_header_guard(header_guard, obj_name); - codegen("\ + if (use_loader) { + codegen("\ + \n\ + /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ \n\ + /* THIS FILE IS AUTOGENERATED! */ \n\ + #ifndef %2$s \n\ + #define %2$s \n\ + \n\ + #include <stdlib.h> \n\ + #include <bpf/bpf.h> \n\ + #include <bpf/skel_internal.h> \n\ + \n\ + struct %1$s { \n\ + struct bpf_loader_ctx ctx; \n\ + ", + obj_name, header_guard + ); + } else { + codegen("\ \n\ /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ \n\ \n\ @@ -381,7 +721,8 @@ static int do_skeleton(int argc, char **argv) struct bpf_object *obj; \n\ ", obj_name, header_guard - ); + ); + }
if (map_cnt) { printf("\tstruct {\n"); @@ -389,7 +730,10 @@ static int do_skeleton(int argc, char **argv) ident = get_map_ident(map); if (!ident) continue; - printf("\t\tstruct bpf_map *%s;\n", ident); + if (use_loader) + printf("\t\tstruct bpf_map_desc %s;\n", ident); + else + printf("\t\tstruct bpf_map *%s;\n", ident); } printf("\t} maps;\n"); } @@ -397,14 +741,22 @@ static int do_skeleton(int argc, char **argv) if (prog_cnt) { printf("\tstruct {\n"); bpf_object__for_each_program(prog, obj) { - printf("\t\tstruct bpf_program *%s;\n", - bpf_program__name(prog)); + if (use_loader) + printf("\t\tstruct bpf_prog_desc %s;\n", + bpf_program__name(prog)); + else + printf("\t\tstruct bpf_program *%s;\n", + bpf_program__name(prog)); } printf("\t} progs;\n"); printf("\tstruct {\n"); bpf_object__for_each_program(prog, obj) { - printf("\t\tstruct bpf_link *%s;\n", - bpf_program__name(prog)); + if (use_loader) + printf("\t\tint %s_fd;\n", + bpf_program__name(prog)); + else + printf("\t\tstruct bpf_link *%s;\n", + bpf_program__name(prog)); } printf("\t} links;\n"); } @@ -415,6 +767,10 @@ static int do_skeleton(int argc, char **argv) if (err) goto out; } + if (use_loader) { + err = gen_trace(obj, obj_name, header_guard); + goto out; + }
codegen("\ \n\ @@ -584,19 +940,7 @@ static int do_skeleton(int argc, char **argv) file_sz);
/* embed contents of BPF object file */ - for (i = 0, len = 0; i < file_sz; i++) { - int w = obj_data[i] ? 4 : 2; - - len += w; - if (len > 78) { - printf("\\n"); - len = w; - } - if (!obj_data[i]) - printf("\0"); - else - printf("\x%02x", (unsigned char)obj_data[i]); - } + print_hex(obj_data, file_sz);
codegen("\ \n\ diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c index edc889acd1fe..5e6c4da2d53c 100644 --- a/tools/bpf/bpftool/main.c +++ b/tools/bpf/bpftool/main.c @@ -29,6 +29,7 @@ bool show_pinned; bool block_mount; bool verifier_logs; bool relaxed_maps; +bool use_loader; struct btf *base_btf; struct pinned_obj_table prog_table; struct pinned_obj_table map_table; @@ -394,6 +395,7 @@ int main(int argc, char **argv) { "mapcompat", no_argument, NULL, 'm' }, { "nomount", no_argument, NULL, 'n' }, { "debug", no_argument, NULL, 'd' }, + { "use-loader", no_argument, NULL, 'L' }, { "base-btf", required_argument, NULL, 'B' }, { 0 } }; @@ -413,7 +415,7 @@ int main(int argc, char **argv) hash_init(link_table.table);
opterr = 0; - while ((opt = getopt_long(argc, argv, "VhpjfmndB:", + while ((opt = getopt_long(argc, argv, "VhpjfLmndB:", options, NULL)) >= 0) { switch (opt) { case 'V': @@ -456,6 +458,9 @@ int main(int argc, char **argv) return -1; } break; + case 'L': + use_loader = true; + break; default: p_err("unrecognized option '%s'", argv[optind - 1]); if (json_output) diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index 76e91641262b..c1cf29798b99 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -90,6 +90,7 @@ extern bool show_pids; extern bool block_mount; extern bool verifier_logs; extern bool relaxed_maps; +extern bool use_loader; extern struct btf *base_btf; extern struct pinned_obj_table prog_table; extern struct pinned_obj_table map_table; diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 88ee5ebabb47..4187b5c9397a 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -16,6 +16,7 @@ #include <sys/types.h> #include <sys/stat.h> #include <sys/syscall.h> +#include <dirent.h>
#include <linux/err.h> #include <linux/perf_event.h> @@ -24,6 +25,8 @@ #include <bpf/bpf.h> #include <bpf/btf.h> #include <bpf/libbpf.h> +#include <bpf/bpf_gen_internal.h> +#include <bpf/skel_internal.h>
#include "cfg.h" #include "main.h" @@ -1501,7 +1504,7 @@ static int load_with_options(int argc, char **argv, bool first_prog_only) set_max_rlimit();
obj = bpf_object__open_file(file, &open_opts); - if (IS_ERR_OR_NULL(obj)) { + if (libbpf_get_error(obj)) { p_err("failed to open object file"); goto err_free_reuse_maps; } @@ -1647,8 +1650,110 @@ static int load_with_options(int argc, char **argv, bool first_prog_only) return -1; }
+static int count_open_fds(void) +{ + DIR *dp = opendir("/proc/self/fd"); + struct dirent *de; + int cnt = -3; + + if (!dp) + return -1; + + while ((de = readdir(dp))) + cnt++; + + closedir(dp); + return cnt; +} + +static int try_loader(struct gen_loader_opts *gen) +{ + struct bpf_load_and_run_opts opts = {}; + struct bpf_loader_ctx *ctx; + int ctx_sz = sizeof(*ctx) + 64 * max(sizeof(struct bpf_map_desc), + sizeof(struct bpf_prog_desc)); + int log_buf_sz = (1u << 24) - 1; + int err, fds_before, fd_delta; + char *log_buf; + + ctx = alloca(ctx_sz); + memset(ctx, 0, ctx_sz); + ctx->sz = ctx_sz; + ctx->log_level = 1; + ctx->log_size = log_buf_sz; + log_buf = malloc(log_buf_sz); + if (!log_buf) + return -ENOMEM; + ctx->log_buf = (long) log_buf; + opts.ctx = ctx; + opts.data = gen->data; + opts.data_sz = gen->data_sz; + opts.insns = gen->insns; + opts.insns_sz = gen->insns_sz; + fds_before = count_open_fds(); + err = bpf_load_and_run(&opts); + fd_delta = count_open_fds() - fds_before; + if (err < 0) { + fprintf(stderr, "err %d\n%s\n%s", err, opts.errstr, log_buf); + if (fd_delta) + fprintf(stderr, "loader prog leaked %d FDs\n", + fd_delta); + } + free(log_buf); + return err; +} + +static int do_loader(int argc, char **argv) +{ + DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts); + DECLARE_LIBBPF_OPTS(gen_loader_opts, gen); + struct bpf_object_load_attr load_attr = {}; + struct bpf_object *obj; + const char *file; + int err = 0; + + if (!REQ_ARGS(1)) + return -1; + file = GET_ARG(); + + obj = bpf_object__open_file(file, &open_opts); + if (libbpf_get_error(obj)) { + p_err("failed to open object file"); + goto err_close_obj; + } + + err = bpf_object__gen_loader(obj, &gen); + if (err) + goto err_close_obj; + + load_attr.obj = obj; + if (verifier_logs) + /* log_level1 + log_level2 + stats, but not stable UAPI */ + load_attr.log_level = 1 + 2 + 4; + + err = bpf_object__load_xattr(&load_attr); + if (err) { + p_err("failed to load object file"); + goto err_close_obj; + } + + if (verifier_logs) { + struct dump_data dd = {}; + + kernel_syms_load(&dd); + dump_xlated_plain(&dd, (void *)gen.insns, gen.insns_sz, false, false); + kernel_syms_destroy(&dd); + } + err = try_loader(&gen); +err_close_obj: + bpf_object__close(obj); + return err; +} + static int do_load(int argc, char **argv) { + if (use_loader) + return do_loader(argc, argv); return load_with_options(argc, argv, true); }
diff --git a/tools/bpf/bpftool/xlated_dumper.c b/tools/bpf/bpftool/xlated_dumper.c index 8608cd68cdd0..acd63a520464 100644 --- a/tools/bpf/bpftool/xlated_dumper.c +++ b/tools/bpf/bpftool/xlated_dumper.c @@ -196,6 +196,9 @@ static const char *print_imm(void *private_data, else if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), "map[id:%u][0]+%u", insn->imm, (insn + 1)->imm); + else if (insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) + snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), + "map[idx:%u]+%u", insn->imm, (insn + 1)->imm); else snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), "0x%llx", (unsigned long long)full_imm);
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 5d67f349590ddc94b6d4e25f19085728db9de697 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BPF_PROG_RUN command as an alias to BPF_RPOG_TEST_RUN to better indicate the full range of use cases done by the command.
Suggested-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20210519014032.20908-1-alexei.starovoitov@gmail.... (cherry picked from commit 5d67f349590ddc94b6d4e25f19085728db9de697) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 1 + tools/include/uapi/linux/bpf.h | 1 + tools/lib/bpf/skel_internal.h | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d8fc4d8c7bc5..b1bc06aea64a 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -106,6 +106,7 @@ enum bpf_cmd { BPF_PROG_ATTACH, BPF_PROG_DETACH, BPF_PROG_TEST_RUN, + BPF_PROG_RUN = BPF_PROG_TEST_RUN, BPF_PROG_GET_NEXT_ID, BPF_MAP_GET_NEXT_ID, BPF_PROG_GET_FD_BY_ID, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index dde6a6042d32..aef4b483bea2 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -816,6 +816,7 @@ enum bpf_cmd { BPF_PROG_ATTACH, BPF_PROG_DETACH, BPF_PROG_TEST_RUN, + BPF_PROG_RUN = BPF_PROG_TEST_RUN, BPF_PROG_GET_NEXT_ID, BPF_MAP_GET_NEXT_ID, BPF_PROG_GET_FD_BY_ID, diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h index 12a126b452c1..b22b50c1b173 100644 --- a/tools/lib/bpf/skel_internal.h +++ b/tools/lib/bpf/skel_internal.h @@ -102,7 +102,7 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts) attr.test.prog_fd = prog_fd; attr.test.ctx_in = (long) opts->ctx; attr.test.ctx_size_in = opts->ctx->sz; - err = skel_sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); + err = skel_sys_bpf(BPF_PROG_RUN, &attr, sizeof(attr)); if (err < 0 || (int)attr.test.retval < 0) { opts->errstr = "failed to execute loader prog"; if (err < 0)
From: Stanislav Fomichev sdf@google.com
mainline inclusion from mainline-5.14-rc1 commit f9bceaa59c5c47a8a08f48e19cbe887e500a1978 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
I'm getting the following error when running 'gen skeleton -L' as regular user:
libbpf: Error in bpf_object__probe_loading():Operation not permitted(1). Couldn't load trivial BPF program. Make sure your kernel supports BPF (CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough value.
Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Stanislav Fomichev sdf@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210521030653.2626513-1-sdf@google.com (cherry picked from commit f9bceaa59c5c47a8a08f48e19cbe887e500a1978) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index dbc51f4af43d..2b01e60f4fc5 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4013,6 +4013,9 @@ bpf_object__probe_loading(struct bpf_object *obj) }; int ret;
+ if (obj->gen_loader) + return 0; + /* make sure basic loading works */
memset(&attr, 0, sizeof(attr));
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.14-rc1 commit 9f0c317f6aa12b160103ee3946d79276c14b95e2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
LLVM patch https://reviews.llvm.org/D102712 narrowed the scope of existing R_BPF_64_64 and R_BPF_64_32 relocations, and added three new relocations, R_BPF_64_ABS64, R_BPF_64_ABS32 and R_BPF_64_NODYLD32. The main motivation is to make relocations linker friendly.
This change, unfortunately, breaks libbpf build, and we will see errors like below: libbpf: ELF relo #0 in section #6 has unexpected type 2 in /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o Error: failed to link '/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o': Unknown error -22 (-22) The new relocation R_BPF_64_ABS64 is generated and libbpf linker sanity check doesn't understand it. Relocation section '.rel.struct_ops' at offset 0x1410 contains 1 entries: Offset Info Type Symbol's Value Symbol's Name 0000000000000018 0000000700000002 R_BPF_64_ABS64 0000000000000000 nogpltcp_init
Look at the selftests/bpf/bpf_tcp_nogpl.c, void BPF_STRUCT_OPS(nogpltcp_init, struct sock *sk) { }
SEC(".struct_ops") struct tcp_congestion_ops bpf_nogpltcp = { .init = (void *)nogpltcp_init, .name = "bpf_nogpltcp", }; The new llvm relocation scheme categorizes 'nogpltcp_init' reference as R_BPF_64_ABS64 instead of R_BPF_64_64 which is used to specify ld_imm64 relocation in the new scheme.
Let us fix the linker sanity checking by including R_BPF_64_ABS64 and R_BPF_64_ABS32. There is no need to check R_BPF_64_NODYLD32 which is used for .BTF and .BTF.ext.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210522162341.3687617-1-yhs@fb.com (cherry picked from commit 9f0c317f6aa12b160103ee3946d79276c14b95e2) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf_internal.h | 6 ++++++ tools/lib/bpf/linker.c | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index a2cc297edb99..777b17b230f8 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -28,6 +28,12 @@ #ifndef R_BPF_64_64 #define R_BPF_64_64 1 #endif +#ifndef R_BPF_64_ABS64 +#define R_BPF_64_ABS64 2 +#endif +#ifndef R_BPF_64_ABS32 +#define R_BPF_64_ABS32 3 +#endif #ifndef R_BPF_64_32 #define R_BPF_64_32 10 #endif diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 488029cb4e42..bd11c429cd5b 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -892,7 +892,8 @@ static int linker_sanity_check_elf_relos(struct src_obj *obj, struct src_sec *se size_t sym_idx = ELF64_R_SYM(relo->r_info); size_t sym_type = ELF64_R_TYPE(relo->r_info);
- if (sym_type != R_BPF_64_64 && sym_type != R_BPF_64_32) { + if (sym_type != R_BPF_64_64 && sym_type != R_BPF_64_32 && + sym_type != R_BPF_64_ABS64 && sym_type != R_BPF_64_ABS32) { pr_warn("ELF relo #%d in section #%zu has unexpected type %zu in %s\n", i, sec->sec_idx, sym_type, obj->filename); return -EINVAL;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 5981881d21dff612abf8fce484f8efa67f49aae4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add libbpf_set_strict_mode() API that allows application to simulate libbpf 1.0 breaking changes before libbpf 1.0 is released. This will help users migrate gradually and with confidence.
For now only ALL or NONE options are available, subsequent patches will add more flags. This patch is preliminary for selftests/bpf changes.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210525035935.1461796-2-andrii@kernel.org (cherry picked from commit 5981881d21dff612abf8fce484f8efa67f49aae4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Makefile | 1 + tools/lib/bpf/libbpf.c | 17 +++++++++++++ tools/lib/bpf/libbpf.h | 1 + tools/lib/bpf/libbpf.map | 5 ++++ tools/lib/bpf/libbpf_legacy.h | 47 +++++++++++++++++++++++++++++++++++ 5 files changed, 71 insertions(+) create mode 100644 tools/lib/bpf/libbpf_legacy.h
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 76cf5ab8d00c..914c0afff255 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -230,6 +230,7 @@ install_headers: $(BPF_HELPER_DEFS) $(call do_install,btf.h,$(prefix)/include/bpf,644); \ $(call do_install,libbpf_util.h,$(prefix)/include/bpf,644); \ $(call do_install,libbpf_common.h,$(prefix)/include/bpf,644); \ + $(call do_install,libbpf_legacy.h,$(prefix)/include/bpf,644); \ $(call do_install,xsk.h,$(prefix)/include/bpf,644); \ $(call do_install,bpf_helpers.h,$(prefix)/include/bpf,644); \ $(call do_install,$(BPF_HELPER_DEFS),$(prefix)/include/bpf,644); \ diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2b01e60f4fc5..416fcb161ba8 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -151,6 +151,23 @@ static inline __u64 ptr_to_u64(const void *ptr) return (__u64) (unsigned long) ptr; }
+/* this goes away in libbpf 1.0 */ +enum libbpf_strict_mode libbpf_mode = LIBBPF_STRICT_NONE; + +int libbpf_set_strict_mode(enum libbpf_strict_mode mode) +{ + /* __LIBBPF_STRICT_LAST is the last power-of-2 value used + 1, so to + * get all possible values we compensate last +1, and then (2*x - 1) + * to get the bit mask + */ + if (mode != LIBBPF_STRICT_ALL + && (mode & ~((__LIBBPF_STRICT_LAST - 1) * 2 - 1))) + return errno = EINVAL, -EINVAL; + + libbpf_mode = mode; + return 0; +} + enum kern_feature_id { /* v4.14: kernel support for program & map names. */ FEAT_PROG_NAME, diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 2fee8d11cdd8..e3a6728e868a 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -18,6 +18,7 @@ #include <linux/bpf.h>
#include "libbpf_common.h" +#include "libbpf_legacy.h"
#ifdef __cplusplus extern "C" { diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 5355f67e97ff..6266317ef3a2 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -369,3 +369,8 @@ LIBBPF_0.4.0 { bpf_tc_hook_destroy; bpf_tc_query; } LIBBPF_0.3.0; + +LIBBPF_0.5.0 { + global: + libbpf_set_strict_mode; +} LIBBPF_0.4.0; diff --git a/tools/lib/bpf/libbpf_legacy.h b/tools/lib/bpf/libbpf_legacy.h new file mode 100644 index 000000000000..7482cfe22ab2 --- /dev/null +++ b/tools/lib/bpf/libbpf_legacy.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ + +/* + * Libbpf legacy APIs (either discouraged or deprecated, as mentioned in [0]) + * + * [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54n... + * + * Copyright (C) 2021 Facebook + */ +#ifndef __LIBBPF_LEGACY_BPF_H +#define __LIBBPF_LEGACY_BPF_H + +#include <linux/bpf.h> +#include <stdbool.h> +#include <stddef.h> +#include <stdint.h> +#include "libbpf_common.h" + +#ifdef __cplusplus +extern "C" { +#endif + +enum libbpf_strict_mode { + /* Turn on all supported strict features of libbpf to simulate libbpf + * v1.0 behavior. + * This will be the default behavior in libbpf v1.0. + */ + LIBBPF_STRICT_ALL = 0xffffffff, + + /* + * Disable any libbpf 1.0 behaviors. This is the default before libbpf + * v1.0. It won't be supported anymore in v1.0, please update your + * code so that it handles LIBBPF_STRICT_ALL mode before libbpf v1.0. + */ + LIBBPF_STRICT_NONE = 0x00, + + __LIBBPF_STRICT_LAST, +}; + +LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode); + + +#ifdef __cplusplus +} /* extern "C" */ +#endif + +#endif /* __LIBBPF_LEGACY_BPF_H */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit f12b654327283d158de0af170943ec5dd8cd02e5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Ensure that low-level APIs behave uniformly across the libbpf as follows: - in case of an error, errno is always set to the correct error code; - when libbpf 1.0 mode is enabled with LIBBPF_STRICT_DIRECT_ERRS option to libbpf_set_strict_mode(), return -Exxx error value directly, instead of -1; - by default, until libbpf 1.0 is released, keep returning -1 directly.
More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]).
[0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54n...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210525035935.1461796-4-andrii@kernel.org (cherry picked from commit f12b654327283d158de0af170943ec5dd8cd02e5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf.c | 168 ++++++++++++++++++++++---------- tools/lib/bpf/libbpf_internal.h | 26 +++++ tools/lib/bpf/libbpf_legacy.h | 12 +++ 3 files changed, 156 insertions(+), 50 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 1ce17f3528f6..d6959f1f3ed8 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -80,6 +80,7 @@ static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size) int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr) { union bpf_attr attr; + int fd;
memset(&attr, '\0', sizeof(attr));
@@ -102,7 +103,8 @@ int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr) else attr.inner_map_fd = create_attr->inner_map_fd;
- return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr)); + fd = sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_create_map_node(enum bpf_map_type map_type, const char *name, @@ -160,6 +162,7 @@ int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name, __u32 map_flags, int node) { union bpf_attr attr; + int fd;
memset(&attr, '\0', sizeof(attr));
@@ -178,7 +181,8 @@ int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name, attr.numa_node = node; }
- return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr)); + fd = sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_create_map_in_map(enum bpf_map_type map_type, const char *name, @@ -222,10 +226,10 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr) int fd;
if (!load_attr->log_buf != !load_attr->log_buf_sz) - return -EINVAL; + return libbpf_err(-EINVAL);
if (load_attr->log_level > (4 | 2 | 1) || (load_attr->log_level && !load_attr->log_buf)) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.prog_type = load_attr->prog_type; @@ -281,8 +285,10 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr) load_attr->func_info_cnt, load_attr->func_info_rec_size, attr.func_info_rec_size); - if (!finfo) + if (!finfo) { + errno = E2BIG; goto done; + }
attr.func_info = ptr_to_u64(finfo); attr.func_info_rec_size = load_attr->func_info_rec_size; @@ -293,8 +299,10 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr) load_attr->line_info_cnt, load_attr->line_info_rec_size, attr.line_info_rec_size); - if (!linfo) + if (!linfo) { + errno = E2BIG; goto done; + }
attr.line_info = ptr_to_u64(linfo); attr.line_info_rec_size = load_attr->line_info_rec_size; @@ -318,9 +326,10 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr)
fd = sys_bpf_prog_load(&attr, sizeof(attr)); done: + /* free() doesn't affect errno, so we don't need to restore it */ free(finfo); free(linfo); - return fd; + return libbpf_err_errno(fd); }
int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, @@ -329,7 +338,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, struct bpf_prog_load_params p = {};
if (!load_attr || !log_buf != !log_buf_sz) - return -EINVAL; + return libbpf_err(-EINVAL);
p.prog_type = load_attr->prog_type; p.expected_attach_type = load_attr->expected_attach_type; @@ -392,6 +401,7 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns, int log_level) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.prog_type = type; @@ -405,13 +415,15 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns, attr.kern_version = kern_version; attr.prog_flags = prog_flags;
- return sys_bpf_prog_load(&attr, sizeof(attr)); + fd = sys_bpf_prog_load(&attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd; @@ -419,24 +431,28 @@ int bpf_map_update_elem(int fd, const void *key, const void *value, attr.value = ptr_to_u64(value); attr.flags = flags;
- return sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_map_lookup_elem(int fd, const void *key, void *value) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd; attr.key = ptr_to_u64(key); attr.value = ptr_to_u64(value);
- return sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd; @@ -444,52 +460,61 @@ int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags) attr.value = ptr_to_u64(value); attr.flags = flags;
- return sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_map_lookup_and_delete_elem(int fd, const void *key, void *value) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd; attr.key = ptr_to_u64(key); attr.value = ptr_to_u64(value);
- return sys_bpf(BPF_MAP_LOOKUP_AND_DELETE_ELEM, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_LOOKUP_AND_DELETE_ELEM, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_map_delete_elem(int fd, const void *key) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd; attr.key = ptr_to_u64(key);
- return sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_map_get_next_key(int fd, const void *key, void *next_key) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd; attr.key = ptr_to_u64(key); attr.next_key = ptr_to_u64(next_key);
- return sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_map_freeze(int fd) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.map_fd = fd;
- return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr)); + ret = sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
static int bpf_map_batch_common(int cmd, int fd, void *in_batch, @@ -501,7 +526,7 @@ static int bpf_map_batch_common(int cmd, int fd, void *in_batch, int ret;
if (!OPTS_VALID(opts, bpf_map_batch_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.batch.map_fd = fd; @@ -516,7 +541,7 @@ static int bpf_map_batch_common(int cmd, int fd, void *in_batch, ret = sys_bpf(cmd, &attr, sizeof(attr)); *count = attr.batch.count;
- return ret; + return libbpf_err_errno(ret); }
int bpf_map_delete_batch(int fd, void *keys, __u32 *count, @@ -553,22 +578,26 @@ int bpf_map_update_batch(int fd, void *keys, void *values, __u32 *count, int bpf_obj_pin(int fd, const char *pathname) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.pathname = ptr_to_u64((void *)pathname); attr.bpf_fd = fd;
- return sys_bpf(BPF_OBJ_PIN, &attr, sizeof(attr)); + ret = sys_bpf(BPF_OBJ_PIN, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_obj_get(const char *pathname) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.pathname = ptr_to_u64((void *)pathname);
- return sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr)); + fd = sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type, @@ -586,9 +615,10 @@ int bpf_prog_attach_xattr(int prog_fd, int target_fd, const struct bpf_prog_attach_opts *opts) { union bpf_attr attr; + int ret;
if (!OPTS_VALID(opts, bpf_prog_attach_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.target_fd = target_fd; @@ -597,30 +627,35 @@ int bpf_prog_attach_xattr(int prog_fd, int target_fd, attr.attach_flags = OPTS_GET(opts, flags, 0); attr.replace_bpf_fd = OPTS_GET(opts, replace_prog_fd, 0);
- return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr)); + ret = sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_prog_detach(int target_fd, enum bpf_attach_type type) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.target_fd = target_fd; attr.attach_type = type;
- return sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr)); + ret = sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_prog_detach2(int prog_fd, int target_fd, enum bpf_attach_type type) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.target_fd = target_fd; attr.attach_bpf_fd = prog_fd; attr.attach_type = type;
- return sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr)); + ret = sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_link_create(int prog_fd, int target_fd, @@ -629,15 +664,16 @@ int bpf_link_create(int prog_fd, int target_fd, { __u32 target_btf_id, iter_info_len; union bpf_attr attr; + int fd;
if (!OPTS_VALID(opts, bpf_link_create_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
iter_info_len = OPTS_GET(opts, iter_info_len, 0); target_btf_id = OPTS_GET(opts, target_btf_id, 0);
if (iter_info_len && target_btf_id) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.link_create.prog_fd = prog_fd; @@ -653,26 +689,30 @@ int bpf_link_create(int prog_fd, int target_fd, attr.link_create.target_btf_id = target_btf_id; }
- return sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr)); + fd = sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_link_detach(int link_fd) { union bpf_attr attr; + int ret;
memset(&attr, 0, sizeof(attr)); attr.link_detach.link_fd = link_fd;
- return sys_bpf(BPF_LINK_DETACH, &attr, sizeof(attr)); + ret = sys_bpf(BPF_LINK_DETACH, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_link_update(int link_fd, int new_prog_fd, const struct bpf_link_update_opts *opts) { union bpf_attr attr; + int ret;
if (!OPTS_VALID(opts, bpf_link_update_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.link_update.link_fd = link_fd; @@ -680,17 +720,20 @@ int bpf_link_update(int link_fd, int new_prog_fd, attr.link_update.flags = OPTS_GET(opts, flags, 0); attr.link_update.old_prog_fd = OPTS_GET(opts, old_prog_fd, 0);
- return sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr)); + ret = sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr)); + return libbpf_err_errno(ret); }
int bpf_iter_create(int link_fd) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.iter_create.link_fd = link_fd;
- return sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr)); + fd = sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, @@ -707,10 +750,12 @@ int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, attr.query.prog_ids = ptr_to_u64(prog_ids);
ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr)); + if (attach_flags) *attach_flags = attr.query.attach_flags; *prog_cnt = attr.query.prog_cnt; - return ret; + + return libbpf_err_errno(ret); }
int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size, @@ -728,13 +773,15 @@ int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size, attr.test.repeat = repeat;
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); + if (size_out) *size_out = attr.test.data_size_out; if (retval) *retval = attr.test.retval; if (duration) *duration = attr.test.duration; - return ret; + + return libbpf_err_errno(ret); }
int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr) @@ -743,7 +790,7 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr) int ret;
if (!test_attr->data_out && test_attr->data_size_out > 0) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.test.prog_fd = test_attr->prog_fd; @@ -758,11 +805,13 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr) attr.test.repeat = test_attr->repeat;
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); + test_attr->data_size_out = attr.test.data_size_out; test_attr->ctx_size_out = attr.test.ctx_size_out; test_attr->retval = attr.test.retval; test_attr->duration = attr.test.duration; - return ret; + + return libbpf_err_errno(ret); }
int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts) @@ -771,7 +820,7 @@ int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts) int ret;
if (!OPTS_VALID(opts, bpf_test_run_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.test.prog_fd = prog_fd; @@ -789,11 +838,13 @@ int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts) attr.test.data_out = ptr_to_u64(OPTS_GET(opts, data_out, NULL));
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); + OPTS_SET(opts, data_size_out, attr.test.data_size_out); OPTS_SET(opts, ctx_size_out, attr.test.ctx_size_out); OPTS_SET(opts, duration, attr.test.duration); OPTS_SET(opts, retval, attr.test.retval); - return ret; + + return libbpf_err_errno(ret); }
static int bpf_obj_get_next_id(__u32 start_id, __u32 *next_id, int cmd) @@ -808,7 +859,7 @@ static int bpf_obj_get_next_id(__u32 start_id, __u32 *next_id, int cmd) if (!err) *next_id = attr.next_id;
- return err; + return libbpf_err_errno(err); }
int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id) @@ -834,41 +885,49 @@ int bpf_link_get_next_id(__u32 start_id, __u32 *next_id) int bpf_prog_get_fd_by_id(__u32 id) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.prog_id = id;
- return sys_bpf(BPF_PROG_GET_FD_BY_ID, &attr, sizeof(attr)); + fd = sys_bpf(BPF_PROG_GET_FD_BY_ID, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_map_get_fd_by_id(__u32 id) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.map_id = id;
- return sys_bpf(BPF_MAP_GET_FD_BY_ID, &attr, sizeof(attr)); + fd = sys_bpf(BPF_MAP_GET_FD_BY_ID, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_btf_get_fd_by_id(__u32 id) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.btf_id = id;
- return sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr)); + fd = sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_link_get_fd_by_id(__u32 id) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.link_id = id;
- return sys_bpf(BPF_LINK_GET_FD_BY_ID, &attr, sizeof(attr)); + fd = sys_bpf(BPF_LINK_GET_FD_BY_ID, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len) @@ -882,21 +941,24 @@ int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len) attr.info.info = ptr_to_u64(info);
err = sys_bpf(BPF_OBJ_GET_INFO_BY_FD, &attr, sizeof(attr)); + if (!err) *info_len = attr.info.info_len;
- return err; + return libbpf_err_errno(err); }
int bpf_raw_tracepoint_open(const char *name, int prog_fd) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.raw_tracepoint.name = ptr_to_u64(name); attr.raw_tracepoint.prog_fd = prog_fd;
- return sys_bpf(BPF_RAW_TRACEPOINT_OPEN, &attr, sizeof(attr)); + fd = sys_bpf(BPF_RAW_TRACEPOINT_OPEN, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size, @@ -916,12 +978,13 @@ int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_s }
fd = sys_bpf(BPF_BTF_LOAD, &attr, sizeof(attr)); - if (fd == -1 && !do_log && log_buf && log_buf_size) { + + if (fd < 0 && !do_log && log_buf && log_buf_size) { do_log = true; goto retry; }
- return fd; + return libbpf_err_errno(fd); }
int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len, @@ -938,37 +1001,42 @@ int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len, attr.task_fd_query.buf_len = *buf_len;
err = sys_bpf(BPF_TASK_FD_QUERY, &attr, sizeof(attr)); + *buf_len = attr.task_fd_query.buf_len; *prog_id = attr.task_fd_query.prog_id; *fd_type = attr.task_fd_query.fd_type; *probe_offset = attr.task_fd_query.probe_offset; *probe_addr = attr.task_fd_query.probe_addr;
- return err; + return libbpf_err_errno(err); }
int bpf_enable_stats(enum bpf_stats_type type) { union bpf_attr attr; + int fd;
memset(&attr, 0, sizeof(attr)); attr.enable_stats.type = type;
- return sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr)); + fd = sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr)); + return libbpf_err_errno(fd); }
int bpf_prog_bind_map(int prog_fd, int map_fd, const struct bpf_prog_bind_opts *opts) { union bpf_attr attr; + int ret;
if (!OPTS_VALID(opts, bpf_prog_bind_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr)); attr.prog_bind_map.prog_fd = prog_fd; attr.prog_bind_map.map_fd = map_fd; attr.prog_bind_map.flags = OPTS_GET(opts, flags, 0);
- return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr)); + ret = sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr)); + return libbpf_err_errno(ret); } diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 777b17b230f8..2c9745462fc6 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -11,6 +11,9 @@
#include <stdlib.h> #include <limits.h> +#include <errno.h> +#include <linux/err.h> +#include "libbpf_legacy.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */ #pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64 @@ -441,4 +444,27 @@ int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ct int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx); int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx);
+extern enum libbpf_strict_mode libbpf_mode; + +/* handle direct returned errors */ +static inline int libbpf_err(int ret) +{ + if (ret < 0) + errno = -ret; + return ret; +} + +/* handle errno-based (e.g., syscall or libc) errors according to libbpf's + * strict mode settings + */ +static inline int libbpf_err_errno(int ret) +{ + if (libbpf_mode & LIBBPF_STRICT_DIRECT_ERRS) + /* errno is already assumed to be set on error */ + return ret < 0 ? -errno : ret; + + /* legacy: on error return -1 directly and don't touch errno */ + return ret; +} + #endif /* __LIBBPF_LIBBPF_INTERNAL_H */ diff --git a/tools/lib/bpf/libbpf_legacy.h b/tools/lib/bpf/libbpf_legacy.h index 7482cfe22ab2..df0d03dcffab 100644 --- a/tools/lib/bpf/libbpf_legacy.h +++ b/tools/lib/bpf/libbpf_legacy.h @@ -33,6 +33,18 @@ enum libbpf_strict_mode { * code so that it handles LIBBPF_STRICT_ALL mode before libbpf v1.0. */ LIBBPF_STRICT_NONE = 0x00, + /* + * Return NULL pointers on error, not ERR_PTR(err). + * Additionally, libbpf also always sets errno to corresponding Exx + * (positive) error code. + */ + LIBBPF_STRICT_CLEAN_PTRS = 0x01, + /* + * Return actual error codes from low-level APIs directly, not just -1. + * Additionally, libbpf also always sets errno to corresponding Exx + * (positive) error code. + */ + LIBBPF_STRICT_DIRECT_ERRS = 0x02,
__LIBBPF_STRICT_LAST, };
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit e9fc3ce99b3485586e7e4803b63df8b4c681f897 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Implement changes to error reporting for high-level libbpf APIs to make them less surprising and less error-prone to users: - in all the cases when error happens, errno is set to an appropriate error value; - in libbpf 1.0 mode, all pointer-returning APIs return NULL on error and error code is communicated through errno; this applies both to APIs that already returned NULL before (so now they communicate more detailed error codes), as well as for many APIs that used ERR_PTR() macro and encoded error numbers as fake pointers. - in legacy (default) mode, those APIs that were returning ERR_PTR(err), continue doing so, but still set errno.
With these changes, errno can be always used to extract actual error, regardless of legacy or libbpf 1.0 modes. This is utilized internally in libbpf in places where libbpf uses it's own high-level APIs. libbpf_get_error() is adapted to handle both cases completely transparently to end-users (and is used by libbpf consistently as well).
More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]).
[0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54n...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210525035935.1461796-5-andrii@kernel.org (cherry picked from commit e9fc3ce99b3485586e7e4803b63df8b4c681f897) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_prog_linfo.c | 18 +- tools/lib/bpf/btf.c | 302 +++++++++---------- tools/lib/bpf/btf_dump.c | 14 +- tools/lib/bpf/libbpf.c | 502 +++++++++++++++++--------------- tools/lib/bpf/libbpf_errno.c | 7 +- tools/lib/bpf/libbpf_internal.h | 27 ++ tools/lib/bpf/linker.c | 22 +- tools/lib/bpf/netlink.c | 81 +++--- tools/lib/bpf/ringbuf.c | 26 +- 9 files changed, 531 insertions(+), 468 deletions(-)
diff --git a/tools/lib/bpf/bpf_prog_linfo.c b/tools/lib/bpf/bpf_prog_linfo.c index 3ed1a27b5f7c..5c503096ef43 100644 --- a/tools/lib/bpf/bpf_prog_linfo.c +++ b/tools/lib/bpf/bpf_prog_linfo.c @@ -106,7 +106,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info) nr_linfo = info->nr_line_info;
if (!nr_linfo) - return NULL; + return errno = EINVAL, NULL;
/* * The min size that bpf_prog_linfo has to access for @@ -114,11 +114,11 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info) */ if (info->line_info_rec_size < offsetof(struct bpf_line_info, file_name_off)) - return NULL; + return errno = EINVAL, NULL;
prog_linfo = calloc(1, sizeof(*prog_linfo)); if (!prog_linfo) - return NULL; + return errno = ENOMEM, NULL;
/* Copy xlated line_info */ prog_linfo->nr_linfo = nr_linfo; @@ -174,7 +174,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
err_free: bpf_prog_linfo__free(prog_linfo); - return NULL; + return errno = EINVAL, NULL; }
const struct bpf_line_info * @@ -186,11 +186,11 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo, const __u64 *jited_linfo;
if (func_idx >= prog_linfo->nr_jited_func) - return NULL; + return errno = ENOENT, NULL;
nr_linfo = prog_linfo->nr_jited_linfo_per_func[func_idx]; if (nr_skip >= nr_linfo) - return NULL; + return errno = ENOENT, NULL;
start = prog_linfo->jited_linfo_func_idx[func_idx] + nr_skip; jited_rec_size = prog_linfo->jited_rec_size; @@ -198,7 +198,7 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo, (start * jited_rec_size); jited_linfo = raw_jited_linfo; if (addr < *jited_linfo) - return NULL; + return errno = ENOENT, NULL;
nr_linfo -= nr_skip; rec_size = prog_linfo->rec_size; @@ -225,13 +225,13 @@ bpf_prog_linfo__lfind(const struct bpf_prog_linfo *prog_linfo,
nr_linfo = prog_linfo->nr_linfo; if (nr_skip >= nr_linfo) - return NULL; + return errno = ENOENT, NULL;
rec_size = prog_linfo->rec_size; raw_linfo = prog_linfo->raw_linfo + (nr_skip * rec_size); linfo = raw_linfo; if (insn_off < linfo->insn_off) - return NULL; + return errno = ENOENT, NULL;
nr_linfo -= nr_skip; for (i = 0; i < nr_linfo; i++) { diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 9be6b87f8d52..1c46d76fefb5 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -449,7 +449,7 @@ struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id) const struct btf_type *btf__type_by_id(const struct btf *btf, __u32 type_id) { if (type_id >= btf->start_id + btf->nr_types) - return NULL; + return errno = EINVAL, NULL; return btf_type_by_id((struct btf *)btf, type_id); }
@@ -516,7 +516,7 @@ size_t btf__pointer_size(const struct btf *btf) int btf__set_pointer_size(struct btf *btf, size_t ptr_sz) { if (ptr_sz != 4 && ptr_sz != 8) - return -EINVAL; + return libbpf_err(-EINVAL); btf->ptr_sz = ptr_sz; return 0; } @@ -543,7 +543,7 @@ enum btf_endianness btf__endianness(const struct btf *btf) int btf__set_endianness(struct btf *btf, enum btf_endianness endian) { if (endian != BTF_LITTLE_ENDIAN && endian != BTF_BIG_ENDIAN) - return -EINVAL; + return libbpf_err(-EINVAL);
btf->swapped_endian = is_host_big_endian() != (endian == BTF_BIG_ENDIAN); if (!btf->swapped_endian) { @@ -574,8 +574,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) int i;
t = btf__type_by_id(btf, type_id); - for (i = 0; i < MAX_RESOLVE_DEPTH && !btf_type_is_void_or_null(t); - i++) { + for (i = 0; i < MAX_RESOLVE_DEPTH && !btf_type_is_void_or_null(t); i++) { switch (btf_kind(t)) { case BTF_KIND_INT: case BTF_KIND_STRUCT: @@ -598,12 +597,12 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_ARRAY: array = btf_array(t); if (nelems && array->nelems > UINT32_MAX / nelems) - return -E2BIG; + return libbpf_err(-E2BIG); nelems *= array->nelems; type_id = array->type; break; default: - return -EINVAL; + return libbpf_err(-EINVAL); }
t = btf__type_by_id(btf, type_id); @@ -611,9 +610,9 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id)
done: if (size < 0) - return -EINVAL; + return libbpf_err(-EINVAL); if (nelems && size > UINT32_MAX / nelems) - return -E2BIG; + return libbpf_err(-E2BIG);
return nelems * size; } @@ -646,7 +645,7 @@ int btf__align_of(const struct btf *btf, __u32 id) for (i = 0; i < vlen; i++, m++) { align = btf__align_of(btf, m->type); if (align <= 0) - return align; + return libbpf_err(align); max_align = max(max_align, align); }
@@ -654,7 +653,7 @@ int btf__align_of(const struct btf *btf, __u32 id) } default: pr_warn("unsupported BTF_KIND:%u\n", btf_kind(t)); - return 0; + return errno = EINVAL, 0; } }
@@ -673,7 +672,7 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id) }
if (depth == MAX_RESOLVE_DEPTH || btf_type_is_void_or_null(t)) - return -EINVAL; + return libbpf_err(-EINVAL);
return type_id; } @@ -693,7 +692,7 @@ __s32 btf__find_by_name(const struct btf *btf, const char *type_name) return i; }
- return -ENOENT; + return libbpf_err(-ENOENT); }
__s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name, @@ -715,7 +714,7 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name, return i; }
- return -ENOENT; + return libbpf_err(-ENOENT); }
static bool btf_is_modifiable(const struct btf *btf) @@ -791,12 +790,12 @@ static struct btf *btf_new_empty(struct btf *base_btf)
struct btf *btf__new_empty(void) { - return btf_new_empty(NULL); + return libbpf_ptr(btf_new_empty(NULL)); }
struct btf *btf__new_empty_split(struct btf *base_btf) { - return btf_new_empty(base_btf); + return libbpf_ptr(btf_new_empty(base_btf)); }
static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf) @@ -852,7 +851,7 @@ static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf)
struct btf *btf__new(const void *data, __u32 size) { - return btf_new(data, size, NULL); + return libbpf_ptr(btf_new(data, size, NULL)); }
static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, @@ -943,7 +942,8 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, goto done; } btf = btf_new(btf_data->d_buf, btf_data->d_size, base_btf); - if (IS_ERR(btf)) + err = libbpf_get_error(btf); + if (err) goto done;
switch (gelf_getclass(elf)) { @@ -959,9 +959,9 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, }
if (btf_ext && btf_ext_data) { - *btf_ext = btf_ext__new(btf_ext_data->d_buf, - btf_ext_data->d_size); - if (IS_ERR(*btf_ext)) + *btf_ext = btf_ext__new(btf_ext_data->d_buf, btf_ext_data->d_size); + err = libbpf_get_error(*btf_ext); + if (err) goto done; } else if (btf_ext) { *btf_ext = NULL; @@ -971,30 +971,24 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, elf_end(elf); close(fd);
- if (err) - return ERR_PTR(err); - /* - * btf is always parsed before btf_ext, so no need to clean up - * btf_ext, if btf loading failed - */ - if (IS_ERR(btf)) + if (!err) return btf; - if (btf_ext && IS_ERR(*btf_ext)) { - btf__free(btf); - err = PTR_ERR(*btf_ext); - return ERR_PTR(err); - } - return btf; + + if (btf_ext) + btf_ext__free(*btf_ext); + btf__free(btf); + + return ERR_PTR(err); }
struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext) { - return btf_parse_elf(path, NULL, btf_ext); + return libbpf_ptr(btf_parse_elf(path, NULL, btf_ext)); }
struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf) { - return btf_parse_elf(path, base_btf, NULL); + return libbpf_ptr(btf_parse_elf(path, base_btf, NULL)); }
static struct btf *btf_parse_raw(const char *path, struct btf *base_btf) @@ -1062,36 +1056,39 @@ static struct btf *btf_parse_raw(const char *path, struct btf *base_btf)
struct btf *btf__parse_raw(const char *path) { - return btf_parse_raw(path, NULL); + return libbpf_ptr(btf_parse_raw(path, NULL)); }
struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf) { - return btf_parse_raw(path, base_btf); + return libbpf_ptr(btf_parse_raw(path, base_btf)); }
static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext) { struct btf *btf; + int err;
if (btf_ext) *btf_ext = NULL;
btf = btf_parse_raw(path, base_btf); - if (!IS_ERR(btf) || PTR_ERR(btf) != -EPROTO) + err = libbpf_get_error(btf); + if (!err) return btf; - + if (err != -EPROTO) + return ERR_PTR(err); return btf_parse_elf(path, base_btf, btf_ext); }
struct btf *btf__parse(const char *path, struct btf_ext **btf_ext) { - return btf_parse(path, NULL, btf_ext); + return libbpf_ptr(btf_parse(path, NULL, btf_ext)); }
struct btf *btf__parse_split(const char *path, struct btf *base_btf) { - return btf_parse(path, base_btf, NULL); + return libbpf_ptr(btf_parse(path, base_btf, NULL)); }
static int compare_vsi_off(const void *_a, const void *_b) @@ -1184,7 +1181,7 @@ int btf__finalize_data(struct bpf_object *obj, struct btf *btf) } }
- return err; + return libbpf_err(err); }
static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian); @@ -1197,13 +1194,13 @@ int btf__load(struct btf *btf) int err = 0;
if (btf->fd >= 0) - return -EEXIST; + return libbpf_err(-EEXIST);
retry_load: if (log_buf_size) { log_buf = malloc(log_buf_size); if (!log_buf) - return -ENOMEM; + return libbpf_err(-ENOMEM);
*log_buf = 0; } @@ -1235,7 +1232,7 @@ int btf__load(struct btf *btf)
done: free(log_buf); - return err; + return libbpf_err(err); }
int btf__fd(const struct btf *btf) @@ -1311,7 +1308,7 @@ const void *btf__get_raw_data(const struct btf *btf_ro, __u32 *size)
data = btf_get_raw_data(btf, &data_sz, btf->swapped_endian); if (!data) - return NULL; + return errno = -ENOMEM, NULL;
btf->raw_size = data_sz; if (btf->swapped_endian) @@ -1329,7 +1326,7 @@ const char *btf__str_by_offset(const struct btf *btf, __u32 offset) else if (offset - btf->start_str_off < btf->hdr->str_len) return btf_strs_data(btf) + (offset - btf->start_str_off); else - return NULL; + return errno = EINVAL, NULL; }
const char *btf__name_by_offset(const struct btf *btf, __u32 offset) @@ -1394,17 +1391,20 @@ struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) int btf__get_from_id(__u32 id, struct btf **btf) { struct btf *res; - int btf_fd; + int err, btf_fd;
*btf = NULL; btf_fd = bpf_btf_get_fd_by_id(id); if (btf_fd < 0) - return -errno; + return libbpf_err(-errno);
res = btf_get_from_fd(btf_fd, NULL); + err = libbpf_get_error(res); + close(btf_fd); - if (IS_ERR(res)) - return PTR_ERR(res); + + if (err) + return libbpf_err(err);
*btf = res; return 0; @@ -1421,31 +1421,30 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, __s64 key_size, value_size; __s32 container_id;
- if (snprintf(container_name, max_name, "____btf_map_%s", map_name) == - max_name) { + if (snprintf(container_name, max_name, "____btf_map_%s", map_name) == max_name) { pr_warn("map:%s length of '____btf_map_%s' is too long\n", map_name, map_name); - return -EINVAL; + return libbpf_err(-EINVAL); }
container_id = btf__find_by_name(btf, container_name); if (container_id < 0) { pr_debug("map:%s container_name:%s cannot be found in BTF. Missing BPF_ANNOTATE_KV_PAIR?\n", map_name, container_name); - return container_id; + return libbpf_err(container_id); }
container_type = btf__type_by_id(btf, container_id); if (!container_type) { pr_warn("map:%s cannot find BTF type for container_id:%u\n", map_name, container_id); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (!btf_is_struct(container_type) || btf_vlen(container_type) < 2) { pr_warn("map:%s container_name:%s is an invalid container struct\n", map_name, container_name); - return -EINVAL; + return libbpf_err(-EINVAL); }
key = btf_members(container_type); @@ -1454,25 +1453,25 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, key_size = btf__resolve_size(btf, key->type); if (key_size < 0) { pr_warn("map:%s invalid BTF key_type_size\n", map_name); - return key_size; + return libbpf_err(key_size); }
if (expected_key_size != key_size) { pr_warn("map:%s btf_key_type_size:%u != map_def_key_size:%u\n", map_name, (__u32)key_size, expected_key_size); - return -EINVAL; + return libbpf_err(-EINVAL); }
value_size = btf__resolve_size(btf, value->type); if (value_size < 0) { pr_warn("map:%s invalid BTF value_type_size\n", map_name); - return value_size; + return libbpf_err(value_size); }
if (expected_value_size != value_size) { pr_warn("map:%s btf_value_type_size:%u != map_def_value_size:%u\n", map_name, (__u32)value_size, expected_value_size); - return -EINVAL; + return libbpf_err(-EINVAL); }
*key_type_id = key->type; @@ -1569,11 +1568,11 @@ int btf__find_str(struct btf *btf, const char *s)
/* BTF needs to be in a modifiable state to build string lookup index */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
off = strset__find_str(btf->strs_set, s); if (off < 0) - return off; + return libbpf_err(off);
return btf->start_str_off + off; } @@ -1594,11 +1593,11 @@ int btf__add_str(struct btf *btf, const char *s) }
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
off = strset__add_str(btf->strs_set, s); if (off < 0) - return off; + return libbpf_err(off);
btf->hdr->str_len = strset__data_size(btf->strs_set);
@@ -1622,7 +1621,7 @@ static int btf_commit_type(struct btf *btf, int data_sz)
err = btf_add_type_idx_entry(btf, btf->hdr->type_len); if (err) - return err; + return libbpf_err(err);
btf->hdr->type_len += data_sz; btf->hdr->str_off += data_sz; @@ -1659,21 +1658,21 @@ int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_t
sz = btf_type_size(src_type); if (sz < 0) - return sz; + return libbpf_err(sz);
/* deconstruct BTF, if necessary, and invalidate raw_data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
memcpy(t, src_type, sz);
err = btf_type_visit_str_offs(t, btf_rewrite_str, &p); if (err) - return err; + return libbpf_err(err);
return btf_commit_type(btf, sz); } @@ -1694,21 +1693,21 @@ int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding
/* non-empty name */ if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL); /* byte_sz must be power of 2 */ if (!byte_sz || (byte_sz & (byte_sz - 1)) || byte_sz > 16) - return -EINVAL; + return libbpf_err(-EINVAL); if (encoding & ~(BTF_INT_SIGNED | BTF_INT_CHAR | BTF_INT_BOOL)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* deconstruct BTF, if necessary, and invalidate raw_data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type) + sizeof(int); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
/* if something goes wrong later, we might end up with an extra string, * but that shouldn't be a problem, because BTF can't be constructed @@ -1742,20 +1741,20 @@ int btf__add_float(struct btf *btf, const char *name, size_t byte_sz)
/* non-empty name */ if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL);
/* byte_sz must be one of the explicitly allowed values */ if (byte_sz != 2 && byte_sz != 4 && byte_sz != 8 && byte_sz != 12 && byte_sz != 16) - return -EINVAL; + return libbpf_err(-EINVAL);
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
name_off = btf__add_str(btf, name); if (name_off < 0) @@ -1786,15 +1785,15 @@ static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref int sz, name_off = 0;
if (validate_type_id(ref_type_id)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
if (name && name[0]) { name_off = btf__add_str(btf, name); @@ -1837,15 +1836,15 @@ int btf__add_array(struct btf *btf, int index_type_id, int elem_type_id, __u32 n int sz;
if (validate_type_id(index_type_id) || validate_type_id(elem_type_id)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type) + sizeof(struct btf_array); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
t->name_off = 0; t->info = btf_type_info(BTF_KIND_ARRAY, 0, 0); @@ -1866,12 +1865,12 @@ static int btf_add_composite(struct btf *btf, int kind, const char *name, __u32 int sz, name_off = 0;
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
if (name && name[0]) { name_off = btf__add_str(btf, name); @@ -1949,30 +1948,30 @@ int btf__add_field(struct btf *btf, const char *name, int type_id,
/* last type should be union/struct */ if (btf->nr_types == 0) - return -EINVAL; + return libbpf_err(-EINVAL); t = btf_last_type(btf); if (!btf_is_composite(t)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (validate_type_id(type_id)) - return -EINVAL; + return libbpf_err(-EINVAL); /* best-effort bit field offset/size enforcement */ is_bitfield = bit_size || (bit_offset % 8 != 0); if (is_bitfield && (bit_size == 0 || bit_size > 255 || bit_offset > 0xffffff)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* only offset 0 is allowed for unions */ if (btf_is_union(t) && bit_offset) - return -EINVAL; + return libbpf_err(-EINVAL);
/* decompose and invalidate raw data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_member); m = btf_add_type_mem(btf, sz); if (!m) - return -ENOMEM; + return libbpf_err(-ENOMEM);
if (name && name[0]) { name_off = btf__add_str(btf, name); @@ -2014,15 +2013,15 @@ int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz)
/* byte_sz must be power of 2 */ if (!byte_sz || (byte_sz & (byte_sz - 1)) || byte_sz > 8) - return -EINVAL; + return libbpf_err(-EINVAL);
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
if (name && name[0]) { name_off = btf__add_str(btf, name); @@ -2054,25 +2053,25 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value)
/* last type should be BTF_KIND_ENUM */ if (btf->nr_types == 0) - return -EINVAL; + return libbpf_err(-EINVAL); t = btf_last_type(btf); if (!btf_is_enum(t)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* non-empty name */ if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL); if (value < INT_MIN || value > UINT_MAX) - return -E2BIG; + return libbpf_err(-E2BIG);
/* decompose and invalidate raw data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_enum); v = btf_add_type_mem(btf, sz); if (!v) - return -ENOMEM; + return libbpf_err(-ENOMEM);
name_off = btf__add_str(btf, name); if (name_off < 0) @@ -2102,7 +2101,7 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value) int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind) { if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL);
switch (fwd_kind) { case BTF_FWD_STRUCT: @@ -2123,7 +2122,7 @@ int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind) */ return btf__add_enum(btf, name, sizeof(int)); default: - return -EINVAL; + return libbpf_err(-EINVAL); } }
@@ -2138,7 +2137,7 @@ int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind) int btf__add_typedef(struct btf *btf, const char *name, int ref_type_id) { if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL);
return btf_add_ref_kind(btf, BTF_KIND_TYPEDEF, name, ref_type_id); } @@ -2193,10 +2192,10 @@ int btf__add_func(struct btf *btf, const char *name, int id;
if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL); if (linkage != BTF_FUNC_STATIC && linkage != BTF_FUNC_GLOBAL && linkage != BTF_FUNC_EXTERN) - return -EINVAL; + return libbpf_err(-EINVAL);
id = btf_add_ref_kind(btf, BTF_KIND_FUNC, name, proto_type_id); if (id > 0) { @@ -2204,7 +2203,7 @@ int btf__add_func(struct btf *btf, const char *name,
t->info = btf_type_info(BTF_KIND_FUNC, linkage, 0); } - return id; + return libbpf_err(id); }
/* @@ -2225,15 +2224,15 @@ int btf__add_func_proto(struct btf *btf, int ret_type_id) int sz;
if (validate_type_id(ret_type_id)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
/* start out with vlen=0; this will be adjusted when adding enum * values, if necessary @@ -2260,23 +2259,23 @@ int btf__add_func_param(struct btf *btf, const char *name, int type_id) int sz, name_off = 0;
if (validate_type_id(type_id)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* last type should be BTF_KIND_FUNC_PROTO */ if (btf->nr_types == 0) - return -EINVAL; + return libbpf_err(-EINVAL); t = btf_last_type(btf); if (!btf_is_func_proto(t)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* decompose and invalidate raw data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_param); p = btf_add_type_mem(btf, sz); if (!p) - return -ENOMEM; + return libbpf_err(-ENOMEM);
if (name && name[0]) { name_off = btf__add_str(btf, name); @@ -2314,21 +2313,21 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id)
/* non-empty name */ if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL); if (linkage != BTF_VAR_STATIC && linkage != BTF_VAR_GLOBAL_ALLOCATED && linkage != BTF_VAR_GLOBAL_EXTERN) - return -EINVAL; + return libbpf_err(-EINVAL); if (validate_type_id(type_id)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* deconstruct BTF, if necessary, and invalidate raw_data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type) + sizeof(struct btf_var); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
name_off = btf__add_str(btf, name); if (name_off < 0) @@ -2363,15 +2362,15 @@ int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz)
/* non-empty name */ if (!name || !name[0]) - return -EINVAL; + return libbpf_err(-EINVAL);
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_type); t = btf_add_type_mem(btf, sz); if (!t) - return -ENOMEM; + return libbpf_err(-ENOMEM);
name_off = btf__add_str(btf, name); if (name_off < 0) @@ -2403,22 +2402,22 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __
/* last type should be BTF_KIND_DATASEC */ if (btf->nr_types == 0) - return -EINVAL; + return libbpf_err(-EINVAL); t = btf_last_type(btf); if (!btf_is_datasec(t)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (validate_type_id(var_type_id)) - return -EINVAL; + return libbpf_err(-EINVAL);
/* decompose and invalidate raw data */ if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_var_secinfo); v = btf_add_type_mem(btf, sz); if (!v) - return -ENOMEM; + return libbpf_err(-ENOMEM);
v->type = var_type_id; v->offset = offset; @@ -2620,11 +2619,11 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size)
err = btf_ext_parse_hdr(data, size); if (err) - return ERR_PTR(err); + return libbpf_err_ptr(err);
btf_ext = calloc(1, sizeof(struct btf_ext)); if (!btf_ext) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM);
btf_ext->data_size = size; btf_ext->data = malloc(size); @@ -2634,9 +2633,11 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size) } memcpy(btf_ext->data, data, size);
- if (btf_ext->hdr->hdr_len < - offsetofend(struct btf_ext_header, line_info_len)) + if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, line_info_len)) { + err = -EINVAL; goto done; + } + err = btf_ext_setup_func_info(btf_ext); if (err) goto done; @@ -2645,8 +2646,11 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size) if (err) goto done;
- if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) + if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) { + err = -EINVAL; goto done; + } + err = btf_ext_setup_core_relos(btf_ext); if (err) goto done; @@ -2654,7 +2658,7 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size) done: if (err) { btf_ext__free(btf_ext); - return ERR_PTR(err); + return libbpf_err_ptr(err); }
return btf_ext; @@ -2693,7 +2697,7 @@ static int btf_ext_reloc_info(const struct btf *btf, existing_len = (*cnt) * record_size; data = realloc(*info, existing_len + records_len); if (!data) - return -ENOMEM; + return libbpf_err(-ENOMEM);
memcpy(data + existing_len, sinfo->data, records_len); /* adjust insn_off only, the rest data will be passed @@ -2703,15 +2707,14 @@ static int btf_ext_reloc_info(const struct btf *btf, __u32 *insn_off;
insn_off = data + existing_len + (i * record_size); - *insn_off = *insn_off / sizeof(struct bpf_insn) + - insns_cnt; + *insn_off = *insn_off / sizeof(struct bpf_insn) + insns_cnt; } *info = data; *cnt += sinfo->num_info; return 0; }
- return -ENOENT; + return libbpf_err(-ENOENT); }
int btf_ext__reloc_func_info(const struct btf *btf, @@ -2900,11 +2903,11 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
if (IS_ERR(d)) { pr_debug("btf_dedup_new failed: %ld", PTR_ERR(d)); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (btf_ensure_modifiable(btf)) - return -ENOMEM; + return libbpf_err(-ENOMEM);
err = btf_dedup_prep(d); if (err) { @@ -2944,7 +2947,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
done: btf_dedup_free(d); - return err; + return libbpf_err(err); }
#define BTF_UNPROCESSED_ID ((__u32)-1) @@ -4417,7 +4420,7 @@ struct btf *libbpf_find_kernel_btf(void) char path[PATH_MAX + 1]; struct utsname buf; struct btf *btf; - int i; + int i, err;
uname(&buf);
@@ -4431,17 +4434,16 @@ struct btf *libbpf_find_kernel_btf(void) btf = btf__parse_raw(path); else btf = btf__parse_elf(path, NULL); - - pr_debug("loading kernel BTF '%s': %ld\n", - path, IS_ERR(btf) ? PTR_ERR(btf) : 0); - if (IS_ERR(btf)) + err = libbpf_get_error(btf); + pr_debug("loading kernel BTF '%s': %d\n", path, err); + if (err) continue;
return btf; }
pr_warn("failed to find valid kernel BTF\n"); - return ERR_PTR(-ESRCH); + return libbpf_err_ptr(-ESRCH); }
int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx) diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index fa27b0bbea0e..835d0ef0d67e 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -128,7 +128,7 @@ struct btf_dump *btf_dump__new(const struct btf *btf,
d = calloc(1, sizeof(struct btf_dump)); if (!d) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM);
d->btf = btf; d->btf_ext = btf_ext; @@ -156,7 +156,7 @@ struct btf_dump *btf_dump__new(const struct btf *btf, return d; err: btf_dump__free(d); - return ERR_PTR(err); + return libbpf_err_ptr(err); }
static int btf_dump_resize(struct btf_dump *d) @@ -236,16 +236,16 @@ int btf_dump__dump_type(struct btf_dump *d, __u32 id) int err, i;
if (id > btf__get_nr_types(d->btf)) - return -EINVAL; + return libbpf_err(-EINVAL);
err = btf_dump_resize(d); if (err) - return err; + return libbpf_err(err);
d->emit_queue_cnt = 0; err = btf_dump_order_type(d, id, false); if (err < 0) - return err; + return libbpf_err(err);
for (i = 0; i < d->emit_queue_cnt; i++) btf_dump_emit_type(d, d->emit_queue[i], 0 /*top-level*/); @@ -1075,11 +1075,11 @@ int btf_dump__emit_type_decl(struct btf_dump *d, __u32 id, int lvl, err;
if (!OPTS_VALID(opts, btf_dump_emit_type_decl_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
err = btf_dump_resize(d); if (err) - return -EINVAL; + return libbpf_err(err);
fname = OPTS_GET(opts, field_name, ""); lvl = OPTS_GET(opts, indent_level, 0); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 416fcb161ba8..0e659713c5c7 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2579,16 +2579,14 @@ static int bpf_object__init_btf(struct bpf_object *obj,
if (btf_data) { obj->btf = btf__new(btf_data->d_buf, btf_data->d_size); - if (IS_ERR(obj->btf)) { - err = PTR_ERR(obj->btf); + err = libbpf_get_error(obj->btf); + if (err) { obj->btf = NULL; - pr_warn("Error loading ELF section %s: %d.\n", - BTF_ELF_SEC, err); + pr_warn("Error loading ELF section %s: %d.\n", BTF_ELF_SEC, err); goto out; } /* enforce 8-byte pointers for BPF-targeted BTFs */ btf__set_pointer_size(obj->btf, 8); - err = 0; } if (btf_ext_data) { if (!obj->btf) { @@ -2596,11 +2594,11 @@ static int bpf_object__init_btf(struct bpf_object *obj, BTF_EXT_ELF_SEC, BTF_ELF_SEC); goto out; } - obj->btf_ext = btf_ext__new(btf_ext_data->d_buf, - btf_ext_data->d_size); - if (IS_ERR(obj->btf_ext)) { - pr_warn("Error loading ELF section %s: %ld. Ignored and continue.\n", - BTF_EXT_ELF_SEC, PTR_ERR(obj->btf_ext)); + obj->btf_ext = btf_ext__new(btf_ext_data->d_buf, btf_ext_data->d_size); + err = libbpf_get_error(obj->btf_ext); + if (err) { + pr_warn("Error loading ELF section %s: %d. Ignored and continue.\n", + BTF_EXT_ELF_SEC, err); obj->btf_ext = NULL; goto out; } @@ -2685,8 +2683,8 @@ static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force) return 0;
obj->btf_vmlinux = libbpf_find_kernel_btf(); - if (IS_ERR(obj->btf_vmlinux)) { - err = PTR_ERR(obj->btf_vmlinux); + err = libbpf_get_error(obj->btf_vmlinux); + if (err) { pr_warn("Error loading vmlinux BTF: %d\n", err); obj->btf_vmlinux = NULL; return err; @@ -2752,8 +2750,9 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) /* clone BTF to sanitize a copy and leave the original intact */ raw_data = btf__get_raw_data(obj->btf, &sz); kern_btf = btf__new(raw_data, sz); - if (IS_ERR(kern_btf)) - return PTR_ERR(kern_btf); + err = libbpf_get_error(kern_btf); + if (err) + return err;
/* enforce 8-byte pointers for BPF-targeted BTFs */ btf__set_pointer_size(obj->btf, 8); @@ -3527,7 +3526,7 @@ bpf_object__find_program_by_title(const struct bpf_object *obj, if (pos->sec_name && !strcmp(pos->sec_name, title)) return pos; } - return NULL; + return errno = ENOENT, NULL; }
static bool prog_is_subprog(const struct bpf_object *obj, @@ -3560,7 +3559,7 @@ bpf_object__find_program_by_name(const struct bpf_object *obj, if (!strcmp(prog->name, name)) return prog; } - return NULL; + return errno = ENOENT, NULL; }
static bool bpf_object__shndx_is_data(const struct bpf_object *obj, @@ -3945,11 +3944,11 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd) if (err && errno == EINVAL) err = bpf_get_map_info_from_fdinfo(fd, &info); if (err) - return err; + return libbpf_err(err);
new_name = strdup(info.name); if (!new_name) - return -errno; + return libbpf_err(-errno);
new_fd = open("/", O_RDONLY | O_CLOEXEC); if (new_fd < 0) { @@ -3987,7 +3986,7 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd) close(new_fd); err_free_new_name: free(new_name); - return err; + return libbpf_err(err); }
__u32 bpf_map__max_entries(const struct bpf_map *map) @@ -3998,7 +3997,7 @@ __u32 bpf_map__max_entries(const struct bpf_map *map) struct bpf_map *bpf_map__inner_map(struct bpf_map *map) { if (!bpf_map_type__is_map_in_map(map->def.type)) - return NULL; + return errno = EINVAL, NULL;
return map->inner_map; } @@ -4006,7 +4005,7 @@ struct bpf_map *bpf_map__inner_map(struct bpf_map *map) int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->def.max_entries = max_entries; return 0; } @@ -4014,7 +4013,7 @@ int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries) int bpf_map__resize(struct bpf_map *map, __u32 max_entries) { if (!map || !max_entries) - return -EINVAL; + return libbpf_err(-EINVAL);
return bpf_map__set_max_entries(map, max_entries); } @@ -5160,10 +5159,10 @@ static int load_module_btfs(struct bpf_object *obj) }
btf = btf_get_from_fd(fd, obj->btf_vmlinux); - if (IS_ERR(btf)) { - pr_warn("failed to load module [%s]'s BTF object #%d: %ld\n", - name, id, PTR_ERR(btf)); - err = PTR_ERR(btf); + err = libbpf_get_error(btf); + if (err) { + pr_warn("failed to load module [%s]'s BTF object #%d: %d\n", + name, id, err); goto err_out; }
@@ -6423,8 +6422,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
if (targ_btf_path) { obj->btf_vmlinux_override = btf__parse(targ_btf_path, NULL); - if (IS_ERR_OR_NULL(obj->btf_vmlinux_override)) { - err = PTR_ERR(obj->btf_vmlinux_override); + err = libbpf_get_error(obj->btf_vmlinux_override); + if (err) { pr_warn("failed to parse target BTF: %d\n", err); return err; } @@ -7481,7 +7480,7 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
if (prog->obj->loaded) { pr_warn("prog '%s': can't load after object was loaded\n", prog->name); - return -EINVAL; + return libbpf_err(-EINVAL); }
if ((prog->type == BPF_PROG_TYPE_TRACING || @@ -7492,7 +7491,7 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
err = libbpf_find_attach_btf_id(prog, &btf_obj_fd, &btf_type_id); if (err) - return err; + return libbpf_err(err);
prog->attach_btf_obj_fd = btf_obj_fd; prog->attach_btf_id = btf_type_id; @@ -7502,13 +7501,13 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) if (prog->preprocessor) { pr_warn("Internal error: can't load program '%s'\n", prog->name); - return -LIBBPF_ERRNO__INTERNAL; + return libbpf_err(-LIBBPF_ERRNO__INTERNAL); }
prog->instances.fds = malloc(sizeof(int)); if (!prog->instances.fds) { pr_warn("Not enough memory for BPF fds\n"); - return -ENOMEM; + return libbpf_err(-ENOMEM); } prog->instances.nr = 1; prog->instances.fds[0] = -1; @@ -7567,7 +7566,7 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) pr_warn("failed to load program '%s'\n", prog->name); zfree(&prog->insns); prog->insns_cnt = 0; - return err; + return libbpf_err(err); }
static int @@ -7702,7 +7701,7 @@ __bpf_object__open_xattr(struct bpf_object_open_attr *attr, int flags)
struct bpf_object *bpf_object__open_xattr(struct bpf_object_open_attr *attr) { - return __bpf_object__open_xattr(attr, 0); + return libbpf_ptr(__bpf_object__open_xattr(attr, 0)); }
struct bpf_object *bpf_object__open(const char *path) @@ -7712,18 +7711,18 @@ struct bpf_object *bpf_object__open(const char *path) .prog_type = BPF_PROG_TYPE_UNSPEC, };
- return bpf_object__open_xattr(&attr); + return libbpf_ptr(__bpf_object__open_xattr(&attr, 0)); }
struct bpf_object * bpf_object__open_file(const char *path, const struct bpf_object_open_opts *opts) { if (!path) - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL);
pr_debug("loading %s\n", path);
- return __bpf_object__open(path, NULL, 0, opts); + return libbpf_ptr(__bpf_object__open(path, NULL, 0, opts)); }
struct bpf_object * @@ -7731,9 +7730,9 @@ bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz, const struct bpf_object_open_opts *opts) { if (!obj_buf || obj_buf_sz == 0) - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL);
- return __bpf_object__open(NULL, obj_buf, obj_buf_sz, opts); + return libbpf_ptr(__bpf_object__open(NULL, obj_buf, obj_buf_sz, opts)); }
struct bpf_object * @@ -7748,9 +7747,9 @@ bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz,
/* returning NULL is wrong, but backwards-compatible */ if (!obj_buf || obj_buf_sz == 0) - return NULL; + return errno = EINVAL, NULL;
- return bpf_object__open_mem(obj_buf, obj_buf_sz, &opts); + return libbpf_ptr(__bpf_object__open(NULL, obj_buf, obj_buf_sz, &opts)); }
int bpf_object__unload(struct bpf_object *obj) @@ -7758,7 +7757,7 @@ int bpf_object__unload(struct bpf_object *obj) size_t i;
if (!obj) - return -EINVAL; + return libbpf_err(-EINVAL);
for (i = 0; i < obj->nr_maps; i++) { zclose(obj->maps[i].fd); @@ -8091,14 +8090,14 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) int err, i;
if (!attr) - return -EINVAL; + return libbpf_err(-EINVAL); obj = attr->obj; if (!obj) - return -EINVAL; + return libbpf_err(-EINVAL);
if (obj->loaded) { pr_warn("object '%s': load can't be attempted twice\n", obj->name); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (obj->gen_loader) @@ -8149,7 +8148,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
bpf_object__unload(obj); pr_warn("failed to load object '%s'\n", obj->path); - return err; + return libbpf_err(err); }
int bpf_object__load(struct bpf_object *obj) @@ -8221,28 +8220,28 @@ int bpf_program__pin_instance(struct bpf_program *prog, const char *path,
err = make_parent_dir(path); if (err) - return err; + return libbpf_err(err);
err = check_path(path); if (err) - return err; + return libbpf_err(err);
if (prog == NULL) { pr_warn("invalid program pointer\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (instance < 0 || instance >= prog->instances.nr) { pr_warn("invalid prog instance %d of prog %s (max %d)\n", instance, prog->name, prog->instances.nr); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (bpf_obj_pin(prog->instances.fds[instance], path)) { err = -errno; cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg)); pr_warn("failed to pin program: %s\n", cp); - return err; + return libbpf_err(err); } pr_debug("pinned program '%s'\n", path);
@@ -8256,22 +8255,23 @@ int bpf_program__unpin_instance(struct bpf_program *prog, const char *path,
err = check_path(path); if (err) - return err; + return libbpf_err(err);
if (prog == NULL) { pr_warn("invalid program pointer\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (instance < 0 || instance >= prog->instances.nr) { pr_warn("invalid prog instance %d of prog %s (max %d)\n", instance, prog->name, prog->instances.nr); - return -EINVAL; + return libbpf_err(-EINVAL); }
err = unlink(path); if (err != 0) - return -errno; + return libbpf_err(-errno); + pr_debug("unpinned program '%s'\n", path);
return 0; @@ -8283,20 +8283,20 @@ int bpf_program__pin(struct bpf_program *prog, const char *path)
err = make_parent_dir(path); if (err) - return err; + return libbpf_err(err);
err = check_path(path); if (err) - return err; + return libbpf_err(err);
if (prog == NULL) { pr_warn("invalid program pointer\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (prog->instances.nr <= 0) { pr_warn("no instances of prog %s to pin\n", prog->name); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (prog->instances.nr == 1) { @@ -8340,7 +8340,7 @@ int bpf_program__pin(struct bpf_program *prog, const char *path)
rmdir(path);
- return err; + return libbpf_err(err); }
int bpf_program__unpin(struct bpf_program *prog, const char *path) @@ -8349,16 +8349,16 @@ int bpf_program__unpin(struct bpf_program *prog, const char *path)
err = check_path(path); if (err) - return err; + return libbpf_err(err);
if (prog == NULL) { pr_warn("invalid program pointer\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (prog->instances.nr <= 0) { pr_warn("no instances of prog %s to pin\n", prog->name); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (prog->instances.nr == 1) { @@ -8372,9 +8372,9 @@ int bpf_program__unpin(struct bpf_program *prog, const char *path)
len = snprintf(buf, PATH_MAX, "%s/%d", path, i); if (len < 0) - return -EINVAL; + return libbpf_err(-EINVAL); else if (len >= PATH_MAX) - return -ENAMETOOLONG; + return libbpf_err(-ENAMETOOLONG);
err = bpf_program__unpin_instance(prog, buf, i); if (err) @@ -8383,7 +8383,7 @@ int bpf_program__unpin(struct bpf_program *prog, const char *path)
err = rmdir(path); if (err) - return -errno; + return libbpf_err(-errno);
return 0; } @@ -8395,14 +8395,14 @@ int bpf_map__pin(struct bpf_map *map, const char *path)
if (map == NULL) { pr_warn("invalid map pointer\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (map->pin_path) { if (path && strcmp(path, map->pin_path)) { pr_warn("map '%s' already has pin path '%s' different from '%s'\n", bpf_map__name(map), map->pin_path, path); - return -EINVAL; + return libbpf_err(-EINVAL); } else if (map->pinned) { pr_debug("map '%s' already pinned at '%s'; not re-pinning\n", bpf_map__name(map), map->pin_path); @@ -8412,10 +8412,10 @@ int bpf_map__pin(struct bpf_map *map, const char *path) if (!path) { pr_warn("missing a path to pin map '%s' at\n", bpf_map__name(map)); - return -EINVAL; + return libbpf_err(-EINVAL); } else if (map->pinned) { pr_warn("map '%s' already pinned\n", bpf_map__name(map)); - return -EEXIST; + return libbpf_err(-EEXIST); }
map->pin_path = strdup(path); @@ -8427,11 +8427,11 @@ int bpf_map__pin(struct bpf_map *map, const char *path)
err = make_parent_dir(map->pin_path); if (err) - return err; + return libbpf_err(err);
err = check_path(map->pin_path); if (err) - return err; + return libbpf_err(err);
if (bpf_obj_pin(map->fd, map->pin_path)) { err = -errno; @@ -8446,7 +8446,7 @@ int bpf_map__pin(struct bpf_map *map, const char *path) out_err: cp = libbpf_strerror_r(-err, errmsg, sizeof(errmsg)); pr_warn("failed to pin map: %s\n", cp); - return err; + return libbpf_err(err); }
int bpf_map__unpin(struct bpf_map *map, const char *path) @@ -8455,29 +8455,29 @@ int bpf_map__unpin(struct bpf_map *map, const char *path)
if (map == NULL) { pr_warn("invalid map pointer\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
if (map->pin_path) { if (path && strcmp(path, map->pin_path)) { pr_warn("map '%s' already has pin path '%s' different from '%s'\n", bpf_map__name(map), map->pin_path, path); - return -EINVAL; + return libbpf_err(-EINVAL); } path = map->pin_path; } else if (!path) { pr_warn("no path to unpin map '%s' from\n", bpf_map__name(map)); - return -EINVAL; + return libbpf_err(-EINVAL); }
err = check_path(path); if (err) - return err; + return libbpf_err(err);
err = unlink(path); if (err != 0) - return -errno; + return libbpf_err(-errno);
map->pinned = false; pr_debug("unpinned map '%s' from '%s'\n", bpf_map__name(map), path); @@ -8492,7 +8492,7 @@ int bpf_map__set_pin_path(struct bpf_map *map, const char *path) if (path) { new = strdup(path); if (!new) - return -errno; + return libbpf_err(-errno); }
free(map->pin_path); @@ -8526,11 +8526,11 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path) int err;
if (!obj) - return -ENOENT; + return libbpf_err(-ENOENT);
if (!obj->loaded) { pr_warn("object not yet loaded; load it first\n"); - return -ENOENT; + return libbpf_err(-ENOENT); }
bpf_object__for_each_map(map, obj) { @@ -8570,7 +8570,7 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path) bpf_map__unpin(map, NULL); }
- return err; + return libbpf_err(err); }
int bpf_object__unpin_maps(struct bpf_object *obj, const char *path) @@ -8579,7 +8579,7 @@ int bpf_object__unpin_maps(struct bpf_object *obj, const char *path) int err;
if (!obj) - return -ENOENT; + return libbpf_err(-ENOENT);
bpf_object__for_each_map(map, obj) { char *pin_path = NULL; @@ -8591,9 +8591,9 @@ int bpf_object__unpin_maps(struct bpf_object *obj, const char *path) len = snprintf(buf, PATH_MAX, "%s/%s", path, bpf_map__name(map)); if (len < 0) - return -EINVAL; + return libbpf_err(-EINVAL); else if (len >= PATH_MAX) - return -ENAMETOOLONG; + return libbpf_err(-ENAMETOOLONG); sanitize_pin_path(buf); pin_path = buf; } else if (!map->pin_path) { @@ -8602,7 +8602,7 @@ int bpf_object__unpin_maps(struct bpf_object *obj, const char *path)
err = bpf_map__unpin(map, pin_path); if (err) - return err; + return libbpf_err(err); }
return 0; @@ -8614,11 +8614,11 @@ int bpf_object__pin_programs(struct bpf_object *obj, const char *path) int err;
if (!obj) - return -ENOENT; + return libbpf_err(-ENOENT);
if (!obj->loaded) { pr_warn("object not yet loaded; load it first\n"); - return -ENOENT; + return libbpf_err(-ENOENT); }
bpf_object__for_each_program(prog, obj) { @@ -8657,7 +8657,7 @@ int bpf_object__pin_programs(struct bpf_object *obj, const char *path) bpf_program__unpin(prog, buf); }
- return err; + return libbpf_err(err); }
int bpf_object__unpin_programs(struct bpf_object *obj, const char *path) @@ -8666,7 +8666,7 @@ int bpf_object__unpin_programs(struct bpf_object *obj, const char *path) int err;
if (!obj) - return -ENOENT; + return libbpf_err(-ENOENT);
bpf_object__for_each_program(prog, obj) { char buf[PATH_MAX]; @@ -8675,13 +8675,13 @@ int bpf_object__unpin_programs(struct bpf_object *obj, const char *path) len = snprintf(buf, PATH_MAX, "%s/%s", path, prog->pin_name); if (len < 0) - return -EINVAL; + return libbpf_err(-EINVAL); else if (len >= PATH_MAX) - return -ENAMETOOLONG; + return libbpf_err(-ENAMETOOLONG);
err = bpf_program__unpin(prog, buf); if (err) - return err; + return libbpf_err(err); }
return 0; @@ -8693,12 +8693,12 @@ int bpf_object__pin(struct bpf_object *obj, const char *path)
err = bpf_object__pin_maps(obj, path); if (err) - return err; + return libbpf_err(err);
err = bpf_object__pin_programs(obj, path); if (err) { bpf_object__unpin_maps(obj, path); - return err; + return libbpf_err(err); }
return 0; @@ -8795,7 +8795,7 @@ bpf_object__next(struct bpf_object *prev)
const char *bpf_object__name(const struct bpf_object *obj) { - return obj ? obj->name : ERR_PTR(-EINVAL); + return obj ? obj->name : libbpf_err_ptr(-EINVAL); }
unsigned int bpf_object__kversion(const struct bpf_object *obj) @@ -8816,7 +8816,7 @@ int bpf_object__btf_fd(const struct bpf_object *obj) int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_version) { if (obj->loaded) - return -EINVAL; + return libbpf_err(-EINVAL);
obj->kern_version = kern_version;
@@ -8836,7 +8836,7 @@ int bpf_object__set_priv(struct bpf_object *obj, void *priv,
void *bpf_object__priv(const struct bpf_object *obj) { - return obj ? obj->priv : ERR_PTR(-EINVAL); + return obj ? obj->priv : libbpf_err_ptr(-EINVAL); }
int bpf_object__gen_loader(struct bpf_object *obj, struct gen_loader_opts *opts) @@ -8872,7 +8872,7 @@ __bpf_program__iter(const struct bpf_program *p, const struct bpf_object *obj,
if (p->obj != obj) { pr_warn("error: program handler doesn't match object\n"); - return NULL; + return errno = EINVAL, NULL; }
idx = (p - obj->programs) + (forward ? 1 : -1); @@ -8918,7 +8918,7 @@ int bpf_program__set_priv(struct bpf_program *prog, void *priv,
void *bpf_program__priv(const struct bpf_program *prog) { - return prog ? prog->priv : ERR_PTR(-EINVAL); + return prog ? prog->priv : libbpf_err_ptr(-EINVAL); }
void bpf_program__set_ifindex(struct bpf_program *prog, __u32 ifindex) @@ -8945,7 +8945,7 @@ const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy) title = strdup(title); if (!title) { pr_warn("failed to strdup program title\n"); - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); } }
@@ -8960,7 +8960,7 @@ bool bpf_program__autoload(const struct bpf_program *prog) int bpf_program__set_autoload(struct bpf_program *prog, bool autoload) { if (prog->obj->loaded) - return -EINVAL; + return libbpf_err(-EINVAL);
prog->load = autoload; return 0; @@ -8982,17 +8982,17 @@ int bpf_program__set_prep(struct bpf_program *prog, int nr_instances, int *instances_fds;
if (nr_instances <= 0 || !prep) - return -EINVAL; + return libbpf_err(-EINVAL);
if (prog->instances.nr > 0 || prog->instances.fds) { pr_warn("Can't set pre-processor after loading\n"); - return -EINVAL; + return libbpf_err(-EINVAL); }
instances_fds = malloc(sizeof(int) * nr_instances); if (!instances_fds) { pr_warn("alloc memory failed for fds\n"); - return -ENOMEM; + return libbpf_err(-ENOMEM); }
/* fill all fd with -1 */ @@ -9009,19 +9009,19 @@ int bpf_program__nth_fd(const struct bpf_program *prog, int n) int fd;
if (!prog) - return -EINVAL; + return libbpf_err(-EINVAL);
if (n >= prog->instances.nr || n < 0) { pr_warn("Can't get the %dth fd from program %s: only %d instances\n", n, prog->name, prog->instances.nr); - return -EINVAL; + return libbpf_err(-EINVAL); }
fd = prog->instances.fds[n]; if (fd < 0) { pr_warn("%dth instance of program '%s' is invalid\n", n, prog->name); - return -ENOENT; + return libbpf_err(-ENOENT); }
return fd; @@ -9047,7 +9047,7 @@ static bool bpf_program__is_type(const struct bpf_program *prog, int bpf_program__set_##NAME(struct bpf_program *prog) \ { \ if (!prog) \ - return -EINVAL; \ + return libbpf_err(-EINVAL); \ bpf_program__set_type(prog, TYPE); \ return 0; \ } \ @@ -9342,7 +9342,7 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type, char *type_names;
if (!name) - return -EINVAL; + return libbpf_err(-EINVAL);
sec_def = find_sec_def(name); if (sec_def) { @@ -9358,7 +9358,7 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type, free(type_names); }
- return -ESRCH; + return libbpf_err(-ESRCH); }
static struct bpf_map *find_struct_ops_map_by_offset(struct bpf_object *obj, @@ -9561,9 +9561,10 @@ int libbpf_find_vmlinux_btf_id(const char *name, int err;
btf = libbpf_find_kernel_btf(); - if (IS_ERR(btf)) { + err = libbpf_get_error(btf); + if (err) { pr_warn("vmlinux BTF is not found\n"); - return -EINVAL; + return libbpf_err(err); }
err = find_attach_btf_id(btf, name, attach_type); @@ -9571,7 +9572,7 @@ int libbpf_find_vmlinux_btf_id(const char *name, pr_warn("%s is not found in vmlinux BTF\n", name);
btf__free(btf); - return err; + return libbpf_err(err); }
static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) @@ -9582,10 +9583,11 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) int err = -EINVAL;
info_linear = bpf_program__get_prog_info_linear(attach_prog_fd, 0); - if (IS_ERR_OR_NULL(info_linear)) { + err = libbpf_get_error(info_linear); + if (err) { pr_warn("failed get_prog_info_linear for FD %d\n", attach_prog_fd); - return -EINVAL; + return err; } info = &info_linear->info; if (!info->btf_id) { @@ -9706,13 +9708,13 @@ int libbpf_attach_type_by_name(const char *name, int i;
if (!name) - return -EINVAL; + return libbpf_err(-EINVAL);
for (i = 0; i < ARRAY_SIZE(section_defs); i++) { if (strncmp(name, section_defs[i].sec, section_defs[i].len)) continue; if (!section_defs[i].is_attachable) - return -EINVAL; + return libbpf_err(-EINVAL); *attach_type = section_defs[i].expected_attach_type; return 0; } @@ -9723,17 +9725,17 @@ int libbpf_attach_type_by_name(const char *name, free(type_names); }
- return -EINVAL; + return libbpf_err(-EINVAL); }
int bpf_map__fd(const struct bpf_map *map) { - return map ? map->fd : -EINVAL; + return map ? map->fd : libbpf_err(-EINVAL); }
const struct bpf_map_def *bpf_map__def(const struct bpf_map *map) { - return map ? &map->def : ERR_PTR(-EINVAL); + return map ? &map->def : libbpf_err_ptr(-EINVAL); }
const char *bpf_map__name(const struct bpf_map *map) @@ -9749,7 +9751,7 @@ enum bpf_map_type bpf_map__type(const struct bpf_map *map) int bpf_map__set_type(struct bpf_map *map, enum bpf_map_type type) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->def.type = type; return 0; } @@ -9762,7 +9764,7 @@ __u32 bpf_map__map_flags(const struct bpf_map *map) int bpf_map__set_map_flags(struct bpf_map *map, __u32 flags) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->def.map_flags = flags; return 0; } @@ -9775,7 +9777,7 @@ __u32 bpf_map__numa_node(const struct bpf_map *map) int bpf_map__set_numa_node(struct bpf_map *map, __u32 numa_node) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->numa_node = numa_node; return 0; } @@ -9788,7 +9790,7 @@ __u32 bpf_map__key_size(const struct bpf_map *map) int bpf_map__set_key_size(struct bpf_map *map, __u32 size) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->def.key_size = size; return 0; } @@ -9801,7 +9803,7 @@ __u32 bpf_map__value_size(const struct bpf_map *map) int bpf_map__set_value_size(struct bpf_map *map, __u32 size) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->def.value_size = size; return 0; } @@ -9820,7 +9822,7 @@ int bpf_map__set_priv(struct bpf_map *map, void *priv, bpf_map_clear_priv_t clear_priv) { if (!map) - return -EINVAL; + return libbpf_err(-EINVAL);
if (map->priv) { if (map->clear_priv) @@ -9834,7 +9836,7 @@ int bpf_map__set_priv(struct bpf_map *map, void *priv,
void *bpf_map__priv(const struct bpf_map *map) { - return map ? map->priv : ERR_PTR(-EINVAL); + return map ? map->priv : libbpf_err_ptr(-EINVAL); }
int bpf_map__set_initial_value(struct bpf_map *map, @@ -9842,7 +9844,7 @@ int bpf_map__set_initial_value(struct bpf_map *map, { if (!map->mmaped || map->libbpf_type == LIBBPF_MAP_KCONFIG || size != map->def.value_size || map->fd >= 0) - return -EINVAL; + return libbpf_err(-EINVAL);
memcpy(map->mmaped, data, size); return 0; @@ -9874,7 +9876,7 @@ __u32 bpf_map__ifindex(const struct bpf_map *map) int bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex) { if (map->fd >= 0) - return -EBUSY; + return libbpf_err(-EBUSY); map->map_ifindex = ifindex; return 0; } @@ -9883,11 +9885,11 @@ int bpf_map__set_inner_map_fd(struct bpf_map *map, int fd) { if (!bpf_map_type__is_map_in_map(map->def.type)) { pr_warn("error: unsupported map type\n"); - return -EINVAL; + return libbpf_err(-EINVAL); } if (map->inner_map_fd != -1) { pr_warn("error: inner_map_fd already specified\n"); - return -EINVAL; + return libbpf_err(-EINVAL); } zfree(&map->inner_map); map->inner_map_fd = fd; @@ -9901,7 +9903,7 @@ __bpf_map__iter(const struct bpf_map *m, const struct bpf_object *obj, int i) struct bpf_map *s, *e;
if (!obj || !obj->maps) - return NULL; + return errno = EINVAL, NULL;
s = obj->maps; e = obj->maps + obj->nr_maps; @@ -9909,7 +9911,7 @@ __bpf_map__iter(const struct bpf_map *m, const struct bpf_object *obj, int i) if ((m < s) || (m >= e)) { pr_warn("error in %s: map handler doesn't belong to object\n", __func__); - return NULL; + return errno = EINVAL, NULL; }
idx = (m - obj->maps) + i; @@ -9948,7 +9950,7 @@ bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name) if (pos->name && !strcmp(pos->name, name)) return pos; } - return NULL; + return errno = ENOENT, NULL; }
int @@ -9960,12 +9962,23 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name) struct bpf_map * bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset) { - return ERR_PTR(-ENOTSUP); + return libbpf_err_ptr(-ENOTSUP); }
long libbpf_get_error(const void *ptr) { - return PTR_ERR_OR_ZERO(ptr); + if (!IS_ERR_OR_NULL(ptr)) + return 0; + + if (IS_ERR(ptr)) + errno = -PTR_ERR(ptr); + + /* If ptr == NULL, then errno should be already set by the failing + * API, because libbpf never returns NULL on success and it now always + * sets errno on error. So no extra errno handling for ptr == NULL + * case. + */ + return -errno; }
int bpf_prog_load(const char *file, enum bpf_prog_type type, @@ -9991,16 +10004,17 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, int err;
if (!attr) - return -EINVAL; + return libbpf_err(-EINVAL); if (!attr->file) - return -EINVAL; + return libbpf_err(-EINVAL);
open_attr.file = attr->file; open_attr.prog_type = attr->prog_type;
obj = bpf_object__open_xattr(&open_attr); - if (IS_ERR_OR_NULL(obj)) - return -ENOENT; + err = libbpf_get_error(obj); + if (err) + return libbpf_err(-ENOENT);
bpf_object__for_each_program(prog, obj) { enum bpf_attach_type attach_type = attr->expected_attach_type; @@ -10020,7 +10034,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, * didn't provide a fallback type, too bad... */ bpf_object__close(obj); - return -EINVAL; + return libbpf_err(-EINVAL); }
prog->prog_ifindex = attr->ifindex; @@ -10038,13 +10052,13 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, if (!first_prog) { pr_warn("object file doesn't contain bpf program\n"); bpf_object__close(obj); - return -ENOENT; + return libbpf_err(-ENOENT); }
err = bpf_object__load(obj); if (err) { bpf_object__close(obj); - return err; + return libbpf_err(err); }
*pobj = obj; @@ -10063,7 +10077,10 @@ struct bpf_link { /* Replace link's underlying BPF program with the new one */ int bpf_link__update_program(struct bpf_link *link, struct bpf_program *prog) { - return bpf_link_update(bpf_link__fd(link), bpf_program__fd(prog), NULL); + int ret; + + ret = bpf_link_update(bpf_link__fd(link), bpf_program__fd(prog), NULL); + return libbpf_err_errno(ret); }
/* Release "ownership" of underlying BPF resource (typically, BPF program @@ -10096,7 +10113,7 @@ int bpf_link__destroy(struct bpf_link *link) free(link->pin_path); free(link);
- return err; + return libbpf_err(err); }
int bpf_link__fd(const struct bpf_link *link) @@ -10111,7 +10128,7 @@ const char *bpf_link__pin_path(const struct bpf_link *link)
static int bpf_link__detach_fd(struct bpf_link *link) { - return close(link->fd); + return libbpf_err_errno(close(link->fd)); }
struct bpf_link *bpf_link__open(const char *path) @@ -10123,13 +10140,13 @@ struct bpf_link *bpf_link__open(const char *path) if (fd < 0) { fd = -errno; pr_warn("failed to open link at %s: %d\n", path, fd); - return ERR_PTR(fd); + return libbpf_err_ptr(fd); }
link = calloc(1, sizeof(*link)); if (!link) { close(fd); - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); } link->detach = &bpf_link__detach_fd; link->fd = fd; @@ -10137,7 +10154,7 @@ struct bpf_link *bpf_link__open(const char *path) link->pin_path = strdup(path); if (!link->pin_path) { bpf_link__destroy(link); - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); }
return link; @@ -10153,22 +10170,22 @@ int bpf_link__pin(struct bpf_link *link, const char *path) int err;
if (link->pin_path) - return -EBUSY; + return libbpf_err(-EBUSY); err = make_parent_dir(path); if (err) - return err; + return libbpf_err(err); err = check_path(path); if (err) - return err; + return libbpf_err(err);
link->pin_path = strdup(path); if (!link->pin_path) - return -ENOMEM; + return libbpf_err(-ENOMEM);
if (bpf_obj_pin(link->fd, link->pin_path)) { err = -errno; zfree(&link->pin_path); - return err; + return libbpf_err(err); }
pr_debug("link fd=%d: pinned at %s\n", link->fd, link->pin_path); @@ -10180,11 +10197,11 @@ int bpf_link__unpin(struct bpf_link *link) int err;
if (!link->pin_path) - return -EINVAL; + return libbpf_err(-EINVAL);
err = unlink(link->pin_path); if (err != 0) - return -errno; + return libbpf_err_errno(err);
pr_debug("link fd=%d: unpinned from %s\n", link->fd, link->pin_path); zfree(&link->pin_path); @@ -10200,11 +10217,10 @@ static int bpf_link__detach_perf_event(struct bpf_link *link) err = -errno;
close(link->fd); - return err; + return libbpf_err(err); }
-struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog, - int pfd) +struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog, int pfd) { char errmsg[STRERR_BUFSIZE]; struct bpf_link *link; @@ -10213,18 +10229,18 @@ struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog, if (pfd < 0) { pr_warn("prog '%s': invalid perf event FD %d\n", prog->name, pfd); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); } prog_fd = bpf_program__fd(prog); if (prog_fd < 0) { pr_warn("prog '%s': can't attach BPF program w/o FD (did you load it?)\n", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
link = calloc(1, sizeof(*link)); if (!link) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); link->detach = &bpf_link__detach_perf_event; link->fd = pfd;
@@ -10236,14 +10252,14 @@ struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog, if (err == -EPROTO) pr_warn("prog '%s': try add PERF_SAMPLE_CALLCHAIN to or remove exclude_callchain_[kernel|user] from pfd %d\n", prog->name, pfd); - return ERR_PTR(err); + return libbpf_err_ptr(err); } if (ioctl(pfd, PERF_EVENT_IOC_ENABLE, 0) < 0) { err = -errno; free(link); pr_warn("prog '%s': failed to enable pfd %d: %s\n", prog->name, pfd, libbpf_strerror_r(err, errmsg, sizeof(errmsg))); - return ERR_PTR(err); + return libbpf_err_ptr(err); } return link; } @@ -10367,16 +10383,16 @@ struct bpf_link *bpf_program__attach_kprobe(struct bpf_program *prog, pr_warn("prog '%s': failed to create %s '%s' perf event: %s\n", prog->name, retprobe ? "kretprobe" : "kprobe", func_name, libbpf_strerror_r(pfd, errmsg, sizeof(errmsg))); - return ERR_PTR(pfd); + return libbpf_err_ptr(pfd); } link = bpf_program__attach_perf_event(prog, pfd); - if (IS_ERR(link)) { + err = libbpf_get_error(link); + if (err) { close(pfd); - err = PTR_ERR(link); pr_warn("prog '%s': failed to attach to %s '%s': %s\n", prog->name, retprobe ? "kretprobe" : "kprobe", func_name, libbpf_strerror_r(err, errmsg, sizeof(errmsg))); - return link; + return libbpf_err_ptr(err); } return link; } @@ -10409,17 +10425,17 @@ struct bpf_link *bpf_program__attach_uprobe(struct bpf_program *prog, prog->name, retprobe ? "uretprobe" : "uprobe", binary_path, func_offset, libbpf_strerror_r(pfd, errmsg, sizeof(errmsg))); - return ERR_PTR(pfd); + return libbpf_err_ptr(pfd); } link = bpf_program__attach_perf_event(prog, pfd); - if (IS_ERR(link)) { + err = libbpf_get_error(link); + if (err) { close(pfd); - err = PTR_ERR(link); pr_warn("prog '%s': failed to attach to %s '%s:0x%zx': %s\n", prog->name, retprobe ? "uretprobe" : "uprobe", binary_path, func_offset, libbpf_strerror_r(err, errmsg, sizeof(errmsg))); - return link; + return libbpf_err_ptr(err); } return link; } @@ -10487,16 +10503,16 @@ struct bpf_link *bpf_program__attach_tracepoint(struct bpf_program *prog, pr_warn("prog '%s': failed to create tracepoint '%s/%s' perf event: %s\n", prog->name, tp_category, tp_name, libbpf_strerror_r(pfd, errmsg, sizeof(errmsg))); - return ERR_PTR(pfd); + return libbpf_err_ptr(pfd); } link = bpf_program__attach_perf_event(prog, pfd); - if (IS_ERR(link)) { + err = libbpf_get_error(link); + if (err) { close(pfd); - err = PTR_ERR(link); pr_warn("prog '%s': failed to attach to tracepoint '%s/%s': %s\n", prog->name, tp_category, tp_name, libbpf_strerror_r(err, errmsg, sizeof(errmsg))); - return link; + return libbpf_err_ptr(err); } return link; } @@ -10509,20 +10525,19 @@ static struct bpf_link *attach_tp(const struct bpf_sec_def *sec,
sec_name = strdup(prog->sec_name); if (!sec_name) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM);
/* extract "tp/<category>/<name>" */ tp_cat = sec_name + sec->len; tp_name = strchr(tp_cat, '/'); if (!tp_name) { - link = ERR_PTR(-EINVAL); - goto out; + free(sec_name); + return libbpf_err_ptr(-EINVAL); } *tp_name = '\0'; tp_name++;
link = bpf_program__attach_tracepoint(prog, tp_cat, tp_name); -out: free(sec_name); return link; } @@ -10537,12 +10552,12 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog, prog_fd = bpf_program__fd(prog); if (prog_fd < 0) { pr_warn("prog '%s': can't attach before loaded\n", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
link = calloc(1, sizeof(*link)); if (!link) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); link->detach = &bpf_link__detach_fd;
pfd = bpf_raw_tracepoint_open(tp_name, prog_fd); @@ -10551,7 +10566,7 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog, free(link); pr_warn("prog '%s': failed to attach to raw tracepoint '%s': %s\n", prog->name, tp_name, libbpf_strerror_r(pfd, errmsg, sizeof(errmsg))); - return ERR_PTR(pfd); + return libbpf_err_ptr(pfd); } link->fd = pfd; return link; @@ -10575,12 +10590,12 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog) prog_fd = bpf_program__fd(prog); if (prog_fd < 0) { pr_warn("prog '%s': can't attach before loaded\n", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
link = calloc(1, sizeof(*link)); if (!link) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); link->detach = &bpf_link__detach_fd;
pfd = bpf_raw_tracepoint_open(NULL, prog_fd); @@ -10589,7 +10604,7 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog) free(link); pr_warn("prog '%s': failed to attach: %s\n", prog->name, libbpf_strerror_r(pfd, errmsg, sizeof(errmsg))); - return ERR_PTR(pfd); + return libbpf_err_ptr(pfd); } link->fd = pfd; return (struct bpf_link *)link; @@ -10628,12 +10643,6 @@ static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec, return bpf_program__attach_lsm(prog); }
-static struct bpf_link *attach_iter(const struct bpf_sec_def *sec, - struct bpf_program *prog) -{ - return bpf_program__attach_iter(prog, NULL); -} - static struct bpf_link * bpf_program__attach_fd(struct bpf_program *prog, int target_fd, int btf_id, const char *target_name) @@ -10648,12 +10657,12 @@ bpf_program__attach_fd(struct bpf_program *prog, int target_fd, int btf_id, prog_fd = bpf_program__fd(prog); if (prog_fd < 0) { pr_warn("prog '%s': can't attach before loaded\n", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
link = calloc(1, sizeof(*link)); if (!link) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); link->detach = &bpf_link__detach_fd;
attach_type = bpf_program__get_expected_attach_type(prog); @@ -10664,7 +10673,7 @@ bpf_program__attach_fd(struct bpf_program *prog, int target_fd, int btf_id, pr_warn("prog '%s': failed to attach to %s: %s\n", prog->name, target_name, libbpf_strerror_r(link_fd, errmsg, sizeof(errmsg))); - return ERR_PTR(link_fd); + return libbpf_err_ptr(link_fd); } link->fd = link_fd; return link; @@ -10697,19 +10706,19 @@ struct bpf_link *bpf_program__attach_freplace(struct bpf_program *prog, if (!!target_fd != !!attach_func_name) { pr_warn("prog '%s': supply none or both of target_fd and attach_func_name\n", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
if (prog->type != BPF_PROG_TYPE_EXT) { pr_warn("prog '%s': only BPF_PROG_TYPE_EXT can attach as freplace", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
if (target_fd) { btf_id = libbpf_find_prog_btf_id(attach_func_name, target_fd); if (btf_id < 0) - return ERR_PTR(btf_id); + return libbpf_err_ptr(btf_id);
return bpf_program__attach_fd(prog, target_fd, btf_id, "freplace"); } else { @@ -10731,7 +10740,7 @@ bpf_program__attach_iter(struct bpf_program *prog, __u32 target_fd = 0;
if (!OPTS_VALID(opts, bpf_iter_attach_opts)) - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL);
link_create_opts.iter_info = OPTS_GET(opts, link_info, (void *)0); link_create_opts.iter_info_len = OPTS_GET(opts, link_info_len, 0); @@ -10739,12 +10748,12 @@ bpf_program__attach_iter(struct bpf_program *prog, prog_fd = bpf_program__fd(prog); if (prog_fd < 0) { pr_warn("prog '%s': can't attach before loaded\n", prog->name); - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL); }
link = calloc(1, sizeof(*link)); if (!link) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM); link->detach = &bpf_link__detach_fd;
link_fd = bpf_link_create(prog_fd, target_fd, BPF_TRACE_ITER, @@ -10754,19 +10763,25 @@ bpf_program__attach_iter(struct bpf_program *prog, free(link); pr_warn("prog '%s': failed to attach to iterator: %s\n", prog->name, libbpf_strerror_r(link_fd, errmsg, sizeof(errmsg))); - return ERR_PTR(link_fd); + return libbpf_err_ptr(link_fd); } link->fd = link_fd; return link; }
+static struct bpf_link *attach_iter(const struct bpf_sec_def *sec, + struct bpf_program *prog) +{ + return bpf_program__attach_iter(prog, NULL); +} + struct bpf_link *bpf_program__attach(struct bpf_program *prog) { const struct bpf_sec_def *sec_def;
sec_def = find_sec_def(prog->sec_name); if (!sec_def || !sec_def->attach_fn) - return ERR_PTR(-ESRCH); + return libbpf_err_ptr(-ESRCH);
return sec_def->attach_fn(sec_def, prog); } @@ -10789,11 +10804,11 @@ struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map) int err;
if (!bpf_map__is_struct_ops(map) || map->fd == -1) - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL);
link = calloc(1, sizeof(*link)); if (!link) - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL);
st_ops = map->st_ops; for (i = 0; i < btf_vlen(st_ops->type); i++) { @@ -10813,7 +10828,7 @@ struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map) if (err) { err = -errno; free(link); - return ERR_PTR(err); + return libbpf_err_ptr(err); }
link->detach = bpf_link__detach_struct_ops; @@ -10867,7 +10882,7 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, }
ring_buffer_write_tail(header, data_tail); - return ret; + return libbpf_err(ret); }
struct perf_buffer; @@ -11020,7 +11035,7 @@ struct perf_buffer *perf_buffer__new(int map_fd, size_t page_cnt, p.lost_cb = opts ? opts->lost_cb : NULL; p.ctx = opts ? opts->ctx : NULL;
- return __perf_buffer__new(map_fd, page_cnt, &p); + return libbpf_ptr(__perf_buffer__new(map_fd, page_cnt, &p)); }
struct perf_buffer * @@ -11036,7 +11051,7 @@ perf_buffer__new_raw(int map_fd, size_t page_cnt, p.cpus = opts->cpus; p.map_keys = opts->map_keys;
- return __perf_buffer__new(map_fd, page_cnt, &p); + return libbpf_ptr(__perf_buffer__new(map_fd, page_cnt, &p)); }
static struct perf_buffer *__perf_buffer__new(int map_fd, size_t page_cnt, @@ -11257,16 +11272,19 @@ int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms) int i, cnt, err;
cnt = epoll_wait(pb->epoll_fd, pb->events, pb->cpu_cnt, timeout_ms); + if (cnt < 0) + return libbpf_err_errno(cnt); + for (i = 0; i < cnt; i++) { struct perf_cpu_buf *cpu_buf = pb->events[i].data.ptr;
err = perf_buffer__process_records(pb, cpu_buf); if (err) { pr_warn("error while processing records: %d\n", err); - return err; + return libbpf_err(err); } } - return cnt < 0 ? -errno : cnt; + return cnt; }
/* Return number of PERF_EVENT_ARRAY map slots set up by this perf_buffer @@ -11287,11 +11305,11 @@ int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx) struct perf_cpu_buf *cpu_buf;
if (buf_idx >= pb->cpu_cnt) - return -EINVAL; + return libbpf_err(-EINVAL);
cpu_buf = pb->cpu_bufs[buf_idx]; if (!cpu_buf) - return -ENOENT; + return libbpf_err(-ENOENT);
return cpu_buf->fd; } @@ -11309,11 +11327,11 @@ int perf_buffer__consume_buffer(struct perf_buffer *pb, size_t buf_idx) struct perf_cpu_buf *cpu_buf;
if (buf_idx >= pb->cpu_cnt) - return -EINVAL; + return libbpf_err(-EINVAL);
cpu_buf = pb->cpu_bufs[buf_idx]; if (!cpu_buf) - return -ENOENT; + return libbpf_err(-ENOENT);
return perf_buffer__process_records(pb, cpu_buf); } @@ -11331,7 +11349,7 @@ int perf_buffer__consume(struct perf_buffer *pb) err = perf_buffer__process_records(pb, cpu_buf); if (err) { pr_warn("perf_buffer: failed to process records in buffer #%d: %d\n", i, err); - return err; + return libbpf_err(err); } } return 0; @@ -11443,13 +11461,13 @@ bpf_program__get_prog_info_linear(int fd, __u64 arrays) void *ptr;
if (arrays >> BPF_PROG_INFO_LAST_ARRAY) - return ERR_PTR(-EINVAL); + return libbpf_err_ptr(-EINVAL);
/* step 1: get array dimensions */ err = bpf_obj_get_info_by_fd(fd, &info, &info_len); if (err) { pr_debug("can't get prog info: %s", strerror(errno)); - return ERR_PTR(-EFAULT); + return libbpf_err_ptr(-EFAULT); }
/* step 2: calculate total size of all arrays */ @@ -11481,7 +11499,7 @@ bpf_program__get_prog_info_linear(int fd, __u64 arrays) data_len = roundup(data_len, sizeof(__u64)); info_linear = malloc(sizeof(struct bpf_prog_info_linear) + data_len); if (!info_linear) - return ERR_PTR(-ENOMEM); + return libbpf_err_ptr(-ENOMEM);
/* step 4: fill data to info_linear->info */ info_linear->arrays = arrays; @@ -11513,7 +11531,7 @@ bpf_program__get_prog_info_linear(int fd, __u64 arrays) if (err) { pr_debug("can't get prog info: %s", strerror(errno)); free(info_linear); - return ERR_PTR(-EFAULT); + return libbpf_err_ptr(-EFAULT); }
/* step 6: verify the data */ @@ -11592,26 +11610,26 @@ int bpf_program__set_attach_target(struct bpf_program *prog, int btf_obj_fd = 0, btf_id = 0, err;
if (!prog || attach_prog_fd < 0 || !attach_func_name) - return -EINVAL; + return libbpf_err(-EINVAL);
if (prog->obj->loaded) - return -EINVAL; + return libbpf_err(-EINVAL);
if (attach_prog_fd) { btf_id = libbpf_find_prog_btf_id(attach_func_name, attach_prog_fd); if (btf_id < 0) - return btf_id; + return libbpf_err(btf_id); } else { /* load btf_vmlinux, if not yet */ err = bpf_object__load_vmlinux_btf(prog->obj, true); if (err) - return err; + return libbpf_err(err); err = find_kernel_btf_id(prog->obj, attach_func_name, prog->expected_attach_type, &btf_obj_fd, &btf_id); if (err) - return err; + return libbpf_err(err); }
prog->attach_btf_id = btf_id; @@ -11710,7 +11728,7 @@ int libbpf_num_possible_cpus(void)
err = parse_cpu_mask_file(fcpu, &mask, &n); if (err) - return err; + return libbpf_err(err);
tmp_cpus = 0; for (i = 0; i < n; i++) { @@ -11730,7 +11748,7 @@ int bpf_object__open_skeleton(struct bpf_object_skeleton *s, .object_name = s->name, ); struct bpf_object *obj; - int i; + int i, err;
/* Attempt to preserve opts->object_name, unless overriden by user * explicitly. Overwriting object name for skeletons is discouraged, @@ -11745,10 +11763,11 @@ int bpf_object__open_skeleton(struct bpf_object_skeleton *s, }
obj = bpf_object__open_mem(s->data, s->data_sz, &skel_opts); - if (IS_ERR(obj)) { - pr_warn("failed to initialize skeleton BPF object '%s': %ld\n", - s->name, PTR_ERR(obj)); - return PTR_ERR(obj); + err = libbpf_get_error(obj); + if (err) { + pr_warn("failed to initialize skeleton BPF object '%s': %d\n", + s->name, err); + return libbpf_err(err); }
*s->obj = obj; @@ -11761,7 +11780,7 @@ int bpf_object__open_skeleton(struct bpf_object_skeleton *s, *map = bpf_object__find_map_by_name(obj, name); if (!*map) { pr_warn("failed to find skeleton map '%s'\n", name); - return -ESRCH; + return libbpf_err(-ESRCH); }
/* externs shouldn't be pre-setup from user code */ @@ -11776,7 +11795,7 @@ int bpf_object__open_skeleton(struct bpf_object_skeleton *s, *prog = bpf_object__find_program_by_name(obj, name); if (!*prog) { pr_warn("failed to find skeleton program '%s'\n", name); - return -ESRCH; + return libbpf_err(-ESRCH); } }
@@ -11790,7 +11809,7 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s) err = bpf_object__load(*s->obj); if (err) { pr_warn("failed to load BPF skeleton '%s': %d\n", s->name, err); - return err; + return libbpf_err(err); }
for (i = 0; i < s->map_cnt; i++) { @@ -11829,7 +11848,7 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s) *mmaped = NULL; pr_warn("failed to re-mmap() map '%s': %d\n", bpf_map__name(map), err); - return err; + return libbpf_err(err); } }
@@ -11838,7 +11857,7 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s)
int bpf_object__attach_skeleton(struct bpf_object_skeleton *s) { - int i; + int i, err;
for (i = 0; i < s->prog_cnt; i++) { struct bpf_program *prog = *s->progs[i].prog; @@ -11853,10 +11872,11 @@ int bpf_object__attach_skeleton(struct bpf_object_skeleton *s) continue;
*link = sec_def->attach_fn(sec_def, prog); - if (IS_ERR(*link)) { - pr_warn("failed to auto-attach program '%s': %ld\n", - bpf_program__name(prog), PTR_ERR(*link)); - return PTR_ERR(*link); + err = libbpf_get_error(*link); + if (err) { + pr_warn("failed to auto-attach program '%s': %d\n", + bpf_program__name(prog), err); + return libbpf_err(err); } }
diff --git a/tools/lib/bpf/libbpf_errno.c b/tools/lib/bpf/libbpf_errno.c index 0afb51f7a919..96f67a772a1b 100644 --- a/tools/lib/bpf/libbpf_errno.c +++ b/tools/lib/bpf/libbpf_errno.c @@ -12,6 +12,7 @@ #include <string.h>
#include "libbpf.h" +#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */ #pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64 @@ -39,7 +40,7 @@ static const char *libbpf_strerror_table[NR_ERRNO] = { int libbpf_strerror(int err, char *buf, size_t size) { if (!buf || !size) - return -1; + return libbpf_err(-EINVAL);
err = err > 0 ? err : -err;
@@ -48,7 +49,7 @@ int libbpf_strerror(int err, char *buf, size_t size)
ret = strerror_r(err, buf, size); buf[size - 1] = '\0'; - return ret; + return libbpf_err_errno(ret); }
if (err < __LIBBPF_ERRNO__END) { @@ -62,5 +63,5 @@ int libbpf_strerror(int err, char *buf, size_t size)
snprintf(buf, size, "Unknown libbpf error %d", err); buf[size - 1] = '\0'; - return -1; + return libbpf_err(-ENOENT); } diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 2c9745462fc6..016ca7cb4f8a 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -467,4 +467,31 @@ static inline int libbpf_err_errno(int ret) return ret; }
+/* handle error for pointer-returning APIs, err is assumed to be < 0 always */ +static inline void *libbpf_err_ptr(int err) +{ + /* set errno on error, this doesn't break anything */ + errno = -err; + + if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS) + return NULL; + + /* legacy: encode err as ptr */ + return ERR_PTR(err); +} + +/* handle pointer-returning APIs' error handling */ +static inline void *libbpf_ptr(void *ret) +{ + /* set errno on error, this doesn't break anything */ + if (IS_ERR(ret)) + errno = -PTR_ERR(ret); + + if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS) + return IS_ERR(ret) ? NULL : ret; + + /* legacy: pass-through original pointer */ + return ret; +} + #endif /* __LIBBPF_LIBBPF_INTERNAL_H */ diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index bd11c429cd5b..0c41468b23bc 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -220,16 +220,16 @@ struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts int err;
if (!OPTS_VALID(opts, bpf_linker_opts)) - return NULL; + return errno = EINVAL, NULL;
if (elf_version(EV_CURRENT) == EV_NONE) { pr_warn_elf("libelf initialization failed"); - return NULL; + return errno = EINVAL, NULL; }
linker = calloc(1, sizeof(*linker)); if (!linker) - return NULL; + return errno = ENOMEM, NULL;
linker->fd = -1;
@@ -241,7 +241,7 @@ struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts
err_out: bpf_linker__free(linker); - return NULL; + return errno = -err, NULL; }
static struct dst_sec *add_dst_sec(struct bpf_linker *linker, const char *sec_name) @@ -444,10 +444,10 @@ int bpf_linker__add_file(struct bpf_linker *linker, const char *filename, int err = 0;
if (!OPTS_VALID(opts, bpf_linker_file_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (!linker->elf) - return -EINVAL; + return libbpf_err(-EINVAL);
err = err ?: linker_load_obj_file(linker, filename, opts, &obj); err = err ?: linker_append_sec_data(linker, &obj); @@ -467,7 +467,7 @@ int bpf_linker__add_file(struct bpf_linker *linker, const char *filename, if (obj.fd >= 0) close(obj.fd);
- return err; + return libbpf_err(err); }
static bool is_dwarf_sec_name(const char *name) @@ -2548,11 +2548,11 @@ int bpf_linker__finalize(struct bpf_linker *linker) int err, i;
if (!linker->elf) - return -EINVAL; + return libbpf_err(-EINVAL);
err = finalize_btf(linker); if (err) - return err; + return libbpf_err(err);
/* Finalize strings */ strs_sz = strset__data_size(linker->strtab_strs); @@ -2584,14 +2584,14 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (elf_update(linker->elf, ELF_C_NULL) < 0) { err = -errno; pr_warn_elf("failed to finalize ELF layout"); - return err; + return libbpf_err(err); }
/* Write out final ELF contents */ if (elf_update(linker->elf, ELF_C_WRITE) < 0) { err = -errno; pr_warn_elf("failed to write ELF contents"); - return err; + return libbpf_err(err); }
elf_end(linker->elf); diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 47444588e0d2..d743c8721aa7 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -225,22 +225,26 @@ static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd, int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags, const struct bpf_xdp_set_link_opts *opts) { - int old_fd = -1; + int old_fd = -1, ret;
if (!OPTS_VALID(opts, bpf_xdp_set_link_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
if (OPTS_HAS(opts, old_fd)) { old_fd = OPTS_GET(opts, old_fd, -1); flags |= XDP_FLAGS_REPLACE; }
- return __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags); + ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags); + return libbpf_err(ret); }
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags) { - return __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags); + int ret; + + ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags); + return libbpf_err(ret); }
static int __dump_link_nlmsg(struct nlmsghdr *nlh, @@ -321,13 +325,13 @@ int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, };
if (flags & ~XDP_FLAGS_MASK || !info_size) - return -EINVAL; + return libbpf_err(-EINVAL);
/* Check whether the single {HW,DRV,SKB} mode is set */ flags &= (XDP_FLAGS_SKB_MODE | XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE); mask = flags - 1; if (flags && flags & mask) - return -EINVAL; + return libbpf_err(-EINVAL);
xdp_id.ifindex = ifindex; xdp_id.flags = flags; @@ -341,7 +345,7 @@ int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, memset((void *) info + sz, 0, info_size - sz); }
- return ret; + return libbpf_err(ret); }
static __u32 get_xdp_id(struct xdp_link_info *info, __u32 flags) @@ -369,7 +373,7 @@ int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags) if (!ret) *prog_id = get_xdp_id(&info, flags);
- return ret; + return libbpf_err(ret); }
typedef int (*qdisc_config_t)(struct nlmsghdr *nh, struct tcmsg *t, @@ -463,11 +467,14 @@ static int tc_qdisc_delete(struct bpf_tc_hook *hook)
int bpf_tc_hook_create(struct bpf_tc_hook *hook) { + int ret; + if (!hook || !OPTS_VALID(hook, bpf_tc_hook) || OPTS_GET(hook, ifindex, 0) <= 0) - return -EINVAL; + return libbpf_err(-EINVAL);
- return tc_qdisc_create_excl(hook); + ret = tc_qdisc_create_excl(hook); + return libbpf_err(ret); }
static int __bpf_tc_detach(const struct bpf_tc_hook *hook, @@ -478,18 +485,18 @@ int bpf_tc_hook_destroy(struct bpf_tc_hook *hook) { if (!hook || !OPTS_VALID(hook, bpf_tc_hook) || OPTS_GET(hook, ifindex, 0) <= 0) - return -EINVAL; + return libbpf_err(-EINVAL);
switch (OPTS_GET(hook, attach_point, 0)) { case BPF_TC_INGRESS: case BPF_TC_EGRESS: - return __bpf_tc_detach(hook, NULL, true); + return libbpf_err(__bpf_tc_detach(hook, NULL, true)); case BPF_TC_INGRESS | BPF_TC_EGRESS: - return tc_qdisc_delete(hook); + return libbpf_err(tc_qdisc_delete(hook)); case BPF_TC_CUSTOM: - return -EOPNOTSUPP; + return libbpf_err(-EOPNOTSUPP); default: - return -EINVAL; + return libbpf_err(-EINVAL); } }
@@ -574,7 +581,7 @@ int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) if (!hook || !opts || !OPTS_VALID(hook, bpf_tc_hook) || !OPTS_VALID(opts, bpf_tc_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
ifindex = OPTS_GET(hook, ifindex, 0); parent = OPTS_GET(hook, parent, 0); @@ -587,11 +594,11 @@ int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || !prog_fd || prog_id) - return -EINVAL; + return libbpf_err(-EINVAL); if (priority > UINT16_MAX) - return -EINVAL; + return libbpf_err(-EINVAL); if (flags & ~BPF_TC_F_REPLACE) - return -EINVAL; + return libbpf_err(-EINVAL);
flags = (flags & BPF_TC_F_REPLACE) ? NLM_F_REPLACE : NLM_F_EXCL; protocol = ETH_P_ALL; @@ -608,32 +615,32 @@ int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
ret = tc_get_tcm_parent(attach_point, &parent); if (ret < 0) - return ret; + return libbpf_err(ret); req.tc.tcm_parent = parent;
ret = nlattr_add(&req.nh, sizeof(req), TCA_KIND, "bpf", sizeof("bpf")); if (ret < 0) - return ret; + return libbpf_err(ret); nla = nlattr_begin_nested(&req.nh, sizeof(req), TCA_OPTIONS); if (!nla) - return -EMSGSIZE; + return libbpf_err(-EMSGSIZE); ret = tc_add_fd_and_name(&req.nh, sizeof(req), prog_fd); if (ret < 0) - return ret; + return libbpf_err(ret); bpf_flags = TCA_BPF_FLAG_ACT_DIRECT; ret = nlattr_add(&req.nh, sizeof(req), TCA_BPF_FLAGS, &bpf_flags, sizeof(bpf_flags)); if (ret < 0) - return ret; + return libbpf_err(ret); nlattr_end_nested(&req.nh, nla);
info.opts = opts;
ret = libbpf_netlink_send_recv(&req.nh, get_tc_info, NULL, &info); if (ret < 0) - return ret; + return libbpf_err(ret); if (!info.processed) - return -ENOENT; + return libbpf_err(-ENOENT); return ret; }
@@ -708,7 +715,13 @@ static int __bpf_tc_detach(const struct bpf_tc_hook *hook, int bpf_tc_detach(const struct bpf_tc_hook *hook, const struct bpf_tc_opts *opts) { - return !opts ? -EINVAL : __bpf_tc_detach(hook, opts, false); + int ret; + + if (!opts) + return libbpf_err(-EINVAL); + + ret = __bpf_tc_detach(hook, opts, false); + return libbpf_err(ret); }
int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) @@ -725,7 +738,7 @@ int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) if (!hook || !opts || !OPTS_VALID(hook, bpf_tc_hook) || !OPTS_VALID(opts, bpf_tc_opts)) - return -EINVAL; + return libbpf_err(-EINVAL);
ifindex = OPTS_GET(hook, ifindex, 0); parent = OPTS_GET(hook, parent, 0); @@ -739,9 +752,9 @@ int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
if (ifindex <= 0 || flags || prog_fd || prog_id || !handle || !priority) - return -EINVAL; + return libbpf_err(-EINVAL); if (priority > UINT16_MAX) - return -EINVAL; + return libbpf_err(-EINVAL);
protocol = ETH_P_ALL;
@@ -756,19 +769,19 @@ int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
ret = tc_get_tcm_parent(attach_point, &parent); if (ret < 0) - return ret; + return libbpf_err(ret); req.tc.tcm_parent = parent;
ret = nlattr_add(&req.nh, sizeof(req), TCA_KIND, "bpf", sizeof("bpf")); if (ret < 0) - return ret; + return libbpf_err(ret);
info.opts = opts;
ret = libbpf_netlink_send_recv(&req.nh, get_tc_info, NULL, &info); if (ret < 0) - return ret; + return libbpf_err(ret); if (!info.processed) - return -ENOENT; + return libbpf_err(-ENOENT); return ret; } diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c index 86c31c787fb9..0fe7c0935a52 100644 --- a/tools/lib/bpf/ringbuf.c +++ b/tools/lib/bpf/ringbuf.c @@ -69,23 +69,23 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, err = -errno; pr_warn("ringbuf: failed to get map info for fd=%d: %d\n", map_fd, err); - return err; + return libbpf_err(err); }
if (info.type != BPF_MAP_TYPE_RINGBUF) { pr_warn("ringbuf: map fd=%d is not BPF_MAP_TYPE_RINGBUF\n", map_fd); - return -EINVAL; + return libbpf_err(-EINVAL); }
tmp = libbpf_reallocarray(rb->rings, rb->ring_cnt + 1, sizeof(*rb->rings)); if (!tmp) - return -ENOMEM; + return libbpf_err(-ENOMEM); rb->rings = tmp;
tmp = libbpf_reallocarray(rb->events, rb->ring_cnt + 1, sizeof(*rb->events)); if (!tmp) - return -ENOMEM; + return libbpf_err(-ENOMEM); rb->events = tmp;
r = &rb->rings[rb->ring_cnt]; @@ -103,7 +103,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, err = -errno; pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %d\n", map_fd, err); - return err; + return libbpf_err(err); } r->consumer_pos = tmp;
@@ -118,7 +118,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, ringbuf_unmap_ring(rb, r); pr_warn("ringbuf: failed to mmap data pages for map fd=%d: %d\n", map_fd, err); - return err; + return libbpf_err(err); } r->producer_pos = tmp; r->data = tmp + rb->page_size; @@ -133,7 +133,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, ringbuf_unmap_ring(rb, r); pr_warn("ringbuf: failed to epoll add map fd=%d: %d\n", map_fd, err); - return err; + return libbpf_err(err); }
rb->ring_cnt++; @@ -165,11 +165,11 @@ ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx, int err;
if (!OPTS_VALID(opts, ring_buffer_opts)) - return NULL; + return errno = EINVAL, NULL;
rb = calloc(1, sizeof(*rb)); if (!rb) - return NULL; + return errno = ENOMEM, NULL;
rb->page_size = getpagesize();
@@ -188,7 +188,7 @@ ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
err_out: ring_buffer__free(rb); - return NULL; + return errno = -err, NULL; }
static inline int roundup_len(__u32 len) @@ -260,7 +260,7 @@ int ring_buffer__consume(struct ring_buffer *rb)
err = ringbuf_process_ring(ring); if (err < 0) - return err; + return libbpf_err(err); res += err; } if (res > INT_MAX) @@ -279,7 +279,7 @@ int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms); if (cnt < 0) - return -errno; + return libbpf_err(-errno);
for (i = 0; i < cnt; i++) { __u32 ring_id = rb->events[i].data.fd; @@ -287,7 +287,7 @@ int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
err = ringbuf_process_ring(ring); if (err < 0) - return err; + return libbpf_err(err); res += err; } if (res > INT_MAX)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 9c6c0449deb41dbe3a66ab9adfd08020bba6c43d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Follow libbpf's error handling conventions and pass through errors and errno properly. Skeleton code always returned NULL on errors (not ERR_PTR(err)), so there are no backwards compatibility concerns. But now we also set errno properly, so it's possible to distinguish different reasons for failure, if necessary.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210525035935.1461796-6-andrii@kernel.org (cherry picked from commit 9c6c0449deb41dbe3a66ab9adfd08020bba6c43d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 13b0aa789178..1d71ff8c52fa 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -713,6 +713,7 @@ static int do_skeleton(int argc, char **argv) #ifndef %2$s \n\ #define %2$s \n\ \n\ + #include <errno.h> \n\ #include <stdlib.h> \n\ #include <bpf/libbpf.h> \n\ \n\ @@ -793,18 +794,23 @@ static int do_skeleton(int argc, char **argv) %1$s__open_opts(const struct bpf_object_open_opts *opts) \n\ { \n\ struct %1$s *obj; \n\ + int err; \n\ \n\ obj = (struct %1$s *)calloc(1, sizeof(*obj)); \n\ - if (!obj) \n\ + if (!obj) { \n\ + errno = ENOMEM; \n\ return NULL; \n\ - if (%1$s__create_skeleton(obj)) \n\ - goto err; \n\ - if (bpf_object__open_skeleton(obj->skeleton, opts)) \n\ - goto err; \n\ + } \n\ + \n\ + err = %1$s__create_skeleton(obj); \n\ + err = err ?: bpf_object__open_skeleton(obj->skeleton, opts);\n\ + if (err) \n\ + goto err_out; \n\ \n\ return obj; \n\ - err: \n\ + err_out: \n\ %1$s__destroy(obj); \n\ + errno = -err; \n\ return NULL; \n\ } \n\ \n\ @@ -824,12 +830,15 @@ static int do_skeleton(int argc, char **argv) %1$s__open_and_load(void) \n\ { \n\ struct %1$s *obj; \n\ + int err; \n\ \n\ obj = %1$s__open(); \n\ if (!obj) \n\ return NULL; \n\ - if (%1$s__load(obj)) { \n\ + err = %1$s__load(obj); \n\ + if (err) { \n\ %1$s__destroy(obj); \n\ + errno = -err; \n\ return NULL; \n\ } \n\ return obj; \n\ @@ -860,7 +869,7 @@ static int do_skeleton(int argc, char **argv) \n\ s = (struct bpf_object_skeleton *)calloc(1, sizeof(*s));\n\ if (!s) \n\ - return -1; \n\ + goto err; \n\ obj->skeleton = s; \n\ \n\ s->sz = sizeof(*s); \n\ @@ -949,7 +958,7 @@ static int do_skeleton(int argc, char **argv) return 0; \n\ err: \n\ bpf_object__destroy_skeleton(s); \n\ - return -1; \n\ + return -ENOMEM; \n\ } \n\ \n\ #endif /* %s */ \n\
From: Florent Revest revest@chromium.org
mainline inclusion from mainline-5.14-rc1 commit d6a6a55518c16040a369360255b355b7a2a261de category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
These macros are convenient wrappers around the bpf_seq_printf and bpf_snprintf helpers. They are currently provided by bpf_tracing.h which targets low level tracing primitives. bpf_helpers.h is a better fit.
The __bpf_narg and __bpf_apply are needed in both files and provided twice. __bpf_empty isn't used anywhere and is removed from bpf_tracing.h
Reported-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210526164643.2881368-1-revest@chromium.org (cherry picked from commit d6a6a55518c16040a369360255b355b7a2a261de) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/progs/bpf_iter_task_vma.c tools/testing/selftests/bpf/progs/test_snprintf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/preload/iterators/iterators.bpf.c | 1 - tools/lib/bpf/bpf_helpers.h | 66 +++++++++++++++++++ tools/lib/bpf/bpf_tracing.h | 62 +++-------------- .../bpf/progs/bpf_iter_bpf_hash_map.c | 1 - .../selftests/bpf/progs/bpf_iter_bpf_map.c | 1 - .../selftests/bpf/progs/bpf_iter_ipv6_route.c | 1 - .../selftests/bpf/progs/bpf_iter_netlink.c | 1 - .../selftests/bpf/progs/bpf_iter_task.c | 1 - .../selftests/bpf/progs/bpf_iter_task_btf.c | 1 - .../selftests/bpf/progs/bpf_iter_task_file.c | 1 - .../selftests/bpf/progs/bpf_iter_task_stack.c | 1 - .../selftests/bpf/progs/bpf_iter_tcp4.c | 1 - .../selftests/bpf/progs/bpf_iter_tcp6.c | 1 - .../selftests/bpf/progs/bpf_iter_udp4.c | 1 - .../selftests/bpf/progs/bpf_iter_udp6.c | 1 - 15 files changed, 74 insertions(+), 67 deletions(-)
diff --git a/kernel/bpf/preload/iterators/iterators.bpf.c b/kernel/bpf/preload/iterators/iterators.bpf.c index 52aa7b38e8b8..03af863314ea 100644 --- a/kernel/bpf/preload/iterators/iterators.bpf.c +++ b/kernel/bpf/preload/iterators/iterators.bpf.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020 Facebook */ #include <linux/bpf.h> #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h> #include <bpf/bpf_core_read.h>
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 9720dc0b4605..b9987c3efa3c 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -158,4 +158,70 @@ enum libbpf_tristate { #define __kconfig __attribute__((section(".kconfig"))) #define __ksym __attribute__((section(".ksyms")))
+#ifndef ___bpf_concat +#define ___bpf_concat(a, b) a ## b +#endif +#ifndef ___bpf_apply +#define ___bpf_apply(fn, n) ___bpf_concat(fn, n) +#endif +#ifndef ___bpf_nth +#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N +#endif +#ifndef ___bpf_narg +#define ___bpf_narg(...) \ + ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) +#endif + +#define ___bpf_fill0(arr, p, x) do {} while (0) +#define ___bpf_fill1(arr, p, x) arr[p] = x +#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args) +#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args) +#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args) +#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args) +#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args) +#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args) +#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args) +#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args) +#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args) +#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args) +#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args) +#define ___bpf_fill(arr, args...) \ + ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args) + +/* + * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values + * in a structure. + */ +#define BPF_SEQ_PRINTF(seq, fmt, args...) \ +({ \ + static const char ___fmt[] = fmt; \ + unsigned long long ___param[___bpf_narg(args)]; \ + \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored "-Wint-conversion"") \ + ___bpf_fill(___param, args); \ + _Pragma("GCC diagnostic pop") \ + \ + bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \ + ___param, sizeof(___param)); \ +}) + +/* + * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of + * an array of u64. + */ +#define BPF_SNPRINTF(out, out_size, fmt, args...) \ +({ \ + static const char ___fmt[] = fmt; \ + unsigned long long ___param[___bpf_narg(args)]; \ + \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored "-Wint-conversion"") \ + ___bpf_fill(___param, args); \ + _Pragma("GCC diagnostic pop") \ + \ + bpf_snprintf(out, out_size, ___fmt, \ + ___param, sizeof(___param)); \ +}) + #endif diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h index 8c954ebc0c7c..c0f3a26aa582 100644 --- a/tools/lib/bpf/bpf_tracing.h +++ b/tools/lib/bpf/bpf_tracing.h @@ -295,13 +295,19 @@ struct pt_regs; (void *)(PT_REGS_FP(ctx) + sizeof(ip))); }) #endif
+#ifndef ___bpf_concat #define ___bpf_concat(a, b) a ## b +#endif +#ifndef ___bpf_apply #define ___bpf_apply(fn, n) ___bpf_concat(fn, n) +#endif +#ifndef ___bpf_nth #define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N +#endif +#ifndef ___bpf_narg #define ___bpf_narg(...) \ ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) -#define ___bpf_empty(...) \ - ___bpf_nth(_, ##__VA_ARGS__, N, N, N, N, N, N, N, N, N, N, 0) +#endif
#define ___bpf_ctx_cast0() ctx #define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0] @@ -413,56 +419,4 @@ typeof(name(0)) name(struct pt_regs *ctx) \ } \ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
-#define ___bpf_fill0(arr, p, x) do {} while (0) -#define ___bpf_fill1(arr, p, x) arr[p] = x -#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args) -#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args) -#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args) -#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args) -#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args) -#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args) -#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args) -#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args) -#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args) -#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args) -#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args) -#define ___bpf_fill(arr, args...) \ - ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args) - -/* - * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values - * in a structure. - */ -#define BPF_SEQ_PRINTF(seq, fmt, args...) \ -({ \ - static const char ___fmt[] = fmt; \ - unsigned long long ___param[___bpf_narg(args)]; \ - \ - _Pragma("GCC diagnostic push") \ - _Pragma("GCC diagnostic ignored "-Wint-conversion"") \ - ___bpf_fill(___param, args); \ - _Pragma("GCC diagnostic pop") \ - \ - bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \ - ___param, sizeof(___param)); \ -}) - -/* - * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of - * an array of u64. - */ -#define BPF_SNPRINTF(out, out_size, fmt, args...) \ -({ \ - static const char ___fmt[] = fmt; \ - unsigned long long ___param[___bpf_narg(args)]; \ - \ - _Pragma("GCC diagnostic push") \ - _Pragma("GCC diagnostic ignored "-Wint-conversion"") \ - ___bpf_fill(___param, args); \ - _Pragma("GCC diagnostic pop") \ - \ - bpf_snprintf(out, out_size, ___fmt, \ - ___param, sizeof(___param)); \ -}) - #endif diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c index 6dfce3fd68bc..0aa3cd34cbe3 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020 Facebook */ #include "bpf_iter.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c index 08651b23edba..6594d77cb7b4 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_map.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020 Facebook */ #include "bpf_iter.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_ipv6_route.c b/tools/testing/selftests/bpf/progs/bpf_iter_ipv6_route.c index d58d9f1642b5..784a610ce039 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_ipv6_route.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_ipv6_route.c @@ -3,7 +3,6 @@ #include "bpf_iter.h" #include "bpf_tracing_net.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_netlink.c b/tools/testing/selftests/bpf/progs/bpf_iter_netlink.c index 95989f4c99b5..a28e51e2dcee 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_netlink.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_netlink.c @@ -3,7 +3,6 @@ #include "bpf_iter.h" #include "bpf_tracing_net.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task.c b/tools/testing/selftests/bpf/progs/bpf_iter_task.c index b7f32c160f4e..c86b93f33b32 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_task.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020 Facebook */ #include "bpf_iter.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c index a1ddc36f13ec..bca8b889cb10 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020, Oracle and/or its affiliates. */ #include "bpf_iter.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h> #include <bpf/bpf_core_read.h>
#include <errno.h> diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c index b2f7c7c5f952..6e7b400888fe 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020 Facebook */ #include "bpf_iter.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_stack.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_stack.c index 50e59a2e142e..8bad127ab5a5 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_task_stack.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_stack.c @@ -2,7 +2,6 @@ /* Copyright (c) 2020 Facebook */ #include "bpf_iter.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.c b/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.c index aa96b604b2b3..92267abb462f 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.c @@ -3,7 +3,6 @@ #include "bpf_iter.h" #include "bpf_tracing_net.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h> #include <bpf/bpf_endian.h>
char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_tcp6.c b/tools/testing/selftests/bpf/progs/bpf_iter_tcp6.c index b4fbddfa4e10..943f7bba180e 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_tcp6.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_tcp6.c @@ -3,7 +3,6 @@ #include "bpf_iter.h" #include "bpf_tracing_net.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h> #include <bpf/bpf_endian.h>
char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_udp4.c b/tools/testing/selftests/bpf/progs/bpf_iter_udp4.c index f258583afbbd..cf0c485b1ed7 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_udp4.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_udp4.c @@ -3,7 +3,6 @@ #include "bpf_iter.h" #include "bpf_tracing_net.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h> #include <bpf/bpf_endian.h>
char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_udp6.c b/tools/testing/selftests/bpf/progs/bpf_iter_udp6.c index 65f93bb03f0f..5031e21c433f 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_udp6.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_udp6.c @@ -3,7 +3,6 @@ #include "bpf_iter.h" #include "bpf_tracing_net.h" #include <bpf/bpf_helpers.h> -#include <bpf/bpf_tracing.h> #include <bpf/bpf_endian.h>
char _license[] SEC("license") = "GPL";
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 16cac0060680c11bb82c325c4fe95cb66fc8dfaf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Official libbpf 0.4 release doesn't include three APIs that were tentatively put into 0.4 section. Fix libbpf.map and move these three APIs:
- bpf_map__initial_value; - bpf_map_lookup_and_delete_elem_flags; - bpf_object__gen_loader.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210603004026.2698513-2-andrii@kernel.org (cherry picked from commit 16cac0060680c11bb82c325c4fe95cb66fc8dfaf) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.map
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.map | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 6266317ef3a2..3bb9ebd8d287 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -359,9 +359,7 @@ LIBBPF_0.4.0 { bpf_linker__finalize; bpf_linker__free; bpf_linker__new; - bpf_map__initial_value; bpf_map__inner_map; - bpf_object__gen_loader; bpf_object__set_kversion; bpf_tc_attach; bpf_tc_detach; @@ -372,5 +370,8 @@ LIBBPF_0.4.0 {
LIBBPF_0.5.0 { global: + bpf_map__initial_value; + bpf_map_lookup_and_delete_elem_flags; + bpf_object__gen_loader; libbpf_set_strict_mode; } LIBBPF_0.4.0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 232c9e8bd5ebfb43563b58f31e685fde06d9441f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
As we gradually get more headers that have to be installed, it's quite annoying to copy/paste long $(call) commands. So extract that logic and do a simple $(foreach) over the list of headers.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210603004026.2698513-3-andrii@kernel.org (cherry picked from commit 232c9e8bd5ebfb43563b58f31e685fde06d9441f) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Makefile | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 914c0afff255..147cc2f45233 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -223,20 +223,14 @@ install_lib: all_cmd $(call do_install_mkdir,$(libdir_SQ)); \ cp -fpR $(LIB_FILE) $(DESTDIR)$(libdir_SQ)
+INSTALL_HEADERS = bpf.h libbpf.h btf.h libbpf_util.h libbpf_common.h libbpf_legacy.h xsk.h \ + bpf_helpers.h $(BPF_HELPER_DEFS) bpf_tracing.h \ + bpf_endian.h bpf_core_read.h + install_headers: $(BPF_HELPER_DEFS) - $(call QUIET_INSTALL, headers) \ - $(call do_install,bpf.h,$(prefix)/include/bpf,644); \ - $(call do_install,libbpf.h,$(prefix)/include/bpf,644); \ - $(call do_install,btf.h,$(prefix)/include/bpf,644); \ - $(call do_install,libbpf_util.h,$(prefix)/include/bpf,644); \ - $(call do_install,libbpf_common.h,$(prefix)/include/bpf,644); \ - $(call do_install,libbpf_legacy.h,$(prefix)/include/bpf,644); \ - $(call do_install,xsk.h,$(prefix)/include/bpf,644); \ - $(call do_install,bpf_helpers.h,$(prefix)/include/bpf,644); \ - $(call do_install,$(BPF_HELPER_DEFS),$(prefix)/include/bpf,644); \ - $(call do_install,bpf_tracing.h,$(prefix)/include/bpf,644); \ - $(call do_install,bpf_endian.h,$(prefix)/include/bpf,644); \ - $(call do_install,bpf_core_read.h,$(prefix)/include/bpf,644); + $(call QUIET_INSTALL, headers) \ + $(foreach hdr,$(INSTALL_HEADERS), \ + $(call do_install,$(hdr),$(prefix)/include/bpf,644);)
install_pkgconfig: $(PC_FILE) $(call QUIET_INSTALL, $(PC_FILE)) \
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit 7d8a819dd31672f02ece93b6a9b9491daba4f0f2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Light skeleton code assumes skel_internal.h header to be installed system-wide by libbpf package. Make sure it is actually installed.
Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210603004026.2698513-4-andrii@kernel.org (cherry picked from commit 7d8a819dd31672f02ece93b6a9b9491daba4f0f2) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 147cc2f45233..4e336b30323c 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -225,7 +225,7 @@ install_lib: all_cmd
INSTALL_HEADERS = bpf.h libbpf.h btf.h libbpf_util.h libbpf_common.h libbpf_legacy.h xsk.h \ bpf_helpers.h $(BPF_HELPER_DEFS) bpf_tracing.h \ - bpf_endian.h bpf_core_read.h + bpf_endian.h bpf_core_read.h skel_internal.h
install_headers: $(BPF_HELPER_DEFS) $(call QUIET_INSTALL, headers) \
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.14-rc1 commit 0779890fed7817725f399d6ec85730e08ebfdeee category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When the bootstrap and final bpftool have different architectures, we need to build two distinct disasm.o objects. Add a recipe for the bootstrap disasm.o.
After commit d510296d331a ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.") cross-building bpftool didn't work anymore, because the bootstrap bpftool was linked using objects from different architectures:
$ make O=/tmp/bpftool ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -C tools/bpf/bpftool/ V=1 [...] aarch64-linux-gnu-gcc ... -c -MMD -o /tmp/bpftool/disasm.o /home/z/src/linux/kernel/bpf/disasm.c gcc ... -c -MMD -o /tmp/bpftool//bootstrap/main.o main.c gcc ... -o /tmp/bpftool//bootstrap/bpftool /tmp/bpftool//bootstrap/main.o ... /tmp/bpftool/disasm.o /usr/bin/ld: /tmp/bpftool/disasm.o: Relocations in generic ELF (EM: 183) /usr/bin/ld: /tmp/bpftool/disasm.o: Relocations in generic ELF (EM: 183) /usr/bin/ld: /tmp/bpftool/disasm.o: Relocations in generic ELF (EM: 183) /usr/bin/ld: /tmp/bpftool/disasm.o: error adding symbols: file in wrong format collect2: error: ld returned 1 exit status [...]
The final bpftool was built for e.g. arm64, while the bootstrap bpftool, executed on the host, was built for x86. The problem here was that disasm.o linked into the bootstrap bpftool was arm64 rather than x86. With the fix we build two disasm.o, one for the target bpftool in arm64, and one for the bootstrap bpftool in x86.
Fixes: d510296d331a ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.") Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210603170515.1854642-1-jean-philippe@linaro.or... (cherry picked from commit 0779890fed7817725f399d6ec85730e08ebfdeee) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index c7d13e87b4ed..e2073c739410 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -135,7 +135,7 @@ endif
BPFTOOL_BOOTSTRAP := $(BOOTSTRAP_OUTPUT)bpftool
-BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o) $(OUTPUT)disasm.o +BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o disasm.o) OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ @@ -179,6 +179,9 @@ endif
CFLAGS += $(if $(BUILD_BPF_SKELS),,-DBPFTOOL_WITHOUT_SKELETONS)
+$(BOOTSTRAP_OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c + $(QUIET_CC)$(HOSTCC) $(CFLAGS) -c -MMD -o $@ $< + $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c $(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -o $@ $<
From: Michal Suchanek msuchanek@suse.de
mainline inclusion from mainline-5.14-rc1 commit edc0571c5f67c7e24958149a8ec6a904ca84840b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The printed value is ptrdiff_t and is formatted wiht %ld. This works on 64bit but produces a warning on 32bit. Fix the format specifier to %td.
Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Michal Suchanek msuchanek@suse.de Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210604112448.32297-1-msuchanek@suse.de (cherry picked from commit edc0571c5f67c7e24958149a8ec6a904ca84840b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0e659713c5c7..3cb3994f91b0 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4625,7 +4625,7 @@ static int init_map_slots(struct bpf_object *obj, struct bpf_map *map) targ_map = map->init_slots[i]; fd = bpf_map__fd(targ_map); if (obj->gen_loader) { - pr_warn("// TODO map_update_elem: idx %ld key %d value==map_idx %ld\n", + pr_warn("// TODO map_update_elem: idx %td key %d value==map_idx %td\n", map - obj->maps, i, targ_map - obj->maps); return -ENOTSUP; } else { @@ -6262,7 +6262,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return -EINVAL;
if (prog->obj->gen_loader) { - pr_warn("// TODO core_relo: prog %ld insn[%d] %s %s kind %d\n", + pr_warn("// TODO core_relo: prog %td insn[%d] %s %s kind %d\n", prog - prog->obj->programs, relo->insn_off / 8, local_name, spec_str, relo->kind); return -ENOTSUP;
From: Wang Hai wanghai38@huawei.com
mainline inclusion from mainline-5.14-rc1 commit 3b3af91cb6893967bbec30f5c14562d0f7f00c2a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
There is no need for special treatment of the 'ret == 0' case. This patch simplifies the return expression.
Signed-off-by: Wang Hai wanghai38@huawei.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210609115651.3392580-1-wanghai38@huawei.com (cherry picked from commit 3b3af91cb6893967bbec30f5c14562d0f7f00c2a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 3cb3994f91b0..64dd637af7db 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2465,10 +2465,8 @@ static int bpf_object__init_maps(struct bpf_object *obj, err = err ?: bpf_object__init_global_data_maps(obj); err = err ?: bpf_object__init_kconfig_map(obj); err = err ?: bpf_object__init_struct_ops_maps(obj); - if (err) - return err;
- return 0; + return err; }
static bool section_have_execinstr(struct bpf_object *obj, int idx)
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.14-rc1 commit 4e164f8716853b879e2b1a21a12d54c57f11372e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Coverity complained about this being unreachable code. It is right because we already enforce flags to be unset, so a check validating the flag value is redundant.
Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210612023502.1283837-2-memxor@gmail.com (cherry picked from commit 4e164f8716853b879e2b1a21a12d54c57f11372e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/netlink.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index d743c8721aa7..efbb50ad59d8 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -675,8 +675,6 @@ static int __bpf_tc_detach(const struct bpf_tc_hook *hook, return -EINVAL; if (priority > UINT16_MAX) return -EINVAL; - if (flags & ~BPF_TC_F_REPLACE) - return -EINVAL; if (!flush) { if (!handle || !priority) return -EINVAL;
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.14-rc1 commit bbf29d3a2e49e482d5267311798aec42f00e88f3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This got lost during the refactoring across versions. We always use NLM_F_EXCL when creating some TC object, so reflect what the function says and set the flag.
Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210612023502.1283837-3-memxor@gmail.com (cherry picked from commit bbf29d3a2e49e482d5267311798aec42f00e88f3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/netlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index efbb50ad59d8..cf9381f03b16 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -457,7 +457,7 @@ static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags)
static int tc_qdisc_create_excl(struct bpf_tc_hook *hook) { - return tc_qdisc_modify(hook, RTM_NEWQDISC, NLM_F_CREATE); + return tc_qdisc_modify(hook, RTM_NEWQDISC, NLM_F_CREATE | NLM_F_EXCL); }
static int tc_qdisc_delete(struct bpf_tc_hook *hook)
From: Jonathan Edwards jonathan.edwards@165gc.onmicrosoft.com
mainline inclusion from mainline-5.14-rc1 commit 5c10a3dbe9220ca7bcee716c13c8a8563bcb010a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only the following program types are supported [1]:
BPF_PROG_TYPE_KPROBE BPF_PROG_TYPE_TRACEPOINT BPF_PROG_TYPE_PERF_EVENT
For libbpf this causes an EINVAL return during the bpf_object__probe_loading call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER can load.
The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some kernels it requires knowledge of the LINUX_VERSION_CODE.
[0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7 [1] https://access.redhat.com/articles/3550581
Signed-off-by: Jonathan Edwards jonathan.edwards@165gc.onmicrosoft.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com (cherry picked from commit 5c10a3dbe9220ca7bcee716c13c8a8563bcb010a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 64dd637af7db..176c06a51461 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4039,6 +4039,10 @@ bpf_object__probe_loading(struct bpf_object *obj) attr.license = "GPL";
ret = bpf_load_program_xattr(&attr, NULL, 0); + if (ret < 0) { + attr.prog_type = BPF_PROG_TYPE_TRACEPOINT; + ret = bpf_load_program_xattr(&attr, NULL, 0); + } if (ret < 0) { ret = errno; cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
From: Toke Høiland-Jørgensen toke@redhat.com
mainline inclusion from mainline-5.14-rc2 commit af0efa050caa66e8f304c42c94c76cb6c480cb7e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The update to streamline libbpf error reporting intended to change all functions to return the errno as a negative return value if LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is *not* set, the return value changes for the two functions that were already returning a negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll().
This is a user-visible API change that breaks applications; so let's revert these two functions back to unconditionally returning a negative errno value.
Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com (cherry picked from commit af0efa050caa66e8f304c42c94c76cb6c480cb7e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 176c06a51461..51de5d012edd 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -10203,7 +10203,7 @@ int bpf_link__unpin(struct bpf_link *link)
err = unlink(link->pin_path); if (err != 0) - return libbpf_err_errno(err); + return -errno;
pr_debug("link fd=%d: unpinned from %s\n", link->fd, link->pin_path); zfree(&link->pin_path); @@ -11275,7 +11275,7 @@ int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms)
cnt = epoll_wait(pb->epoll_fd, pb->events, pb->cpu_cnt, timeout_ms); if (cnt < 0) - return libbpf_err_errno(cnt); + return -errno;
for (i = 0; i < cnt; i++) { struct perf_cpu_buf *cpu_buf = pb->events[i].data.ptr;
From: Daniel Xu dxu@dxuuu.xyz
mainline inclusion from mainline-5.14-rc6 commit c34c338a40e4f3b6f80889cd17fd9281784d1c32 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Before this patch, btf_new() was liable to close an arbitrary FD 0 if BTF parsing failed. This was because:
* btf->fd was initialized to 0 through the calloc() * btf__free() (in the `done` label) closed any FDs >= 0 * btf->fd is left at 0 if parsing fails
This issue was discovered on a system using libbpf v0.3 (without BTF_KIND_FLOAT support) but with a kernel that had BTF_KIND_FLOAT types in BTF. Thus, parsing fails.
While this patch technically doesn't fix any issues b/c upstream libbpf has BTF_KIND_FLOAT support, it'll help prevent issues in the future if more BTF types are added. It also allow the fix to be backported to older libbpf's.
Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Daniel Xu dxu@dxuuu.xyz Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/5969bb991adedb03c6ae93e051fd2a00d293cf25.1627513... (cherry picked from commit c34c338a40e4f3b6f80889cd17fd9281784d1c32) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 1c46d76fefb5..c08b135c022b 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -810,6 +810,7 @@ static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf) btf->nr_types = 0; btf->start_id = 1; btf->start_str_off = 0; + btf->fd = -1;
if (base_btf) { btf->base_btf = base_btf; @@ -838,8 +839,6 @@ static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf) if (err) goto done;
- btf->fd = -1; - done: if (err) { btf__free(btf);
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.15-rc1 commit d809e134be7a1fdd9f5b99ab3291c6da5c0b8240 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Currently bpf_prog_put() is called from the task context only. With addition of bpf timers the timer related helpers will start calling bpf_prog_put() from irq-saved region and in rare cases might drop the refcnt to zero. To address this case, first, convert bpf_prog_free_id() to be irq-save (this is similar to bpf_map_free_id), and, second, defer non irq appropriate calls into work queue. For example: bpf_audit_prog() is calling kmalloc and wake_up_interruptible, bpf_prog_kallsyms_del_all()->bpf_ksym_del()->spin_unlock_bh(). They are not safe with irqs disabled.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau kafai@fb.com Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210715005417.78572-2-alexei.starovoitov@gmail.... (cherry picked from commit d809e134be7a1fdd9f5b99ab3291c6da5c0b8240) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/syscall.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index b916a6438d7d..f79a55deb7c8 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1736,6 +1736,8 @@ static int bpf_prog_alloc_id(struct bpf_prog *prog)
void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) { + unsigned long flags; + /* cBPF to eBPF migrations are currently not in the idr store. * Offloaded programs are removed from the store when their device * disappears - even if someone grabs an fd to them they are unusable, @@ -1745,7 +1747,7 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) return;
if (do_idr_lock) - spin_lock_bh(&prog_idr_lock); + spin_lock_irqsave(&prog_idr_lock, flags); else __acquire(&prog_idr_lock);
@@ -1753,7 +1755,7 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) prog->aux->id = 0;
if (do_idr_lock) - spin_unlock_bh(&prog_idr_lock); + spin_unlock_irqrestore(&prog_idr_lock, flags); else __release(&prog_idr_lock); } @@ -1789,14 +1791,32 @@ static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred) } }
+static void bpf_prog_put_deferred(struct work_struct *work) +{ + struct bpf_prog_aux *aux; + struct bpf_prog *prog; + + aux = container_of(work, struct bpf_prog_aux, work); + prog = aux->prog; + perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0); + bpf_audit_prog(prog, BPF_AUDIT_UNLOAD); + __bpf_prog_put_noref(prog, true); +} + static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock) { - if (atomic64_dec_and_test(&prog->aux->refcnt)) { - perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0); - bpf_audit_prog(prog, BPF_AUDIT_UNLOAD); + struct bpf_prog_aux *aux = prog->aux; + + if (atomic64_dec_and_test(&aux->refcnt)) { /* bpf_prog_free_id() must be called first */ bpf_prog_free_id(prog, do_idr_lock); - __bpf_prog_put_noref(prog, true); + + if (in_irq() || irqs_disabled()) { + INIT_WORK(&aux->work, bpf_prog_put_deferred); + schedule_work(&aux->work); + } else { + bpf_prog_put_deferred(&aux->work); + } } }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.15-rc1 commit c1b3fed319d32a721d4b9c17afaeb430444ff773 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Move ____bpf_spin_lock/unlock into helpers to make it more clear that quadruple underscore bpf_spin_lock/unlock are irqsave/restore variants.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau kafai@fb.com Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210715005417.78572-3-alexei.starovoitov@gmail.... (cherry picked from commit c1b3fed319d32a721d4b9c17afaeb430444ff773) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/helpers.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index e49b4cb5f099..1dbbd10743a6 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -278,13 +278,18 @@ static inline void __bpf_spin_unlock(struct bpf_spin_lock *lock)
static DEFINE_PER_CPU(unsigned long, irqsave_flags);
-notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock) +static inline void __bpf_spin_lock_irqsave(struct bpf_spin_lock *lock) { unsigned long flags;
local_irq_save(flags); __bpf_spin_lock(lock); __this_cpu_write(irqsave_flags, flags); +} + +notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock) +{ + __bpf_spin_lock_irqsave(lock); return 0; }
@@ -295,13 +300,18 @@ const struct bpf_func_proto bpf_spin_lock_proto = { .arg1_type = ARG_PTR_TO_SPIN_LOCK, };
-notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock) +static inline void __bpf_spin_unlock_irqrestore(struct bpf_spin_lock *lock) { unsigned long flags;
flags = __this_cpu_read(irqsave_flags); __bpf_spin_unlock(lock); local_irq_restore(flags); +} + +notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock) +{ + __bpf_spin_unlock_irqrestore(lock); return 0; }
@@ -322,9 +332,9 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, else lock = dst + map->spin_lock_off; preempt_disable(); - ____bpf_spin_lock(lock); + __bpf_spin_lock_irqsave(lock); copy_map_value(map, dst, src); - ____bpf_spin_unlock(lock); + __bpf_spin_unlock_irqrestore(lock); preempt_enable(); }
From: Shuyi Cheng chengshuyi@linux.alibaba.com
mainline inclusion from mainline-5.15-rc1 commit 1373ff59955621b7e71e7a1152036c93a5780c11 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
btf_custom_path allows developers to load custom BTF which libbpf will subsequently use for CO-RE relocation instead of vmlinux BTF.
Having btf_custom_path in bpf_object_open_opts one can directly use the skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path parameter, as opposed to using bpf_object__load_xattr() which is slated to be deprecated ([0]).
This work continues previous work started by another developer ([1]).
[0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDK... [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHH...
Signed-off-by: Shuyi Cheng chengshuyi@linux.alibaba.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@li... (cherry picked from commit 1373ff59955621b7e71e7a1152036c93a5780c11) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 28 ++++++++++++++++++++++++---- tools/lib/bpf/libbpf.h | 9 ++++++++- 2 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 51de5d012edd..36d0a6f17a34 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -498,6 +498,10 @@ struct bpf_object { * it at load time. */ struct btf *btf_vmlinux; + /* Path to the custom BTF to be used for BPF CO-RE relocations as an + * override for vmlinux BTF. + */ + char *btf_custom_path; /* vmlinux BTF override for CO-RE relocations */ struct btf *btf_vmlinux_override; /* Lazily initialized kernel module BTFs */ @@ -2646,8 +2650,10 @@ static bool obj_needs_vmlinux_btf(const struct bpf_object *obj) struct bpf_program *prog; int i;
- /* CO-RE relocations need kernel BTF */ - if (obj->btf_ext && obj->btf_ext->core_relo_info.len) + /* CO-RE relocations need kernel BTF, only when btf_custom_path + * is not specified + */ + if (obj->btf_ext && obj->btf_ext->core_relo_info.len && !obj->btf_custom_path) return true;
/* Support for typed ksyms needs kernel BTF */ @@ -7609,7 +7615,7 @@ static struct bpf_object * __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz, const struct bpf_object_open_opts *opts) { - const char *obj_name, *kconfig; + const char *obj_name, *kconfig, *btf_tmp_path; struct bpf_program *prog; struct bpf_object *obj; char tmp_name[64]; @@ -7640,6 +7646,19 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz, if (IS_ERR(obj)) return obj;
+ btf_tmp_path = OPTS_GET(opts, btf_custom_path, NULL); + if (btf_tmp_path) { + if (strlen(btf_tmp_path) >= PATH_MAX) { + err = -ENAMETOOLONG; + goto out; + } + obj->btf_custom_path = strdup(btf_tmp_path); + if (!obj->btf_custom_path) { + err = -ENOMEM; + goto out; + } + } + kconfig = OPTS_GET(opts, kconfig, NULL); if (kconfig) { obj->kconfig = strdup(kconfig); @@ -8112,7 +8131,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) err = err ? : bpf_object__sanitize_maps(obj); err = err ? : bpf_object__init_kern_struct_ops_maps(obj); err = err ? : bpf_object__create_maps(obj); - err = err ? : bpf_object__relocate(obj, attr->target_btf_path); + err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : attr->target_btf_path); err = err ? : bpf_object__load_progs(obj, attr->log_level);
if (obj->gen_loader) { @@ -8759,6 +8778,7 @@ void bpf_object__close(struct bpf_object *obj) for (i = 0; i < obj->nr_maps; i++) bpf_map__destroy(&obj->maps[i]);
+ zfree(&obj->btf_custom_path); zfree(&obj->kconfig); zfree(&obj->externs); obj->nr_extern = 0; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index e3a6728e868a..f76d640d0fa0 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -94,8 +94,15 @@ struct bpf_object_open_opts { * system Kconfig for CONFIG_xxx externs. */ const char *kconfig; + /* Path to the custom BTF to be used for BPF CO-RE relocations. + * This custom BTF completely replaces the use of vmlinux BTF + * for the purpose of CO-RE relocations. + * NOTE: any other BPF feature (e.g., fentry/fexit programs, + * struct_ops, etc) will need actual kernel BTF at /sys/kernel/btf/vmlinux. + */ + const char *btf_custom_path; }; -#define bpf_object_open_opts__last_field kconfig +#define bpf_object_open_opts__last_field btf_custom_path
LIBBPF_API struct bpf_object *bpf_object__open(const char *path); LIBBPF_API struct bpf_object *
From: Alan Maguire alan.maguire@oracle.com
mainline inclusion from mainline-5.15-rc1 commit 920d16af9b42adfc8524d880b0e8dba66a6cb87d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a BTF dumper for typed data, so that the user can dump a typed version of the data provided.
The API is
int btf_dump__dump_type_data(struct btf_dump *d, __u32 id, void *data, size_t data_sz, const struct btf_dump_type_data_opts *opts);
...where the id is the BTF id of the data pointed to by the "void *" argument; for example the BTF id of "struct sk_buff" for a "struct skb *" data pointer. Options supported are
- a starting indent level (indent_lvl) - a user-specified indent string which will be printed once per indent level; if NULL, tab is chosen but any string <= 32 chars can be provided. - a set of boolean options to control dump display, similar to those used for BPF helper bpf_snprintf_btf(). Options are - compact : omit newlines and other indentation - skip_names: omit member names - emit_zeroes: show zero-value members
Default output format is identical to that dumped by bpf_snprintf_btf(), for example a "struct sk_buff" representation would look like this:
struct sk_buff){ (union){ (struct){ .next = (struct sk_buff *)0xffffffffffffffff, .prev = (struct sk_buff *)0xffffffffffffffff, (union){ .dev = (struct net_device *)0xffffffffffffffff, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ...
If the data structure is larger than the *data_sz* number of bytes that are available in *data*, as much of the data as possible will be dumped and -E2BIG will be returned. This is useful as tracers will sometimes not be able to capture all of the data associated with a type; for example a "struct task_struct" is ~16k. Being able to specify that only a subset is available is important for such cases. On success, the amount of data dumped is returned.
Signed-off-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@o... (cherry picked from commit 920d16af9b42adfc8524d880b0e8dba66a6cb87d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.h | 19 + tools/lib/bpf/btf_dump.c | 819 ++++++++++++++++++++++++++++++++++++++- tools/lib/bpf/libbpf.map | 1 + 3 files changed, 834 insertions(+), 5 deletions(-)
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index b54f1c3ebd57..374e9f15de2e 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -184,6 +184,25 @@ LIBBPF_API int btf_dump__emit_type_decl(struct btf_dump *d, __u32 id, const struct btf_dump_emit_type_decl_opts *opts);
+ +struct btf_dump_type_data_opts { + /* size of this struct, for forward/backward compatibility */ + size_t sz; + const char *indent_str; + int indent_level; + /* below match "show" flags for bpf_show_snprintf() */ + bool compact; /* no newlines/indentation */ + bool skip_names; /* skip member/type names */ + bool emit_zeroes; /* show 0-valued fields */ + size_t :0; +}; +#define btf_dump_type_data_opts__last_field emit_zeroes + +LIBBPF_API int +btf_dump__dump_type_data(struct btf_dump *d, __u32 id, + const void *data, size_t data_sz, + const struct btf_dump_type_data_opts *opts); + /* * A set of helpers for easier BTF types handling */ diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 835d0ef0d67e..ff21ac98ab07 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -10,6 +10,8 @@ #include <stddef.h> #include <stdlib.h> #include <string.h> +#include <ctype.h> +#include <endian.h> #include <errno.h> #include <linux/err.h> #include <linux/btf.h> @@ -53,6 +55,26 @@ struct btf_dump_type_aux_state { __u8 referenced: 1; };
+/* indent string length; one indent string is added for each indent level */ +#define BTF_DATA_INDENT_STR_LEN 32 + +/* + * Common internal data for BTF type data dump operations. + */ +struct btf_dump_data { + const void *data_end; /* end of valid data to show */ + bool compact; + bool skip_names; + bool emit_zeroes; + __u8 indent_lvl; /* base indent level */ + char indent_str[BTF_DATA_INDENT_STR_LEN]; + /* below are used during iteration */ + int depth; + bool is_array_member; + bool is_array_terminated; + bool is_array_char; +}; + struct btf_dump { const struct btf *btf; const struct btf_ext *btf_ext; @@ -60,6 +82,7 @@ struct btf_dump { struct btf_dump_opts opts; int ptr_sz; bool strip_mods; + bool skip_anon_defs; int last_id;
/* per-type auxiliary state */ @@ -89,6 +112,10 @@ struct btf_dump { * name occurrences */ struct hashmap *ident_names; + /* + * data for typed display; allocated if needed. + */ + struct btf_dump_data *typed_dump; };
static size_t str_hash_fn(const void *key, void *ctx) @@ -765,11 +792,11 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id) break; case BTF_KIND_FUNC_PROTO: { const struct btf_param *p = btf_params(t); - __u16 vlen = btf_vlen(t); + __u16 n = btf_vlen(t); int i;
btf_dump_emit_type(d, t->type, cont_id); - for (i = 0; i < vlen; i++, p++) + for (i = 0; i < n; i++, p++) btf_dump_emit_type(d, p->type, cont_id);
break; @@ -852,8 +879,9 @@ static void btf_dump_emit_bit_padding(const struct btf_dump *d, static void btf_dump_emit_struct_fwd(struct btf_dump *d, __u32 id, const struct btf_type *t) { - btf_dump_printf(d, "%s %s", + btf_dump_printf(d, "%s%s%s", btf_is_struct(t) ? "struct" : "union", + t->name_off ? " " : "", btf_dump_type_name(d, id)); }
@@ -1259,7 +1287,7 @@ static void btf_dump_emit_type_chain(struct btf_dump *d, case BTF_KIND_UNION: btf_dump_emit_mods(d, decls); /* inline anonymous struct/union */ - if (t->name_off == 0) + if (t->name_off == 0 && !d->skip_anon_defs) btf_dump_emit_struct_def(d, id, t, lvl); else btf_dump_emit_struct_fwd(d, id, t); @@ -1267,7 +1295,7 @@ static void btf_dump_emit_type_chain(struct btf_dump *d, case BTF_KIND_ENUM: btf_dump_emit_mods(d, decls); /* inline anonymous enum */ - if (t->name_off == 0) + if (t->name_off == 0 && !d->skip_anon_defs) btf_dump_emit_enum_def(d, id, t, lvl); else btf_dump_emit_enum_fwd(d, id, t); @@ -1392,6 +1420,39 @@ static void btf_dump_emit_type_chain(struct btf_dump *d, btf_dump_emit_name(d, fname, last_was_ptr); }
+/* show type name as (type_name) */ +static void btf_dump_emit_type_cast(struct btf_dump *d, __u32 id, + bool top_level) +{ + const struct btf_type *t; + + /* for array members, we don't bother emitting type name for each + * member to avoid the redundancy of + * .name = (char[4])[(char)'f',(char)'o',(char)'o',] + */ + if (d->typed_dump->is_array_member) + return; + + /* avoid type name specification for variable/section; it will be done + * for the associated variable value(s). + */ + t = btf__type_by_id(d->btf, id); + if (btf_is_var(t) || btf_is_datasec(t)) + return; + + if (top_level) + btf_dump_printf(d, "("); + + d->skip_anon_defs = true; + d->strip_mods = true; + btf_dump_emit_type_decl(d, id, "", 0); + d->strip_mods = false; + d->skip_anon_defs = false; + + if (top_level) + btf_dump_printf(d, ")"); +} + /* return number of duplicates (occurrences) of a given name */ static size_t btf_dump_name_dups(struct btf_dump *d, struct hashmap *name_map, const char *orig_name) @@ -1447,3 +1508,751 @@ static const char *btf_dump_ident_name(struct btf_dump *d, __u32 id) { return btf_dump_resolve_name(d, id, d->ident_names); } + +static int btf_dump_dump_type_data(struct btf_dump *d, + const char *fname, + const struct btf_type *t, + __u32 id, + const void *data, + __u8 bits_offset, + __u8 bit_sz); + +static const char *btf_dump_data_newline(struct btf_dump *d) +{ + return d->typed_dump->compact || d->typed_dump->depth == 0 ? "" : "\n"; +} + +static const char *btf_dump_data_delim(struct btf_dump *d) +{ + return d->typed_dump->depth == 0 ? "" : ","; +} + +static void btf_dump_data_pfx(struct btf_dump *d) +{ + int i, lvl = d->typed_dump->indent_lvl + d->typed_dump->depth; + + if (d->typed_dump->compact) + return; + + for (i = 0; i < lvl; i++) + btf_dump_printf(d, "%s", d->typed_dump->indent_str); +} + +/* A macro is used here as btf_type_value[s]() appends format specifiers + * to the format specifier passed in; these do the work of appending + * delimiters etc while the caller simply has to specify the type values + * in the format specifier + value(s). + */ +#define btf_dump_type_values(d, fmt, ...) \ + btf_dump_printf(d, fmt "%s%s", \ + ##__VA_ARGS__, \ + btf_dump_data_delim(d), \ + btf_dump_data_newline(d)) + +static int btf_dump_unsupported_data(struct btf_dump *d, + const struct btf_type *t, + __u32 id) +{ + btf_dump_printf(d, "<unsupported kind:%u>", btf_kind(t)); + return -ENOTSUP; +} + +static void btf_dump_int128(struct btf_dump *d, + const struct btf_type *t, + const void *data) +{ + __int128 num = *(__int128 *)data; + + if ((num >> 64) == 0) + btf_dump_type_values(d, "0x%llx", (long long)num); + else + btf_dump_type_values(d, "0x%llx%016llx", (long long)num >> 32, + (long long)num); +} + +static unsigned __int128 btf_dump_bitfield_get_data(struct btf_dump *d, + const struct btf_type *t, + const void *data, + __u8 bits_offset, + __u8 bit_sz) +{ + __u16 left_shift_bits, right_shift_bits; + __u8 nr_copy_bits, nr_copy_bytes; + unsigned __int128 num = 0, ret; + const __u8 *bytes = data; + int i; + + /* Bitfield value retrieval is done in two steps; first relevant bytes are + * stored in num, then we left/right shift num to eliminate irrelevant bits. + */ + nr_copy_bits = bit_sz + bits_offset; + nr_copy_bytes = t->size; +#if __BYTE_ORDER == __LITTLE_ENDIAN + for (i = nr_copy_bytes - 1; i >= 0; i--) + num = num * 256 + bytes[i]; +#elif __BYTE_ORDER == __BIG_ENDIAN + for (i = 0; i < nr_copy_bytes; i++) + num = num * 256 + bytes[i]; +#else +# error "Unrecognized __BYTE_ORDER__" +#endif + left_shift_bits = 128 - nr_copy_bits; + right_shift_bits = 128 - bit_sz; + + ret = (num << left_shift_bits) >> right_shift_bits; + + return ret; +} + +static int btf_dump_bitfield_check_zero(struct btf_dump *d, + const struct btf_type *t, + const void *data, + __u8 bits_offset, + __u8 bit_sz) +{ + __int128 check_num; + + check_num = btf_dump_bitfield_get_data(d, t, data, bits_offset, bit_sz); + if (check_num == 0) + return -ENODATA; + return 0; +} + +static int btf_dump_bitfield_data(struct btf_dump *d, + const struct btf_type *t, + const void *data, + __u8 bits_offset, + __u8 bit_sz) +{ + unsigned __int128 print_num; + + print_num = btf_dump_bitfield_get_data(d, t, data, bits_offset, bit_sz); + btf_dump_int128(d, t, &print_num); + + return 0; +} + +/* ints, floats and ptrs */ +static int btf_dump_base_type_check_zero(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data) +{ + static __u8 bytecmp[16] = {}; + int nr_bytes; + + /* For pointer types, pointer size is not defined on a per-type basis. + * On dump creation however, we store the pointer size. + */ + if (btf_kind(t) == BTF_KIND_PTR) + nr_bytes = d->ptr_sz; + else + nr_bytes = t->size; + + if (nr_bytes < 1 || nr_bytes > 16) { + pr_warn("unexpected size %d for id [%u]\n", nr_bytes, id); + return -EINVAL; + } + + if (memcmp(data, bytecmp, nr_bytes) == 0) + return -ENODATA; + return 0; +} + +static int btf_dump_int_data(struct btf_dump *d, + const struct btf_type *t, + __u32 type_id, + const void *data, + __u8 bits_offset) +{ + __u8 encoding = btf_int_encoding(t); + bool sign = encoding & BTF_INT_SIGNED; + int sz = t->size; + + if (sz == 0) { + pr_warn("unexpected size %d for id [%u]\n", sz, type_id); + return -EINVAL; + } + + /* handle packed int data - accesses of integers not aligned on + * int boundaries can cause problems on some platforms. + */ + if (((uintptr_t)data) % sz) + return btf_dump_bitfield_data(d, t, data, 0, 0); + + switch (sz) { + case 16: + btf_dump_int128(d, t, data); + break; + case 8: + if (sign) + btf_dump_type_values(d, "%lld", *(long long *)data); + else + btf_dump_type_values(d, "%llu", *(unsigned long long *)data); + break; + case 4: + if (sign) + btf_dump_type_values(d, "%d", *(__s32 *)data); + else + btf_dump_type_values(d, "%u", *(__u32 *)data); + break; + case 2: + if (sign) + btf_dump_type_values(d, "%d", *(__s16 *)data); + else + btf_dump_type_values(d, "%u", *(__u16 *)data); + break; + case 1: + if (d->typed_dump->is_array_char) { + /* check for null terminator */ + if (d->typed_dump->is_array_terminated) + break; + if (*(char *)data == '\0') { + d->typed_dump->is_array_terminated = true; + break; + } + if (isprint(*(char *)data)) { + btf_dump_type_values(d, "'%c'", *(char *)data); + break; + } + } + if (sign) + btf_dump_type_values(d, "%d", *(__s8 *)data); + else + btf_dump_type_values(d, "%u", *(__u8 *)data); + break; + default: + pr_warn("unexpected sz %d for id [%u]\n", sz, type_id); + return -EINVAL; + } + return 0; +} + +union float_data { + long double ld; + double d; + float f; +}; + +static int btf_dump_float_data(struct btf_dump *d, + const struct btf_type *t, + __u32 type_id, + const void *data) +{ + const union float_data *flp = data; + union float_data fl; + int sz = t->size; + + /* handle unaligned data; copy to local union */ + if (((uintptr_t)data) % sz) { + memcpy(&fl, data, sz); + flp = &fl; + } + + switch (sz) { + case 16: + btf_dump_type_values(d, "%Lf", flp->ld); + break; + case 8: + btf_dump_type_values(d, "%lf", flp->d); + break; + case 4: + btf_dump_type_values(d, "%f", flp->f); + break; + default: + pr_warn("unexpected size %d for id [%u]\n", sz, type_id); + return -EINVAL; + } + return 0; +} + +static int btf_dump_var_data(struct btf_dump *d, + const struct btf_type *v, + __u32 id, + const void *data) +{ + enum btf_func_linkage linkage = btf_var(v)->linkage; + const struct btf_type *t; + const char *l; + __u32 type_id; + + switch (linkage) { + case BTF_FUNC_STATIC: + l = "static "; + break; + case BTF_FUNC_EXTERN: + l = "extern "; + break; + case BTF_FUNC_GLOBAL: + default: + l = ""; + break; + } + + /* format of output here is [linkage] [type] [varname] = (type)value, + * for example "static int cpu_profile_flip = (int)1" + */ + btf_dump_printf(d, "%s", l); + type_id = v->type; + t = btf__type_by_id(d->btf, type_id); + btf_dump_emit_type_cast(d, type_id, false); + btf_dump_printf(d, " %s = ", btf_name_of(d, v->name_off)); + return btf_dump_dump_type_data(d, NULL, t, type_id, data, 0, 0); +} + +static int btf_dump_array_data(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data) +{ + const struct btf_array *array = btf_array(t); + const struct btf_type *elem_type; + __u32 i, elem_size = 0, elem_type_id; + bool is_array_member; + + elem_type_id = array->type; + elem_type = skip_mods_and_typedefs(d->btf, elem_type_id, NULL); + elem_size = btf__resolve_size(d->btf, elem_type_id); + if (elem_size <= 0) { + pr_warn("unexpected elem size %d for array type [%u]\n", elem_size, id); + return -EINVAL; + } + + if (btf_is_int(elem_type)) { + /* + * BTF_INT_CHAR encoding never seems to be set for + * char arrays, so if size is 1 and element is + * printable as a char, we'll do that. + */ + if (elem_size == 1) + d->typed_dump->is_array_char = true; + } + + /* note that we increment depth before calling btf_dump_print() below; + * this is intentional. btf_dump_data_newline() will not print a + * newline for depth 0 (since this leaves us with trailing newlines + * at the end of typed display), so depth is incremented first. + * For similar reasons, we decrement depth before showing the closing + * parenthesis. + */ + d->typed_dump->depth++; + btf_dump_printf(d, "[%s", btf_dump_data_newline(d)); + + /* may be a multidimensional array, so store current "is array member" + * status so we can restore it correctly later. + */ + is_array_member = d->typed_dump->is_array_member; + d->typed_dump->is_array_member = true; + for (i = 0; i < array->nelems; i++, data += elem_size) { + if (d->typed_dump->is_array_terminated) + break; + btf_dump_dump_type_data(d, NULL, elem_type, elem_type_id, data, 0, 0); + } + d->typed_dump->is_array_member = is_array_member; + d->typed_dump->depth--; + btf_dump_data_pfx(d); + btf_dump_type_values(d, "]"); + + return 0; +} + +static int btf_dump_struct_data(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data) +{ + const struct btf_member *m = btf_members(t); + __u16 n = btf_vlen(t); + int i, err; + + /* note that we increment depth before calling btf_dump_print() below; + * this is intentional. btf_dump_data_newline() will not print a + * newline for depth 0 (since this leaves us with trailing newlines + * at the end of typed display), so depth is incremented first. + * For similar reasons, we decrement depth before showing the closing + * parenthesis. + */ + d->typed_dump->depth++; + btf_dump_printf(d, "{%s", btf_dump_data_newline(d)); + + for (i = 0; i < n; i++, m++) { + const struct btf_type *mtype; + const char *mname; + __u32 moffset; + __u8 bit_sz; + + mtype = btf__type_by_id(d->btf, m->type); + mname = btf_name_of(d, m->name_off); + moffset = btf_member_bit_offset(t, i); + + bit_sz = btf_member_bitfield_size(t, i); + err = btf_dump_dump_type_data(d, mname, mtype, m->type, data + moffset / 8, + moffset % 8, bit_sz); + if (err < 0) + return err; + } + d->typed_dump->depth--; + btf_dump_data_pfx(d); + btf_dump_type_values(d, "}"); + return err; +} + +static int btf_dump_ptr_data(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data) +{ + btf_dump_type_values(d, "%p", *(void **)data); + return 0; +} + +static int btf_dump_get_enum_value(struct btf_dump *d, + const struct btf_type *t, + const void *data, + __u32 id, + __s64 *value) +{ + int sz = t->size; + + /* handle unaligned enum value */ + if (((uintptr_t)data) % sz) { + *value = (__s64)btf_dump_bitfield_get_data(d, t, data, 0, 0); + return 0; + } + switch (t->size) { + case 8: + *value = *(__s64 *)data; + return 0; + case 4: + *value = *(__s32 *)data; + return 0; + case 2: + *value = *(__s16 *)data; + return 0; + case 1: + *value = *(__s8 *)data; + return 0; + default: + pr_warn("unexpected size %d for enum, id:[%u]\n", t->size, id); + return -EINVAL; + } +} + +static int btf_dump_enum_data(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data) +{ + const struct btf_enum *e; + __s64 value; + int i, err; + + err = btf_dump_get_enum_value(d, t, data, id, &value); + if (err) + return err; + + for (i = 0, e = btf_enum(t); i < btf_vlen(t); i++, e++) { + if (value != e->val) + continue; + btf_dump_type_values(d, "%s", btf_name_of(d, e->name_off)); + return 0; + } + + btf_dump_type_values(d, "%d", value); + return 0; +} + +static int btf_dump_datasec_data(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data) +{ + const struct btf_var_secinfo *vsi; + const struct btf_type *var; + __u32 i; + int err; + + btf_dump_type_values(d, "SEC("%s") ", btf_name_of(d, t->name_off)); + + for (i = 0, vsi = btf_var_secinfos(t); i < btf_vlen(t); i++, vsi++) { + var = btf__type_by_id(d->btf, vsi->type); + err = btf_dump_dump_type_data(d, NULL, var, vsi->type, data + vsi->offset, 0, 0); + if (err < 0) + return err; + btf_dump_printf(d, ";"); + } + return 0; +} + +/* return size of type, or if base type overflows, return -E2BIG. */ +static int btf_dump_type_data_check_overflow(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data, + __u8 bits_offset) +{ + __s64 size = btf__resolve_size(d->btf, id); + + if (size < 0 || size >= INT_MAX) { + pr_warn("unexpected size [%lld] for id [%u]\n", + size, id); + return -EINVAL; + } + + /* Only do overflow checking for base types; we do not want to + * avoid showing part of a struct, union or array, even if we + * do not have enough data to show the full object. By + * restricting overflow checking to base types we can ensure + * that partial display succeeds, while avoiding overflowing + * and using bogus data for display. + */ + t = skip_mods_and_typedefs(d->btf, id, NULL); + if (!t) { + pr_warn("unexpected error skipping mods/typedefs for id [%u]\n", + id); + return -EINVAL; + } + + switch (btf_kind(t)) { + case BTF_KIND_INT: + case BTF_KIND_FLOAT: + case BTF_KIND_PTR: + case BTF_KIND_ENUM: + if (data + bits_offset / 8 + size > d->typed_dump->data_end) + return -E2BIG; + break; + default: + break; + } + return (int)size; +} + +static int btf_dump_type_data_check_zero(struct btf_dump *d, + const struct btf_type *t, + __u32 id, + const void *data, + __u8 bits_offset, + __u8 bit_sz) +{ + __s64 value; + int i, err; + + /* toplevel exceptions; we show zero values if + * - we ask for them (emit_zeros) + * - if we are at top-level so we see "struct empty { }" + * - or if we are an array member and the array is non-empty and + * not a char array; we don't want to be in a situation where we + * have an integer array 0, 1, 0, 1 and only show non-zero values. + * If the array contains zeroes only, or is a char array starting + * with a '\0', the array-level check_zero() will prevent showing it; + * we are concerned with determining zero value at the array member + * level here. + */ + if (d->typed_dump->emit_zeroes || d->typed_dump->depth == 0 || + (d->typed_dump->is_array_member && + !d->typed_dump->is_array_char)) + return 0; + + t = skip_mods_and_typedefs(d->btf, id, NULL); + + switch (btf_kind(t)) { + case BTF_KIND_INT: + if (bit_sz) + return btf_dump_bitfield_check_zero(d, t, data, bits_offset, bit_sz); + return btf_dump_base_type_check_zero(d, t, id, data); + case BTF_KIND_FLOAT: + case BTF_KIND_PTR: + return btf_dump_base_type_check_zero(d, t, id, data); + case BTF_KIND_ARRAY: { + const struct btf_array *array = btf_array(t); + const struct btf_type *elem_type; + __u32 elem_type_id, elem_size; + bool ischar; + + elem_type_id = array->type; + elem_size = btf__resolve_size(d->btf, elem_type_id); + elem_type = skip_mods_and_typedefs(d->btf, elem_type_id, NULL); + + ischar = btf_is_int(elem_type) && elem_size == 1; + + /* check all elements; if _any_ element is nonzero, all + * of array is displayed. We make an exception however + * for char arrays where the first element is 0; these + * are considered zeroed also, even if later elements are + * non-zero because the string is terminated. + */ + for (i = 0; i < array->nelems; i++) { + if (i == 0 && ischar && *(char *)data == 0) + return -ENODATA; + err = btf_dump_type_data_check_zero(d, elem_type, + elem_type_id, + data + + (i * elem_size), + bits_offset, 0); + if (err != -ENODATA) + return err; + } + return -ENODATA; + } + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: { + const struct btf_member *m = btf_members(t); + __u16 n = btf_vlen(t); + + /* if any struct/union member is non-zero, the struct/union + * is considered non-zero and dumped. + */ + for (i = 0; i < n; i++, m++) { + const struct btf_type *mtype; + __u32 moffset; + + mtype = btf__type_by_id(d->btf, m->type); + moffset = btf_member_bit_offset(t, i); + + /* btf_int_bits() does not store member bitfield size; + * bitfield size needs to be stored here so int display + * of member can retrieve it. + */ + bit_sz = btf_member_bitfield_size(t, i); + err = btf_dump_type_data_check_zero(d, mtype, m->type, data + moffset / 8, + moffset % 8, bit_sz); + if (err != ENODATA) + return err; + } + return -ENODATA; + } + case BTF_KIND_ENUM: + if (btf_dump_get_enum_value(d, t, data, id, &value)) + return 0; + if (value == 0) + return -ENODATA; + return 0; + default: + return 0; + } +} + +/* returns size of data dumped, or error. */ +static int btf_dump_dump_type_data(struct btf_dump *d, + const char *fname, + const struct btf_type *t, + __u32 id, + const void *data, + __u8 bits_offset, + __u8 bit_sz) +{ + int size, err; + + size = btf_dump_type_data_check_overflow(d, t, id, data, bits_offset); + if (size < 0) + return size; + err = btf_dump_type_data_check_zero(d, t, id, data, bits_offset, bit_sz); + if (err) { + /* zeroed data is expected and not an error, so simply skip + * dumping such data. Record other errors however. + */ + if (err == -ENODATA) + return size; + return err; + } + btf_dump_data_pfx(d); + + if (!d->typed_dump->skip_names) { + if (fname && strlen(fname) > 0) + btf_dump_printf(d, ".%s = ", fname); + btf_dump_emit_type_cast(d, id, true); + } + + t = skip_mods_and_typedefs(d->btf, id, NULL); + + switch (btf_kind(t)) { + case BTF_KIND_UNKN: + case BTF_KIND_FWD: + case BTF_KIND_FUNC: + case BTF_KIND_FUNC_PROTO: + err = btf_dump_unsupported_data(d, t, id); + break; + case BTF_KIND_INT: + if (bit_sz) + err = btf_dump_bitfield_data(d, t, data, bits_offset, bit_sz); + else + err = btf_dump_int_data(d, t, id, data, bits_offset); + break; + case BTF_KIND_FLOAT: + err = btf_dump_float_data(d, t, id, data); + break; + case BTF_KIND_PTR: + err = btf_dump_ptr_data(d, t, id, data); + break; + case BTF_KIND_ARRAY: + err = btf_dump_array_data(d, t, id, data); + break; + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + err = btf_dump_struct_data(d, t, id, data); + break; + case BTF_KIND_ENUM: + /* handle bitfield and int enum values */ + if (bit_sz) { + unsigned __int128 print_num; + __s64 enum_val; + + print_num = btf_dump_bitfield_get_data(d, t, data, bits_offset, bit_sz); + enum_val = (__s64)print_num; + err = btf_dump_enum_data(d, t, id, &enum_val); + } else + err = btf_dump_enum_data(d, t, id, data); + break; + case BTF_KIND_VAR: + err = btf_dump_var_data(d, t, id, data); + break; + case BTF_KIND_DATASEC: + err = btf_dump_datasec_data(d, t, id, data); + break; + default: + pr_warn("unexpected kind [%u] for id [%u]\n", + BTF_INFO_KIND(t->info), id); + return -EINVAL; + } + if (err < 0) + return err; + return size; +} + +int btf_dump__dump_type_data(struct btf_dump *d, __u32 id, + const void *data, size_t data_sz, + const struct btf_dump_type_data_opts *opts) +{ + const struct btf_type *t; + int ret; + + if (!OPTS_VALID(opts, btf_dump_type_data_opts)) + return libbpf_err(-EINVAL); + + t = btf__type_by_id(d->btf, id); + if (!t) + return libbpf_err(-ENOENT); + + d->typed_dump = calloc(1, sizeof(struct btf_dump_data)); + if (!d->typed_dump) + return libbpf_err(-ENOMEM); + + d->typed_dump->data_end = data + data_sz; + d->typed_dump->indent_lvl = OPTS_GET(opts, indent_level, 0); + /* default indent string is a tab */ + if (!opts->indent_str) + d->typed_dump->indent_str[0] = '\t'; + else + strncat(d->typed_dump->indent_str, opts->indent_str, + sizeof(d->typed_dump->indent_str) - 1); + + d->typed_dump->compact = OPTS_GET(opts, compact, false); + d->typed_dump->skip_names = OPTS_GET(opts, skip_names, false); + d->typed_dump->emit_zeroes = OPTS_GET(opts, emit_zeroes, false); + + ret = btf_dump_dump_type_data(d, NULL, t, id, data, 0, 0); + + free(d->typed_dump); + + return libbpf_err(ret); +} diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 3bb9ebd8d287..9bbb195812fa 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -373,5 +373,6 @@ LIBBPF_0.5.0 { bpf_map__initial_value; bpf_map_lookup_and_delete_elem_flags; bpf_object__gen_loader; + btf_dump__dump_type_data; libbpf_set_strict_mode; } LIBBPF_0.4.0;
From: Alan Maguire alan.maguire@oracle.com
mainline inclusion from mainline-5.15-rc1 commit 8d44c3578b48d5f605eddcfd6a644e3944455a6b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
If data is packed, data structures can store it outside of usual boundaries. For example a 4-byte int can be stored on a unaligned boundary in a case like this:
struct s { char f1; int f2; } __attribute((packed));
...the int is stored at an offset of one byte. Some platforms have problems dereferencing data that is not aligned with its size, and code exists to handle most cases of this for BTF typed data display. However pointer display was missed, and a simple function to test if "ptr_is_aligned(data, data_sz)" would help clarify this code.
Suggested-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@o... (cherry picked from commit 8d44c3578b48d5f605eddcfd6a644e3944455a6b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf_dump.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index ff21ac98ab07..8748d357d4ff 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -1659,6 +1659,11 @@ static int btf_dump_base_type_check_zero(struct btf_dump *d, return 0; }
+static bool ptr_is_aligned(const void *data, int data_sz) +{ + return ((uintptr_t)data) % data_sz == 0; +} + static int btf_dump_int_data(struct btf_dump *d, const struct btf_type *t, __u32 type_id, @@ -1677,7 +1682,7 @@ static int btf_dump_int_data(struct btf_dump *d, /* handle packed int data - accesses of integers not aligned on * int boundaries can cause problems on some platforms. */ - if (((uintptr_t)data) % sz) + if (!ptr_is_aligned(data, sz)) return btf_dump_bitfield_data(d, t, data, 0, 0);
switch (sz) { @@ -1744,7 +1749,7 @@ static int btf_dump_float_data(struct btf_dump *d, int sz = t->size;
/* handle unaligned data; copy to local union */ - if (((uintptr_t)data) % sz) { + if (!ptr_is_aligned(data, sz)) { memcpy(&fl, data, sz); flp = &fl; } @@ -1897,12 +1902,27 @@ static int btf_dump_struct_data(struct btf_dump *d, return err; }
+union ptr_data { + unsigned int p; + unsigned long long lp; +}; + static int btf_dump_ptr_data(struct btf_dump *d, const struct btf_type *t, __u32 id, const void *data) { - btf_dump_type_values(d, "%p", *(void **)data); + if (ptr_is_aligned(data, d->ptr_sz) && d->ptr_sz == sizeof(void *)) { + btf_dump_type_values(d, "%p", *(void **)data); + } else { + union ptr_data pt; + + memcpy(&pt, data, d->ptr_sz); + if (d->ptr_sz == 4) + btf_dump_type_values(d, "0x%x", pt.p); + else + btf_dump_type_values(d, "0x%llx", pt.lp); + } return 0; }
@@ -1915,7 +1935,7 @@ static int btf_dump_get_enum_value(struct btf_dump *d, int sz = t->size;
/* handle unaligned enum value */ - if (((uintptr_t)data) % sz) { + if (!ptr_is_aligned(data, sz)) { *value = (__s64)btf_dump_bitfield_get_data(d, t, data, 0, 0); return 0; }
From: Martynas Pumputis m@lambda.lt
mainline inclusion from mainline-5.15-rc1 commit 08f71a1e39a1f07a464ac782d9b612d6a74c7015 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a test case to check whether an unsuccessful creation of an outer map of a BTF-defined map-in-map destroys the inner map.
As bpf_object__create_map() is a static function, we cannot just call it from the test case and then check whether a map accessible via map->inner_map_fd has been closed. Instead, we iterate over all maps and check whether the map "$MAP_NAME.inner" does not exist.
Signed-off-by: Martynas Pumputis m@lambda.lt Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210719173838.423148-3-m@lambda.lt (cherry picked from commit 08f71a1e39a1f07a464ac782d9b612d6a74c7015) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../bpf/progs/test_map_in_map_invalid.c | 26 ++++++++ tools/testing/selftests/bpf/test_maps.c | 63 ++++++++++++++++++- 2 files changed, 88 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/test_map_in_map_invalid.c
diff --git a/tools/testing/selftests/bpf/progs/test_map_in_map_invalid.c b/tools/testing/selftests/bpf/progs/test_map_in_map_invalid.c new file mode 100644 index 000000000000..703c08e06442 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_map_in_map_invalid.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Isovalent, Inc. */ +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct inner { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, __u32); + __type(value, int); + __uint(max_entries, 4); +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); + __uint(max_entries, 0); /* This will make map creation to fail */ + __uint(key_size, sizeof(__u32)); + __array(values, struct inner); +} mim SEC(".maps"); + +SEC("xdp") +int xdp_noop0(struct xdp_md *ctx) +{ + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index 20eb7301850e..b241bb07972e 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -1136,12 +1136,16 @@ static void test_sockmap(unsigned int tasks, void *data) }
#define MAPINMAP_PROG "./test_map_in_map.o" +#define MAPINMAP_INVALID_PROG "./test_map_in_map_invalid.o" static void test_map_in_map(void) { struct bpf_object *obj; struct bpf_map *map; int mim_fd, fd, err; int pos = 0; + struct bpf_map_info info = {}; + __u32 len = sizeof(info); + __u32 id = 0;
obj = bpf_object__open(MAPINMAP_PROG);
@@ -1211,11 +1215,68 @@ static void test_map_in_map(void) }
close(fd); + fd = -1; bpf_object__close(obj); + + /* Test that failing bpf_object__create_map() destroys the inner map */ + obj = bpf_object__open(MAPINMAP_INVALID_PROG); + err = libbpf_get_error(obj); + if (err) { + printf("Failed to load %s program: %d %d", + MAPINMAP_INVALID_PROG, err, errno); + goto out_map_in_map; + } + + map = bpf_object__find_map_by_name(obj, "mim"); + if (!map) { + printf("Failed to load array of maps from test prog\n"); + goto out_map_in_map; + } + + err = bpf_object__load(obj); + if (!err) { + printf("Loading obj supposed to fail\n"); + goto out_map_in_map; + } + + /* Iterate over all maps to check whether the internal map + * ("mim.internal") has been destroyed. + */ + while (true) { + err = bpf_map_get_next_id(id, &id); + if (err) { + if (errno == ENOENT) + break; + printf("Failed to get next map: %d", errno); + goto out_map_in_map; + } + + fd = bpf_map_get_fd_by_id(id); + if (fd < 0) { + if (errno == ENOENT) + continue; + printf("Failed to get map by id %u: %d", id, errno); + goto out_map_in_map; + } + + err = bpf_obj_get_info_by_fd(fd, &info, &len); + if (err) { + printf("Failed to get map info by fd %d: %d", fd, + errno); + goto out_map_in_map; + } + + if (!strcmp(info.name, "mim.inner")) { + printf("Inner map mim.inner was not destroyed\n"); + goto out_map_in_map; + } + } + return;
out_map_in_map: - close(fd); + if (fd >= 0) + close(fd); exit(1); }
From: Alan Maguire alan.maguire@oracle.com
mainline inclusion from mainline-5.15-rc1 commit a1d3cc3c5eca598cfabee3a35f30f34fbe2f709b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
__int128 is not supported for some 32-bit platforms (arm and i386). __int128 was used in carrying out computations on bitfields which aid display, but the same calculations could be done with __u64 with the small effect of not supporting 128-bit bitfields.
With these changes, a big-endian issue with casting 128-bit integers to 64-bit for enum bitfields is solved also, as we now use 64-bit integers for bitfield calculations.
Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Reported-by: Linux Kernel Functional Testing lkft@linaro.org Signed-off-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@o... (cherry picked from commit a1d3cc3c5eca598cfabee3a35f30f34fbe2f709b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf_dump.c | 98 ++++++++++++++++++++++++++-------------- 1 file changed, 65 insertions(+), 33 deletions(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 8748d357d4ff..bac3a902d4bb 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -1557,31 +1557,26 @@ static int btf_dump_unsupported_data(struct btf_dump *d, return -ENOTSUP; }
-static void btf_dump_int128(struct btf_dump *d, - const struct btf_type *t, - const void *data) -{ - __int128 num = *(__int128 *)data; - - if ((num >> 64) == 0) - btf_dump_type_values(d, "0x%llx", (long long)num); - else - btf_dump_type_values(d, "0x%llx%016llx", (long long)num >> 32, - (long long)num); -} - -static unsigned __int128 btf_dump_bitfield_get_data(struct btf_dump *d, - const struct btf_type *t, - const void *data, - __u8 bits_offset, - __u8 bit_sz) +static int btf_dump_get_bitfield_value(struct btf_dump *d, + const struct btf_type *t, + const void *data, + __u8 bits_offset, + __u8 bit_sz, + __u64 *value) { __u16 left_shift_bits, right_shift_bits; __u8 nr_copy_bits, nr_copy_bytes; - unsigned __int128 num = 0, ret; const __u8 *bytes = data; + int sz = t->size; + __u64 num = 0; int i;
+ /* Maximum supported bitfield size is 64 bits */ + if (sz > 8) { + pr_warn("unexpected bitfield size %d\n", sz); + return -EINVAL; + } + /* Bitfield value retrieval is done in two steps; first relevant bytes are * stored in num, then we left/right shift num to eliminate irrelevant bits. */ @@ -1596,12 +1591,12 @@ static unsigned __int128 btf_dump_bitfield_get_data(struct btf_dump *d, #else # error "Unrecognized __BYTE_ORDER__" #endif - left_shift_bits = 128 - nr_copy_bits; - right_shift_bits = 128 - bit_sz; + left_shift_bits = 64 - nr_copy_bits; + right_shift_bits = 64 - bit_sz;
- ret = (num << left_shift_bits) >> right_shift_bits; + *value = (num << left_shift_bits) >> right_shift_bits;
- return ret; + return 0; }
static int btf_dump_bitfield_check_zero(struct btf_dump *d, @@ -1610,9 +1605,12 @@ static int btf_dump_bitfield_check_zero(struct btf_dump *d, __u8 bits_offset, __u8 bit_sz) { - __int128 check_num; + __u64 check_num; + int err;
- check_num = btf_dump_bitfield_get_data(d, t, data, bits_offset, bit_sz); + err = btf_dump_get_bitfield_value(d, t, data, bits_offset, bit_sz, &check_num); + if (err) + return err; if (check_num == 0) return -ENODATA; return 0; @@ -1624,10 +1622,14 @@ static int btf_dump_bitfield_data(struct btf_dump *d, __u8 bits_offset, __u8 bit_sz) { - unsigned __int128 print_num; + __u64 print_num; + int err; + + err = btf_dump_get_bitfield_value(d, t, data, bits_offset, bit_sz, &print_num); + if (err) + return err;
- print_num = btf_dump_bitfield_get_data(d, t, data, bits_offset, bit_sz); - btf_dump_int128(d, t, &print_num); + btf_dump_type_values(d, "0x%llx", (unsigned long long)print_num);
return 0; } @@ -1686,9 +1688,29 @@ static int btf_dump_int_data(struct btf_dump *d, return btf_dump_bitfield_data(d, t, data, 0, 0);
switch (sz) { - case 16: - btf_dump_int128(d, t, data); + case 16: { + const __u64 *ints = data; + __u64 lsi, msi; + + /* avoid use of __int128 as some 32-bit platforms do not + * support it. + */ +#if __BYTE_ORDER == __LITTLE_ENDIAN + lsi = ints[0]; + msi = ints[1]; +#elif __BYTE_ORDER == __BIG_ENDIAN + lsi = ints[1]; + msi = ints[0]; +#else +# error "Unrecognized __BYTE_ORDER__" +#endif + if (msi == 0) + btf_dump_type_values(d, "0x%llx", (unsigned long long)lsi); + else + btf_dump_type_values(d, "0x%llx%016llx", (unsigned long long)msi, + (unsigned long long)lsi); break; + } case 8: if (sign) btf_dump_type_values(d, "%lld", *(long long *)data); @@ -1936,9 +1958,16 @@ static int btf_dump_get_enum_value(struct btf_dump *d,
/* handle unaligned enum value */ if (!ptr_is_aligned(data, sz)) { - *value = (__s64)btf_dump_bitfield_get_data(d, t, data, 0, 0); + __u64 val; + int err; + + err = btf_dump_get_bitfield_value(d, t, data, 0, 0, &val); + if (err) + return err; + *value = (__s64)val; return 0; } + switch (t->size) { case 8: *value = *(__s64 *)data; @@ -2214,10 +2243,13 @@ static int btf_dump_dump_type_data(struct btf_dump *d, case BTF_KIND_ENUM: /* handle bitfield and int enum values */ if (bit_sz) { - unsigned __int128 print_num; + __u64 print_num; __s64 enum_val;
- print_num = btf_dump_bitfield_get_data(d, t, data, bits_offset, bit_sz); + err = btf_dump_get_bitfield_value(d, t, data, bits_offset, bit_sz, + &print_num); + if (err) + break; enum_val = (__s64)print_num; err = btf_dump_enum_data(d, t, id, &enum_val); } else
From: Alan Maguire alan.maguire@oracle.com
mainline inclusion from mainline-5.15-rc1 commit 720c29fca9fb87c473148ff666b75314420cdda6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When retrieving the enum value associated with typed data during "is data zero?" checking in btf_dump_type_data_check_zero(), the return value of btf_dump_get_enum_value() is not passed to the caller if the function returns a non-zero (error) value. Currently, 0 is returned if the function returns an error. We should instead propagate the error to the caller.
Signed-off-by: Alan Maguire alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@o... (cherry picked from commit 720c29fca9fb87c473148ff666b75314420cdda6) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf_dump.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index bac3a902d4bb..9937d35c8f36 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -2171,8 +2171,9 @@ static int btf_dump_type_data_check_zero(struct btf_dump *d, return -ENODATA; } case BTF_KIND_ENUM: - if (btf_dump_get_enum_value(d, t, data, id, &value)) - return 0; + err = btf_dump_get_enum_value(d, t, data, id, &value); + if (err) + return err; if (value == 0) return -ENODATA; return 0;
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.15-rc1 commit 6e43b28607848eeb079c033f415b410788569b27 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
CO-RE processing functions don't need to know 'struct bpf_program' details. Cleanup the layering to eventually be able to move CO-RE logic into a separate file.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210721000822.40958-2-alexei.starovoitov@gmail.... (cherry picked from commit 6e43b28607848eeb079c033f415b410788569b27) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 74 ++++++++++++++++++++++-------------------- 1 file changed, 38 insertions(+), 36 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 36d0a6f17a34..ee20ca151b54 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5625,7 +5625,7 @@ static int bpf_core_spec_match(struct bpf_core_spec *local_spec, return 1; }
-static int bpf_core_calc_field_relo(const struct bpf_program *prog, +static int bpf_core_calc_field_relo(const char *prog_name, const struct bpf_core_relo *relo, const struct bpf_core_spec *spec, __u32 *val, __u32 *field_sz, __u32 *type_id, @@ -5669,7 +5669,7 @@ static int bpf_core_calc_field_relo(const struct bpf_program *prog, *val = sz; } else { pr_warn("prog '%s': relo %d at insn #%d can't be applied to array access\n", - prog->name, relo->kind, relo->insn_off / 8); + prog_name, relo->kind, relo->insn_off / 8); return -EINVAL; } if (validate) @@ -5691,7 +5691,7 @@ static int bpf_core_calc_field_relo(const struct bpf_program *prog, if (byte_sz >= 8) { /* bitfield can't be read with 64-bit read */ pr_warn("prog '%s': relo %d at insn #%d can't be satisfied for bitfield\n", - prog->name, relo->kind, relo->insn_off / 8); + prog_name, relo->kind, relo->insn_off / 8); return -E2BIG; } byte_sz *= 2; @@ -5841,7 +5841,7 @@ struct bpf_core_relo_res * with each other. Otherwise, libbpf will refuse to proceed due to ambiguity. * If instruction has to be poisoned, *poison will be set to true. */ -static int bpf_core_calc_relo(const struct bpf_program *prog, +static int bpf_core_calc_relo(const char *prog_name, const struct bpf_core_relo *relo, int relo_idx, const struct bpf_core_spec *local_spec, @@ -5859,10 +5859,10 @@ static int bpf_core_calc_relo(const struct bpf_program *prog, res->orig_type_id = res->new_type_id = 0;
if (core_relo_is_field_based(relo->kind)) { - err = bpf_core_calc_field_relo(prog, relo, local_spec, + err = bpf_core_calc_field_relo(prog_name, relo, local_spec, &res->orig_val, &res->orig_sz, &res->orig_type_id, &res->validate); - err = err ?: bpf_core_calc_field_relo(prog, relo, targ_spec, + err = err ?: bpf_core_calc_field_relo(prog_name, relo, targ_spec, &res->new_val, &res->new_sz, &res->new_type_id, NULL); if (err) @@ -5920,7 +5920,7 @@ static int bpf_core_calc_relo(const struct bpf_program *prog, } else if (err == -EOPNOTSUPP) { /* EOPNOTSUPP means unknown/unsupported relocation */ pr_warn("prog '%s': relo #%d: unrecognized CO-RE relocation %s (%d) at insn #%d\n", - prog->name, relo_idx, core_relo_kind_str(relo->kind), + prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind, relo->insn_off / 8); }
@@ -5931,11 +5931,11 @@ static int bpf_core_calc_relo(const struct bpf_program *prog, * Turn instruction for which CO_RE relocation failed into invalid one with * distinct signature. */ -static void bpf_core_poison_insn(struct bpf_program *prog, int relo_idx, +static void bpf_core_poison_insn(const char *prog_name, int relo_idx, int insn_idx, struct bpf_insn *insn) { pr_debug("prog '%s': relo #%d: substituting insn #%d w/ invalid insn\n", - prog->name, relo_idx, insn_idx); + prog_name, relo_idx, insn_idx); insn->code = BPF_JMP | BPF_CALL; insn->dst_reg = 0; insn->src_reg = 0; @@ -5991,6 +5991,7 @@ static int bpf_core_patch_insn(struct bpf_program *prog, int relo_idx, const struct bpf_core_relo_res *res) { + const char *prog_name = prog->name; __u32 orig_val, new_val; struct bpf_insn *insn; int insn_idx; @@ -6013,8 +6014,8 @@ static int bpf_core_patch_insn(struct bpf_program *prog, * verifier about "unknown opcode 00" */ if (is_ldimm64_insn(insn)) - bpf_core_poison_insn(prog, relo_idx, insn_idx + 1, insn + 1); - bpf_core_poison_insn(prog, relo_idx, insn_idx, insn); + bpf_core_poison_insn(prog_name, relo_idx, insn_idx + 1, insn + 1); + bpf_core_poison_insn(prog_name, relo_idx, insn_idx, insn); return 0; }
@@ -6028,14 +6029,14 @@ static int bpf_core_patch_insn(struct bpf_program *prog, return -EINVAL; if (res->validate && insn->imm != orig_val) { pr_warn("prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %u -> %u\n", - prog->name, relo_idx, + prog_name, relo_idx, insn_idx, insn->imm, orig_val, new_val); return -EINVAL; } orig_val = insn->imm; insn->imm = new_val; pr_debug("prog '%s': relo #%d: patched insn #%d (ALU/ALU64) imm %u -> %u\n", - prog->name, relo_idx, insn_idx, + prog_name, relo_idx, insn_idx, orig_val, new_val); break; case BPF_LDX: @@ -6043,25 +6044,25 @@ static int bpf_core_patch_insn(struct bpf_program *prog, case BPF_STX: if (res->validate && insn->off != orig_val) { pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %u -> %u\n", - prog->name, relo_idx, insn_idx, insn->off, orig_val, new_val); + prog_name, relo_idx, insn_idx, insn->off, orig_val, new_val); return -EINVAL; } if (new_val > SHRT_MAX) { pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) value too big: %u\n", - prog->name, relo_idx, insn_idx, new_val); + prog_name, relo_idx, insn_idx, new_val); return -ERANGE; } if (res->fail_memsz_adjust) { pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) accesses field incorrectly. " "Make sure you are accessing pointers, unsigned integers, or fields of matching type and size.\n", - prog->name, relo_idx, insn_idx); + prog_name, relo_idx, insn_idx); goto poison; }
orig_val = insn->off; insn->off = new_val; pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) off %u -> %u\n", - prog->name, relo_idx, insn_idx, orig_val, new_val); + prog_name, relo_idx, insn_idx, orig_val, new_val);
if (res->new_sz != res->orig_sz) { int insn_bytes_sz, insn_bpf_sz; @@ -6069,20 +6070,20 @@ static int bpf_core_patch_insn(struct bpf_program *prog, insn_bytes_sz = insn_bpf_size_to_bytes(insn); if (insn_bytes_sz != res->orig_sz) { pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) unexpected mem size: got %d, exp %u\n", - prog->name, relo_idx, insn_idx, insn_bytes_sz, res->orig_sz); + prog_name, relo_idx, insn_idx, insn_bytes_sz, res->orig_sz); return -EINVAL; }
insn_bpf_sz = insn_bytes_to_bpf_size(res->new_sz); if (insn_bpf_sz < 0) { pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) invalid new mem size: %u\n", - prog->name, relo_idx, insn_idx, res->new_sz); + prog_name, relo_idx, insn_idx, res->new_sz); return -EINVAL; }
insn->code = BPF_MODE(insn->code) | insn_bpf_sz | BPF_CLASS(insn->code); pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) mem_sz %u -> %u\n", - prog->name, relo_idx, insn_idx, res->orig_sz, res->new_sz); + prog_name, relo_idx, insn_idx, res->orig_sz, res->new_sz); } break; case BPF_LD: { @@ -6094,14 +6095,14 @@ static int bpf_core_patch_insn(struct bpf_program *prog, insn[1].code != 0 || insn[1].dst_reg != 0 || insn[1].src_reg != 0 || insn[1].off != 0) { pr_warn("prog '%s': relo #%d: insn #%d (LDIMM64) has unexpected form\n", - prog->name, relo_idx, insn_idx); + prog_name, relo_idx, insn_idx); return -EINVAL; }
imm = insn[0].imm + ((__u64)insn[1].imm << 32); if (res->validate && imm != orig_val) { pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDIMM64) value: got %llu, exp %u -> %u\n", - prog->name, relo_idx, + prog_name, relo_idx, insn_idx, (unsigned long long)imm, orig_val, new_val); return -EINVAL; @@ -6110,13 +6111,13 @@ static int bpf_core_patch_insn(struct bpf_program *prog, insn[0].imm = new_val; insn[1].imm = 0; /* currently only 32-bit values are supported */ pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %u\n", - prog->name, relo_idx, insn_idx, + prog_name, relo_idx, insn_idx, (unsigned long long)imm, new_val); break; } default: pr_warn("prog '%s': relo #%d: trying to relocate unrecognized insn #%d, code:0x%x, src:0x%x, dst:0x%x, off:0x%x, imm:0x%x\n", - prog->name, relo_idx, insn_idx, insn->code, + prog_name, relo_idx, insn_idx, insn->code, insn->src_reg, insn->dst_reg, insn->off, insn->imm); return -EINVAL; } @@ -6252,6 +6253,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, const struct btf_type *local_type; const char *local_name; struct core_cand_list *cands = NULL; + const char *prog_name = prog->name; __u32 local_id; const char *spec_str; int i, j, err; @@ -6278,13 +6280,13 @@ static int bpf_core_apply_relo(struct bpf_program *prog, err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); if (err) { pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", - prog->name, relo_idx, local_id, btf_kind_str(local_type), + prog_name, relo_idx, local_id, btf_kind_str(local_type), str_is_empty(local_name) ? "<anon>" : local_name, spec_str, err); return -EINVAL; }
- pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog->name, + pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec); libbpf_print(LIBBPF_DEBUG, "\n"); @@ -6301,7 +6303,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, /* libbpf doesn't support candidate search for anonymous types */ if (str_is_empty(spec_str)) { pr_warn("prog '%s': relo #%d: <%s> (%d) relocation doesn't support anonymous types\n", - prog->name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); + prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); return -EOPNOTSUPP; }
@@ -6309,7 +6311,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, cands = bpf_core_find_cands(prog->obj, local_btf, local_id); if (IS_ERR(cands)) { pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n", - prog->name, relo_idx, local_id, btf_kind_str(local_type), + prog_name, relo_idx, local_id, btf_kind_str(local_type), local_name, PTR_ERR(cands)); return PTR_ERR(cands); } @@ -6325,13 +6327,13 @@ static int bpf_core_apply_relo(struct bpf_program *prog, cands->cands[i].id, &cand_spec); if (err < 0) { pr_warn("prog '%s': relo #%d: error matching candidate #%d ", - prog->name, relo_idx, i); + prog_name, relo_idx, i); bpf_core_dump_spec(LIBBPF_WARN, &cand_spec); libbpf_print(LIBBPF_WARN, ": %d\n", err); return err; }
- pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog->name, + pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name, relo_idx, err == 0 ? "non-matching" : "matching", i); bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec); libbpf_print(LIBBPF_DEBUG, "\n"); @@ -6339,7 +6341,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, if (err == 0) continue;
- err = bpf_core_calc_relo(prog, relo, relo_idx, &local_spec, &cand_spec, &cand_res); + err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, &cand_spec, &cand_res); if (err) return err;
@@ -6351,7 +6353,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, * should all resolve to the same bit offset */ pr_warn("prog '%s': relo #%d: field offset ambiguity: %u != %u\n", - prog->name, relo_idx, cand_spec.bit_offset, + prog_name, relo_idx, cand_spec.bit_offset, targ_spec.bit_offset); return -EINVAL; } else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) { @@ -6360,7 +6362,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, * proceed due to ambiguity */ pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n", - prog->name, relo_idx, + prog_name, relo_idx, cand_res.poison ? "failure" : "success", cand_res.new_val, targ_res.poison ? "failure" : "success", targ_res.new_val); return -EINVAL; @@ -6393,10 +6395,10 @@ static int bpf_core_apply_relo(struct bpf_program *prog, */ if (j == 0) { pr_debug("prog '%s': relo #%d: no matching targets found\n", - prog->name, relo_idx); + prog_name, relo_idx);
/* calculate single target relo result explicitly */ - err = bpf_core_calc_relo(prog, relo, relo_idx, &local_spec, NULL, &targ_res); + err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, NULL, &targ_res); if (err) return err; } @@ -6406,7 +6408,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, err = bpf_core_patch_insn(prog, relo, relo_idx, &targ_res); if (err) { pr_warn("prog '%s': relo #%d: failed to patch insn #%zu: %d\n", - prog->name, relo_idx, relo->insn_off / BPF_INSN_SZ, err); + prog_name, relo_idx, relo->insn_off / BPF_INSN_SZ, err); return -EINVAL; }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.15-rc1 commit 3ee4f5335511b5357d3e762b3461b0d13e565ad5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
bpf_core_apply_relo() doesn't need to know bpf_program internals and hashmap details.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210721000822.40958-3-alexei.starovoitov@gmail.... (cherry picked from commit 3ee4f5335511b5357d3e762b3461b0d13e565ad5) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 117 +++++++++++++++++++++++++---------------- 1 file changed, 71 insertions(+), 46 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ee20ca151b54..423844b07ef2 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5986,26 +5986,13 @@ static int insn_bytes_to_bpf_size(__u32 sz) * 5. *(T *)(rX + <off>) = rY, where T is one of {u8, u16, u32, u64}; * 6. *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. */ -static int bpf_core_patch_insn(struct bpf_program *prog, - const struct bpf_core_relo *relo, - int relo_idx, - const struct bpf_core_relo_res *res) +static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, + int insn_idx, const struct bpf_core_relo *relo, + int relo_idx, const struct bpf_core_relo_res *res) { - const char *prog_name = prog->name; __u32 orig_val, new_val; - struct bpf_insn *insn; - int insn_idx; __u8 class;
- if (relo->insn_off % BPF_INSN_SZ) - return -EINVAL; - insn_idx = relo->insn_off / BPF_INSN_SZ; - /* adjust insn_idx from section frame of reference to the local - * program's frame of reference; (sub-)program code is not yet - * relocated, so it's enough to just subtract in-section offset - */ - insn_idx = insn_idx - prog->sec_insn_off; - insn = &prog->insns[insn_idx]; class = BPF_CLASS(insn->code);
if (res->poison) { @@ -6091,7 +6078,6 @@ static int bpf_core_patch_insn(struct bpf_program *prog,
if (!is_ldimm64_insn(insn) || insn[0].src_reg != 0 || insn[0].off != 0 || - insn_idx + 1 >= prog->insns_cnt || insn[1].code != 0 || insn[1].dst_reg != 0 || insn[1].src_reg != 0 || insn[1].off != 0) { pr_warn("prog '%s': relo #%d: insn #%d (LDIMM64) has unexpected form\n", @@ -6241,19 +6227,17 @@ static void *u32_as_hash_key(__u32 x) * between multiple relocations for the same type ID and is updated as some * of the candidates are pruned due to structural incompatibility. */ -static int bpf_core_apply_relo(struct bpf_program *prog, - const struct bpf_core_relo *relo, - int relo_idx, - const struct btf *local_btf, - struct hashmap *cand_cache) +static int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, + int insn_idx, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct core_cand_list *cands) { struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; - const void *type_key = u32_as_hash_key(relo->type_id); struct bpf_core_relo_res cand_res, targ_res; const struct btf_type *local_type; const char *local_name; - struct core_cand_list *cands = NULL; - const char *prog_name = prog->name; __u32 local_id; const char *spec_str; int i, j, err; @@ -6271,12 +6255,6 @@ static int bpf_core_apply_relo(struct bpf_program *prog, if (str_is_empty(spec_str)) return -EINVAL;
- if (prog->obj->gen_loader) { - pr_warn("// TODO core_relo: prog %td insn[%d] %s %s kind %d\n", - prog - prog->obj->programs, relo->insn_off / 8, - local_name, spec_str, relo->kind); - return -ENOTSUP; - } err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); if (err) { pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", @@ -6307,20 +6285,6 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return -EOPNOTSUPP; }
- if (!hashmap__find(cand_cache, type_key, (void **)&cands)) { - cands = bpf_core_find_cands(prog->obj, local_btf, local_id); - if (IS_ERR(cands)) { - pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n", - prog_name, relo_idx, local_id, btf_kind_str(local_type), - local_name, PTR_ERR(cands)); - return PTR_ERR(cands); - } - err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); - if (err) { - bpf_core_free_cands(cands); - return err; - } - }
for (i = 0, j = 0; i < cands->len; i++) { err = bpf_core_spec_match(&local_spec, cands->cands[i].btf, @@ -6405,7 +6369,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
patch_insn: /* bpf_core_patch_insn() should know how to handle missing targ_spec */ - err = bpf_core_patch_insn(prog, relo, relo_idx, &targ_res); + err = bpf_core_patch_insn(prog_name, insn, insn_idx, relo, relo_idx, &targ_res); if (err) { pr_warn("prog '%s': relo #%d: failed to patch insn #%zu: %d\n", prog_name, relo_idx, relo->insn_off / BPF_INSN_SZ, err); @@ -6415,6 +6379,67 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return 0; }
+static int bpf_core_apply_relo(struct bpf_program *prog, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct hashmap *cand_cache) +{ + const void *type_key = u32_as_hash_key(relo->type_id); + struct core_cand_list *cands = NULL; + const char *prog_name = prog->name; + const struct btf_type *local_type; + const char *local_name; + __u32 local_id = relo->type_id; + struct bpf_insn *insn; + int insn_idx, err; + + if (relo->insn_off % BPF_INSN_SZ) + return -EINVAL; + insn_idx = relo->insn_off / BPF_INSN_SZ; + /* adjust insn_idx from section frame of reference to the local + * program's frame of reference; (sub-)program code is not yet + * relocated, so it's enough to just subtract in-section offset + */ + insn_idx = insn_idx - prog->sec_insn_off; + if (insn_idx > prog->insns_cnt) + return -EINVAL; + insn = &prog->insns[insn_idx]; + + local_type = btf__type_by_id(local_btf, local_id); + if (!local_type) + return -EINVAL; + + local_name = btf__name_by_offset(local_btf, local_type->name_off); + if (!local_name) + return -EINVAL; + + if (prog->obj->gen_loader) { + pr_warn("// TODO core_relo: prog %td insn[%d] %s kind %d\n", + prog - prog->obj->programs, relo->insn_off / 8, + local_name, relo->kind); + return -ENOTSUP; + } + + if (relo->kind != BPF_TYPE_ID_LOCAL && + !hashmap__find(cand_cache, type_key, (void **)&cands)) { + cands = bpf_core_find_cands(prog->obj, local_btf, local_id); + if (IS_ERR(cands)) { + pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n", + prog_name, relo_idx, local_id, btf_kind_str(local_type), + local_name, PTR_ERR(cands)); + return PTR_ERR(cands); + } + err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); + if (err) { + bpf_core_free_cands(cands); + return err; + } + } + + return bpf_core_apply_relo_insn(prog_name, insn, insn_idx, relo, relo_idx, local_btf, cands); +} + static int bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) {
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.15-rc1 commit 301ba4d710284e088d278adc477b7edad834577f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
In order to make a clean split of CO-RE logic move its types into independent header file.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210721000822.40958-4-alexei.starovoitov@gmail.... (cherry picked from commit 301ba4d710284e088d278adc477b7edad834577f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 32 ++++-------- tools/lib/bpf/libbpf_internal.h | 71 +------------------------ tools/lib/bpf/relo_core.h | 92 +++++++++++++++++++++++++++++++++ 3 files changed, 102 insertions(+), 93 deletions(-) create mode 100644 tools/lib/bpf/relo_core.h
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 423844b07ef2..74d786ad4e13 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5036,34 +5036,20 @@ static size_t bpf_core_essential_name_len(const char *name) return n; }
-struct core_cand -{ - const struct btf *btf; - const struct btf_type *t; - const char *name; - __u32 id; -}; - -/* dynamically sized list of type IDs and its associated struct btf */ -struct core_cand_list { - struct core_cand *cands; - int len; -}; - -static void bpf_core_free_cands(struct core_cand_list *cands) +static void bpf_core_free_cands(struct bpf_core_cand_list *cands) { free(cands->cands); free(cands); }
-static int bpf_core_add_cands(struct core_cand *local_cand, +static int bpf_core_add_cands(struct bpf_core_cand *local_cand, size_t local_essent_len, const struct btf *targ_btf, const char *targ_btf_name, int targ_start_id, - struct core_cand_list *cands) + struct bpf_core_cand_list *cands) { - struct core_cand *new_cands, *cand; + struct bpf_core_cand *new_cands, *cand; const struct btf_type *t; const char *targ_name; size_t targ_essent_len; @@ -5199,11 +5185,11 @@ static int load_module_btfs(struct bpf_object *obj) return 0; }
-static struct core_cand_list * +static struct bpf_core_cand_list * bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 local_type_id) { - struct core_cand local_cand = {}; - struct core_cand_list *cands; + struct bpf_core_cand local_cand = {}; + struct bpf_core_cand_list *cands; const struct btf *main_btf; size_t local_essent_len; int err, i; @@ -6232,7 +6218,7 @@ static int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn const struct bpf_core_relo *relo, int relo_idx, const struct btf *local_btf, - struct core_cand_list *cands) + struct bpf_core_cand_list *cands) { struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; struct bpf_core_relo_res cand_res, targ_res; @@ -6386,7 +6372,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, struct hashmap *cand_cache) { const void *type_key = u32_as_hash_key(relo->type_id); - struct core_cand_list *cands = NULL; + struct bpf_core_cand_list *cands = NULL; const char *prog_name = prog->name; const struct btf_type *local_type; const char *local_name; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 016ca7cb4f8a..3178d5685dce 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -14,6 +14,7 @@ #include <errno.h> #include <linux/err.h> #include "libbpf_legacy.h" +#include "relo_core.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */ #pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64 @@ -366,76 +367,6 @@ struct bpf_line_info_min { __u32 line_col; };
-/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value - * has to be adjusted by relocations. - */ -enum bpf_core_relo_kind { - BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */ - BPF_FIELD_BYTE_SIZE = 1, /* field size in bytes */ - BPF_FIELD_EXISTS = 2, /* field existence in target kernel */ - BPF_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */ - BPF_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */ - BPF_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */ - BPF_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */ - BPF_TYPE_ID_TARGET = 7, /* type ID in target kernel */ - BPF_TYPE_EXISTS = 8, /* type existence in target kernel */ - BPF_TYPE_SIZE = 9, /* type size in bytes */ - BPF_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ - BPF_ENUMVAL_VALUE = 11, /* enum value integer value */ -}; - -/* The minimum bpf_core_relo checked by the loader - * - * CO-RE relocation captures the following data: - * - insn_off - instruction offset (in bytes) within a BPF program that needs - * its insn->imm field to be relocated with actual field info; - * - type_id - BTF type ID of the "root" (containing) entity of a relocatable - * type or field; - * - access_str_off - offset into corresponding .BTF string section. String - * interpretation depends on specific relocation kind: - * - for field-based relocations, string encodes an accessed field using - * a sequence of field and array indices, separated by colon (:). It's - * conceptually very close to LLVM's getelementptr ([0]) instruction's - * arguments for identifying offset to a field. - * - for type-based relocations, strings is expected to be just "0"; - * - for enum value-based relocations, string contains an index of enum - * value within its enum type; - * - * Example to provide a better feel. - * - * struct sample { - * int a; - * struct { - * int b[10]; - * }; - * }; - * - * struct sample *s = ...; - * int x = &s->a; // encoded as "0:0" (a is field #0) - * int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1, - * // b is field #0 inside anon struct, accessing elem #5) - * int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array) - * - * type_id for all relocs in this example will capture BTF type id of - * `struct sample`. - * - * Such relocation is emitted when using __builtin_preserve_access_index() - * Clang built-in, passing expression that captures field address, e.g.: - * - * bpf_probe_read(&dst, sizeof(dst), - * __builtin_preserve_access_index(&src->a.b.c)); - * - * In this case Clang will emit field relocation recording necessary data to - * be able to find offset of embedded `a.b.c` field within `src` struct. - * - * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction - */ -struct bpf_core_relo { - __u32 insn_off; - __u32 type_id; - __u32 access_str_off; - enum bpf_core_relo_kind kind; -};
typedef int (*type_id_visit_fn)(__u32 *type_id, void *ctx); typedef int (*str_off_visit_fn)(__u32 *str_off, void *ctx); diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h new file mode 100644 index 000000000000..ddf20151fe41 --- /dev/null +++ b/tools/lib/bpf/relo_core.h @@ -0,0 +1,92 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +/* Copyright (c) 2019 Facebook */ + +#ifndef __RELO_CORE_H +#define __RELO_CORE_H + +/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value + * has to be adjusted by relocations. + */ +enum bpf_core_relo_kind { + BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */ + BPF_FIELD_BYTE_SIZE = 1, /* field size in bytes */ + BPF_FIELD_EXISTS = 2, /* field existence in target kernel */ + BPF_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */ + BPF_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */ + BPF_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */ + BPF_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */ + BPF_TYPE_ID_TARGET = 7, /* type ID in target kernel */ + BPF_TYPE_EXISTS = 8, /* type existence in target kernel */ + BPF_TYPE_SIZE = 9, /* type size in bytes */ + BPF_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ + BPF_ENUMVAL_VALUE = 11, /* enum value integer value */ +}; + +/* The minimum bpf_core_relo checked by the loader + * + * CO-RE relocation captures the following data: + * - insn_off - instruction offset (in bytes) within a BPF program that needs + * its insn->imm field to be relocated with actual field info; + * - type_id - BTF type ID of the "root" (containing) entity of a relocatable + * type or field; + * - access_str_off - offset into corresponding .BTF string section. String + * interpretation depends on specific relocation kind: + * - for field-based relocations, string encodes an accessed field using + * a sequence of field and array indices, separated by colon (:). It's + * conceptually very close to LLVM's getelementptr ([0]) instruction's + * arguments for identifying offset to a field. + * - for type-based relocations, strings is expected to be just "0"; + * - for enum value-based relocations, string contains an index of enum + * value within its enum type; + * + * Example to provide a better feel. + * + * struct sample { + * int a; + * struct { + * int b[10]; + * }; + * }; + * + * struct sample *s = ...; + * int x = &s->a; // encoded as "0:0" (a is field #0) + * int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1, + * // b is field #0 inside anon struct, accessing elem #5) + * int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array) + * + * type_id for all relocs in this example will capture BTF type id of + * `struct sample`. + * + * Such relocation is emitted when using __builtin_preserve_access_index() + * Clang built-in, passing expression that captures field address, e.g.: + * + * bpf_probe_read(&dst, sizeof(dst), + * __builtin_preserve_access_index(&src->a.b.c)); + * + * In this case Clang will emit field relocation recording necessary data to + * be able to find offset of embedded `a.b.c` field within `src` struct. + * + * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction + */ +struct bpf_core_relo { + __u32 insn_off; + __u32 type_id; + __u32 access_str_off; + enum bpf_core_relo_kind kind; +}; + +struct bpf_core_cand +{ + const struct btf *btf; + const struct btf_type *t; + const char *name; + __u32 id; +}; + +/* dynamically sized list of type IDs and its associated struct btf */ +struct bpf_core_cand_list { + struct bpf_core_cand *cands; + int len; +}; + +#endif
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.15-rc1 commit b0588390dbcedcd74fab6ffb8afe8d52380fd8b6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Move CO-RE logic into separate file. The internal interface between libbpf and CO-RE is through bpf_core_apply_relo_insn() function and few structs defined in relo_core.h.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210721000822.40958-5-alexei.starovoitov@gmail.... (cherry picked from commit b0588390dbcedcd74fab6ffb8afe8d52380fd8b6) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Build | 2 +- tools/lib/bpf/libbpf.c | 1297 +------------------------------ tools/lib/bpf/libbpf_internal.h | 10 + tools/lib/bpf/relo_core.c | 1295 ++++++++++++++++++++++++++++++ tools/lib/bpf/relo_core.h | 12 +- 5 files changed, 1319 insertions(+), 1297 deletions(-) create mode 100644 tools/lib/bpf/relo_core.c
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build index 430f6874fa41..94f0a146bb7b 100644 --- a/tools/lib/bpf/Build +++ b/tools/lib/bpf/Build @@ -1,3 +1,3 @@ libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \ netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \ - btf_dump.o ringbuf.o strset.o linker.o gen_loader.o + btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 74d786ad4e13..a0a72fbabde5 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -595,11 +595,6 @@ static bool insn_is_subprog_call(const struct bpf_insn *insn) insn->off == 0; }
-static bool is_ldimm64_insn(struct bpf_insn *insn) -{ - return insn->code == (BPF_LD | BPF_IMM | BPF_DW); -} - static bool is_call_insn(const struct bpf_insn *insn) { return insn->code == (BPF_JMP | BPF_CALL); @@ -4739,279 +4734,6 @@ bpf_object__create_maps(struct bpf_object *obj) return err; }
-#define BPF_CORE_SPEC_MAX_LEN 64 - -/* represents BPF CO-RE field or array element accessor */ -struct bpf_core_accessor { - __u32 type_id; /* struct/union type or array element type */ - __u32 idx; /* field index or array index */ - const char *name; /* field name or NULL for array accessor */ -}; - -struct bpf_core_spec { - const struct btf *btf; - /* high-level spec: named fields and array indices only */ - struct bpf_core_accessor spec[BPF_CORE_SPEC_MAX_LEN]; - /* original unresolved (no skip_mods_or_typedefs) root type ID */ - __u32 root_type_id; - /* CO-RE relocation kind */ - enum bpf_core_relo_kind relo_kind; - /* high-level spec length */ - int len; - /* raw, low-level spec: 1-to-1 with accessor spec string */ - int raw_spec[BPF_CORE_SPEC_MAX_LEN]; - /* raw spec length */ - int raw_len; - /* field bit offset represented by spec */ - __u32 bit_offset; -}; - -static bool str_is_empty(const char *s) -{ - return !s || !s[0]; -} - -static bool is_flex_arr(const struct btf *btf, - const struct bpf_core_accessor *acc, - const struct btf_array *arr) -{ - const struct btf_type *t; - - /* not a flexible array, if not inside a struct or has non-zero size */ - if (!acc->name || arr->nelems > 0) - return false; - - /* has to be the last member of enclosing struct */ - t = btf__type_by_id(btf, acc->type_id); - return acc->idx == btf_vlen(t) - 1; -} - -static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) -{ - switch (kind) { - case BPF_FIELD_BYTE_OFFSET: return "byte_off"; - case BPF_FIELD_BYTE_SIZE: return "byte_sz"; - case BPF_FIELD_EXISTS: return "field_exists"; - case BPF_FIELD_SIGNED: return "signed"; - case BPF_FIELD_LSHIFT_U64: return "lshift_u64"; - case BPF_FIELD_RSHIFT_U64: return "rshift_u64"; - case BPF_TYPE_ID_LOCAL: return "local_type_id"; - case BPF_TYPE_ID_TARGET: return "target_type_id"; - case BPF_TYPE_EXISTS: return "type_exists"; - case BPF_TYPE_SIZE: return "type_size"; - case BPF_ENUMVAL_EXISTS: return "enumval_exists"; - case BPF_ENUMVAL_VALUE: return "enumval_value"; - default: return "unknown"; - } -} - -static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) -{ - switch (kind) { - case BPF_FIELD_BYTE_OFFSET: - case BPF_FIELD_BYTE_SIZE: - case BPF_FIELD_EXISTS: - case BPF_FIELD_SIGNED: - case BPF_FIELD_LSHIFT_U64: - case BPF_FIELD_RSHIFT_U64: - return true; - default: - return false; - } -} - -static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) -{ - switch (kind) { - case BPF_TYPE_ID_LOCAL: - case BPF_TYPE_ID_TARGET: - case BPF_TYPE_EXISTS: - case BPF_TYPE_SIZE: - return true; - default: - return false; - } -} - -static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) -{ - switch (kind) { - case BPF_ENUMVAL_EXISTS: - case BPF_ENUMVAL_VALUE: - return true; - default: - return false; - } -} - -/* - * Turn bpf_core_relo into a low- and high-level spec representation, - * validating correctness along the way, as well as calculating resulting - * field bit offset, specified by accessor string. Low-level spec captures - * every single level of nestedness, including traversing anonymous - * struct/union members. High-level one only captures semantically meaningful - * "turning points": named fields and array indicies. - * E.g., for this case: - * - * struct sample { - * int __unimportant; - * struct { - * int __1; - * int __2; - * int a[7]; - * }; - * }; - * - * struct sample *s = ...; - * - * int x = &s->a[3]; // access string = '0:1:2:3' - * - * Low-level spec has 1:1 mapping with each element of access string (it's - * just a parsed access string representation): [0, 1, 2, 3]. - * - * High-level spec will capture only 3 points: - * - intial zero-index access by pointer (&s->... is the same as &s[0]...); - * - field 'a' access (corresponds to '2' in low-level spec); - * - array element #3 access (corresponds to '3' in low-level spec). - * - * Type-based relocations (TYPE_EXISTS/TYPE_SIZE, - * TYPE_ID_LOCAL/TYPE_ID_TARGET) don't capture any field information. Their - * spec and raw_spec are kept empty. - * - * Enum value-based relocations (ENUMVAL_EXISTS/ENUMVAL_VALUE) use access - * string to specify enumerator's value index that need to be relocated. - */ -static int bpf_core_parse_spec(const struct btf *btf, - __u32 type_id, - const char *spec_str, - enum bpf_core_relo_kind relo_kind, - struct bpf_core_spec *spec) -{ - int access_idx, parsed_len, i; - struct bpf_core_accessor *acc; - const struct btf_type *t; - const char *name; - __u32 id; - __s64 sz; - - if (str_is_empty(spec_str) || *spec_str == ':') - return -EINVAL; - - memset(spec, 0, sizeof(*spec)); - spec->btf = btf; - spec->root_type_id = type_id; - spec->relo_kind = relo_kind; - - /* type-based relocations don't have a field access string */ - if (core_relo_is_type_based(relo_kind)) { - if (strcmp(spec_str, "0")) - return -EINVAL; - return 0; - } - - /* parse spec_str="0:1:2:3:4" into array raw_spec=[0, 1, 2, 3, 4] */ - while (*spec_str) { - if (*spec_str == ':') - ++spec_str; - if (sscanf(spec_str, "%d%n", &access_idx, &parsed_len) != 1) - return -EINVAL; - if (spec->raw_len == BPF_CORE_SPEC_MAX_LEN) - return -E2BIG; - spec_str += parsed_len; - spec->raw_spec[spec->raw_len++] = access_idx; - } - - if (spec->raw_len == 0) - return -EINVAL; - - t = skip_mods_and_typedefs(btf, type_id, &id); - if (!t) - return -EINVAL; - - access_idx = spec->raw_spec[0]; - acc = &spec->spec[0]; - acc->type_id = id; - acc->idx = access_idx; - spec->len++; - - if (core_relo_is_enumval_based(relo_kind)) { - if (!btf_is_enum(t) || spec->raw_len > 1 || access_idx >= btf_vlen(t)) - return -EINVAL; - - /* record enumerator name in a first accessor */ - acc->name = btf__name_by_offset(btf, btf_enum(t)[access_idx].name_off); - return 0; - } - - if (!core_relo_is_field_based(relo_kind)) - return -EINVAL; - - sz = btf__resolve_size(btf, id); - if (sz < 0) - return sz; - spec->bit_offset = access_idx * sz * 8; - - for (i = 1; i < spec->raw_len; i++) { - t = skip_mods_and_typedefs(btf, id, &id); - if (!t) - return -EINVAL; - - access_idx = spec->raw_spec[i]; - acc = &spec->spec[spec->len]; - - if (btf_is_composite(t)) { - const struct btf_member *m; - __u32 bit_offset; - - if (access_idx >= btf_vlen(t)) - return -EINVAL; - - bit_offset = btf_member_bit_offset(t, access_idx); - spec->bit_offset += bit_offset; - - m = btf_members(t) + access_idx; - if (m->name_off) { - name = btf__name_by_offset(btf, m->name_off); - if (str_is_empty(name)) - return -EINVAL; - - acc->type_id = id; - acc->idx = access_idx; - acc->name = name; - spec->len++; - } - - id = m->type; - } else if (btf_is_array(t)) { - const struct btf_array *a = btf_array(t); - bool flex; - - t = skip_mods_and_typedefs(btf, a->type, &id); - if (!t) - return -EINVAL; - - flex = is_flex_arr(btf, acc - 1, a); - if (!flex && access_idx >= a->nelems) - return -EINVAL; - - spec->spec[spec->len].type_id = id; - spec->spec[spec->len].idx = access_idx; - spec->len++; - - sz = btf__resolve_size(btf, id); - if (sz < 0) - return sz; - spec->bit_offset += access_idx * sz * 8; - } else { - pr_warn("relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n", - type_id, spec_str, i, id, btf_kind_str(t)); - return -EINVAL; - } - } - - return 0; -} - static bool bpf_core_is_flavor_sep(const char *s) { /* check X___Y name pattern, where X and Y are not underscores */ @@ -5024,7 +4746,7 @@ static bool bpf_core_is_flavor_sep(const char *s) * before last triple underscore. Struct name part after last triple * underscore is ignored by BPF CO-RE relocation during relocation matching. */ -static size_t bpf_core_essential_name_len(const char *name) +size_t bpf_core_essential_name_len(const char *name) { size_t n = strlen(name); int i; @@ -5243,165 +4965,6 @@ bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 l return ERR_PTR(err); }
-/* Check two types for compatibility for the purpose of field access - * relocation. const/volatile/restrict and typedefs are skipped to ensure we - * are relocating semantically compatible entities: - * - any two STRUCTs/UNIONs are compatible and can be mixed; - * - any two FWDs are compatible, if their names match (modulo flavor suffix); - * - any two PTRs are always compatible; - * - for ENUMs, names should be the same (ignoring flavor suffix) or at - * least one of enums should be anonymous; - * - for ENUMs, check sizes, names are ignored; - * - for INT, size and signedness are ignored; - * - any two FLOATs are always compatible; - * - for ARRAY, dimensionality is ignored, element types are checked for - * compatibility recursively; - * - everything else shouldn't be ever a target of relocation. - * These rules are not set in stone and probably will be adjusted as we get - * more experience with using BPF CO-RE relocations. - */ -static int bpf_core_fields_are_compat(const struct btf *local_btf, - __u32 local_id, - const struct btf *targ_btf, - __u32 targ_id) -{ - const struct btf_type *local_type, *targ_type; - -recur: - local_type = skip_mods_and_typedefs(local_btf, local_id, &local_id); - targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); - if (!local_type || !targ_type) - return -EINVAL; - - if (btf_is_composite(local_type) && btf_is_composite(targ_type)) - return 1; - if (btf_kind(local_type) != btf_kind(targ_type)) - return 0; - - switch (btf_kind(local_type)) { - case BTF_KIND_PTR: - case BTF_KIND_FLOAT: - return 1; - case BTF_KIND_FWD: - case BTF_KIND_ENUM: { - const char *local_name, *targ_name; - size_t local_len, targ_len; - - local_name = btf__name_by_offset(local_btf, - local_type->name_off); - targ_name = btf__name_by_offset(targ_btf, targ_type->name_off); - local_len = bpf_core_essential_name_len(local_name); - targ_len = bpf_core_essential_name_len(targ_name); - /* one of them is anonymous or both w/ same flavor-less names */ - return local_len == 0 || targ_len == 0 || - (local_len == targ_len && - strncmp(local_name, targ_name, local_len) == 0); - } - case BTF_KIND_INT: - /* just reject deprecated bitfield-like integers; all other - * integers are by default compatible between each other - */ - return btf_int_offset(local_type) == 0 && - btf_int_offset(targ_type) == 0; - case BTF_KIND_ARRAY: - local_id = btf_array(local_type)->type; - targ_id = btf_array(targ_type)->type; - goto recur; - default: - pr_warn("unexpected kind %d relocated, local [%d], target [%d]\n", - btf_kind(local_type), local_id, targ_id); - return 0; - } -} - -/* - * Given single high-level named field accessor in local type, find - * corresponding high-level accessor for a target type. Along the way, - * maintain low-level spec for target as well. Also keep updating target - * bit offset. - * - * Searching is performed through recursive exhaustive enumeration of all - * fields of a struct/union. If there are any anonymous (embedded) - * structs/unions, they are recursively searched as well. If field with - * desired name is found, check compatibility between local and target types, - * before returning result. - * - * 1 is returned, if field is found. - * 0 is returned if no compatible field is found. - * <0 is returned on error. - */ -static int bpf_core_match_member(const struct btf *local_btf, - const struct bpf_core_accessor *local_acc, - const struct btf *targ_btf, - __u32 targ_id, - struct bpf_core_spec *spec, - __u32 *next_targ_id) -{ - const struct btf_type *local_type, *targ_type; - const struct btf_member *local_member, *m; - const char *local_name, *targ_name; - __u32 local_id; - int i, n, found; - - targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); - if (!targ_type) - return -EINVAL; - if (!btf_is_composite(targ_type)) - return 0; - - local_id = local_acc->type_id; - local_type = btf__type_by_id(local_btf, local_id); - local_member = btf_members(local_type) + local_acc->idx; - local_name = btf__name_by_offset(local_btf, local_member->name_off); - - n = btf_vlen(targ_type); - m = btf_members(targ_type); - for (i = 0; i < n; i++, m++) { - __u32 bit_offset; - - bit_offset = btf_member_bit_offset(targ_type, i); - - /* too deep struct/union/array nesting */ - if (spec->raw_len == BPF_CORE_SPEC_MAX_LEN) - return -E2BIG; - - /* speculate this member will be the good one */ - spec->bit_offset += bit_offset; - spec->raw_spec[spec->raw_len++] = i; - - targ_name = btf__name_by_offset(targ_btf, m->name_off); - if (str_is_empty(targ_name)) { - /* embedded struct/union, we need to go deeper */ - found = bpf_core_match_member(local_btf, local_acc, - targ_btf, m->type, - spec, next_targ_id); - if (found) /* either found or error */ - return found; - } else if (strcmp(local_name, targ_name) == 0) { - /* matching named field */ - struct bpf_core_accessor *targ_acc; - - targ_acc = &spec->spec[spec->len++]; - targ_acc->type_id = targ_id; - targ_acc->idx = i; - targ_acc->name = targ_name; - - *next_targ_id = m->type; - found = bpf_core_fields_are_compat(local_btf, - local_member->type, - targ_btf, m->type); - if (!found) - spec->len--; /* pop accessor */ - return found; - } - /* member turned out not to be what we looked for */ - spec->bit_offset -= bit_offset; - spec->raw_len--; - } - - return 0; -} - /* Check local and target types for compatibility. This check is used for * type-based CO-RE relocations and follow slightly different rules than * field-based relocations. This function assumes that root types were already @@ -5421,8 +4984,8 @@ static int bpf_core_match_member(const struct btf *local_btf, * These rules are not set in stone and probably will be adjusted as we get * more experience with using BPF CO-RE relocations. */ -static int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, - const struct btf *targ_btf, __u32 targ_id) +int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, + const struct btf *targ_btf, __u32 targ_id) { const struct btf_type *local_type, *targ_type; int depth = 32; /* max recursion depth */ @@ -5496,658 +5059,6 @@ static int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id } }
-/* - * Try to match local spec to a target type and, if successful, produce full - * target spec (high-level, low-level + bit offset). - */ -static int bpf_core_spec_match(struct bpf_core_spec *local_spec, - const struct btf *targ_btf, __u32 targ_id, - struct bpf_core_spec *targ_spec) -{ - const struct btf_type *targ_type; - const struct bpf_core_accessor *local_acc; - struct bpf_core_accessor *targ_acc; - int i, sz, matched; - - memset(targ_spec, 0, sizeof(*targ_spec)); - targ_spec->btf = targ_btf; - targ_spec->root_type_id = targ_id; - targ_spec->relo_kind = local_spec->relo_kind; - - if (core_relo_is_type_based(local_spec->relo_kind)) { - return bpf_core_types_are_compat(local_spec->btf, - local_spec->root_type_id, - targ_btf, targ_id); - } - - local_acc = &local_spec->spec[0]; - targ_acc = &targ_spec->spec[0]; - - if (core_relo_is_enumval_based(local_spec->relo_kind)) { - size_t local_essent_len, targ_essent_len; - const struct btf_enum *e; - const char *targ_name; - - /* has to resolve to an enum */ - targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, &targ_id); - if (!btf_is_enum(targ_type)) - return 0; - - local_essent_len = bpf_core_essential_name_len(local_acc->name); - - for (i = 0, e = btf_enum(targ_type); i < btf_vlen(targ_type); i++, e++) { - targ_name = btf__name_by_offset(targ_spec->btf, e->name_off); - targ_essent_len = bpf_core_essential_name_len(targ_name); - if (targ_essent_len != local_essent_len) - continue; - if (strncmp(local_acc->name, targ_name, local_essent_len) == 0) { - targ_acc->type_id = targ_id; - targ_acc->idx = i; - targ_acc->name = targ_name; - targ_spec->len++; - targ_spec->raw_spec[targ_spec->raw_len] = targ_acc->idx; - targ_spec->raw_len++; - return 1; - } - } - return 0; - } - - if (!core_relo_is_field_based(local_spec->relo_kind)) - return -EINVAL; - - for (i = 0; i < local_spec->len; i++, local_acc++, targ_acc++) { - targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, - &targ_id); - if (!targ_type) - return -EINVAL; - - if (local_acc->name) { - matched = bpf_core_match_member(local_spec->btf, - local_acc, - targ_btf, targ_id, - targ_spec, &targ_id); - if (matched <= 0) - return matched; - } else { - /* for i=0, targ_id is already treated as array element - * type (because it's the original struct), for others - * we should find array element type first - */ - if (i > 0) { - const struct btf_array *a; - bool flex; - - if (!btf_is_array(targ_type)) - return 0; - - a = btf_array(targ_type); - flex = is_flex_arr(targ_btf, targ_acc - 1, a); - if (!flex && local_acc->idx >= a->nelems) - return 0; - if (!skip_mods_and_typedefs(targ_btf, a->type, - &targ_id)) - return -EINVAL; - } - - /* too deep struct/union/array nesting */ - if (targ_spec->raw_len == BPF_CORE_SPEC_MAX_LEN) - return -E2BIG; - - targ_acc->type_id = targ_id; - targ_acc->idx = local_acc->idx; - targ_acc->name = NULL; - targ_spec->len++; - targ_spec->raw_spec[targ_spec->raw_len] = targ_acc->idx; - targ_spec->raw_len++; - - sz = btf__resolve_size(targ_btf, targ_id); - if (sz < 0) - return sz; - targ_spec->bit_offset += local_acc->idx * sz * 8; - } - } - - return 1; -} - -static int bpf_core_calc_field_relo(const char *prog_name, - const struct bpf_core_relo *relo, - const struct bpf_core_spec *spec, - __u32 *val, __u32 *field_sz, __u32 *type_id, - bool *validate) -{ - const struct bpf_core_accessor *acc; - const struct btf_type *t; - __u32 byte_off, byte_sz, bit_off, bit_sz, field_type_id; - const struct btf_member *m; - const struct btf_type *mt; - bool bitfield; - __s64 sz; - - *field_sz = 0; - - if (relo->kind == BPF_FIELD_EXISTS) { - *val = spec ? 1 : 0; - return 0; - } - - if (!spec) - return -EUCLEAN; /* request instruction poisoning */ - - acc = &spec->spec[spec->len - 1]; - t = btf__type_by_id(spec->btf, acc->type_id); - - /* a[n] accessor needs special handling */ - if (!acc->name) { - if (relo->kind == BPF_FIELD_BYTE_OFFSET) { - *val = spec->bit_offset / 8; - /* remember field size for load/store mem size */ - sz = btf__resolve_size(spec->btf, acc->type_id); - if (sz < 0) - return -EINVAL; - *field_sz = sz; - *type_id = acc->type_id; - } else if (relo->kind == BPF_FIELD_BYTE_SIZE) { - sz = btf__resolve_size(spec->btf, acc->type_id); - if (sz < 0) - return -EINVAL; - *val = sz; - } else { - pr_warn("prog '%s': relo %d at insn #%d can't be applied to array access\n", - prog_name, relo->kind, relo->insn_off / 8); - return -EINVAL; - } - if (validate) - *validate = true; - return 0; - } - - m = btf_members(t) + acc->idx; - mt = skip_mods_and_typedefs(spec->btf, m->type, &field_type_id); - bit_off = spec->bit_offset; - bit_sz = btf_member_bitfield_size(t, acc->idx); - - bitfield = bit_sz > 0; - if (bitfield) { - byte_sz = mt->size; - byte_off = bit_off / 8 / byte_sz * byte_sz; - /* figure out smallest int size necessary for bitfield load */ - while (bit_off + bit_sz - byte_off * 8 > byte_sz * 8) { - if (byte_sz >= 8) { - /* bitfield can't be read with 64-bit read */ - pr_warn("prog '%s': relo %d at insn #%d can't be satisfied for bitfield\n", - prog_name, relo->kind, relo->insn_off / 8); - return -E2BIG; - } - byte_sz *= 2; - byte_off = bit_off / 8 / byte_sz * byte_sz; - } - } else { - sz = btf__resolve_size(spec->btf, field_type_id); - if (sz < 0) - return -EINVAL; - byte_sz = sz; - byte_off = spec->bit_offset / 8; - bit_sz = byte_sz * 8; - } - - /* for bitfields, all the relocatable aspects are ambiguous and we - * might disagree with compiler, so turn off validation of expected - * value, except for signedness - */ - if (validate) - *validate = !bitfield; - - switch (relo->kind) { - case BPF_FIELD_BYTE_OFFSET: - *val = byte_off; - if (!bitfield) { - *field_sz = byte_sz; - *type_id = field_type_id; - } - break; - case BPF_FIELD_BYTE_SIZE: - *val = byte_sz; - break; - case BPF_FIELD_SIGNED: - /* enums will be assumed unsigned */ - *val = btf_is_enum(mt) || - (btf_int_encoding(mt) & BTF_INT_SIGNED); - if (validate) - *validate = true; /* signedness is never ambiguous */ - break; - case BPF_FIELD_LSHIFT_U64: -#if __BYTE_ORDER == __LITTLE_ENDIAN - *val = 64 - (bit_off + bit_sz - byte_off * 8); -#else - *val = (8 - byte_sz) * 8 + (bit_off - byte_off * 8); -#endif - break; - case BPF_FIELD_RSHIFT_U64: - *val = 64 - bit_sz; - if (validate) - *validate = true; /* right shift is never ambiguous */ - break; - case BPF_FIELD_EXISTS: - default: - return -EOPNOTSUPP; - } - - return 0; -} - -static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo, - const struct bpf_core_spec *spec, - __u32 *val) -{ - __s64 sz; - - /* type-based relos return zero when target type is not found */ - if (!spec) { - *val = 0; - return 0; - } - - switch (relo->kind) { - case BPF_TYPE_ID_TARGET: - *val = spec->root_type_id; - break; - case BPF_TYPE_EXISTS: - *val = 1; - break; - case BPF_TYPE_SIZE: - sz = btf__resolve_size(spec->btf, spec->root_type_id); - if (sz < 0) - return -EINVAL; - *val = sz; - break; - case BPF_TYPE_ID_LOCAL: - /* BPF_TYPE_ID_LOCAL is handled specially and shouldn't get here */ - default: - return -EOPNOTSUPP; - } - - return 0; -} - -static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo, - const struct bpf_core_spec *spec, - __u32 *val) -{ - const struct btf_type *t; - const struct btf_enum *e; - - switch (relo->kind) { - case BPF_ENUMVAL_EXISTS: - *val = spec ? 1 : 0; - break; - case BPF_ENUMVAL_VALUE: - if (!spec) - return -EUCLEAN; /* request instruction poisoning */ - t = btf__type_by_id(spec->btf, spec->spec[0].type_id); - e = btf_enum(t) + spec->spec[0].idx; - *val = e->val; - break; - default: - return -EOPNOTSUPP; - } - - return 0; -} - -struct bpf_core_relo_res -{ - /* expected value in the instruction, unless validate == false */ - __u32 orig_val; - /* new value that needs to be patched up to */ - __u32 new_val; - /* relocation unsuccessful, poison instruction, but don't fail load */ - bool poison; - /* some relocations can't be validated against orig_val */ - bool validate; - /* for field byte offset relocations or the forms: - * *(T *)(rX + <off>) = rY - * rX = *(T *)(rY + <off>), - * we remember original and resolved field size to adjust direct - * memory loads of pointers and integers; this is necessary for 32-bit - * host kernel architectures, but also allows to automatically - * relocate fields that were resized from, e.g., u32 to u64, etc. - */ - bool fail_memsz_adjust; - __u32 orig_sz; - __u32 orig_type_id; - __u32 new_sz; - __u32 new_type_id; -}; - -/* Calculate original and target relocation values, given local and target - * specs and relocation kind. These values are calculated for each candidate. - * If there are multiple candidates, resulting values should all be consistent - * with each other. Otherwise, libbpf will refuse to proceed due to ambiguity. - * If instruction has to be poisoned, *poison will be set to true. - */ -static int bpf_core_calc_relo(const char *prog_name, - const struct bpf_core_relo *relo, - int relo_idx, - const struct bpf_core_spec *local_spec, - const struct bpf_core_spec *targ_spec, - struct bpf_core_relo_res *res) -{ - int err = -EOPNOTSUPP; - - res->orig_val = 0; - res->new_val = 0; - res->poison = false; - res->validate = true; - res->fail_memsz_adjust = false; - res->orig_sz = res->new_sz = 0; - res->orig_type_id = res->new_type_id = 0; - - if (core_relo_is_field_based(relo->kind)) { - err = bpf_core_calc_field_relo(prog_name, relo, local_spec, - &res->orig_val, &res->orig_sz, - &res->orig_type_id, &res->validate); - err = err ?: bpf_core_calc_field_relo(prog_name, relo, targ_spec, - &res->new_val, &res->new_sz, - &res->new_type_id, NULL); - if (err) - goto done; - /* Validate if it's safe to adjust load/store memory size. - * Adjustments are performed only if original and new memory - * sizes differ. - */ - res->fail_memsz_adjust = false; - if (res->orig_sz != res->new_sz) { - const struct btf_type *orig_t, *new_t; - - orig_t = btf__type_by_id(local_spec->btf, res->orig_type_id); - new_t = btf__type_by_id(targ_spec->btf, res->new_type_id); - - /* There are two use cases in which it's safe to - * adjust load/store's mem size: - * - reading a 32-bit kernel pointer, while on BPF - * size pointers are always 64-bit; in this case - * it's safe to "downsize" instruction size due to - * pointer being treated as unsigned integer with - * zero-extended upper 32-bits; - * - reading unsigned integers, again due to - * zero-extension is preserving the value correctly. - * - * In all other cases it's incorrect to attempt to - * load/store field because read value will be - * incorrect, so we poison relocated instruction. - */ - if (btf_is_ptr(orig_t) && btf_is_ptr(new_t)) - goto done; - if (btf_is_int(orig_t) && btf_is_int(new_t) && - btf_int_encoding(orig_t) != BTF_INT_SIGNED && - btf_int_encoding(new_t) != BTF_INT_SIGNED) - goto done; - - /* mark as invalid mem size adjustment, but this will - * only be checked for LDX/STX/ST insns - */ - res->fail_memsz_adjust = true; - } - } else if (core_relo_is_type_based(relo->kind)) { - err = bpf_core_calc_type_relo(relo, local_spec, &res->orig_val); - err = err ?: bpf_core_calc_type_relo(relo, targ_spec, &res->new_val); - } else if (core_relo_is_enumval_based(relo->kind)) { - err = bpf_core_calc_enumval_relo(relo, local_spec, &res->orig_val); - err = err ?: bpf_core_calc_enumval_relo(relo, targ_spec, &res->new_val); - } - -done: - if (err == -EUCLEAN) { - /* EUCLEAN is used to signal instruction poisoning request */ - res->poison = true; - err = 0; - } else if (err == -EOPNOTSUPP) { - /* EOPNOTSUPP means unknown/unsupported relocation */ - pr_warn("prog '%s': relo #%d: unrecognized CO-RE relocation %s (%d) at insn #%d\n", - prog_name, relo_idx, core_relo_kind_str(relo->kind), - relo->kind, relo->insn_off / 8); - } - - return err; -} - -/* - * Turn instruction for which CO_RE relocation failed into invalid one with - * distinct signature. - */ -static void bpf_core_poison_insn(const char *prog_name, int relo_idx, - int insn_idx, struct bpf_insn *insn) -{ - pr_debug("prog '%s': relo #%d: substituting insn #%d w/ invalid insn\n", - prog_name, relo_idx, insn_idx); - insn->code = BPF_JMP | BPF_CALL; - insn->dst_reg = 0; - insn->src_reg = 0; - insn->off = 0; - /* if this instruction is reachable (not a dead code), - * verifier will complain with the following message: - * invalid func unknown#195896080 - */ - insn->imm = 195896080; /* => 0xbad2310 => "bad relo" */ -} - -static int insn_bpf_size_to_bytes(struct bpf_insn *insn) -{ - switch (BPF_SIZE(insn->code)) { - case BPF_DW: return 8; - case BPF_W: return 4; - case BPF_H: return 2; - case BPF_B: return 1; - default: return -1; - } -} - -static int insn_bytes_to_bpf_size(__u32 sz) -{ - switch (sz) { - case 8: return BPF_DW; - case 4: return BPF_W; - case 2: return BPF_H; - case 1: return BPF_B; - default: return -1; - } -} - -/* - * Patch relocatable BPF instruction. - * - * Patched value is determined by relocation kind and target specification. - * For existence relocations target spec will be NULL if field/type is not found. - * Expected insn->imm value is determined using relocation kind and local - * spec, and is checked before patching instruction. If actual insn->imm value - * is wrong, bail out with error. - * - * Currently supported classes of BPF instruction are: - * 1. rX = <imm> (assignment with immediate operand); - * 2. rX += <imm> (arithmetic operations with immediate operand); - * 3. rX = <imm64> (load with 64-bit immediate value); - * 4. rX = *(T *)(rY + <off>), where T is one of {u8, u16, u32, u64}; - * 5. *(T *)(rX + <off>) = rY, where T is one of {u8, u16, u32, u64}; - * 6. *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. - */ -static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, - int insn_idx, const struct bpf_core_relo *relo, - int relo_idx, const struct bpf_core_relo_res *res) -{ - __u32 orig_val, new_val; - __u8 class; - - class = BPF_CLASS(insn->code); - - if (res->poison) { -poison: - /* poison second part of ldimm64 to avoid confusing error from - * verifier about "unknown opcode 00" - */ - if (is_ldimm64_insn(insn)) - bpf_core_poison_insn(prog_name, relo_idx, insn_idx + 1, insn + 1); - bpf_core_poison_insn(prog_name, relo_idx, insn_idx, insn); - return 0; - } - - orig_val = res->orig_val; - new_val = res->new_val; - - switch (class) { - case BPF_ALU: - case BPF_ALU64: - if (BPF_SRC(insn->code) != BPF_K) - return -EINVAL; - if (res->validate && insn->imm != orig_val) { - pr_warn("prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %u -> %u\n", - prog_name, relo_idx, - insn_idx, insn->imm, orig_val, new_val); - return -EINVAL; - } - orig_val = insn->imm; - insn->imm = new_val; - pr_debug("prog '%s': relo #%d: patched insn #%d (ALU/ALU64) imm %u -> %u\n", - prog_name, relo_idx, insn_idx, - orig_val, new_val); - break; - case BPF_LDX: - case BPF_ST: - case BPF_STX: - if (res->validate && insn->off != orig_val) { - pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %u -> %u\n", - prog_name, relo_idx, insn_idx, insn->off, orig_val, new_val); - return -EINVAL; - } - if (new_val > SHRT_MAX) { - pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) value too big: %u\n", - prog_name, relo_idx, insn_idx, new_val); - return -ERANGE; - } - if (res->fail_memsz_adjust) { - pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) accesses field incorrectly. " - "Make sure you are accessing pointers, unsigned integers, or fields of matching type and size.\n", - prog_name, relo_idx, insn_idx); - goto poison; - } - - orig_val = insn->off; - insn->off = new_val; - pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) off %u -> %u\n", - prog_name, relo_idx, insn_idx, orig_val, new_val); - - if (res->new_sz != res->orig_sz) { - int insn_bytes_sz, insn_bpf_sz; - - insn_bytes_sz = insn_bpf_size_to_bytes(insn); - if (insn_bytes_sz != res->orig_sz) { - pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) unexpected mem size: got %d, exp %u\n", - prog_name, relo_idx, insn_idx, insn_bytes_sz, res->orig_sz); - return -EINVAL; - } - - insn_bpf_sz = insn_bytes_to_bpf_size(res->new_sz); - if (insn_bpf_sz < 0) { - pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) invalid new mem size: %u\n", - prog_name, relo_idx, insn_idx, res->new_sz); - return -EINVAL; - } - - insn->code = BPF_MODE(insn->code) | insn_bpf_sz | BPF_CLASS(insn->code); - pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) mem_sz %u -> %u\n", - prog_name, relo_idx, insn_idx, res->orig_sz, res->new_sz); - } - break; - case BPF_LD: { - __u64 imm; - - if (!is_ldimm64_insn(insn) || - insn[0].src_reg != 0 || insn[0].off != 0 || - insn[1].code != 0 || insn[1].dst_reg != 0 || - insn[1].src_reg != 0 || insn[1].off != 0) { - pr_warn("prog '%s': relo #%d: insn #%d (LDIMM64) has unexpected form\n", - prog_name, relo_idx, insn_idx); - return -EINVAL; - } - - imm = insn[0].imm + ((__u64)insn[1].imm << 32); - if (res->validate && imm != orig_val) { - pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDIMM64) value: got %llu, exp %u -> %u\n", - prog_name, relo_idx, - insn_idx, (unsigned long long)imm, - orig_val, new_val); - return -EINVAL; - } - - insn[0].imm = new_val; - insn[1].imm = 0; /* currently only 32-bit values are supported */ - pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %u\n", - prog_name, relo_idx, insn_idx, - (unsigned long long)imm, new_val); - break; - } - default: - pr_warn("prog '%s': relo #%d: trying to relocate unrecognized insn #%d, code:0x%x, src:0x%x, dst:0x%x, off:0x%x, imm:0x%x\n", - prog_name, relo_idx, insn_idx, insn->code, - insn->src_reg, insn->dst_reg, insn->off, insn->imm); - return -EINVAL; - } - - return 0; -} - -/* Output spec definition in the format: - * [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>, - * where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b - */ -static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec) -{ - const struct btf_type *t; - const struct btf_enum *e; - const char *s; - __u32 type_id; - int i; - - type_id = spec->root_type_id; - t = btf__type_by_id(spec->btf, type_id); - s = btf__name_by_offset(spec->btf, t->name_off); - - libbpf_print(level, "[%u] %s %s", type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s); - - if (core_relo_is_type_based(spec->relo_kind)) - return; - - if (core_relo_is_enumval_based(spec->relo_kind)) { - t = skip_mods_and_typedefs(spec->btf, type_id, NULL); - e = btf_enum(t) + spec->raw_spec[0]; - s = btf__name_by_offset(spec->btf, e->name_off); - - libbpf_print(level, "::%s = %u", s, e->val); - return; - } - - if (core_relo_is_field_based(spec->relo_kind)) { - for (i = 0; i < spec->len; i++) { - if (spec->spec[i].name) - libbpf_print(level, ".%s", spec->spec[i].name); - else if (i > 0 || spec->spec[i].idx > 0) - libbpf_print(level, "[%u]", spec->spec[i].idx); - } - - libbpf_print(level, " ("); - for (i = 0; i < spec->raw_len; i++) - libbpf_print(level, "%s%d", i == 0 ? "" : ":", spec->raw_spec[i]); - - if (spec->bit_offset % 8) - libbpf_print(level, " @ offset %u.%u)", - spec->bit_offset / 8, spec->bit_offset % 8); - else - libbpf_print(level, " @ offset %u)", spec->bit_offset / 8); - return; - } -} - static size_t bpf_core_hash_fn(const void *key, void *ctx) { return (size_t)key; @@ -6163,208 +5074,6 @@ static void *u32_as_hash_key(__u32 x) return (void *)(uintptr_t)x; }
-/* - * CO-RE relocate single instruction. - * - * The outline and important points of the algorithm: - * 1. For given local type, find corresponding candidate target types. - * Candidate type is a type with the same "essential" name, ignoring - * everything after last triple underscore (___). E.g., `sample`, - * `sample___flavor_one`, `sample___flavor_another_one`, are all candidates - * for each other. Names with triple underscore are referred to as - * "flavors" and are useful, among other things, to allow to - * specify/support incompatible variations of the same kernel struct, which - * might differ between different kernel versions and/or build - * configurations. - * - * N.B. Struct "flavors" could be generated by bpftool's BTF-to-C - * converter, when deduplicated BTF of a kernel still contains more than - * one different types with the same name. In that case, ___2, ___3, etc - * are appended starting from second name conflict. But start flavors are - * also useful to be defined "locally", in BPF program, to extract same - * data from incompatible changes between different kernel - * versions/configurations. For instance, to handle field renames between - * kernel versions, one can use two flavors of the struct name with the - * same common name and use conditional relocations to extract that field, - * depending on target kernel version. - * 2. For each candidate type, try to match local specification to this - * candidate target type. Matching involves finding corresponding - * high-level spec accessors, meaning that all named fields should match, - * as well as all array accesses should be within the actual bounds. Also, - * types should be compatible (see bpf_core_fields_are_compat for details). - * 3. It is supported and expected that there might be multiple flavors - * matching the spec. As long as all the specs resolve to the same set of - * offsets across all candidates, there is no error. If there is any - * ambiguity, CO-RE relocation will fail. This is necessary to accomodate - * imprefection of BTF deduplication, which can cause slight duplication of - * the same BTF type, if some directly or indirectly referenced (by - * pointer) type gets resolved to different actual types in different - * object files. If such situation occurs, deduplicated BTF will end up - * with two (or more) structurally identical types, which differ only in - * types they refer to through pointer. This should be OK in most cases and - * is not an error. - * 4. Candidate types search is performed by linearly scanning through all - * types in target BTF. It is anticipated that this is overall more - * efficient memory-wise and not significantly worse (if not better) - * CPU-wise compared to prebuilding a map from all local type names to - * a list of candidate type names. It's also sped up by caching resolved - * list of matching candidates per each local "root" type ID, that has at - * least one bpf_core_relo associated with it. This list is shared - * between multiple relocations for the same type ID and is updated as some - * of the candidates are pruned due to structural incompatibility. - */ -static int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, - int insn_idx, - const struct bpf_core_relo *relo, - int relo_idx, - const struct btf *local_btf, - struct bpf_core_cand_list *cands) -{ - struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; - struct bpf_core_relo_res cand_res, targ_res; - const struct btf_type *local_type; - const char *local_name; - __u32 local_id; - const char *spec_str; - int i, j, err; - - local_id = relo->type_id; - local_type = btf__type_by_id(local_btf, local_id); - if (!local_type) - return -EINVAL; - - local_name = btf__name_by_offset(local_btf, local_type->name_off); - if (!local_name) - return -EINVAL; - - spec_str = btf__name_by_offset(local_btf, relo->access_str_off); - if (str_is_empty(spec_str)) - return -EINVAL; - - err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); - if (err) { - pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", - prog_name, relo_idx, local_id, btf_kind_str(local_type), - str_is_empty(local_name) ? "<anon>" : local_name, - spec_str, err); - return -EINVAL; - } - - pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name, - relo_idx, core_relo_kind_str(relo->kind), relo->kind); - bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec); - libbpf_print(LIBBPF_DEBUG, "\n"); - - /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ - if (relo->kind == BPF_TYPE_ID_LOCAL) { - targ_res.validate = true; - targ_res.poison = false; - targ_res.orig_val = local_spec.root_type_id; - targ_res.new_val = local_spec.root_type_id; - goto patch_insn; - } - - /* libbpf doesn't support candidate search for anonymous types */ - if (str_is_empty(spec_str)) { - pr_warn("prog '%s': relo #%d: <%s> (%d) relocation doesn't support anonymous types\n", - prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); - return -EOPNOTSUPP; - } - - - for (i = 0, j = 0; i < cands->len; i++) { - err = bpf_core_spec_match(&local_spec, cands->cands[i].btf, - cands->cands[i].id, &cand_spec); - if (err < 0) { - pr_warn("prog '%s': relo #%d: error matching candidate #%d ", - prog_name, relo_idx, i); - bpf_core_dump_spec(LIBBPF_WARN, &cand_spec); - libbpf_print(LIBBPF_WARN, ": %d\n", err); - return err; - } - - pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name, - relo_idx, err == 0 ? "non-matching" : "matching", i); - bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec); - libbpf_print(LIBBPF_DEBUG, "\n"); - - if (err == 0) - continue; - - err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, &cand_spec, &cand_res); - if (err) - return err; - - if (j == 0) { - targ_res = cand_res; - targ_spec = cand_spec; - } else if (cand_spec.bit_offset != targ_spec.bit_offset) { - /* if there are many field relo candidates, they - * should all resolve to the same bit offset - */ - pr_warn("prog '%s': relo #%d: field offset ambiguity: %u != %u\n", - prog_name, relo_idx, cand_spec.bit_offset, - targ_spec.bit_offset); - return -EINVAL; - } else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) { - /* all candidates should result in the same relocation - * decision and value, otherwise it's dangerous to - * proceed due to ambiguity - */ - pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n", - prog_name, relo_idx, - cand_res.poison ? "failure" : "success", cand_res.new_val, - targ_res.poison ? "failure" : "success", targ_res.new_val); - return -EINVAL; - } - - cands->cands[j++] = cands->cands[i]; - } - - /* - * For BPF_FIELD_EXISTS relo or when used BPF program has field - * existence checks or kernel version/config checks, it's expected - * that we might not find any candidates. In this case, if field - * wasn't found in any candidate, the list of candidates shouldn't - * change at all, we'll just handle relocating appropriately, - * depending on relo's kind. - */ - if (j > 0) - cands->len = j; - - /* - * If no candidates were found, it might be both a programmer error, - * as well as expected case, depending whether instruction w/ - * relocation is guarded in some way that makes it unreachable (dead - * code) if relocation can't be resolved. This is handled in - * bpf_core_patch_insn() uniformly by replacing that instruction with - * BPF helper call insn (using invalid helper ID). If that instruction - * is indeed unreachable, then it will be ignored and eliminated by - * verifier. If it was an error, then verifier will complain and point - * to a specific instruction number in its log. - */ - if (j == 0) { - pr_debug("prog '%s': relo #%d: no matching targets found\n", - prog_name, relo_idx); - - /* calculate single target relo result explicitly */ - err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, NULL, &targ_res); - if (err) - return err; - } - -patch_insn: - /* bpf_core_patch_insn() should know how to handle missing targ_spec */ - err = bpf_core_patch_insn(prog_name, insn, insn_idx, relo, relo_idx, &targ_res); - if (err) { - pr_warn("prog '%s': relo #%d: failed to patch insn #%zu: %d\n", - prog_name, relo_idx, relo->insn_off / BPF_INSN_SZ, err); - return -EINVAL; - } - - return 0; -} - static int bpf_core_apply_relo(struct bpf_program *prog, const struct bpf_core_relo *relo, int relo_idx, diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 3178d5685dce..f7b691d5f9eb 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -425,4 +425,14 @@ static inline void *libbpf_ptr(void *ret) return ret; }
+static inline bool str_is_empty(const char *s) +{ + return !s || !s[0]; +} + +static inline bool is_ldimm64_insn(struct bpf_insn *insn) +{ + return insn->code == (BPF_LD | BPF_IMM | BPF_DW); +} + #endif /* __LIBBPF_LIBBPF_INTERNAL_H */ diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c new file mode 100644 index 000000000000..4016ed492d0c --- /dev/null +++ b/tools/lib/bpf/relo_core.c @@ -0,0 +1,1295 @@ +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +/* Copyright (c) 2019 Facebook */ + +#include <stdio.h> +#include <string.h> +#include <errno.h> +#include <ctype.h> +#include <linux/err.h> + +#include "libbpf.h" +#include "bpf.h" +#include "btf.h" +#include "str_error.h" +#include "libbpf_internal.h" + +#define BPF_CORE_SPEC_MAX_LEN 64 + +/* represents BPF CO-RE field or array element accessor */ +struct bpf_core_accessor { + __u32 type_id; /* struct/union type or array element type */ + __u32 idx; /* field index or array index */ + const char *name; /* field name or NULL for array accessor */ +}; + +struct bpf_core_spec { + const struct btf *btf; + /* high-level spec: named fields and array indices only */ + struct bpf_core_accessor spec[BPF_CORE_SPEC_MAX_LEN]; + /* original unresolved (no skip_mods_or_typedefs) root type ID */ + __u32 root_type_id; + /* CO-RE relocation kind */ + enum bpf_core_relo_kind relo_kind; + /* high-level spec length */ + int len; + /* raw, low-level spec: 1-to-1 with accessor spec string */ + int raw_spec[BPF_CORE_SPEC_MAX_LEN]; + /* raw spec length */ + int raw_len; + /* field bit offset represented by spec */ + __u32 bit_offset; +}; + +static bool is_flex_arr(const struct btf *btf, + const struct bpf_core_accessor *acc, + const struct btf_array *arr) +{ + const struct btf_type *t; + + /* not a flexible array, if not inside a struct or has non-zero size */ + if (!acc->name || arr->nelems > 0) + return false; + + /* has to be the last member of enclosing struct */ + t = btf__type_by_id(btf, acc->type_id); + return acc->idx == btf_vlen(t) - 1; +} + +static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) +{ + switch (kind) { + case BPF_FIELD_BYTE_OFFSET: return "byte_off"; + case BPF_FIELD_BYTE_SIZE: return "byte_sz"; + case BPF_FIELD_EXISTS: return "field_exists"; + case BPF_FIELD_SIGNED: return "signed"; + case BPF_FIELD_LSHIFT_U64: return "lshift_u64"; + case BPF_FIELD_RSHIFT_U64: return "rshift_u64"; + case BPF_TYPE_ID_LOCAL: return "local_type_id"; + case BPF_TYPE_ID_TARGET: return "target_type_id"; + case BPF_TYPE_EXISTS: return "type_exists"; + case BPF_TYPE_SIZE: return "type_size"; + case BPF_ENUMVAL_EXISTS: return "enumval_exists"; + case BPF_ENUMVAL_VALUE: return "enumval_value"; + default: return "unknown"; + } +} + +static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) +{ + switch (kind) { + case BPF_FIELD_BYTE_OFFSET: + case BPF_FIELD_BYTE_SIZE: + case BPF_FIELD_EXISTS: + case BPF_FIELD_SIGNED: + case BPF_FIELD_LSHIFT_U64: + case BPF_FIELD_RSHIFT_U64: + return true; + default: + return false; + } +} + +static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) +{ + switch (kind) { + case BPF_TYPE_ID_LOCAL: + case BPF_TYPE_ID_TARGET: + case BPF_TYPE_EXISTS: + case BPF_TYPE_SIZE: + return true; + default: + return false; + } +} + +static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) +{ + switch (kind) { + case BPF_ENUMVAL_EXISTS: + case BPF_ENUMVAL_VALUE: + return true; + default: + return false; + } +} + +/* + * Turn bpf_core_relo into a low- and high-level spec representation, + * validating correctness along the way, as well as calculating resulting + * field bit offset, specified by accessor string. Low-level spec captures + * every single level of nestedness, including traversing anonymous + * struct/union members. High-level one only captures semantically meaningful + * "turning points": named fields and array indicies. + * E.g., for this case: + * + * struct sample { + * int __unimportant; + * struct { + * int __1; + * int __2; + * int a[7]; + * }; + * }; + * + * struct sample *s = ...; + * + * int x = &s->a[3]; // access string = '0:1:2:3' + * + * Low-level spec has 1:1 mapping with each element of access string (it's + * just a parsed access string representation): [0, 1, 2, 3]. + * + * High-level spec will capture only 3 points: + * - intial zero-index access by pointer (&s->... is the same as &s[0]...); + * - field 'a' access (corresponds to '2' in low-level spec); + * - array element #3 access (corresponds to '3' in low-level spec). + * + * Type-based relocations (TYPE_EXISTS/TYPE_SIZE, + * TYPE_ID_LOCAL/TYPE_ID_TARGET) don't capture any field information. Their + * spec and raw_spec are kept empty. + * + * Enum value-based relocations (ENUMVAL_EXISTS/ENUMVAL_VALUE) use access + * string to specify enumerator's value index that need to be relocated. + */ +static int bpf_core_parse_spec(const struct btf *btf, + __u32 type_id, + const char *spec_str, + enum bpf_core_relo_kind relo_kind, + struct bpf_core_spec *spec) +{ + int access_idx, parsed_len, i; + struct bpf_core_accessor *acc; + const struct btf_type *t; + const char *name; + __u32 id; + __s64 sz; + + if (str_is_empty(spec_str) || *spec_str == ':') + return -EINVAL; + + memset(spec, 0, sizeof(*spec)); + spec->btf = btf; + spec->root_type_id = type_id; + spec->relo_kind = relo_kind; + + /* type-based relocations don't have a field access string */ + if (core_relo_is_type_based(relo_kind)) { + if (strcmp(spec_str, "0")) + return -EINVAL; + return 0; + } + + /* parse spec_str="0:1:2:3:4" into array raw_spec=[0, 1, 2, 3, 4] */ + while (*spec_str) { + if (*spec_str == ':') + ++spec_str; + if (sscanf(spec_str, "%d%n", &access_idx, &parsed_len) != 1) + return -EINVAL; + if (spec->raw_len == BPF_CORE_SPEC_MAX_LEN) + return -E2BIG; + spec_str += parsed_len; + spec->raw_spec[spec->raw_len++] = access_idx; + } + + if (spec->raw_len == 0) + return -EINVAL; + + t = skip_mods_and_typedefs(btf, type_id, &id); + if (!t) + return -EINVAL; + + access_idx = spec->raw_spec[0]; + acc = &spec->spec[0]; + acc->type_id = id; + acc->idx = access_idx; + spec->len++; + + if (core_relo_is_enumval_based(relo_kind)) { + if (!btf_is_enum(t) || spec->raw_len > 1 || access_idx >= btf_vlen(t)) + return -EINVAL; + + /* record enumerator name in a first accessor */ + acc->name = btf__name_by_offset(btf, btf_enum(t)[access_idx].name_off); + return 0; + } + + if (!core_relo_is_field_based(relo_kind)) + return -EINVAL; + + sz = btf__resolve_size(btf, id); + if (sz < 0) + return sz; + spec->bit_offset = access_idx * sz * 8; + + for (i = 1; i < spec->raw_len; i++) { + t = skip_mods_and_typedefs(btf, id, &id); + if (!t) + return -EINVAL; + + access_idx = spec->raw_spec[i]; + acc = &spec->spec[spec->len]; + + if (btf_is_composite(t)) { + const struct btf_member *m; + __u32 bit_offset; + + if (access_idx >= btf_vlen(t)) + return -EINVAL; + + bit_offset = btf_member_bit_offset(t, access_idx); + spec->bit_offset += bit_offset; + + m = btf_members(t) + access_idx; + if (m->name_off) { + name = btf__name_by_offset(btf, m->name_off); + if (str_is_empty(name)) + return -EINVAL; + + acc->type_id = id; + acc->idx = access_idx; + acc->name = name; + spec->len++; + } + + id = m->type; + } else if (btf_is_array(t)) { + const struct btf_array *a = btf_array(t); + bool flex; + + t = skip_mods_and_typedefs(btf, a->type, &id); + if (!t) + return -EINVAL; + + flex = is_flex_arr(btf, acc - 1, a); + if (!flex && access_idx >= a->nelems) + return -EINVAL; + + spec->spec[spec->len].type_id = id; + spec->spec[spec->len].idx = access_idx; + spec->len++; + + sz = btf__resolve_size(btf, id); + if (sz < 0) + return sz; + spec->bit_offset += access_idx * sz * 8; + } else { + pr_warn("relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n", + type_id, spec_str, i, id, btf_kind_str(t)); + return -EINVAL; + } + } + + return 0; +} + +/* Check two types for compatibility for the purpose of field access + * relocation. const/volatile/restrict and typedefs are skipped to ensure we + * are relocating semantically compatible entities: + * - any two STRUCTs/UNIONs are compatible and can be mixed; + * - any two FWDs are compatible, if their names match (modulo flavor suffix); + * - any two PTRs are always compatible; + * - for ENUMs, names should be the same (ignoring flavor suffix) or at + * least one of enums should be anonymous; + * - for ENUMs, check sizes, names are ignored; + * - for INT, size and signedness are ignored; + * - any two FLOATs are always compatible; + * - for ARRAY, dimensionality is ignored, element types are checked for + * compatibility recursively; + * - everything else shouldn't be ever a target of relocation. + * These rules are not set in stone and probably will be adjusted as we get + * more experience with using BPF CO-RE relocations. + */ +static int bpf_core_fields_are_compat(const struct btf *local_btf, + __u32 local_id, + const struct btf *targ_btf, + __u32 targ_id) +{ + const struct btf_type *local_type, *targ_type; + +recur: + local_type = skip_mods_and_typedefs(local_btf, local_id, &local_id); + targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); + if (!local_type || !targ_type) + return -EINVAL; + + if (btf_is_composite(local_type) && btf_is_composite(targ_type)) + return 1; + if (btf_kind(local_type) != btf_kind(targ_type)) + return 0; + + switch (btf_kind(local_type)) { + case BTF_KIND_PTR: + case BTF_KIND_FLOAT: + return 1; + case BTF_KIND_FWD: + case BTF_KIND_ENUM: { + const char *local_name, *targ_name; + size_t local_len, targ_len; + + local_name = btf__name_by_offset(local_btf, + local_type->name_off); + targ_name = btf__name_by_offset(targ_btf, targ_type->name_off); + local_len = bpf_core_essential_name_len(local_name); + targ_len = bpf_core_essential_name_len(targ_name); + /* one of them is anonymous or both w/ same flavor-less names */ + return local_len == 0 || targ_len == 0 || + (local_len == targ_len && + strncmp(local_name, targ_name, local_len) == 0); + } + case BTF_KIND_INT: + /* just reject deprecated bitfield-like integers; all other + * integers are by default compatible between each other + */ + return btf_int_offset(local_type) == 0 && + btf_int_offset(targ_type) == 0; + case BTF_KIND_ARRAY: + local_id = btf_array(local_type)->type; + targ_id = btf_array(targ_type)->type; + goto recur; + default: + pr_warn("unexpected kind %d relocated, local [%d], target [%d]\n", + btf_kind(local_type), local_id, targ_id); + return 0; + } +} + +/* + * Given single high-level named field accessor in local type, find + * corresponding high-level accessor for a target type. Along the way, + * maintain low-level spec for target as well. Also keep updating target + * bit offset. + * + * Searching is performed through recursive exhaustive enumeration of all + * fields of a struct/union. If there are any anonymous (embedded) + * structs/unions, they are recursively searched as well. If field with + * desired name is found, check compatibility between local and target types, + * before returning result. + * + * 1 is returned, if field is found. + * 0 is returned if no compatible field is found. + * <0 is returned on error. + */ +static int bpf_core_match_member(const struct btf *local_btf, + const struct bpf_core_accessor *local_acc, + const struct btf *targ_btf, + __u32 targ_id, + struct bpf_core_spec *spec, + __u32 *next_targ_id) +{ + const struct btf_type *local_type, *targ_type; + const struct btf_member *local_member, *m; + const char *local_name, *targ_name; + __u32 local_id; + int i, n, found; + + targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); + if (!targ_type) + return -EINVAL; + if (!btf_is_composite(targ_type)) + return 0; + + local_id = local_acc->type_id; + local_type = btf__type_by_id(local_btf, local_id); + local_member = btf_members(local_type) + local_acc->idx; + local_name = btf__name_by_offset(local_btf, local_member->name_off); + + n = btf_vlen(targ_type); + m = btf_members(targ_type); + for (i = 0; i < n; i++, m++) { + __u32 bit_offset; + + bit_offset = btf_member_bit_offset(targ_type, i); + + /* too deep struct/union/array nesting */ + if (spec->raw_len == BPF_CORE_SPEC_MAX_LEN) + return -E2BIG; + + /* speculate this member will be the good one */ + spec->bit_offset += bit_offset; + spec->raw_spec[spec->raw_len++] = i; + + targ_name = btf__name_by_offset(targ_btf, m->name_off); + if (str_is_empty(targ_name)) { + /* embedded struct/union, we need to go deeper */ + found = bpf_core_match_member(local_btf, local_acc, + targ_btf, m->type, + spec, next_targ_id); + if (found) /* either found or error */ + return found; + } else if (strcmp(local_name, targ_name) == 0) { + /* matching named field */ + struct bpf_core_accessor *targ_acc; + + targ_acc = &spec->spec[spec->len++]; + targ_acc->type_id = targ_id; + targ_acc->idx = i; + targ_acc->name = targ_name; + + *next_targ_id = m->type; + found = bpf_core_fields_are_compat(local_btf, + local_member->type, + targ_btf, m->type); + if (!found) + spec->len--; /* pop accessor */ + return found; + } + /* member turned out not to be what we looked for */ + spec->bit_offset -= bit_offset; + spec->raw_len--; + } + + return 0; +} + +/* + * Try to match local spec to a target type and, if successful, produce full + * target spec (high-level, low-level + bit offset). + */ +static int bpf_core_spec_match(struct bpf_core_spec *local_spec, + const struct btf *targ_btf, __u32 targ_id, + struct bpf_core_spec *targ_spec) +{ + const struct btf_type *targ_type; + const struct bpf_core_accessor *local_acc; + struct bpf_core_accessor *targ_acc; + int i, sz, matched; + + memset(targ_spec, 0, sizeof(*targ_spec)); + targ_spec->btf = targ_btf; + targ_spec->root_type_id = targ_id; + targ_spec->relo_kind = local_spec->relo_kind; + + if (core_relo_is_type_based(local_spec->relo_kind)) { + return bpf_core_types_are_compat(local_spec->btf, + local_spec->root_type_id, + targ_btf, targ_id); + } + + local_acc = &local_spec->spec[0]; + targ_acc = &targ_spec->spec[0]; + + if (core_relo_is_enumval_based(local_spec->relo_kind)) { + size_t local_essent_len, targ_essent_len; + const struct btf_enum *e; + const char *targ_name; + + /* has to resolve to an enum */ + targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, &targ_id); + if (!btf_is_enum(targ_type)) + return 0; + + local_essent_len = bpf_core_essential_name_len(local_acc->name); + + for (i = 0, e = btf_enum(targ_type); i < btf_vlen(targ_type); i++, e++) { + targ_name = btf__name_by_offset(targ_spec->btf, e->name_off); + targ_essent_len = bpf_core_essential_name_len(targ_name); + if (targ_essent_len != local_essent_len) + continue; + if (strncmp(local_acc->name, targ_name, local_essent_len) == 0) { + targ_acc->type_id = targ_id; + targ_acc->idx = i; + targ_acc->name = targ_name; + targ_spec->len++; + targ_spec->raw_spec[targ_spec->raw_len] = targ_acc->idx; + targ_spec->raw_len++; + return 1; + } + } + return 0; + } + + if (!core_relo_is_field_based(local_spec->relo_kind)) + return -EINVAL; + + for (i = 0; i < local_spec->len; i++, local_acc++, targ_acc++) { + targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, + &targ_id); + if (!targ_type) + return -EINVAL; + + if (local_acc->name) { + matched = bpf_core_match_member(local_spec->btf, + local_acc, + targ_btf, targ_id, + targ_spec, &targ_id); + if (matched <= 0) + return matched; + } else { + /* for i=0, targ_id is already treated as array element + * type (because it's the original struct), for others + * we should find array element type first + */ + if (i > 0) { + const struct btf_array *a; + bool flex; + + if (!btf_is_array(targ_type)) + return 0; + + a = btf_array(targ_type); + flex = is_flex_arr(targ_btf, targ_acc - 1, a); + if (!flex && local_acc->idx >= a->nelems) + return 0; + if (!skip_mods_and_typedefs(targ_btf, a->type, + &targ_id)) + return -EINVAL; + } + + /* too deep struct/union/array nesting */ + if (targ_spec->raw_len == BPF_CORE_SPEC_MAX_LEN) + return -E2BIG; + + targ_acc->type_id = targ_id; + targ_acc->idx = local_acc->idx; + targ_acc->name = NULL; + targ_spec->len++; + targ_spec->raw_spec[targ_spec->raw_len] = targ_acc->idx; + targ_spec->raw_len++; + + sz = btf__resolve_size(targ_btf, targ_id); + if (sz < 0) + return sz; + targ_spec->bit_offset += local_acc->idx * sz * 8; + } + } + + return 1; +} + +static int bpf_core_calc_field_relo(const char *prog_name, + const struct bpf_core_relo *relo, + const struct bpf_core_spec *spec, + __u32 *val, __u32 *field_sz, __u32 *type_id, + bool *validate) +{ + const struct bpf_core_accessor *acc; + const struct btf_type *t; + __u32 byte_off, byte_sz, bit_off, bit_sz, field_type_id; + const struct btf_member *m; + const struct btf_type *mt; + bool bitfield; + __s64 sz; + + *field_sz = 0; + + if (relo->kind == BPF_FIELD_EXISTS) { + *val = spec ? 1 : 0; + return 0; + } + + if (!spec) + return -EUCLEAN; /* request instruction poisoning */ + + acc = &spec->spec[spec->len - 1]; + t = btf__type_by_id(spec->btf, acc->type_id); + + /* a[n] accessor needs special handling */ + if (!acc->name) { + if (relo->kind == BPF_FIELD_BYTE_OFFSET) { + *val = spec->bit_offset / 8; + /* remember field size for load/store mem size */ + sz = btf__resolve_size(spec->btf, acc->type_id); + if (sz < 0) + return -EINVAL; + *field_sz = sz; + *type_id = acc->type_id; + } else if (relo->kind == BPF_FIELD_BYTE_SIZE) { + sz = btf__resolve_size(spec->btf, acc->type_id); + if (sz < 0) + return -EINVAL; + *val = sz; + } else { + pr_warn("prog '%s': relo %d at insn #%d can't be applied to array access\n", + prog_name, relo->kind, relo->insn_off / 8); + return -EINVAL; + } + if (validate) + *validate = true; + return 0; + } + + m = btf_members(t) + acc->idx; + mt = skip_mods_and_typedefs(spec->btf, m->type, &field_type_id); + bit_off = spec->bit_offset; + bit_sz = btf_member_bitfield_size(t, acc->idx); + + bitfield = bit_sz > 0; + if (bitfield) { + byte_sz = mt->size; + byte_off = bit_off / 8 / byte_sz * byte_sz; + /* figure out smallest int size necessary for bitfield load */ + while (bit_off + bit_sz - byte_off * 8 > byte_sz * 8) { + if (byte_sz >= 8) { + /* bitfield can't be read with 64-bit read */ + pr_warn("prog '%s': relo %d at insn #%d can't be satisfied for bitfield\n", + prog_name, relo->kind, relo->insn_off / 8); + return -E2BIG; + } + byte_sz *= 2; + byte_off = bit_off / 8 / byte_sz * byte_sz; + } + } else { + sz = btf__resolve_size(spec->btf, field_type_id); + if (sz < 0) + return -EINVAL; + byte_sz = sz; + byte_off = spec->bit_offset / 8; + bit_sz = byte_sz * 8; + } + + /* for bitfields, all the relocatable aspects are ambiguous and we + * might disagree with compiler, so turn off validation of expected + * value, except for signedness + */ + if (validate) + *validate = !bitfield; + + switch (relo->kind) { + case BPF_FIELD_BYTE_OFFSET: + *val = byte_off; + if (!bitfield) { + *field_sz = byte_sz; + *type_id = field_type_id; + } + break; + case BPF_FIELD_BYTE_SIZE: + *val = byte_sz; + break; + case BPF_FIELD_SIGNED: + /* enums will be assumed unsigned */ + *val = btf_is_enum(mt) || + (btf_int_encoding(mt) & BTF_INT_SIGNED); + if (validate) + *validate = true; /* signedness is never ambiguous */ + break; + case BPF_FIELD_LSHIFT_U64: +#if __BYTE_ORDER == __LITTLE_ENDIAN + *val = 64 - (bit_off + bit_sz - byte_off * 8); +#else + *val = (8 - byte_sz) * 8 + (bit_off - byte_off * 8); +#endif + break; + case BPF_FIELD_RSHIFT_U64: + *val = 64 - bit_sz; + if (validate) + *validate = true; /* right shift is never ambiguous */ + break; + case BPF_FIELD_EXISTS: + default: + return -EOPNOTSUPP; + } + + return 0; +} + +static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo, + const struct bpf_core_spec *spec, + __u32 *val) +{ + __s64 sz; + + /* type-based relos return zero when target type is not found */ + if (!spec) { + *val = 0; + return 0; + } + + switch (relo->kind) { + case BPF_TYPE_ID_TARGET: + *val = spec->root_type_id; + break; + case BPF_TYPE_EXISTS: + *val = 1; + break; + case BPF_TYPE_SIZE: + sz = btf__resolve_size(spec->btf, spec->root_type_id); + if (sz < 0) + return -EINVAL; + *val = sz; + break; + case BPF_TYPE_ID_LOCAL: + /* BPF_TYPE_ID_LOCAL is handled specially and shouldn't get here */ + default: + return -EOPNOTSUPP; + } + + return 0; +} + +static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo, + const struct bpf_core_spec *spec, + __u32 *val) +{ + const struct btf_type *t; + const struct btf_enum *e; + + switch (relo->kind) { + case BPF_ENUMVAL_EXISTS: + *val = spec ? 1 : 0; + break; + case BPF_ENUMVAL_VALUE: + if (!spec) + return -EUCLEAN; /* request instruction poisoning */ + t = btf__type_by_id(spec->btf, spec->spec[0].type_id); + e = btf_enum(t) + spec->spec[0].idx; + *val = e->val; + break; + default: + return -EOPNOTSUPP; + } + + return 0; +} + +struct bpf_core_relo_res +{ + /* expected value in the instruction, unless validate == false */ + __u32 orig_val; + /* new value that needs to be patched up to */ + __u32 new_val; + /* relocation unsuccessful, poison instruction, but don't fail load */ + bool poison; + /* some relocations can't be validated against orig_val */ + bool validate; + /* for field byte offset relocations or the forms: + * *(T *)(rX + <off>) = rY + * rX = *(T *)(rY + <off>), + * we remember original and resolved field size to adjust direct + * memory loads of pointers and integers; this is necessary for 32-bit + * host kernel architectures, but also allows to automatically + * relocate fields that were resized from, e.g., u32 to u64, etc. + */ + bool fail_memsz_adjust; + __u32 orig_sz; + __u32 orig_type_id; + __u32 new_sz; + __u32 new_type_id; +}; + +/* Calculate original and target relocation values, given local and target + * specs and relocation kind. These values are calculated for each candidate. + * If there are multiple candidates, resulting values should all be consistent + * with each other. Otherwise, libbpf will refuse to proceed due to ambiguity. + * If instruction has to be poisoned, *poison will be set to true. + */ +static int bpf_core_calc_relo(const char *prog_name, + const struct bpf_core_relo *relo, + int relo_idx, + const struct bpf_core_spec *local_spec, + const struct bpf_core_spec *targ_spec, + struct bpf_core_relo_res *res) +{ + int err = -EOPNOTSUPP; + + res->orig_val = 0; + res->new_val = 0; + res->poison = false; + res->validate = true; + res->fail_memsz_adjust = false; + res->orig_sz = res->new_sz = 0; + res->orig_type_id = res->new_type_id = 0; + + if (core_relo_is_field_based(relo->kind)) { + err = bpf_core_calc_field_relo(prog_name, relo, local_spec, + &res->orig_val, &res->orig_sz, + &res->orig_type_id, &res->validate); + err = err ?: bpf_core_calc_field_relo(prog_name, relo, targ_spec, + &res->new_val, &res->new_sz, + &res->new_type_id, NULL); + if (err) + goto done; + /* Validate if it's safe to adjust load/store memory size. + * Adjustments are performed only if original and new memory + * sizes differ. + */ + res->fail_memsz_adjust = false; + if (res->orig_sz != res->new_sz) { + const struct btf_type *orig_t, *new_t; + + orig_t = btf__type_by_id(local_spec->btf, res->orig_type_id); + new_t = btf__type_by_id(targ_spec->btf, res->new_type_id); + + /* There are two use cases in which it's safe to + * adjust load/store's mem size: + * - reading a 32-bit kernel pointer, while on BPF + * size pointers are always 64-bit; in this case + * it's safe to "downsize" instruction size due to + * pointer being treated as unsigned integer with + * zero-extended upper 32-bits; + * - reading unsigned integers, again due to + * zero-extension is preserving the value correctly. + * + * In all other cases it's incorrect to attempt to + * load/store field because read value will be + * incorrect, so we poison relocated instruction. + */ + if (btf_is_ptr(orig_t) && btf_is_ptr(new_t)) + goto done; + if (btf_is_int(orig_t) && btf_is_int(new_t) && + btf_int_encoding(orig_t) != BTF_INT_SIGNED && + btf_int_encoding(new_t) != BTF_INT_SIGNED) + goto done; + + /* mark as invalid mem size adjustment, but this will + * only be checked for LDX/STX/ST insns + */ + res->fail_memsz_adjust = true; + } + } else if (core_relo_is_type_based(relo->kind)) { + err = bpf_core_calc_type_relo(relo, local_spec, &res->orig_val); + err = err ?: bpf_core_calc_type_relo(relo, targ_spec, &res->new_val); + } else if (core_relo_is_enumval_based(relo->kind)) { + err = bpf_core_calc_enumval_relo(relo, local_spec, &res->orig_val); + err = err ?: bpf_core_calc_enumval_relo(relo, targ_spec, &res->new_val); + } + +done: + if (err == -EUCLEAN) { + /* EUCLEAN is used to signal instruction poisoning request */ + res->poison = true; + err = 0; + } else if (err == -EOPNOTSUPP) { + /* EOPNOTSUPP means unknown/unsupported relocation */ + pr_warn("prog '%s': relo #%d: unrecognized CO-RE relocation %s (%d) at insn #%d\n", + prog_name, relo_idx, core_relo_kind_str(relo->kind), + relo->kind, relo->insn_off / 8); + } + + return err; +} + +/* + * Turn instruction for which CO_RE relocation failed into invalid one with + * distinct signature. + */ +static void bpf_core_poison_insn(const char *prog_name, int relo_idx, + int insn_idx, struct bpf_insn *insn) +{ + pr_debug("prog '%s': relo #%d: substituting insn #%d w/ invalid insn\n", + prog_name, relo_idx, insn_idx); + insn->code = BPF_JMP | BPF_CALL; + insn->dst_reg = 0; + insn->src_reg = 0; + insn->off = 0; + /* if this instruction is reachable (not a dead code), + * verifier will complain with the following message: + * invalid func unknown#195896080 + */ + insn->imm = 195896080; /* => 0xbad2310 => "bad relo" */ +} + +static int insn_bpf_size_to_bytes(struct bpf_insn *insn) +{ + switch (BPF_SIZE(insn->code)) { + case BPF_DW: return 8; + case BPF_W: return 4; + case BPF_H: return 2; + case BPF_B: return 1; + default: return -1; + } +} + +static int insn_bytes_to_bpf_size(__u32 sz) +{ + switch (sz) { + case 8: return BPF_DW; + case 4: return BPF_W; + case 2: return BPF_H; + case 1: return BPF_B; + default: return -1; + } +} + +/* + * Patch relocatable BPF instruction. + * + * Patched value is determined by relocation kind and target specification. + * For existence relocations target spec will be NULL if field/type is not found. + * Expected insn->imm value is determined using relocation kind and local + * spec, and is checked before patching instruction. If actual insn->imm value + * is wrong, bail out with error. + * + * Currently supported classes of BPF instruction are: + * 1. rX = <imm> (assignment with immediate operand); + * 2. rX += <imm> (arithmetic operations with immediate operand); + * 3. rX = <imm64> (load with 64-bit immediate value); + * 4. rX = *(T *)(rY + <off>), where T is one of {u8, u16, u32, u64}; + * 5. *(T *)(rX + <off>) = rY, where T is one of {u8, u16, u32, u64}; + * 6. *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. + */ +static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, + int insn_idx, const struct bpf_core_relo *relo, + int relo_idx, const struct bpf_core_relo_res *res) +{ + __u32 orig_val, new_val; + __u8 class; + + class = BPF_CLASS(insn->code); + + if (res->poison) { +poison: + /* poison second part of ldimm64 to avoid confusing error from + * verifier about "unknown opcode 00" + */ + if (is_ldimm64_insn(insn)) + bpf_core_poison_insn(prog_name, relo_idx, insn_idx + 1, insn + 1); + bpf_core_poison_insn(prog_name, relo_idx, insn_idx, insn); + return 0; + } + + orig_val = res->orig_val; + new_val = res->new_val; + + switch (class) { + case BPF_ALU: + case BPF_ALU64: + if (BPF_SRC(insn->code) != BPF_K) + return -EINVAL; + if (res->validate && insn->imm != orig_val) { + pr_warn("prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %u -> %u\n", + prog_name, relo_idx, + insn_idx, insn->imm, orig_val, new_val); + return -EINVAL; + } + orig_val = insn->imm; + insn->imm = new_val; + pr_debug("prog '%s': relo #%d: patched insn #%d (ALU/ALU64) imm %u -> %u\n", + prog_name, relo_idx, insn_idx, + orig_val, new_val); + break; + case BPF_LDX: + case BPF_ST: + case BPF_STX: + if (res->validate && insn->off != orig_val) { + pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %u -> %u\n", + prog_name, relo_idx, insn_idx, insn->off, orig_val, new_val); + return -EINVAL; + } + if (new_val > SHRT_MAX) { + pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) value too big: %u\n", + prog_name, relo_idx, insn_idx, new_val); + return -ERANGE; + } + if (res->fail_memsz_adjust) { + pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) accesses field incorrectly. " + "Make sure you are accessing pointers, unsigned integers, or fields of matching type and size.\n", + prog_name, relo_idx, insn_idx); + goto poison; + } + + orig_val = insn->off; + insn->off = new_val; + pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) off %u -> %u\n", + prog_name, relo_idx, insn_idx, orig_val, new_val); + + if (res->new_sz != res->orig_sz) { + int insn_bytes_sz, insn_bpf_sz; + + insn_bytes_sz = insn_bpf_size_to_bytes(insn); + if (insn_bytes_sz != res->orig_sz) { + pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) unexpected mem size: got %d, exp %u\n", + prog_name, relo_idx, insn_idx, insn_bytes_sz, res->orig_sz); + return -EINVAL; + } + + insn_bpf_sz = insn_bytes_to_bpf_size(res->new_sz); + if (insn_bpf_sz < 0) { + pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) invalid new mem size: %u\n", + prog_name, relo_idx, insn_idx, res->new_sz); + return -EINVAL; + } + + insn->code = BPF_MODE(insn->code) | insn_bpf_sz | BPF_CLASS(insn->code); + pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) mem_sz %u -> %u\n", + prog_name, relo_idx, insn_idx, res->orig_sz, res->new_sz); + } + break; + case BPF_LD: { + __u64 imm; + + if (!is_ldimm64_insn(insn) || + insn[0].src_reg != 0 || insn[0].off != 0 || + insn[1].code != 0 || insn[1].dst_reg != 0 || + insn[1].src_reg != 0 || insn[1].off != 0) { + pr_warn("prog '%s': relo #%d: insn #%d (LDIMM64) has unexpected form\n", + prog_name, relo_idx, insn_idx); + return -EINVAL; + } + + imm = insn[0].imm + ((__u64)insn[1].imm << 32); + if (res->validate && imm != orig_val) { + pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDIMM64) value: got %llu, exp %u -> %u\n", + prog_name, relo_idx, + insn_idx, (unsigned long long)imm, + orig_val, new_val); + return -EINVAL; + } + + insn[0].imm = new_val; + insn[1].imm = 0; /* currently only 32-bit values are supported */ + pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %u\n", + prog_name, relo_idx, insn_idx, + (unsigned long long)imm, new_val); + break; + } + default: + pr_warn("prog '%s': relo #%d: trying to relocate unrecognized insn #%d, code:0x%x, src:0x%x, dst:0x%x, off:0x%x, imm:0x%x\n", + prog_name, relo_idx, insn_idx, insn->code, + insn->src_reg, insn->dst_reg, insn->off, insn->imm); + return -EINVAL; + } + + return 0; +} + +/* Output spec definition in the format: + * [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>, + * where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b + */ +static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec) +{ + const struct btf_type *t; + const struct btf_enum *e; + const char *s; + __u32 type_id; + int i; + + type_id = spec->root_type_id; + t = btf__type_by_id(spec->btf, type_id); + s = btf__name_by_offset(spec->btf, t->name_off); + + libbpf_print(level, "[%u] %s %s", type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s); + + if (core_relo_is_type_based(spec->relo_kind)) + return; + + if (core_relo_is_enumval_based(spec->relo_kind)) { + t = skip_mods_and_typedefs(spec->btf, type_id, NULL); + e = btf_enum(t) + spec->raw_spec[0]; + s = btf__name_by_offset(spec->btf, e->name_off); + + libbpf_print(level, "::%s = %u", s, e->val); + return; + } + + if (core_relo_is_field_based(spec->relo_kind)) { + for (i = 0; i < spec->len; i++) { + if (spec->spec[i].name) + libbpf_print(level, ".%s", spec->spec[i].name); + else if (i > 0 || spec->spec[i].idx > 0) + libbpf_print(level, "[%u]", spec->spec[i].idx); + } + + libbpf_print(level, " ("); + for (i = 0; i < spec->raw_len; i++) + libbpf_print(level, "%s%d", i == 0 ? "" : ":", spec->raw_spec[i]); + + if (spec->bit_offset % 8) + libbpf_print(level, " @ offset %u.%u)", + spec->bit_offset / 8, spec->bit_offset % 8); + else + libbpf_print(level, " @ offset %u)", spec->bit_offset / 8); + return; + } +} + +/* + * CO-RE relocate single instruction. + * + * The outline and important points of the algorithm: + * 1. For given local type, find corresponding candidate target types. + * Candidate type is a type with the same "essential" name, ignoring + * everything after last triple underscore (___). E.g., `sample`, + * `sample___flavor_one`, `sample___flavor_another_one`, are all candidates + * for each other. Names with triple underscore are referred to as + * "flavors" and are useful, among other things, to allow to + * specify/support incompatible variations of the same kernel struct, which + * might differ between different kernel versions and/or build + * configurations. + * + * N.B. Struct "flavors" could be generated by bpftool's BTF-to-C + * converter, when deduplicated BTF of a kernel still contains more than + * one different types with the same name. In that case, ___2, ___3, etc + * are appended starting from second name conflict. But start flavors are + * also useful to be defined "locally", in BPF program, to extract same + * data from incompatible changes between different kernel + * versions/configurations. For instance, to handle field renames between + * kernel versions, one can use two flavors of the struct name with the + * same common name and use conditional relocations to extract that field, + * depending on target kernel version. + * 2. For each candidate type, try to match local specification to this + * candidate target type. Matching involves finding corresponding + * high-level spec accessors, meaning that all named fields should match, + * as well as all array accesses should be within the actual bounds. Also, + * types should be compatible (see bpf_core_fields_are_compat for details). + * 3. It is supported and expected that there might be multiple flavors + * matching the spec. As long as all the specs resolve to the same set of + * offsets across all candidates, there is no error. If there is any + * ambiguity, CO-RE relocation will fail. This is necessary to accomodate + * imprefection of BTF deduplication, which can cause slight duplication of + * the same BTF type, if some directly or indirectly referenced (by + * pointer) type gets resolved to different actual types in different + * object files. If such situation occurs, deduplicated BTF will end up + * with two (or more) structurally identical types, which differ only in + * types they refer to through pointer. This should be OK in most cases and + * is not an error. + * 4. Candidate types search is performed by linearly scanning through all + * types in target BTF. It is anticipated that this is overall more + * efficient memory-wise and not significantly worse (if not better) + * CPU-wise compared to prebuilding a map from all local type names to + * a list of candidate type names. It's also sped up by caching resolved + * list of matching candidates per each local "root" type ID, that has at + * least one bpf_core_relo associated with it. This list is shared + * between multiple relocations for the same type ID and is updated as some + * of the candidates are pruned due to structural incompatibility. + */ +int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, + int insn_idx, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct bpf_core_cand_list *cands) +{ + struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; + struct bpf_core_relo_res cand_res, targ_res; + const struct btf_type *local_type; + const char *local_name; + __u32 local_id; + const char *spec_str; + int i, j, err; + + local_id = relo->type_id; + local_type = btf__type_by_id(local_btf, local_id); + if (!local_type) + return -EINVAL; + + local_name = btf__name_by_offset(local_btf, local_type->name_off); + if (!local_name) + return -EINVAL; + + spec_str = btf__name_by_offset(local_btf, relo->access_str_off); + if (str_is_empty(spec_str)) + return -EINVAL; + + err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); + if (err) { + pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", + prog_name, relo_idx, local_id, btf_kind_str(local_type), + str_is_empty(local_name) ? "<anon>" : local_name, + spec_str, err); + return -EINVAL; + } + + pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name, + relo_idx, core_relo_kind_str(relo->kind), relo->kind); + bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec); + libbpf_print(LIBBPF_DEBUG, "\n"); + + /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ + if (relo->kind == BPF_TYPE_ID_LOCAL) { + targ_res.validate = true; + targ_res.poison = false; + targ_res.orig_val = local_spec.root_type_id; + targ_res.new_val = local_spec.root_type_id; + goto patch_insn; + } + + /* libbpf doesn't support candidate search for anonymous types */ + if (str_is_empty(spec_str)) { + pr_warn("prog '%s': relo #%d: <%s> (%d) relocation doesn't support anonymous types\n", + prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); + return -EOPNOTSUPP; + } + + + for (i = 0, j = 0; i < cands->len; i++) { + err = bpf_core_spec_match(&local_spec, cands->cands[i].btf, + cands->cands[i].id, &cand_spec); + if (err < 0) { + pr_warn("prog '%s': relo #%d: error matching candidate #%d ", + prog_name, relo_idx, i); + bpf_core_dump_spec(LIBBPF_WARN, &cand_spec); + libbpf_print(LIBBPF_WARN, ": %d\n", err); + return err; + } + + pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name, + relo_idx, err == 0 ? "non-matching" : "matching", i); + bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec); + libbpf_print(LIBBPF_DEBUG, "\n"); + + if (err == 0) + continue; + + err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, &cand_spec, &cand_res); + if (err) + return err; + + if (j == 0) { + targ_res = cand_res; + targ_spec = cand_spec; + } else if (cand_spec.bit_offset != targ_spec.bit_offset) { + /* if there are many field relo candidates, they + * should all resolve to the same bit offset + */ + pr_warn("prog '%s': relo #%d: field offset ambiguity: %u != %u\n", + prog_name, relo_idx, cand_spec.bit_offset, + targ_spec.bit_offset); + return -EINVAL; + } else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) { + /* all candidates should result in the same relocation + * decision and value, otherwise it's dangerous to + * proceed due to ambiguity + */ + pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n", + prog_name, relo_idx, + cand_res.poison ? "failure" : "success", cand_res.new_val, + targ_res.poison ? "failure" : "success", targ_res.new_val); + return -EINVAL; + } + + cands->cands[j++] = cands->cands[i]; + } + + /* + * For BPF_FIELD_EXISTS relo or when used BPF program has field + * existence checks or kernel version/config checks, it's expected + * that we might not find any candidates. In this case, if field + * wasn't found in any candidate, the list of candidates shouldn't + * change at all, we'll just handle relocating appropriately, + * depending on relo's kind. + */ + if (j > 0) + cands->len = j; + + /* + * If no candidates were found, it might be both a programmer error, + * as well as expected case, depending whether instruction w/ + * relocation is guarded in some way that makes it unreachable (dead + * code) if relocation can't be resolved. This is handled in + * bpf_core_patch_insn() uniformly by replacing that instruction with + * BPF helper call insn (using invalid helper ID). If that instruction + * is indeed unreachable, then it will be ignored and eliminated by + * verifier. If it was an error, then verifier will complain and point + * to a specific instruction number in its log. + */ + if (j == 0) { + pr_debug("prog '%s': relo #%d: no matching targets found\n", + prog_name, relo_idx); + + /* calculate single target relo result explicitly */ + err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, NULL, &targ_res); + if (err) + return err; + } + +patch_insn: + /* bpf_core_patch_insn() should know how to handle missing targ_spec */ + err = bpf_core_patch_insn(prog_name, insn, insn_idx, relo, relo_idx, &targ_res); + if (err) { + pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n", + prog_name, relo_idx, relo->insn_off / 8, err); + return -EINVAL; + } + + return 0; +} diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index ddf20151fe41..3b9f8f18346c 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -75,8 +75,7 @@ struct bpf_core_relo { enum bpf_core_relo_kind kind; };
-struct bpf_core_cand -{ +struct bpf_core_cand { const struct btf *btf; const struct btf_type *t; const char *name; @@ -89,4 +88,13 @@ struct bpf_core_cand_list { int len; };
+int bpf_core_apply_relo_insn(const char *prog_name, + struct bpf_insn *insn, int insn_idx, + const struct bpf_core_relo *relo, int relo_idx, + const struct btf *local_btf, + struct bpf_core_cand_list *cands); +int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, + const struct btf *targ_btf, __u32 targ_id); + +size_t bpf_core_essential_name_len(const char *name); #endif
From: Jason Wang wangborong@cdjrlc.com
mainline inclusion from mainline-5.15-rc1 commit c139e40a515d2d1e51f7c08bd63ed4d1c7f64163 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove the repeated word 'the' in line 48.
Signed-off-by: Jason Wang wangborong@cdjrlc.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210727115928.74600-1-wangborong@cdjrlc.com (cherry picked from commit c139e40a515d2d1e51f7c08bd63ed4d1c7f64163) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a0a72fbabde5..75fdc19e85c1 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5972,7 +5972,7 @@ static int bpf_object__collect_relos(struct bpf_object *obj)
for (i = 0; i < obj->nr_programs; i++) { struct bpf_program *p = &obj->programs[i]; - + if (!p->nr_reloc) continue;
@@ -8280,7 +8280,7 @@ static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix, ret = snprintf(btf_type_name, sizeof(btf_type_name), "%s%s", prefix, name); /* snprintf returns the number of characters written excluding the - * the terminating null. So, if >= BTF_MAX_NAME_SIZE are written, it + * terminating null. So, if >= BTF_MAX_NAME_SIZE are written, it * indicates truncation. */ if (ret < 0 || ret >= sizeof(btf_type_name)) @@ -8822,7 +8822,7 @@ struct bpf_link { int bpf_link__update_program(struct bpf_link *link, struct bpf_program *prog) { int ret; - + ret = bpf_link_update(bpf_link__fd(link), bpf_program__fd(prog), NULL); return libbpf_err_errno(ret); }
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.15-rc1 commit 5aad03685185b5133a28e1ee1d4e98d3fd3642a3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Kernel functions referenced by .BTF_ids may be changed from global to static and get inlined or get renamed/removed, and thus disappears from BTF. This causes kernel build failure when resolve_btfids do id patch for symbols in .BTF_ids in vmlinux. Update resolve_btfids to emit warning messages and patch zero id for missing symbols instead of aborting kernel build process.
Suggested-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210727132532.2473636-2-hengqi.chen@gmail.com (cherry picked from commit 5aad03685185b5133a28e1ee1d4e98d3fd3642a3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/resolve_btfids/main.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c index 7ce4558a932f..efe0777854ae 100644 --- a/tools/bpf/resolve_btfids/main.c +++ b/tools/bpf/resolve_btfids/main.c @@ -291,7 +291,7 @@ static int compressed_section_fix(Elf *elf, Elf_Scn *scn, GElf_Shdr *sh) sh->sh_addralign = expected;
if (gelf_update_shdr(scn, sh) == 0) { - printf("FAILED cannot update section header: %s\n", + pr_err("FAILED cannot update section header: %s\n", elf_errmsg(-1)); return -1; } @@ -317,6 +317,7 @@ static int elf_collect(struct object *obj)
elf = elf_begin(fd, ELF_C_RDWR_MMAP, NULL); if (!elf) { + close(fd); pr_err("FAILED cannot create ELF descriptor: %s\n", elf_errmsg(-1)); return -1; @@ -485,7 +486,7 @@ static int symbols_resolve(struct object *obj) err = libbpf_get_error(btf); if (err) { pr_err("FAILED: load BTF from %s: %s\n", - obj->path, strerror(-err)); + obj->btf ?: obj->path, strerror(-err)); return -1; }
@@ -556,8 +557,7 @@ static int id_patch(struct object *obj, struct btf_id *id) int i;
if (!id->id) { - pr_err("FAILED unresolved symbol %s\n", id->name); - return -EINVAL; + pr_err("WARN: resolve_btfids: unresolved symbol %s\n", id->name); }
for (i = 0; i < id->addr_cnt; i++) { @@ -735,8 +735,9 @@ int main(int argc, const char **argv)
err = 0; out: - if (obj.efile.elf) + if (obj.efile.elf) { elf_end(obj.efile.elf); - close(obj.efile.fd); + close(obj.efile.fd); + } return err; }
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.15-rc1 commit d36216429ff3e69db4f6ea5e0c86b80010f5f30b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
To avoid kernel build failure due to some missing .BTF-ids referenced functions/types, the patch ([1]) tries to fill btf_id 0 for these types.
In bpf verifier, for percpu variable and helper returning btf_id cases, verifier already emitted proper warning with something like verbose(env, "Helper has invalid btf_id in R%d\n", regno); verbose(env, "invalid return type %d of func %s#%d\n", fn->ret_type, func_id_name(func_id), func_id);
But this is not the case for bpf_iter context arguments. I hacked resolve_btfids to encode btf_id 0 for struct task_struct. With `./test_progs -n 7/5`, I got, 0: (79) r2 = *(u64 *)(r1 +0) func 'bpf_iter_task' arg0 has btf_id 29739 type STRUCT 'bpf_iter_meta' ; struct seq_file *seq = ctx->meta->seq; 1: (79) r6 = *(u64 *)(r2 +0) ; struct task_struct *task = ctx->task; 2: (79) r7 = *(u64 *)(r1 +8) ; if (task == (void *)0) { 3: (55) if r7 != 0x0 goto pc+11 ... ; BPF_SEQ_PRINTF(seq, "%8d %8d\n", task->tgid, task->pid); 26: (61) r1 = *(u32 *)(r7 +1372) Type '(anon)' is not a struct
Basically, verifier will return btf_id 0 for task_struct. Later on, when the code tries to access task->tgid, the verifier correctly complains the type is '(anon)' and it is not a struct. Users still need to backtrace to find out what is going on.
Let us catch the invalid btf_id 0 earlier and provide better message indicating btf_id is wrong. The new error message looks like below: R1 type=ctx expected=fp ; struct seq_file *seq = ctx->meta->seq; 0: (79) r2 = *(u64 *)(r1 +0) func 'bpf_iter_task' arg0 has btf_id 29739 type STRUCT 'bpf_iter_meta' ; struct seq_file *seq = ctx->meta->seq; 1: (79) r6 = *(u64 *)(r2 +0) ; struct task_struct *task = ctx->task; 2: (79) r7 = *(u64 *)(r1 +8) invalid btf_id for context argument offset 8 invalid bpf_context access off=8 size=8
[1] https://lore.kernel.org/bpf/20210727132532.2473636-1-hengqi.chen@gmail.com/
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210728183025.1461750-1-yhs@fb.com (cherry picked from commit d36216429ff3e69db4f6ea5e0c86b80010f5f30b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index cd3c1a11fef1..e9b6e50a6764 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4700,6 +4700,11 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, const struct bpf_ctx_arg_aux *ctx_arg_info = &prog->aux->ctx_arg_info[i];
if (ctx_arg_info->offset == off) { + if (!ctx_arg_info->btf_id) { + bpf_log(log,"invalid btf_id for context argument offset %u\n", off); + return false; + } + info->reg_type = ctx_arg_info->reg_type; info->btf = btf_vmlinux; info->btf_id = ctx_arg_info->btf_id;
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 6d2d73cdd673d493f9f3751188757129b1d23fb7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Variable "err" is initialised to -EINVAL so that this error code is returned when something goes wrong in libbpf_find_prog_btf_id(). However, a recent change in the function made use of the variable in such a way that it is set to 0 if retrieving linear information on the program is successful, and this 0 value remains if we error out on failures at later stages.
Let's fix this by setting err to -EINVAL later in the function.
Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210729162028.29512-2-quentin@isovalent.com (cherry picked from commit 6d2d73cdd673d493f9f3751188757129b1d23fb7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 75fdc19e85c1..21bf1f5f45ec 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8324,7 +8324,7 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) struct bpf_prog_info_linear *info_linear; struct bpf_prog_info *info; struct btf *btf = NULL; - int err = -EINVAL; + int err;
info_linear = bpf_program__get_prog_info_linear(attach_prog_fd, 0); err = libbpf_get_error(info_linear); @@ -8333,6 +8333,8 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) attach_prog_fd); return err; } + + err = -EINVAL; info = &info_linear->info; if (!info->btf_id) { pr_warn("The target program doesn't have BTF\n");
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 3c7e58590600eca3402f08e7fbdf4f2d1e36c5c8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
As part of the effort to move towards a v1.0 for libbpf, rename btf__load() function, used to "upload" BTF information into the kernel, as btf__load_into_kernel(). This new name better reflects what the function does.
References:
- https://github.com/libbpf/libbpf/issues/278 - https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210729162028.29512-3-quentin@isovalent.com (cherry picked from commit 3c7e58590600eca3402f08e7fbdf4f2d1e36c5c8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 3 ++- tools/lib/bpf/btf.h | 1 + tools/lib/bpf/libbpf.c | 2 +- tools/lib/bpf/libbpf.map | 1 + 4 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index c08b135c022b..912de58edeea 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1185,7 +1185,7 @@ int btf__finalize_data(struct bpf_object *obj, struct btf *btf)
static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
-int btf__load(struct btf *btf) +int btf__load_into_kernel(struct btf *btf) { __u32 log_buf_size = 0, raw_size; char *log_buf = NULL; @@ -1233,6 +1233,7 @@ int btf__load(struct btf *btf) free(log_buf); return libbpf_err(err); } +int btf__load(struct btf *) __attribute__((alias("btf__load_into_kernel")));
int btf__fd(const struct btf *btf) { diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 374e9f15de2e..fd8a21d936ef 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -46,6 +46,7 @@ LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_b
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); LIBBPF_API int btf__load(struct btf *btf); +LIBBPF_API int btf__load_into_kernel(struct btf *btf); LIBBPF_API __s32 btf__find_by_name(const struct btf *btf, const char *type_name); LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf, diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 21bf1f5f45ec..25f9d8f24125 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2770,7 +2770,7 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) */ btf__set_fd(kern_btf, 0); } else { - err = btf__load(kern_btf); + err = btf__load_into_kernel(kern_btf); } if (sanitize) { if (!err) { diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 9bbb195812fa..cd70797eebb1 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -373,6 +373,7 @@ LIBBPF_0.5.0 { bpf_map__initial_value; bpf_map_lookup_and_delete_elem_flags; bpf_object__gen_loader; + btf__load_into_kernel; btf_dump__dump_type_data; libbpf_set_strict_mode; } LIBBPF_0.4.0;
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 6cc93e2f2c1c865acadedfea174bde893a2aa376 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Rename function btf__get_from_id() as btf__load_from_kernel_by_id() to better indicate what the function does. Change the new function so that, instead of requiring a pointer to the pointer to update and returning with an error code, it takes a single argument (the id of the BTF object) and returns the corresponding pointer. This is more in line with the existing constructors.
The other tools calling the (soon-to-be) deprecated btf__get_from_id() function will be updated in a future commit.
References:
- https://github.com/libbpf/libbpf/issues/278 - https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210729162028.29512-4-quentin@isovalent.com (cherry picked from commit 6cc93e2f2c1c865acadedfea174bde893a2aa376) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 25 +++++++++++++++++-------- tools/lib/bpf/btf.h | 4 +++- tools/lib/bpf/libbpf.c | 5 +++-- tools/lib/bpf/libbpf.map | 1 + 4 files changed, 24 insertions(+), 11 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 912de58edeea..4ceaa79c8405 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1388,21 +1388,30 @@ struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) return btf; }
-int btf__get_from_id(__u32 id, struct btf **btf) +struct btf *btf__load_from_kernel_by_id(__u32 id) { - struct btf *res; - int err, btf_fd; + struct btf *btf; + int btf_fd;
- *btf = NULL; btf_fd = bpf_btf_get_fd_by_id(id); if (btf_fd < 0) - return libbpf_err(-errno); - - res = btf_get_from_fd(btf_fd, NULL); - err = libbpf_get_error(res); + return libbpf_err_ptr(-errno);
+ btf = btf_get_from_fd(btf_fd, NULL); close(btf_fd);
+ return libbpf_ptr(btf); +} + +int btf__get_from_id(__u32 id, struct btf **btf) +{ + struct btf *res; + int err; + + *btf = NULL; + res = btf__load_from_kernel_by_id(id); + err = libbpf_get_error(res); + if (err) return libbpf_err(err);
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index fd8a21d936ef..5d955329a1f4 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -44,6 +44,9 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b LIBBPF_API struct btf *btf__parse_raw(const char *path); LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
+LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id); +LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf); + LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); LIBBPF_API int btf__load(struct btf *btf); LIBBPF_API int btf__load_into_kernel(struct btf *btf); @@ -67,7 +70,6 @@ LIBBPF_API void btf__set_fd(struct btf *btf, int fd); LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size); LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset); LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset); -LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf); LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, __u32 expected_key_size, __u32 expected_value_size, diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 25f9d8f24125..32868fbaf653 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8323,7 +8323,7 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) { struct bpf_prog_info_linear *info_linear; struct bpf_prog_info *info; - struct btf *btf = NULL; + struct btf *btf; int err;
info_linear = bpf_program__get_prog_info_linear(attach_prog_fd, 0); @@ -8340,7 +8340,8 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) pr_warn("The target program doesn't have BTF\n"); goto out; } - if (btf__get_from_id(info->btf_id, &btf)) { + btf = btf__load_from_kernel_by_id(info->btf_id); + if (libbpf_get_error(btf)) { pr_warn("Failed to get BTF of the program\n"); goto out; } diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index cd70797eebb1..454a8202dc57 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -373,6 +373,7 @@ LIBBPF_0.5.0 { bpf_map__initial_value; bpf_map_lookup_and_delete_elem_flags; bpf_object__gen_loader; + btf__load_from_kernel_by_id; btf__load_into_kernel; btf_dump__dump_type_data; libbpf_set_strict_mode;
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 369e955b3d1c12f6ec2e51a95911bb80ada55d79 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Make sure to call btf__free() (and not simply free(), which does not free all pointers stored in the struct) on pointers to struct btf objects retrieved at various locations.
These were found while updating the calls to btf__get_from_id().
Fixes: 999d82cbc044 ("tools/bpf: enhance test_btf file testing to test func info") Fixes: 254471e57a86 ("tools/bpf: bpftool: add support for func types") Fixes: 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs") Fixes: d56354dc4909 ("perf tools: Save bpf_prog_info and BTF of new BPF programs") Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command") Fixes: fa853c4b839e ("perf stat: Enable counting events for BPF programs") Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210729162028.29512-5-quentin@isovalent.com (cherry picked from commit 369e955b3d1c12f6ec2e51a95911bb80ada55d79) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/prog.c | 5 ++++- tools/perf/util/bpf-event.c | 4 ++-- tools/perf/util/bpf_counter.c | 3 ++- tools/testing/selftests/bpf/prog_tests/btf.c | 1 + 4 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 4187b5c9397a..e850c5d909c0 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -783,6 +783,8 @@ prog_dump(struct bpf_prog_info *info, enum dump_mode mode, kernel_syms_destroy(&dd); }
+ btf__free(btf); + return 0; }
@@ -1976,8 +1978,8 @@ static char *profile_target_name(int tgt_fd) struct bpf_prog_info_linear *info_linear; struct bpf_func_info *func_info; const struct btf_type *t; + struct btf *btf = NULL; char *name = NULL; - struct btf *btf;
info_linear = bpf_program__get_prog_info_linear( tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); @@ -2001,6 +2003,7 @@ static char *profile_target_name(int tgt_fd) } name = strdup(btf__name_by_offset(btf, t->name_off)); out: + btf__free(btf); free(info_linear); return name; } diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 4eb02762104b..40ef2e61ceb5 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -293,7 +293,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
out: free(info_linear); - free(btf); + btf__free(btf); return err ? -1 : 0; }
@@ -483,7 +483,7 @@ static void perf_env__add_bpf_info(struct perf_env *env, u32 id) perf_env__fetch_btf(env, btf_id, btf);
out: - free(btf); + btf__free(btf); close(fd); }
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c index 04f89120b323..df98270932a6 100644 --- a/tools/perf/util/bpf_counter.c +++ b/tools/perf/util/bpf_counter.c @@ -63,8 +63,8 @@ static char *bpf_target_prog_name(int tgt_fd) struct bpf_prog_info_linear *info_linear; struct bpf_func_info *func_info; const struct btf_type *t; + struct btf *btf = NULL; char *name = NULL; - struct btf *btf;
info_linear = bpf_program__get_prog_info_linear( tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); @@ -88,6 +88,7 @@ static char *bpf_target_prog_name(int tgt_fd) } name = strdup(btf__name_by_offset(btf, t->name_off)); out: + btf__free(btf); free(info_linear); return name; } diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 055d2c0486ed..9ca65b171e61 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -4254,6 +4254,7 @@ static void do_test_file(unsigned int test_num) fprintf(stderr, "OK");
done: + btf__free(btf); free(func_info); bpf_object__close(obj); }
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 86f4b7f2578f69284fa782be54e700c42c757897 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Replace the calls to function btf__get_from_id(), which we plan to deprecate before the library reaches v1.0, with calls to btf__load_from_kernel_by_id() in tools/ (bpftool, perf, selftests). Update the surrounding code accordingly (instead of passing a pointer to the btf struct, get it as a return value from the function).
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210729162028.29512-6-quentin@isovalent.com (cherry picked from commit 86f4b7f2578f69284fa782be54e700c42c757897) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 8 ++----- tools/bpf/bpftool/btf_dumper.c | 6 +++-- tools/bpf/bpftool/map.c | 14 ++++++------ tools/bpf/bpftool/prog.c | 24 +++++++++++++------- tools/perf/util/bpf-event.c | 7 +++--- tools/perf/util/bpf_counter.c | 9 ++++++-- tools/testing/selftests/bpf/prog_tests/btf.c | 3 ++- 7 files changed, 42 insertions(+), 29 deletions(-)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 1326fff3629b..e58599eb8f4e 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -560,16 +560,12 @@ static int do_dump(int argc, char **argv) }
if (!btf) { - err = btf__get_from_id(btf_id, &btf); + btf = btf__load_from_kernel_by_id(btf_id); + err = libbpf_get_error(btf); if (err) { p_err("get btf by id (%u): %s", btf_id, strerror(err)); goto done; } - if (!btf) { - err = -ENOENT; - p_err("can't find btf with ID (%u)", btf_id); - goto done; - } }
if (dump_c) { diff --git a/tools/bpf/bpftool/btf_dumper.c b/tools/bpf/bpftool/btf_dumper.c index 0e9310727281..922c46d042f1 100644 --- a/tools/bpf/bpftool/btf_dumper.c +++ b/tools/bpf/bpftool/btf_dumper.c @@ -64,8 +64,10 @@ static int dump_prog_id_as_func_ptr(const struct btf_dumper *d, } info = &prog_info->info;
- if (!info->btf_id || !info->nr_func_info || - btf__get_from_id(info->btf_id, &prog_btf)) + if (!info->btf_id || !info->nr_func_info) + goto print; + prog_btf = btf__load_from_kernel_by_id(info->btf_id); + if (libbpf_get_error(prog_btf)) goto print; finfo = u64_to_ptr(info->func_info); func_type = btf__type_by_id(prog_btf, finfo->type_id); diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index ce6faf1b90e8..a061df6e7a14 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -806,10 +806,11 @@ static struct btf *get_map_kv_btf(const struct bpf_map_info *info) } else if (info->btf_value_type_id) { int err;
- err = btf__get_from_id(info->btf_id, &btf); - if (err || !btf) { + btf = btf__load_from_kernel_by_id(info->btf_id); + err = libbpf_get_error(btf); + if (err) { p_err("failed to get btf"); - btf = err ? ERR_PTR(err) : ERR_PTR(-ESRCH); + btf = ERR_PTR(err); } }
@@ -1038,11 +1039,10 @@ static void print_key_value(struct bpf_map_info *info, void *key, void *value) { json_writer_t *btf_wtr; - struct btf *btf = NULL; - int err; + struct btf *btf;
- err = btf__get_from_id(info->btf_id, &btf); - if (err) { + btf = btf__load_from_kernel_by_id(info->btf_id); + if (libbpf_get_error(btf)) { p_err("failed to get btf"); return; } diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index e850c5d909c0..6c80bc729482 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -249,10 +249,10 @@ static void show_prog_metadata(int fd, __u32 num_maps) struct bpf_map_info map_info; struct btf_var_secinfo *vsi; bool printed_header = false; - struct btf *btf = NULL; unsigned int i, vlen; void *value = NULL; const char *name; + struct btf *btf; int err;
if (!num_maps) @@ -263,8 +263,8 @@ static void show_prog_metadata(int fd, __u32 num_maps) if (!value) return;
- err = btf__get_from_id(map_info.btf_id, &btf); - if (err || !btf) + btf = btf__load_from_kernel_by_id(map_info.btf_id); + if (libbpf_get_error(btf)) goto out_free;
t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id); @@ -648,9 +648,12 @@ prog_dump(struct bpf_prog_info *info, enum dump_mode mode, member_len = info->xlated_prog_len; }
- if (info->btf_id && btf__get_from_id(info->btf_id, &btf)) { - p_err("failed to get btf"); - return -1; + if (info->btf_id) { + btf = btf__load_from_kernel_by_id(info->btf_id); + if (libbpf_get_error(btf)) { + p_err("failed to get btf"); + return -1; + } }
func_info = u64_to_ptr(info->func_info); @@ -1988,12 +1991,17 @@ static char *profile_target_name(int tgt_fd) return NULL; }
- if (info_linear->info.btf_id == 0 || - btf__get_from_id(info_linear->info.btf_id, &btf)) { + if (info_linear->info.btf_id == 0) { p_err("prog FD %d doesn't have valid btf", tgt_fd); goto out; }
+ btf = btf__load_from_kernel_by_id(info_linear->info.btf_id); + if (libbpf_get_error(btf)) { + p_err("failed to load btf for prog FD %d", tgt_fd); + goto out; + } + func_info = u64_to_ptr(info_linear->info.func_info); t = btf__type_by_id(btf, func_info[0].type_id); if (!t) { diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 40ef2e61ceb5..81f8bde75a80 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -220,10 +220,10 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, err = -1; goto out; } - if (btf__get_from_id(info->btf_id, &btf)) { + btf = btf__load_from_kernel_by_id(info->btf_id); + if (libbpf_get_error(btf)) { pr_debug("%s: failed to get BTF of id %u, aborting\n", __func__, info->btf_id); err = -1; - btf = NULL; goto out; } perf_env__fetch_btf(env, info->btf_id, btf); @@ -475,7 +475,8 @@ static void perf_env__add_bpf_info(struct perf_env *env, u32 id) if (btf_id == 0) goto out;
- if (btf__get_from_id(btf_id, &btf)) { + btf = btf__load_from_kernel_by_id(btf_id); + if (libbpf_get_error(btf)) { pr_debug("%s: failed to get BTF of id %u, aborting\n", __func__, btf_id); goto out; diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c index df98270932a6..6cb80a7cd362 100644 --- a/tools/perf/util/bpf_counter.c +++ b/tools/perf/util/bpf_counter.c @@ -73,12 +73,17 @@ static char *bpf_target_prog_name(int tgt_fd) return NULL; }
- if (info_linear->info.btf_id == 0 || - btf__get_from_id(info_linear->info.btf_id, &btf)) { + if (info_linear->info.btf_id == 0) { pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); goto out; }
+ btf = btf__load_from_kernel_by_id(info_linear->info.btf_id); + if (libbpf_get_error(btf)) { + pr_debug("failed to load btf for prog FD %d\n", tgt_fd); + goto out; + } + func_info = u64_to_ptr(info_linear->info.func_info); t = btf__type_by_id(btf, func_info[0].type_id); if (!t) { diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 9ca65b171e61..01befb614ee9 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -4218,7 +4218,8 @@ static void do_test_file(unsigned int test_num) goto done; }
- err = btf__get_from_id(info.btf_id, &btf); + btf = btf__load_from_kernel_by_id(info.btf_id); + err = libbpf_get_error(btf); if (CHECK(err, "cannot get btf from kernel, err: %d", err)) goto done;
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 61fc51b1d3e5915e356f2c0b67cd3bb13b640413 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a new API function btf__load_from_kernel_by_id_split(), which takes a pointer to a base BTF object in order to support split BTF objects when retrieving BTF information from the kernel.
Reference: https://github.com/libbpf/libbpf/issues/314
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210729162028.29512-8-quentin@isovalent.com (cherry picked from commit 61fc51b1d3e5915e356f2c0b67cd3bb13b640413) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 9 +++++++-- tools/lib/bpf/btf.h | 1 + tools/lib/bpf/libbpf.map | 1 + 3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 4ceaa79c8405..43214070a80a 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1388,7 +1388,7 @@ struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) return btf; }
-struct btf *btf__load_from_kernel_by_id(__u32 id) +struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf) { struct btf *btf; int btf_fd; @@ -1397,12 +1397,17 @@ struct btf *btf__load_from_kernel_by_id(__u32 id) if (btf_fd < 0) return libbpf_err_ptr(-errno);
- btf = btf_get_from_fd(btf_fd, NULL); + btf = btf_get_from_fd(btf_fd, base_btf); close(btf_fd);
return libbpf_ptr(btf); }
+struct btf *btf__load_from_kernel_by_id(__u32 id) +{ + return btf__load_from_kernel_by_id_split(id, NULL); +} + int btf__get_from_id(__u32 id, struct btf **btf) { struct btf *res; diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 5d955329a1f4..596a42c8f4f5 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -45,6 +45,7 @@ LIBBPF_API struct btf *btf__parse_raw(const char *path); LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id); +LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf); LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 454a8202dc57..00a381ab9f11 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -374,6 +374,7 @@ LIBBPF_0.5.0 { bpf_map_lookup_and_delete_elem_flags; bpf_object__gen_loader; btf__load_from_kernel_by_id; + btf__load_from_kernel_by_id_split; btf__load_into_kernel; btf_dump__dump_type_data; libbpf_set_strict_mode;
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.15-rc1 commit 211ab78f7658b50ea10c4569be63ca5009fd39b4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Split BTF objects are typically BTF objects for kernel modules, which are incrementally built on top of kernel BTF instead of redefining all kernel symbols they need. We can use bpftool with its -B command-line option to dump split BTF objects. It works well when the handle provided for the BTF object to dump is a "path" to the BTF object, typically under /sys/kernel/btf, because bpftool internally calls btf__parse_split() which can take a "base_btf" pointer and resolve the BTF reconstruction (although in that case, the "-B" option is unnecessary because bpftool performs autodetection).
However, it did not work so far when passing the BTF object through its id, because bpftool would call btf__get_from_id() which did not provide a way to pass a "base_btf" pointer.
In other words, the following works:
# bpftool btf dump file /sys/kernel/btf/i2c_smbus -B /sys/kernel/btf/vmlinux
But this was not possible:
# bpftool btf dump id 6 -B /sys/kernel/btf/vmlinux
The libbpf API has recently changed, and btf__get_from_id() has been deprecated in favour of btf__load_from_kernel_by_id() and its version with support for split BTF, btf__load_from_kernel_by_id_split(). Let's update bpftool to make it able to dump the BTF object in the second case as well.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210729162028.29512-9-quentin@isovalent.com (cherry picked from commit 211ab78f7658b50ea10c4569be63ca5009fd39b4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index e58599eb8f4e..0ba562625b02 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -560,7 +560,7 @@ static int do_dump(int argc, char **argv) }
if (!btf) { - btf = btf__load_from_kernel_by_id(btf_id); + btf = btf__load_from_kernel_by_id_split(btf_id, base_btf); err = libbpf_get_error(btf); if (err) { p_err("get btf by id (%u): %s", btf_id, strerror(err));
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.15-rc1 commit a710eed386f182fcbfe517b659f60024fdb7c40c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add two new APIs: btf__load_vmlinux_btf and btf__load_module_btf. btf__load_vmlinux_btf is just an alias to the existing API named libbpf_find_kernel_btf, rename to be more precisely and consistent with existing BTF APIs. btf__load_module_btf can be used to load module BTF, add it for completeness. These two APIs are useful for implementing tracing tools and introspection tools. This is part of the effort towards libbpf 1.0 ([0]).
[0] Closes: https://github.com/libbpf/libbpf/issues/280
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210730114012.494408-1-hengqi.chen@gmail.com (cherry picked from commit a710eed386f182fcbfe517b659f60024fdb7c40c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 14 ++++++++++++-- tools/lib/bpf/btf.h | 6 ++++-- tools/lib/bpf/libbpf.c | 4 ++-- tools/lib/bpf/libbpf.map | 2 ++ 4 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 43214070a80a..aab7e4ece0a0 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -4041,7 +4041,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d) */ if (d->hypot_adjust_canon) continue; - + if (t_kind == BTF_KIND_FWD && c_kind != BTF_KIND_FWD) d->map[t_id] = c_id;
@@ -4414,7 +4414,7 @@ static int btf_dedup_remap_types(struct btf_dedup *d) * Probe few well-known locations for vmlinux kernel image and try to load BTF * data out of it to use for target BTF. */ -struct btf *libbpf_find_kernel_btf(void) +struct btf *btf__load_vmlinux_btf(void) { struct { const char *path_fmt; @@ -4460,6 +4460,16 @@ struct btf *libbpf_find_kernel_btf(void) return libbpf_err_ptr(-ESRCH); }
+struct btf *libbpf_find_kernel_btf(void) __attribute__((alias("btf__load_vmlinux_btf"))); + +struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf) +{ + char path[80]; + + snprintf(path, sizeof(path), "/sys/kernel/btf/%s", module_name); + return btf__parse_split(path, vmlinux_btf); +} + int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx) { int i, n, err; diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 596a42c8f4f5..4a711f990904 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -44,6 +44,10 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b LIBBPF_API struct btf *btf__parse_raw(const char *path); LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
+LIBBPF_API struct btf *btf__load_vmlinux_btf(void); +LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf); +LIBBPF_API struct btf *libbpf_find_kernel_btf(void); + LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id); LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf); LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf); @@ -93,8 +97,6 @@ int btf_ext__reloc_line_info(const struct btf *btf, LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext); LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
-LIBBPF_API struct btf *libbpf_find_kernel_btf(void); - LIBBPF_API int btf__find_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_type(struct btf *btf, const struct btf *src_btf, diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 32868fbaf653..7decd67eafc3 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2681,7 +2681,7 @@ static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force) if (!force && !obj_needs_vmlinux_btf(obj)) return 0;
- obj->btf_vmlinux = libbpf_find_kernel_btf(); + obj->btf_vmlinux = btf__load_vmlinux_btf(); err = libbpf_get_error(obj->btf_vmlinux); if (err) { pr_warn("Error loading vmlinux BTF: %d\n", err); @@ -8304,7 +8304,7 @@ int libbpf_find_vmlinux_btf_id(const char *name, struct btf *btf; int err;
- btf = libbpf_find_kernel_btf(); + btf = btf__load_vmlinux_btf(); err = libbpf_get_error(btf); if (err) { pr_warn("vmlinux BTF is not found\n"); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 00a381ab9f11..93319760e3a3 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -376,6 +376,8 @@ LIBBPF_0.5.0 { btf__load_from_kernel_by_id; btf__load_from_kernel_by_id_split; btf__load_into_kernel; + btf__load_module_btf; + btf__load_vmlinux_btf; btf_dump__dump_type_data; libbpf_set_strict_mode; } LIBBPF_0.4.0;
From: Hao Luo haoluo@google.com
mainline inclusion from mainline-5.15-rc1 commit 2211c825e7b6b99bbcabab4e0130a2779275dcc3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Currently weak typeless ksyms have default value zero, when they don't exist in the kernel. However, weak typed ksyms are rejected by libbpf if they can not be resolved. This means that if a bpf object contains the declaration of a nonexistent weak typed ksym, it will be rejected even if there is no program that references the symbol.
Nonexistent weak typed ksyms can also default to zero just like typeless ones. This allows programs that access weak typed ksyms to be accepted by verifier, if the accesses are guarded. For example,
extern const int bpf_link_fops3 __ksym __weak;
/* then in BPF program */
if (&bpf_link_fops3) { /* use bpf_link_fops3 */ }
If actual use of nonexistent typed ksym is not guarded properly, verifier would see that register is not PTR_TO_BTF_ID and wouldn't allow to use it for direct memory reads or passing it to BPF helpers.
Signed-off-by: Hao Luo haoluo@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210812003819.2439037-1-haoluo@google.com (cherry picked from commit 2211c825e7b6b99bbcabab4e0130a2779275dcc3) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 16 +++--- .../selftests/bpf/prog_tests/ksyms_btf.c | 31 ++++++++++ .../selftests/bpf/progs/test_ksyms_weak.c | 56 +++++++++++++++++++ 3 files changed, 96 insertions(+), 7 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_weak.c
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 7decd67eafc3..1781d0edcdf0 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5278,11 +5278,11 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) } insn[1].imm = ext->kcfg.data_off; } else /* EXT_KSYM */ { - if (ext->ksym.type_id) { /* typed ksyms */ + if (ext->ksym.type_id && ext->is_set) { /* typed ksyms */ insn[0].src_reg = BPF_PSEUDO_BTF_ID; insn[0].imm = ext->ksym.kernel_btf_id; insn[1].imm = ext->ksym.kernel_btf_obj_fd; - } else { /* typeless ksyms */ + } else { /* typeless ksyms or unresolved typed ksyms */ insn[0].imm = (__u32)ext->ksym.addr; insn[1].imm = ext->ksym.addr >> 32; } @@ -6610,11 +6610,8 @@ static int find_ksym_btf_id(struct bpf_object *obj, const char *ksym_name, break; } } - if (id <= 0) { - pr_warn("extern (%s ksym) '%s': failed to find BTF ID in kernel BTF(s).\n", - __btf_kind_str(kind), ksym_name); + if (id <= 0) return -ESRCH; - }
*res_btf = btf; *res_btf_fd = btf_fd; @@ -6631,8 +6628,13 @@ static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj, struct btf *btf = NULL;
id = find_ksym_btf_id(obj, ext->name, BTF_KIND_VAR, &btf, &btf_fd); - if (id < 0) + if (id == -ESRCH && ext->is_weak) { + return 0; + } else if (id < 0) { + pr_warn("extern (var ksym) '%s': not found in kernel BTF\n", + ext->name); return id; + }
/* find local type_id */ local_type_id = ext->ksym.type_id; diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c index 97f38d4f6a26..0d36eccd690a 100644 --- a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c @@ -6,6 +6,7 @@ #include <bpf/btf.h> #include "test_ksyms_btf.skel.h" #include "test_ksyms_btf_null_check.skel.h" +#include "test_ksyms_weak.skel.h" #include "test_ksyms_btf_write_check.skel.h"
static int duration; @@ -82,6 +83,33 @@ static void test_null_check(void) test_ksyms_btf_null_check__destroy(skel); }
+static void test_weak_syms(void) +{ + struct test_ksyms_weak *skel; + struct test_ksyms_weak__data *data; + int err; + + skel = test_ksyms_weak__open_and_load(); + if (CHECK(!skel, "test_ksyms_weak__open_and_load", "failed\n")) + return; + + err = test_ksyms_weak__attach(skel); + if (CHECK(err, "test_ksyms_weak__attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* trigger tracepoint */ + usleep(1); + + data = skel->data; + ASSERT_EQ(data->out__existing_typed, 0, "existing typed ksym"); + ASSERT_NEQ(data->out__existing_typeless, -1, "existing typeless ksym"); + ASSERT_EQ(data->out__non_existent_typeless, 0, "nonexistent typeless ksym"); + ASSERT_EQ(data->out__non_existent_typed, 0, "nonexistent typed ksym"); + +cleanup: + test_ksyms_weak__destroy(skel); +} + static void test_write_check(void) { struct test_ksyms_btf_write_check *skel; @@ -118,6 +146,9 @@ void test_ksyms_btf(void) if (test__start_subtest("null_check")) test_null_check();
+ if (test__start_subtest("weak_ksyms")) + test_weak_syms(); + if (test__start_subtest("write_check")) test_write_check(); } diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_weak.c b/tools/testing/selftests/bpf/progs/test_ksyms_weak.c new file mode 100644 index 000000000000..5f8379aadb29 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_ksyms_weak.c @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test weak ksyms. + * + * Copyright (c) 2021 Google + */ + +#include "vmlinux.h" + +#include <bpf/bpf_helpers.h> + +int out__existing_typed = -1; +__u64 out__existing_typeless = -1; + +__u64 out__non_existent_typeless = -1; +__u64 out__non_existent_typed = -1; + +/* existing weak symbols */ + +/* test existing weak symbols can be resolved. */ +extern const struct rq runqueues __ksym __weak; /* typed */ +extern const void bpf_prog_active __ksym __weak; /* typeless */ + + +/* non-existent weak symbols. */ + +/* typeless symbols, default to zero. */ +extern const void bpf_link_fops1 __ksym __weak; + +/* typed symbols, default to zero. */ +extern const int bpf_link_fops2 __ksym __weak; + +SEC("raw_tp/sys_enter") +int pass_handler(const void *ctx) +{ + struct rq *rq; + + /* tests existing symbols. */ + rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, 0); + if (rq) + out__existing_typed = rq->cpu; + out__existing_typeless = (__u64)&bpf_prog_active; + + /* tests non-existent symbols. */ + out__non_existent_typeless = (__u64)&bpf_link_fops1; + + /* tests non-existent symbols. */ + out__non_existent_typed = (__u64)&bpf_link_fops2; + + if (&bpf_link_fops2) /* can't happen */ + out__non_existent_typed = (__u64)bpf_per_cpu_ptr(&bpf_link_fops2, 0); + + return 0; +} + +char _license[] SEC("license") = "GPL";
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.15-rc4 commit bcfd367c2839f2126c048fe59700ec1b538e2b06 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When a BPF object is compiled without BTF info (without -g), trying to link such objects using bpftool causes a SIGSEGV due to btf__get_nr_types accessing obj->btf which is NULL. Fix this by checking for the NULL pointer, and return error.
Reproducer: $ cat a.bpf.c extern int foo(void); int bar(void) { return foo(); } $ cat b.bpf.c int foo(void) { return 0; } $ clang -O2 -target bpf -c a.bpf.c $ clang -O2 -target bpf -c b.bpf.c $ bpftool gen obj out a.bpf.o b.bpf.o Segmentation fault (core dumped)
After fix: $ bpftool gen obj out a.bpf.o b.bpf.o libbpf: failed to find BTF info for object 'a.bpf.o' Error: failed to link 'a.bpf.o': Unknown error -22 (-22)
Fixes: a46349227cd8 (libbpf: Add linker extern resolution support for functions and global variables) Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com (cherry picked from commit bcfd367c2839f2126c048fe59700ec1b538e2b06) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 0c41468b23bc..f638ba372a9b 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1649,11 +1649,17 @@ static bool btf_is_non_static(const struct btf_type *t) static int find_glob_sym_btf(struct src_obj *obj, Elf64_Sym *sym, const char *sym_name, int *out_btf_sec_id, int *out_btf_id) { - int i, j, n = btf__get_nr_types(obj->btf), m, btf_id = 0; + int i, j, n, m, btf_id = 0; const struct btf_type *t; const struct btf_var_secinfo *vi; const char *name;
+ if (!obj->btf) { + pr_warn("failed to find BTF info for object '%s'\n", obj->filename); + return -EINVAL; + } + + n = btf__get_nr_types(obj->btf); for (i = 1; i <= n; i++) { t = btf__type_by_id(obj->btf, i);
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.15-rc5 commit 4729445b47efebf089da4ccbcd1b116ffa2ad4af category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When fed an empty BPF object, bpftool gen skeleton -L crashes at btf__set_fd() since it assumes presence of obj->btf, however for the sequence below clang adds no .BTF section (hence no BTF).
Reproducer:
$ touch a.bpf.c $ clang -O2 -g -target bpf -c a.bpf.c $ bpftool gen skeleton -L a.bpf.o /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ /* THIS FILE IS AUTOGENERATED! */
struct a_bpf { struct bpf_loader_ctx ctx; Segmentation fault (core dumped)
The same occurs for files compiled without BTF info, i.e. without clang's -g flag.
Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.) Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com (cherry picked from commit 4729445b47efebf089da4ccbcd1b116ffa2ad4af) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 1781d0edcdf0..fa45f5643c9b 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6860,7 +6860,8 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
if (obj->gen_loader) { /* reset FDs */ - btf__set_fd(obj->btf, -1); + if (obj->btf) + btf__set_fd(obj->btf, -1); for (i = 0; i < obj->nr_maps; i++) obj->maps[i].fd = -1; if (!err)
From: Toke Høiland-Jørgensen toke@redhat.com
mainline inclusion from mainline-5.16-rc1 commit 03e601f48b2da6fb44d0f7b86957a8f6bacfb347 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
If libbpf encounters an ELF file that has been stripped of its symbol table, it will crash in bpf_object__add_programs() when trying to dereference the obj->efile.symbols pointer.
Fix this by erroring out of bpf_object__elf_collect() if it is not able able to find the symbol table.
v2: - Move check into bpf_object__elf_collect() and add nice error message
Fixes: 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com (cherry picked from commit 03e601f48b2da6fb44d0f7b86957a8f6bacfb347) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index fa45f5643c9b..4d182a9313f8 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2992,6 +2992,12 @@ static int bpf_object__elf_collect(struct bpf_object *obj) } }
+ if (!obj->efile.symbols) { + pr_warn("elf: couldn't find symbol table in %s, stripped object file?\n", + obj->path); + return -ENOENT; + } + scn = NULL; while ((scn = elf_nextscn(elf, scn)) != NULL) { idx++;
From: Matt Smith alastorze@fb.com
mainline inclusion from mainline-5.16-rc1 commit 08a6f22ef6f843d0ea7252087787b5ab04610bec category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This change was necessary to enforce the implied contract that bpf_object_skeleton->data should not be mutated. The data will be cast to `void *` during assignment to handle the case where a user is compiling with older libbpf headers to avoid a compiler warning of `const void *` data being cast to `void *`
Signed-off-by: Matt Smith alastorze@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210901194439.3853238-2-alastorze@fb.com (cherry picked from commit 08a6f22ef6f843d0ea7252087787b5ab04610bec) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index f76d640d0fa0..12b9d15911dc 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -790,7 +790,7 @@ struct bpf_object_skeleton { size_t sz; /* size of this struct, for forward/backward compatibility */
const char *name; - void *data; + const void *data; size_t data_sz;
struct bpf_object **obj;
From: Matt Smith alastorze@fb.com
mainline inclusion from mainline-5.16-rc1 commit a6cc6b34b93e3660149a7cb947be98a9b239ffce category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This adds a skeleton method X__elf_bytes() which returns the binary data of the compiled and embedded BPF object file. It additionally sets the size of the return data to the provided size_t pointer argument.
The assignment to s->data is cast to void * to ensure no warning is issued if compiled with a previous version of libbpf where the bpf_object_skeleton field is void * instead of const void *
Signed-off-by: Matt Smith alastorze@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210901194439.3853238-3-alastorze@fb.com (cherry picked from commit a6cc6b34b93e3660149a7cb947be98a9b239ffce) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 1d71ff8c52fa..abd85bf11ce5 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -238,8 +238,8 @@ static void codegen(const char *template, ...) } else if (c == '\n') { break; } else { - p_err("unrecognized character at pos %td in template '%s'", - src - template - 1, template); + p_err("unrecognized character at pos %td in template '%s': '%c'", + src - template - 1, template, c); free(s); exit(-1); } @@ -406,7 +406,7 @@ static void codegen_destroy(struct bpf_object *obj, const char *obj_name) }
bpf_object__for_each_map(map, obj) { - const char * ident; + const char *ident;
ident = get_map_ident(map); if (!ident) @@ -862,6 +862,8 @@ static int do_skeleton(int argc, char **argv) codegen("\ \n\ \n\ + static inline const void *%1$s__elf_bytes(size_t *sz); \n\ + \n\ static inline int \n\ %1$s__create_skeleton(struct %1$s *obj) \n\ { \n\ @@ -943,10 +945,20 @@ static int do_skeleton(int argc, char **argv) codegen("\ \n\ \n\ - s->data_sz = %d; \n\ - s->data = (void *)"\ \n\ - ", - file_sz); + s->data = (void *)%2$s__elf_bytes(&s->data_sz); \n\ + \n\ + return 0; \n\ + err: \n\ + bpf_object__destroy_skeleton(s); \n\ + return -ENOMEM; \n\ + } \n\ + \n\ + static inline const void *%2$s__elf_bytes(size_t *sz) \n\ + { \n\ + *sz = %1$d; \n\ + return (const void *)"\ \n\ + " + , file_sz, obj_name);
/* embed contents of BPF object file */ print_hex(obj_data, file_sz); @@ -954,11 +966,6 @@ static int do_skeleton(int argc, char **argv) codegen("\ \n\ "; \n\ - \n\ - return 0; \n\ - err: \n\ - bpf_object__destroy_skeleton(s); \n\ - return -ENOMEM; \n\ } \n\ \n\ #endif /* %s */ \n\
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.13-rc1 commit 7a2fa70aaffc2f8823feca22709a00f5c069a8a9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add ASSERT_TRUE/ASSERT_FALSE for conditions calculated with custom logic to true/false. Also add remaining arithmetical assertions: - ASSERT_LE -- less than or equal; - ASSERT_GT -- greater than; - ASSERT_GE -- greater than or equal. This should cover most scenarios where people fall back to error-prone CHECK()s.
Also extend ASSERT_ERR() to print out errno, in addition to direct error.
Also convert few CHECK() instances to ensure new ASSERT_xxx() variants work as expected. Subsequent patch will also use ASSERT_TRUE/ASSERT_FALSE more extensively.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Lorenz Bauer lmb@cloudflare.com Link: https://lore.kernel.org/bpf/20210426192949.416837-2-andrii@kernel.org (cherry picked from commit 7a2fa70aaffc2f8823feca22709a00f5c069a8a9) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/btf_dump.c | 2 +- .../selftests/bpf/prog_tests/btf_endian.c | 4 +- .../selftests/bpf/prog_tests/cgroup_link.c | 2 +- .../selftests/bpf/prog_tests/kfree_skb.c | 2 +- .../selftests/bpf/prog_tests/resolve_btfids.c | 7 +-- .../selftests/bpf/prog_tests/snprintf_btf.c | 4 +- tools/testing/selftests/bpf/test_progs.h | 50 ++++++++++++++++++- 7 files changed, 56 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c index c60091ee8a21..5e129dc2073c 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c @@ -77,7 +77,7 @@ static int test_btf_dump_case(int n, struct btf_dump_test_case *t)
snprintf(out_file, sizeof(out_file), "/tmp/%s.output.XXXXXX", t->file); fd = mkstemp(out_file); - if (CHECK(fd < 0, "create_tmp", "failed to create file: %d\n", fd)) { + if (!ASSERT_GE(fd, 0, "create_tmp")) { err = fd; goto done; } diff --git a/tools/testing/selftests/bpf/prog_tests/btf_endian.c b/tools/testing/selftests/bpf/prog_tests/btf_endian.c index 8c52d72c876e..8ab5d3e358dd 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_endian.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_endian.c @@ -6,8 +6,6 @@ #include <test_progs.h> #include <bpf/btf.h>
-static int duration = 0; - void test_btf_endian() { #if __BYTE_ORDER == __LITTLE_ENDIAN enum btf_endianness endian = BTF_LITTLE_ENDIAN; @@ -71,7 +69,7 @@ void test_btf_endian() {
/* now modify original BTF */ var_id = btf__add_var(btf, "some_var", BTF_VAR_GLOBAL_ALLOCATED, 1); - CHECK(var_id <= 0, "var_id", "failed %d\n", var_id); + ASSERT_GT(var_id, 0, "var_id");
btf__free(swap_btf); swap_btf = NULL; diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_link.c b/tools/testing/selftests/bpf/prog_tests/cgroup_link.c index 4d9b514b3fd9..736796e56ed1 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgroup_link.c +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_link.c @@ -54,7 +54,7 @@ void test_cgroup_link(void)
for (i = 0; i < cg_nr; i++) { cgs[i].fd = create_and_get_cgroup(cgs[i].path); - if (CHECK(cgs[i].fd < 0, "cg_create", "fail: %d\n", cgs[i].fd)) + if (!ASSERT_GE(cgs[i].fd, 0, "cg_create")) goto cleanup; }
diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c index 42c3a3103c26..d65107919998 100644 --- a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c +++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c @@ -134,7 +134,7 @@ void test_kfree_skb(void) /* make sure kfree_skb program was triggered * and it sent expected skb into ring buffer */ - CHECK_FAIL(!passed); + ASSERT_TRUE(passed, "passed");
err = bpf_map_lookup_elem(bpf_map__fd(global_data), &zero, test_ok); if (CHECK(err, "get_result", diff --git a/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c b/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c index 6ace5e9efec1..d3c2de2c24d1 100644 --- a/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c +++ b/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c @@ -160,11 +160,8 @@ int test_resolve_btfids(void) break;
if (i > 0) { - ret = CHECK(test_set.ids[i - 1] > test_set.ids[i], - "sort_check", - "test_set is not sorted\n"); - if (ret) - break; + if (!ASSERT_LE(test_set.ids[i - 1], test_set.ids[i], "sort_check")) + return -1; } }
diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c index 686b40f11a45..76e1f5fe18fa 100644 --- a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c @@ -42,9 +42,7 @@ void test_snprintf_btf(void) * and it set expected return values from bpf_trace_printk()s * and all tests ran. */ - if (CHECK(bss->ret <= 0, - "bpf_snprintf_btf: got return value", - "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests)) + if (!ASSERT_GT(bss->ret, 0, "bpf_snprintf_ret")) goto cleanup;
if (CHECK(bss->ran_subtests == 0, "check if subtests ran", diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index 5daf25f83f0b..a2f3dd14b313 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -131,6 +131,20 @@ extern int test__join_cgroup(const char *path); #define CHECK_ATTR(condition, tag, format...) \ _CHECK(condition, tag, tattr.duration, format)
+#define ASSERT_TRUE(actual, name) ({ \ + static int duration = 0; \ + bool ___ok = (actual); \ + CHECK(!___ok, (name), "unexpected %s: got FALSE\n", (name)); \ + ___ok; \ +}) + +#define ASSERT_FALSE(actual, name) ({ \ + static int duration = 0; \ + bool ___ok = !(actual); \ + CHECK(!___ok, (name), "unexpected %s: got TRUE\n", (name)); \ + ___ok; \ +}) + #define ASSERT_EQ(actual, expected, name) ({ \ static int duration = 0; \ typeof(actual) ___act = (actual); \ @@ -153,6 +167,39 @@ extern int test__join_cgroup(const char *path); ___ok; \ })
+#define ASSERT_LE(actual, expected, name) ({ \ + static int duration = 0; \ + typeof(actual) ___act = (actual); \ + typeof(expected) ___exp = (expected); \ + bool ___ok = ___act <= ___exp; \ + CHECK(!___ok, (name), \ + "unexpected %s: actual %lld > expected %lld\n", \ + (name), (long long)(___act), (long long)(___exp)); \ + ___ok; \ +}) + +#define ASSERT_GT(actual, expected, name) ({ \ + static int duration = 0; \ + typeof(actual) ___act = (actual); \ + typeof(expected) ___exp = (expected); \ + bool ___ok = ___act > ___exp; \ + CHECK(!___ok, (name), \ + "unexpected %s: actual %lld <= expected %lld\n", \ + (name), (long long)(___act), (long long)(___exp)); \ + ___ok; \ +}) + +#define ASSERT_GE(actual, expected, name) ({ \ + static int duration = 0; \ + typeof(actual) ___act = (actual); \ + typeof(expected) ___exp = (expected); \ + bool ___ok = ___act >= ___exp; \ + CHECK(!___ok, (name), \ + "unexpected %s: actual %lld < expected %lld\n", \ + (name), (long long)(___act), (long long)(___exp)); \ + ___ok; \ +}) + #define ASSERT_STREQ(actual, expected, name) ({ \ static int duration = 0; \ const char *___act = actual; \ @@ -168,7 +215,8 @@ extern int test__join_cgroup(const char *path); static int duration = 0; \ long long ___res = (res); \ bool ___ok = ___res == 0; \ - CHECK(!___ok, (name), "unexpected error: %lld\n", ___res); \ + CHECK(!___ok, (name), "unexpected error: %lld (errno %d)\n", \ + ___res, errno); \ ___ok; \ })
From: Matt Smith alastorze@fb.com
mainline inclusion from mainline-5.16-rc1 commit 980a1a4c342f353a62d64174d0a6a9466a741273 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch adds two checks for the X__elf_bytes BPF skeleton helper method. The first asserts that the pointer returned from the helper method is valid, the second asserts that the provided size pointer is set.
Signed-off-by: Matt Smith alastorze@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210901194439.3853238-4-alastorze@fb.com (cherry picked from commit 980a1a4c342f353a62d64174d0a6a9466a741273) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/skeleton.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c index f6f130c99b8c..fe1e204a65c6 100644 --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c @@ -18,6 +18,8 @@ void test_skeleton(void) struct test_skeleton__data *data; struct test_skeleton__rodata *rodata; struct test_skeleton__kconfig *kcfg; + const void *elf_bytes; + size_t elf_bytes_sz = 0;
skel = test_skeleton__open(); if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) @@ -91,6 +93,10 @@ void test_skeleton(void) CHECK(bss->kern_ver != kcfg->LINUX_KERNEL_VERSION, "ext2", "got %d != exp %d\n", bss->kern_ver, kcfg->LINUX_KERNEL_VERSION);
+ elf_bytes = test_skeleton__elf_bytes(&elf_bytes_sz); + ASSERT_OK_PTR(elf_bytes, "elf_bytes"); + ASSERT_GE(elf_bytes_sz, 0, "elf_bytes_sz"); + cleanup: test_skeleton__destroy(skel); }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 006a5099fc18a43792b01d002e1e45c5541ea666 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
After updating to binutils 2.35, the build began to fail with an assembler error. A bug was opened on the Red Hat Bugzilla a few days later for the same issue.
Work around the problem by using the new `symver` attribute (introduced in GCC 10) as needed instead of assembler directives.
This addresses Red Hat ([0]) and OpenSUSE ([1]) bug reports, as well as libbpf issue ([2]).
[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1863059 [1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1188749 [2]: Closes: https://github.com/libbpf/libbpf/issues/338
Co-developed-by: Patrick McCarty patrick.mccarty@intel.com Co-developed-by: Michal Suchanek msuchanek@suse.de Signed-off-by: Patrick McCarty patrick.mccarty@intel.com Signed-off-by: Michal Suchanek msuchanek@suse.de Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210907221023.2660953-1-andrii@kernel.org (cherry picked from commit 006a5099fc18a43792b01d002e1e45c5541ea666) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf_internal.h | 25 +++++++++++++++++++------ tools/lib/bpf/xsk.c | 4 ++-- 2 files changed, 21 insertions(+), 8 deletions(-)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index f7b691d5f9eb..8de87eb7c979 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -90,17 +90,30 @@ /* Symbol versioning is different between static and shared library. * Properly versioned symbols are needed for shared library, but * only the symbol of the new version is needed for static library. + * Starting with GNU C 10, use symver attribute instead of .symver assembler + * directive, which works better with GCC LTO builds. */ -#ifdef SHARED -# define COMPAT_VERSION(internal_name, api_name, version) \ +#if defined(SHARED) && defined(__GNUC__) && __GNUC__ >= 10 + +#define DEFAULT_VERSION(internal_name, api_name, version) \ + __attribute__((symver(#api_name "@@" #version))) +#define COMPAT_VERSION(internal_name, api_name, version) \ + __attribute__((symver(#api_name "@" #version))) + +#elif defined(SHARED) + +#define COMPAT_VERSION(internal_name, api_name, version) \ asm(".symver " #internal_name "," #api_name "@" #version); -# define DEFAULT_VERSION(internal_name, api_name, version) \ +#define DEFAULT_VERSION(internal_name, api_name, version) \ asm(".symver " #internal_name "," #api_name "@@" #version); -#else -# define COMPAT_VERSION(internal_name, api_name, version) -# define DEFAULT_VERSION(internal_name, api_name, version) \ + +#else /* !SHARED */ + +#define COMPAT_VERSION(internal_name, api_name, version) +#define DEFAULT_VERSION(internal_name, api_name, version) \ extern typeof(internal_name) api_name \ __attribute__((alias(#internal_name))); + #endif
extern void libbpf_print(enum libbpf_print_level level, diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index c4390ef98b19..f7f2f367aca5 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -273,6 +273,7 @@ static int xsk_create_umem_rings(struct xsk_umem *umem, int fd, return err; }
+DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4) int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area, __u64 size, struct xsk_ring_prod *fill, struct xsk_ring_cons *comp, @@ -337,6 +338,7 @@ struct xsk_umem_config_v1 { __u32 frame_headroom; };
+COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2) int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area, __u64 size, struct xsk_ring_prod *fill, struct xsk_ring_cons *comp, @@ -350,8 +352,6 @@ int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area, return xsk_umem__create_v0_0_4(umem_ptr, umem_area, size, fill, comp, &config); } -COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2) -DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4)
static int xsk_load_xdp_prog(struct xsk_socket *xsk) {
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit 0b46b755056043dccfe078b96b256502b88f2464 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare the deprecation of two API functions. This macro marks functions as deprecated when libbpf's version reaches the values passed as an argument.
As part of this change libbpf_version.h header is added with recorded major (LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros. They are now part of libbpf public API and can be relied upon by user code. libbpf_version.h is installed system-wide along other libbpf public headers.
Due to this new build-time auto-generated header, in-kernel applications relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to include libbpf's output directory as part of a list of include search paths. Better fix would be to use libbpf's make_install target to install public API headers, but that clean up is left out as a future improvement. The build changes were tested by building kernel (with KBUILD_OUTPUT and O= specified explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No problems were detected.
Note that because of the constraints of the C preprocessor we have to write a few lines of macro magic for each version used to prepare deprecation (0.6 for now).
Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of btf__get_from_id() and btf__load(), which are replaced by btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively, starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]).
[0] Closes: https://github.com/libbpf/libbpf/issues/278
Co-developed-by: Quentin Monnet quentin@isovalent.com Co-developed-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org (cherry picked from commit 0b46b755056043dccfe078b96b256502b88f2464) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/preload/Makefile | 7 +++++-- tools/bpf/bpftool/Makefile | 4 ++++ tools/bpf/resolve_btfids/Makefile | 6 ++++-- tools/lib/bpf/Makefile | 24 +++++++++++++++++------- tools/lib/bpf/btf.h | 2 ++ tools/lib/bpf/libbpf_common.h | 19 +++++++++++++++++++ tools/testing/selftests/bpf/Makefile | 4 ++-- 7 files changed, 53 insertions(+), 13 deletions(-)
diff --git a/kernel/bpf/preload/Makefile b/kernel/bpf/preload/Makefile index 1951332dd15f..ac29d4e9a384 100644 --- a/kernel/bpf/preload/Makefile +++ b/kernel/bpf/preload/Makefile @@ -10,12 +10,15 @@ LIBBPF_OUT = $(abspath $(obj)) $(LIBBPF_A): $(Q)$(MAKE) -C $(LIBBPF_SRCS) O=$(LIBBPF_OUT)/ OUTPUT=$(LIBBPF_OUT)/ $(LIBBPF_OUT)/libbpf.a
-userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi \ +userccflags += -I$(LIBBPF_OUT) -I $(srctree)/tools/include/ \ + -I $(srctree)/tools/include/uapi \ -I $(srctree)/tools/lib/ -Wno-unused-result
userprogs := bpf_preload_umd
-clean-files := $(userprogs) bpf_helper_defs.h FEATURE-DUMP.libbpf staticobjs/ feature/ +clean-files := $(userprogs) libbpf_version.h bpf_helper_defs.h FEATURE-DUMP.libbpf staticobjs/ feature/ + +$(obj)/iterators/iterators.o: $(LIBBPF_A)
bpf_preload_umd-objs := iterators/iterators.o bpf_preload_umd-userldlibs := $(LIBBPF_A) -lelf -lz diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index e2073c739410..2304fd6bcccc 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -59,6 +59,7 @@ CFLAGS += -W -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers CFLAGS += $(filter-out -Wswitch-enum -Wnested-externs,$(EXTRA_WARNINGS)) CFLAGS += -DPACKAGE='"bpftool"' -D__EXPORTED_HEADERS__ \ -I$(if $(OUTPUT),$(OUTPUT),.) \ + $(if $(LIBBPF_OUTPUT),-I$(LIBBPF_OUTPUT)) \ -I$(srctree)/kernel/bpf/ \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ @@ -136,7 +137,10 @@ endif BPFTOOL_BOOTSTRAP := $(BOOTSTRAP_OUTPUT)bpftool
BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o disasm.o) +$(BOOTSTRAP_OBJS): $(LIBBPF_BOOTSTRAP) + OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o +$(OBJS): $(LIBBPF)
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ diff --git a/tools/bpf/resolve_btfids/Makefile b/tools/bpf/resolve_btfids/Makefile index af9f9d3534c9..ea6e250824ed 100644 --- a/tools/bpf/resolve_btfids/Makefile +++ b/tools/bpf/resolve_btfids/Makefile @@ -30,6 +30,7 @@ LIBBPF_SRC := $(srctree)/tools/lib/bpf/ SUBCMD_SRC := $(srctree)/tools/lib/subcmd/
BPFOBJ := $(OUTPUT)/libbpf/libbpf.a +LIBBPF_OUT := $(abspath $(dir $(BPFOBJ)))/ SUBCMDOBJ := $(OUTPUT)/libsubcmd/libsubcmd.a
BINARY := $(OUTPUT)/resolve_btfids @@ -45,11 +46,12 @@ $(SUBCMDOBJ): fixdep FORCE | $(OUTPUT)/libsubcmd $(Q)$(MAKE) -C $(SUBCMD_SRC) OUTPUT=$(abspath $(dir $@))/ $(abspath $@)
$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(OUTPUT)/libbpf - $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(abspath $(dir $@))/ $(abspath $@) + $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(LIBBPF_OUT) $(abspath $@)
CFLAGS := -g \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ + -I$(LIBBPF_OUT) \ -I$(LIBBPF_SRC) \ -I$(SUBCMD_SRC)
@@ -58,7 +60,7 @@ LIBS = -lelf -lz export srctree OUTPUT CFLAGS Q include $(srctree)/tools/build/Makefile.include
-$(BINARY_IN): fixdep FORCE | $(OUTPUT) +$(BINARY_IN): $(BPFOBJ) fixdep FORCE | $(OUTPUT) $(Q)$(MAKE) $(build)=resolve_btfids
$(BINARY): $(BPFOBJ) $(SUBCMDOBJ) $(BINARY_IN) diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 4e336b30323c..8c026bd6b385 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -8,7 +8,8 @@ VERSION_SCRIPT := libbpf.map LIBBPF_VERSION := $(shell \ grep -oE '^LIBBPF_([0-9.]+)' $(VERSION_SCRIPT) | \ sort -rV | head -n1 | cut -d'_' -f2) -LIBBPF_MAJOR_VERSION := $(firstword $(subst ., ,$(LIBBPF_VERSION))) +LIBBPF_MAJOR_VERSION := $(word 1,$(subst ., ,$(LIBBPF_VERSION))) +LIBBPF_MINOR_VERSION := $(word 2,$(subst ., ,$(LIBBPF_VERSION)))
MAKEFLAGS += --no-print-directory
@@ -59,7 +60,8 @@ ifndef VERBOSE VERBOSE = 0 endif
-INCLUDES = -I. -I$(srctree)/tools/include -I$(srctree)/tools/include/uapi +INCLUDES = -I$(if $(OUTPUT),$(OUTPUT),.) \ + -I$(srctree)/tools/include -I$(srctree)/tools/include/uapi
export prefix libdir src obj
@@ -111,7 +113,9 @@ SHARED_OBJDIR := $(OUTPUT)sharedobjs/ STATIC_OBJDIR := $(OUTPUT)staticobjs/ BPF_IN_SHARED := $(SHARED_OBJDIR)libbpf-in.o BPF_IN_STATIC := $(STATIC_OBJDIR)libbpf-in.o +VERSION_HDR := $(OUTPUT)libbpf_version.h BPF_HELPER_DEFS := $(OUTPUT)bpf_helper_defs.h +BPF_GENERATED := $(BPF_HELPER_DEFS) $(VERSION_HDR)
LIB_TARGET := $(addprefix $(OUTPUT),$(LIB_TARGET)) LIB_FILE := $(addprefix $(OUTPUT),$(LIB_FILE)) @@ -136,7 +140,7 @@ all: fixdep
all_cmd: $(CMD_TARGETS) check
-$(BPF_IN_SHARED): force $(BPF_HELPER_DEFS) +$(BPF_IN_SHARED): force $(BPF_GENERATED) @(test -f ../../include/uapi/linux/bpf.h -a -f ../../../include/uapi/linux/bpf.h && ( \ (diff -B ../../include/uapi/linux/bpf.h ../../../include/uapi/linux/bpf.h >/dev/null) || \ echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'" >&2 )) || true @@ -154,13 +158,19 @@ $(BPF_IN_SHARED): force $(BPF_HELPER_DEFS) echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'" >&2 )) || true $(Q)$(MAKE) $(build)=libbpf OUTPUT=$(SHARED_OBJDIR) CFLAGS="$(CFLAGS) $(SHLIB_FLAGS)"
-$(BPF_IN_STATIC): force $(BPF_HELPER_DEFS) +$(BPF_IN_STATIC): force $(BPF_GENERATED) $(Q)$(MAKE) $(build)=libbpf OUTPUT=$(STATIC_OBJDIR)
$(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h $(QUIET_GEN)$(srctree)/scripts/bpf_helpers_doc.py --header \ --file $(srctree)/tools/include/uapi/linux/bpf.h > $(BPF_HELPER_DEFS)
+$(VERSION_HDR): force + $(QUIET_GEN)echo "/* This file was auto-generated. */" > $@ + @echo "" >> $@ + @echo "#define LIBBPF_MAJOR_VERSION $(LIBBPF_MAJOR_VERSION)" >> $@ + @echo "#define LIBBPF_MINOR_VERSION $(LIBBPF_MINOR_VERSION)" >> $@ + $(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION)
$(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN_SHARED) $(VERSION_SCRIPT) @@ -224,10 +234,10 @@ install_lib: all_cmd cp -fpR $(LIB_FILE) $(DESTDIR)$(libdir_SQ)
INSTALL_HEADERS = bpf.h libbpf.h btf.h libbpf_util.h libbpf_common.h libbpf_legacy.h xsk.h \ - bpf_helpers.h $(BPF_HELPER_DEFS) bpf_tracing.h \ + bpf_helpers.h $(BPF_GENERATED) bpf_tracing.h \ bpf_endian.h bpf_core_read.h skel_internal.h
-install_headers: $(BPF_HELPER_DEFS) +install_headers: $(BPF_GENERATED) $(call QUIET_INSTALL, headers) \ $(foreach hdr,$(INSTALL_HEADERS), \ $(call do_install,$(hdr),$(prefix)/include/bpf,644);) @@ -240,7 +250,7 @@ install: install_lib install_pkgconfig install_headers
clean: $(call QUIET_CLEAN, libbpf) $(RM) -rf $(CMD_TARGETS) \ - *~ .*.d .*.cmd LIBBPF-CFLAGS $(BPF_HELPER_DEFS) \ + *~ .*.d .*.cmd LIBBPF-CFLAGS $(BPF_GENERATED) \ $(SHARED_OBJDIR) $(STATIC_OBJDIR) \ $(addprefix $(OUTPUT), \ *.o *.a *.so *.so.$(LIBBPF_MAJOR_VERSION) *.pc) diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 4a711f990904..f2e2fab950b7 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -50,9 +50,11 @@ LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id); LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf); +LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_from_kernel_by_id instead") LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); +LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_into_kernel instead") LIBBPF_API int btf__load(struct btf *btf); LIBBPF_API int btf__load_into_kernel(struct btf *btf); LIBBPF_API __s32 btf__find_by_name(const struct btf *btf, diff --git a/tools/lib/bpf/libbpf_common.h b/tools/lib/bpf/libbpf_common.h index 947d8bd8a7bb..36ac77f2bea2 100644 --- a/tools/lib/bpf/libbpf_common.h +++ b/tools/lib/bpf/libbpf_common.h @@ -10,6 +10,7 @@ #define __LIBBPF_LIBBPF_COMMON_H
#include <string.h> +#include "libbpf_version.h"
#ifndef LIBBPF_API #define LIBBPF_API __attribute__((visibility("default"))) @@ -17,6 +18,24 @@
#define LIBBPF_DEPRECATED(msg) __attribute__((deprecated(msg)))
+/* Mark a symbol as deprecated when libbpf version is >= {major}.{minor} */ +#define LIBBPF_DEPRECATED_SINCE(major, minor, msg) \ + __LIBBPF_MARK_DEPRECATED_ ## major ## _ ## minor \ + (LIBBPF_DEPRECATED("libbpf v" # major "." # minor "+: " msg)) + +#define __LIBBPF_CURRENT_VERSION_GEQ(major, minor) \ + (LIBBPF_MAJOR_VERSION > (major) || \ + (LIBBPF_MAJOR_VERSION == (major) && LIBBPF_MINOR_VERSION >= (minor))) + +/* Add checks for other versions below when planning deprecation of API symbols + * with the LIBBPF_DEPRECATED_SINCE macro. + */ +#if __LIBBPF_CURRENT_VERSION_GEQ(0, 6) +#define __LIBBPF_MARK_DEPRECATED_0_6(X) X +#else +#define __LIBBPF_MARK_DEPRECATED_0_6(X) +#endif + /* Helper macro to declare and initialize libbpf options struct * * This dance with uninitialized declaration, followed by memset to zero, diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 827d37dbd00f..ef611403958a 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -476,14 +476,14 @@ $(OUTPUT)/test_cpp: test_cpp.cpp $(OUTPUT)/test_core_extern.skel.h $(BPFOBJ) $(Q)$(CXX) $(CFLAGS) $(filter %.a %.o %.cpp,$^) $(LDLIBS) -o $@
# Benchmark runner -$(OUTPUT)/bench_%.o: benchs/bench_%.c bench.h +$(OUTPUT)/bench_%.o: benchs/bench_%.c bench.h $(BPFOBJ) $(call msg,CC,,$@) $(Q)$(CC) $(CFLAGS) -c $(filter %.c,$^) $(LDLIBS) -o $@ $(OUTPUT)/bench_rename.o: $(OUTPUT)/test_overhead.skel.h $(OUTPUT)/bench_trigger.o: $(OUTPUT)/trigger_bench.skel.h $(OUTPUT)/bench_ringbufs.o: $(OUTPUT)/ringbuf_bench.skel.h \ $(OUTPUT)/perfbuf_bench.skel.h -$(OUTPUT)/bench.o: bench.h testing_helpers.h +$(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ) $(OUTPUT)/bench: LDLIBS += -lm $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \ $(OUTPUT)/bench_count.o \
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 2f383041278672332ab57c78570c30c47d6f35fd category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Turn previously auto-generated libbpf_version.h header into a normal header file. This prevents various tricky Makefile integration issues, simplifies the overall build process, but also allows to further extend it with some more versioning-related APIs in the future.
To prevent accidental out-of-sync versions as defined by libbpf.map and libbpf_version.h, Makefile checks their consistency at build time.
Simultaneously with this change bump libbpf.map to v0.6.
Also undo adding libbpf's output directory into include path for kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary because libbpf_version.h is just a normal header like any other.
Fixes: 0b46b7550560 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org (cherry picked from commit 2f383041278672332ab57c78570c30c47d6f35fd) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/preload/Makefile | 7 ++----- tools/bpf/bpftool/Makefile | 1 - tools/bpf/resolve_btfids/Makefile | 1 - tools/lib/bpf/.gitignore | 1 - tools/lib/bpf/Makefile | 33 ++++++++++++++++++++----------- tools/lib/bpf/libbpf.map | 3 +++ tools/lib/bpf/libbpf_version.h | 9 +++++++++ 7 files changed, 35 insertions(+), 20 deletions(-) create mode 100644 tools/lib/bpf/libbpf_version.h
diff --git a/kernel/bpf/preload/Makefile b/kernel/bpf/preload/Makefile index ac29d4e9a384..1951332dd15f 100644 --- a/kernel/bpf/preload/Makefile +++ b/kernel/bpf/preload/Makefile @@ -10,15 +10,12 @@ LIBBPF_OUT = $(abspath $(obj)) $(LIBBPF_A): $(Q)$(MAKE) -C $(LIBBPF_SRCS) O=$(LIBBPF_OUT)/ OUTPUT=$(LIBBPF_OUT)/ $(LIBBPF_OUT)/libbpf.a
-userccflags += -I$(LIBBPF_OUT) -I $(srctree)/tools/include/ \ - -I $(srctree)/tools/include/uapi \ +userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi \ -I $(srctree)/tools/lib/ -Wno-unused-result
userprogs := bpf_preload_umd
-clean-files := $(userprogs) libbpf_version.h bpf_helper_defs.h FEATURE-DUMP.libbpf staticobjs/ feature/ - -$(obj)/iterators/iterators.o: $(LIBBPF_A) +clean-files := $(userprogs) bpf_helper_defs.h FEATURE-DUMP.libbpf staticobjs/ feature/
bpf_preload_umd-objs := iterators/iterators.o bpf_preload_umd-userldlibs := $(LIBBPF_A) -lelf -lz diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index 2304fd6bcccc..70fe8656d840 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -59,7 +59,6 @@ CFLAGS += -W -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers CFLAGS += $(filter-out -Wswitch-enum -Wnested-externs,$(EXTRA_WARNINGS)) CFLAGS += -DPACKAGE='"bpftool"' -D__EXPORTED_HEADERS__ \ -I$(if $(OUTPUT),$(OUTPUT),.) \ - $(if $(LIBBPF_OUTPUT),-I$(LIBBPF_OUTPUT)) \ -I$(srctree)/kernel/bpf/ \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ diff --git a/tools/bpf/resolve_btfids/Makefile b/tools/bpf/resolve_btfids/Makefile index ea6e250824ed..df8a0f1b6289 100644 --- a/tools/bpf/resolve_btfids/Makefile +++ b/tools/bpf/resolve_btfids/Makefile @@ -51,7 +51,6 @@ $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(OUTPUT)/l CFLAGS := -g \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ - -I$(LIBBPF_OUT) \ -I$(LIBBPF_SRC) \ -I$(SUBCMD_SRC)
diff --git a/tools/lib/bpf/.gitignore b/tools/lib/bpf/.gitignore index 5d4cfac671d5..0da84cb9e66d 100644 --- a/tools/lib/bpf/.gitignore +++ b/tools/lib/bpf/.gitignore @@ -1,5 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only -libbpf_version.h libbpf.pc libbpf.so.* TAGS diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 8c026bd6b385..84b3bd34f6c6 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -113,9 +113,8 @@ SHARED_OBJDIR := $(OUTPUT)sharedobjs/ STATIC_OBJDIR := $(OUTPUT)staticobjs/ BPF_IN_SHARED := $(SHARED_OBJDIR)libbpf-in.o BPF_IN_STATIC := $(STATIC_OBJDIR)libbpf-in.o -VERSION_HDR := $(OUTPUT)libbpf_version.h BPF_HELPER_DEFS := $(OUTPUT)bpf_helper_defs.h -BPF_GENERATED := $(BPF_HELPER_DEFS) $(VERSION_HDR) +BPF_GENERATED := $(BPF_HELPER_DEFS)
LIB_TARGET := $(addprefix $(OUTPUT),$(LIB_TARGET)) LIB_FILE := $(addprefix $(OUTPUT),$(LIB_FILE)) @@ -165,12 +164,6 @@ $(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h $(QUIET_GEN)$(srctree)/scripts/bpf_helpers_doc.py --header \ --file $(srctree)/tools/include/uapi/linux/bpf.h > $(BPF_HELPER_DEFS)
-$(VERSION_HDR): force - $(QUIET_GEN)echo "/* This file was auto-generated. */" > $@ - @echo "" >> $@ - @echo "#define LIBBPF_MAJOR_VERSION $(LIBBPF_MAJOR_VERSION)" >> $@ - @echo "#define LIBBPF_MINOR_VERSION $(LIBBPF_MINOR_VERSION)" >> $@ - $(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION)
$(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN_SHARED) $(VERSION_SCRIPT) @@ -189,7 +182,7 @@ $(OUTPUT)libbpf.pc: -e "s|@VERSION@|$(LIBBPF_VERSION)|" \ < libbpf.pc.template > $@
-check: check_abi +check: check_abi check_version
check_abi: $(OUTPUT)libbpf.so $(VERSION_SCRIPT) @if [ "$(GLOBAL_SYM_COUNT)" != "$(VERSIONED_SYM_COUNT)" ]; then \ @@ -215,6 +208,21 @@ check_abi: $(OUTPUT)libbpf.so $(VERSION_SCRIPT) exit 1; \ fi
+HDR_MAJ_VERSION := $(shell grep -oE '^#define LIBBPF_MAJOR_VERSION ([0-9]+)$$' libbpf_version.h | cut -d' ' -f3) +HDR_MIN_VERSION := $(shell grep -oE '^#define LIBBPF_MINOR_VERSION ([0-9]+)$$' libbpf_version.h | cut -d' ' -f3) + +check_version: $(VERSION_SCRIPT) libbpf_version.h + @if [ "$(HDR_MAJ_VERSION)" != "$(LIBBPF_MAJOR_VERSION)" ]; then \ + echo "Error: libbpf major version mismatch detected: " \ + "'$(HDR_MAJ_VERSION)' != '$(LIBBPF_MAJOR_VERSION)'" >&2; \ + exit 1; \ + fi + @if [ "$(HDR_MIN_VERSION)" != "$(LIBBPF_MINOR_VERSION)" ]; then \ + echo "Error: libbpf minor version mismatch detected: " \ + "'$(HDR_MIN_VERSION)' != '$(LIBBPF_MINOR_VERSION)'" >&2; \ + exit 1; \ + fi + define do_install_mkdir if [ ! -d '$(DESTDIR_SQ)$1' ]; then \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$1'; \ @@ -234,8 +242,9 @@ install_lib: all_cmd cp -fpR $(LIB_FILE) $(DESTDIR)$(libdir_SQ)
INSTALL_HEADERS = bpf.h libbpf.h btf.h libbpf_util.h libbpf_common.h libbpf_legacy.h xsk.h \ - bpf_helpers.h $(BPF_GENERATED) bpf_tracing.h \ - bpf_endian.h bpf_core_read.h skel_internal.h + bpf_helpers.h $(BPF_GENERATED) bpf_tracing.h \ + bpf_endian.h bpf_core_read.h skel_internal.h \ + libbpf_version.h
install_headers: $(BPF_GENERATED) $(call QUIET_INSTALL, headers) \ @@ -255,7 +264,7 @@ clean: $(addprefix $(OUTPUT), \ *.o *.a *.so *.so.$(LIBBPF_MAJOR_VERSION) *.pc)
-PHONY += force cscope tags +PHONY += force cscope tags check check_abi check_version force:
cscope: diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 93319760e3a3..3008cd8e35ab 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -381,3 +381,6 @@ LIBBPF_0.5.0 { btf_dump__dump_type_data; libbpf_set_strict_mode; } LIBBPF_0.4.0; + +LIBBPF_0.6.0 { +} LIBBPF_0.5.0; diff --git a/tools/lib/bpf/libbpf_version.h b/tools/lib/bpf/libbpf_version.h new file mode 100644 index 000000000000..dd56d76f291c --- /dev/null +++ b/tools/lib/bpf/libbpf_version.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +/* Copyright (C) 2021 Facebook */ +#ifndef __LIBBPF_VERSION_H +#define __LIBBPF_VERSION_H + +#define LIBBPF_MAJOR_VERSION 0 +#define LIBBPF_MINOR_VERSION 6 + +#endif /* __LIBBPF_VERSION_H */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 53df63ccdc0258118e53089197d0428c5330cc9c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Update struct_ops selftests to always specify "struct_ops" section prefix. Libbpf will require a proper BPF program type set in the next patch, so this prevents tests breaking.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210914014733.2768-2-andrii@kernel.org (cherry picked from commit 53df63ccdc0258118e53089197d0428c5330cc9c) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/progs/bpf_cubic.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/progs/bpf_cubic.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/bpf_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cubic.c index 6939bfd8690f..4d09cca501e8 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cubic.c @@ -188,10 +188,8 @@ void BPF_PROG(bictcp_init, struct sock *sk) tcp_sk(sk)->snd_ssthresh = initial_ssthresh; }
-/* No prefix in SEC will also work. - * The remaining tcp-cubic functions have an easier way. - */ -SEC("no-sec-prefix-bictcp_cwnd_event") +/* "struct_ops" prefix is a requirement */ +SEC("struct_ops/bictcp_cwnd_event") void BPF_PROG(bictcp_cwnd_event, struct sock *sk, enum tcp_ca_event event) { if (event == CA_EVENT_TX_START) {
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 91b4d1d1d54431c72f3a7ff034f30a635f787426 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Refactor bpf_object__open() sequencing to perform BPF program type detection based on SEC() definitions before we get to relocations collection. This allows to have more information about BPF program by the time we get to, say, struct_ops relocation gathering. This, subsequently, simplifies struct_ops logic and removes the need to perform extra find_sec_def() resolution.
With this patch libbpf will require all struct_ops BPF programs to be marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations. Real-world applications are already doing that through something like selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's internal handling of SEC() definitions and is in the sprit of upcoming libbpf-1.0 section strictness changes ([0]).
[0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org (cherry picked from commit 91b4d1d1d54431c72f3a7ff034f30a635f787426) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 93 +++++++++++++++++++++++------------------- 1 file changed, 51 insertions(+), 42 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 4d182a9313f8..fe15905fd88b 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6339,12 +6339,37 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
static const struct bpf_sec_def *find_sec_def(const char *sec_name);
+static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object_open_opts *opts) +{ + struct bpf_program *prog; + + bpf_object__for_each_program(prog, obj) { + prog->sec_def = find_sec_def(prog->sec_name); + if (!prog->sec_def) { + /* couldn't guess, but user might manually specify */ + pr_debug("prog '%s': unrecognized ELF section name '%s'\n", + prog->name, prog->sec_name); + continue; + } + + if (prog->sec_def->is_sleepable) + prog->prog_flags |= BPF_F_SLEEPABLE; + bpf_program__set_type(prog, prog->sec_def->prog_type); + bpf_program__set_expected_attach_type(prog, prog->sec_def->expected_attach_type); + + if (prog->sec_def->prog_type == BPF_PROG_TYPE_TRACING || + prog->sec_def->prog_type == BPF_PROG_TYPE_EXT) + prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0); + } + + return 0; +} + static struct bpf_object * __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz, const struct bpf_object_open_opts *opts) { const char *obj_name, *kconfig, *btf_tmp_path; - struct bpf_program *prog; struct bpf_object *obj; char tmp_name[64]; int err; @@ -6402,30 +6427,12 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz, err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object__finalize_btf(obj); err = err ? : bpf_object__init_maps(obj, opts); + err = err ? : bpf_object_init_progs(obj, opts); err = err ? : bpf_object__collect_relos(obj); if (err) goto out; - bpf_object__elf_finish(obj);
- bpf_object__for_each_program(prog, obj) { - prog->sec_def = find_sec_def(prog->sec_name); - if (!prog->sec_def) { - /* couldn't guess, but user might manually specify */ - pr_debug("prog '%s': unrecognized ELF section name '%s'\n", - prog->name, prog->sec_name); - continue; - } - - if (prog->sec_def->is_sleepable) - prog->prog_flags |= BPF_F_SLEEPABLE; - bpf_program__set_type(prog, prog->sec_def->prog_type); - bpf_program__set_expected_attach_type(prog, - prog->sec_def->expected_attach_type); - - if (prog->sec_def->prog_type == BPF_PROG_TYPE_TRACING || - prog->sec_def->prog_type == BPF_PROG_TYPE_EXT) - prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0); - } + bpf_object__elf_finish(obj);
return obj; out: @@ -8217,35 +8224,37 @@ static int bpf_object__collect_st_ops_relos(struct bpf_object *obj, return -EINVAL; }
- if (prog->type == BPF_PROG_TYPE_UNSPEC) { - const struct bpf_sec_def *sec_def; - - sec_def = find_sec_def(prog->sec_name); - if (sec_def && - sec_def->prog_type != BPF_PROG_TYPE_STRUCT_OPS) { - /* for pr_warn */ - prog->type = sec_def->prog_type; - goto invalid_prog; - } + /* prevent the use of BPF prog with invalid type */ + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS) { + pr_warn("struct_ops reloc %s: prog %s is not struct_ops BPF program\n", + map->name, prog->name); + return -EINVAL; + }
- prog->type = BPF_PROG_TYPE_STRUCT_OPS; + /* if we haven't yet processed this BPF program, record proper + * attach_btf_id and member_idx + */ + if (!prog->attach_btf_id) { prog->attach_btf_id = st_ops->type_id; prog->expected_attach_type = member_idx; - } else if (prog->type != BPF_PROG_TYPE_STRUCT_OPS || - prog->attach_btf_id != st_ops->type_id || - prog->expected_attach_type != member_idx) { - goto invalid_prog; } + + /* struct_ops BPF prog can be re-used between multiple + * .struct_ops as long as it's the same struct_ops struct + * definition and the same function pointer field + */ + if (prog->attach_btf_id != st_ops->type_id || + prog->expected_attach_type != member_idx) { + pr_warn("struct_ops reloc %s: cannot use prog %s in sec %s with type %u attach_btf_id %u expected_attach_type %u for func ptr %s\n", + map->name, prog->name, prog->sec_name, prog->type, + prog->attach_btf_id, prog->expected_attach_type, name); + return -EINVAL; + } + st_ops->progs[member_idx] = prog; }
return 0; - -invalid_prog: - pr_warn("struct_ops reloc %s: cannot use prog %s in sec %s with type %u attach_btf_id %u expected_attach_type %u for func ptr %s\n", - map->name, prog->name, prog->sec_name, prog->type, - prog->attach_btf_id, prog->expected_attach_type, name); - return -EINVAL; }
#define BTF_TRACE_PREFIX "btf_trace_"
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 5532dfd42e4846e84d346a6dfe01e477e35baa65 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF programs, as it is already recorded at bpf_object__open() time for all recognized type of BPF programs. This further reduces number of explicit calls to find_sec_def(), simplifying further refactorings.
No functional changes are done by this patch.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org (cherry picked from commit 5532dfd42e4846e84d346a6dfe01e477e35baa65) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 67 +++++++++++++++--------------------------- 1 file changed, 24 insertions(+), 43 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index fe15905fd88b..6d53dbe1c97a 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -216,8 +216,7 @@ struct reloc_desc {
struct bpf_sec_def;
-typedef struct bpf_link *(*attach_fn_t)(const struct bpf_sec_def *sec, - struct bpf_program *prog); +typedef struct bpf_link *(*attach_fn_t)(struct bpf_program *prog);
struct bpf_sec_def { const char *sec; @@ -7883,20 +7882,13 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog, __VA_ARGS__ \ }
-static struct bpf_link *attach_kprobe(const struct bpf_sec_def *sec, - struct bpf_program *prog); -static struct bpf_link *attach_tp(const struct bpf_sec_def *sec, - struct bpf_program *prog); -static struct bpf_link *attach_raw_tp(const struct bpf_sec_def *sec, - struct bpf_program *prog); -static struct bpf_link *attach_trace(const struct bpf_sec_def *sec, - struct bpf_program *prog); -static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec, - struct bpf_program *prog); -static struct bpf_link *attach_iter(const struct bpf_sec_def *sec, - struct bpf_program *prog); -static struct bpf_link *attach_sched(const struct bpf_sec_def *sec, - struct bpf_program *prog); +static struct bpf_link *attach_kprobe(struct bpf_program *prog); +static struct bpf_link *attach_tp(struct bpf_program *prog); +static struct bpf_link *attach_raw_tp(struct bpf_program *prog); +static struct bpf_link *attach_trace(struct bpf_program *prog); +static struct bpf_link *attach_lsm(struct bpf_program *prog); +static struct bpf_link *attach_iter(struct bpf_program *prog); +static struct bpf_link *attach_sched(struct bpf_program *prog);
static const struct bpf_sec_def section_defs[] = { BPF_PROG_SEC("socket", BPF_PROG_TYPE_SOCKET_FILTER), @@ -9162,14 +9154,13 @@ struct bpf_link *bpf_program__attach_kprobe(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_kprobe(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_kprobe(struct bpf_program *prog) { const char *func_name; bool retprobe;
- func_name = prog->sec_name + sec->len; - retprobe = strcmp(sec->sec, "kretprobe/") == 0; + func_name = prog->sec_name + prog->sec_def->len; + retprobe = strcmp(prog->sec_def->sec, "kretprobe/") == 0;
return bpf_program__attach_kprobe(prog, retprobe, func_name); } @@ -9282,8 +9273,7 @@ struct bpf_link *bpf_program__attach_tracepoint(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_tp(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_tp(struct bpf_program *prog) { char *sec_name, *tp_cat, *tp_name; struct bpf_link *link; @@ -9293,7 +9283,7 @@ static struct bpf_link *attach_tp(const struct bpf_sec_def *sec, return libbpf_err_ptr(-ENOMEM);
/* extract "tp/<category>/<name>" */ - tp_cat = sec_name + sec->len; + tp_cat = sec_name + prog->sec_def->len; tp_name = strchr(tp_cat, '/'); if (!tp_name) { free(sec_name); @@ -9337,10 +9327,9 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_raw_tp(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_raw_tp(struct bpf_program *prog) { - const char *tp_name = prog->sec_name + sec->len; + const char *tp_name = prog->sec_name + prog->sec_def->len;
return bpf_program__attach_raw_tracepoint(prog, tp_name); } @@ -9390,20 +9379,17 @@ struct bpf_link *bpf_program__attach_lsm(struct bpf_program *prog) return bpf_program__attach_btf_id(prog); }
-static struct bpf_link *attach_trace(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_trace(struct bpf_program *prog) { return bpf_program__attach_trace(prog); }
-static struct bpf_link *attach_sched(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_sched(struct bpf_program *prog) { return bpf_program__attach_sched(prog); }
-static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_lsm(struct bpf_program *prog) { return bpf_program__attach_lsm(prog); } @@ -9534,21 +9520,17 @@ bpf_program__attach_iter(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_iter(const struct bpf_sec_def *sec, - struct bpf_program *prog) +static struct bpf_link *attach_iter(struct bpf_program *prog) { return bpf_program__attach_iter(prog, NULL); }
struct bpf_link *bpf_program__attach(struct bpf_program *prog) { - const struct bpf_sec_def *sec_def; - - sec_def = find_sec_def(prog->sec_name); - if (!sec_def || !sec_def->attach_fn) + if (!prog->sec_def || !prog->sec_def->attach_fn) return libbpf_err_ptr(-ESRCH);
- return sec_def->attach_fn(sec_def, prog); + return prog->sec_def->attach_fn(prog); }
static int bpf_link__detach_struct_ops(struct bpf_link *link) @@ -10627,16 +10609,15 @@ int bpf_object__attach_skeleton(struct bpf_object_skeleton *s) for (i = 0; i < s->prog_cnt; i++) { struct bpf_program *prog = *s->progs[i].prog; struct bpf_link **link = s->progs[i].link; - const struct bpf_sec_def *sec_def;
if (!prog->load) continue;
- sec_def = find_sec_def(prog->sec_name); - if (!sec_def || !sec_def->attach_fn) + /* auto-attaching not supported for this program */ + if (!prog->sec_def || !prog->sec_def->attach_fn) continue;
- *link = sec_def->attach_fn(sec_def, prog); + *link = prog->sec_def->attach_fn(prog); err = libbpf_get_error(*link); if (err) { pr_warn("failed to auto-attach program '%s': %d\n",
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit b6291a6f30d35bd4459dc35aac2f30669a4356ac category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove almost all the code that explicitly iterated BPF program section definitions in favor of using find_sec_def(). The only remaining user of section_defs is libbpf_get_type_names that has to iterate all of them to construct its result.
Having one internal API entry point for section definitions will simplify further refactorings around libbpf's program section definitions parsing.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org (cherry picked from commit b6291a6f30d35bd4459dc35aac2f30669a4356ac) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 46 +++++++++++++++++------------------------- 1 file changed, 19 insertions(+), 27 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6d53dbe1c97a..84528f29b8e1 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8409,22 +8409,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, __u32 attach_prog_fd = prog->attach_prog_fd; const char *name = prog->sec_name, *attach_name; const struct bpf_sec_def *sec = NULL; - int i, err = 0; + int err = 0;
if (!name) return -EINVAL;
- for (i = 0; i < ARRAY_SIZE(section_defs); i++) { - if (!section_defs[i].is_attach_btf) - continue; - if (strncmp(name, section_defs[i].sec, section_defs[i].len)) - continue; - - sec = §ion_defs[i]; - break; - } - - if (!sec) { + sec = find_sec_def(name); + if (!sec || !sec->is_attach_btf) { pr_warn("failed to identify BTF ID based on ELF section name '%s'\n", name); return -ESRCH; } @@ -8462,27 +8453,28 @@ int libbpf_attach_type_by_name(const char *name, enum bpf_attach_type *attach_type) { char *type_names; - int i; + const struct bpf_sec_def *sec_def;
if (!name) return libbpf_err(-EINVAL);
- for (i = 0; i < ARRAY_SIZE(section_defs); i++) { - if (strncmp(name, section_defs[i].sec, section_defs[i].len)) - continue; - if (!section_defs[i].is_attachable) - return libbpf_err(-EINVAL); - *attach_type = section_defs[i].expected_attach_type; - return 0; - } - pr_debug("failed to guess attach type based on ELF section name '%s'\n", name); - type_names = libbpf_get_type_names(true); - if (type_names != NULL) { - pr_debug("attachable section(type) names are:%s\n", type_names); - free(type_names); + sec_def = find_sec_def(name); + if (!sec_def) { + pr_debug("failed to guess attach type based on ELF section name '%s'\n", name); + type_names = libbpf_get_type_names(true); + if (type_names != NULL) { + pr_debug("attachable section(type) names are:%s\n", type_names); + free(type_names); + } + + return libbpf_err(-EINVAL); }
- return libbpf_err(-EINVAL); + if (!sec_def->is_attachable) + return libbpf_err(-EINVAL); + + *attach_type = sec_def->expected_attach_type; + return 0; }
int bpf_map__fd(const struct bpf_map *map)
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit 41ced4cd88020c9d4b71ff7c50d020f081efa4a0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Change BTF_KIND_* macros to enums so they are encoded in dwarf and appear in vmlinux.h. This will make it easier for bpf programs to use these constants without macro definitions.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com (cherry picked from commit 41ced4cd88020c9d4b71ff7c50d020f081efa4a0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/btf.h | 41 ++++++++++++++++++---------------- tools/include/uapi/linux/btf.h | 41 ++++++++++++++++++---------------- 2 files changed, 44 insertions(+), 38 deletions(-)
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index d27b1708efe9..10e401073dd1 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -56,25 +56,28 @@ struct btf_type { #define BTF_INFO_VLEN(info) ((info) & 0xffff) #define BTF_INFO_KFLAG(info) ((info) >> 31)
-#define BTF_KIND_UNKN 0 /* Unknown */ -#define BTF_KIND_INT 1 /* Integer */ -#define BTF_KIND_PTR 2 /* Pointer */ -#define BTF_KIND_ARRAY 3 /* Array */ -#define BTF_KIND_STRUCT 4 /* Struct */ -#define BTF_KIND_UNION 5 /* Union */ -#define BTF_KIND_ENUM 6 /* Enumeration */ -#define BTF_KIND_FWD 7 /* Forward */ -#define BTF_KIND_TYPEDEF 8 /* Typedef */ -#define BTF_KIND_VOLATILE 9 /* Volatile */ -#define BTF_KIND_CONST 10 /* Const */ -#define BTF_KIND_RESTRICT 11 /* Restrict */ -#define BTF_KIND_FUNC 12 /* Function */ -#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ -#define BTF_KIND_VAR 14 /* Variable */ -#define BTF_KIND_DATASEC 15 /* Section */ -#define BTF_KIND_FLOAT 16 /* Floating point */ -#define BTF_KIND_MAX BTF_KIND_FLOAT -#define NR_BTF_KINDS (BTF_KIND_MAX + 1) +enum { + BTF_KIND_UNKN = 0, /* Unknown */ + BTF_KIND_INT = 1, /* Integer */ + BTF_KIND_PTR = 2, /* Pointer */ + BTF_KIND_ARRAY = 3, /* Array */ + BTF_KIND_STRUCT = 4, /* Struct */ + BTF_KIND_UNION = 5, /* Union */ + BTF_KIND_ENUM = 6, /* Enumeration */ + BTF_KIND_FWD = 7, /* Forward */ + BTF_KIND_TYPEDEF = 8, /* Typedef */ + BTF_KIND_VOLATILE = 9, /* Volatile */ + BTF_KIND_CONST = 10, /* Const */ + BTF_KIND_RESTRICT = 11, /* Restrict */ + BTF_KIND_FUNC = 12, /* Function */ + BTF_KIND_FUNC_PROTO = 13, /* Function Proto */ + BTF_KIND_VAR = 14, /* Variable */ + BTF_KIND_DATASEC = 15, /* Section */ + BTF_KIND_FLOAT = 16, /* Floating point */ + + NR_BTF_KINDS, + BTF_KIND_MAX = NR_BTF_KINDS - 1, +};
/* For some specific BTF_KIND, "struct btf_type" is immediately * followed by extra data. diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h index d27b1708efe9..10e401073dd1 100644 --- a/tools/include/uapi/linux/btf.h +++ b/tools/include/uapi/linux/btf.h @@ -56,25 +56,28 @@ struct btf_type { #define BTF_INFO_VLEN(info) ((info) & 0xffff) #define BTF_INFO_KFLAG(info) ((info) >> 31)
-#define BTF_KIND_UNKN 0 /* Unknown */ -#define BTF_KIND_INT 1 /* Integer */ -#define BTF_KIND_PTR 2 /* Pointer */ -#define BTF_KIND_ARRAY 3 /* Array */ -#define BTF_KIND_STRUCT 4 /* Struct */ -#define BTF_KIND_UNION 5 /* Union */ -#define BTF_KIND_ENUM 6 /* Enumeration */ -#define BTF_KIND_FWD 7 /* Forward */ -#define BTF_KIND_TYPEDEF 8 /* Typedef */ -#define BTF_KIND_VOLATILE 9 /* Volatile */ -#define BTF_KIND_CONST 10 /* Const */ -#define BTF_KIND_RESTRICT 11 /* Restrict */ -#define BTF_KIND_FUNC 12 /* Function */ -#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ -#define BTF_KIND_VAR 14 /* Variable */ -#define BTF_KIND_DATASEC 15 /* Section */ -#define BTF_KIND_FLOAT 16 /* Floating point */ -#define BTF_KIND_MAX BTF_KIND_FLOAT -#define NR_BTF_KINDS (BTF_KIND_MAX + 1) +enum { + BTF_KIND_UNKN = 0, /* Unknown */ + BTF_KIND_INT = 1, /* Integer */ + BTF_KIND_PTR = 2, /* Pointer */ + BTF_KIND_ARRAY = 3, /* Array */ + BTF_KIND_STRUCT = 4, /* Struct */ + BTF_KIND_UNION = 5, /* Union */ + BTF_KIND_ENUM = 6, /* Enumeration */ + BTF_KIND_FWD = 7, /* Forward */ + BTF_KIND_TYPEDEF = 8, /* Typedef */ + BTF_KIND_VOLATILE = 9, /* Volatile */ + BTF_KIND_CONST = 10, /* Const */ + BTF_KIND_RESTRICT = 11, /* Restrict */ + BTF_KIND_FUNC = 12, /* Function */ + BTF_KIND_FUNC_PROTO = 13, /* Function Proto */ + BTF_KIND_VAR = 14, /* Variable */ + BTF_KIND_DATASEC = 15, /* Section */ + BTF_KIND_FLOAT = 16, /* Floating point */ + + NR_BTF_KINDS, + BTF_KIND_MAX = NR_BTF_KINDS - 1, +};
/* For some specific BTF_KIND, "struct btf_type" is immediately * followed by extra data.
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit b5ea834dde6b6e7f75e51d5f66dac8cd7c97b5ef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
LLVM14 added support for a new C attribute ([1]) __attribute__((btf_tag("arbitrary_str"))) This attribute will be emitted to dwarf ([2]) and pahole will convert it to BTF. Or for bpf target, this attribute will be emitted to BTF directly ([3], [4]). The attribute is intended to provide additional information for - struct/union type or struct/union member - static/global variables - static/global function or function parameter.
For linux kernel, the btf_tag can be applied in various places to specify user pointer, function pre- or post- condition, function allow/deny in certain context, etc. Such information will be encoded in vmlinux BTF and can be used by verifier.
The btf_tag can also be applied to bpf programs to help global verifiable functions, e.g., specifying preconditions, etc.
This patch added basic parsing and checking support in kernel for new BTF_KIND_TAG kind.
[1] https://reviews.llvm.org/D106614 [2] https://reviews.llvm.org/D106621 [3] https://reviews.llvm.org/D106622 [4] https://reviews.llvm.org/D109560
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com (cherry picked from commit b5ea834dde6b6e7f75e51d5f66dac8cd7c97b5ef) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/btf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/btf.h | 14 +++- kernel/bpf/btf.c | 128 +++++++++++++++++++++++++++++++++ tools/include/uapi/linux/btf.h | 14 +++- 3 files changed, 154 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index 10e401073dd1..642b6ecb37d7 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -43,7 +43,7 @@ struct btf_type { * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO and VAR. + * FUNC, FUNC_PROTO, VAR and TAG. * "type" is a type_id referring to another type. */ union { @@ -74,6 +74,7 @@ enum { BTF_KIND_VAR = 14, /* Variable */ BTF_KIND_DATASEC = 15, /* Section */ BTF_KIND_FLOAT = 16, /* Floating point */ + BTF_KIND_TAG = 17, /* Tag */
NR_BTF_KINDS, BTF_KIND_MAX = NR_BTF_KINDS - 1, @@ -173,4 +174,15 @@ struct btf_var_secinfo { __u32 size; };
+/* BTF_KIND_TAG is followed by a single "struct btf_tag" to describe + * additional information related to the tag applied location. + * If component_idx == -1, the tag is applied to a struct, union, + * variable or function. Otherwise, it is applied to a struct/union + * member or a func argument, and component_idx indicates which member + * or argument (0 ... vlen-1). + */ +struct btf_tag { + __s32 component_idx; +}; + #endif /* _UAPI__LINUX_BTF_H__ */ diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index e9b6e50a6764..5906870fe1d2 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -280,6 +280,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_FUNC_PROTO] = "FUNC_PROTO", [BTF_KIND_VAR] = "VAR", [BTF_KIND_DATASEC] = "DATASEC", + [BTF_KIND_TAG] = "TAG", };
const char *btf_type_str(const struct btf_type *t) @@ -458,6 +459,17 @@ static bool btf_type_is_datasec(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; }
+static bool btf_type_is_tag(const struct btf_type *t) +{ + return BTF_INFO_KIND(t->info) == BTF_KIND_TAG; +} + +static bool btf_type_is_tag_target(const struct btf_type *t) +{ + return btf_type_is_func(t) || btf_type_is_struct(t) || + btf_type_is_var(t); +} + u32 btf_nr_types(const struct btf *btf) { u32 total = 0; @@ -536,6 +548,7 @@ const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf, static bool btf_type_is_resolve_source_only(const struct btf_type *t) { return btf_type_is_var(t) || + btf_type_is_tag(t) || btf_type_is_datasec(t); }
@@ -562,6 +575,7 @@ static bool btf_type_needs_resolve(const struct btf_type *t) btf_type_is_struct(t) || btf_type_is_array(t) || btf_type_is_var(t) || + btf_type_is_tag(t) || btf_type_is_datasec(t); }
@@ -614,6 +628,11 @@ static const struct btf_var *btf_type_var(const struct btf_type *t) return (const struct btf_var *)(t + 1); }
+static const struct btf_tag *btf_type_tag(const struct btf_type *t) +{ + return (const struct btf_tag *)(t + 1); +} + static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t) { return kind_ops[BTF_INFO_KIND(t->info)]; @@ -3675,6 +3694,110 @@ static const struct btf_kind_operations datasec_ops = { .show = btf_datasec_show, };
+static s32 btf_tag_check_meta(struct btf_verifier_env *env, + const struct btf_type *t, + u32 meta_left) +{ + const struct btf_tag *tag; + u32 meta_needed = sizeof(*tag); + s32 component_idx; + const char *value; + + if (meta_left < meta_needed) { + btf_verifier_log_basic(env, t, + "meta_left:%u meta_needed:%u", + meta_left, meta_needed); + return -EINVAL; + } + + value = btf_name_by_offset(env->btf, t->name_off); + if (!value || !value[0]) { + btf_verifier_log_type(env, t, "Invalid value"); + return -EINVAL; + } + + if (btf_type_vlen(t)) { + btf_verifier_log_type(env, t, "vlen != 0"); + return -EINVAL; + } + + if (btf_type_kflag(t)) { + btf_verifier_log_type(env, t, "Invalid btf_info kind_flag"); + return -EINVAL; + } + + component_idx = btf_type_tag(t)->component_idx; + if (component_idx < -1) { + btf_verifier_log_type(env, t, "Invalid component_idx"); + return -EINVAL; + } + + btf_verifier_log_type(env, t, NULL); + + return meta_needed; +} + +static int btf_tag_resolve(struct btf_verifier_env *env, + const struct resolve_vertex *v) +{ + const struct btf_type *next_type; + const struct btf_type *t = v->t; + u32 next_type_id = t->type; + struct btf *btf = env->btf; + s32 component_idx; + u32 vlen; + + next_type = btf_type_by_id(btf, next_type_id); + if (!next_type || !btf_type_is_tag_target(next_type)) { + btf_verifier_log_type(env, v->t, "Invalid type_id"); + return -EINVAL; + } + + if (!env_type_is_resolve_sink(env, next_type) && + !env_type_is_resolved(env, next_type_id)) + return env_stack_push(env, next_type, next_type_id); + + component_idx = btf_type_tag(t)->component_idx; + if (component_idx != -1) { + if (btf_type_is_var(next_type)) { + btf_verifier_log_type(env, v->t, "Invalid component_idx"); + return -EINVAL; + } + + if (btf_type_is_struct(next_type)) { + vlen = btf_type_vlen(next_type); + } else { + /* next_type should be a function */ + next_type = btf_type_by_id(btf, next_type->type); + vlen = btf_type_vlen(next_type); + } + + if ((u32)component_idx >= vlen) { + btf_verifier_log_type(env, v->t, "Invalid component_idx"); + return -EINVAL; + } + } + + env_stack_pop_resolved(env, next_type_id, 0); + + return 0; +} + +static void btf_tag_log(struct btf_verifier_env *env, const struct btf_type *t) +{ + btf_verifier_log(env, "type=%u component_idx=%d", t->type, + btf_type_tag(t)->component_idx); +} + +static const struct btf_kind_operations tag_ops = { + .check_meta = btf_tag_check_meta, + .resolve = btf_tag_resolve, + .check_member = btf_df_check_member, + .check_kflag_member = btf_df_check_kflag_member, + .log_details = btf_tag_log, + .show = btf_df_show, +}; + static int btf_func_proto_check(struct btf_verifier_env *env, const struct btf_type *t) { @@ -3808,6 +3931,7 @@ static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS] = { [BTF_KIND_FUNC_PROTO] = &func_proto_ops, [BTF_KIND_VAR] = &var_ops, [BTF_KIND_DATASEC] = &datasec_ops, + [BTF_KIND_TAG] = &tag_ops, };
static s32 btf_check_meta(struct btf_verifier_env *env, @@ -3892,6 +4016,10 @@ static bool btf_resolve_valid(struct btf_verifier_env *env, return !btf_resolved_type_id(btf, type_id) && !btf_resolved_type_size(btf, type_id);
+ if (btf_type_is_tag(t)) + return btf_resolved_type_id(btf, type_id) && + !btf_resolved_type_size(btf, type_id); + if (btf_type_is_modifier(t) || btf_type_is_ptr(t) || btf_type_is_var(t)) { t = btf_type_id_resolve(btf, &type_id); diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h index 10e401073dd1..642b6ecb37d7 100644 --- a/tools/include/uapi/linux/btf.h +++ b/tools/include/uapi/linux/btf.h @@ -43,7 +43,7 @@ struct btf_type { * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO and VAR. + * FUNC, FUNC_PROTO, VAR and TAG. * "type" is a type_id referring to another type. */ union { @@ -74,6 +74,7 @@ enum { BTF_KIND_VAR = 14, /* Variable */ BTF_KIND_DATASEC = 15, /* Section */ BTF_KIND_FLOAT = 16, /* Floating point */ + BTF_KIND_TAG = 17, /* Tag */
NR_BTF_KINDS, BTF_KIND_MAX = NR_BTF_KINDS - 1, @@ -173,4 +174,15 @@ struct btf_var_secinfo { __u32 size; };
+/* BTF_KIND_TAG is followed by a single "struct btf_tag" to describe + * additional information related to the tag applied location. + * If component_idx == -1, the tag is applied to a struct, union, + * variable or function. Otherwise, it is applied to a struct/union + * member or a func argument, and component_idx indicates which member + * or argument (0 ... vlen-1). + */ +struct btf_tag { + __s32 component_idx; +}; + #endif /* _UAPI__LINUX_BTF_H__ */
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit 30025e8bd80fdc5a4159ec7f9511121ea561f456 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This patch renames functions btf_{hash,equal}_int() to btf_{hash,equal}_int_tag() so they can be reused for BTF_KIND_TAG support. There is no functionality change for this patch.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com (cherry picked from commit 30025e8bd80fdc5a4159ec7f9511121ea561f456) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index aab7e4ece0a0..4b00a8fe3e80 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3262,8 +3262,8 @@ static bool btf_equal_common(struct btf_type *t1, struct btf_type *t2) t1->size == t2->size; }
-/* Calculate type signature hash of INT. */ -static long btf_hash_int(struct btf_type *t) +/* Calculate type signature hash of INT or TAG. */ +static long btf_hash_int_tag(struct btf_type *t) { __u32 info = *(__u32 *)(t + 1); long h; @@ -3273,8 +3273,8 @@ static long btf_hash_int(struct btf_type *t) return h; }
-/* Check structural equality of two INTs. */ -static bool btf_equal_int(struct btf_type *t1, struct btf_type *t2) +/* Check structural equality of two INTs or TAGs. */ +static bool btf_equal_int_tag(struct btf_type *t1, struct btf_type *t2) { __u32 info1, info2;
@@ -3541,7 +3541,7 @@ static int btf_dedup_prep(struct btf_dedup *d) h = btf_hash_common(t); break; case BTF_KIND_INT: - h = btf_hash_int(t); + h = btf_hash_int_tag(t); break; case BTF_KIND_ENUM: h = btf_hash_enum(t); @@ -3599,11 +3599,11 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id) return 0;
case BTF_KIND_INT: - h = btf_hash_int(t); + h = btf_hash_int_tag(t); for_each_dedup_cand(d, hash_entry, h) { cand_id = (__u32)(long)hash_entry->value; cand = btf_type_by_id(d->btf, cand_id); - if (btf_equal_int(t, cand)) { + if (btf_equal_int_tag(t, cand)) { new_id = cand_id; break; } @@ -3887,7 +3887,7 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
switch (cand_kind) { case BTF_KIND_INT: - return btf_equal_int(cand_type, canon_type); + return btf_equal_int_tag(cand_type, canon_type);
case BTF_KIND_ENUM: if (d->opts.dont_resolve_fwds)
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit 5b84bd10363e36ceb7c4c1ae749a3fc8adf8df45 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BTF_KIND_TAG support for parsing and dedup. Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not supported in the kernel, sanitize it to INTs.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com (cherry picked from commit 5b84bd10363e36ceb7c4c1ae749a3fc8adf8df45) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 68 +++++++++++++++++++++++++++++++++ tools/lib/bpf/btf.h | 15 ++++++++ tools/lib/bpf/btf_dump.c | 3 ++ tools/lib/bpf/libbpf.c | 31 +++++++++++++-- tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 2 + 6 files changed, 118 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 4b00a8fe3e80..9154992dc792 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -310,6 +310,8 @@ static int btf_type_size(const struct btf_type *t) return base_size + sizeof(struct btf_var); case BTF_KIND_DATASEC: return base_size + vlen * sizeof(struct btf_var_secinfo); + case BTF_KIND_TAG: + return base_size + sizeof(struct btf_tag); default: pr_debug("Unsupported BTF_KIND:%u\n", btf_kind(t)); return -EINVAL; @@ -382,6 +384,9 @@ static int btf_bswap_type_rest(struct btf_type *t) v->size = bswap_32(v->size); } return 0; + case BTF_KIND_TAG: + btf_tag(t)->component_idx = bswap_32(btf_tag(t)->component_idx); + return 0; default: pr_debug("Unsupported BTF_KIND:%u\n", btf_kind(t)); return -EINVAL; @@ -592,6 +597,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_CONST: case BTF_KIND_RESTRICT: case BTF_KIND_VAR: + case BTF_KIND_TAG: type_id = t->type; break; case BTF_KIND_ARRAY: @@ -2446,6 +2452,48 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __ return 0; }
+/* + * Append new BTF_KIND_TAG type with: + * - *value* - non-empty/non-NULL string; + * - *ref_type_id* - referenced type ID, it might not exist yet; + * - *component_idx* - -1 for tagging reference type, otherwise struct/union + * member or function argument index; + * Returns: + * - >0, type ID of newly added BTF type; + * - <0, on error. + */ +int btf__add_tag(struct btf *btf, const char *value, int ref_type_id, + int component_idx) +{ + struct btf_type *t; + int sz, value_off; + + if (!value || !value[0] || component_idx < -1) + return libbpf_err(-EINVAL); + + if (validate_type_id(ref_type_id)) + return libbpf_err(-EINVAL); + + if (btf_ensure_modifiable(btf)) + return libbpf_err(-ENOMEM); + + sz = sizeof(struct btf_type) + sizeof(struct btf_tag); + t = btf_add_type_mem(btf, sz); + if (!t) + return libbpf_err(-ENOMEM); + + value_off = btf__add_str(btf, value); + if (value_off < 0) + return value_off; + + t->name_off = value_off; + t->info = btf_type_info(BTF_KIND_TAG, 0, false); + t->type = ref_type_id; + btf_tag(t)->component_idx = component_idx; + + return btf_commit_type(btf, sz); +} + struct btf_ext_sec_setup_param { __u32 off; __u32 len; @@ -3541,6 +3589,7 @@ static int btf_dedup_prep(struct btf_dedup *d) h = btf_hash_common(t); break; case BTF_KIND_INT: + case BTF_KIND_TAG: h = btf_hash_int_tag(t); break; case BTF_KIND_ENUM: @@ -3596,6 +3645,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id) case BTF_KIND_FUNC_PROTO: case BTF_KIND_VAR: case BTF_KIND_DATASEC: + case BTF_KIND_TAG: return 0;
case BTF_KIND_INT: @@ -4216,6 +4266,23 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id) } break;
+ case BTF_KIND_TAG: + ref_type_id = btf_dedup_ref_type(d, t->type); + if (ref_type_id < 0) + return ref_type_id; + t->type = ref_type_id; + + h = btf_hash_int_tag(t); + for_each_dedup_cand(d, hash_entry, h) { + cand_id = (__u32)(long)hash_entry->value; + cand = btf_type_by_id(d->btf, cand_id); + if (btf_equal_int_tag(t, cand)) { + new_id = cand_id; + break; + } + } + break; + case BTF_KIND_ARRAY: { struct btf_array *info = btf_array(t);
@@ -4488,6 +4555,7 @@ int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ct case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_VAR: + case BTF_KIND_TAG: return visit(&t->type, ctx);
case BTF_KIND_ARRAY: { diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index f2e2fab950b7..659ea8a2769b 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -143,6 +143,10 @@ LIBBPF_API int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __u32 byte_sz);
+/* tag construction API */ +LIBBPF_API int btf__add_tag(struct btf *btf, const char *value, int ref_type_id, + int component_idx); + struct btf_dedup_opts { unsigned int dedup_table_size; bool dont_resolve_fwds; @@ -330,6 +334,11 @@ static inline bool btf_is_float(const struct btf_type *t) return btf_kind(t) == BTF_KIND_FLOAT; }
+static inline bool btf_is_tag(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_TAG; +} + static inline __u8 btf_int_encoding(const struct btf_type *t) { return BTF_INT_ENCODING(*(__u32 *)(t + 1)); @@ -398,6 +407,12 @@ btf_var_secinfos(const struct btf_type *t) return (struct btf_var_secinfo *)(t + 1); }
+struct btf_tag; +static inline struct btf_tag *btf_tag(const struct btf_type *t) +{ + return (struct btf_tag *)(t + 1); +} + #ifdef __cplusplus } /* extern "C" */ #endif diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 9937d35c8f36..87a91b51b0d1 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -316,6 +316,7 @@ static int btf_dump_mark_referenced(struct btf_dump *d) case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_VAR: + case BTF_KIND_TAG: d->type_states[t->type].referenced = 1; break;
@@ -583,6 +584,7 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr) case BTF_KIND_FUNC: case BTF_KIND_VAR: case BTF_KIND_DATASEC: + case BTF_KIND_TAG: d->type_states[id].order_state = ORDERED; return 0;
@@ -2220,6 +2222,7 @@ static int btf_dump_dump_type_data(struct btf_dump *d, case BTF_KIND_FWD: case BTF_KIND_FUNC: case BTF_KIND_FUNC_PROTO: + case BTF_KIND_TAG: err = btf_dump_unsupported_data(d, t, id); break; case BTF_KIND_INT: diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 84528f29b8e1..6680cb5d2a89 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -193,6 +193,8 @@ enum kern_feature_id { FEAT_MODULE_BTF, /* BTF_KIND_FLOAT support */ FEAT_BTF_FLOAT, + /* BTF_KIND_TAG support */ + FEAT_BTF_TAG, __FEAT_CNT, };
@@ -1984,6 +1986,7 @@ static const char *__btf_kind_str(__u16 kind) case BTF_KIND_VAR: return "var"; case BTF_KIND_DATASEC: return "datasec"; case BTF_KIND_FLOAT: return "float"; + case BTF_KIND_TAG: return "tag"; default: return "unknown"; } } @@ -2483,8 +2486,9 @@ static bool btf_needs_sanitization(struct bpf_object *obj) bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC); bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); + bool has_tag = kernel_supports(obj, FEAT_BTF_TAG);
- return !has_func || !has_datasec || !has_func_global || !has_float; + return !has_func || !has_datasec || !has_func_global || !has_float || !has_tag; }
static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) @@ -2493,14 +2497,15 @@ static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC); bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); + bool has_tag = kernel_supports(obj, FEAT_BTF_TAG); struct btf_type *t; int i, j, vlen;
for (i = 1; i <= btf__get_nr_types(btf); i++) { t = (struct btf_type *)btf__type_by_id(btf, i);
- if (!has_datasec && btf_is_var(t)) { - /* replace VAR with INT */ + if ((!has_datasec && btf_is_var(t)) || (!has_tag && btf_is_tag(t))) { + /* replace VAR/TAG with INT */ t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0); /* * using size = 1 is the safest choice, 4 will be too @@ -4211,6 +4216,23 @@ static int probe_kern_btf_float(void) strs, sizeof(strs))); }
+static int probe_kern_btf_tag(void) +{ + static const char strs[] = "\0tag"; + __u32 types[] = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* VAR x */ /* [2] */ + BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1), + BTF_VAR_STATIC, + /* attr */ + BTF_TYPE_TAG_ENC(1, 2, -1), + }; + + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); +} + static int probe_kern_array_mmap(void) { struct bpf_create_map_attr attr = { @@ -4393,6 +4415,9 @@ static struct kern_feature_desc { [FEAT_BTF_FLOAT] = { "BTF_KIND_FLOAT support", probe_kern_btf_float, }, + [FEAT_BTF_TAG] = { + "BTF_KIND_TAG support", probe_kern_btf_tag, + }, };
static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id) diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 3008cd8e35ab..e13df375c779 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -383,4 +383,6 @@ LIBBPF_0.5.0 { } LIBBPF_0.4.0;
LIBBPF_0.6.0 { + global: + btf__add_tag; } LIBBPF_0.5.0; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 8de87eb7c979..c9ca612c9449 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -69,6 +69,8 @@ #define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size) #define BTF_TYPE_FLOAT_ENC(name, sz) \ BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FLOAT, 0, 0), sz) +#define BTF_TYPE_TAG_ENC(value, type, component_idx) \ + BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_TAG, 0, 0), type), (component_idx)
#ifndef likely #define likely(x) __builtin_expect(!!(x), 1)
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit 5c07f2fec00361fb5e8cff8ba7fdede7b29f38bd category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Added bpftool support to dump BTF_KIND_TAG information. The new bpftool will be used in later patches to dump btf in the test bpf program object file.
Currently, the tags are not emitted with bpftool btf dump file <path> format c and they are silently ignored. The tag information is mostly used in the kernel for verification purpose and the kernel uses its own btf to check. With adding these tags to vmlinux.h, tags will be encoded in program's btf but they will not be used by the kernel, at least for now. So let us delay adding these tags to format C header files until there is a real need.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210914223031.246951-1-yhs@fb.com (cherry picked from commit 5c07f2fec00361fb5e8cff8ba7fdede7b29f38bd) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/bpf/bpftool/btf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 0ba562625b02..2228136c538a 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -36,6 +36,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_FUNC_PROTO] = "FUNC_PROTO", [BTF_KIND_VAR] = "VAR", [BTF_KIND_DATASEC] = "DATASEC", + [BTF_KIND_TAG] = "TAG", };
struct btf_attach_table { @@ -327,6 +328,17 @@ static int dump_btf_type(const struct btf *btf, __u32 id, jsonw_end_array(w); break; } + case BTF_KIND_TAG: { + const struct btf_tag *tag = (const void *)(t + 1); + + if (json_output) { + jsonw_uint_field(w, "type_id", t->type); + jsonw_int_field(w, "component_idx", tag->component_idx); + } else { + printf(" type_id=%u component_idx=%d", t->type, tag->component_idx); + } + break; + } default: break; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit f11f86a3931b5d533aed1be1720fbd55bd63174d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Don't perform another search for sec_def inside libbpf_find_attach_btf_id(), as each recognized bpf_program already has prog->sec_def set.
Also remove unnecessary NULL check for prog->sec_name, as it can never be NULL.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org (cherry picked from commit f11f86a3931b5d533aed1be1720fbd55bd63174d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6680cb5d2a89..4cfd6c582f95 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8432,19 +8432,15 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, { enum bpf_attach_type attach_type = prog->expected_attach_type; __u32 attach_prog_fd = prog->attach_prog_fd; - const char *name = prog->sec_name, *attach_name; - const struct bpf_sec_def *sec = NULL; + const char *attach_name; int err = 0;
- if (!name) - return -EINVAL; - - sec = find_sec_def(name); - if (!sec || !sec->is_attach_btf) { - pr_warn("failed to identify BTF ID based on ELF section name '%s'\n", name); + if (!prog->sec_def || !prog->sec_def->is_attach_btf) { + pr_warn("failed to identify BTF ID based on ELF section name '%s'\n", + prog->sec_name); return -ESRCH; } - attach_name = name + sec->len; + attach_name = prog->sec_name + prog->sec_def->len;
/* BPF program's BTF ID */ if (attach_prog_fd) {
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 23a7baaa93882c241ad3464cdeeb8ef0d1d40a12 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
relaxed_core_relocs option hasn't had any effect for a while now, stop specifying it. Next patch marks it as deprecated.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-3-andrii@kernel.org (cherry picked from commit 23a7baaa93882c241ad3464cdeeb8ef0d1d40a12) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/core_reloc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index a9127478d5e0..0288549949fd 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -249,8 +249,7 @@ static int duration = 0; #define SIZE_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_size.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o", \ - .relaxed_core_relocs = true + .btf_src_file = "btf__core_reloc_" #name ".o"
#define SIZE_OUTPUT_DATA(type) \ STRUCT_TO_CHAR_PTR(core_reloc_size_output) { \
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 277641859e833549722eb82c97cbc2d55421df7c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It's relevant and hasn't been doing anything for a long while now. Deprecated it.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org (cherry picked from commit 277641859e833549722eb82c97cbc2d55421df7c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 12b9d15911dc..59cf43955ad9 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -83,6 +83,7 @@ struct bpf_object_open_opts { * Non-relocatable instructions are replaced with invalid ones to * prevent accidental errors. * */ + LIBBPF_DEPRECATED_SINCE(0, 6, "field has no effect") bool relaxed_core_relocs; /* maps that set the 'pinning' attribute in their definition will have * their pin_path attribute set to a file in this directory, and be
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 2d5ec1c66e25f0b4dd895a211e651a12dec2ef4f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Allow to use bpf_program__set_attach_target to only set target attach program FD, while letting libbpf to use target attach function name from SEC() definition. This might be useful for some scenarios where bpf_object contains multiple related freplace BPF programs intended to replace different sub-programs in target BPF program. In such case all programs will have the same attach_prog_fd, but different attach_func_name. It's convenient to specify such target function names declaratively in SEC() definitions, but attach_prog_fd is a dynamic runtime setting.
To simplify such scenario, allow bpf_program__set_attach_target() to delay BTF ID resolution till the BPF program load time by providing NULL attach_func_name. In that case the behavior will be similar to using bpf_object_open_opts.attach_prog_fd (which is marked deprecated since v0.7), but has the benefit of allowing more control by user in what is attached to what. Such setup allows having BPF programs attached to different target attach_prog_fd with target functions still declaratively recorded in BPF source code in SEC() definitions.
Selftests changes in the next patch should make this more obvious.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org (cherry picked from commit 2d5ec1c66e25f0b4dd895a211e651a12dec2ef4f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 4cfd6c582f95..47cfa247b717 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -10369,18 +10369,29 @@ int bpf_program__set_attach_target(struct bpf_program *prog, { int btf_obj_fd = 0, btf_id = 0, err;
- if (!prog || attach_prog_fd < 0 || !attach_func_name) + if (!prog || attach_prog_fd < 0) return libbpf_err(-EINVAL);
if (prog->obj->loaded) return libbpf_err(-EINVAL);
+ if (attach_prog_fd && !attach_func_name) { + /* remember attach_prog_fd and let bpf_program__load() find + * BTF ID during the program load + */ + prog->attach_prog_fd = attach_prog_fd; + return 0; + } + if (attach_prog_fd) { btf_id = libbpf_find_prog_btf_id(attach_func_name, attach_prog_fd); if (btf_id < 0) return libbpf_err(btf_id); } else { + if (!attach_func_name) + return libbpf_err(-EINVAL); + /* load btf_vmlinux, if not yet */ err = bpf_object__load_vmlinux_btf(prog->obj, true); if (err)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 91b555d73e53879fc6d4cf82c8c0e14c00ce212d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption that bpf_object contains either only single freplace BPF program or all of BPF programs in BPF object are freplaces intended to replace different subprograms of the same target BPF program. This seems both a bit confusing, too assuming, and limiting.
We've had bpf_program__set_attach_target() API which allows more fine-grained control over this, on a per-program level. As such, mark open_opts.attach_prog_fd as deprecated starting from v0.7, so that we have one more universal way of setting freplace targets. With previous change to allow NULL attach_func_name argument, and especially combined with BPF skeleton, arguable bpf_program__set_attach_target() is a more convenient and explicit API as well.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org (cherry picked from commit 91b555d73e53879fc6d4cf82c8c0e14c00ce212d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 3 +++ tools/lib/bpf/libbpf.h | 2 ++ tools/lib/bpf/libbpf_common.h | 5 +++++ 3 files changed, 10 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 47cfa247b717..ea8140123d18 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6381,9 +6381,12 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object bpf_program__set_type(prog, prog->sec_def->prog_type); bpf_program__set_expected_attach_type(prog, prog->sec_def->expected_attach_type);
+#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wdeprecated-declarations" if (prog->sec_def->prog_type == BPF_PROG_TYPE_TRACING || prog->sec_def->prog_type == BPF_PROG_TYPE_EXT) prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0); +#pragma GCC diagnostic pop }
return 0; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 59cf43955ad9..e5c64cdda1e3 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -90,6 +90,8 @@ struct bpf_object_open_opts { * auto-pinned to that path on load; defaults to "/sys/fs/bpf". */ const char *pin_root_path; + + LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__set_attach_target() on each individual bpf_program") __u32 attach_prog_fd; /* Additional kernel config content that augments and overrides * system Kconfig for CONFIG_xxx externs. diff --git a/tools/lib/bpf/libbpf_common.h b/tools/lib/bpf/libbpf_common.h index 36ac77f2bea2..aaa1efbf6f51 100644 --- a/tools/lib/bpf/libbpf_common.h +++ b/tools/lib/bpf/libbpf_common.h @@ -35,6 +35,11 @@ #else #define __LIBBPF_MARK_DEPRECATED_0_6(X) #endif +#if __LIBBPF_CURRENT_VERSION_GEQ(0, 7) +#define __LIBBPF_MARK_DEPRECATED_0_7(X) X +#else +#define __LIBBPF_MARK_DEPRECATED_0_7(X) +#endif
/* Helper macro to declare and initialize libbpf options struct *
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 942025c9f37ee45e69eb5f39a2877afab66d9555 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so change all struct bpf_program and struct bpf_map pointers to const pointers. This is completely backwards compatible with no functional change.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org (cherry picked from commit 942025c9f37ee45e69eb5f39a2877afab66d9555) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c tools/lib/bpf/libbpf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 66 +++++++++++++++++++++--------------------- tools/lib/bpf/libbpf.h | 30 +++++++++---------- 2 files changed, 48 insertions(+), 48 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ea8140123d18..e44bde6c6538 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -218,7 +218,7 @@ struct reloc_desc {
struct bpf_sec_def;
-typedef struct bpf_link *(*attach_fn_t)(struct bpf_program *prog); +typedef struct bpf_link *(*attach_fn_t)(const struct bpf_program *prog);
struct bpf_sec_def { const char *sec; @@ -7910,13 +7910,13 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog, __VA_ARGS__ \ }
-static struct bpf_link *attach_kprobe(struct bpf_program *prog); -static struct bpf_link *attach_tp(struct bpf_program *prog); -static struct bpf_link *attach_raw_tp(struct bpf_program *prog); -static struct bpf_link *attach_trace(struct bpf_program *prog); -static struct bpf_link *attach_lsm(struct bpf_program *prog); -static struct bpf_link *attach_iter(struct bpf_program *prog); -static struct bpf_link *attach_sched(struct bpf_program *prog); +static struct bpf_link *attach_kprobe(const struct bpf_program *prog); +static struct bpf_link *attach_tp(const struct bpf_program *prog); +static struct bpf_link *attach_raw_tp(const struct bpf_program *prog); +static struct bpf_link *attach_trace(const struct bpf_program *prog); +static struct bpf_link *attach_lsm(const struct bpf_program *prog); +static struct bpf_link *attach_iter(const struct bpf_program *prog); +static struct bpf_link *attach_sched(const struct bpf_program *prog);
static const struct bpf_sec_def section_defs[] = { BPF_PROG_SEC("socket", BPF_PROG_TYPE_SOCKET_FILTER), @@ -8993,7 +8993,7 @@ static int bpf_link__detach_perf_event(struct bpf_link *link) return libbpf_err(err); }
-struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog, int pfd) +struct bpf_link *bpf_program__attach_perf_event(const struct bpf_program *prog, int pfd) { char errmsg[STRERR_BUFSIZE]; struct bpf_link *link; @@ -9142,7 +9142,7 @@ static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name, return pfd; }
-struct bpf_link *bpf_program__attach_kprobe(struct bpf_program *prog, +struct bpf_link *bpf_program__attach_kprobe(const struct bpf_program *prog, bool retprobe, const char *func_name) { @@ -9170,7 +9170,7 @@ struct bpf_link *bpf_program__attach_kprobe(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_kprobe(struct bpf_program *prog) +static struct bpf_link *attach_kprobe(const struct bpf_program *prog) { const char *func_name; bool retprobe; @@ -9181,7 +9181,7 @@ static struct bpf_link *attach_kprobe(struct bpf_program *prog) return bpf_program__attach_kprobe(prog, retprobe, func_name); }
-struct bpf_link *bpf_program__attach_uprobe(struct bpf_program *prog, +struct bpf_link *bpf_program__attach_uprobe(const struct bpf_program *prog, bool retprobe, pid_t pid, const char *binary_path, size_t func_offset) @@ -9262,7 +9262,7 @@ static int perf_event_open_tracepoint(const char *tp_category, return pfd; }
-struct bpf_link *bpf_program__attach_tracepoint(struct bpf_program *prog, +struct bpf_link *bpf_program__attach_tracepoint(const struct bpf_program *prog, const char *tp_category, const char *tp_name) { @@ -9289,7 +9289,7 @@ struct bpf_link *bpf_program__attach_tracepoint(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_tp(struct bpf_program *prog) +static struct bpf_link *attach_tp(const struct bpf_program *prog) { char *sec_name, *tp_cat, *tp_name; struct bpf_link *link; @@ -9313,7 +9313,7 @@ static struct bpf_link *attach_tp(struct bpf_program *prog) return link; }
-struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog, +struct bpf_link *bpf_program__attach_raw_tracepoint(const struct bpf_program *prog, const char *tp_name) { char errmsg[STRERR_BUFSIZE]; @@ -9343,7 +9343,7 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_raw_tp(struct bpf_program *prog) +static struct bpf_link *attach_raw_tp(const struct bpf_program *prog) { const char *tp_name = prog->sec_name + prog->sec_def->len;
@@ -9351,7 +9351,7 @@ static struct bpf_link *attach_raw_tp(struct bpf_program *prog) }
/* Common logic for all BPF program types that attach to a btf_id */ -static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog) +static struct bpf_link *bpf_program__attach_btf_id(const struct bpf_program *prog) { char errmsg[STRERR_BUFSIZE]; struct bpf_link *link; @@ -9380,38 +9380,38 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog) return (struct bpf_link *)link; }
-struct bpf_link *bpf_program__attach_trace(struct bpf_program *prog) +struct bpf_link *bpf_program__attach_trace(const struct bpf_program *prog) { return bpf_program__attach_btf_id(prog); }
-struct bpf_link *bpf_program__attach_sched(struct bpf_program *prog) +struct bpf_link *bpf_program__attach_sched(const struct bpf_program *prog) { return bpf_program__attach_btf_id(prog); }
-struct bpf_link *bpf_program__attach_lsm(struct bpf_program *prog) +struct bpf_link *bpf_program__attach_lsm(const struct bpf_program *prog) { return bpf_program__attach_btf_id(prog); }
-static struct bpf_link *attach_trace(struct bpf_program *prog) +static struct bpf_link *attach_trace(const struct bpf_program *prog) { return bpf_program__attach_trace(prog); }
-static struct bpf_link *attach_sched(struct bpf_program *prog) +static struct bpf_link *attach_sched(const struct bpf_program *prog) { return bpf_program__attach_sched(prog); }
-static struct bpf_link *attach_lsm(struct bpf_program *prog) +static struct bpf_link *attach_lsm(const struct bpf_program *prog) { return bpf_program__attach_lsm(prog); }
static struct bpf_link * -bpf_program__attach_fd(struct bpf_program *prog, int target_fd, int btf_id, +bpf_program__attach_fd(const struct bpf_program *prog, int target_fd, int btf_id, const char *target_name) { DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts, @@ -9447,24 +9447,24 @@ bpf_program__attach_fd(struct bpf_program *prog, int target_fd, int btf_id, }
struct bpf_link * -bpf_program__attach_cgroup(struct bpf_program *prog, int cgroup_fd) +bpf_program__attach_cgroup(const struct bpf_program *prog, int cgroup_fd) { return bpf_program__attach_fd(prog, cgroup_fd, 0, "cgroup"); }
struct bpf_link * -bpf_program__attach_netns(struct bpf_program *prog, int netns_fd) +bpf_program__attach_netns(const struct bpf_program *prog, int netns_fd) { return bpf_program__attach_fd(prog, netns_fd, 0, "netns"); }
-struct bpf_link *bpf_program__attach_xdp(struct bpf_program *prog, int ifindex) +struct bpf_link *bpf_program__attach_xdp(const struct bpf_program *prog, int ifindex) { /* target_fd/target_ifindex use the same field in LINK_CREATE */ return bpf_program__attach_fd(prog, ifindex, 0, "xdp"); }
-struct bpf_link *bpf_program__attach_freplace(struct bpf_program *prog, +struct bpf_link *bpf_program__attach_freplace(const struct bpf_program *prog, int target_fd, const char *attach_func_name) { @@ -9497,7 +9497,7 @@ struct bpf_link *bpf_program__attach_freplace(struct bpf_program *prog, }
struct bpf_link * -bpf_program__attach_iter(struct bpf_program *prog, +bpf_program__attach_iter(const struct bpf_program *prog, const struct bpf_iter_attach_opts *opts) { DECLARE_LIBBPF_OPTS(bpf_link_create_opts, link_create_opts); @@ -9536,12 +9536,12 @@ bpf_program__attach_iter(struct bpf_program *prog, return link; }
-static struct bpf_link *attach_iter(struct bpf_program *prog) +static struct bpf_link *attach_iter(const struct bpf_program *prog) { return bpf_program__attach_iter(prog, NULL); }
-struct bpf_link *bpf_program__attach(struct bpf_program *prog) +struct bpf_link *bpf_program__attach(const struct bpf_program *prog) { if (!prog->sec_def || !prog->sec_def->attach_fn) return libbpf_err_ptr(-ESRCH); @@ -9559,7 +9559,7 @@ static int bpf_link__detach_struct_ops(struct bpf_link *link) return 0; }
-struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map) +struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map) { struct bpf_struct_ops *st_ops; struct bpf_link *link; @@ -10644,7 +10644,7 @@ int bpf_object__attach_skeleton(struct bpf_object_skeleton *s) if (!prog->sec_def || !prog->sec_def->attach_fn) continue;
- *link = prog->sec_def->attach_fn(prog); + *link = bpf_program__attach(prog); err = libbpf_get_error(*link); if (err) { pr_warn("failed to auto-attach program '%s': %d\n", diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index e5c64cdda1e3..d9815eee26ac 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -246,42 +246,42 @@ LIBBPF_API int bpf_link__detach(struct bpf_link *link); LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
LIBBPF_API struct bpf_link * -bpf_program__attach(struct bpf_program *prog); +bpf_program__attach(const struct bpf_program *prog); LIBBPF_API struct bpf_link * -bpf_program__attach_perf_event(struct bpf_program *prog, int pfd); +bpf_program__attach_perf_event(const struct bpf_program *prog, int pfd); LIBBPF_API struct bpf_link * -bpf_program__attach_kprobe(struct bpf_program *prog, bool retprobe, +bpf_program__attach_kprobe(const struct bpf_program *prog, bool retprobe, const char *func_name); LIBBPF_API struct bpf_link * -bpf_program__attach_uprobe(struct bpf_program *prog, bool retprobe, +bpf_program__attach_uprobe(const struct bpf_program *prog, bool retprobe, pid_t pid, const char *binary_path, size_t func_offset); LIBBPF_API struct bpf_link * -bpf_program__attach_tracepoint(struct bpf_program *prog, +bpf_program__attach_tracepoint(const struct bpf_program *prog, const char *tp_category, const char *tp_name); LIBBPF_API struct bpf_link * -bpf_program__attach_raw_tracepoint(struct bpf_program *prog, +bpf_program__attach_raw_tracepoint(const struct bpf_program *prog, const char *tp_name); LIBBPF_API struct bpf_link * -bpf_program__attach_trace(struct bpf_program *prog); +bpf_program__attach_trace(const struct bpf_program *prog); LIBBPF_API struct bpf_link * -bpf_program__attach_lsm(struct bpf_program *prog); +bpf_program__attach_lsm(const struct bpf_program *prog); LIBBPF_API struct bpf_link * -bpf_program__attach_cgroup(struct bpf_program *prog, int cgroup_fd); +bpf_program__attach_cgroup(const struct bpf_program *prog, int cgroup_fd); LIBBPF_API struct bpf_link * -bpf_program__attach_netns(struct bpf_program *prog, int netns_fd); +bpf_program__attach_netns(const struct bpf_program *prog, int netns_fd); LIBBPF_API struct bpf_link * -bpf_program__attach_xdp(struct bpf_program *prog, int ifindex); +bpf_program__attach_xdp(const struct bpf_program *prog, int ifindex); LIBBPF_API struct bpf_link * -bpf_program__attach_freplace(struct bpf_program *prog, +bpf_program__attach_freplace(const struct bpf_program *prog, int target_fd, const char *attach_func_name); LIBBPF_API struct bpf_link * -bpf_program__attach_sched(struct bpf_program *prog); +bpf_program__attach_sched(const struct bpf_program *prog);
struct bpf_map;
-LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(struct bpf_map *map); +LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map);
struct bpf_iter_attach_opts { size_t sz; /* size of this struct for forward/backward compatibility */ @@ -291,7 +291,7 @@ struct bpf_iter_attach_opts { #define bpf_iter_attach_opts__last_field link_info_len
LIBBPF_API struct bpf_link * -bpf_program__attach_iter(struct bpf_program *prog, +bpf_program__attach_iter(const struct bpf_program *prog, const struct bpf_iter_attach_opts *opts);
struct bpf_insn;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.14-rc1 commit bad2e478af3b4df9fd84b4db7779ea91bd618c16 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Turn ony libbpf 1.0 mode. Fix all the explicit IS_ERR checks that now will be broken because libbpf returns NULL on error (and sets errno). Fix ASSERT_OK_PTR and ASSERT_ERR_PTR to work for both old mode and new modes and use them throughout selftests. This is trivial to do by using libbpf_get_error() API that all libbpf users are supposed to use, instead of IS_ERR checks.
A bunch of checks also did explicit -1 comparison for various fd-returning APIs. Such checks are replaced with >= 0 or < 0 cases.
There were also few misuses of bpf_object__find_map_by_name() in test_maps. Those are fixed in this patch as well.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210525035935.1461796-3-andrii@kernel.org (cherry picked from commit bad2e478af3b4df9fd84b4db7779ea91bd618c16) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/prog_tests/bpf_iter.c tools/testing/selftests/bpf/prog_tests/check_mtu.c tools/testing/selftests/bpf/prog_tests/prog_run_xattr.c tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/bench.c | 1 + .../selftests/bpf/benchs/bench_rename.c | 2 +- .../selftests/bpf/benchs/bench_ringbufs.c | 6 +- .../selftests/bpf/benchs/bench_trigger.c | 2 +- .../selftests/bpf/prog_tests/attach_probe.c | 12 +- .../selftests/bpf/prog_tests/bpf_iter.c | 26 ++- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 8 +- tools/testing/selftests/bpf/prog_tests/btf.c | 93 +++++----- .../selftests/bpf/prog_tests/btf_dump.c | 8 +- .../selftests/bpf/prog_tests/btf_write.c | 4 +- .../bpf/prog_tests/cg_storage_multi.c | 84 +++------ .../bpf/prog_tests/cgroup_attach_multi.c | 2 +- .../selftests/bpf/prog_tests/cgroup_link.c | 14 +- .../bpf/prog_tests/cgroup_skb_sk_lookup.c | 2 +- .../selftests/bpf/prog_tests/core_reloc.c | 15 +- .../selftests/bpf/prog_tests/fexit_bpf2bpf.c | 25 +-- .../selftests/bpf/prog_tests/flow_dissector.c | 2 +- .../bpf/prog_tests/flow_dissector_reattach.c | 10 +- .../bpf/prog_tests/get_stack_raw_tp.c | 10 +- .../prog_tests/get_stackid_cannot_attach.c | 9 +- .../selftests/bpf/prog_tests/hashmap.c | 9 +- .../selftests/bpf/prog_tests/kfree_skb.c | 19 +- .../selftests/bpf/prog_tests/ksyms_btf.c | 3 +- .../selftests/bpf/prog_tests/link_pinning.c | 7 +- .../selftests/bpf/prog_tests/obj_name.c | 8 +- .../selftests/bpf/prog_tests/perf_branches.c | 4 +- .../selftests/bpf/prog_tests/perf_buffer.c | 2 +- .../bpf/prog_tests/perf_event_stackmap.c | 3 +- .../selftests/bpf/prog_tests/probe_user.c | 7 +- .../selftests/bpf/prog_tests/prog_run_xattr.c | 2 +- .../bpf/prog_tests/raw_tp_test_run.c | 4 +- .../selftests/bpf/prog_tests/rdonly_maps.c | 7 +- .../bpf/prog_tests/reference_tracking.c | 2 +- .../selftests/bpf/prog_tests/resolve_btfids.c | 2 +- .../bpf/prog_tests/select_reuseport.c | 53 +++--- .../selftests/bpf/prog_tests/send_signal.c | 3 +- .../selftests/bpf/prog_tests/sk_lookup.c | 2 +- .../selftests/bpf/prog_tests/sock_fields.c | 14 +- .../selftests/bpf/prog_tests/sockmap_basic.c | 6 +- .../selftests/bpf/prog_tests/sockmap_ktls.c | 2 +- .../selftests/bpf/prog_tests/sockmap_listen.c | 10 +- .../bpf/prog_tests/stacktrace_build_id_nmi.c | 3 +- .../selftests/bpf/prog_tests/stacktrace_map.c | 2 +- .../bpf/prog_tests/stacktrace_map_raw_tp.c | 5 +- .../bpf/prog_tests/tcp_hdr_options.c | 15 +- .../selftests/bpf/prog_tests/test_overhead.c | 12 +- .../bpf/prog_tests/trampoline_count.c | 14 +- .../selftests/bpf/prog_tests/udp_limit.c | 7 +- .../selftests/bpf/prog_tests/xdp_bpf2bpf.c | 2 +- .../selftests/bpf/prog_tests/xdp_link.c | 8 +- tools/testing/selftests/bpf/test_maps.c | 168 +++++++++--------- tools/testing/selftests/bpf/test_progs.c | 3 + tools/testing/selftests/bpf/test_progs.h | 9 +- .../selftests/bpf/test_tcpnotify_user.c | 7 +- 54 files changed, 341 insertions(+), 418 deletions(-)
diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index 332ed2f7b402..6ea15b93a2f8 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -43,6 +43,7 @@ void setup_libbpf() { int err;
+ libbpf_set_strict_mode(LIBBPF_STRICT_ALL); libbpf_set_print(libbpf_print_fn);
err = bump_memlock_rlimit(); diff --git a/tools/testing/selftests/bpf/benchs/bench_rename.c b/tools/testing/selftests/bpf/benchs/bench_rename.c index a967674098ad..c7ec114eca56 100644 --- a/tools/testing/selftests/bpf/benchs/bench_rename.c +++ b/tools/testing/selftests/bpf/benchs/bench_rename.c @@ -65,7 +65,7 @@ static void attach_bpf(struct bpf_program *prog) struct bpf_link *link;
link = bpf_program__attach(prog); - if (IS_ERR(link)) { + if (!link) { fprintf(stderr, "failed to attach program!\n"); exit(1); } diff --git a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c index da87c7f31891..f5fcb5be8933 100644 --- a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c +++ b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c @@ -181,7 +181,7 @@ static void ringbuf_libbpf_setup() }
link = bpf_program__attach(ctx->skel->progs.bench_ringbuf); - if (IS_ERR(link)) { + if (!link) { fprintf(stderr, "failed to attach program!\n"); exit(1); } @@ -271,7 +271,7 @@ static void ringbuf_custom_setup() }
link = bpf_program__attach(ctx->skel->progs.bench_ringbuf); - if (IS_ERR(link)) { + if (!link) { fprintf(stderr, "failed to attach program\n"); exit(1); } @@ -430,7 +430,7 @@ static void perfbuf_libbpf_setup() }
link = bpf_program__attach(ctx->skel->progs.bench_perfbuf); - if (IS_ERR(link)) { + if (!link) { fprintf(stderr, "failed to attach program\n"); exit(1); } diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c index 2a0b6c9885a4..f41a491a8cc0 100644 --- a/tools/testing/selftests/bpf/benchs/bench_trigger.c +++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c @@ -60,7 +60,7 @@ static void attach_bpf(struct bpf_program *prog) struct bpf_link *link;
link = bpf_program__attach(prog); - if (IS_ERR(link)) { + if (!link) { fprintf(stderr, "failed to attach program!\n"); exit(1); } diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c index a0ee87c8e1ea..d9977dfdd752 100644 --- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c +++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c @@ -47,16 +47,14 @@ void test_attach_probe(void) kprobe_link = bpf_program__attach_kprobe(skel->progs.handle_kprobe, false /* retprobe */, SYS_NANOSLEEP_KPROBE_NAME); - if (CHECK(IS_ERR(kprobe_link), "attach_kprobe", - "err %ld\n", PTR_ERR(kprobe_link))) + if (!ASSERT_OK_PTR(kprobe_link, "attach_kprobe")) goto cleanup; skel->links.handle_kprobe = kprobe_link;
kretprobe_link = bpf_program__attach_kprobe(skel->progs.handle_kretprobe, true /* retprobe */, SYS_NANOSLEEP_KPROBE_NAME); - if (CHECK(IS_ERR(kretprobe_link), "attach_kretprobe", - "err %ld\n", PTR_ERR(kretprobe_link))) + if (!ASSERT_OK_PTR(kretprobe_link, "attach_kretprobe")) goto cleanup; skel->links.handle_kretprobe = kretprobe_link;
@@ -65,8 +63,7 @@ void test_attach_probe(void) 0 /* self pid */, "/proc/self/exe", uprobe_offset); - if (CHECK(IS_ERR(uprobe_link), "attach_uprobe", - "err %ld\n", PTR_ERR(uprobe_link))) + if (!ASSERT_OK_PTR(uprobe_link, "attach_uprobe")) goto cleanup; skel->links.handle_uprobe = uprobe_link;
@@ -75,8 +72,7 @@ void test_attach_probe(void) -1 /* any pid */, "/proc/self/exe", uprobe_offset); - if (CHECK(IS_ERR(uretprobe_link), "attach_uretprobe", - "err %ld\n", PTR_ERR(uretprobe_link))) + if (!ASSERT_OK_PTR(uretprobe_link, "attach_uretprobe")) goto cleanup; skel->links.handle_uretprobe = uretprobe_link;
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index 448885b95eed..adc38a6bae14 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -45,7 +45,7 @@ static void do_dummy_read(struct bpf_program *prog) int iter_fd, len;
link = bpf_program__attach_iter(prog, NULL); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) return;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -182,7 +182,7 @@ static int do_btf_read(struct bpf_iter_task_btf *skel) int ret = 0;
link = bpf_program__attach_iter(prog, NULL); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) return ret;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -384,7 +384,7 @@ static void test_file_iter(void) return;
link = bpf_program__attach_iter(skel1->progs.dump_task, NULL); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
/* unlink this path if it exists. */ @@ -490,7 +490,7 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) skel->bss->map2_id = map_info.id;
link = bpf_program__attach_iter(skel->progs.dump_bpf_map, NULL); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto free_map2;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -595,14 +595,12 @@ static void test_bpf_hash_map(void) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.dump_bpf_hash_map, &opts); - if (CHECK(!IS_ERR(link), "attach_iter", - "attach_iter for hashmap2 unexpected succeeded\n")) + if (!ASSERT_ERR_PTR(link, "attach_iter")) goto out;
linfo.map.map_fd = bpf_map__fd(skel->maps.hashmap3); link = bpf_program__attach_iter(skel->progs.dump_bpf_hash_map, &opts); - if (CHECK(!IS_ERR(link), "attach_iter", - "attach_iter for hashmap3 unexpected succeeded\n")) + if (!ASSERT_ERR_PTR(link, "attach_iter")) goto out;
/* hashmap1 should be good, update map values here */ @@ -624,7 +622,7 @@ static void test_bpf_hash_map(void)
linfo.map.map_fd = map_fd; link = bpf_program__attach_iter(skel->progs.dump_bpf_hash_map, &opts); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -715,7 +713,7 @@ static void test_bpf_percpu_hash_map(void) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.dump_bpf_percpu_hash_map, &opts); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -786,7 +784,7 @@ static void test_bpf_array_map(void) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.dump_bpf_array_map, &opts); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -882,7 +880,7 @@ static void test_bpf_percpu_array_map(void) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.dump_bpf_percpu_array_map, &opts); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -950,7 +948,7 @@ static void test_bpf_sk_storage_map(void) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.dump_bpf_sk_storage_map, &opts); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
iter_fd = bpf_iter_create(bpf_link__fd(link)); @@ -1003,7 +1001,7 @@ static void test_rdonly_buf_out_of_bound(void) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.dump_bpf_hash_map, &opts); - if (CHECK(!IS_ERR(link), "attach_iter", "unexpected success\n")) + if (!ASSERT_ERR_PTR(link, "attach_iter")) bpf_link__destroy(link);
bpf_iter_test_kern5__destroy(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c index 9a8f47fc0b91..634035eb6fb7 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c @@ -80,7 +80,7 @@ static void *server(void *arg) bytes, total_bytes, nr_sent, errno);
done: - if (fd != -1) + if (fd >= 0) close(fd); if (err) { WRITE_ONCE(stop, 1); @@ -189,8 +189,7 @@ static void test_cubic(void) return;
link = bpf_map__attach_struct_ops(cubic_skel->maps.cubic); - if (CHECK(IS_ERR(link), "bpf_map__attach_struct_ops", "err:%ld\n", - PTR_ERR(link))) { + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { bpf_cubic__destroy(cubic_skel); return; } @@ -211,8 +210,7 @@ static void test_dctcp(void) return;
link = bpf_map__attach_struct_ops(dctcp_skel->maps.dctcp); - if (CHECK(IS_ERR(link), "bpf_map__attach_struct_ops", "err:%ld\n", - PTR_ERR(link))) { + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { bpf_dctcp__destroy(dctcp_skel); return; } diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 01befb614ee9..5a5013221da9 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -3681,7 +3681,7 @@ static void do_test_raw(unsigned int test_num) always_log); free(raw_btf);
- err = ((btf_fd == -1) != test->btf_load_err); + err = ((btf_fd < 0) != test->btf_load_err); if (CHECK(err, "btf_fd:%d test->btf_load_err:%u", btf_fd, test->btf_load_err) || CHECK(test->err_str && !strstr(btf_log_buf, test->err_str), @@ -3690,7 +3690,7 @@ static void do_test_raw(unsigned int test_num) goto done; }
- if (err || btf_fd == -1) + if (err || btf_fd < 0) goto done;
create_attr.name = test->map_name; @@ -3704,16 +3704,16 @@ static void do_test_raw(unsigned int test_num)
map_fd = bpf_create_map_xattr(&create_attr);
- err = ((map_fd == -1) != test->map_create_err); + err = ((map_fd < 0) != test->map_create_err); CHECK(err, "map_fd:%d test->map_create_err:%u", map_fd, test->map_create_err);
done: if (*btf_log_buf && (err || always_log)) fprintf(stderr, "\n%s", btf_log_buf); - if (btf_fd != -1) + if (btf_fd >= 0) close(btf_fd); - if (map_fd != -1) + if (map_fd >= 0) close(map_fd); }
@@ -3811,7 +3811,7 @@ static int test_big_btf_info(unsigned int test_num) btf_fd = bpf_load_btf(raw_btf, raw_btf_size, btf_log_buf, BTF_LOG_BUF_SIZE, always_log); - if (CHECK(btf_fd == -1, "errno:%d", errno)) { + if (CHECK(btf_fd < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -3857,7 +3857,7 @@ static int test_big_btf_info(unsigned int test_num) free(raw_btf); free(user_btf);
- if (btf_fd != -1) + if (btf_fd >= 0) close(btf_fd);
return err; @@ -3899,7 +3899,7 @@ static int test_btf_id(unsigned int test_num) btf_fd[0] = bpf_load_btf(raw_btf, raw_btf_size, btf_log_buf, BTF_LOG_BUF_SIZE, always_log); - if (CHECK(btf_fd[0] == -1, "errno:%d", errno)) { + if (CHECK(btf_fd[0] < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -3913,7 +3913,7 @@ static int test_btf_id(unsigned int test_num) }
btf_fd[1] = bpf_btf_get_fd_by_id(info[0].id); - if (CHECK(btf_fd[1] == -1, "errno:%d", errno)) { + if (CHECK(btf_fd[1] < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -3941,7 +3941,7 @@ static int test_btf_id(unsigned int test_num) create_attr.btf_value_type_id = 2;
map_fd = bpf_create_map_xattr(&create_attr); - if (CHECK(map_fd == -1, "errno:%d", errno)) { + if (CHECK(map_fd < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -3964,7 +3964,7 @@ static int test_btf_id(unsigned int test_num)
/* Test BTF ID is removed from the kernel */ btf_fd[0] = bpf_btf_get_fd_by_id(map_info.btf_id); - if (CHECK(btf_fd[0] == -1, "errno:%d", errno)) { + if (CHECK(btf_fd[0] < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -3975,7 +3975,7 @@ static int test_btf_id(unsigned int test_num) close(map_fd); map_fd = -1; btf_fd[0] = bpf_btf_get_fd_by_id(map_info.btf_id); - if (CHECK(btf_fd[0] != -1, "BTF lingers")) { + if (CHECK(btf_fd[0] >= 0, "BTF lingers")) { err = -1; goto done; } @@ -3987,11 +3987,11 @@ static int test_btf_id(unsigned int test_num) fprintf(stderr, "\n%s", btf_log_buf);
free(raw_btf); - if (map_fd != -1) + if (map_fd >= 0) close(map_fd); for (i = 0; i < 2; i++) { free(user_btf[i]); - if (btf_fd[i] != -1) + if (btf_fd[i] >= 0) close(btf_fd[i]); }
@@ -4036,7 +4036,7 @@ static void do_test_get_info(unsigned int test_num) btf_fd = bpf_load_btf(raw_btf, raw_btf_size, btf_log_buf, BTF_LOG_BUF_SIZE, always_log); - if (CHECK(btf_fd == -1, "errno:%d", errno)) { + if (CHECK(btf_fd <= 0, "errno:%d", errno)) { err = -1; goto done; } @@ -4082,7 +4082,7 @@ static void do_test_get_info(unsigned int test_num) free(raw_btf); free(user_btf);
- if (btf_fd != -1) + if (btf_fd >= 0) close(btf_fd); }
@@ -4119,8 +4119,9 @@ static void do_test_file(unsigned int test_num) return;
btf = btf__parse_elf(test->file, &btf_ext); - if (IS_ERR(btf)) { - if (PTR_ERR(btf) == -ENOENT) { + err = libbpf_get_error(btf); + if (err) { + if (err == -ENOENT) { printf("%s:SKIP: No ELF %s found", __func__, BTF_ELF_SEC); test__skip(); return; @@ -4133,7 +4134,8 @@ static void do_test_file(unsigned int test_num) btf_ext__free(btf_ext);
obj = bpf_object__open(test->file); - if (CHECK(IS_ERR(obj), "obj: %ld", PTR_ERR(obj))) + err = libbpf_get_error(obj); + if (CHECK(err, "obj: %d", err)) return;
prog = bpf_program__next(NULL, obj); @@ -4168,7 +4170,7 @@ static void do_test_file(unsigned int test_num) info_len = sizeof(struct bpf_prog_info); err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
- if (CHECK(err == -1, "invalid get info (1st) errno:%d", errno)) { + if (CHECK(err < 0, "invalid get info (1st) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); err = -1; goto done; @@ -4200,7 +4202,7 @@ static void do_test_file(unsigned int test_num)
err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
- if (CHECK(err == -1, "invalid get info (2nd) errno:%d", errno)) { + if (CHECK(err < 0, "invalid get info (2nd) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); err = -1; goto done; @@ -4758,7 +4760,7 @@ static void do_test_pprint(int test_num) always_log); free(raw_btf);
- if (CHECK(btf_fd == -1, "errno:%d", errno)) { + if (CHECK(btf_fd < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -4773,7 +4775,7 @@ static void do_test_pprint(int test_num) create_attr.btf_value_type_id = test->value_type_id;
map_fd = bpf_create_map_xattr(&create_attr); - if (CHECK(map_fd == -1, "errno:%d", errno)) { + if (CHECK(map_fd < 0, "errno:%d", errno)) { err = -1; goto done; } @@ -4854,7 +4856,7 @@ static void do_test_pprint(int test_num)
err = check_line(expected_line, nexpected_line, sizeof(expected_line), line); - if (err == -1) + if (err < 0) goto done; }
@@ -4870,7 +4872,7 @@ static void do_test_pprint(int test_num) cpu, cmapv); err = check_line(expected_line, nexpected_line, sizeof(expected_line), line); - if (err == -1) + if (err < 0) goto done;
cmapv = cmapv + rounded_value_size; @@ -4908,9 +4910,9 @@ static void do_test_pprint(int test_num) fprintf(stderr, "OK"); if (*btf_log_buf && (err || always_log)) fprintf(stderr, "\n%s", btf_log_buf); - if (btf_fd != -1) + if (btf_fd >= 0) close(btf_fd); - if (map_fd != -1) + if (map_fd >= 0) close(map_fd); if (pin_file) fclose(pin_file); @@ -5822,7 +5824,7 @@ static int test_get_finfo(const struct prog_info_raw_test *test, /* get necessary lens */ info_len = sizeof(struct bpf_prog_info); err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (CHECK(err == -1, "invalid get info (1st) errno:%d", errno)) { + if (CHECK(err < 0, "invalid get info (1st) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); return -1; } @@ -5852,7 +5854,7 @@ static int test_get_finfo(const struct prog_info_raw_test *test, info.func_info_rec_size = rec_size; info.func_info = ptr_to_u64(func_info); err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (CHECK(err == -1, "invalid get info (2nd) errno:%d", errno)) { + if (CHECK(err < 0, "invalid get info (2nd) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); err = -1; goto done; @@ -5916,7 +5918,7 @@ static int test_get_linfo(const struct prog_info_raw_test *test,
info_len = sizeof(struct bpf_prog_info); err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (CHECK(err == -1, "err:%d errno:%d", err, errno)) { + if (CHECK(err < 0, "err:%d errno:%d", err, errno)) { err = -1; goto done; } @@ -5995,7 +5997,7 @@ static int test_get_linfo(const struct prog_info_raw_test *test, * Only recheck the info.*line_info* fields. * Other fields are not the concern of this test. */ - if (CHECK(err == -1 || + if (CHECK(err < 0 || info.nr_line_info != cnt || (jited_cnt && !info.jited_line_info) || info.nr_jited_line_info != jited_cnt || @@ -6132,7 +6134,7 @@ static void do_test_info_raw(unsigned int test_num) always_log); free(raw_btf);
- if (CHECK(btf_fd == -1, "invalid btf_fd errno:%d", errno)) { + if (CHECK(btf_fd < 0, "invalid btf_fd errno:%d", errno)) { err = -1; goto done; } @@ -6145,7 +6147,8 @@ static void do_test_info_raw(unsigned int test_num) patched_linfo = patch_name_tbd(test->line_info, test->str_sec, linfo_str_off, test->str_sec_size, &linfo_size); - if (IS_ERR(patched_linfo)) { + err = libbpf_get_error(patched_linfo); + if (err) { fprintf(stderr, "error in creating raw bpf_line_info"); err = -1; goto done; @@ -6169,7 +6172,7 @@ static void do_test_info_raw(unsigned int test_num) }
prog_fd = syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); - err = ((prog_fd == -1) != test->expected_prog_load_failure); + err = ((prog_fd < 0) != test->expected_prog_load_failure); if (CHECK(err, "prog_fd:%d expected_prog_load_failure:%u errno:%d", prog_fd, test->expected_prog_load_failure, errno) || CHECK(test->err_str && !strstr(btf_log_buf, test->err_str), @@ -6178,7 +6181,7 @@ static void do_test_info_raw(unsigned int test_num) goto done; }
- if (prog_fd == -1) + if (prog_fd < 0) goto done;
err = test_get_finfo(test, prog_fd); @@ -6195,12 +6198,12 @@ static void do_test_info_raw(unsigned int test_num) if (*btf_log_buf && (err || always_log)) fprintf(stderr, "\n%s", btf_log_buf);
- if (btf_fd != -1) + if (btf_fd >= 0) close(btf_fd); - if (prog_fd != -1) + if (prog_fd >= 0) close(prog_fd);
- if (!IS_ERR(patched_linfo)) + if (!libbpf_get_error(patched_linfo)) free(patched_linfo); }
@@ -6691,9 +6694,9 @@ static void do_test_dedup(unsigned int test_num) return;
test_btf = btf__new((__u8 *)raw_btf, raw_btf_size); + err = libbpf_get_error(test_btf); free(raw_btf); - if (CHECK(IS_ERR(test_btf), "invalid test_btf errno:%ld", - PTR_ERR(test_btf))) { + if (CHECK(err, "invalid test_btf errno:%d", err)) { err = -1; goto done; } @@ -6705,9 +6708,9 @@ static void do_test_dedup(unsigned int test_num) if (!raw_btf) return; expect_btf = btf__new((__u8 *)raw_btf, raw_btf_size); + err = libbpf_get_error(expect_btf); free(raw_btf); - if (CHECK(IS_ERR(expect_btf), "invalid expect_btf errno:%ld", - PTR_ERR(expect_btf))) { + if (CHECK(err, "invalid expect_btf errno:%d", err)) { err = -1; goto done; } @@ -6818,10 +6821,8 @@ static void do_test_dedup(unsigned int test_num) }
done: - if (!IS_ERR(test_btf)) - btf__free(test_btf); - if (!IS_ERR(expect_btf)) - btf__free(expect_btf); + btf__free(test_btf); + btf__free(expect_btf); }
void test_btf(void) diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c index 5e129dc2073c..1b90e684ff13 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c @@ -32,8 +32,9 @@ static int btf_dump_all_types(const struct btf *btf, int err = 0, id;
d = btf_dump__new(btf, NULL, opts, btf_dump_printf); - if (IS_ERR(d)) - return PTR_ERR(d); + err = libbpf_get_error(d); + if (err) + return err;
for (id = 1; id <= type_cnt; id++) { err = btf_dump__dump_type(d, id); @@ -56,8 +57,7 @@ static int test_btf_dump_case(int n, struct btf_dump_test_case *t) snprintf(test_file, sizeof(test_file), "%s.o", t->file);
btf = btf__parse_elf(test_file, NULL); - if (CHECK(IS_ERR(btf), "btf_parse_elf", - "failed to load test BTF: %ld\n", PTR_ERR(btf))) { + if (!ASSERT_OK_PTR(btf, "btf_parse_elf")) { err = -PTR_ERR(btf); btf = NULL; goto done; diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c index f36da15b134f..022c7d89d6f4 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_write.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c @@ -4,8 +4,6 @@ #include <bpf/btf.h> #include "btf_helpers.h"
-static int duration = 0; - void test_btf_write() { const struct btf_var_secinfo *vi; const struct btf_type *t; @@ -16,7 +14,7 @@ void test_btf_write() { int id, err, str_off;
btf = btf__new_empty(); - if (CHECK(IS_ERR(btf), "new_empty", "failed: %ld\n", PTR_ERR(btf))) + if (!ASSERT_OK_PTR(btf, "new_empty")) return;
str_off = btf__find_str(btf, "int"); diff --git a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c index 643dfa35419c..876be0ecb654 100644 --- a/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c +++ b/tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c @@ -102,8 +102,7 @@ static void test_egress_only(int parent_cgroup_fd, int child_cgroup_fd) */ parent_link = bpf_program__attach_cgroup(obj->progs.egress, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_link), "parent-cg-attach", - "err %ld", PTR_ERR(parent_link))) + if (!ASSERT_OK_PTR(parent_link, "parent-cg-attach")) goto close_bpf_object; err = connect_send(CHILD_CGROUP); if (CHECK(err, "first-connect-send", "errno %d", errno)) @@ -126,8 +125,7 @@ static void test_egress_only(int parent_cgroup_fd, int child_cgroup_fd) */ child_link = bpf_program__attach_cgroup(obj->progs.egress, child_cgroup_fd); - if (CHECK(IS_ERR(child_link), "child-cg-attach", - "err %ld", PTR_ERR(child_link))) + if (!ASSERT_OK_PTR(child_link, "child-cg-attach")) goto close_bpf_object; err = connect_send(CHILD_CGROUP); if (CHECK(err, "second-connect-send", "errno %d", errno)) @@ -147,10 +145,8 @@ static void test_egress_only(int parent_cgroup_fd, int child_cgroup_fd) goto close_bpf_object;
close_bpf_object: - if (!IS_ERR(parent_link)) - bpf_link__destroy(parent_link); - if (!IS_ERR(child_link)) - bpf_link__destroy(child_link); + bpf_link__destroy(parent_link); + bpf_link__destroy(child_link);
cg_storage_multi_egress_only__destroy(obj); } @@ -176,18 +172,15 @@ static void test_isolated(int parent_cgroup_fd, int child_cgroup_fd) */ parent_egress1_link = bpf_program__attach_cgroup(obj->progs.egress1, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_egress1_link), "parent-egress1-cg-attach", - "err %ld", PTR_ERR(parent_egress1_link))) + if (!ASSERT_OK_PTR(parent_egress1_link, "parent-egress1-cg-attach")) goto close_bpf_object; parent_egress2_link = bpf_program__attach_cgroup(obj->progs.egress2, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_egress2_link), "parent-egress2-cg-attach", - "err %ld", PTR_ERR(parent_egress2_link))) + if (!ASSERT_OK_PTR(parent_egress2_link, "parent-egress2-cg-attach")) goto close_bpf_object; parent_ingress_link = bpf_program__attach_cgroup(obj->progs.ingress, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_ingress_link), "parent-ingress-cg-attach", - "err %ld", PTR_ERR(parent_ingress_link))) + if (!ASSERT_OK_PTR(parent_ingress_link, "parent-ingress-cg-attach")) goto close_bpf_object; err = connect_send(CHILD_CGROUP); if (CHECK(err, "first-connect-send", "errno %d", errno)) @@ -221,18 +214,15 @@ static void test_isolated(int parent_cgroup_fd, int child_cgroup_fd) */ child_egress1_link = bpf_program__attach_cgroup(obj->progs.egress1, child_cgroup_fd); - if (CHECK(IS_ERR(child_egress1_link), "child-egress1-cg-attach", - "err %ld", PTR_ERR(child_egress1_link))) + if (!ASSERT_OK_PTR(child_egress1_link, "child-egress1-cg-attach")) goto close_bpf_object; child_egress2_link = bpf_program__attach_cgroup(obj->progs.egress2, child_cgroup_fd); - if (CHECK(IS_ERR(child_egress2_link), "child-egress2-cg-attach", - "err %ld", PTR_ERR(child_egress2_link))) + if (!ASSERT_OK_PTR(child_egress2_link, "child-egress2-cg-attach")) goto close_bpf_object; child_ingress_link = bpf_program__attach_cgroup(obj->progs.ingress, child_cgroup_fd); - if (CHECK(IS_ERR(child_ingress_link), "child-ingress-cg-attach", - "err %ld", PTR_ERR(child_ingress_link))) + if (!ASSERT_OK_PTR(child_ingress_link, "child-ingress-cg-attach")) goto close_bpf_object; err = connect_send(CHILD_CGROUP); if (CHECK(err, "second-connect-send", "errno %d", errno)) @@ -264,18 +254,12 @@ static void test_isolated(int parent_cgroup_fd, int child_cgroup_fd) goto close_bpf_object;
close_bpf_object: - if (!IS_ERR(parent_egress1_link)) - bpf_link__destroy(parent_egress1_link); - if (!IS_ERR(parent_egress2_link)) - bpf_link__destroy(parent_egress2_link); - if (!IS_ERR(parent_ingress_link)) - bpf_link__destroy(parent_ingress_link); - if (!IS_ERR(child_egress1_link)) - bpf_link__destroy(child_egress1_link); - if (!IS_ERR(child_egress2_link)) - bpf_link__destroy(child_egress2_link); - if (!IS_ERR(child_ingress_link)) - bpf_link__destroy(child_ingress_link); + bpf_link__destroy(parent_egress1_link); + bpf_link__destroy(parent_egress2_link); + bpf_link__destroy(parent_ingress_link); + bpf_link__destroy(child_egress1_link); + bpf_link__destroy(child_egress2_link); + bpf_link__destroy(child_ingress_link);
cg_storage_multi_isolated__destroy(obj); } @@ -301,18 +285,15 @@ static void test_shared(int parent_cgroup_fd, int child_cgroup_fd) */ parent_egress1_link = bpf_program__attach_cgroup(obj->progs.egress1, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_egress1_link), "parent-egress1-cg-attach", - "err %ld", PTR_ERR(parent_egress1_link))) + if (!ASSERT_OK_PTR(parent_egress1_link, "parent-egress1-cg-attach")) goto close_bpf_object; parent_egress2_link = bpf_program__attach_cgroup(obj->progs.egress2, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_egress2_link), "parent-egress2-cg-attach", - "err %ld", PTR_ERR(parent_egress2_link))) + if (!ASSERT_OK_PTR(parent_egress2_link, "parent-egress2-cg-attach")) goto close_bpf_object; parent_ingress_link = bpf_program__attach_cgroup(obj->progs.ingress, parent_cgroup_fd); - if (CHECK(IS_ERR(parent_ingress_link), "parent-ingress-cg-attach", - "err %ld", PTR_ERR(parent_ingress_link))) + if (!ASSERT_OK_PTR(parent_ingress_link, "parent-ingress-cg-attach")) goto close_bpf_object; err = connect_send(CHILD_CGROUP); if (CHECK(err, "first-connect-send", "errno %d", errno)) @@ -338,18 +319,15 @@ static void test_shared(int parent_cgroup_fd, int child_cgroup_fd) */ child_egress1_link = bpf_program__attach_cgroup(obj->progs.egress1, child_cgroup_fd); - if (CHECK(IS_ERR(child_egress1_link), "child-egress1-cg-attach", - "err %ld", PTR_ERR(child_egress1_link))) + if (!ASSERT_OK_PTR(child_egress1_link, "child-egress1-cg-attach")) goto close_bpf_object; child_egress2_link = bpf_program__attach_cgroup(obj->progs.egress2, child_cgroup_fd); - if (CHECK(IS_ERR(child_egress2_link), "child-egress2-cg-attach", - "err %ld", PTR_ERR(child_egress2_link))) + if (!ASSERT_OK_PTR(child_egress2_link, "child-egress2-cg-attach")) goto close_bpf_object; child_ingress_link = bpf_program__attach_cgroup(obj->progs.ingress, child_cgroup_fd); - if (CHECK(IS_ERR(child_ingress_link), "child-ingress-cg-attach", - "err %ld", PTR_ERR(child_ingress_link))) + if (!ASSERT_OK_PTR(child_ingress_link, "child-ingress-cg-attach")) goto close_bpf_object; err = connect_send(CHILD_CGROUP); if (CHECK(err, "second-connect-send", "errno %d", errno)) @@ -375,18 +353,12 @@ static void test_shared(int parent_cgroup_fd, int child_cgroup_fd) goto close_bpf_object;
close_bpf_object: - if (!IS_ERR(parent_egress1_link)) - bpf_link__destroy(parent_egress1_link); - if (!IS_ERR(parent_egress2_link)) - bpf_link__destroy(parent_egress2_link); - if (!IS_ERR(parent_ingress_link)) - bpf_link__destroy(parent_ingress_link); - if (!IS_ERR(child_egress1_link)) - bpf_link__destroy(child_egress1_link); - if (!IS_ERR(child_egress2_link)) - bpf_link__destroy(child_egress2_link); - if (!IS_ERR(child_ingress_link)) - bpf_link__destroy(child_ingress_link); + bpf_link__destroy(parent_egress1_link); + bpf_link__destroy(parent_egress2_link); + bpf_link__destroy(parent_ingress_link); + bpf_link__destroy(child_egress1_link); + bpf_link__destroy(child_egress2_link); + bpf_link__destroy(child_ingress_link);
cg_storage_multi_shared__destroy(obj); } diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c b/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c index 0a1fc9816cef..20bb8831dda6 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c @@ -167,7 +167,7 @@ void test_cgroup_attach_multi(void) prog_cnt = 2; CHECK_FAIL(bpf_prog_query(cg5, BPF_CGROUP_INET_EGRESS, BPF_F_QUERY_EFFECTIVE, &attach_flags, - prog_ids, &prog_cnt) != -1); + prog_ids, &prog_cnt) >= 0); CHECK_FAIL(errno != ENOSPC); CHECK_FAIL(prog_cnt != 4); /* check that prog_ids are returned even when buffer is too small */ diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_link.c b/tools/testing/selftests/bpf/prog_tests/cgroup_link.c index 736796e56ed1..9091524131d6 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgroup_link.c +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_link.c @@ -65,8 +65,7 @@ void test_cgroup_link(void) for (i = 0; i < cg_nr; i++) { links[i] = bpf_program__attach_cgroup(skel->progs.egress, cgs[i].fd); - if (CHECK(IS_ERR(links[i]), "cg_attach", "i: %d, err: %ld\n", - i, PTR_ERR(links[i]))) + if (!ASSERT_OK_PTR(links[i], "cg_attach")) goto cleanup; }
@@ -121,8 +120,7 @@ void test_cgroup_link(void)
links[last_cg] = bpf_program__attach_cgroup(skel->progs.egress, cgs[last_cg].fd); - if (CHECK(IS_ERR(links[last_cg]), "cg_attach", "err: %ld\n", - PTR_ERR(links[last_cg]))) + if (!ASSERT_OK_PTR(links[last_cg], "cg_attach")) goto cleanup;
ping_and_check(cg_nr + 1, 0); @@ -147,7 +145,7 @@ void test_cgroup_link(void) /* attempt to mix in with multi-attach bpf_link */ tmp_link = bpf_program__attach_cgroup(skel->progs.egress, cgs[last_cg].fd); - if (CHECK(!IS_ERR(tmp_link), "cg_attach_fail", "unexpected success!\n")) { + if (!ASSERT_ERR_PTR(tmp_link, "cg_attach_fail")) { bpf_link__destroy(tmp_link); goto cleanup; } @@ -165,8 +163,7 @@ void test_cgroup_link(void) /* attach back link-based one */ links[last_cg] = bpf_program__attach_cgroup(skel->progs.egress, cgs[last_cg].fd); - if (CHECK(IS_ERR(links[last_cg]), "cg_attach", "err: %ld\n", - PTR_ERR(links[last_cg]))) + if (!ASSERT_OK_PTR(links[last_cg], "cg_attach")) goto cleanup;
ping_and_check(cg_nr, 0); @@ -249,8 +246,7 @@ void test_cgroup_link(void) BPF_CGROUP_INET_EGRESS);
for (i = 0; i < cg_nr; i++) { - if (!IS_ERR(links[i])) - bpf_link__destroy(links[i]); + bpf_link__destroy(links[i]); } test_cgroup_link__destroy(skel);
diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_skb_sk_lookup.c b/tools/testing/selftests/bpf/prog_tests/cgroup_skb_sk_lookup.c index 464edc1c1708..b9dc4ec655b5 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgroup_skb_sk_lookup.c +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_skb_sk_lookup.c @@ -60,7 +60,7 @@ static void run_cgroup_bpf_test(const char *cg_path, int out_sk) goto cleanup;
link = bpf_program__attach_cgroup(skel->progs.ingress_lookup, cgfd); - if (CHECK(IS_ERR(link), "cgroup_attach", "err: %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "cgroup_attach")) goto cleanup;
run_lookup_test(&skel->bss->g_serv_port, out_sk); diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index 0288549949fd..cc003c938efd 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -367,8 +367,7 @@ static int setup_type_id_case_local(struct core_reloc_test_case *test) const char *name; int i;
- if (CHECK(IS_ERR(local_btf), "local_btf", "failed: %ld\n", PTR_ERR(local_btf)) || - CHECK(IS_ERR(targ_btf), "targ_btf", "failed: %ld\n", PTR_ERR(targ_btf))) { + if (!ASSERT_OK_PTR(local_btf, "local_btf") || !ASSERT_OK_PTR(targ_btf, "targ_btf")) { btf__free(local_btf); btf__free(targ_btf); return -EINVAL; @@ -846,8 +845,7 @@ void test_core_reloc(void) }
obj = bpf_object__open_file(test_case->bpf_obj_file, NULL); - if (CHECK(IS_ERR(obj), "obj_open", "failed to open '%s': %ld\n", - test_case->bpf_obj_file, PTR_ERR(obj))) + if (!ASSERT_OK_PTR(obj, "obj_open")) continue;
probe_name = "raw_tracepoint/sys_enter"; @@ -897,8 +895,7 @@ void test_core_reloc(void) data->my_pid_tgid = my_pid_tgid;
link = bpf_program__attach_raw_tracepoint(prog, tp_name); - if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", - PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_raw_tp")) goto cleanup;
/* trigger test run */ @@ -939,10 +936,8 @@ void test_core_reloc(void) CHECK_FAIL(munmap(mmap_data, mmap_sz)); mmap_data = NULL; } - if (!IS_ERR_OR_NULL(link)) { - bpf_link__destroy(link); - link = NULL; - } + bpf_link__destroy(link); + link = NULL; bpf_object__close(obj); } } diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index 5c0448910426..d7082f422b4e 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -116,10 +116,8 @@ static void test_fexit_bpf2bpf_common(const char *obj_file,
close_prog: for (i = 0; i < prog_cnt; i++) - if (!IS_ERR_OR_NULL(link[i])) - bpf_link__destroy(link[i]); - if (!IS_ERR_OR_NULL(obj)) - bpf_object__close(obj); + bpf_link__destroy(link[i]); + bpf_object__close(obj); bpf_object__close(tgt_obj); free(link); free(prog); @@ -201,7 +199,7 @@ static int test_second_attach(struct bpf_object *obj) return err;
link = bpf_program__attach_freplace(prog, tgt_fd, tgt_name); - if (CHECK(IS_ERR(link), "second_link", "failed to attach second link prog_fd %d tgt_fd %d\n", bpf_program__fd(prog), tgt_fd)) + if (!ASSERT_OK_PTR(link, "second_link")) goto out;
err = bpf_prog_test_run(tgt_fd, 1, &pkt_v6, sizeof(pkt_v6), @@ -253,9 +251,7 @@ static void test_fmod_ret_freplace(void) opts.attach_prog_fd = pkt_fd;
freplace_obj = bpf_object__open_file(freplace_name, &opts); - if (CHECK(IS_ERR_OR_NULL(freplace_obj), "freplace_obj_open", - "failed to open %s: %ld\n", freplace_name, - PTR_ERR(freplace_obj))) + if (!ASSERT_OK_PTR(freplace_obj, "freplace_obj_open")) goto out;
err = bpf_object__load(freplace_obj); @@ -264,14 +260,12 @@ static void test_fmod_ret_freplace(void)
prog = bpf_program__next(NULL, freplace_obj); freplace_link = bpf_program__attach_trace(prog); - if (CHECK(IS_ERR(freplace_link), "freplace_attach_trace", "failed to link\n")) + if (!ASSERT_OK_PTR(freplace_link, "freplace_attach_trace")) goto out;
opts.attach_prog_fd = bpf_program__fd(prog); fmod_obj = bpf_object__open_file(fmod_ret_name, &opts); - if (CHECK(IS_ERR_OR_NULL(fmod_obj), "fmod_obj_open", - "failed to open %s: %ld\n", fmod_ret_name, - PTR_ERR(fmod_obj))) + if (!ASSERT_OK_PTR(fmod_obj, "fmod_obj_open")) goto out;
err = bpf_object__load(fmod_obj); @@ -320,9 +314,7 @@ static void test_obj_load_failure_common(const char *obj_file, );
obj = bpf_object__open_file(obj_file, &opts); - if (CHECK(IS_ERR_OR_NULL(obj), "obj_open", - "failed to open %s: %ld\n", obj_file, - PTR_ERR(obj))) + if (!ASSERT_OK_PTR(obj, "obj_open")) goto close_prog;
/* It should fail to load the program */ @@ -331,8 +323,7 @@ static void test_obj_load_failure_common(const char *obj_file, goto close_prog;
close_prog: - if (!IS_ERR_OR_NULL(obj)) - bpf_object__close(obj); + bpf_object__close(obj); bpf_object__close(pkt_obj); }
diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c index cd6dc80edf18..225714f71ac6 100644 --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c @@ -541,7 +541,7 @@ static void test_skb_less_link_create(struct bpf_flow *skel, int tap_fd) return;
link = bpf_program__attach_netns(skel->progs._dissect, net_fd); - if (CHECK(IS_ERR(link), "attach_netns", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_netns")) goto out_close;
run_tests_skb_less(tap_fd, skel->maps.last_dissection); diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c index 172c586b6996..3931ede5c534 100644 --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c @@ -134,9 +134,9 @@ static void test_link_create_link_create(int netns, int prog1, int prog2) /* Expect failure creating link when another link exists */ errno = 0; link2 = bpf_link_create(prog2, netns, BPF_FLOW_DISSECTOR, &opts); - if (CHECK_FAIL(link2 != -1 || errno != E2BIG)) + if (CHECK_FAIL(link2 >= 0 || errno != E2BIG)) perror("bpf_prog_attach(prog2) expected E2BIG"); - if (link2 != -1) + if (link2 >= 0) close(link2); CHECK_FAIL(query_attached_prog_id(netns) != query_prog_id(prog1));
@@ -159,9 +159,9 @@ static void test_prog_attach_link_create(int netns, int prog1, int prog2) /* Expect failure creating link when prog attached */ errno = 0; link = bpf_link_create(prog2, netns, BPF_FLOW_DISSECTOR, &opts); - if (CHECK_FAIL(link != -1 || errno != EEXIST)) + if (CHECK_FAIL(link >= 0 || errno != EEXIST)) perror("bpf_link_create(prog2) expected EEXIST"); - if (link != -1) + if (link >= 0) close(link); CHECK_FAIL(query_attached_prog_id(netns) != query_prog_id(prog1));
@@ -623,7 +623,7 @@ static void run_tests(int netns) } out_close: for (i = 0; i < ARRAY_SIZE(progs); i++) { - if (progs[i] != -1) + if (progs[i] >= 0) CHECK_FAIL(close(progs[i])); } } diff --git a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c index 925722217edf..522237aa4470 100644 --- a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c @@ -121,12 +121,12 @@ void test_get_stack_raw_tp(void) goto close_prog;
link = bpf_program__attach_raw_tracepoint(prog, "sys_enter"); - if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_raw_tp")) goto close_prog;
pb_opts.sample_cb = get_stack_print_output; pb = perf_buffer__new(bpf_map__fd(map), 8, &pb_opts); - if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb))) + if (!ASSERT_OK_PTR(pb, "perf_buf__new")) goto close_prog;
/* trigger some syscall action */ @@ -141,9 +141,7 @@ void test_get_stack_raw_tp(void) }
close_prog: - if (!IS_ERR_OR_NULL(link)) - bpf_link__destroy(link); - if (!IS_ERR_OR_NULL(pb)) - perf_buffer__free(pb); + bpf_link__destroy(link); + perf_buffer__free(pb); bpf_object__close(obj); } diff --git a/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c b/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c index d884b2ed5bc5..8d5a6023a1bb 100644 --- a/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c @@ -48,8 +48,7 @@ void test_get_stackid_cannot_attach(void)
skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu, pmu_fd); - CHECK(!IS_ERR(skel->links.oncpu), "attach_perf_event_no_callchain", - "should have failed\n"); + ASSERT_ERR_PTR(skel->links.oncpu, "attach_perf_event_no_callchain"); close(pmu_fd);
/* add PERF_SAMPLE_CALLCHAIN, attach should succeed */ @@ -65,8 +64,7 @@ void test_get_stackid_cannot_attach(void)
skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu, pmu_fd); - CHECK(IS_ERR(skel->links.oncpu), "attach_perf_event_callchain", - "err: %ld\n", PTR_ERR(skel->links.oncpu)); + ASSERT_OK_PTR(skel->links.oncpu, "attach_perf_event_callchain"); close(pmu_fd);
/* add exclude_callchain_kernel, attach should fail */ @@ -82,8 +80,7 @@ void test_get_stackid_cannot_attach(void)
skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu, pmu_fd); - CHECK(!IS_ERR(skel->links.oncpu), "attach_perf_event_exclude_callchain_kernel", - "should have failed\n"); + ASSERT_ERR_PTR(skel->links.oncpu, "attach_perf_event_exclude_callchain_kernel"); close(pmu_fd);
cleanup: diff --git a/tools/testing/selftests/bpf/prog_tests/hashmap.c b/tools/testing/selftests/bpf/prog_tests/hashmap.c index 428d488830c6..4747ab18f97f 100644 --- a/tools/testing/selftests/bpf/prog_tests/hashmap.c +++ b/tools/testing/selftests/bpf/prog_tests/hashmap.c @@ -48,8 +48,7 @@ static void test_hashmap_generic(void) struct hashmap *map;
map = hashmap__new(hash_fn, equal_fn, NULL); - if (CHECK(IS_ERR(map), "hashmap__new", - "failed to create map: %ld\n", PTR_ERR(map))) + if (!ASSERT_OK_PTR(map, "hashmap__new")) return;
for (i = 0; i < ELEM_CNT; i++) { @@ -267,8 +266,7 @@ static void test_hashmap_multimap(void)
/* force collisions */ map = hashmap__new(collision_hash_fn, equal_fn, NULL); - if (CHECK(IS_ERR(map), "hashmap__new", - "failed to create map: %ld\n", PTR_ERR(map))) + if (!ASSERT_OK_PTR(map, "hashmap__new")) return;
/* set up multimap: @@ -339,8 +337,7 @@ static void test_hashmap_empty()
/* force collisions */ map = hashmap__new(hash_fn, equal_fn, NULL); - if (CHECK(IS_ERR(map), "hashmap__new", - "failed to create map: %ld\n", PTR_ERR(map))) + if (!ASSERT_OK_PTR(map, "hashmap__new")) goto cleanup;
if (CHECK(hashmap__size(map) != 0, "hashmap__size", diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c index d65107919998..ddfb6bf97152 100644 --- a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c +++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c @@ -97,15 +97,13 @@ void test_kfree_skb(void) goto close_prog;
link = bpf_program__attach_raw_tracepoint(prog, NULL); - if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_raw_tp")) goto close_prog; link_fentry = bpf_program__attach_trace(fentry); - if (CHECK(IS_ERR(link_fentry), "attach fentry", "err %ld\n", - PTR_ERR(link_fentry))) + if (!ASSERT_OK_PTR(link_fentry, "attach fentry")) goto close_prog; link_fexit = bpf_program__attach_trace(fexit); - if (CHECK(IS_ERR(link_fexit), "attach fexit", "err %ld\n", - PTR_ERR(link_fexit))) + if (!ASSERT_OK_PTR(link_fexit, "attach fexit")) goto close_prog;
perf_buf_map = bpf_object__find_map_by_name(obj2, "perf_buf_map"); @@ -116,7 +114,7 @@ void test_kfree_skb(void) pb_opts.sample_cb = on_sample; pb_opts.ctx = &passed; pb = perf_buffer__new(bpf_map__fd(perf_buf_map), 1, &pb_opts); - if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb))) + if (!ASSERT_OK_PTR(pb, "perf_buf__new")) goto close_prog;
memcpy(skb.cb, &cb, sizeof(cb)); @@ -144,12 +142,9 @@ void test_kfree_skb(void) CHECK_FAIL(!test_ok[0] || !test_ok[1]); close_prog: perf_buffer__free(pb); - if (!IS_ERR_OR_NULL(link)) - bpf_link__destroy(link); - if (!IS_ERR_OR_NULL(link_fentry)) - bpf_link__destroy(link_fentry); - if (!IS_ERR_OR_NULL(link_fexit)) - bpf_link__destroy(link_fexit); + bpf_link__destroy(link); + bpf_link__destroy(link_fentry); + bpf_link__destroy(link_fexit); bpf_object__close(obj); bpf_object__close(obj2); } diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c index 0d36eccd690a..3e021fc172f5 100644 --- a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c @@ -126,8 +126,7 @@ void test_ksyms_btf(void) struct btf *btf;
btf = libbpf_find_kernel_btf(); - if (CHECK(IS_ERR(btf), "btf_exists", "failed to load kernel BTF: %ld\n", - PTR_ERR(btf))) + if (!ASSERT_OK_PTR(btf, "btf_exists")) return;
percpu_datasec = btf__find_by_name_kind(btf, ".data..percpu", diff --git a/tools/testing/selftests/bpf/prog_tests/link_pinning.c b/tools/testing/selftests/bpf/prog_tests/link_pinning.c index a743288cf384..6fc97c45f71e 100644 --- a/tools/testing/selftests/bpf/prog_tests/link_pinning.c +++ b/tools/testing/selftests/bpf/prog_tests/link_pinning.c @@ -17,7 +17,7 @@ void test_link_pinning_subtest(struct bpf_program *prog, int err, i;
link = bpf_program__attach(prog); - if (CHECK(IS_ERR(link), "link_attach", "err: %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "link_attach")) goto cleanup;
bss->in = 1; @@ -51,7 +51,7 @@ void test_link_pinning_subtest(struct bpf_program *prog,
/* re-open link from BPFFS */ link = bpf_link__open(link_pin_path); - if (CHECK(IS_ERR(link), "link_open", "err: %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "link_open")) goto cleanup;
CHECK(strcmp(link_pin_path, bpf_link__pin_path(link)), "pin_path2", @@ -84,8 +84,7 @@ void test_link_pinning_subtest(struct bpf_program *prog, CHECK(i == 10000, "link_attached", "got to iteration #%d\n", i);
cleanup: - if (!IS_ERR(link)) - bpf_link__destroy(link); + bpf_link__destroy(link); }
void test_link_pinning(void) diff --git a/tools/testing/selftests/bpf/prog_tests/obj_name.c b/tools/testing/selftests/bpf/prog_tests/obj_name.c index e178416bddad..6194b776a28b 100644 --- a/tools/testing/selftests/bpf/prog_tests/obj_name.c +++ b/tools/testing/selftests/bpf/prog_tests/obj_name.c @@ -38,13 +38,13 @@ void test_obj_name(void)
fd = syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); CHECK((tests[i].success && fd < 0) || - (!tests[i].success && fd != -1) || + (!tests[i].success && fd >= 0) || (!tests[i].success && errno != tests[i].expected_errno), "check-bpf-prog-name", "fd %d(%d) errno %d(%d)\n", fd, tests[i].success, errno, tests[i].expected_errno);
- if (fd != -1) + if (fd >= 0) close(fd);
/* test different attr.map_name during BPF_MAP_CREATE */ @@ -59,13 +59,13 @@ void test_obj_name(void) memcpy(attr.map_name, tests[i].name, ncopy); fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr)); CHECK((tests[i].success && fd < 0) || - (!tests[i].success && fd != -1) || + (!tests[i].success && fd >= 0) || (!tests[i].success && errno != tests[i].expected_errno), "check-bpf-map-name", "fd %d(%d) errno %d(%d)\n", fd, tests[i].success, errno, tests[i].expected_errno);
- if (fd != -1) + if (fd >= 0) close(fd); } } diff --git a/tools/testing/selftests/bpf/prog_tests/perf_branches.c b/tools/testing/selftests/bpf/prog_tests/perf_branches.c index e35c444902a7..12c4f45cee1a 100644 --- a/tools/testing/selftests/bpf/prog_tests/perf_branches.c +++ b/tools/testing/selftests/bpf/prog_tests/perf_branches.c @@ -74,7 +74,7 @@ static void test_perf_branches_common(int perf_fd,
/* attach perf_event */ link = bpf_program__attach_perf_event(skel->progs.perf_branches, perf_fd); - if (CHECK(IS_ERR(link), "attach_perf_event", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_perf_event")) goto out_destroy_skel;
/* generate some branches on cpu 0 */ @@ -119,7 +119,7 @@ static void test_perf_branches_hw(void) * Some setups don't support branch records (virtual machines, !x86), * so skip test in this case. */ - if (pfd == -1) { + if (pfd < 0) { if (errno == ENOENT || errno == EOPNOTSUPP) { printf("%s:SKIP:no PERF_SAMPLE_BRANCH_STACK\n", __func__); diff --git a/tools/testing/selftests/bpf/prog_tests/perf_buffer.c b/tools/testing/selftests/bpf/prog_tests/perf_buffer.c index 8d75475408f5..7daaaab13681 100644 --- a/tools/testing/selftests/bpf/prog_tests/perf_buffer.c +++ b/tools/testing/selftests/bpf/prog_tests/perf_buffer.c @@ -80,7 +80,7 @@ void test_perf_buffer(void) pb_opts.sample_cb = on_sample; pb_opts.ctx = &cpu_seen; pb = perf_buffer__new(bpf_map__fd(skel->maps.perf_buf_map), 1, &pb_opts); - if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb))) + if (!ASSERT_OK_PTR(pb, "perf_buf__new")) goto out_close;
CHECK(perf_buffer__epoll_fd(pb) < 0, "epoll_fd", diff --git a/tools/testing/selftests/bpf/prog_tests/perf_event_stackmap.c b/tools/testing/selftests/bpf/prog_tests/perf_event_stackmap.c index 72c3690844fb..33144c9432ae 100644 --- a/tools/testing/selftests/bpf/prog_tests/perf_event_stackmap.c +++ b/tools/testing/selftests/bpf/prog_tests/perf_event_stackmap.c @@ -97,8 +97,7 @@ void test_perf_event_stackmap(void)
skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu, pmu_fd); - if (CHECK(IS_ERR(skel->links.oncpu), "attach_perf_event", - "err %ld\n", PTR_ERR(skel->links.oncpu))) { + if (!ASSERT_OK_PTR(skel->links.oncpu, "attach_perf_event")) { close(pmu_fd); goto cleanup; } diff --git a/tools/testing/selftests/bpf/prog_tests/probe_user.c b/tools/testing/selftests/bpf/prog_tests/probe_user.c index 7aecfd9e87d1..95bd12097358 100644 --- a/tools/testing/selftests/bpf/prog_tests/probe_user.c +++ b/tools/testing/selftests/bpf/prog_tests/probe_user.c @@ -15,7 +15,7 @@ void test_probe_user(void) static const int zero = 0;
obj = bpf_object__open_file(obj_file, &opts); - if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj))) + if (!ASSERT_OK_PTR(obj, "obj_open_file")) return;
kprobe_prog = bpf_object__find_program_by_title(obj, prog_name); @@ -33,11 +33,8 @@ void test_probe_user(void) goto cleanup;
kprobe_link = bpf_program__attach(kprobe_prog); - if (CHECK(IS_ERR(kprobe_link), "attach_kprobe", - "err %ld\n", PTR_ERR(kprobe_link))) { - kprobe_link = NULL; + if (!ASSERT_OK_PTR(kprobe_link, "attach_kprobe")) goto cleanup; - }
memset(&curr, 0, sizeof(curr)); in->sin_family = AF_INET; diff --git a/tools/testing/selftests/bpf/prog_tests/prog_run_xattr.c b/tools/testing/selftests/bpf/prog_tests/prog_run_xattr.c index 935a294f049a..b2daafe2729a 100644 --- a/tools/testing/selftests/bpf/prog_tests/prog_run_xattr.c +++ b/tools/testing/selftests/bpf/prog_tests/prog_run_xattr.c @@ -24,7 +24,7 @@ void test_prog_run_xattr(void) memset(buf, 0, sizeof(buf));
err = bpf_prog_test_run_xattr(&tattr); - CHECK_ATTR(err != -1 || errno != ENOSPC || tattr.retval, "run", + CHECK_ATTR(err >= 0 || errno != ENOSPC || tattr.retval, "run", "err %d errno %d retval %d\n", err, errno, tattr.retval);
CHECK_ATTR(tattr.data_size_out != sizeof(pkt_v4), "data_size_out", diff --git a/tools/testing/selftests/bpf/prog_tests/raw_tp_test_run.c b/tools/testing/selftests/bpf/prog_tests/raw_tp_test_run.c index c5fb191874ac..41720a62c4fa 100644 --- a/tools/testing/selftests/bpf/prog_tests/raw_tp_test_run.c +++ b/tools/testing/selftests/bpf/prog_tests/raw_tp_test_run.c @@ -77,7 +77,7 @@ void test_raw_tp_test_run(void) /* invalid cpu ID should fail with ENXIO */ opts.cpu = 0xffffffff; err = bpf_prog_test_run_opts(prog_fd, &opts); - CHECK(err != -1 || errno != ENXIO, + CHECK(err >= 0 || errno != ENXIO, "test_run_opts_fail", "should failed with ENXIO\n");
@@ -85,7 +85,7 @@ void test_raw_tp_test_run(void) opts.cpu = 1; opts.flags = 0; err = bpf_prog_test_run_opts(prog_fd, &opts); - CHECK(err != -1 || errno != EINVAL, + CHECK(err >= 0 || errno != EINVAL, "test_run_opts_fail", "should failed with EINVAL\n");
diff --git a/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c b/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c index 563e12120e77..5f9eaa3ab584 100644 --- a/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c +++ b/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c @@ -30,7 +30,7 @@ void test_rdonly_maps(void) struct bss bss;
obj = bpf_object__open_file(file, NULL); - if (CHECK(IS_ERR(obj), "obj_open", "err %ld\n", PTR_ERR(obj))) + if (!ASSERT_OK_PTR(obj, "obj_open")) return;
err = bpf_object__load(obj); @@ -58,11 +58,8 @@ void test_rdonly_maps(void) goto cleanup;
link = bpf_program__attach_raw_tracepoint(prog, "sys_enter"); - if (CHECK(IS_ERR(link), "attach_prog", "prog '%s', err %ld\n", - t->prog_name, PTR_ERR(link))) { - link = NULL; + if (!ASSERT_OK_PTR(link, "attach_prog")) goto cleanup; - }
/* trigger probe */ usleep(1); diff --git a/tools/testing/selftests/bpf/prog_tests/reference_tracking.c b/tools/testing/selftests/bpf/prog_tests/reference_tracking.c index ac1ee10cffd8..de2688166696 100644 --- a/tools/testing/selftests/bpf/prog_tests/reference_tracking.c +++ b/tools/testing/selftests/bpf/prog_tests/reference_tracking.c @@ -15,7 +15,7 @@ void test_reference_tracking(void) int err = 0;
obj = bpf_object__open_file(file, &open_opts); - if (CHECK_FAIL(IS_ERR(obj))) + if (!ASSERT_OK_PTR(obj, "obj_open_file")) return;
if (CHECK(strcmp(bpf_object__name(obj), obj_name), "obj_name", diff --git a/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c b/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c index d3c2de2c24d1..f62361306f6d 100644 --- a/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c +++ b/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c @@ -76,7 +76,7 @@ __resolve_symbol(struct btf *btf, int type_id) }
for (i = 0; i < ARRAY_SIZE(test_symbols); i++) { - if (test_symbols[i].id != -1) + if (test_symbols[i].id >= 0) continue;
if (BTF_INFO_KIND(type->info) != test_symbols[i].type) diff --git a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c index 821b4146b7b6..4efd337d6a3c 100644 --- a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c +++ b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c @@ -78,7 +78,7 @@ static int create_maps(enum bpf_map_type inner_type) attr.max_entries = REUSEPORT_ARRAY_SIZE;
reuseport_array = bpf_create_map_xattr(&attr); - RET_ERR(reuseport_array == -1, "creating reuseport_array", + RET_ERR(reuseport_array < 0, "creating reuseport_array", "reuseport_array:%d errno:%d\n", reuseport_array, errno);
/* Creating outer_map */ @@ -89,7 +89,7 @@ static int create_maps(enum bpf_map_type inner_type) attr.max_entries = 1; attr.inner_map_fd = reuseport_array; outer_map = bpf_create_map_xattr(&attr); - RET_ERR(outer_map == -1, "creating outer_map", + RET_ERR(outer_map < 0, "creating outer_map", "outer_map:%d errno:%d\n", outer_map, errno);
return 0; @@ -102,8 +102,9 @@ static int prepare_bpf_obj(void) int err;
obj = bpf_object__open("test_select_reuseport_kern.o"); - RET_ERR(IS_ERR_OR_NULL(obj), "open test_select_reuseport_kern.o", - "obj:%p PTR_ERR(obj):%ld\n", obj, PTR_ERR(obj)); + err = libbpf_get_error(obj); + RET_ERR(err, "open test_select_reuseport_kern.o", + "obj:%p PTR_ERR(obj):%d\n", obj, err);
map = bpf_object__find_map_by_name(obj, "outer_map"); RET_ERR(!map, "find outer_map", "!map\n"); @@ -116,31 +117,31 @@ static int prepare_bpf_obj(void) prog = bpf_program__next(NULL, obj); RET_ERR(!prog, "get first bpf_program", "!prog\n"); select_by_skb_data_prog = bpf_program__fd(prog); - RET_ERR(select_by_skb_data_prog == -1, "get prog fd", + RET_ERR(select_by_skb_data_prog < 0, "get prog fd", "select_by_skb_data_prog:%d\n", select_by_skb_data_prog);
map = bpf_object__find_map_by_name(obj, "result_map"); RET_ERR(!map, "find result_map", "!map\n"); result_map = bpf_map__fd(map); - RET_ERR(result_map == -1, "get result_map fd", + RET_ERR(result_map < 0, "get result_map fd", "result_map:%d\n", result_map);
map = bpf_object__find_map_by_name(obj, "tmp_index_ovr_map"); RET_ERR(!map, "find tmp_index_ovr_map\n", "!map"); tmp_index_ovr_map = bpf_map__fd(map); - RET_ERR(tmp_index_ovr_map == -1, "get tmp_index_ovr_map fd", + RET_ERR(tmp_index_ovr_map < 0, "get tmp_index_ovr_map fd", "tmp_index_ovr_map:%d\n", tmp_index_ovr_map);
map = bpf_object__find_map_by_name(obj, "linum_map"); RET_ERR(!map, "find linum_map", "!map\n"); linum_map = bpf_map__fd(map); - RET_ERR(linum_map == -1, "get linum_map fd", + RET_ERR(linum_map < 0, "get linum_map fd", "linum_map:%d\n", linum_map);
map = bpf_object__find_map_by_name(obj, "data_check_map"); RET_ERR(!map, "find data_check_map", "!map\n"); data_check_map = bpf_map__fd(map); - RET_ERR(data_check_map == -1, "get data_check_map fd", + RET_ERR(data_check_map < 0, "get data_check_map fd", "data_check_map:%d\n", data_check_map);
return 0; @@ -237,7 +238,7 @@ static long get_linum(void) int err;
err = bpf_map_lookup_elem(linum_map, &index_zero, &linum); - RET_ERR(err == -1, "lookup_elem(linum_map)", "err:%d errno:%d\n", + RET_ERR(err < 0, "lookup_elem(linum_map)", "err:%d errno:%d\n", err, errno);
return linum; @@ -254,11 +255,11 @@ static void check_data(int type, sa_family_t family, const struct cmd *cmd, addrlen = sizeof(cli_sa); err = getsockname(cli_fd, (struct sockaddr *)&cli_sa, &addrlen); - RET_IF(err == -1, "getsockname(cli_fd)", "err:%d errno:%d\n", + RET_IF(err < 0, "getsockname(cli_fd)", "err:%d errno:%d\n", err, errno);
err = bpf_map_lookup_elem(data_check_map, &index_zero, &result); - RET_IF(err == -1, "lookup_elem(data_check_map)", "err:%d errno:%d\n", + RET_IF(err < 0, "lookup_elem(data_check_map)", "err:%d errno:%d\n", err, errno);
if (type == SOCK_STREAM) { @@ -347,7 +348,7 @@ static void check_results(void)
for (i = 0; i < NR_RESULTS; i++) { err = bpf_map_lookup_elem(result_map, &i, &results[i]); - RET_IF(err == -1, "lookup_elem(result_map)", + RET_IF(err < 0, "lookup_elem(result_map)", "i:%u err:%d errno:%d\n", i, err, errno); }
@@ -524,12 +525,12 @@ static void test_syncookie(int type, sa_family_t family) */ err = bpf_map_update_elem(tmp_index_ovr_map, &index_zero, &tmp_index, BPF_ANY); - RET_IF(err == -1, "update_elem(tmp_index_ovr_map, 0, 1)", + RET_IF(err < 0, "update_elem(tmp_index_ovr_map, 0, 1)", "err:%d errno:%d\n", err, errno); do_test(type, family, &cmd, PASS); err = bpf_map_lookup_elem(tmp_index_ovr_map, &index_zero, &tmp_index); - RET_IF(err == -1 || tmp_index != -1, + RET_IF(err < 0 || tmp_index >= 0, "lookup_elem(tmp_index_ovr_map)", "err:%d errno:%d tmp_index:%d\n", err, errno, tmp_index); @@ -569,7 +570,7 @@ static void test_detach_bpf(int type, sa_family_t family)
for (i = 0; i < NR_RESULTS; i++) { err = bpf_map_lookup_elem(result_map, &i, &tmp); - RET_IF(err == -1, "lookup_elem(result_map)", + RET_IF(err < 0, "lookup_elem(result_map)", "i:%u err:%d errno:%d\n", i, err, errno); nr_run_before += tmp; } @@ -584,7 +585,7 @@ static void test_detach_bpf(int type, sa_family_t family)
for (i = 0; i < NR_RESULTS; i++) { err = bpf_map_lookup_elem(result_map, &i, &tmp); - RET_IF(err == -1, "lookup_elem(result_map)", + RET_IF(err < 0, "lookup_elem(result_map)", "i:%u err:%d errno:%d\n", i, err, errno); nr_run_after += tmp; } @@ -632,24 +633,24 @@ static void prepare_sk_fds(int type, sa_family_t family, bool inany) SO_ATTACH_REUSEPORT_EBPF, &select_by_skb_data_prog, sizeof(select_by_skb_data_prog)); - RET_IF(err == -1, "setsockopt(SO_ATTACH_REUEPORT_EBPF)", + RET_IF(err < 0, "setsockopt(SO_ATTACH_REUEPORT_EBPF)", "err:%d errno:%d\n", err, errno); }
err = bind(sk_fds[i], (struct sockaddr *)&srv_sa, addrlen); - RET_IF(err == -1, "bind()", "sk_fds[%d] err:%d errno:%d\n", + RET_IF(err < 0, "bind()", "sk_fds[%d] err:%d errno:%d\n", i, err, errno);
if (type == SOCK_STREAM) { err = listen(sk_fds[i], 10); - RET_IF(err == -1, "listen()", + RET_IF(err < 0, "listen()", "sk_fds[%d] err:%d errno:%d\n", i, err, errno); }
err = bpf_map_update_elem(reuseport_array, &i, &sk_fds[i], BPF_NOEXIST); - RET_IF(err == -1, "update_elem(reuseport_array)", + RET_IF(err < 0, "update_elem(reuseport_array)", "sk_fds[%d] err:%d errno:%d\n", i, err, errno);
if (i == first) { @@ -682,7 +683,7 @@ static void setup_per_test(int type, sa_family_t family, bool inany, prepare_sk_fds(type, family, inany); err = bpf_map_update_elem(tmp_index_ovr_map, &index_zero, &ovr, BPF_ANY); - RET_IF(err == -1, "update_elem(tmp_index_ovr_map, 0, -1)", + RET_IF(err < 0, "update_elem(tmp_index_ovr_map, 0, -1)", "err:%d errno:%d\n", err, errno);
/* Install reuseport_array to outer_map? */ @@ -691,7 +692,7 @@ static void setup_per_test(int type, sa_family_t family, bool inany,
err = bpf_map_update_elem(outer_map, &index_zero, &reuseport_array, BPF_ANY); - RET_IF(err == -1, "update_elem(outer_map, 0, reuseport_array)", + RET_IF(err < 0, "update_elem(outer_map, 0, reuseport_array)", "err:%d errno:%d\n", err, errno); }
@@ -720,18 +721,18 @@ static void cleanup_per_test(bool no_inner_map) return;
err = bpf_map_delete_elem(outer_map, &index_zero); - RET_IF(err == -1, "delete_elem(outer_map)", + RET_IF(err < 0, "delete_elem(outer_map)", "err:%d errno:%d\n", err, errno); }
static void cleanup(void) { - if (outer_map != -1) { + if (outer_map >= 0) { close(outer_map); outer_map = -1; }
- if (reuseport_array != -1) { + if (reuseport_array >= 0) { close(reuseport_array); reuseport_array = -1; } diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c index 1931de5275e5..839f7ddaec16 100644 --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c @@ -107,8 +107,7 @@ static void test_send_signal_common(struct perf_event_attr *attr,
skel->links.send_signal_perf = bpf_program__attach_perf_event(skel->progs.send_signal_perf, pmu_fd); - if (CHECK(IS_ERR(skel->links.send_signal_perf), "attach_perf_event", - "err %ld\n", PTR_ERR(skel->links.send_signal_perf))) + if (!ASSERT_OK_PTR(skel->links.send_signal_perf, "attach_perf_event")) goto disable_pmu; }
diff --git a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c index b4c9f4a96ae4..6db07401bc49 100644 --- a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c +++ b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c @@ -480,7 +480,7 @@ static struct bpf_link *attach_lookup_prog(struct bpf_program *prog) }
link = bpf_program__attach_netns(prog, net_fd); - if (CHECK(IS_ERR(link), "bpf_program__attach_netns", "failed\n")) { + if (!ASSERT_OK_PTR(link, "bpf_program__attach_netns")) { errno = -PTR_ERR(link); log_err("failed to attach program '%s' to netns", bpf_program__name(prog)); diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c index af87118e748e..577d619fb07e 100644 --- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c +++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c @@ -97,12 +97,12 @@ static void check_result(void)
err = bpf_map_lookup_elem(linum_map_fd, &egress_linum_idx, &egress_linum); - CHECK(err == -1, "bpf_map_lookup_elem(linum_map_fd)", + CHECK(err < 0, "bpf_map_lookup_elem(linum_map_fd)", "err:%d errno:%d\n", err, errno);
err = bpf_map_lookup_elem(linum_map_fd, &ingress_linum_idx, &ingress_linum); - CHECK(err == -1, "bpf_map_lookup_elem(linum_map_fd)", + CHECK(err < 0, "bpf_map_lookup_elem(linum_map_fd)", "err:%d errno:%d\n", err, errno);
memcpy(&srv_sk, &skel->bss->srv_sk, sizeof(srv_sk)); @@ -355,14 +355,12 @@ void test_sock_fields(void)
egress_link = bpf_program__attach_cgroup(skel->progs.egress_read_sock_fields, child_cg_fd); - if (CHECK(IS_ERR(egress_link), "attach_cgroup(egress)", "err:%ld\n", - PTR_ERR(egress_link))) + if (!ASSERT_OK_PTR(egress_link, "attach_cgroup(egress)")) goto done;
ingress_link = bpf_program__attach_cgroup(skel->progs.ingress_read_sock_fields, child_cg_fd); - if (CHECK(IS_ERR(ingress_link), "attach_cgroup(ingress)", "err:%ld\n", - PTR_ERR(ingress_link))) + if (!ASSERT_OK_PTR(ingress_link, "attach_cgroup(ingress)")) goto done;
linum_map_fd = bpf_map__fd(skel->maps.linum_map); @@ -375,8 +373,8 @@ void test_sock_fields(void) bpf_link__destroy(egress_link); bpf_link__destroy(ingress_link); test_sock_fields__destroy(skel); - if (child_cg_fd != -1) + if (child_cg_fd >= 0) close(child_cg_fd); - if (parent_cg_fd != -1) + if (parent_cg_fd >= 0) close(parent_cg_fd); } diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 85f73261fab0..6384f3cc5ad0 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -86,11 +86,11 @@ static void test_sockmap_create_update_free(enum bpf_map_type map_type) int s, map, err;
s = connected_socket_v4(); - if (CHECK_FAIL(s == -1)) + if (CHECK_FAIL(s < 0)) return;
map = bpf_create_map(map_type, sizeof(int), sizeof(int), 1, 0); - if (CHECK_FAIL(map == -1)) { + if (CHECK_FAIL(map < 0)) { perror("bpf_create_map"); goto out; } @@ -243,7 +243,7 @@ static void test_sockmap_copy(enum bpf_map_type map_type) opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); link = bpf_program__attach_iter(skel->progs.copy, &opts); - if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + if (!ASSERT_OK_PTR(link, "attach_iter")) goto out;
iter_fd = bpf_iter_create(bpf_link__fd(link)); diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c index 06b86addc181..7a0d64fdc192 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c @@ -98,7 +98,7 @@ static void run_tests(int family, enum bpf_map_type map_type) int map;
map = bpf_create_map(map_type, sizeof(int), sizeof(int), 1, 0); - if (CHECK_FAIL(map == -1)) { + if (CHECK_FAIL(map < 0)) { perror("bpf_map_create"); return; } diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index d7d65a700799..b734bb449827 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -139,7 +139,7 @@ #define xbpf_map_delete_elem(fd, key) \ ({ \ int __ret = bpf_map_delete_elem((fd), (key)); \ - if (__ret == -1) \ + if (__ret < 0) \ FAIL_ERRNO("map_delete"); \ __ret; \ }) @@ -147,7 +147,7 @@ #define xbpf_map_lookup_elem(fd, key, val) \ ({ \ int __ret = bpf_map_lookup_elem((fd), (key), (val)); \ - if (__ret == -1) \ + if (__ret < 0) \ FAIL_ERRNO("map_lookup"); \ __ret; \ }) @@ -155,7 +155,7 @@ #define xbpf_map_update_elem(fd, key, val, flags) \ ({ \ int __ret = bpf_map_update_elem((fd), (key), (val), (flags)); \ - if (__ret == -1) \ + if (__ret < 0) \ FAIL_ERRNO("map_update"); \ __ret; \ }) @@ -164,7 +164,7 @@ ({ \ int __ret = \ bpf_prog_attach((prog), (target), (type), (flags)); \ - if (__ret == -1) \ + if (__ret < 0) \ FAIL_ERRNO("prog_attach(" #type ")"); \ __ret; \ }) @@ -172,7 +172,7 @@ #define xbpf_prog_detach2(prog, target, type) \ ({ \ int __ret = bpf_prog_detach2((prog), (target), (type)); \ - if (__ret == -1) \ + if (__ret < 0) \ FAIL_ERRNO("prog_detach2(" #type ")"); \ __ret; \ }) diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c index 11a769e18f5d..0a91d8d9954b 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c @@ -62,8 +62,7 @@ void test_stacktrace_build_id_nmi(void)
skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu, pmu_fd); - if (CHECK(IS_ERR(skel->links.oncpu), "attach_perf_event", - "err %ld\n", PTR_ERR(skel->links.oncpu))) { + if (!ASSERT_OK_PTR(skel->links.oncpu, "attach_perf_event")) { close(pmu_fd); goto cleanup; } diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c index 37269d23df93..04b476bd62b9 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c @@ -21,7 +21,7 @@ void test_stacktrace_map(void) goto close_prog;
link = bpf_program__attach_tracepoint(prog, "sched", "sched_switch"); - if (CHECK(IS_ERR(link), "attach_tp", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_tp")) goto close_prog;
/* find map fds */ diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c index 404a5498e1a3..4fd30bb651ad 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c @@ -21,7 +21,7 @@ void test_stacktrace_map_raw_tp(void) goto close_prog;
link = bpf_program__attach_raw_tracepoint(prog, "sched_switch"); - if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_raw_tp")) goto close_prog;
/* find map fds */ @@ -59,7 +59,6 @@ void test_stacktrace_map_raw_tp(void) goto close_prog;
close_prog: - if (!IS_ERR_OR_NULL(link)) - bpf_link__destroy(link); + bpf_link__destroy(link); bpf_object__close(obj); } diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c b/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c index c85174cdcb77..990ad5984196 100644 --- a/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c +++ b/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c @@ -353,8 +353,7 @@ static void fastopen_estab(void) return;
link = bpf_program__attach_cgroup(skel->progs.estab, cg_fd); - if (CHECK(IS_ERR(link), "attach_cgroup(estab)", "err: %ld\n", - PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_cgroup(estab)")) return;
if (sk_fds_connect(&sk_fds, true)) { @@ -398,8 +397,7 @@ static void syncookie_estab(void) return;
link = bpf_program__attach_cgroup(skel->progs.estab, cg_fd); - if (CHECK(IS_ERR(link), "attach_cgroup(estab)", "err: %ld\n", - PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_cgroup(estab)")) return;
if (sk_fds_connect(&sk_fds, false)) { @@ -431,8 +429,7 @@ static void fin(void) return;
link = bpf_program__attach_cgroup(skel->progs.estab, cg_fd); - if (CHECK(IS_ERR(link), "attach_cgroup(estab)", "err: %ld\n", - PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_cgroup(estab)")) return;
if (sk_fds_connect(&sk_fds, false)) { @@ -471,8 +468,7 @@ static void __simple_estab(bool exprm) return;
link = bpf_program__attach_cgroup(skel->progs.estab, cg_fd); - if (CHECK(IS_ERR(link), "attach_cgroup(estab)", "err: %ld\n", - PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_cgroup(estab)")) return;
if (sk_fds_connect(&sk_fds, false)) { @@ -509,8 +505,7 @@ static void misc(void) return;
link = bpf_program__attach_cgroup(misc_skel->progs.misc_estab, cg_fd); - if (CHECK(IS_ERR(link), "attach_cgroup(misc_estab)", "err: %ld\n", - PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_cgroup(misc_estab)")) return;
if (sk_fds_connect(&sk_fds, false)) { diff --git a/tools/testing/selftests/bpf/prog_tests/test_overhead.c b/tools/testing/selftests/bpf/prog_tests/test_overhead.c index 9966685866fd..123c68c1917d 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_overhead.c +++ b/tools/testing/selftests/bpf/prog_tests/test_overhead.c @@ -73,7 +73,7 @@ void test_test_overhead(void) return;
obj = bpf_object__open_file("./test_overhead.o", NULL); - if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj))) + if (!ASSERT_OK_PTR(obj, "obj_open_file")) return;
kprobe_prog = bpf_object__find_program_by_title(obj, kprobe_name); @@ -108,7 +108,7 @@ void test_test_overhead(void) /* attach kprobe */ link = bpf_program__attach_kprobe(kprobe_prog, false /* retprobe */, kprobe_func); - if (CHECK(IS_ERR(link), "attach_kprobe", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_kprobe")) goto cleanup; test_run("kprobe"); bpf_link__destroy(link); @@ -116,28 +116,28 @@ void test_test_overhead(void) /* attach kretprobe */ link = bpf_program__attach_kprobe(kretprobe_prog, true /* retprobe */, kprobe_func); - if (CHECK(IS_ERR(link), "attach kretprobe", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_kretprobe")) goto cleanup; test_run("kretprobe"); bpf_link__destroy(link);
/* attach raw_tp */ link = bpf_program__attach_raw_tracepoint(raw_tp_prog, "task_rename"); - if (CHECK(IS_ERR(link), "attach fentry", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_raw_tp")) goto cleanup; test_run("raw_tp"); bpf_link__destroy(link);
/* attach fentry */ link = bpf_program__attach_trace(fentry_prog); - if (CHECK(IS_ERR(link), "attach fentry", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_fentry")) goto cleanup; test_run("fentry"); bpf_link__destroy(link);
/* attach fexit */ link = bpf_program__attach_trace(fexit_prog); - if (CHECK(IS_ERR(link), "attach fexit", "err %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "attach_fexit")) goto cleanup; test_run("fexit"); bpf_link__destroy(link); diff --git a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c index f3022d934e2d..d7f5a931d7f3 100644 --- a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c +++ b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c @@ -55,7 +55,7 @@ void test_trampoline_count(void) /* attach 'allowed' trampoline programs */ for (i = 0; i < MAX_TRAMP_PROGS; i++) { obj = bpf_object__open_file(object, NULL); - if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj))) { + if (!ASSERT_OK_PTR(obj, "obj_open_file")) { obj = NULL; goto cleanup; } @@ -68,14 +68,14 @@ void test_trampoline_count(void)
if (rand() % 2) { link = load(inst[i].obj, fentry_name); - if (CHECK(IS_ERR(link), "attach prog", "err %ld\n", PTR_ERR(link))) { + if (!ASSERT_OK_PTR(link, "attach_prog")) { link = NULL; goto cleanup; } inst[i].link_fentry = link; } else { link = load(inst[i].obj, fexit_name); - if (CHECK(IS_ERR(link), "attach prog", "err %ld\n", PTR_ERR(link))) { + if (!ASSERT_OK_PTR(link, "attach_prog")) { link = NULL; goto cleanup; } @@ -85,7 +85,7 @@ void test_trampoline_count(void)
/* and try 1 extra.. */ obj = bpf_object__open_file(object, NULL); - if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj))) { + if (!ASSERT_OK_PTR(obj, "obj_open_file")) { obj = NULL; goto cleanup; } @@ -96,13 +96,15 @@ void test_trampoline_count(void)
/* ..that needs to fail */ link = load(obj, fentry_name); - if (CHECK(!IS_ERR(link), "cannot attach over the limit", "err %ld\n", PTR_ERR(link))) { + err = libbpf_get_error(link); + if (!ASSERT_ERR_PTR(link, "cannot attach over the limit")) { bpf_link__destroy(link); goto cleanup_extra; }
/* with E2BIG error */ - CHECK(PTR_ERR(link) != -E2BIG, "proper error check", "err %ld\n", PTR_ERR(link)); + ASSERT_EQ(err, -E2BIG, "proper error check"); + ASSERT_EQ(link, NULL, "ptr_is_null");
/* and finaly execute the probe */ if (CHECK_FAIL(prctl(PR_GET_NAME, comm, 0L, 0L, 0L))) diff --git a/tools/testing/selftests/bpf/prog_tests/udp_limit.c b/tools/testing/selftests/bpf/prog_tests/udp_limit.c index 2aba09d4d01b..56c9d6bd38a3 100644 --- a/tools/testing/selftests/bpf/prog_tests/udp_limit.c +++ b/tools/testing/selftests/bpf/prog_tests/udp_limit.c @@ -22,11 +22,10 @@ void test_udp_limit(void) goto close_cgroup_fd;
skel->links.sock = bpf_program__attach_cgroup(skel->progs.sock, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.sock, "cg_attach_sock")) + goto close_skeleton; skel->links.sock_release = bpf_program__attach_cgroup(skel->progs.sock_release, cgroup_fd); - if (CHECK(IS_ERR(skel->links.sock) || IS_ERR(skel->links.sock_release), - "cg-attach", "sock %ld sock_release %ld", - PTR_ERR(skel->links.sock), - PTR_ERR(skel->links.sock_release))) + if (!ASSERT_OK_PTR(skel->links.sock_release, "cg_attach_sock_release")) goto close_skeleton;
/* BPF program enforces a single UDP socket per cgroup, diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c index 2c6c570b21f8..3bd5904b4db5 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c @@ -90,7 +90,7 @@ void test_xdp_bpf2bpf(void) pb_opts.ctx = &passed; pb = perf_buffer__new(bpf_map__fd(ftrace_skel->maps.perf_buf_map), 1, &pb_opts); - if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb))) + if (!ASSERT_OK_PTR(pb, "perf_buf__new")) goto out;
/* Run test program */ diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_link.c b/tools/testing/selftests/bpf/prog_tests/xdp_link.c index 6f814999b395..46eed0a33c23 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_link.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_link.c @@ -51,7 +51,7 @@ void test_xdp_link(void)
/* BPF link is not allowed to replace prog attachment */ link = bpf_program__attach_xdp(skel1->progs.xdp_handler, IFINDEX_LO); - if (CHECK(!IS_ERR(link), "link_attach_fail", "unexpected success\n")) { + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { bpf_link__destroy(link); /* best-effort detach prog */ opts.old_fd = prog_fd1; @@ -67,7 +67,7 @@ void test_xdp_link(void)
/* now BPF link should attach successfully */ link = bpf_program__attach_xdp(skel1->progs.xdp_handler, IFINDEX_LO); - if (CHECK(IS_ERR(link), "link_attach", "failed: %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "link_attach")) goto cleanup; skel1->links.xdp_handler = link;
@@ -95,7 +95,7 @@ void test_xdp_link(void)
/* BPF link is not allowed to replace another BPF link */ link = bpf_program__attach_xdp(skel2->progs.xdp_handler, IFINDEX_LO); - if (CHECK(!IS_ERR(link), "link_attach_fail", "unexpected success\n")) { + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { bpf_link__destroy(link); goto cleanup; } @@ -105,7 +105,7 @@ void test_xdp_link(void)
/* new link attach should succeed */ link = bpf_program__attach_xdp(skel2->progs.xdp_handler, IFINDEX_LO); - if (CHECK(IS_ERR(link), "link_attach", "failed: %ld\n", PTR_ERR(link))) + if (!ASSERT_OK_PTR(link, "link_attach")) goto cleanup; skel2->links.xdp_handler = link;
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index b241bb07972e..759e5f1935f0 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -53,12 +53,12 @@ static void test_hashmap(unsigned int task, void *data)
value = 0; /* BPF_NOEXIST means add new element if it doesn't exist. */ - assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) < 0 && /* key=1 already exists. */ errno == EEXIST);
/* -1 is an invalid flag. */ - assert(bpf_map_update_elem(fd, &key, &value, -1) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, -1) < 0 && errno == EINVAL);
/* Check that key=1 can be found. */ @@ -66,10 +66,10 @@ static void test_hashmap(unsigned int task, void *data)
key = 2; /* Check that key=2 is not found. */ - assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == ENOENT); + assert(bpf_map_lookup_elem(fd, &key, &value) < 0 && errno == ENOENT);
/* BPF_EXIST means update existing element. */ - assert(bpf_map_update_elem(fd, &key, &value, BPF_EXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_EXIST) < 0 && /* key=2 is not there. */ errno == ENOENT);
@@ -80,7 +80,7 @@ static void test_hashmap(unsigned int task, void *data) * inserted due to max_entries limit. */ key = 0; - assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) < 0 && errno == E2BIG);
/* Update existing element, though the map is full. */ @@ -89,12 +89,12 @@ static void test_hashmap(unsigned int task, void *data) key = 2; assert(bpf_map_update_elem(fd, &key, &value, BPF_ANY) == 0); key = 3; - assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) < 0 && errno == E2BIG);
/* Check that key = 0 doesn't exist. */ key = 0; - assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT); + assert(bpf_map_delete_elem(fd, &key) < 0 && errno == ENOENT);
/* Iterate over two elements. */ assert(bpf_map_get_next_key(fd, NULL, &first_key) == 0 && @@ -104,7 +104,7 @@ static void test_hashmap(unsigned int task, void *data) assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 && (next_key == 1 || next_key == 2) && (next_key != first_key)); - assert(bpf_map_get_next_key(fd, &next_key, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, &next_key, &next_key) < 0 && errno == ENOENT);
/* Delete both elements. */ @@ -112,13 +112,13 @@ static void test_hashmap(unsigned int task, void *data) assert(bpf_map_delete_elem(fd, &key) == 0); key = 2; assert(bpf_map_delete_elem(fd, &key) == 0); - assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT); + assert(bpf_map_delete_elem(fd, &key) < 0 && errno == ENOENT);
key = 0; /* Check that map is empty. */ - assert(bpf_map_get_next_key(fd, NULL, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, NULL, &next_key) < 0 && errno == ENOENT); - assert(bpf_map_get_next_key(fd, &key, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, &key, &next_key) < 0 && errno == ENOENT);
close(fd); @@ -169,12 +169,12 @@ static void test_hashmap_percpu(unsigned int task, void *data) expected_key_mask |= key;
/* BPF_NOEXIST means add new element if it doesn't exist. */ - assert(bpf_map_update_elem(fd, &key, value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, value, BPF_NOEXIST) < 0 && /* key=1 already exists. */ errno == EEXIST);
/* -1 is an invalid flag. */ - assert(bpf_map_update_elem(fd, &key, value, -1) == -1 && + assert(bpf_map_update_elem(fd, &key, value, -1) < 0 && errno == EINVAL);
/* Check that key=1 can be found. Value could be 0 if the lookup @@ -186,10 +186,10 @@ static void test_hashmap_percpu(unsigned int task, void *data)
key = 2; /* Check that key=2 is not found. */ - assert(bpf_map_lookup_elem(fd, &key, value) == -1 && errno == ENOENT); + assert(bpf_map_lookup_elem(fd, &key, value) < 0 && errno == ENOENT);
/* BPF_EXIST means update existing element. */ - assert(bpf_map_update_elem(fd, &key, value, BPF_EXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, value, BPF_EXIST) < 0 && /* key=2 is not there. */ errno == ENOENT);
@@ -202,11 +202,11 @@ static void test_hashmap_percpu(unsigned int task, void *data) * inserted due to max_entries limit. */ key = 0; - assert(bpf_map_update_elem(fd, &key, value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, value, BPF_NOEXIST) < 0 && errno == E2BIG);
/* Check that key = 0 doesn't exist. */ - assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT); + assert(bpf_map_delete_elem(fd, &key) < 0 && errno == ENOENT);
/* Iterate over two elements. */ assert(bpf_map_get_next_key(fd, NULL, &first_key) == 0 && @@ -237,13 +237,13 @@ static void test_hashmap_percpu(unsigned int task, void *data) assert(bpf_map_delete_elem(fd, &key) == 0); key = 2; assert(bpf_map_delete_elem(fd, &key) == 0); - assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT); + assert(bpf_map_delete_elem(fd, &key) < 0 && errno == ENOENT);
key = 0; /* Check that map is empty. */ - assert(bpf_map_get_next_key(fd, NULL, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, NULL, &next_key) < 0 && errno == ENOENT); - assert(bpf_map_get_next_key(fd, &key, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, &key, &next_key) < 0 && errno == ENOENT);
close(fd); @@ -360,7 +360,7 @@ static void test_arraymap(unsigned int task, void *data) assert(bpf_map_update_elem(fd, &key, &value, BPF_ANY) == 0);
value = 0; - assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) < 0 && errno == EEXIST);
/* Check that key=1 can be found. */ @@ -374,11 +374,11 @@ static void test_arraymap(unsigned int task, void *data) * due to max_entries limit. */ key = 2; - assert(bpf_map_update_elem(fd, &key, &value, BPF_EXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_EXIST) < 0 && errno == E2BIG);
/* Check that key = 2 doesn't exist. */ - assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == ENOENT); + assert(bpf_map_lookup_elem(fd, &key, &value) < 0 && errno == ENOENT);
/* Iterate over two elements. */ assert(bpf_map_get_next_key(fd, NULL, &next_key) == 0 && @@ -387,12 +387,12 @@ static void test_arraymap(unsigned int task, void *data) next_key == 0); assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 && next_key == 1); - assert(bpf_map_get_next_key(fd, &next_key, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, &next_key, &next_key) < 0 && errno == ENOENT);
/* Delete shouldn't succeed. */ key = 1; - assert(bpf_map_delete_elem(fd, &key) == -1 && errno == EINVAL); + assert(bpf_map_delete_elem(fd, &key) < 0 && errno == EINVAL);
close(fd); } @@ -418,7 +418,7 @@ static void test_arraymap_percpu(unsigned int task, void *data) assert(bpf_map_update_elem(fd, &key, values, BPF_ANY) == 0);
bpf_percpu(values, 0) = 0; - assert(bpf_map_update_elem(fd, &key, values, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, values, BPF_NOEXIST) < 0 && errno == EEXIST);
/* Check that key=1 can be found. */ @@ -433,11 +433,11 @@ static void test_arraymap_percpu(unsigned int task, void *data)
/* Check that key=2 cannot be inserted due to max_entries limit. */ key = 2; - assert(bpf_map_update_elem(fd, &key, values, BPF_EXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, values, BPF_EXIST) < 0 && errno == E2BIG);
/* Check that key = 2 doesn't exist. */ - assert(bpf_map_lookup_elem(fd, &key, values) == -1 && errno == ENOENT); + assert(bpf_map_lookup_elem(fd, &key, values) < 0 && errno == ENOENT);
/* Iterate over two elements. */ assert(bpf_map_get_next_key(fd, NULL, &next_key) == 0 && @@ -446,12 +446,12 @@ static void test_arraymap_percpu(unsigned int task, void *data) next_key == 0); assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 && next_key == 1); - assert(bpf_map_get_next_key(fd, &next_key, &next_key) == -1 && + assert(bpf_map_get_next_key(fd, &next_key, &next_key) < 0 && errno == ENOENT);
/* Delete shouldn't succeed. */ key = 1; - assert(bpf_map_delete_elem(fd, &key) == -1 && errno == EINVAL); + assert(bpf_map_delete_elem(fd, &key) < 0 && errno == EINVAL);
close(fd); } @@ -555,7 +555,7 @@ static void test_queuemap(unsigned int task, void *data) assert(bpf_map_update_elem(fd, NULL, &vals[i], 0) == 0);
/* Check that element cannot be pushed due to max_entries limit */ - assert(bpf_map_update_elem(fd, NULL, &val, 0) == -1 && + assert(bpf_map_update_elem(fd, NULL, &val, 0) < 0 && errno == E2BIG);
/* Peek element */ @@ -571,12 +571,12 @@ static void test_queuemap(unsigned int task, void *data) val == vals[i]);
/* Check that there are not elements left */ - assert(bpf_map_lookup_and_delete_elem(fd, NULL, &val) == -1 && + assert(bpf_map_lookup_and_delete_elem(fd, NULL, &val) < 0 && errno == ENOENT);
/* Check that non supported functions set errno to EINVAL */ - assert(bpf_map_delete_elem(fd, NULL) == -1 && errno == EINVAL); - assert(bpf_map_get_next_key(fd, NULL, NULL) == -1 && errno == EINVAL); + assert(bpf_map_delete_elem(fd, NULL) < 0 && errno == EINVAL); + assert(bpf_map_get_next_key(fd, NULL, NULL) < 0 && errno == EINVAL);
close(fd); } @@ -613,7 +613,7 @@ static void test_stackmap(unsigned int task, void *data) assert(bpf_map_update_elem(fd, NULL, &vals[i], 0) == 0);
/* Check that element cannot be pushed due to max_entries limit */ - assert(bpf_map_update_elem(fd, NULL, &val, 0) == -1 && + assert(bpf_map_update_elem(fd, NULL, &val, 0) < 0 && errno == E2BIG);
/* Peek element */ @@ -629,12 +629,12 @@ static void test_stackmap(unsigned int task, void *data) val == vals[i]);
/* Check that there are not elements left */ - assert(bpf_map_lookup_and_delete_elem(fd, NULL, &val) == -1 && + assert(bpf_map_lookup_and_delete_elem(fd, NULL, &val) < 0 && errno == ENOENT);
/* Check that non supported functions set errno to EINVAL */ - assert(bpf_map_delete_elem(fd, NULL) == -1 && errno == EINVAL); - assert(bpf_map_get_next_key(fd, NULL, NULL) == -1 && errno == EINVAL); + assert(bpf_map_delete_elem(fd, NULL) < 0 && errno == EINVAL); + assert(bpf_map_get_next_key(fd, NULL, NULL) < 0 && errno == EINVAL);
close(fd); } @@ -835,7 +835,7 @@ static void test_sockmap(unsigned int tasks, void *data) }
bpf_map_rx = bpf_object__find_map_by_name(obj, "sock_map_rx"); - if (IS_ERR(bpf_map_rx)) { + if (!bpf_map_rx) { printf("Failed to load map rx from verdict prog\n"); goto out_sockmap; } @@ -847,7 +847,7 @@ static void test_sockmap(unsigned int tasks, void *data) }
bpf_map_tx = bpf_object__find_map_by_name(obj, "sock_map_tx"); - if (IS_ERR(bpf_map_tx)) { + if (!bpf_map_tx) { printf("Failed to load map tx from verdict prog\n"); goto out_sockmap; } @@ -859,7 +859,7 @@ static void test_sockmap(unsigned int tasks, void *data) }
bpf_map_msg = bpf_object__find_map_by_name(obj, "sock_map_msg"); - if (IS_ERR(bpf_map_msg)) { + if (!bpf_map_msg) { printf("Failed to load map msg from msg_verdict prog\n"); goto out_sockmap; } @@ -871,7 +871,7 @@ static void test_sockmap(unsigned int tasks, void *data) }
bpf_map_break = bpf_object__find_map_by_name(obj, "sock_map_break"); - if (IS_ERR(bpf_map_break)) { + if (!bpf_map_break) { printf("Failed to load map tx from verdict prog\n"); goto out_sockmap; } @@ -1157,7 +1157,7 @@ static void test_map_in_map(void) }
map = bpf_object__find_map_by_name(obj, "mim_array"); - if (IS_ERR(map)) { + if (!map) { printf("Failed to load array of maps from test prog\n"); goto out_map_in_map; } @@ -1168,7 +1168,7 @@ static void test_map_in_map(void) }
map = bpf_object__find_map_by_name(obj, "mim_hash"); - if (IS_ERR(map)) { + if (!map) { printf("Failed to load hash of maps from test prog\n"); goto out_map_in_map; } @@ -1181,7 +1181,7 @@ static void test_map_in_map(void) bpf_object__load(obj);
map = bpf_object__find_map_by_name(obj, "mim_array"); - if (IS_ERR(map)) { + if (!map) { printf("Failed to load array of maps from test prog\n"); goto out_map_in_map; } @@ -1198,7 +1198,7 @@ static void test_map_in_map(void) }
map = bpf_object__find_map_by_name(obj, "mim_hash"); - if (IS_ERR(map)) { + if (!map) { printf("Failed to load hash of maps from test prog\n"); goto out_map_in_map; } @@ -1306,7 +1306,7 @@ static void test_map_large(void) }
key.c = -1; - assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) < 0 && errno == E2BIG);
/* Iterate through all elements. */ @@ -1314,12 +1314,12 @@ static void test_map_large(void) key.c = -1; for (i = 0; i < MAP_SIZE; i++) assert(bpf_map_get_next_key(fd, &key, &key) == 0); - assert(bpf_map_get_next_key(fd, &key, &key) == -1 && errno == ENOENT); + assert(bpf_map_get_next_key(fd, &key, &key) < 0 && errno == ENOENT);
key.c = 0; assert(bpf_map_lookup_elem(fd, &key, &value) == 0 && value == 0); key.a = 1; - assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == ENOENT); + assert(bpf_map_lookup_elem(fd, &key, &value) < 0 && errno == ENOENT);
close(fd); } @@ -1451,7 +1451,7 @@ static void test_map_parallel(void) run_parallel(TASKS, test_update_delete, data);
/* Check that key=0 is already there. */ - assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST) < 0 && errno == EEXIST);
/* Check that all elements were inserted. */ @@ -1459,7 +1459,7 @@ static void test_map_parallel(void) key = -1; for (i = 0; i < MAP_SIZE; i++) assert(bpf_map_get_next_key(fd, &key, &key) == 0); - assert(bpf_map_get_next_key(fd, &key, &key) == -1 && errno == ENOENT); + assert(bpf_map_get_next_key(fd, &key, &key) < 0 && errno == ENOENT);
/* Another check for all elements */ for (i = 0; i < MAP_SIZE; i++) { @@ -1475,8 +1475,8 @@ static void test_map_parallel(void)
/* Nothing should be left. */ key = -1; - assert(bpf_map_get_next_key(fd, NULL, &key) == -1 && errno == ENOENT); - assert(bpf_map_get_next_key(fd, &key, &key) == -1 && errno == ENOENT); + assert(bpf_map_get_next_key(fd, NULL, &key) < 0 && errno == ENOENT); + assert(bpf_map_get_next_key(fd, &key, &key) < 0 && errno == ENOENT); }
static void test_map_rdonly(void) @@ -1494,12 +1494,12 @@ static void test_map_rdonly(void) key = 1; value = 1234; /* Try to insert key=1 element. */ - assert(bpf_map_update_elem(fd, &key, &value, BPF_ANY) == -1 && + assert(bpf_map_update_elem(fd, &key, &value, BPF_ANY) < 0 && errno == EPERM);
/* Check that key=1 is not found. */ - assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == ENOENT); - assert(bpf_map_get_next_key(fd, &key, &value) == -1 && errno == ENOENT); + assert(bpf_map_lookup_elem(fd, &key, &value) < 0 && errno == ENOENT); + assert(bpf_map_get_next_key(fd, &key, &value) < 0 && errno == ENOENT);
close(fd); } @@ -1522,8 +1522,8 @@ static void test_map_wronly_hash(void) assert(bpf_map_update_elem(fd, &key, &value, BPF_ANY) == 0);
/* Check that reading elements and keys from the map is not allowed. */ - assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == EPERM); - assert(bpf_map_get_next_key(fd, &key, &value) == -1 && errno == EPERM); + assert(bpf_map_lookup_elem(fd, &key, &value) < 0 && errno == EPERM); + assert(bpf_map_get_next_key(fd, &key, &value) < 0 && errno == EPERM);
close(fd); } @@ -1550,10 +1550,10 @@ static void test_map_wronly_stack_or_queue(enum bpf_map_type map_type) assert(bpf_map_update_elem(fd, NULL, &value, BPF_ANY) == 0);
/* Peek element should fail */ - assert(bpf_map_lookup_elem(fd, NULL, &value) == -1 && errno == EPERM); + assert(bpf_map_lookup_elem(fd, NULL, &value) < 0 && errno == EPERM);
/* Pop element should fail */ - assert(bpf_map_lookup_and_delete_elem(fd, NULL, &value) == -1 && + assert(bpf_map_lookup_and_delete_elem(fd, NULL, &value) < 0 && errno == EPERM);
close(fd); @@ -1607,7 +1607,7 @@ static void prepare_reuseport_grp(int type, int map_fd, size_t map_elem_size, value = &fd32; } err = bpf_map_update_elem(map_fd, &index0, value, BPF_ANY); - CHECK(err != -1 || errno != EINVAL, + CHECK(err >= 0 || errno != EINVAL, "reuseport array update unbound sk", "sock_type:%d err:%d errno:%d\n", type, err, errno); @@ -1636,7 +1636,7 @@ static void prepare_reuseport_grp(int type, int map_fd, size_t map_elem_size, */ err = bpf_map_update_elem(map_fd, &index0, value, BPF_ANY); - CHECK(err != -1 || errno != EINVAL, + CHECK(err >= 0 || errno != EINVAL, "reuseport array update non-listening sk", "sock_type:%d err:%d errno:%d\n", type, err, errno); @@ -1666,31 +1666,31 @@ static void test_reuseport_array(void)
map_fd = bpf_create_map(BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, sizeof(__u32), sizeof(__u64), array_size, 0); - CHECK(map_fd == -1, "reuseport array create", + CHECK(map_fd < 0, "reuseport array create", "map_fd:%d, errno:%d\n", map_fd, errno);
/* Test lookup/update/delete with invalid index */ err = bpf_map_delete_elem(map_fd, &bad_index); - CHECK(err != -1 || errno != E2BIG, "reuseport array del >=max_entries", + CHECK(err >= 0 || errno != E2BIG, "reuseport array del >=max_entries", "err:%d errno:%d\n", err, errno);
err = bpf_map_update_elem(map_fd, &bad_index, &fd64, BPF_ANY); - CHECK(err != -1 || errno != E2BIG, + CHECK(err >= 0 || errno != E2BIG, "reuseport array update >=max_entries", "err:%d errno:%d\n", err, errno);
err = bpf_map_lookup_elem(map_fd, &bad_index, &map_cookie); - CHECK(err != -1 || errno != ENOENT, + CHECK(err >= 0 || errno != ENOENT, "reuseport array update >=max_entries", "err:%d errno:%d\n", err, errno);
/* Test lookup/delete non existence elem */ err = bpf_map_lookup_elem(map_fd, &index3, &map_cookie); - CHECK(err != -1 || errno != ENOENT, + CHECK(err >= 0 || errno != ENOENT, "reuseport array lookup not-exist elem", "err:%d errno:%d\n", err, errno); err = bpf_map_delete_elem(map_fd, &index3); - CHECK(err != -1 || errno != ENOENT, + CHECK(err >= 0 || errno != ENOENT, "reuseport array del not-exist elem", "err:%d errno:%d\n", err, errno);
@@ -1704,7 +1704,7 @@ static void test_reuseport_array(void) /* BPF_EXIST failure case */ err = bpf_map_update_elem(map_fd, &index3, &grpa_fds64[fds_idx], BPF_EXIST); - CHECK(err != -1 || errno != ENOENT, + CHECK(err >= 0 || errno != ENOENT, "reuseport array update empty elem BPF_EXIST", "sock_type:%d err:%d errno:%d\n", type, err, errno); @@ -1713,7 +1713,7 @@ static void test_reuseport_array(void) /* BPF_NOEXIST success case */ err = bpf_map_update_elem(map_fd, &index3, &grpa_fds64[fds_idx], BPF_NOEXIST); - CHECK(err == -1, + CHECK(err < 0, "reuseport array update empty elem BPF_NOEXIST", "sock_type:%d err:%d errno:%d\n", type, err, errno); @@ -1722,7 +1722,7 @@ static void test_reuseport_array(void) /* BPF_EXIST success case. */ err = bpf_map_update_elem(map_fd, &index3, &grpa_fds64[fds_idx], BPF_EXIST); - CHECK(err == -1, + CHECK(err < 0, "reuseport array update same elem BPF_EXIST", "sock_type:%d err:%d errno:%d\n", type, err, errno); fds_idx = REUSEPORT_FD_IDX(err, fds_idx); @@ -1730,7 +1730,7 @@ static void test_reuseport_array(void) /* BPF_NOEXIST failure case */ err = bpf_map_update_elem(map_fd, &index3, &grpa_fds64[fds_idx], BPF_NOEXIST); - CHECK(err != -1 || errno != EEXIST, + CHECK(err >= 0 || errno != EEXIST, "reuseport array update non-empty elem BPF_NOEXIST", "sock_type:%d err:%d errno:%d\n", type, err, errno); @@ -1739,7 +1739,7 @@ static void test_reuseport_array(void) /* BPF_ANY case (always succeed) */ err = bpf_map_update_elem(map_fd, &index3, &grpa_fds64[fds_idx], BPF_ANY); - CHECK(err == -1, + CHECK(err < 0, "reuseport array update same sk with BPF_ANY", "sock_type:%d err:%d errno:%d\n", type, err, errno);
@@ -1748,32 +1748,32 @@ static void test_reuseport_array(void)
/* The same sk cannot be added to reuseport_array twice */ err = bpf_map_update_elem(map_fd, &index3, &fd64, BPF_ANY); - CHECK(err != -1 || errno != EBUSY, + CHECK(err >= 0 || errno != EBUSY, "reuseport array update same sk with same index", "sock_type:%d err:%d errno:%d\n", type, err, errno);
err = bpf_map_update_elem(map_fd, &index0, &fd64, BPF_ANY); - CHECK(err != -1 || errno != EBUSY, + CHECK(err >= 0 || errno != EBUSY, "reuseport array update same sk with different index", "sock_type:%d err:%d errno:%d\n", type, err, errno);
/* Test delete elem */ err = bpf_map_delete_elem(map_fd, &index3); - CHECK(err == -1, "reuseport array delete sk", + CHECK(err < 0, "reuseport array delete sk", "sock_type:%d err:%d errno:%d\n", type, err, errno);
/* Add it back with BPF_NOEXIST */ err = bpf_map_update_elem(map_fd, &index3, &fd64, BPF_NOEXIST); - CHECK(err == -1, + CHECK(err < 0, "reuseport array re-add with BPF_NOEXIST after del", "sock_type:%d err:%d errno:%d\n", type, err, errno);
/* Test cookie */ err = bpf_map_lookup_elem(map_fd, &index3, &map_cookie); - CHECK(err == -1 || sk_cookie != map_cookie, + CHECK(err < 0 || sk_cookie != map_cookie, "reuseport array lookup re-added sk", "sock_type:%d err:%d errno:%d sk_cookie:0x%llx map_cookie:0x%llxn", type, err, errno, sk_cookie, map_cookie); @@ -1782,7 +1782,7 @@ static void test_reuseport_array(void) for (f = 0; f < ARRAY_SIZE(grpa_fds64); f++) close(grpa_fds64[f]); err = bpf_map_lookup_elem(map_fd, &index3, &map_cookie); - CHECK(err != -1 || errno != ENOENT, + CHECK(err >= 0 || errno != ENOENT, "reuseport array lookup after close()", "sock_type:%d err:%d errno:%d\n", type, err, errno); @@ -1793,7 +1793,7 @@ static void test_reuseport_array(void) CHECK(fd64 == -1, "socket(SOCK_RAW)", "err:%d errno:%d\n", err, errno); err = bpf_map_update_elem(map_fd, &index3, &fd64, BPF_NOEXIST); - CHECK(err != -1 || errno != ENOTSUPP, "reuseport array update SOCK_RAW", + CHECK(err >= 0 || errno != ENOTSUPP, "reuseport array update SOCK_RAW", "err:%d errno:%d\n", err, errno); close(fd64);
@@ -1803,16 +1803,16 @@ static void test_reuseport_array(void) /* Test 32 bit fd */ map_fd = bpf_create_map(BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, sizeof(__u32), sizeof(__u32), array_size, 0); - CHECK(map_fd == -1, "reuseport array create", + CHECK(map_fd < 0, "reuseport array create", "map_fd:%d, errno:%d\n", map_fd, errno); prepare_reuseport_grp(SOCK_STREAM, map_fd, sizeof(__u32), &fd64, &sk_cookie, 1); fd = fd64; err = bpf_map_update_elem(map_fd, &index3, &fd, BPF_NOEXIST); - CHECK(err == -1, "reuseport array update 32 bit fd", + CHECK(err < 0, "reuseport array update 32 bit fd", "err:%d errno:%d\n", err, errno); err = bpf_map_lookup_elem(map_fd, &index3, &map_cookie); - CHECK(err != -1 || errno != ENOSPC, + CHECK(err >= 0 || errno != ENOSPC, "reuseport array lookup 32 bit fd", "err:%d errno:%d\n", err, errno); close(fd); @@ -1858,6 +1858,8 @@ int main(void) { srand(time(NULL));
+ libbpf_set_strict_mode(LIBBPF_STRICT_ALL); + map_flags = 0; run_all_tests();
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 91c52c98bd28..935c90db3dee 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -727,6 +727,9 @@ int main(int argc, char **argv) if (err) return err;
+ /* Use libbpf 1.0 API mode */ + libbpf_set_strict_mode(LIBBPF_STRICT_ALL); + libbpf_set_print(libbpf_print_fn);
srand(time(NULL)); diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index a2f3dd14b313..1a0bf27d42c7 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -239,16 +239,17 @@ extern int test__join_cgroup(const char *path); #define ASSERT_OK_PTR(ptr, name) ({ \ static int duration = 0; \ const void *___res = (ptr); \ - bool ___ok = !IS_ERR_OR_NULL(___res); \ - CHECK(!___ok, (name), \ - "unexpected error: %ld\n", PTR_ERR(___res)); \ + int ___err = libbpf_get_error(___res); \ + bool ___ok = ___err == 0; \ + CHECK(!___ok, (name), "unexpected error: %d\n", ___err); \ ___ok; \ })
#define ASSERT_ERR_PTR(ptr, name) ({ \ static int duration = 0; \ const void *___res = (ptr); \ - bool ___ok = IS_ERR(___res); \ + int ___err = libbpf_get_error(___res); \ + bool ___ok = ___err != 0; \ CHECK(!___ok, (name), "unexpected pointer: %p\n", ___res); \ ___ok; \ }) diff --git a/tools/testing/selftests/bpf/test_tcpnotify_user.c b/tools/testing/selftests/bpf/test_tcpnotify_user.c index 73da7fe8c152..4a39304cc5a6 100644 --- a/tools/testing/selftests/bpf/test_tcpnotify_user.c +++ b/tools/testing/selftests/bpf/test_tcpnotify_user.c @@ -82,6 +82,8 @@ int main(int argc, char **argv) cpu_set_t cpuset; __u32 key = 0;
+ libbpf_set_strict_mode(LIBBPF_STRICT_ALL); + CPU_ZERO(&cpuset); CPU_SET(0, &cpuset); pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset); @@ -116,7 +118,7 @@ int main(int argc, char **argv)
pb_opts.sample_cb = dummyfn; pb = perf_buffer__new(bpf_map__fd(perf_map), 8, &pb_opts); - if (IS_ERR(pb)) + if (!pb) goto err;
pthread_create(&tid, NULL, poller_thread, pb); @@ -163,7 +165,6 @@ int main(int argc, char **argv) bpf_prog_detach(cg_fd, BPF_CGROUP_SOCK_OPS); close(cg_fd); cleanup_cgroup_environment(); - if (!IS_ERR_OR_NULL(pb)) - perf_buffer__free(pb); + perf_buffer__free(pb); return error; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 60aed22076b0d0ec2b7c7f9dba3ccd642520e1f3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------
Switch fexit_bpf2bpf selftest to bpf_program__set_attach_target() instead of using bpf_object_open_opts.attach_prog_fd, which is going to be deprecated. These changes also demonstrate the new mode of set_attach_target() in which it allows NULL when the target is BPF program (attach_prog_fd != 0).
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210916015836.1248906-6-andrii@kernel.org (cherry picked from commit 60aed22076b0d0ec2b7c7f9dba3ccd642520e1f3) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/fexit_bpf2bpf.c | 43 +++++++++++-------- 1 file changed, 26 insertions(+), 17 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index d7082f422b4e..337a1c011d77 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -58,7 +58,7 @@ static void test_fexit_bpf2bpf_common(const char *obj_file, test_cb cb) { struct bpf_object *obj = NULL, *tgt_obj; - struct bpf_program **prog = NULL; + struct bpf_program **prog = NULL, *p; struct bpf_link **link = NULL; __u32 duration = 0, retval; int err, tgt_fd, i; @@ -68,21 +68,23 @@ static void test_fexit_bpf2bpf_common(const char *obj_file, if (CHECK(err, "tgt_prog_load", "file %s err %d errno %d\n", target_obj_file, err, errno)) return; - DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, - .attach_prog_fd = tgt_fd, - );
link = calloc(sizeof(struct bpf_link *), prog_cnt); prog = calloc(sizeof(struct bpf_program *), prog_cnt); if (CHECK(!link || !prog, "alloc_memory", "failed to alloc memory")) goto close_prog;
- obj = bpf_object__open_file(obj_file, &opts); + obj = bpf_object__open_file(obj_file, NULL); if (CHECK(IS_ERR_OR_NULL(obj), "obj_open", "failed to open %s: %ld\n", obj_file, PTR_ERR(obj))) goto close_prog;
+ bpf_object__for_each_program(p, obj) { + err = bpf_program__set_attach_target(p, tgt_fd, NULL); + ASSERT_OK(err, "set_attach_target"); + } + err = bpf_object__load(obj); if (CHECK(err, "obj_load", "err %d\n", err)) goto close_prog; @@ -240,7 +242,7 @@ static void test_fmod_ret_freplace(void) struct bpf_link *freplace_link = NULL; struct bpf_program *prog; __u32 duration = 0; - int err, pkt_fd; + int err, pkt_fd, attach_prog_fd;
err = bpf_prog_load(tgt_name, BPF_PROG_TYPE_UNSPEC, &pkt_obj, &pkt_fd); @@ -248,26 +250,32 @@ static void test_fmod_ret_freplace(void) if (CHECK(err, "tgt_prog_load", "file %s err %d errno %d\n", tgt_name, err, errno)) return; - opts.attach_prog_fd = pkt_fd;
- freplace_obj = bpf_object__open_file(freplace_name, &opts); + freplace_obj = bpf_object__open_file(freplace_name, NULL); if (!ASSERT_OK_PTR(freplace_obj, "freplace_obj_open")) goto out;
+ prog = bpf_program__next(NULL, freplace_obj); + err = bpf_program__set_attach_target(prog, pkt_fd, NULL); + ASSERT_OK(err, "freplace__set_attach_target"); + err = bpf_object__load(freplace_obj); if (CHECK(err, "freplace_obj_load", "err %d\n", err)) goto out;
- prog = bpf_program__next(NULL, freplace_obj); freplace_link = bpf_program__attach_trace(prog); if (!ASSERT_OK_PTR(freplace_link, "freplace_attach_trace")) goto out;
- opts.attach_prog_fd = bpf_program__fd(prog); - fmod_obj = bpf_object__open_file(fmod_ret_name, &opts); + fmod_obj = bpf_object__open_file(fmod_ret_name, NULL); if (!ASSERT_OK_PTR(fmod_obj, "fmod_obj_open")) goto out;
+ attach_prog_fd = bpf_program__fd(prog); + prog = bpf_program__next(NULL, fmod_obj); + err = bpf_program__set_attach_target(prog, attach_prog_fd, NULL); + ASSERT_OK(err, "fmod_ret_set_attach_target"); + err = bpf_object__load(fmod_obj); if (CHECK(!err, "fmod_obj_load", "loading fmod_ret should fail\n")) goto out; @@ -292,14 +300,14 @@ static void test_func_sockmap_update(void) }
static void test_obj_load_failure_common(const char *obj_file, - const char *target_obj_file) - + const char *target_obj_file) { /* * standalone test that asserts failure to load freplace prog * because of invalid return code. */ struct bpf_object *obj = NULL, *pkt_obj; + struct bpf_program *prog; int err, pkt_fd; __u32 duration = 0;
@@ -309,14 +317,15 @@ static void test_obj_load_failure_common(const char *obj_file, if (CHECK(err, "tgt_prog_load", "file %s err %d errno %d\n", target_obj_file, err, errno)) return; - DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, - .attach_prog_fd = pkt_fd, - );
- obj = bpf_object__open_file(obj_file, &opts); + obj = bpf_object__open_file(obj_file, NULL); if (!ASSERT_OK_PTR(obj, "obj_open")) goto close_prog;
+ prog = bpf_program__next(NULL, obj); + err = bpf_program__set_attach_target(prog, pkt_fd, NULL); + ASSERT_OK(err, "set_attach_target"); + /* It should fail to load the program */ err = bpf_object__load(obj); if (CHECK(!err, "bpf_obj_load should fail", "err %d\n", err))
From: Toke Høiland-Jørgensen toke@redhat.com
mainline inclusion from mainline-5.16-rc1 commit c3e8c44a90631d2479fec6ecc6ba37e3188f487d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When parsing legacy map definitions, libbpf would error out when encountering an STT_SECTION symbol. This becomes a problem because some versions of binutils will produce SECTION symbols for every section when processing an ELF file, so BPF files run through 'strip' will end up with such symbols, making libbpf refuse to load them.
There's not really any reason why erroring out is strictly necessary, so change libbpf to just ignore SECTION symbols when parsing the ELF.
Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com (cherry picked from commit c3e8c44a90631d2479fec6ecc6ba37e3188f487d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e44bde6c6538..9fef04507aea 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1867,6 +1867,8 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) continue; if (sym.st_shndx != obj->efile.maps_shndx) continue; + if (GELF_ST_TYPE(sym.st_info) == STT_SECTION) + continue;
map = bpf_object__add_map(obj); if (IS_ERR(map)) @@ -1879,8 +1881,7 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) return -LIBBPF_ERRNO__FORMAT; }
- if (GELF_ST_TYPE(sym.st_info) == STT_SECTION - || GELF_ST_BIND(sym.st_info) == STB_LOCAL) { + if (GELF_ST_BIND(sym.st_info) == STB_LOCAL) { pr_warn("map '%s' (legacy): static maps are not supported\n", map_name); return -ENOTSUP; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 9673268f03ba72efcc00fa95f3fe3744fcae0dd0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF program type. "classifier" is a misleading terminology and should be migrated away from.
[0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.n...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org (cherry picked from commit 9673268f03ba72efcc00fa95f3fe3744fcae0dd0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 9fef04507aea..f21abda0a7f0 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7929,6 +7929,7 @@ static const struct bpf_sec_def section_defs[] = { .attach_fn = attach_kprobe), BPF_PROG_SEC("uretprobe/", BPF_PROG_TYPE_KPROBE), BPF_PROG_SEC("classifier", BPF_PROG_TYPE_SCHED_CLS), + BPF_PROG_SEC("tc", BPF_PROG_TYPE_SCHED_CLS), BPF_PROG_SEC("action", BPF_PROG_TYPE_SCHED_ACT), SEC_DEF("tracepoint/", TRACEPOINT, .attach_fn = attach_tp),
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 66fe33241726d1f872e55d95a35c063d58602ae1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Align gen_loader data to 8 byte boundary to make sure union bpf_attr, bpf_insns and other structs are aligned.
Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com (cherry picked from commit 66fe33241726d1f872e55d95a35c063d58602ae1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 8df718a6b142..80087b13877f 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -5,6 +5,7 @@ #include <string.h> #include <errno.h> #include <linux/filter.h> +#include <sys/param.h> #include "btf.h" #include "bpf.h" #include "libbpf.h" @@ -135,13 +136,17 @@ void bpf_gen__init(struct bpf_gen *gen, int log_level)
static int add_data(struct bpf_gen *gen, const void *data, __u32 size) { + __u32 size8 = roundup(size, 8); + __u64 zero = 0; void *prev;
- if (realloc_data_buf(gen, size)) + if (realloc_data_buf(gen, size8)) return 0; prev = gen->data_cur; memcpy(gen->data_cur, data, size); gen->data_cur += size; + memcpy(gen->data_cur, &zero, size8 - size); + gen->data_cur += size8 - size; return prev - gen->data_start; }
From: Toke Høiland-Jørgensen toke@redhat.com
mainline inclusion from mainline-5.16-rc1 commit 161ecd537948a7003129889b04a3a0858687bc70 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The previous patch to ignore STT_SECTION symbols only added the ignore condition in one of them. This fails if there's more than one map definition in the 'maps' section, because the subsequent modulus check will fail, resulting in error messages like:
libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o
Fix this by also ignoring STT_SECTION in the first loop.
Fixes: c3e8c44a9063 ("libbpf: Ignore STT_SECTION symbols in 'maps' section") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com (cherry picked from commit 161ecd537948a7003129889b04a3a0858687bc70) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f21abda0a7f0..0ecf8ba98a48 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1843,6 +1843,8 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) continue; if (sym.st_shndx != obj->efile.maps_shndx) continue; + if (GELF_ST_TYPE(sym.st_info) == STT_SECTION) + continue; nr_maps++; } /* Assume equally sized map definitions */
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.16-rc1 commit f731052325efc3726577feb743c7495f880ae07d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
A bunch of BPF maps do not support specifying BTF types for key and value. This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry logic which removes BTF type IDs when BPF map creation failed. Instead of retrying, this commit recognizes those specialized maps and removes BTF type IDs when creating BPF map.
[0] Closes: https://github.com/libbpf/libbpf/issues/355
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com (cherry picked from commit f731052325efc3726577feb743c7495f880ae07d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0ecf8ba98a48..2e5111edd6ee 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4611,6 +4611,30 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b create_attr.inner_map_fd = map->inner_map_fd; }
+ switch (def->type) { + case BPF_MAP_TYPE_PERF_EVENT_ARRAY: + case BPF_MAP_TYPE_CGROUP_ARRAY: + case BPF_MAP_TYPE_STACK_TRACE: + case BPF_MAP_TYPE_ARRAY_OF_MAPS: + case BPF_MAP_TYPE_HASH_OF_MAPS: + case BPF_MAP_TYPE_DEVMAP: + case BPF_MAP_TYPE_DEVMAP_HASH: + case BPF_MAP_TYPE_CPUMAP: + case BPF_MAP_TYPE_XSKMAP: + case BPF_MAP_TYPE_SOCKMAP: + case BPF_MAP_TYPE_SOCKHASH: + case BPF_MAP_TYPE_QUEUE: + case BPF_MAP_TYPE_STACK: + case BPF_MAP_TYPE_RINGBUF: + create_attr.btf_fd = 0; + create_attr.btf_key_type_id = 0; + create_attr.btf_value_type_id = 0; + map->btf_key_type_id = 0; + map->btf_value_type_id = 0; + default: + break; + } + if (obj->gen_loader) { bpf_gen__map_create(obj->gen_loader, &create_attr, is_inner ? -1 : map - obj->maps); /* Pretend to have valid FD to pass various fd >= 0 checks.
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.16-rc1 commit 18f4fccbf314fdb07d276f4cd3eaf53f1825550d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This change updates the BPF syscall loader to relocate BTF_KIND_FUNC relocations, with support for weak kfunc relocations. The general idea is to move map_fds to loader map, and also use the data for storing kfunc BTF fds. Since both reuse the fd_array parameter, they need to be kept together.
For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc, we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more chances of being <= INT16_MAX than treating data map as a sparse array and adding fd as needed.
When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse array model, so that as long as it does remain <= INT16_MAX, we pass an index relative to the start of fd_array.
We store all ksyms in an array where we try to avoid calling the bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was already stored. This also speeds up the loading process compared to emitting calls in all cases, in later tests.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com (cherry picked from commit 18f4fccbf314fdb07d276f4cd3eaf53f1825550d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_gen_internal.h | 16 +- tools/lib/bpf/gen_loader.c | 314 +++++++++++++++++++++++++------ tools/lib/bpf/libbpf.c | 8 +- 3 files changed, 280 insertions(+), 58 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index 615400391e57..70eccbffefb1 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -7,6 +7,15 @@ struct ksym_relo_desc { const char *name; int kind; int insn_idx; + bool is_weak; +}; + +struct ksym_desc { + const char *name; + int ref; + int kind; + int off; + int insn; };
struct bpf_gen { @@ -24,6 +33,10 @@ struct bpf_gen { int relo_cnt; char attach_target[128]; int attach_kind; + struct ksym_desc *ksyms; + __u32 nr_ksyms; + int fd_array; + int nr_fd_array; };
void bpf_gen__init(struct bpf_gen *gen, int log_level); @@ -36,6 +49,7 @@ void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_a void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size); void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx); void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type); -void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx); +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, int kind, + int insn_idx);
#endif diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 80087b13877f..937bfc7db41e 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -14,8 +14,10 @@ #include "bpf_gen_internal.h" #include "skel_internal.h"
-#define MAX_USED_MAPS 64 -#define MAX_USED_PROGS 32 +#define MAX_USED_MAPS 64 +#define MAX_USED_PROGS 32 +#define MAX_KFUNC_DESCS 256 +#define MAX_FD_ARRAY_SZ (MAX_USED_PROGS + MAX_KFUNC_DESCS)
/* The following structure describes the stack layout of the loader program. * In addition R6 contains the pointer to context. @@ -30,7 +32,6 @@ */ struct loader_stack { __u32 btf_fd; - __u32 map_fd[MAX_USED_MAPS]; __u32 prog_fd[MAX_USED_PROGS]; __u32 inner_map_fd; }; @@ -143,13 +144,49 @@ static int add_data(struct bpf_gen *gen, const void *data, __u32 size) if (realloc_data_buf(gen, size8)) return 0; prev = gen->data_cur; - memcpy(gen->data_cur, data, size); - gen->data_cur += size; - memcpy(gen->data_cur, &zero, size8 - size); - gen->data_cur += size8 - size; + if (data) { + memcpy(gen->data_cur, data, size); + memcpy(gen->data_cur + size, &zero, size8 - size); + } else { + memset(gen->data_cur, 0, size8); + } + gen->data_cur += size8; return prev - gen->data_start; }
+/* Get index for map_fd/btf_fd slot in reserved fd_array, or in data relative + * to start of fd_array. Caller can decide if it is usable or not. + */ +static int add_map_fd(struct bpf_gen *gen) +{ + if (!gen->fd_array) + gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int)); + if (gen->nr_maps == MAX_USED_MAPS) { + pr_warn("Total maps exceeds %d\n", MAX_USED_MAPS); + gen->error = -E2BIG; + return 0; + } + return gen->nr_maps++; +} + +static int add_kfunc_btf_fd(struct bpf_gen *gen) +{ + int cur; + + if (!gen->fd_array) + gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int)); + if (gen->nr_fd_array == MAX_KFUNC_DESCS) { + cur = add_data(gen, NULL, sizeof(int)); + return (cur - gen->fd_array) / sizeof(int); + } + return MAX_USED_MAPS + gen->nr_fd_array++; +} + +static int blob_fd_array_off(struct bpf_gen *gen, int index) +{ + return gen->fd_array + index * sizeof(int); +} + static int insn_bytes_to_bpf_size(__u32 sz) { switch (sz) { @@ -171,14 +208,22 @@ static void emit_rel_store(struct bpf_gen *gen, int off, int data) emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0)); }
-/* *(u64 *)(blob + off) = (u64)(void *)(%sp + stack_off) */ -static void emit_rel_store_sp(struct bpf_gen *gen, int off, int stack_off) +static void move_blob2blob(struct bpf_gen *gen, int off, int size, int blob_off) { - emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_10)); - emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, stack_off)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, blob_off)); + emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_2, 0)); emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off)); - emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0)); + emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0)); +} + +static void move_blob2ctx(struct bpf_gen *gen, int ctx_off, int size, int blob_off) +{ + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, blob_off)); + emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_1, 0)); + emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_6, BPF_REG_0, ctx_off)); }
static void move_ctx2blob(struct bpf_gen *gen, int off, int size, int ctx_off, @@ -326,11 +371,11 @@ int bpf_gen__finish(struct bpf_gen *gen) offsetof(struct bpf_prog_desc, prog_fd), 4, stack_off(prog_fd[i])); for (i = 0; i < gen->nr_maps; i++) - move_stack2ctx(gen, - sizeof(struct bpf_loader_ctx) + - sizeof(struct bpf_map_desc) * i + - offsetof(struct bpf_map_desc, map_fd), 4, - stack_off(map_fd[i])); + move_blob2ctx(gen, + sizeof(struct bpf_loader_ctx) + + sizeof(struct bpf_map_desc) * i + + offsetof(struct bpf_map_desc, map_fd), 4, + blob_fd_array_off(gen, i)); emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0)); emit(gen, BPF_EXIT_INSN()); pr_debug("gen: finish %d\n", gen->error); @@ -390,7 +435,7 @@ void bpf_gen__map_create(struct bpf_gen *gen, { int attr_size = offsetofend(union bpf_attr, btf_vmlinux_value_type_id); bool close_inner_map_fd = false; - int map_create_attr; + int map_create_attr, idx; union bpf_attr attr;
memset(&attr, 0, attr_size); @@ -467,9 +512,11 @@ void bpf_gen__map_create(struct bpf_gen *gen, gen->error = -EDOM; /* internal bug */ return; } else { - emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, - stack_off(map_fd[map_idx]))); - gen->nr_maps++; + /* add_map_fd does gen->nr_maps++ */ + idx = add_map_fd(gen); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, blob_fd_array_off(gen, idx))); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_7, 0)); } if (close_inner_map_fd) emit_sys_close_stack(gen, stack_off(inner_map_fd)); @@ -511,8 +558,8 @@ static void emit_find_attach_target(struct bpf_gen *gen) */ }
-void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, - int insn_idx) +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, + int kind, int insn_idx) { struct ksym_relo_desc *relo;
@@ -524,38 +571,192 @@ void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, gen->relos = relo; relo += gen->relo_cnt; relo->name = name; + relo->is_weak = is_weak; relo->kind = kind; relo->insn_idx = insn_idx; gen->relo_cnt++; }
-static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns) +/* returns existing ksym_desc with ref incremented, or inserts a new one */ +static struct ksym_desc *get_ksym_desc(struct bpf_gen *gen, struct ksym_relo_desc *relo) { - int name, insn, len = strlen(relo->name) + 1; + struct ksym_desc *kdesc;
- pr_debug("gen: emit_relo: %s at %d\n", relo->name, relo->insn_idx); - name = add_data(gen, relo->name, len); + for (int i = 0; i < gen->nr_ksyms; i++) { + if (!strcmp(gen->ksyms[i].name, relo->name)) { + gen->ksyms[i].ref++; + return &gen->ksyms[i]; + } + } + kdesc = libbpf_reallocarray(gen->ksyms, gen->nr_ksyms + 1, sizeof(*kdesc)); + if (!kdesc) { + gen->error = -ENOMEM; + return NULL; + } + gen->ksyms = kdesc; + kdesc = &gen->ksyms[gen->nr_ksyms++]; + kdesc->name = relo->name; + kdesc->kind = relo->kind; + kdesc->ref = 1; + kdesc->off = 0; + kdesc->insn = 0; + return kdesc; +} + +/* Overwrites BPF_REG_{0, 1, 2, 3, 4, 7} + * Returns result in BPF_REG_7 + */ +static void emit_bpf_find_by_name_kind(struct bpf_gen *gen, struct ksym_relo_desc *relo) +{ + int name_off, len = strlen(relo->name) + 1;
+ name_off = add_data(gen, relo->name, len); emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, - 0, 0, 0, name)); + 0, 0, 0, name_off)); emit(gen, BPF_MOV64_IMM(BPF_REG_2, len)); emit(gen, BPF_MOV64_IMM(BPF_REG_3, relo->kind)); emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0)); emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind)); emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0)); debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind); - emit_check_err(gen); +} + +/* Expects: + * BPF_REG_8 - pointer to instruction + * + * We need to reuse BTF fd for same symbol otherwise each relocation takes a new + * index, while kernel limits total kfunc BTFs to 256. For duplicate symbols, + * this would mean a new BTF fd index for each entry. By pairing symbol name + * with index, we get the insn->imm, insn->off pairing that kernel uses for + * kfunc_tab, which becomes the effective limit even though all of them may + * share same index in fd_array (such that kfunc_btf_tab has 1 element). + */ +static void emit_relo_kfunc_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insn) +{ + struct ksym_desc *kdesc; + int btf_fd_idx; + + kdesc = get_ksym_desc(gen, relo); + if (!kdesc) + return; + /* try to copy from existing bpf_insn */ + if (kdesc->ref > 1) { + move_blob2blob(gen, insn + offsetof(struct bpf_insn, imm), 4, + kdesc->insn + offsetof(struct bpf_insn, imm)); + move_blob2blob(gen, insn + offsetof(struct bpf_insn, off), 2, + kdesc->insn + offsetof(struct bpf_insn, off)); + goto log; + } + /* remember insn offset, so we can copy BTF ID and FD later */ + kdesc->insn = insn; + emit_bpf_find_by_name_kind(gen, relo); + if (!relo->is_weak) + emit_check_err(gen); + /* get index in fd_array to store BTF FD at */ + btf_fd_idx = add_kfunc_btf_fd(gen); + if (btf_fd_idx > INT16_MAX) { + pr_warn("BTF fd off %d for kfunc %s exceeds INT16_MAX, cannot process relocation\n", + btf_fd_idx, relo->name); + gen->error = -E2BIG; + return; + } + kdesc->off = btf_fd_idx; + /* set a default value for imm */ + emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, offsetof(struct bpf_insn, imm), 0)); + /* skip success case store if ret < 0 */ + emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 1)); /* store btf_id into insn[insn_idx].imm */ - insn = insns + sizeof(struct bpf_insn) * relo->insn_idx + - offsetof(struct bpf_insn, imm); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm))); + /* load fd_array slot pointer */ emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, - 0, 0, 0, insn)); - emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, 0)); - if (relo->kind == BTF_KIND_VAR) { - /* store btf_obj_fd into insn[insn_idx + 1].imm */ - emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32)); - emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, - sizeof(struct bpf_insn))); + 0, 0, 0, blob_fd_array_off(gen, btf_fd_idx))); + /* skip store of BTF fd if ret < 0 */ + emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 3)); + /* store BTF fd in slot */ + emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_7)); + emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_9, 32)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_9, 0)); + /* set a default value for off */ + emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), 0)); + /* skip insn->off store if ret < 0 */ + emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 2)); + /* skip if vmlinux BTF */ + emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_9, 0, 1)); + /* store index into insn[insn_idx].off */ + emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), btf_fd_idx)); +log: + if (!gen->log_level) + return; + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_8, + offsetof(struct bpf_insn, imm))); + emit(gen, BPF_LDX_MEM(BPF_H, BPF_REG_9, BPF_REG_8, + offsetof(struct bpf_insn, off))); + debug_regs(gen, BPF_REG_7, BPF_REG_9, " func (%s:count=%d): imm: %%d, off: %%d", + relo->name, kdesc->ref); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, blob_fd_array_off(gen, kdesc->off))); + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_9, BPF_REG_0, 0)); + debug_regs(gen, BPF_REG_9, -1, " func (%s:count=%d): btf_fd", + relo->name, kdesc->ref); +} + +/* Expects: + * BPF_REG_8 - pointer to instruction + */ +static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insn) +{ + struct ksym_desc *kdesc; + + kdesc = get_ksym_desc(gen, relo); + if (!kdesc) + return; + /* try to copy from existing ldimm64 insn */ + if (kdesc->ref > 1) { + move_blob2blob(gen, insn + offsetof(struct bpf_insn, imm), 4, + kdesc->insn + offsetof(struct bpf_insn, imm)); + move_blob2blob(gen, insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 4, + kdesc->insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm)); + goto log; + } + /* remember insn offset, so we can copy BTF ID and FD later */ + kdesc->insn = insn; + emit_bpf_find_by_name_kind(gen, relo); + emit_check_err(gen); + /* store btf_id into insn[insn_idx].imm */ + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm))); + /* store btf_obj_fd into insn[insn_idx + 1].imm */ + emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm))); +log: + if (!gen->log_level) + return; + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_8, + offsetof(struct bpf_insn, imm))); + emit(gen, BPF_LDX_MEM(BPF_H, BPF_REG_9, BPF_REG_8, sizeof(struct bpf_insn) + + offsetof(struct bpf_insn, imm))); + debug_regs(gen, BPF_REG_7, BPF_REG_9, " var (%s:count=%d): imm: %%d, fd: %%d", + relo->name, kdesc->ref); +} + +static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns) +{ + int insn; + + pr_debug("gen: emit_relo (%d): %s at %d\n", relo->kind, relo->name, relo->insn_idx); + insn = insns + sizeof(struct bpf_insn) * relo->insn_idx; + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_8, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, insn)); + switch (relo->kind) { + case BTF_KIND_VAR: + emit_relo_ksym_btf(gen, relo, insn); + break; + case BTF_KIND_FUNC: + emit_relo_kfunc_btf(gen, relo, insn); + break; + default: + pr_warn("Unknown relocation kind '%d'\n", relo->kind); + gen->error = -EDOM; + return; } }
@@ -571,14 +772,22 @@ static void cleanup_relos(struct bpf_gen *gen, int insns) { int i, insn;
- for (i = 0; i < gen->relo_cnt; i++) { - if (gen->relos[i].kind != BTF_KIND_VAR) - continue; - /* close fd recorded in insn[insn_idx + 1].imm */ - insn = insns + - sizeof(struct bpf_insn) * (gen->relos[i].insn_idx + 1) + - offsetof(struct bpf_insn, imm); - emit_sys_close_blob(gen, insn); + for (i = 0; i < gen->nr_ksyms; i++) { + if (gen->ksyms[i].kind == BTF_KIND_VAR) { + /* close fd recorded in insn[insn_idx + 1].imm */ + insn = gen->ksyms[i].insn; + insn += sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm); + emit_sys_close_blob(gen, insn); + } else { /* BTF_KIND_FUNC */ + emit_sys_close_blob(gen, blob_fd_array_off(gen, gen->ksyms[i].off)); + if (gen->ksyms[i].off < MAX_FD_ARRAY_SZ) + gen->nr_fd_array--; + } + } + if (gen->nr_ksyms) { + free(gen->ksyms); + gen->nr_ksyms = 0; + gen->ksyms = NULL; } if (gen->relo_cnt) { free(gen->relos); @@ -637,9 +846,8 @@ void bpf_gen__prog_load(struct bpf_gen *gen, /* populate union bpf_attr with a pointer to line_info */ emit_rel_store(gen, attr_field(prog_load_attr, line_info), line_info);
- /* populate union bpf_attr fd_array with a pointer to stack where map_fds are saved */ - emit_rel_store_sp(gen, attr_field(prog_load_attr, fd_array), - stack_off(map_fd[0])); + /* populate union bpf_attr fd_array with a pointer to data where map_fds are saved */ + emit_rel_store(gen, attr_field(prog_load_attr, fd_array), gen->fd_array);
/* populate union bpf_attr with user provided log details */ move_ctx2blob(gen, attr_field(prog_load_attr, log_level), 4, @@ -706,8 +914,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit(gen, BPF_EMIT_CALL(BPF_FUNC_copy_from_user));
map_update_attr = add_data(gen, &attr, attr_size); - move_stack2blob(gen, attr_field(map_update_attr, map_fd), 4, - stack_off(map_fd[map_idx])); + move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, + blob_fd_array_off(gen, map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key); emit_rel_store(gen, attr_field(map_update_attr, value), value); /* emit MAP_UPDATE_ELEM command */ @@ -725,8 +933,8 @@ void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx) memset(&attr, 0, attr_size); pr_debug("gen: map_freeze: idx %d\n", map_idx); map_freeze_attr = add_data(gen, &attr, attr_size); - move_stack2blob(gen, attr_field(map_freeze_attr, map_fd), 4, - stack_off(map_fd[map_idx])); + move_blob2blob(gen, attr_field(map_freeze_attr, map_fd), 4, + blob_fd_array_off(gen, map_idx)); /* emit MAP_FREEZE command */ emit_sys_bpf(gen, BPF_MAP_FREEZE, map_freeze_attr, attr_size); debug_ret(gen, "map_freeze"); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2e5111edd6ee..a304cfbba40d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6245,12 +6245,12 @@ static int bpf_program__record_externs(struct bpf_program *prog) ext->name); return -ENOTSUP; } - bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_VAR, - relo->insn_idx); + bpf_gen__record_extern(obj->gen_loader, ext->name, ext->is_weak, + BTF_KIND_VAR, relo->insn_idx); break; case RELO_EXTERN_FUNC: - bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_FUNC, - relo->insn_idx); + bpf_gen__record_extern(obj->gen_loader, ext->name, ext->is_weak, + BTF_KIND_FUNC, relo->insn_idx); break; default: continue;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 7ca61121598338ab713a5c705a843f3b8fed9f90 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add a bulk copying api, btf__add_btf(), that speeds up and simplifies appending entire contents of one BTF object to another one, taking care of copying BTF type data, adjusting resulting BTF type IDs according to their new locations in the destination BTF object, as well as copying and deduplicating all the referenced strings and updating all the string offsets in new BTF types as appropriate.
This API is intended to be used from tools that are generating and otherwise manipulating BTFs generically, such as pahole. In pahole's case, this API is useful for speeding up parallelized BTF encoding, as it allows pahole to offload all the intricacies of BTF type copying to libbpf and handle the parallelization aspects of the process.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Liu songliubraving@fb.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org (cherry picked from commit 7ca61121598338ab713a5c705a843f3b8fed9f90) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 114 ++++++++++++++++++++++++++++++++++++++- tools/lib/bpf/btf.h | 22 ++++++++ tools/lib/bpf/libbpf.map | 1 + 3 files changed, 135 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 9154992dc792..f1829ad45cb5 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -189,12 +189,17 @@ int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_ return 0; }
+static void *btf_add_type_offs_mem(struct btf *btf, size_t add_cnt) +{ + return libbpf_add_mem((void **)&btf->type_offs, &btf->type_offs_cap, sizeof(__u32), + btf->nr_types, BTF_MAX_NR_TYPES, add_cnt); +} + static int btf_add_type_idx_entry(struct btf *btf, __u32 type_off) { __u32 *p;
- p = libbpf_add_mem((void **)&btf->type_offs, &btf->type_offs_cap, sizeof(__u32), - btf->nr_types, BTF_MAX_NR_TYPES, 1); + p = btf_add_type_offs_mem(btf, 1); if (!p) return -ENOMEM;
@@ -1697,6 +1702,111 @@ int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_t return btf_commit_type(btf, sz); }
+static int btf_rewrite_type_ids(__u32 *type_id, void *ctx) +{ + struct btf *btf = ctx; + + if (!*type_id) /* nothing to do for VOID references */ + return 0; + + /* we haven't updated btf's type count yet, so + * btf->start_id + btf->nr_types - 1 is the type ID offset we should + * add to all newly added BTF types + */ + *type_id += btf->start_id + btf->nr_types - 1; + return 0; +} + +int btf__add_btf(struct btf *btf, const struct btf *src_btf) +{ + struct btf_pipe p = { .src = src_btf, .dst = btf }; + int data_sz, sz, cnt, i, err, old_strs_len; + __u32 *off; + void *t; + + /* appending split BTF isn't supported yet */ + if (src_btf->base_btf) + return libbpf_err(-ENOTSUP); + + /* deconstruct BTF, if necessary, and invalidate raw_data */ + if (btf_ensure_modifiable(btf)) + return libbpf_err(-ENOMEM); + + /* remember original strings section size if we have to roll back + * partial strings section changes + */ + old_strs_len = btf->hdr->str_len; + + data_sz = src_btf->hdr->type_len; + cnt = btf__get_nr_types(src_btf); + + /* pre-allocate enough memory for new types */ + t = btf_add_type_mem(btf, data_sz); + if (!t) + return libbpf_err(-ENOMEM); + + /* pre-allocate enough memory for type offset index for new types */ + off = btf_add_type_offs_mem(btf, cnt); + if (!off) + return libbpf_err(-ENOMEM); + + /* bulk copy types data for all types from src_btf */ + memcpy(t, src_btf->types_data, data_sz); + + for (i = 0; i < cnt; i++) { + sz = btf_type_size(t); + if (sz < 0) { + /* unlikely, has to be corrupted src_btf */ + err = sz; + goto err_out; + } + + /* fill out type ID to type offset mapping for lookups by type ID */ + *off = t - btf->types_data; + + /* add, dedup, and remap strings referenced by this BTF type */ + err = btf_type_visit_str_offs(t, btf_rewrite_str, &p); + if (err) + goto err_out; + + /* remap all type IDs referenced from this BTF type */ + err = btf_type_visit_type_ids(t, btf_rewrite_type_ids, btf); + if (err) + goto err_out; + + /* go to next type data and type offset index entry */ + t += sz; + off++; + } + + /* Up until now any of the copied type data was effectively invisible, + * so if we exited early before this point due to error, BTF would be + * effectively unmodified. There would be extra internal memory + * pre-allocated, but it would not be available for querying. But now + * that we've copied and rewritten all the data successfully, we can + * update type count and various internal offsets and sizes to + * "commit" the changes and made them visible to the outside world. + */ + btf->hdr->type_len += data_sz; + btf->hdr->str_off += data_sz; + btf->nr_types += cnt; + + /* return type ID of the first added BTF type */ + return btf->start_id + btf->nr_types - cnt; +err_out: + /* zero out preallocated memory as if it was just allocated with + * libbpf_add_mem() + */ + memset(btf->types_data + btf->hdr->type_len, 0, data_sz); + memset(btf->strs_data + old_strs_len, 0, btf->hdr->str_len - old_strs_len); + + /* and now restore original strings section size; types data size + * wasn't modified, so doesn't need restoring, see big comment above */ + btf->hdr->str_len = old_strs_len; + + return libbpf_err(err); +} + /* * Append new BTF_KIND_INT type with: * - *name* - non-empty, non-NULL type name; diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 659ea8a2769b..104db2cd699d 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -103,6 +103,28 @@ LIBBPF_API int btf__find_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type); +/** + * @brief **btf__add_btf()** appends all the BTF types from *src_btf* into *btf* + * @param btf BTF object which all the BTF types and strings are added to + * @param src_btf BTF object which all BTF types and referenced strings are copied from + * @return BTF type ID of the first appended BTF type, or negative error code + * + * **btf__add_btf()** can be used to simply and efficiently append the entire + * contents of one BTF object to another one. All the BTF type data is copied + * over, all referenced type IDs are adjusted by adding a necessary ID offset. + * Only strings referenced from BTF types are copied over and deduplicated, so + * if there were some unused strings in *src_btf*, those won't be copied over, + * which is consistent with the general string deduplication semantics of BTF + * writing APIs. + * + * If any error is encountered during this process, the contents of *btf* is + * left intact, which means that **btf__add_btf()** follows the transactional + * semantics and the operation as a whole is all-or-nothing. + * + * *src_btf* has to be non-split BTF, as of now copying types from split BTF + * is not supported and will result in -ENOTSUP error code returned. + */ +LIBBPF_API int btf__add_btf(struct btf *btf, const struct btf *src_btf);
LIBBPF_API int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding); LIBBPF_API int btf__add_float(struct btf *btf, const char *name, size_t byte_sz); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index e13df375c779..ee32c52400b9 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -384,5 +384,6 @@ LIBBPF_0.5.0 {
LIBBPF_0.6.0 { global: + btf__add_btf; btf__add_tag; } LIBBPF_0.5.0;
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.16-rc1 commit 4a404a7e8a3902fc560527241a611186605efb4e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
BPF objects are not reloadable after unload. Users are expected to use bpf_object__close() to unload and free up resources in one operation. No need to expose bpf_object__unload() as a public API, deprecate it ([0]). Add bpf_object__unload() as an alias to internal bpf_object_unload() and replace all bpf_object__unload() uses to avoid compilation errors.
[0] Closes: https://github.com/libbpf/libbpf/issues/290
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com (cherry picked from commit 4a404a7e8a3902fc560527241a611186605efb4e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 8 +++++--- tools/lib/bpf/libbpf.h | 1 + 2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a304cfbba40d..e4456e578049 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6562,7 +6562,7 @@ bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz, return libbpf_ptr(__bpf_object__open(NULL, obj_buf, obj_buf_sz, &opts)); }
-int bpf_object__unload(struct bpf_object *obj) +static int bpf_object_unload(struct bpf_object *obj) { size_t i;
@@ -6581,6 +6581,8 @@ int bpf_object__unload(struct bpf_object *obj) return 0; }
+int bpf_object__unload(struct bpf_object *obj) __attribute__((alias("bpf_object_unload"))); + static int bpf_object__sanitize_maps(struct bpf_object *obj) { struct bpf_map *m; @@ -6959,7 +6961,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) if (obj->maps[i].pinned && !obj->maps[i].reused) bpf_map__unpin(&obj->maps[i], NULL);
- bpf_object__unload(obj); + bpf_object_unload(obj); pr_warn("failed to load object '%s'\n", obj->path); return libbpf_err(err); } @@ -7563,7 +7565,7 @@ void bpf_object__close(struct bpf_object *obj)
bpf_gen__free(obj->gen_loader); bpf_object__elf_finish(obj); - bpf_object__unload(obj); + bpf_object_unload(obj); btf__free(obj->btf); btf_ext__free(obj->btf_ext);
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index d9815eee26ac..061962310fb8 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -150,6 +150,7 @@ struct bpf_object_load_attr { /* Load/unload object into/from kernel */ LIBBPF_API int bpf_object__load(struct bpf_object *obj); LIBBPF_API int bpf_object__load_xattr(struct bpf_object_load_attr *attr); +LIBBPF_DEPRECATED_SINCE(0, 6, "bpf_object__unload() is deprecated, use bpf_object__close() instead") LIBBPF_API int bpf_object__unload(struct bpf_object *obj);
LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj);
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.16-rc1 commit 2088a3a71d870115fdfb799c0f7de76d7383ba03 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with a new set of APIs named bpf_object__{prev,next}_{program,map} which follow the libbpf API naming convention ([0]). No functionality changes.
[0] Closes: https://github.com/libbpf/libbpf/issues/296
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com (cherry picked from commit 2088a3a71d870115fdfb799c0f7de76d7383ba03) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 24 ++++++++++++++++++++++++ tools/lib/bpf/libbpf.h | 35 +++++++++++++++++++++++------------ tools/lib/bpf/libbpf.map | 4 ++++ 3 files changed, 51 insertions(+), 12 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e4456e578049..ba4eb8da2b65 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7699,6 +7699,12 @@ __bpf_program__iter(const struct bpf_program *p, const struct bpf_object *obj,
struct bpf_program * bpf_program__next(struct bpf_program *prev, const struct bpf_object *obj) +{ + return bpf_object__next_program(obj, prev); +} + +struct bpf_program * +bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev) { struct bpf_program *prog = prev;
@@ -7711,6 +7717,12 @@ bpf_program__next(struct bpf_program *prev, const struct bpf_object *obj)
struct bpf_program * bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj) +{ + return bpf_object__prev_program(obj, next); +} + +struct bpf_program * +bpf_object__prev_program(const struct bpf_object *obj, struct bpf_program *next) { struct bpf_program *prog = next;
@@ -8725,6 +8737,12 @@ __bpf_map__iter(const struct bpf_map *m, const struct bpf_object *obj, int i)
struct bpf_map * bpf_map__next(const struct bpf_map *prev, const struct bpf_object *obj) +{ + return bpf_object__next_map(obj, prev); +} + +struct bpf_map * +bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *prev) { if (prev == NULL) return obj->maps; @@ -8734,6 +8752,12 @@ bpf_map__next(const struct bpf_map *prev, const struct bpf_object *obj)
struct bpf_map * bpf_map__prev(const struct bpf_map *next, const struct bpf_object *obj) +{ + return bpf_object__prev_map(obj, next); +} + +struct bpf_map * +bpf_object__prev_map(const struct bpf_object *obj, const struct bpf_map *next) { if (next == NULL) { if (!obj->nr_maps) diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 061962310fb8..3c43b2d5d065 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -190,16 +190,22 @@ LIBBPF_API int libbpf_find_vmlinux_btf_id(const char *name,
/* Accessors of bpf_program */ struct bpf_program; -LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog, - const struct bpf_object *obj); +LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__next_program() instead") +struct bpf_program *bpf_program__next(struct bpf_program *prog, + const struct bpf_object *obj); +LIBBPF_API struct bpf_program * +bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prog);
-#define bpf_object__for_each_program(pos, obj) \ - for ((pos) = bpf_program__next(NULL, (obj)); \ - (pos) != NULL; \ - (pos) = bpf_program__next((pos), (obj))) +#define bpf_object__for_each_program(pos, obj) \ + for ((pos) = bpf_object__next_program((obj), NULL); \ + (pos) != NULL; \ + (pos) = bpf_object__next_program((obj), (pos)))
-LIBBPF_API struct bpf_program *bpf_program__prev(struct bpf_program *prog, - const struct bpf_object *obj); +LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__prev_program() instead") +struct bpf_program *bpf_program__prev(struct bpf_program *prog, + const struct bpf_object *obj); +LIBBPF_API struct bpf_program * +bpf_object__prev_program(const struct bpf_object *obj, struct bpf_program *prog);
typedef void (*bpf_program_clear_priv_t)(struct bpf_program *, void *);
@@ -437,16 +443,21 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name); LIBBPF_API struct bpf_map * bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset);
+LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__next_map() instead") +struct bpf_map *bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj); LIBBPF_API struct bpf_map * -bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj); +bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *map); + #define bpf_object__for_each_map(pos, obj) \ - for ((pos) = bpf_map__next(NULL, (obj)); \ + for ((pos) = bpf_object__next_map((obj), NULL); \ (pos) != NULL; \ - (pos) = bpf_map__next((pos), (obj))) + (pos) = bpf_object__next_map((obj), (pos))) #define bpf_map__for_each bpf_object__for_each_map
+LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__prev_map() instead") +struct bpf_map *bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj); LIBBPF_API struct bpf_map * -bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj); +bpf_object__prev_map(const struct bpf_object *obj, const struct bpf_map *map);
/* get/set map FD */ LIBBPF_API int bpf_map__fd(const struct bpf_map *map); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index ee32c52400b9..087a80d1613f 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -384,6 +384,10 @@ LIBBPF_0.5.0 {
LIBBPF_0.6.0 { global: + bpf_object__next_map; + bpf_object__next_program; + bpf_object__prev_map; + bpf_object__prev_program; btf__add_btf; btf__add_tag; } LIBBPF_0.5.0;
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.16-rc1 commit 6f2b219b62a4376ca2da15c503de79d0650c8155 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Replace deprecated bpf_{map,program}__next APIs with newly added bpf_object__next_{map,program} APIs, so that no compilation warnings emit.
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211003165844.4054931-3-hengqi.chen@gmail.com (cherry picked from commit 6f2b219b62a4376ca2da15c503de79d0650c8155) Signed-off-by: Wang Yufen wangyufen@huawei.com --- samples/bpf/xdp1_user.c | 2 +- samples/bpf/xdp_sample_pkts_user.c | 2 +- tools/bpf/bpftool/iter.c | 2 +- tools/bpf/bpftool/prog.c | 2 +- tools/testing/selftests/bpf/prog_tests/btf.c | 2 +- tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c | 6 +++--- tools/testing/selftests/bpf/prog_tests/select_reuseport.c | 2 +- tools/testing/selftests/bpf/prog_tests/tcp_rtt.c | 2 +- tools/testing/selftests/bpf/xdping.c | 2 +- 9 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c index c447ad9e3a1d..460db097c3bb 100644 --- a/samples/bpf/xdp1_user.c +++ b/samples/bpf/xdp1_user.c @@ -134,7 +134,7 @@ int main(int argc, char **argv) if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) return 1;
- map = bpf_map__next(NULL, obj); + map = bpf_object__next_map(obj, NULL); if (!map) { printf("finding a map in obj file failed\n"); return 1; diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c index 4b2a300c750c..ee841f157045 100644 --- a/samples/bpf/xdp_sample_pkts_user.c +++ b/samples/bpf/xdp_sample_pkts_user.c @@ -159,7 +159,7 @@ int main(int argc, char **argv) return 1; }
- map = bpf_map__next(NULL, obj); + map = bpf_object__next_map(obj, NULL); if (!map) { printf("finding a map in obj file failed\n"); return 1; diff --git a/tools/bpf/bpftool/iter.c b/tools/bpf/bpftool/iter.c index 3b1aad7535dd..0925d58d3d8e 100644 --- a/tools/bpf/bpftool/iter.c +++ b/tools/bpf/bpftool/iter.c @@ -57,7 +57,7 @@ static int do_pin(int argc, char **argv) goto close_obj; }
- prog = bpf_program__next(NULL, obj); + prog = bpf_object__next_program(obj, NULL); if (!prog) { p_err("can't find bpf program in objfile %s", objfile); goto close_obj; diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 6c80bc729482..0041987fd5c9 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -1603,7 +1603,7 @@ static int load_with_options(int argc, char **argv, bool first_prog_only) goto err_close_obj;
if (first_prog_only) { - prog = bpf_program__next(NULL, obj); + prog = bpf_object__next_program(obj, NULL); if (!prog) { p_err("object file doesn't contain any bpf program"); goto err_close_obj; diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 5a5013221da9..0cbffc0276c2 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -4138,7 +4138,7 @@ static void do_test_file(unsigned int test_num) if (CHECK(err, "obj: %d", err)) return;
- prog = bpf_program__next(NULL, obj); + prog = bpf_object__next_program(obj, NULL); if (CHECK(!prog, "Cannot find bpf_prog")) { err = -1; goto done; diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index 337a1c011d77..dc3a1275cf26 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -255,7 +255,7 @@ static void test_fmod_ret_freplace(void) if (!ASSERT_OK_PTR(freplace_obj, "freplace_obj_open")) goto out;
- prog = bpf_program__next(NULL, freplace_obj); + prog = bpf_object__next_program(freplace_obj, NULL); err = bpf_program__set_attach_target(prog, pkt_fd, NULL); ASSERT_OK(err, "freplace__set_attach_target");
@@ -272,7 +272,7 @@ static void test_fmod_ret_freplace(void) goto out;
attach_prog_fd = bpf_program__fd(prog); - prog = bpf_program__next(NULL, fmod_obj); + prog = bpf_object__next_program(fmod_obj, NULL); err = bpf_program__set_attach_target(prog, attach_prog_fd, NULL); ASSERT_OK(err, "fmod_ret_set_attach_target");
@@ -322,7 +322,7 @@ static void test_obj_load_failure_common(const char *obj_file, if (!ASSERT_OK_PTR(obj, "obj_open")) goto close_prog;
- prog = bpf_program__next(NULL, obj); + prog = bpf_object__next_program(obj, NULL); err = bpf_program__set_attach_target(prog, pkt_fd, NULL); ASSERT_OK(err, "set_attach_target");
diff --git a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c index 4efd337d6a3c..d40e9156c48d 100644 --- a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c +++ b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c @@ -114,7 +114,7 @@ static int prepare_bpf_obj(void) err = bpf_object__load(obj); RET_ERR(err, "load bpf_object", "err:%d\n", err);
- prog = bpf_program__next(NULL, obj); + prog = bpf_object__next_program(obj, NULL); RET_ERR(!prog, "get first bpf_program", "!prog\n"); select_by_skb_data_prog = bpf_program__fd(prog); RET_ERR(select_by_skb_data_prog < 0, "get prog fd", diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c b/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c index d207e968e6b1..265b4fe33ec3 100644 --- a/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c +++ b/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c @@ -109,7 +109,7 @@ static int run_test(int cgroup_fd, int server_fd) return -1; }
- map = bpf_map__next(NULL, obj); + map = bpf_object__next_map(obj, NULL); map_fd = bpf_map__fd(map);
err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0); diff --git a/tools/testing/selftests/bpf/xdping.c b/tools/testing/selftests/bpf/xdping.c index 842d9155d36c..ed8ab32404b9 100644 --- a/tools/testing/selftests/bpf/xdping.c +++ b/tools/testing/selftests/bpf/xdping.c @@ -188,7 +188,7 @@ int main(int argc, char **argv) return 1; }
- map = bpf_map__next(NULL, obj); + map = bpf_object__next_map(obj, NULL); if (map) map_fd = bpf_map__fd(map); if (!map || map_fd < 0) {
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit b79c2ce3baa99beea7f8410ce3154cc23e26dbd8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The "install_headers" target in libbpf's Makefile would unconditionally export all API headers to the target directory. When those headers are installed to compile another application, this means that make always finds newer dependencies for the source files relying on those headers, and deduces that the targets should be rebuilt.
Avoid that by making "install_headers" depend on the source header files, and (re-)install them only when necessary.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-2-quentin@isovalent.com (cherry picked from commit b79c2ce3baa99beea7f8410ce3154cc23e26dbd8) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/Makefile | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 84b3bd34f6c6..46ed82ef676e 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -241,15 +241,24 @@ install_lib: all_cmd $(call do_install_mkdir,$(libdir_SQ)); \ cp -fpR $(LIB_FILE) $(DESTDIR)$(libdir_SQ)
-INSTALL_HEADERS = bpf.h libbpf.h btf.h libbpf_util.h libbpf_common.h libbpf_legacy.h xsk.h \ - bpf_helpers.h $(BPF_GENERATED) bpf_tracing.h \ - bpf_endian.h bpf_core_read.h skel_internal.h \ - libbpf_version.h - -install_headers: $(BPF_GENERATED) - $(call QUIET_INSTALL, headers) \ - $(foreach hdr,$(INSTALL_HEADERS), \ - $(call do_install,$(hdr),$(prefix)/include/bpf,644);) +SRC_HDRS := bpf.h libbpf.h btf.h libbpf_util.h libbpf_common.h libbpf_legacy.h xsk.h \ + bpf_helpers.h bpf_tracing.h bpf_endian.h bpf_core_read.h \ + skel_internal.h libbpf_version.h +GEN_HDRS := $(BPF_GENERATED) + +INSTALL_PFX := $(DESTDIR)$(prefix)/include/bpf +INSTALL_SRC_HDRS := $(addprefix $(INSTALL_PFX)/, $(SRC_HDRS)) +INSTALL_GEN_HDRS := $(addprefix $(INSTALL_PFX)/, $(notdir $(GEN_HDRS))) + +$(INSTALL_SRC_HDRS): $(INSTALL_PFX)/%.h: %.h + $(call QUIET_INSTALL, $@) \ + $(call do_install,$<,$(prefix)/include/bpf,644) + +$(INSTALL_GEN_HDRS): $(INSTALL_PFX)/%.h: $(OUTPUT)%.h + $(call QUIET_INSTALL, $@) \ + $(call do_install,$<,$(prefix)/include/bpf,644) + +install_headers: $(BPF_GENERATED) $(INSTALL_SRC_HDRS) $(INSTALL_GEN_HDRS)
install_pkgconfig: $(PC_FILE) $(call QUIET_INSTALL, $(PC_FILE)) \
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit c66a248f1950d41502fb67624147281d9de0e868 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It seems that the header file was never necessary to compile bpftool, and it is not part of the headers exported from libbpf. Let's remove the includes from prog.c and gen.c.
Fixes: d510296d331a ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.") Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-3-quentin@isovalent.com (cherry picked from commit c66a248f1950d41502fb67624147281d9de0e868) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 1 - tools/bpf/bpftool/prog.c | 1 - 2 files changed, 2 deletions(-)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index abd85bf11ce5..f795907b4061 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -18,7 +18,6 @@ #include <sys/stat.h> #include <sys/mman.h> #include <bpf/btf.h> -#include <bpf/bpf_gen_internal.h>
#include "json_writer.h" #include "main.h" diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 0041987fd5c9..e717c91c525a 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -25,7 +25,6 @@ #include <bpf/bpf.h> #include <bpf/btf.h> #include <bpf/libbpf.h> -#include <bpf/bpf_gen_internal.h> #include <bpf/skel_internal.h>
#include "cfg.h"
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit f012ade10b34c461663bc3dd957636be06804b0d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Bpftool relies on libbpf, therefore it relies on a number of headers from the library and must be linked against the library. The Makefile for bpftool exposes these objects by adding tools/lib as an include directory ("-I$(srctree)/tools/lib"). This is a working solution, but this is not the cleanest one. The risk is to involuntarily include objects that are not intended to be exposed by the libbpf.
The headers needed to compile bpftool should in fact be "installed" from libbpf, with its "install_headers" Makefile target. In addition, there is one header which is internal to the library and not supposed to be used by external applications, but that bpftool uses anyway.
Adjust the Makefile in order to install the header files properly before compiling bpftool. Also copy the additional internal header file (nlattr.h), but call it out explicitly. Build (and install headers) in a subdirectory under bpftool/ instead of tools/lib/bpf/. When descending from a parent Makefile, this is configurable by setting the OUTPUT, LIBBPF_OUTPUT and LIBBPF_DESTDIR variables.
Also adjust the Makefile for BPF selftests, so as to reuse the (host) libbpf compiled earlier and to avoid compiling a separate version of the library just for bpftool.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-4-quentin@isovalent.com (cherry picked from commit f012ade10b34c461663bc3dd957636be06804b0d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/Makefile | 33 ++++++++++++++++++---------- tools/testing/selftests/bpf/Makefile | 2 ++ 2 files changed, 23 insertions(+), 12 deletions(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index 70fe8656d840..db705c7baa34 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -16,19 +16,23 @@ endif BPF_DIR = $(srctree)/tools/lib/bpf/
ifneq ($(OUTPUT),) - LIBBPF_OUTPUT = $(OUTPUT)/libbpf/ - LIBBPF_PATH = $(LIBBPF_OUTPUT) - BOOTSTRAP_OUTPUT = $(OUTPUT)/bootstrap/ + _OUTPUT := $(OUTPUT) else - LIBBPF_OUTPUT = - LIBBPF_PATH = $(BPF_DIR) - BOOTSTRAP_OUTPUT = $(CURDIR)/bootstrap/ + _OUTPUT := $(CURDIR) endif +BOOTSTRAP_OUTPUT := $(_OUTPUT)/bootstrap/ +LIBBPF_OUTPUT := $(_OUTPUT)/libbpf/ +LIBBPF_DESTDIR := $(LIBBPF_OUTPUT) +LIBBPF_INCLUDE := $(LIBBPF_DESTDIR)/include
-LIBBPF = $(LIBBPF_PATH)libbpf.a +LIBBPF = $(LIBBPF_OUTPUT)libbpf.a LIBBPF_BOOTSTRAP_OUTPUT = $(BOOTSTRAP_OUTPUT)libbpf/ LIBBPF_BOOTSTRAP = $(LIBBPF_BOOTSTRAP_OUTPUT)libbpf.a
+# We need to copy nlattr.h which is not otherwise exported by libbpf, but still +# required by bpftool. +LIBBPF_INTERNAL_HDRS := nlattr.h + ifeq ($(BPFTOOL_VERSION),) BPFTOOL_VERSION := $(shell make -rR --no-print-directory -sC ../../.. kernelversion) endif @@ -37,7 +41,13 @@ $(LIBBPF_OUTPUT) $(BOOTSTRAP_OUTPUT) $(LIBBPF_BOOTSTRAP_OUTPUT): $(QUIET_MKDIR)mkdir -p $@
$(LIBBPF): FORCE | $(LIBBPF_OUTPUT) - $(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_OUTPUT) $(LIBBPF_OUTPUT)libbpf.a + $(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_OUTPUT) \ + DESTDIR=$(LIBBPF_DESTDIR) prefix= $(LIBBPF) install_headers + +$(LIBBPF_INCLUDE)/bpf/$(LIBBPF_INTERNAL_HDRS): \ + $(addprefix $(BPF_DIR),$(LIBBPF_INTERNAL_HDRS)) $(LIBBPF) + $(call QUIET_INSTALL, bpf/$(notdir $@)) + $(Q)install -m 644 -t $(LIBBPF_INCLUDE)/bpf/ $(BPF_DIR)$(notdir $@)
$(LIBBPF_BOOTSTRAP): FORCE | $(LIBBPF_BOOTSTRAP_OUTPUT) $(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_BOOTSTRAP_OUTPUT) \ @@ -59,10 +69,10 @@ CFLAGS += -W -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers CFLAGS += $(filter-out -Wswitch-enum -Wnested-externs,$(EXTRA_WARNINGS)) CFLAGS += -DPACKAGE='"bpftool"' -D__EXPORTED_HEADERS__ \ -I$(if $(OUTPUT),$(OUTPUT),.) \ + -I$(LIBBPF_INCLUDE) \ -I$(srctree)/kernel/bpf/ \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ - -I$(srctree)/tools/lib \ -I$(srctree)/tools/perf CFLAGS += -DBPFTOOL_VERSION='"$(BPFTOOL_VERSION)"' ifneq ($(EXTRA_CFLAGS),) @@ -139,7 +149,7 @@ BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o g $(BOOTSTRAP_OBJS): $(LIBBPF_BOOTSTRAP)
OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o -$(OBJS): $(LIBBPF) +$(OBJS): $(LIBBPF) $(LIBBPF_INCLUDE)/bpf/$(LIBBPF_INTERNAL_HDRS)
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ @@ -166,8 +176,7 @@ $(OUTPUT)%.bpf.o: skeleton/%.bpf.c $(OUTPUT)vmlinux.h $(LIBBPF) $(QUIET_CLANG)$(CLANG) \ -I$(if $(OUTPUT),$(OUTPUT),.) \ -I$(srctree)/tools/include/uapi/ \ - -I$(LIBBPF_PATH) \ - -I$(srctree)/tools/lib \ + -I$(LIBBPF_INCLUDE) \ -g -O2 -Wall -target bpf -c $< -o $@ && $(LLVM_STRIP) -g $@
$(OUTPUT)%.skel.h: $(OUTPUT)%.bpf.o $(BPFTOOL_BOOTSTRAP) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index ef611403958a..fc5600d35f14 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -202,6 +202,8 @@ $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) \ CC=$(HOSTCC) LD=$(HOSTLD) \ OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ + LIBBPF_OUTPUT=$(HOST_BUILD_DIR)/libbpf/ \ + LIBBPF_DESTDIR=$(HOST_SCRATCH_DIR)/ \ prefix= DESTDIR=$(HOST_SCRATCH_DIR)/ install $(Q)mkdir -p $(BUILD_DIR)/bpftool/Documentation $(Q)RST2MAN_OPTS="--exit-status=1" $(MAKE) $(submake_extras) \
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit 1478994aad82810d833bf9c816fb4e9845553e9b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
API headers from libbpf should not be accessed directly from the library's source directory. Instead, they should be exported with "make install_headers". Let's make sure that resolve_btfids installs the headers properly when building.
When descending from a parent Makefile, the specific output directories for building the library and exporting the headers are configurable with LIBBPF_OUT and LIBBPF_DESTDIR, respectively. This is in addition to OUTPUT, on top of which those variables are constructed by default.
Also adjust the Makefile for the BPF selftests in order to point to the (target) libbpf shared with other tools, instead of building a version specific to resolve_btfids. Remove libbpf's order-only dependencies on the include directories (they are created by libbpf and don't need to exist beforehand).
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-5-quentin@isovalent.com (cherry picked from commit 1478994aad82810d833bf9c816fb4e9845553e9b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/resolve_btfids/Makefile | 16 +++++++++++----- tools/bpf/resolve_btfids/main.c | 4 ++-- tools/testing/selftests/bpf/Makefile | 7 +++++-- 3 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/tools/bpf/resolve_btfids/Makefile b/tools/bpf/resolve_btfids/Makefile index df8a0f1b6289..0292fe94fc70 100644 --- a/tools/bpf/resolve_btfids/Makefile +++ b/tools/bpf/resolve_btfids/Makefile @@ -33,25 +33,30 @@ BPFOBJ := $(OUTPUT)/libbpf/libbpf.a LIBBPF_OUT := $(abspath $(dir $(BPFOBJ)))/ SUBCMDOBJ := $(OUTPUT)/libsubcmd/libsubcmd.a
+LIBBPF_DESTDIR := $(LIBBPF_OUT) +LIBBPF_INCLUDE := $(LIBBPF_DESTDIR)include + BINARY := $(OUTPUT)/resolve_btfids BINARY_IN := $(BINARY)-in.o
all: $(BINARY)
-$(OUTPUT) $(OUTPUT)/libbpf $(OUTPUT)/libsubcmd: +$(OUTPUT) $(OUTPUT)/libsubcmd $(LIBBPF_OUT): $(call msg,MKDIR,,$@) $(Q)mkdir -p $(@)
$(SUBCMDOBJ): fixdep FORCE | $(OUTPUT)/libsubcmd $(Q)$(MAKE) -C $(SUBCMD_SRC) OUTPUT=$(abspath $(dir $@))/ $(abspath $@)
-$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(OUTPUT)/libbpf - $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(LIBBPF_OUT) $(abspath $@) +$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(LIBBPF_OUT) + $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(LIBBPF_OUT) \ + DESTDIR=$(LIBBPF_DESTDIR) prefix= \ + $(abspath $@) install_headers
CFLAGS := -g \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ - -I$(LIBBPF_SRC) \ + -I$(LIBBPF_INCLUDE) \ -I$(SUBCMD_SRC)
LIBS = -lelf -lz @@ -69,7 +74,8 @@ $(BINARY): $(BPFOBJ) $(SUBCMDOBJ) $(BINARY_IN) clean_objects := $(wildcard $(OUTPUT)/*.o \ $(OUTPUT)/.*.o.cmd \ $(OUTPUT)/.*.o.d \ - $(OUTPUT)/libbpf \ + $(LIBBPF_OUT) \ + $(LIBBPF_DESTDIR) \ $(OUTPUT)/libsubcmd \ $(OUTPUT)/resolve_btfids)
diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c index efe0777854ae..e0bfbcc20103 100644 --- a/tools/bpf/resolve_btfids/main.c +++ b/tools/bpf/resolve_btfids/main.c @@ -60,8 +60,8 @@ #include <linux/rbtree.h> #include <linux/zalloc.h> #include <linux/err.h> -#include <btf.h> -#include <libbpf.h> +#include <bpf/btf.h> +#include <bpf/libbpf.h> #include <parse-options.h>
#define BTF_IDS_SECTION ".BTF_ids" diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index fc5600d35f14..4f8272e6370b 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -115,9 +115,11 @@ BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a ifneq ($(CROSS_COMPILE),) HOST_BUILD_DIR := $(BUILD_DIR)/host HOST_SCRATCH_DIR := $(OUTPUT)/host-tools +HOST_INCLUDE_DIR := $(HOST_SCRATCH_DIR)/include else HOST_BUILD_DIR := $(BUILD_DIR) HOST_SCRATCH_DIR := $(SCRATCH_DIR) +HOST_INCLUDE_DIR := $(INCLUDE_DIR) endif HOST_BPFOBJ := $(HOST_BUILD_DIR)/libbpf/libbpf.a RESOLVE_BTFIDS := $(HOST_BUILD_DIR)/resolve_btfids/resolve_btfids @@ -213,14 +215,14 @@ $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \
$(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ ../../../include/uapi/linux/bpf.h \ - | $(INCLUDE_DIR) $(BUILD_DIR)/libbpf + | $(BUILD_DIR)/libbpf $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/ \ DESTDIR=$(SCRATCH_DIR) prefix= all install_headers
ifneq ($(BPFOBJ),$(HOST_BPFOBJ)) $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ ../../../include/uapi/linux/bpf.h \ - | $(INCLUDE_DIR) $(HOST_BUILD_DIR)/libbpf + | $(HOST_BUILD_DIR)/libbpf $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) \ OUTPUT=$(HOST_BUILD_DIR)/libbpf/ CC=$(HOSTCC) LD=$(HOSTLD) \ DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers @@ -244,6 +246,7 @@ $(RESOLVE_BTFIDS): $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/resolve_btfids \ $(TOOLSDIR)/lib/str_error_r.c $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/resolve_btfids \ CC=$(HOSTCC) LD=$(HOSTLD) AR=$(HOSTAR) \ + LIBBPF_INCLUDE=$(HOST_INCLUDE_DIR) \ OUTPUT=$(HOST_BUILD_DIR)/resolve_btfids/ BPFOBJ=$(HOST_BPFOBJ)
# Get Clang's default includes on this system, as opposed to those seen by
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit be79505caf3f99a2f9cca5946261085b333f7034 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
API headers from libbpf should not be accessed directly from the library's source directory. Instead, they should be exported with "make install_headers". Let's make sure that runqslower installs the headers properly when building.
We use a libbpf_hdrs target to mark the logical dependency on libbpf's headers export for a number of object files, even though the headers should have been exported at this time (since bpftool needs them, and is required to generate the skeleton or the vmlinux.h).
When descending from a parent Makefile, the specific output directories for building the library and exporting the headers are configurable with BPFOBJ_OUTPUT and BPF_DESTDIR, respectively. This is in addition to OUTPUT, on top of which those variables are constructed by default.
Also adjust the Makefile for the BPF selftests. We pass a number of variables to the "make" invocation, because we want to point runqslower to the (target) libbpf shared with other tools, instead of building its own version. In addition, runqslower relies on (target) bpftool, and we also want to pass the proper variables to its Makefile so that bpftool itself reuses the same libbpf.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-6-quentin@isovalent.com (cherry picked from commit be79505caf3f99a2f9cca5946261085b333f7034) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/runqslower/Makefile | 22 +++++++++++++--------- tools/testing/selftests/bpf/Makefile | 15 +++++++++------ 2 files changed, 22 insertions(+), 15 deletions(-)
diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index 9d9fb6209be1..1e9a34ba98e2 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -9,9 +9,9 @@ BPFTOOL ?= $(DEFAULT_BPFTOOL) LIBBPF_SRC := $(abspath ../../lib/bpf) BPFOBJ_OUTPUT := $(OUTPUT)libbpf/ BPFOBJ := $(BPFOBJ_OUTPUT)libbpf.a -BPF_INCLUDE := $(BPFOBJ_OUTPUT) -INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../lib) \ - -I$(abspath ../../include/uapi) +BPF_DESTDIR := $(BPFOBJ_OUTPUT) +BPF_INCLUDE := $(BPF_DESTDIR)/include +INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../include/uapi) CFLAGS := -g -Wall
# Try to detect best kernel BTF source @@ -30,7 +30,7 @@ endif
.DELETE_ON_ERROR:
-.PHONY: all clean runqslower +.PHONY: all clean runqslower libbpf_hdrs all: runqslower
runqslower: $(OUTPUT)/runqslower @@ -43,13 +43,15 @@ clean: $(Q)$(RM) $(OUTPUT)runqslower $(Q)$(RM) -r .output
+libbpf_hdrs: $(BPFOBJ) + $(OUTPUT)/runqslower: $(OUTPUT)/runqslower.o $(BPFOBJ) $(QUIET_LINK)$(CC) $(CFLAGS) $^ -lelf -lz -o $@
$(OUTPUT)/runqslower.o: runqslower.h $(OUTPUT)/runqslower.skel.h \ - $(OUTPUT)/runqslower.bpf.o + $(OUTPUT)/runqslower.bpf.o | libbpf_hdrs
-$(OUTPUT)/runqslower.bpf.o: $(OUTPUT)/vmlinux.h runqslower.h +$(OUTPUT)/runqslower.bpf.o: $(OUTPUT)/vmlinux.h runqslower.h | libbpf_hdrs
$(OUTPUT)/%.skel.h: $(OUTPUT)/%.bpf.o | $(BPFTOOL) $(QUIET_GEN)$(BPFTOOL) gen skeleton $< > $@ @@ -74,8 +76,10 @@ $(OUTPUT)/vmlinux.h: $(VMLINUX_BTF_PATH) | $(OUTPUT) $(BPFTOOL) $(QUIET_GEN)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@
$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(BPFOBJ_OUTPUT) - $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(BPFOBJ_OUTPUT) $@ + $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(BPFOBJ_OUTPUT) \ + DESTDIR=$(BPFOBJ_OUTPUT) prefix= $(abspath $@) install_headers
-$(DEFAULT_BPFTOOL): | $(BPFTOOL_OUTPUT) +$(DEFAULT_BPFTOOL): $(BPFOBJ) | $(BPFTOOL_OUTPUT) $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT) \ - CC=$(HOSTCC) LD=$(HOSTLD) + LIBBPF_OUTPUT=$(BPFOBJ_OUTPUT) \ + LIBBPF_DESTDIR=$(BPF_DESTDIR) CC=$(HOSTCC) LD=$(HOSTLD) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 4f8272e6370b..b491166f9be5 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -123,6 +123,7 @@ HOST_INCLUDE_DIR := $(INCLUDE_DIR) endif HOST_BPFOBJ := $(HOST_BUILD_DIR)/libbpf/libbpf.a RESOLVE_BTFIDS := $(HOST_BUILD_DIR)/resolve_btfids/resolve_btfids +RUNQSLOWER_OUTPUT := $(BUILD_DIR)/runqslower/
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ @@ -147,7 +148,7 @@ $(notdir $(TEST_GEN_PROGS) \ # sort removes libbpf duplicates when not cross-building MAKE_DIRS := $(sort $(BUILD_DIR)/libbpf $(HOST_BUILD_DIR)/libbpf \ $(HOST_BUILD_DIR)/bpftool $(HOST_BUILD_DIR)/resolve_btfids \ - $(INCLUDE_DIR)) + $(RUNQSLOWER_OUTPUT) $(INCLUDE_DIR)) $(MAKE_DIRS): $(call msg,MKDIR,,$@) $(Q)mkdir -p $@ @@ -176,11 +177,13 @@ $(OUTPUT)/test_stub.o: test_stub.c $(BPFOBJ)
DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool
-$(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL) - $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/runqslower \ - OUTPUT=$(SCRATCH_DIR)/ VMLINUX_BTF=$(VMLINUX_BTF) \ - BPFOBJ=$(BPFOBJ) BPF_INCLUDE=$(INCLUDE_DIR) && \ - cp $(SCRATCH_DIR)/runqslower $@ +$(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL) $(RUNQSLOWER_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/runqslower \ + OUTPUT=$(RUNQSLOWER_OUTPUT) VMLINUX_BTF=$(VMLINUX_BTF) \ + BPFTOOL_OUTPUT=$(BUILD_DIR)/bpftool/ \ + BPFOBJ_OUTPUT=$(BUILD_DIR)/libbpf \ + BPFOBJ=$(BPFOBJ) BPF_INCLUDE=$(INCLUDE_DIR) && \ + cp $(RUNQSLOWER_OUTPUT)runqslower $@
$(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED): $(OUTPUT)/test_stub.o $(BPFOBJ)
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit bf60791741d430e8a3e2f8b4a3941d392bf838c2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
API headers from libbpf should not be accessed directly from the library's source directory. Instead, they should be exported with "make install_headers". Let's make sure that bpf/preload/Makefile installs the headers properly when building.
Note that we declare an additional dependency for iterators/iterators.o: having $(LIBBPF_A) as a dependency to "$(obj)/bpf_preload_umd" is not sufficient, as it makes it required only at the linking step. But we need libbpf to be compiled, and in particular its headers to be exported, before we attempt to compile iterators.o. The issue would not occur before this commit, because libbpf's headers were not exported and were always available under tools/lib/bpf.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-7-quentin@isovalent.com (cherry picked from commit bf60791741d430e8a3e2f8b4a3941d392bf838c2) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/preload/Makefile | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/preload/Makefile b/kernel/bpf/preload/Makefile index 1951332dd15f..469d35e890eb 100644 --- a/kernel/bpf/preload/Makefile +++ b/kernel/bpf/preload/Makefile @@ -1,21 +1,36 @@ # SPDX-License-Identifier: GPL-2.0
LIBBPF_SRCS = $(srctree)/tools/lib/bpf/ -LIBBPF_A = $(obj)/libbpf.a -LIBBPF_OUT = $(abspath $(obj)) +LIBBPF_OUT = $(abspath $(obj))/libbpf +LIBBPF_A = $(LIBBPF_OUT)/libbpf.a +LIBBPF_DESTDIR = $(LIBBPF_OUT) +LIBBPF_INCLUDE = $(LIBBPF_DESTDIR)/include
# Although not in use by libbpf's Makefile, set $(O) so that the "dummy" test # in tools/scripts/Makefile.include always succeeds when building the kernel # with $(O) pointing to a relative path, as in "make O=build bindeb-pkg". -$(LIBBPF_A): - $(Q)$(MAKE) -C $(LIBBPF_SRCS) O=$(LIBBPF_OUT)/ OUTPUT=$(LIBBPF_OUT)/ $(LIBBPF_OUT)/libbpf.a +$(LIBBPF_A): | $(LIBBPF_OUT) + $(Q)$(MAKE) -C $(LIBBPF_SRCS) O=$(LIBBPF_OUT)/ OUTPUT=$(LIBBPF_OUT)/ \ + DESTDIR=$(LIBBPF_DESTDIR) prefix= \ + $(LIBBPF_OUT)/libbpf.a install_headers + +libbpf_hdrs: $(LIBBPF_A) + +.PHONY: libbpf_hdrs + +$(LIBBPF_OUT): + $(call msg,MKDIR,$@) + $(Q)mkdir -p $@
userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi \ - -I $(srctree)/tools/lib/ -Wno-unused-result + -I $(LIBBPF_INCLUDE) -Wno-unused-result
userprogs := bpf_preload_umd
clean-files := $(userprogs) bpf_helper_defs.h FEATURE-DUMP.libbpf staticobjs/ feature/ +clean-files += $(LIBBPF_OUT) $(LIBBPF_DESTDIR) + +$(obj)/iterators/iterators.o: | libbpf_hdrs
bpf_preload_umd-objs := iterators/iterators.o bpf_preload_umd-userldlibs := $(LIBBPF_A) -lelf -lz
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit 7bf731dcc641f7d7c71d1932678e0de8ea472612 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
API headers from libbpf should not be accessed directly from the library's source directory. Instead, they should be exported with "make install_headers". Let's make sure that bpf/preload/iterators/Makefile installs the headers properly when building.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-8-quentin@isovalent.com (cherry picked from commit 7bf731dcc641f7d7c71d1932678e0de8ea472612) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/preload/iterators/Makefile | 38 ++++++++++++++++++--------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/kernel/bpf/preload/iterators/Makefile b/kernel/bpf/preload/iterators/Makefile index 28fa8c1440f4..a4aedc7b0728 100644 --- a/kernel/bpf/preload/iterators/Makefile +++ b/kernel/bpf/preload/iterators/Makefile @@ -1,18 +1,26 @@ # SPDX-License-Identifier: GPL-2.0 OUTPUT := .output +abs_out := $(abspath $(OUTPUT)) + CLANG ?= clang LLC ?= llc LLVM_STRIP ?= llvm-strip + +TOOLS_PATH := $(abspath ../../../../tools) +BPFTOOL_SRC := $(TOOLS_PATH)/bpf/bpftool +BPFTOOL_OUTPUT := $(abs_out)/bpftool DEFAULT_BPFTOOL := $(OUTPUT)/sbin/bpftool BPFTOOL ?= $(DEFAULT_BPFTOOL) -LIBBPF_SRC := $(abspath ../../../../tools/lib/bpf) -BPFOBJ := $(OUTPUT)/libbpf.a -BPF_INCLUDE := $(OUTPUT) -INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../../../tools/lib) \ - -I$(abspath ../../../../tools/include/uapi) + +LIBBPF_SRC := $(TOOLS_PATH)/lib/bpf +LIBBPF_OUTPUT := $(abs_out)/libbpf +LIBBPF_DESTDIR := $(LIBBPF_OUTPUT) +LIBBPF_INCLUDE := $(LIBBPF_DESTDIR)/include +BPFOBJ := $(LIBBPF_OUTPUT)/libbpf.a + +INCLUDES := -I$(OUTPUT) -I$(LIBBPF_INCLUDE) -I$(TOOLS_PATH)/include/uapi CFLAGS := -g -Wall
-abs_out := $(abspath $(OUTPUT)) ifeq ($(V),1) Q = msg = @@ -44,14 +52,18 @@ $(OUTPUT)/iterators.bpf.o: iterators.bpf.c $(BPFOBJ) | $(OUTPUT) -c $(filter %.c,$^) -o $@ && \ $(LLVM_STRIP) -g $@
-$(OUTPUT): +$(OUTPUT) $(LIBBPF_OUTPUT) $(BPFTOOL_OUTPUT): $(call msg,MKDIR,$@) - $(Q)mkdir -p $(OUTPUT) + $(Q)mkdir -p $@
-$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(OUTPUT) +$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(LIBBPF_OUTPUT) $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) \ - OUTPUT=$(abspath $(dir $@))/ $(abspath $@) + OUTPUT=$(abspath $(dir $@))/ prefix= \ + DESTDIR=$(LIBBPF_DESTDIR) $(abspath $@) install_headers
-$(DEFAULT_BPFTOOL): - $(Q)$(MAKE) $(submake_extras) -C ../../../../tools/bpf/bpftool \ - prefix= OUTPUT=$(abs_out)/ DESTDIR=$(abs_out) install +$(DEFAULT_BPFTOOL): $(BPFOBJ) | $(BPFTOOL_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOL_SRC) \ + OUTPUT=$(BPFTOOL_OUTPUT)/ \ + LIBBPF_OUTPUT=$(LIBBPF_OUTPUT)/ \ + LIBBPF_DESTDIR=$(LIBBPF_DESTDIR)/ \ + prefix= DESTDIR=$(abs_out)/ install
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit 87ee33bfdd4f74edc1548c7f0140800cfcc33039 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The script test_bpftool_build.sh attempts to build bpftool in the various supported ways, to make sure nothing breaks.
One of those ways is to run "make tools/bpf" from the root of the kernel repository. This command builds bpftool, along with the other tools under tools/bpf, and runqslower in particular. After running the command and upon a successful bpftool build, the script attempts to cleanup the generated objects. However, after building with this target and in the case of runqslower, the files are not cleaned up as expected.
This is because the "tools/bpf" target sets $(OUTPUT) to .../tools/bpf/runqslower/ when building the tool, causing the object files to be placed directly under the runqslower directory. But when running "cd tools/bpf; make clean", the value for $(OUTPUT) is set to ".output" (relative to the runqslower directory) by runqslower's Makefile, and this is where the Makefile looks for files to clean up.
We cannot easily fix in the root Makefile (where "tools/bpf" is defined) or in tools/scripts/Makefile.include (setting $(OUTPUT)), where changing the way the output variables are passed would likely have consequences elsewhere. We could change runqslower's Makefile to build in the repository instead of in a dedicated ".output/", but doing so just to accommodate a test script doesn't sound great. Instead, let's just make sure that we clean up runqslower properly by adding the correct command to the script.
This will attempt to clean runqslower twice: the first try with command "cd tools/bpf; make clean" will search for tools/bpf/runqslower/.output and fail to clean it (but will still clean the other tools, in particular bpftool), the second one (added in this commit) sets the $(OUTPUT) variable like for building with the "tool/bpf" target and should succeed.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-12-quentin@isovalent.com (cherry picked from commit 87ee33bfdd4f74edc1548c7f0140800cfcc33039) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/test_bpftool_build.sh | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_bpftool_build.sh b/tools/testing/selftests/bpf/test_bpftool_build.sh index 2db3c60e1e61..4e27a90aac29 100755 --- a/tools/testing/selftests/bpf/test_bpftool_build.sh +++ b/tools/testing/selftests/bpf/test_bpftool_build.sh @@ -107,6 +107,10 @@ echo -e "... through kbuild\n"
if [ -f ".config" ] ; then make_and_clean tools/bpf + ## "make tools/bpf" sets $(OUTPUT) to ...tools/bpf/runqslower for + ## runqslower, but the default (used for the "clean" target) is .output. + ## Let's make sure we clean runqslower's directory properly. + make -C tools/bpf/runqslower OUTPUT=${KDIR_ROOT_DIR}/tools/bpf/runqslower/ clean
## $OUTPUT is overwritten in kbuild Makefile, and thus cannot be passed ## down from toplevel Makefile to bpftool's Makefile.
From: Quentin Monnet quentin@isovalent.com
mainline inclusion from mainline-5.16-rc1 commit d7db0a4e8d95101ebb545444578ba7085c270e5f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
With "make install", bpftool installs its binary and its bash completion file. Usually, this is what we want. But a few components in the kernel repository (namely, BPF iterators and selftests) also install bpftool locally before using it. In such a case, bash completion is not necessary and is just a useless build artifact.
Let's add an "install-bin" target to bpftool, to offer a way to install the binary only.
Signed-off-by: Quentin Monnet quentin@isovalent.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211007194438.34443-13-quentin@isovalent.com (cherry picked from commit d7db0a4e8d95101ebb545444578ba7085c270e5f) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/preload/iterators/Makefile | 2 +- tools/bpf/bpftool/Makefile | 6 ++++-- tools/testing/selftests/bpf/Makefile | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/preload/iterators/Makefile b/kernel/bpf/preload/iterators/Makefile index a4aedc7b0728..b8bd60511227 100644 --- a/kernel/bpf/preload/iterators/Makefile +++ b/kernel/bpf/preload/iterators/Makefile @@ -66,4 +66,4 @@ $(DEFAULT_BPFTOOL): $(BPFOBJ) | $(BPFTOOL_OUTPUT) OUTPUT=$(BPFTOOL_OUTPUT)/ \ LIBBPF_OUTPUT=$(LIBBPF_OUTPUT)/ \ LIBBPF_DESTDIR=$(LIBBPF_DESTDIR)/ \ - prefix= DESTDIR=$(abs_out)/ install + prefix= DESTDIR=$(abs_out)/ install-bin diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index db705c7baa34..ee035ef316b4 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -225,10 +225,12 @@ clean: $(LIBBPF)-clean $(LIBBPF_BOOTSTRAP)-clean feature-detect-clean $(Q)$(RM) -- $(OUTPUT)FEATURE-DUMP.bpftool $(Q)$(RM) -r -- $(OUTPUT)feature/
-install: $(OUTPUT)bpftool +install-bin: $(OUTPUT)bpftool $(call QUIET_INSTALL, bpftool) $(Q)$(INSTALL) -m 0755 -d $(DESTDIR)$(prefix)/sbin $(Q)$(INSTALL) $(OUTPUT)bpftool $(DESTDIR)$(prefix)/sbin/bpftool + +install: install-bin $(Q)$(INSTALL) -m 0755 -d $(DESTDIR)$(bash_compdir) $(Q)$(INSTALL) -m 0644 bash-completion/bpftool $(DESTDIR)$(bash_compdir)
@@ -255,6 +257,6 @@ zdep: @if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
.SECONDARY: -.PHONY: all FORCE clean install uninstall zdep +.PHONY: all FORCE clean install-bin install uninstall zdep .PHONY: doc doc-clean doc-install doc-uninstall .DEFAULT_GOAL := all diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index b491166f9be5..3b9d4e066a31 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -209,7 +209,7 @@ $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ LIBBPF_OUTPUT=$(HOST_BUILD_DIR)/libbpf/ \ LIBBPF_DESTDIR=$(HOST_SCRATCH_DIR)/ \ - prefix= DESTDIR=$(HOST_SCRATCH_DIR)/ install + prefix= DESTDIR=$(HOST_SCRATCH_DIR)/ install-bin $(Q)mkdir -p $(BUILD_DIR)/bpftool/Documentation $(Q)RST2MAN_OPTS="--exit-status=1" $(MAKE) $(submake_extras) \ -C $(BPFTOOLDIR)/Documentation \
From: Dave Marchevsky davemarchevsky@fb.com
mainline inclusion from mainline-5.16-rc1 commit aba64c7da98330141dcdadd5612f088043a83696 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This stat is currently printed in the verifier log and not stored anywhere. To ease consumption of this data, add a field to bpf_prog_aux so it can be exposed via BPF_OBJ_GET_INFO_BY_FD and fdinfo.
Signed-off-by: Dave Marchevsky davemarchevsky@fb.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20211020074818.1017682-2-davemarchevsky@fb.com (cherry picked from commit aba64c7da98330141dcdadd5612f088043a83696) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 8 ++++++-- kernel/bpf/verifier.c | 1 + tools/include/uapi/linux/bpf.h | 1 + 5 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index ce82ba540afa..57a92e493ea3 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -914,6 +914,7 @@ struct bpf_prog_aux { struct bpf_prog *prog; struct user_struct *user; u64 load_time; /* ns since boottime */ + u32 verified_insns; struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE]; char name[BPF_OBJ_NAME_LEN]; #ifdef CONFIG_SECURITY diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b1bc06aea64a..d9097d60fe3e 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4564,6 +4564,7 @@ struct bpf_prog_info { __u64 run_time_ns; __u64 run_cnt; __u64 recursion_misses; + __u32 verified_insns; } __attribute__((aligned(8)));
struct bpf_map_info { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index f79a55deb7c8..9a5c0ac72e68 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1878,7 +1878,8 @@ static void bpf_prog_show_fdinfo(struct seq_file *m, struct file *filp) "prog_id:\t%u\n" "run_time_ns:\t%llu\n" "run_cnt:\t%llu\n" - "recursion_misses:\t%llu\n", + "recursion_misses:\t%llu\n" + "verified_insns:\t%u\n", prog->type, prog->jited, prog_tag, @@ -1886,7 +1887,8 @@ static void bpf_prog_show_fdinfo(struct seq_file *m, struct file *filp) prog->aux->id, stats.nsecs, stats.cnt, - stats.misses); + stats.misses, + prog->aux->verified_insns); } #endif
@@ -3577,6 +3579,8 @@ static int bpf_prog_get_info_by_fd(struct file *file, info.run_cnt = stats.cnt; info.recursion_misses = stats.misses;
+ info.verified_insns = prog->aux->verified_insns; + if (!bpf_capable()) { info.jited_prog_len = 0; info.xlated_prog_len = 0; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 0e8000bd93ff..e236b845f93e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13352,6 +13352,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
env->verification_time = ktime_get_ns() - start_time; print_verification_stats(env); + env->prog->aux->verified_insns = env->insn_processed;
if (log->level && bpf_verifier_log_full(log)) ret = -ENOSPC; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index aef4b483bea2..3356ef1e2272 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5273,6 +5273,7 @@ struct bpf_prog_info { __u64 run_time_ns; __u64 run_cnt; __u64 recursion_misses; + __u32 verified_insns; } __attribute__((aligned(8)));
struct bpf_map_info {
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit b96c07f3b5ae6944eb52fd96a322340aa80aef5d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
There isn't a good use case where anyone but libbpf itself needs to call btf__finalize_data(). It was implemented for internal use and it's not clear why it was made into public API in the first place. To function, it requires active ELF data, which is stored inside bpf_object for the duration of opening phase only. But the only BTF that needs bpf_object's ELF is that bpf_object's BTF itself, which libbpf fixes up automatically during bpf_object__open() operation anyways. There is no need for any additional fix up and no reasonable scenario where it's useful and appropriate.
Thus, btf__finalize_data() is just an API atavism and is better removed. So this patch marks it as deprecated immediately (v0.6+) and moves the code from btf.c into libbpf.c where it's used in the context of bpf_object opening phase. Such code co-location allows to make code structure more straightforward and remove bpf_object__section_size() and bpf_object__variable_offset() internal helpers from libbpf_internal.h, making them static. Their naming is also adjusted to not create a wrong illusion that they are some sort of method of bpf_object. They are internal helpers and are called appropriately.
This is part of libbpf 1.0 effort ([0]).
[0] Closes: https://github.com/libbpf/libbpf/issues/276
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org (cherry picked from commit b96c07f3b5ae6944eb52fd96a322340aa80aef5d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 93 ---------------------------- tools/lib/bpf/btf.h | 1 + tools/lib/bpf/libbpf.c | 106 ++++++++++++++++++++++++++++++-- tools/lib/bpf/libbpf_internal.h | 4 -- 4 files changed, 102 insertions(+), 102 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f1829ad45cb5..7544ea1b89cf 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -1101,99 +1101,6 @@ struct btf *btf__parse_split(const char *path, struct btf *base_btf) return libbpf_ptr(btf_parse(path, base_btf, NULL)); }
-static int compare_vsi_off(const void *_a, const void *_b) -{ - const struct btf_var_secinfo *a = _a; - const struct btf_var_secinfo *b = _b; - - return a->offset - b->offset; -} - -static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf, - struct btf_type *t) -{ - __u32 size = 0, off = 0, i, vars = btf_vlen(t); - const char *name = btf__name_by_offset(btf, t->name_off); - const struct btf_type *t_var; - struct btf_var_secinfo *vsi; - const struct btf_var *var; - int ret; - - if (!name) { - pr_debug("No name found in string section for DATASEC kind.\n"); - return -ENOENT; - } - - /* .extern datasec size and var offsets were set correctly during - * extern collection step, so just skip straight to sorting variables - */ - if (t->size) - goto sort_vars; - - ret = bpf_object__section_size(obj, name, &size); - if (ret || !size || (t->size && t->size != size)) { - pr_debug("Invalid size for section %s: %u bytes\n", name, size); - return -ENOENT; - } - - t->size = size; - - for (i = 0, vsi = btf_var_secinfos(t); i < vars; i++, vsi++) { - t_var = btf__type_by_id(btf, vsi->type); - var = btf_var(t_var); - - if (!btf_is_var(t_var)) { - pr_debug("Non-VAR type seen in section %s\n", name); - return -EINVAL; - } - - if (var->linkage == BTF_VAR_STATIC) - continue; - - name = btf__name_by_offset(btf, t_var->name_off); - if (!name) { - pr_debug("No name found in string section for VAR kind\n"); - return -ENOENT; - } - - ret = bpf_object__variable_offset(obj, name, &off); - if (ret) { - pr_debug("No offset found in symbol table for VAR %s\n", - name); - return -ENOENT; - } - - vsi->offset = off; - } - -sort_vars: - qsort(btf_var_secinfos(t), vars, sizeof(*vsi), compare_vsi_off); - return 0; -} - -int btf__finalize_data(struct bpf_object *obj, struct btf *btf) -{ - int err = 0; - __u32 i; - - for (i = 1; i <= btf->nr_types; i++) { - struct btf_type *t = btf_type_by_id(btf, i); - - /* Loader needs to fix up some of the things compiler - * couldn't get its hands on while emitting BTF. This - * is section size and global variable offset. We use - * the info from the ELF itself for this purpose. - */ - if (btf_is_datasec(t)) { - err = btf_fixup_datasec(obj, btf, t); - if (err) - break; - } - } - - return libbpf_err(err); -} - static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
int btf__load_into_kernel(struct btf *btf) diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 104db2cd699d..bddc1d05f120 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -53,6 +53,7 @@ LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *b LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_from_kernel_by_id instead") LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
+LIBBPF_DEPRECATED_SINCE(0, 6, "intended for internal libbpf use only") LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_into_kernel instead") LIBBPF_API int btf__load(struct btf *btf); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ba4eb8da2b65..16c731bf945b 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1289,8 +1289,7 @@ static bool bpf_map_type__is_map_in_map(enum bpf_map_type type) return false; }
-int bpf_object__section_size(const struct bpf_object *obj, const char *name, - __u32 *size) +static int find_elf_sec_sz(const struct bpf_object *obj, const char *name, __u32 *size) { int ret = -ENOENT;
@@ -1322,8 +1321,7 @@ int bpf_object__section_size(const struct bpf_object *obj, const char *name, return *size ? 0 : ret; }
-int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, - __u32 *off) +static int find_elf_var_offset(const struct bpf_object *obj, const char *name, __u32 *off) { Elf_Data *symbols = obj->efile.symbols; const char *sname; @@ -2615,6 +2613,104 @@ static int bpf_object__init_btf(struct bpf_object *obj, return 0; }
+static int compare_vsi_off(const void *_a, const void *_b) +{ + const struct btf_var_secinfo *a = _a; + const struct btf_var_secinfo *b = _b; + + return a->offset - b->offset; +} + +static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf, + struct btf_type *t) +{ + __u32 size = 0, off = 0, i, vars = btf_vlen(t); + const char *name = btf__name_by_offset(btf, t->name_off); + const struct btf_type *t_var; + struct btf_var_secinfo *vsi; + const struct btf_var *var; + int ret; + + if (!name) { + pr_debug("No name found in string section for DATASEC kind.\n"); + return -ENOENT; + } + + /* .extern datasec size and var offsets were set correctly during + * extern collection step, so just skip straight to sorting variables + */ + if (t->size) + goto sort_vars; + + ret = find_elf_sec_sz(obj, name, &size); + if (ret || !size || (t->size && t->size != size)) { + pr_debug("Invalid size for section %s: %u bytes\n", name, size); + return -ENOENT; + } + + t->size = size; + + for (i = 0, vsi = btf_var_secinfos(t); i < vars; i++, vsi++) { + t_var = btf__type_by_id(btf, vsi->type); + var = btf_var(t_var); + + if (!btf_is_var(t_var)) { + pr_debug("Non-VAR type seen in section %s\n", name); + return -EINVAL; + } + + if (var->linkage == BTF_VAR_STATIC) + continue; + + name = btf__name_by_offset(btf, t_var->name_off); + if (!name) { + pr_debug("No name found in string section for VAR kind\n"); + return -ENOENT; + } + + ret = find_elf_var_offset(obj, name, &off); + if (ret) { + pr_debug("No offset found in symbol table for VAR %s\n", + name); + return -ENOENT; + } + + vsi->offset = off; + } + +sort_vars: + qsort(btf_var_secinfos(t), vars, sizeof(*vsi), compare_vsi_off); + return 0; +} + +static int btf_finalize_data(struct bpf_object *obj, struct btf *btf) +{ + int err = 0; + __u32 i, n = btf__get_nr_types(btf); + + for (i = 1; i <= n; i++) { + struct btf_type *t = btf_type_by_id(btf, i); + + /* Loader needs to fix up some of the things compiler + * couldn't get its hands on while emitting BTF. This + * is section size and global variable offset. We use + * the info from the ELF itself for this purpose. + */ + if (btf_is_datasec(t)) { + err = btf_fixup_datasec(obj, btf, t); + if (err) + break; + } + } + + return libbpf_err(err); +} + +int btf__finalize_data(struct bpf_object *obj, struct btf *btf) +{ + return btf_finalize_data(obj, btf); +} + static int bpf_object__finalize_btf(struct bpf_object *obj) { int err; @@ -2622,7 +2718,7 @@ static int bpf_object__finalize_btf(struct bpf_object *obj) if (!obj->btf) return 0;
- err = btf__finalize_data(obj, obj->btf); + err = btf_finalize_data(obj, obj->btf); if (err) { pr_warn("Error finalizing %s: %d.\n", BTF_ELF_SEC, err); return err; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index c9ca612c9449..f7a7c5fa26cd 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -283,10 +283,6 @@ struct bpf_prog_load_params {
int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
-int bpf_object__section_size(const struct bpf_object *obj, const char *name, - __u32 *size); -int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, - __u32 *off); struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf); void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type, const char **prefix, int *kind);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 29a30ff501518a49282754909543cef1ef49e4bc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Name currently anonymous internal struct that keeps ELF-related state for bpf_object. Just a bit of clean up, no functional changes.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org (cherry picked from commit 29a30ff501518a49282754909543cef1ef49e4bc) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 70 ++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 36 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 16c731bf945b..9a226a1710fc 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -431,6 +431,35 @@ struct module_btf { int fd; };
+struct elf_state { + int fd; + const void *obj_buf; + size_t obj_buf_sz; + Elf *elf; + GElf_Ehdr ehdr; + Elf_Data *symbols; + Elf_Data *data; + Elf_Data *rodata; + Elf_Data *bss; + Elf_Data *st_ops_data; + size_t shstrndx; /* section index for section name strings */ + size_t strtabidx; + struct { + GElf_Shdr shdr; + Elf_Data *data; + } *reloc_sects; + int nr_reloc_sects; + int maps_shndx; + int btf_maps_shndx; + __u32 btf_maps_sec_btf_id; + int text_shndx; + int symbols_shndx; + int data_shndx; + int rodata_shndx; + int bss_shndx; + int st_ops_shndx; +}; + struct bpf_object { char name[BPF_OBJ_NAME_LEN]; char license[64]; @@ -453,40 +482,10 @@ struct bpf_object {
struct bpf_gen *gen_loader;
+ /* Information when doing ELF related work. Only valid if efile.elf is not NULL */ + struct elf_state efile; /* - * Information when doing elf related work. Only valid if fd - * is valid. - */ - struct { - int fd; - const void *obj_buf; - size_t obj_buf_sz; - Elf *elf; - GElf_Ehdr ehdr; - Elf_Data *symbols; - Elf_Data *data; - Elf_Data *rodata; - Elf_Data *bss; - Elf_Data *st_ops_data; - size_t shstrndx; /* section index for section name strings */ - size_t strtabidx; - struct { - GElf_Shdr shdr; - Elf_Data *data; - } *reloc_sects; - int nr_reloc_sects; - int maps_shndx; - int btf_maps_shndx; - __u32 btf_maps_sec_btf_id; - int text_shndx; - int symbols_shndx; - int data_shndx; - int rodata_shndx; - int bss_shndx; - int st_ops_shndx; - } efile; - /* - * All loaded bpf_object is linked in a list, which is + * All loaded bpf_object are linked in a list, which is * hidden to caller. bpf_objects__<func> handlers deal with * all objects. */ @@ -516,7 +515,6 @@ struct bpf_object {
char path[]; }; -#define obj_elf_valid(o) ((o)->efile.elf)
static const char *elf_sym_str(const struct bpf_object *obj, size_t off); static const char *elf_sec_str(const struct bpf_object *obj, size_t off); @@ -1150,7 +1148,7 @@ static struct bpf_object *bpf_object__new(const char *path,
static void bpf_object__elf_finish(struct bpf_object *obj) { - if (!obj_elf_valid(obj)) + if (!obj->efile.elf) return;
if (obj->efile.elf) { @@ -1175,7 +1173,7 @@ static int bpf_object__elf_init(struct bpf_object *obj) int err = 0; GElf_Ehdr *ep;
- if (obj_elf_valid(obj)) { + if (obj->efile.elf) { pr_warn("elf: init internal error\n"); return -LIBBPF_ERRNO__LIBELF; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 12d9466d8bf3d1d4b4fd0f5733b6fa0cc5ee1013 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Refactor internals of libbpf to allow adding custom SEC() handling logic easily from outside of libbpf. To that effect, each SEC()-handling registration sets mandatory program type/expected attach type for a given prefix and can provide three callbacks called at different points of BPF program lifetime:
- init callback for right after bpf_program is initialized and prog_type/expected_attach_type is set. This happens during bpf_object__open() step, close to the very end of constructing bpf_object, so all the libbpf APIs for querying and updating bpf_program properties should be available;
- pre-load callback is called right before BPF_PROG_LOAD command is called in the kernel. This callbacks has ability to set both bpf_program properties, as well as program load attributes, overriding and augmenting the standard libbpf handling of them;
- optional auto-attach callback, which makes a given SEC() handler support auto-attachment of a BPF program through bpf_program__attach() API and/or BPF skeletons <skel>__attach() method.
Each callbacks gets a `long cookie` parameter passed in, which is specified during SEC() handling. This can be used by callbacks to lookup whatever additional information is necessary.
This is not yet completely ready to be exposed to the outside world, mainly due to non-public nature of struct bpf_prog_load_params. Instead of making it part of public API, we'll wait until the planned low-level libbpf API improvements for BPF_PROG_LOAD and other typical bpf() syscall APIs, at which point we'll have a public, probably OPTS-based, way to fully specify BPF program load parameters, which will be used as an interface for custom pre-load callbacks.
But this change itself is already a good first step to unify the BPF program hanling logic even within the libbpf itself. As one example, all the extra per-program type handling (sleepable bit, attach_btf_id resolution, unsetting optional expected attach type) is now more obvious and is gathered in one place.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Dave Marchevsky davemarchevsky@fb.com Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org (cherry picked from commit 12d9466d8bf3d1d4b4fd0f5733b6fa0cc5ee1013) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 135 +++++++++++++++++++++++++++-------------- 1 file changed, 90 insertions(+), 45 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 9a226a1710fc..c3b2f3395e40 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -218,7 +218,9 @@ struct reloc_desc {
struct bpf_sec_def;
-typedef struct bpf_link *(*attach_fn_t)(const struct bpf_program *prog); +typedef int (*init_fn_t)(struct bpf_program *prog, long cookie); +typedef int (*preload_fn_t)(struct bpf_program *prog, struct bpf_prog_load_params *attr, long cookie); +typedef struct bpf_link *(*attach_fn_t)(const struct bpf_program *prog, long cookie);
struct bpf_sec_def { const char *sec; @@ -229,7 +231,11 @@ struct bpf_sec_def { bool is_attachable; bool is_attach_btf; bool is_sleepable; + + init_fn_t init_fn; + preload_fn_t preload_fn; attach_fn_t attach_fn; + long cookie; };
/* @@ -6180,6 +6186,45 @@ static int bpf_object__sanitize_prog(struct bpf_object *obj, struct bpf_program return 0; }
+static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id); + +/* this is called as prog->sec_def->preload_fn for libbpf-supported sec_defs */ +static int libbpf_preload_prog(struct bpf_program *prog, + struct bpf_prog_load_params *attr, long cookie) +{ + /* old kernels might not support specifying expected_attach_type */ + if (prog->sec_def->is_exp_attach_type_optional && + !kernel_supports(prog->obj, FEAT_EXP_ATTACH_TYPE)) + attr->expected_attach_type = 0; + + if (prog->sec_def->is_sleepable) + attr->prog_flags |= BPF_F_SLEEPABLE; + + if ((prog->type == BPF_PROG_TYPE_TRACING || + prog->type == BPF_PROG_TYPE_LSM || + prog->type == BPF_PROG_TYPE_EXT || + prog->type == BPF_PROG_TYPE_SCHED) && !prog->attach_btf_id) { + int btf_obj_fd = 0, btf_type_id = 0, err; + + err = libbpf_find_attach_btf_id(prog, &btf_obj_fd, &btf_type_id); + if (err) + return err; + + /* cache resolved BTF FD and BTF type ID in the prog */ + prog->attach_btf_obj_fd = btf_obj_fd; + prog->attach_btf_id = btf_type_id; + + /* but by now libbpf common logic is not utilizing + * prog->atach_btf_obj_fd/prog->attach_btf_id anymore because + * this callback is called after attrs were populated by + * libbpf, so this callback has to update attr explicitly here + */ + attr->attach_btf_obj_fd = btf_obj_fd; + attr->attach_btf_id = btf_type_id; + } + return 0; +} + static int load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, char *license, __u32 kern_version, int *pfd) @@ -6188,7 +6233,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, char *cp, errmsg[STRERR_BUFSIZE]; size_t log_buf_size = 0; char *log_buf = NULL; - int btf_fd, ret; + int btf_fd, ret, err;
if (prog->type == BPF_PROG_TYPE_UNSPEC) { /* @@ -6204,22 +6249,15 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, return -EINVAL;
load_attr.prog_type = prog->type; - /* old kernels might not support specifying expected_attach_type */ - if (!kernel_supports(prog->obj, FEAT_EXP_ATTACH_TYPE) && prog->sec_def && - prog->sec_def->is_exp_attach_type_optional) - load_attr.expected_attach_type = 0; - else - load_attr.expected_attach_type = prog->expected_attach_type; + load_attr.expected_attach_type = prog->expected_attach_type; if (kernel_supports(prog->obj, FEAT_PROG_NAME)) load_attr.name = prog->name; load_attr.insns = insns; load_attr.insn_cnt = insns_cnt; load_attr.license = license; load_attr.attach_btf_id = prog->attach_btf_id; - if (prog->attach_prog_fd) - load_attr.attach_prog_fd = prog->attach_prog_fd; - else - load_attr.attach_btf_obj_fd = prog->attach_btf_obj_fd; + load_attr.attach_prog_fd = prog->attach_prog_fd; + load_attr.attach_btf_obj_fd = prog->attach_btf_obj_fd; load_attr.attach_btf_id = prog->attach_btf_id; load_attr.kern_version = kern_version; load_attr.prog_ifindex = prog->prog_ifindex; @@ -6238,6 +6276,16 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, load_attr.log_level = prog->log_level; load_attr.prog_flags = prog->prog_flags;
+ /* adjust load_attr if sec_def provides custom preload callback */ + if (prog->sec_def && prog->sec_def->preload_fn) { + err = prog->sec_def->preload_fn(prog, &load_attr, prog->sec_def->cookie); + if (err < 0) { + pr_warn("prog '%s': failed to prepare load attributes: %d\n", + prog->name, err); + return err; + } + } + if (prog->obj->gen_loader) { bpf_gen__prog_load(prog->obj->gen_loader, &load_attr, prog - prog->obj->programs); @@ -6353,8 +6401,6 @@ static int bpf_program__record_externs(struct bpf_program *prog) return 0; }
-static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id); - int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) { int err = 0, fd, i; @@ -6364,20 +6410,6 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) return libbpf_err(-EINVAL); }
- if ((prog->type == BPF_PROG_TYPE_TRACING || - prog->type == BPF_PROG_TYPE_LSM || - prog->type == BPF_PROG_TYPE_EXT || - prog->type == BPF_PROG_TYPE_SCHED) && !prog->attach_btf_id) { - int btf_obj_fd = 0, btf_type_id = 0; - - err = libbpf_find_attach_btf_id(prog, &btf_obj_fd, &btf_type_id); - if (err) - return libbpf_err(err); - - prog->attach_btf_obj_fd = btf_obj_fd; - prog->attach_btf_id = btf_type_id; - } - if (prog->instances.nr < 0 || !prog->instances.fds) { if (prog->preprocessor) { pr_warn("Internal error: can't load program '%s'\n", @@ -6487,6 +6519,7 @@ static const struct bpf_sec_def *find_sec_def(const char *sec_name); static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object_open_opts *opts) { struct bpf_program *prog; + int err;
bpf_object__for_each_program(prog, obj) { prog->sec_def = find_sec_def(prog->sec_name); @@ -6497,8 +6530,6 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object continue; }
- if (prog->sec_def->is_sleepable) - prog->prog_flags |= BPF_F_SLEEPABLE; bpf_program__set_type(prog, prog->sec_def->prog_type); bpf_program__set_expected_attach_type(prog, prog->sec_def->expected_attach_type);
@@ -6508,6 +6539,18 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object prog->sec_def->prog_type == BPF_PROG_TYPE_EXT) prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0); #pragma GCC diagnostic pop + + /* sec_def can have custom callback which should be called + * after bpf_program is initialized to adjust its properties + */ + if (prog->sec_def->init_fn) { + err = prog->sec_def->init_fn(prog, prog->sec_def->cookie); + if (err < 0) { + pr_warn("prog '%s': failed to initialize: %d\n", + prog->name, err); + return err; + } + } }
return 0; @@ -8016,6 +8059,7 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog, .is_exp_attach_type_optional = eatype_optional, \ .is_attachable = attachable, \ .is_attach_btf = attach_btf, \ + .preload_fn = libbpf_preload_prog, \ }
/* Programs that can NOT be attached. */ @@ -8042,16 +8086,17 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog, .sec = sec_pfx, \ .len = sizeof(sec_pfx) - 1, \ .prog_type = BPF_PROG_TYPE_##ptype, \ + .preload_fn = libbpf_preload_prog, \ __VA_ARGS__ \ }
-static struct bpf_link *attach_kprobe(const struct bpf_program *prog); -static struct bpf_link *attach_tp(const struct bpf_program *prog); -static struct bpf_link *attach_raw_tp(const struct bpf_program *prog); -static struct bpf_link *attach_trace(const struct bpf_program *prog); -static struct bpf_link *attach_lsm(const struct bpf_program *prog); -static struct bpf_link *attach_iter(const struct bpf_program *prog); -static struct bpf_link *attach_sched(const struct bpf_program *prog); +static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_tp(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_raw_tp(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_trace(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_lsm(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_iter(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_sched(const struct bpf_program *prog, long cookie);
static const struct bpf_sec_def section_defs[] = { BPF_PROG_SEC("socket", BPF_PROG_TYPE_SOCKET_FILTER), @@ -9318,7 +9363,7 @@ struct bpf_link *bpf_program__attach_kprobe(const struct bpf_program *prog, return link; }
-static struct bpf_link *attach_kprobe(const struct bpf_program *prog) +static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cookie) { const char *func_name; bool retprobe; @@ -9437,7 +9482,7 @@ struct bpf_link *bpf_program__attach_tracepoint(const struct bpf_program *prog, return link; }
-static struct bpf_link *attach_tp(const struct bpf_program *prog) +static struct bpf_link *attach_tp(const struct bpf_program *prog, long cookie) { char *sec_name, *tp_cat, *tp_name; struct bpf_link *link; @@ -9491,7 +9536,7 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(const struct bpf_program *pr return link; }
-static struct bpf_link *attach_raw_tp(const struct bpf_program *prog) +static struct bpf_link *attach_raw_tp(const struct bpf_program *prog, long cookie) { const char *tp_name = prog->sec_name + prog->sec_def->len;
@@ -9543,17 +9588,17 @@ struct bpf_link *bpf_program__attach_lsm(const struct bpf_program *prog) return bpf_program__attach_btf_id(prog); }
-static struct bpf_link *attach_trace(const struct bpf_program *prog) +static struct bpf_link *attach_trace(const struct bpf_program *prog, long cookie) { return bpf_program__attach_trace(prog); }
-static struct bpf_link *attach_sched(const struct bpf_program *prog) +static struct bpf_link *attach_sched(const struct bpf_program *prog, long cookie) { return bpf_program__attach_sched(prog); }
-static struct bpf_link *attach_lsm(const struct bpf_program *prog) +static struct bpf_link *attach_lsm(const struct bpf_program *prog, long cookie) { return bpf_program__attach_lsm(prog); } @@ -9684,7 +9729,7 @@ bpf_program__attach_iter(const struct bpf_program *prog, return link; }
-static struct bpf_link *attach_iter(const struct bpf_program *prog) +static struct bpf_link *attach_iter(const struct bpf_program *prog, long cookie) { return bpf_program__attach_iter(prog, NULL); } @@ -9694,7 +9739,7 @@ struct bpf_link *bpf_program__attach(const struct bpf_program *prog) if (!prog->sec_def || !prog->sec_def->attach_fn) return libbpf_err_ptr(-ESRCH);
- return prog->sec_def->attach_fn(prog); + return prog->sec_def->attach_fn(prog, prog->sec_def->cookie); }
static int bpf_link__detach_struct_ops(struct bpf_link *link)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 13d35a0cf1741431333ba4aa9bce9c5bbc88f63b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Move closer to not relying on bpf_sec_def internals that won't be part of public API, when pluggable SEC() handlers will be allowed. Drop pre-calculated prefix length, and in various helpers don't rely on this prefix length availability. Also minimize reliance on knowing bpf_sec_def's prefix for few places where section prefix shortcuts are supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint).
Given checking some string for having a given string-constant prefix is such a common operation and so annoying to be done with pure C code, add a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c where prefix comparison is performed. With __builtin_constant_p() it's possible to have a convenient helper that checks some string for having a given prefix, where prefix is either string literal (or compile-time known string due to compiler optimization) or just a runtime string pointer, which is quite convenient and saves a lot of typing and string literal duplication.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Dave Marchevsky davemarchevsky@fb.com Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org (cherry picked from commit 13d35a0cf1741431333ba4aa9bce9c5bbc88f63b) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 41 ++++++++++++++++++--------------- tools/lib/bpf/libbpf_internal.h | 7 ++++++ 2 files changed, 30 insertions(+), 18 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index c3b2f3395e40..0f8f1e41ff22 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -224,7 +224,6 @@ typedef struct bpf_link *(*attach_fn_t)(const struct bpf_program *prog, long coo
struct bpf_sec_def { const char *sec; - size_t len; enum bpf_prog_type prog_type; enum bpf_attach_type expected_attach_type; bool is_exp_attach_type_optional; @@ -1665,7 +1664,7 @@ static int bpf_object__process_kconfig_line(struct bpf_object *obj, void *ext_val; __u64 num;
- if (strncmp(buf, "CONFIG_", 7)) + if (!str_has_pfx(buf, "CONFIG_")) return 0;
sep = strchr(buf, '='); @@ -3015,7 +3014,7 @@ static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn) static bool is_sec_name_dwarf(const char *name) { /* approximation, but the actual list is too long */ - return strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0; + return str_has_pfx(name, ".debug_"); }
static bool ignore_elf_section(GElf_Shdr *hdr, const char *name) @@ -3037,7 +3036,7 @@ static bool ignore_elf_section(GElf_Shdr *hdr, const char *name) if (is_sec_name_dwarf(name)) return true;
- if (strncmp(name, ".rel", sizeof(".rel") - 1) == 0) { + if (str_has_pfx(name, ".rel")) { name += sizeof(".rel") - 1; /* DWARF section relocations */ if (is_sec_name_dwarf(name)) @@ -6979,8 +6978,7 @@ static int bpf_object__resolve_externs(struct bpf_object *obj, if (err) return err; pr_debug("extern (kcfg) %s=0x%x\n", ext->name, kver); - } else if (ext->type == EXT_KCFG && - strncmp(ext->name, "CONFIG_", 7) == 0) { + } else if (ext->type == EXT_KCFG && str_has_pfx(ext->name, "CONFIG_")) { need_config = true; } else if (ext->type == EXT_KSYM) { if (ext->ksym.type_id) @@ -8053,7 +8051,6 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog, attachable, attach_btf) \ { \ .sec = string, \ - .len = sizeof(string) - 1, \ .prog_type = ptype, \ .expected_attach_type = eatype, \ .is_exp_attach_type_optional = eatype_optional, \ @@ -8084,7 +8081,6 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog,
#define SEC_DEF(sec_pfx, ptype, ...) { \ .sec = sec_pfx, \ - .len = sizeof(sec_pfx) - 1, \ .prog_type = BPF_PROG_TYPE_##ptype, \ .preload_fn = libbpf_preload_prog, \ __VA_ARGS__ \ @@ -8262,10 +8258,8 @@ static const struct bpf_sec_def *find_sec_def(const char *sec_name) int i, n = ARRAY_SIZE(section_defs);
for (i = 0; i < n; i++) { - if (strncmp(sec_name, - section_defs[i].sec, section_defs[i].len)) - continue; - return §ion_defs[i]; + if (str_has_pfx(sec_name, section_defs[i].sec)) + return §ion_defs[i]; } return NULL; } @@ -8624,7 +8618,7 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, prog->sec_name); return -ESRCH; } - attach_name = prog->sec_name + prog->sec_def->len; + attach_name = prog->sec_name + strlen(prog->sec_def->sec);
/* BPF program's BTF ID */ if (attach_prog_fd) { @@ -9368,8 +9362,11 @@ static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cooki const char *func_name; bool retprobe;
- func_name = prog->sec_name + prog->sec_def->len; - retprobe = strcmp(prog->sec_def->sec, "kretprobe/") == 0; + retprobe = str_has_pfx(prog->sec_name, "kretprobe/"); + if (retprobe) + func_name = prog->sec_name + sizeof("kretprobe/") - 1; + else + func_name = prog->sec_name + sizeof("kprobe/") - 1;
return bpf_program__attach_kprobe(prog, retprobe, func_name); } @@ -9491,8 +9488,11 @@ static struct bpf_link *attach_tp(const struct bpf_program *prog, long cookie) if (!sec_name) return libbpf_err_ptr(-ENOMEM);
- /* extract "tp/<category>/<name>" */ - tp_cat = sec_name + prog->sec_def->len; + /* extract "tp/<category>/<name>" or "tracepoint/<category>/<name>" */ + if (str_has_pfx(prog->sec_name, "tp/")) + tp_cat = sec_name + sizeof("tp/") - 1; + else + tp_cat = sec_name + sizeof("tracepoint/") - 1; tp_name = strchr(tp_cat, '/'); if (!tp_name) { free(sec_name); @@ -9538,7 +9538,12 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(const struct bpf_program *pr
static struct bpf_link *attach_raw_tp(const struct bpf_program *prog, long cookie) { - const char *tp_name = prog->sec_name + prog->sec_def->len; + const char *tp_name; + + if (str_has_pfx(prog->sec_name, "raw_tp/")) + tp_name = prog->sec_name + sizeof("raw_tp/") - 1; + else + tp_name = prog->sec_name + sizeof("raw_tracepoint/") - 1;
return bpf_program__attach_raw_tracepoint(prog, tp_name); } diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index f7a7c5fa26cd..63d0d4c7ea1d 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -89,6 +89,13 @@ (offsetof(TYPE, FIELD) + sizeof(((TYPE *)0)->FIELD)) #endif
+/* Check whether a string `str` has prefix `pfx`, regardless if `pfx` is + * a string literal known at compilation time or char * pointer known only at + * runtime. + */ +#define str_has_pfx(str, pfx) \ + (strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 : strlen(pfx)) == 0) + /* Symbol versioning is different between static and shared library. * Properly versioned symbols are needed for shared library, but * only the symbol of the new version is needed for static library.
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit ad23b7238474c6319bf692ae6ce037d9696df1d1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These APIs require copying ELF data structures into local GElf_xxx structs and have a more cumbersome API. BPF ELF file is defined to be always 64-bit ELF object, even when intended to be run on 32-bit host architectures, so there is no need to do class-agnostic conversions everywhere. BPF static linker implementation within libbpf has been using Elf64-specific types since initial implementation.
Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more succinct direct access to ELF symbol and relocation records within ELF data itself and switch all the GElf_xxx usage into Elf64_xxx equivalents. The only remaining place within libbpf.c that's still using gelf API is gelf_getclass(), as there doesn't seem to be a direct way to get underlying ELF bitness.
No functional changes intended.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org (cherry picked from commit ad23b7238474c6319bf692ae6ce037d9696df1d1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 353 ++++++++++++++++++-------------- tools/lib/bpf/libbpf_internal.h | 4 +- tools/lib/bpf/linker.c | 1 - 3 files changed, 196 insertions(+), 162 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0f8f1e41ff22..9413360be003 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -441,7 +441,7 @@ struct elf_state { const void *obj_buf; size_t obj_buf_sz; Elf *elf; - GElf_Ehdr ehdr; + Elf64_Ehdr *ehdr; Elf_Data *symbols; Elf_Data *data; Elf_Data *rodata; @@ -450,7 +450,7 @@ struct elf_state { size_t shstrndx; /* section index for section name strings */ size_t strtabidx; struct { - GElf_Shdr shdr; + Elf64_Shdr *shdr; Elf_Data *data; } *reloc_sects; int nr_reloc_sects; @@ -525,9 +525,11 @@ static const char *elf_sym_str(const struct bpf_object *obj, size_t off); static const char *elf_sec_str(const struct bpf_object *obj, size_t off); static Elf_Scn *elf_sec_by_idx(const struct bpf_object *obj, size_t idx); static Elf_Scn *elf_sec_by_name(const struct bpf_object *obj, const char *name); -static int elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn, GElf_Shdr *hdr); +static Elf64_Shdr *elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn); static const char *elf_sec_name(const struct bpf_object *obj, Elf_Scn *scn); static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn); +static Elf64_Sym *elf_sym_by_idx(const struct bpf_object *obj, size_t idx); +static Elf64_Rel *elf_rel_by_idx(Elf_Data *data, size_t idx);
void bpf_program__unload(struct bpf_program *prog) { @@ -669,25 +671,25 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, size_t sec_sz = sec_data->d_size, sec_off, prog_sz, nr_syms; int nr_progs, err, i; const char *name; - GElf_Sym sym; + Elf64_Sym *sym;
progs = obj->programs; nr_progs = obj->nr_programs; - nr_syms = symbols->d_size / sizeof(GElf_Sym); + nr_syms = symbols->d_size / sizeof(Elf64_Sym); sec_off = 0;
for (i = 0; i < nr_syms; i++) { - if (!gelf_getsym(symbols, i, &sym)) - continue; - if (sym.st_shndx != sec_idx) + sym = elf_sym_by_idx(obj, i); + + if (sym->st_shndx != sec_idx) continue; - if (GELF_ST_TYPE(sym.st_info) != STT_FUNC) + if (ELF64_ST_TYPE(sym->st_info) != STT_FUNC) continue;
- prog_sz = sym.st_size; - sec_off = sym.st_value; + prog_sz = sym->st_size; + sec_off = sym->st_value;
- name = elf_sym_str(obj, sym.st_name); + name = elf_sym_str(obj, sym->st_name); if (!name) { pr_warn("sec '%s': failed to get symbol name for offset %zu\n", sec_name, sec_off); @@ -700,7 +702,7 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return -LIBBPF_ERRNO__FORMAT; }
- if (sec_idx != obj->efile.text_shndx && GELF_ST_BIND(sym.st_info) == STB_LOCAL) { + if (sec_idx != obj->efile.text_shndx && ELF64_ST_BIND(sym->st_info) == STB_LOCAL) { pr_warn("sec '%s': program '%s' is static and not supported\n", sec_name, name); return -ENOTSUP; } @@ -733,9 +735,9 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, * as static to enable more permissive BPF verification mode * with more outside context available to BPF verifier */ - if (GELF_ST_BIND(sym.st_info) != STB_LOCAL - && (GELF_ST_VISIBILITY(sym.st_other) == STV_HIDDEN - || GELF_ST_VISIBILITY(sym.st_other) == STV_INTERNAL)) + if (ELF64_ST_BIND(sym->st_info) != STB_LOCAL + && (ELF64_ST_VISIBILITY(sym->st_other) == STV_HIDDEN + || ELF64_ST_VISIBILITY(sym->st_other) == STV_INTERNAL)) prog->mark_btf_static = true;
nr_progs++; @@ -1175,8 +1177,9 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
static int bpf_object__elf_init(struct bpf_object *obj) { + Elf64_Ehdr *ehdr; int err = 0; - GElf_Ehdr *ep; + Elf *elf;
if (obj->efile.elf) { pr_warn("elf: init internal error\n"); @@ -1188,8 +1191,7 @@ static int bpf_object__elf_init(struct bpf_object *obj) * obj_buf should have been validated by * bpf_object__open_buffer(). */ - obj->efile.elf = elf_memory((char *)obj->efile.obj_buf, - obj->efile.obj_buf_sz); + elf = elf_memory((char *)obj->efile.obj_buf, obj->efile.obj_buf_sz); } else { obj->efile.fd = open(obj->path, O_RDONLY); if (obj->efile.fd < 0) { @@ -1201,23 +1203,37 @@ static int bpf_object__elf_init(struct bpf_object *obj) return err; }
- obj->efile.elf = elf_begin(obj->efile.fd, ELF_C_READ_MMAP, NULL); + elf = elf_begin(obj->efile.fd, ELF_C_READ_MMAP, NULL); }
- if (!obj->efile.elf) { + if (!elf) { pr_warn("elf: failed to open %s as ELF file: %s\n", obj->path, elf_errmsg(-1)); err = -LIBBPF_ERRNO__LIBELF; goto errout; }
- if (!gelf_getehdr(obj->efile.elf, &obj->efile.ehdr)) { + obj->efile.elf = elf; + + if (elf_kind(elf) != ELF_K_ELF) { + err = -LIBBPF_ERRNO__FORMAT; + pr_warn("elf: '%s' is not a proper ELF object\n", obj->path); + goto errout; + } + + if (gelf_getclass(elf) != ELFCLASS64) { + err = -LIBBPF_ERRNO__FORMAT; + pr_warn("elf: '%s' is not a 64-bit ELF object\n", obj->path); + goto errout; + } + + obj->efile.ehdr = ehdr = elf64_getehdr(elf); + if (!obj->efile.ehdr) { pr_warn("elf: failed to get ELF header from %s: %s\n", obj->path, elf_errmsg(-1)); err = -LIBBPF_ERRNO__FORMAT; goto errout; } - ep = &obj->efile.ehdr;
- if (elf_getshdrstrndx(obj->efile.elf, &obj->efile.shstrndx)) { + if (elf_getshdrstrndx(elf, &obj->efile.shstrndx)) { pr_warn("elf: failed to get section names section index for %s: %s\n", obj->path, elf_errmsg(-1)); err = -LIBBPF_ERRNO__FORMAT; @@ -1225,7 +1241,7 @@ static int bpf_object__elf_init(struct bpf_object *obj) }
/* Elf is corrupted/truncated, avoid calling elf_strptr. */ - if (!elf_rawdata(elf_getscn(obj->efile.elf, obj->efile.shstrndx), NULL)) { + if (!elf_rawdata(elf_getscn(elf, obj->efile.shstrndx), NULL)) { pr_warn("elf: failed to get section names strings from %s: %s\n", obj->path, elf_errmsg(-1)); err = -LIBBPF_ERRNO__FORMAT; @@ -1233,8 +1249,7 @@ static int bpf_object__elf_init(struct bpf_object *obj) }
/* Old LLVM set e_machine to EM_NONE */ - if (ep->e_type != ET_REL || - (ep->e_machine && ep->e_machine != EM_BPF)) { + if (ehdr->e_type != ET_REL || (ehdr->e_machine && ehdr->e_machine != EM_BPF)) { pr_warn("elf: %s is not a valid eBPF object file\n", obj->path); err = -LIBBPF_ERRNO__FORMAT; goto errout; @@ -1249,10 +1264,10 @@ static int bpf_object__elf_init(struct bpf_object *obj) static int bpf_object__check_endianness(struct bpf_object *obj) { #if __BYTE_ORDER == __LITTLE_ENDIAN - if (obj->efile.ehdr.e_ident[EI_DATA] == ELFDATA2LSB) + if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2LSB) return 0; #elif __BYTE_ORDER == __BIG_ENDIAN - if (obj->efile.ehdr.e_ident[EI_DATA] == ELFDATA2MSB) + if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2MSB) return 0; #else # error "Unrecognized __BYTE_ORDER__" @@ -1333,23 +1348,20 @@ static int find_elf_var_offset(const struct bpf_object *obj, const char *name, _ if (!name || !off) return -EINVAL;
- for (si = 0; si < symbols->d_size / sizeof(GElf_Sym); si++) { - GElf_Sym sym; + for (si = 0; si < symbols->d_size / sizeof(Elf64_Sym); si++) { + Elf64_Sym *sym = elf_sym_by_idx(obj, si);
- if (!gelf_getsym(symbols, si, &sym)) - continue; - if (GELF_ST_BIND(sym.st_info) != STB_GLOBAL || - GELF_ST_TYPE(sym.st_info) != STT_OBJECT) + if (ELF64_ST_BIND(sym->st_info) != STB_GLOBAL || + ELF64_ST_TYPE(sym->st_info) != STT_OBJECT) continue;
- sname = elf_sym_str(obj, sym.st_name); + sname = elf_sym_str(obj, sym->st_name); if (!sname) { - pr_warn("failed to get sym name string for var %s\n", - name); + pr_warn("failed to get sym name string for var %s\n", name); return -EIO; } if (strcmp(name, sname) == 0) { - *off = sym.st_value; + *off = sym->st_value; return 0; } } @@ -1836,15 +1848,13 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) * * TODO: Detect array of map and report error. */ - nr_syms = symbols->d_size / sizeof(GElf_Sym); + nr_syms = symbols->d_size / sizeof(Elf64_Sym); for (i = 0; i < nr_syms; i++) { - GElf_Sym sym; + Elf64_Sym *sym = elf_sym_by_idx(obj, i);
- if (!gelf_getsym(symbols, i, &sym)) + if (sym->st_shndx != obj->efile.maps_shndx) continue; - if (sym.st_shndx != obj->efile.maps_shndx) - continue; - if (GELF_ST_TYPE(sym.st_info) == STT_SECTION) + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) continue; nr_maps++; } @@ -1861,40 +1871,38 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict)
/* Fill obj->maps using data in "maps" section. */ for (i = 0; i < nr_syms; i++) { - GElf_Sym sym; + Elf64_Sym *sym = elf_sym_by_idx(obj, i); const char *map_name; struct bpf_map_def *def; struct bpf_map *map;
- if (!gelf_getsym(symbols, i, &sym)) - continue; - if (sym.st_shndx != obj->efile.maps_shndx) + if (sym->st_shndx != obj->efile.maps_shndx) continue; - if (GELF_ST_TYPE(sym.st_info) == STT_SECTION) + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) continue;
map = bpf_object__add_map(obj); if (IS_ERR(map)) return PTR_ERR(map);
- map_name = elf_sym_str(obj, sym.st_name); + map_name = elf_sym_str(obj, sym->st_name); if (!map_name) { pr_warn("failed to get map #%d name sym string for obj %s\n", i, obj->path); return -LIBBPF_ERRNO__FORMAT; }
- if (GELF_ST_BIND(sym.st_info) == STB_LOCAL) { + if (ELF64_ST_BIND(sym->st_info) == STB_LOCAL) { pr_warn("map '%s' (legacy): static maps are not supported\n", map_name); return -ENOTSUP; }
map->libbpf_type = LIBBPF_MAP_UNSPEC; - map->sec_idx = sym.st_shndx; - map->sec_offset = sym.st_value; + map->sec_idx = sym->st_shndx; + map->sec_offset = sym->st_value; pr_debug("map '%s' (legacy): at sec_idx %d, offset %zu.\n", map_name, map->sec_idx, map->sec_offset); - if (sym.st_value + map_def_sz > data->d_size) { + if (sym->st_value + map_def_sz > data->d_size) { pr_warn("corrupted maps section in %s: last map "%s" too small\n", obj->path, map_name); return -EINVAL; @@ -1906,7 +1914,7 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) return -ENOMEM; } pr_debug("map %d is "%s"\n", i, map->name); - def = (struct bpf_map_def *)(data->d_buf + sym.st_value); + def = (struct bpf_map_def *)(data->d_buf + sym->st_value); /* * If the definition of the map in the object file fits in * bpf_map_def, copy it. Any extra fields in our version @@ -2476,12 +2484,13 @@ static int bpf_object__init_maps(struct bpf_object *obj,
static bool section_have_execinstr(struct bpf_object *obj, int idx) { - GElf_Shdr sh; + Elf64_Shdr *sh;
- if (elf_sec_hdr(obj, elf_sec_by_idx(obj, idx), &sh)) + sh = elf_sec_hdr(obj, elf_sec_by_idx(obj, idx)); + if (!sh) return false;
- return sh.sh_flags & SHF_EXECINSTR; + return sh->sh_flags & SHF_EXECINSTR; }
static bool btf_needs_sanitization(struct bpf_object *obj) @@ -2958,32 +2967,36 @@ static Elf_Scn *elf_sec_by_name(const struct bpf_object *obj, const char *name) return NULL; }
-static int elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn, GElf_Shdr *hdr) +static Elf64_Shdr *elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn) { + Elf64_Shdr *shdr; + if (!scn) - return -EINVAL; + return NULL;
- if (gelf_getshdr(scn, hdr) != hdr) { + shdr = elf64_getshdr(scn); + if (!shdr) { pr_warn("elf: failed to get section(%zu) header from %s: %s\n", elf_ndxscn(scn), obj->path, elf_errmsg(-1)); - return -EINVAL; + return NULL; }
- return 0; + return shdr; }
static const char *elf_sec_name(const struct bpf_object *obj, Elf_Scn *scn) { const char *name; - GElf_Shdr sh; + Elf64_Shdr *sh;
if (!scn) return NULL;
- if (elf_sec_hdr(obj, scn, &sh)) + sh = elf_sec_hdr(obj, scn); + if (!sh) return NULL;
- name = elf_sec_str(obj, sh.sh_name); + name = elf_sec_str(obj, sh->sh_name); if (!name) { pr_warn("elf: failed to get section(%zu) name from %s: %s\n", elf_ndxscn(scn), obj->path, elf_errmsg(-1)); @@ -3011,13 +3024,29 @@ static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn) return data; }
+static Elf64_Sym *elf_sym_by_idx(const struct bpf_object *obj, size_t idx) +{ + if (idx >= obj->efile.symbols->d_size / sizeof(Elf64_Sym)) + return NULL; + + return (Elf64_Sym *)obj->efile.symbols->d_buf + idx; +} + +static Elf64_Rel *elf_rel_by_idx(Elf_Data *data, size_t idx) +{ + if (idx >= data->d_size / sizeof(Elf64_Rel)) + return NULL; + + return (Elf64_Rel *)data->d_buf + idx; +} + static bool is_sec_name_dwarf(const char *name) { /* approximation, but the actual list is too long */ return str_has_pfx(name, ".debug_"); }
-static bool ignore_elf_section(GElf_Shdr *hdr, const char *name) +static bool ignore_elf_section(Elf64_Shdr *hdr, const char *name) { /* no special handling of .strtab */ if (hdr->sh_type == SHT_STRTAB) @@ -3072,17 +3101,18 @@ static int bpf_object__elf_collect(struct bpf_object *obj) const char *name; Elf_Data *data; Elf_Scn *scn; - GElf_Shdr sh; + Elf64_Shdr *sh;
/* a bunch of ELF parsing functionality depends on processing symbols, * so do the first pass and find the symbol table */ scn = NULL; while ((scn = elf_nextscn(elf, scn)) != NULL) { - if (elf_sec_hdr(obj, scn, &sh)) + sh = elf_sec_hdr(obj, scn); + if (!sh) return -LIBBPF_ERRNO__FORMAT;
- if (sh.sh_type == SHT_SYMTAB) { + if (sh->sh_type == SHT_SYMTAB) { if (obj->efile.symbols) { pr_warn("elf: multiple symbol tables in %s\n", obj->path); return -LIBBPF_ERRNO__FORMAT; @@ -3094,7 +3124,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
obj->efile.symbols = data; obj->efile.symbols_shndx = elf_ndxscn(scn); - obj->efile.strtabidx = sh.sh_link; + obj->efile.strtabidx = sh->sh_link; } }
@@ -3108,14 +3138,15 @@ static int bpf_object__elf_collect(struct bpf_object *obj) while ((scn = elf_nextscn(elf, scn)) != NULL) { idx++;
- if (elf_sec_hdr(obj, scn, &sh)) + sh = elf_sec_hdr(obj, scn); + if (!sh) return -LIBBPF_ERRNO__FORMAT;
- name = elf_sec_str(obj, sh.sh_name); + name = elf_sec_str(obj, sh->sh_name); if (!name) return -LIBBPF_ERRNO__FORMAT;
- if (ignore_elf_section(&sh, name)) + if (ignore_elf_section(sh, name)) continue;
data = elf_sec_data(obj, scn); @@ -3124,8 +3155,8 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
pr_debug("elf: section(%d) %s, size %ld, link %d, flags %lx, type=%d\n", idx, name, (unsigned long)data->d_size, - (int)sh.sh_link, (unsigned long)sh.sh_flags, - (int)sh.sh_type); + (int)sh->sh_link, (unsigned long)sh->sh_flags, + (int)sh->sh_type);
if (strcmp(name, "license") == 0) { err = bpf_object__init_license(obj, data->d_buf, data->d_size); @@ -3143,10 +3174,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) btf_data = data; } else if (strcmp(name, BTF_EXT_ELF_SEC) == 0) { btf_ext_data = data; - } else if (sh.sh_type == SHT_SYMTAB) { + } else if (sh->sh_type == SHT_SYMTAB) { /* already processed during the first pass above */ - } else if (sh.sh_type == SHT_PROGBITS && data->d_size > 0) { - if (sh.sh_flags & SHF_EXECINSTR) { + } else if (sh->sh_type == SHT_PROGBITS && data->d_size > 0) { + if (sh->sh_flags & SHF_EXECINSTR) { if (strcmp(name, ".text") == 0) obj->efile.text_shndx = idx; err = bpf_object__add_programs(obj, data, name, idx); @@ -3165,10 +3196,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) pr_info("elf: skipping unrecognized data section(%d) %s\n", idx, name); } - } else if (sh.sh_type == SHT_REL) { + } else if (sh->sh_type == SHT_REL) { int nr_sects = obj->efile.nr_reloc_sects; void *sects = obj->efile.reloc_sects; - int sec = sh.sh_info; /* points to other section */ + int sec = sh->sh_info; /* points to other section */
/* Only do relo for section with exec instructions */ if (!section_have_execinstr(obj, sec) && @@ -3190,12 +3221,12 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
obj->efile.reloc_sects[nr_sects].shdr = sh; obj->efile.reloc_sects[nr_sects].data = data; - } else if (sh.sh_type == SHT_NOBITS && strcmp(name, BSS_SEC) == 0) { + } else if (sh->sh_type == SHT_NOBITS && strcmp(name, BSS_SEC) == 0) { obj->efile.bss = data; obj->efile.bss_shndx = idx; } else { pr_info("elf: skipping section(%d) %s (size %zu)\n", idx, name, - (size_t)sh.sh_size); + (size_t)sh->sh_size); } }
@@ -3211,19 +3242,19 @@ static int bpf_object__elf_collect(struct bpf_object *obj) return bpf_object__init_btf(obj, btf_data, btf_ext_data); }
-static bool sym_is_extern(const GElf_Sym *sym) +static bool sym_is_extern(const Elf64_Sym *sym) { - int bind = GELF_ST_BIND(sym->st_info); + int bind = ELF64_ST_BIND(sym->st_info); /* externs are symbols w/ type=NOTYPE, bind=GLOBAL|WEAK, section=UND */ return sym->st_shndx == SHN_UNDEF && (bind == STB_GLOBAL || bind == STB_WEAK) && - GELF_ST_TYPE(sym->st_info) == STT_NOTYPE; + ELF64_ST_TYPE(sym->st_info) == STT_NOTYPE; }
-static bool sym_is_subprog(const GElf_Sym *sym, int text_shndx) +static bool sym_is_subprog(const Elf64_Sym *sym, int text_shndx) { - int bind = GELF_ST_BIND(sym->st_info); - int type = GELF_ST_TYPE(sym->st_info); + int bind = ELF64_ST_BIND(sym->st_info); + int type = ELF64_ST_TYPE(sym->st_info);
/* in .text section */ if (sym->st_shndx != text_shndx) @@ -3421,30 +3452,31 @@ static int bpf_object__collect_externs(struct bpf_object *obj) int i, n, off, dummy_var_btf_id; const char *ext_name, *sec_name; Elf_Scn *scn; - GElf_Shdr sh; + Elf64_Shdr *sh;
if (!obj->efile.symbols) return 0;
scn = elf_sec_by_idx(obj, obj->efile.symbols_shndx); - if (elf_sec_hdr(obj, scn, &sh)) + sh = elf_sec_hdr(obj, scn); + if (!sh) return -LIBBPF_ERRNO__FORMAT;
dummy_var_btf_id = add_dummy_ksym_var(obj->btf); if (dummy_var_btf_id < 0) return dummy_var_btf_id;
- n = sh.sh_size / sh.sh_entsize; + n = sh->sh_size / sh->sh_entsize; pr_debug("looking for externs among %d symbols...\n", n);
for (i = 0; i < n; i++) { - GElf_Sym sym; + Elf64_Sym *sym = elf_sym_by_idx(obj, i);
- if (!gelf_getsym(obj->efile.symbols, i, &sym)) + if (!sym) return -LIBBPF_ERRNO__FORMAT; - if (!sym_is_extern(&sym)) + if (!sym_is_extern(sym)) continue; - ext_name = elf_sym_str(obj, sym.st_name); + ext_name = elf_sym_str(obj, sym->st_name); if (!ext_name || !ext_name[0]) continue;
@@ -3466,7 +3498,7 @@ static int bpf_object__collect_externs(struct bpf_object *obj) t = btf__type_by_id(obj->btf, ext->btf_id); ext->name = btf__name_by_offset(obj->btf, t->name_off); ext->sym_idx = i; - ext->is_weak = GELF_ST_BIND(sym.st_info) == STB_WEAK; + ext->is_weak = ELF64_ST_BIND(sym->st_info) == STB_WEAK;
ext->sec_btf_id = find_extern_sec_btf_id(obj->btf, ext->btf_id); if (ext->sec_btf_id <= 0) { @@ -3706,7 +3738,7 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx) static int bpf_program__record_reloc(struct bpf_program *prog, struct reloc_desc *reloc_desc, __u32 insn_idx, const char *sym_name, - const GElf_Sym *sym, const GElf_Rel *rel) + const Elf64_Sym *sym, const Elf64_Rel *rel) { struct bpf_insn *insn = &prog->insns[insn_idx]; size_t map_idx, nr_maps = prog->obj->nr_maps; @@ -3723,7 +3755,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog, }
if (sym_is_extern(sym)) { - int sym_idx = GELF_R_SYM(rel->r_info); + int sym_idx = ELF64_R_SYM(rel->r_info); int i, n = obj->nr_extern; struct extern_desc *ext;
@@ -3888,9 +3920,8 @@ static struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, }
static int -bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data *data) +bpf_object__collect_prog_relos(struct bpf_object *obj, Elf64_Shdr *shdr, Elf_Data *data) { - Elf_Data *symbols = obj->efile.symbols; const char *relo_sec_name, *sec_name; size_t sec_idx = shdr->sh_info; struct bpf_program *prog; @@ -3900,8 +3931,8 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data __u32 insn_idx; Elf_Scn *scn; Elf_Data *scn_data; - GElf_Sym sym; - GElf_Rel rel; + Elf64_Sym *sym; + Elf64_Rel *rel;
scn = elf_sec_by_idx(obj, sec_idx); scn_data = elf_sec_data(obj, scn); @@ -3916,33 +3947,36 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data nrels = shdr->sh_size / shdr->sh_entsize;
for (i = 0; i < nrels; i++) { - if (!gelf_getrel(data, i, &rel)) { + rel = elf_rel_by_idx(data, i); + if (!rel) { pr_warn("sec '%s': failed to get relo #%d\n", relo_sec_name, i); return -LIBBPF_ERRNO__FORMAT; } - if (!gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &sym)) { + + sym = elf_sym_by_idx(obj, ELF64_R_SYM(rel->r_info)); + if (!sym) { pr_warn("sec '%s': symbol 0x%zx not found for relo #%d\n", - relo_sec_name, (size_t)GELF_R_SYM(rel.r_info), i); + relo_sec_name, (size_t)ELF64_R_SYM(rel->r_info), i); return -LIBBPF_ERRNO__FORMAT; }
- if (rel.r_offset % BPF_INSN_SZ || rel.r_offset >= scn_data->d_size) { + if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) { pr_warn("sec '%s': invalid offset 0x%zx for relo #%d\n", - relo_sec_name, (size_t)GELF_R_SYM(rel.r_info), i); + relo_sec_name, (size_t)ELF64_R_SYM(rel->r_info), i); return -LIBBPF_ERRNO__FORMAT; }
- insn_idx = rel.r_offset / BPF_INSN_SZ; + insn_idx = rel->r_offset / BPF_INSN_SZ; /* relocations against static functions are recorded as * relocations against the section that contains a function; * in such case, symbol will be STT_SECTION and sym.st_name * will point to empty string (0), so fetch section name * instead */ - if (GELF_ST_TYPE(sym.st_info) == STT_SECTION && sym.st_name == 0) - sym_name = elf_sec_name(obj, elf_sec_by_idx(obj, sym.st_shndx)); + if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION && sym->st_name == 0) + sym_name = elf_sec_name(obj, elf_sec_by_idx(obj, sym->st_shndx)); else - sym_name = elf_sym_str(obj, sym.st_name); + sym_name = elf_sym_str(obj, sym->st_name); sym_name = sym_name ?: "<?";
pr_debug("sec '%s': relo #%d: insn #%u against '%s'\n", @@ -3964,7 +3998,7 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, GElf_Shdr *shdr, Elf_Data /* adjust insn_idx to local BPF program frame of reference */ insn_idx -= prog->sec_insn_off; err = bpf_program__record_reloc(prog, &relos[prog->nr_reloc], - insn_idx, sym_name, &sym, &rel); + insn_idx, sym_name, sym, rel); if (err) return err;
@@ -5972,10 +6006,10 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) }
static int bpf_object__collect_st_ops_relos(struct bpf_object *obj, - GElf_Shdr *shdr, Elf_Data *data); + Elf64_Shdr *shdr, Elf_Data *data);
static int bpf_object__collect_map_relos(struct bpf_object *obj, - GElf_Shdr *shdr, Elf_Data *data) + Elf64_Shdr *shdr, Elf_Data *data) { const int bpf_ptr_sz = 8, host_ptr_sz = sizeof(void *); int i, j, nrels, new_sz; @@ -5984,10 +6018,9 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, struct bpf_map *map = NULL, *targ_map; const struct btf_member *member; const char *name, *mname; - Elf_Data *symbols; unsigned int moff; - GElf_Sym sym; - GElf_Rel rel; + Elf64_Sym *sym; + Elf64_Rel *rel; void *tmp;
if (!obj->efile.btf_maps_sec_btf_id || !obj->btf) @@ -5996,28 +6029,30 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, if (!sec) return -EINVAL;
- symbols = obj->efile.symbols; nrels = shdr->sh_size / shdr->sh_entsize; for (i = 0; i < nrels; i++) { - if (!gelf_getrel(data, i, &rel)) { + rel = elf_rel_by_idx(data, i); + if (!rel) { pr_warn(".maps relo #%d: failed to get ELF relo\n", i); return -LIBBPF_ERRNO__FORMAT; } - if (!gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &sym)) { + + sym = elf_sym_by_idx(obj, ELF64_R_SYM(rel->r_info)); + if (!sym) { pr_warn(".maps relo #%d: symbol %zx not found\n", - i, (size_t)GELF_R_SYM(rel.r_info)); + i, (size_t)ELF64_R_SYM(rel->r_info)); return -LIBBPF_ERRNO__FORMAT; } - name = elf_sym_str(obj, sym.st_name) ?: "<?>"; - if (sym.st_shndx != obj->efile.btf_maps_shndx) { + name = elf_sym_str(obj, sym->st_name) ?: "<?>"; + if (sym->st_shndx != obj->efile.btf_maps_shndx) { pr_warn(".maps relo #%d: '%s' isn't a BTF-defined map\n", i, name); return -LIBBPF_ERRNO__RELOC; }
- pr_debug(".maps relo #%d: for %zd value %zd rel.r_offset %zu name %d ('%s')\n", - i, (ssize_t)(rel.r_info >> 32), (size_t)sym.st_value, - (size_t)rel.r_offset, sym.st_name, name); + pr_debug(".maps relo #%d: for %zd value %zd rel->r_offset %zu name %d ('%s')\n", + i, (ssize_t)(rel->r_info >> 32), (size_t)sym->st_value, + (size_t)rel->r_offset, sym->st_name, name);
for (j = 0; j < obj->nr_maps; j++) { map = &obj->maps[j]; @@ -6025,13 +6060,13 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, continue;
vi = btf_var_secinfos(sec) + map->btf_var_idx; - if (vi->offset <= rel.r_offset && - rel.r_offset + bpf_ptr_sz <= vi->offset + vi->size) + if (vi->offset <= rel->r_offset && + rel->r_offset + bpf_ptr_sz <= vi->offset + vi->size) break; } if (j == obj->nr_maps) { - pr_warn(".maps relo #%d: cannot find map '%s' at rel.r_offset %zu\n", - i, name, (size_t)rel.r_offset); + pr_warn(".maps relo #%d: cannot find map '%s' at rel->r_offset %zu\n", + i, name, (size_t)rel->r_offset); return -EINVAL; }
@@ -6058,10 +6093,10 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, return -EINVAL;
moff = btf_member_bit_offset(def, btf_vlen(def) - 1) / 8; - if (rel.r_offset - vi->offset < moff) + if (rel->r_offset - vi->offset < moff) return -EINVAL;
- moff = rel.r_offset - vi->offset - moff; + moff = rel->r_offset - vi->offset - moff; /* here we use BPF pointer size, which is always 64 bit, as we * are parsing ELF that was built for BPF target */ @@ -6107,7 +6142,7 @@ static int bpf_object__collect_relos(struct bpf_object *obj) int i, err;
for (i = 0; i < obj->efile.nr_reloc_sects; i++) { - GElf_Shdr *shdr = &obj->efile.reloc_sects[i].shdr; + Elf64_Shdr *shdr = obj->efile.reloc_sects[i].shdr; Elf_Data *data = obj->efile.reloc_sects[i].data; int idx = shdr->sh_info;
@@ -8336,7 +8371,7 @@ static struct bpf_map *find_struct_ops_map_by_offset(struct bpf_object *obj,
/* Collect the reloc from ELF and populate the st_ops->progs[] */ static int bpf_object__collect_st_ops_relos(struct bpf_object *obj, - GElf_Shdr *shdr, Elf_Data *data) + Elf64_Shdr *shdr, Elf_Data *data) { const struct btf_member *member; struct bpf_struct_ops *st_ops; @@ -8344,58 +8379,58 @@ static int bpf_object__collect_st_ops_relos(struct bpf_object *obj, unsigned int shdr_idx; const struct btf *btf; struct bpf_map *map; - Elf_Data *symbols; unsigned int moff, insn_idx; const char *name; __u32 member_idx; - GElf_Sym sym; - GElf_Rel rel; + Elf64_Sym *sym; + Elf64_Rel *rel; int i, nrels;
- symbols = obj->efile.symbols; btf = obj->btf; nrels = shdr->sh_size / shdr->sh_entsize; for (i = 0; i < nrels; i++) { - if (!gelf_getrel(data, i, &rel)) { + rel = elf_rel_by_idx(data, i); + if (!rel) { pr_warn("struct_ops reloc: failed to get %d reloc\n", i); return -LIBBPF_ERRNO__FORMAT; }
- if (!gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &sym)) { + sym = elf_sym_by_idx(obj, ELF64_R_SYM(rel->r_info)); + if (!sym) { pr_warn("struct_ops reloc: symbol %zx not found\n", - (size_t)GELF_R_SYM(rel.r_info)); + (size_t)ELF64_R_SYM(rel->r_info)); return -LIBBPF_ERRNO__FORMAT; }
- name = elf_sym_str(obj, sym.st_name) ?: "<?>"; - map = find_struct_ops_map_by_offset(obj, rel.r_offset); + name = elf_sym_str(obj, sym->st_name) ?: "<?>"; + map = find_struct_ops_map_by_offset(obj, rel->r_offset); if (!map) { - pr_warn("struct_ops reloc: cannot find map at rel.r_offset %zu\n", - (size_t)rel.r_offset); + pr_warn("struct_ops reloc: cannot find map at rel->r_offset %zu\n", + (size_t)rel->r_offset); return -EINVAL; }
- moff = rel.r_offset - map->sec_offset; - shdr_idx = sym.st_shndx; + moff = rel->r_offset - map->sec_offset; + shdr_idx = sym->st_shndx; st_ops = map->st_ops; - pr_debug("struct_ops reloc %s: for %lld value %lld shdr_idx %u rel.r_offset %zu map->sec_offset %zu name %d ('%s')\n", + pr_debug("struct_ops reloc %s: for %lld value %lld shdr_idx %u rel->r_offset %zu map->sec_offset %zu name %d ('%s')\n", map->name, - (long long)(rel.r_info >> 32), - (long long)sym.st_value, - shdr_idx, (size_t)rel.r_offset, - map->sec_offset, sym.st_name, name); + (long long)(rel->r_info >> 32), + (long long)sym->st_value, + shdr_idx, (size_t)rel->r_offset, + map->sec_offset, sym->st_name, name);
if (shdr_idx >= SHN_LORESERVE) { - pr_warn("struct_ops reloc %s: rel.r_offset %zu shdr_idx %u unsupported non-static function\n", - map->name, (size_t)rel.r_offset, shdr_idx); + pr_warn("struct_ops reloc %s: rel->r_offset %zu shdr_idx %u unsupported non-static function\n", + map->name, (size_t)rel->r_offset, shdr_idx); return -LIBBPF_ERRNO__RELOC; } - if (sym.st_value % BPF_INSN_SZ) { + if (sym->st_value % BPF_INSN_SZ) { pr_warn("struct_ops reloc %s: invalid target program offset %llu\n", - map->name, (unsigned long long)sym.st_value); + map->name, (unsigned long long)sym->st_value); return -LIBBPF_ERRNO__FORMAT; } - insn_idx = sym.st_value / BPF_INSN_SZ; + insn_idx = sym->st_value / BPF_INSN_SZ;
member = find_member_by_offset(st_ops->type, moff * 8); if (!member) { diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 63d0d4c7ea1d..604a9a26773c 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -52,8 +52,8 @@ #endif
/* Older libelf all end up in this expression, for both 32 and 64 bit */ -#ifndef GELF_ST_VISIBILITY -#define GELF_ST_VISIBILITY(o) ((o) & 0x03) +#ifndef ELF64_ST_VISIBILITY +#define ELF64_ST_VISIBILITY(o) ((o) & 0x03) #endif
#define BTF_INFO_ENC(kind, kind_flag, vlen) \ diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index f638ba372a9b..db97d3bfdc5e 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -15,7 +15,6 @@ #include <linux/btf.h> #include <elf.h> #include <libelf.h> -#include <gelf.h> #include <fcntl.h> #include "libbpf.h" #include "btf.h"
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 25bbbd7a444b1624000389830d46ffdc5b809ee8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove internal libbpf assumption that there can be only one .rodata, .data, and .bss map per BPF object. To achieve that, extend and generalize the scheme that was used for keeping track of relocation ELF sections. Now each ELF section has a temporary extra index that keeps track of logical type of ELF section (relocations, data, read-only data, BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss handling.
We don't yet allow multiple .rodata, .data, and .bss sections, but no libbpf internal code makes an assumption that there can be only one of each and thus they can be explicitly referenced by a single index. Next patches will actually allow multiple .rodata and .data sections.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org (cherry picked from commit 25bbbd7a444b1624000389830d46ffdc5b809ee8) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 258 ++++++++++++++++++++++------------------- 1 file changed, 139 insertions(+), 119 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 9413360be003..41eac2e6540a 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -436,6 +436,20 @@ struct module_btf { int fd; };
+enum sec_type { + SEC_UNUSED = 0, + SEC_RELO, + SEC_BSS, + SEC_DATA, + SEC_RODATA, +}; + +struct elf_sec_desc { + enum sec_type sec_type; + Elf64_Shdr *shdr; + Elf_Data *data; +}; + struct elf_state { int fd; const void *obj_buf; @@ -443,25 +457,16 @@ struct elf_state { Elf *elf; Elf64_Ehdr *ehdr; Elf_Data *symbols; - Elf_Data *data; - Elf_Data *rodata; - Elf_Data *bss; Elf_Data *st_ops_data; size_t shstrndx; /* section index for section name strings */ size_t strtabidx; - struct { - Elf64_Shdr *shdr; - Elf_Data *data; - } *reloc_sects; - int nr_reloc_sects; + struct elf_sec_desc *secs; + int sec_cnt; int maps_shndx; int btf_maps_shndx; __u32 btf_maps_sec_btf_id; int text_shndx; int symbols_shndx; - int data_shndx; - int rodata_shndx; - int bss_shndx; int st_ops_shndx; };
@@ -480,10 +485,10 @@ struct bpf_object { struct extern_desc *externs; int nr_extern; int kconfig_map_idx; - int rodata_map_idx;
bool loaded; bool has_subcalls; + bool has_rodata;
struct bpf_gen *gen_loader;
@@ -1138,12 +1143,8 @@ static struct bpf_object *bpf_object__new(const char *path, obj->efile.obj_buf_sz = obj_buf_sz; obj->efile.maps_shndx = -1; obj->efile.btf_maps_shndx = -1; - obj->efile.data_shndx = -1; - obj->efile.rodata_shndx = -1; - obj->efile.bss_shndx = -1; obj->efile.st_ops_shndx = -1; obj->kconfig_map_idx = -1; - obj->rodata_map_idx = -1;
obj->kern_version = get_kernel_version(); obj->loaded = false; @@ -1163,13 +1164,10 @@ static void bpf_object__elf_finish(struct bpf_object *obj) obj->efile.elf = NULL; } obj->efile.symbols = NULL; - obj->efile.data = NULL; - obj->efile.rodata = NULL; - obj->efile.bss = NULL; obj->efile.st_ops_data = NULL;
- zfree(&obj->efile.reloc_sects); - obj->efile.nr_reloc_sects = 0; + zfree(&obj->efile.secs); + obj->efile.sec_cnt = 0; zclose(obj->efile.fd); obj->efile.obj_buf = NULL; obj->efile.obj_buf_sz = 0; @@ -1310,30 +1308,18 @@ static bool bpf_map_type__is_map_in_map(enum bpf_map_type type) static int find_elf_sec_sz(const struct bpf_object *obj, const char *name, __u32 *size) { int ret = -ENOENT; + Elf_Data *data; + Elf_Scn *scn;
*size = 0; - if (!name) { + if (!name) return -EINVAL; - } else if (!strcmp(name, DATA_SEC)) { - if (obj->efile.data) - *size = obj->efile.data->d_size; - } else if (!strcmp(name, BSS_SEC)) { - if (obj->efile.bss) - *size = obj->efile.bss->d_size; - } else if (!strcmp(name, RODATA_SEC)) { - if (obj->efile.rodata) - *size = obj->efile.rodata->d_size; - } else if (!strcmp(name, STRUCT_OPS_SEC)) { - if (obj->efile.st_ops_data) - *size = obj->efile.st_ops_data->d_size; - } else { - Elf_Scn *scn = elf_sec_by_name(obj, name); - Elf_Data *data = elf_sec_data(obj, scn);
- if (data) { - ret = 0; /* found it */ - *size = data->d_size; - } + scn = elf_sec_by_name(obj, name); + data = elf_sec_data(obj, scn); + if (data) { + ret = 0; /* found it */ + *size = data->d_size; }
return *size ? 0 : ret; @@ -1486,34 +1472,39 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
static int bpf_object__init_global_data_maps(struct bpf_object *obj) { - int err; + struct elf_sec_desc *sec_desc; + int err = 0, sec_idx;
/* * Populate obj->maps with libbpf internal maps. */ - if (obj->efile.data_shndx >= 0) { - err = bpf_object__init_internal_map(obj, LIBBPF_MAP_DATA, - obj->efile.data_shndx, - obj->efile.data->d_buf, - obj->efile.data->d_size); - if (err) - return err; - } - if (obj->efile.rodata_shndx >= 0) { - err = bpf_object__init_internal_map(obj, LIBBPF_MAP_RODATA, - obj->efile.rodata_shndx, - obj->efile.rodata->d_buf, - obj->efile.rodata->d_size); - if (err) - return err; - - obj->rodata_map_idx = obj->nr_maps - 1; - } - if (obj->efile.bss_shndx >= 0) { - err = bpf_object__init_internal_map(obj, LIBBPF_MAP_BSS, - obj->efile.bss_shndx, - NULL, - obj->efile.bss->d_size); + for (sec_idx = 1; sec_idx < obj->efile.sec_cnt; sec_idx++) { + sec_desc = &obj->efile.secs[sec_idx]; + + switch (sec_desc->sec_type) { + case SEC_DATA: + err = bpf_object__init_internal_map(obj, LIBBPF_MAP_DATA, + sec_idx, + sec_desc->data->d_buf, + sec_desc->data->d_size); + break; + case SEC_RODATA: + obj->has_rodata = true; + err = bpf_object__init_internal_map(obj, LIBBPF_MAP_RODATA, + sec_idx, + sec_desc->data->d_buf, + sec_desc->data->d_size); + break; + case SEC_BSS: + err = bpf_object__init_internal_map(obj, LIBBPF_MAP_BSS, + sec_idx, + NULL, + sec_desc->data->d_size); + break; + default: + /* skip */ + break; + } if (err) return err; } @@ -3094,6 +3085,7 @@ static int cmp_progs(const void *_a, const void *_b)
static int bpf_object__elf_collect(struct bpf_object *obj) { + struct elf_sec_desc *sec_desc; Elf *elf = obj->efile.elf; Elf_Data *btf_ext_data = NULL; Elf_Data *btf_data = NULL; @@ -3103,6 +3095,15 @@ static int bpf_object__elf_collect(struct bpf_object *obj) Elf_Scn *scn; Elf64_Shdr *sh;
+ /* ELF section indices are 1-based, so allocate +1 element to keep + * indexing simple. Also include 0th invalid section into sec_cnt for + * simpler and more traditional iteration logic. + */ + obj->efile.sec_cnt = 1 + obj->efile.ehdr->e_shnum; + obj->efile.secs = calloc(obj->efile.sec_cnt, sizeof(*obj->efile.secs)); + if (!obj->efile.secs) + return -ENOMEM; + /* a bunch of ELF parsing functionality depends on processing symbols, * so do the first pass and find the symbol table */ @@ -3122,8 +3123,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) if (!data) return -LIBBPF_ERRNO__FORMAT;
+ idx = elf_ndxscn(scn); + obj->efile.symbols = data; - obj->efile.symbols_shndx = elf_ndxscn(scn); + obj->efile.symbols_shndx = idx; obj->efile.strtabidx = sh->sh_link; } } @@ -3136,7 +3139,8 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
scn = NULL; while ((scn = elf_nextscn(elf, scn)) != NULL) { - idx++; + idx = elf_ndxscn(scn); + sec_desc = &obj->efile.secs[idx];
sh = elf_sec_hdr(obj, scn); if (!sh) @@ -3184,11 +3188,13 @@ static int bpf_object__elf_collect(struct bpf_object *obj) if (err) return err; } else if (strcmp(name, DATA_SEC) == 0) { - obj->efile.data = data; - obj->efile.data_shndx = idx; + sec_desc->sec_type = SEC_DATA; + sec_desc->shdr = sh; + sec_desc->data = data; } else if (strcmp(name, RODATA_SEC) == 0) { - obj->efile.rodata = data; - obj->efile.rodata_shndx = idx; + sec_desc->sec_type = SEC_RODATA; + sec_desc->shdr = sh; + sec_desc->data = data; } else if (strcmp(name, STRUCT_OPS_SEC) == 0) { obj->efile.st_ops_data = data; obj->efile.st_ops_shndx = idx; @@ -3197,33 +3203,25 @@ static int bpf_object__elf_collect(struct bpf_object *obj) idx, name); } } else if (sh->sh_type == SHT_REL) { - int nr_sects = obj->efile.nr_reloc_sects; - void *sects = obj->efile.reloc_sects; - int sec = sh->sh_info; /* points to other section */ + int targ_sec_idx = sh->sh_info; /* points to other section */
/* Only do relo for section with exec instructions */ - if (!section_have_execinstr(obj, sec) && + if (!section_have_execinstr(obj, targ_sec_idx) && strcmp(name, ".rel" STRUCT_OPS_SEC) && strcmp(name, ".rel" MAPS_ELF_SEC)) { pr_info("elf: skipping relo section(%d) %s for section(%d) %s\n", - idx, name, sec, - elf_sec_name(obj, elf_sec_by_idx(obj, sec)) ?: "<?>"); + idx, name, targ_sec_idx, + elf_sec_name(obj, elf_sec_by_idx(obj, targ_sec_idx)) ?: "<?>"); continue; }
- sects = libbpf_reallocarray(sects, nr_sects + 1, - sizeof(*obj->efile.reloc_sects)); - if (!sects) - return -ENOMEM; - - obj->efile.reloc_sects = sects; - obj->efile.nr_reloc_sects++; - - obj->efile.reloc_sects[nr_sects].shdr = sh; - obj->efile.reloc_sects[nr_sects].data = data; + sec_desc->sec_type = SEC_RELO; + sec_desc->shdr = sh; + sec_desc->data = data; } else if (sh->sh_type == SHT_NOBITS && strcmp(name, BSS_SEC) == 0) { - obj->efile.bss = data; - obj->efile.bss_shndx = idx; + sec_desc->sec_type = SEC_BSS; + sec_desc->shdr = sh; + sec_desc->data = data; } else { pr_info("elf: skipping section(%d) %s (size %zu)\n", idx, name, (size_t)sh->sh_size); @@ -3708,9 +3706,14 @@ bpf_object__find_program_by_name(const struct bpf_object *obj, static bool bpf_object__shndx_is_data(const struct bpf_object *obj, int shndx) { - return shndx == obj->efile.data_shndx || - shndx == obj->efile.bss_shndx || - shndx == obj->efile.rodata_shndx; + switch (obj->efile.secs[shndx].sec_type) { + case SEC_BSS: + case SEC_DATA: + case SEC_RODATA: + return true; + default: + return false; + } }
static bool bpf_object__shndx_is_maps(const struct bpf_object *obj, @@ -3723,16 +3726,19 @@ static bool bpf_object__shndx_is_maps(const struct bpf_object *obj, static enum libbpf_map_type bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx) { - if (shndx == obj->efile.data_shndx) - return LIBBPF_MAP_DATA; - else if (shndx == obj->efile.bss_shndx) + if (shndx == obj->efile.symbols_shndx) + return LIBBPF_MAP_KCONFIG; + + switch (obj->efile.secs[shndx].sec_type) { + case SEC_BSS: return LIBBPF_MAP_BSS; - else if (shndx == obj->efile.rodata_shndx) + case SEC_DATA: + return LIBBPF_MAP_DATA; + case SEC_RODATA: return LIBBPF_MAP_RODATA; - else if (shndx == obj->efile.symbols_shndx) - return LIBBPF_MAP_KCONFIG; - else + default: return LIBBPF_MAP_UNSPEC; + } }
static int bpf_program__record_reloc(struct bpf_program *prog, @@ -3868,7 +3874,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog, } for (map_idx = 0; map_idx < nr_maps; map_idx++) { map = &obj->maps[map_idx]; - if (map->libbpf_type != type) + if (map->libbpf_type != type || map->sec_idx != sym->st_shndx) continue; pr_debug("prog '%s': found data map %zd (%s, sec %d, off %zu) for insn %u\n", prog->name, map_idx, map->name, map->sec_idx, @@ -6141,10 +6147,18 @@ static int bpf_object__collect_relos(struct bpf_object *obj) { int i, err;
- for (i = 0; i < obj->efile.nr_reloc_sects; i++) { - Elf64_Shdr *shdr = obj->efile.reloc_sects[i].shdr; - Elf_Data *data = obj->efile.reloc_sects[i].data; - int idx = shdr->sh_info; + for (i = 0; i < obj->efile.sec_cnt; i++) { + struct elf_sec_desc *sec_desc = &obj->efile.secs[i]; + Elf64_Shdr *shdr; + Elf_Data *data; + int idx; + + if (sec_desc->sec_type != SEC_RELO) + continue; + + shdr = sec_desc->shdr; + data = sec_desc->data; + idx = shdr->sh_info;
if (shdr->sh_type != SHT_REL) { pr_warn("internal error at %d\n", __LINE__); @@ -6264,6 +6278,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, char *license, __u32 kern_version, int *pfd) { struct bpf_prog_load_params load_attr = {}; + struct bpf_object *obj = prog->obj; char *cp, errmsg[STRERR_BUFSIZE]; size_t log_buf_size = 0; char *log_buf = NULL; @@ -6284,7 +6299,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
load_attr.prog_type = prog->type; load_attr.expected_attach_type = prog->expected_attach_type; - if (kernel_supports(prog->obj, FEAT_PROG_NAME)) + if (kernel_supports(obj, FEAT_PROG_NAME)) load_attr.name = prog->name; load_attr.insns = insns; load_attr.insn_cnt = insns_cnt; @@ -6297,8 +6312,8 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, load_attr.prog_ifindex = prog->prog_ifindex;
/* specify func_info/line_info only if kernel supports them */ - btf_fd = bpf_object__btf_fd(prog->obj); - if (btf_fd >= 0 && kernel_supports(prog->obj, FEAT_BTF_FUNC)) { + btf_fd = bpf_object__btf_fd(obj); + if (btf_fd >= 0 && kernel_supports(obj, FEAT_BTF_FUNC)) { load_attr.prog_btf_fd = btf_fd; load_attr.func_info = prog->func_info; load_attr.func_info_rec_size = prog->func_info_rec_size; @@ -6320,9 +6335,9 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, } }
- if (prog->obj->gen_loader) { - bpf_gen__prog_load(prog->obj->gen_loader, &load_attr, - prog - prog->obj->programs); + if (obj->gen_loader) { + bpf_gen__prog_load(obj->gen_loader, &load_attr, + prog - obj->programs); *pfd = -1; return 0; } @@ -6343,16 +6358,21 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, if (log_buf && load_attr.log_level) pr_debug("verifier log:\n%s", log_buf);
- if (prog->obj->rodata_map_idx >= 0 && - kernel_supports(prog->obj, FEAT_PROG_BIND_MAP)) { - struct bpf_map *rodata_map = - &prog->obj->maps[prog->obj->rodata_map_idx]; + if (obj->has_rodata && kernel_supports(obj, FEAT_PROG_BIND_MAP)) { + struct bpf_map *map; + int i; + + for (i = 0; i < obj->nr_maps; i++) { + map = &prog->obj->maps[i]; + if (map->libbpf_type != LIBBPF_MAP_RODATA) + continue;
- if (bpf_prog_bind_map(ret, bpf_map__fd(rodata_map), NULL)) { - cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); - pr_warn("prog '%s': failed to bind .rodata map: %s\n", - prog->name, cp); - /* Don't fail hard if can't bind rodata. */ + if (bpf_prog_bind_map(ret, bpf_map__fd(map), NULL)) { + cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); + pr_warn("prog '%s': failed to bind .rodata map: %s\n", + prog->name, cp); + /* Don't fail hard if can't bind rodata. */ + } } }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 8654b4d35e6c915ef456c14320ec8720383e81a7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove the assumption about only single instance of each of .rodata and .data internal maps. Nothing changes for '.rodata' and '.data' maps, but new '.rodata.something' map will get 'rodata_something' section in BPF skeleton for them (as well as having struct bpf_map * field in maps section with the same field name).
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-6-andrii@kernel.org (cherry picked from commit 8654b4d35e6c915ef456c14320ec8720383e81a7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 107 ++++++++++++++++++++++------------------ 1 file changed, 60 insertions(+), 47 deletions(-)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index f795907b4061..58b4ddb5545f 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -33,6 +33,11 @@ static void sanitize_identifier(char *name) name[i] = '_'; }
+static bool str_has_prefix(const char *str, const char *prefix) +{ + return strncmp(str, prefix, strlen(prefix)) == 0; +} + static bool str_has_suffix(const char *str, const char *suffix) { size_t i, n1 = strlen(str), n2 = strlen(suffix); @@ -67,23 +72,47 @@ static void get_header_guard(char *guard, const char *obj_name) guard[i] = toupper(guard[i]); }
-static const char *get_map_ident(const struct bpf_map *map) +static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz) { + static const char *sfxs[] = { ".data", ".rodata", ".bss", ".kconfig" }; const char *name = bpf_map__name(map); + int i, n; + + if (!bpf_map__is_internal(map)) { + snprintf(buf, buf_sz, "%s", name); + return true; + } + + for (i = 0, n = ARRAY_SIZE(sfxs); i < n; i++) { + const char *sfx = sfxs[i], *p; + + p = strstr(name, sfx); + if (p) { + snprintf(buf, buf_sz, "%s", p + 1); + sanitize_identifier(buf); + return true; + } + }
- if (!bpf_map__is_internal(map)) - return name; - - if (str_has_suffix(name, ".data")) - return "data"; - else if (str_has_suffix(name, ".rodata")) - return "rodata"; - else if (str_has_suffix(name, ".bss")) - return "bss"; - else if (str_has_suffix(name, ".kconfig")) - return "kconfig"; - else - return NULL; + return false; +} + +static bool get_datasec_ident(const char *sec_name, char *buf, size_t buf_sz) +{ + static const char *pfxs[] = { ".data", ".rodata", ".bss", ".kconfig" }; + int i, n; + + for (i = 0, n = ARRAY_SIZE(pfxs); i < n; i++) { + const char *pfx = pfxs[i]; + + if (str_has_prefix(sec_name, pfx)) { + snprintf(buf, buf_sz, "%s", sec_name + 1); + sanitize_identifier(buf); + return true; + } + } + + return false; }
static void codegen_btf_dump_printf(void *ctx, const char *fmt, va_list args) @@ -100,24 +129,14 @@ static int codegen_datasec_def(struct bpf_object *obj, const char *sec_name = btf__name_by_offset(btf, sec->name_off); const struct btf_var_secinfo *sec_var = btf_var_secinfos(sec); int i, err, off = 0, pad_cnt = 0, vlen = btf_vlen(sec); - const char *sec_ident; - char var_ident[256]; + char var_ident[256], sec_ident[256]; bool strip_mods = false;
- if (strcmp(sec_name, ".data") == 0) { - sec_ident = "data"; - strip_mods = true; - } else if (strcmp(sec_name, ".bss") == 0) { - sec_ident = "bss"; - strip_mods = true; - } else if (strcmp(sec_name, ".rodata") == 0) { - sec_ident = "rodata"; - strip_mods = true; - } else if (strcmp(sec_name, ".kconfig") == 0) { - sec_ident = "kconfig"; - } else { + if (!get_datasec_ident(sec_name, sec_ident, sizeof(sec_ident))) return 0; - } + + if (strcmp(sec_name, ".kconfig") != 0) + strip_mods = true;
printf(" struct %s__%s {\n", obj_name, sec_ident); for (i = 0; i < vlen; i++, sec_var++) { @@ -385,6 +404,7 @@ static void codegen_destroy(struct bpf_object *obj, const char *obj_name) { struct bpf_program *prog; struct bpf_map *map; + char ident[256];
codegen("\ \n\ @@ -405,10 +425,7 @@ static void codegen_destroy(struct bpf_object *obj, const char *obj_name) }
bpf_object__for_each_map(map, obj) { - const char *ident; - - ident = get_map_ident(map); - if (!ident) + if (!get_map_ident(map, ident, sizeof(ident))) continue; if (bpf_map__is_internal(map) && (bpf_map__def(map)->map_flags & BPF_F_MMAPABLE)) @@ -432,6 +449,7 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h struct bpf_object_load_attr load_attr = {}; DECLARE_LIBBPF_OPTS(gen_loader_opts, opts); struct bpf_map *map; + char ident[256]; int err = 0;
err = bpf_object__gen_loader(obj, &opts); @@ -477,12 +495,10 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h ", obj_name, opts.data_sz); bpf_object__for_each_map(map, obj) { - const char *ident; const void *mmap_data = NULL; size_t mmap_size = 0;
- ident = get_map_ident(map); - if (!ident) + if (!get_map_ident(map, ident, sizeof(ident))) continue;
if (!bpf_map__is_internal(map) || @@ -544,15 +560,15 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h return err; \n\ ", obj_name); bpf_object__for_each_map(map, obj) { - const char *ident, *mmap_flags; + const char *mmap_flags;
- ident = get_map_ident(map); - if (!ident) + if (!get_map_ident(map, ident, sizeof(ident))) continue;
if (!bpf_map__is_internal(map) || !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE)) continue; + if (bpf_map__def(map)->map_flags & BPF_F_RDONLY_PROG) mmap_flags = "PROT_READ"; else @@ -602,7 +618,8 @@ static int do_skeleton(int argc, char **argv) DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts); char obj_name[MAX_OBJ_NAME_LEN] = "", *obj_data; struct bpf_object *obj = NULL; - const char *file, *ident; + const char *file; + char ident[256]; struct bpf_program *prog; int fd, err = -1; struct bpf_map *map; @@ -673,8 +690,7 @@ static int do_skeleton(int argc, char **argv) }
bpf_object__for_each_map(map, obj) { - ident = get_map_ident(map); - if (!ident) { + if (!get_map_ident(map, ident, sizeof(ident))) { p_err("ignoring unrecognized internal map '%s'...", bpf_map__name(map)); continue; @@ -727,8 +743,7 @@ static int do_skeleton(int argc, char **argv) if (map_cnt) { printf("\tstruct {\n"); bpf_object__for_each_map(map, obj) { - ident = get_map_ident(map); - if (!ident) + if (!get_map_ident(map, ident, sizeof(ident))) continue; if (use_loader) printf("\t\tstruct bpf_map_desc %s;\n", ident); @@ -894,9 +909,7 @@ static int do_skeleton(int argc, char **argv) ); i = 0; bpf_object__for_each_map(map, obj) { - ident = get_map_ident(map); - - if (!ident) + if (!get_map_ident(map, ident, sizeof(ident))) continue;
codegen("\
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit ef9356d392f980b3b192668fa05b2eaaad127da1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It can happen that some data sections (e.g., .rodata.cst16, containing compiler populated string constants) won't have a corresponding BTF DATASEC type. Now that libbpf supports .rodata.* and .data.* sections, situation like that will cause invalid BPF skeleton to be generated that won't compile successfully, as some parts of skeleton would assume memory-mapped struct definitions for each special data section.
Fix this by generating empty struct definitions for such data sections.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-7-andrii@kernel.org (cherry picked from commit ef9356d392f980b3b192668fa05b2eaaad127da1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/gen.c | 51 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 45 insertions(+), 6 deletions(-)
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index 58b4ddb5545f..e9ae3030d148 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -213,22 +213,61 @@ static int codegen_datasecs(struct bpf_object *obj, const char *obj_name) struct btf *btf = bpf_object__btf(obj); int n = btf__get_nr_types(btf); struct btf_dump *d; + struct bpf_map *map; + const struct btf_type *sec; + char sec_ident[256], map_ident[256]; int i, err = 0;
d = btf_dump__new(btf, NULL, NULL, codegen_btf_dump_printf); if (IS_ERR(d)) return PTR_ERR(d);
- for (i = 1; i <= n; i++) { - const struct btf_type *t = btf__type_by_id(btf, i); + bpf_object__for_each_map(map, obj) { + /* only generate definitions for memory-mapped internal maps */ + if (!bpf_map__is_internal(map)) + continue; + if (!(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE)) + continue;
- if (!btf_is_datasec(t)) + if (!get_map_ident(map, map_ident, sizeof(map_ident))) continue;
- err = codegen_datasec_def(obj, btf, d, t, obj_name); - if (err) - goto out; + sec = NULL; + for (i = 1; i <= n; i++) { + const struct btf_type *t = btf__type_by_id(btf, i); + const char *name; + + if (!btf_is_datasec(t)) + continue; + + name = btf__str_by_offset(btf, t->name_off); + if (!get_datasec_ident(name, sec_ident, sizeof(sec_ident))) + continue; + + if (strcmp(sec_ident, map_ident) == 0) { + sec = t; + break; + } + } + + /* In some cases (e.g., sections like .rodata.cst16 containing + * compiler allocated string constants only) there will be + * special internal maps with no corresponding DATASEC BTF + * type. In such case, generate empty structs for each such + * map. It will still be memory-mapped and its contents + * accessible from user-space through BPF skeleton. + */ + if (!sec) { + printf(" struct %s__%s {\n", obj_name, map_ident); + printf(" } *%s;\n", map_ident); + } else { + err = codegen_datasec_def(obj, btf, d, sec, obj_name); + if (err) + goto out; + } } + + out: btf_dump__free(d); return err;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit aed659170a3171e425913ae259d46396fb9c10ef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add support for having multiple .rodata and .data data sections ([0]). .rodata/.data are supported like the usual, but now also .rodata.<whatever> and .data.<whatever> are also supported. Each such section will get its own backing BPF_MAP_TYPE_ARRAY, just like .rodata and .data.
Multiple .bss maps are not supported, as the whole '.bss' name is confusing and might be deprecated soon, as well as user would need to specify custom ELF section with SEC() attribute anyway, so might as well stick to just .data.* and .rodata.* convention.
User-visible map name for such new maps is going to be just their ELF section names.
[0] https://github.com/libbpf/libbpf/issues/274
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org (cherry picked from commit aed659170a3171e425913ae259d46396fb9c10ef) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 130 ++++++++++++++++++++++++++++++++--------- 1 file changed, 101 insertions(+), 29 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 41eac2e6540a..9cc25e528d30 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -350,15 +350,14 @@ enum libbpf_map_type { LIBBPF_MAP_KCONFIG, };
-static const char * const libbpf_type_to_btf_name[] = { - [LIBBPF_MAP_DATA] = DATA_SEC, - [LIBBPF_MAP_BSS] = BSS_SEC, - [LIBBPF_MAP_RODATA] = RODATA_SEC, - [LIBBPF_MAP_KCONFIG] = KCONFIG_SEC, -}; - struct bpf_map { char *name; + /* real_name is defined for special internal maps (.rodata*, + * .data*, .bss, .kconfig) and preserves their original ELF section + * name. This is important to be be able to find corresponding BTF + * DATASEC information. + */ + char *real_name; int fd; int sec_idx; size_t sec_offset; @@ -1399,17 +1398,55 @@ static size_t bpf_map_mmap_sz(const struct bpf_map *map) return map_sz; }
-static char *internal_map_name(struct bpf_object *obj, - enum libbpf_map_type type) +static char *internal_map_name(struct bpf_object *obj, const char *real_name) { char map_name[BPF_OBJ_NAME_LEN], *p; - const char *sfx = libbpf_type_to_btf_name[type]; - int sfx_len = max((size_t)7, strlen(sfx)); - int pfx_len = min((size_t)BPF_OBJ_NAME_LEN - sfx_len - 1, - strlen(obj->name)); + int pfx_len, sfx_len = max((size_t)7, strlen(real_name)); + + /* This is one of the more confusing parts of libbpf for various + * reasons, some of which are historical. The original idea for naming + * internal names was to include as much of BPF object name prefix as + * possible, so that it can be distinguished from similar internal + * maps of a different BPF object. + * As an example, let's say we have bpf_object named 'my_object_name' + * and internal map corresponding to '.rodata' ELF section. The final + * map name advertised to user and to the kernel will be + * 'my_objec.rodata', taking first 8 characters of object name and + * entire 7 characters of '.rodata'. + * Somewhat confusingly, if internal map ELF section name is shorter + * than 7 characters, e.g., '.bss', we still reserve 7 characters + * for the suffix, even though we only have 4 actual characters, and + * resulting map will be called 'my_objec.bss', not even using all 15 + * characters allowed by the kernel. Oh well, at least the truncated + * object name is somewhat consistent in this case. But if the map + * name is '.kconfig', we'll still have entirety of '.kconfig' added + * (8 chars) and thus will be left with only first 7 characters of the + * object name ('my_obje'). Happy guessing, user, that the final map + * name will be "my_obje.kconfig". + * Now, with libbpf starting to support arbitrarily named .rodata.* + * and .data.* data sections, it's possible that ELF section name is + * longer than allowed 15 chars, so we now need to be careful to take + * only up to 15 first characters of ELF name, taking no BPF object + * name characters at all. So '.rodata.abracadabra' will result in + * '.rodata.abracad' kernel and user-visible name. + * We need to keep this convoluted logic intact for .data, .bss and + * .rodata maps, but for new custom .data.custom and .rodata.custom + * maps we use their ELF names as is, not prepending bpf_object name + * in front. We still need to truncate them to 15 characters for the + * kernel. Full name can be recovered for such maps by using DATASEC + * BTF type associated with such map's value type, though. + */ + if (sfx_len >= BPF_OBJ_NAME_LEN) + sfx_len = BPF_OBJ_NAME_LEN - 1; + + /* if there are two or more dots in map name, it's a custom dot map */ + if (strchr(real_name + 1, '.') != NULL) + pfx_len = 0; + else + pfx_len = min((size_t)BPF_OBJ_NAME_LEN - sfx_len - 1, strlen(obj->name));
snprintf(map_name, sizeof(map_name), "%.*s%.*s", pfx_len, obj->name, - sfx_len, libbpf_type_to_btf_name[type]); + sfx_len, real_name);
/* sanitise map name to characters allowed by kernel */ for (p = map_name; *p && p < map_name + sizeof(map_name); p++) @@ -1421,7 +1458,7 @@ static char *internal_map_name(struct bpf_object *obj,
static int bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, - int sec_idx, void *data, size_t data_sz) + const char *real_name, int sec_idx, void *data, size_t data_sz) { struct bpf_map_def *def; struct bpf_map *map; @@ -1434,9 +1471,11 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, map->libbpf_type = type; map->sec_idx = sec_idx; map->sec_offset = 0; - map->name = internal_map_name(obj, type); - if (!map->name) { - pr_warn("failed to alloc map name\n"); + map->real_name = strdup(real_name); + map->name = internal_map_name(obj, real_name); + if (!map->real_name || !map->name) { + zfree(&map->real_name); + zfree(&map->name); return -ENOMEM; }
@@ -1459,6 +1498,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, map->mmaped = NULL; pr_warn("failed to alloc map '%s' content buffer: %d\n", map->name, err); + zfree(&map->real_name); zfree(&map->name); return err; } @@ -1473,6 +1513,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, static int bpf_object__init_global_data_maps(struct bpf_object *obj) { struct elf_sec_desc *sec_desc; + const char *sec_name; int err = 0, sec_idx;
/* @@ -1483,21 +1524,24 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
switch (sec_desc->sec_type) { case SEC_DATA: + sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx)); err = bpf_object__init_internal_map(obj, LIBBPF_MAP_DATA, - sec_idx, + sec_name, sec_idx, sec_desc->data->d_buf, sec_desc->data->d_size); break; case SEC_RODATA: obj->has_rodata = true; + sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx)); err = bpf_object__init_internal_map(obj, LIBBPF_MAP_RODATA, - sec_idx, + sec_name, sec_idx, sec_desc->data->d_buf, sec_desc->data->d_size); break; case SEC_BSS: + sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx)); err = bpf_object__init_internal_map(obj, LIBBPF_MAP_BSS, - sec_idx, + sec_name, sec_idx, NULL, sec_desc->data->d_size); break; @@ -1801,7 +1845,7 @@ static int bpf_object__init_kconfig_map(struct bpf_object *obj)
map_sz = last_ext->kcfg.data_off + last_ext->kcfg.sz; err = bpf_object__init_internal_map(obj, LIBBPF_MAP_KCONFIG, - obj->efile.symbols_shndx, + ".kconfig", obj->efile.symbols_shndx, NULL, map_sz); if (err) return err; @@ -1901,7 +1945,7 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict)
map->name = strdup(map_name); if (!map->name) { - pr_warn("failed to alloc map name\n"); + pr_warn("map '%s': failed to alloc map name\n", map_name); return -ENOMEM; } pr_debug("map %d is "%s"\n", i, map->name); @@ -3187,11 +3231,13 @@ static int bpf_object__elf_collect(struct bpf_object *obj) err = bpf_object__add_programs(obj, data, name, idx); if (err) return err; - } else if (strcmp(name, DATA_SEC) == 0) { + } else if (strcmp(name, DATA_SEC) == 0 || + str_has_pfx(name, DATA_SEC ".")) { sec_desc->sec_type = SEC_DATA; sec_desc->shdr = sh; sec_desc->data = data; - } else if (strcmp(name, RODATA_SEC) == 0) { + } else if (strcmp(name, RODATA_SEC) == 0 || + str_has_pfx(name, RODATA_SEC ".")) { sec_desc->sec_type = SEC_RODATA; sec_desc->shdr = sh; sec_desc->data = data; @@ -4036,8 +4082,7 @@ static int bpf_map_find_btf_info(struct bpf_object *obj, struct bpf_map *map) * LLVM annotates global data differently in BTF, that is, * only as '.data', '.bss' or '.rodata'. */ - ret = btf__find_by_name(obj->btf, - libbpf_type_to_btf_name[map->libbpf_type]); + ret = btf__find_by_name(obj->btf, map->real_name); } if (ret < 0) return ret; @@ -7737,6 +7782,7 @@ static void bpf_map__destroy(struct bpf_map *map) }
zfree(&map->name); + zfree(&map->real_name); zfree(&map->pin_path);
if (map->fd >= 0) @@ -8741,9 +8787,30 @@ const struct bpf_map_def *bpf_map__def(const struct bpf_map *map) return map ? &map->def : libbpf_err_ptr(-EINVAL); }
+static bool map_uses_real_name(const struct bpf_map *map) +{ + /* Since libbpf started to support custom .data.* and .rodata.* maps, + * their user-visible name differs from kernel-visible name. Users see + * such map's corresponding ELF section name as a map name. + * This check distinguishes .data/.rodata from .data.* and .rodata.* + * maps to know which name has to be returned to the user. + */ + if (map->libbpf_type == LIBBPF_MAP_DATA && strcmp(map->real_name, DATA_SEC) != 0) + return true; + if (map->libbpf_type == LIBBPF_MAP_RODATA && strcmp(map->real_name, RODATA_SEC) != 0) + return true; + return false; +} + const char *bpf_map__name(const struct bpf_map *map) { - return map ? map->name : NULL; + if (!map) + return NULL; + + if (map_uses_real_name(map)) + return map->real_name; + + return map->name; }
enum bpf_map_type bpf_map__type(const struct bpf_map *map) @@ -8962,7 +9029,12 @@ bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name) struct bpf_map *pos;
bpf_object__for_each_map(pos, obj) { - if (pos->name && !strcmp(pos->name, name)) + if (map_uses_real_name(pos)) { + if (strcmp(pos->real_name, name) == 0) + return pos; + continue; + } + if (strcmp(pos->name, name) == 0) return pos; } return errno = ENOENT, NULL;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 30c5bd96476ced0d3e08372be5186ff7f421d10c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Enhance existing selftests to demonstrate the use of custom .data/.rodata sections.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-9-andrii@kernel.org (cherry picked from commit 30c5bd96476ced0d3e08372be5186ff7f421d10c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/skeleton.c | 29 +++++++++++++++++++ .../selftests/bpf/progs/test_skeleton.c | 18 ++++++++++++ 2 files changed, 47 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c index fe1e204a65c6..180afd632f4c 100644 --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c @@ -16,10 +16,13 @@ void test_skeleton(void) struct test_skeleton* skel; struct test_skeleton__bss *bss; struct test_skeleton__data *data; + struct test_skeleton__data_dyn *data_dyn; struct test_skeleton__rodata *rodata; + struct test_skeleton__rodata_dyn *rodata_dyn; struct test_skeleton__kconfig *kcfg; const void *elf_bytes; size_t elf_bytes_sz = 0; + int i;
skel = test_skeleton__open(); if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) @@ -30,7 +33,12 @@ void test_skeleton(void)
bss = skel->bss; data = skel->data; + data_dyn = skel->data_dyn; rodata = skel->rodata; + rodata_dyn = skel->rodata_dyn; + + ASSERT_STREQ(bpf_map__name(skel->maps.rodata_dyn), ".rodata.dyn", "rodata_dyn_name"); + ASSERT_STREQ(bpf_map__name(skel->maps.data_dyn), ".data.dyn", "data_dyn_name");
/* validate values are pre-initialized correctly */ CHECK(data->in1 != -1, "in1", "got %d != exp %d\n", data->in1, -1); @@ -46,6 +54,12 @@ void test_skeleton(void) CHECK(rodata->in.in6 != 0, "in6", "got %d != exp %d\n", rodata->in.in6, 0); CHECK(bss->out6 != 0, "out6", "got %d != exp %d\n", bss->out6, 0);
+ ASSERT_EQ(rodata_dyn->in_dynarr_sz, 0, "in_dynarr_sz"); + for (i = 0; i < 4; i++) + ASSERT_EQ(rodata_dyn->in_dynarr[i], -(i + 1), "in_dynarr"); + for (i = 0; i < 4; i++) + ASSERT_EQ(data_dyn->out_dynarr[i], i + 1, "out_dynarr"); + /* validate we can pre-setup global variables, even in .bss */ data->in1 = 10; data->in2 = 11; @@ -53,6 +67,10 @@ void test_skeleton(void) bss->in4 = 13; rodata->in.in6 = 14;
+ rodata_dyn->in_dynarr_sz = 4; + for (i = 0; i < 4; i++) + rodata_dyn->in_dynarr[i] = i + 10; + err = test_skeleton__load(skel); if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) goto cleanup; @@ -64,6 +82,10 @@ void test_skeleton(void) CHECK(bss->in4 != 13, "in4", "got %lld != exp %lld\n", bss->in4, 13LL); CHECK(rodata->in.in6 != 14, "in6", "got %d != exp %d\n", rodata->in.in6, 14);
+ ASSERT_EQ(rodata_dyn->in_dynarr_sz, 4, "in_dynarr_sz"); + for (i = 0; i < 4; i++) + ASSERT_EQ(rodata_dyn->in_dynarr[i], i + 10, "in_dynarr"); + /* now set new values and attach to get them into outX variables */ data->in1 = 1; data->in2 = 2; @@ -73,6 +95,8 @@ void test_skeleton(void) bss->in5.b = 6; kcfg = skel->kconfig;
+ skel->data_read_mostly->read_mostly_var = 123; + err = test_skeleton__attach(skel); if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) goto cleanup; @@ -93,6 +117,11 @@ void test_skeleton(void) CHECK(bss->kern_ver != kcfg->LINUX_KERNEL_VERSION, "ext2", "got %d != exp %d\n", bss->kern_ver, kcfg->LINUX_KERNEL_VERSION);
+ for (i = 0; i < 4; i++) + ASSERT_EQ(data_dyn->out_dynarr[i], i + 10, "out_dynarr"); + + ASSERT_EQ(skel->bss->out_mostly_var, 123, "out_mostly_var"); + elf_bytes = test_skeleton__elf_bytes(&elf_bytes_sz); ASSERT_OK_PTR(elf_bytes, "elf_bytes"); ASSERT_GE(elf_bytes_sz, 0, "elf_bytes_sz"); diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c index 441fa1c552c8..1b1187d2967b 100644 --- a/tools/testing/selftests/bpf/progs/test_skeleton.c +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c @@ -5,6 +5,8 @@ #include <linux/bpf.h> #include <bpf/bpf_helpers.h>
+#define __read_mostly SEC(".data.read_mostly") + struct s { int a; long long b; @@ -40,9 +42,20 @@ int kern_ver = 0;
struct s out5 = {};
+ +const volatile int in_dynarr_sz SEC(".rodata.dyn"); +const volatile int in_dynarr[4] SEC(".rodata.dyn") = { -1, -2, -3, -4 }; + +int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 }; + +int read_mostly_var __read_mostly; +int out_mostly_var; + SEC("raw_tp/sys_enter") int handler(const void *ctx) { + int i; + out1 = in1; out2 = in2; out3 = in3; @@ -53,6 +66,11 @@ int handler(const void *ctx) bpf_syscall = CONFIG_BPF_SYSCALL; kern_ver = LINUX_KERNEL_VERSION;
+ for (i = 0; i < in_dynarr_sz; i++) + out_dynarr[i] = in_dynarr[i]; + + out_mostly_var = read_mostly_var; + return 0; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 26071635ac5ecd8276bf3bdfc3ea1128c93ac722 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Map name that's assigned to internal maps (.rodata, .data, .bss, etc) consist of a small prefix of bpf_object's name and ELF section name as a suffix. This makes it hard for users to "guess" the name to use for looking up by name with bpf_object__find_map_by_name() API.
One proposal was to drop object name prefix from the map name and just use ".rodata", ".data", etc, names. One downside called out was that when multiple BPF applications are active on the host, it will be hard to distinguish between multiple instances of .rodata and know which BPF object (app) they belong to. Having few first characters, while quite limiting, still can give a bit of a clue, in general.
Note, though, that btf_value_type_id for such global data maps (ARRAY) points to DATASEC type, which encodes full ELF name, so tools like bpftool can take advantage of this fact to "recover" full original name of the map. This is also the reason why for custom .data.* and .rodata.* maps libbpf uses only their ELF names and doesn't prepend object name at all.
Another downside of such approach is that it is not backwards compatible and, among direct use of bpf_object__find_map_by_name() API, will break any BPF skeleton generated using bpftool that was compiled with older libbpf version.
Instead of causing all this pain, libbpf will still generate map name using a combination of object name and ELF section name, but it will allow looking such maps up by their natural names, which correspond to their respective ELF section names. This means non-truncated ELF section names longer than 15 characters are going to be expected and supported.
With such set up, we get the best of both worlds: leave small bits of a clue about BPF application that instantiated such maps, as well as making it easy for user apps to lookup such maps at runtime. In this sense it closes corresponding libbpf 1.0 issue ([0]).
BPF skeletons will continue using full names for lookups.
[0] Closes: https://github.com/libbpf/libbpf/issues/275
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-10-andrii@kernel.org (cherry picked from commit 26071635ac5ecd8276bf3bdfc3ea1128c93ac722) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 9cc25e528d30..6b7948b9b226 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -9029,6 +9029,16 @@ bpf_object__find_map_by_name(const struct bpf_object *obj, const char *name) struct bpf_map *pos;
bpf_object__for_each_map(pos, obj) { + /* if it's a special internal map name (which always starts + * with dot) then check if that special name matches the + * real map name (ELF section name) + */ + if (name[0] == '.') { + if (pos->real_name && strcmp(pos->real_name, name) == 0) + return pos; + continue; + } + /* otherwise map name has to be an exact match */ if (map_uses_real_name(pos)) { if (strcmp(pos->real_name, name) == 0) return pos;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 4f2511e1990985103929ab799fb3ebca81969b77 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Utilize libbpf's feature of allowing to lookup internal maps by their ELF section names. No need to guess or calculate the exact truncated prefix taken from the object name.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211021014404.2635234-11-andrii@kernel.org (cherry picked from commit 4f2511e1990985103929ab799fb3ebca81969b77) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../testing/selftests/bpf/prog_tests/core_autosize.c | 2 +- tools/testing/selftests/bpf/prog_tests/core_reloc.c | 2 +- tools/testing/selftests/bpf/prog_tests/global_data.c | 11 +++++++++-- .../selftests/bpf/prog_tests/global_data_init.c | 2 +- tools/testing/selftests/bpf/prog_tests/kfree_skb.c | 2 +- tools/testing/selftests/bpf/prog_tests/rdonly_maps.c | 2 +- 6 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/core_autosize.c b/tools/testing/selftests/bpf/prog_tests/core_autosize.c index 981c251453d9..ce7ac5c3fbbb 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_autosize.c +++ b/tools/testing/selftests/bpf/prog_tests/core_autosize.c @@ -164,7 +164,7 @@ void test_core_autosize(void)
usleep(1);
- bss_map = bpf_object__find_map_by_name(skel->obj, "test_cor.bss"); + bss_map = bpf_object__find_map_by_name(skel->obj, ".bss"); if (!ASSERT_OK_PTR(bss_map, "bss_map_find")) goto cleanup;
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index cc003c938efd..b8e46da0e1a0 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -877,7 +877,7 @@ void test_core_reloc(void) goto cleanup; }
- data_map = bpf_object__find_map_by_name(obj, "test_cor.bss"); + data_map = bpf_object__find_map_by_name(obj, ".bss"); if (CHECK(!data_map, "find_data_map", "data map not found\n")) goto cleanup;
diff --git a/tools/testing/selftests/bpf/prog_tests/global_data.c b/tools/testing/selftests/bpf/prog_tests/global_data.c index 9efa7e50eab2..afd8639f9a94 100644 --- a/tools/testing/selftests/bpf/prog_tests/global_data.c +++ b/tools/testing/selftests/bpf/prog_tests/global_data.c @@ -103,11 +103,18 @@ static void test_global_data_struct(struct bpf_object *obj, __u32 duration) static void test_global_data_rdonly(struct bpf_object *obj, __u32 duration) { int err = -ENOMEM, map_fd, zero = 0; - struct bpf_map *map; + struct bpf_map *map, *map2; __u8 *buff;
map = bpf_object__find_map_by_name(obj, "test_glo.rodata"); - if (CHECK_FAIL(!map || !bpf_map__is_internal(map))) + if (!ASSERT_OK_PTR(map, "map")) + return; + if (!ASSERT_TRUE(bpf_map__is_internal(map), "is_internal")) + return; + + /* ensure we can lookup internal maps by their ELF names */ + map2 = bpf_object__find_map_by_name(obj, ".rodata"); + if (!ASSERT_EQ(map, map2, "same_maps")) return;
map_fd = bpf_map__fd(map); diff --git a/tools/testing/selftests/bpf/prog_tests/global_data_init.c b/tools/testing/selftests/bpf/prog_tests/global_data_init.c index ee46b11f1f9a..1db86eab101b 100644 --- a/tools/testing/selftests/bpf/prog_tests/global_data_init.c +++ b/tools/testing/selftests/bpf/prog_tests/global_data_init.c @@ -16,7 +16,7 @@ void test_global_data_init(void) if (CHECK_FAIL(err)) return;
- map = bpf_object__find_map_by_name(obj, "test_glo.rodata"); + map = bpf_object__find_map_by_name(obj, ".rodata"); if (CHECK_FAIL(!map || !bpf_map__is_internal(map))) goto out;
diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c index ddfb6bf97152..a9e00960127c 100644 --- a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c +++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c @@ -92,7 +92,7 @@ void test_kfree_skb(void) if (CHECK(!fexit, "find_prog", "prog eth_type_trans not found\n")) goto close_prog;
- global_data = bpf_object__find_map_by_name(obj2, "kfree_sk.bss"); + global_data = bpf_object__find_map_by_name(obj2, ".bss"); if (CHECK(!global_data, "find global data", "not found\n")) goto close_prog;
diff --git a/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c b/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c index 5f9eaa3ab584..fd5d2ddfb062 100644 --- a/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c +++ b/tools/testing/selftests/bpf/prog_tests/rdonly_maps.c @@ -37,7 +37,7 @@ void test_rdonly_maps(void) if (CHECK(err, "obj_load", "err %d errno %d\n", err, errno)) goto cleanup;
- bss_map = bpf_object__find_map_by_name(obj, "test_rdo.bss"); + bss_map = bpf_object__find_map_by_name(obj, ".bss"); if (CHECK(!bss_map, "find_bss_map", "failed\n")) goto cleanup;
From: Mauricio Vásquez mauricio@kinvolk.io
mainline inclusion from mainline-5.16-rc1 commit 1000298c76830bc291358e98e8fa5baa3baa9b3a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Free btf_dedup if btf_ensure_modifiable() returns error.
Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Mauricio Vásquez mauricio@kinvolk.io Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211022202035.48868-1-mauricio@kinvolk.io (cherry picked from commit 1000298c76830bc291358e98e8fa5baa3baa9b3a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 7544ea1b89cf..d115133b10ad 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -2985,8 +2985,10 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext, return libbpf_err(-EINVAL); }
- if (btf_ensure_modifiable(btf)) - return libbpf_err(-ENOMEM); + if (btf_ensure_modifiable(btf)) { + err = -ENOMEM; + goto done; + }
err = btf_dedup_prep(d); if (err) {
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit bd16dee66ae4de3f1726c69ac901d2b7a53b0c86 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The llvm patches ([1], [2]) added support to attach btf_decl_tag attributes to typedef declarations. This patch added support in kernel.
[1] https://reviews.llvm.org/D110127 [2] https://reviews.llvm.org/D112259
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211021195628.4018847-1-yhs@fb.com (cherry picked from commit bd16dee66ae4de3f1726c69ac901d2b7a53b0c86) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 5906870fe1d2..b7c334bccb88 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -467,7 +467,7 @@ static bool btf_type_is_tag(const struct btf_type *t) static bool btf_type_is_tag_target(const struct btf_type *t) { return btf_type_is_func(t) || btf_type_is_struct(t) || - btf_type_is_var(t); + btf_type_is_var(t) || btf_type_is_typedef(t); }
u32 btf_nr_types(const struct btf *btf) @@ -3759,7 +3759,7 @@ static int btf_tag_resolve(struct btf_verifier_env *env,
component_idx = btf_type_tag(t)->component_idx; if (component_idx != -1) { - if (btf_type_is_var(next_type)) { + if (btf_type_is_var(next_type) || btf_type_is_typedef(next_type)) { btf_verifier_log_type(env, v->t, "Invalid component_idx"); return -EINVAL; }
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit 5a8671349dd1cfe6255c524d37d6265df7483f36 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BTF_KIND_DECL_TAG typedef support in btf.rst.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211021195649.4020514-1-yhs@fb.com (cherry picked from commit 5a8671349dd1cfe6255c524d37d6265df7483f36) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: Documentation/bpf/btf.rst
Signed-off-by: Wang Yufen wangyufen@huawei.com --- Documentation/bpf/btf.rst | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst index 44dc789de2b4..f82137aa4bd9 100644 --- a/Documentation/bpf/btf.rst +++ b/Documentation/bpf/btf.rst @@ -452,6 +452,32 @@ map definition. * ``offset``: the in-section offset of the variable * ``size``: the size of the variable in bytes
+2.2.17 BTF_KIND_DECL_TAG +~~~~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: offset to a non-empty string + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_DECL_TAG + * ``info.vlen``: 0 + * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef`` + +``btf_type`` is followed by ``struct btf_decl_tag``.:: + + struct btf_decl_tag { + __u32 component_idx; + }; + +The ``name_off`` encodes btf_decl_tag attribute string. +The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``. +For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``. +For the other three types, if the btf_decl_tag attribute is +applied to the ``struct``, ``union`` or ``func`` itself, +``btf_decl_tag.component_idx`` must be ``-1``. Otherwise, +the attribute is applied to a ``struct``/``union`` member or +a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a +valid index (starting from 0) pointing to a member or an argument. + 3. BTF Kernel API *****************
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit de5d0dcef602de39070c31c7e56c58249c56ba37 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Fix instruction index validity check which has off-by-one error.
Fixes: 3ee4f5335511 ("libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211025224531.1088894-2-andrii@kernel.org (cherry picked from commit de5d0dcef602de39070c31c7e56c58249c56ba37) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6b7948b9b226..03a4f01f02be 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5338,7 +5338,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, * relocated, so it's enough to just subtract in-section offset */ insn_idx = insn_idx - prog->sec_insn_off; - if (insn_idx > prog->insns_cnt) + if (insn_idx >= prog->insns_cnt) return -EINVAL; insn = &prog->insns[insn_idx];
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit 65a7fa2e4e5381d205d3b0098da0fc8471fbbfb6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add APIs providing read-only access to bpf_program BPF instructions ([0]). This is useful for diagnostics purposes, but it also allows a cleaner support for cloning BPF programs after libbpf did all the FD resolution and CO-RE relocations, subprog instructions appending, etc. Currently, cloning BPF program is possible only through hijacking a half-broken bpf_program__set_prep() API, which doesn't really work well for anything but most primitive programs. For instance, set_prep() API doesn't allow adjusting BPF program load parameters which are necessary for loading fentry/fexit BPF programs (the case where BPF program cloning is a necessity if doing some sort of mass-attachment functionality).
Given bpf_program__set_prep() API is set to be deprecated, having a cleaner alternative is a must. libbpf internally already keeps track of linear array of struct bpf_insn, so it's not hard to expose it. The only gotcha is that libbpf previously freed instructions array during bpf_object load time, which would make this API much less useful overall, because in between bpf_object__open() and bpf_object__load() a lot of changes to instructions are done by libbpf.
So this patch makes libbpf hold onto prog->insns array even after BPF program loading. I think this is a small price for added functionality and improved introspection of BPF program code.
See retsnoop PR ([1]) for how it can be used in practice and code savings compared to relying on bpf_program__set_prep().
[0] Closes: https://github.com/libbpf/libbpf/issues/298 [1] https://github.com/anakryiko/retsnoop/pull/1
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211025224531.1088894-3-andrii@kernel.org (cherry picked from commit 65a7fa2e4e5381d205d3b0098da0fc8471fbbfb6) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 12 ++++++++++-- tools/lib/bpf/libbpf.h | 36 ++++++++++++++++++++++++++++++++++-- tools/lib/bpf/libbpf.map | 2 ++ 3 files changed, 46 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 03a4f01f02be..ffc7e072055c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6576,8 +6576,6 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) out: if (err) pr_warn("failed to load program '%s'\n", prog->name); - zfree(&prog->insns); - prog->insns_cnt = 0; return libbpf_err(err); }
@@ -8040,6 +8038,16 @@ size_t bpf_program__size(const struct bpf_program *prog) return prog->insns_cnt * BPF_INSN_SZ; }
+const struct bpf_insn *bpf_program__insns(const struct bpf_program *prog) +{ + return prog->insns; +} + +size_t bpf_program__insn_cnt(const struct bpf_program *prog) +{ + return prog->insns_cnt; +} + int bpf_program__set_prep(struct bpf_program *prog, int nr_instances, bpf_program_prep_t prep) { diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 3c43b2d5d065..6ecc96a76091 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -226,6 +226,40 @@ LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload /* returns program size in bytes */ LIBBPF_API size_t bpf_program__size(const struct bpf_program *prog);
+struct bpf_insn; + +/** + * @brief **bpf_program__insns()** gives read-only access to BPF program's + * underlying BPF instructions. + * @param prog BPF program for which to return instructions + * @return a pointer to an array of BPF instructions that belong to the + * specified BPF program + * + * Returned pointer is always valid and not NULL. Number of `struct bpf_insn` + * pointed to can be fetched using **bpf_program__insn_cnt()** API. + * + * Keep in mind, libbpf can modify and append/delete BPF program's + * instructions as it processes BPF object file and prepares everything for + * uploading into the kernel. So depending on the point in BPF object + * lifetime, **bpf_program__insns()** can return different sets of + * instructions. As an example, during BPF object load phase BPF program + * instructions will be CO-RE-relocated, BPF subprograms instructions will be + * appended, ldimm64 instructions will have FDs embedded, etc. So instructions + * returned before **bpf_object__load()** and after it might be quite + * different. + */ +LIBBPF_API const struct bpf_insn *bpf_program__insns(const struct bpf_program *prog); +/** + * @brief **bpf_program__insn_cnt()** returns number of `struct bpf_insn`'s + * that form specified BPF program. + * @param prog BPF program for which to return number of BPF instructions + * + * See **bpf_program__insns()** documentation for notes on how libbpf can + * change instructions and their count during different phases of + * **bpf_object** lifetime. + */ +LIBBPF_API size_t bpf_program__insn_cnt(const struct bpf_program *prog); + LIBBPF_API int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_version); LIBBPF_API int bpf_program__fd(const struct bpf_program *prog); @@ -301,8 +335,6 @@ LIBBPF_API struct bpf_link * bpf_program__attach_iter(const struct bpf_program *prog, const struct bpf_iter_attach_opts *opts);
-struct bpf_insn; - /* * Libbpf allows callers to adjust BPF programs before being loaded * into kernel. One program in an object file can be transformed into diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 087a80d1613f..4b78550810e5 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -388,6 +388,8 @@ LIBBPF_0.6.0 { bpf_object__next_program; bpf_object__prev_map; bpf_object__prev_program; + bpf_program__insn_cnt; + bpf_program__insns; btf__add_btf; btf__add_tag; } LIBBPF_0.5.0;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit e21d585cb3db65a207cd338c74b9886090ef1ceb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Schedule deprecation of a set of APIs that are related to multi-instance bpf_programs: - bpf_program__set_prep() ([0]); - bpf_program__{set,unset}_instance() ([1]); - bpf_program__nth_fd().
These APIs are obscure, very niche, and don't seem to be used much in practice. bpf_program__set_prep() is pretty useless for anything but the simplest BPF programs, as it doesn't allow to adjust BPF program load attributes, among other things. In short, it already bitrotted and will bitrot some more if not removed.
With bpf_program__insns() API, which gives access to post-processed BPF program instructions of any given entry-point BPF program, it's now possible to do whatever necessary adjustments were possible with set_prep() API before, but also more. Given any such use case is automatically an advanced use case, requiring users to stick to low-level bpf_prog_load() APIs and managing their own prog FDs is reasonable.
[0] Closes: https://github.com/libbpf/libbpf/issues/299 [1] Closes: https://github.com/libbpf/libbpf/issues/300
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211025224531.1088894-4-andrii@kernel.org (cherry picked from commit e21d585cb3db65a207cd338c74b9886090ef1ceb) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 22 +++++++++++++--------- tools/lib/bpf/libbpf.h | 6 +++++- 2 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ffc7e072055c..13f80ec3c779 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7260,8 +7260,7 @@ static int check_path(const char *path) return err; }
-int bpf_program__pin_instance(struct bpf_program *prog, const char *path, - int instance) +static int bpf_program_pin_instance(struct bpf_program *prog, const char *path, int instance) { char *cp, errmsg[STRERR_BUFSIZE]; int err; @@ -7296,8 +7295,7 @@ int bpf_program__pin_instance(struct bpf_program *prog, const char *path, return 0; }
-int bpf_program__unpin_instance(struct bpf_program *prog, const char *path, - int instance) +static int bpf_program_unpin_instance(struct bpf_program *prog, const char *path, int instance) { int err;
@@ -7325,6 +7323,12 @@ int bpf_program__unpin_instance(struct bpf_program *prog, const char *path, return 0; }
+__attribute__((alias("bpf_program_pin_instance"))) +int bpf_object__pin_instance(struct bpf_program *prog, const char *path, int instance); + +__attribute__((alias("bpf_program_unpin_instance"))) +int bpf_program__unpin_instance(struct bpf_program *prog, const char *path, int instance); + int bpf_program__pin(struct bpf_program *prog, const char *path) { int i, err; @@ -7349,7 +7353,7 @@ int bpf_program__pin(struct bpf_program *prog, const char *path)
if (prog->instances.nr == 1) { /* don't create subdirs when pinning single instance */ - return bpf_program__pin_instance(prog, path, 0); + return bpf_program_pin_instance(prog, path, 0); }
for (i = 0; i < prog->instances.nr; i++) { @@ -7365,7 +7369,7 @@ int bpf_program__pin(struct bpf_program *prog, const char *path) goto err_unpin; }
- err = bpf_program__pin_instance(prog, buf, i); + err = bpf_program_pin_instance(prog, buf, i); if (err) goto err_unpin; } @@ -7383,7 +7387,7 @@ int bpf_program__pin(struct bpf_program *prog, const char *path) else if (len >= PATH_MAX) continue;
- bpf_program__unpin_instance(prog, buf, i); + bpf_program_unpin_instance(prog, buf, i); }
rmdir(path); @@ -7411,7 +7415,7 @@ int bpf_program__unpin(struct bpf_program *prog, const char *path)
if (prog->instances.nr == 1) { /* don't create subdirs when pinning single instance */ - return bpf_program__unpin_instance(prog, path, 0); + return bpf_program_unpin_instance(prog, path, 0); }
for (i = 0; i < prog->instances.nr; i++) { @@ -7424,7 +7428,7 @@ int bpf_program__unpin(struct bpf_program *prog, const char *path) else if (len >= PATH_MAX) return libbpf_err(-ENAMETOOLONG);
- err = bpf_program__unpin_instance(prog, buf, i); + err = bpf_program_unpin_instance(prog, buf, i); if (err) return err; } diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 6ecc96a76091..02b25a85f3e8 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -263,9 +263,11 @@ LIBBPF_API size_t bpf_program__insn_cnt(const struct bpf_program *prog); LIBBPF_API int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_version); LIBBPF_API int bpf_program__fd(const struct bpf_program *prog); +LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") LIBBPF_API int bpf_program__pin_instance(struct bpf_program *prog, const char *path, int instance); +LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") LIBBPF_API int bpf_program__unpin_instance(struct bpf_program *prog, const char *path, int instance); @@ -363,7 +365,7 @@ bpf_program__attach_iter(const struct bpf_program *prog, * one instance. In this case bpf_program__fd(prog) is equal to * bpf_program__nth_fd(prog, 0). */ - +LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insns() for getting bpf_program instructions") struct bpf_prog_prep_result { /* * If not NULL, load new instruction array. @@ -392,9 +394,11 @@ typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n, struct bpf_insn *insns, int insns_cnt, struct bpf_prog_prep_result *res);
+LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insns() for getting bpf_program instructions") LIBBPF_API int bpf_program__set_prep(struct bpf_program *prog, int nr_instance, bpf_program_prep_t prep);
+LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") LIBBPF_API int bpf_program__nth_fd(const struct bpf_program *prog, int n);
/*
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.16-rc1 commit c4813e969ac471af730902377a2656b6b1b92c4d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The name of the API doesn't convey clearly that this size is in number of bytes (there needed to be a separate comment to make this clear in libbpf.h). Further, measuring the size of BPF program in bytes is not exactly the best fit, because BPF programs always consist of 8-byte instructions. As such, bpf_program__insn_cnt() is a better alternative in pretty much any imaginable case.
So schedule bpf_program__size() deprecation starting from v0.7 and it will be removed in libbpf 1.0.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211025224531.1088894-5-andrii@kernel.org (cherry picked from commit c4813e969ac471af730902377a2656b6b1b92c4d) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 02b25a85f3e8..74ea79a75364 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -224,6 +224,7 @@ LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog); LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload);
/* returns program size in bytes */ +LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead") LIBBPF_API size_t bpf_program__size(const struct bpf_program *prog);
struct bpf_insn;
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.16-rc1 commit d6aef08a872b9e23eecc92d0e92393473b13c497 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This helper allows us to get the address of a kernel symbol from inside a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can relocate typeless ksym vars.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com (cherry picked from commit d6aef08a872b9e23eecc92d0e92393473b13c497) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf.h include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 16 ++++++++++++++++ kernel/bpf/syscall.c | 27 +++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 16 ++++++++++++++++ 4 files changed, 60 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 57a92e493ea3..930d0032285b 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2054,6 +2054,7 @@ extern const struct bpf_func_proto bpf_per_cpu_ptr_proto; extern const struct bpf_func_proto bpf_this_cpu_ptr_proto; extern const struct bpf_func_proto bpf_for_each_map_elem_proto; extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; +extern const struct bpf_func_proto bpf_kallsyms_lookup_name_proto;
const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d9097d60fe3e..c3da9fdb6d88 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3899,6 +3899,21 @@ union bpf_attr { * Execute close syscall for given FD. * Return * A syscall result. + * + * long bpf_kallsyms_lookup_name(const char *name, int name_sz, int flags, u64 *res) + * Description + * Get the address of a kernel symbol, returned in *res*. *res* is + * set to 0 if the symbol is not found. + * Return + * On success, zero. On error, a negative value. + * + * **-EINVAL** if *flags* is not zero. + * + * **-EINVAL** if string *name* is not the same size as *name_sz*. + * + * **-ENOENT** if symbol is not found. + * + * **-EPERM** if caller does not have permission to obtain kernel address. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4067,6 +4082,7 @@ union bpf_attr { FN(sys_bpf), \ FN(btf_find_by_name_kind), \ FN(sys_close), \ + FN(kallsyms_lookup_name), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 9a5c0ac72e68..9ee73a94b5a9 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4706,6 +4706,31 @@ const struct bpf_func_proto bpf_sys_close_proto = { .arg1_type = ARG_ANYTHING, };
+BPF_CALL_4(bpf_kallsyms_lookup_name, const char *, name, int, name_sz, int, flags, u64 *, res) +{ + if (flags) + return -EINVAL; + + if (name_sz <= 1 || name[name_sz - 1]) + return -EINVAL; + + if (!bpf_dump_raw_ok(current_cred())) + return -EPERM; + + *res = kallsyms_lookup_name(name); + return *res ? 0 : -ENOENT; +} + +const struct bpf_func_proto bpf_kallsyms_lookup_name_proto = { + .func = bpf_kallsyms_lookup_name, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_CONST_SIZE, + .arg3_type = ARG_ANYTHING, + .arg4_type = ARG_PTR_TO_LONG, +}; + static const struct bpf_func_proto * syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -4716,6 +4741,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_btf_find_by_name_kind_proto; case BPF_FUNC_sys_close: return &bpf_sys_close_proto; + case BPF_FUNC_kallsyms_lookup_name: + return &bpf_kallsyms_lookup_name_proto; default: return tracing_prog_func_proto(func_id, prog); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 3356ef1e2272..639206344c07 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4609,6 +4609,21 @@ union bpf_attr { * Execute close syscall for given FD. * Return * A syscall result. + * + * long bpf_kallsyms_lookup_name(const char *name, int name_sz, int flags, u64 *res) + * Description + * Get the address of a kernel symbol, returned in *res*. *res* is + * set to 0 if the symbol is not found. + * Return + * On success, zero. On error, a negative value. + * + * **-EINVAL** if *flags* is not zero. + * + * **-EINVAL** if string *name* is not the same size as *name_sz*. + * + * **-ENOENT** if symbol is not found. + * + * **-EPERM** if caller does not have permission to obtain kernel address. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4777,6 +4792,7 @@ union bpf_attr { FN(sys_bpf), \ FN(btf_find_by_name_kind), \ FN(sys_close), \ + FN(kallsyms_lookup_name), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.16-rc1 commit c24941cd3766b6de682dbe6809bd6af12271ab5b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This uses the bpf_kallsyms_lookup_name helper added in previous patches to relocate typeless ksyms. The return value ENOENT can be ignored, and the value written to 'res' can be directly stored to the insn, as it is overwritten to 0 on lookup failure. For repeating symbols, we can simply copy the previously populated bpf_insn.
Also, we need to take care to not close fds for typeless ksym_desc, so reuse the 'off' member's space to add a marker for typeless ksym and use that to skip them in cleanup_relos.
We add a emit_ksym_relo_log helper that avoids duplicating common logging instructions between typeless and weak ksym (for future commit).
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com (cherry picked from commit c24941cd3766b6de682dbe6809bd6af12271ab5b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_gen_internal.h | 12 +++- tools/lib/bpf/gen_loader.c | 97 ++++++++++++++++++++++++++++---- tools/lib/bpf/libbpf.c | 13 ++--- 3 files changed, 99 insertions(+), 23 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index 70eccbffefb1..9f328d044636 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -8,13 +8,19 @@ struct ksym_relo_desc { int kind; int insn_idx; bool is_weak; + bool is_typeless; };
struct ksym_desc { const char *name; int ref; int kind; - int off; + union { + /* used for kfunc */ + int off; + /* used for typeless ksym */ + bool typeless; + }; int insn; };
@@ -49,7 +55,7 @@ void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_a void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size); void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx); void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type); -void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, int kind, - int insn_idx); +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, + bool is_typeless, int kind, int insn_idx);
#endif diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 937bfc7db41e..11172a868180 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -559,7 +559,7 @@ static void emit_find_attach_target(struct bpf_gen *gen) }
void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, - int kind, int insn_idx) + bool is_typeless, int kind, int insn_idx) { struct ksym_relo_desc *relo;
@@ -572,6 +572,7 @@ void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, relo += gen->relo_cnt; relo->name = name; relo->is_weak = is_weak; + relo->is_typeless = is_typeless; relo->kind = kind; relo->insn_idx = insn_idx; gen->relo_cnt++; @@ -621,6 +622,29 @@ static void emit_bpf_find_by_name_kind(struct bpf_gen *gen, struct ksym_relo_des debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind); }
+/* Overwrites BPF_REG_{0, 1, 2, 3, 4, 7} + * Returns result in BPF_REG_7 + * Returns u64 symbol addr in BPF_REG_9 + */ +static void emit_bpf_kallsyms_lookup_name(struct bpf_gen *gen, struct ksym_relo_desc *relo) +{ + int name_off, len = strlen(relo->name) + 1, res_off; + + name_off = add_data(gen, relo->name, len); + res_off = add_data(gen, NULL, 8); /* res is u64 */ + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, name_off)); + emit(gen, BPF_MOV64_IMM(BPF_REG_2, len)); + emit(gen, BPF_MOV64_IMM(BPF_REG_3, 0)); + emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_4, BPF_PSEUDO_MAP_IDX_VALUE, + 0, 0, 0, res_off)); + emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_4)); + emit(gen, BPF_EMIT_CALL(BPF_FUNC_kallsyms_lookup_name)); + emit(gen, BPF_LDX_MEM(BPF_DW, BPF_REG_9, BPF_REG_7, 0)); + emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0)); + debug_ret(gen, "kallsyms_lookup_name(%s,%d)", relo->name, relo->kind); +} + /* Expects: * BPF_REG_8 - pointer to instruction * @@ -700,6 +724,58 @@ static void emit_relo_kfunc_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo relo->name, kdesc->ref); }
+static void emit_ksym_relo_log(struct bpf_gen *gen, struct ksym_relo_desc *relo, + int ref) +{ + if (!gen->log_level) + return; + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_8, + offsetof(struct bpf_insn, imm))); + emit(gen, BPF_LDX_MEM(BPF_H, BPF_REG_9, BPF_REG_8, sizeof(struct bpf_insn) + + offsetof(struct bpf_insn, imm))); + debug_regs(gen, BPF_REG_7, BPF_REG_9, " var t=%d w=%d (%s:count=%d): imm[0]: %%d, imm[1]: %%d", + relo->is_typeless, relo->is_weak, relo->name, ref); + emit(gen, BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_8, offsetofend(struct bpf_insn, code))); + debug_regs(gen, BPF_REG_9, -1, " var t=%d w=%d (%s:count=%d): insn.reg", + relo->is_typeless, relo->is_weak, relo->name, ref); +} + +/* Expects: + * BPF_REG_8 - pointer to instruction + */ +static void emit_relo_ksym_typeless(struct bpf_gen *gen, + struct ksym_relo_desc *relo, int insn) +{ + struct ksym_desc *kdesc; + + kdesc = get_ksym_desc(gen, relo); + if (!kdesc) + return; + /* try to copy from existing ldimm64 insn */ + if (kdesc->ref > 1) { + move_blob2blob(gen, insn + offsetof(struct bpf_insn, imm), 4, + kdesc->insn + offsetof(struct bpf_insn, imm)); + move_blob2blob(gen, insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 4, + kdesc->insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm)); + goto log; + } + /* remember insn offset, so we can copy ksym addr later */ + kdesc->insn = insn; + /* skip typeless ksym_desc in fd closing loop in cleanup_relos */ + kdesc->typeless = true; + emit_bpf_kallsyms_lookup_name(gen, relo); + emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_7, -ENOENT, 1)); + emit_check_err(gen); + /* store lower half of addr into insn[insn_idx].imm */ + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_9, offsetof(struct bpf_insn, imm))); + /* store upper half of addr into insn[insn_idx + 1].imm */ + emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_9, 32)); + emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_9, + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm))); +log: + emit_ksym_relo_log(gen, relo, kdesc->ref); +} + /* Expects: * BPF_REG_8 - pointer to instruction */ @@ -729,14 +805,7 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm))); log: - if (!gen->log_level) - return; - emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_8, - offsetof(struct bpf_insn, imm))); - emit(gen, BPF_LDX_MEM(BPF_H, BPF_REG_9, BPF_REG_8, sizeof(struct bpf_insn) + - offsetof(struct bpf_insn, imm))); - debug_regs(gen, BPF_REG_7, BPF_REG_9, " var (%s:count=%d): imm: %%d, fd: %%d", - relo->name, kdesc->ref); + emit_ksym_relo_log(gen, relo, kdesc->ref); }
static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns) @@ -748,7 +817,10 @@ static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insn emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_8, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, insn)); switch (relo->kind) { case BTF_KIND_VAR: - emit_relo_ksym_btf(gen, relo, insn); + if (relo->is_typeless) + emit_relo_ksym_typeless(gen, relo, insn); + else + emit_relo_ksym_btf(gen, relo, insn); break; case BTF_KIND_FUNC: emit_relo_kfunc_btf(gen, relo, insn); @@ -773,12 +845,13 @@ static void cleanup_relos(struct bpf_gen *gen, int insns) int i, insn;
for (i = 0; i < gen->nr_ksyms; i++) { - if (gen->ksyms[i].kind == BTF_KIND_VAR) { + /* only close fds for typed ksyms and kfuncs */ + if (gen->ksyms[i].kind == BTF_KIND_VAR && !gen->ksyms[i].typeless) { /* close fd recorded in insn[insn_idx + 1].imm */ insn = gen->ksyms[i].insn; insn += sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm); emit_sys_close_blob(gen, insn); - } else { /* BTF_KIND_FUNC */ + } else if (gen->ksyms[i].kind == BTF_KIND_FUNC) { emit_sys_close_blob(gen, blob_fd_array_off(gen, gen->ksyms[i].off)); if (gen->ksyms[i].off < MAX_FD_ARRAY_SZ) gen->nr_fd_array--; diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 13f80ec3c779..048bf7689961 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6481,17 +6481,14 @@ static int bpf_program__record_externs(struct bpf_program *prog) case RELO_EXTERN_VAR: if (ext->type != EXT_KSYM) continue; - if (!ext->ksym.type_id) { - pr_warn("typeless ksym %s is not supported yet\n", - ext->name); - return -ENOTSUP; - } - bpf_gen__record_extern(obj->gen_loader, ext->name, ext->is_weak, + bpf_gen__record_extern(obj->gen_loader, ext->name, + ext->is_weak, !ext->ksym.type_id, BTF_KIND_VAR, relo->insn_idx); break; case RELO_EXTERN_FUNC: - bpf_gen__record_extern(obj->gen_loader, ext->name, ext->is_weak, - BTF_KIND_FUNC, relo->insn_idx); + bpf_gen__record_extern(obj->gen_loader, ext->name, + ext->is_weak, false, BTF_KIND_FUNC, + relo->insn_idx); break; default: continue;
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.16-rc1 commit 585a3571981d8a93e2211e1ac835d31a63e68cd8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
This extends existing ksym relocation code to also support relocating weak ksyms. Care needs to be taken to zero out the src_reg (currently BPF_PSEUOD_BTF_ID, always set for gen_loader by bpf_object__relocate_data) when the BTF ID lookup fails at runtime. This is not a problem for libbpf as it only sets ext->is_set when BTF ID lookup succeeds (and only proceeds in case of failure if ext->is_weak, leading to src_reg remaining as 0 for weak unresolved ksym).
A pattern similar to emit_relo_kfunc_btf is followed of first storing the default values and then jumping over actual stores in case of an error. For src_reg adjustment, we also need to perform it when copying the populated instruction, so depending on if copied insn[0].imm is 0 or not, we decide to jump over the adjustment.
We cannot reach that point unless the ksym was weak and resolved and zeroed out, as the emit_check_err will cause us to jump to cleanup label, so we do not need to recheck whether the ksym is weak before doing the adjustment after copying BTF ID and BTF FD.
This is consistent with how libbpf relocates weak ksym. Logging statements are added to show the relocation result and aid debugging.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211028063501.2239335-4-memxor@gmail.com (cherry picked from commit 585a3571981d8a93e2211e1ac835d31a63e68cd8) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 11172a868180..1c404752e565 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -13,6 +13,7 @@ #include "hashmap.h" #include "bpf_gen_internal.h" #include "skel_internal.h" +#include <asm/byteorder.h>
#define MAX_USED_MAPS 64 #define MAX_USED_PROGS 32 @@ -776,12 +777,24 @@ static void emit_relo_ksym_typeless(struct bpf_gen *gen, emit_ksym_relo_log(gen, relo, kdesc->ref); }
+static __u32 src_reg_mask(void) +{ +#if defined(__LITTLE_ENDIAN_BITFIELD) + return 0x0f; /* src_reg,dst_reg,... */ +#elif defined(__BIG_ENDIAN_BITFIELD) + return 0xf0; /* dst_reg,src_reg,... */ +#else +#error "Unsupported bit endianness, cannot proceed" +#endif +} + /* Expects: * BPF_REG_8 - pointer to instruction */ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insn) { struct ksym_desc *kdesc; + __u32 reg_mask;
kdesc = get_ksym_desc(gen, relo); if (!kdesc) @@ -792,19 +805,35 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, kdesc->insn + offsetof(struct bpf_insn, imm)); move_blob2blob(gen, insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 4, kdesc->insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm)); - goto log; + emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_9, BPF_REG_8, offsetof(struct bpf_insn, imm))); + /* jump over src_reg adjustment if imm is not 0 */ + emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_9, 0, 3)); + goto clear_src_reg; } /* remember insn offset, so we can copy BTF ID and FD later */ kdesc->insn = insn; emit_bpf_find_by_name_kind(gen, relo); - emit_check_err(gen); + if (!relo->is_weak) + emit_check_err(gen); + /* set default values as 0 */ + emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, offsetof(struct bpf_insn, imm), 0)); + emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 0)); + /* skip success case stores if ret < 0 */ + emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 4)); /* store btf_id into insn[insn_idx].imm */ emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm))); /* store btf_obj_fd into insn[insn_idx + 1].imm */ emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32)); emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm))); -log: + emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 3)); +clear_src_reg: + /* clear bpf_object__relocate_data's src_reg assignment, otherwise we get a verifier failure */ + reg_mask = src_reg_mask(); + emit(gen, BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_8, offsetofend(struct bpf_insn, code))); + emit(gen, BPF_ALU32_IMM(BPF_AND, BPF_REG_9, reg_mask)); + emit(gen, BPF_STX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, offsetofend(struct bpf_insn, code))); + emit_ksym_relo_log(gen, relo, kdesc->ref); }
From: Jean-Philippe Brucker jean-philippe@linaro.org
mainline inclusion from mainline-5.16-rc2 commit e4ac80ef8198636a23866a59575917550328886f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Commit be79505caf3f ("tools/runqslower: Install libbpf headers when building") uses the target libbpf to build the host bpftool, which doesn't work when cross-building:
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -C tools/bpf/runqslower O=/tmp/runqslower ... LINK /tmp/runqslower/bpftool/bpftool /usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a(libbpf-in.o): Relocations in generic ELF (EM: 183) /usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a: error adding symbols: file in wrong format collect2: error: ld returned 1 exit status
When cross-building, the target architecture differs from the host. The bpftool used for building runqslower is executed on the host, and thus must use a different libbpf than that used for runqslower itself. Remove the LIBBPF_OUTPUT and LIBBPF_DESTDIR parameters, so the bpftool build makes its own library if necessary.
In the selftests, pass the host bpftool, already a prerequisite for the runqslower recipe, as BPFTOOL_OUTPUT. The runqslower Makefile will use the bpftool that's already built for selftests instead of making a new one.
Fixes: be79505caf3f ("tools/runqslower: Install libbpf headers when building") Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Quentin Monnet quentin@isovalent.com Link: https://lore.kernel.org/bpf/20211112155128.565680-1-jean-philippe@linaro.org Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit e4ac80ef8198636a23866a59575917550328886f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/runqslower/Makefile | 3 +-- tools/testing/selftests/bpf/Makefile | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index 1e9a34ba98e2..8a3cbb7e504d 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -81,5 +81,4 @@ $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(BPFOBJ_OU
$(DEFAULT_BPFTOOL): $(BPFOBJ) | $(BPFTOOL_OUTPUT) $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT) \ - LIBBPF_OUTPUT=$(BPFOBJ_OUTPUT) \ - LIBBPF_DESTDIR=$(BPF_DESTDIR) CC=$(HOSTCC) LD=$(HOSTLD) + CC=$(HOSTCC) LD=$(HOSTLD) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 3b9d4e066a31..c51db9acb0e9 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -180,7 +180,7 @@ DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool $(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL) $(RUNQSLOWER_OUTPUT) $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/runqslower \ OUTPUT=$(RUNQSLOWER_OUTPUT) VMLINUX_BTF=$(VMLINUX_BTF) \ - BPFTOOL_OUTPUT=$(BUILD_DIR)/bpftool/ \ + BPFTOOL_OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ BPFOBJ_OUTPUT=$(BUILD_DIR)/libbpf \ BPFOBJ=$(BPFOBJ) BPF_INCLUDE=$(INCLUDE_DIR) && \ cp $(RUNQSLOWER_OUTPUT)runqslower $@
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.16-rc2 commit ba05fd36b8512d6aeefe9c2c5b6a25b726c4bfff category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Alexei reported a fd leak issue in gen loader (when invoked from bpftool) [0]. When adding ksym support, map fd allocation was moved from stack to loader map, however I missed closing these fds (relevant when cleanup label is jumped to on error). For the success case, the allocated fd is returned in loader ctx, hence this problem is not noticed.
Make three changes, first MAX_USED_MAPS in MAX_FD_ARRAY_SZ instead of MAX_USED_PROGS, the braino was not a problem until now for this case as we didn't try to close map fds (otherwise use of it would have tried closing 32 additional fds in ksym btf fd range). Then, do a cleanup for all nr_maps fds in cleanup label code, so that in case of error all temporary map fds from bpf_gen__map_create are closed.
Then, adjust the cleanup label to only generate code for the required number of program and map fds. To trim code for remaining program fds, lay out prog_fd array in stack in the end, so that we can directly skip the remaining instances. Still stack size remains same, since changing that would require changes in a lot of places (including adjustment of stack_off macro), so nr_progs_sz variable is only used to track required number of iterations (and jump over cleanup size calculated from that), stack offset calculation remains unaffected.
The difference for test_ksyms_module.o is as follows: libbpf: //prog cleanup iterations: before = 34, after = 5 libbpf: //maps cleanup iterations: before = 64, after = 2
Also, move allocation of gen->fd_array offset to bpf_gen__init. Since offset can now be 0, and we already continue even if add_data returns 0 in case of failure, we do not need to distinguish between 0 offset and failure case 0, as we rely on bpf_gen__finish to check errors. We can also skip check for gen->fd_array in add_*_fd functions, since bpf_gen__init will take care of it.
[0]: https://lore.kernel.org/bpf/CAADnVQJ6jSitKSNKyxOrUzwY2qDRX0sPkJ=VLGHuCLVJ=qO...
Fixes: 18f4fccbf314 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations") Reported-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211112232022.899074-1-memxor@gmail.com (cherry picked from commit ba05fd36b8512d6aeefe9c2c5b6a25b726c4bfff) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_gen_internal.h | 4 +-- tools/lib/bpf/gen_loader.c | 47 ++++++++++++++++++++------------ tools/lib/bpf/libbpf.c | 4 +-- 3 files changed, 34 insertions(+), 21 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index 9f328d044636..d507e2841df5 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -45,8 +45,8 @@ struct bpf_gen { int nr_fd_array; };
-void bpf_gen__init(struct bpf_gen *gen, int log_level); -int bpf_gen__finish(struct bpf_gen *gen); +void bpf_gen__init(struct bpf_gen *gen, int log_level, int nr_progs, int nr_maps); +int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps); void bpf_gen__free(struct bpf_gen *gen); void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size); void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx); diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 1c404752e565..c6e065c493b8 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -18,7 +18,7 @@ #define MAX_USED_MAPS 64 #define MAX_USED_PROGS 32 #define MAX_KFUNC_DESCS 256 -#define MAX_FD_ARRAY_SZ (MAX_USED_PROGS + MAX_KFUNC_DESCS) +#define MAX_FD_ARRAY_SZ (MAX_USED_MAPS + MAX_KFUNC_DESCS)
/* The following structure describes the stack layout of the loader program. * In addition R6 contains the pointer to context. @@ -33,8 +33,8 @@ */ struct loader_stack { __u32 btf_fd; - __u32 prog_fd[MAX_USED_PROGS]; __u32 inner_map_fd; + __u32 prog_fd[MAX_USED_PROGS]; };
#define stack_off(field) \ @@ -42,6 +42,11 @@ struct loader_stack {
#define attr_field(attr, field) (attr + offsetof(union bpf_attr, field))
+static int blob_fd_array_off(struct bpf_gen *gen, int index) +{ + return gen->fd_array + index * sizeof(int); +} + static int realloc_insn_buf(struct bpf_gen *gen, __u32 size) { size_t off = gen->insn_cur - gen->insn_start; @@ -102,11 +107,15 @@ static void emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn in emit(gen, insn2); }
-void bpf_gen__init(struct bpf_gen *gen, int log_level) +static int add_data(struct bpf_gen *gen, const void *data, __u32 size); +static void emit_sys_close_blob(struct bpf_gen *gen, int blob_off); + +void bpf_gen__init(struct bpf_gen *gen, int log_level, int nr_progs, int nr_maps) { - size_t stack_sz = sizeof(struct loader_stack); + size_t stack_sz = sizeof(struct loader_stack), nr_progs_sz; int i;
+ gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int)); gen->log_level = log_level; /* save ctx pointer into R6 */ emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1)); @@ -118,19 +127,27 @@ void bpf_gen__init(struct bpf_gen *gen, int log_level) emit(gen, BPF_MOV64_IMM(BPF_REG_3, 0)); emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
+ /* amount of stack actually used, only used to calculate iterations, not stack offset */ + nr_progs_sz = offsetof(struct loader_stack, prog_fd[nr_progs]); /* jump over cleanup code */ emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, - /* size of cleanup code below */ - (stack_sz / 4) * 3 + 2)); + /* size of cleanup code below (including map fd cleanup) */ + (nr_progs_sz / 4) * 3 + 2 + + /* 6 insns for emit_sys_close_blob, + * 6 insns for debug_regs in emit_sys_close_blob + */ + nr_maps * (6 + (gen->log_level ? 6 : 0))));
/* remember the label where all error branches will jump to */ gen->cleanup_label = gen->insn_cur - gen->insn_start; /* emit cleanup code: close all temp FDs */ - for (i = 0; i < stack_sz; i += 4) { + for (i = 0; i < nr_progs_sz; i += 4) { emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -stack_sz + i)); emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 1)); emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close)); } + for (i = 0; i < nr_maps; i++) + emit_sys_close_blob(gen, blob_fd_array_off(gen, i)); /* R7 contains the error code from sys_bpf. Copy it into R0 and exit. */ emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7)); emit(gen, BPF_EXIT_INSN()); @@ -160,8 +177,6 @@ static int add_data(struct bpf_gen *gen, const void *data, __u32 size) */ static int add_map_fd(struct bpf_gen *gen) { - if (!gen->fd_array) - gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int)); if (gen->nr_maps == MAX_USED_MAPS) { pr_warn("Total maps exceeds %d\n", MAX_USED_MAPS); gen->error = -E2BIG; @@ -174,8 +189,6 @@ static int add_kfunc_btf_fd(struct bpf_gen *gen) { int cur;
- if (!gen->fd_array) - gen->fd_array = add_data(gen, NULL, MAX_FD_ARRAY_SZ * sizeof(int)); if (gen->nr_fd_array == MAX_KFUNC_DESCS) { cur = add_data(gen, NULL, sizeof(int)); return (cur - gen->fd_array) / sizeof(int); @@ -183,11 +196,6 @@ static int add_kfunc_btf_fd(struct bpf_gen *gen) return MAX_USED_MAPS + gen->nr_fd_array++; }
-static int blob_fd_array_off(struct bpf_gen *gen, int index) -{ - return gen->fd_array + index * sizeof(int); -} - static int insn_bytes_to_bpf_size(__u32 sz) { switch (sz) { @@ -359,10 +367,15 @@ static void emit_sys_close_blob(struct bpf_gen *gen, int blob_off) __emit_sys_close(gen); }
-int bpf_gen__finish(struct bpf_gen *gen) +int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) { int i;
+ if (nr_progs != gen->nr_progs || nr_maps != gen->nr_maps) { + pr_warn("progs/maps mismatch\n"); + gen->error = -EFAULT; + return gen->error; + } emit_sys_close_stack(gen, stack_off(btf_fd)); for (i = 0; i < gen->nr_progs; i++) move_stack2ctx(gen, diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 048bf7689961..97d215ba7bb5 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7145,7 +7145,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) }
if (obj->gen_loader) - bpf_gen__init(obj->gen_loader, attr->log_level); + bpf_gen__init(obj->gen_loader, attr->log_level, obj->nr_programs, obj->nr_maps);
err = bpf_object__probe_loading(obj); err = err ? : bpf_object__load_vmlinux_btf(obj, false); @@ -7164,7 +7164,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) for (i = 0; i < obj->nr_maps; i++) obj->maps[i].fd = -1; if (!err) - err = bpf_gen__finish(obj->gen_loader); + err = bpf_gen__finish(obj->gen_loader, obj->nr_programs, obj->nr_maps); }
/* clean up module BTFs */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 833907876be55205d0ec153dcd819c014404ee16 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Prevent divide-by-zero if ELF is corrupted and has zero sh_entsize. Reported by oss-fuzz project.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20211103173213.1376990-2-andrii@kernel.org (cherry picked from commit 833907876be55205d0ec153dcd819c014404ee16) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 97d215ba7bb5..0e582758fdb4 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3503,7 +3503,7 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
scn = elf_sec_by_idx(obj, obj->efile.symbols_shndx); sh = elf_sec_hdr(obj, scn); - if (!sh) + if (!sh || sh->sh_entsize != sizeof(Elf64_Sym)) return -LIBBPF_ERRNO__FORMAT;
dummy_var_btf_id = add_dummy_ksym_var(obj->btf);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 88918dc12dc357a06d8d722a684617b1c87a4654 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
If BTF is corrupted DATASEC's variable type ID might be incorrect. Prevent this easy to detect situation with extra NULL check. Reported by oss-fuzz project.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20211103173213.1376990-3-andrii@kernel.org (cherry picked from commit 88918dc12dc357a06d8d722a684617b1c87a4654) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0e582758fdb4..7a46fd502eb2 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2699,13 +2699,12 @@ static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf,
for (i = 0, vsi = btf_var_secinfos(t); i < vars; i++, vsi++) { t_var = btf__type_by_id(btf, vsi->type); - var = btf_var(t_var); - - if (!btf_is_var(t_var)) { + if (!t_var || !btf_is_var(t_var)) { pr_debug("Non-VAR type seen in section %s\n", name); return -EINVAL; }
+ var = btf_var(t_var); if (var->linkage == BTF_VAR_STATIC) continue;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 62554d52e71797eefa3fc15b54008038837bb2d4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
.BTF and .BTF.ext ELF sections should have SHT_PROGBITS type and contain data. If they are not, ELF is invalid or corrupted, so bail out. Otherwise this can lead to data->d_buf being NULL and SIGSEGV later on. Reported by oss-fuzz project.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20211103173213.1376990-4-andrii@kernel.org (cherry picked from commit 62554d52e71797eefa3fc15b54008038837bb2d4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 7a46fd502eb2..8188d5113d5d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3218,8 +3218,12 @@ static int bpf_object__elf_collect(struct bpf_object *obj) } else if (strcmp(name, MAPS_ELF_SEC) == 0) { obj->efile.btf_maps_shndx = idx; } else if (strcmp(name, BTF_ELF_SEC) == 0) { + if (sh->sh_type != SHT_PROGBITS) + return -LIBBPF_ERRNO__FORMAT; btf_data = data; } else if (strcmp(name, BTF_EXT_ELF_SEC) == 0) { + if (sh->sh_type != SHT_PROGBITS) + return -LIBBPF_ERRNO__FORMAT; btf_ext_data = data; } else if (sh->sh_type == SHT_SYMTAB) { /* already processed during the first pass above */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 0d6988e16a12ebd41d3e268992211b0ceba44ed7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
e_shnum does include section #0 and as such is exactly the number of ELF sections that we need to allocate memory for to use section indices as array indices. Fix the off-by-one error.
This is purely accounting fix, previously we were overallocating one too many array items. But no correctness errors otherwise.
Fixes: 25bbbd7a444b ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20211103173213.1376990-5-andrii@kernel.org (cherry picked from commit 0d6988e16a12ebd41d3e268992211b0ceba44ed7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 8188d5113d5d..d37a29beecfe 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3138,11 +3138,11 @@ static int bpf_object__elf_collect(struct bpf_object *obj) Elf_Scn *scn; Elf64_Shdr *sh;
- /* ELF section indices are 1-based, so allocate +1 element to keep - * indexing simple. Also include 0th invalid section into sec_cnt for - * simpler and more traditional iteration logic. + /* ELF section indices are 0-based, but sec #0 is special "invalid" + * section. e_shnum does include sec #0, so e_shnum is the necessary + * size of an array to keep all the sections. */ - obj->efile.sec_cnt = 1 + obj->efile.ehdr->e_shnum; + obj->efile.sec_cnt = obj->efile.ehdr->e_shnum; obj->efile.secs = calloc(obj->efile.sec_cnt, sizeof(*obj->efile.secs)); if (!obj->efile.secs) return -ENOMEM;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit b7332d2820d394dd2ac127df1567b4da597355a1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add few sanity checks for relocations to prevent div-by-zero and out-of-bounds array accesses in libbpf.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20211103173213.1376990-6-andrii@kernel.org (cherry picked from commit b7332d2820d394dd2ac127df1567b4da597355a1) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index d37a29beecfe..ff6f6ac8a793 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3254,6 +3254,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) } else if (sh->sh_type == SHT_REL) { int targ_sec_idx = sh->sh_info; /* points to other section */
+ if (sh->sh_entsize != sizeof(Elf64_Rel) || + targ_sec_idx >= obj->efile.sec_cnt) + return -LIBBPF_ERRNO__FORMAT; + /* Only do relo for section with exec instructions */ if (!section_have_execinstr(obj, targ_sec_idx) && strcmp(name, ".rel" STRUCT_OPS_SEC) && @@ -3978,7 +3982,7 @@ static int bpf_object__collect_prog_relos(struct bpf_object *obj, Elf64_Shdr *shdr, Elf_Data *data) { const char *relo_sec_name, *sec_name; - size_t sec_idx = shdr->sh_info; + size_t sec_idx = shdr->sh_info, sym_idx; struct bpf_program *prog; struct reloc_desc *relos; int err, i, nrels; @@ -3989,6 +3993,9 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, Elf64_Shdr *shdr, Elf_Dat Elf64_Sym *sym; Elf64_Rel *rel;
+ if (sec_idx >= obj->efile.sec_cnt) + return -EINVAL; + scn = elf_sec_by_idx(obj, sec_idx); scn_data = elf_sec_data(obj, scn);
@@ -4008,16 +4015,23 @@ bpf_object__collect_prog_relos(struct bpf_object *obj, Elf64_Shdr *shdr, Elf_Dat return -LIBBPF_ERRNO__FORMAT; }
- sym = elf_sym_by_idx(obj, ELF64_R_SYM(rel->r_info)); + sym_idx = ELF64_R_SYM(rel->r_info); + sym = elf_sym_by_idx(obj, sym_idx); if (!sym) { - pr_warn("sec '%s': symbol 0x%zx not found for relo #%d\n", - relo_sec_name, (size_t)ELF64_R_SYM(rel->r_info), i); + pr_warn("sec '%s': symbol #%zu not found for relo #%d\n", + relo_sec_name, sym_idx, i); + return -LIBBPF_ERRNO__FORMAT; + } + + if (sym->st_shndx >= obj->efile.sec_cnt) { + pr_warn("sec '%s': corrupted symbol #%zu pointing to invalid section #%zu for relo #%d\n", + relo_sec_name, sym_idx, (size_t)sym->st_shndx, i); return -LIBBPF_ERRNO__FORMAT; }
if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) { pr_warn("sec '%s': invalid offset 0x%zx for relo #%d\n", - relo_sec_name, (size_t)ELF64_R_SYM(rel->r_info), i); + relo_sec_name, (size_t)rel->r_offset, i); return -LIBBPF_ERRNO__FORMAT; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit be2f2d1680dfb36793ea8d3110edd4a1db496352 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Mark bpf_program__load() as deprecated ([0]) since v0.6. Also rename few internal program loading bpf_object helper functions to have more consistent naming.
[0] Closes: https://github.com/libbpf/libbpf/issues/301
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211103051449.1884903-1-andrii@kernel.org (cherry picked from commit be2f2d1680dfb36793ea8d3110edd4a1db496352) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 36 ++++++++++++++++++++++-------------- tools/lib/bpf/libbpf.h | 4 ++-- 2 files changed, 24 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ff6f6ac8a793..c1800ecee079 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6335,12 +6335,12 @@ static int libbpf_preload_prog(struct bpf_program *prog, return 0; }
-static int -load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, - char *license, __u32 kern_version, int *pfd) +static int bpf_object_load_prog_instance(struct bpf_object *obj, struct bpf_program *prog, + struct bpf_insn *insns, int insns_cnt, + const char *license, __u32 kern_version, + int *prog_fd) { struct bpf_prog_load_params load_attr = {}; - struct bpf_object *obj = prog->obj; char *cp, errmsg[STRERR_BUFSIZE]; size_t log_buf_size = 0; char *log_buf = NULL; @@ -6400,7 +6400,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, if (obj->gen_loader) { bpf_gen__prog_load(obj->gen_loader, &load_attr, prog - obj->programs); - *pfd = -1; + *prog_fd = -1; return 0; } retry_load: @@ -6438,7 +6438,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, } }
- *pfd = ret; + *prog_fd = ret; ret = 0; goto out; } @@ -6514,11 +6514,12 @@ static int bpf_program__record_externs(struct bpf_program *prog) return 0; }
-int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) +static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog, + const char *license, __u32 kern_ver) { int err = 0, fd, i;
- if (prog->obj->loaded) { + if (obj->loaded) { pr_warn("prog '%s': can't load after object was loaded\n", prog->name); return libbpf_err(-EINVAL); } @@ -6544,10 +6545,11 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) pr_warn("prog '%s': inconsistent nr(%d) != 1\n", prog->name, prog->instances.nr); } - if (prog->obj->gen_loader) + if (obj->gen_loader) bpf_program__record_externs(prog); - err = load_program(prog, prog->insns, prog->insns_cnt, - license, kern_ver, &fd); + err = bpf_object_load_prog_instance(obj, prog, + prog->insns, prog->insns_cnt, + license, kern_ver, &fd); if (!err) prog->instances.fds[0] = fd; goto out; @@ -6575,8 +6577,9 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) continue; }
- err = load_program(prog, result.new_insn_ptr, - result.new_insn_cnt, license, kern_ver, &fd); + err = bpf_object_load_prog_instance(obj, prog, + result.new_insn_ptr, result.new_insn_cnt, + license, kern_ver, &fd); if (err) { pr_warn("Loading the %dth instance of program '%s' failed\n", i, prog->name); @@ -6593,6 +6596,11 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver) return libbpf_err(err); }
+int bpf_program__load(struct bpf_program *prog, const char *license, __u32 kern_ver) +{ + return bpf_object_load_prog(prog->obj, prog, license, kern_ver); +} + static int bpf_object__load_progs(struct bpf_object *obj, int log_level) { @@ -6616,7 +6624,7 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level) continue; } prog->log_level |= log_level; - err = bpf_program__load(prog, obj->license, obj->kern_version); + err = bpf_object_load_prog(obj, prog, obj->license, obj->kern_version); if (err) return err; } diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 74ea79a75364..44d58c934446 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -261,8 +261,8 @@ LIBBPF_API const struct bpf_insn *bpf_program__insns(const struct bpf_program *p */ LIBBPF_API size_t bpf_program__insn_cnt(const struct bpf_program *prog);
-LIBBPF_API int bpf_program__load(struct bpf_program *prog, char *license, - __u32 kern_version); +LIBBPF_DEPRECATED_SINCE(0, 6, "use bpf_object__load() instead") +LIBBPF_API int bpf_program__load(struct bpf_program *prog, const char *license, __u32 kern_version); LIBBPF_API int bpf_program__fd(const struct bpf_program *prog); LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") LIBBPF_API int bpf_program__pin_instance(struct bpf_program *prog,
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit be80e9cdbca8ac66d09e0e24e0bd41d992362a0b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It's confusing that libbpf-provided helper macro doesn't start with LIBBPF. Also "declare" vs "define" is confusing terminology, I can never remember and always have to look up previous examples.
Bypass both issues by renaming DECLARE_LIBBPF_OPTS into a short and clean LIBBPF_OPTS. To avoid breaking existing code, provide:
#define DECLARE_LIBBPF_OPTS LIBBPF_OPTS
in libbpf_legacy.h. We can decide later if we ever want to remove it or we'll keep it forever because it doesn't add any maintainability burden.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Dave Marchevsky davemarchevsky@fb.com Link: https://lore.kernel.org/bpf/20211103220845.2676888-2-andrii@kernel.org (cherry picked from commit be80e9cdbca8ac66d09e0e24e0bd41d992362a0b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf.h | 1 + tools/lib/bpf/libbpf_common.h | 2 +- tools/lib/bpf/libbpf_legacy.h | 1 + 3 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 875dde20d56e..b281376a4f1c 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -29,6 +29,7 @@ #include <stdint.h>
#include "libbpf_common.h" +#include "libbpf_legacy.h"
#ifdef __cplusplus extern "C" { diff --git a/tools/lib/bpf/libbpf_common.h b/tools/lib/bpf/libbpf_common.h index aaa1efbf6f51..0967112b933a 100644 --- a/tools/lib/bpf/libbpf_common.h +++ b/tools/lib/bpf/libbpf_common.h @@ -54,7 +54,7 @@ * including any extra padding, it with memset() and then assigns initial * values provided by users in struct initializer-syntax as varargs. */ -#define DECLARE_LIBBPF_OPTS(TYPE, NAME, ...) \ +#define LIBBPF_OPTS(TYPE, NAME, ...) \ struct TYPE NAME = ({ \ memset(&NAME, 0, sizeof(struct TYPE)); \ (struct TYPE) { \ diff --git a/tools/lib/bpf/libbpf_legacy.h b/tools/lib/bpf/libbpf_legacy.h index df0d03dcffab..5914867c509a 100644 --- a/tools/lib/bpf/libbpf_legacy.h +++ b/tools/lib/bpf/libbpf_legacy.h @@ -51,6 +51,7 @@ enum libbpf_strict_mode {
LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode);
+#define DECLARE_LIBBPF_OPTS LIBBPF_OPTS
#ifdef __cplusplus } /* extern "C" */
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 2a2cb45b727b7a1041f3d3d93414b774e66454bb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When adding -fsanitize=address to SAN_CFLAGS, it has to be passed both to compiler through CFLAGS as well as linker through LDFLAGS. Add SAN_CFLAGS into LDFLAGS to allow building selftests with ASAN.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Hengqi Chen hengqi.chen@gmail.com Link: https://lore.kernel.org/bpf/20211107165521.9240-2-andrii@kernel.org (cherry picked from commit 2a2cb45b727b7a1041f3d3d93414b774e66454bb) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/testing/selftests/bpf/Makefile
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index c51db9acb0e9..20b7a5663c21 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -26,6 +26,7 @@ CFLAGS += -g -rdynamic -Wall -O2 $(GENFLAGS) $(SAN_CFLAGS) \ -I$(TOOLSINCDIR) -I$(APIDIR) -I$(OUTPUT) \ -Dbpf_prog_load=bpf_prog_test_load \ -Dbpf_load_program=bpf_test_load_program +LDFLAGS += $(SAN_CFLAGS) LDLIBS += -lcap -lelf -lz -lrt -lpthread
# Order correspond to 'make run_tests' order
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 8f7b239ea8cfdc8e64c875ee417fed41431a1f37 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
It's not enough to just free(map->inner_map), as inner_map itself can have extra memory allocated, like map name.
Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Hengqi Chen hengqi.chen@gmail.com Link: https://lore.kernel.org/bpf/20211107165521.9240-3-andrii@kernel.org (cherry picked from commit 8f7b239ea8cfdc8e64c875ee417fed41431a1f37) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index c1800ecee079..27b49e5a8e1d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8995,7 +8995,10 @@ int bpf_map__set_inner_map_fd(struct bpf_map *map, int fd) pr_warn("error: inner_map_fd already specified\n"); return libbpf_err(-EINVAL); } - zfree(&map->inner_map); + if (map->inner_map) { + bpf_map__destroy(map->inner_map); + zfree(&map->inner_map); + } map->inner_map_fd = fd; return 0; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 8ba285874913da21ca39a46376e9cc5ce0f45f94 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Free up memory and resources used by temporary allocated memstream and btf_dump instance.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Hengqi Chen hengqi.chen@gmail.com Link: https://lore.kernel.org/bpf/20211107165521.9240-4-andrii@kernel.org (cherry picked from commit 8ba285874913da21ca39a46376e9cc5ce0f45f94) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/btf_helpers.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/btf_helpers.c b/tools/testing/selftests/bpf/btf_helpers.c index 48f90490f922..95f4e308170e 100644 --- a/tools/testing/selftests/bpf/btf_helpers.c +++ b/tools/testing/selftests/bpf/btf_helpers.c @@ -242,18 +242,23 @@ const char *btf_type_c_dump(const struct btf *btf) d = btf_dump__new(btf, NULL, &opts, btf_dump_printf); if (libbpf_get_error(d)) { fprintf(stderr, "Failed to create btf_dump instance: %ld\n", libbpf_get_error(d)); - return NULL; + goto err_out; }
for (i = 1; i <= btf__get_nr_types(btf); i++) { err = btf_dump__dump_type(d, i); if (err) { fprintf(stderr, "Failed to dump type [%d]: %d\n", i, err); - return NULL; + goto err_out; } }
+ btf_dump__free(d); fflush(buf_file); fclose(buf_file); return buf; +err_out: + btf_dump__free(d); + fclose(buf_file); + return NULL; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit b8b26e585f3a0fbcee1032c622f046787da57390 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Array holding per-cpu values wasn't freed. Fix that.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211107165521.9240-5-andrii@kernel.org (cherry picked from commit b8b26e585f3a0fbcee1032c622f046787da57390) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index adc38a6bae14..66da14500635 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -674,14 +674,13 @@ static void test_bpf_percpu_hash_map(void) char buf[64]; void *val;
- val = malloc(8 * bpf_num_possible_cpus()); - skel = bpf_iter_bpf_percpu_hash_map__open(); if (CHECK(!skel, "bpf_iter_bpf_percpu_hash_map__open", "skeleton open failed\n")) return;
skel->rodata->num_cpus = bpf_num_possible_cpus(); + val = malloc(8 * bpf_num_possible_cpus());
err = bpf_iter_bpf_percpu_hash_map__load(skel); if (CHECK(!skel, "bpf_iter_bpf_percpu_hash_map__load", @@ -746,6 +745,7 @@ static void test_bpf_percpu_hash_map(void) bpf_link__destroy(link); out: bpf_iter_bpf_percpu_hash_map__destroy(skel); + free(val); }
static void test_bpf_array_map(void) @@ -846,14 +846,13 @@ static void test_bpf_percpu_array_map(void) void *val; int len;
- val = malloc(8 * bpf_num_possible_cpus()); - skel = bpf_iter_bpf_percpu_array_map__open(); if (CHECK(!skel, "bpf_iter_bpf_percpu_array_map__open", "skeleton open failed\n")) return;
skel->rodata->num_cpus = bpf_num_possible_cpus(); + val = malloc(8 * bpf_num_possible_cpus());
err = bpf_iter_bpf_percpu_array_map__load(skel); if (CHECK(!skel, "bpf_iter_bpf_percpu_array_map__load", @@ -909,6 +908,7 @@ static void test_bpf_percpu_array_map(void) bpf_link__destroy(link); out: bpf_iter_bpf_percpu_array_map__destroy(skel); + free(val); }
static void test_bpf_sk_storage_map(void)
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 5309b516bcc6f76dda0e44a7a1824324277093d6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Inner array of allocated strings wasn't freed on success. Now it's always freed.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Hengqi Chen hengqi.chen@gmail.com Link: https://lore.kernel.org/bpf/20211107165521.9240-6-andrii@kernel.org (cherry picked from commit 5309b516bcc6f76dda0e44a7a1824324277093d6) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/btf.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 0cbffc0276c2..e999d68c0f45 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -3637,11 +3637,9 @@ static void *btf_raw_create(const struct btf_header *hdr, next_str_idx < strs_cnt ? strs_idx[next_str_idx] : NULL;
done: + free(strs_idx); if (err) { - if (raw_btf) - free(raw_btf); - if (strs_idx) - free(strs_idx); + free(raw_btf); return NULL; } return raw_btf;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit f92321d706a810b89a905e04658e38931c4bb0e0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
btf__parse() is repeated after successful setup, leaving the first instance leaked. Remove redundant and premature call.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Hengqi Chen hengqi.chen@gmail.com Link: https://lore.kernel.org/bpf/20211107165521.9240-8-andrii@kernel.org (cherry picked from commit f92321d706a810b89a905e04658e38931c4bb0e0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/core_reloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index b8e46da0e1a0..a47cd67c9451 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -432,7 +432,7 @@ static int setup_type_id_case_local(struct core_reloc_test_case *test)
static int setup_type_id_case_success(struct core_reloc_test_case *test) { struct core_reloc_type_id_output *exp = (void *)test->output; - struct btf *targ_btf = btf__parse(test->btf_src_file, NULL); + struct btf *targ_btf; int err;
err = setup_type_id_case_local(test);
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.16-rc1 commit 223f903e9c832699f4e5f422281a60756c1c6cfe category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Patch set [1] introduced BTF_KIND_TAG to allow tagging declarations for struct/union, struct/union field, var, func and func arguments and these tags will be encoded into dwarf. They are also encoded to btf by llvm for the bpf target.
After BTF_KIND_TAG is introduced, we intended to use it for kernel __user attributes. But kernel __user is actually a type attribute. Upstream and internal discussion showed it is not a good idea to mix declaration attribute and type attribute. So we proposed to introduce btf_type_tag as a type attribute and existing btf_tag renamed to btf_decl_tag ([2]).
This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some other declarations with *_tag to *_decl_tag to make it clear the tag is for declaration. In the future, BTF_KIND_TYPE_TAG might be introduced per [3].
[1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ [2] https://reviews.llvm.org/D111588 [3] https://reviews.llvm.org/D111199
Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG") Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG") Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG") Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com (cherry picked from commit 223f903e9c832699f4e5f422281a60756c1c6cfe) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: Documentation/bpf/btf.rst kernel/bpf/btf.c tools/bpf/bpftool/btf.c tools/lib/bpf/libbpf.c tools/testing/selftests/bpf/README.rst tools/testing/selftests/bpf/btf_helpers.c tools/testing/selftests/bpf/prog_tests/btf.c tools/testing/selftests/bpf/prog_tests/btf_write.c tools/testing/selftests/bpf/progs/tag.c tools/testing/selftests/bpf/test_btf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- Documentation/bpf/btf.rst | 3 +- include/uapi/linux/btf.h | 8 ++--- kernel/bpf/btf.c | 44 +++++++++++++------------- tools/bpf/bpftool/btf.c | 6 ++-- tools/include/uapi/linux/btf.h | 8 ++--- tools/lib/bpf/btf.c | 36 ++++++++++----------- tools/lib/bpf/btf.h | 12 +++---- tools/lib/bpf/btf_dump.c | 6 ++-- tools/lib/bpf/libbpf.c | 24 +++++++------- tools/lib/bpf/libbpf.map | 2 +- tools/lib/bpf/libbpf_internal.h | 4 +-- tools/testing/selftests/bpf/test_btf.h | 3 ++ 12 files changed, 80 insertions(+), 76 deletions(-)
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst index f82137aa4bd9..672aa0abf49f 100644 --- a/Documentation/bpf/btf.rst +++ b/Documentation/bpf/btf.rst @@ -84,6 +84,7 @@ sequentially and type id is assigned to each recognized type starting from id #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ #define BTF_KIND_VAR 14 /* Variable */ #define BTF_KIND_DATASEC 15 /* Section */ + #define BTF_KIND_DECL_TAG 17 /* Decl Tag */
Note that the type section encodes debug info, not just pure types. ``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. @@ -105,7 +106,7 @@ Each type contains the following common data:: * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC and FUNC_PROTO. + * FUNC, FUNC_PROTO and DECL_TAG. * "type" is a type_id referring to another type. */ union { diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index 642b6ecb37d7..deb12f755f0f 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -43,7 +43,7 @@ struct btf_type { * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO, VAR and TAG. + * FUNC, FUNC_PROTO, VAR and DECL_TAG. * "type" is a type_id referring to another type. */ union { @@ -74,7 +74,7 @@ enum { BTF_KIND_VAR = 14, /* Variable */ BTF_KIND_DATASEC = 15, /* Section */ BTF_KIND_FLOAT = 16, /* Floating point */ - BTF_KIND_TAG = 17, /* Tag */ + BTF_KIND_DECL_TAG = 17, /* Decl Tag */
NR_BTF_KINDS, BTF_KIND_MAX = NR_BTF_KINDS - 1, @@ -174,14 +174,14 @@ struct btf_var_secinfo { __u32 size; };
-/* BTF_KIND_TAG is followed by a single "struct btf_tag" to describe +/* BTF_KIND_DECL_TAG is followed by a single "struct btf_decl_tag" to describe * additional information related to the tag applied location. * If component_idx == -1, the tag is applied to a struct, union, * variable or function. Otherwise, it is applied to a struct/union * member or a func argument, and component_idx indicates which member * or argument (0 ... vlen-1). */ -struct btf_tag { +struct btf_decl_tag { __s32 component_idx; };
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index b7c334bccb88..650ed461872d 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -280,7 +280,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_FUNC_PROTO] = "FUNC_PROTO", [BTF_KIND_VAR] = "VAR", [BTF_KIND_DATASEC] = "DATASEC", - [BTF_KIND_TAG] = "TAG", + [BTF_KIND_DECL_TAG] = "DECL_TAG", };
const char *btf_type_str(const struct btf_type *t) @@ -459,12 +459,12 @@ static bool btf_type_is_datasec(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; }
-static bool btf_type_is_tag(const struct btf_type *t) +static bool btf_type_is_decl_tag(const struct btf_type *t) { - return BTF_INFO_KIND(t->info) == BTF_KIND_TAG; + return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG; }
-static bool btf_type_is_tag_target(const struct btf_type *t) +static bool btf_type_is_decl_tag_target(const struct btf_type *t) { return btf_type_is_func(t) || btf_type_is_struct(t) || btf_type_is_var(t) || btf_type_is_typedef(t); @@ -548,7 +548,7 @@ const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf, static bool btf_type_is_resolve_source_only(const struct btf_type *t) { return btf_type_is_var(t) || - btf_type_is_tag(t) || + btf_type_is_decl_tag(t) || btf_type_is_datasec(t); }
@@ -575,7 +575,7 @@ static bool btf_type_needs_resolve(const struct btf_type *t) btf_type_is_struct(t) || btf_type_is_array(t) || btf_type_is_var(t) || - btf_type_is_tag(t) || + btf_type_is_decl_tag(t) || btf_type_is_datasec(t); }
@@ -628,9 +628,9 @@ static const struct btf_var *btf_type_var(const struct btf_type *t) return (const struct btf_var *)(t + 1); }
-static const struct btf_tag *btf_type_tag(const struct btf_type *t) +static const struct btf_decl_tag *btf_type_decl_tag(const struct btf_type *t) { - return (const struct btf_tag *)(t + 1); + return (const struct btf_decl_tag *)(t + 1); }
static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t) @@ -3694,11 +3694,11 @@ static const struct btf_kind_operations datasec_ops = { .show = btf_datasec_show, };
-static s32 btf_tag_check_meta(struct btf_verifier_env *env, +static s32 btf_decl_tag_check_meta(struct btf_verifier_env *env, const struct btf_type *t, u32 meta_left) { - const struct btf_tag *tag; + const struct btf_decl_tag *tag; u32 meta_needed = sizeof(*tag); s32 component_idx; const char *value; @@ -3726,7 +3726,7 @@ static s32 btf_tag_check_meta(struct btf_verifier_env *env, return -EINVAL; }
- component_idx = btf_type_tag(t)->component_idx; + component_idx = btf_type_decl_tag(t)->component_idx; if (component_idx < -1) { btf_verifier_log_type(env, t, "Invalid component_idx"); return -EINVAL; @@ -3737,7 +3737,7 @@ static s32 btf_tag_check_meta(struct btf_verifier_env *env, return meta_needed; }
-static int btf_tag_resolve(struct btf_verifier_env *env, +static int btf_decl_tag_resolve(struct btf_verifier_env *env, const struct resolve_vertex *v) { const struct btf_type *next_type; @@ -3748,7 +3748,7 @@ static int btf_tag_resolve(struct btf_verifier_env *env, u32 vlen;
next_type = btf_type_by_id(btf, next_type_id); - if (!next_type || !btf_type_is_tag_target(next_type)) { + if (!next_type || !btf_type_is_decl_tag_target(next_type)) { btf_verifier_log_type(env, v->t, "Invalid type_id"); return -EINVAL; } @@ -3757,7 +3757,7 @@ static int btf_tag_resolve(struct btf_verifier_env *env, !env_type_is_resolved(env, next_type_id)) return env_stack_push(env, next_type, next_type_id);
- component_idx = btf_type_tag(t)->component_idx; + component_idx = btf_type_decl_tag(t)->component_idx; if (component_idx != -1) { if (btf_type_is_var(next_type) || btf_type_is_typedef(next_type)) { btf_verifier_log_type(env, v->t, "Invalid component_idx"); @@ -3783,18 +3783,18 @@ static int btf_tag_resolve(struct btf_verifier_env *env, return 0; }
-static void btf_tag_log(struct btf_verifier_env *env, const struct btf_type *t) +static void btf_decl_tag_log(struct btf_verifier_env *env, const struct btf_type *t) { btf_verifier_log(env, "type=%u component_idx=%d", t->type, - btf_type_tag(t)->component_idx); + btf_type_decl_tag(t)->component_idx); }
-static const struct btf_kind_operations tag_ops = { - .check_meta = btf_tag_check_meta, - .resolve = btf_tag_resolve, +static const struct btf_kind_operations decl_tag_ops = { + .check_meta = btf_decl_tag_check_meta, + .resolve = btf_decl_tag_resolve, .check_member = btf_df_check_member, .check_kflag_member = btf_df_check_kflag_member, - .log_details = btf_tag_log, + .log_details = btf_decl_tag_log, .show = btf_df_show, };
@@ -3931,7 +3931,7 @@ static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS] = { [BTF_KIND_FUNC_PROTO] = &func_proto_ops, [BTF_KIND_VAR] = &var_ops, [BTF_KIND_DATASEC] = &datasec_ops, - [BTF_KIND_TAG] = &tag_ops, + [BTF_KIND_DECL_TAG] = &decl_tag_ops, };
static s32 btf_check_meta(struct btf_verifier_env *env, @@ -4016,7 +4016,7 @@ static bool btf_resolve_valid(struct btf_verifier_env *env, return !btf_resolved_type_id(btf, type_id) && !btf_resolved_type_size(btf, type_id);
- if (btf_type_is_tag(t)) + if (btf_type_is_decl_tag(t)) return btf_resolved_type_id(btf, type_id) && !btf_resolved_type_size(btf, type_id);
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 2228136c538a..52cb61ef7331 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -36,7 +36,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_FUNC_PROTO] = "FUNC_PROTO", [BTF_KIND_VAR] = "VAR", [BTF_KIND_DATASEC] = "DATASEC", - [BTF_KIND_TAG] = "TAG", + [BTF_KIND_DECL_TAG] = "DECL_TAG", };
struct btf_attach_table { @@ -328,8 +328,8 @@ static int dump_btf_type(const struct btf *btf, __u32 id, jsonw_end_array(w); break; } - case BTF_KIND_TAG: { - const struct btf_tag *tag = (const void *)(t + 1); + case BTF_KIND_DECL_TAG: { + const struct btf_decl_tag *tag = (const void *)(t + 1);
if (json_output) { jsonw_uint_field(w, "type_id", t->type); diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h index 642b6ecb37d7..deb12f755f0f 100644 --- a/tools/include/uapi/linux/btf.h +++ b/tools/include/uapi/linux/btf.h @@ -43,7 +43,7 @@ struct btf_type { * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO, VAR and TAG. + * FUNC, FUNC_PROTO, VAR and DECL_TAG. * "type" is a type_id referring to another type. */ union { @@ -74,7 +74,7 @@ enum { BTF_KIND_VAR = 14, /* Variable */ BTF_KIND_DATASEC = 15, /* Section */ BTF_KIND_FLOAT = 16, /* Floating point */ - BTF_KIND_TAG = 17, /* Tag */ + BTF_KIND_DECL_TAG = 17, /* Decl Tag */
NR_BTF_KINDS, BTF_KIND_MAX = NR_BTF_KINDS - 1, @@ -174,14 +174,14 @@ struct btf_var_secinfo { __u32 size; };
-/* BTF_KIND_TAG is followed by a single "struct btf_tag" to describe +/* BTF_KIND_DECL_TAG is followed by a single "struct btf_decl_tag" to describe * additional information related to the tag applied location. * If component_idx == -1, the tag is applied to a struct, union, * variable or function. Otherwise, it is applied to a struct/union * member or a func argument, and component_idx indicates which member * or argument (0 ... vlen-1). */ -struct btf_tag { +struct btf_decl_tag { __s32 component_idx; };
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index d115133b10ad..f7963e208198 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -315,8 +315,8 @@ static int btf_type_size(const struct btf_type *t) return base_size + sizeof(struct btf_var); case BTF_KIND_DATASEC: return base_size + vlen * sizeof(struct btf_var_secinfo); - case BTF_KIND_TAG: - return base_size + sizeof(struct btf_tag); + case BTF_KIND_DECL_TAG: + return base_size + sizeof(struct btf_decl_tag); default: pr_debug("Unsupported BTF_KIND:%u\n", btf_kind(t)); return -EINVAL; @@ -389,8 +389,8 @@ static int btf_bswap_type_rest(struct btf_type *t) v->size = bswap_32(v->size); } return 0; - case BTF_KIND_TAG: - btf_tag(t)->component_idx = bswap_32(btf_tag(t)->component_idx); + case BTF_KIND_DECL_TAG: + btf_decl_tag(t)->component_idx = bswap_32(btf_decl_tag(t)->component_idx); return 0; default: pr_debug("Unsupported BTF_KIND:%u\n", btf_kind(t)); @@ -602,7 +602,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_CONST: case BTF_KIND_RESTRICT: case BTF_KIND_VAR: - case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: type_id = t->type; break; case BTF_KIND_ARRAY: @@ -2470,7 +2470,7 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __ }
/* - * Append new BTF_KIND_TAG type with: + * Append new BTF_KIND_DECL_TAG type with: * - *value* - non-empty/non-NULL string; * - *ref_type_id* - referenced type ID, it might not exist yet; * - *component_idx* - -1 for tagging reference type, otherwise struct/union @@ -2479,7 +2479,7 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __ * - >0, type ID of newly added BTF type; * - <0, on error. */ -int btf__add_tag(struct btf *btf, const char *value, int ref_type_id, +int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id, int component_idx) { struct btf_type *t; @@ -2494,7 +2494,7 @@ int btf__add_tag(struct btf *btf, const char *value, int ref_type_id, if (btf_ensure_modifiable(btf)) return libbpf_err(-ENOMEM);
- sz = sizeof(struct btf_type) + sizeof(struct btf_tag); + sz = sizeof(struct btf_type) + sizeof(struct btf_decl_tag); t = btf_add_type_mem(btf, sz); if (!t) return libbpf_err(-ENOMEM); @@ -2504,9 +2504,9 @@ int btf__add_tag(struct btf *btf, const char *value, int ref_type_id, return value_off;
t->name_off = value_off; - t->info = btf_type_info(BTF_KIND_TAG, 0, false); + t->info = btf_type_info(BTF_KIND_DECL_TAG, 0, false); t->type = ref_type_id; - btf_tag(t)->component_idx = component_idx; + btf_decl_tag(t)->component_idx = component_idx;
return btf_commit_type(btf, sz); } @@ -3330,7 +3330,7 @@ static bool btf_equal_common(struct btf_type *t1, struct btf_type *t2) }
/* Calculate type signature hash of INT or TAG. */ -static long btf_hash_int_tag(struct btf_type *t) +static long btf_hash_int_decl_tag(struct btf_type *t) { __u32 info = *(__u32 *)(t + 1); long h; @@ -3608,8 +3608,8 @@ static int btf_dedup_prep(struct btf_dedup *d) h = btf_hash_common(t); break; case BTF_KIND_INT: - case BTF_KIND_TAG: - h = btf_hash_int_tag(t); + case BTF_KIND_DECL_TAG: + h = btf_hash_int_decl_tag(t); break; case BTF_KIND_ENUM: h = btf_hash_enum(t); @@ -3664,11 +3664,11 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id) case BTF_KIND_FUNC_PROTO: case BTF_KIND_VAR: case BTF_KIND_DATASEC: - case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: return 0;
case BTF_KIND_INT: - h = btf_hash_int_tag(t); + h = btf_hash_int_decl_tag(t); for_each_dedup_cand(d, hash_entry, h) { cand_id = (__u32)(long)hash_entry->value; cand = btf_type_by_id(d->btf, cand_id); @@ -4285,13 +4285,13 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id) } break;
- case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: ref_type_id = btf_dedup_ref_type(d, t->type); if (ref_type_id < 0) return ref_type_id; t->type = ref_type_id;
- h = btf_hash_int_tag(t); + h = btf_hash_int_decl_tag(t); for_each_dedup_cand(d, hash_entry, h) { cand_id = (__u32)(long)hash_entry->value; cand = btf_type_by_id(d->btf, cand_id); @@ -4574,7 +4574,7 @@ int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ct case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_VAR: - case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: return visit(&t->type, ctx);
case BTF_KIND_ARRAY: { diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index bddc1d05f120..f91a91b4f0d2 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -167,7 +167,7 @@ LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __u32 byte_sz);
/* tag construction API */ -LIBBPF_API int btf__add_tag(struct btf *btf, const char *value, int ref_type_id, +LIBBPF_API int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id, int component_idx);
struct btf_dedup_opts { @@ -357,9 +357,9 @@ static inline bool btf_is_float(const struct btf_type *t) return btf_kind(t) == BTF_KIND_FLOAT; }
-static inline bool btf_is_tag(const struct btf_type *t) +static inline bool btf_is_decl_tag(const struct btf_type *t) { - return btf_kind(t) == BTF_KIND_TAG; + return btf_kind(t) == BTF_KIND_DECL_TAG; }
static inline __u8 btf_int_encoding(const struct btf_type *t) @@ -430,10 +430,10 @@ btf_var_secinfos(const struct btf_type *t) return (struct btf_var_secinfo *)(t + 1); }
-struct btf_tag; -static inline struct btf_tag *btf_tag(const struct btf_type *t) +struct btf_decl_tag; +static inline struct btf_decl_tag *btf_decl_tag(const struct btf_type *t) { - return (struct btf_tag *)(t + 1); + return (struct btf_decl_tag *)(t + 1); }
#ifdef __cplusplus diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 87a91b51b0d1..c4372186b18e 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -316,7 +316,7 @@ static int btf_dump_mark_referenced(struct btf_dump *d) case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_VAR: - case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: d->type_states[t->type].referenced = 1; break;
@@ -584,7 +584,7 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr) case BTF_KIND_FUNC: case BTF_KIND_VAR: case BTF_KIND_DATASEC: - case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: d->type_states[id].order_state = ORDERED; return 0;
@@ -2222,7 +2222,7 @@ static int btf_dump_dump_type_data(struct btf_dump *d, case BTF_KIND_FWD: case BTF_KIND_FUNC: case BTF_KIND_FUNC_PROTO: - case BTF_KIND_TAG: + case BTF_KIND_DECL_TAG: err = btf_dump_unsupported_data(d, t, id); break; case BTF_KIND_INT: diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 27b49e5a8e1d..0aee7e24e822 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -193,8 +193,8 @@ enum kern_feature_id { FEAT_MODULE_BTF, /* BTF_KIND_FLOAT support */ FEAT_BTF_FLOAT, - /* BTF_KIND_TAG support */ - FEAT_BTF_TAG, + /* BTF_KIND_DECL_TAG support */ + FEAT_BTF_DECL_TAG, __FEAT_CNT, };
@@ -2033,7 +2033,7 @@ static const char *__btf_kind_str(__u16 kind) case BTF_KIND_VAR: return "var"; case BTF_KIND_DATASEC: return "datasec"; case BTF_KIND_FLOAT: return "float"; - case BTF_KIND_TAG: return "tag"; + case BTF_KIND_DECL_TAG: return "decl_tag"; default: return "unknown"; } } @@ -2534,9 +2534,9 @@ static bool btf_needs_sanitization(struct bpf_object *obj) bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC); bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); - bool has_tag = kernel_supports(obj, FEAT_BTF_TAG); + bool has_decl_tag = kernel_supports(obj, FEAT_BTF_DECL_TAG);
- return !has_func || !has_datasec || !has_func_global || !has_float || !has_tag; + return !has_func || !has_datasec || !has_func_global || !has_float || !has_decl_tag; }
static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) @@ -2545,15 +2545,15 @@ static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC); bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); - bool has_tag = kernel_supports(obj, FEAT_BTF_TAG); + bool has_decl_tag = kernel_supports(obj, FEAT_BTF_DECL_TAG); struct btf_type *t; int i, j, vlen;
for (i = 1; i <= btf__get_nr_types(btf); i++) { t = (struct btf_type *)btf__type_by_id(btf, i);
- if ((!has_datasec && btf_is_var(t)) || (!has_tag && btf_is_tag(t))) { - /* replace VAR/TAG with INT */ + if ((!has_datasec && btf_is_var(t)) || (!has_decl_tag && btf_is_decl_tag(t))) { + /* replace VAR/DECL_TAG with INT */ t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0); /* * using size = 1 is the safest choice, 4 will be too @@ -4420,7 +4420,7 @@ static int probe_kern_btf_float(void) strs, sizeof(strs))); }
-static int probe_kern_btf_tag(void) +static int probe_kern_btf_decl_tag(void) { static const char strs[] = "\0tag"; __u32 types[] = { @@ -4430,7 +4430,7 @@ static int probe_kern_btf_tag(void) BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1), BTF_VAR_STATIC, /* attr */ - BTF_TYPE_TAG_ENC(1, 2, -1), + BTF_TYPE_DECL_TAG_ENC(1, 2, -1), };
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), @@ -4619,8 +4619,8 @@ static struct kern_feature_desc { [FEAT_BTF_FLOAT] = { "BTF_KIND_FLOAT support", probe_kern_btf_float, }, - [FEAT_BTF_TAG] = { - "BTF_KIND_TAG support", probe_kern_btf_tag, + [FEAT_BTF_DECL_TAG] = { + "BTF_KIND_DECL_TAG support", probe_kern_btf_decl_tag, }, };
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 4b78550810e5..5353be0a6625 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -391,5 +391,5 @@ LIBBPF_0.6.0 { bpf_program__insn_cnt; bpf_program__insns; btf__add_btf; - btf__add_tag; + btf__add_decl_tag; } LIBBPF_0.5.0; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 604a9a26773c..4d1095c4cadf 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -69,8 +69,8 @@ #define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size) #define BTF_TYPE_FLOAT_ENC(name, sz) \ BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FLOAT, 0, 0), sz) -#define BTF_TYPE_TAG_ENC(value, type, component_idx) \ - BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_TAG, 0, 0), type), (component_idx) +#define BTF_TYPE_DECL_TAG_ENC(value, type, component_idx) \ + BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_DECL_TAG, 0, 0), type), (component_idx)
#ifndef likely #define likely(x) __builtin_expect(!!(x), 1) diff --git a/tools/testing/selftests/bpf/test_btf.h b/tools/testing/selftests/bpf/test_btf.h index 2023725f1962..ed3fcdcf08be 100644 --- a/tools/testing/selftests/bpf/test_btf.h +++ b/tools/testing/selftests/bpf/test_btf.h @@ -66,4 +66,7 @@ #define BTF_FUNC_ENC(name, func_proto) \ BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 0), func_proto)
+#define BTF_DECL_TAG_ENC(value, type, component_idx) \ + BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_DECL_TAG, 0, 0), type), (component_idx) + #endif /* _TEST_BTF_H */
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.17-rc1 commit 8c42d2fa4eeab6c37a0b1b1aa7a2715248ef4f34 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
LLVM patches ([1] for clang, [2] and [3] for BPF backend) added support for btf_type_tag attributes. This patch added support for the kernel.
The main motivation for btf_type_tag is to bring kernel annotations __user, __rcu etc. to btf. With such information available in btf, bpf verifier can detect mis-usages and reject the program. For example, for __user tagged pointer, developers can then use proper helper like bpf_probe_read_user() etc. to read the data.
BTF_KIND_TYPE_TAG may also useful for other tracing facility where instead of to require user to specify kernel/user address type, the kernel can detect it by itself with btf.
[1] https://reviews.llvm.org/D111199 [2] https://reviews.llvm.org/D113222 [3] https://reviews.llvm.org/D113496
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com (cherry picked from commit 8c42d2fa4eeab6c37a0b1b1aa7a2715248ef4f34) Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/btf.h | 3 ++- kernel/bpf/btf.c | 14 +++++++++++++- tools/include/uapi/linux/btf.h | 3 ++- 3 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index deb12f755f0f..b0d8fea1951d 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -43,7 +43,7 @@ struct btf_type { * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO, VAR and DECL_TAG. + * FUNC, FUNC_PROTO, VAR, DECL_TAG and TYPE_TAG. * "type" is a type_id referring to another type. */ union { @@ -75,6 +75,7 @@ enum { BTF_KIND_DATASEC = 15, /* Section */ BTF_KIND_FLOAT = 16, /* Floating point */ BTF_KIND_DECL_TAG = 17, /* Decl Tag */ + BTF_KIND_TYPE_TAG = 18, /* Type Tag */
NR_BTF_KINDS, BTF_KIND_MAX = NR_BTF_KINDS - 1, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 650ed461872d..0921e2b0c101 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -281,6 +281,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_VAR] = "VAR", [BTF_KIND_DATASEC] = "DATASEC", [BTF_KIND_DECL_TAG] = "DECL_TAG", + [BTF_KIND_TYPE_TAG] = "TYPE_TAG", };
const char *btf_type_str(const struct btf_type *t) @@ -417,6 +418,7 @@ static bool btf_type_is_modifier(const struct btf_type *t) case BTF_KIND_VOLATILE: case BTF_KIND_CONST: case BTF_KIND_RESTRICT: + case BTF_KIND_TYPE_TAG: return true; }
@@ -1735,6 +1737,7 @@ __btf_resolve_size(const struct btf *btf, const struct btf_type *type, case BTF_KIND_VOLATILE: case BTF_KIND_CONST: case BTF_KIND_RESTRICT: + case BTF_KIND_TYPE_TAG: id = type->type; type = btf_type_by_id(btf, type->type); break; @@ -2343,6 +2346,8 @@ static int btf_ref_type_check_meta(struct btf_verifier_env *env, const struct btf_type *t, u32 meta_left) { + const char *value; + if (btf_type_vlen(t)) { btf_verifier_log_type(env, t, "vlen != 0"); return -EINVAL; @@ -2358,7 +2363,7 @@ static int btf_ref_type_check_meta(struct btf_verifier_env *env, return -EINVAL; }
- /* typedef type must have a valid name, and other ref types, + /* typedef/type_tag type must have a valid name, and other ref types, * volatile, const, restrict, should have a null name. */ if (BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF) { @@ -2367,6 +2372,12 @@ static int btf_ref_type_check_meta(struct btf_verifier_env *env, btf_verifier_log_type(env, t, "Invalid name"); return -EINVAL; } + } else if (BTF_INFO_KIND(t->info) == BTF_KIND_TYPE_TAG) { + value = btf_name_by_offset(env->btf, t->name_off); + if (!value || !value[0]) { + btf_verifier_log_type(env, t, "Invalid name"); + return -EINVAL; + } } else { if (t->name_off) { btf_verifier_log_type(env, t, "Invalid name"); @@ -3932,6 +3943,7 @@ static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS] = { [BTF_KIND_VAR] = &var_ops, [BTF_KIND_DATASEC] = &datasec_ops, [BTF_KIND_DECL_TAG] = &decl_tag_ops, + [BTF_KIND_TYPE_TAG] = &modifier_ops, };
static s32 btf_check_meta(struct btf_verifier_env *env, diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h index deb12f755f0f..b0d8fea1951d 100644 --- a/tools/include/uapi/linux/btf.h +++ b/tools/include/uapi/linux/btf.h @@ -43,7 +43,7 @@ struct btf_type { * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO, VAR and DECL_TAG. + * FUNC, FUNC_PROTO, VAR, DECL_TAG and TYPE_TAG. * "type" is a type_id referring to another type. */ union { @@ -75,6 +75,7 @@ enum { BTF_KIND_DATASEC = 15, /* Section */ BTF_KIND_FLOAT = 16, /* Floating point */ BTF_KIND_DECL_TAG = 17, /* Decl Tag */ + BTF_KIND_TYPE_TAG = 18, /* Type Tag */
NR_BTF_KINDS, BTF_KIND_MAX = NR_BTF_KINDS - 1,
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.17-rc1 commit 2dc1e488e5cdfd937554ca81fd46ad874d244b3f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add libbpf support for BTF_KIND_TYPE_TAG.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211112012614.1505315-1-yhs@fb.com (cherry picked from commit 2dc1e488e5cdfd937554ca81fd46ad874d244b3f) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.map
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 23 +++++++++++++++++++++++ tools/lib/bpf/btf.h | 9 ++++++++- tools/lib/bpf/btf_dump.c | 9 +++++++++ tools/lib/bpf/libbpf.c | 31 ++++++++++++++++++++++++++++++- tools/lib/bpf/libbpf.map | 1 + tools/lib/bpf/libbpf_internal.h | 2 ++ 6 files changed, 73 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f7963e208198..881fae3f62c6 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -299,6 +299,7 @@ static int btf_type_size(const struct btf_type *t) case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_FLOAT: + case BTF_KIND_TYPE_TAG: return base_size; case BTF_KIND_INT: return base_size + sizeof(__u32); @@ -349,6 +350,7 @@ static int btf_bswap_type_rest(struct btf_type *t) case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_FLOAT: + case BTF_KIND_TYPE_TAG: return 0; case BTF_KIND_INT: *(__u32 *)(t + 1) = bswap_32(*(__u32 *)(t + 1)); @@ -644,6 +646,7 @@ int btf__align_of(const struct btf *btf, __u32 id) case BTF_KIND_VOLATILE: case BTF_KIND_CONST: case BTF_KIND_RESTRICT: + case BTF_KIND_TYPE_TAG: return btf__align_of(btf, t->type); case BTF_KIND_ARRAY: return btf__align_of(btf, btf_array(t)->type); @@ -2215,6 +2218,22 @@ int btf__add_restrict(struct btf *btf, int ref_type_id) return btf_add_ref_kind(btf, BTF_KIND_RESTRICT, NULL, ref_type_id); }
+/* + * Append new BTF_KIND_TYPE_TAG type with: + * - *value*, non-empty/non-NULL tag value; + * - *ref_type_id* - referenced type ID, it might not exist yet; + * Returns: + * - >0, type ID of newly added BTF type; + * - <0, on error. + */ +int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id) +{ + if (!value|| !value[0]) + return libbpf_err(-EINVAL); + + return btf_add_ref_kind(btf, BTF_KIND_TYPE_TAG, value, ref_type_id); +} + /* * Append new BTF_KIND_FUNC type with: * - *name*, non-empty/non-NULL name; @@ -3605,6 +3624,7 @@ static int btf_dedup_prep(struct btf_dedup *d) case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_FLOAT: + case BTF_KIND_TYPE_TAG: h = btf_hash_common(t); break; case BTF_KIND_INT: @@ -3665,6 +3685,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id) case BTF_KIND_VAR: case BTF_KIND_DATASEC: case BTF_KIND_DECL_TAG: + case BTF_KIND_TYPE_TAG: return 0;
case BTF_KIND_INT: @@ -4269,6 +4290,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id) case BTF_KIND_PTR: case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: + case BTF_KIND_TYPE_TAG: ref_type_id = btf_dedup_ref_type(d, t->type); if (ref_type_id < 0) return ref_type_id; @@ -4575,6 +4597,7 @@ int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ct case BTF_KIND_FUNC: case BTF_KIND_VAR: case BTF_KIND_DECL_TAG: + case BTF_KIND_TYPE_TAG: return visit(&t->type, ctx);
case BTF_KIND_ARRAY: { diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index f91a91b4f0d2..b436a4f6ae2e 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -153,6 +153,7 @@ LIBBPF_API int btf__add_typedef(struct btf *btf, const char *name, int ref_type_ LIBBPF_API int btf__add_volatile(struct btf *btf, int ref_type_id); LIBBPF_API int btf__add_const(struct btf *btf, int ref_type_id); LIBBPF_API int btf__add_restrict(struct btf *btf, int ref_type_id); +LIBBPF_API int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id);
/* func and func_proto construction APIs */ LIBBPF_API int btf__add_func(struct btf *btf, const char *name, @@ -329,7 +330,8 @@ static inline bool btf_is_mod(const struct btf_type *t)
return kind == BTF_KIND_VOLATILE || kind == BTF_KIND_CONST || - kind == BTF_KIND_RESTRICT; + kind == BTF_KIND_RESTRICT || + kind == BTF_KIND_TYPE_TAG; }
static inline bool btf_is_func(const struct btf_type *t) @@ -362,6 +364,11 @@ static inline bool btf_is_decl_tag(const struct btf_type *t) return btf_kind(t) == BTF_KIND_DECL_TAG; }
+static inline bool btf_is_type_tag(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_TYPE_TAG; +} + static inline __u8 btf_int_encoding(const struct btf_type *t) { return BTF_INT_ENCODING(*(__u32 *)(t + 1)); diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index c4372186b18e..3c8e98e34e9d 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -317,6 +317,7 @@ static int btf_dump_mark_referenced(struct btf_dump *d) case BTF_KIND_FUNC: case BTF_KIND_VAR: case BTF_KIND_DECL_TAG: + case BTF_KIND_TYPE_TAG: d->type_states[t->type].referenced = 1; break;
@@ -560,6 +561,7 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr) case BTF_KIND_VOLATILE: case BTF_KIND_CONST: case BTF_KIND_RESTRICT: + case BTF_KIND_TYPE_TAG: return btf_dump_order_type(d, t->type, through_ptr);
case BTF_KIND_FUNC_PROTO: { @@ -734,6 +736,7 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id) case BTF_KIND_VOLATILE: case BTF_KIND_CONST: case BTF_KIND_RESTRICT: + case BTF_KIND_TYPE_TAG: btf_dump_emit_type(d, t->type, cont_id); break; case BTF_KIND_ARRAY: @@ -1154,6 +1157,7 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id, case BTF_KIND_CONST: case BTF_KIND_RESTRICT: case BTF_KIND_FUNC_PROTO: + case BTF_KIND_TYPE_TAG: id = t->type; break; case BTF_KIND_ARRAY: @@ -1322,6 +1326,11 @@ static void btf_dump_emit_type_chain(struct btf_dump *d, case BTF_KIND_RESTRICT: btf_dump_printf(d, " restrict"); break; + case BTF_KIND_TYPE_TAG: + btf_dump_emit_mods(d, decls); + name = btf_name_of(d, t->name_off); + btf_dump_printf(d, " __attribute__((btf_type_tag("%s")))", name); + break; case BTF_KIND_ARRAY: { const struct btf_array *a = btf_array(t); const struct btf_type *next_t; diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0aee7e24e822..e2a1670f2bd7 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -195,6 +195,8 @@ enum kern_feature_id { FEAT_BTF_FLOAT, /* BTF_KIND_DECL_TAG support */ FEAT_BTF_DECL_TAG, + /* BTF_KIND_TYPE_TAG support */ + FEAT_BTF_TYPE_TAG, __FEAT_CNT, };
@@ -2034,6 +2036,7 @@ static const char *__btf_kind_str(__u16 kind) case BTF_KIND_DATASEC: return "datasec"; case BTF_KIND_FLOAT: return "float"; case BTF_KIND_DECL_TAG: return "decl_tag"; + case BTF_KIND_TYPE_TAG: return "type_tag"; default: return "unknown"; } } @@ -2535,8 +2538,10 @@ static bool btf_needs_sanitization(struct bpf_object *obj) bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); bool has_decl_tag = kernel_supports(obj, FEAT_BTF_DECL_TAG); + bool has_type_tag = kernel_supports(obj, FEAT_BTF_TYPE_TAG);
- return !has_func || !has_datasec || !has_func_global || !has_float || !has_decl_tag; + return !has_func || !has_datasec || !has_func_global || !has_float || + !has_decl_tag || !has_type_tag; }
static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) @@ -2546,6 +2551,7 @@ static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT); bool has_func = kernel_supports(obj, FEAT_BTF_FUNC); bool has_decl_tag = kernel_supports(obj, FEAT_BTF_DECL_TAG); + bool has_type_tag = kernel_supports(obj, FEAT_BTF_TYPE_TAG); struct btf_type *t; int i, j, vlen;
@@ -2604,6 +2610,10 @@ static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) */ t->name_off = 0; t->info = BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 0); + } else if (!has_type_tag && btf_is_type_tag(t)) { + /* replace TYPE_TAG with a CONST */ + t->name_off = 0; + t->info = BTF_INFO_ENC(BTF_KIND_CONST, 0, 0); } } } @@ -4437,6 +4447,22 @@ static int probe_kern_btf_decl_tag(void) strs, sizeof(strs))); }
+static int probe_kern_btf_type_tag(void) +{ + static const char strs[] = "\0tag"; + __u32 types[] = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* attr */ + BTF_TYPE_TYPE_TAG_ENC(1, 1), /* [2] */ + /* ptr */ + BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 2), /* [3] */ + }; + + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); +} + static int probe_kern_array_mmap(void) { struct bpf_create_map_attr attr = { @@ -4622,6 +4648,9 @@ static struct kern_feature_desc { [FEAT_BTF_DECL_TAG] = { "BTF_KIND_DECL_TAG support", probe_kern_btf_decl_tag, }, + [FEAT_BTF_TYPE_TAG] = { + "BTF_KIND_TYPE_TAG support", probe_kern_btf_type_tag, + }, };
static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id) diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 5353be0a6625..a0ada6815597 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -392,4 +392,5 @@ LIBBPF_0.6.0 { bpf_program__insns; btf__add_btf; btf__add_decl_tag; + btf__add_type_tag; } LIBBPF_0.5.0; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 4d1095c4cadf..bdb9c7bc6382 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -71,6 +71,8 @@ BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FLOAT, 0, 0), sz) #define BTF_TYPE_DECL_TAG_ENC(value, type, component_idx) \ BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_DECL_TAG, 0, 0), type), (component_idx) +#define BTF_TYPE_TYPE_TAG_ENC(value, type) \ + BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_TYPE_TAG, 0, 0), type)
#ifndef likely #define likely(x) __builtin_expect(!!(x), 1)
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.17-rc1 commit 3da5ba6f0509ace03cad38b554c89797129e90be category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add bpftool support for BTF_KIND_TYPE_TAG.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211112012620.1505506-1-yhs@fb.com (cherry picked from commit 3da5ba6f0509ace03cad38b554c89797129e90be) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/bpf/bpftool/btf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 52cb61ef7331..e46c60a1d7aa 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -37,6 +37,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_VAR] = "VAR", [BTF_KIND_DATASEC] = "DATASEC", [BTF_KIND_DECL_TAG] = "DECL_TAG", + [BTF_KIND_TYPE_TAG] = "TYPE_TAG", };
struct btf_attach_table { @@ -141,6 +142,7 @@ static int dump_btf_type(const struct btf *btf, __u32 id, case BTF_KIND_VOLATILE: case BTF_KIND_RESTRICT: case BTF_KIND_TYPEDEF: + case BTF_KIND_TYPE_TAG: if (json_output) jsonw_uint_field(w, "type_id", t->type); else
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.17-rc1 commit d52f5c639dd8605d2563b77b190e278f615a2b8a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add BTF_KIND_TYPE_TAG documentation in btf.rst.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211112012656.1509082-1-yhs@fb.com (cherry picked from commit d52f5c639dd8605d2563b77b190e278f615a2b8a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- Documentation/bpf/btf.rst | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst index 672aa0abf49f..183f398c68d6 100644 --- a/Documentation/bpf/btf.rst +++ b/Documentation/bpf/btf.rst @@ -85,6 +85,7 @@ sequentially and type id is assigned to each recognized type starting from id #define BTF_KIND_VAR 14 /* Variable */ #define BTF_KIND_DATASEC 15 /* Section */ #define BTF_KIND_DECL_TAG 17 /* Decl Tag */ + #define BTF_KIND_TYPE_TAG 18 /* Type Tag */
Note that the type section encodes debug info, not just pure types. ``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. @@ -106,7 +107,7 @@ Each type contains the following common data:: * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC, FUNC_PROTO and DECL_TAG. + * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG. * "type" is a type_id referring to another type. */ union { @@ -479,6 +480,16 @@ the attribute is applied to a ``struct``/``union`` member or a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a valid index (starting from 0) pointing to a member or an argument.
+2.2.17 BTF_KIND_TYPE_TAG +~~~~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: offset to a non-empty string + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_TYPE_TAG + * ``info.vlen``: 0 + * ``type``: the type with ``btf_type_tag`` attribute + 3. BTF Kernel API *****************
From: Yonghong Song yhs@fb.com
mainline inclusion from mainline-5.17-rc1 commit 69a055d546156adc6f7727ec981f721d5ba9231a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Commit 2dc1e488e5cd ("libbpf: Support BTF_KIND_TYPE_TAG") added the BTF_KIND_TYPE_TAG support. But to test vmlinux build with ...
#define __user __attribute__((btf_type_tag("user")))
... I needed to sync libbpf repo and manually copy libbpf sources to pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely used in kernel.
But this approach missed one case in dedup with structures where BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in btf_dedup_is_equiv(), and this will result in a pahole dedup failure. This patch fixed this issue and a selftest is added in the subsequent patch to test this scenario.
The other missed handling is in btf__resolve_size(). Currently the compiler always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets add case BTF_KIND_TYPE_TAG in the switch statement to be future proof.
Fixes: 2dc1e488e5cd ("libbpf: Support BTF_KIND_TYPE_TAG") Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211115163937.3922235-1-yhs@fb.com (cherry picked from commit 69a055d546156adc6f7727ec981f721d5ba9231a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 881fae3f62c6..da1f7934fedc 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -605,6 +605,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_RESTRICT: case BTF_KIND_VAR: case BTF_KIND_DECL_TAG: + case BTF_KIND_TYPE_TAG: type_id = t->type; break; case BTF_KIND_ARRAY: @@ -3995,6 +3996,7 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, case BTF_KIND_PTR: case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: + case BTF_KIND_TYPE_TAG: if (cand_type->info != canon_type->info) return 0; return btf_dedup_is_equiv(d, cand_type->type, canon_type->type);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 16e0c35c6f7a2e90d52f3035ecf942af21417b7b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Load global data maps lazily, if kernel is too old to support global data. Make sure that programs are still correct by detecting if any of the to-be-loaded programs have relocation against any of such maps.
This allows to solve the issue ([0]) with bpf_printk() and Clang generating unnecessary and unreferenced .rodata.strX.Y sections, but it also goes further along the CO-RE lines, allowing to have a BPF object in which some code can work on very old kernels and relies only on BPF maps explicitly, while other BPF programs might enjoy global variable support. If such programs are correctly set to not load at runtime on old kernels, bpf_object will load and function correctly now.
[0] https://lore.kernel.org/bpf/CAK-59YFPU3qO+_pXWOH+c1LSA=8WA1yabJZfREjOEXNHAqg...
Fixes: aed659170a31 ("libbpf: Support multiple .rodata.* and .data.* BPF maps") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211123200105.387855-1-andrii@kernel.org (cherry picked from commit 16e0c35c6f7a2e90d52f3035ecf942af21417b7b) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e2a1670f2bd7..9b904130cd3e 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4950,6 +4950,24 @@ bpf_object__create_maps(struct bpf_object *obj) for (i = 0; i < obj->nr_maps; i++) { map = &obj->maps[i];
+ /* To support old kernels, we skip creating global data maps + * (.rodata, .data, .kconfig, etc); later on, during program + * loading, if we detect that at least one of the to-be-loaded + * programs is referencing any global data map, we'll error + * out with program name and relocation index logged. + * This approach allows to accommodate Clang emitting + * unnecessary .rodata.str1.1 sections for string literals, + * but also it allows to have CO-RE applications that use + * global variables in some of BPF programs, but not others. + * If those global variable-using programs are not loaded at + * runtime due to bpf_program__set_autoload(prog, false), + * bpf_object loading will succeed just fine even on old + * kernels. + */ + if (bpf_map__is_internal(map) && + !kernel_supports(obj, FEAT_GLOBAL_DATA)) + continue; + retried = false; retry: if (map->pin_path) { @@ -5549,6 +5567,14 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE; insn[0].imm = relo->map_idx; } else { + const struct bpf_map *map = &obj->maps[relo->map_idx]; + + if (bpf_map__is_internal(map) && + !kernel_supports(obj, FEAT_GLOBAL_DATA)) { + pr_warn("prog '%s': relo #%d: kernel doesn't support global data\n", + prog->name, i); + return -ENOTSUP; + } insn[0].src_reg = BPF_PSEUDO_MAP_VALUE; insn[0].imm = obj->maps[relo->map_idx].fd; } @@ -6077,6 +6103,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) */ if (prog_is_subprog(obj, prog)) continue; + if (!prog->load) + continue;
err = bpf_object__relocate_calls(obj, prog); if (err) { @@ -6090,6 +6118,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) prog = &obj->programs[i]; if (prog_is_subprog(obj, prog)) continue; + if (!prog->load) + continue; err = bpf_object__relocate_data(obj, prog); if (err) { pr_warn("prog '%s': failed to relocate data references: %d\n", @@ -6875,10 +6905,6 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj) bpf_object__for_each_map(m, obj) { if (!bpf_map__is_internal(m)) continue; - if (!kernel_supports(obj, FEAT_GLOBAL_DATA)) { - pr_warn("kernel doesn't support global data\n"); - return -ENOTSUP; - } if (!kernel_supports(obj, FEAT_ARRAY_MMAP)) m->def.map_flags ^= BPF_F_MMAPABLE; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 401891a9debaf0a684502f2aaecf53448cee9414 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Perform a memory copy before we do the sanity checks of btf_ext_hdr. This prevents misaligned memory access if raw btf_ext data is not 4-byte aligned ([0]).
While at it, also add missing const qualifier.
[0] Closes: https://github.com/libbpf/libbpf/issues/391
Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Reported-by: Evgeny Vereshchagin evvers@ya.ru Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-3-andrii@kernel.org (cherry picked from commit 401891a9debaf0a684502f2aaecf53448cee9414) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 10 +++++----- tools/lib/bpf/btf.h | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index da1f7934fedc..deccd04e377b 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -2711,15 +2711,11 @@ void btf_ext__free(struct btf_ext *btf_ext) free(btf_ext); }
-struct btf_ext *btf_ext__new(__u8 *data, __u32 size) +struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) { struct btf_ext *btf_ext; int err;
- err = btf_ext_parse_hdr(data, size); - if (err) - return libbpf_err_ptr(err); - btf_ext = calloc(1, sizeof(struct btf_ext)); if (!btf_ext) return libbpf_err_ptr(-ENOMEM); @@ -2732,6 +2728,10 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size) } memcpy(btf_ext->data, data, size);
+ err = btf_ext_parse_hdr(btf_ext->data, size); + if (err) + goto done; + if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, line_info_len)) { err = -EINVAL; goto done; diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index b436a4f6ae2e..d6c135de9a8a 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -83,7 +83,7 @@ LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, __u32 expected_value_size, __u32 *key_type_id, __u32 *value_type_id);
-LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size); +LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size); LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext); LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 2a6a9bf26170b4e156c18706cd230934ebd2f95f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Sanitizer complains about qsort(), bsearch(), and memcpy() being called with NULL pointer. This can only happen when the associated number of elements is zero, so no harm should be done. But still prevent this from happening to keep sanitizer runs clean from extra noise.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-5-andrii@kernel.org (cherry picked from commit 2a6a9bf26170b4e156c18706cd230934ebd2f95f) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 9b904130cd3e..5be79487cb9d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3298,7 +3298,8 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
/* sort BPF programs by section name and in-section instruction offset * for faster search */ - qsort(obj->programs, obj->nr_programs, sizeof(*obj->programs), cmp_progs); + if (obj->nr_programs) + qsort(obj->programs, obj->nr_programs, sizeof(*obj->programs), cmp_progs);
return bpf_object__init_btf(obj, btf_data, btf_ext_data); } @@ -5780,6 +5781,8 @@ static int cmp_relo_by_insn_idx(const void *key, const void *elem)
static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, size_t insn_idx) { + if (!prog->nr_reloc) + return NULL; return bsearch(&insn_idx, prog->reloc_desc, prog->nr_reloc, sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx); } @@ -5795,8 +5798,9 @@ static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_progra relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos)); if (!relos) return -ENOMEM; - memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc, - sizeof(*relos) * subprog->nr_reloc); + if (subprog->nr_reloc) + memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc, + sizeof(*relos) * subprog->nr_reloc);
for (i = main_prog->nr_reloc; i < new_cnt; i++) relos[i].insn_idx += subprog->sub_insn_off;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 8cb125566c40b7141d8842c534f0ea5820ee3d5c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
glob_syms array wasn't freed on bpf_link__free(). Fix that.
Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-6-andrii@kernel.org (cherry picked from commit 8cb125566c40b7141d8842c534f0ea5820ee3d5c) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index db97d3bfdc5e..c03aab74ca9a 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -210,6 +210,7 @@ void bpf_linker__free(struct bpf_linker *linker) } free(linker->secs);
+ free(linker->glob_syms); free(linker); }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 593835377f24ca1bb98008ec1dc3baefe491ad6e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
add_dst_sec() can invalidate bpf_linker's section index making dst_symtab pointer pointing into unallocated memory. Reinitialize dst_symtab pointer on each iteration to make sure it's always valid.
Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-7-andrii@kernel.org (cherry picked from commit 593835377f24ca1bb98008ec1dc3baefe491ad6e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/linker.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index c03aab74ca9a..3dd7360b7fec 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -2000,7 +2000,7 @@ static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj, static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *obj) { struct src_sec *src_symtab = &obj->secs[obj->symtab_sec_idx]; - struct dst_sec *dst_symtab = &linker->secs[linker->symtab_sec_idx]; + struct dst_sec *dst_symtab; int i, err;
for (i = 1; i < obj->sec_cnt; i++) { @@ -2033,6 +2033,9 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob return -1; }
+ /* add_dst_sec() above could have invalidated linker->secs */ + dst_symtab = &linker->secs[linker->symtab_sec_idx]; + /* shdr->sh_link points to SYMTAB */ dst_sec->shdr->sh_link = linker->symtab_sec_idx;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 3bd0233f388e061c44d36a1ac614a3bb4a851b7e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Prevent sanitizer from complaining about passing NULL into memcpy(), even if it happens with zero size.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-9-andrii@kernel.org (cherry picked from commit 3bd0233f388e061c44d36a1ac614a3bb4a851b7e) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/core_reloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index a47cd67c9451..636ea0287e83 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -891,7 +891,8 @@ void test_core_reloc(void) data = mmap_data;
memset(mmap_data, 0, sizeof(*data)); - memcpy(data->in, test_case->input, test_case->input_len); + if (test_case->input_len) + memcpy(data->in, test_case->input, test_case->input_len); data->my_pid_tgid = my_pid_tgid;
link = bpf_program__attach_raw_tracepoint(prog, tp_name);
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 6c4dedb7550aafd094f7d803668fd039545f4e57 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Perfbuf doesn't guarantee 8-byte alignment of the data like BPF ringbuf does, so struct get_stack_trace_t can arrive not properly aligned for subsequent u64 accesses. Easiest fix is to just copy data locally.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-10-andrii@kernel.org (cherry picked from commit 6c4dedb7550aafd094f7d803668fd039545f4e57) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/get_stack_raw_tp.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c index 522237aa4470..47f1020fa5b3 100644 --- a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c @@ -24,13 +24,19 @@ static void get_stack_print_output(void *ctx, int cpu, void *data, __u32 size) { bool good_kern_stack = false, good_user_stack = false; const char *nonjit_func = "___bpf_prog_run"; - struct get_stack_trace_t *e = data; + /* perfbuf-submitted data is 4-byte aligned, but we need 8-byte + * alignment, so copy data into a local variable, for simplicity + */ + struct get_stack_trace_t e; int i, num_stack; static __u64 cnt; struct ksym *ks;
cnt++;
+ memset(&e, 0, sizeof(e)); + memcpy(&e, data, size <= sizeof(e) ? size : sizeof(e)); + if (size < sizeof(struct get_stack_trace_t)) { __u64 *raw_data = data; bool found = false; @@ -57,19 +63,19 @@ static void get_stack_print_output(void *ctx, int cpu, void *data, __u32 size) good_user_stack = true; } } else { - num_stack = e->kern_stack_size / sizeof(__u64); + num_stack = e.kern_stack_size / sizeof(__u64); if (env.jit_enabled) { good_kern_stack = num_stack > 0; } else { for (i = 0; i < num_stack; i++) { - ks = ksym_search(e->kern_stack[i]); + ks = ksym_search(e.kern_stack[i]); if (ks && (strcmp(ks->name, nonjit_func) == 0)) { good_kern_stack = true; break; } } } - if (e->user_stack_size > 0 && e->user_stack_buildid_size > 0) + if (e.user_stack_size > 0 && e.user_stack_buildid_size > 0) good_user_stack = true; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit e2e0d90c550a2588ebed7aa2753adaac0f633989 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Copy over iphdr into a local variable before accessing its fields.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-11-andrii@kernel.org (cherry picked from commit e2e0d90c550a2588ebed7aa2753adaac0f633989) Signed-off-by: Wang Yufen wangyufen@huawei.com --- .../selftests/bpf/prog_tests/queue_stack_map.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/queue_stack_map.c b/tools/testing/selftests/bpf/prog_tests/queue_stack_map.c index f47e7b1cb32c..cb3f64ef5a7a 100644 --- a/tools/testing/selftests/bpf/prog_tests/queue_stack_map.c +++ b/tools/testing/selftests/bpf/prog_tests/queue_stack_map.c @@ -14,7 +14,7 @@ static void test_queue_stack_map_by_type(int type) int i, err, prog_fd, map_in_fd, map_out_fd; char file[32], buf[128]; struct bpf_object *obj; - struct iphdr *iph = (void *)buf + sizeof(struct ethhdr); + struct iphdr iph;
/* Fill test values to be used */ for (i = 0; i < MAP_SIZE; i++) @@ -60,15 +60,17 @@ static void test_queue_stack_map_by_type(int type)
err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4), buf, &size, &retval, &duration); - if (err || retval || size != sizeof(pkt_v4) || - iph->daddr != val) + if (err || retval || size != sizeof(pkt_v4)) + break; + memcpy(&iph, buf + sizeof(struct ethhdr), sizeof(iph)); + if (iph.daddr != val) break; }
- CHECK(err || retval || size != sizeof(pkt_v4) || iph->daddr != val, + CHECK(err || retval || size != sizeof(pkt_v4) || iph.daddr != val, "bpf_map_pop_elem", "err %d errno %d retval %d size %d iph->daddr %u\n", - err, errno, retval, size, iph->daddr); + err, errno, retval, size, iph.daddr);
/* Queue is empty, program should return TC_ACT_SHOT */ err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 57428298b5acf2ba2dd98359c532774f6eaeecb3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Buf can be not zero-terminated leading to strstr() to access data beyond the intended buf[] array. Fix by forcing zero termination.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-12-andrii@kernel.org (cherry picked from commit 57428298b5acf2ba2dd98359c532774f6eaeecb3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/test_bpffs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpffs.c b/tools/testing/selftests/bpf/prog_tests/test_bpffs.c index 172c999e523c..506e098b17db 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_bpffs.c +++ b/tools/testing/selftests/bpf/prog_tests/test_bpffs.c @@ -18,11 +18,13 @@ static int read_iter(char *file) fd = open(file, 0); if (fd < 0) return -1; - while ((len = read(fd, buf, sizeof(buf))) > 0) + while ((len = read(fd, buf, sizeof(buf))) > 0) { + buf[sizeof(buf) - 1] = '\0'; if (strstr(buf, "iter")) { close(fd); return 0; } + } close(fd); return -1; }
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 8f6f41f39348f25db843f2fcb2f1c166b4bfa2d7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Similar to previous patch, just copy over necessary struct into local stack variable before checking its fields.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211124002325.1737739-14-andrii@kernel.org (cherry picked from commit 8f6f41f39348f25db843f2fcb2f1c166b4bfa2d7) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/testing/selftests/bpf/prog_tests/xdp.c | 11 ++++++----- tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c | 6 +++--- 2 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp.c b/tools/testing/selftests/bpf/prog_tests/xdp.c index 48921ff74850..d50e92c89840 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp.c @@ -11,8 +11,8 @@ void test_xdp(void) const char *file = "./test_xdp.o"; struct bpf_object *obj; char buf[128]; - struct ipv6hdr *iph6 = (void *)buf + sizeof(struct ethhdr); - struct iphdr *iph = (void *)buf + sizeof(struct ethhdr); + struct ipv6hdr iph6; + struct iphdr iph; __u32 duration, retval, size; int err, prog_fd, map_fd;
@@ -28,16 +28,17 @@ void test_xdp(void)
err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4), buf, &size, &retval, &duration); - + memcpy(&iph, buf + sizeof(struct ethhdr), sizeof(iph)); CHECK(err || retval != XDP_TX || size != 74 || - iph->protocol != IPPROTO_IPIP, "ipv4", + iph.protocol != IPPROTO_IPIP, "ipv4", "err %d errno %d retval %d size %d\n", err, errno, retval, size);
err = bpf_prog_test_run(prog_fd, 1, &pkt_v6, sizeof(pkt_v6), buf, &size, &retval, &duration); + memcpy(&iph6, buf + sizeof(struct ethhdr), sizeof(iph6)); CHECK(err || retval != XDP_TX || size != 114 || - iph6->nexthdr != IPPROTO_IPV6, "ipv6", + iph6.nexthdr != IPPROTO_IPV6, "ipv6", "err %d errno %d retval %d size %d\n", err, errno, retval, size); out: diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c index 3bd5904b4db5..4e7a22b984a8 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c @@ -42,7 +42,7 @@ void test_xdp_bpf2bpf(void) char buf[128]; int err, pkt_fd, map_fd; bool passed = false; - struct iphdr *iph = (void *)buf + sizeof(struct ethhdr); + struct iphdr iph; struct iptnl_info value4 = {.family = AF_INET}; struct test_xdp *pkt_skel = NULL; struct test_xdp_bpf2bpf *ftrace_skel = NULL; @@ -96,9 +96,9 @@ void test_xdp_bpf2bpf(void) /* Run test program */ err = bpf_prog_test_run(pkt_fd, 1, &pkt_v4, sizeof(pkt_v4), buf, &size, &retval, &duration); - + memcpy(&iph, buf + sizeof(struct ethhdr), sizeof(iph)); if (CHECK(err || retval != XDP_TX || size != 74 || - iph->protocol != IPPROTO_IPIP, "ipv4", + iph.protocol != IPPROTO_IPIP, "ipv4", "err %d errno %d retval %d size %d\n", err, errno, retval, size)) goto out;
From: Hengqi Chen hengqi.chen@gmail.com
mainline inclusion from mainline-5.17-rc1 commit 341ac5ffc4bd859103899c876902caf07cc97ea4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Support static initialization of BPF_MAP_TYPE_PROG_ARRAY with a syntax similar to map-in-map initialization ([0]):
SEC("socket") int tailcall_1(void *ctx) { return 0; }
struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 2); __uint(key_size, sizeof(__u32)); __array(values, int (void *)); } prog_array_init SEC(".maps") = { .values = { [1] = (void *)&tailcall_1, }, };
Here's the relevant part of libbpf debug log showing what's going on with prog-array initialization:
libbpf: sec '.relsocket': collecting relocation for section(3) 'socket' libbpf: sec '.relsocket': relo #0: insn #2 against 'prog_array_init' libbpf: prog 'entry': found map 0 (prog_array_init, sec 4, off 0) for insn #0 libbpf: .maps relo #0: for 3 value 0 rel->r_offset 32 name 53 ('tailcall_1') libbpf: .maps relo #0: map 'prog_array_init' slot [1] points to prog 'tailcall_1' libbpf: map 'prog_array_init': created successfully, fd=5 libbpf: map 'prog_array_init': slot [1] set to prog 'tailcall_1' fd=6
[0] Closes: https://github.com/libbpf/libbpf/issues/354
Signed-off-by: Hengqi Chen hengqi.chen@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211128141633.502339-2-hengqi.chen@gmail.com (cherry picked from commit 341ac5ffc4bd859103899c876902caf07cc97ea4) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 154 ++++++++++++++++++++++++++++++++--------- 1 file changed, 121 insertions(+), 33 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 5be79487cb9d..06fc745325de 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2216,6 +2216,9 @@ int parse_btf_map_def(const char *map_name, struct btf *btf, map_def->parts |= MAP_DEF_VALUE_SIZE | MAP_DEF_VALUE_TYPE; } else if (strcmp(name, "values") == 0) { + bool is_map_in_map = bpf_map_type__is_map_in_map(map_def->map_type); + bool is_prog_array = map_def->map_type == BPF_MAP_TYPE_PROG_ARRAY; + const char *desc = is_map_in_map ? "map-in-map inner" : "prog-array value"; char inner_map_name[128]; int err;
@@ -2229,8 +2232,8 @@ int parse_btf_map_def(const char *map_name, struct btf *btf, map_name, name); return -EINVAL; } - if (!bpf_map_type__is_map_in_map(map_def->map_type)) { - pr_warn("map '%s': should be map-in-map.\n", + if (!is_map_in_map && !is_prog_array) { + pr_warn("map '%s': should be map-in-map or prog-array.\n", map_name); return -ENOTSUP; } @@ -2242,22 +2245,30 @@ int parse_btf_map_def(const char *map_name, struct btf *btf, map_def->value_size = 4; t = btf__type_by_id(btf, m->type); if (!t) { - pr_warn("map '%s': map-in-map inner type [%d] not found.\n", - map_name, m->type); + pr_warn("map '%s': %s type [%d] not found.\n", + map_name, desc, m->type); return -EINVAL; } if (!btf_is_array(t) || btf_array(t)->nelems) { - pr_warn("map '%s': map-in-map inner spec is not a zero-sized array.\n", - map_name); + pr_warn("map '%s': %s spec is not a zero-sized array.\n", + map_name, desc); return -EINVAL; } t = skip_mods_and_typedefs(btf, btf_array(t)->type, NULL); if (!btf_is_ptr(t)) { - pr_warn("map '%s': map-in-map inner def is of unexpected kind %s.\n", - map_name, btf_kind_str(t)); + pr_warn("map '%s': %s def is of unexpected kind %s.\n", + map_name, desc, btf_kind_str(t)); return -EINVAL; } t = skip_mods_and_typedefs(btf, t->type, NULL); + if (is_prog_array) { + if (!btf_is_func_proto(t)) { + pr_warn("map '%s': prog-array value def is of unexpected kind %s.\n", + map_name, btf_kind_str(t)); + return -EINVAL; + } + continue; + } if (!btf_is_struct(t)) { pr_warn("map '%s': map-in-map inner def is of unexpected kind %s.\n", map_name, btf_kind_str(t)); @@ -4903,7 +4914,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b return err; }
-static int init_map_slots(struct bpf_object *obj, struct bpf_map *map) +static int init_map_in_map_slots(struct bpf_object *obj, struct bpf_map *map) { const struct bpf_map *targ_map; unsigned int i; @@ -4915,6 +4926,7 @@ static int init_map_slots(struct bpf_object *obj, struct bpf_map *map)
targ_map = map->init_slots[i]; fd = bpf_map__fd(targ_map); + if (obj->gen_loader) { pr_warn("// TODO map_update_elem: idx %td key %d value==map_idx %td\n", map - obj->maps, i, targ_map - obj->maps); @@ -4925,8 +4937,7 @@ static int init_map_slots(struct bpf_object *obj, struct bpf_map *map) if (err) { err = -errno; pr_warn("map '%s': failed to initialize slot [%d] to map '%s' fd=%d: %d\n", - map->name, i, targ_map->name, - fd, err); + map->name, i, targ_map->name, fd, err); return err; } pr_debug("map '%s': slot [%d] set to map '%s' fd=%d\n", @@ -4939,6 +4950,59 @@ static int init_map_slots(struct bpf_object *obj, struct bpf_map *map) return 0; }
+static int init_prog_array_slots(struct bpf_object *obj, struct bpf_map *map) +{ + const struct bpf_program *targ_prog; + unsigned int i; + int fd, err; + + if (obj->gen_loader) + return -ENOTSUP; + + for (i = 0; i < map->init_slots_sz; i++) { + if (!map->init_slots[i]) + continue; + + targ_prog = map->init_slots[i]; + fd = bpf_program__fd(targ_prog); + + err = bpf_map_update_elem(map->fd, &i, &fd, 0); + if (err) { + err = -errno; + pr_warn("map '%s': failed to initialize slot [%d] to prog '%s' fd=%d: %d\n", + map->name, i, targ_prog->name, fd, err); + return err; + } + pr_debug("map '%s': slot [%d] set to prog '%s' fd=%d\n", + map->name, i, targ_prog->name, fd); + } + + zfree(&map->init_slots); + map->init_slots_sz = 0; + + return 0; +} + +static int bpf_object_init_prog_arrays(struct bpf_object *obj) +{ + struct bpf_map *map; + int i, err; + + for (i = 0; i < obj->nr_maps; i++) { + map = &obj->maps[i]; + + if (!map->init_slots_sz || map->def.type != BPF_MAP_TYPE_PROG_ARRAY) + continue; + + err = init_prog_array_slots(obj, map); + if (err < 0) { + zclose(map->fd); + return err; + } + } + return 0; +} + static int bpf_object__create_maps(struct bpf_object *obj) { @@ -5005,8 +5069,8 @@ bpf_object__create_maps(struct bpf_object *obj) } }
- if (map->init_slots_sz) { - err = init_map_slots(obj, map); + if (map->init_slots_sz && map->def.type != BPF_MAP_TYPE_PROG_ARRAY) { + err = init_map_in_map_slots(obj, map); if (err < 0) { zclose(map->fd); goto err_out; @@ -6146,9 +6210,11 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, int i, j, nrels, new_sz; const struct btf_var_secinfo *vi = NULL; const struct btf_type *sec, *var, *def; - struct bpf_map *map = NULL, *targ_map; + struct bpf_map *map = NULL, *targ_map = NULL; + struct bpf_program *targ_prog = NULL; + bool is_prog_array, is_map_in_map; const struct btf_member *member; - const char *name, *mname; + const char *name, *mname, *type; unsigned int moff; Elf64_Sym *sym; Elf64_Rel *rel; @@ -6175,11 +6241,6 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, return -LIBBPF_ERRNO__FORMAT; } name = elf_sym_str(obj, sym->st_name) ?: "<?>"; - if (sym->st_shndx != obj->efile.btf_maps_shndx) { - pr_warn(".maps relo #%d: '%s' isn't a BTF-defined map\n", - i, name); - return -LIBBPF_ERRNO__RELOC; - }
pr_debug(".maps relo #%d: for %zd value %zd rel->r_offset %zu name %d ('%s')\n", i, (ssize_t)(rel->r_info >> 32), (size_t)sym->st_value, @@ -6201,19 +6262,45 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, return -EINVAL; }
- if (!bpf_map_type__is_map_in_map(map->def.type)) - return -EINVAL; - if (map->def.type == BPF_MAP_TYPE_HASH_OF_MAPS && - map->def.key_size != sizeof(int)) { - pr_warn(".maps relo #%d: hash-of-maps '%s' should have key size %zu.\n", - i, map->name, sizeof(int)); + is_map_in_map = bpf_map_type__is_map_in_map(map->def.type); + is_prog_array = map->def.type == BPF_MAP_TYPE_PROG_ARRAY; + type = is_map_in_map ? "map" : "prog"; + if (is_map_in_map) { + if (sym->st_shndx != obj->efile.btf_maps_shndx) { + pr_warn(".maps relo #%d: '%s' isn't a BTF-defined map\n", + i, name); + return -LIBBPF_ERRNO__RELOC; + } + if (map->def.type == BPF_MAP_TYPE_HASH_OF_MAPS && + map->def.key_size != sizeof(int)) { + pr_warn(".maps relo #%d: hash-of-maps '%s' should have key size %zu.\n", + i, map->name, sizeof(int)); + return -EINVAL; + } + targ_map = bpf_object__find_map_by_name(obj, name); + if (!targ_map) { + pr_warn(".maps relo #%d: '%s' isn't a valid map reference\n", + i, name); + return -ESRCH; + } + } else if (is_prog_array) { + targ_prog = bpf_object__find_program_by_name(obj, name); + if (!targ_prog) { + pr_warn(".maps relo #%d: '%s' isn't a valid program reference\n", + i, name); + return -ESRCH; + } + if (targ_prog->sec_idx != sym->st_shndx || + targ_prog->sec_insn_off * 8 != sym->st_value || + prog_is_subprog(obj, targ_prog)) { + pr_warn(".maps relo #%d: '%s' isn't an entry-point program\n", + i, name); + return -LIBBPF_ERRNO__RELOC; + } + } else { return -EINVAL; }
- targ_map = bpf_object__find_map_by_name(obj, name); - if (!targ_map) - return -ESRCH; - var = btf__type_by_id(obj->btf, vi->type); def = skip_mods_and_typedefs(obj->btf, var->type, NULL); if (btf_vlen(def) == 0) @@ -6244,10 +6331,10 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, (new_sz - map->init_slots_sz) * host_ptr_sz); map->init_slots_sz = new_sz; } - map->init_slots[moff] = targ_map; + map->init_slots[moff] = is_map_in_map ? (void *)targ_map : (void *)targ_prog;
- pr_debug(".maps relo #%d: map '%s' slot [%d] points to map '%s'\n", - i, map->name, moff, name); + pr_debug(".maps relo #%d: map '%s' slot [%d] points to %s '%s'\n", + i, map->name, moff, type, name); }
return 0; @@ -7240,6 +7327,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr) err = err ? : bpf_object__create_maps(obj); err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : attr->target_btf_path); err = err ? : bpf_object__load_progs(obj, attr->log_level); + err = err ? : bpf_object_init_prog_arrays(obj);
if (obj->gen_loader) { /* reset FDs */
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.17-rc1 commit d4efb170861827290f7f571020001a60d001faaf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Andrii mentioned in [0] that switching to ARG_CONST_SIZE_OR_ZERO lets user avoid having to prove that string size at runtime is not zero and helps with not having to supress clang optimizations.
[0]: https://lore.kernel.org/bpf/CAEf4BzZa_vhXB3c8atNcTS6=krQvC25H7K7c3WWZhM=27ro...
Suggested-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211122235733.634914-2-memxor@gmail.com (cherry picked from commit d4efb170861827290f7f571020001a60d001faaf) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/syscall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 9ee73a94b5a9..3e9c8b0ec6c9 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4726,7 +4726,7 @@ const struct bpf_func_proto bpf_kallsyms_lookup_name_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_MEM, - .arg2_type = ARG_CONST_SIZE, + .arg2_type = ARG_CONST_SIZE_OR_ZERO, .arg3_type = ARG_ANYTHING, .arg4_type = ARG_PTR_TO_LONG, };
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.17-rc1 commit 0270090d396a8e7e7f42adae13fdfa48ffb85144 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Instead, jump directly to success case stores in case ret >= 0, else do the default 0 value store and jump over the success case. This is better in terms of readability. Readjust the code for kfunc relocation as well to follow a similar pattern, also leads to easier to follow code now.
Suggested-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211122235733.634914-3-memxor@gmail.com (cherry picked from commit 0270090d396a8e7e7f42adae13fdfa48ffb85144) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index c6e065c493b8..69f88efa271c 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -699,27 +699,29 @@ static void emit_relo_kfunc_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo return; } kdesc->off = btf_fd_idx; - /* set a default value for imm */ + /* jump to success case */ + emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 3)); + /* set value for imm, off as 0 */ emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, offsetof(struct bpf_insn, imm), 0)); - /* skip success case store if ret < 0 */ - emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 1)); + emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), 0)); + /* skip success case for ret < 0 */ + emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 10)); /* store btf_id into insn[insn_idx].imm */ emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm))); + /* obtain fd in BPF_REG_9 */ + emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_7)); + emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_9, 32)); + /* jump to fd_array store if fd denotes module BTF */ + emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_9, 0, 2)); + /* set the default value for off */ + emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), 0)); + /* skip BTF fd store for vmlinux BTF */ + emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 4)); /* load fd_array slot pointer */ emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, blob_fd_array_off(gen, btf_fd_idx))); - /* skip store of BTF fd if ret < 0 */ - emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 3)); /* store BTF fd in slot */ - emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_7)); - emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_9, 32)); emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_9, 0)); - /* set a default value for off */ - emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), 0)); - /* skip insn->off store if ret < 0 */ - emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 2)); - /* skip if vmlinux BTF */ - emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_9, 0, 1)); /* store index into insn[insn_idx].off */ emit(gen, BPF_ST_MEM(BPF_H, BPF_REG_8, offsetof(struct bpf_insn, off), btf_fd_idx)); log: @@ -828,17 +830,20 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, emit_bpf_find_by_name_kind(gen, relo); if (!relo->is_weak) emit_check_err(gen); - /* set default values as 0 */ + /* jump to success case */ + emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 3)); + /* set values for insn[insn_idx].imm, insn[insn_idx + 1].imm as 0 */ emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, offsetof(struct bpf_insn, imm), 0)); emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_8, sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 0)); - /* skip success case stores if ret < 0 */ - emit(gen, BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0, 4)); + /* skip success case for ret < 0 */ + emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 4)); /* store btf_id into insn[insn_idx].imm */ emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, offsetof(struct bpf_insn, imm))); /* store btf_obj_fd into insn[insn_idx + 1].imm */ emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32)); emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_7, sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm))); + /* skip src_reg adjustment */ emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 3)); clear_src_reg: /* clear bpf_object__relocate_data's src_reg assignment, otherwise we get a verifier failure */
From: Kumar Kartikeya Dwivedi memxor@gmail.com
mainline inclusion from mainline-5.17-rc1 commit d995816b77eb826e0f6d7adf4471ec191b362be0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Alexei pointed out that we can use BPF_REG_0 which already contains imm from move_blob2blob computation. Note that we now compare the second insn's imm, but this should not matter, since both will be zeroed out for the error case for the insn populated earlier.
Suggested-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20211122235733.634914-4-memxor@gmail.com (cherry picked from commit d995816b77eb826e0f6d7adf4471ec191b362be0) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 69f88efa271c..baf74a83ebdc 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -820,9 +820,8 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, kdesc->insn + offsetof(struct bpf_insn, imm)); move_blob2blob(gen, insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm), 4, kdesc->insn + sizeof(struct bpf_insn) + offsetof(struct bpf_insn, imm)); - emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_9, BPF_REG_8, offsetof(struct bpf_insn, imm))); - /* jump over src_reg adjustment if imm is not 0 */ - emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_9, 0, 3)); + /* jump over src_reg adjustment if imm is not 0, reuse BPF_REG_0 from move_blob2blob */ + emit(gen, BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3)); goto clear_src_reg; } /* remember insn offset, so we can copy BTF ID and FD later */
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 74753e1462e77349525daf9eb60ea21ed92d3a97 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
To prepare relo_core.c to be compiled in the kernel and the user space replace btf__type_by_id with btf_type_by_id.
In libbpf btf__type_by_id and btf_type_by_id have different behavior.
bpf_core_apply_relo_insn() needs behavior of uapi btf__type_by_id vs internal btf_type_by_id, but type_id range check is already done in bpf_core_apply_relo(), so it's safe to replace it everywhere. The kernel btf_type_by_id() does the check anyway. It doesn't hurt.
Suggested-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-2-alexei.starovoitov@gmail.... (cherry picked from commit 74753e1462e77349525daf9eb60ea21ed92d3a97) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/btf.c | 2 +- tools/lib/bpf/libbpf_internal.h | 2 +- tools/lib/bpf/relo_core.c | 19 ++++++++----------- 3 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index deccd04e377b..315556bd4b32 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -449,7 +449,7 @@ const struct btf *btf__base_btf(const struct btf *btf) }
/* internal helper returning non-const pointer to a type */ -struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id) +struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id) { if (type_id == 0) return &btf_void; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index bdb9c7bc6382..0d16f01376da 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -170,7 +170,7 @@ static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size) struct btf; struct btf_type;
-struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id); +struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id); const char *btf_kind_str(const struct btf_type *t); const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 4016ed492d0c..8cdd462d4e49 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -51,7 +51,7 @@ static bool is_flex_arr(const struct btf *btf, return false;
/* has to be the last member of enclosing struct */ - t = btf__type_by_id(btf, acc->type_id); + t = btf_type_by_id(btf, acc->type_id); return acc->idx == btf_vlen(t) - 1; }
@@ -388,7 +388,7 @@ static int bpf_core_match_member(const struct btf *local_btf, return 0;
local_id = local_acc->type_id; - local_type = btf__type_by_id(local_btf, local_id); + local_type = btf_type_by_id(local_btf, local_id); local_member = btf_members(local_type) + local_acc->idx; local_name = btf__name_by_offset(local_btf, local_member->name_off);
@@ -580,7 +580,7 @@ static int bpf_core_calc_field_relo(const char *prog_name, return -EUCLEAN; /* request instruction poisoning */
acc = &spec->spec[spec->len - 1]; - t = btf__type_by_id(spec->btf, acc->type_id); + t = btf_type_by_id(spec->btf, acc->type_id);
/* a[n] accessor needs special handling */ if (!acc->name) { @@ -729,7 +729,7 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo, case BPF_ENUMVAL_VALUE: if (!spec) return -EUCLEAN; /* request instruction poisoning */ - t = btf__type_by_id(spec->btf, spec->spec[0].type_id); + t = btf_type_by_id(spec->btf, spec->spec[0].type_id); e = btf_enum(t) + spec->spec[0].idx; *val = e->val; break; @@ -805,8 +805,8 @@ static int bpf_core_calc_relo(const char *prog_name, if (res->orig_sz != res->new_sz) { const struct btf_type *orig_t, *new_t;
- orig_t = btf__type_by_id(local_spec->btf, res->orig_type_id); - new_t = btf__type_by_id(targ_spec->btf, res->new_type_id); + orig_t = btf_type_by_id(local_spec->btf, res->orig_type_id); + new_t = btf_type_by_id(targ_spec->btf, res->new_type_id);
/* There are two use cases in which it's safe to * adjust load/store's mem size: @@ -1054,7 +1054,7 @@ static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec) int i;
type_id = spec->root_type_id; - t = btf__type_by_id(spec->btf, type_id); + t = btf_type_by_id(spec->btf, type_id); s = btf__name_by_offset(spec->btf, t->name_off);
libbpf_print(level, "[%u] %s %s", type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s); @@ -1158,10 +1158,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, int i, j, err;
local_id = relo->type_id; - local_type = btf__type_by_id(local_btf, local_id); - if (!local_type) - return -EINVAL; - + local_type = btf_type_by_id(local_btf, local_id); local_name = btf__name_by_offset(local_btf, local_type->name_off); if (!local_name) return -EINVAL;
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 8293eb995f349aed28006792cad4cb48091919dd category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Rename btf_member_bit_offset() and btf_member_bitfield_size() to avoid conflicts with similarly named helpers in libbpf's btf.h. Rename the kernel helpers, since libbpf helpers are part of uapi.
Suggested-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-3-alexei.starovoitov@gmail.... (cherry picked from commit 8293eb995f349aed28006792cad4cb48091919dd) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: net/ipv4/bpf_tcp_ca.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/btf.h | 8 ++++---- kernel/bpf/bpf_struct_ops.c | 6 +++--- kernel/bpf/btf.c | 18 +++++++++--------- net/ipv4/bpf_tcp_ca.c | 4 ++-- 4 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h index 6964f39150a0..14e9f02ba741 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -191,15 +191,15 @@ static inline bool btf_type_kflag(const struct btf_type *t) return BTF_INFO_KFLAG(t->info); }
-static inline u32 btf_member_bit_offset(const struct btf_type *struct_type, - const struct btf_member *member) +static inline u32 __btf_member_bit_offset(const struct btf_type *struct_type, + const struct btf_member *member) { return btf_type_kflag(struct_type) ? BTF_MEMBER_BIT_OFFSET(member->offset) : member->offset; }
-static inline u32 btf_member_bitfield_size(const struct btf_type *struct_type, - const struct btf_member *member) +static inline u32 __btf_member_bitfield_size(const struct btf_type *struct_type, + const struct btf_member *member) { return btf_type_kflag(struct_type) ? BTF_MEMBER_BITFIELD_SIZE(member->offset) : 0; diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index ac283f9b2332..9efd465d81c6 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -161,7 +161,7 @@ void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log) break; }
- if (btf_member_bitfield_size(t, member)) { + if (__btf_member_bitfield_size(t, member)) { pr_warn("bit field member %s in struct %s is not supported\n", mname, st_ops->name); break; @@ -292,7 +292,7 @@ static int check_zero_holes(const struct btf_type *t, void *data) const struct btf_type *mtype;
for_each_member(i, t, member) { - moff = btf_member_bit_offset(t, member) / 8; + moff = __btf_member_bit_offset(t, member) / 8; if (moff > prev_mend && memchr_inv(data + prev_mend, 0, moff - prev_mend)) return -EINVAL; @@ -369,7 +369,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key, u32 moff; u32 flags;
- moff = btf_member_bit_offset(t, member) / 8; + moff = __btf_member_bit_offset(t, member) / 8; ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL); if (ptype == module_type) { if (*(void **)(udata + moff)) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 0921e2b0c101..7b74ba18f477 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -2967,7 +2967,7 @@ static s32 btf_struct_check_meta(struct btf_verifier_env *env, return -EINVAL; }
- offset = btf_member_bit_offset(t, member); + offset = __btf_member_bit_offset(t, member); if (is_union && offset) { btf_verifier_log_member(env, t, member, "Invalid member bits_offset"); @@ -3099,7 +3099,7 @@ int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t) if (off != -ENOENT) /* only one 'struct bpf_spin_lock' is allowed */ return -E2BIG; - off = btf_member_bit_offset(t, member); + off = __btf_member_bit_offset(t, member); if (off % 8) /* valid C code cannot generate such BTF */ return -EINVAL; @@ -3133,8 +3133,8 @@ static void __btf_struct_show(const struct btf *btf, const struct btf_type *t,
btf_show_start_member(show, member);
- member_offset = btf_member_bit_offset(t, member); - bitfield_size = btf_member_bitfield_size(t, member); + member_offset = __btf_member_bit_offset(t, member); + bitfield_size = __btf_member_bitfield_size(t, member); bytes_offset = BITS_ROUNDDOWN_BYTES(member_offset); bits8_offset = BITS_PER_BYTE_MASKED(member_offset); if (bitfield_size) { @@ -4935,7 +4935,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf, if (array_elem->nelems != 0) goto error;
- moff = btf_member_bit_offset(t, member) / 8; + moff = __btf_member_bit_offset(t, member) / 8; if (off < moff) goto error;
@@ -4958,14 +4958,14 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
for_each_member(i, t, member) { /* offset of the field in bytes */ - moff = btf_member_bit_offset(t, member) / 8; + moff = __btf_member_bit_offset(t, member) / 8; if (off + size <= moff) /* won't find anything, field is already too far */ break;
- if (btf_member_bitfield_size(t, member)) { - u32 end_bit = btf_member_bit_offset(t, member) + - btf_member_bitfield_size(t, member); + if (__btf_member_bitfield_size(t, member)) { + u32 end_bit = __btf_member_bit_offset(t, member) + + __btf_member_bitfield_size(t, member);
/* off <= moff instead of off == moff because clang * does not generate a BTF member for anonymous diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index d520e61649c8..4ea1b7f652d1 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -196,7 +196,7 @@ static int bpf_tcp_ca_init_member(const struct btf_type *t, utcp_ca = (const struct tcp_congestion_ops *)udata; tcp_ca = (struct tcp_congestion_ops *)kdata;
- moff = btf_member_bit_offset(t, member) / 8; + moff = __btf_member_bit_offset(t, member) / 8; switch (moff) { case offsetof(struct tcp_congestion_ops, flags): if (utcp_ca->flags & ~TCP_CONG_MASK) @@ -226,7 +226,7 @@ static int bpf_tcp_ca_init_member(const struct btf_type *t, static int bpf_tcp_ca_check_member(const struct btf_type *t, const struct btf_member *member) { - if (is_unsupported(btf_member_bit_offset(t, member) / 8)) + if (is_unsupported(__btf_member_bit_offset(t, member) / 8)) return -ENOTSUPP; return 0; }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 29db4bea1d10b73749d7992c1fc9ac13499e8871 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Make relo_core.c to be compiled for the kernel and for user space libbpf.
Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32. This is the maximum number of nested structs and arrays. For example: struct sample { int a; struct { int b[10]; }; };
struct sample *s = ...; int *y = &s->b[5]; This field access is encoded as "0:1:0:5" and spec len is 4.
The follow up patch might bump it back to 64.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-4-alexei.starovoitov@gmail.... (cherry picked from commit 29db4bea1d10b73749d7992c1fc9ac13499e8871) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: kernel/bpf/btf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/btf.h | 81 +++++++++++++++++++++++++++++++++++++++ kernel/bpf/Makefile | 4 ++ kernel/bpf/btf.c | 26 +++++++++++++ tools/lib/bpf/relo_core.c | 76 ++++++++++++++++++++++++++++++------ 4 files changed, 176 insertions(+), 11 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h index 14e9f02ba741..2300bca4c42f 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -141,6 +141,53 @@ static inline bool btf_type_is_enum(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM; }
+static inline bool str_is_empty(const char *s) +{ + return !s || !s[0]; +} + +static inline u16 btf_kind(const struct btf_type *t) +{ + return BTF_INFO_KIND(t->info); +} + +static inline bool btf_is_enum(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_ENUM; +} + +static inline bool btf_is_composite(const struct btf_type *t) +{ + u16 kind = btf_kind(t); + + return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION; +} + +static inline bool btf_is_array(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_ARRAY; +} + +static inline bool btf_is_int(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_INT; +} + +static inline bool btf_is_ptr(const struct btf_type *t) +{ + return btf_kind(t) == BTF_KIND_PTR; +} + +static inline u8 btf_int_offset(const struct btf_type *t) +{ + return BTF_INT_OFFSET(*(u32 *)(t + 1)); +} + +static inline u8 btf_int_encoding(const struct btf_type *t) +{ + return BTF_INT_ENCODING(*(u32 *)(t + 1)); +} + static inline bool btf_type_is_scalar(const struct btf_type *t) { return btf_type_is_int(t) || btf_type_is_enum(t); @@ -181,6 +228,11 @@ static inline u16 btf_type_vlen(const struct btf_type *t) return BTF_INFO_VLEN(t->info); }
+static inline u16 btf_vlen(const struct btf_type *t) +{ + return btf_type_vlen(t); +} + static inline u16 btf_func_linkage(const struct btf_type *t) { return BTF_INFO_VLEN(t->info); @@ -205,11 +257,40 @@ static inline u32 __btf_member_bitfield_size(const struct btf_type *struct_type, : 0; }
+static inline struct btf_member *btf_members(const struct btf_type *t) +{ + return (struct btf_member *)(t + 1); +} + +static inline u32 btf_member_bit_offset(const struct btf_type *t, u32 member_idx) +{ + const struct btf_member *m = btf_members(t) + member_idx; + + return __btf_member_bit_offset(t, m); +} + +static inline u32 btf_member_bitfield_size(const struct btf_type *t, u32 member_idx) +{ + const struct btf_member *m = btf_members(t) + member_idx; + + return __btf_member_bitfield_size(t, m); +} + static inline const struct btf_member *btf_type_member(const struct btf_type *t) { return (const struct btf_member *)(t + 1); }
+static inline struct btf_array *btf_array(const struct btf_type *t) +{ + return (struct btf_array *)(t + 1); +} + +static inline struct btf_enum *btf_enum(const struct btf_type *t) +{ + return (struct btf_enum *)(t + 1); +} + static inline const struct btf_var_secinfo *btf_type_var_secinfo( const struct btf_type *t) { diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index c1b9f71ee6aa..6996f0e7e152 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -36,3 +36,7 @@ obj-$(CONFIG_BPF_SYSCALL) += bpf_struct_ops.o obj-${CONFIG_BPF_LSM} += bpf_lsm.o endif obj-$(CONFIG_BPF_PRELOAD) += preload/ + +obj-$(CONFIG_BPF_SYSCALL) += relo_core.o +$(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE + $(call if_changed_rule,cc_o_c) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 7b74ba18f477..68331879aa11 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6228,3 +6228,29 @@ const struct bpf_func_proto bpf_btf_find_by_name_kind_proto = { .arg3_type = ARG_ANYTHING, .arg4_type = ARG_ANYTHING, }; + +int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, + const struct btf *targ_btf, __u32 targ_id) +{ + return -EOPNOTSUPP; +} + +static bool bpf_core_is_flavor_sep(const char *s) +{ + /* check X___Y name pattern, where X and Y are not underscores */ + return s[0] != '_' && /* X */ + s[1] == '_' && s[2] == '_' && s[3] == '_' && /* ___ */ + s[4] != '_'; /* Y */ +} + +size_t bpf_core_essential_name_len(const char *name) +{ + size_t n = strlen(name); + int i; + + for (i = n - 5; i >= 0; i--) { + if (bpf_core_is_flavor_sep(name + i)) + return i + 1; + } + return n; +} diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 8cdd462d4e49..c27c02c6e2fd 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -1,6 +1,60 @@ // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) /* Copyright (c) 2019 Facebook */
+#ifdef __KERNEL__ +#include <linux/bpf.h> +#include <linux/btf.h> +#include <linux/string.h> +#include <linux/bpf_verifier.h> +#include "relo_core.h" + +static const char *btf_kind_str(const struct btf_type *t) +{ + return btf_type_str(t); +} + +static bool is_ldimm64_insn(struct bpf_insn *insn) +{ + return insn->code == (BPF_LD | BPF_IMM | BPF_DW); +} + +static const struct btf_type * +skip_mods_and_typedefs(const struct btf *btf, u32 id, u32 *res_id) +{ + return btf_type_skip_modifiers(btf, id, res_id); +} + +static const char *btf__name_by_offset(const struct btf *btf, u32 offset) +{ + return btf_name_by_offset(btf, offset); +} + +static s64 btf__resolve_size(const struct btf *btf, u32 type_id) +{ + const struct btf_type *t; + int size; + + t = btf_type_by_id(btf, type_id); + t = btf_resolve_size(btf, t, &size); + if (IS_ERR(t)) + return PTR_ERR(t); + return size; +} + +enum libbpf_print_level { + LIBBPF_WARN, + LIBBPF_INFO, + LIBBPF_DEBUG, +}; + +#undef pr_warn +#undef pr_info +#undef pr_debug +#define pr_warn(fmt, log, ...) bpf_log((void *)log, fmt, "", ##__VA_ARGS__) +#define pr_info(fmt, log, ...) bpf_log((void *)log, fmt, "", ##__VA_ARGS__) +#define pr_debug(fmt, log, ...) bpf_log((void *)log, fmt, "", ##__VA_ARGS__) +#define libbpf_print(level, fmt, ...) bpf_log((void *)prog_name, fmt, ##__VA_ARGS__) +#else #include <stdio.h> #include <string.h> #include <errno.h> @@ -12,8 +66,9 @@ #include "btf.h" #include "str_error.h" #include "libbpf_internal.h" +#endif
-#define BPF_CORE_SPEC_MAX_LEN 64 +#define BPF_CORE_SPEC_MAX_LEN 32
/* represents BPF CO-RE field or array element accessor */ struct bpf_core_accessor { @@ -150,7 +205,7 @@ static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) * Enum value-based relocations (ENUMVAL_EXISTS/ENUMVAL_VALUE) use access * string to specify enumerator's value index that need to be relocated. */ -static int bpf_core_parse_spec(const struct btf *btf, +static int bpf_core_parse_spec(const char *prog_name, const struct btf *btf, __u32 type_id, const char *spec_str, enum bpf_core_relo_kind relo_kind, @@ -272,8 +327,8 @@ static int bpf_core_parse_spec(const struct btf *btf, return sz; spec->bit_offset += access_idx * sz * 8; } else { - pr_warn("relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n", - type_id, spec_str, i, id, btf_kind_str(t)); + pr_warn("prog '%s': relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n", + prog_name, type_id, spec_str, i, id, btf_kind_str(t)); return -EINVAL; } } @@ -346,8 +401,6 @@ static int bpf_core_fields_are_compat(const struct btf *local_btf, targ_id = btf_array(targ_type)->type; goto recur; default: - pr_warn("unexpected kind %d relocated, local [%d], target [%d]\n", - btf_kind(local_type), local_id, targ_id); return 0; } } @@ -1045,7 +1098,7 @@ static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, * [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>, * where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b */ -static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec) +static void bpf_core_dump_spec(const char *prog_name, int level, const struct bpf_core_spec *spec) { const struct btf_type *t; const struct btf_enum *e; @@ -1167,7 +1220,8 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, if (str_is_empty(spec_str)) return -EINVAL;
- err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); + err = bpf_core_parse_spec(prog_name, local_btf, local_id, spec_str, + relo->kind, &local_spec); if (err) { pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", prog_name, relo_idx, local_id, btf_kind_str(local_type), @@ -1178,7 +1232,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); - bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec); + bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, &local_spec); libbpf_print(LIBBPF_DEBUG, "\n");
/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ @@ -1204,14 +1258,14 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, if (err < 0) { pr_warn("prog '%s': relo #%d: error matching candidate #%d ", prog_name, relo_idx, i); - bpf_core_dump_spec(LIBBPF_WARN, &cand_spec); + bpf_core_dump_spec(prog_name, LIBBPF_WARN, &cand_spec); libbpf_print(LIBBPF_WARN, ": %d\n", err); return err; }
pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name, relo_idx, err == 0 ? "non-matching" : "matching", i); - bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec); + bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, &cand_spec); libbpf_print(LIBBPF_DEBUG, "\n");
if (err == 0)
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 46334a0cd21bed70d6f1ddef1464f75a0ebe1774 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
enum bpf_core_relo_kind is generated by llvm and processed by libbpf. It's a de-facto uapi. With CO-RE in the kernel the bpf_core_relo_kind values become uapi de-jure. Also rename them with BPF_CORE_ prefix to distinguish from conflicting names in bpf_core_read.h. The enums bpf_field_info_kind, bpf_type_id_kind, bpf_type_info_kind, bpf_enum_value_kind are passing different values from bpf program into llvm.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-5-alexei.starovoitov@gmail.... (cherry picked from commit 46334a0cd21bed70d6f1ddef1464f75a0ebe1774) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/relo_core.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/uapi/linux/bpf.h | 19 ++++++++ tools/include/uapi/linux/bpf.h | 19 ++++++++ tools/lib/bpf/libbpf.c | 2 +- tools/lib/bpf/relo_core.c | 84 +++++++++++++++++----------------- tools/lib/bpf/relo_core.h | 18 +------- 5 files changed, 82 insertions(+), 60 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c3da9fdb6d88..13fa6e081687 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5242,4 +5242,23 @@ enum { BTF_F_ZERO = (1ULL << 3), };
+/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value + * has to be adjusted by relocations. It is emitted by llvm and passed to + * libbpf and later to the kernel. + */ +enum bpf_core_relo_kind { + BPF_CORE_FIELD_BYTE_OFFSET = 0, /* field byte offset */ + BPF_CORE_FIELD_BYTE_SIZE = 1, /* field size in bytes */ + BPF_CORE_FIELD_EXISTS = 2, /* field existence in target kernel */ + BPF_CORE_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */ + BPF_CORE_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */ + BPF_CORE_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */ + BPF_CORE_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */ + BPF_CORE_TYPE_ID_TARGET = 7, /* type ID in target kernel */ + BPF_CORE_TYPE_EXISTS = 8, /* type existence in target kernel */ + BPF_CORE_TYPE_SIZE = 9, /* type size in bytes */ + BPF_CORE_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ + BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 639206344c07..1cea96fe92dc 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5951,4 +5951,23 @@ enum { BTF_F_ZERO = (1ULL << 3), };
+/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value + * has to be adjusted by relocations. It is emitted by llvm and passed to + * libbpf and later to the kernel. + */ +enum bpf_core_relo_kind { + BPF_CORE_FIELD_BYTE_OFFSET = 0, /* field byte offset */ + BPF_CORE_FIELD_BYTE_SIZE = 1, /* field size in bytes */ + BPF_CORE_FIELD_EXISTS = 2, /* field existence in target kernel */ + BPF_CORE_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */ + BPF_CORE_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */ + BPF_CORE_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */ + BPF_CORE_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */ + BPF_CORE_TYPE_ID_TARGET = 7, /* type ID in target kernel */ + BPF_CORE_TYPE_EXISTS = 8, /* type existence in target kernel */ + BPF_CORE_TYPE_SIZE = 9, /* type size in bytes */ + BPF_CORE_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ + BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 06fc745325de..36cdac95ba9b 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5486,7 +5486,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return -ENOTSUP; }
- if (relo->kind != BPF_TYPE_ID_LOCAL && + if (relo->kind != BPF_CORE_TYPE_ID_LOCAL && !hashmap__find(cand_cache, type_key, (void **)&cands)) { cands = bpf_core_find_cands(prog->obj, local_btf, local_id); if (IS_ERR(cands)) { diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index c27c02c6e2fd..35a38fccc837 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -113,18 +113,18 @@ static bool is_flex_arr(const struct btf *btf, static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) { switch (kind) { - case BPF_FIELD_BYTE_OFFSET: return "byte_off"; - case BPF_FIELD_BYTE_SIZE: return "byte_sz"; - case BPF_FIELD_EXISTS: return "field_exists"; - case BPF_FIELD_SIGNED: return "signed"; - case BPF_FIELD_LSHIFT_U64: return "lshift_u64"; - case BPF_FIELD_RSHIFT_U64: return "rshift_u64"; - case BPF_TYPE_ID_LOCAL: return "local_type_id"; - case BPF_TYPE_ID_TARGET: return "target_type_id"; - case BPF_TYPE_EXISTS: return "type_exists"; - case BPF_TYPE_SIZE: return "type_size"; - case BPF_ENUMVAL_EXISTS: return "enumval_exists"; - case BPF_ENUMVAL_VALUE: return "enumval_value"; + case BPF_CORE_FIELD_BYTE_OFFSET: return "byte_off"; + case BPF_CORE_FIELD_BYTE_SIZE: return "byte_sz"; + case BPF_CORE_FIELD_EXISTS: return "field_exists"; + case BPF_CORE_FIELD_SIGNED: return "signed"; + case BPF_CORE_FIELD_LSHIFT_U64: return "lshift_u64"; + case BPF_CORE_FIELD_RSHIFT_U64: return "rshift_u64"; + case BPF_CORE_TYPE_ID_LOCAL: return "local_type_id"; + case BPF_CORE_TYPE_ID_TARGET: return "target_type_id"; + case BPF_CORE_TYPE_EXISTS: return "type_exists"; + case BPF_CORE_TYPE_SIZE: return "type_size"; + case BPF_CORE_ENUMVAL_EXISTS: return "enumval_exists"; + case BPF_CORE_ENUMVAL_VALUE: return "enumval_value"; default: return "unknown"; } } @@ -132,12 +132,12 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) { switch (kind) { - case BPF_FIELD_BYTE_OFFSET: - case BPF_FIELD_BYTE_SIZE: - case BPF_FIELD_EXISTS: - case BPF_FIELD_SIGNED: - case BPF_FIELD_LSHIFT_U64: - case BPF_FIELD_RSHIFT_U64: + case BPF_CORE_FIELD_BYTE_OFFSET: + case BPF_CORE_FIELD_BYTE_SIZE: + case BPF_CORE_FIELD_EXISTS: + case BPF_CORE_FIELD_SIGNED: + case BPF_CORE_FIELD_LSHIFT_U64: + case BPF_CORE_FIELD_RSHIFT_U64: return true; default: return false; @@ -147,10 +147,10 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) { switch (kind) { - case BPF_TYPE_ID_LOCAL: - case BPF_TYPE_ID_TARGET: - case BPF_TYPE_EXISTS: - case BPF_TYPE_SIZE: + case BPF_CORE_TYPE_ID_LOCAL: + case BPF_CORE_TYPE_ID_TARGET: + case BPF_CORE_TYPE_EXISTS: + case BPF_CORE_TYPE_SIZE: return true; default: return false; @@ -160,8 +160,8 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) { switch (kind) { - case BPF_ENUMVAL_EXISTS: - case BPF_ENUMVAL_VALUE: + case BPF_CORE_ENUMVAL_EXISTS: + case BPF_CORE_ENUMVAL_VALUE: return true; default: return false; @@ -624,7 +624,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
*field_sz = 0;
- if (relo->kind == BPF_FIELD_EXISTS) { + if (relo->kind == BPF_CORE_FIELD_EXISTS) { *val = spec ? 1 : 0; return 0; } @@ -637,7 +637,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
/* a[n] accessor needs special handling */ if (!acc->name) { - if (relo->kind == BPF_FIELD_BYTE_OFFSET) { + if (relo->kind == BPF_CORE_FIELD_BYTE_OFFSET) { *val = spec->bit_offset / 8; /* remember field size for load/store mem size */ sz = btf__resolve_size(spec->btf, acc->type_id); @@ -645,7 +645,7 @@ static int bpf_core_calc_field_relo(const char *prog_name, return -EINVAL; *field_sz = sz; *type_id = acc->type_id; - } else if (relo->kind == BPF_FIELD_BYTE_SIZE) { + } else if (relo->kind == BPF_CORE_FIELD_BYTE_SIZE) { sz = btf__resolve_size(spec->btf, acc->type_id); if (sz < 0) return -EINVAL; @@ -697,36 +697,36 @@ static int bpf_core_calc_field_relo(const char *prog_name, *validate = !bitfield;
switch (relo->kind) { - case BPF_FIELD_BYTE_OFFSET: + case BPF_CORE_FIELD_BYTE_OFFSET: *val = byte_off; if (!bitfield) { *field_sz = byte_sz; *type_id = field_type_id; } break; - case BPF_FIELD_BYTE_SIZE: + case BPF_CORE_FIELD_BYTE_SIZE: *val = byte_sz; break; - case BPF_FIELD_SIGNED: + case BPF_CORE_FIELD_SIGNED: /* enums will be assumed unsigned */ *val = btf_is_enum(mt) || (btf_int_encoding(mt) & BTF_INT_SIGNED); if (validate) *validate = true; /* signedness is never ambiguous */ break; - case BPF_FIELD_LSHIFT_U64: + case BPF_CORE_FIELD_LSHIFT_U64: #if __BYTE_ORDER == __LITTLE_ENDIAN *val = 64 - (bit_off + bit_sz - byte_off * 8); #else *val = (8 - byte_sz) * 8 + (bit_off - byte_off * 8); #endif break; - case BPF_FIELD_RSHIFT_U64: + case BPF_CORE_FIELD_RSHIFT_U64: *val = 64 - bit_sz; if (validate) *validate = true; /* right shift is never ambiguous */ break; - case BPF_FIELD_EXISTS: + case BPF_CORE_FIELD_EXISTS: default: return -EOPNOTSUPP; } @@ -747,20 +747,20 @@ static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo, }
switch (relo->kind) { - case BPF_TYPE_ID_TARGET: + case BPF_CORE_TYPE_ID_TARGET: *val = spec->root_type_id; break; - case BPF_TYPE_EXISTS: + case BPF_CORE_TYPE_EXISTS: *val = 1; break; - case BPF_TYPE_SIZE: + case BPF_CORE_TYPE_SIZE: sz = btf__resolve_size(spec->btf, spec->root_type_id); if (sz < 0) return -EINVAL; *val = sz; break; - case BPF_TYPE_ID_LOCAL: - /* BPF_TYPE_ID_LOCAL is handled specially and shouldn't get here */ + case BPF_CORE_TYPE_ID_LOCAL: + /* BPF_CORE_TYPE_ID_LOCAL is handled specially and shouldn't get here */ default: return -EOPNOTSUPP; } @@ -776,10 +776,10 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo, const struct btf_enum *e;
switch (relo->kind) { - case BPF_ENUMVAL_EXISTS: + case BPF_CORE_ENUMVAL_EXISTS: *val = spec ? 1 : 0; break; - case BPF_ENUMVAL_VALUE: + case BPF_CORE_ENUMVAL_VALUE: if (!spec) return -EUCLEAN; /* request instruction poisoning */ t = btf_type_by_id(spec->btf, spec->spec[0].type_id); @@ -1236,7 +1236,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, libbpf_print(LIBBPF_DEBUG, "\n");
/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ - if (relo->kind == BPF_TYPE_ID_LOCAL) { + if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) { targ_res.validate = true; targ_res.poison = false; targ_res.orig_val = local_spec.root_type_id; @@ -1302,7 +1302,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, }
/* - * For BPF_FIELD_EXISTS relo or when used BPF program has field + * For BPF_CORE_FIELD_EXISTS relo or when used BPF program has field * existence checks or kernel version/config checks, it's expected * that we might not find any candidates. In this case, if field * wasn't found in any candidate, the list of candidates shouldn't diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index 3b9f8f18346c..3d0b86e7f439 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -4,23 +4,7 @@ #ifndef __RELO_CORE_H #define __RELO_CORE_H
-/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value - * has to be adjusted by relocations. - */ -enum bpf_core_relo_kind { - BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */ - BPF_FIELD_BYTE_SIZE = 1, /* field size in bytes */ - BPF_FIELD_EXISTS = 2, /* field existence in target kernel */ - BPF_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */ - BPF_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */ - BPF_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */ - BPF_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */ - BPF_TYPE_ID_TARGET = 7, /* type ID in target kernel */ - BPF_TYPE_EXISTS = 8, /* type existence in target kernel */ - BPF_TYPE_SIZE = 9, /* type size in bytes */ - BPF_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ - BPF_ENUMVAL_VALUE = 11, /* enum value integer value */ -}; +#include <linux/bpf.h>
/* The minimum bpf_core_relo checked by the loader *
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit fbd94c7afcf99c9f3b1ba1168657ecc428eb2c8d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
struct bpf_core_relo is generated by llvm and processed by libbpf. It's a de-facto uapi. With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure. Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command and let the kernel perform CO-RE relocations.
Note the struct bpf_line_info and struct bpf_func_info have the same layout when passed from LLVM to libbpf and from libbpf to the kernel except "insn_off" fields means "byte offset" when LLVM generates it. Then libbpf converts it to "insn index" to pass to the kernel. The struct bpf_core_relo's "insn_off" field is always "byte offset".
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-6-alexei.starovoitov@gmail.... (cherry picked from commit fbd94c7afcf99c9f3b1ba1168657ecc428eb2c8d) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: include/linux/bpf.h
Signed-off-by: Wang Yufen wangyufen@huawei.com --- include/linux/bpf.h | 8 ++++ include/uapi/linux/bpf.h | 59 +++++++++++++++++++++++++- kernel/bpf/btf.c | 6 +++ kernel/bpf/syscall.c | 2 +- kernel/bpf/verifier.c | 76 ++++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 59 +++++++++++++++++++++++++- tools/lib/bpf/relo_core.h | 53 ------------------------ 7 files changed, 207 insertions(+), 56 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 930d0032285b..d9e54d6479fa 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1618,6 +1618,14 @@ const struct btf_func_model * bpf_jit_find_kfunc_model(const struct bpf_prog *prog, const struct bpf_insn *insn);
+struct bpf_core_ctx { + struct bpf_verifier_log *log; + const struct btf *btf; +}; + +int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, + int relo_idx, void *insn); + static inline bool unprivileged_ebpf_enabled(void) { return !sysctl_unprivileged_bpf_disabled; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 13fa6e081687..8fd143849999 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -589,8 +589,10 @@ union bpf_attr { /* or valid module BTF object fd or 0 to attach to vmlinux */ __u32 attach_btf_obj_fd; }; - __u32 :32; /* pad */ + __u32 core_relo_cnt; /* number of bpf_core_relo */ __aligned_u64 fd_array; /* array of FDs */ + __aligned_u64 core_relos; + __u32 core_relo_rec_size; /* sizeof(struct bpf_core_relo) */ };
struct { /* anonymous struct used by BPF_OBJ_* commands */ @@ -5261,4 +5263,59 @@ enum bpf_core_relo_kind { BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ };
+/* + * "struct bpf_core_relo" is used to pass relocation data form LLVM to libbpf + * and from libbpf to the kernel. + * + * CO-RE relocation captures the following data: + * - insn_off - instruction offset (in bytes) within a BPF program that needs + * its insn->imm field to be relocated with actual field info; + * - type_id - BTF type ID of the "root" (containing) entity of a relocatable + * type or field; + * - access_str_off - offset into corresponding .BTF string section. String + * interpretation depends on specific relocation kind: + * - for field-based relocations, string encodes an accessed field using + * a sequence of field and array indices, separated by colon (:). It's + * conceptually very close to LLVM's getelementptr ([0]) instruction's + * arguments for identifying offset to a field. + * - for type-based relocations, strings is expected to be just "0"; + * - for enum value-based relocations, string contains an index of enum + * value within its enum type; + * - kind - one of enum bpf_core_relo_kind; + * + * Example: + * struct sample { + * int a; + * struct { + * int b[10]; + * }; + * }; + * + * struct sample *s = ...; + * int *x = &s->a; // encoded as "0:0" (a is field #0) + * int *y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1, + * // b is field #0 inside anon struct, accessing elem #5) + * int *z = &s[10]->b; // encoded as "10:1" (ptr is used as an array) + * + * type_id for all relocs in this example will capture BTF type id of + * `struct sample`. + * + * Such relocation is emitted when using __builtin_preserve_access_index() + * Clang built-in, passing expression that captures field address, e.g.: + * + * bpf_probe_read(&dst, sizeof(dst), + * __builtin_preserve_access_index(&src->a.b.c)); + * + * In this case Clang will emit field relocation recording necessary data to + * be able to find offset of embedded `a.b.c` field within `src` struct. + * + * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction + */ +struct bpf_core_relo { + __u32 insn_off; + __u32 type_id; + __u32 access_str_off; + enum bpf_core_relo_kind kind; +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 68331879aa11..192d9aa21aab 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6254,3 +6254,9 @@ size_t bpf_core_essential_name_len(const char *name) } return n; } + +int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, + int relo_idx, void *insn) +{ + return -EOPNOTSUPP; +} diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 3e9c8b0ec6c9..df9e2c0365f2 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2177,7 +2177,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type) }
/* last field in 'union bpf_attr' used by this command */ -#define BPF_PROG_LOAD_LAST_FIELD fd_array +#define BPF_PROG_LOAD_LAST_FIELD core_relo_rec_size
static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e236b845f93e..5dac4d420a02 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9646,6 +9646,78 @@ static int check_btf_line(struct bpf_verifier_env *env, return err; }
+#define MIN_CORE_RELO_SIZE sizeof(struct bpf_core_relo) +#define MAX_CORE_RELO_SIZE MAX_FUNCINFO_REC_SIZE + +static int check_core_relo(struct bpf_verifier_env *env, + const union bpf_attr *attr, + bpfptr_t uattr) +{ + u32 i, nr_core_relo, ncopy, expected_size, rec_size; + struct bpf_core_relo core_relo = {}; + struct bpf_prog *prog = env->prog; + const struct btf *btf = prog->aux->btf; + struct bpf_core_ctx ctx = { + .log = &env->log, + .btf = btf, + }; + bpfptr_t u_core_relo; + int err; + + nr_core_relo = attr->core_relo_cnt; + if (!nr_core_relo) + return 0; + if (nr_core_relo > INT_MAX / sizeof(struct bpf_core_relo)) + return -EINVAL; + + rec_size = attr->core_relo_rec_size; + if (rec_size < MIN_CORE_RELO_SIZE || + rec_size > MAX_CORE_RELO_SIZE || + rec_size % sizeof(u32)) + return -EINVAL; + + u_core_relo = make_bpfptr(attr->core_relos, uattr.is_kernel); + expected_size = sizeof(struct bpf_core_relo); + ncopy = min_t(u32, expected_size, rec_size); + + /* Unlike func_info and line_info, copy and apply each CO-RE + * relocation record one at a time. + */ + for (i = 0; i < nr_core_relo; i++) { + /* future proofing when sizeof(bpf_core_relo) changes */ + err = bpf_check_uarg_tail_zero(u_core_relo, expected_size, rec_size); + if (err) { + if (err == -E2BIG) { + verbose(env, "nonzero tailing record in core_relo"); + if (copy_to_bpfptr_offset(uattr, + offsetof(union bpf_attr, core_relo_rec_size), + &expected_size, sizeof(expected_size))) + err = -EFAULT; + } + break; + } + + if (copy_from_bpfptr(&core_relo, u_core_relo, ncopy)) { + err = -EFAULT; + break; + } + + if (core_relo.insn_off % 8 || core_relo.insn_off / 8 >= prog->len) { + verbose(env, "Invalid core_relo[%u].insn_off:%u prog->len:%u\n", + i, core_relo.insn_off, prog->len); + err = -EINVAL; + break; + } + + err = bpf_core_apply(&ctx, &core_relo, i, + &prog->insnsi[core_relo.insn_off / 8]); + if (err) + break; + bpfptr_add(&u_core_relo, rec_size); + } + return err; +} + static int check_btf_info(struct bpf_verifier_env *env, const union bpf_attr *attr, bpfptr_t uattr) @@ -9676,6 +9748,10 @@ static int check_btf_info(struct bpf_verifier_env *env, if (err) return err;
+ err = check_core_relo(env, attr, uattr); + if (err) + return err; + return 0; }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1cea96fe92dc..e805b725e676 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1299,8 +1299,10 @@ union bpf_attr { /* or valid module BTF object fd or 0 to attach to vmlinux */ __u32 attach_btf_obj_fd; }; - __u32 :32; /* pad */ + __u32 core_relo_cnt; /* number of bpf_core_relo */ __aligned_u64 fd_array; /* array of FDs */ + __aligned_u64 core_relos; + __u32 core_relo_rec_size; /* sizeof(struct bpf_core_relo) */ };
struct { /* anonymous struct used by BPF_OBJ_* commands */ @@ -5970,4 +5972,59 @@ enum bpf_core_relo_kind { BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ };
+/* + * "struct bpf_core_relo" is used to pass relocation data form LLVM to libbpf + * and from libbpf to the kernel. + * + * CO-RE relocation captures the following data: + * - insn_off - instruction offset (in bytes) within a BPF program that needs + * its insn->imm field to be relocated with actual field info; + * - type_id - BTF type ID of the "root" (containing) entity of a relocatable + * type or field; + * - access_str_off - offset into corresponding .BTF string section. String + * interpretation depends on specific relocation kind: + * - for field-based relocations, string encodes an accessed field using + * a sequence of field and array indices, separated by colon (:). It's + * conceptually very close to LLVM's getelementptr ([0]) instruction's + * arguments for identifying offset to a field. + * - for type-based relocations, strings is expected to be just "0"; + * - for enum value-based relocations, string contains an index of enum + * value within its enum type; + * - kind - one of enum bpf_core_relo_kind; + * + * Example: + * struct sample { + * int a; + * struct { + * int b[10]; + * }; + * }; + * + * struct sample *s = ...; + * int *x = &s->a; // encoded as "0:0" (a is field #0) + * int *y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1, + * // b is field #0 inside anon struct, accessing elem #5) + * int *z = &s[10]->b; // encoded as "10:1" (ptr is used as an array) + * + * type_id for all relocs in this example will capture BTF type id of + * `struct sample`. + * + * Such relocation is emitted when using __builtin_preserve_access_index() + * Clang built-in, passing expression that captures field address, e.g.: + * + * bpf_probe_read(&dst, sizeof(dst), + * __builtin_preserve_access_index(&src->a.b.c)); + * + * In this case Clang will emit field relocation recording necessary data to + * be able to find offset of embedded `a.b.c` field within `src` struct. + * + * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction + */ +struct bpf_core_relo { + __u32 insn_off; + __u32 type_id; + __u32 access_str_off; + enum bpf_core_relo_kind kind; +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index 3d0b86e7f439..f410691cc4e5 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -6,59 +6,6 @@
#include <linux/bpf.h>
-/* The minimum bpf_core_relo checked by the loader - * - * CO-RE relocation captures the following data: - * - insn_off - instruction offset (in bytes) within a BPF program that needs - * its insn->imm field to be relocated with actual field info; - * - type_id - BTF type ID of the "root" (containing) entity of a relocatable - * type or field; - * - access_str_off - offset into corresponding .BTF string section. String - * interpretation depends on specific relocation kind: - * - for field-based relocations, string encodes an accessed field using - * a sequence of field and array indices, separated by colon (:). It's - * conceptually very close to LLVM's getelementptr ([0]) instruction's - * arguments for identifying offset to a field. - * - for type-based relocations, strings is expected to be just "0"; - * - for enum value-based relocations, string contains an index of enum - * value within its enum type; - * - * Example to provide a better feel. - * - * struct sample { - * int a; - * struct { - * int b[10]; - * }; - * }; - * - * struct sample *s = ...; - * int x = &s->a; // encoded as "0:0" (a is field #0) - * int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1, - * // b is field #0 inside anon struct, accessing elem #5) - * int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array) - * - * type_id for all relocs in this example will capture BTF type id of - * `struct sample`. - * - * Such relocation is emitted when using __builtin_preserve_access_index() - * Clang built-in, passing expression that captures field address, e.g.: - * - * bpf_probe_read(&dst, sizeof(dst), - * __builtin_preserve_access_index(&src->a.b.c)); - * - * In this case Clang will emit field relocation recording necessary data to - * be able to find offset of embedded `a.b.c` field within `src` struct. - * - * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction - */ -struct bpf_core_relo { - __u32 insn_off; - __u32 type_id; - __u32 access_str_off; - enum bpf_core_relo_kind kind; -}; - struct bpf_core_cand { const struct btf *btf; const struct btf_type *t;
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 03d5b99138dd8c7bfb838396acb180bd515ebf06 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Remove two redundant fields from struct bpf_core_cand.
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-8-alexei.starovoitov@gmail.... (cherry picked from commit 03d5b99138dd8c7bfb838396acb180bd515ebf06) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 30 +++++++++++++++++------------- tools/lib/bpf/relo_core.h | 2 -- 2 files changed, 17 insertions(+), 15 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 36cdac95ba9b..2ab33c4d68b6 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5142,15 +5142,18 @@ static int bpf_core_add_cands(struct bpf_core_cand *local_cand, struct bpf_core_cand_list *cands) { struct bpf_core_cand *new_cands, *cand; - const struct btf_type *t; - const char *targ_name; + const struct btf_type *t, *local_t; + const char *targ_name, *local_name; size_t targ_essent_len; int n, i;
+ local_t = btf__type_by_id(local_cand->btf, local_cand->id); + local_name = btf__str_by_offset(local_cand->btf, local_t->name_off); + n = btf__get_nr_types(targ_btf); for (i = targ_start_id; i <= n; i++) { t = btf__type_by_id(targ_btf, i); - if (btf_kind(t) != btf_kind(local_cand->t)) + if (btf_kind(t) != btf_kind(local_t)) continue;
targ_name = btf__name_by_offset(targ_btf, t->name_off); @@ -5161,12 +5164,12 @@ static int bpf_core_add_cands(struct bpf_core_cand *local_cand, if (targ_essent_len != local_essent_len) continue;
- if (strncmp(local_cand->name, targ_name, local_essent_len) != 0) + if (strncmp(local_name, targ_name, local_essent_len) != 0) continue;
pr_debug("CO-RE relocating [%d] %s %s: found target candidate [%d] %s %s in [%s]\n", - local_cand->id, btf_kind_str(local_cand->t), - local_cand->name, i, btf_kind_str(t), targ_name, + local_cand->id, btf_kind_str(local_t), + local_name, i, btf_kind_str(t), targ_name, targ_btf_name); new_cands = libbpf_reallocarray(cands->cands, cands->len + 1, sizeof(*cands->cands)); @@ -5175,8 +5178,6 @@ static int bpf_core_add_cands(struct bpf_core_cand *local_cand,
cand = &new_cands[cands->len]; cand->btf = targ_btf; - cand->t = t; - cand->name = targ_name; cand->id = i;
cands->cands = new_cands; @@ -5283,18 +5284,21 @@ bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 l struct bpf_core_cand local_cand = {}; struct bpf_core_cand_list *cands; const struct btf *main_btf; + const struct btf_type *local_t; + const char *local_name; size_t local_essent_len; int err, i;
local_cand.btf = local_btf; - local_cand.t = btf__type_by_id(local_btf, local_type_id); - if (!local_cand.t) + local_cand.id = local_type_id; + local_t = btf__type_by_id(local_btf, local_type_id); + if (!local_t) return ERR_PTR(-EINVAL);
- local_cand.name = btf__name_by_offset(local_btf, local_cand.t->name_off); - if (str_is_empty(local_cand.name)) + local_name = btf__name_by_offset(local_btf, local_t->name_off); + if (str_is_empty(local_name)) return ERR_PTR(-EINVAL); - local_essent_len = bpf_core_essential_name_len(local_cand.name); + local_essent_len = bpf_core_essential_name_len(local_name);
cands = calloc(1, sizeof(*cands)); if (!cands) diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index f410691cc4e5..4f864b8e33b7 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -8,8 +8,6 @@
struct bpf_core_cand { const struct btf *btf; - const struct btf_type *t; - const char *name; __u32 id; };
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 1e89106da25390826608ad6ac0edfb7c9952eff3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Given BPF program's BTF root type name perform the following steps: . search in vmlinux candidate cache. . if (present in cache and candidate list >= 1) return candidate list. . do a linear search through kernel BTFs for possible candidates. . regardless of number of candidates found populate vmlinux cache. . if (candidate list >= 1) return candidate list. . search in module candidate cache. . if (present in cache) return candidate list (even if list is empty). . do a linear search through BTFs of all kernel modules collecting candidates from all of them. . regardless of number of candidates found populate module cache. . return candidate list. Then wire the result into bpf_core_apply_relo_insn().
When BPF program is trying to CO-RE relocate a type that doesn't exist in either vmlinux BTF or in modules BTFs these steps will perform 2 cache lookups when cache is hit.
Note the cache doesn't prevent the abuse by the program that might have lots of relocations that cannot be resolved. Hence cond_resched().
CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-9-alexei.starovoitov@gmail.... (cherry picked from commit 1e89106da25390826608ad6ac0edfb7c9952eff3) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 346 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 345 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 192d9aa21aab..525e68a4816c 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -25,6 +25,7 @@ #include <linux/kobject.h> #include <linux/sysfs.h> #include <net/sock.h> +#include "../tools/lib/bpf/relo_core.h"
/* BTF (BPF Type Format) is the meta data format which describes * the data types of BPF program/map. Hence, it basically focus @@ -6044,6 +6045,8 @@ btf_module_read(struct file *file, struct kobject *kobj, return len; }
+static void purge_cand_cache(struct btf *btf); + static int btf_module_notify(struct notifier_block *nb, unsigned long op, void *module) { @@ -6078,6 +6081,7 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, goto out; }
+ purge_cand_cache(NULL); mutex_lock(&btf_module_mutex); btf_mod->module = module; btf_mod->btf = btf; @@ -6120,6 +6124,7 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, list_del(&btf_mod->list); if (btf_mod->sysfs_attr) sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr); + purge_cand_cache(btf_mod->btf); btf_put(btf_mod->btf); kfree(btf_mod->sysfs_attr); kfree(btf_mod); @@ -6255,8 +6260,347 @@ size_t bpf_core_essential_name_len(const char *name) return n; }
+struct bpf_cand_cache { + const char *name; + u32 name_len; + u16 kind; + u16 cnt; + struct { + const struct btf *btf; + u32 id; + } cands[]; +}; + +static void bpf_free_cands(struct bpf_cand_cache *cands) +{ + if (!cands->cnt) + /* empty candidate array was allocated on stack */ + return; + kfree(cands); +} + +static void bpf_free_cands_from_cache(struct bpf_cand_cache *cands) +{ + kfree(cands->name); + kfree(cands); +} + +#define VMLINUX_CAND_CACHE_SIZE 31 +static struct bpf_cand_cache *vmlinux_cand_cache[VMLINUX_CAND_CACHE_SIZE]; + +#define MODULE_CAND_CACHE_SIZE 31 +static struct bpf_cand_cache *module_cand_cache[MODULE_CAND_CACHE_SIZE]; + +static DEFINE_MUTEX(cand_cache_mutex); + +static void __print_cand_cache(struct bpf_verifier_log *log, + struct bpf_cand_cache **cache, + int cache_size) +{ + struct bpf_cand_cache *cc; + int i, j; + + for (i = 0; i < cache_size; i++) { + cc = cache[i]; + if (!cc) + continue; + bpf_log(log, "[%d]%s(", i, cc->name); + for (j = 0; j < cc->cnt; j++) { + bpf_log(log, "%d", cc->cands[j].id); + if (j < cc->cnt - 1) + bpf_log(log, " "); + } + bpf_log(log, "), "); + } +} + +static void print_cand_cache(struct bpf_verifier_log *log) +{ + mutex_lock(&cand_cache_mutex); + bpf_log(log, "vmlinux_cand_cache:"); + __print_cand_cache(log, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE); + bpf_log(log, "\nmodule_cand_cache:"); + __print_cand_cache(log, module_cand_cache, MODULE_CAND_CACHE_SIZE); + bpf_log(log, "\n"); + mutex_unlock(&cand_cache_mutex); +} + +static u32 hash_cands(struct bpf_cand_cache *cands) +{ + return jhash(cands->name, cands->name_len, 0); +} + +static struct bpf_cand_cache *check_cand_cache(struct bpf_cand_cache *cands, + struct bpf_cand_cache **cache, + int cache_size) +{ + struct bpf_cand_cache *cc = cache[hash_cands(cands) % cache_size]; + + if (cc && cc->name_len == cands->name_len && + !strncmp(cc->name, cands->name, cands->name_len)) + return cc; + return NULL; +} + +static size_t sizeof_cands(int cnt) +{ + return offsetof(struct bpf_cand_cache, cands[cnt]); +} + +static struct bpf_cand_cache *populate_cand_cache(struct bpf_cand_cache *cands, + struct bpf_cand_cache **cache, + int cache_size) +{ + struct bpf_cand_cache **cc = &cache[hash_cands(cands) % cache_size], *new_cands; + + if (*cc) { + bpf_free_cands_from_cache(*cc); + *cc = NULL; + } + new_cands = kmalloc(sizeof_cands(cands->cnt), GFP_KERNEL); + if (!new_cands) { + bpf_free_cands(cands); + return ERR_PTR(-ENOMEM); + } + memcpy(new_cands, cands, sizeof_cands(cands->cnt)); + /* strdup the name, since it will stay in cache. + * the cands->name points to strings in prog's BTF and the prog can be unloaded. + */ + new_cands->name = kmemdup_nul(cands->name, cands->name_len, GFP_KERNEL); + bpf_free_cands(cands); + if (!new_cands->name) { + kfree(new_cands); + return ERR_PTR(-ENOMEM); + } + *cc = new_cands; + return new_cands; +} + +static void __purge_cand_cache(struct btf *btf, struct bpf_cand_cache **cache, + int cache_size) +{ + struct bpf_cand_cache *cc; + int i, j; + + for (i = 0; i < cache_size; i++) { + cc = cache[i]; + if (!cc) + continue; + if (!btf) { + /* when new module is loaded purge all of module_cand_cache, + * since new module might have candidates with the name + * that matches cached cands. + */ + bpf_free_cands_from_cache(cc); + cache[i] = NULL; + continue; + } + /* when module is unloaded purge cache entries + * that match module's btf + */ + for (j = 0; j < cc->cnt; j++) + if (cc->cands[j].btf == btf) { + bpf_free_cands_from_cache(cc); + cache[i] = NULL; + break; + } + } + +} + +static void purge_cand_cache(struct btf *btf) +{ + mutex_lock(&cand_cache_mutex); + __purge_cand_cache(btf, module_cand_cache, MODULE_CAND_CACHE_SIZE); + mutex_unlock(&cand_cache_mutex); +} + +static struct bpf_cand_cache * +bpf_core_add_cands(struct bpf_cand_cache *cands, const struct btf *targ_btf, + int targ_start_id) +{ + struct bpf_cand_cache *new_cands; + const struct btf_type *t; + const char *targ_name; + size_t targ_essent_len; + int n, i; + + n = btf_nr_types(targ_btf); + for (i = targ_start_id; i < n; i++) { + t = btf_type_by_id(targ_btf, i); + if (btf_kind(t) != cands->kind) + continue; + + targ_name = btf_name_by_offset(targ_btf, t->name_off); + if (!targ_name) + continue; + + /* the resched point is before strncmp to make sure that search + * for non-existing name will have a chance to schedule(). + */ + cond_resched(); + + if (strncmp(cands->name, targ_name, cands->name_len) != 0) + continue; + + targ_essent_len = bpf_core_essential_name_len(targ_name); + if (targ_essent_len != cands->name_len) + continue; + + /* most of the time there is only one candidate for a given kind+name pair */ + new_cands = kmalloc(sizeof_cands(cands->cnt + 1), GFP_KERNEL); + if (!new_cands) { + bpf_free_cands(cands); + return ERR_PTR(-ENOMEM); + } + + memcpy(new_cands, cands, sizeof_cands(cands->cnt)); + bpf_free_cands(cands); + cands = new_cands; + cands->cands[cands->cnt].btf = targ_btf; + cands->cands[cands->cnt].id = i; + cands->cnt++; + } + return cands; +} + +static struct bpf_cand_cache * +bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id) +{ + struct bpf_cand_cache *cands, *cc, local_cand = {}; + const struct btf *local_btf = ctx->btf; + const struct btf_type *local_type; + const struct btf *main_btf; + size_t local_essent_len; + struct btf *mod_btf; + const char *name; + int id; + + main_btf = bpf_get_btf_vmlinux(); + if (IS_ERR(main_btf)) + return (void *)main_btf; + + local_type = btf_type_by_id(local_btf, local_type_id); + if (!local_type) + return ERR_PTR(-EINVAL); + + name = btf_name_by_offset(local_btf, local_type->name_off); + if (str_is_empty(name)) + return ERR_PTR(-EINVAL); + local_essent_len = bpf_core_essential_name_len(name); + + cands = &local_cand; + cands->name = name; + cands->kind = btf_kind(local_type); + cands->name_len = local_essent_len; + + cc = check_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE); + /* cands is a pointer to stack here */ + if (cc) { + if (cc->cnt) + return cc; + goto check_modules; + } + + /* Attempt to find target candidates in vmlinux BTF first */ + cands = bpf_core_add_cands(cands, main_btf, 1); + if (IS_ERR(cands)) + return cands; + + /* cands is a pointer to kmalloced memory here if cands->cnt > 0 */ + + /* populate cache even when cands->cnt == 0 */ + cc = populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE); + if (IS_ERR(cc)) + return cc; + + /* if vmlinux BTF has any candidate, don't go for module BTFs */ + if (cc->cnt) + return cc; + +check_modules: + /* cands is a pointer to stack here and cands->cnt == 0 */ + cc = check_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE); + if (cc) + /* if cache has it return it even if cc->cnt == 0 */ + return cc; + + /* If candidate is not found in vmlinux's BTF then search in module's BTFs */ + spin_lock_bh(&btf_idr_lock); + idr_for_each_entry(&btf_idr, mod_btf, id) { + if (!btf_is_module(mod_btf)) + continue; + /* linear search could be slow hence unlock/lock + * the IDR to avoiding holding it for too long + */ + btf_get(mod_btf); + spin_unlock_bh(&btf_idr_lock); + cands = bpf_core_add_cands(cands, mod_btf, btf_nr_types(main_btf)); + if (IS_ERR(cands)) { + btf_put(mod_btf); + return cands; + } + spin_lock_bh(&btf_idr_lock); + btf_put(mod_btf); + } + spin_unlock_bh(&btf_idr_lock); + /* cands is a pointer to kmalloced memory here if cands->cnt > 0 + * or pointer to stack if cands->cnd == 0. + * Copy it into the cache even when cands->cnt == 0 and + * return the result. + */ + return populate_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE); +} + int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, int relo_idx, void *insn) { - return -EOPNOTSUPP; + bool need_cands = relo->kind != BPF_CORE_TYPE_ID_LOCAL; + struct bpf_core_cand_list cands = {}; + int err; + + if (need_cands) { + struct bpf_cand_cache *cc; + int i; + + mutex_lock(&cand_cache_mutex); + cc = bpf_core_find_cands(ctx, relo->type_id); + if (IS_ERR(cc)) { + bpf_log(ctx->log, "target candidate search failed for %d\n", + relo->type_id); + err = PTR_ERR(cc); + goto out; + } + if (cc->cnt) { + cands.cands = kcalloc(cc->cnt, sizeof(*cands.cands), GFP_KERNEL); + if (!cands.cands) { + err = -ENOMEM; + goto out; + } + } + for (i = 0; i < cc->cnt; i++) { + bpf_log(ctx->log, + "CO-RE relocating %s %s: found target candidate [%d]\n", + btf_kind_str[cc->kind], cc->name, cc->cands[i].id); + cands.cands[i].btf = cc->cands[i].btf; + cands.cands[i].id = cc->cands[i].id; + } + cands.len = cc->cnt; + /* cand_cache_mutex needs to span the cache lookup and + * copy of btf pointer into bpf_core_cand_list, + * since module can be unloaded while bpf_core_apply_relo_insn + * is working with module's btf. + */ + } + + err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8, + relo, relo_idx, ctx->btf, &cands); +out: + if (need_cands) { + kfree(cands.cands); + mutex_unlock(&cand_cache_mutex); + if (ctx->log->level & BPF_LOG_LEVEL2) + print_cand_cache(ctx->log); + } + return err; }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit d0e928876e30b18411b80fd2445424bc00e95745 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Without lskel the CO-RE relocations are processed by libbpf before any other work is done. Instead, when lskel is needed, remember relocation as RELO_CORE kind. Then when loader prog is generated for a given bpf program pass CO-RE relos of that program to gen loader via bpf_gen__record_relo_core(). The gen loader will remember them as-is and pass it later as-is into the kernel.
The normal libbpf flow is to process CO-RE early before call relos happen. In case of gen_loader the core relos have to be added to other relos to be copied together when bpf static function is appended in different places to other main bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for normal relos and for RELO_CORE kind too. When that is done each struct reloc_desc has good relos for specific main prog.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-10-alexei.starovoitov@gmail... (cherry picked from commit d0e928876e30b18411b80fd2445424bc00e95745) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/gen_loader.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_gen_internal.h | 3 + tools/lib/bpf/gen_loader.c | 38 ++++++++++- tools/lib/bpf/libbpf.c | 109 ++++++++++++++++++++++--------- 3 files changed, 118 insertions(+), 32 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index d507e2841df5..a5f332f0f495 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -37,6 +37,8 @@ struct bpf_gen { int error; struct ksym_relo_desc *relos; int relo_cnt; + struct bpf_core_relo *core_relos; + int core_relo_cnt; char attach_target[128]; int attach_kind; struct ksym_desc *ksyms; @@ -57,5 +59,6 @@ void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx); void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type); void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, bool is_typeless, int kind, int insn_idx); +void bpf_gen__record_relo_core(struct bpf_gen *gen, const struct bpf_core_relo *core_relo);
#endif diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index baf74a83ebdc..2bc783050060 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -854,6 +854,22 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, emit_ksym_relo_log(gen, relo, kdesc->ref); }
+void bpf_gen__record_relo_core(struct bpf_gen *gen, + const struct bpf_core_relo *core_relo) +{ + struct bpf_core_relo *relos; + + relos = libbpf_reallocarray(gen->core_relos, gen->core_relo_cnt + 1, sizeof(*relos)); + if (!relos) { + gen->error = -ENOMEM; + return; + } + gen->core_relos = relos; + relos += gen->core_relo_cnt; + memcpy(relos, core_relo, sizeof(*relos)); + gen->core_relo_cnt++; +} + static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns) { int insn; @@ -886,6 +902,15 @@ static void emit_relos(struct bpf_gen *gen, int insns) emit_relo(gen, gen->relos + i, insns); }
+static void cleanup_core_relo(struct bpf_gen *gen) +{ + if (!gen->core_relo_cnt) + return; + free(gen->core_relos); + gen->core_relo_cnt = 0; + gen->core_relos = NULL; +} + static void cleanup_relos(struct bpf_gen *gen, int insns) { int i, insn; @@ -913,13 +938,14 @@ static void cleanup_relos(struct bpf_gen *gen, int insns) gen->relo_cnt = 0; gen->relos = NULL; } + cleanup_core_relo(gen); }
void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx) { - int attr_size = offsetofend(union bpf_attr, fd_array); - int prog_load_attr, license, insns, func_info, line_info; + int prog_load_attr, license, insns, func_info, line_info, core_relos; + int attr_size = offsetofend(union bpf_attr, core_relo_rec_size); union bpf_attr attr;
memset(&attr, 0, attr_size); @@ -949,6 +975,11 @@ void bpf_gen__prog_load(struct bpf_gen *gen, line_info = add_data(gen, load_attr->line_info, attr.line_info_cnt * attr.line_info_rec_size);
+ attr.core_relo_rec_size = sizeof(struct bpf_core_relo); + attr.core_relo_cnt = gen->core_relo_cnt; + core_relos = add_data(gen, gen->core_relos, + attr.core_relo_cnt * attr.core_relo_rec_size); + memcpy(attr.prog_name, load_attr->name, min((unsigned)strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1)); prog_load_attr = add_data(gen, &attr, attr_size); @@ -965,6 +996,9 @@ void bpf_gen__prog_load(struct bpf_gen *gen, /* populate union bpf_attr with a pointer to line_info */ emit_rel_store(gen, attr_field(prog_load_attr, line_info), line_info);
+ /* populate union bpf_attr with a pointer to core_relos */ + emit_rel_store(gen, attr_field(prog_load_attr, core_relos), core_relos); + /* populate union bpf_attr fd_array with a pointer to data where map_fds are saved */ emit_rel_store(gen, attr_field(prog_load_attr, fd_array), gen->fd_array);
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2ab33c4d68b6..c716f1a8c0a3 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -209,13 +209,19 @@ enum reloc_type { RELO_EXTERN_VAR, RELO_EXTERN_FUNC, RELO_SUBPROG_ADDR, + RELO_CORE, };
struct reloc_desc { enum reloc_type type; int insn_idx; - int map_idx; - int sym_off; + union { + const struct bpf_core_relo *core_relo; /* used when type == RELO_CORE */ + struct { + int map_idx; + int sym_off; + }; + }; };
struct bpf_sec_def; @@ -5448,6 +5454,24 @@ static void *u32_as_hash_key(__u32 x) return (void *)(uintptr_t)x; }
+static int record_relo_core(struct bpf_program *prog, + const struct bpf_core_relo *core_relo, int insn_idx) +{ + struct reloc_desc *relos, *relo; + + relos = libbpf_reallocarray(prog->reloc_desc, + prog->nr_reloc + 1, sizeof(*relos)); + if (!relos) + return -ENOMEM; + relo = &relos[prog->nr_reloc]; + relo->type = RELO_CORE; + relo->insn_idx = insn_idx; + relo->core_relo = core_relo; + prog->reloc_desc = relos; + prog->nr_reloc++; + return 0; +} + static int bpf_core_apply_relo(struct bpf_program *prog, const struct bpf_core_relo *relo, int relo_idx, @@ -5484,10 +5508,12 @@ static int bpf_core_apply_relo(struct bpf_program *prog, return -EINVAL;
if (prog->obj->gen_loader) { - pr_warn("// TODO core_relo: prog %td insn[%d] %s kind %d\n", + const char *spec_str = btf__name_by_offset(local_btf, relo->access_str_off); + + pr_debug("record_relo_core: prog %td insn[%d] %s %s %s final insn_idx %d\n", prog - prog->obj->programs, relo->insn_off / 8, - local_name, relo->kind); - return -ENOTSUP; + btf_kind_str(local_type), local_name, spec_str, insn_idx); + return record_relo_core(prog, relo, insn_idx); }
if (relo->kind != BPF_CORE_TYPE_ID_LOCAL && @@ -5686,6 +5712,9 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) case RELO_CALL: /* handled already */ break; + case RELO_CORE: + /* will be handled by bpf_program_record_relos() */ + break; default: pr_warn("prog '%s': relo #%d: bad relo type %d\n", prog->name, i, relo->type); @@ -6126,6 +6155,35 @@ bpf_object__free_relocs(struct bpf_object *obj) } }
+static int cmp_relocs(const void *_a, const void *_b) +{ + const struct reloc_desc *a = _a; + const struct reloc_desc *b = _b; + + if (a->insn_idx != b->insn_idx) + return a->insn_idx < b->insn_idx ? -1 : 1; + + /* no two relocations should have the same insn_idx, but ... */ + if (a->type != b->type) + return a->type < b->type ? -1 : 1; + + return 0; +} + +static void bpf_object__sort_relos(struct bpf_object *obj) +{ + int i; + + for (i = 0; i < obj->nr_programs; i++) { + struct bpf_program *p = &obj->programs[i]; + + if (!p->nr_reloc) + continue; + + qsort(p->reloc_desc, p->nr_reloc, sizeof(*p->reloc_desc), cmp_relocs); + } +} + static int bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) { @@ -6140,6 +6198,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) err); return err; } + if (obj->gen_loader) + bpf_object__sort_relos(obj); }
/* Before relocating calls pre-process relocations and mark @@ -6344,21 +6404,6 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj, return 0; }
-static int cmp_relocs(const void *_a, const void *_b) -{ - const struct reloc_desc *a = _a; - const struct reloc_desc *b = _b; - - if (a->insn_idx != b->insn_idx) - return a->insn_idx < b->insn_idx ? -1 : 1; - - /* no two relocations should have the same insn_idx, but ... */ - if (a->type != b->type) - return a->type < b->type ? -1 : 1; - - return 0; -} - static int bpf_object__collect_relos(struct bpf_object *obj) { int i, err; @@ -6391,14 +6436,7 @@ static int bpf_object__collect_relos(struct bpf_object *obj) return err; }
- for (i = 0; i < obj->nr_programs; i++) { - struct bpf_program *p = &obj->programs[i]; - - if (!p->nr_reloc) - continue; - - qsort(p->reloc_desc, p->nr_reloc, sizeof(*p->reloc_desc), cmp_relocs); - } + bpf_object__sort_relos(obj); return 0; }
@@ -6639,7 +6677,7 @@ static int bpf_object_load_prog_instance(struct bpf_object *obj, struct bpf_prog return ret; }
-static int bpf_program__record_externs(struct bpf_program *prog) +static int bpf_program_record_relos(struct bpf_program *prog) { struct bpf_object *obj = prog->obj; int i; @@ -6661,6 +6699,17 @@ static int bpf_program__record_externs(struct bpf_program *prog) ext->is_weak, false, BTF_KIND_FUNC, relo->insn_idx); break; + case RELO_CORE: { + struct bpf_core_relo cr = { + .insn_off = relo->insn_idx * 8, + .type_id = relo->core_relo->type_id, + .access_str_off = relo->core_relo->access_str_off, + .kind = relo->core_relo->kind, + }; + + bpf_gen__record_relo_core(obj->gen_loader, &cr); + break; + } default: continue; } @@ -6700,7 +6749,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog prog->name, prog->instances.nr); } if (obj->gen_loader) - bpf_program__record_externs(prog); + bpf_program_record_relos(prog); err = bpf_object_load_prog_instance(obj, prog, prog->insns, prog->insns_cnt, license, kern_ver, &fd);
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit be05c94476f3cf4fdc29feab4ed1053187323296 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Add ability to initialize inner maps in light skeleton.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-11-alexei.starovoitov@gmail... (cherry picked from commit be05c94476f3cf4fdc29feab4ed1053187323296) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/gen_loader.c | 27 +++++++++++++++++++++++++++ tools/lib/bpf/libbpf.c | 6 +++--- 3 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index a5f332f0f495..aeaae7ed09ca 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -60,5 +60,6 @@ void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum b void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, bool is_typeless, int kind, int insn_idx); void bpf_gen__record_relo_core(struct bpf_gen *gen, const struct bpf_core_relo *core_relo); +void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int key, int inner_map_idx);
#endif diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 2bc783050060..722b1f94f8b6 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -1077,6 +1077,33 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit_check_err(gen); }
+void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int slot, + int inner_map_idx) +{ + int attr_size = offsetofend(union bpf_attr, flags); + int map_update_attr, key; + union bpf_attr attr; + + memset(&attr, 0, attr_size); + pr_debug("gen: populate_outer_map: outer %d key %d inner %d\n", + outer_map_idx, slot, inner_map_idx); + + key = add_data(gen, &slot, sizeof(slot)); + + map_update_attr = add_data(gen, &attr, attr_size); + move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, + blob_fd_array_off(gen, outer_map_idx)); + emit_rel_store(gen, attr_field(map_update_attr, key), key); + emit_rel_store(gen, attr_field(map_update_attr, value), + blob_fd_array_off(gen, inner_map_idx)); + + /* emit MAP_UPDATE_ELEM command */ + emit_sys_bpf(gen, BPF_MAP_UPDATE_ELEM, map_update_attr, attr_size); + debug_ret(gen, "populate_outer_map outer %d key %d inner %d", + outer_map_idx, slot, inner_map_idx); + emit_check_err(gen); +} + void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx) { int attr_size = offsetofend(union bpf_attr, map_fd); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index c716f1a8c0a3..02530e19958d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4934,9 +4934,9 @@ static int init_map_in_map_slots(struct bpf_object *obj, struct bpf_map *map) fd = bpf_map__fd(targ_map);
if (obj->gen_loader) { - pr_warn("// TODO map_update_elem: idx %td key %d value==map_idx %td\n", - map - obj->maps, i, targ_map - obj->maps); - return -ENOTSUP; + bpf_gen__populate_outer_map(obj->gen_loader, + map - obj->maps, i, + targ_map - obj->maps); } else { err = bpf_map_update_elem(map->fd, &i, &fd, 0); }
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 19250f5fc0c283892a61f3abf9d65e6325f63897 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
The gen_loader has to clear attach_kind otherwise the programs without attach_btf_id will fail load if they follow programs with attach_btf_id.
Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211201181040.23337-12-alexei.starovoitov@gmail... (cherry picked from commit 19250f5fc0c283892a61f3abf9d65e6325f63897) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/gen_loader.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 722b1f94f8b6..f955e2c895ea 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -1029,9 +1029,11 @@ void bpf_gen__prog_load(struct bpf_gen *gen, debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt); /* successful or not, close btf module FDs used in extern ksyms and attach_btf_obj_fd */ cleanup_relos(gen, insns); - if (gen->attach_kind) + if (gen->attach_kind) { emit_sys_close_blob(gen, attr_field(prog_load_attr, attach_btf_obj_fd)); + gen->attach_kind = 0; + } emit_check_err(gen); /* remember prog_fd in the stack, if successful */ emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7,
From: Andrii Nakryiko andrii@kernel.org
mainline inclusion from mainline-5.17-rc1 commit b8b5cb55f5d3f03cc1479a3768d68173a10359ad category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Fix the `int i` declaration inside the for statement. This is non-C89 compliant. See [0] for user report breaking BCC build.
[0] https://github.com/libbpf/libbpf/issues/403
Fixes: 18f4fccbf314 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations") Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Kumar Kartikeya Dwivedi memxor@gmail.com Link: https://lore.kernel.org/bpf/20211105191055.3324874-1-andrii@kernel.org (cherry picked from commit b8b5cb55f5d3f03cc1479a3768d68173a10359ad) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index f955e2c895ea..515b4c5d4ffd 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -596,8 +596,9 @@ void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak, static struct ksym_desc *get_ksym_desc(struct bpf_gen *gen, struct ksym_relo_desc *relo) { struct ksym_desc *kdesc; + int i;
- for (int i = 0; i < gen->nr_ksyms; i++) { + for (i = 0; i < gen->nr_ksyms; i++) { if (!strcmp(gen->ksyms[i].name, relo->name)) { gen->ksyms[i].ref++; return &gen->ksyms[i];
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 78c1f8d0634cc35da613d844eda7c849fc50f643 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Reduce bpf_core_apply_relo_insn() stack usage and bump BPF_CORE_SPEC_MAX_LEN limit back to 64.
Fixes: 29db4bea1d10 ("bpf: Prepare relo_core.c for kernel duty.") Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211203182836.16646-1-alexei.starovoitov@gmail.... (cherry picked from commit 78c1f8d0634cc35da613d844eda7c849fc50f643) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 11 ++++++- tools/lib/bpf/libbpf.c | 4 ++- tools/lib/bpf/relo_core.c | 62 ++++++++++++--------------------------- tools/lib/bpf/relo_core.h | 30 ++++++++++++++++++- 4 files changed, 61 insertions(+), 46 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 525e68a4816c..36eb24958607 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6557,8 +6557,16 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, { bool need_cands = relo->kind != BPF_CORE_TYPE_ID_LOCAL; struct bpf_core_cand_list cands = {}; + struct bpf_core_spec *specs; int err;
+ /* ~4k of temp memory necessary to convert LLVM spec like "0:1:0:5" + * into arrays of btf_ids of struct fields and array indices. + */ + specs = kcalloc(3, sizeof(*specs), GFP_KERNEL); + if (!specs) + return -ENOMEM; + if (need_cands) { struct bpf_cand_cache *cc; int i; @@ -6594,8 +6602,9 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, }
err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8, - relo, relo_idx, ctx->btf, &cands); + relo, relo_idx, ctx->btf, &cands, specs); out: + kfree(specs); if (need_cands) { kfree(cands.cands); mutex_unlock(&cand_cache_mutex); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 02530e19958d..c0596e4b6aab 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5478,6 +5478,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, const struct btf *local_btf, struct hashmap *cand_cache) { + struct bpf_core_spec specs_scratch[3] = {}; const void *type_key = u32_as_hash_key(relo->type_id); struct bpf_core_cand_list *cands = NULL; const char *prog_name = prog->name; @@ -5532,7 +5533,8 @@ static int bpf_core_apply_relo(struct bpf_program *prog, } }
- return bpf_core_apply_relo_insn(prog_name, insn, insn_idx, relo, relo_idx, local_btf, cands); + return bpf_core_apply_relo_insn(prog_name, insn, insn_idx, relo, + relo_idx, local_btf, cands, specs_scratch); }
static int diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 35a38fccc837..462d6d9bb0cd 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -68,33 +68,6 @@ enum libbpf_print_level { #include "libbpf_internal.h" #endif
-#define BPF_CORE_SPEC_MAX_LEN 32 - -/* represents BPF CO-RE field or array element accessor */ -struct bpf_core_accessor { - __u32 type_id; /* struct/union type or array element type */ - __u32 idx; /* field index or array index */ - const char *name; /* field name or NULL for array accessor */ -}; - -struct bpf_core_spec { - const struct btf *btf; - /* high-level spec: named fields and array indices only */ - struct bpf_core_accessor spec[BPF_CORE_SPEC_MAX_LEN]; - /* original unresolved (no skip_mods_or_typedefs) root type ID */ - __u32 root_type_id; - /* CO-RE relocation kind */ - enum bpf_core_relo_kind relo_kind; - /* high-level spec length */ - int len; - /* raw, low-level spec: 1-to-1 with accessor spec string */ - int raw_spec[BPF_CORE_SPEC_MAX_LEN]; - /* raw spec length */ - int raw_len; - /* field bit offset represented by spec */ - __u32 bit_offset; -}; - static bool is_flex_arr(const struct btf *btf, const struct bpf_core_accessor *acc, const struct btf_array *arr) @@ -1200,9 +1173,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, const struct bpf_core_relo *relo, int relo_idx, const struct btf *local_btf, - struct bpf_core_cand_list *cands) + struct bpf_core_cand_list *cands, + struct bpf_core_spec *specs_scratch) { - struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; + struct bpf_core_spec *local_spec = &specs_scratch[0]; + struct bpf_core_spec *cand_spec = &specs_scratch[1]; + struct bpf_core_spec *targ_spec = &specs_scratch[2]; struct bpf_core_relo_res cand_res, targ_res; const struct btf_type *local_type; const char *local_name; @@ -1221,7 +1197,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, return -EINVAL;
err = bpf_core_parse_spec(prog_name, local_btf, local_id, spec_str, - relo->kind, &local_spec); + relo->kind, local_spec); if (err) { pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", prog_name, relo_idx, local_id, btf_kind_str(local_type), @@ -1232,15 +1208,15 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); - bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, &local_spec); + bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, local_spec); libbpf_print(LIBBPF_DEBUG, "\n");
/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) { targ_res.validate = true; targ_res.poison = false; - targ_res.orig_val = local_spec.root_type_id; - targ_res.new_val = local_spec.root_type_id; + targ_res.orig_val = local_spec->root_type_id; + targ_res.new_val = local_spec->root_type_id; goto patch_insn; }
@@ -1253,38 +1229,38 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
for (i = 0, j = 0; i < cands->len; i++) { - err = bpf_core_spec_match(&local_spec, cands->cands[i].btf, - cands->cands[i].id, &cand_spec); + err = bpf_core_spec_match(local_spec, cands->cands[i].btf, + cands->cands[i].id, cand_spec); if (err < 0) { pr_warn("prog '%s': relo #%d: error matching candidate #%d ", prog_name, relo_idx, i); - bpf_core_dump_spec(prog_name, LIBBPF_WARN, &cand_spec); + bpf_core_dump_spec(prog_name, LIBBPF_WARN, cand_spec); libbpf_print(LIBBPF_WARN, ": %d\n", err); return err; }
pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name, relo_idx, err == 0 ? "non-matching" : "matching", i); - bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, &cand_spec); + bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, cand_spec); libbpf_print(LIBBPF_DEBUG, "\n");
if (err == 0) continue;
- err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, &cand_spec, &cand_res); + err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, cand_spec, &cand_res); if (err) return err;
if (j == 0) { targ_res = cand_res; - targ_spec = cand_spec; - } else if (cand_spec.bit_offset != targ_spec.bit_offset) { + *targ_spec = *cand_spec; + } else if (cand_spec->bit_offset != targ_spec->bit_offset) { /* if there are many field relo candidates, they * should all resolve to the same bit offset */ pr_warn("prog '%s': relo #%d: field offset ambiguity: %u != %u\n", - prog_name, relo_idx, cand_spec.bit_offset, - targ_spec.bit_offset); + prog_name, relo_idx, cand_spec->bit_offset, + targ_spec->bit_offset); return -EINVAL; } else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) { /* all candidates should result in the same relocation @@ -1328,7 +1304,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, prog_name, relo_idx);
/* calculate single target relo result explicitly */ - err = bpf_core_calc_relo(prog_name, relo, relo_idx, &local_spec, NULL, &targ_res); + err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, &targ_res); if (err) return err; } diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index 4f864b8e33b7..17799819ad7c 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -17,11 +17,39 @@ struct bpf_core_cand_list { int len; };
+#define BPF_CORE_SPEC_MAX_LEN 64 + +/* represents BPF CO-RE field or array element accessor */ +struct bpf_core_accessor { + __u32 type_id; /* struct/union type or array element type */ + __u32 idx; /* field index or array index */ + const char *name; /* field name or NULL for array accessor */ +}; + +struct bpf_core_spec { + const struct btf *btf; + /* high-level spec: named fields and array indices only */ + struct bpf_core_accessor spec[BPF_CORE_SPEC_MAX_LEN]; + /* original unresolved (no skip_mods_or_typedefs) root type ID */ + __u32 root_type_id; + /* CO-RE relocation kind */ + enum bpf_core_relo_kind relo_kind; + /* high-level spec length */ + int len; + /* raw, low-level spec: 1-to-1 with accessor spec string */ + int raw_spec[BPF_CORE_SPEC_MAX_LEN]; + /* raw spec length */ + int raw_len; + /* field bit offset represented by spec */ + __u32 bit_offset; +}; + int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, int insn_idx, const struct bpf_core_relo *relo, int relo_idx, const struct btf *local_btf, - struct bpf_core_cand_list *cands); + struct bpf_core_cand_list *cands, + struct bpf_core_spec *specs_scratch); int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, const struct btf *targ_btf, __u32 targ_id);
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 29f2e5bd9439445fe14ba8570b1c9a7ad682df84 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
When CONFIG_DEBUG_INFO_BTF_MODULES is not set the following warning can be seen: kernel/bpf/btf.c:6588:13: warning: 'purge_cand_cache' defined but not used [-Wunused-function] Fix it.
Fixes: 1e89106da253 ("bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().") Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20211207014839.6976-1-alexei.starovoitov@gmail.c... (cherry picked from commit 29f2e5bd9439445fe14ba8570b1c9a7ad682df84) Signed-off-by: Wang Yufen wangyufen@huawei.com --- kernel/bpf/btf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 36eb24958607..3cc3f59beeda 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6376,6 +6376,7 @@ static struct bpf_cand_cache *populate_cand_cache(struct bpf_cand_cache *cands, return new_cands; }
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES static void __purge_cand_cache(struct btf *btf, struct bpf_cand_cache **cache, int cache_size) { @@ -6414,6 +6415,7 @@ static void purge_cand_cache(struct btf *btf) __purge_cand_cache(btf, module_cand_cache, MODULE_CAND_CACHE_SIZE); mutex_unlock(&cand_cache_mutex); } +#endif
static struct bpf_cand_cache * bpf_core_add_cands(struct bpf_cand_cache *cands, const struct btf *targ_btf,
From: Shuyi Cheng chengshuyi@linux.alibaba.com
mainline inclusion from mainline-5.17-rc1 commit 229fae38d0fc0d6ff58d57cbeb1432da55e58d4f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
Fix error: "failed to pin map: Bad file descriptor, path: /sys/fs/bpf/_rodata_str1_1."
In the old kernel, the global data map will not be created, see [0]. So we should skip the pinning of the global data map to avoid bpf_object__pin_maps returning error. Therefore, when the map is not created, we mark “map->skipped" as true and then check during relocation and during pinning.
Fixes: 16e0c35c6f7a ("libbpf: Load global data maps lazily on legacy kernels") Signed-off-by: Shuyi Cheng chengshuyi@linux.alibaba.com Signed-off-by: Andrii Nakryiko andrii@kernel.org (cherry picked from commit 229fae38d0fc0d6ff58d57cbeb1432da55e58d4f) Signed-off-by: Wang Yufen wangyufen@huawei.com
Conflicts: tools/lib/bpf/libbpf.c
Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/libbpf.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index c0596e4b6aab..c8b5a8f6e84f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -388,6 +388,7 @@ struct bpf_map { char *pin_path; bool pinned; bool reused; + bool skipped; };
enum extern_type { @@ -5036,8 +5037,10 @@ bpf_object__create_maps(struct bpf_object *obj) * kernels. */ if (bpf_map__is_internal(map) && - !kernel_supports(obj, FEAT_GLOBAL_DATA)) + !kernel_supports(obj, FEAT_GLOBAL_DATA)) { + map->skipped = true; continue; + }
retried = false; retry: @@ -5666,8 +5669,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog) } else { const struct bpf_map *map = &obj->maps[relo->map_idx];
- if (bpf_map__is_internal(map) && - !kernel_supports(obj, FEAT_GLOBAL_DATA)) { + if (map->skipped) { pr_warn("prog '%s': relo #%d: kernel doesn't support global data\n", prog->name, i); return -ENOTSUP; @@ -7813,6 +7815,9 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path) char *pin_path = NULL; char buf[PATH_MAX];
+ if (map->skipped) + continue; + if (path) { int len;
From: Alexei Starovoitov ast@kernel.org
mainline inclusion from mainline-5.17-rc1 commit 259172bb6514758ce3be1610c500b51a9f44212a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5EUVD CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------------
libbpf's obj->nr_programs includes static and global functions. That number could be higher than the actual number of bpf programs going be loaded by gen_loader. Passing larger nr_programs to bpf_gen__init() doesn't hurt. Those exra stack slots will stay as zero. bpf_gen__finish() needs to check that actual number of progs that gen_loader saw is less than or equal to obj->nr_programs.
Fixes: ba05fd36b851 ("libbpf: Perform map fd cleanup for gen_loader in case of error") Signed-off-by: Alexei Starovoitov ast@kernel.org (cherry picked from commit 259172bb6514758ce3be1610c500b51a9f44212a) Signed-off-by: Wang Yufen wangyufen@huawei.com --- tools/lib/bpf/gen_loader.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 515b4c5d4ffd..fbccd20798d6 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -371,8 +371,9 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) { int i;
- if (nr_progs != gen->nr_progs || nr_maps != gen->nr_maps) { - pr_warn("progs/maps mismatch\n"); + if (nr_progs < gen->nr_progs || nr_maps != gen->nr_maps) { + pr_warn("nr_progs %d/%d nr_maps %d/%d mismatch\n", + nr_progs, gen->nr_progs, nr_maps, gen->nr_maps); gen->error = -EFAULT; return gen->error; }