This series enables perf branch stack sampling support on arm64 platform
via a new arch feature called Branch Record Buffer Extension (BRBE). All
the relevant register definitions could be accessed here.
https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
This series applies on 6.10-rc3.
Also this series is being hosted below for quick access, review and test.
https://git.gitlab.arm.com/linux-arm/linux-anshuman.git (brbe_v18)
- Anshuman
========== Perf Branch Stack Sampling Support (arm64 platforms) ===========
Currently arm64 platform does not support perf branch stack sampling. Hence
any event requesting for branch stack records i.e PERF_SAMPLE_BRANCH_STACK
marked in event->attr.sample_type, will be rejected in armpmu_event_init().
static int armpmu_event_init(struct perf_event *event)
{
........
/* does not support taken branch sampling */
if (has_branch_stack(event))
return -EOPNOTSUPP;
........
}
$perf record -j any,u,k ls
Error:
cycles:P: PMU Hardware or event type doesn't support branch stack sampling.
-------------------- CONFIG_ARM64_BRBE and FEAT_BRBE ----------------------
After this series, perf branch stack sampling feature gets enabled on arm64
platforms where FEAT_BRBE HW feature is supported, and CONFIG_ARM64_BRBE is
also selected during build. Let's observe all all possible scenarios here.
1. Feature not built (!CONFIG_ARM64_BRBE):
Falls back to the current behaviour i.e event gets rejected.
2. Feature built but HW not supported (CONFIG_ARM64_BRBE && !FEAT_BRBE):
Falls back to the current behaviour i.e event gets rejected.
3. Feature built and HW supported (CONFIG_ARM64_BRBE && FEAT_BRBE):
Platform supports branch stack sampling requests. Let's observe through a
simple example here.
$perf record -j any_call,u,k,save_type ls
[Please refer perf-record man pages for all possible branch filter options]
$perf report
-------------------------- Snip ----------------------
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ............................................ ............................................ ..................
#
3.52% ls [kernel.kallsyms] [k] sched_clock_noinstr [k] arch_counter_get_cntpct 16
3.52% ls [kernel.kallsyms] [k] sched_clock [k] sched_clock_noinstr 9
1.85% ls [kernel.kallsyms] [k] sched_clock_cpu [k] sched_clock 5
1.80% ls [kernel.kallsyms] [k] irqtime_account_irq [k] sched_clock_cpu 20
1.58% ls [kernel.kallsyms] [k] gic_handle_irq [k] generic_handle_domain_irq 19
1.58% ls [kernel.kallsyms] [k] call_on_irq_stack [k] gic_handle_irq 9
1.58% ls [kernel.kallsyms] [k] do_interrupt_handler [k] call_on_irq_stack 23
1.58% ls [kernel.kallsyms] [k] generic_handle_domain_irq [k] __irq_resolve_mapping 6
1.58% ls [kernel.kallsyms] [k] __irq_resolve_mapping [k] __rcu_read_lock 10
-------------------------- Snip ----------------------
$perf report -D | grep cycles
-------------------------- Snip ----------------------
..... 1: ffff800080dd3334 -> ffff800080dd759c 39 cycles P 0 IND_CALL
..... 2: ffff800080ffaea0 -> ffff800080ffb688 16 cycles P 0 IND_CALL
..... 3: ffff800080139918 -> ffff800080ffae64 9 cycles P 0 CALL
..... 4: ffff800080dd3324 -> ffff8000801398f8 7 cycles P 0 CALL
..... 5: ffff8000800f8548 -> ffff800080dd330c 21 cycles P 0 IND_CALL
..... 6: ffff8000800f864c -> ffff8000800f84ec 6 cycles P 0 CALL
..... 7: ffff8000800f86dc -> ffff8000800f8638 11 cycles P 0 CALL
..... 8: ffff8000800f86d4 -> ffff800081008630 16 cycles P 0 CALL
-------------------------- Snip ----------------------
perf script and other tooling can also be applied on the captured perf.data
Similarly branch stack sampling records can be collected via direct system
call i.e perf_event_open() method after setting 'struct perf_event_attr' as
required.
event->attr.sample_type |= PERF_SAMPLE_BRANCH_STACK
event->attr.branch_sample_type |= PERF_SAMPLE_BRANCH_<FILTER_1> |
PERF_SAMPLE_BRANCH_<FILTER_2> |
PERF_SAMPLE_BRANCH_<FILTER_3> |
...............................
But all branch filters might not be supported on the platform.
----------------------- BRBE Branch Filters Support -----------------------
- Following branch filters are supported on arm64.
PERF_SAMPLE_BRANCH_USER /* Branch privilege filters */
PERF_SAMPLE_BRANCH_KERNEL
PERF_SAMPLE_BRANCH_HV
PERF_SAMPLE_BRANCH_ANY /* Branch type filters */
PERF_SAMPLE_BRANCH_ANY_CALL
PERF_SAMPLE_BRANCH_ANY_RETURN
PERF_SAMPLE_BRANCH_IND_CALL
PERF_SAMPLE_BRANCH_COND
PERF_SAMPLE_BRANCH_IND_JUMP
PERF_SAMPLE_BRANCH_CALL
PERF_SAMPLE_BRANCH_NO_FLAGS /* Branch record flags */
PERF_SAMPLE_BRANCH_NO_CYCLES
PERF_SAMPLE_BRANCH_TYPE_SAVE
PERF_SAMPLE_BRANCH_HW_INDEX
PERF_SAMPLE_BRANCH_PRIV_SAVE
- Following branch filters are not supported on arm64.
PERF_SAMPLE_BRANCH_ABORT_TX
PERF_SAMPLE_BRANCH_IN_TX
PERF_SAMPLE_BRANCH_NO_TX
PERF_SAMPLE_BRANCH_CALL_STACK
Events requesting above non-supported branch filters get rejected.
--------------------------- Virtualisation support ------------------------
- No guest support
-------------------------------- Testing ---------------------------------
- Cross compiled for both arm64 and arm32 platforms
- Passes all branch tests with 'perf test branch' on arm64
Changes in V18:
- Changed BRBIDR0_EL1 register fields CC and FORMAT, updated the commit message
- Replaced BRBIDR0_EL1_FORMAT_0 as BRBIDR0_EL1_FORMAT_FORMAT_0 in BRBE driver
- Dropped ifdef CONFIG_ARM64_BRBE around __init_el2_brbe()
- Updated in code comment around __init_el2_brbe()
- Dropped the write up for EL2->EL1 transition, also moved up the EL3 write up
- Unconditionally capture branch record type and privilege information
- Scan valid branch stack events in armpmu_start() to create merged filter
- Dropped branch_sample_type override in armv8pmu_branch_stack_add()
- Dropped branch filter mismatch between PMU and event in read_branch_records()
- Added SW filtering framework in read_branch_records() during filter mismatch
- Added SW filtering for privilege modes
- Used host_data_ptr() to access host_debug_state.brbcr_el1 register
- Changed DEBUG_STATE_SAVE_BRBE to use BIT(7)
- Reverted back iflags as u8
https://lore.kernel.org/all/20240405024639.1179064-1-anshuman.khandual@arm.…
Changes in V17:
- Added back Reviewed-by tags from Mark Brown
- Updated the commit message regarding the field BRBINFx_EL1_TYPE_IMPDEF_TRAP_EL3
- Added leading 0s for all values as BRBIDR0_EL1.NUMREC is a 8 bit field
- Added leading 0s for all values as BRBFCR_EL1.BANK is a 2 bit field
- Reordered BRBCR_EL1/BRBCR_EL12/BRBCR_EL2 registers as per sysreg encodings
- Renamed s/FIRST/BANK_0 and s/SECOND/BANK_1 in BRBFCR_EL1.BANK
- Renamed s/UNCOND_DIRECT/DIRECT_UNCOND in BRBINFx_EL1.TYPE
- Renamed s/COND_DIRECT/DIRECT_COND in BRBINFx_EL1.TYPE
- Dropped __SYS_BRBINF/__SYS_BRBSRC/__SYS_BRBTGT and their expansions
- Moved all existing BRBE registers from sysreg.h header to tools/sysreg format
- Updated the commit message including about sys_insn_descs[]
- Changed KVM to use existing SYS_BRBSRC/TGT/INF_EL1(n) format
- Moved the BRBE instructions into sys_insn_descs[] array
- ARM PMUV3 changes have been moved into the BRBE driver patch instead
- Moved down branch_stack_add() in armpmu_add() after event's basic checks
- Added new callbacks in struct arm_pmu e.g branch_stack_[init|add|del]()
- Renamed struct arm_pmu callback branch_reset() as branch_stack_reset()
- Dropped the comment in armpmu_event_init()
- Renamed 'pmu_hw_events' elements from 'brbe_' to more generic 'branch_'
- Separated out from the BRBE driver implementation patch
- Dropped the comment in __init_el2_brbe()
- Updated __init_el2_brbe() with BRBCR_EL2.MPRED requirements
- Updated __init_el2_brbe() with __check_hvhe() constructs
- Updated booting.rst regarding MPRED, MDCR_EL3 and fine grained control
- Dropped Documentation/arch/arm64/brbe.rst
- Renamed armv8pmu_branch_reset() as armv8pmu_branch_stack_reset()
- Separated out booting.rst and EL2 boot requirements into a new patch
- Dropped process_branch_aborts() completely
- Added an warning if transaction states get detected unexpectedly
- Dropped enum brbe_bank_idx from the driver
- Defined armv8pmu_branch_stack_init/add/del() callbacks in the driver
- Changed BRBE driver to use existing SYS_BRBSRC/TGT/INF_EL1(n) format
- Dropped isb() call sites in __debug_[save|restore]_brbe()
- Changed to [read|write]_sysreg_el1() accessors in __debug_[save|restore]_brbe()
Changes in V16
https://lore.kernel.org/all/20240125094119.2542332-1-anshuman.khandual@arm.…
- Updated BRBINFx_EL1.TYPE = 0b110000 as field IMPDEF_TRAP_EL3
- Updated BRBCR_ELx[9] as field FZPSS
- Updated BRBINFINJ_EL1 to use sysreg field BRBINFx_EL1
- Added BRB_INF_SRC_TGT_EL1 macro for corresponding BRB_[INF|SRC|TGT] expansion
- Renamed arm_brbe.h as arm_pmuv3_branch.h
- Updated perf_sample_save_brstack()'s new argument requirements with NULL
- Fixed typo (s/informations/information) in Documentation/arch/arm64/brbe.rst
- Added SPDX-License-Identifier in Documentation/arch/arm64/brbe.rst
- Added new PERF_SAMPLE_BRANCH_COUNTERS into BRBE_EXCLUDE_BRANCH_FILTERS
- Dropped BRBFCR_EL1 and BRBCR_EL1 from enum vcpu_sysreg
- Reverted back the KVM NVHE patch - use host_debug_state based 'brbcr_el1'
element and dropped the previous dependency on Jame's coresight series
Changes in V15:
https://lore.kernel.org/all/20231201053906.1261704-1-anshuman.khandual@arm.…
- Added a comment for armv8pmu_branch_probe() regarding single cpu probe
- Added a text in brbe.rst regarding single cpu probe
- Dropped runtime BRBE enable for setting DEBUG_STATE_SAVE_BRBE
- Dropped zero_branch_stack based zero branch records mechanism
- Replaced BRBFCR_EL1_DEFAULT_CONFIG with BRBFCR_EL1_CONFIG_MASK
- Added BRBFCR_EL1_CONFIG_MASK masking in branch_type_to_brbfcr()
- Moved BRBE helpers from arm_brbe.h into arm_brbe.c
- Moved armv8_pmu_xxx() declaration inside arm_brbe.h for arm64 (CONFIG_ARM64_BRBE)
- Moved armv8_pmu_xxx() stub definitions inside arm_brbe.h for arm32 (!CONFIG_ARM64_BRBE)
- Included arm_brbe.h header both in arm_pmuv3.c and arm_brbe.c
- Dropped BRBE custom pr_fmt()
- Dropped CONFIG_PERF_EVENTS wrapping from header entries
- Flush branch records when a cpu bound event follows a task bound event
- Dropped BRBFCR_EL1 from __debug_save_brbe()/__debug_restore_brbe()
- Always save the live SYS_BRBCR_EL1 in host context and then check if
BRBE was enabled before resetting SYS_BRBCR_EL1 for the host
Changes in V14:
https://lore.kernel.org/all/20231114051329.327572-1-anshuman.khandual@arm.c…
- This series has been reorganised as suggested during V13
- There are just eight patches now i.e 5 enablement and 3 perf branch tests
- Fixed brackets problem in __SYS_BRBINFO/BRBSRC/BRBTGT() macros
- Renamed the macro i.e s/__SYS_BRBINFO/__SYS_BRBINF/
- Renamed s/BRB_IALL/BRB_IALL_INSN and s/BRBE_INJ/BRB_INJ_INSN
- Moved BRB_IALL_INSN and SYS_BRB_INSN instructions to sysreg patch
- Changed E1BRE as ExBRE in sysreg fields inside BRBCR_ELx
- Used BRBCR_ELx for defining all BRBCR_EL1, BRBCR_EL2, and BRBCR_EL12 (new)
- Folded the following three patches into a single patch i.e [PATCH 3/8]
drivers: perf: arm_pmu: Add new sched_task() callback
arm64/perf: Add branch stack support in struct arm_pmu
arm64/perf: Add branch stack support in struct pmu_hw_events
arm64/perf: Add branch stack support in ARMV8 PMU
arm64/perf: Add PERF_ATTACH_TASK_DATA to events with has_branch_stack()
- All armv8pmu_branch_xxxx() stub definitions have been moved inside
include/linux/perf/arm_pmuv3.h for easy access from both arm32 and arm64
- Added brbe_users, brbe_context and brbe_sample_type in struct pmu_hw_events
- Added comments for all the above new elements in struct pmu_hw_events
- Added branch_reset() and sched_task() callbacks
- Changed and optimized branch records processing during a PMU IRQ
- NO branch records get captured for event with mismatched brbe_sample_type
- Branch record context is tracked from armpmu_del() & armpmu_add()
- Branch record hardware is driven from armv8pmu_start() & armv8pmu_stop()
- Dropped NULL check for 'pmu_ctx' inside armv8pmu_sched_task()
- Moved down PERF_ATTACH_TASK_DATA assignment with a preceding comment
- In conflicting branch sample type requests, first event takes precedence
- Folded the following five patches from V13 into a single patch i.e
[PATCH 4/8]
arm64/perf: Enable branch stack events via FEAT_BRBE
arm64/perf: Add struct brbe_regset helper functions
arm64/perf: Implement branch records save on task sched out
arm64/perf: Implement branch records save on PMU IRQ
- Fixed the year in copyright statement
- Added Documentation/arch/arm64/brbe.rst
- Updated Documentation/arch/arm64/booting.rst (BRBCR_EL2.CC for EL1 entry)
- Added __init_el2_brbe() which enables branch record cycle count support
- Disabled EL2 traps in __init_el2_fgt() while accessing BRBE registers and
executing instructions
- Changed CONFIG_ARM64_BRBE user visible description
- Fixed a typo in CONFIG_ARM64_BRBE config option description text
- Added BUILD_BUG_ON() co-relating BRBE_BANK_MAX_ENTRIES and MAX_BRANCH_RECORDS
- Dropped arm64_create_brbe_task_ctx_kmem_cache()
- Moved down comment for PERF_SAMPLE_BRANCH_KERNEL in branch_type_to_brbcr()
- Renamed BRBCR_ELx_DEFAULT_CONFIG as BRBCR_ELx_CONFIG_MASK
- Replaced BRBCR_ELx_DEFAULT_TS with BRBCR_ELx_TS_MASK in BRBCR_ELx_CONFIG_MASK
- Replaced BRBCR_ELx_E1BRE instances with BRBCR_ELx_ExBRE
- Added BRBE specific branch stack sampling perf test patches into the series
- Added a patch to prevent guest accesses into BRBE registers and instructions
- Added a patch to save the BRBE host context in NVHE environment
- Updated most commit messages
Changes in V13:
https://lore.kernel.org/all/20230711082455.215983-1-anshuman.khandual@arm.c…https://lore.kernel.org/all/20230622065351.1092893-1-anshuman.khandual@arm.…
- Added branch callback stubs for aarch32 pmuv3 based platforms
- Updated the comments for capture_brbe_regset()
- Deleted the comments in __read_brbe_regset()
- Reversed the arguments order in capture_brbe_regset() and brbe_branch_save()
- Fixed BRBE_BANK[0|1]_IDX_MAX indices comparison in armv8pmu_branch_read()
- Fixed BRBE_BANK[0|1]_IDX_MAX indices comparison in capture_brbe_regset()
Changes in V12:
https://lore.kernel.org/all/20230615133239.442736-1-anshuman.khandual@arm.c…
- Replaced branch types with complete DIRECT/INDIRECT prefixes/suffixes
- Replaced branch types with complete INSN/ALIGN prefixes/suffixes
- Replaced return branch types as simple RET/ERET
- Replaced time field GST_PHYSICAL as GUEST_PHYSICAL
- Added 0 padding for BRBIDR0_EL1.NUMREC enum values
- Dropped helper arm_pmu_branch_stack_supported()
- Renamed armv8pmu_branch_valid() as armv8pmu_branch_attr_valid()
- Separated perf_task_ctx_cache setup from arm_pmu private allocation
- Collected changes to branch_records_alloc() in a single patch [5/10]
- Reworked and cleaned up branch_records_alloc()
- Reworked armv8pmu_branch_read() with new loop iterations in patch [6/10]
- Reworked capture_brbe_regset() with new loop iterations in patch [8/10]
- Updated the comment in branch_type_to_brbcr()
- Fixed the comment before stitch_stored_live_entries()
- Fixed BRBINFINJ_EL1 definition for VALID_FULL enum field
- Factored out helper __read_brbe_regset() from capture_brbe_regset()
- Dropped the helper copy_brbe_regset()
- Simplified stitch_stored_live_entries() with memcpy(), memmove()
- Reworked armv8pmu_probe_pmu() to bail out early with !probe.present
- Rework brbe_attributes_probe() without 'struct brbe_hw_attr'
- Dropped 'struct brbe_hw_attr' argument from capture_brbe_regset()
- Dropped 'struct brbe_hw_attr' argument from brbe_branch_save()
- Dropped arm_pmu->private and added arm_pmu->reg_trbidr instead
Changes in V11:
https://lore.kernel.org/all/20230531040428.501523-1-anshuman.khandual@arm.c…
- Fixed the crash for per-cpu events without event->pmu_ctx->task_ctx_data
Changes in V10:
https://lore.kernel.org/all/20230517022410.722287-1-anshuman.khandual@arm.c…
- Rebased the series on v6.4-rc2
- Moved ARMV8 PMUV3 changes inside drivers/perf/arm_pmuv3.c
- Moved BRBE driver changes inside drivers/perf/arm_brbe.[c|h]
- Moved the WARN_ON() inside the if condition in armv8pmu_handle_irq()
Changes in V9:
https://lore.kernel.org/all/20230315051444.1683170-1-anshuman.khandual@arm.…
- Fixed build problem with has_branch_stack() in arm64 header
- BRBINF_EL1 definition has been changed from 'Sysreg' to 'SysregFields'
- Renamed all BRBINF_EL1 call sites as BRBINFx_EL1
- Dropped static const char branch_filter_error_msg[]
- Implemented a positive list check for BRBE supported perf branch filters
- Added a comment in armv8pmu_handle_irq()
- Implemented per-cpu allocation for struct branch_record records
- Skipped looping through bank 1 if an invalid record is detected in bank 0
- Added comment in armv8pmu_branch_read() explaining prohibited region etc
- Added comment warning about erroneously marking transactions as aborted
- Replaced the first argument (perf_branch_entry) in capture_brbe_flags()
- Dropped the last argument (idx) in capture_brbe_flags()
- Dropped the brbcr argument from capture_brbe_flags()
- Used perf_sample_save_brstack() to capture branch records for perf_sample_data
- Added comment explaining rationale for setting BRBCR_EL1_FZP for user only traces
- Dropped BRBE prohibited state mechanism while in armv8pmu_branch_read()
- Implemented event task context based branch records save mechanism
Changes in V8:
https://lore.kernel.org/all/20230123125956.1350336-1-anshuman.khandual@arm.…
- Replaced arm_pmu->features as arm_pmu->has_branch_stack, updated its helper
- Added a comment and line break before arm_pmu->private element
- Added WARN_ON_ONCE() in helpers i.e armv8pmu_branch_[read|valid|enable|disable]()
- Dropped comments in armv8pmu_enable_event() and armv8pmu_disable_event()
- Replaced open bank encoding in BRBFCR_EL1 with SYS_FIELD_PREP()
- Changed brbe_hw_attr->brbe_version from 'bool' to 'int'
- Updated pr_warn() as pr_warn_once() with values in brbe_get_perf_[type|priv]()
- Replaced all pr_warn_once() as pr_debug_once() in armv8pmu_branch_valid()
- Added a comment in branch_type_to_brbcr() for the BRBCR_EL1 privilege settings
- Modified the comment related to BRBINFx_EL1.LASTFAILED in capture_brbe_flags()
- Modified brbe_get_perf_entry_type() as brbe_set_perf_entry_type()
- Renamed brbe_valid() as brbe_record_is_complete()
- Renamed brbe_source() as brbe_record_is_source_only()
- Renamed brbe_target() as brbe_record_is_target_only()
- Inverted checks for !brbe_record_is_[target|source]_only() for info capture
- Replaced 'fetch' with 'get' in all helpers that extract field value
- Dropped 'static int brbe_current_bank' optimization in select_brbe_bank()
- Dropped select_brbe_bank_index() completely, added capture_branch_entry()
- Process captured branch entries in two separate loops one for each BRBE bank
- Moved branch_records_alloc() inside armv8pmu_probe_pmu()
- Added a forward declaration for the helper has_branch_stack()
- Added new callbacks armv8pmu_private_alloc() and armv8pmu_private_free()
- Updated armv8pmu_probe_pmu() to allocate the private structure before SMP call
Changes in V7:
https://lore.kernel.org/all/20230105031039.207972-1-anshuman.khandual@arm.c…
- Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
- Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
- Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
- Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
- Defined BRBCR_EL1_DEFAULT_TS in the header
- Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
- Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
- Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
- Also set BRBE in paused state in armv8pmu_branch_disable()
- Dropped brbe_paused(), set_brbe_paused() helpers
- Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
- Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
- Added valid_brbe_[cc, format, version]() helpers
- Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
- Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
- Defined enum brbe_bank_idx with possible values for BRBE bank indices
- Changed armpmu->hw_attr into armpmu->private
- Added missing space in stub definition for armv8pmu_branch_valid()
- Replaced both kmalloc() with kzalloc()
- Added BRBE_BANK_MAX_ENTRIES
- Updated comment for capture_brbe_flags()
- Updated comment for struct brbe_hw_attr
- Dropped space after type cast in couple of places
- Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
- Captured cpuc->branches->branch_entries[idx] in a local variable
- Dropped saved_priv from armv8pmu_branch_read()
- Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
- Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
- Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
- Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
- Reorganized brbe_valid_nr() and dropped the pr_warn() message
- Changed probe sequence in brbe_attributes_probe()
- Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
- Disable BRBE before disabling the PMU event counter
- Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
- Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()
Changes in V6:
https://lore.kernel.org/linux-arm-kernel/20221208084402.863310-1-anshuman.k…
- Restore the exception level privilege after reading the branch records
- Unpause the buffer after reading the branch records
- Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level
- Reworked BRBE implementation and branch stack sampling support on arm pmu
- BRBE implementation is now part of overall ARMV8 PMU implementation
- BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/
- CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig
- File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c
- File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h
- BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events
- BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events
- BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation
- Added sched_task() callback into struct arm_pmu
- Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes
- Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu
- Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events
- Dropped brbe_users, brbe_context tracking in struct hw_pmu_events
- Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag
- armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation
- Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe
- Added armv8pmu_branch_reset() inside armv8pmu_branch_enable()
- Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK
- Dropped set_brbe_disabled() as well
- Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events
Changes in V5:
https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.…
- Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01
- Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI
- Changed config ARM_BRBE_PMU from 'tristate' to 'bool'
Changes in V4:
https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.c…
- Changed ../tools/sysreg declarations as suggested
- Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags
- Dropped perfmon_capable() check in armpmu_event_init()
- s/pr_warn_once/pr_info in armpmu_event_init()
- Added brbe_format element into struct pmu_hw_events
- Changed v1p1 as brbe_v1p1 in struct pmu_hw_events
- Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning
Changes in V3:
https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.c…
- Moved brbe_stack from the stack and now dynamically allocated
- Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv()
- Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg
- Created dummy BRBINF_EL1 field definitions in tools/sysreg
- Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable()
- Both exception and exception return branche records are now captured
only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already
been checked in generic perf via perf_allow_kernel()
Changes in V2:
https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.c…
- Dropped branch sample filter helpers consolidation patch from this series
- Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
- Use cached perfmon_capable() while configuring BRBE branch record filters
Changes in V1:
https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.k…
- Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
- Process new perf branch types via PERF_BR_EXTEND_ABI
Changes in RFC V2:
https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.k…
- Added branch_sample_priv() while consolidating other branch sample filter helpers
- Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
- Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
- Added documentation for struct arm_pmu changes, updated commit message
- Updated commit message for BRBE detection infrastructure patch
- PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
- Branch privilege state capture mechanism has now moved inside the driver
Changes in RFC V1:
https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khan…
Anshuman Khandual (6):
arm64/sysreg: Add BRBE registers and fields
KVM: arm64: Explicitly handle BRBE traps as UNDEFINED
drivers: perf: arm_pmu: Add infrastructure for branch stack sampling
arm64/boot: Enable EL2 requirements for BRBE
drivers: perf: arm_pmuv3: Enable branch stack sampling via FEAT_BRBE
KVM: arm64: nvhe: Disable branch generation in nVHE guests
James Clark (3):
perf: test: Speed up running brstack test on an Arm model
perf: test: Remove empty lines from branch filter test output
perf: test: Extend branch stack sampling test for Arm64 BRBE
Junhao He (1):
arm64/perf: Enable branch stack sampling
Marc Zyngier (2):
KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flag
KVM: arm64: Fix TRFCR_EL1/PMSCR_EL1 access in hVHE mode
Miguel Luis (2):
arm64: Add missing _EL12 encodings
arm64: Add missing _EL2 encodings
Documentation/arch/arm64/booting.rst | 21 +
arch/arm64/configs/openeuler_defconfig | 1 +
arch/arm64/include/asm/el2_setup.h | 87 +-
arch/arm64/include/asm/kvm_host.h | 5 +-
arch/arm64/include/asm/sysreg.h | 55 +-
arch/arm64/kvm/debug.c | 5 +
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 43 +-
arch/arm64/kvm/hyp/vhe/switch.c | 7 +-
arch/arm64/kvm/sys_regs.c | 56 ++
arch/arm64/tools/sysreg | 131 +++
drivers/perf/Kconfig | 11 +
drivers/perf/Makefile | 1 +
drivers/perf/arm_brbe.c | 1198 ++++++++++++++++++++++++
drivers/perf/arm_pmu.c | 42 +-
drivers/perf/arm_pmuv3.c | 160 +++-
drivers/perf/arm_pmuv3_branch.h | 83 ++
include/linux/perf/arm_pmu.h | 37 +-
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/shell/test_brstack.sh | 57 +-
tools/perf/tests/tests.h | 1 +
tools/perf/tests/workloads/Build | 2 +
tools/perf/tests/workloads/traploop.c | 39 +
22 files changed, 2007 insertions(+), 36 deletions(-)
create mode 100644 drivers/perf/arm_brbe.c
create mode 100644 drivers/perf/arm_pmuv3_branch.h
create mode 100644 tools/perf/tests/workloads/traploop.c
--
2.33.0
From: David Fernandez Gonzalez <david.fernandez.gonzalez(a)oracle.com>
mainline inclusion
from mainline-v6.11-rc7
commit 48b9a8dabcc3cf5f961b2ebcd8933bf9204babb7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARY1L
CVE: CVE-2024-46738
Reference: https://lore.kernel.org/lkml/20240828154338.754746-1-david.fernandez.gonzal…
--------------------------------
When removing a resource from vmci_resource_table in
vmci_resource_remove(), the search is performed using the resource
handle by comparing context and resource fields.
It is possible though to create two resources with different types
but same handle (same context and resource fields).
When trying to remove one of the resources, vmci_resource_remove()
may not remove the intended one, but the object will still be freed
as in the case of the datagram type in vmci_datagram_destroy_handle().
vmci_resource_table will still hold a pointer to this freed resource
leading to a use-after-free vulnerability.
BUG: KASAN: use-after-free in vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline]
BUG: KASAN: use-after-free in vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147
Read of size 4 at addr ffff88801c16d800 by task syz-executor197/1592
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x82/0xa9 lib/dump_stack.c:106
print_address_description.constprop.0+0x21/0x366 mm/kasan/report.c:239
__kasan_report.cold+0x7f/0x132 mm/kasan/report.c:425
kasan_report+0x38/0x51 mm/kasan/report.c:442
vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline]
vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147
vmci_qp_broker_detach+0x89a/0x11b9 drivers/misc/vmw_vmci/vmci_queue_pair.c:2182
ctx_free_ctx+0x473/0xbe1 drivers/misc/vmw_vmci/vmci_context.c:444
kref_put include/linux/kref.h:65 [inline]
vmci_ctx_put drivers/misc/vmw_vmci/vmci_context.c:497 [inline]
vmci_ctx_destroy+0x170/0x1d6 drivers/misc/vmw_vmci/vmci_context.c:195
vmci_host_close+0x125/0x1ac drivers/misc/vmw_vmci/vmci_host.c:143
__fput+0x261/0xa34 fs/file_table.c:282
task_work_run+0xf0/0x194 kernel/task_work.c:164
tracehook_notify_resume include/linux/tracehook.h:189 [inline]
exit_to_user_mode_loop+0x184/0x189 kernel/entry/common.c:187
exit_to_user_mode_prepare+0x11b/0x123 kernel/entry/common.c:220
__syscall_exit_to_user_mode_work kernel/entry/common.c:302 [inline]
syscall_exit_to_user_mode+0x18/0x42 kernel/entry/common.c:313
do_syscall_64+0x41/0x85 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x6e/0x0
This change ensures the type is also checked when removing
the resource from vmci_resource_table in vmci_resource_remove().
Fixes: bc63dedb7d46 ("VMCI: resource object implementation.")
Cc: stable(a)vger.kernel.org
Reported-by: George Kennedy <george.kennedy(a)oracle.com>
Signed-off-by: David Fernandez Gonzalez <david.fernandez.gonzalez(a)oracle.com>
Signed-off-by: Zhang Kunbo <zhangkunbo(a)huawei.com>
---
drivers/misc/vmw_vmci/vmci_resource.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/vmw_vmci/vmci_resource.c b/drivers/misc/vmw_vmci/vmci_resource.c
index 692daa9eff34..19c9d2cdd277 100644
--- a/drivers/misc/vmw_vmci/vmci_resource.c
+++ b/drivers/misc/vmw_vmci/vmci_resource.c
@@ -144,7 +144,8 @@ void vmci_resource_remove(struct vmci_resource *resource)
spin_lock(&vmci_resource_table.lock);
hlist_for_each_entry(r, &vmci_resource_table.entries[idx], node) {
- if (vmci_handle_is_equal(r->handle, resource->handle)) {
+ if (vmci_handle_is_equal(r->handle, resource->handle) &&
+ resource->type == r->type) {
hlist_del_init_rcu(&r->node);
break;
}
--
2.34.1
From: Stephen Hemminger <stephen(a)networkplumber.org>
stable inclusion
from stable-v5.10.226
commit 98c75d76187944296068d685dfd8a1e9fd8c4fdc
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARX29
CVE: CVE-2024-46800
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 3b3a2a9c6349e25a025d2330f479bc33a6ccb54a upstream.
If netem_dequeue() enqueues packet to inner qdisc and that qdisc
returns __NET_XMIT_STOLEN. The packet is dropped but
qdisc_tree_reduce_backlog() is not called to update the parent's
q.qlen, leading to the similar use-after-free as Commit
e04991a48dbaf382 ("netem: fix return value if duplicate enqueue
fails")
Commands to trigger KASAN UaF:
ip link add type dummy
ip link set lo up
ip link set dummy0 up
tc qdisc add dev lo parent root handle 1: drr
tc filter add dev lo parent 1: basic classid 1:1
tc class add dev lo classid 1:1 drr
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2: handle 3: drr
tc filter add dev lo parent 3: basic classid 3:1 action mirred egress
redirect dev dummy0
tc class add dev lo classid 3:1 drr
ping -c1 -W0.01 localhost # Trigger bug
tc class del dev lo classid 1:1
tc class add dev lo classid 1:1 drr
ping -c1 -W0.01 localhost # UaF
Fixes: 50612537e9ab ("netem: fix classful handling")
Reported-by: Budimir Markovic <markovicbudimir(a)gmail.com>
Signed-off-by: Stephen Hemminger <stephen(a)networkplumber.org>
Link: https://patch.msgid.link/20240901182438.4992-1-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Liang <wangliang74(a)huawei.com>
---
net/sched/sch_netem.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index b72b308fe406..8321ab36357c 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -722,11 +722,10 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
err = qdisc_enqueue(skb, q->qdisc, &to_free);
kfree_skb_list(to_free);
- if (err != NET_XMIT_SUCCESS &&
- net_xmit_drop_count(err)) {
- qdisc_qstats_drop(sch);
- qdisc_tree_reduce_backlog(sch, 1,
- pkt_len);
+ if (err != NET_XMIT_SUCCESS) {
+ if (net_xmit_drop_count(err))
+ qdisc_qstats_drop(sch);
+ qdisc_tree_reduce_backlog(sch, 1, pkt_len);
}
goto tfifo_dequeue;
}
--
2.34.1
From: David Fernandez Gonzalez <david.fernandez.gonzalez(a)oracle.com>
mainline inclusion
from mainline-v6.11-rc7
commit 48b9a8dabcc3cf5f961b2ebcd8933bf9204babb7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARY1L
CVE: CVE-2024-46738
Reference: https://lore.kernel.org/lkml/20240828154338.754746-1-david.fernandez.gonzal…
--------------------------------
When removing a resource from vmci_resource_table in
vmci_resource_remove(), the search is performed using the resource
handle by comparing context and resource fields.
It is possible though to create two resources with different types
but same handle (same context and resource fields).
When trying to remove one of the resources, vmci_resource_remove()
may not remove the intended one, but the object will still be freed
as in the case of the datagram type in vmci_datagram_destroy_handle().
vmci_resource_table will still hold a pointer to this freed resource
leading to a use-after-free vulnerability.
BUG: KASAN: use-after-free in vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline]
BUG: KASAN: use-after-free in vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147
Read of size 4 at addr ffff88801c16d800 by task syz-executor197/1592
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x82/0xa9 lib/dump_stack.c:106
print_address_description.constprop.0+0x21/0x366 mm/kasan/report.c:239
__kasan_report.cold+0x7f/0x132 mm/kasan/report.c:425
kasan_report+0x38/0x51 mm/kasan/report.c:442
vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline]
vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147
vmci_qp_broker_detach+0x89a/0x11b9 drivers/misc/vmw_vmci/vmci_queue_pair.c:2182
ctx_free_ctx+0x473/0xbe1 drivers/misc/vmw_vmci/vmci_context.c:444
kref_put include/linux/kref.h:65 [inline]
vmci_ctx_put drivers/misc/vmw_vmci/vmci_context.c:497 [inline]
vmci_ctx_destroy+0x170/0x1d6 drivers/misc/vmw_vmci/vmci_context.c:195
vmci_host_close+0x125/0x1ac drivers/misc/vmw_vmci/vmci_host.c:143
__fput+0x261/0xa34 fs/file_table.c:282
task_work_run+0xf0/0x194 kernel/task_work.c:164
tracehook_notify_resume include/linux/tracehook.h:189 [inline]
exit_to_user_mode_loop+0x184/0x189 kernel/entry/common.c:187
exit_to_user_mode_prepare+0x11b/0x123 kernel/entry/common.c:220
__syscall_exit_to_user_mode_work kernel/entry/common.c:302 [inline]
syscall_exit_to_user_mode+0x18/0x42 kernel/entry/common.c:313
do_syscall_64+0x41/0x85 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x6e/0x0
This change ensures the type is also checked when removing
the resource from vmci_resource_table in vmci_resource_remove().
Fixes: bc63dedb7d46 ("VMCI: resource object implementation.")
Cc: stable(a)vger.kernel.org
Reported-by: George Kennedy <george.kennedy(a)oracle.com>
Signed-off-by: David Fernandez Gonzalez <david.fernandez.gonzalez(a)oracle.com>
Signed-off-by: Zhang Kunbo <zhangkunbo(a)huawei.com>
---
drivers/misc/vmw_vmci/vmci_resource.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/vmw_vmci/vmci_resource.c b/drivers/misc/vmw_vmci/vmci_resource.c
index 692daa9eff34..19c9d2cdd277 100644
--- a/drivers/misc/vmw_vmci/vmci_resource.c
+++ b/drivers/misc/vmw_vmci/vmci_resource.c
@@ -144,7 +144,8 @@ void vmci_resource_remove(struct vmci_resource *resource)
spin_lock(&vmci_resource_table.lock);
hlist_for_each_entry(r, &vmci_resource_table.entries[idx], node) {
- if (vmci_handle_is_equal(r->handle, resource->handle)) {
+ if (vmci_handle_is_equal(r->handle, resource->handle) &&
+ resource->type == r->type) {
hlist_del_init_rcu(&r->node);
break;
}
--
2.34.1
From: David Fernandez Gonzalez <david.fernandez.gonzalez(a)oracle.com>
mainline inclusion
from mainline-v6.11-rc7
commit 48b9a8dabcc3cf5f961b2ebcd8933bf9204babb7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARY1L
CVE: CVE-2024-46738
Reference: https://lore.kernel.org/lkml/20240828154338.754746-1-david.fernandez.gonzal…
--------------------------------
When removing a resource from vmci_resource_table in
vmci_resource_remove(), the search is performed using the resource
handle by comparing context and resource fields.
It is possible though to create two resources with different types
but same handle (same context and resource fields).
When trying to remove one of the resources, vmci_resource_remove()
may not remove the intended one, but the object will still be freed
as in the case of the datagram type in vmci_datagram_destroy_handle().
vmci_resource_table will still hold a pointer to this freed resource
leading to a use-after-free vulnerability.
BUG: KASAN: use-after-free in vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline]
BUG: KASAN: use-after-free in vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147
Read of size 4 at addr ffff88801c16d800 by task syz-executor197/1592
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x82/0xa9 lib/dump_stack.c:106
print_address_description.constprop.0+0x21/0x366 mm/kasan/report.c:239
__kasan_report.cold+0x7f/0x132 mm/kasan/report.c:425
kasan_report+0x38/0x51 mm/kasan/report.c:442
vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline]
vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147
vmci_qp_broker_detach+0x89a/0x11b9 drivers/misc/vmw_vmci/vmci_queue_pair.c:2182
ctx_free_ctx+0x473/0xbe1 drivers/misc/vmw_vmci/vmci_context.c:444
kref_put include/linux/kref.h:65 [inline]
vmci_ctx_put drivers/misc/vmw_vmci/vmci_context.c:497 [inline]
vmci_ctx_destroy+0x170/0x1d6 drivers/misc/vmw_vmci/vmci_context.c:195
vmci_host_close+0x125/0x1ac drivers/misc/vmw_vmci/vmci_host.c:143
__fput+0x261/0xa34 fs/file_table.c:282
task_work_run+0xf0/0x194 kernel/task_work.c:164
tracehook_notify_resume include/linux/tracehook.h:189 [inline]
exit_to_user_mode_loop+0x184/0x189 kernel/entry/common.c:187
exit_to_user_mode_prepare+0x11b/0x123 kernel/entry/common.c:220
__syscall_exit_to_user_mode_work kernel/entry/common.c:302 [inline]
syscall_exit_to_user_mode+0x18/0x42 kernel/entry/common.c:313
do_syscall_64+0x41/0x85 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x6e/0x0
This change ensures the type is also checked when removing
the resource from vmci_resource_table in vmci_resource_remove().
Fixes: bc63dedb7d46 ("VMCI: resource object implementation.")
Cc: stable(a)vger.kernel.org
Reported-by: George Kennedy <george.kennedy(a)oracle.com>
Signed-off-by: David Fernandez Gonzalez <david.fernandez.gonzalez(a)oracle.com>
Signed-off-by: Zhang Kunbo <zhangkunbo(a)huawei.com>
---
drivers/misc/vmw_vmci/vmci_resource.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/vmw_vmci/vmci_resource.c b/drivers/misc/vmw_vmci/vmci_resource.c
index 692daa9eff34..19c9d2cdd277 100644
--- a/drivers/misc/vmw_vmci/vmci_resource.c
+++ b/drivers/misc/vmw_vmci/vmci_resource.c
@@ -144,7 +144,8 @@ void vmci_resource_remove(struct vmci_resource *resource)
spin_lock(&vmci_resource_table.lock);
hlist_for_each_entry(r, &vmci_resource_table.entries[idx], node) {
- if (vmci_handle_is_equal(r->handle, resource->handle)) {
+ if (vmci_handle_is_equal(r->handle, resource->handle) &&
+ resource->type == r->type) {
hlist_del_init_rcu(&r->node);
break;
}
--
2.34.1
From: Stephen Hemminger <stephen(a)networkplumber.org>
stable inclusion
from stable-v4.19.322
commit f0bddb4de043399f16d1969dad5ee5b984a64e7b
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARX29
CVE: CVE-2024-46800
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 3b3a2a9c6349e25a025d2330f479bc33a6ccb54a upstream.
If netem_dequeue() enqueues packet to inner qdisc and that qdisc
returns __NET_XMIT_STOLEN. The packet is dropped but
qdisc_tree_reduce_backlog() is not called to update the parent's
q.qlen, leading to the similar use-after-free as Commit
e04991a48dbaf382 ("netem: fix return value if duplicate enqueue
fails")
Commands to trigger KASAN UaF:
ip link add type dummy
ip link set lo up
ip link set dummy0 up
tc qdisc add dev lo parent root handle 1: drr
tc filter add dev lo parent 1: basic classid 1:1
tc class add dev lo classid 1:1 drr
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2: handle 3: drr
tc filter add dev lo parent 3: basic classid 3:1 action mirred egress
redirect dev dummy0
tc class add dev lo classid 3:1 drr
ping -c1 -W0.01 localhost # Trigger bug
tc class del dev lo classid 1:1
tc class add dev lo classid 1:1 drr
ping -c1 -W0.01 localhost # UaF
Fixes: 50612537e9ab ("netem: fix classful handling")
Reported-by: Budimir Markovic <markovicbudimir(a)gmail.com>
Signed-off-by: Stephen Hemminger <stephen(a)networkplumber.org>
Link: https://patch.msgid.link/20240901182438.4992-1-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Liang <wangliang74(a)huawei.com>
---
net/sched/sch_netem.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 95832934e965..8e33bcd69edd 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -697,11 +697,10 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
err = qdisc_enqueue(skb, q->qdisc, &to_free);
kfree_skb_list(to_free);
- if (err != NET_XMIT_SUCCESS &&
- net_xmit_drop_count(err)) {
- qdisc_qstats_drop(sch);
- qdisc_tree_reduce_backlog(sch, 1,
- pkt_len);
+ if (err != NET_XMIT_SUCCESS) {
+ if (net_xmit_drop_count(err))
+ qdisc_qstats_drop(sch);
+ qdisc_tree_reduce_backlog(sch, 1, pkt_len);
}
goto tfifo_dequeue;
}
--
2.34.1
tree: https://gitee.com/openeuler/kernel.git OLK-6.6
head: d25e57f47750555950b12145f98d5168319e6712
commit: 3ce4cb81ef2b148f6c830c7debb4405e26cded1c [13560/14103] drivers/crypto/ccp: support TKM run on CSV
config: x86_64-buildonly-randconfig-006-20240925 (https://download.01.org/0day-ci/archive/20240925/202409250819.8xNdiVIP-lkp@…)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240925/202409250819.8xNdiVIP-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409250819.8xNdiVIP-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/x86/kvm/svm/svm.c:25:
>> include/linux/psp-hygon.h:257:1: error: conflicting types for 'vpsp_try_do_cmd'
257 | vpsp_try_do_cmd(int cmd, phys_addr_t phy_addr,
| ^
include/linux/psp-hygon.h:253:1: note: previous definition is here
253 | vpsp_try_do_cmd(uint32_t vid, int cmd,
| ^
include/linux/psp-hygon.h:285:5: warning: no previous prototype for function 'psp_register_cmd_notifier' [-Wmissing-prototypes]
285 | int psp_register_cmd_notifier(uint32_t cmd_id, p2c_notifier_t notifier) { return -ENODEV; }
| ^
include/linux/psp-hygon.h:285:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
285 | int psp_register_cmd_notifier(uint32_t cmd_id, p2c_notifier_t notifier) { return -ENODEV; }
| ^
| static
include/linux/psp-hygon.h:286:5: warning: no previous prototype for function 'psp_unregister_cmd_notifier' [-Wmissing-prototypes]
286 | int psp_unregister_cmd_notifier(uint32_t cmd_id, p2c_notifier_t notifier) { return -ENODEV; }
| ^
include/linux/psp-hygon.h:286:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
286 | int psp_unregister_cmd_notifier(uint32_t cmd_id, p2c_notifier_t notifier) { return -ENODEV; }
| ^
| static
2 warnings and 1 error generated.
vim +/vpsp_try_do_cmd +257 include/linux/psp-hygon.h
247
248 static inline int
249 vpsp_try_get_result(uint8_t prio,
250 uint32_t index, phys_addr_t phy_addr, struct vpsp_ret *psp_ret) { return -ENODEV; }
251
252 static inline int
253 vpsp_try_do_cmd(uint32_t vid, int cmd,
254 void *data, struct vpsp_ret *psp_ret) { return -ENODEV; }
255
256 static inline int
> 257 vpsp_try_do_cmd(int cmd, phys_addr_t phy_addr,
258 struct vpsp_ret *psp_ret) { return -ENODEV; }
259
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
tree: https://gitee.com/openeuler/kernel.git OLK-6.6
head: d25e57f47750555950b12145f98d5168319e6712
commit: 3ce4cb81ef2b148f6c830c7debb4405e26cded1c [13560/14103] drivers/crypto/ccp: support TKM run on CSV
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20240925/202409250631.b3SgU0xM-lkp@…)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240925/202409250631.b3SgU0xM-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409250631.b3SgU0xM-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/crypto/ccp/hygon/vpsp.c:91: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Copy the guest data to the host kernel buffer
>> drivers/crypto/ccp/hygon/vpsp.c:294: warning: Cannot understand * @brief Directly convert the gpa address into hpa and forward it to PSP,
on line 294 - I thought it was a doc line
>> drivers/crypto/ccp/hygon/vpsp.c:381: warning: Cannot understand * @brief copy data in gpa to host memory and send it to psp for processing.
on line 381 - I thought it was a doc line
vim +91 drivers/crypto/ccp/hygon/vpsp.c
89
90 /**
> 91 * Copy the guest data to the host kernel buffer
92 * and record the host buffer address in 'hbuf'.
93 * This 'hbuf' is used to restore context information
94 * during asynchronous processing.
95 */
96 static int kvm_pv_psp_cmd_pre_op(struct kvm_vpsp *vpsp, gpa_t data_gpa,
97 struct vpsp_hbuf_wrapper *hbuf)
98 {
99 int ret = 0;
100 void *data = NULL;
101 struct psp_cmdresp_head psp_head;
102 uint32_t data_size;
103
104 if (unlikely(vpsp->read_guest(vpsp->kvm, data_gpa, &psp_head,
105 sizeof(struct psp_cmdresp_head))))
106 return -EFAULT;
107
108 data_size = psp_head.buf_size;
109 if (check_psp_mem_range(NULL, data_gpa, data_size))
110 return -EFAULT;
111
112 data = kzalloc(data_size, GFP_KERNEL);
113 if (!data)
114 return -ENOMEM;
115
116 if (unlikely(vpsp->read_guest(vpsp->kvm, data_gpa, data, data_size))) {
117 ret = -EFAULT;
118 goto end;
119 }
120
121 hbuf->data = data;
122 hbuf->data_size = data_size;
123
124 end:
125 return ret;
126 }
127
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki