Kernel
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
August 2022
- 11 participants
- 82 discussions

[PATCH openEuler-1.0-LTS] net_sched: cls_route: remove from list when handle is 0
by Yongqiang Liu 16 Aug '22
by Yongqiang Liu 16 Aug '22
16 Aug '22
From: Thadeu Lima de Souza Cascardo <cascardo(a)canonical.com>
mainline inclusion
from mainline-v6.0-rc1
commit 9ad36309e2719a884f946678e0296be10f0bb4c1
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5LJLR
CVE: CVE-2022-2588
--------------------------------
When a route filter is replaced and the old filter has a 0 handle, the old
one won't be removed from the hashtable, while it will still be freed.
The test was there since before commit 1109c00547fc ("net: sched: RCU
cls_route"), when a new filter was not allocated when there was an old one.
The old filter was reused and the reinserting would only be necessary if an
old filter was replaced. That was still wrong for the same case where the
old handle was 0.
Remove the old filter from the list independently from its handle value.
This fixes CVE-2022-2588, also reported as ZDI-CAN-17440.
Reported-by: Zhenpeng Lin <zplin(a)u.northwestern.edu>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)canonical.com>
Reviewed-by: Kamal Mostafa <kamal(a)canonical.com>
Cc: <stable(a)vger.kernel.org>
Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
Link: https://lore.kernel.org/r/20220809170518.164662-1-cascardo@canonical.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Xu Jia <xujia39(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/sched/cls_route.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index 0a0a22a0666d..c13903441ac1 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -528,7 +528,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
rcu_assign_pointer(f->next, f1);
rcu_assign_pointer(*fp, f);
- if (fold && fold->handle && f->handle != fold->handle) {
+ if (fold) {
th = to_hash(fold->handle);
h = from_hash(fold->handle >> 16);
b = rtnl_dereference(head->table[th]);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] Revert "x86/unwind/orc: Change REG_SP_INDIRECT"
by Yongqiang Liu 15 Aug '22
by Yongqiang Liu 15 Aug '22
15 Aug '22
From: Yipeng Zou <zouyipeng(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5MC71
CVE: NA
---------------------------------
This reverts commit 657a6bec17a04e9223a099608084ab5adc2582df.
This patch was backport from mainline and intend to fix REG_SP_INDIRECT
type in orc unwinder.The patch was fix an objtools problem on mainline,
which The upstream commit havn't been merged in hulk-4.19,and it led to
parse the sp value form orc data was wrong. So we need revert this patch.
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Reviewed-by: Zhang Jianhua <chris.zjh(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
arch/x86/kernel/unwind_orc.c | 5 +----
tools/objtool/orc_dump.c | 2 +-
2 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 3ff76f88e220..6c5d3b22ac61 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -450,7 +450,7 @@ bool unwind_next_frame(struct unwind_state *state)
break;
case ORC_REG_SP_INDIRECT:
- sp = state->sp;
+ sp = state->sp + orc->sp_offset;
indirect = true;
break;
@@ -500,9 +500,6 @@ bool unwind_next_frame(struct unwind_state *state)
if (indirect) {
if (!deref_stack_reg(state, sp, &sp))
goto err;
-
- if (orc->sp_reg == ORC_REG_SP_INDIRECT)
- sp += orc->sp_offset;
}
/* Find IP, SP and possibly regs: */
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c
index ba28830aace2..faa444270ee3 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -64,7 +64,7 @@ static void print_reg(unsigned int reg, int offset)
if (reg == ORC_REG_BP_INDIRECT)
printf("(bp%+d)", offset);
else if (reg == ORC_REG_SP_INDIRECT)
- printf("(sp)%+d", offset);
+ printf("(sp%+d)", offset);
else if (reg == ORC_REG_UNDEFINED)
printf("(und)");
else
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/3] scsi: iscsi: Add helper functions to manage iscsi_cls_conn
by Yongqiang Liu 15 Aug '22
by Yongqiang Liu 15 Aug '22
15 Aug '22
From: Wenchao Hao <haowenchao(a)huawei.com>
mainline inclusion
from mainline-v5.18-rc1
commit ad515cada7dac3cdf5e1ad77a0ed696f5f34e0ab
category: bugfix
bugzilla: 187381, https://gitee.com/openeuler/kernel/issues/I5LBFL
CVE: NA
--------------------------------
- iscsi_alloc_conn(): Allocate and initialize iscsi_cls_conn
- iscsi_add_conn(): Expose iscsi_cls_conn to userspace via sysfs
- iscsi_remove_conn(): Remove iscsi_cls_conn from sysfs
Link: https://lore.kernel.org/r/20220310015759.3296841-2-haowenchao@huawei.com
Reviewed-by: Mike Christie <michael.christie(a)oracle.com>
Signed-off-by: Wenchao Hao <haowenchao(a)huawei.com>
Signed-off-by: Wu Bo <wubo40(a)huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Conflict: drivers/scsi/scsi_transport_iscsi.c
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/scsi/scsi_transport_iscsi.c | 89 +++++++++++++++++++++++++++++
include/scsi/scsi_transport_iscsi.h | 4 ++
2 files changed, 93 insertions(+)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index bd699906828f..eba74ac58735 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2201,6 +2201,95 @@ void iscsi_free_session(struct iscsi_cls_session *session)
}
EXPORT_SYMBOL_GPL(iscsi_free_session);
+/**
+ * iscsi_alloc_conn - alloc iscsi class connection
+ * @session: iscsi cls session
+ * @dd_size: private driver data size
+ * @cid: connection id
+ */
+struct iscsi_cls_conn *
+iscsi_alloc_conn(struct iscsi_cls_session *session, int dd_size, uint32_t cid)
+{
+ struct iscsi_transport *transport = session->transport;
+ struct iscsi_cls_conn *conn;
+
+ conn = kzalloc(sizeof(*conn) + dd_size, GFP_KERNEL);
+ if (!conn)
+ return NULL;
+ if (dd_size)
+ conn->dd_data = &conn[1];
+
+ mutex_init(&conn->ep_mutex);
+ INIT_LIST_HEAD(&conn->conn_list);
+ conn->transport = transport;
+ conn->cid = cid;
+
+ /* this is released in the dev's release function */
+ if (!get_device(&session->dev))
+ goto free_conn;
+
+ dev_set_name(&conn->dev, "connection%d:%u", session->sid, cid);
+ device_initialize(&conn->dev);
+ conn->dev.parent = &session->dev;
+ conn->dev.release = iscsi_conn_release;
+
+ return conn;
+
+free_conn:
+ kfree(conn);
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(iscsi_alloc_conn);
+
+/**
+ * iscsi_add_conn - add iscsi class connection
+ * @conn: iscsi cls connection
+ *
+ * This will expose iscsi_cls_conn to sysfs so make sure the related
+ * resources for sysfs attributes are initialized before calling this.
+ */
+int iscsi_add_conn(struct iscsi_cls_conn *conn)
+{
+ int err;
+ unsigned long flags;
+ struct iscsi_cls_session *session = iscsi_dev_to_session(conn->dev.parent);
+
+ err = device_add(&conn->dev);
+ if (err) {
+ iscsi_cls_session_printk(KERN_ERR, session,
+ "could not register connection's dev\n");
+ return err;
+ }
+ transport_register_device(&conn->dev);
+
+ spin_lock_irqsave(&connlock, flags);
+ list_add(&conn->conn_list, &connlist);
+ spin_unlock_irqrestore(&connlock, flags);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(iscsi_add_conn);
+
+/**
+ * iscsi_remove_conn - remove iscsi class connection from sysfs
+ * @conn: iscsi cls connection
+ *
+ * Remove iscsi_cls_conn from sysfs, and wait for previous
+ * read/write of iscsi_cls_conn's attributes in sysfs to finish.
+ */
+void iscsi_remove_conn(struct iscsi_cls_conn *conn)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&connlock, flags);
+ list_del(&conn->conn_list);
+ spin_unlock_irqrestore(&connlock, flags);
+
+ transport_unregister_device(&conn->dev);
+ device_del(&conn->dev);
+}
+EXPORT_SYMBOL_GPL(iscsi_remove_conn);
+
/**
* iscsi_create_conn - create iscsi class connection
* @session: iscsi cls session
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index 848ba2822a21..105458b20cd8 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -446,6 +446,10 @@ extern struct iscsi_cls_session *iscsi_create_session(struct Scsi_Host *shost,
unsigned int target_id);
extern void iscsi_remove_session(struct iscsi_cls_session *session);
extern void iscsi_free_session(struct iscsi_cls_session *session);
+extern struct iscsi_cls_conn *iscsi_alloc_conn(struct iscsi_cls_session *sess,
+ int dd_size, uint32_t cid);
+extern int iscsi_add_conn(struct iscsi_cls_conn *conn);
+extern void iscsi_remove_conn(struct iscsi_cls_conn *conn);
extern struct iscsi_cls_conn *iscsi_create_conn(struct iscsi_cls_session *sess,
int dd_size, uint32_t cid);
extern void iscsi_put_conn(struct iscsi_cls_conn *conn);
--
2.25.1
1
2
From: Zheng Zengkai <zhengzengkai(a)huawei.com>
phytium inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I41AUQ
--------------------------------------
Use CONFIG_ARCH_PHYTIUM to control phytium ACS quirks.
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin(a)huawei.com>
---
drivers/pci/quirks.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 99657b9bc82e..c389cef5c7bd 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4646,10 +4646,12 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_ZHAOXIN, 0x9083, pci_quirk_mf_endpoint_acs },
/* Zhaoxin Root/Downstream Ports */
{ PCI_VENDOR_ID_ZHAOXIN, PCI_ANY_ID, pci_quirk_zhaoxin_pcie_ports_acs },
+#ifdef CONFIG_ARCH_PHYTIUM
/* because PLX switch Vendor id is 0x10b5 on phytium cpu */
{ 0x10b5, PCI_ANY_ID, pci_quirk_xgene_acs },
/* because rootcomplex Vendor id is 0x17cd on phytium cpu */
{ 0x17cd, PCI_ANY_ID, pci_quirk_xgene_acs },
+#endif
{ 0 }
};
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 1/2] sched: Fix null-ptr-deref in free_fair_sched_group
by Yongqiang Liu 15 Aug '22
by Yongqiang Liu 15 Aug '22
15 Aug '22
From: Hui Tang <tanghui20(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 187419, https://gitee.com/openeuler/kernel/issues/I5LIPL
CVE: NA
-------------------------------
do_el0_svc+0x50/0x11c arch/arm64/kernel/syscall.c:217
el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353
el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369
el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683
==================================================================
BUG: KASAN: null-ptr-deref in rq_of kernel/sched/sched.h:1118 [inline]
BUG: KASAN: null-ptr-deref in unthrottle_qos_sched_group kernel/sched/fair.c:7619 [inline]
BUG: KASAN: null-ptr-deref in free_fair_sched_group+0x124/0x320 kernel/sched/fair.c:12131
Read of size 8 at addr 0000000000000130 by task syz-executor100/223
CPU: 3 PID: 223 Comm: syz-executor100 Not tainted 5.10.0 #6
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x40c arch/arm64/kernel/stacktrace.c:132
show_stack+0x30/0x40 arch/arm64/kernel/stacktrace.c:196
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b4/0x248 lib/dump_stack.c:118
__kasan_report mm/kasan/report.c:551 [inline]
kasan_report+0x18c/0x210 mm/kasan/report.c:564
check_memory_region_inline mm/kasan/generic.c:187 [inline]
__asan_load8+0x98/0xc0 mm/kasan/generic.c:253
rq_of kernel/sched/sched.h:1118 [inline]
unthrottle_qos_sched_group kernel/sched/fair.c:7619 [inline]
free_fair_sched_group+0x124/0x320 kernel/sched/fair.c:12131
sched_free_group kernel/sched/core.c:7767 [inline]
sched_create_group+0x48/0xc0 kernel/sched/core.c:7798
cpu_cgroup_css_alloc+0x18/0x40 kernel/sched/core.c:7930
css_create+0x7c/0x4a0 kernel/cgroup/cgroup.c:5328
cgroup_apply_control_enable+0x288/0x340 kernel/cgroup/cgroup.c:3135
cgroup_apply_control kernel/cgroup/cgroup.c:3217 [inline]
cgroup_subtree_control_write+0x668/0x8b0 kernel/cgroup/cgroup.c:3375
cgroup_file_write+0x1a8/0x37c kernel/cgroup/cgroup.c:3909
kernfs_fop_write_iter+0x220/0x2f4 fs/kernfs/file.c:296
call_write_iter include/linux/fs.h:1960 [inline]
new_sync_write+0x260/0x370 fs/read_write.c:515
vfs_write+0x3dc/0x4ac fs/read_write.c:602
ksys_write+0xfc/0x200 fs/read_write.c:655
__do_sys_write fs/read_write.c:667 [inline]
__se_sys_write fs/read_write.c:664 [inline]
__arm64_sys_write+0x50/0x60 fs/read_write.c:664
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common.constprop.0+0xf4/0x414 arch/arm64/kernel/syscall.c:155
do_el0_svc+0x50/0x11c arch/arm64/kernel/syscall.c:217
el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353
el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369
el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683
So add check for tg->cfs_rq[i] before unthrottle_qos_sched_group() called.
Signed-off-by: Hui Tang <tanghui20(a)huawei.com>
Reviewed-by: Zhang Qiao <zhangqiao22(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bcc72537b6fa..a34ca843bf0a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10878,7 +10878,7 @@ void free_fair_sched_group(struct task_group *tg)
for_each_possible_cpu(i) {
#ifdef CONFIG_QOS_SCHED
- if (tg->cfs_rq)
+ if (tg->cfs_rq && tg->cfs_rq[i])
unthrottle_qos_sched_group(tg->cfs_rq[i]);
#endif
if (tg->cfs_rq)
--
2.25.1
1
1
Backport BPF CO-RE support patches for openEuler-22.09
Alan Maguire (5):
libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()
libbpf: BTF dumper support for typed data
libbpf: Clarify/fix unaligned data issues for btf typed dump
libbpf: Avoid use of __int128 in typed dump display
libbpf: Propagate errors when retrieving enum value for typed data
display
Alexei Starovoitov (47):
bpf: Optimize program stats
bpf: Run sleepable programs with migration disabled
bpf: Compute program stats for sleepable programs
bpf: Add per-program recursion prevention mechanism
selftest/bpf: Add a recursion test
bpf: Count the number of times recursion was prevented
selftests/bpf: Improve recursion selftest
bpf: Allows per-cpu maps and map-in-map in sleepable programs
selftests/bpf: Add a test for map-in-map and per-cpu maps in sleepable
progs
bpf: Clear per_cpu pointers during bpf_prog_realloc
bpf: Dont allow vmlinux BTF to be used in map_create and prog_load.
libbpf: Remove unused field.
bpf: Introduce bpf_sys_bpf() helper and program type.
bpf: Introduce bpfptr_t user/kernel pointer.
bpf: Prepare bpf syscall to be used from kernel and user space.
libbpf: Support for syscall program type
bpf: Make btf_load command to be bpfptr_t compatible.
bpf: Introduce fd_idx
bpf: Add bpf_btf_find_by_name_kind() helper.
bpf: Add bpf_sys_close() helper.
libbpf: Change the order of data and text relocations.
libbpf: Add bpf_object pointer to kernel_supports().
libbpf: Preliminary support for fd_idx
libbpf: Generate loader program out of BPF ELF file.
libbpf: Cleanup temp FDs when intermediate sys_bpf fails.
libbpf: Introduce bpf_map__initial_value().
bpftool: Use syscall/loader program in "prog load" and "gen skeleton"
command.
bpf: Add cmd alias BPF_PROG_RUN
bpf: Prepare bpf_prog_put() to be called from irq context.
bpf: Factor out bpf_spin_lock into helpers.
libbpf: Cleanup the layering between CORE and bpf_program.
libbpf: Split bpf_core_apply_relo() into bpf_program independent
helper.
libbpf: Move CO-RE types into relo_core.h.
libbpf: Split CO-RE logic into relo_core.c.
libbpf: Make gen_loader data aligned.
libbpf: Replace btf__type_by_id() with btf_type_by_id().
bpf: Rename btf_member accessors.
bpf: Prepare relo_core.c for kernel duty.
bpf: Define enum bpf_core_relo_kind as uapi.
bpf: Pass a set of bpf_core_relo-s to prog_load command.
bpf: Add bpf_core_add_cands() and wire it into
bpf_core_apply_relo_insn().
libbpf: Use CO-RE in the kernel in light skeleton.
libbpf: Support init of inner maps in light skeleton.
libbpf: Clean gen_loader's attach kind.
libbpf: Reduce bpf_core_apply_relo_insn() stack usage.
bpf: Silence purge_cand_cache build warning.
libbpf: Fix gen_loader assumption on number of programs.
Andrei Matei (1):
libbpf: Fail early when loading programs with unspecified type
Andrew Delgadillo (1):
selftests/bpf: Drop the need for LLVM's llc
Andrii Nakryiko (149):
libbpf: Factor out common operations in BTF writing APIs
selftest/bpf: Relax btf_dedup test checks
libbpf: Unify and speed up BTF string deduplication
libbpf: Implement basic split BTF support
selftests/bpf: Add split BTF basic test
selftests/bpf: Add checking of raw type dump in BTF writer APIs
selftests
libbpf: Support BTF dedup of split BTFs
libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays
selftests/bpf: Add split BTF dedup selftests
tools/bpftool: Add bpftool support for split BTF
bpf: Add in-kernel split BTF support
bpf: Assign ID to vmlinux BTF and return extra info for BTF in
GET_OBJ_INFO
kbuild: Build kernel module BTFs if BTF is enabled and pahole supports
it
bpf: Load and verify kernel module BTFs
tools/bpftool: Add support for in-kernel and named BTF in `btf show`
bpf: Compile out btf_parse_module() if module BTF is not enabled
bpf: Sanitize BTF data pointer after module is loaded
tools/bpftool: Emit name <anon> for anonymous BTFs
libbpf: Add base BTF accessor
tools/bpftool: Auto-detect split BTFs in common cases
bpf: Keep module's btf_data_size intact after load
libbpf: Add internal helper to load BTF data by FD
libbpf: Refactor CO-RE relocs to not assume a single BTF object
libbpf: Add kernel module BTF support for CO-RE relocations
selftests/bpf: Add bpf_testmod kernel module for testing
selftests/bpf: Add support for marking sub-tests as skipped
selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF
bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
bpf: Allow to specify kernel module BTFs when attaching BPF programs
libbpf: Factor out low-level BPF program loading helper
libbpf: Support attachment of BPF tracing programs to kernel modules
selftests/bpf: Add tp_btf CO-RE reloc test for modules
selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module
kbuild: Skip module BTF generation for out-of-tree external modules
selftests/bpf: fix bpf_testmod.ko recompilation logic
libbpf: Support modules in bpf_program__set_attach_target() API
selftests/bpf: Work-around EBUSY errors from hashmap update/delete
bpf: Allow empty module BTFs
libbpf: Add user-space variants of BPF_CORE_READ() family of macros
libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family
selftests/bpf: Add tests for user- and non-CO-RE BPF_CORE_READ()
variants
libbpf: Clarify kernel type use with USER variants of CORE reading
macros
selftests/bpf: Sync RCU before unloading bpf_testmod
bpf: Support BPF ksym variables in kernel modules
libbpf: Support kernel module ksym externs
selftests/bpf: Test kernel module ksym externs
selftests/bpf: Don't exit on failed bpf_testmod unload
libbpf: Stop using feature-detection Makefiles
libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h
libbpf: Expose btf_type_by_id() internally
libbpf: Generalize BTF and BTF.ext type ID and strings iteration
libbpf: Rename internal memory-management helpers
libbpf: Extract internal set-of-strings datastructure APIs
libbpf: Add generic BTF type shallow copy API
libbpf: Add BPF static linker APIs
libbpf: Add BPF static linker BTF and BTF.ext support
bpftool: Add ability to specify custom skeleton object name
bpftool: Add `gen object` command to perform BPF static linking
selftests/bpf: Pass all BPF .o's through BPF static linker
selftests/bpf: Add multi-file statically linked BPF object file test
libbpf: Skip BTF fixup if object file has no BTF
libbpf: Constify few bpf_program getters
libbpf: Preserve empty DATASEC BTFs during static linking
libbpf: Fix memory leak when emitting final btf_ext
libbpf: Add bpf_map__inner_map API
libbpf: Suppress compiler warning when using SEC() macro with externs
libbpf: Mark BPF subprogs with hidden visibility as static for BPF
verifier
libbpf: Allow gaps in BPF program sections to support overriden weak
functions
libbpf: Refactor BTF map definition parsing
libbpf: Factor out symtab and relos sanity checks
libbpf: Make few internal helpers available outside of libbpf.c
libbpf: Extend sanity checking ELF symbols with externs validation
libbpf: Tighten BTF type ID rewriting with error checking
libbpf: Add linker extern resolution support for functions and global
variables
libbpf: Support extern resolution for BTF-defined maps in .maps
section
libbpf: Support BTF_KIND_FLOAT during type compatibility checks in
CO-RE
bpftool: Strip const/volatile/restrict modifiers from .bss and .data
vars
libbpf: Add per-file linker opts
selftests/bpf: Stop using static variables for passing data to/from
user-space
bpftool: Stop emitting static variables in BPF skeleton
libbpf: Fix ELF symbol visibility update logic
libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions
libbpf: Reject static maps
libbpf: Reject static entry-point BPF programs
libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0
behaviors
libbpf: Streamline error reporting for low-level APIs
libbpf: Streamline error reporting for high-level APIs
bpftool: Set errno on skeleton failures and propagate errors
libbpf: Move few APIs from 0.4 to 0.5 version
libbpf: Refactor header installation portions of Makefile
libbpf: Install skel_internal.h header used from light skeletons
selftests/bpf: Add remaining ASSERT_xxx() variants
libbpf: Fix build with latest gcc/binutils with LTO
libbpf: Make libbpf_version.h non-auto-generated
selftests/bpf: Update selftests to always provide "struct_ops" SEC
libbpf: Ensure BPF prog types are set before relocations
libbpf: Simplify BPF program auto-attach code
libbpf: Minimize explicit iterator of section definition array
libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
selftests/bpf: Stop using relaxed_core_relocs which has no effect
libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
libbpf: Allow skipping attach_func_name in
bpf_program__set_attach_target()
libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
libbpf: Constify all high-level program attach APIs
selftests/bpf: Turn on libbpf 1.0 mode and fix all IS_ERR checks
selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target()
API
libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
libbpf: Add API that copies all BTF types from one BTF object to
another
libbpf: Deprecate btf__finalize_data() and move it into libbpf.c
libbpf: Extract ELF processing state into separate struct
libbpf: Refactor internal sec_def handling to enable pluggability
libbpf: Reduce reliance of attach_fns on sec_def internals
libbpf: Use Elf64-specific types explicitly for dealing with ELF
libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps
bpftool: Support multiple .rodata/.data internal maps in skeleton
bpftool: Improve skeleton generation for data maps without DATASEC
type
libbpf: Support multiple .rodata.* and .data.* BPF maps
selftests/bpf: Demonstrate use of custom .rodata/.data sections
libbpf: Simplify look up by name of internal maps
selftests/bpf: Switch to ".bss"/".rodata"/".data" lookups for internal
maps
libbpf: Fix off-by-one bug in bpf_core_apply_relo()
libbpf: Add ability to fetch bpf_program's underlying instructions
libbpf: Deprecate multi-instance bpf_program APIs
libbpf: Deprecate ambiguously-named bpf_program__size() API
libbpf: Detect corrupted ELF symbols section
libbpf: Improve sanity checking during BTF fix up
libbpf: Validate that .BTF and .BTF.ext sections contain data
libbpf: Fix section counting logic
libbpf: Improve ELF relo sanitization
libbpf: Deprecate bpf_program__load() API
libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS
selftests/bpf: Pass sanitizer flags to linker through LDFLAGS
libbpf: Free up resources used by inner map definition
selftests/bpf: Fix memory leaks in btf_type_c_dump() helper
selftests/bpf: Free per-cpu values array in bpf_iter selftest
selftests/bpf: Free inner strings index in btf selftest
selftests/bpf: Avoid duplicate btf__parse() call
libbpf: Load global data maps lazily on legacy kernels
libbpf: Fix potential misaligned memory access in btf_ext__new()
libbpf: Don't call libc APIs with NULL pointers
libbpf: Fix glob_syms memory leak in bpf_linker
libbpf: Fix using invalidated memory in bpf_linker
selftests/bpf: Fix possible NULL passed to memcpy() with zero size
selftests/bpf: Prevent misaligned memory access in get_stack_raw_tp
test
selftests/bpf: Fix misaligned memory access in queue_stack_map test
selftests/bpf: Prevent out-of-bounds stack access in test_bpffs
selftests/bpf: Fix misaligned accesses in xdp and xdp_bpf2bpf tests
libbpf: Cleanup struct bpf_core_cand.
libbpf: Fix non-C89 loop variable declaration in gen_loader.c
Arnaldo Carvalho de Melo (1):
libbpf: Provide GELF_ST_VISIBILITY() define for older libelf
Brendan Jackman (15):
tools/resolve_btfids: Fix some error messages
bpf: Fix cold build of test_progs-no_alu32
bpf: Clarify return value of probe str helpers
bpf: x86: Factor out emission of ModR/M for *(reg + off)
bpf: x86: Factor out emission of REX byte
bpf: x86: Factor out a lookup table for some ALU opcodes
bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
bpf: Move BPF_STX reserved field check into BPF_STX verifier code
bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
bpf: Add instructions for atomic_[cmp]xchg
bpf: Pull out a macro for interpreting atomic ALU operations
bpf: Add bitwise atomic instructions
bpf: Add tests for new BPF atomic operations
bpf: Document new atomic instructions
bpf: Rename fixup_bpf_calls and add some comments
Cong Wang (1):
bpf: Clear percpu pointers in bpf_prog_clone_free()
Daniel Xu (1):
libbpf: Do not close un-owned FD 0 on errors
Dave Marchevsky (1):
bpf: Add verified_insns to bpf_prog_info and fdinfo
Dmitrii Banshchikov (7):
bpf: Rename bpf_reg_state variables
bpf: Extract nullable reg type conversion into a helper function
bpf: Support pointers in global func args
selftests/bpf: Add unit tests for pointers in global functions
bpf: Drop imprecise log message
selftests/bpf: Fix a compiler warning in global func test
bpf: Use MAX_BPF_FUNC_REG_ARGS macro
Florent Revest (7):
bpf: Be less specific about socket cookies guarantees
selftests/bpf: Fix the ASSERT_ERR_PTR macro
bpf: Factorize bpf_trace_printk and bpf_seq_printf
bpf: Add a ARG_PTR_TO_CONST_STR argument type
bpf: Add a bpf_snprintf helper
libbpf: Introduce a BPF_SNPRINTF helper macro
libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h
Florian Lehner (2):
selftests/bpf: Print reason when a tester could not run a program
selftests/bpf: Avoid errno clobbering
Gary Lin (3):
bpf,x64: Pad NOPs to make images converge more easily
test_bpf: Remove EXPECTED_FAIL flag from bpf_fill_maxinsns11
selftests/bpf: Add verifier tests for x64 jit jump padding
Hao Luo (1):
libbpf: Support weak typed ksyms.
Hengqi Chen (8):
libbpf: Fix KERNEL_VERSION macro
tools/resolve_btfids: Emit warnings and patch zero id for missing
symbols
libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
libbpf: Support uniform BTF-defined key/value specification across all
BPF maps
libbpf: Deprecate bpf_object__unload() API since v0.6
libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7
selftests/bpf: Switch to new bpf_object__next_{map,program} APIs
libbpf: Support static initialization of BPF_MAP_TYPE_PROG_ARRAY
Ian Rogers (3):
bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
tools/bpftool: Add -Wall when building BPF programs
libbpf: Add NULL check to add_dummy_ksym_var
Ilya Leoshkevich (5):
selftests/bpf: Copy extras in out-of-srctree builds
bpf: Add BTF_KIND_FLOAT to uapi
libbpf: Fix whitespace in btf_add_composite() comment
libbpf: Add BTF_KIND_FLOAT support
bpf: Generate BTF_KIND_FLOAT when linking vmlinux
Jason Wang (1):
libbpf: Fix comment typo
Jean-Philippe Brucker (12):
tools: Factor HOSTCC, HOSTLD, HOSTAR definitions
tools/runqslower: Use Makefile.include
tools/runqslower: Enable out-of-tree build
tools/runqslower: Build bpftool using HOSTCC
tools/bpftool: Fix build slowdown
selftests/bpf: Enable cross-building
selftests/bpf: Fix out-of-tree build
selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
selftests/bpf: Fix installation of urandom_read
selftests/bpf: Install btf_dump test cases
tools/bpftool: Fix cross-build
tools/runqslower: Fix cross-build
Jiri Olsa (3):
tools/resolve_btfids: Warn when having multiple IDs for single type
libbpf: Use string table index from index table if needed
perf build: Move feature cleanup under tools/build
Joe Stringer (1):
tools: Sync uapi bpf.h header with latest changes
Jonathan Edwards (1):
libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading
Kumar Kartikeya Dwivedi (14):
libbpf: Add various netlink helpers
libbpf: Add low level TC-BPF management API
libbpf: Remove unneeded check for flags during tc detach
libbpf: Set NLM_F_EXCL when creating qdisc
libbpf: Fix segfault in static linker for objects without BTF
libbpf: Fix segfault in light skeleton for objects without BTF
libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
bpf: Add bpf_kallsyms_lookup_name helper
libbpf: Add typeless ksym support to gen_loader
libbpf: Add weak ksym support to gen_loader
libbpf: Perform map fd cleanup for gen_loader in case of error
bpf: Change bpf_kallsyms_lookup_name size type to
ARG_CONST_SIZE_OR_ZERO
libbpf: Avoid double stores for success/failure case of ksym
relocations
libbpf: Avoid reload of imm for weak, unresolved, repeating ksym
Lorenz Bauer (2):
bpf: Consolidate shared test timing code
bpf: Add PROG_TEST_RUN support for sk_lookup programs
Martin KaFai Lau (9):
bpf: Simplify freeing logic in linfo and jited_linfo
bpf: Refactor btf_check_func_arg_match
bpf: Support bpf program calling kernel function
bpf: Support kernel function call in x86-32
libbpf: Refactor bpf_object__resolve_ksyms_btf_id
libbpf: Refactor codes for finding btf id of a kernel symbol
libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
libbpf: Record extern sym relocation first
libbpf: Support extern kernel function
Martynas Pumputis (1):
selftests/bpf: Check inner map deletion
Matt Smith (3):
libbpf: Change bpf_object_skeleton data field to const pointer
bpftool: Provide a helper method for accessing skeleton's embedded ELF
data
selftests/bpf: Add checks for X__elf_bytes() skeleton helper
Mauricio Vásquez (1):
libbpf: Fix memory leak in btf__dedup()
Michal Suchanek (1):
libbpf: Fix pr_warn type warnings on 32bit
Pedro Tammela (2):
libbpf: Avoid inline hint definition from 'linux/stddef.h'
libbpf: Clarify flags in ringbuf helpers
Quentin Monnet (17):
libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
libbpf: Rename btf__load() as btf__load_into_kernel()
libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
tools: Free BTF objects at various locations
tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
libbpf: Add split BTF support for btf__load_from_kernel_by_id()
tools: bpftool: Support dumping split BTF by id
libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API
deprecations
libbpf: Skip re-installing headers file if source is older than target
bpftool: Remove unused includes to <bpf/bpf_gen_internal.h>
bpftool: Install libbpf headers instead of including the dir
tools/resolve_btfids: Install libbpf headers when building
tools/runqslower: Install libbpf headers when building
bpf: preload: Install libbpf headers when building
bpf: iterators: Install libbpf headers when building
selftests/bpf: Better clean up for runqslower in test_bpftool_build.sh
bpftool: Add install-bin target to install binary only
Rafael David Tinoco (1):
libbpf: Add bpf object kern_version attribute setter
Sedat Dilek (1):
tools: Factor Clang, LLC and LLVM utils definitions
Shuyi Cheng (2):
libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
libbpf: Add "bool skipped" to struct bpf_map
Song Liu (5):
bpf: Use separate lockdep class for each hashtab
bpf: Avoid hashtab deadlock with map_locked
bpftool: Add Makefile target bootstrap
perf build: Support build BPF skeletons with perf
perf stat: Enable counting events for BPF programs
Stanislav Fomichev (2):
libbpf: Cap retries in sys_bpf_prog_load
libbpf: Skip bpf_object__probe_loading for light skeleton
Toke Høiland-Jørgensen (5):
bpf: Return target info when a tracing bpf_link is queried
libbpf: Restore errno return for functions that were already returning
it
libbpf: Don't crash on object files with no symbol tables
libbpf: Ignore STT_SECTION symbols in 'maps' section
libbpf: Properly ignore STT_SECTION symbols in legacy map definitions
Wang Hai (1):
libbpf: Simplify the return expression of bpf_object__init_maps
function
Wang Qing (1):
bpf, btf: Remove the duplicate btf_ids.h include
Wedson Almeida Filho (1):
bpf: Refactor check_cfg to use a structured loop.
Yauheni Kaliuta (7):
selftests/bpf: test_progs/sockopt_sk: Remove version
selftests/bpf: test_progs/sockopt_sk: Convert to use BPF skeleton
selftests/bpf: Pass page size from userspace in sockopt_sk
selftests/bpf: Pass page size from userspace in map_ptr
selftests/bpf: mmap: Use runtime page size
selftests/bpf: ringbuf: Use runtime page size
selftests/bpf: ringbuf_multi: Use runtime page size
Yonghong Song (27):
bpf: Permit cond_resched for some iterators
bpf: Permit size-0 datasec
bpf: Refactor BPF_PSEUDO_CALL checking as a helper function
bpf: Factor out visit_func_call_insn() in check_cfg()
bpf: Factor out verbose_invalid_scalar()
bpf: Refactor check_func_call() to allow callback function
bpf: Change return value of verifier function add_subprog()
bpf: Add bpf_for_each_map_elem() helper
libbpf: Move function is_ldimm64() earlier in libbpf.c
libbpf: Support subprog address relocation
selftests/bpf: Fix test_cpp compilation failure with clang
bpftool: Fix a clang compilation warning
libbpf: Add support for new llvm bpf relocations
bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
btf: Change BTF_KIND_* macros to enums
bpf: Support for new btf kind BTF_KIND_TAG
libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
libbpf: Add support for BTF_KIND_TAG
bpftool: Add support for BTF_KIND_TAG
bpf: Add BTF_KIND_DECL_TAG typedef support
docs/bpf: Update documentation for BTF_KIND_DECL_TAG typedef support
bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes
libbpf: Support BTF_KIND_TYPE_TAG
bpftool: Support BTF_KIND_TYPE_TAG
docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support
libbpf: Fix a couple of missed btf_type_tag handling in btf.c
Documentation/ABI/testing/sysfs-kernel-btf | 8 +
Documentation/bpf/btf.rst | 40 +-
Documentation/networking/filter.rst | 61 +-
arch/arm/net/bpf_jit_32.c | 7 +-
arch/arm64/net/bpf_jit_comp.c | 16 +-
arch/mips/net/ebpf_jit.c | 11 +-
arch/powerpc/net/bpf_jit_comp64.c | 25 +-
arch/riscv/net/bpf_jit_comp32.c | 20 +-
arch/riscv/net/bpf_jit_comp64.c | 16 +-
arch/s390/net/bpf_jit_comp.c | 27 +-
arch/sparc/net/bpf_jit_comp_64.c | 17 +-
arch/x86/net/bpf_jit_comp.c | 408 +-
arch/x86/net/bpf_jit_comp32.c | 204 +-
drivers/net/ethernet/netronome/nfp/bpf/fw.h | 4 +-
drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 +-
drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +-
.../net/ethernet/netronome/nfp/bpf/verifier.c | 15 +-
include/linux/bpf.h | 150 +-
include/linux/bpf_types.h | 2 +
include/linux/bpf_verifier.h | 46 +-
include/linux/bpfptr.h | 75 +
include/linux/btf.h | 106 +-
include/linux/filter.h | 45 +-
include/linux/module.h | 4 +
include/uapi/linux/bpf.h | 265 +-
include/uapi/linux/btf.h | 57 +-
kernel/bpf/Makefile | 4 +
kernel/bpf/bpf_iter.c | 43 +-
kernel/bpf/bpf_struct_ops.c | 6 +-
kernel/bpf/btf.c | 1403 ++++-
kernel/bpf/core.c | 153 +-
kernel/bpf/disasm.c | 56 +-
kernel/bpf/hashtab.c | 130 +-
kernel/bpf/helpers.c | 326 +-
kernel/bpf/preload/Makefile | 25 +-
kernel/bpf/preload/iterators/Makefile | 38 +-
kernel/bpf/preload/iterators/iterators.bpf.c | 1 -
kernel/bpf/syscall.c | 363 +-
kernel/bpf/sysfs_btf.c | 2 +-
kernel/bpf/task_iter.c | 2 +
kernel/bpf/trampoline.c | 77 +-
kernel/bpf/verifier.c | 1508 ++++-
kernel/module.c | 36 +
kernel/trace/bpf_trace.c | 375 +-
lib/Kconfig.debug | 9 +
lib/test_bpf.c | 21 +-
net/bpf/test_run.c | 290 +-
net/core/filter.c | 1 +
net/ipv4/bpf_tcp_ca.c | 7 +-
samples/bpf/bpf_insn.h | 4 +-
samples/bpf/cookie_uid_helper_example.c | 8 +-
samples/bpf/sock_example.c | 2 +-
samples/bpf/test_cgrp2_attach.c | 5 +-
samples/bpf/xdp1_user.c | 2 +-
samples/bpf/xdp_sample_pkts_user.c | 2 +-
scripts/Makefile.modfinal | 25 +-
scripts/link-vmlinux.sh | 7 +-
.../bpf/bpftool/Documentation/bpftool-gen.rst | 78 +-
tools/bpf/bpftool/Makefile | 55 +-
tools/bpf/bpftool/bash-completion/bpftool | 17 +-
tools/bpf/bpftool/btf.c | 80 +-
tools/bpf/bpftool/btf_dumper.c | 6 +-
tools/bpf/bpftool/gen.c | 651 +-
tools/bpf/bpftool/iter.c | 2 +-
tools/bpf/bpftool/main.c | 20 +-
tools/bpf/bpftool/main.h | 2 +
tools/bpf/bpftool/map.c | 14 +-
tools/bpf/bpftool/net.c | 2 +-
tools/bpf/bpftool/prog.c | 141 +-
tools/bpf/bpftool/xlated_dumper.c | 3 +
tools/bpf/resolve_btfids/Makefile | 19 +-
tools/bpf/resolve_btfids/main.c | 38 +-
tools/bpf/runqslower/Makefile | 71 +-
tools/build/Makefile | 8 +-
tools/build/Makefile.feature | 4 +-
tools/build/feature/Makefile | 4 +-
tools/include/linux/filter.h | 24 +-
tools/include/uapi/linux/bpf.h | 977 ++-
tools/include/uapi/linux/btf.h | 57 +-
tools/lib/bpf/.gitignore | 2 -
tools/lib/bpf/Build | 2 +-
tools/lib/bpf/Makefile | 104 +-
tools/lib/bpf/bpf.c | 272 +-
tools/lib/bpf/bpf.h | 1 +
tools/lib/bpf/bpf_core_read.h | 169 +-
tools/lib/bpf/bpf_gen_internal.h | 65 +
tools/lib/bpf/bpf_helpers.h | 108 +-
tools/lib/bpf/bpf_prog_linfo.c | 18 +-
tools/lib/bpf/bpf_tracing.h | 44 +-
tools/lib/bpf/btf.c | 2085 ++++---
tools/lib/bpf/btf.h | 99 +-
tools/lib/bpf/btf_dump.c | 910 ++-
tools/lib/bpf/gen_loader.c | 1126 ++++
tools/lib/bpf/libbpf.c | 5436 +++++++++--------
tools/lib/bpf/libbpf.h | 212 +-
tools/lib/bpf/libbpf.map | 54 +
tools/lib/bpf/libbpf_common.h | 26 +-
tools/lib/bpf/libbpf_errno.c | 7 +-
tools/lib/bpf/libbpf_internal.h | 299 +-
tools/lib/bpf/libbpf_legacy.h | 60 +
tools/lib/bpf/libbpf_version.h | 9 +
tools/lib/bpf/linker.c | 2901 +++++++++
tools/lib/bpf/netlink.c | 593 +-
tools/lib/bpf/nlattr.h | 48 +
tools/lib/bpf/relo_core.c | 1322 ++++
tools/lib/bpf/relo_core.h | 57 +
tools/lib/bpf/ringbuf.c | 26 +-
tools/lib/bpf/skel_internal.h | 123 +
tools/lib/bpf/strset.c | 176 +
tools/lib/bpf/strset.h | 21 +
tools/lib/bpf/xsk.c | 4 +-
tools/perf/Documentation/perf-stat.txt | 18 +
tools/perf/Makefile.config | 9 +
tools/perf/Makefile.perf | 58 +-
tools/perf/builtin-stat.c | 82 +-
tools/perf/util/Build | 1 +
tools/perf/util/bpf-event.c | 11 +-
tools/perf/util/bpf_counter.c | 320 +
tools/perf/util/bpf_counter.h | 72 +
tools/perf/util/bpf_skel/.gitignore | 3 +
.../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 +
tools/perf/util/evsel.c | 5 +
tools/perf/util/evsel.h | 5 +
tools/perf/util/python.c | 21 +
tools/perf/util/stat-display.c | 4 +-
tools/perf/util/stat.c | 2 +-
tools/perf/util/target.c | 34 +-
tools/perf/util/target.h | 10 +
tools/scripts/Makefile.include | 18 +
tools/testing/selftests/bpf/.gitignore | 1 +
tools/testing/selftests/bpf/Makefile | 183 +-
tools/testing/selftests/bpf/bench.c | 1 +
.../selftests/bpf/benchs/bench_rename.c | 2 +-
.../selftests/bpf/benchs/bench_ringbufs.c | 6 +-
.../selftests/bpf/benchs/bench_trigger.c | 2 +-
.../selftests/bpf/bpf_testmod/.gitignore | 6 +
.../selftests/bpf/bpf_testmod/Makefile | 20 +
.../bpf/bpf_testmod/bpf_testmod-events.h | 36 +
.../selftests/bpf/bpf_testmod/bpf_testmod.c | 55 +
.../selftests/bpf/bpf_testmod/bpf_testmod.h | 14 +
tools/testing/selftests/bpf/btf_helpers.c | 264 +
tools/testing/selftests/bpf/btf_helpers.h | 19 +
.../selftests/bpf/prog_tests/atomics.c | 246 +
.../selftests/bpf/prog_tests/attach_probe.c | 12 +-
.../selftests/bpf/prog_tests/bpf_iter.c | 34 +-
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 8 +-
tools/testing/selftests/bpf/prog_tests/btf.c | 166 +-
.../bpf/prog_tests/btf_dedup_split.c | 325 +
.../selftests/bpf/prog_tests/btf_dump.c | 10 +-
.../selftests/bpf/prog_tests/btf_endian.c | 4 +-
.../selftests/bpf/prog_tests/btf_map_in_map.c | 33 -
.../selftests/bpf/prog_tests/btf_split.c | 99 +
.../selftests/bpf/prog_tests/btf_write.c | 47 +-
.../bpf/prog_tests/cg_storage_multi.c | 84 +-
.../bpf/prog_tests/cgroup_attach_multi.c | 6 +-
.../selftests/bpf/prog_tests/cgroup_link.c | 16 +-
.../bpf/prog_tests/cgroup_skb_sk_lookup.c | 2 +-
.../selftests/bpf/prog_tests/core_autosize.c | 2 +-
.../bpf/prog_tests/core_read_macros.c | 64 +
.../selftests/bpf/prog_tests/core_reloc.c | 105 +-
.../selftests/bpf/prog_tests/fexit_bpf2bpf.c | 68 +-
.../selftests/bpf/prog_tests/fexit_stress.c | 4 +-
.../selftests/bpf/prog_tests/flow_dissector.c | 2 +-
.../bpf/prog_tests/flow_dissector_reattach.c | 10 +-
.../bpf/prog_tests/get_stack_raw_tp.c | 24 +-
.../prog_tests/get_stackid_cannot_attach.c | 9 +-
.../selftests/bpf/prog_tests/global_data.c | 11 +-
.../bpf/prog_tests/global_data_init.c | 2 +-
.../bpf/prog_tests/global_func_args.c | 60 +
.../selftests/bpf/prog_tests/hashmap.c | 9 +-
.../selftests/bpf/prog_tests/kfree_skb.c | 23 +-
.../selftests/bpf/prog_tests/ksyms_btf.c | 34 +-
.../selftests/bpf/prog_tests/ksyms_module.c | 31 +
.../selftests/bpf/prog_tests/link_pinning.c | 7 +-
.../selftests/bpf/prog_tests/map_ptr.c | 15 +-
tools/testing/selftests/bpf/prog_tests/mmap.c | 24 +-
.../selftests/bpf/prog_tests/module_attach.c | 53 +
.../selftests/bpf/prog_tests/obj_name.c | 8 +-
.../selftests/bpf/prog_tests/perf_branches.c | 4 +-
.../selftests/bpf/prog_tests/perf_buffer.c | 2 +-
.../bpf/prog_tests/perf_event_stackmap.c | 3 +-
.../selftests/bpf/prog_tests/probe_user.c | 7 +-
.../selftests/bpf/prog_tests/prog_run_xattr.c | 2 +-
.../bpf/prog_tests/queue_stack_map.c | 12 +-
.../bpf/prog_tests/raw_tp_test_run.c | 4 +-
.../selftests/bpf/prog_tests/rdonly_maps.c | 9 +-
.../selftests/bpf/prog_tests/recursion.c | 41 +
.../bpf/prog_tests/reference_tracking.c | 2 +-
.../selftests/bpf/prog_tests/resolve_btfids.c | 9 +-
.../selftests/bpf/prog_tests/ringbuf.c | 17 +-
.../selftests/bpf/prog_tests/ringbuf_multi.c | 23 +-
.../bpf/prog_tests/select_reuseport.c | 55 +-
.../selftests/bpf/prog_tests/send_signal.c | 5 +-
.../selftests/bpf/prog_tests/sk_lookup.c | 2 +-
.../selftests/bpf/prog_tests/skeleton.c | 41 +-
.../selftests/bpf/prog_tests/snprintf_btf.c | 4 +-
.../selftests/bpf/prog_tests/sock_fields.c | 14 +-
.../selftests/bpf/prog_tests/sockmap_basic.c | 6 +-
.../selftests/bpf/prog_tests/sockmap_ktls.c | 2 +-
.../selftests/bpf/prog_tests/sockmap_listen.c | 10 +-
.../selftests/bpf/prog_tests/sockopt_sk.c | 66 +-
.../bpf/prog_tests/stacktrace_build_id_nmi.c | 3 +-
.../selftests/bpf/prog_tests/stacktrace_map.c | 2 +-
.../bpf/prog_tests/stacktrace_map_raw_tp.c | 5 +-
.../selftests/bpf/prog_tests/static_linked.c | 35 +
.../bpf/prog_tests/tcp_hdr_options.c | 15 +-
.../selftests/bpf/prog_tests/tcp_rtt.c | 2 +-
.../selftests/bpf/prog_tests/test_bpffs.c | 4 +-
.../bpf/prog_tests/test_global_funcs.c | 8 +
.../selftests/bpf/prog_tests/test_overhead.c | 12 +-
.../bpf/prog_tests/trampoline_count.c | 18 +-
.../selftests/bpf/prog_tests/udp_limit.c | 7 +-
tools/testing/selftests/bpf/prog_tests/xdp.c | 11 +-
.../selftests/bpf/prog_tests/xdp_bpf2bpf.c | 8 +-
.../selftests/bpf/prog_tests/xdp_link.c | 8 +-
tools/testing/selftests/bpf/progs/atomics.c | 154 +
tools/testing/selftests/bpf/progs/bpf_cubic.c | 6 +-
.../bpf/progs/bpf_iter_bpf_hash_map.c | 1 -
.../selftests/bpf/progs/bpf_iter_bpf_map.c | 1 -
.../selftests/bpf/progs/bpf_iter_ipv6_route.c | 1 -
.../selftests/bpf/progs/bpf_iter_netlink.c | 1 -
.../selftests/bpf/progs/bpf_iter_task.c | 1 -
.../selftests/bpf/progs/bpf_iter_task_btf.c | 1 -
.../selftests/bpf/progs/bpf_iter_task_file.c | 1 -
.../selftests/bpf/progs/bpf_iter_task_stack.c | 1 -
.../selftests/bpf/progs/bpf_iter_tcp4.c | 1 -
.../selftests/bpf/progs/bpf_iter_tcp6.c | 1 -
.../selftests/bpf/progs/bpf_iter_test_kern4.c | 4 +-
.../selftests/bpf/progs/bpf_iter_udp4.c | 1 -
.../selftests/bpf/progs/bpf_iter_udp6.c | 1 -
.../selftests/bpf/progs/core_reloc_types.h | 17 +
tools/testing/selftests/bpf/progs/kfree_skb.c | 4 +-
tools/testing/selftests/bpf/progs/lsm.c | 69 +
.../selftests/bpf/progs/map_ptr_kern.c | 4 +-
tools/testing/selftests/bpf/progs/recursion.c | 46 +
.../testing/selftests/bpf/progs/sockopt_sk.c | 11 +-
tools/testing/selftests/bpf/progs/tailcall3.c | 2 +-
tools/testing/selftests/bpf/progs/tailcall4.c | 2 +-
tools/testing/selftests/bpf/progs/tailcall5.c | 2 +-
.../selftests/bpf/progs/tailcall_bpf2bpf2.c | 2 +-
.../selftests/bpf/progs/tailcall_bpf2bpf4.c | 2 +-
.../selftests/bpf/progs/test_cls_redirect.c | 4 +-
.../bpf/progs/test_core_read_macros.c | 50 +
.../bpf/progs/test_core_reloc_module.c | 96 +
.../selftests/bpf/progs/test_global_func10.c | 29 +
.../selftests/bpf/progs/test_global_func11.c | 19 +
.../selftests/bpf/progs/test_global_func12.c | 21 +
.../selftests/bpf/progs/test_global_func13.c | 24 +
.../selftests/bpf/progs/test_global_func14.c | 21 +
.../selftests/bpf/progs/test_global_func15.c | 22 +
.../selftests/bpf/progs/test_global_func16.c | 22 +
.../selftests/bpf/progs/test_global_func9.c | 132 +
.../bpf/progs/test_global_func_args.c | 91 +
.../selftests/bpf/progs/test_ksyms_module.c | 26 +
.../selftests/bpf/progs/test_ksyms_weak.c | 56 +
.../bpf/progs/test_map_in_map_invalid.c | 26 +
tools/testing/selftests/bpf/progs/test_mmap.c | 2 -
.../selftests/bpf/progs/test_module_attach.c | 66 +
.../selftests/bpf/progs/test_rdonly_maps.c | 6 +-
.../selftests/bpf/progs/test_ringbuf.c | 1 -
.../selftests/bpf/progs/test_ringbuf_multi.c | 1 -
.../selftests/bpf/progs/test_skeleton.c | 20 +-
.../selftests/bpf/progs/test_sockmap_listen.c | 2 +-
.../selftests/bpf/progs/test_static_linked1.c | 30 +
.../selftests/bpf/progs/test_static_linked2.c | 31 +
.../selftests/bpf/progs/test_subprogs.c | 13 +
.../selftests/bpf/test_bpftool_build.sh | 4 +
tools/testing/selftests/bpf/test_btf.h | 3 +
.../selftests/bpf/test_cgroup_storage.c | 2 +-
tools/testing/selftests/bpf/test_maps.c | 279 +-
tools/testing/selftests/bpf/test_progs.c | 79 +-
tools/testing/selftests/bpf/test_progs.h | 72 +-
.../selftests/bpf/test_tcpnotify_user.c | 7 +-
tools/testing/selftests/bpf/test_verifier.c | 103 +-
.../selftests/bpf/verifier/atomic_and.c | 77 +
.../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +
.../selftests/bpf/verifier/atomic_fetch_add.c | 106 +
.../selftests/bpf/verifier/atomic_or.c | 77 +
.../selftests/bpf/verifier/atomic_xchg.c | 46 +
.../selftests/bpf/verifier/atomic_xor.c | 77 +
tools/testing/selftests/bpf/verifier/calls.c | 12 +-
tools/testing/selftests/bpf/verifier/ctx.c | 7 +-
.../selftests/bpf/verifier/dead_code.c | 10 +-
.../bpf/verifier/direct_packet_access.c | 4 +-
tools/testing/selftests/bpf/verifier/jit.c | 24 +
.../testing/selftests/bpf/verifier/leak_ptr.c | 10 +-
.../selftests/bpf/verifier/meta_access.c | 4 +-
tools/testing/selftests/bpf/verifier/unpriv.c | 3 +-
.../bpf/verifier/value_illegal_alu.c | 2 +-
tools/testing/selftests/bpf/verifier/xadd.c | 18 +-
tools/testing/selftests/bpf/xdping.c | 2 +-
tools/testing/selftests/tc-testing/Makefile | 3 +-
292 files changed, 24645 insertions(+), 6370 deletions(-)
create mode 100644 include/linux/bpfptr.h
create mode 100644 tools/lib/bpf/bpf_gen_internal.h
create mode 100644 tools/lib/bpf/gen_loader.c
create mode 100644 tools/lib/bpf/libbpf_legacy.h
create mode 100644 tools/lib/bpf/libbpf_version.h
create mode 100644 tools/lib/bpf/linker.c
create mode 100644 tools/lib/bpf/relo_core.c
create mode 100644 tools/lib/bpf/relo_core.h
create mode 100644 tools/lib/bpf/skel_internal.h
create mode 100644 tools/lib/bpf/strset.c
create mode 100644 tools/lib/bpf/strset.h
create mode 100644 tools/perf/util/bpf_counter.c
create mode 100644 tools/perf/util/bpf_counter.h
create mode 100644 tools/perf/util/bpf_skel/.gitignore
create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
create mode 100644 tools/testing/selftests/bpf/bpf_testmod/.gitignore
create mode 100644 tools/testing/selftests/bpf/bpf_testmod/Makefile
create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
create mode 100644 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h
create mode 100644 tools/testing/selftests/bpf/btf_helpers.c
create mode 100644 tools/testing/selftests/bpf/btf_helpers.h
create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_split.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/core_read_macros.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/global_func_args.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/module_attach.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/recursion.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/static_linked.c
create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
create mode 100644 tools/testing/selftests/bpf/progs/recursion.c
create mode 100644 tools/testing/selftests/bpf/progs/test_core_read_macros.c
create mode 100644 tools/testing/selftests/bpf/progs/test_core_reloc_module.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func10.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func11.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func12.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func13.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func14.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func15.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func16.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func9.c
create mode 100644 tools/testing/selftests/bpf/progs/test_global_func_args.c
create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c
create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_weak.c
create mode 100644 tools/testing/selftests/bpf/progs/test_map_in_map_invalid.c
create mode 100644 tools/testing/selftests/bpf/progs/test_module_attach.c
create mode 100644 tools/testing/selftests/bpf/progs/test_static_linked1.c
create mode 100644 tools/testing/selftests/bpf/progs/test_static_linked2.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
--
2.20.1
1
379

12 Aug '22
From: Xu Kuohai <xukuohai(a)huawei.com>
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5M05G
CVE: NA
-------------------------------------------------------
BMC is an in-kernel key-value cache implemented in BPF and proposed by
paper [1]. The paper discussed BMC for memcached, obtaining at least
6x performance speedup.
This patch implements a sample BMC for Redis. Paper [1] implements BMC
in XDP, bypassing the kernel network stack totally. Since Redis is based
on TCP protocol, and it's almost impossible to fully process TCP traffic
in XDP, so this patch implements BMC in sockmap, which locates at the
top of kernel network stack. Since kernel network stack is not bypassed,
the speedup is not significant. Any way, this is only a sample
implementation, and performance improvements can be continuously
optimized.
See [2] for details on how to build samples/bpf.
Output files:
samples/bpf/bmctool
samples/bpf/bmc/bpf.o
Sample usage:
bmctool prog load -p 6379 ./bmc/bpf.o # load bmc bpf prog and attach it
# to sockets with listen port 6379
bmctool stat # dump bmc status
bmctool prog unload # detach and unload bmc prog
[1] https://www.usenix.org/conference/nsdi21/presentation/ghigoff
[2] https://www.kernel.org/doc/readme/samples-bpf-README.rst
Signed-off-by: Xu Kuohai <xukuohai(a)huawei.com>
Reviewed-by: Yang Jihong <yangjihong1(a)huawei.com>
---
samples/bpf/Makefile | 3 +
samples/bpf/bmc/bpf.c | 144 ++++++++
samples/bpf/bmc/common.h | 21 ++
samples/bpf/bmc/redis.h | 648 ++++++++++++++++++++++++++++++++++
samples/bpf/bmc/tool.c | 733 +++++++++++++++++++++++++++++++++++++++
5 files changed, 1549 insertions(+)
create mode 100644 samples/bpf/bmc/bpf.c
create mode 100644 samples/bpf/bmc/common.h
create mode 100644 samples/bpf/bmc/redis.h
create mode 100644 samples/bpf/bmc/tool.c
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index aeebf5d12f32..f9bb6bdad6ce 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -54,6 +54,7 @@ tprogs-y += task_fd_query
tprogs-y += xdp_sample_pkts
tprogs-y += ibumad
tprogs-y += hbm
+tprogs-y += bmctool
# Libbpf dependencies
LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -111,6 +112,7 @@ task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
ibumad-objs := bpf_load.o ibumad_user.o $(TRACE_HELPERS)
hbm-objs := bpf_load.o hbm.o $(CGROUP_HELPERS)
+bmctool-objs := bmc/tool.o
# Tell kbuild to always build the programs
always-y := $(tprogs-y)
@@ -172,6 +174,7 @@ always-y += ibumad_kern.o
always-y += hbm_out_kern.o
always-y += hbm_edt_kern.o
always-y += xdpsock_kern.o
+always-y += bmc/bpf.o
ifeq ($(ARCH), arm)
# Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux
diff --git a/samples/bpf/bmc/bpf.c b/samples/bpf/bmc/bpf.c
new file mode 100644
index 000000000000..127260c611f8
--- /dev/null
+++ b/samples/bpf/bmc/bpf.c
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Huawei Technologies Co., Ltd. 2022-2022. All rights reserved.
+ *
+ * Description: BPF program to accelerate Redis. The idea is to add a kernel
+ * cache for Redis data. When new Redis request is received, the kernel cache
+ * is checked, and if the requested data is found in the cache, a Redis reply
+ * message is constructed and sent back directly.
+ */
+
+#include <uapi/linux/in.h>
+#include <uapi/linux/if_ether.h>
+#include <uapi/linux/ip.h>
+#include <uapi/linux/tcp.h>
+#include <uapi/linux/bpf.h>
+#include <uapi/linux/pkt_cls.h>
+
+#include <bpf/bpf_helpers.h>
+
+#define debug(fmt, ...) \
+do { \
+ char ___fmt[] = fmt; \
+ bpf_trace_printk(___fmt, sizeof(___fmt), ##__VA_ARGS__); \
+} while (0)
+
+struct tcp_key {
+ __u32 family;
+ __be32 local_ip4;
+ __be32 remote_ip4;
+ __be32 local_port;
+ __be32 remote_port;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_SOCKHASH);
+ __uint(key_size, sizeof(struct tcp_key));
+ __uint(value_size, sizeof(u64));
+ __uint(max_entries, 1024);
+} bmc_socks SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __uint(key_size, sizeof(u32));
+ __uint(value_size, sizeof(u32));
+ __uint(max_entries, 16);
+} bmc_ports SEC(".maps");
+
+SEC("bmc/sock_parser")
+int sock_parser(struct __sk_buff *skb)
+{
+ return skb->len;
+}
+
+static void init_tcp_key(struct tcp_key *key, struct bpf_sock *sk)
+{
+ if (sk != NULL) {
+ key->family = sk->family;
+ key->local_ip4 = sk->src_ip4;
+ key->remote_ip4 = sk->dst_ip4;
+ key->local_port = htonl(sk->src_port);
+ key->remote_port = htonl((u32)ntohs(sk->dst_port));
+ }
+}
+
+static int sock_redirect(struct __sk_buff *skb)
+{
+ struct tcp_key key;
+ struct bpf_sock *sk;
+
+ sk = skb->sk;
+ if (sk == NULL)
+ return SK_PASS;
+
+ init_tcp_key(&key, sk);
+ return bpf_sk_redirect_hash(skb, &bmc_socks, &key, 0);
+}
+
+#include "redis.h"
+
+SEC("bmc/sock_verdict")
+int sock_verdict(struct __sk_buff *skb)
+{
+ return bmc_process(skb);
+}
+
+static bool is_bmc_port(u32 port)
+{
+ u32 *val = bpf_map_lookup_elem(&bmc_ports, &port);
+
+ return val != NULL && *val != 0;
+}
+
+static void add_bmc_sock(struct bpf_sock_ops *skops, struct bpf_sock *sk)
+{
+ struct tcp_key key;
+
+ init_tcp_key(&key, sk);
+ bpf_sock_hash_update(skops, &bmc_socks, &key, BPF_ANY);
+}
+
+static void delete_bmc_sock(struct bpf_sock_ops *skops, struct bpf_sock *sk)
+{
+ struct tcp_key key;
+
+ init_tcp_key(&key, sk);
+ bpf_map_delete_elem(&bmc_socks, &key);
+}
+
+SEC("bmc/sock_ops")
+int sock_ops(struct bpf_sock_ops *skops)
+{
+ int op;
+ u16 local_port;
+ struct tcp_key key;
+ struct bpf_sock *sk;
+
+ sk = skops->sk;
+ if (skops->family != AF_INET || sk == NULL)
+ return 0;
+
+ local_port = ntohs((u16)sk->src_port);
+
+ switch ((int)skops->op) {
+ case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
+ case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
+ if (is_bmc_port(local_port)) {
+ bpf_sock_ops_cb_flags_set(skops, BPF_SOCK_OPS_STATE_CB_FLAG);
+ add_bmc_sock(skops, sk);
+ }
+ break;
+
+ case BPF_SOCK_OPS_STATE_CB:
+ if ((int)skops->args[1] == BPF_TCP_CLOSE)
+ delete_bmc_sock(skops, sk);
+ break;
+
+ default:
+ break;
+ }
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/samples/bpf/bmc/common.h b/samples/bpf/bmc/common.h
new file mode 100644
index 000000000000..51c8623ab4f8
--- /dev/null
+++ b/samples/bpf/bmc/common.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) Huawei Technologies Co., Ltd. 2022-2022. All rights reserved.
+ * Description: common header for both user prog and bpf kernel prog
+ */
+#ifndef __REDIS_BMC_COMMON_H__
+#define __REDIS_BMC_COMMON_H__
+
+#define REDIS_GET_PROG_INDEX 0
+#define REDIS_SET_PROG_INDEX 1
+
+struct redis_bmc_stat {
+ __u64 total_get_requests;
+ __u64 hit_get_requests;
+ __u64 drop_get_requests;
+ __u64 total_set_requests;
+ __u64 hit_set_requests;
+ __u64 drop_set_requests;
+};
+
+#endif
diff --git a/samples/bpf/bmc/redis.h b/samples/bpf/bmc/redis.h
new file mode 100644
index 000000000000..6e739ce3d81a
--- /dev/null
+++ b/samples/bpf/bmc/redis.h
@@ -0,0 +1,648 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) Huawei Technologies Co., Ltd. 2022-2022. All rights reserved.
+ *
+ * Description: This file parses REDIS commands. When SET command is received,
+ * the KEY and VALUE fields are extracted from the message and are stored to
+ * bmc_storage. When GET command is received, we lookup bmc_storage with the
+ * KEY received, and if success we fill the reply message with the found VALUE
+ * and send it back to the client.
+ *
+ * Here is a sample redis SET and GET session:
+ * (C: is the client, S: the server)
+ *
+ * C: "*3\r\n$3\r\nset\r\n$5\r\nkey01\r\n$5\r\nval01\r\n"
+ * S: "+OK\r\n"
+ * C: "*2\r\n$3\r\nget\r\n$5\r\nkey01\r\n"
+ * S: "$5\r\nval01\r\n"
+ *
+ * See [0] for RESP protocol details.
+ * [0] https://redis.io/docs/reference/protocol-spec/
+ */
+
+#include "common.h"
+
+#define BMC_MAX_REDIS_KEY_LEN 448 /* total key size should be less than 512 */
+#define BMC_MAX_REDIS_VALUE_LEN 2048
+#define BMC_MAX_CPUS 512 // NR_CPUS
+
+struct redis_key {
+ u32 len;
+ u8 data[BMC_MAX_REDIS_KEY_LEN];
+};
+
+struct redis_value {
+ u32 len;
+ u8 data[BMC_MAX_REDIS_VALUE_LEN];
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_LRU_HASH);
+ __uint(key_size, sizeof(struct redis_key));
+ __uint(value_size, sizeof(struct redis_value));
+ __uint(max_entries, 10000);
+} bmc_storage SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+ __uint(key_size, sizeof(u32));
+ __uint(value_size, sizeof(u32));
+ __uint(max_entries, 2);
+} bmc_jump_table SEC(".maps");
+
+struct redis_ctx {
+ struct redis_key key;
+ struct redis_value value;
+ u32 offset;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(key_size, sizeof(u32));
+ __uint(value_size, sizeof(struct redis_ctx));
+ __uint(max_entries, BMC_MAX_CPUS);
+} ctxmap SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(key_size, sizeof(u32));
+ __uint(value_size, sizeof(struct redis_bmc_stat));
+ __uint(max_entries, BMC_MAX_CPUS);
+} bmc_stats SEC(".maps");
+
+static int bmc_copy_from_skb(void *dst, void *dend,
+ struct __sk_buff *skb,
+ u32 skb_off, u32 len)
+{
+ u32 i;
+ u32 off = 0;
+ void *data = (void *)(long)skb->data;
+ void *data_end = (void *)(long)skb->data_end;
+
+ if (len > 2047)
+ return -1;
+
+ if (len >= 1024 && dst + off + 1024 < dend &&
+ data + skb_off + off + 1024 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 1024))
+ return -1;
+ off += 1024;
+ len -= 1024;
+ }
+ if (len >= 512 && dst + off + 512 < dend &&
+ data + skb_off + off + 512 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 512))
+ return -1;
+ off += 512;
+ len -= 512;
+ }
+ if (len >= 256 && dst + off + 256 < dend &&
+ data + skb_off + off + 256 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 256))
+ return -1;
+ off += 256;
+ len -= 256;
+ }
+ if (len >= 128 && dst + off + 128 < dend &&
+ data + skb_off + off + 128 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 128))
+ return -1;
+ off += 128;
+ len -= 128;
+ }
+ if (len >= 64 && dst + off + 64 < dend &&
+ data + skb_off + off + 64 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 64))
+ return -1;
+ off += 64;
+ len -= 64;
+ }
+ if (len >= 32 && dst + off + 32 < dend &&
+ data + skb_off + off + 32 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 32))
+ return -1;
+ off += 32;
+ len -= 32;
+ }
+ if (len >= 16 && dst + off + 16 < dend &&
+ data + skb_off + off + 16 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 16))
+ return -1;
+ off += 16;
+ len -= 16;
+ }
+
+ if (len >= 8 && dst + off + 8 < dend &&
+ data + skb_off + off + 8 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 8))
+ return -1;
+ off += 8;
+ len -= 8;
+ }
+
+ if (len >= 4 && dst + off + 4 < dend &&
+ data + skb_off + off + 4 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 4))
+ return -1;
+ off += 4;
+ len -= 4;
+ }
+
+ if (len >= 2 && dst + off + 2 < dend &&
+ data + skb_off + off + 2 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 2))
+ return -1;
+ off += 2;
+ len -= 2;
+ }
+
+ if (len >= 1 && dst + off + 1 < dend &&
+ data + skb_off + off + 1 < data_end) {
+ if (bpf_skb_load_bytes(skb, skb_off + off, dst + off, 1))
+ return -1;
+ off += 1;
+ len -= 1;
+ }
+
+ return len == 0 ? 0 : -1;
+}
+
+static int bmc_copy_to_skb(struct __sk_buff *skb, u32 skb_off,
+ void *dst, void *dend, u32 len)
+{
+ u32 i;
+ u32 off = 0;
+ void *data = (void *)(long)skb->data;
+ void *data_end = (void *)(long)skb->data_end;
+
+ if (len > 2047)
+ return -1;
+
+ if (len >= 1024 && dst + off + 1024 < dend &&
+ data + skb_off + off + 1024 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 1024, 0))
+ return -1;
+ off += 1024;
+ len -= 1024;
+ }
+ if (len >= 512 && dst + off + 512 < dend &&
+ data + skb_off + off + 512 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 512, 0))
+ return -1;
+ off += 512;
+ len -= 512;
+ }
+ if (len >= 256 && dst + off + 256 < dend &&
+ data + skb_off + off + 256 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 256, 0))
+ return -1;
+ off += 256;
+ len -= 256;
+ }
+ if (len >= 128 && dst + off + 128 < dend &&
+ data + skb_off + off + 128 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 128, 0))
+ return -1;
+ off += 128;
+ len -= 128;
+ }
+ if (len >= 64 && dst + off + 64 < dend &&
+ data + skb_off + off + 64 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 64, 0))
+ return -1;
+ off += 64;
+ len -= 64;
+ }
+ if (len >= 32 && dst + off + 32 < dend &&
+ data + skb_off + off + 32 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 32, 0))
+ return -1;
+ off += 32;
+ len -= 32;
+ }
+ if (len >= 16 && dst + off + 16 < dend &&
+ data + skb_off + off + 16 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 16, 0))
+ return -1;
+ off += 16;
+ len -= 16;
+ }
+
+ if (len >= 8 && dst + off + 8 < dend &&
+ data + skb_off + off + 8 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 8, 0))
+ return -1;
+ off += 8;
+ len -= 8;
+ }
+
+ if (len >= 4 && dst + off + 4 < dend &&
+ data + skb_off + off + 4 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 4, 0))
+ return -1;
+ off += 4;
+ len -= 4;
+ }
+
+ if (len >= 2 && dst + off + 2 < dend &&
+ data + skb_off + off + 2 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 2, 0))
+ return -1;
+ off += 2;
+ len -= 2;
+ }
+
+ if (len >= 1 && dst + off + 1 < dend &&
+ data + skb_off + off + 1 < data_end) {
+ if (bpf_skb_store_bytes(skb, skb_off + off, dst + off, 1, 0))
+ return -1;
+ off += 1;
+ len -= 1;
+ }
+
+ return len == 0 ? 0 : -1;
+}
+
+static inline struct redis_ctx *get_ctx(void)
+{
+ u32 cpu = bpf_get_smp_processor_id();
+
+ if (cpu >= BMC_MAX_CPUS)
+ return NULL;
+ return bpf_map_lookup_elem(&ctxmap, &cpu);
+}
+
+static inline struct redis_bmc_stat *get_stat(void)
+{
+ u32 cpu = bpf_get_smp_processor_id();
+
+ if (cpu >= BMC_MAX_CPUS)
+ return NULL;
+ return bpf_map_lookup_elem(&bmc_stats, &cpu);
+}
+
+static int do_redis_get_handler(struct __sk_buff *skb, struct redis_ctx *ctx)
+{
+ int i;
+ u32 n;
+ int err;
+ char *p;
+ char *data;
+ char *data_end;
+ char buf[5];
+ struct redis_value *val;
+
+ ctx = get_ctx();
+ if (!ctx)
+ return BPF_OK;
+
+ val = bpf_map_lookup_elem(&bmc_storage, &ctx->key);
+ if (val == NULL || val->len == 0 || val->len > sizeof(val->data))
+ return BPF_OK;
+
+ n = val->len;
+
+ i = 0;
+ while (i < 5) {
+ buf[i] = '0' + n % 10;
+ n = n / 10;
+ i++;
+ if (n == 0)
+ break;
+ }
+
+ if (i >= 5)
+ return BPF_OK;
+
+ /* $ LEN \r \n VALUE \r \n */
+ n = 1 + i + 2 + val->len + 2;
+
+ if (n > skb->len)
+ /* extend head space */
+ err = bpf_skb_change_head(skb, n - skb->len, 0);
+ else if (n < skb->len)
+ /* shrink head space */
+ err = bpf_skb_adjust_room(skb, -(skb->len - n), 0, 0);
+
+ if (err)
+ return BPF_DROP;
+
+ data = (char *)(long)skb->data;
+ data_end = (char *)(long)skb->data_end;
+ p = data;
+ /* 3 is '$' and "\r\n"*/
+ if (p + i + 3 > data_end)
+ return BPF_DROP;
+
+ *p++ = '$';
+ while (p < data_end && --i >= 0)
+ *p++ = buf[i];
+ *p++ = '\r';
+ *p++ = '\n';
+
+ n = val->len;
+ if (n == 0 || n > sizeof(val->data) || p + n + 2 > data_end)
+ return BPF_DROP;
+
+ if (bmc_copy_to_skb(skb, p - data, val->data,
+ val->data + sizeof(val->data), n))
+ return BPF_DROP;
+
+ p += n;
+ char end_mark[] = { '\r', '\n'};
+
+ bpf_skb_store_bytes(skb, p - data, end_mark, sizeof(end_mark), 0);
+
+ return BPF_REDIRECT;
+}
+
+static int do_redis_set_handler(struct __sk_buff *skb, struct redis_ctx *ctx)
+{
+ int err;
+ u32 off = 0;
+ u32 value_len;
+ char *data = (char *)(long)skb->data;
+ char *data_end = (char *)(long)skb->data_end;
+
+ if (data + 1 > data_end || data[0] != '$')
+ return BPF_OK;
+ off++;
+ data++;
+
+ value_len = 0;
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ value_len = value_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ value_len = value_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ value_len = value_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ value_len = value_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+
+ if (data + 2 > data_end || data[0] != '\r' || data[1] != '\n')
+ return BPF_OK;
+ off += 2;
+ data += 2;
+
+ if (data > data_end)
+ return BPF_OK;
+
+ /* format error */
+ if (value_len <= 0 || value_len > sizeof(ctx->value.data) ||
+ value_len >= data_end - data) {
+ return BPF_OK;
+ }
+
+ if (bmc_copy_from_skb(ctx->value.data,
+ ctx->value.data + sizeof(ctx->value.data),
+ skb, off, value_len))
+ return BPF_OK;
+
+ ctx->value.len = value_len;
+
+ if (bpf_map_update_elem(&bmc_storage, &ctx->key, &ctx->value, BPF_ANY)) {
+ bpf_map_delete_elem(&bmc_storage, &ctx->key);
+ return BPF_OK;
+ }
+
+ char reply[] = { '+', 'O', 'K', '\r', '\n'};
+
+ if (skb->len < sizeof(reply))
+ /* extend head space */
+ err = bpf_skb_change_head(skb, sizeof(reply) - skb->len, 0);
+ else
+ /* shrink head space */
+ err = bpf_skb_adjust_room(skb, -(skb->len - sizeof(reply)), 0, 0);
+
+ if (err)
+ return BPF_OK;
+
+ bpf_skb_store_bytes(skb, 0, reply, sizeof(reply), 0);
+
+ return BPF_REDIRECT;
+}
+
+SEC("bmc/redis_get_handler")
+int redis_get_handler(struct __sk_buff *skb)
+{
+ int err;
+ struct redis_bmc_stat *stat;
+ struct redis_ctx *ctx;
+
+ stat = get_stat();
+ if (!stat)
+ return SK_PASS;
+
+ stat->total_get_requests++;
+
+ ctx = get_ctx();
+ if (!ctx)
+ return SK_PASS;
+
+ err = do_redis_get_handler(skb, ctx);
+ if (err == BPF_REDIRECT) {
+ stat->hit_get_requests++;
+ return sock_redirect(skb);
+ }
+
+ if (err == BPF_DROP) {
+ stat->drop_get_requests++;
+ return SK_DROP;
+ }
+
+ return SK_PASS;
+}
+
+SEC("bmc/redis_set_handler")
+int redis_set_handler(struct __sk_buff *skb)
+{
+ int err;
+ struct redis_bmc_stat *stat;
+ struct redis_ctx *ctx;
+
+ stat = get_stat();
+ if (!stat)
+ return SK_PASS;
+
+ stat->total_set_requests++;
+
+ ctx = get_ctx();
+ if (!ctx)
+ return SK_PASS;
+
+ err = do_redis_set_handler(skb, ctx);
+ if (err == BPF_REDIRECT) {
+ stat->hit_set_requests++;
+ return sock_redirect(skb);
+ }
+
+ if (err == BPF_DROP) {
+ stat->drop_set_requests++;
+ return SK_DROP;
+ }
+
+ err = bpf_skb_adjust_room(skb, ctx->offset, 0, 0);
+ if (!err)
+ return SK_PASS;
+
+ stat->drop_set_requests++;
+ return SK_DROP;
+}
+
+static inline int bmc_process(struct __sk_buff *skb)
+{
+ u32 off;
+ int err;
+ u32 key_len;
+ char *data;
+ char *data_end;
+ int expect_get = 0;
+ int is_get = 0;
+ struct redis_ctx *ctx;
+
+ ctx = get_ctx();
+ if (ctx == NULL)
+ return SK_PASS;
+
+ err = bpf_skb_pull_data(skb, skb->len);
+ if (err)
+ return SK_PASS;
+
+ off = 0;
+ data = (char *)(long)skb->data;
+ data_end = (char *)(long)skb->data_end;
+
+ /*
+ * SET message format:
+ * "*3\r\n" // this is an array with 3 elements
+ * "$3\r\n" // the first element is a string with 3 characters
+ * "set\r\n" // the string is "set"
+ * "$5\r\n" // the second element is a string with 5 characters
+ * "key01\r\n" // the string is "key01"
+ * "$5\r\n" // the third element is a string with 5 characters
+ * "val01\r\n" // the string is "valu01"
+ *
+ * GET message format:
+ * "*2\r\n" // this is an array with 3 elements
+ * "$3\r\n" // the first element is a string with 3 characters
+ * "get\r\n" // the string is "get"
+ * "$5\r\n" // the second element is a string with 5 characters
+ * "key01\r\n" // the string is "key01"
+ */
+ if (data + 4 > data_end)
+ return SK_PASS;
+
+ /* Not GET, Not SET */
+ if (data[0] != '*' || (data[1] != '2' && data[1] != '3') ||
+ data[2] != '\r' || data[3] != '\n')
+ return SK_PASS;
+
+ expect_get = (data[1] == '2');
+ off += 4;
+ data += 4;
+
+ if (data + 4 > data_end)
+ return SK_PASS;
+
+ if (data[0] != '$' || data[1] != '3' || data[2] != '\r' ||
+ data[3] != '\n')
+ return SK_PASS;
+
+ off += 4;
+ data += 4;
+
+ if (data + 5 > data_end)
+ return SK_PASS;
+
+ switch (data[0]) {
+ case 'g':
+ is_get = 1;
+ case 's':
+ if (data[1] != 'e' || data[2] != 't' ||
+ data[3] != '\r' || data[4] != '\n')
+ return SK_PASS;
+ break;
+ case 'G':
+ is_get = 1;
+ case 'S':
+ if (data[1] != 'E' || data[2] != 'T' ||
+ data[3] != '\r' || data[4] != '\n')
+ return SK_PASS;
+ break;
+ default:
+ return SK_PASS;
+ }
+ off += 5;
+ data += 5;
+
+ if (expect_get != is_get)
+ return SK_PASS;
+
+ if (data + 1 > data_end || data[0] != '$')
+ return SK_PASS;
+ off++;
+ data++;
+
+ key_len = 0;
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ key_len = key_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ key_len = key_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ key_len = key_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data < data_end && data[0] >= '0' && data[0] <= '9') {
+ key_len = key_len * 10 + data[0] - '0';
+ off++;
+ data++;
+ }
+ if (data + 2 > data_end || data[0] != '\r' || data[1] != '\n')
+ return SK_PASS;
+ off += 2;
+ data += 2;
+
+ if (data > data_end)
+ return SK_PASS;
+
+ if (key_len == 0 || key_len > sizeof(ctx->key.data) ||
+ key_len >= data_end - data)
+ return SK_PASS;
+
+ ctx->offset = off + key_len + 2;
+ ctx->key.len = key_len;
+
+ if (bmc_copy_from_skb(ctx->key.data,
+ ctx->key.data + sizeof(ctx->key.data),
+ skb, off, key_len))
+ return SK_PASS;
+
+ if (is_get) {
+ bpf_tail_call(skb, &bmc_jump_table, REDIS_GET_PROG_INDEX);
+ } else {
+ err = bpf_skb_adjust_room(skb, -ctx->offset, 0, 0);
+ if (err)
+ return SK_PASS;
+ bpf_tail_call(skb, &bmc_jump_table, REDIS_SET_PROG_INDEX);
+ }
+ return SK_PASS;
+}
diff --git a/samples/bpf/bmc/tool.c b/samples/bpf/bmc/tool.c
new file mode 100644
index 000000000000..e45be64a2819
--- /dev/null
+++ b/samples/bpf/bmc/tool.c
@@ -0,0 +1,733 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Huawei Technologies Co., Ltd. 2022-2022. All rights reserved.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <limits.h>
+#include <errno.h>
+
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/select.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <fcntl.h>
+
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+
+#include "common.h"
+
+#define DEFAULT_CGROUP_PATH "/sys/fs/cgroup"
+#define DEFAULT_REDIS_PORT 6379
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))
+#endif
+
+struct {
+ char *cgroup_path;
+ char *bpf_path;
+ int cgroup_fd;
+ int map_socks_fd;
+ int map_ports_fd;
+ int map_storage_fd;
+ int map_jump_table_fd;
+ int map_stats_fd;
+ int sock_parser_prog_fd;
+ int sock_verdict_prog_fd;
+ int sock_ops_prog_fd;
+ int redis_get_prog_fd;
+ int redis_set_prog_fd;
+ uint16_t listen_port;
+} bmc;
+
+struct bmc_prog_info {
+ const char *sec_name;
+ enum bpf_prog_type prog_type;
+ enum bpf_attach_type attach_type;
+ int *p_prog_fd;
+ int *p_attach_fd;
+ unsigned int attach_flags;
+ const char *pin_path;
+ struct bpf_program *prog;
+};
+
+struct bmc_map_info {
+ const char *map_name;
+ int *p_map_fd;
+ char *pin_path;
+ struct bpf_map *map;
+ bool is_stat_map;
+};
+
+static struct bmc_prog_info prog_infos[] = {
+ {
+ .sec_name = "bmc/sock_parser",
+ .prog_type = BPF_PROG_TYPE_SK_SKB,
+ .attach_type = BPF_SK_SKB_STREAM_PARSER,
+ .p_prog_fd = &bmc.sock_parser_prog_fd,
+ .p_attach_fd = &bmc.map_socks_fd,
+ .attach_flags = 0,
+ .pin_path = "/sys/fs/bpf/bmc/prog_sock_parser"
+ },
+ {
+ .sec_name = "bmc/sock_verdict",
+ .prog_type = BPF_PROG_TYPE_SK_SKB,
+ .attach_type = BPF_SK_SKB_STREAM_VERDICT,
+ .p_prog_fd = &bmc.sock_verdict_prog_fd,
+ .p_attach_fd = &bmc.map_socks_fd,
+ .attach_flags = 0,
+ .pin_path = "/sys/fs/bpf/bmc/prog_sock_verdict"
+ },
+ {
+ .sec_name = "bmc/sock_ops",
+ .prog_type = BPF_PROG_TYPE_SOCK_OPS,
+ .attach_type = BPF_CGROUP_SOCK_OPS,
+ .p_prog_fd = &bmc.sock_ops_prog_fd,
+ .p_attach_fd = &bmc.cgroup_fd,
+ .attach_flags = 0,
+ .pin_path = "/sys/fs/bpf/bmc/prog_sock_ops"
+ },
+ {
+ .sec_name = "bmc/redis_get_handler",
+ .prog_type = BPF_PROG_TYPE_SK_SKB,
+ .p_prog_fd = &bmc.redis_get_prog_fd,
+ .p_attach_fd = NULL,
+ .attach_flags = 0,
+ .pin_path = "/sys/fs/bpf/bmc/prog_redis_get_handler"
+
+ },
+ {
+ .sec_name = "bmc/redis_set_handler",
+ .prog_type = BPF_PROG_TYPE_SK_SKB,
+ .p_prog_fd = &bmc.redis_set_prog_fd,
+ .p_attach_fd = NULL,
+ .attach_flags = 0,
+ .pin_path = "/sys/fs/bpf/bmc/prog_redis_set_handler"
+
+ }
+};
+
+static struct bmc_map_info map_infos[] = {
+ {
+ .map_name = "bmc_socks",
+ .p_map_fd = &bmc.map_socks_fd,
+ .pin_path = "/sys/fs/bpf/bmc/map_socks"
+ },
+ {
+ .map_name = "bmc_ports",
+ .p_map_fd = &bmc.map_ports_fd,
+ .pin_path = "/sys/fs/bpf/bmc/map_ports"
+ },
+ {
+ .map_name = "bmc_storage",
+ .p_map_fd = &bmc.map_storage_fd,
+ .pin_path = "/sys/fs/bpf/bmc/map_storage"
+ },
+ {
+ .map_name = "bmc_jump_table",
+ .p_map_fd = &bmc.map_jump_table_fd,
+ .pin_path = "/sys/fs/bpf/bmc/map_jump_table"
+ },
+ {
+ .map_name = "bmc_stats",
+ .p_map_fd = &bmc.map_stats_fd,
+ .pin_path = "/sys/fs/bpf/bmc/stats",
+ .is_stat_map = true,
+ },
+};
+
+static int find_type_by_sec_name(const char *sec_name,
+ enum bpf_prog_type *p_prog_type,
+ enum bpf_attach_type *p_attach_type)
+{
+ int i;
+
+ if (sec_name == NULL) {
+ fprintf(stderr, "sec_name is NULL\n");
+ return -1;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(prog_infos); i++) {
+ if (!strcmp(prog_infos[i].sec_name, sec_name)) {
+ *p_prog_type = prog_infos[i].prog_type;
+ *p_attach_type = prog_infos[i].attach_type;
+ return 0;
+ }
+ }
+
+ fprintf(stderr, "unknown prog %s\n", sec_name);
+
+ return -1;
+}
+
+static int set_prog_type(struct bpf_object *obj)
+{
+ const char *sec_name;
+ struct bpf_program *prog;
+ enum bpf_prog_type prog_type;
+ enum bpf_attach_type attach_type;
+
+ bpf_object__for_each_program(prog, obj) {
+ sec_name = bpf_program__section_name(prog);
+ if (find_type_by_sec_name(sec_name, &prog_type, &attach_type))
+ return -1;
+ bpf_program__set_type(prog, prog_type);
+ bpf_program__set_expected_attach_type(prog, attach_type);
+ }
+
+ return 0;
+}
+
+static struct bpf_object *load_bpf_file(const char *bpf_file)
+{
+ int err;
+ char err_buf[256];
+ struct bpf_object *obj;
+
+ obj = bpf_object__open(bpf_file);
+ err = libbpf_get_error(obj);
+ if (err) {
+ libbpf_strerror(err, err_buf, sizeof(err_buf));
+ fprintf(stderr, "unable to open bpf file %s : %s\n", bpf_file,
+ err_buf);
+ return NULL;
+ }
+
+ if (set_prog_type(obj)) {
+ bpf_object__close(obj);
+ return NULL;
+ }
+
+ err = bpf_object__load(obj);
+ if (err) {
+ fprintf(stderr, "load bpf object failed\n");
+ bpf_object__close(obj);
+ return NULL;
+ }
+
+ return obj;
+}
+
+static int find_prog(struct bpf_object *obj, const char *sec_name,
+ struct bpf_program **p_prog, int *p_prog_fd)
+{
+ int fd;
+ struct bpf_program *prog;
+
+ prog = bpf_object__find_program_by_title(obj, sec_name);
+ if (!prog) {
+ fprintf(stderr, "failed to find prog %s\n", sec_name);
+ return -1;
+ }
+
+ fd = bpf_program__fd(prog);
+ if (fd < 0) {
+ fprintf(stderr, "failed to get fd of prog %s\n", sec_name);
+ return -1;
+ }
+
+
+ *p_prog = prog;
+ *p_prog_fd = fd;
+
+ return 0;
+}
+
+static void unpin_progs(int n)
+{
+ int i;
+
+ for (i = 0; i < n; i++)
+ bpf_program__unpin(prog_infos[i].prog, prog_infos[i].pin_path);
+}
+
+static int find_progs(struct bpf_object *obj)
+{
+ int i;
+ struct bmc_prog_info *info;
+
+ for (i = 0; i < ARRAY_SIZE(prog_infos); i++) {
+ info = &prog_infos[i];
+
+ if (find_prog(obj, info->sec_name, &info->prog, info->p_prog_fd))
+ goto error_find_prog;
+
+ if (bpf_program__pin(info->prog, info->pin_path))
+ goto error_find_prog;
+ }
+
+ return 0;
+
+error_find_prog:
+ unpin_progs(i);
+ return -1;
+}
+
+static int find_map(struct bpf_object *obj, const char *map_name,
+ struct bpf_map **p_map, int *p_map_fd)
+{
+ int fd;
+ struct bpf_map *map;
+
+ map = bpf_object__find_map_by_name(obj, map_name);
+ if (!map) {
+ fprintf(stderr, "failed to find map %s\n", map_name);
+ return -1;
+ }
+
+ fd = bpf_map__fd(map);
+ if (fd < 0) {
+ fprintf(stderr, "failed to get fd of map %s\n", map_name);
+ return -1;
+ }
+
+
+ *p_map = map;
+ *p_map_fd = fd;
+
+ return 0;
+}
+
+static void unpin_maps(int n)
+{
+ int i;
+
+ for (i = 0; i < n; i++)
+ bpf_map__unpin(map_infos[i].map, map_infos[i].pin_path);
+}
+
+static int find_maps(struct bpf_object *obj)
+{
+ int i;
+ struct bmc_map_info *info;
+
+ for (i = 0; i < ARRAY_SIZE(map_infos); i++) {
+ info = &map_infos[i];
+
+ if (find_map(obj, info->map_name, &info->map, info->p_map_fd))
+ goto error_find_map;
+
+ if (bpf_map__pin(info->map, info->pin_path)) {
+ fprintf(stderr, "failed to pin map %s to path %s\n",
+ info->map_name, info->pin_path);
+ goto error_find_map;
+ }
+ }
+
+ return 0;
+
+error_find_map:
+ unpin_maps(i);
+ return -1;
+}
+
+static void detach_progs(int n)
+{
+ int i;
+ struct bmc_prog_info *info;
+
+ for (i = 0; i < n; i++) {
+ info = &prog_infos[i];
+ bpf_prog_detach(*info->p_prog_fd, info->attach_type);
+ }
+}
+
+static int attach_progs(struct bpf_object *obj)
+{
+ int i;
+ int prog_fd;
+ int attach_fd;
+ unsigned int flags;
+ enum bpf_attach_type type;
+ struct bmc_prog_info *info;
+
+ for (i = 0; i < ARRAY_SIZE(prog_infos); i++) {
+ info = &prog_infos[i];
+ if (!info->p_attach_fd)
+ continue;
+ prog_fd = *info->p_prog_fd;
+ attach_fd = *info->p_attach_fd;
+ type = info->attach_type;
+ flags = info->attach_flags;
+
+ if (bpf_prog_attach(prog_fd, attach_fd, type, flags)) {
+ fprintf(stderr, "attach prog %s failed!\n",
+ info->sec_name);
+ goto error_attach_prog;
+ }
+ }
+
+ return 0;
+
+error_attach_prog:
+ detach_progs(i);
+
+ return -1;
+}
+
+static int add_bmc_port(void)
+{
+ int ret;
+ int map_fd = bmc.map_ports_fd;
+ uint16_t port = htons(bmc.listen_port);
+ uint32_t key = (uint32_t)port;
+ uint32_t value = 1;
+
+ ret = bpf_map_update_elem(map_fd, &key, &value, 0);
+ if (ret)
+ fprintf(stderr, "failed to add port %u\n", port);
+
+ return ret;
+}
+
+static int add_tail_call(void)
+{
+ int ret;
+ int map_fd = bmc.map_jump_table_fd;
+ __u32 key;
+ __u32 value;
+
+ key = REDIS_GET_PROG_INDEX;
+ value = bmc.redis_get_prog_fd;
+ ret = bpf_map_update_elem(map_fd, &key, &value, 0);
+ if (ret) {
+ fprintf(stderr, "failed to add redis get tail call prog\n");
+ return -1;
+ }
+
+ key = REDIS_SET_PROG_INDEX;
+ value = bmc.redis_set_prog_fd;
+ ret = bpf_map_update_elem(map_fd, &key, &value, 0);
+ if (ret) {
+ fprintf(stderr, "failed to add redis set tail call prog\n");
+ key = REDIS_GET_PROG_INDEX;
+ bpf_map_delete_elem(map_fd, &key);
+ }
+
+ return ret;
+}
+
+static int setup_bpf(void)
+{
+ struct bpf_object *obj;
+
+ bmc.cgroup_fd = open(bmc.cgroup_path, O_DIRECTORY, O_RDONLY);
+ if (bmc.cgroup_fd < 0) {
+ fprintf(stderr, "failed to open cgroup %s: %s\n",
+ bmc.cgroup_path, strerror(errno));
+ return -1;
+ }
+
+ obj = load_bpf_file(bmc.bpf_path);
+ if (!obj)
+ goto error_load_object;
+
+ if (find_progs(obj))
+ goto error_load_object;
+
+ if (find_maps(obj))
+ goto error_find_maps;
+
+ if (attach_progs(obj))
+ goto error_attach_progs;
+
+ if (add_bmc_port())
+ goto error_add_port;
+
+ if (add_tail_call())
+ goto error_attach_progs;
+
+ return 0;
+
+error_add_port:
+ detach_progs(ARRAY_SIZE(prog_infos));
+error_attach_progs:
+ unpin_maps(ARRAY_SIZE(map_infos));
+error_find_maps:
+ unpin_progs(ARRAY_SIZE(prog_infos));
+error_load_object:
+ bpf_object__close(obj);
+ close(bmc.cgroup_fd);
+ return -1;
+}
+
+static int parse_load_args(int argc, char *argv[])
+{
+ int opt;
+ int port;
+
+ bmc.cgroup_path = DEFAULT_CGROUP_PATH;
+ bmc.listen_port = DEFAULT_REDIS_PORT;
+
+ while ((opt = getopt(argc, argv, "c:p:")) != -1) {
+ switch (opt) {
+ case 'c':
+ bmc.cgroup_path = optarg;
+ break;
+ case 'p':
+ port = atoi(optarg);
+ if (port <= 0 || port >= USHRT_MAX) {
+ fprintf(stderr, "invalid port: %s\n", optarg);
+ return -1;
+ }
+ bmc.listen_port = port;
+ break;
+ default:
+ fprintf(stderr, "unknown option %c\n", opt);
+ return -1;
+ }
+ }
+
+ if (optind >= argc) {
+ fprintf(stderr, "no bpf prog file found\n");
+ return -1;
+ }
+
+ bmc.bpf_path = argv[optind];
+
+ printf("bpf file: %s\n", bmc.bpf_path);
+ printf("cgroup path: %s\n", bmc.cgroup_path);
+ printf("listen port: %d\n", bmc.listen_port);
+
+ return 0;
+}
+
+struct cmd {
+ const char *name;
+ int (*func)(int argc, char *argv[]);
+};
+
+static int do_prog(int argc, char *argv[]);
+static int do_stat(int argc, char *argv[]);
+
+static int do_prog_load(int argc, char *argv[]);
+static int do_prog_unload(int argc, char *argv[]);
+
+static struct cmd main_cmds[] = {
+ { "prog", do_prog },
+ { "stat", do_stat },
+};
+
+static struct cmd prog_cmds[] = {
+ { "load", do_prog_load },
+ { "unload", do_prog_unload },
+};
+
+static char *elf_name;
+
+static int dispatch_cmd(struct cmd cmds[], int ncmd, int argc,
+ char *argv[], void (*help)(void))
+{
+ int i;
+ int ret;
+
+ if (argc <= 0) {
+ help();
+ return -1;
+ }
+
+ for (i = 0; i < ncmd; i++) {
+ if (!strcmp(argv[0], cmds[i].name)) {
+ ret = cmds[i].func(argc - 1, argv + 1);
+ if (ret == -2) {
+ help();
+ ret = -1;
+ }
+ return ret;
+ }
+ }
+
+ help();
+
+ return -1;
+}
+
+static int do_prog_load(int argc, char *argv[])
+{
+ if (parse_load_args(argc + 1, argv - 1) < 0)
+ return -2;
+
+ if (setup_bpf())
+ return -1;
+
+ return 0;
+}
+
+static int do_prog_unload(int argc, char *argv[])
+{
+ int i;
+ int prog_fd;
+ int cgroup_fd;
+ char *cgroup_path = DEFAULT_CGROUP_PATH;
+
+ if (argc > 1)
+ cgroup_path = argv[0];
+
+ cgroup_fd = open(cgroup_path, O_DIRECTORY, O_RDONLY);
+ if (cgroup_fd < 0) {
+ fprintf(stderr, "failed to open cgroup path: %s\n",
+ cgroup_path);
+ return -1;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(prog_infos); i++) {
+ if (prog_infos[i].attach_type == BPF_CGROUP_SOCK_OPS) {
+ prog_fd = bpf_obj_get(prog_infos[i].pin_path);
+ if (prog_fd >= 0)
+ bpf_prog_detach2(prog_fd, cgroup_fd,
+ BPF_CGROUP_SOCK_OPS);
+ }
+ unlink(prog_infos[i].pin_path);
+ }
+
+ for (i = 0; i < ARRAY_SIZE(map_infos); i++)
+ unlink(map_infos[i].pin_path);
+
+ return 0;
+}
+
+static void do_prog_help(void)
+{
+ fprintf(stderr,
+ "Usage: %s prog load [-c CGROUP_PATH] [-p LISTEN_PORT] {BPF_FILE}\n"
+ " %s prog unload [CGROUP_PATH]\n",
+ elf_name, elf_name);
+}
+
+static int do_prog(int argc, char *argv[])
+{
+ return dispatch_cmd(prog_cmds, ARRAY_SIZE(prog_cmds),
+ argc, argv, do_prog_help);
+}
+
+static int do_stat(int argc, char *argv[])
+{
+ int i;
+ int fd;
+ int err;
+ int ncpu;
+ bool found = false;
+ struct bmc_map_info *info;
+ struct bpf_map_info map = {};
+ struct redis_bmc_stat stat = {};
+ __u32 len = sizeof(map);
+
+ ncpu = sysconf(_SC_NPROCESSORS_ONLN);
+ if (ncpu < 0) {
+ fprintf(stderr, "sysconf failed: %s\n", strerror(errno));
+ return -1;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(map_infos); i++) {
+ info = &map_infos[i];
+ if (info->is_stat_map) {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found) {
+ fprintf(stderr, "no stats map found\n");
+ return -1;
+ }
+
+ fd = bpf_obj_get(info->pin_path);
+ if (fd < 0) {
+ fprintf(stderr, "failed to open %s\n",
+ info->pin_path);
+ return -1;
+ }
+
+ err = bpf_obj_get_info_by_fd(fd, &map, &len);
+ if (err) {
+ fprintf(stderr, "failed to get map info\n");
+ goto error;
+ }
+
+ if (map.type != BPF_MAP_TYPE_ARRAY) {
+ fprintf(stderr, "unexpected map type: %d\n", map.type);
+ goto error;
+ }
+
+ if (map.key_size != sizeof(__u32)) {
+ fprintf(stderr, "unexpected map key_size: %u\n", map.key_size);
+ goto error;
+ }
+
+ if (map.value_size != sizeof(struct redis_bmc_stat)) {
+ fprintf(stderr, "unexpected map key_size: %u\n", map.key_size);
+ goto error;
+ }
+
+ for (int i = 0; i < ncpu; i++) {
+ __u32 key = i;
+ struct redis_bmc_stat value;
+
+ err = bpf_map_lookup_elem(fd, &key, &value);
+ if (err) {
+ fprintf(stderr, "lookup cpu stat failed, cpu=%u\n", i);
+ goto error;
+ }
+ stat.total_get_requests += value.total_get_requests;
+ stat.hit_get_requests += value.hit_get_requests;
+ stat.drop_get_requests += value.drop_get_requests;
+ stat.total_set_requests += value.total_set_requests;
+ stat.hit_set_requests += value.hit_set_requests;
+ stat.drop_set_requests += value.drop_set_requests;
+ }
+
+ printf("Total GET Requests: %llu\n", stat.total_get_requests);
+ printf("Hit GET Requests: %llu (%.2f%%)\n", stat.hit_get_requests,
+ stat.total_get_requests == 0 ? 0 :
+ (double)stat.hit_get_requests /
+ (double)stat.total_get_requests *
+ 100);
+ printf("Dropped GET Requests: %llu (%.2lf%%)\n", stat.drop_get_requests,
+ stat.total_get_requests == 0 ? 0 :
+ (double)stat.drop_get_requests /
+ (double)stat.total_get_requests *
+ 100);
+
+ printf("Total SET Requests: %llu\n", stat.total_set_requests);
+ printf("Hit SET Requests: %llu (%.2f%%)\n", stat.hit_set_requests,
+ stat.total_set_requests == 0 ? 0 :
+ (double)stat.hit_set_requests /
+ (double)stat.total_set_requests *
+ 100);
+ printf("Dropped SET Requests: %llu (%.2lf%%)\n", stat.drop_set_requests,
+ stat.total_set_requests == 0 ? 0 :
+ (double)stat.drop_set_requests /
+ (double)stat.total_set_requests *
+ 100);
+
+ close(fd);
+
+ return 0;
+
+error:
+ close(fd);
+ return -1;
+}
+
+static void do_main_help(void)
+{
+ fprintf(stderr,
+ "Usage: %s OBJECT { COMMAND | help }\n"
+ " OBJECT := { prog | stat }\n",
+ elf_name);
+}
+
+int main(int argc, char *argv[])
+{
+ elf_name = argv[0];
+
+ return dispatch_cmd(main_cmds, ARRAY_SIZE(main_cmds),
+ argc - 1, argv + 1, do_main_help);
+}
--
2.20.1
1
0

[PATCH openEuler-1.0-LTS 1/2] mm/slub: add missing TID updates on slab deactivation
by Yongqiang Liu 12 Aug '22
by Yongqiang Liu 12 Aug '22
12 Aug '22
From: Jann Horn <jannh(a)google.com>
stable inclusion
from stable-4.19.252
commit e2b2f0e2e34d71ae6c2a1114fd3c525930e84bc7
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5LJH1
CVE: NA
--------------------------------
commit eeaa345e128515135ccb864c04482180c08e3259 upstream.
The fastpath in slab_alloc_node() assumes that c->slab is stable as long as
the TID stays the same. However, two places in __slab_alloc() currently
don't update the TID when deactivating the CPU slab.
If multiple operations race the right way, this could lead to an object
getting lost; or, in an even more unlikely situation, it could even lead to
an object being freed onto the wrong slab's freelist, messing up the
`inuse` counter and eventually causing a page to be freed to the page
allocator while it still contains slab objects.
(I haven't actually tested these cases though, this is just based on
looking at the code. Writing testcases for this stuff seems like it'd be
a pain...)
The race leading to state inconsistency is (all operations on the same CPU
and kmem_cache):
- task A: begin do_slab_free():
- read TID
- read pcpu freelist (==NULL)
- check `slab == c->slab` (true)
- [PREEMPT A->B]
- task B: begin slab_alloc_node():
- fastpath fails (`c->freelist` is NULL)
- enter __slab_alloc()
- slub_get_cpu_ptr() (disables preemption)
- enter ___slab_alloc()
- take local_lock_irqsave()
- read c->freelist as NULL
- get_freelist() returns NULL
- write `c->slab = NULL`
- drop local_unlock_irqrestore()
- goto new_slab
- slub_percpu_partial() is NULL
- get_partial() returns NULL
- slub_put_cpu_ptr() (enables preemption)
- [PREEMPT B->A]
- task A: finish do_slab_free():
- this_cpu_cmpxchg_double() succeeds()
- [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL]
From there, the object on c->freelist will get lost if task B is allowed to
continue from here: It will proceed to the retry_load_slab label,
set c->slab, then jump to load_freelist, which clobbers c->freelist.
But if we instead continue as follows, we get worse corruption:
- task A: run __slab_free() on object from other struct slab:
- CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial)
- task A: run slab_alloc_node() with NUMA node constraint:
- fastpath fails (c->slab is NULL)
- call __slab_alloc()
- slub_get_cpu_ptr() (disables preemption)
- enter ___slab_alloc()
- c->slab is NULL: goto new_slab
- slub_percpu_partial() is non-NULL
- set c->slab to slub_percpu_partial(c)
- [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects
from slab-2]
- goto redo
- node_match() fails
- goto deactivate_slab
- existing c->freelist is passed into deactivate_slab()
- inuse count of slab-1 is decremented to account for object from
slab-2
At this point, the inuse count of slab-1 is 1 lower than it should be.
This means that if we free all allocated objects in slab-1 except for one,
SLUB will think that slab-1 is completely unused, and may free its page,
leading to use-after-free.
Fixes: c17dda40a6a4e ("slub: Separate out kmem_cache_cpu processing from deactivate_slab")
Fixes: 03e404af26dc2 ("slub: fast release on full slab")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Acked-by: Christoph Lameter <cl(a)linux.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo(a)gmail.com>
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
mm/slub.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 7b5630ca9274..4bc29bcd0d5d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2168,6 +2168,7 @@ static void deactivate_slab(struct kmem_cache *s, struct page *page,
c->page = NULL;
c->freelist = NULL;
+ c->tid = next_tid(c->tid);
}
/*
@@ -2301,8 +2302,6 @@ static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
{
stat(s, CPUSLAB_FLUSH);
deactivate_slab(s, c->page, c->freelist, c);
-
- c->tid = next_tid(c->tid);
}
/*
@@ -2589,6 +2588,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
if (!freelist) {
c->page = NULL;
+ c->tid = next_tid(c->tid);
stat(s, DEACTIVATE_BYPASS);
goto new_slab;
}
--
2.25.1
1
1
您好!
Kernel SIG 邀请您参加 2022-08-12 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG例会
会议内容:
1.英特尔Sapphire Rapids平台PMU新特性介绍
2.基于BPF内核缓存的Redis加速特性评审
3.BPF CO-RE(Compile Once-Run Everywhere)特性评审
会议链接:https://us06web.zoom.us/j/89836175849?pwd=ODlUNVhldkdndnN0b21VRUIxNkg0dz09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2022-08-12 14:00,
The subject of the conference is openEuler Kernel SIG例会,
Summary:
1.英特尔Sapphire Rapids平台PMU新特性介绍
2.基于BPF内核缓存的Redis加速特性评审
3.BPF CO-RE(Compile Once-Run Everywhere)特性评审
You can join the meeting at https://us06web.zoom.us/j/89836175849?pwd=ODlUNVhldkdndnN0b21VRUIxNkg0dz09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

10 Aug '22
From: Juergen Gross <jgross(a)suse.com>
stable inclusion
from stable-v5.10.132
commit 136d7987fcfdeca73ee3c6a29e48f99fdd0f4d87
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5JTYM
CVE: CVE-2022-36123
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 38fa5479b41376dc9d7f57e71c83514285a25ca0 ]
The .brk section has the same properties as .bss: it is an alloc-only
section and should be cleared before being used.
Not doing so is especially a problem for Xen PV guests, as the
hypervisor will validate page tables (check for writable page tables
and hypervisor private bits) before accepting them to be used.
Make sure .brk is initially zero by letting clear_bss() clear the brk
area, too.
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Link: https://lore.kernel.org/r/20220630071441.28576-3-jgross@suse.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: GONG, Ruiqi <gongruiqi1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/x86/kernel/head64.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 05e117137b45..efe13ab366f4 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -419,6 +419,8 @@ static void __init clear_bss(void)
{
memset(__bss_start, 0,
(unsigned long) __bss_stop - (unsigned long) __bss_start);
+ memset(__brk_base, 0,
+ (unsigned long) __brk_limit - (unsigned long) __brk_base);
}
static unsigned long get_cmd_line_ptr(void)
--
2.20.1
1
0