Kernel
Threads by month
- ----- 2025 -----
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
October 2021
- 26 participants
- 189 discussions

Re: [PATCH openEuler-21.03] dm verity: fix require_signatures module_param permissions
by chengjian (D) 26 Oct '21
by chengjian (D) 26 Oct '21
26 Oct '21
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
在 2021/10/23 22:40, huangzhuoli 写道:
> From: John Keeping <john(a)metanate.com>
>
> stable inclusion
> from stable-v5.10.44
> commit 90547d5db50bcb2705709e420e0af51535109113
> bugzilla:https://bugzilla.openeuler.org/show_bug.cgi?id=426
> CVE: NA
>
> -------------------------------------------------
>
> [ Upstream commit 0c1f3193b1cdd21e7182f97dc9bca7d284d18a15 ]
>
> The third parameter of module_param() is permissions for the sysfs node
> but it looks like it is being used as the initial value of the parameter
> here. In fact, false here equates to omitting the file from sysfs and
> does not affect the value of require_signatures.
>
> Making the parameter writable is not simple because going from
> false->true is fine but it should not be possible to remove the
> requirement to verify a signature. But it can be useful to inspect the
> value of this parameter from userspace, so change the permissions to
> make a read-only file in sysfs.
>
> Signed-off-by: John Keeping <john(a)metanate.com>
> Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
> Signed-off-by: huangzhuoli <bioagr_huangzl(a)163.com>
> ---
> drivers/md/dm-verity-verify-sig.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-verity-verify-sig.c b/drivers/md/dm-verity-verify-sig.c
> index 614e43db93aa..919154ae4cae 100644
> --- a/drivers/md/dm-verity-verify-sig.c
> +++ b/drivers/md/dm-verity-verify-sig.c
> @@ -15,7 +15,7 @@
> #define DM_VERITY_VERIFY_ERR(s) DM_VERITY_ROOT_HASH_VERIFICATION " " s
>
> static bool require_signatures;
> -module_param(require_signatures, bool, false);
> +module_param(require_signatures, bool, 0444);
> MODULE_PARM_DESC(require_signatures,
> "Verify the roothash of dm-verity hash tree");
>
1
0

Re: [PATCH openEuler-21.03] dm verity: fix require_signatures module_param permissions
by chengjian (D) 26 Oct '21
by chengjian (D) 26 Oct '21
26 Oct '21
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
在 2021/10/23 22:40, huangzhuoli 写道:
> From: John Keeping <john(a)metanate.com>
>
> stable inclusion
> from stable-v5.10.44
> commit 90547d5db50bcb2705709e420e0af51535109113
> bugzilla:https://bugzilla.openeuler.org/show_bug.cgi?id=426
> CVE: NA
>
> -------------------------------------------------
>
> [ Upstream commit 0c1f3193b1cdd21e7182f97dc9bca7d284d18a15 ]
>
> The third parameter of module_param() is permissions for the sysfs node
> but it looks like it is being used as the initial value of the parameter
> here. In fact, false here equates to omitting the file from sysfs and
> does not affect the value of require_signatures.
>
> Making the parameter writable is not simple because going from
> false->true is fine but it should not be possible to remove the
> requirement to verify a signature. But it can be useful to inspect the
> value of this parameter from userspace, so change the permissions to
> make a read-only file in sysfs.
>
> Signed-off-by: John Keeping <john(a)metanate.com>
> Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
> Signed-off-by: huangzhuoli <bioagr_huangzl(a)163.com>
> ---
> drivers/md/dm-verity-verify-sig.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-verity-verify-sig.c b/drivers/md/dm-verity-verify-sig.c
> index 614e43db93aa..919154ae4cae 100644
> --- a/drivers/md/dm-verity-verify-sig.c
> +++ b/drivers/md/dm-verity-verify-sig.c
> @@ -15,7 +15,7 @@
> #define DM_VERITY_VERIFY_ERR(s) DM_VERITY_ROOT_HASH_VERIFICATION " " s
>
> static bool require_signatures;
> -module_param(require_signatures, bool, false);
> +module_param(require_signatures, bool, 0444);
> MODULE_PARM_DESC(require_signatures,
> "Verify the roothash of dm-verity hash tree");
>
1
0

26 Oct '21
您好,感谢参与 openEuler kernel 开发
您的 signed-off-by 没有加。
另外您的邮箱用户名可能不正确:显示为 yourname。
在 2021/10/23 23:05, yourname 写道:
> From: Takashi Iwai <tiwai(a)suse.de>
>
> stable inclusion
> from stable-v5.10.44
> commit bd7d88b0874f82f7b29d1a53e574cedaf23166ba
> bugzilla:https://bugzilla.openeuler.org/show_bug.cgi?id=445
> CVE: NA
>
> -------------------------------------------------
>
> commit 83e197a8414c0ba545e7e3916ce05f836f349273 upstream.
>
> The timer instance per queue is exclusive, and snd_seq_timer_open()
> should have managed the concurrent accesses. It looks as if it's
> checking the already existing timer instance at the beginning, but
> it's not right, because there is no protection, hence any later
> concurrent call of snd_seq_timer_open() may override the timer
> instance easily. This may result in UAF, as the leftover timer
> instance can keep running while the queue itself gets closed, as
> spotted by syzkaller recently.
>
> For avoiding the race, add a proper check at the assignment of
> tmr->timeri again, and return -EBUSY if it's been already registered.
>
> Reported-by: syzbot+ddc1260a83ed1cbf6fb5(a)syzkaller.appspotmail.com
> Cc: <stable(a)vger.kernel.org>
> Link: https://lore.kernel.org/r/000000000000dce34f05c42f110c@google.com
> Link: https://lore.kernel.org/r/20210610152059.24633-1-tiwai@suse.de
> Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> Signed-off-by: yourname <625267121(a)qq.com>
> ---
> sound/core/seq/seq_timer.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/sound/core/seq/seq_timer.c b/sound/core/seq/seq_timer.c
> index 1645e4142e30..9863be6fd43e 100644
> --- a/sound/core/seq/seq_timer.c
> +++ b/sound/core/seq/seq_timer.c
> @@ -297,8 +297,16 @@ int snd_seq_timer_open(struct snd_seq_queue *q)
> return err;
> }
> spin_lock_irq(&tmr->lock);
> - tmr->timeri = t;
> + if (tmr->timeri)
> + err = -EBUSY;
> + else
> + tmr->timeri = t;
> spin_unlock_irq(&tmr->lock);
> + if (err < 0) {
> + snd_timer_close(t);
> + snd_timer_instance_free(t);
> + return err;
> + }
> return 0;
> }
>
1
0

26 Oct '21
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
在 2021/10/23 23:05, yourname 写道:
> From: Dan Carpenter <dan.carpenter(a)oracle.com>
>
> stable inclusion
> from stable-v5.10.44
> commit be23c4af3d8a1b986fe9b43b8966797653a76ca4
> bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=341
> CVE: NA
>
> --------------------------------
>
> [ Upstream commit 1dde47a66d4fb181830d6fa000e5ea86907b639e ]
>
> We spotted a bug recently during a review where a driver was
> unregistering a bus that wasn't registered, which would trigger this
> BUG_ON(). Let's handle that situation more gracefully, and just print
> a warning and return.
>
> Reported-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
> Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
> Reviewed-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
> Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
> Signed-off-by: wangqing <wangqing(a)uniontech.com>
> Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
> Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
您好,感谢参与 openEuler kernel 开发
您的 signed-off-by 没有加。
另外您的邮箱用户名可能不正确:显示为 yourname。
> drivers/net/phy/mdio_bus.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index 757e950fb745..b848439fa837 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -608,7 +608,8 @@ void mdiobus_unregister(struct mii_bus *bus)
> struct mdio_device *mdiodev;
> int i;
>
> - BUG_ON(bus->state != MDIOBUS_REGISTERED);
> + if (WARN_ON_ONCE(bus->state != MDIOBUS_REGISTERED))
> + return;
> bus->state = MDIOBUS_UNREGISTERED;
>
> for (i = 0; i < PHY_MAX_ADDR; i++) {
1
0

[PATCH openEuler-21.03] usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind
by liuhao 26 Oct '21
by liuhao 26 Oct '21
26 Oct '21
From: Wesley Cheng <wcheng(a)codeaurora.org>
stable inclusion
from stable-v5.10.44
commit 5cead896962d8b25dee8a8efc85b076572732b86
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=406
CVE: NA
-------------------------------------------------
commit 6fc1db5e6211e30fbb1cee8d7925d79d4ed2ae14 upstream.
During unbind, ffs_func_eps_disable() will be executed, resulting in
completion callbacks for any pending USB requests. When using AIO,
irrespective of the completion status, io_data work is queued to
io_completion_wq to evaluate and handle the completed requests. Since
work runs asynchronously to the unbind() routine, there can be a
scenario where the work runs after the USB gadget has been fully
removed, resulting in accessing of a resource which has been already
freed. (i.e. usb_ep_free_request() accessing the USB ep structure)
Explicitly drain the io_completion_wq, instead of relying on the
destroy_workqueue() (in ffs_data_put()) to make sure no pending
completion work items are running.
Signed-off-by: Wesley Cheng <wcheng(a)codeaurora.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/1621644261-1236-1-git-send-email-wcheng@codeauror…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: liuhao <qq1107732331(a)qq.com>
---
drivers/usb/gadget/function/f_fs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index ffe67d836b0c..7df180b110af 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -3566,6 +3566,9 @@ static void ffs_func_unbind(struct usb_configuration *c,
ffs->func = NULL;
}
+ /* Drain any pending AIO completions */
+ drain_workqueue(ffs->io_completion_wq);
+
if (!--opts->refcnt)
functionfs_unbind(ffs);
--
2.23.0
2
1

Re: [PATCH openEuler-21.03] NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error.
by chengjian (D) 26 Oct '21
by chengjian (D) 26 Oct '21
26 Oct '21
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
在 2021/10/24 16:58, wangyongpan 写道:
> From: Dai Ngo <dai.ngo(a)oracle.com>
>
> stable inclusion
> from stable-v5.10.44
> commit 6e13b9bc66f0e34238aa7b9486a0575177fb7955
> bugzilla:https://bugzilla.openeuler.org/show_bug.cgi?id=414
> CVE: NA
>
> -------------------------------------------------
>
> commit f8849e206ef52b584cd9227255f4724f0cc900bb upstream.
>
> Currently if __nfs4_proc_set_acl fails with NFS4ERR_BADOWNER it
> re-enables the idmapper by clearing NFS_CAP_UIDGID_NOMAP before
> retrying again. The NFS_CAP_UIDGID_NOMAP remains cleared even if
> the retry fails. This causes problem for subsequent setattr
> requests for v4 server that does not have idmapping configured.
>
> This patch modifies nfs4_proc_set_acl to detect NFS4ERR_BADOWNER
> and NFS4ERR_BADNAME and skips the retry, since the kernel isn't
> involved in encoding the ACEs, and return -EINVAL.
>
> Steps to reproduce the problem:
>
> # mount -o vers=4.1,sec=sys server:/export/test /tmp/mnt
> # touch /tmp/mnt/file1
> # chown 99 /tmp/mnt/file1
> # nfs4_setfacl -a A::unknown.user@xyz.com:wrtncy /tmp/mnt/file1
> Failed setxattr operation: Invalid argument
> # chown 99 /tmp/mnt/file1
> chown: changing ownership of ‘/tmp/mnt/file1’: Invalid argument
> # umount /tmp/mnt
> # mount -o vers=4.1,sec=sys server:/export/test /tmp/mnt
> # chown 99 /tmp/mnt/file1
> #
>
> v2: detect NFS4ERR_BADOWNER and NFS4ERR_BADNAME and skip retry
> in nfs4_proc_set_acl.
> Signed-off-by: Dai Ngo <dai.ngo(a)oracle.com>
> Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> Signed-off-by: wangyongpan <1071630525(a)qq.com>
> ---
> fs/nfs/nfs4proc.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index eedcbe6832fb..f387b34bc5e5 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5942,6 +5942,14 @@ static int nfs4_proc_set_acl(struct inode *inode, const void *buf, size_t buflen
> do {
> err = __nfs4_proc_set_acl(inode, buf, buflen);
> trace_nfs4_set_acl(inode, err);
> + if (err == -NFS4ERR_BADOWNER || err == -NFS4ERR_BADNAME) {
> + /*
> + * no need to retry since the kernel
> + * isn't involved in encoding the ACEs.
> + */
> + err = -EINVAL;
> + break;
> + }
> err = nfs4_handle_exception(NFS_SERVER(inode), err,
> &exception);
> } while (exception.retry);
1
0
From: Feng Tang <feng.tang(a)intel.com>
mainline inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4CMMB?from=project-issue
CVE: NA
------------------------------------
End users frequently want to know what features their processor
supports, independent of what the kernel supports.
/proc/cpuinfo is great. It is omnipresent and since it is provided by
the kernel it is always as up to date as the kernel. But, it could be
ambiguous about processor features which can be disabled by the kernel
at boot-time or compile-time.
There are some user space tools showing more raw features, but they are
not bound with kernel, and go with distros. Many end users are still
using old distros with new kernels (upgraded by themselves), and may
not upgrade the distros only to get a newer tool.
So here arise the need for a new tool, which
* shows raw CPU features read from the CPUID instruction
* will be easier to update compared to existing userspace
tooling (perhaps distributed like perf)
* inherits "modern" kernel development process, in contrast to some
of the existing userspace CPUID tools which are still being
developed
without git and distributed in tarballs from non-https sites.
* Can produce output consistent with /proc/cpuinfo to make comparison
easier.
The CPUID leaf definitions are kept in an .csv file which allows for
updating only that file to add support for new feature leafs.
This is based on prototype code from Borislav Petkov
(http://sr71.net/~dave/intel/stupid-cpuid.c)
[ bp:
- Massage, add #define _GNU_SOURCE to fix implicit declaration of
function ‘strcasestr' warning
- remove superfluous newlines
- fallback to cpuid.csv in the current dir if none found
- fix typos
- move comments over the lines instead of sideways. ]
Originally-from: Borislav Petkov <bp(a)alien8.de>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Suggested-by: Borislav Petkov <bp(a)alien8.de>
Signed-off-by: Feng Tang <feng.tang(a)intel.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: zhangyue <zhangyue1(a)kylinos.cn>
---
tools/arch/x86/kcpuid/Makefile | 24 ++
tools/arch/x86/kcpuid/cpuid.csv | 400 +++++++++++++++++++
tools/arch/x86/kcpuid/kcpuid.c | 657 ++++++++++++++++++++++++++++++++
3 files changed, 1081 insertions(+)
create mode 100644 tools/arch/x86/kcpuid/Makefile
create mode 100644 tools/arch/x86/kcpuid/cpuid.csv
create mode 100644 tools/arch/x86/kcpuid/kcpuid.c
diff --git a/tools/arch/x86/kcpuid/Makefile b/tools/arch/x86/kcpuid/Makefile
new file mode 100644
index 000000000000..a2f1e855092c
--- /dev/null
+++ b/tools/arch/x86/kcpuid/Makefile
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for x86/kcpuid tool
+
+kcpuid : kcpuid.c
+
+CFLAGS = -Wextra
+
+BINDIR ?= /usr/sbin
+
+HWDATADIR ?= /usr/share/misc/
+
+override CFLAGS += -O2 -Wall -I../../../include
+
+%: %.c
+ $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
+
+.PHONY : clean
+clean :
+ @rm -f kcpuid
+
+install : kcpuid
+ install -d $(DESTDIR)$(BINDIR)
+ install -m 755 -p kcpuid $(DESTDIR)$(BINDIR)/kcpuid
+ install -m 444 -p cpuid.csv $(HWDATADIR)/cpuid.csv
diff --git a/tools/arch/x86/kcpuid/cpuid.csv b/tools/arch/x86/kcpuid/cpuid.csv
new file mode 100644
index 000000000000..e422fad9a98e
--- /dev/null
+++ b/tools/arch/x86/kcpuid/cpuid.csv
@@ -0,0 +1,400 @@
+# The basic row format is:
+# LEAF, SUBLEAF, register_name, bits, short_name, long_description
+
+# Leaf 00H
+ 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+
+# Leaf 01H
+ 1, 0, EAX, 3:0, stepping, Stepping ID
+ 1, 0, EAX, 7:4, model, Model
+ 1, 0, EAX, 11:8, family, Family ID
+ 1, 0, EAX, 13:12, processor, Processor Type
+ 1, 0, EAX, 19:16, model_ext, Extended Model ID
+ 1, 0, EAX, 27:20, family_ext, Extended Family ID
+
+ 1, 0, EBX, 7:0, brand, Brand Index
+ 1, 0, EBX, 15:8, clflush_size, CLFLUSH line size (value * 8) in bytes
+ 1, 0, EBX, 23:16, max_cpu_id, Maxim number of addressable logic cpu in this package
+ 1, 0, EBX, 31:24, apic_id, Initial APIC ID
+
+ 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ 1, 0, ECX, 1, pclmulqdq, PCLMULQDQ instruction supported
+ 1, 0, ECX, 2, dtes64, DS area uses 64-bit layout
+ 1, 0, ECX, 3, mwait, MONITOR/MWAIT supported
+ 1, 0, ECX, 4, ds_cpl, CPL Qualified Debug Store which allows for branch message storage qualified by CPL
+ 1, 0, ECX, 5, vmx, Virtual Machine Extensions supported
+ 1, 0, ECX, 6, smx, Safer Mode Extension supported
+ 1, 0, ECX, 7, eist, Enhanced Intel SpeedStep Technology
+ 1, 0, ECX, 8, tm2, Thermal Monitor 2
+ 1, 0, ECX, 9, ssse3, Supplemental Streaming SIMD Extensions 3 (SSSE3)
+ 1, 0, ECX, 10, l1_ctx_id, L1 data cache could be set to either adaptive mode or shared mode (check IA32_MISC_ENABLE bit 24 definition)
+ 1, 0, ECX, 11, sdbg, IA32_DEBUG_INTERFACE MSR for silicon debug supported
+ 1, 0, ECX, 12, fma, FMA extensions using YMM state supported
+ 1, 0, ECX, 13, cmpxchg16b, 'CMPXCHG16B - Compare and Exchange Bytes' supported
+ 1, 0, ECX, 14, xtpr_update, xTPR Update Control supported
+ 1, 0, ECX, 15, pdcm, Perfmon and Debug Capability present
+ 1, 0, ECX, 17, pcid, Process-Context Identifiers feature present
+ 1, 0, ECX, 18, dca, Prefetching data from a memory mapped device supported
+ 1, 0, ECX, 19, sse4_1, SSE4.1 feature present
+ 1, 0, ECX, 20, sse4_2, SSE4.2 feature present
+ 1, 0, ECX, 21, x2apic, x2APIC supported
+ 1, 0, ECX, 22, movbe, MOVBE instruction supported
+ 1, 0, ECX, 23, popcnt, POPCNT instruction supported
+ 1, 0, ECX, 24, tsc_deadline_timer, LAPIC supports one-shot operation using a TSC deadline value
+ 1, 0, ECX, 25, aesni, AESNI instruction supported
+ 1, 0, ECX, 26, xsave, XSAVE/XRSTOR processor extended states (XSETBV/XGETBV/XCR0)
+ 1, 0, ECX, 27, osxsave, OS has set CR4.OSXSAVE bit to enable XSETBV/XGETBV/XCR0
+ 1, 0, ECX, 28, avx, AVX instruction supported
+ 1, 0, ECX, 29, f16c, 16-bit floating-point conversion instruction supported
+ 1, 0, ECX, 30, rdrand, RDRAND instruction supported
+
+ 1, 0, EDX, 0, fpu, x87 FPU on chip
+ 1, 0, EDX, 1, vme, Virtual-8086 Mode Enhancement
+ 1, 0, EDX, 2, de, Debugging Extensions
+ 1, 0, EDX, 3, pse, Page Size Extensions
+ 1, 0, EDX, 4, tsc, Time Stamp Counter
+ 1, 0, EDX, 5, msr, RDMSR and WRMSR Support
+ 1, 0, EDX, 6, pae, Physical Address Extensions
+ 1, 0, EDX, 7, mce, Machine Check Exception
+ 1, 0, EDX, 8, cx8, CMPXCHG8B instr
+ 1, 0, EDX, 9, apic, APIC on Chip
+ 1, 0, EDX, 11, sep, SYSENTER and SYSEXIT instrs
+ 1, 0, EDX, 12, mtrr, Memory Type Range Registers
+ 1, 0, EDX, 13, pge, Page Global Bit
+ 1, 0, EDX, 14, mca, Machine Check Architecture
+ 1, 0, EDX, 15, cmov, Conditional Move Instrs
+ 1, 0, EDX, 16, pat, Page Attribute Table
+ 1, 0, EDX, 17, pse36, 36-Bit Page Size Extension
+ 1, 0, EDX, 18, psn, Processor Serial Number
+ 1, 0, EDX, 19, clflush, CLFLUSH instr
+# 1, 0, EDX, 20,
+ 1, 0, EDX, 21, ds, Debug Store
+ 1, 0, EDX, 22, acpi, Thermal Monitor and Software Controlled Clock Facilities
+ 1, 0, EDX, 23, mmx, Intel MMX Technology
+ 1, 0, EDX, 24, fxsr, XSAVE and FXRSTOR Instrs
+ 1, 0, EDX, 25, sse, SSE
+ 1, 0, EDX, 26, sse2, SSE2
+ 1, 0, EDX, 27, ss, Self Snoop
+ 1, 0, EDX, 28, hit, Max APIC IDs
+ 1, 0, EDX, 29, tm, Thermal Monitor
+# 1, 0, EDX, 30,
+ 1, 0, EDX, 31, pbe, Pending Break Enable
+
+# Leaf 02H
+# cache and TLB descriptor info
+
+# Leaf 03H
+# Precessor Serial Number, introduced on Pentium III, not valid for
+# latest models
+
+# Leaf 04H
+# thread/core and cache topology
+ 4, 0, EAX, 4:0, cache_type, Cache type like instr/data or unified
+ 4, 0, EAX, 7:5, cache_level, Cache Level (starts at 1)
+ 4, 0, EAX, 8, cache_self_init, Cache Self Initialization
+ 4, 0, EAX, 9, fully_associate, Fully Associative cache
+# 4, 0, EAX, 13:10, resvd, resvd
+ 4, 0, EAX, 25:14, max_logical_id, Max number of addressable IDs for logical processors sharing the cache
+ 4, 0, EAX, 31:26, max_phy_id, Max number of addressable IDs for processors in phy package
+
+ 4, 0, EBX, 11:0, cache_linesize, Size of a cache line in bytes
+ 4, 0, EBX, 21:12, cache_partition, Physical Line partitions
+ 4, 0, EBX, 31:22, cache_ways, Ways of associativity
+ 4, 0, ECX, 31:0, cache_sets, Number of Sets - 1
+ 4, 0, EDX, 0, c_wbinvd, 1 means WBINVD/INVD is not ganranteed to act upon lower level caches of non-originating threads sharing this cache
+ 4, 0, EDX, 1, c_incl, Whether cache is inclusive of lower cache level
+ 4, 0, EDX, 2, c_comp_index, Complex Cache Indexing
+
+# Leaf 05H
+# MONITOR/MWAIT
+ 5, 0, EAX, 15:0, min_mon_size, Smallest monitor line size in bytes
+ 5, 0, EBX, 15:0, max_mon_size, Largest monitor line size in bytes
+ 5, 0, ECX, 0, mwait_ext, Enum of Monitor-Mwait extensions supported
+ 5, 0, ECX, 1, mwait_irq_break, Largest monitor line size in bytes
+ 5, 0, EDX, 3:0, c0_sub_stats, Number of C0* sub C-states supported using MWAIT
+ 5, 0, EDX, 7:4, c1_sub_stats, Number of C1* sub C-states supported using MWAIT
+ 5, 0, EDX, 11:8, c2_sub_stats, Number of C2* sub C-states supported using MWAIT
+ 5, 0, EDX, 15:12, c3_sub_stats, Number of C3* sub C-states supported using MWAIT
+ 5, 0, EDX, 19:16, c4_sub_stats, Number of C4* sub C-states supported using MWAIT
+ 5, 0, EDX, 23:20, c5_sub_stats, Number of C5* sub C-states supported using MWAIT
+ 5, 0, EDX, 27:24, c6_sub_stats, Number of C6* sub C-states supported using MWAIT
+ 5, 0, EDX, 31:28, c7_sub_stats, Number of C7* sub C-states supported using MWAIT
+
+# Leaf 06H
+# Thermal & Power Management
+
+ 6, 0, EAX, 0, dig_temp, Digital temperature sensor supported
+ 6, 0, EAX, 1, turbo, Intel Turbo Boost
+ 6, 0, EAX, 2, arat, Always running APIC timer
+# 6, 0, EAX, 3, resv, Reserved
+ 6, 0, EAX, 4, pln, Power limit notifications supported
+ 6, 0, EAX, 5, ecmd, Clock modulation duty cycle extension supported
+ 6, 0, EAX, 6, ptm, Package thermal management supported
+ 6, 0, EAX, 7, hwp, HWP base register
+ 6, 0, EAX, 8, hwp_notify, HWP notification
+ 6, 0, EAX, 9, hwp_act_window, HWP activity window
+ 6, 0, EAX, 10, hwp_energy, HWP energy performance preference
+ 6, 0, EAX, 11, hwp_pkg_req, HWP package level request
+# 6, 0, EAX, 12, resv, Reserved
+ 6, 0, EAX, 13, hdc, HDC base registers supported
+ 6, 0, EAX, 14, turbo3, Turbo Boost Max 3.0
+ 6, 0, EAX, 15, hwp_cap, Highest Performance change supported
+ 6, 0, EAX, 16, hwp_peci, HWP PECI override is supported
+ 6, 0, EAX, 17, hwp_flex, Flexible HWP is supported
+ 6, 0, EAX, 18, hwp_fast, Fast access mode for the IA32_HWP_REQUEST MSR is supported
+# 6, 0, EAX, 19, resv, Reserved
+ 6, 0, EAX, 20, hwp_ignr, Ignoring Idle Logical Processor HWP request is supported
+
+ 6, 0, EBX, 3:0, therm_irq_thresh, Number of Interrupt Thresholds in Digital Thermal Sensor
+ 6, 0, ECX, 0, aperfmperf, Presence of IA32_MPERF and IA32_APERF
+ 6, 0, ECX, 3, energ_bias, Performance-energy bias preference supported
+
+# Leaf 07H
+# ECX == 0
+# AVX512 refers to https://en.wikipedia.org/wiki/AVX-512
+# XXX: Do we really need to enumerate each and every AVX512 sub features
+
+ 7, 0, EBX, 0, fsgsbase, RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE supported
+ 7, 0, EBX, 1, tsc_adjust, TSC_ADJUST MSR supported
+ 7, 0, EBX, 2, sgx, Software Guard Extensions
+ 7, 0, EBX, 3, bmi1, BMI1
+ 7, 0, EBX, 4, hle, Hardware Lock Elision
+ 7, 0, EBX, 5, avx2, AVX2
+# 7, 0, EBX, 6, fdp_excp_only, x87 FPU Data Pointer updated only on x87 exceptions
+ 7, 0, EBX, 7, smep, Supervisor-Mode Execution Prevention
+ 7, 0, EBX, 8, bmi2, BMI2
+ 7, 0, EBX, 9, rep_movsb, Enhanced REP MOVSB/STOSB
+ 7, 0, EBX, 10, invpcid, INVPCID instruction
+ 7, 0, EBX, 11, rtm, Restricted Transactional Memory
+ 7, 0, EBX, 12, rdt_m, Intel RDT Monitoring capability
+ 7, 0, EBX, 13, depc_fpu_cs_ds, Deprecates FPU CS and FPU DS
+ 7, 0, EBX, 14, mpx, Memory Protection Extensions
+ 7, 0, EBX, 15, rdt_a, Intel RDT Allocation capability
+ 7, 0, EBX, 16, avx512f, AVX512 Foundation instr
+ 7, 0, EBX, 17, avx512dq, AVX512 Double and Quadword AVX512 instr
+ 7, 0, EBX, 18, rdseed, RDSEED instr
+ 7, 0, EBX, 19, adx, ADX instr
+ 7, 0, EBX, 20, smap, Supervisor Mode Access Prevention
+ 7, 0, EBX, 21, avx512ifma, AVX512 Integer Fused Multiply Add
+# 7, 0, EBX, 22, resvd, resvd
+ 7, 0, EBX, 23, clflushopt, CLFLUSHOPT instr
+ 7, 0, EBX, 24, clwb, CLWB instr
+ 7, 0, EBX, 25, intel_pt, Intel Processor Trace instr
+ 7, 0, EBX, 26, avx512pf, Prefetch
+ 7, 0, EBX, 27, avx512er, AVX512 Exponent Reciproca instr
+ 7, 0, EBX, 28, avx512cd, AVX512 Conflict Detection instr
+ 7, 0, EBX, 29, sha, Intel Secure Hash Algorithm Extensions instr
+ 7, 0, EBX, 26, avx512bw, AVX512 Byte & Word instr
+ 7, 0, EBX, 28, avx512vl, AVX512 Vector Length Extentions (VL)
+ 7, 0, ECX, 0, prefetchwt1, X
+ 7, 0, ECX, 1, avx512vbmi, AVX512 Vector Byte Manipulation Instructions
+ 7, 0, ECX, 2, umip, User-mode Instruction Prevention
+
+ 7, 0, ECX, 3, pku, Protection Keys for User-mode pages
+ 7, 0, ECX, 4, ospke, CR4 PKE set to enable protection keys
+# 7, 0, ECX, 16:5, resvd, resvd
+ 7, 0, ECX, 21:17, mawau, The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode
+ 7, 0, ECX, 22, rdpid, RDPID and IA32_TSC_AUX
+# 7, 0, ECX, 29:23, resvd, resvd
+ 7, 0, ECX, 30, sgx_lc, SGX Launch Configuration
+# 7, 0, ECX, 31, resvd, resvd
+
+# Leaf 08H
+#
+
+
+# Leaf 09H
+# Direct Cache Access (DCA) information
+ 9, 0, ECX, 31:0, dca_cap, The value of IA32_PLATFORM_DCA_CAP
+
+# Leaf 0AH
+# Architectural Performance Monitoring
+#
+# Do we really need to print out the PMU related stuff?
+# Does normal user really care about it?
+#
+ 0xA, 0, EAX, 7:0, pmu_ver, Performance Monitoring Unit version
+ 0xA, 0, EAX, 15:8, pmu_gp_cnt_num, Numer of general-purose PMU counters per logical CPU
+ 0xA, 0, EAX, 23:16, pmu_cnt_bits, Bit wideth of PMU counter
+ 0xA, 0, EAX, 31:24, pmu_ebx_bits, Length of EBX bit vector to enumerate PMU events
+
+ 0xA, 0, EBX, 0, pmu_no_core_cycle_evt, Core cycle event not available
+ 0xA, 0, EBX, 1, pmu_no_instr_ret_evt, Instruction retired event not available
+ 0xA, 0, EBX, 2, pmu_no_ref_cycle_evt, Reference cycles event not available
+ 0xA, 0, EBX, 3, pmu_no_llc_ref_evt, Last-level cache reference event not available
+ 0xA, 0, EBX, 4, pmu_no_llc_mis_evt, Last-level cache misses event not available
+ 0xA, 0, EBX, 5, pmu_no_br_instr_ret_evt, Branch instruction retired event not available
+ 0xA, 0, EBX, 6, pmu_no_br_mispredict_evt, Branch mispredict retired event not available
+
+ 0xA, 0, ECX, 4:0, pmu_fixed_cnt_num, Performance Monitoring Unit version
+ 0xA, 0, ECX, 12:5, pmu_fixed_cnt_bits, Numer of PMU counters per logical CPU
+
+# Leaf 0BH
+# Extended Topology Enumeration Leaf
+#
+
+ 0xB, 0, EAX, 4:0, id_shift, Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type
+ 0xB, 0, EBX, 15:0, cpu_nr, Number of logical processors at this level type
+ 0xB, 0, ECX, 15:8, lvl_type, 0-Invalid 1-SMT 2-Core
+ 0xB, 0, EDX, 31:0, x2apic_id, x2APIC ID the current logical processor
+
+
+# Leaf 0DH
+# Processor Extended State
+
+ 0xD, 0, EAX, 0, x87, X87 state
+ 0xD, 0, EAX, 1, sse, SSE state
+ 0xD, 0, EAX, 2, avx, AVX state
+ 0xD, 0, EAX, 4:3, mpx, MPX state
+ 0xD, 0, EAX, 7:5, avx512, AVX-512 state
+ 0xD, 0, EAX, 9, pkru, PKRU state
+
+ 0xD, 0, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 0, ECX, 31:0, max_sz_xsave, Maximum size (bytes) of the XSAVE/XRSTOR save area
+
+ 0xD, 1, EAX, 0, xsaveopt, XSAVEOPT available
+ 0xD, 1, EAX, 1, xsavec, XSAVEC and compacted form supported
+ 0xD, 1, EAX, 2, xgetbv, XGETBV supported
+ 0xD, 1, EAX, 3, xsaves, XSAVES/XRSTORS and IA32_XSS supported
+
+ 0xD, 1, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 1, ECX, 8, pt, PT state
+ 0xD, 1, ECX, 11, cet_usr, CET user state
+ 0xD, 1, ECX, 12, cet_supv, CET supervisor state
+ 0xD, 1, ECX, 13, hdc, HDC state
+ 0xD, 1, ECX, 16, hwp, HWP state
+
+# Leaf 0FH
+# Intel RDT Monitoring
+
+ 0xF, 0, EBX, 31:0, rmid_range, Maximum range (zero-based) of RMID within this physical processor of all types
+ 0xF, 0, EDX, 1, l3c_rdt_mon, L3 Cache RDT Monitoring supported
+
+ 0xF, 1, ECX, 31:0, rmid_range, Maximum range (zero-based) of RMID of this types
+ 0xF, 1, EDX, 0, l3c_ocp_mon, L3 Cache occupancy Monitoring supported
+ 0xF, 1, EDX, 1, l3c_tbw_mon, L3 Cache Total Bandwidth Monitoring supported
+ 0xF, 1, EDX, 2, l3c_lbw_mon, L3 Cache Local Bandwidth Monitoring supported
+
+# Leaf 10H
+# Intel RDT Allocation
+
+ 0x10, 0, EBX, 1, l3c_rdt_alloc, L3 Cache Allocation supported
+ 0x10, 0, EBX, 2, l2c_rdt_alloc, L2 Cache Allocation supported
+ 0x10, 0, EBX, 3, mem_bw_alloc, Memory Bandwidth Allocation supported
+
+
+# Leaf 12H
+# SGX Capability
+#
+# Some detailed SGX features not added yet
+
+ 0x12, 0, EAX, 0, sgx1, L3 Cache Allocation supported
+ 0x12, 1, EAX, 0, sgx2, L3 Cache Allocation supported
+
+
+# Leaf 14H
+# Intel Processor Tracer
+#
+
+# Leaf 15H
+# Time Stamp Counter and Nominal Core Crystal Clock Information
+
+ 0x15, 0, EAX, 31:0, tsc_denominator, The denominator of the TSC/”core crystal clock” ratio
+ 0x15, 0, EBX, 31:0, tsc_numerator, The numerator of the TSC/”core crystal clock” ratio
+ 0x15, 0, ECX, 31:0, nom_freq, Nominal frequency of the core crystal clock in Hz
+
+# Leaf 16H
+# Processor Frequency Information
+
+ 0x16, 0, EAX, 15:0, cpu_base_freq, Processor Base Frequency in MHz
+ 0x16, 0, EBX, 15:0, cpu_max_freq, Maximum Frequency in MHz
+ 0x16, 0, ECX, 15:0, bus_freq, Bus (Reference) Frequency in MHz
+
+# Leaf 17H
+# System-On-Chip Vendor Attribute
+
+ 0x17, 0, EAX, 31:0, max_socid, Maximum input value of supported sub-leaf
+ 0x17, 0, EBX, 15:0, soc_vid, SOC Vendor ID
+ 0x17, 0, EBX, 16, std_vid, SOC Vendor ID is assigned via an industry standard scheme
+ 0x17, 0, ECX, 31:0, soc_pid, SOC Project ID assigned by vendor
+ 0x17, 0, EDX, 31:0, soc_sid, SOC Stepping ID
+
+# Leaf 18H
+# Deterministic Address Translation Parameters
+
+
+# Leaf 19H
+# Key Locker Leaf
+
+
+# Leaf 1AH
+# Hybrid Information
+
+ 0x1A, 0, EAX, 31:24, core_type, 20H-Intel_Atom 40H-Intel_Core
+
+
+# Leaf 1FH
+# V2 Extended Topology - A preferred superset to leaf 0BH
+
+
+# According to SDM
+# 40000000H - 4FFFFFFFH is invalid range
+
+
+# Leaf 80000001H
+# Extended Processor Signature and Feature Bits
+
+0x80000001, 0, ECX, 0, lahf_lm, LAHF/SAHF available in 64-bit mode
+0x80000001, 0, ECX, 5, lzcnt, LZCNT
+0x80000001, 0, ECX, 8, prefetchw, PREFETCHW
+
+0x80000001, 0, EDX, 11, sysret, SYSCALL/SYSRET supported
+0x80000001, 0, EDX, 20, exec_dis, Execute Disable Bit available
+0x80000001, 0, EDX, 26, 1gb_page, 1GB page supported
+0x80000001, 0, EDX, 27, rdtscp, RDTSCP and IA32_TSC_AUX are available
+#0x80000001, 0, EDX, 29, 64b, 64b Architecture supported
+
+# Leaf 80000002H/80000003H/80000004H
+# Processor Brand String
+
+# Leaf 80000005H
+# Reserved
+
+# Leaf 80000006H
+# Extended L2 Cache Features
+
+0x80000006, 0, ECX, 7:0, clsize, Cache Line size in bytes
+0x80000006, 0, ECX, 15:12, l2c_assoc, L2 Associativity
+0x80000006, 0, ECX, 31:16, csize, Cache size in 1K units
+
+
+# Leaf 80000007H
+
+0x80000007, 0, EDX, 8, nonstop_tsc, Invariant TSC available
+
+
+# Leaf 80000008H
+
+0x80000008, 0, EAX, 7:0, phy_adr_bits, Physical Address Bits
+0x80000008, 0, EAX, 15:8, lnr_adr_bits, Linear Address Bits
+0x80000007, 0, EBX, 9, wbnoinvd, WBNOINVD
+
+# 0x8000001E
+# EAX: Extended APIC ID
+0x8000001E, 0, EAX, 31:0, extended_apic_id, Extended APIC ID
+# EBX: Core Identifiers
+0x8000001E, 0, EBX, 7:0, core_id, Identifies the logical core ID
+0x8000001E, 0, EBX, 15:8, threads_per_core, The number of threads per core is threads_per_core + 1
+# ECX: Node Identifiers
+0x8000001E, 0, ECX, 7:0, node_id, Node ID
+0x8000001E, 0, ECX, 10:8, nodes_per_processor, Nodes per processor { 0: 1 node, else reserved }
+
+# 8000001F: AMD Secure Encryption
+0x8000001F, 0, EAX, 0, sme, Secure Memory Encryption
+0x8000001F, 0, EAX, 1, sev, Secure Encrypted Virtualization
+0x8000001F, 0, EAX, 2, vmpgflush, VM Page Flush MSR
+0x8000001F, 0, EAX, 3, seves, SEV Encrypted State
+0x8000001F, 0, EBX, 5:0, c-bit, Page table bit number used to enable memory encryption
+0x8000001F, 0, EBX, 11:6, mem_encrypt_physaddr_width, Reduction of physical address space in bits with SME enabled
+0x8000001F, 0, ECX, 31:0, num_encrypted_guests, Maximum ASID value that may be used for an SEV-enabled guest
+0x8000001F, 0, EDX, 31:0, minimum_sev_asid, Minimum ASID value that must be used for an SEV-enabled, SEV-ES-disabled guest
\ No newline at end of file
diff --git a/tools/arch/x86/kcpuid/kcpuid.c b/tools/arch/x86/kcpuid/kcpuid.c
new file mode 100644
index 000000000000..936a3a2aad04
--- /dev/null
+++ b/tools/arch/x86/kcpuid/kcpuid.c
@@ -0,0 +1,657 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+char *def_csv = "/usr/share/misc/cpuid.csv";
+char *user_csv;
+
+
+/* Cover both single-bit flag and multiple-bits fields */
+struct bits_desc {
+ /* start and end bits */
+ int start, end;
+ /* 0 or 1 for 1-bit flag */
+ int value;
+ char simp[32];
+ char detail[256];
+};
+
+/* descriptor info for eax/ebx/ecx/edx */
+struct reg_desc {
+ /* number of valid entries */
+ int nr;
+ struct bits_desc descs[32];
+};
+
+enum {
+ R_EAX = 0,
+ R_EBX,
+ R_ECX,
+ R_EDX,
+ NR_REGS
+};
+
+struct subleaf {
+ u32 index;
+ u32 sub;
+ u32 eax, ebx, ecx, edx;
+ struct reg_desc info[NR_REGS];
+};
+
+/* Represent one leaf (basic or extended) */
+struct cpuid_func {
+ /*
+ * Array of subleafs for this func, if there is no subleafs
+ * then the leafs[0] is the main leaf
+ */
+ struct subleaf *leafs;
+ int nr;
+};
+
+struct cpuid_range {
+ /* array of main leafs */
+ struct cpuid_func *funcs;
+ /* number of valid leafs */
+ int nr;
+ bool is_ext;
+};
+
+/*
+ * basic: basic functions range: [0... ]
+ * ext: extended functions range: [0x80000000... ]
+ */
+struct cpuid_range *leafs_basic, *leafs_ext;
+
+static int num_leafs;
+static bool is_amd;
+static bool show_details;
+static bool show_raw;
+static bool show_flags_only = true;
+static u32 user_index = 0xFFFFFFFF;
+static u32 user_sub = 0xFFFFFFFF;
+static int flines;
+
+static inline void cpuid(u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+{
+ /* ecx is often an input as well as an output. */
+ asm volatile("cpuid"
+ : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+ : "0" (*eax), "2" (*ecx));
+}
+
+static inline bool has_subleafs(u32 f)
+{
+ if (f == 0x7 || f == 0xd)
+ return true;
+
+ if (is_amd) {
+ if (f == 0x8000001d)
+ return true;
+ return false;
+ }
+
+ switch (f) {
+ case 0x4:
+ case 0xb:
+ case 0xf:
+ case 0x10:
+ case 0x14:
+ case 0x18:
+ case 0x1f:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static void leaf_print_raw(struct subleaf *leaf)
+{
+ if (has_subleafs(leaf->index)) {
+ if (leaf->sub == 0)
+ printf("0x%08x: subleafs:\n", leaf->index);
+
+ printf(" %2d: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->sub, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ } else {
+ printf("0x%08x: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->index, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ }
+}
+
+/* Return true is the input eax/ebx/ecx/edx are all zero */
+static bool cpuid_store(struct cpuid_range *range, u32 f, int subleaf,
+ u32 a, u32 b, u32 c, u32 d)
+{
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ int s = 0;
+
+ if (a == 0 && b == 0 && c == 0 && d == 0)
+ return true;
+
+ /*
+ * Cut off vendor-prefix from CPUID function as we're using it as an
+ * index into ->funcs.
+ */
+ func = &range->funcs[f & 0xffff];
+
+ if (!func->leafs) {
+ func->leafs = malloc(sizeof(struct subleaf));
+ if (!func->leafs)
+ perror("malloc func leaf");
+
+ func->nr = 1;
+ } else {
+ s = func->nr;
+ func->leafs = realloc(func->leafs, (s + 1) * sizeof(*leaf));
+ if (!func->leafs)
+ perror("realloc f->leafs");
+
+ func->nr++;
+ }
+
+ leaf = &func->leafs[s];
+
+ leaf->index = f;
+ leaf->sub = subleaf;
+ leaf->eax = a;
+ leaf->ebx = b;
+ leaf->ecx = c;
+ leaf->edx = d;
+
+ return false;
+}
+
+static void raw_dump_range(struct cpuid_range *range)
+{
+ u32 f;
+ int i;
+
+ printf("%s Leafs :\n", range->is_ext ? "Extended" : "Basic");
+ printf("================\n");
+
+ for (f = 0; (int)f < range->nr; f++) {
+ struct cpuid_func *func = &range->funcs[f];
+ u32 index = f;
+
+ if (range->is_ext)
+ index += 0x80000000;
+
+ /* Skip leaf without valid items */
+ if (!func->nr)
+ continue;
+
+ /* First item is the main leaf, followed by all subleafs */
+ for (i = 0; i < func->nr; i++)
+ leaf_print_raw(&func->leafs[i]);
+ }
+}
+
+#define MAX_SUBLEAF_NUM 32
+struct cpuid_range *setup_cpuid_range(u32 input_eax)
+{
+ u32 max_func, idx_func;
+ int subleaf;
+ struct cpuid_range *range;
+ u32 eax, ebx, ecx, edx;
+ u32 f = input_eax;
+ int max_subleaf;
+ bool allzero;
+
+ eax = input_eax;
+ ebx = ecx = edx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ max_func = eax;
+ idx_func = (max_func & 0xffff) + 1;
+
+ range = malloc(sizeof(struct cpuid_range));
+ if (!range)
+ perror("malloc range");
+
+ if (input_eax & 0x80000000)
+ range->is_ext = true;
+ else
+ range->is_ext = false;
+
+ range->funcs = malloc(sizeof(struct cpuid_func) * idx_func);
+ if (!range->funcs)
+ perror("malloc range->funcs");
+
+ range->nr = idx_func;
+ memset(range->funcs, 0, sizeof(struct cpuid_func) * idx_func);
+
+ for (; f <= max_func; f++) {
+ eax = f;
+ subleaf = ecx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf, eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+
+ if (!has_subleafs(f))
+ continue;
+
+ max_subleaf = MAX_SUBLEAF_NUM;
+
+ /*
+ * Some can provide the exact number of subleafs,
+ * others have to be tried (0xf)
+ */
+ if (f == 0x7 || f == 0x14 || f == 0x17 || f == 0x18)
+ max_subleaf = (eax & 0xff) + 1;
+
+ if (f == 0xb)
+ max_subleaf = 2;
+
+ for (subleaf = 1; subleaf < max_subleaf; subleaf++) {
+ eax = f;
+ ecx = subleaf;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf,
+ eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+ }
+
+ }
+
+ return range;
+}
+
+/*
+ * The basic row format for cpuid.csv is
+ * LEAF,SUBLEAF,register_name,bits,short name,long description
+ *
+ * like:
+ * 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+ * 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ */
+static int parse_line(char *line)
+{
+ char *str;
+ int i;
+ struct cpuid_range *range;
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ u32 index;
+ u32 sub;
+ char buffer[512];
+ char *buf;
+ /*
+ * Tokens:
+ * 1. leaf
+ * 2. subleaf
+ * 3. register
+ * 4. bits
+ * 5. short name
+ * 6. long detail
+ */
+ char *tokens[6];
+ struct reg_desc *reg;
+ struct bits_desc *bdesc;
+ int reg_index;
+ char *start, *end;
+
+ /* Skip comments and NULL line */
+ if (line[0] == '#' || line[0] == '\n')
+ return 0;
+
+ strncpy(buffer, line, 511);
+ buffer[511] = 0;
+ str = buffer;
+ for (i = 0; i < 5; i++) {
+ tokens[i] = strtok(str, ",");
+ if (!tokens[i])
+ goto err_exit;
+ str = NULL;
+ }
+ tokens[5] = strtok(str, "\n");
+ if (!tokens[5])
+ goto err_exit;
+
+ /* index/main-leaf */
+ index = strtoull(tokens[0], NULL, 0);
+
+ if (index & 0x80000000)
+ range = leafs_ext;
+ else
+ range = leafs_basic;
+
+ index &= 0x7FFFFFFF;
+ /* Skip line parsing for non-existing indexes */
+ if ((int)index >= range->nr)
+ return -1;
+
+ func = &range->funcs[index];
+
+ /* Return if the index has no valid item on this platform */
+ if (!func->nr)
+ return 0;
+
+ /* subleaf */
+ sub = strtoul(tokens[1], NULL, 0);
+ if ((int)sub > func->nr)
+ return -1;
+
+ leaf = &func->leafs[sub];
+ buf = tokens[2];
+
+ if (strcasestr(buf, "EAX"))
+ reg_index = R_EAX;
+ else if (strcasestr(buf, "EBX"))
+ reg_index = R_EBX;
+ else if (strcasestr(buf, "ECX"))
+ reg_index = R_ECX;
+ else if (strcasestr(buf, "EDX"))
+ reg_index = R_EDX;
+ else
+ goto err_exit;
+
+ reg = &leaf->info[reg_index];
+ bdesc = ®->descs[reg->nr++];
+
+ /* bit flag or bits field */
+ buf = tokens[3];
+
+ end = strtok(buf, ":");
+ bdesc->end = strtoul(end, NULL, 0);
+ bdesc->start = bdesc->end;
+
+ /* start != NULL means it is bit fields */
+ start = strtok(NULL, ":");
+ if (start)
+ bdesc->start = strtoul(start, NULL, 0);
+
+ strcpy(bdesc->simp, tokens[4]);
+ strcpy(bdesc->detail, tokens[5]);
+ return 0;
+
+err_exit:
+ printf("Warning: wrong line format:\n");
+ printf("\tline[%d]: %s\n", flines, line);
+ return -1;
+}
+
+/* Parse csv file, and construct the array of all leafs and subleafs */
+static void parse_text(void)
+{
+ FILE *file;
+ char *filename, *line = NULL;
+ size_t len = 0;
+ int ret;
+
+ if (show_raw)
+ return;
+
+ filename = user_csv ? user_csv : def_csv;
+ file = fopen(filename, "r");
+ if (!file) {
+ /* Fallback to a csv in the same dir */
+ file = fopen("./cpuid.csv", "r");
+ }
+
+ if (!file) {
+ printf("Fail to open '%s'\n", filename);
+ return;
+ }
+
+ while (1) {
+ ret = getline(&line, &len, file);
+ flines++;
+ if (ret > 0)
+ parse_line(line);
+
+ if (feof(file))
+ break;
+ }
+
+ fclose(file);
+}
+
+
+/* Decode every eax/ebx/ecx/edx */
+static void decode_bits(u32 value, struct reg_desc *rdesc)
+{
+ struct bits_desc *bdesc;
+ int start, end, i;
+ u32 mask;
+
+ for (i = 0; i < rdesc->nr; i++) {
+ bdesc = &rdesc->descs[i];
+
+ start = bdesc->start;
+ end = bdesc->end;
+ if (start == end) {
+ /* single bit flag */
+ if (value & (1 << start))
+ printf("\t%-20s %s%s\n",
+ bdesc->simp,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ } else {
+ /* bit fields */
+ if (show_flags_only)
+ continue;
+
+ mask = ((u64)1 << (end - start + 1)) - 1;
+ printf("\t%-20s\t: 0x%-8x\t%s%s\n",
+ bdesc->simp,
+ (value >> start) & mask,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ }
+ }
+}
+
+static void show_leaf(struct subleaf *leaf)
+{
+ if (!leaf)
+ return;
+
+ if (show_raw)
+ leaf_print_raw(leaf);
+
+ decode_bits(leaf->eax, &leaf->info[R_EAX]);
+ decode_bits(leaf->ebx, &leaf->info[R_EBX]);
+ decode_bits(leaf->ecx, &leaf->info[R_ECX]);
+ decode_bits(leaf->edx, &leaf->info[R_EDX]);
+}
+
+static void show_func(struct cpuid_func *func)
+{
+ int i;
+
+ if (!func)
+ return;
+
+ for (i = 0; i < func->nr; i++)
+ show_leaf(&func->leafs[i]);
+}
+
+static void show_range(struct cpuid_range *range)
+{
+ int i;
+
+ for (i = 0; i < range->nr; i++)
+ show_func(&range->funcs[i]);
+}
+
+static inline struct cpuid_func *index_to_func(u32 index)
+{
+ struct cpuid_range *range;
+
+ range = (index & 0x80000000) ? leafs_ext : leafs_basic;
+ index &= 0x7FFFFFFF;
+
+ if (((index & 0xFFFF) + 1) > (u32)range->nr) {
+ printf("ERR: invalid input index (0x%x)\n", index);
+ return NULL;
+ }
+ return &range->funcs[index];
+}
+
+static void show_info(void)
+{
+ struct cpuid_func *func;
+
+ if (show_raw) {
+ /* Show all of the raw output of 'cpuid' instr */
+ raw_dump_range(leafs_basic);
+ raw_dump_range(leafs_ext);
+ return;
+ }
+
+ if (user_index != 0xFFFFFFFF) {
+ /* Only show specific leaf/subleaf info */
+ func = index_to_func(user_index);
+ if (!func)
+ return;
+
+ /* Dump the raw data also */
+ show_raw = true;
+
+ if (user_sub != 0xFFFFFFFF) {
+ if (user_sub + 1 <= (u32)func->nr) {
+ show_leaf(&func->leafs[user_sub]);
+ return;
+ }
+
+ printf("ERR: invalid input subleaf (0x%x)\n", user_sub);
+ }
+
+ show_func(func);
+ return;
+ }
+
+ printf("CPU features:\n=============\n\n");
+ show_range(leafs_basic);
+ show_range(leafs_ext);
+}
+
+static void setup_platform_cpuid(void)
+{
+ u32 eax, ebx, ecx, edx;
+
+ /* Check vendor */
+ eax = ebx = ecx = edx = 0;
+ cpuid(&eax, &ebx, &ecx, &edx);
+
+ /* "htuA" */
+ if (ebx == 0x68747541)
+ is_amd = true;
+
+ /* Setup leafs for the basic and extended range */
+ leafs_basic = setup_cpuid_range(0x0);
+ leafs_ext = setup_cpuid_range(0x80000000);
+}
+
+static void usage(void)
+{
+ printf("kcpuid [-abdfhr] [-l leaf] [-s subleaf]\n"
+ "\t-a|--all Show both bit flags and complex bit fields info\n"
+ "\t-b|--bitflags Show boolean flags only\n"
+ "\t-d|--detail Show details of the flag/fields (default)\n"
+ "\t-f|--flags Specify the cpuid csv file\n"
+ "\t-h|--help Show usage info\n"
+ "\t-l|--leaf=index Specify the leaf you want to check\n"
+ "\t-r|--raw Show raw cpuid data\n"
+ "\t-s|--subleaf=sub Specify the subleaf you want to check\n"
+ );
+}
+
+static struct option opts[] = {
+ { "all", no_argument, NULL, 'a' }, /* show both bit flags and fields */
+ { "bitflags", no_argument, NULL, 'b' }, /* only show bit flags, default on */
+ { "detail", no_argument, NULL, 'd' }, /* show detail descriptions */
+ { "file", required_argument, NULL, 'f' }, /* use user's cpuid file */
+ { "help", no_argument, NULL, 'h'}, /* show usage */
+ { "leaf", required_argument, NULL, 'l'}, /* only check a specific leaf */
+ { "raw", no_argument, NULL, 'r'}, /* show raw CPUID leaf data */
+ { "subleaf", required_argument, NULL, 's'}, /* check a specific subleaf */
+ { NULL, 0, NULL, 0 }
+};
+
+static int parse_options(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt_long(argc, argv, "abdf:hl:rs:",
+ opts, NULL)) != -1)
+ switch (c) {
+ case 'a':
+ show_flags_only = false;
+ break;
+ case 'b':
+ show_flags_only = true;
+ break;
+ case 'd':
+ show_details = true;
+ break;
+ case 'f':
+ user_csv = optarg;
+ break;
+ case 'h':
+ usage();
+ exit(1);
+ break;
+ case 'l':
+ /* main leaf */
+ user_index = strtoul(optarg, NULL, 0);
+ break;
+ case 'r':
+ show_raw = true;
+ break;
+ case 's':
+ /* subleaf */
+ user_sub = strtoul(optarg, NULL, 0);
+ break;
+ default:
+ printf("%s: Invalid option '%c'\n", argv[0], optopt);
+ return -1;
+ }
+
+ return 0;
+}
+
+/*
+ * Do 4 things in turn:
+ * 1. Parse user options
+ * 2. Parse and store all the CPUID leaf data supported on this platform
+ * 2. Parse the csv file, while skipping leafs which are not available
+ * on this platform
+ * 3. Print leafs info based on user options
+ */
+int main(int argc, char *argv[])
+{
+ if (parse_options(argc, argv))
+ return -1;
+
+ /* Setup the cpuid leafs of current platform */
+ setup_platform_cpuid();
+
+ /* Read and parse the 'cpuid.csv' */
+ parse_text();
+
+ show_info();
+ return 0;
+}
\ No newline at end of file
--
2.30.0
1
0
From: Feng Tang <feng.tang(a)intel.com>
mainline inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4CMMB?from=project-issue
CVE: NA
------------------------------------
End users frequently want to know what features their processor
supports, independent of what the kernel supports.
/proc/cpuinfo is great. It is omnipresent and since it is provided by
the kernel it is always as up to date as the kernel. But, it could be
ambiguous about processor features which can be disabled by the kernel
at boot-time or compile-time.
There are some user space tools showing more raw features, but they are
not bound with kernel, and go with distros. Many end users are still
using old distros with new kernels (upgraded by themselves), and may
not upgrade the distros only to get a newer tool.
So here arise the need for a new tool, which
* shows raw CPU features read from the CPUID instruction
* will be easier to update compared to existing userspace
tooling (perhaps distributed like perf)
* inherits "modern" kernel development process, in contrast to some
of the existing userspace CPUID tools which are still being
developed
without git and distributed in tarballs from non-https sites.
* Can produce output consistent with /proc/cpuinfo to make comparison
easier.
The CPUID leaf definitions are kept in an .csv file which allows for
updating only that file to add support for new feature leafs.
This is based on prototype code from Borislav Petkov
(http://sr71.net/~dave/intel/stupid-cpuid.c)
[ bp:
- Massage, add #define _GNU_SOURCE to fix implicit declaration of
function ‘strcasestr' warning
- remove superfluous newlines
- fallback to cpuid.csv in the current dir if none found
- fix typos
- move comments over the lines instead of sideways. ]
Originally-from: Borislav Petkov <bp(a)alien8.de>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Suggested-by: Borislav Petkov <bp(a)alien8.de>
Signed-off-by: Feng Tang <feng.tang(a)intel.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: zhangyue <zhangyue1(a)kylinos.cn>
---
tools/arch/x86/kcpuid/Makefile | 24 ++
tools/arch/x86/kcpuid/cpuid.csv | 400 +++++++++++++++++++
tools/arch/x86/kcpuid/kcpuid.c | 657 ++++++++++++++++++++++++++++++++
3 files changed, 1081 insertions(+)
create mode 100644 tools/arch/x86/kcpuid/Makefile
create mode 100644 tools/arch/x86/kcpuid/cpuid.csv
create mode 100644 tools/arch/x86/kcpuid/kcpuid.c
diff --git a/tools/arch/x86/kcpuid/Makefile b/tools/arch/x86/kcpuid/Makefile
new file mode 100644
index 000000000000..a2f1e855092c
--- /dev/null
+++ b/tools/arch/x86/kcpuid/Makefile
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for x86/kcpuid tool
+
+kcpuid : kcpuid.c
+
+CFLAGS = -Wextra
+
+BINDIR ?= /usr/sbin
+
+HWDATADIR ?= /usr/share/misc/
+
+override CFLAGS += -O2 -Wall -I../../../include
+
+%: %.c
+ $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
+
+.PHONY : clean
+clean :
+ @rm -f kcpuid
+
+install : kcpuid
+ install -d $(DESTDIR)$(BINDIR)
+ install -m 755 -p kcpuid $(DESTDIR)$(BINDIR)/kcpuid
+ install -m 444 -p cpuid.csv $(HWDATADIR)/cpuid.csv
diff --git a/tools/arch/x86/kcpuid/cpuid.csv b/tools/arch/x86/kcpuid/cpuid.csv
new file mode 100644
index 000000000000..e422fad9a98e
--- /dev/null
+++ b/tools/arch/x86/kcpuid/cpuid.csv
@@ -0,0 +1,400 @@
+# The basic row format is:
+# LEAF, SUBLEAF, register_name, bits, short_name, long_description
+
+# Leaf 00H
+ 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+
+# Leaf 01H
+ 1, 0, EAX, 3:0, stepping, Stepping ID
+ 1, 0, EAX, 7:4, model, Model
+ 1, 0, EAX, 11:8, family, Family ID
+ 1, 0, EAX, 13:12, processor, Processor Type
+ 1, 0, EAX, 19:16, model_ext, Extended Model ID
+ 1, 0, EAX, 27:20, family_ext, Extended Family ID
+
+ 1, 0, EBX, 7:0, brand, Brand Index
+ 1, 0, EBX, 15:8, clflush_size, CLFLUSH line size (value * 8) in bytes
+ 1, 0, EBX, 23:16, max_cpu_id, Maxim number of addressable logic cpu in this package
+ 1, 0, EBX, 31:24, apic_id, Initial APIC ID
+
+ 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ 1, 0, ECX, 1, pclmulqdq, PCLMULQDQ instruction supported
+ 1, 0, ECX, 2, dtes64, DS area uses 64-bit layout
+ 1, 0, ECX, 3, mwait, MONITOR/MWAIT supported
+ 1, 0, ECX, 4, ds_cpl, CPL Qualified Debug Store which allows for branch message storage qualified by CPL
+ 1, 0, ECX, 5, vmx, Virtual Machine Extensions supported
+ 1, 0, ECX, 6, smx, Safer Mode Extension supported
+ 1, 0, ECX, 7, eist, Enhanced Intel SpeedStep Technology
+ 1, 0, ECX, 8, tm2, Thermal Monitor 2
+ 1, 0, ECX, 9, ssse3, Supplemental Streaming SIMD Extensions 3 (SSSE3)
+ 1, 0, ECX, 10, l1_ctx_id, L1 data cache could be set to either adaptive mode or shared mode (check IA32_MISC_ENABLE bit 24 definition)
+ 1, 0, ECX, 11, sdbg, IA32_DEBUG_INTERFACE MSR for silicon debug supported
+ 1, 0, ECX, 12, fma, FMA extensions using YMM state supported
+ 1, 0, ECX, 13, cmpxchg16b, 'CMPXCHG16B - Compare and Exchange Bytes' supported
+ 1, 0, ECX, 14, xtpr_update, xTPR Update Control supported
+ 1, 0, ECX, 15, pdcm, Perfmon and Debug Capability present
+ 1, 0, ECX, 17, pcid, Process-Context Identifiers feature present
+ 1, 0, ECX, 18, dca, Prefetching data from a memory mapped device supported
+ 1, 0, ECX, 19, sse4_1, SSE4.1 feature present
+ 1, 0, ECX, 20, sse4_2, SSE4.2 feature present
+ 1, 0, ECX, 21, x2apic, x2APIC supported
+ 1, 0, ECX, 22, movbe, MOVBE instruction supported
+ 1, 0, ECX, 23, popcnt, POPCNT instruction supported
+ 1, 0, ECX, 24, tsc_deadline_timer, LAPIC supports one-shot operation using a TSC deadline value
+ 1, 0, ECX, 25, aesni, AESNI instruction supported
+ 1, 0, ECX, 26, xsave, XSAVE/XRSTOR processor extended states (XSETBV/XGETBV/XCR0)
+ 1, 0, ECX, 27, osxsave, OS has set CR4.OSXSAVE bit to enable XSETBV/XGETBV/XCR0
+ 1, 0, ECX, 28, avx, AVX instruction supported
+ 1, 0, ECX, 29, f16c, 16-bit floating-point conversion instruction supported
+ 1, 0, ECX, 30, rdrand, RDRAND instruction supported
+
+ 1, 0, EDX, 0, fpu, x87 FPU on chip
+ 1, 0, EDX, 1, vme, Virtual-8086 Mode Enhancement
+ 1, 0, EDX, 2, de, Debugging Extensions
+ 1, 0, EDX, 3, pse, Page Size Extensions
+ 1, 0, EDX, 4, tsc, Time Stamp Counter
+ 1, 0, EDX, 5, msr, RDMSR and WRMSR Support
+ 1, 0, EDX, 6, pae, Physical Address Extensions
+ 1, 0, EDX, 7, mce, Machine Check Exception
+ 1, 0, EDX, 8, cx8, CMPXCHG8B instr
+ 1, 0, EDX, 9, apic, APIC on Chip
+ 1, 0, EDX, 11, sep, SYSENTER and SYSEXIT instrs
+ 1, 0, EDX, 12, mtrr, Memory Type Range Registers
+ 1, 0, EDX, 13, pge, Page Global Bit
+ 1, 0, EDX, 14, mca, Machine Check Architecture
+ 1, 0, EDX, 15, cmov, Conditional Move Instrs
+ 1, 0, EDX, 16, pat, Page Attribute Table
+ 1, 0, EDX, 17, pse36, 36-Bit Page Size Extension
+ 1, 0, EDX, 18, psn, Processor Serial Number
+ 1, 0, EDX, 19, clflush, CLFLUSH instr
+# 1, 0, EDX, 20,
+ 1, 0, EDX, 21, ds, Debug Store
+ 1, 0, EDX, 22, acpi, Thermal Monitor and Software Controlled Clock Facilities
+ 1, 0, EDX, 23, mmx, Intel MMX Technology
+ 1, 0, EDX, 24, fxsr, XSAVE and FXRSTOR Instrs
+ 1, 0, EDX, 25, sse, SSE
+ 1, 0, EDX, 26, sse2, SSE2
+ 1, 0, EDX, 27, ss, Self Snoop
+ 1, 0, EDX, 28, hit, Max APIC IDs
+ 1, 0, EDX, 29, tm, Thermal Monitor
+# 1, 0, EDX, 30,
+ 1, 0, EDX, 31, pbe, Pending Break Enable
+
+# Leaf 02H
+# cache and TLB descriptor info
+
+# Leaf 03H
+# Precessor Serial Number, introduced on Pentium III, not valid for
+# latest models
+
+# Leaf 04H
+# thread/core and cache topology
+ 4, 0, EAX, 4:0, cache_type, Cache type like instr/data or unified
+ 4, 0, EAX, 7:5, cache_level, Cache Level (starts at 1)
+ 4, 0, EAX, 8, cache_self_init, Cache Self Initialization
+ 4, 0, EAX, 9, fully_associate, Fully Associative cache
+# 4, 0, EAX, 13:10, resvd, resvd
+ 4, 0, EAX, 25:14, max_logical_id, Max number of addressable IDs for logical processors sharing the cache
+ 4, 0, EAX, 31:26, max_phy_id, Max number of addressable IDs for processors in phy package
+
+ 4, 0, EBX, 11:0, cache_linesize, Size of a cache line in bytes
+ 4, 0, EBX, 21:12, cache_partition, Physical Line partitions
+ 4, 0, EBX, 31:22, cache_ways, Ways of associativity
+ 4, 0, ECX, 31:0, cache_sets, Number of Sets - 1
+ 4, 0, EDX, 0, c_wbinvd, 1 means WBINVD/INVD is not ganranteed to act upon lower level caches of non-originating threads sharing this cache
+ 4, 0, EDX, 1, c_incl, Whether cache is inclusive of lower cache level
+ 4, 0, EDX, 2, c_comp_index, Complex Cache Indexing
+
+# Leaf 05H
+# MONITOR/MWAIT
+ 5, 0, EAX, 15:0, min_mon_size, Smallest monitor line size in bytes
+ 5, 0, EBX, 15:0, max_mon_size, Largest monitor line size in bytes
+ 5, 0, ECX, 0, mwait_ext, Enum of Monitor-Mwait extensions supported
+ 5, 0, ECX, 1, mwait_irq_break, Largest monitor line size in bytes
+ 5, 0, EDX, 3:0, c0_sub_stats, Number of C0* sub C-states supported using MWAIT
+ 5, 0, EDX, 7:4, c1_sub_stats, Number of C1* sub C-states supported using MWAIT
+ 5, 0, EDX, 11:8, c2_sub_stats, Number of C2* sub C-states supported using MWAIT
+ 5, 0, EDX, 15:12, c3_sub_stats, Number of C3* sub C-states supported using MWAIT
+ 5, 0, EDX, 19:16, c4_sub_stats, Number of C4* sub C-states supported using MWAIT
+ 5, 0, EDX, 23:20, c5_sub_stats, Number of C5* sub C-states supported using MWAIT
+ 5, 0, EDX, 27:24, c6_sub_stats, Number of C6* sub C-states supported using MWAIT
+ 5, 0, EDX, 31:28, c7_sub_stats, Number of C7* sub C-states supported using MWAIT
+
+# Leaf 06H
+# Thermal & Power Management
+
+ 6, 0, EAX, 0, dig_temp, Digital temperature sensor supported
+ 6, 0, EAX, 1, turbo, Intel Turbo Boost
+ 6, 0, EAX, 2, arat, Always running APIC timer
+# 6, 0, EAX, 3, resv, Reserved
+ 6, 0, EAX, 4, pln, Power limit notifications supported
+ 6, 0, EAX, 5, ecmd, Clock modulation duty cycle extension supported
+ 6, 0, EAX, 6, ptm, Package thermal management supported
+ 6, 0, EAX, 7, hwp, HWP base register
+ 6, 0, EAX, 8, hwp_notify, HWP notification
+ 6, 0, EAX, 9, hwp_act_window, HWP activity window
+ 6, 0, EAX, 10, hwp_energy, HWP energy performance preference
+ 6, 0, EAX, 11, hwp_pkg_req, HWP package level request
+# 6, 0, EAX, 12, resv, Reserved
+ 6, 0, EAX, 13, hdc, HDC base registers supported
+ 6, 0, EAX, 14, turbo3, Turbo Boost Max 3.0
+ 6, 0, EAX, 15, hwp_cap, Highest Performance change supported
+ 6, 0, EAX, 16, hwp_peci, HWP PECI override is supported
+ 6, 0, EAX, 17, hwp_flex, Flexible HWP is supported
+ 6, 0, EAX, 18, hwp_fast, Fast access mode for the IA32_HWP_REQUEST MSR is supported
+# 6, 0, EAX, 19, resv, Reserved
+ 6, 0, EAX, 20, hwp_ignr, Ignoring Idle Logical Processor HWP request is supported
+
+ 6, 0, EBX, 3:0, therm_irq_thresh, Number of Interrupt Thresholds in Digital Thermal Sensor
+ 6, 0, ECX, 0, aperfmperf, Presence of IA32_MPERF and IA32_APERF
+ 6, 0, ECX, 3, energ_bias, Performance-energy bias preference supported
+
+# Leaf 07H
+# ECX == 0
+# AVX512 refers to https://en.wikipedia.org/wiki/AVX-512
+# XXX: Do we really need to enumerate each and every AVX512 sub features
+
+ 7, 0, EBX, 0, fsgsbase, RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE supported
+ 7, 0, EBX, 1, tsc_adjust, TSC_ADJUST MSR supported
+ 7, 0, EBX, 2, sgx, Software Guard Extensions
+ 7, 0, EBX, 3, bmi1, BMI1
+ 7, 0, EBX, 4, hle, Hardware Lock Elision
+ 7, 0, EBX, 5, avx2, AVX2
+# 7, 0, EBX, 6, fdp_excp_only, x87 FPU Data Pointer updated only on x87 exceptions
+ 7, 0, EBX, 7, smep, Supervisor-Mode Execution Prevention
+ 7, 0, EBX, 8, bmi2, BMI2
+ 7, 0, EBX, 9, rep_movsb, Enhanced REP MOVSB/STOSB
+ 7, 0, EBX, 10, invpcid, INVPCID instruction
+ 7, 0, EBX, 11, rtm, Restricted Transactional Memory
+ 7, 0, EBX, 12, rdt_m, Intel RDT Monitoring capability
+ 7, 0, EBX, 13, depc_fpu_cs_ds, Deprecates FPU CS and FPU DS
+ 7, 0, EBX, 14, mpx, Memory Protection Extensions
+ 7, 0, EBX, 15, rdt_a, Intel RDT Allocation capability
+ 7, 0, EBX, 16, avx512f, AVX512 Foundation instr
+ 7, 0, EBX, 17, avx512dq, AVX512 Double and Quadword AVX512 instr
+ 7, 0, EBX, 18, rdseed, RDSEED instr
+ 7, 0, EBX, 19, adx, ADX instr
+ 7, 0, EBX, 20, smap, Supervisor Mode Access Prevention
+ 7, 0, EBX, 21, avx512ifma, AVX512 Integer Fused Multiply Add
+# 7, 0, EBX, 22, resvd, resvd
+ 7, 0, EBX, 23, clflushopt, CLFLUSHOPT instr
+ 7, 0, EBX, 24, clwb, CLWB instr
+ 7, 0, EBX, 25, intel_pt, Intel Processor Trace instr
+ 7, 0, EBX, 26, avx512pf, Prefetch
+ 7, 0, EBX, 27, avx512er, AVX512 Exponent Reciproca instr
+ 7, 0, EBX, 28, avx512cd, AVX512 Conflict Detection instr
+ 7, 0, EBX, 29, sha, Intel Secure Hash Algorithm Extensions instr
+ 7, 0, EBX, 26, avx512bw, AVX512 Byte & Word instr
+ 7, 0, EBX, 28, avx512vl, AVX512 Vector Length Extentions (VL)
+ 7, 0, ECX, 0, prefetchwt1, X
+ 7, 0, ECX, 1, avx512vbmi, AVX512 Vector Byte Manipulation Instructions
+ 7, 0, ECX, 2, umip, User-mode Instruction Prevention
+
+ 7, 0, ECX, 3, pku, Protection Keys for User-mode pages
+ 7, 0, ECX, 4, ospke, CR4 PKE set to enable protection keys
+# 7, 0, ECX, 16:5, resvd, resvd
+ 7, 0, ECX, 21:17, mawau, The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode
+ 7, 0, ECX, 22, rdpid, RDPID and IA32_TSC_AUX
+# 7, 0, ECX, 29:23, resvd, resvd
+ 7, 0, ECX, 30, sgx_lc, SGX Launch Configuration
+# 7, 0, ECX, 31, resvd, resvd
+
+# Leaf 08H
+#
+
+
+# Leaf 09H
+# Direct Cache Access (DCA) information
+ 9, 0, ECX, 31:0, dca_cap, The value of IA32_PLATFORM_DCA_CAP
+
+# Leaf 0AH
+# Architectural Performance Monitoring
+#
+# Do we really need to print out the PMU related stuff?
+# Does normal user really care about it?
+#
+ 0xA, 0, EAX, 7:0, pmu_ver, Performance Monitoring Unit version
+ 0xA, 0, EAX, 15:8, pmu_gp_cnt_num, Numer of general-purose PMU counters per logical CPU
+ 0xA, 0, EAX, 23:16, pmu_cnt_bits, Bit wideth of PMU counter
+ 0xA, 0, EAX, 31:24, pmu_ebx_bits, Length of EBX bit vector to enumerate PMU events
+
+ 0xA, 0, EBX, 0, pmu_no_core_cycle_evt, Core cycle event not available
+ 0xA, 0, EBX, 1, pmu_no_instr_ret_evt, Instruction retired event not available
+ 0xA, 0, EBX, 2, pmu_no_ref_cycle_evt, Reference cycles event not available
+ 0xA, 0, EBX, 3, pmu_no_llc_ref_evt, Last-level cache reference event not available
+ 0xA, 0, EBX, 4, pmu_no_llc_mis_evt, Last-level cache misses event not available
+ 0xA, 0, EBX, 5, pmu_no_br_instr_ret_evt, Branch instruction retired event not available
+ 0xA, 0, EBX, 6, pmu_no_br_mispredict_evt, Branch mispredict retired event not available
+
+ 0xA, 0, ECX, 4:0, pmu_fixed_cnt_num, Performance Monitoring Unit version
+ 0xA, 0, ECX, 12:5, pmu_fixed_cnt_bits, Numer of PMU counters per logical CPU
+
+# Leaf 0BH
+# Extended Topology Enumeration Leaf
+#
+
+ 0xB, 0, EAX, 4:0, id_shift, Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type
+ 0xB, 0, EBX, 15:0, cpu_nr, Number of logical processors at this level type
+ 0xB, 0, ECX, 15:8, lvl_type, 0-Invalid 1-SMT 2-Core
+ 0xB, 0, EDX, 31:0, x2apic_id, x2APIC ID the current logical processor
+
+
+# Leaf 0DH
+# Processor Extended State
+
+ 0xD, 0, EAX, 0, x87, X87 state
+ 0xD, 0, EAX, 1, sse, SSE state
+ 0xD, 0, EAX, 2, avx, AVX state
+ 0xD, 0, EAX, 4:3, mpx, MPX state
+ 0xD, 0, EAX, 7:5, avx512, AVX-512 state
+ 0xD, 0, EAX, 9, pkru, PKRU state
+
+ 0xD, 0, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 0, ECX, 31:0, max_sz_xsave, Maximum size (bytes) of the XSAVE/XRSTOR save area
+
+ 0xD, 1, EAX, 0, xsaveopt, XSAVEOPT available
+ 0xD, 1, EAX, 1, xsavec, XSAVEC and compacted form supported
+ 0xD, 1, EAX, 2, xgetbv, XGETBV supported
+ 0xD, 1, EAX, 3, xsaves, XSAVES/XRSTORS and IA32_XSS supported
+
+ 0xD, 1, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 1, ECX, 8, pt, PT state
+ 0xD, 1, ECX, 11, cet_usr, CET user state
+ 0xD, 1, ECX, 12, cet_supv, CET supervisor state
+ 0xD, 1, ECX, 13, hdc, HDC state
+ 0xD, 1, ECX, 16, hwp, HWP state
+
+# Leaf 0FH
+# Intel RDT Monitoring
+
+ 0xF, 0, EBX, 31:0, rmid_range, Maximum range (zero-based) of RMID within this physical processor of all types
+ 0xF, 0, EDX, 1, l3c_rdt_mon, L3 Cache RDT Monitoring supported
+
+ 0xF, 1, ECX, 31:0, rmid_range, Maximum range (zero-based) of RMID of this types
+ 0xF, 1, EDX, 0, l3c_ocp_mon, L3 Cache occupancy Monitoring supported
+ 0xF, 1, EDX, 1, l3c_tbw_mon, L3 Cache Total Bandwidth Monitoring supported
+ 0xF, 1, EDX, 2, l3c_lbw_mon, L3 Cache Local Bandwidth Monitoring supported
+
+# Leaf 10H
+# Intel RDT Allocation
+
+ 0x10, 0, EBX, 1, l3c_rdt_alloc, L3 Cache Allocation supported
+ 0x10, 0, EBX, 2, l2c_rdt_alloc, L2 Cache Allocation supported
+ 0x10, 0, EBX, 3, mem_bw_alloc, Memory Bandwidth Allocation supported
+
+
+# Leaf 12H
+# SGX Capability
+#
+# Some detailed SGX features not added yet
+
+ 0x12, 0, EAX, 0, sgx1, L3 Cache Allocation supported
+ 0x12, 1, EAX, 0, sgx2, L3 Cache Allocation supported
+
+
+# Leaf 14H
+# Intel Processor Tracer
+#
+
+# Leaf 15H
+# Time Stamp Counter and Nominal Core Crystal Clock Information
+
+ 0x15, 0, EAX, 31:0, tsc_denominator, The denominator of the TSC/”core crystal clock” ratio
+ 0x15, 0, EBX, 31:0, tsc_numerator, The numerator of the TSC/”core crystal clock” ratio
+ 0x15, 0, ECX, 31:0, nom_freq, Nominal frequency of the core crystal clock in Hz
+
+# Leaf 16H
+# Processor Frequency Information
+
+ 0x16, 0, EAX, 15:0, cpu_base_freq, Processor Base Frequency in MHz
+ 0x16, 0, EBX, 15:0, cpu_max_freq, Maximum Frequency in MHz
+ 0x16, 0, ECX, 15:0, bus_freq, Bus (Reference) Frequency in MHz
+
+# Leaf 17H
+# System-On-Chip Vendor Attribute
+
+ 0x17, 0, EAX, 31:0, max_socid, Maximum input value of supported sub-leaf
+ 0x17, 0, EBX, 15:0, soc_vid, SOC Vendor ID
+ 0x17, 0, EBX, 16, std_vid, SOC Vendor ID is assigned via an industry standard scheme
+ 0x17, 0, ECX, 31:0, soc_pid, SOC Project ID assigned by vendor
+ 0x17, 0, EDX, 31:0, soc_sid, SOC Stepping ID
+
+# Leaf 18H
+# Deterministic Address Translation Parameters
+
+
+# Leaf 19H
+# Key Locker Leaf
+
+
+# Leaf 1AH
+# Hybrid Information
+
+ 0x1A, 0, EAX, 31:24, core_type, 20H-Intel_Atom 40H-Intel_Core
+
+
+# Leaf 1FH
+# V2 Extended Topology - A preferred superset to leaf 0BH
+
+
+# According to SDM
+# 40000000H - 4FFFFFFFH is invalid range
+
+
+# Leaf 80000001H
+# Extended Processor Signature and Feature Bits
+
+0x80000001, 0, ECX, 0, lahf_lm, LAHF/SAHF available in 64-bit mode
+0x80000001, 0, ECX, 5, lzcnt, LZCNT
+0x80000001, 0, ECX, 8, prefetchw, PREFETCHW
+
+0x80000001, 0, EDX, 11, sysret, SYSCALL/SYSRET supported
+0x80000001, 0, EDX, 20, exec_dis, Execute Disable Bit available
+0x80000001, 0, EDX, 26, 1gb_page, 1GB page supported
+0x80000001, 0, EDX, 27, rdtscp, RDTSCP and IA32_TSC_AUX are available
+#0x80000001, 0, EDX, 29, 64b, 64b Architecture supported
+
+# Leaf 80000002H/80000003H/80000004H
+# Processor Brand String
+
+# Leaf 80000005H
+# Reserved
+
+# Leaf 80000006H
+# Extended L2 Cache Features
+
+0x80000006, 0, ECX, 7:0, clsize, Cache Line size in bytes
+0x80000006, 0, ECX, 15:12, l2c_assoc, L2 Associativity
+0x80000006, 0, ECX, 31:16, csize, Cache size in 1K units
+
+
+# Leaf 80000007H
+
+0x80000007, 0, EDX, 8, nonstop_tsc, Invariant TSC available
+
+
+# Leaf 80000008H
+
+0x80000008, 0, EAX, 7:0, phy_adr_bits, Physical Address Bits
+0x80000008, 0, EAX, 15:8, lnr_adr_bits, Linear Address Bits
+0x80000007, 0, EBX, 9, wbnoinvd, WBNOINVD
+
+# 0x8000001E
+# EAX: Extended APIC ID
+0x8000001E, 0, EAX, 31:0, extended_apic_id, Extended APIC ID
+# EBX: Core Identifiers
+0x8000001E, 0, EBX, 7:0, core_id, Identifies the logical core ID
+0x8000001E, 0, EBX, 15:8, threads_per_core, The number of threads per core is threads_per_core + 1
+# ECX: Node Identifiers
+0x8000001E, 0, ECX, 7:0, node_id, Node ID
+0x8000001E, 0, ECX, 10:8, nodes_per_processor, Nodes per processor { 0: 1 node, else reserved }
+
+# 8000001F: AMD Secure Encryption
+0x8000001F, 0, EAX, 0, sme, Secure Memory Encryption
+0x8000001F, 0, EAX, 1, sev, Secure Encrypted Virtualization
+0x8000001F, 0, EAX, 2, vmpgflush, VM Page Flush MSR
+0x8000001F, 0, EAX, 3, seves, SEV Encrypted State
+0x8000001F, 0, EBX, 5:0, c-bit, Page table bit number used to enable memory encryption
+0x8000001F, 0, EBX, 11:6, mem_encrypt_physaddr_width, Reduction of physical address space in bits with SME enabled
+0x8000001F, 0, ECX, 31:0, num_encrypted_guests, Maximum ASID value that may be used for an SEV-enabled guest
+0x8000001F, 0, EDX, 31:0, minimum_sev_asid, Minimum ASID value that must be used for an SEV-enabled, SEV-ES-disabled guest
\ No newline at end of file
diff --git a/tools/arch/x86/kcpuid/kcpuid.c b/tools/arch/x86/kcpuid/kcpuid.c
new file mode 100644
index 000000000000..936a3a2aad04
--- /dev/null
+++ b/tools/arch/x86/kcpuid/kcpuid.c
@@ -0,0 +1,657 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+char *def_csv = "/usr/share/misc/cpuid.csv";
+char *user_csv;
+
+
+/* Cover both single-bit flag and multiple-bits fields */
+struct bits_desc {
+ /* start and end bits */
+ int start, end;
+ /* 0 or 1 for 1-bit flag */
+ int value;
+ char simp[32];
+ char detail[256];
+};
+
+/* descriptor info for eax/ebx/ecx/edx */
+struct reg_desc {
+ /* number of valid entries */
+ int nr;
+ struct bits_desc descs[32];
+};
+
+enum {
+ R_EAX = 0,
+ R_EBX,
+ R_ECX,
+ R_EDX,
+ NR_REGS
+};
+
+struct subleaf {
+ u32 index;
+ u32 sub;
+ u32 eax, ebx, ecx, edx;
+ struct reg_desc info[NR_REGS];
+};
+
+/* Represent one leaf (basic or extended) */
+struct cpuid_func {
+ /*
+ * Array of subleafs for this func, if there is no subleafs
+ * then the leafs[0] is the main leaf
+ */
+ struct subleaf *leafs;
+ int nr;
+};
+
+struct cpuid_range {
+ /* array of main leafs */
+ struct cpuid_func *funcs;
+ /* number of valid leafs */
+ int nr;
+ bool is_ext;
+};
+
+/*
+ * basic: basic functions range: [0... ]
+ * ext: extended functions range: [0x80000000... ]
+ */
+struct cpuid_range *leafs_basic, *leafs_ext;
+
+static int num_leafs;
+static bool is_amd;
+static bool show_details;
+static bool show_raw;
+static bool show_flags_only = true;
+static u32 user_index = 0xFFFFFFFF;
+static u32 user_sub = 0xFFFFFFFF;
+static int flines;
+
+static inline void cpuid(u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+{
+ /* ecx is often an input as well as an output. */
+ asm volatile("cpuid"
+ : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+ : "0" (*eax), "2" (*ecx));
+}
+
+static inline bool has_subleafs(u32 f)
+{
+ if (f == 0x7 || f == 0xd)
+ return true;
+
+ if (is_amd) {
+ if (f == 0x8000001d)
+ return true;
+ return false;
+ }
+
+ switch (f) {
+ case 0x4:
+ case 0xb:
+ case 0xf:
+ case 0x10:
+ case 0x14:
+ case 0x18:
+ case 0x1f:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static void leaf_print_raw(struct subleaf *leaf)
+{
+ if (has_subleafs(leaf->index)) {
+ if (leaf->sub == 0)
+ printf("0x%08x: subleafs:\n", leaf->index);
+
+ printf(" %2d: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->sub, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ } else {
+ printf("0x%08x: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->index, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ }
+}
+
+/* Return true is the input eax/ebx/ecx/edx are all zero */
+static bool cpuid_store(struct cpuid_range *range, u32 f, int subleaf,
+ u32 a, u32 b, u32 c, u32 d)
+{
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ int s = 0;
+
+ if (a == 0 && b == 0 && c == 0 && d == 0)
+ return true;
+
+ /*
+ * Cut off vendor-prefix from CPUID function as we're using it as an
+ * index into ->funcs.
+ */
+ func = &range->funcs[f & 0xffff];
+
+ if (!func->leafs) {
+ func->leafs = malloc(sizeof(struct subleaf));
+ if (!func->leafs)
+ perror("malloc func leaf");
+
+ func->nr = 1;
+ } else {
+ s = func->nr;
+ func->leafs = realloc(func->leafs, (s + 1) * sizeof(*leaf));
+ if (!func->leafs)
+ perror("realloc f->leafs");
+
+ func->nr++;
+ }
+
+ leaf = &func->leafs[s];
+
+ leaf->index = f;
+ leaf->sub = subleaf;
+ leaf->eax = a;
+ leaf->ebx = b;
+ leaf->ecx = c;
+ leaf->edx = d;
+
+ return false;
+}
+
+static void raw_dump_range(struct cpuid_range *range)
+{
+ u32 f;
+ int i;
+
+ printf("%s Leafs :\n", range->is_ext ? "Extended" : "Basic");
+ printf("================\n");
+
+ for (f = 0; (int)f < range->nr; f++) {
+ struct cpuid_func *func = &range->funcs[f];
+ u32 index = f;
+
+ if (range->is_ext)
+ index += 0x80000000;
+
+ /* Skip leaf without valid items */
+ if (!func->nr)
+ continue;
+
+ /* First item is the main leaf, followed by all subleafs */
+ for (i = 0; i < func->nr; i++)
+ leaf_print_raw(&func->leafs[i]);
+ }
+}
+
+#define MAX_SUBLEAF_NUM 32
+struct cpuid_range *setup_cpuid_range(u32 input_eax)
+{
+ u32 max_func, idx_func;
+ int subleaf;
+ struct cpuid_range *range;
+ u32 eax, ebx, ecx, edx;
+ u32 f = input_eax;
+ int max_subleaf;
+ bool allzero;
+
+ eax = input_eax;
+ ebx = ecx = edx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ max_func = eax;
+ idx_func = (max_func & 0xffff) + 1;
+
+ range = malloc(sizeof(struct cpuid_range));
+ if (!range)
+ perror("malloc range");
+
+ if (input_eax & 0x80000000)
+ range->is_ext = true;
+ else
+ range->is_ext = false;
+
+ range->funcs = malloc(sizeof(struct cpuid_func) * idx_func);
+ if (!range->funcs)
+ perror("malloc range->funcs");
+
+ range->nr = idx_func;
+ memset(range->funcs, 0, sizeof(struct cpuid_func) * idx_func);
+
+ for (; f <= max_func; f++) {
+ eax = f;
+ subleaf = ecx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf, eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+
+ if (!has_subleafs(f))
+ continue;
+
+ max_subleaf = MAX_SUBLEAF_NUM;
+
+ /*
+ * Some can provide the exact number of subleafs,
+ * others have to be tried (0xf)
+ */
+ if (f == 0x7 || f == 0x14 || f == 0x17 || f == 0x18)
+ max_subleaf = (eax & 0xff) + 1;
+
+ if (f == 0xb)
+ max_subleaf = 2;
+
+ for (subleaf = 1; subleaf < max_subleaf; subleaf++) {
+ eax = f;
+ ecx = subleaf;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf,
+ eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+ }
+
+ }
+
+ return range;
+}
+
+/*
+ * The basic row format for cpuid.csv is
+ * LEAF,SUBLEAF,register_name,bits,short name,long description
+ *
+ * like:
+ * 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+ * 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ */
+static int parse_line(char *line)
+{
+ char *str;
+ int i;
+ struct cpuid_range *range;
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ u32 index;
+ u32 sub;
+ char buffer[512];
+ char *buf;
+ /*
+ * Tokens:
+ * 1. leaf
+ * 2. subleaf
+ * 3. register
+ * 4. bits
+ * 5. short name
+ * 6. long detail
+ */
+ char *tokens[6];
+ struct reg_desc *reg;
+ struct bits_desc *bdesc;
+ int reg_index;
+ char *start, *end;
+
+ /* Skip comments and NULL line */
+ if (line[0] == '#' || line[0] == '\n')
+ return 0;
+
+ strncpy(buffer, line, 511);
+ buffer[511] = 0;
+ str = buffer;
+ for (i = 0; i < 5; i++) {
+ tokens[i] = strtok(str, ",");
+ if (!tokens[i])
+ goto err_exit;
+ str = NULL;
+ }
+ tokens[5] = strtok(str, "\n");
+ if (!tokens[5])
+ goto err_exit;
+
+ /* index/main-leaf */
+ index = strtoull(tokens[0], NULL, 0);
+
+ if (index & 0x80000000)
+ range = leafs_ext;
+ else
+ range = leafs_basic;
+
+ index &= 0x7FFFFFFF;
+ /* Skip line parsing for non-existing indexes */
+ if ((int)index >= range->nr)
+ return -1;
+
+ func = &range->funcs[index];
+
+ /* Return if the index has no valid item on this platform */
+ if (!func->nr)
+ return 0;
+
+ /* subleaf */
+ sub = strtoul(tokens[1], NULL, 0);
+ if ((int)sub > func->nr)
+ return -1;
+
+ leaf = &func->leafs[sub];
+ buf = tokens[2];
+
+ if (strcasestr(buf, "EAX"))
+ reg_index = R_EAX;
+ else if (strcasestr(buf, "EBX"))
+ reg_index = R_EBX;
+ else if (strcasestr(buf, "ECX"))
+ reg_index = R_ECX;
+ else if (strcasestr(buf, "EDX"))
+ reg_index = R_EDX;
+ else
+ goto err_exit;
+
+ reg = &leaf->info[reg_index];
+ bdesc = ®->descs[reg->nr++];
+
+ /* bit flag or bits field */
+ buf = tokens[3];
+
+ end = strtok(buf, ":");
+ bdesc->end = strtoul(end, NULL, 0);
+ bdesc->start = bdesc->end;
+
+ /* start != NULL means it is bit fields */
+ start = strtok(NULL, ":");
+ if (start)
+ bdesc->start = strtoul(start, NULL, 0);
+
+ strcpy(bdesc->simp, tokens[4]);
+ strcpy(bdesc->detail, tokens[5]);
+ return 0;
+
+err_exit:
+ printf("Warning: wrong line format:\n");
+ printf("\tline[%d]: %s\n", flines, line);
+ return -1;
+}
+
+/* Parse csv file, and construct the array of all leafs and subleafs */
+static void parse_text(void)
+{
+ FILE *file;
+ char *filename, *line = NULL;
+ size_t len = 0;
+ int ret;
+
+ if (show_raw)
+ return;
+
+ filename = user_csv ? user_csv : def_csv;
+ file = fopen(filename, "r");
+ if (!file) {
+ /* Fallback to a csv in the same dir */
+ file = fopen("./cpuid.csv", "r");
+ }
+
+ if (!file) {
+ printf("Fail to open '%s'\n", filename);
+ return;
+ }
+
+ while (1) {
+ ret = getline(&line, &len, file);
+ flines++;
+ if (ret > 0)
+ parse_line(line);
+
+ if (feof(file))
+ break;
+ }
+
+ fclose(file);
+}
+
+
+/* Decode every eax/ebx/ecx/edx */
+static void decode_bits(u32 value, struct reg_desc *rdesc)
+{
+ struct bits_desc *bdesc;
+ int start, end, i;
+ u32 mask;
+
+ for (i = 0; i < rdesc->nr; i++) {
+ bdesc = &rdesc->descs[i];
+
+ start = bdesc->start;
+ end = bdesc->end;
+ if (start == end) {
+ /* single bit flag */
+ if (value & (1 << start))
+ printf("\t%-20s %s%s\n",
+ bdesc->simp,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ } else {
+ /* bit fields */
+ if (show_flags_only)
+ continue;
+
+ mask = ((u64)1 << (end - start + 1)) - 1;
+ printf("\t%-20s\t: 0x%-8x\t%s%s\n",
+ bdesc->simp,
+ (value >> start) & mask,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ }
+ }
+}
+
+static void show_leaf(struct subleaf *leaf)
+{
+ if (!leaf)
+ return;
+
+ if (show_raw)
+ leaf_print_raw(leaf);
+
+ decode_bits(leaf->eax, &leaf->info[R_EAX]);
+ decode_bits(leaf->ebx, &leaf->info[R_EBX]);
+ decode_bits(leaf->ecx, &leaf->info[R_ECX]);
+ decode_bits(leaf->edx, &leaf->info[R_EDX]);
+}
+
+static void show_func(struct cpuid_func *func)
+{
+ int i;
+
+ if (!func)
+ return;
+
+ for (i = 0; i < func->nr; i++)
+ show_leaf(&func->leafs[i]);
+}
+
+static void show_range(struct cpuid_range *range)
+{
+ int i;
+
+ for (i = 0; i < range->nr; i++)
+ show_func(&range->funcs[i]);
+}
+
+static inline struct cpuid_func *index_to_func(u32 index)
+{
+ struct cpuid_range *range;
+
+ range = (index & 0x80000000) ? leafs_ext : leafs_basic;
+ index &= 0x7FFFFFFF;
+
+ if (((index & 0xFFFF) + 1) > (u32)range->nr) {
+ printf("ERR: invalid input index (0x%x)\n", index);
+ return NULL;
+ }
+ return &range->funcs[index];
+}
+
+static void show_info(void)
+{
+ struct cpuid_func *func;
+
+ if (show_raw) {
+ /* Show all of the raw output of 'cpuid' instr */
+ raw_dump_range(leafs_basic);
+ raw_dump_range(leafs_ext);
+ return;
+ }
+
+ if (user_index != 0xFFFFFFFF) {
+ /* Only show specific leaf/subleaf info */
+ func = index_to_func(user_index);
+ if (!func)
+ return;
+
+ /* Dump the raw data also */
+ show_raw = true;
+
+ if (user_sub != 0xFFFFFFFF) {
+ if (user_sub + 1 <= (u32)func->nr) {
+ show_leaf(&func->leafs[user_sub]);
+ return;
+ }
+
+ printf("ERR: invalid input subleaf (0x%x)\n", user_sub);
+ }
+
+ show_func(func);
+ return;
+ }
+
+ printf("CPU features:\n=============\n\n");
+ show_range(leafs_basic);
+ show_range(leafs_ext);
+}
+
+static void setup_platform_cpuid(void)
+{
+ u32 eax, ebx, ecx, edx;
+
+ /* Check vendor */
+ eax = ebx = ecx = edx = 0;
+ cpuid(&eax, &ebx, &ecx, &edx);
+
+ /* "htuA" */
+ if (ebx == 0x68747541)
+ is_amd = true;
+
+ /* Setup leafs for the basic and extended range */
+ leafs_basic = setup_cpuid_range(0x0);
+ leafs_ext = setup_cpuid_range(0x80000000);
+}
+
+static void usage(void)
+{
+ printf("kcpuid [-abdfhr] [-l leaf] [-s subleaf]\n"
+ "\t-a|--all Show both bit flags and complex bit fields info\n"
+ "\t-b|--bitflags Show boolean flags only\n"
+ "\t-d|--detail Show details of the flag/fields (default)\n"
+ "\t-f|--flags Specify the cpuid csv file\n"
+ "\t-h|--help Show usage info\n"
+ "\t-l|--leaf=index Specify the leaf you want to check\n"
+ "\t-r|--raw Show raw cpuid data\n"
+ "\t-s|--subleaf=sub Specify the subleaf you want to check\n"
+ );
+}
+
+static struct option opts[] = {
+ { "all", no_argument, NULL, 'a' }, /* show both bit flags and fields */
+ { "bitflags", no_argument, NULL, 'b' }, /* only show bit flags, default on */
+ { "detail", no_argument, NULL, 'd' }, /* show detail descriptions */
+ { "file", required_argument, NULL, 'f' }, /* use user's cpuid file */
+ { "help", no_argument, NULL, 'h'}, /* show usage */
+ { "leaf", required_argument, NULL, 'l'}, /* only check a specific leaf */
+ { "raw", no_argument, NULL, 'r'}, /* show raw CPUID leaf data */
+ { "subleaf", required_argument, NULL, 's'}, /* check a specific subleaf */
+ { NULL, 0, NULL, 0 }
+};
+
+static int parse_options(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt_long(argc, argv, "abdf:hl:rs:",
+ opts, NULL)) != -1)
+ switch (c) {
+ case 'a':
+ show_flags_only = false;
+ break;
+ case 'b':
+ show_flags_only = true;
+ break;
+ case 'd':
+ show_details = true;
+ break;
+ case 'f':
+ user_csv = optarg;
+ break;
+ case 'h':
+ usage();
+ exit(1);
+ break;
+ case 'l':
+ /* main leaf */
+ user_index = strtoul(optarg, NULL, 0);
+ break;
+ case 'r':
+ show_raw = true;
+ break;
+ case 's':
+ /* subleaf */
+ user_sub = strtoul(optarg, NULL, 0);
+ break;
+ default:
+ printf("%s: Invalid option '%c'\n", argv[0], optopt);
+ return -1;
+ }
+
+ return 0;
+}
+
+/*
+ * Do 4 things in turn:
+ * 1. Parse user options
+ * 2. Parse and store all the CPUID leaf data supported on this platform
+ * 2. Parse the csv file, while skipping leafs which are not available
+ * on this platform
+ * 3. Print leafs info based on user options
+ */
+int main(int argc, char *argv[])
+{
+ if (parse_options(argc, argv))
+ return -1;
+
+ /* Setup the cpuid leafs of current platform */
+ setup_platform_cpuid();
+
+ /* Read and parse the 'cpuid.csv' */
+ parse_text();
+
+ show_info();
+ return 0;
+}
\ No newline at end of file
--
2.30.0
1
0
From: Feng Tang <feng.tang(a)intel.com>
mainline inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4CMMB?from=project-issue
CVE: NA
------------------------------------
End users frequently want to know what features their processor
supports, independent of what the kernel supports.
/proc/cpuinfo is great. It is omnipresent and since it is provided by
the kernel it is always as up to date as the kernel. But, it could be
ambiguous about processor features which can be disabled by the kernel
at boot-time or compile-time.
There are some user space tools showing more raw features, but they are
not bound with kernel, and go with distros. Many end users are still
using old distros with new kernels (upgraded by themselves), and may
not upgrade the distros only to get a newer tool.
So here arise the need for a new tool, which
* shows raw CPU features read from the CPUID instruction
* will be easier to update compared to existing userspace
tooling (perhaps distributed like perf)
* inherits "modern" kernel development process, in contrast to some
of the existing userspace CPUID tools which are still being
developed
without git and distributed in tarballs from non-https sites.
* Can produce output consistent with /proc/cpuinfo to make comparison
easier.
The CPUID leaf definitions are kept in an .csv file which allows for
updating only that file to add support for new feature leafs.
This is based on prototype code from Borislav Petkov
(http://sr71.net/~dave/intel/stupid-cpuid.c)
[ bp:
- Massage, add #define _GNU_SOURCE to fix implicit declaration of
function ‘strcasestr' warning
- remove superfluous newlines
- fallback to cpuid.csv in the current dir if none found
- fix typos
- move comments over the lines instead of sideways. ]
Originally-from: Borislav Petkov <bp(a)alien8.de>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Suggested-by: Borislav Petkov <bp(a)alien8.de>
Signed-off-by: Feng Tang <feng.tang(a)intel.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: zhangyue <zhangyue1(a)kylinos.cn>
---
tools/arch/x86/kcpuid/Makefile | 24 ++
tools/arch/x86/kcpuid/cpuid.csv | 400 +++++++++++++++++++
tools/arch/x86/kcpuid/kcpuid.c | 657 ++++++++++++++++++++++++++++++++
3 files changed, 1081 insertions(+)
create mode 100644 tools/arch/x86/kcpuid/Makefile
create mode 100644 tools/arch/x86/kcpuid/cpuid.csv
create mode 100644 tools/arch/x86/kcpuid/kcpuid.c
diff --git a/tools/arch/x86/kcpuid/Makefile b/tools/arch/x86/kcpuid/Makefile
new file mode 100644
index 000000000000..a2f1e855092c
--- /dev/null
+++ b/tools/arch/x86/kcpuid/Makefile
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for x86/kcpuid tool
+
+kcpuid : kcpuid.c
+
+CFLAGS = -Wextra
+
+BINDIR ?= /usr/sbin
+
+HWDATADIR ?= /usr/share/misc/
+
+override CFLAGS += -O2 -Wall -I../../../include
+
+%: %.c
+ $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
+
+.PHONY : clean
+clean :
+ @rm -f kcpuid
+
+install : kcpuid
+ install -d $(DESTDIR)$(BINDIR)
+ install -m 755 -p kcpuid $(DESTDIR)$(BINDIR)/kcpuid
+ install -m 444 -p cpuid.csv $(HWDATADIR)/cpuid.csv
diff --git a/tools/arch/x86/kcpuid/cpuid.csv b/tools/arch/x86/kcpuid/cpuid.csv
new file mode 100644
index 000000000000..e422fad9a98e
--- /dev/null
+++ b/tools/arch/x86/kcpuid/cpuid.csv
@@ -0,0 +1,400 @@
+# The basic row format is:
+# LEAF, SUBLEAF, register_name, bits, short_name, long_description
+
+# Leaf 00H
+ 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+
+# Leaf 01H
+ 1, 0, EAX, 3:0, stepping, Stepping ID
+ 1, 0, EAX, 7:4, model, Model
+ 1, 0, EAX, 11:8, family, Family ID
+ 1, 0, EAX, 13:12, processor, Processor Type
+ 1, 0, EAX, 19:16, model_ext, Extended Model ID
+ 1, 0, EAX, 27:20, family_ext, Extended Family ID
+
+ 1, 0, EBX, 7:0, brand, Brand Index
+ 1, 0, EBX, 15:8, clflush_size, CLFLUSH line size (value * 8) in bytes
+ 1, 0, EBX, 23:16, max_cpu_id, Maxim number of addressable logic cpu in this package
+ 1, 0, EBX, 31:24, apic_id, Initial APIC ID
+
+ 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ 1, 0, ECX, 1, pclmulqdq, PCLMULQDQ instruction supported
+ 1, 0, ECX, 2, dtes64, DS area uses 64-bit layout
+ 1, 0, ECX, 3, mwait, MONITOR/MWAIT supported
+ 1, 0, ECX, 4, ds_cpl, CPL Qualified Debug Store which allows for branch message storage qualified by CPL
+ 1, 0, ECX, 5, vmx, Virtual Machine Extensions supported
+ 1, 0, ECX, 6, smx, Safer Mode Extension supported
+ 1, 0, ECX, 7, eist, Enhanced Intel SpeedStep Technology
+ 1, 0, ECX, 8, tm2, Thermal Monitor 2
+ 1, 0, ECX, 9, ssse3, Supplemental Streaming SIMD Extensions 3 (SSSE3)
+ 1, 0, ECX, 10, l1_ctx_id, L1 data cache could be set to either adaptive mode or shared mode (check IA32_MISC_ENABLE bit 24 definition)
+ 1, 0, ECX, 11, sdbg, IA32_DEBUG_INTERFACE MSR for silicon debug supported
+ 1, 0, ECX, 12, fma, FMA extensions using YMM state supported
+ 1, 0, ECX, 13, cmpxchg16b, 'CMPXCHG16B - Compare and Exchange Bytes' supported
+ 1, 0, ECX, 14, xtpr_update, xTPR Update Control supported
+ 1, 0, ECX, 15, pdcm, Perfmon and Debug Capability present
+ 1, 0, ECX, 17, pcid, Process-Context Identifiers feature present
+ 1, 0, ECX, 18, dca, Prefetching data from a memory mapped device supported
+ 1, 0, ECX, 19, sse4_1, SSE4.1 feature present
+ 1, 0, ECX, 20, sse4_2, SSE4.2 feature present
+ 1, 0, ECX, 21, x2apic, x2APIC supported
+ 1, 0, ECX, 22, movbe, MOVBE instruction supported
+ 1, 0, ECX, 23, popcnt, POPCNT instruction supported
+ 1, 0, ECX, 24, tsc_deadline_timer, LAPIC supports one-shot operation using a TSC deadline value
+ 1, 0, ECX, 25, aesni, AESNI instruction supported
+ 1, 0, ECX, 26, xsave, XSAVE/XRSTOR processor extended states (XSETBV/XGETBV/XCR0)
+ 1, 0, ECX, 27, osxsave, OS has set CR4.OSXSAVE bit to enable XSETBV/XGETBV/XCR0
+ 1, 0, ECX, 28, avx, AVX instruction supported
+ 1, 0, ECX, 29, f16c, 16-bit floating-point conversion instruction supported
+ 1, 0, ECX, 30, rdrand, RDRAND instruction supported
+
+ 1, 0, EDX, 0, fpu, x87 FPU on chip
+ 1, 0, EDX, 1, vme, Virtual-8086 Mode Enhancement
+ 1, 0, EDX, 2, de, Debugging Extensions
+ 1, 0, EDX, 3, pse, Page Size Extensions
+ 1, 0, EDX, 4, tsc, Time Stamp Counter
+ 1, 0, EDX, 5, msr, RDMSR and WRMSR Support
+ 1, 0, EDX, 6, pae, Physical Address Extensions
+ 1, 0, EDX, 7, mce, Machine Check Exception
+ 1, 0, EDX, 8, cx8, CMPXCHG8B instr
+ 1, 0, EDX, 9, apic, APIC on Chip
+ 1, 0, EDX, 11, sep, SYSENTER and SYSEXIT instrs
+ 1, 0, EDX, 12, mtrr, Memory Type Range Registers
+ 1, 0, EDX, 13, pge, Page Global Bit
+ 1, 0, EDX, 14, mca, Machine Check Architecture
+ 1, 0, EDX, 15, cmov, Conditional Move Instrs
+ 1, 0, EDX, 16, pat, Page Attribute Table
+ 1, 0, EDX, 17, pse36, 36-Bit Page Size Extension
+ 1, 0, EDX, 18, psn, Processor Serial Number
+ 1, 0, EDX, 19, clflush, CLFLUSH instr
+# 1, 0, EDX, 20,
+ 1, 0, EDX, 21, ds, Debug Store
+ 1, 0, EDX, 22, acpi, Thermal Monitor and Software Controlled Clock Facilities
+ 1, 0, EDX, 23, mmx, Intel MMX Technology
+ 1, 0, EDX, 24, fxsr, XSAVE and FXRSTOR Instrs
+ 1, 0, EDX, 25, sse, SSE
+ 1, 0, EDX, 26, sse2, SSE2
+ 1, 0, EDX, 27, ss, Self Snoop
+ 1, 0, EDX, 28, hit, Max APIC IDs
+ 1, 0, EDX, 29, tm, Thermal Monitor
+# 1, 0, EDX, 30,
+ 1, 0, EDX, 31, pbe, Pending Break Enable
+
+# Leaf 02H
+# cache and TLB descriptor info
+
+# Leaf 03H
+# Precessor Serial Number, introduced on Pentium III, not valid for
+# latest models
+
+# Leaf 04H
+# thread/core and cache topology
+ 4, 0, EAX, 4:0, cache_type, Cache type like instr/data or unified
+ 4, 0, EAX, 7:5, cache_level, Cache Level (starts at 1)
+ 4, 0, EAX, 8, cache_self_init, Cache Self Initialization
+ 4, 0, EAX, 9, fully_associate, Fully Associative cache
+# 4, 0, EAX, 13:10, resvd, resvd
+ 4, 0, EAX, 25:14, max_logical_id, Max number of addressable IDs for logical processors sharing the cache
+ 4, 0, EAX, 31:26, max_phy_id, Max number of addressable IDs for processors in phy package
+
+ 4, 0, EBX, 11:0, cache_linesize, Size of a cache line in bytes
+ 4, 0, EBX, 21:12, cache_partition, Physical Line partitions
+ 4, 0, EBX, 31:22, cache_ways, Ways of associativity
+ 4, 0, ECX, 31:0, cache_sets, Number of Sets - 1
+ 4, 0, EDX, 0, c_wbinvd, 1 means WBINVD/INVD is not ganranteed to act upon lower level caches of non-originating threads sharing this cache
+ 4, 0, EDX, 1, c_incl, Whether cache is inclusive of lower cache level
+ 4, 0, EDX, 2, c_comp_index, Complex Cache Indexing
+
+# Leaf 05H
+# MONITOR/MWAIT
+ 5, 0, EAX, 15:0, min_mon_size, Smallest monitor line size in bytes
+ 5, 0, EBX, 15:0, max_mon_size, Largest monitor line size in bytes
+ 5, 0, ECX, 0, mwait_ext, Enum of Monitor-Mwait extensions supported
+ 5, 0, ECX, 1, mwait_irq_break, Largest monitor line size in bytes
+ 5, 0, EDX, 3:0, c0_sub_stats, Number of C0* sub C-states supported using MWAIT
+ 5, 0, EDX, 7:4, c1_sub_stats, Number of C1* sub C-states supported using MWAIT
+ 5, 0, EDX, 11:8, c2_sub_stats, Number of C2* sub C-states supported using MWAIT
+ 5, 0, EDX, 15:12, c3_sub_stats, Number of C3* sub C-states supported using MWAIT
+ 5, 0, EDX, 19:16, c4_sub_stats, Number of C4* sub C-states supported using MWAIT
+ 5, 0, EDX, 23:20, c5_sub_stats, Number of C5* sub C-states supported using MWAIT
+ 5, 0, EDX, 27:24, c6_sub_stats, Number of C6* sub C-states supported using MWAIT
+ 5, 0, EDX, 31:28, c7_sub_stats, Number of C7* sub C-states supported using MWAIT
+
+# Leaf 06H
+# Thermal & Power Management
+
+ 6, 0, EAX, 0, dig_temp, Digital temperature sensor supported
+ 6, 0, EAX, 1, turbo, Intel Turbo Boost
+ 6, 0, EAX, 2, arat, Always running APIC timer
+# 6, 0, EAX, 3, resv, Reserved
+ 6, 0, EAX, 4, pln, Power limit notifications supported
+ 6, 0, EAX, 5, ecmd, Clock modulation duty cycle extension supported
+ 6, 0, EAX, 6, ptm, Package thermal management supported
+ 6, 0, EAX, 7, hwp, HWP base register
+ 6, 0, EAX, 8, hwp_notify, HWP notification
+ 6, 0, EAX, 9, hwp_act_window, HWP activity window
+ 6, 0, EAX, 10, hwp_energy, HWP energy performance preference
+ 6, 0, EAX, 11, hwp_pkg_req, HWP package level request
+# 6, 0, EAX, 12, resv, Reserved
+ 6, 0, EAX, 13, hdc, HDC base registers supported
+ 6, 0, EAX, 14, turbo3, Turbo Boost Max 3.0
+ 6, 0, EAX, 15, hwp_cap, Highest Performance change supported
+ 6, 0, EAX, 16, hwp_peci, HWP PECI override is supported
+ 6, 0, EAX, 17, hwp_flex, Flexible HWP is supported
+ 6, 0, EAX, 18, hwp_fast, Fast access mode for the IA32_HWP_REQUEST MSR is supported
+# 6, 0, EAX, 19, resv, Reserved
+ 6, 0, EAX, 20, hwp_ignr, Ignoring Idle Logical Processor HWP request is supported
+
+ 6, 0, EBX, 3:0, therm_irq_thresh, Number of Interrupt Thresholds in Digital Thermal Sensor
+ 6, 0, ECX, 0, aperfmperf, Presence of IA32_MPERF and IA32_APERF
+ 6, 0, ECX, 3, energ_bias, Performance-energy bias preference supported
+
+# Leaf 07H
+# ECX == 0
+# AVX512 refers to https://en.wikipedia.org/wiki/AVX-512
+# XXX: Do we really need to enumerate each and every AVX512 sub features
+
+ 7, 0, EBX, 0, fsgsbase, RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE supported
+ 7, 0, EBX, 1, tsc_adjust, TSC_ADJUST MSR supported
+ 7, 0, EBX, 2, sgx, Software Guard Extensions
+ 7, 0, EBX, 3, bmi1, BMI1
+ 7, 0, EBX, 4, hle, Hardware Lock Elision
+ 7, 0, EBX, 5, avx2, AVX2
+# 7, 0, EBX, 6, fdp_excp_only, x87 FPU Data Pointer updated only on x87 exceptions
+ 7, 0, EBX, 7, smep, Supervisor-Mode Execution Prevention
+ 7, 0, EBX, 8, bmi2, BMI2
+ 7, 0, EBX, 9, rep_movsb, Enhanced REP MOVSB/STOSB
+ 7, 0, EBX, 10, invpcid, INVPCID instruction
+ 7, 0, EBX, 11, rtm, Restricted Transactional Memory
+ 7, 0, EBX, 12, rdt_m, Intel RDT Monitoring capability
+ 7, 0, EBX, 13, depc_fpu_cs_ds, Deprecates FPU CS and FPU DS
+ 7, 0, EBX, 14, mpx, Memory Protection Extensions
+ 7, 0, EBX, 15, rdt_a, Intel RDT Allocation capability
+ 7, 0, EBX, 16, avx512f, AVX512 Foundation instr
+ 7, 0, EBX, 17, avx512dq, AVX512 Double and Quadword AVX512 instr
+ 7, 0, EBX, 18, rdseed, RDSEED instr
+ 7, 0, EBX, 19, adx, ADX instr
+ 7, 0, EBX, 20, smap, Supervisor Mode Access Prevention
+ 7, 0, EBX, 21, avx512ifma, AVX512 Integer Fused Multiply Add
+# 7, 0, EBX, 22, resvd, resvd
+ 7, 0, EBX, 23, clflushopt, CLFLUSHOPT instr
+ 7, 0, EBX, 24, clwb, CLWB instr
+ 7, 0, EBX, 25, intel_pt, Intel Processor Trace instr
+ 7, 0, EBX, 26, avx512pf, Prefetch
+ 7, 0, EBX, 27, avx512er, AVX512 Exponent Reciproca instr
+ 7, 0, EBX, 28, avx512cd, AVX512 Conflict Detection instr
+ 7, 0, EBX, 29, sha, Intel Secure Hash Algorithm Extensions instr
+ 7, 0, EBX, 26, avx512bw, AVX512 Byte & Word instr
+ 7, 0, EBX, 28, avx512vl, AVX512 Vector Length Extentions (VL)
+ 7, 0, ECX, 0, prefetchwt1, X
+ 7, 0, ECX, 1, avx512vbmi, AVX512 Vector Byte Manipulation Instructions
+ 7, 0, ECX, 2, umip, User-mode Instruction Prevention
+
+ 7, 0, ECX, 3, pku, Protection Keys for User-mode pages
+ 7, 0, ECX, 4, ospke, CR4 PKE set to enable protection keys
+# 7, 0, ECX, 16:5, resvd, resvd
+ 7, 0, ECX, 21:17, mawau, The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode
+ 7, 0, ECX, 22, rdpid, RDPID and IA32_TSC_AUX
+# 7, 0, ECX, 29:23, resvd, resvd
+ 7, 0, ECX, 30, sgx_lc, SGX Launch Configuration
+# 7, 0, ECX, 31, resvd, resvd
+
+# Leaf 08H
+#
+
+
+# Leaf 09H
+# Direct Cache Access (DCA) information
+ 9, 0, ECX, 31:0, dca_cap, The value of IA32_PLATFORM_DCA_CAP
+
+# Leaf 0AH
+# Architectural Performance Monitoring
+#
+# Do we really need to print out the PMU related stuff?
+# Does normal user really care about it?
+#
+ 0xA, 0, EAX, 7:0, pmu_ver, Performance Monitoring Unit version
+ 0xA, 0, EAX, 15:8, pmu_gp_cnt_num, Numer of general-purose PMU counters per logical CPU
+ 0xA, 0, EAX, 23:16, pmu_cnt_bits, Bit wideth of PMU counter
+ 0xA, 0, EAX, 31:24, pmu_ebx_bits, Length of EBX bit vector to enumerate PMU events
+
+ 0xA, 0, EBX, 0, pmu_no_core_cycle_evt, Core cycle event not available
+ 0xA, 0, EBX, 1, pmu_no_instr_ret_evt, Instruction retired event not available
+ 0xA, 0, EBX, 2, pmu_no_ref_cycle_evt, Reference cycles event not available
+ 0xA, 0, EBX, 3, pmu_no_llc_ref_evt, Last-level cache reference event not available
+ 0xA, 0, EBX, 4, pmu_no_llc_mis_evt, Last-level cache misses event not available
+ 0xA, 0, EBX, 5, pmu_no_br_instr_ret_evt, Branch instruction retired event not available
+ 0xA, 0, EBX, 6, pmu_no_br_mispredict_evt, Branch mispredict retired event not available
+
+ 0xA, 0, ECX, 4:0, pmu_fixed_cnt_num, Performance Monitoring Unit version
+ 0xA, 0, ECX, 12:5, pmu_fixed_cnt_bits, Numer of PMU counters per logical CPU
+
+# Leaf 0BH
+# Extended Topology Enumeration Leaf
+#
+
+ 0xB, 0, EAX, 4:0, id_shift, Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type
+ 0xB, 0, EBX, 15:0, cpu_nr, Number of logical processors at this level type
+ 0xB, 0, ECX, 15:8, lvl_type, 0-Invalid 1-SMT 2-Core
+ 0xB, 0, EDX, 31:0, x2apic_id, x2APIC ID the current logical processor
+
+
+# Leaf 0DH
+# Processor Extended State
+
+ 0xD, 0, EAX, 0, x87, X87 state
+ 0xD, 0, EAX, 1, sse, SSE state
+ 0xD, 0, EAX, 2, avx, AVX state
+ 0xD, 0, EAX, 4:3, mpx, MPX state
+ 0xD, 0, EAX, 7:5, avx512, AVX-512 state
+ 0xD, 0, EAX, 9, pkru, PKRU state
+
+ 0xD, 0, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 0, ECX, 31:0, max_sz_xsave, Maximum size (bytes) of the XSAVE/XRSTOR save area
+
+ 0xD, 1, EAX, 0, xsaveopt, XSAVEOPT available
+ 0xD, 1, EAX, 1, xsavec, XSAVEC and compacted form supported
+ 0xD, 1, EAX, 2, xgetbv, XGETBV supported
+ 0xD, 1, EAX, 3, xsaves, XSAVES/XRSTORS and IA32_XSS supported
+
+ 0xD, 1, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 1, ECX, 8, pt, PT state
+ 0xD, 1, ECX, 11, cet_usr, CET user state
+ 0xD, 1, ECX, 12, cet_supv, CET supervisor state
+ 0xD, 1, ECX, 13, hdc, HDC state
+ 0xD, 1, ECX, 16, hwp, HWP state
+
+# Leaf 0FH
+# Intel RDT Monitoring
+
+ 0xF, 0, EBX, 31:0, rmid_range, Maximum range (zero-based) of RMID within this physical processor of all types
+ 0xF, 0, EDX, 1, l3c_rdt_mon, L3 Cache RDT Monitoring supported
+
+ 0xF, 1, ECX, 31:0, rmid_range, Maximum range (zero-based) of RMID of this types
+ 0xF, 1, EDX, 0, l3c_ocp_mon, L3 Cache occupancy Monitoring supported
+ 0xF, 1, EDX, 1, l3c_tbw_mon, L3 Cache Total Bandwidth Monitoring supported
+ 0xF, 1, EDX, 2, l3c_lbw_mon, L3 Cache Local Bandwidth Monitoring supported
+
+# Leaf 10H
+# Intel RDT Allocation
+
+ 0x10, 0, EBX, 1, l3c_rdt_alloc, L3 Cache Allocation supported
+ 0x10, 0, EBX, 2, l2c_rdt_alloc, L2 Cache Allocation supported
+ 0x10, 0, EBX, 3, mem_bw_alloc, Memory Bandwidth Allocation supported
+
+
+# Leaf 12H
+# SGX Capability
+#
+# Some detailed SGX features not added yet
+
+ 0x12, 0, EAX, 0, sgx1, L3 Cache Allocation supported
+ 0x12, 1, EAX, 0, sgx2, L3 Cache Allocation supported
+
+
+# Leaf 14H
+# Intel Processor Tracer
+#
+
+# Leaf 15H
+# Time Stamp Counter and Nominal Core Crystal Clock Information
+
+ 0x15, 0, EAX, 31:0, tsc_denominator, The denominator of the TSC/”core crystal clock” ratio
+ 0x15, 0, EBX, 31:0, tsc_numerator, The numerator of the TSC/”core crystal clock” ratio
+ 0x15, 0, ECX, 31:0, nom_freq, Nominal frequency of the core crystal clock in Hz
+
+# Leaf 16H
+# Processor Frequency Information
+
+ 0x16, 0, EAX, 15:0, cpu_base_freq, Processor Base Frequency in MHz
+ 0x16, 0, EBX, 15:0, cpu_max_freq, Maximum Frequency in MHz
+ 0x16, 0, ECX, 15:0, bus_freq, Bus (Reference) Frequency in MHz
+
+# Leaf 17H
+# System-On-Chip Vendor Attribute
+
+ 0x17, 0, EAX, 31:0, max_socid, Maximum input value of supported sub-leaf
+ 0x17, 0, EBX, 15:0, soc_vid, SOC Vendor ID
+ 0x17, 0, EBX, 16, std_vid, SOC Vendor ID is assigned via an industry standard scheme
+ 0x17, 0, ECX, 31:0, soc_pid, SOC Project ID assigned by vendor
+ 0x17, 0, EDX, 31:0, soc_sid, SOC Stepping ID
+
+# Leaf 18H
+# Deterministic Address Translation Parameters
+
+
+# Leaf 19H
+# Key Locker Leaf
+
+
+# Leaf 1AH
+# Hybrid Information
+
+ 0x1A, 0, EAX, 31:24, core_type, 20H-Intel_Atom 40H-Intel_Core
+
+
+# Leaf 1FH
+# V2 Extended Topology - A preferred superset to leaf 0BH
+
+
+# According to SDM
+# 40000000H - 4FFFFFFFH is invalid range
+
+
+# Leaf 80000001H
+# Extended Processor Signature and Feature Bits
+
+0x80000001, 0, ECX, 0, lahf_lm, LAHF/SAHF available in 64-bit mode
+0x80000001, 0, ECX, 5, lzcnt, LZCNT
+0x80000001, 0, ECX, 8, prefetchw, PREFETCHW
+
+0x80000001, 0, EDX, 11, sysret, SYSCALL/SYSRET supported
+0x80000001, 0, EDX, 20, exec_dis, Execute Disable Bit available
+0x80000001, 0, EDX, 26, 1gb_page, 1GB page supported
+0x80000001, 0, EDX, 27, rdtscp, RDTSCP and IA32_TSC_AUX are available
+#0x80000001, 0, EDX, 29, 64b, 64b Architecture supported
+
+# Leaf 80000002H/80000003H/80000004H
+# Processor Brand String
+
+# Leaf 80000005H
+# Reserved
+
+# Leaf 80000006H
+# Extended L2 Cache Features
+
+0x80000006, 0, ECX, 7:0, clsize, Cache Line size in bytes
+0x80000006, 0, ECX, 15:12, l2c_assoc, L2 Associativity
+0x80000006, 0, ECX, 31:16, csize, Cache size in 1K units
+
+
+# Leaf 80000007H
+
+0x80000007, 0, EDX, 8, nonstop_tsc, Invariant TSC available
+
+
+# Leaf 80000008H
+
+0x80000008, 0, EAX, 7:0, phy_adr_bits, Physical Address Bits
+0x80000008, 0, EAX, 15:8, lnr_adr_bits, Linear Address Bits
+0x80000007, 0, EBX, 9, wbnoinvd, WBNOINVD
+
+# 0x8000001E
+# EAX: Extended APIC ID
+0x8000001E, 0, EAX, 31:0, extended_apic_id, Extended APIC ID
+# EBX: Core Identifiers
+0x8000001E, 0, EBX, 7:0, core_id, Identifies the logical core ID
+0x8000001E, 0, EBX, 15:8, threads_per_core, The number of threads per core is threads_per_core + 1
+# ECX: Node Identifiers
+0x8000001E, 0, ECX, 7:0, node_id, Node ID
+0x8000001E, 0, ECX, 10:8, nodes_per_processor, Nodes per processor { 0: 1 node, else reserved }
+
+# 8000001F: AMD Secure Encryption
+0x8000001F, 0, EAX, 0, sme, Secure Memory Encryption
+0x8000001F, 0, EAX, 1, sev, Secure Encrypted Virtualization
+0x8000001F, 0, EAX, 2, vmpgflush, VM Page Flush MSR
+0x8000001F, 0, EAX, 3, seves, SEV Encrypted State
+0x8000001F, 0, EBX, 5:0, c-bit, Page table bit number used to enable memory encryption
+0x8000001F, 0, EBX, 11:6, mem_encrypt_physaddr_width, Reduction of physical address space in bits with SME enabled
+0x8000001F, 0, ECX, 31:0, num_encrypted_guests, Maximum ASID value that may be used for an SEV-enabled guest
+0x8000001F, 0, EDX, 31:0, minimum_sev_asid, Minimum ASID value that must be used for an SEV-enabled, SEV-ES-disabled guest
\ No newline at end of file
diff --git a/tools/arch/x86/kcpuid/kcpuid.c b/tools/arch/x86/kcpuid/kcpuid.c
new file mode 100644
index 000000000000..936a3a2aad04
--- /dev/null
+++ b/tools/arch/x86/kcpuid/kcpuid.c
@@ -0,0 +1,657 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+char *def_csv = "/usr/share/misc/cpuid.csv";
+char *user_csv;
+
+
+/* Cover both single-bit flag and multiple-bits fields */
+struct bits_desc {
+ /* start and end bits */
+ int start, end;
+ /* 0 or 1 for 1-bit flag */
+ int value;
+ char simp[32];
+ char detail[256];
+};
+
+/* descriptor info for eax/ebx/ecx/edx */
+struct reg_desc {
+ /* number of valid entries */
+ int nr;
+ struct bits_desc descs[32];
+};
+
+enum {
+ R_EAX = 0,
+ R_EBX,
+ R_ECX,
+ R_EDX,
+ NR_REGS
+};
+
+struct subleaf {
+ u32 index;
+ u32 sub;
+ u32 eax, ebx, ecx, edx;
+ struct reg_desc info[NR_REGS];
+};
+
+/* Represent one leaf (basic or extended) */
+struct cpuid_func {
+ /*
+ * Array of subleafs for this func, if there is no subleafs
+ * then the leafs[0] is the main leaf
+ */
+ struct subleaf *leafs;
+ int nr;
+};
+
+struct cpuid_range {
+ /* array of main leafs */
+ struct cpuid_func *funcs;
+ /* number of valid leafs */
+ int nr;
+ bool is_ext;
+};
+
+/*
+ * basic: basic functions range: [0... ]
+ * ext: extended functions range: [0x80000000... ]
+ */
+struct cpuid_range *leafs_basic, *leafs_ext;
+
+static int num_leafs;
+static bool is_amd;
+static bool show_details;
+static bool show_raw;
+static bool show_flags_only = true;
+static u32 user_index = 0xFFFFFFFF;
+static u32 user_sub = 0xFFFFFFFF;
+static int flines;
+
+static inline void cpuid(u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+{
+ /* ecx is often an input as well as an output. */
+ asm volatile("cpuid"
+ : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+ : "0" (*eax), "2" (*ecx));
+}
+
+static inline bool has_subleafs(u32 f)
+{
+ if (f == 0x7 || f == 0xd)
+ return true;
+
+ if (is_amd) {
+ if (f == 0x8000001d)
+ return true;
+ return false;
+ }
+
+ switch (f) {
+ case 0x4:
+ case 0xb:
+ case 0xf:
+ case 0x10:
+ case 0x14:
+ case 0x18:
+ case 0x1f:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static void leaf_print_raw(struct subleaf *leaf)
+{
+ if (has_subleafs(leaf->index)) {
+ if (leaf->sub == 0)
+ printf("0x%08x: subleafs:\n", leaf->index);
+
+ printf(" %2d: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->sub, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ } else {
+ printf("0x%08x: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->index, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ }
+}
+
+/* Return true is the input eax/ebx/ecx/edx are all zero */
+static bool cpuid_store(struct cpuid_range *range, u32 f, int subleaf,
+ u32 a, u32 b, u32 c, u32 d)
+{
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ int s = 0;
+
+ if (a == 0 && b == 0 && c == 0 && d == 0)
+ return true;
+
+ /*
+ * Cut off vendor-prefix from CPUID function as we're using it as an
+ * index into ->funcs.
+ */
+ func = &range->funcs[f & 0xffff];
+
+ if (!func->leafs) {
+ func->leafs = malloc(sizeof(struct subleaf));
+ if (!func->leafs)
+ perror("malloc func leaf");
+
+ func->nr = 1;
+ } else {
+ s = func->nr;
+ func->leafs = realloc(func->leafs, (s + 1) * sizeof(*leaf));
+ if (!func->leafs)
+ perror("realloc f->leafs");
+
+ func->nr++;
+ }
+
+ leaf = &func->leafs[s];
+
+ leaf->index = f;
+ leaf->sub = subleaf;
+ leaf->eax = a;
+ leaf->ebx = b;
+ leaf->ecx = c;
+ leaf->edx = d;
+
+ return false;
+}
+
+static void raw_dump_range(struct cpuid_range *range)
+{
+ u32 f;
+ int i;
+
+ printf("%s Leafs :\n", range->is_ext ? "Extended" : "Basic");
+ printf("================\n");
+
+ for (f = 0; (int)f < range->nr; f++) {
+ struct cpuid_func *func = &range->funcs[f];
+ u32 index = f;
+
+ if (range->is_ext)
+ index += 0x80000000;
+
+ /* Skip leaf without valid items */
+ if (!func->nr)
+ continue;
+
+ /* First item is the main leaf, followed by all subleafs */
+ for (i = 0; i < func->nr; i++)
+ leaf_print_raw(&func->leafs[i]);
+ }
+}
+
+#define MAX_SUBLEAF_NUM 32
+struct cpuid_range *setup_cpuid_range(u32 input_eax)
+{
+ u32 max_func, idx_func;
+ int subleaf;
+ struct cpuid_range *range;
+ u32 eax, ebx, ecx, edx;
+ u32 f = input_eax;
+ int max_subleaf;
+ bool allzero;
+
+ eax = input_eax;
+ ebx = ecx = edx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ max_func = eax;
+ idx_func = (max_func & 0xffff) + 1;
+
+ range = malloc(sizeof(struct cpuid_range));
+ if (!range)
+ perror("malloc range");
+
+ if (input_eax & 0x80000000)
+ range->is_ext = true;
+ else
+ range->is_ext = false;
+
+ range->funcs = malloc(sizeof(struct cpuid_func) * idx_func);
+ if (!range->funcs)
+ perror("malloc range->funcs");
+
+ range->nr = idx_func;
+ memset(range->funcs, 0, sizeof(struct cpuid_func) * idx_func);
+
+ for (; f <= max_func; f++) {
+ eax = f;
+ subleaf = ecx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf, eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+
+ if (!has_subleafs(f))
+ continue;
+
+ max_subleaf = MAX_SUBLEAF_NUM;
+
+ /*
+ * Some can provide the exact number of subleafs,
+ * others have to be tried (0xf)
+ */
+ if (f == 0x7 || f == 0x14 || f == 0x17 || f == 0x18)
+ max_subleaf = (eax & 0xff) + 1;
+
+ if (f == 0xb)
+ max_subleaf = 2;
+
+ for (subleaf = 1; subleaf < max_subleaf; subleaf++) {
+ eax = f;
+ ecx = subleaf;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf,
+ eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+ }
+
+ }
+
+ return range;
+}
+
+/*
+ * The basic row format for cpuid.csv is
+ * LEAF,SUBLEAF,register_name,bits,short name,long description
+ *
+ * like:
+ * 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+ * 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ */
+static int parse_line(char *line)
+{
+ char *str;
+ int i;
+ struct cpuid_range *range;
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ u32 index;
+ u32 sub;
+ char buffer[512];
+ char *buf;
+ /*
+ * Tokens:
+ * 1. leaf
+ * 2. subleaf
+ * 3. register
+ * 4. bits
+ * 5. short name
+ * 6. long detail
+ */
+ char *tokens[6];
+ struct reg_desc *reg;
+ struct bits_desc *bdesc;
+ int reg_index;
+ char *start, *end;
+
+ /* Skip comments and NULL line */
+ if (line[0] == '#' || line[0] == '\n')
+ return 0;
+
+ strncpy(buffer, line, 511);
+ buffer[511] = 0;
+ str = buffer;
+ for (i = 0; i < 5; i++) {
+ tokens[i] = strtok(str, ",");
+ if (!tokens[i])
+ goto err_exit;
+ str = NULL;
+ }
+ tokens[5] = strtok(str, "\n");
+ if (!tokens[5])
+ goto err_exit;
+
+ /* index/main-leaf */
+ index = strtoull(tokens[0], NULL, 0);
+
+ if (index & 0x80000000)
+ range = leafs_ext;
+ else
+ range = leafs_basic;
+
+ index &= 0x7FFFFFFF;
+ /* Skip line parsing for non-existing indexes */
+ if ((int)index >= range->nr)
+ return -1;
+
+ func = &range->funcs[index];
+
+ /* Return if the index has no valid item on this platform */
+ if (!func->nr)
+ return 0;
+
+ /* subleaf */
+ sub = strtoul(tokens[1], NULL, 0);
+ if ((int)sub > func->nr)
+ return -1;
+
+ leaf = &func->leafs[sub];
+ buf = tokens[2];
+
+ if (strcasestr(buf, "EAX"))
+ reg_index = R_EAX;
+ else if (strcasestr(buf, "EBX"))
+ reg_index = R_EBX;
+ else if (strcasestr(buf, "ECX"))
+ reg_index = R_ECX;
+ else if (strcasestr(buf, "EDX"))
+ reg_index = R_EDX;
+ else
+ goto err_exit;
+
+ reg = &leaf->info[reg_index];
+ bdesc = ®->descs[reg->nr++];
+
+ /* bit flag or bits field */
+ buf = tokens[3];
+
+ end = strtok(buf, ":");
+ bdesc->end = strtoul(end, NULL, 0);
+ bdesc->start = bdesc->end;
+
+ /* start != NULL means it is bit fields */
+ start = strtok(NULL, ":");
+ if (start)
+ bdesc->start = strtoul(start, NULL, 0);
+
+ strcpy(bdesc->simp, tokens[4]);
+ strcpy(bdesc->detail, tokens[5]);
+ return 0;
+
+err_exit:
+ printf("Warning: wrong line format:\n");
+ printf("\tline[%d]: %s\n", flines, line);
+ return -1;
+}
+
+/* Parse csv file, and construct the array of all leafs and subleafs */
+static void parse_text(void)
+{
+ FILE *file;
+ char *filename, *line = NULL;
+ size_t len = 0;
+ int ret;
+
+ if (show_raw)
+ return;
+
+ filename = user_csv ? user_csv : def_csv;
+ file = fopen(filename, "r");
+ if (!file) {
+ /* Fallback to a csv in the same dir */
+ file = fopen("./cpuid.csv", "r");
+ }
+
+ if (!file) {
+ printf("Fail to open '%s'\n", filename);
+ return;
+ }
+
+ while (1) {
+ ret = getline(&line, &len, file);
+ flines++;
+ if (ret > 0)
+ parse_line(line);
+
+ if (feof(file))
+ break;
+ }
+
+ fclose(file);
+}
+
+
+/* Decode every eax/ebx/ecx/edx */
+static void decode_bits(u32 value, struct reg_desc *rdesc)
+{
+ struct bits_desc *bdesc;
+ int start, end, i;
+ u32 mask;
+
+ for (i = 0; i < rdesc->nr; i++) {
+ bdesc = &rdesc->descs[i];
+
+ start = bdesc->start;
+ end = bdesc->end;
+ if (start == end) {
+ /* single bit flag */
+ if (value & (1 << start))
+ printf("\t%-20s %s%s\n",
+ bdesc->simp,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ } else {
+ /* bit fields */
+ if (show_flags_only)
+ continue;
+
+ mask = ((u64)1 << (end - start + 1)) - 1;
+ printf("\t%-20s\t: 0x%-8x\t%s%s\n",
+ bdesc->simp,
+ (value >> start) & mask,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ }
+ }
+}
+
+static void show_leaf(struct subleaf *leaf)
+{
+ if (!leaf)
+ return;
+
+ if (show_raw)
+ leaf_print_raw(leaf);
+
+ decode_bits(leaf->eax, &leaf->info[R_EAX]);
+ decode_bits(leaf->ebx, &leaf->info[R_EBX]);
+ decode_bits(leaf->ecx, &leaf->info[R_ECX]);
+ decode_bits(leaf->edx, &leaf->info[R_EDX]);
+}
+
+static void show_func(struct cpuid_func *func)
+{
+ int i;
+
+ if (!func)
+ return;
+
+ for (i = 0; i < func->nr; i++)
+ show_leaf(&func->leafs[i]);
+}
+
+static void show_range(struct cpuid_range *range)
+{
+ int i;
+
+ for (i = 0; i < range->nr; i++)
+ show_func(&range->funcs[i]);
+}
+
+static inline struct cpuid_func *index_to_func(u32 index)
+{
+ struct cpuid_range *range;
+
+ range = (index & 0x80000000) ? leafs_ext : leafs_basic;
+ index &= 0x7FFFFFFF;
+
+ if (((index & 0xFFFF) + 1) > (u32)range->nr) {
+ printf("ERR: invalid input index (0x%x)\n", index);
+ return NULL;
+ }
+ return &range->funcs[index];
+}
+
+static void show_info(void)
+{
+ struct cpuid_func *func;
+
+ if (show_raw) {
+ /* Show all of the raw output of 'cpuid' instr */
+ raw_dump_range(leafs_basic);
+ raw_dump_range(leafs_ext);
+ return;
+ }
+
+ if (user_index != 0xFFFFFFFF) {
+ /* Only show specific leaf/subleaf info */
+ func = index_to_func(user_index);
+ if (!func)
+ return;
+
+ /* Dump the raw data also */
+ show_raw = true;
+
+ if (user_sub != 0xFFFFFFFF) {
+ if (user_sub + 1 <= (u32)func->nr) {
+ show_leaf(&func->leafs[user_sub]);
+ return;
+ }
+
+ printf("ERR: invalid input subleaf (0x%x)\n", user_sub);
+ }
+
+ show_func(func);
+ return;
+ }
+
+ printf("CPU features:\n=============\n\n");
+ show_range(leafs_basic);
+ show_range(leafs_ext);
+}
+
+static void setup_platform_cpuid(void)
+{
+ u32 eax, ebx, ecx, edx;
+
+ /* Check vendor */
+ eax = ebx = ecx = edx = 0;
+ cpuid(&eax, &ebx, &ecx, &edx);
+
+ /* "htuA" */
+ if (ebx == 0x68747541)
+ is_amd = true;
+
+ /* Setup leafs for the basic and extended range */
+ leafs_basic = setup_cpuid_range(0x0);
+ leafs_ext = setup_cpuid_range(0x80000000);
+}
+
+static void usage(void)
+{
+ printf("kcpuid [-abdfhr] [-l leaf] [-s subleaf]\n"
+ "\t-a|--all Show both bit flags and complex bit fields info\n"
+ "\t-b|--bitflags Show boolean flags only\n"
+ "\t-d|--detail Show details of the flag/fields (default)\n"
+ "\t-f|--flags Specify the cpuid csv file\n"
+ "\t-h|--help Show usage info\n"
+ "\t-l|--leaf=index Specify the leaf you want to check\n"
+ "\t-r|--raw Show raw cpuid data\n"
+ "\t-s|--subleaf=sub Specify the subleaf you want to check\n"
+ );
+}
+
+static struct option opts[] = {
+ { "all", no_argument, NULL, 'a' }, /* show both bit flags and fields */
+ { "bitflags", no_argument, NULL, 'b' }, /* only show bit flags, default on */
+ { "detail", no_argument, NULL, 'd' }, /* show detail descriptions */
+ { "file", required_argument, NULL, 'f' }, /* use user's cpuid file */
+ { "help", no_argument, NULL, 'h'}, /* show usage */
+ { "leaf", required_argument, NULL, 'l'}, /* only check a specific leaf */
+ { "raw", no_argument, NULL, 'r'}, /* show raw CPUID leaf data */
+ { "subleaf", required_argument, NULL, 's'}, /* check a specific subleaf */
+ { NULL, 0, NULL, 0 }
+};
+
+static int parse_options(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt_long(argc, argv, "abdf:hl:rs:",
+ opts, NULL)) != -1)
+ switch (c) {
+ case 'a':
+ show_flags_only = false;
+ break;
+ case 'b':
+ show_flags_only = true;
+ break;
+ case 'd':
+ show_details = true;
+ break;
+ case 'f':
+ user_csv = optarg;
+ break;
+ case 'h':
+ usage();
+ exit(1);
+ break;
+ case 'l':
+ /* main leaf */
+ user_index = strtoul(optarg, NULL, 0);
+ break;
+ case 'r':
+ show_raw = true;
+ break;
+ case 's':
+ /* subleaf */
+ user_sub = strtoul(optarg, NULL, 0);
+ break;
+ default:
+ printf("%s: Invalid option '%c'\n", argv[0], optopt);
+ return -1;
+ }
+
+ return 0;
+}
+
+/*
+ * Do 4 things in turn:
+ * 1. Parse user options
+ * 2. Parse and store all the CPUID leaf data supported on this platform
+ * 2. Parse the csv file, while skipping leafs which are not available
+ * on this platform
+ * 3. Print leafs info based on user options
+ */
+int main(int argc, char *argv[])
+{
+ if (parse_options(argc, argv))
+ return -1;
+
+ /* Setup the cpuid leafs of current platform */
+ setup_platform_cpuid();
+
+ /* Read and parse the 'cpuid.csv' */
+ parse_text();
+
+ show_info();
+ return 0;
+}
\ No newline at end of file
--
2.30.0
1
0
From: Feng Tang <feng.tang(a)intel.com>
mainline inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4CMMB?from=project-issue
CVE: NA
------------------------------------
End users frequently want to know what features their processor
supports, independent of what the kernel supports.
/proc/cpuinfo is great. It is omnipresent and since it is provided by
the kernel it is always as up to date as the kernel. But, it could be
ambiguous about processor features which can be disabled by the kernel
at boot-time or compile-time.
There are some user space tools showing more raw features, but they are
not bound with kernel, and go with distros. Many end users are still
using old distros with new kernels (upgraded by themselves), and may
not upgrade the distros only to get a newer tool.
So here arise the need for a new tool, which
* shows raw CPU features read from the CPUID instruction
* will be easier to update compared to existing userspace
tooling (perhaps distributed like perf)
* inherits "modern" kernel development process, in contrast to some
of the existing userspace CPUID tools which are still being
developed
without git and distributed in tarballs from non-https sites.
* Can produce output consistent with /proc/cpuinfo to make comparison
easier.
The CPUID leaf definitions are kept in an .csv file which allows for
updating only that file to add support for new feature leafs.
This is based on prototype code from Borislav Petkov
(http://sr71.net/~dave/intel/stupid-cpuid.c)
[ bp:
- Massage, add #define _GNU_SOURCE to fix implicit declaration of
function ‘strcasestr' warning
- remove superfluous newlines
- fallback to cpuid.csv in the current dir if none found
- fix typos
- move comments over the lines instead of sideways. ]
Originally-from: Borislav Petkov <bp(a)alien8.de>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Suggested-by: Borislav Petkov <bp(a)alien8.de>
Signed-off-by: Feng Tang <feng.tang(a)intel.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: zhangyue <zhangyue1(a)kylinos.cn>
---
tools/arch/x86/kcpuid/Makefile | 24 ++
tools/arch/x86/kcpuid/cpuid.csv | 400 +++++++++++++++++++
tools/arch/x86/kcpuid/kcpuid.c | 657 ++++++++++++++++++++++++++++++++
3 files changed, 1081 insertions(+)
create mode 100644 tools/arch/x86/kcpuid/Makefile
create mode 100644 tools/arch/x86/kcpuid/cpuid.csv
create mode 100644 tools/arch/x86/kcpuid/kcpuid.c
diff --git a/tools/arch/x86/kcpuid/Makefile b/tools/arch/x86/kcpuid/Makefile
new file mode 100644
index 000000000000..a2f1e855092c
--- /dev/null
+++ b/tools/arch/x86/kcpuid/Makefile
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for x86/kcpuid tool
+
+kcpuid : kcpuid.c
+
+CFLAGS = -Wextra
+
+BINDIR ?= /usr/sbin
+
+HWDATADIR ?= /usr/share/misc/
+
+override CFLAGS += -O2 -Wall -I../../../include
+
+%: %.c
+ $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
+
+.PHONY : clean
+clean :
+ @rm -f kcpuid
+
+install : kcpuid
+ install -d $(DESTDIR)$(BINDIR)
+ install -m 755 -p kcpuid $(DESTDIR)$(BINDIR)/kcpuid
+ install -m 444 -p cpuid.csv $(HWDATADIR)/cpuid.csv
diff --git a/tools/arch/x86/kcpuid/cpuid.csv b/tools/arch/x86/kcpuid/cpuid.csv
new file mode 100644
index 000000000000..e422fad9a98e
--- /dev/null
+++ b/tools/arch/x86/kcpuid/cpuid.csv
@@ -0,0 +1,400 @@
+# The basic row format is:
+# LEAF, SUBLEAF, register_name, bits, short_name, long_description
+
+# Leaf 00H
+ 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+
+# Leaf 01H
+ 1, 0, EAX, 3:0, stepping, Stepping ID
+ 1, 0, EAX, 7:4, model, Model
+ 1, 0, EAX, 11:8, family, Family ID
+ 1, 0, EAX, 13:12, processor, Processor Type
+ 1, 0, EAX, 19:16, model_ext, Extended Model ID
+ 1, 0, EAX, 27:20, family_ext, Extended Family ID
+
+ 1, 0, EBX, 7:0, brand, Brand Index
+ 1, 0, EBX, 15:8, clflush_size, CLFLUSH line size (value * 8) in bytes
+ 1, 0, EBX, 23:16, max_cpu_id, Maxim number of addressable logic cpu in this package
+ 1, 0, EBX, 31:24, apic_id, Initial APIC ID
+
+ 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ 1, 0, ECX, 1, pclmulqdq, PCLMULQDQ instruction supported
+ 1, 0, ECX, 2, dtes64, DS area uses 64-bit layout
+ 1, 0, ECX, 3, mwait, MONITOR/MWAIT supported
+ 1, 0, ECX, 4, ds_cpl, CPL Qualified Debug Store which allows for branch message storage qualified by CPL
+ 1, 0, ECX, 5, vmx, Virtual Machine Extensions supported
+ 1, 0, ECX, 6, smx, Safer Mode Extension supported
+ 1, 0, ECX, 7, eist, Enhanced Intel SpeedStep Technology
+ 1, 0, ECX, 8, tm2, Thermal Monitor 2
+ 1, 0, ECX, 9, ssse3, Supplemental Streaming SIMD Extensions 3 (SSSE3)
+ 1, 0, ECX, 10, l1_ctx_id, L1 data cache could be set to either adaptive mode or shared mode (check IA32_MISC_ENABLE bit 24 definition)
+ 1, 0, ECX, 11, sdbg, IA32_DEBUG_INTERFACE MSR for silicon debug supported
+ 1, 0, ECX, 12, fma, FMA extensions using YMM state supported
+ 1, 0, ECX, 13, cmpxchg16b, 'CMPXCHG16B - Compare and Exchange Bytes' supported
+ 1, 0, ECX, 14, xtpr_update, xTPR Update Control supported
+ 1, 0, ECX, 15, pdcm, Perfmon and Debug Capability present
+ 1, 0, ECX, 17, pcid, Process-Context Identifiers feature present
+ 1, 0, ECX, 18, dca, Prefetching data from a memory mapped device supported
+ 1, 0, ECX, 19, sse4_1, SSE4.1 feature present
+ 1, 0, ECX, 20, sse4_2, SSE4.2 feature present
+ 1, 0, ECX, 21, x2apic, x2APIC supported
+ 1, 0, ECX, 22, movbe, MOVBE instruction supported
+ 1, 0, ECX, 23, popcnt, POPCNT instruction supported
+ 1, 0, ECX, 24, tsc_deadline_timer, LAPIC supports one-shot operation using a TSC deadline value
+ 1, 0, ECX, 25, aesni, AESNI instruction supported
+ 1, 0, ECX, 26, xsave, XSAVE/XRSTOR processor extended states (XSETBV/XGETBV/XCR0)
+ 1, 0, ECX, 27, osxsave, OS has set CR4.OSXSAVE bit to enable XSETBV/XGETBV/XCR0
+ 1, 0, ECX, 28, avx, AVX instruction supported
+ 1, 0, ECX, 29, f16c, 16-bit floating-point conversion instruction supported
+ 1, 0, ECX, 30, rdrand, RDRAND instruction supported
+
+ 1, 0, EDX, 0, fpu, x87 FPU on chip
+ 1, 0, EDX, 1, vme, Virtual-8086 Mode Enhancement
+ 1, 0, EDX, 2, de, Debugging Extensions
+ 1, 0, EDX, 3, pse, Page Size Extensions
+ 1, 0, EDX, 4, tsc, Time Stamp Counter
+ 1, 0, EDX, 5, msr, RDMSR and WRMSR Support
+ 1, 0, EDX, 6, pae, Physical Address Extensions
+ 1, 0, EDX, 7, mce, Machine Check Exception
+ 1, 0, EDX, 8, cx8, CMPXCHG8B instr
+ 1, 0, EDX, 9, apic, APIC on Chip
+ 1, 0, EDX, 11, sep, SYSENTER and SYSEXIT instrs
+ 1, 0, EDX, 12, mtrr, Memory Type Range Registers
+ 1, 0, EDX, 13, pge, Page Global Bit
+ 1, 0, EDX, 14, mca, Machine Check Architecture
+ 1, 0, EDX, 15, cmov, Conditional Move Instrs
+ 1, 0, EDX, 16, pat, Page Attribute Table
+ 1, 0, EDX, 17, pse36, 36-Bit Page Size Extension
+ 1, 0, EDX, 18, psn, Processor Serial Number
+ 1, 0, EDX, 19, clflush, CLFLUSH instr
+# 1, 0, EDX, 20,
+ 1, 0, EDX, 21, ds, Debug Store
+ 1, 0, EDX, 22, acpi, Thermal Monitor and Software Controlled Clock Facilities
+ 1, 0, EDX, 23, mmx, Intel MMX Technology
+ 1, 0, EDX, 24, fxsr, XSAVE and FXRSTOR Instrs
+ 1, 0, EDX, 25, sse, SSE
+ 1, 0, EDX, 26, sse2, SSE2
+ 1, 0, EDX, 27, ss, Self Snoop
+ 1, 0, EDX, 28, hit, Max APIC IDs
+ 1, 0, EDX, 29, tm, Thermal Monitor
+# 1, 0, EDX, 30,
+ 1, 0, EDX, 31, pbe, Pending Break Enable
+
+# Leaf 02H
+# cache and TLB descriptor info
+
+# Leaf 03H
+# Precessor Serial Number, introduced on Pentium III, not valid for
+# latest models
+
+# Leaf 04H
+# thread/core and cache topology
+ 4, 0, EAX, 4:0, cache_type, Cache type like instr/data or unified
+ 4, 0, EAX, 7:5, cache_level, Cache Level (starts at 1)
+ 4, 0, EAX, 8, cache_self_init, Cache Self Initialization
+ 4, 0, EAX, 9, fully_associate, Fully Associative cache
+# 4, 0, EAX, 13:10, resvd, resvd
+ 4, 0, EAX, 25:14, max_logical_id, Max number of addressable IDs for logical processors sharing the cache
+ 4, 0, EAX, 31:26, max_phy_id, Max number of addressable IDs for processors in phy package
+
+ 4, 0, EBX, 11:0, cache_linesize, Size of a cache line in bytes
+ 4, 0, EBX, 21:12, cache_partition, Physical Line partitions
+ 4, 0, EBX, 31:22, cache_ways, Ways of associativity
+ 4, 0, ECX, 31:0, cache_sets, Number of Sets - 1
+ 4, 0, EDX, 0, c_wbinvd, 1 means WBINVD/INVD is not ganranteed to act upon lower level caches of non-originating threads sharing this cache
+ 4, 0, EDX, 1, c_incl, Whether cache is inclusive of lower cache level
+ 4, 0, EDX, 2, c_comp_index, Complex Cache Indexing
+
+# Leaf 05H
+# MONITOR/MWAIT
+ 5, 0, EAX, 15:0, min_mon_size, Smallest monitor line size in bytes
+ 5, 0, EBX, 15:0, max_mon_size, Largest monitor line size in bytes
+ 5, 0, ECX, 0, mwait_ext, Enum of Monitor-Mwait extensions supported
+ 5, 0, ECX, 1, mwait_irq_break, Largest monitor line size in bytes
+ 5, 0, EDX, 3:0, c0_sub_stats, Number of C0* sub C-states supported using MWAIT
+ 5, 0, EDX, 7:4, c1_sub_stats, Number of C1* sub C-states supported using MWAIT
+ 5, 0, EDX, 11:8, c2_sub_stats, Number of C2* sub C-states supported using MWAIT
+ 5, 0, EDX, 15:12, c3_sub_stats, Number of C3* sub C-states supported using MWAIT
+ 5, 0, EDX, 19:16, c4_sub_stats, Number of C4* sub C-states supported using MWAIT
+ 5, 0, EDX, 23:20, c5_sub_stats, Number of C5* sub C-states supported using MWAIT
+ 5, 0, EDX, 27:24, c6_sub_stats, Number of C6* sub C-states supported using MWAIT
+ 5, 0, EDX, 31:28, c7_sub_stats, Number of C7* sub C-states supported using MWAIT
+
+# Leaf 06H
+# Thermal & Power Management
+
+ 6, 0, EAX, 0, dig_temp, Digital temperature sensor supported
+ 6, 0, EAX, 1, turbo, Intel Turbo Boost
+ 6, 0, EAX, 2, arat, Always running APIC timer
+# 6, 0, EAX, 3, resv, Reserved
+ 6, 0, EAX, 4, pln, Power limit notifications supported
+ 6, 0, EAX, 5, ecmd, Clock modulation duty cycle extension supported
+ 6, 0, EAX, 6, ptm, Package thermal management supported
+ 6, 0, EAX, 7, hwp, HWP base register
+ 6, 0, EAX, 8, hwp_notify, HWP notification
+ 6, 0, EAX, 9, hwp_act_window, HWP activity window
+ 6, 0, EAX, 10, hwp_energy, HWP energy performance preference
+ 6, 0, EAX, 11, hwp_pkg_req, HWP package level request
+# 6, 0, EAX, 12, resv, Reserved
+ 6, 0, EAX, 13, hdc, HDC base registers supported
+ 6, 0, EAX, 14, turbo3, Turbo Boost Max 3.0
+ 6, 0, EAX, 15, hwp_cap, Highest Performance change supported
+ 6, 0, EAX, 16, hwp_peci, HWP PECI override is supported
+ 6, 0, EAX, 17, hwp_flex, Flexible HWP is supported
+ 6, 0, EAX, 18, hwp_fast, Fast access mode for the IA32_HWP_REQUEST MSR is supported
+# 6, 0, EAX, 19, resv, Reserved
+ 6, 0, EAX, 20, hwp_ignr, Ignoring Idle Logical Processor HWP request is supported
+
+ 6, 0, EBX, 3:0, therm_irq_thresh, Number of Interrupt Thresholds in Digital Thermal Sensor
+ 6, 0, ECX, 0, aperfmperf, Presence of IA32_MPERF and IA32_APERF
+ 6, 0, ECX, 3, energ_bias, Performance-energy bias preference supported
+
+# Leaf 07H
+# ECX == 0
+# AVX512 refers to https://en.wikipedia.org/wiki/AVX-512
+# XXX: Do we really need to enumerate each and every AVX512 sub features
+
+ 7, 0, EBX, 0, fsgsbase, RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE supported
+ 7, 0, EBX, 1, tsc_adjust, TSC_ADJUST MSR supported
+ 7, 0, EBX, 2, sgx, Software Guard Extensions
+ 7, 0, EBX, 3, bmi1, BMI1
+ 7, 0, EBX, 4, hle, Hardware Lock Elision
+ 7, 0, EBX, 5, avx2, AVX2
+# 7, 0, EBX, 6, fdp_excp_only, x87 FPU Data Pointer updated only on x87 exceptions
+ 7, 0, EBX, 7, smep, Supervisor-Mode Execution Prevention
+ 7, 0, EBX, 8, bmi2, BMI2
+ 7, 0, EBX, 9, rep_movsb, Enhanced REP MOVSB/STOSB
+ 7, 0, EBX, 10, invpcid, INVPCID instruction
+ 7, 0, EBX, 11, rtm, Restricted Transactional Memory
+ 7, 0, EBX, 12, rdt_m, Intel RDT Monitoring capability
+ 7, 0, EBX, 13, depc_fpu_cs_ds, Deprecates FPU CS and FPU DS
+ 7, 0, EBX, 14, mpx, Memory Protection Extensions
+ 7, 0, EBX, 15, rdt_a, Intel RDT Allocation capability
+ 7, 0, EBX, 16, avx512f, AVX512 Foundation instr
+ 7, 0, EBX, 17, avx512dq, AVX512 Double and Quadword AVX512 instr
+ 7, 0, EBX, 18, rdseed, RDSEED instr
+ 7, 0, EBX, 19, adx, ADX instr
+ 7, 0, EBX, 20, smap, Supervisor Mode Access Prevention
+ 7, 0, EBX, 21, avx512ifma, AVX512 Integer Fused Multiply Add
+# 7, 0, EBX, 22, resvd, resvd
+ 7, 0, EBX, 23, clflushopt, CLFLUSHOPT instr
+ 7, 0, EBX, 24, clwb, CLWB instr
+ 7, 0, EBX, 25, intel_pt, Intel Processor Trace instr
+ 7, 0, EBX, 26, avx512pf, Prefetch
+ 7, 0, EBX, 27, avx512er, AVX512 Exponent Reciproca instr
+ 7, 0, EBX, 28, avx512cd, AVX512 Conflict Detection instr
+ 7, 0, EBX, 29, sha, Intel Secure Hash Algorithm Extensions instr
+ 7, 0, EBX, 26, avx512bw, AVX512 Byte & Word instr
+ 7, 0, EBX, 28, avx512vl, AVX512 Vector Length Extentions (VL)
+ 7, 0, ECX, 0, prefetchwt1, X
+ 7, 0, ECX, 1, avx512vbmi, AVX512 Vector Byte Manipulation Instructions
+ 7, 0, ECX, 2, umip, User-mode Instruction Prevention
+
+ 7, 0, ECX, 3, pku, Protection Keys for User-mode pages
+ 7, 0, ECX, 4, ospke, CR4 PKE set to enable protection keys
+# 7, 0, ECX, 16:5, resvd, resvd
+ 7, 0, ECX, 21:17, mawau, The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode
+ 7, 0, ECX, 22, rdpid, RDPID and IA32_TSC_AUX
+# 7, 0, ECX, 29:23, resvd, resvd
+ 7, 0, ECX, 30, sgx_lc, SGX Launch Configuration
+# 7, 0, ECX, 31, resvd, resvd
+
+# Leaf 08H
+#
+
+
+# Leaf 09H
+# Direct Cache Access (DCA) information
+ 9, 0, ECX, 31:0, dca_cap, The value of IA32_PLATFORM_DCA_CAP
+
+# Leaf 0AH
+# Architectural Performance Monitoring
+#
+# Do we really need to print out the PMU related stuff?
+# Does normal user really care about it?
+#
+ 0xA, 0, EAX, 7:0, pmu_ver, Performance Monitoring Unit version
+ 0xA, 0, EAX, 15:8, pmu_gp_cnt_num, Numer of general-purose PMU counters per logical CPU
+ 0xA, 0, EAX, 23:16, pmu_cnt_bits, Bit wideth of PMU counter
+ 0xA, 0, EAX, 31:24, pmu_ebx_bits, Length of EBX bit vector to enumerate PMU events
+
+ 0xA, 0, EBX, 0, pmu_no_core_cycle_evt, Core cycle event not available
+ 0xA, 0, EBX, 1, pmu_no_instr_ret_evt, Instruction retired event not available
+ 0xA, 0, EBX, 2, pmu_no_ref_cycle_evt, Reference cycles event not available
+ 0xA, 0, EBX, 3, pmu_no_llc_ref_evt, Last-level cache reference event not available
+ 0xA, 0, EBX, 4, pmu_no_llc_mis_evt, Last-level cache misses event not available
+ 0xA, 0, EBX, 5, pmu_no_br_instr_ret_evt, Branch instruction retired event not available
+ 0xA, 0, EBX, 6, pmu_no_br_mispredict_evt, Branch mispredict retired event not available
+
+ 0xA, 0, ECX, 4:0, pmu_fixed_cnt_num, Performance Monitoring Unit version
+ 0xA, 0, ECX, 12:5, pmu_fixed_cnt_bits, Numer of PMU counters per logical CPU
+
+# Leaf 0BH
+# Extended Topology Enumeration Leaf
+#
+
+ 0xB, 0, EAX, 4:0, id_shift, Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type
+ 0xB, 0, EBX, 15:0, cpu_nr, Number of logical processors at this level type
+ 0xB, 0, ECX, 15:8, lvl_type, 0-Invalid 1-SMT 2-Core
+ 0xB, 0, EDX, 31:0, x2apic_id, x2APIC ID the current logical processor
+
+
+# Leaf 0DH
+# Processor Extended State
+
+ 0xD, 0, EAX, 0, x87, X87 state
+ 0xD, 0, EAX, 1, sse, SSE state
+ 0xD, 0, EAX, 2, avx, AVX state
+ 0xD, 0, EAX, 4:3, mpx, MPX state
+ 0xD, 0, EAX, 7:5, avx512, AVX-512 state
+ 0xD, 0, EAX, 9, pkru, PKRU state
+
+ 0xD, 0, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 0, ECX, 31:0, max_sz_xsave, Maximum size (bytes) of the XSAVE/XRSTOR save area
+
+ 0xD, 1, EAX, 0, xsaveopt, XSAVEOPT available
+ 0xD, 1, EAX, 1, xsavec, XSAVEC and compacted form supported
+ 0xD, 1, EAX, 2, xgetbv, XGETBV supported
+ 0xD, 1, EAX, 3, xsaves, XSAVES/XRSTORS and IA32_XSS supported
+
+ 0xD, 1, EBX, 31:0, max_sz_xcr0, Maximum size (bytes) required by enabled features in XCR0
+ 0xD, 1, ECX, 8, pt, PT state
+ 0xD, 1, ECX, 11, cet_usr, CET user state
+ 0xD, 1, ECX, 12, cet_supv, CET supervisor state
+ 0xD, 1, ECX, 13, hdc, HDC state
+ 0xD, 1, ECX, 16, hwp, HWP state
+
+# Leaf 0FH
+# Intel RDT Monitoring
+
+ 0xF, 0, EBX, 31:0, rmid_range, Maximum range (zero-based) of RMID within this physical processor of all types
+ 0xF, 0, EDX, 1, l3c_rdt_mon, L3 Cache RDT Monitoring supported
+
+ 0xF, 1, ECX, 31:0, rmid_range, Maximum range (zero-based) of RMID of this types
+ 0xF, 1, EDX, 0, l3c_ocp_mon, L3 Cache occupancy Monitoring supported
+ 0xF, 1, EDX, 1, l3c_tbw_mon, L3 Cache Total Bandwidth Monitoring supported
+ 0xF, 1, EDX, 2, l3c_lbw_mon, L3 Cache Local Bandwidth Monitoring supported
+
+# Leaf 10H
+# Intel RDT Allocation
+
+ 0x10, 0, EBX, 1, l3c_rdt_alloc, L3 Cache Allocation supported
+ 0x10, 0, EBX, 2, l2c_rdt_alloc, L2 Cache Allocation supported
+ 0x10, 0, EBX, 3, mem_bw_alloc, Memory Bandwidth Allocation supported
+
+
+# Leaf 12H
+# SGX Capability
+#
+# Some detailed SGX features not added yet
+
+ 0x12, 0, EAX, 0, sgx1, L3 Cache Allocation supported
+ 0x12, 1, EAX, 0, sgx2, L3 Cache Allocation supported
+
+
+# Leaf 14H
+# Intel Processor Tracer
+#
+
+# Leaf 15H
+# Time Stamp Counter and Nominal Core Crystal Clock Information
+
+ 0x15, 0, EAX, 31:0, tsc_denominator, The denominator of the TSC/”core crystal clock” ratio
+ 0x15, 0, EBX, 31:0, tsc_numerator, The numerator of the TSC/”core crystal clock” ratio
+ 0x15, 0, ECX, 31:0, nom_freq, Nominal frequency of the core crystal clock in Hz
+
+# Leaf 16H
+# Processor Frequency Information
+
+ 0x16, 0, EAX, 15:0, cpu_base_freq, Processor Base Frequency in MHz
+ 0x16, 0, EBX, 15:0, cpu_max_freq, Maximum Frequency in MHz
+ 0x16, 0, ECX, 15:0, bus_freq, Bus (Reference) Frequency in MHz
+
+# Leaf 17H
+# System-On-Chip Vendor Attribute
+
+ 0x17, 0, EAX, 31:0, max_socid, Maximum input value of supported sub-leaf
+ 0x17, 0, EBX, 15:0, soc_vid, SOC Vendor ID
+ 0x17, 0, EBX, 16, std_vid, SOC Vendor ID is assigned via an industry standard scheme
+ 0x17, 0, ECX, 31:0, soc_pid, SOC Project ID assigned by vendor
+ 0x17, 0, EDX, 31:0, soc_sid, SOC Stepping ID
+
+# Leaf 18H
+# Deterministic Address Translation Parameters
+
+
+# Leaf 19H
+# Key Locker Leaf
+
+
+# Leaf 1AH
+# Hybrid Information
+
+ 0x1A, 0, EAX, 31:24, core_type, 20H-Intel_Atom 40H-Intel_Core
+
+
+# Leaf 1FH
+# V2 Extended Topology - A preferred superset to leaf 0BH
+
+
+# According to SDM
+# 40000000H - 4FFFFFFFH is invalid range
+
+
+# Leaf 80000001H
+# Extended Processor Signature and Feature Bits
+
+0x80000001, 0, ECX, 0, lahf_lm, LAHF/SAHF available in 64-bit mode
+0x80000001, 0, ECX, 5, lzcnt, LZCNT
+0x80000001, 0, ECX, 8, prefetchw, PREFETCHW
+
+0x80000001, 0, EDX, 11, sysret, SYSCALL/SYSRET supported
+0x80000001, 0, EDX, 20, exec_dis, Execute Disable Bit available
+0x80000001, 0, EDX, 26, 1gb_page, 1GB page supported
+0x80000001, 0, EDX, 27, rdtscp, RDTSCP and IA32_TSC_AUX are available
+#0x80000001, 0, EDX, 29, 64b, 64b Architecture supported
+
+# Leaf 80000002H/80000003H/80000004H
+# Processor Brand String
+
+# Leaf 80000005H
+# Reserved
+
+# Leaf 80000006H
+# Extended L2 Cache Features
+
+0x80000006, 0, ECX, 7:0, clsize, Cache Line size in bytes
+0x80000006, 0, ECX, 15:12, l2c_assoc, L2 Associativity
+0x80000006, 0, ECX, 31:16, csize, Cache size in 1K units
+
+
+# Leaf 80000007H
+
+0x80000007, 0, EDX, 8, nonstop_tsc, Invariant TSC available
+
+
+# Leaf 80000008H
+
+0x80000008, 0, EAX, 7:0, phy_adr_bits, Physical Address Bits
+0x80000008, 0, EAX, 15:8, lnr_adr_bits, Linear Address Bits
+0x80000007, 0, EBX, 9, wbnoinvd, WBNOINVD
+
+# 0x8000001E
+# EAX: Extended APIC ID
+0x8000001E, 0, EAX, 31:0, extended_apic_id, Extended APIC ID
+# EBX: Core Identifiers
+0x8000001E, 0, EBX, 7:0, core_id, Identifies the logical core ID
+0x8000001E, 0, EBX, 15:8, threads_per_core, The number of threads per core is threads_per_core + 1
+# ECX: Node Identifiers
+0x8000001E, 0, ECX, 7:0, node_id, Node ID
+0x8000001E, 0, ECX, 10:8, nodes_per_processor, Nodes per processor { 0: 1 node, else reserved }
+
+# 8000001F: AMD Secure Encryption
+0x8000001F, 0, EAX, 0, sme, Secure Memory Encryption
+0x8000001F, 0, EAX, 1, sev, Secure Encrypted Virtualization
+0x8000001F, 0, EAX, 2, vmpgflush, VM Page Flush MSR
+0x8000001F, 0, EAX, 3, seves, SEV Encrypted State
+0x8000001F, 0, EBX, 5:0, c-bit, Page table bit number used to enable memory encryption
+0x8000001F, 0, EBX, 11:6, mem_encrypt_physaddr_width, Reduction of physical address space in bits with SME enabled
+0x8000001F, 0, ECX, 31:0, num_encrypted_guests, Maximum ASID value that may be used for an SEV-enabled guest
+0x8000001F, 0, EDX, 31:0, minimum_sev_asid, Minimum ASID value that must be used for an SEV-enabled, SEV-ES-disabled guest
\ No newline at end of file
diff --git a/tools/arch/x86/kcpuid/kcpuid.c b/tools/arch/x86/kcpuid/kcpuid.c
new file mode 100644
index 000000000000..936a3a2aad04
--- /dev/null
+++ b/tools/arch/x86/kcpuid/kcpuid.c
@@ -0,0 +1,657 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+char *def_csv = "/usr/share/misc/cpuid.csv";
+char *user_csv;
+
+
+/* Cover both single-bit flag and multiple-bits fields */
+struct bits_desc {
+ /* start and end bits */
+ int start, end;
+ /* 0 or 1 for 1-bit flag */
+ int value;
+ char simp[32];
+ char detail[256];
+};
+
+/* descriptor info for eax/ebx/ecx/edx */
+struct reg_desc {
+ /* number of valid entries */
+ int nr;
+ struct bits_desc descs[32];
+};
+
+enum {
+ R_EAX = 0,
+ R_EBX,
+ R_ECX,
+ R_EDX,
+ NR_REGS
+};
+
+struct subleaf {
+ u32 index;
+ u32 sub;
+ u32 eax, ebx, ecx, edx;
+ struct reg_desc info[NR_REGS];
+};
+
+/* Represent one leaf (basic or extended) */
+struct cpuid_func {
+ /*
+ * Array of subleafs for this func, if there is no subleafs
+ * then the leafs[0] is the main leaf
+ */
+ struct subleaf *leafs;
+ int nr;
+};
+
+struct cpuid_range {
+ /* array of main leafs */
+ struct cpuid_func *funcs;
+ /* number of valid leafs */
+ int nr;
+ bool is_ext;
+};
+
+/*
+ * basic: basic functions range: [0... ]
+ * ext: extended functions range: [0x80000000... ]
+ */
+struct cpuid_range *leafs_basic, *leafs_ext;
+
+static int num_leafs;
+static bool is_amd;
+static bool show_details;
+static bool show_raw;
+static bool show_flags_only = true;
+static u32 user_index = 0xFFFFFFFF;
+static u32 user_sub = 0xFFFFFFFF;
+static int flines;
+
+static inline void cpuid(u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+{
+ /* ecx is often an input as well as an output. */
+ asm volatile("cpuid"
+ : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+ : "0" (*eax), "2" (*ecx));
+}
+
+static inline bool has_subleafs(u32 f)
+{
+ if (f == 0x7 || f == 0xd)
+ return true;
+
+ if (is_amd) {
+ if (f == 0x8000001d)
+ return true;
+ return false;
+ }
+
+ switch (f) {
+ case 0x4:
+ case 0xb:
+ case 0xf:
+ case 0x10:
+ case 0x14:
+ case 0x18:
+ case 0x1f:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static void leaf_print_raw(struct subleaf *leaf)
+{
+ if (has_subleafs(leaf->index)) {
+ if (leaf->sub == 0)
+ printf("0x%08x: subleafs:\n", leaf->index);
+
+ printf(" %2d: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->sub, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ } else {
+ printf("0x%08x: EAX=0x%08x, EBX=0x%08x, ECX=0x%08x, EDX=0x%08x\n",
+ leaf->index, leaf->eax, leaf->ebx, leaf->ecx, leaf->edx);
+ }
+}
+
+/* Return true is the input eax/ebx/ecx/edx are all zero */
+static bool cpuid_store(struct cpuid_range *range, u32 f, int subleaf,
+ u32 a, u32 b, u32 c, u32 d)
+{
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ int s = 0;
+
+ if (a == 0 && b == 0 && c == 0 && d == 0)
+ return true;
+
+ /*
+ * Cut off vendor-prefix from CPUID function as we're using it as an
+ * index into ->funcs.
+ */
+ func = &range->funcs[f & 0xffff];
+
+ if (!func->leafs) {
+ func->leafs = malloc(sizeof(struct subleaf));
+ if (!func->leafs)
+ perror("malloc func leaf");
+
+ func->nr = 1;
+ } else {
+ s = func->nr;
+ func->leafs = realloc(func->leafs, (s + 1) * sizeof(*leaf));
+ if (!func->leafs)
+ perror("realloc f->leafs");
+
+ func->nr++;
+ }
+
+ leaf = &func->leafs[s];
+
+ leaf->index = f;
+ leaf->sub = subleaf;
+ leaf->eax = a;
+ leaf->ebx = b;
+ leaf->ecx = c;
+ leaf->edx = d;
+
+ return false;
+}
+
+static void raw_dump_range(struct cpuid_range *range)
+{
+ u32 f;
+ int i;
+
+ printf("%s Leafs :\n", range->is_ext ? "Extended" : "Basic");
+ printf("================\n");
+
+ for (f = 0; (int)f < range->nr; f++) {
+ struct cpuid_func *func = &range->funcs[f];
+ u32 index = f;
+
+ if (range->is_ext)
+ index += 0x80000000;
+
+ /* Skip leaf without valid items */
+ if (!func->nr)
+ continue;
+
+ /* First item is the main leaf, followed by all subleafs */
+ for (i = 0; i < func->nr; i++)
+ leaf_print_raw(&func->leafs[i]);
+ }
+}
+
+#define MAX_SUBLEAF_NUM 32
+struct cpuid_range *setup_cpuid_range(u32 input_eax)
+{
+ u32 max_func, idx_func;
+ int subleaf;
+ struct cpuid_range *range;
+ u32 eax, ebx, ecx, edx;
+ u32 f = input_eax;
+ int max_subleaf;
+ bool allzero;
+
+ eax = input_eax;
+ ebx = ecx = edx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ max_func = eax;
+ idx_func = (max_func & 0xffff) + 1;
+
+ range = malloc(sizeof(struct cpuid_range));
+ if (!range)
+ perror("malloc range");
+
+ if (input_eax & 0x80000000)
+ range->is_ext = true;
+ else
+ range->is_ext = false;
+
+ range->funcs = malloc(sizeof(struct cpuid_func) * idx_func);
+ if (!range->funcs)
+ perror("malloc range->funcs");
+
+ range->nr = idx_func;
+ memset(range->funcs, 0, sizeof(struct cpuid_func) * idx_func);
+
+ for (; f <= max_func; f++) {
+ eax = f;
+ subleaf = ecx = 0;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf, eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+
+ if (!has_subleafs(f))
+ continue;
+
+ max_subleaf = MAX_SUBLEAF_NUM;
+
+ /*
+ * Some can provide the exact number of subleafs,
+ * others have to be tried (0xf)
+ */
+ if (f == 0x7 || f == 0x14 || f == 0x17 || f == 0x18)
+ max_subleaf = (eax & 0xff) + 1;
+
+ if (f == 0xb)
+ max_subleaf = 2;
+
+ for (subleaf = 1; subleaf < max_subleaf; subleaf++) {
+ eax = f;
+ ecx = subleaf;
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ allzero = cpuid_store(range, f, subleaf,
+ eax, ebx, ecx, edx);
+ if (allzero)
+ continue;
+ num_leafs++;
+ }
+
+ }
+
+ return range;
+}
+
+/*
+ * The basic row format for cpuid.csv is
+ * LEAF,SUBLEAF,register_name,bits,short name,long description
+ *
+ * like:
+ * 0, 0, EAX, 31:0, max_basic_leafs, Max input value for supported subleafs
+ * 1, 0, ECX, 0, sse3, Streaming SIMD Extensions 3(SSE3)
+ */
+static int parse_line(char *line)
+{
+ char *str;
+ int i;
+ struct cpuid_range *range;
+ struct cpuid_func *func;
+ struct subleaf *leaf;
+ u32 index;
+ u32 sub;
+ char buffer[512];
+ char *buf;
+ /*
+ * Tokens:
+ * 1. leaf
+ * 2. subleaf
+ * 3. register
+ * 4. bits
+ * 5. short name
+ * 6. long detail
+ */
+ char *tokens[6];
+ struct reg_desc *reg;
+ struct bits_desc *bdesc;
+ int reg_index;
+ char *start, *end;
+
+ /* Skip comments and NULL line */
+ if (line[0] == '#' || line[0] == '\n')
+ return 0;
+
+ strncpy(buffer, line, 511);
+ buffer[511] = 0;
+ str = buffer;
+ for (i = 0; i < 5; i++) {
+ tokens[i] = strtok(str, ",");
+ if (!tokens[i])
+ goto err_exit;
+ str = NULL;
+ }
+ tokens[5] = strtok(str, "\n");
+ if (!tokens[5])
+ goto err_exit;
+
+ /* index/main-leaf */
+ index = strtoull(tokens[0], NULL, 0);
+
+ if (index & 0x80000000)
+ range = leafs_ext;
+ else
+ range = leafs_basic;
+
+ index &= 0x7FFFFFFF;
+ /* Skip line parsing for non-existing indexes */
+ if ((int)index >= range->nr)
+ return -1;
+
+ func = &range->funcs[index];
+
+ /* Return if the index has no valid item on this platform */
+ if (!func->nr)
+ return 0;
+
+ /* subleaf */
+ sub = strtoul(tokens[1], NULL, 0);
+ if ((int)sub > func->nr)
+ return -1;
+
+ leaf = &func->leafs[sub];
+ buf = tokens[2];
+
+ if (strcasestr(buf, "EAX"))
+ reg_index = R_EAX;
+ else if (strcasestr(buf, "EBX"))
+ reg_index = R_EBX;
+ else if (strcasestr(buf, "ECX"))
+ reg_index = R_ECX;
+ else if (strcasestr(buf, "EDX"))
+ reg_index = R_EDX;
+ else
+ goto err_exit;
+
+ reg = &leaf->info[reg_index];
+ bdesc = ®->descs[reg->nr++];
+
+ /* bit flag or bits field */
+ buf = tokens[3];
+
+ end = strtok(buf, ":");
+ bdesc->end = strtoul(end, NULL, 0);
+ bdesc->start = bdesc->end;
+
+ /* start != NULL means it is bit fields */
+ start = strtok(NULL, ":");
+ if (start)
+ bdesc->start = strtoul(start, NULL, 0);
+
+ strcpy(bdesc->simp, tokens[4]);
+ strcpy(bdesc->detail, tokens[5]);
+ return 0;
+
+err_exit:
+ printf("Warning: wrong line format:\n");
+ printf("\tline[%d]: %s\n", flines, line);
+ return -1;
+}
+
+/* Parse csv file, and construct the array of all leafs and subleafs */
+static void parse_text(void)
+{
+ FILE *file;
+ char *filename, *line = NULL;
+ size_t len = 0;
+ int ret;
+
+ if (show_raw)
+ return;
+
+ filename = user_csv ? user_csv : def_csv;
+ file = fopen(filename, "r");
+ if (!file) {
+ /* Fallback to a csv in the same dir */
+ file = fopen("./cpuid.csv", "r");
+ }
+
+ if (!file) {
+ printf("Fail to open '%s'\n", filename);
+ return;
+ }
+
+ while (1) {
+ ret = getline(&line, &len, file);
+ flines++;
+ if (ret > 0)
+ parse_line(line);
+
+ if (feof(file))
+ break;
+ }
+
+ fclose(file);
+}
+
+
+/* Decode every eax/ebx/ecx/edx */
+static void decode_bits(u32 value, struct reg_desc *rdesc)
+{
+ struct bits_desc *bdesc;
+ int start, end, i;
+ u32 mask;
+
+ for (i = 0; i < rdesc->nr; i++) {
+ bdesc = &rdesc->descs[i];
+
+ start = bdesc->start;
+ end = bdesc->end;
+ if (start == end) {
+ /* single bit flag */
+ if (value & (1 << start))
+ printf("\t%-20s %s%s\n",
+ bdesc->simp,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ } else {
+ /* bit fields */
+ if (show_flags_only)
+ continue;
+
+ mask = ((u64)1 << (end - start + 1)) - 1;
+ printf("\t%-20s\t: 0x%-8x\t%s%s\n",
+ bdesc->simp,
+ (value >> start) & mask,
+ show_details ? "-" : "",
+ show_details ? bdesc->detail : ""
+ );
+ }
+ }
+}
+
+static void show_leaf(struct subleaf *leaf)
+{
+ if (!leaf)
+ return;
+
+ if (show_raw)
+ leaf_print_raw(leaf);
+
+ decode_bits(leaf->eax, &leaf->info[R_EAX]);
+ decode_bits(leaf->ebx, &leaf->info[R_EBX]);
+ decode_bits(leaf->ecx, &leaf->info[R_ECX]);
+ decode_bits(leaf->edx, &leaf->info[R_EDX]);
+}
+
+static void show_func(struct cpuid_func *func)
+{
+ int i;
+
+ if (!func)
+ return;
+
+ for (i = 0; i < func->nr; i++)
+ show_leaf(&func->leafs[i]);
+}
+
+static void show_range(struct cpuid_range *range)
+{
+ int i;
+
+ for (i = 0; i < range->nr; i++)
+ show_func(&range->funcs[i]);
+}
+
+static inline struct cpuid_func *index_to_func(u32 index)
+{
+ struct cpuid_range *range;
+
+ range = (index & 0x80000000) ? leafs_ext : leafs_basic;
+ index &= 0x7FFFFFFF;
+
+ if (((index & 0xFFFF) + 1) > (u32)range->nr) {
+ printf("ERR: invalid input index (0x%x)\n", index);
+ return NULL;
+ }
+ return &range->funcs[index];
+}
+
+static void show_info(void)
+{
+ struct cpuid_func *func;
+
+ if (show_raw) {
+ /* Show all of the raw output of 'cpuid' instr */
+ raw_dump_range(leafs_basic);
+ raw_dump_range(leafs_ext);
+ return;
+ }
+
+ if (user_index != 0xFFFFFFFF) {
+ /* Only show specific leaf/subleaf info */
+ func = index_to_func(user_index);
+ if (!func)
+ return;
+
+ /* Dump the raw data also */
+ show_raw = true;
+
+ if (user_sub != 0xFFFFFFFF) {
+ if (user_sub + 1 <= (u32)func->nr) {
+ show_leaf(&func->leafs[user_sub]);
+ return;
+ }
+
+ printf("ERR: invalid input subleaf (0x%x)\n", user_sub);
+ }
+
+ show_func(func);
+ return;
+ }
+
+ printf("CPU features:\n=============\n\n");
+ show_range(leafs_basic);
+ show_range(leafs_ext);
+}
+
+static void setup_platform_cpuid(void)
+{
+ u32 eax, ebx, ecx, edx;
+
+ /* Check vendor */
+ eax = ebx = ecx = edx = 0;
+ cpuid(&eax, &ebx, &ecx, &edx);
+
+ /* "htuA" */
+ if (ebx == 0x68747541)
+ is_amd = true;
+
+ /* Setup leafs for the basic and extended range */
+ leafs_basic = setup_cpuid_range(0x0);
+ leafs_ext = setup_cpuid_range(0x80000000);
+}
+
+static void usage(void)
+{
+ printf("kcpuid [-abdfhr] [-l leaf] [-s subleaf]\n"
+ "\t-a|--all Show both bit flags and complex bit fields info\n"
+ "\t-b|--bitflags Show boolean flags only\n"
+ "\t-d|--detail Show details of the flag/fields (default)\n"
+ "\t-f|--flags Specify the cpuid csv file\n"
+ "\t-h|--help Show usage info\n"
+ "\t-l|--leaf=index Specify the leaf you want to check\n"
+ "\t-r|--raw Show raw cpuid data\n"
+ "\t-s|--subleaf=sub Specify the subleaf you want to check\n"
+ );
+}
+
+static struct option opts[] = {
+ { "all", no_argument, NULL, 'a' }, /* show both bit flags and fields */
+ { "bitflags", no_argument, NULL, 'b' }, /* only show bit flags, default on */
+ { "detail", no_argument, NULL, 'd' }, /* show detail descriptions */
+ { "file", required_argument, NULL, 'f' }, /* use user's cpuid file */
+ { "help", no_argument, NULL, 'h'}, /* show usage */
+ { "leaf", required_argument, NULL, 'l'}, /* only check a specific leaf */
+ { "raw", no_argument, NULL, 'r'}, /* show raw CPUID leaf data */
+ { "subleaf", required_argument, NULL, 's'}, /* check a specific subleaf */
+ { NULL, 0, NULL, 0 }
+};
+
+static int parse_options(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt_long(argc, argv, "abdf:hl:rs:",
+ opts, NULL)) != -1)
+ switch (c) {
+ case 'a':
+ show_flags_only = false;
+ break;
+ case 'b':
+ show_flags_only = true;
+ break;
+ case 'd':
+ show_details = true;
+ break;
+ case 'f':
+ user_csv = optarg;
+ break;
+ case 'h':
+ usage();
+ exit(1);
+ break;
+ case 'l':
+ /* main leaf */
+ user_index = strtoul(optarg, NULL, 0);
+ break;
+ case 'r':
+ show_raw = true;
+ break;
+ case 's':
+ /* subleaf */
+ user_sub = strtoul(optarg, NULL, 0);
+ break;
+ default:
+ printf("%s: Invalid option '%c'\n", argv[0], optopt);
+ return -1;
+ }
+
+ return 0;
+}
+
+/*
+ * Do 4 things in turn:
+ * 1. Parse user options
+ * 2. Parse and store all the CPUID leaf data supported on this platform
+ * 2. Parse the csv file, while skipping leafs which are not available
+ * on this platform
+ * 3. Print leafs info based on user options
+ */
+int main(int argc, char *argv[])
+{
+ if (parse_options(argc, argv))
+ return -1;
+
+ /* Setup the cpuid leafs of current platform */
+ setup_platform_cpuid();
+
+ /* Read and parse the 'cpuid.csv' */
+ parse_text();
+
+ show_info();
+ return 0;
+}
\ No newline at end of file
--
2.30.0
1
0

[PATCH kernel-4.19 1/3] net: hns3: use ae_dev->ops->reset_event to do reset.
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Yonglong Liu <liuyonglong(a)huawei.com>
driver inclusion
category: bugfix
bugzilla: NA
CVE: NA
----------------------------
IT products want to know whether a reset is happened, so need to
use ae_dev->ops->reset_event to notify them.
Signed-off-by: Yonglong Liu <liuyonglong(a)huawei.com>
Reviewed-by: li yongxin <liyongxin1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 98a1d3d2870a9..8507eb60450fe 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3933,6 +3933,7 @@ static void hclge_set_def_reset_request(struct hnae3_ae_dev *ae_dev,
static void hclge_reset_timer(struct timer_list *t)
{
struct hclge_dev *hdev = from_timer(hdev, t, reset_timer);
+ struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev);
/* if default_reset_request has no value, it means that this reset
* request has already be handled, so just return here
@@ -3942,7 +3943,9 @@ static void hclge_reset_timer(struct timer_list *t)
dev_info(&hdev->pdev->dev,
"triggering reset in reset timer\n");
- hclge_reset_event(hdev->pdev, NULL);
+
+ if (ae_dev->ops->reset_event)
+ ae_dev->ops->reset_event(hdev->pdev, NULL);
}
static bool hclge_reset_end(struct hnae3_handle *handle, bool done)
--
2.25.1
1
2

[PATCH kernel-4.19] media: firewire: firedtv-avc: fix a buffer overflow in avc_ca_pmt()
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Dan Carpenter <dan.carpenter(a)oracle.com>
mainline inclusion
from mainline-v5.16
commit 35d2969ea3c7d32aee78066b1f3cf61a0d935a4e
category: bugfix
bugzilla: NA
CVE: CVE-2021-42739
-------------------------------------------------
The bounds checking in avc_ca_pmt() is not strict enough. It should
be checking "read_pos + 4" because it's reading 5 bytes. If the
"es_info_length" is non-zero then it reads a 6th byte so there needs to
be an additional check for that.
I also added checks for the "write_pos". I don't think these are
required because "read_pos" and "write_pos" are tied together so
checking one ought to be enough. But they make the code easier to
understand for me. The check on write_pos is:
if (write_pos + 4 >= sizeof(c->operand) - 4) {
The first "+ 4" is because we're writing 5 bytes and the last " - 4"
is to leave space for the CRC.
The other problem is that "length" can be invalid. It comes from
"data_length" in fdtv_ca_pmt().
Cc: stable(a)vger.kernel.org
Reported-by: Luo Likang <luolikang(a)nsfocus.com>
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/media/firewire/firedtv-avc.c | 14 +++++++++++---
drivers/media/firewire/firedtv-ci.c | 2 ++
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/media/firewire/firedtv-avc.c b/drivers/media/firewire/firedtv-avc.c
index 3ef5df1648d77..8c31cf90c5907 100644
--- a/drivers/media/firewire/firedtv-avc.c
+++ b/drivers/media/firewire/firedtv-avc.c
@@ -1169,7 +1169,11 @@ int avc_ca_pmt(struct firedtv *fdtv, char *msg, int length)
read_pos += program_info_length;
write_pos += program_info_length;
}
- while (read_pos < length) {
+ while (read_pos + 4 < length) {
+ if (write_pos + 4 >= sizeof(c->operand) - 4) {
+ ret = -EINVAL;
+ goto out;
+ }
c->operand[write_pos++] = msg[read_pos++];
c->operand[write_pos++] = msg[read_pos++];
c->operand[write_pos++] = msg[read_pos++];
@@ -1181,13 +1185,17 @@ int avc_ca_pmt(struct firedtv *fdtv, char *msg, int length)
c->operand[write_pos++] = es_info_length >> 8;
c->operand[write_pos++] = es_info_length & 0xff;
if (es_info_length > 0) {
+ if (read_pos >= length) {
+ ret = -EINVAL;
+ goto out;
+ }
pmt_cmd_id = msg[read_pos++];
if (pmt_cmd_id != 1 && pmt_cmd_id != 4)
dev_err(fdtv->device, "invalid pmt_cmd_id %d at stream level\n",
pmt_cmd_id);
- if (es_info_length > sizeof(c->operand) - 4 -
- write_pos) {
+ if (es_info_length > sizeof(c->operand) - 4 - write_pos ||
+ es_info_length > length - read_pos) {
ret = -EINVAL;
goto out;
}
diff --git a/drivers/media/firewire/firedtv-ci.c b/drivers/media/firewire/firedtv-ci.c
index 8dc5a7495abee..14f779812d250 100644
--- a/drivers/media/firewire/firedtv-ci.c
+++ b/drivers/media/firewire/firedtv-ci.c
@@ -138,6 +138,8 @@ static int fdtv_ca_pmt(struct firedtv *fdtv, void *arg)
} else {
data_length = msg->msg[3];
}
+ if (data_length > sizeof(msg->msg) - data_pos)
+ return -EINVAL;
return avc_ca_pmt(fdtv, &msg->msg[data_pos], data_length);
}
--
2.25.1
1
0
From: wuhuaye <wuhuaye(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4DLVR
CVE: NA
-------------------------------------------------
Hi everyone,
This patch add some special gpio register configuration.
It is necessary to configure some gpio register
when gpio-dwapb function is enable in Ascend platform.
Signed-off-by: wuhuaye <wuhuaye(a)huawei.com>
Reviewed-by: Ding Tianhong <dingtianhong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/mm/init.c | 17 ++++++++++
drivers/gpio/gpio-dwapb.c | 66 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 82 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5449ae2d26bee..55fc6a0206796 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -941,6 +941,12 @@ __setup("keepinitrd", keepinitrd_setup);
void ascend_enable_all_features(void)
{
+#ifdef CONFIG_GPIO_DWAPB
+ extern bool enable_ascend_gpio_dwapb;
+
+ enable_ascend_gpio_dwapb = true;
+#endif
+
if (IS_ENABLED(CONFIG_ASCEND_DVPP_MMAP))
enable_mmap_dvpp = 1;
@@ -974,6 +980,17 @@ static int __init ascend_enable_setup(char *__unused)
}
early_param("ascend_enable_all", ascend_enable_setup);
+
+static int __init ascend_mini_enable_setup(char *s)
+{
+#ifdef CONFIG_GPIO_DWAPB
+ extern bool enable_ascend_mini_gpio_dwapb;
+
+ enable_ascend_mini_gpio_dwapb = true;
+#endif
+ return 1;
+}
+__setup("ascend_mini_enable", ascend_mini_enable_setup);
#endif
diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 044888fd96a1f..2cdbb4a11075f 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -50,6 +50,9 @@
#define GPIO_EXT_PORTB 0x54
#define GPIO_EXT_PORTC 0x58
#define GPIO_EXT_PORTD 0x5c
+#define GPIO_INTCOMB_MASK 0xffc
+#define GPIO_INT_MASK_REG 0x3804
+#define MEM_PERI_SUBCTRL_IOBASE 0x1
#define DWAPB_MAX_PORTS 4
#define GPIO_EXT_PORT_STRIDE 0x04 /* register stride 32 bits */
@@ -64,6 +67,8 @@
#define GPIO_INTSTATUS_V2 0x3c
#define GPIO_PORTA_EOI_V2 0x40
+bool enable_ascend_mini_gpio_dwapb;
+bool enable_ascend_gpio_dwapb;
struct dwapb_gpio;
#ifdef CONFIG_PM_SLEEP
@@ -77,10 +82,11 @@ struct dwapb_context {
u32 int_type;
u32 int_pol;
u32 int_deb;
+ u32 int_comb_mask;
u32 wake_en;
};
#endif
-
+static void __iomem *peri_subctrl_base_addr;
struct dwapb_gpio_port {
struct gpio_chip gc;
bool is_registered;
@@ -232,6 +238,11 @@ static void dwapb_irq_enable(struct irq_data *d)
val = dwapb_read(gpio, GPIO_INTEN);
val |= BIT(d->hwirq);
dwapb_write(gpio, GPIO_INTEN, val);
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTMASK);
+ val &= ~BIT(d->hwirq);
+ dwapb_write(gpio, GPIO_INTMASK, val);
+ }
spin_unlock_irqrestore(&gc->bgpio_lock, flags);
}
@@ -244,6 +255,11 @@ static void dwapb_irq_disable(struct irq_data *d)
u32 val;
spin_lock_irqsave(&gc->bgpio_lock, flags);
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTMASK);
+ val |= BIT(d->hwirq);
+ dwapb_write(gpio, GPIO_INTMASK, val);
+ }
val = dwapb_read(gpio, GPIO_INTEN);
val &= ~BIT(d->hwirq);
dwapb_write(gpio, GPIO_INTEN, val);
@@ -393,6 +409,7 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
unsigned int hwirq, ngpio = gc->ngpio;
struct irq_chip_type *ct;
int err, i;
+ u32 val;
gpio->domain = irq_domain_create_linear(fwnode, ngpio,
&irq_generic_chip_ops, gpio);
@@ -470,6 +487,12 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
irq_create_mapping(gpio->domain, hwirq);
port->gc.to_irq = dwapb_gpio_to_irq;
+
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTCOMB_MASK);
+ val |= BIT(0);
+ dwapb_write(gpio, GPIO_INTCOMB_MASK, val);
+ }
}
static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
@@ -478,6 +501,7 @@ static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
struct gpio_chip *gc = &port->gc;
unsigned int ngpio = gc->ngpio;
irq_hw_number_t hwirq;
+ u32 val;
if (!gpio->domain)
return;
@@ -487,6 +511,12 @@ static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
irq_domain_remove(gpio->domain);
gpio->domain = NULL;
+
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTCOMB_MASK);
+ val &= ~BIT(0);
+ dwapb_write(gpio, GPIO_INTCOMB_MASK, val);
+ }
}
static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
@@ -660,6 +690,22 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
int err;
struct device *dev = &pdev->dev;
struct dwapb_platform_data *pdata = dev_get_platdata(dev);
+ struct device_node *np = dev->of_node;
+ unsigned int value;
+
+ if (enable_ascend_mini_gpio_dwapb && enable_ascend_gpio_dwapb) {
+ peri_subctrl_base_addr = of_iomap(np, MEM_PERI_SUBCTRL_IOBASE);
+ if (!peri_subctrl_base_addr) {
+ dev_err(&pdev->dev, "sysctrl iomap not find!\n");
+ } else {
+ dev_dbg(&pdev->dev, "sysctrl iomap find!\n");
+ value = readl(peri_subctrl_base_addr +
+ GPIO_INT_MASK_REG);
+ value &= ~1UL;
+ writel(value, peri_subctrl_base_addr +
+ GPIO_INT_MASK_REG);
+ }
+ }
if (!pdata) {
pdata = dwapb_gpio_get_pdata(dev);
@@ -742,6 +788,10 @@ static int dwapb_gpio_remove(struct platform_device *pdev)
reset_control_assert(gpio->rst);
clk_disable_unprepare(gpio->clk);
+ if ((peri_subctrl_base_addr != NULL) && enable_ascend_mini_gpio_dwapb &&
+ enable_ascend_gpio_dwapb)
+ iounmap(peri_subctrl_base_addr);
+
return 0;
}
@@ -778,6 +828,9 @@ static int dwapb_gpio_suspend(struct device *dev)
ctx->int_pol = dwapb_read(gpio, GPIO_INT_POLARITY);
ctx->int_type = dwapb_read(gpio, GPIO_INTTYPE_LEVEL);
ctx->int_deb = dwapb_read(gpio, GPIO_PORTA_DEBOUNCE);
+ if (enable_ascend_gpio_dwapb)
+ ctx->int_comb_mask =
+ dwapb_read(gpio, GPIO_INTCOMB_MASK);
/* Mask out interrupts */
dwapb_write(gpio, GPIO_INTMASK,
@@ -798,6 +851,7 @@ static int dwapb_gpio_resume(struct device *dev)
struct gpio_chip *gc = &gpio->ports[0].gc;
unsigned long flags;
int i;
+ unsigned int value;
if (!IS_ERR(gpio->clk))
clk_prepare_enable(gpio->clk);
@@ -826,6 +880,9 @@ static int dwapb_gpio_resume(struct device *dev)
dwapb_write(gpio, GPIO_PORTA_DEBOUNCE, ctx->int_deb);
dwapb_write(gpio, GPIO_INTEN, ctx->int_en);
dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);
+ if (enable_ascend_gpio_dwapb)
+ dwapb_write(gpio,
+ GPIO_INTCOMB_MASK, ctx->int_comb_mask);
/* Clear out spurious interrupts */
dwapb_write(gpio, GPIO_PORTA_EOI, 0xffffffff);
@@ -833,6 +890,13 @@ static int dwapb_gpio_resume(struct device *dev)
}
spin_unlock_irqrestore(&gc->bgpio_lock, flags);
+ if ((peri_subctrl_base_addr != NULL) && enable_ascend_mini_gpio_dwapb &&
+ enable_ascend_gpio_dwapb) {
+ value = readl(peri_subctrl_base_addr + GPIO_INT_MASK_REG);
+ value &= ~1UL;
+ writel(value, peri_subctrl_base_addr + GPIO_INT_MASK_REG);
+ }
+
return 0;
}
#endif
--
2.25.1
1
0
From: wuhuaye <wuhuaye(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4DLVR
CVE: NA
-------------------------------------------------
Hi everyone,
This patch add some special gpio register configuration.
It is necessary to configure some gpio register
when gpio-dwapb function is enable in Ascend platform.
Signed-off-by: wuhuaye <wuhuaye(a)huawei.com>
Reviewed-by: Ding Tianhong <dingtianhong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/mm/init.c | 17 ++++++++++
drivers/gpio/gpio-dwapb.c | 66 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 82 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 2399a257eaf33..8cdf92626c2c6 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -944,6 +944,12 @@ __setup("keepinitrd", keepinitrd_setup);
void ascend_enable_all_features(void)
{
+#ifdef CONFIG_GPIO_DWAPB
+ extern bool enable_ascend_gpio_dwapb;
+
+ enable_ascend_gpio_dwapb = true;
+#endif
+
if (IS_ENABLED(CONFIG_ASCEND_DVPP_MMAP))
enable_mmap_dvpp = 1;
@@ -981,6 +987,17 @@ static int __init ascend_enable_setup(char *__unused)
}
early_param("ascend_enable_all", ascend_enable_setup);
+
+static int __init ascend_mini_enable_setup(char *s)
+{
+#ifdef CONFIG_GPIO_DWAPB
+ extern bool enable_ascend_mini_gpio_dwapb;
+
+ enable_ascend_mini_gpio_dwapb = true;
+#endif
+ return 1;
+}
+__setup("ascend_mini_enable", ascend_mini_enable_setup);
#endif
diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 2a56efced7988..4c332f0ed8524 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -50,6 +50,9 @@
#define GPIO_EXT_PORTB 0x54
#define GPIO_EXT_PORTC 0x58
#define GPIO_EXT_PORTD 0x5c
+#define GPIO_INTCOMB_MASK 0xffc
+#define GPIO_INT_MASK_REG 0x3804
+#define MEM_PERI_SUBCTRL_IOBASE 0x1
#define DWAPB_DRIVER_NAME "gpio-dwapb"
#define DWAPB_MAX_PORTS 4
@@ -66,6 +69,8 @@
#define GPIO_INTSTATUS_V2 0x3c
#define GPIO_PORTA_EOI_V2 0x40
+bool enable_ascend_mini_gpio_dwapb;
+bool enable_ascend_gpio_dwapb;
struct dwapb_gpio;
#ifdef CONFIG_PM_SLEEP
@@ -79,10 +84,11 @@ struct dwapb_context {
u32 int_type;
u32 int_pol;
u32 int_deb;
+ u32 int_comb_mask;
u32 wake_en;
};
#endif
-
+static void __iomem *peri_subctrl_base_addr;
struct dwapb_gpio_port {
struct gpio_chip gc;
bool is_registered;
@@ -234,6 +240,11 @@ static void dwapb_irq_enable(struct irq_data *d)
val = dwapb_read(gpio, GPIO_INTEN);
val |= BIT(d->hwirq);
dwapb_write(gpio, GPIO_INTEN, val);
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTMASK);
+ val &= ~BIT(d->hwirq);
+ dwapb_write(gpio, GPIO_INTMASK, val);
+ }
spin_unlock_irqrestore(&gc->bgpio_lock, flags);
}
@@ -246,6 +257,11 @@ static void dwapb_irq_disable(struct irq_data *d)
u32 val;
spin_lock_irqsave(&gc->bgpio_lock, flags);
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTMASK);
+ val |= BIT(d->hwirq);
+ dwapb_write(gpio, GPIO_INTMASK, val);
+ }
val = dwapb_read(gpio, GPIO_INTEN);
val &= ~BIT(d->hwirq);
dwapb_write(gpio, GPIO_INTEN, val);
@@ -395,6 +411,7 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
unsigned int hwirq, ngpio = gc->ngpio;
struct irq_chip_type *ct;
int err, i;
+ u32 val;
gpio->domain = irq_domain_create_linear(fwnode, ngpio,
&irq_generic_chip_ops, gpio);
@@ -472,6 +489,12 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
irq_create_mapping(gpio->domain, hwirq);
port->gc.to_irq = dwapb_gpio_to_irq;
+
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTCOMB_MASK);
+ val |= BIT(0);
+ dwapb_write(gpio, GPIO_INTCOMB_MASK, val);
+ }
}
static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
@@ -480,6 +503,7 @@ static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
struct gpio_chip *gc = &port->gc;
unsigned int ngpio = gc->ngpio;
irq_hw_number_t hwirq;
+ u32 val;
if (!gpio->domain)
return;
@@ -489,6 +513,12 @@ static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
irq_domain_remove(gpio->domain);
gpio->domain = NULL;
+
+ if (enable_ascend_gpio_dwapb) {
+ val = dwapb_read(gpio, GPIO_INTCOMB_MASK);
+ val &= ~BIT(0);
+ dwapb_write(gpio, GPIO_INTCOMB_MASK, val);
+ }
}
static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
@@ -669,6 +699,22 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
int err;
struct device *dev = &pdev->dev;
struct dwapb_platform_data *pdata = dev_get_platdata(dev);
+ struct device_node *np = dev->of_node;
+ unsigned int value;
+
+ if (enable_ascend_mini_gpio_dwapb && enable_ascend_gpio_dwapb) {
+ peri_subctrl_base_addr = of_iomap(np, MEM_PERI_SUBCTRL_IOBASE);
+ if (!peri_subctrl_base_addr) {
+ dev_err(&pdev->dev, "sysctrl iomap not find!\n");
+ } else {
+ dev_dbg(&pdev->dev, "sysctrl iomap find!\n");
+ value = readl(peri_subctrl_base_addr +
+ GPIO_INT_MASK_REG);
+ value &= ~1UL;
+ writel(value, peri_subctrl_base_addr +
+ GPIO_INT_MASK_REG);
+ }
+ }
if (!pdata) {
pdata = dwapb_gpio_get_pdata(dev);
@@ -751,6 +797,10 @@ static int dwapb_gpio_remove(struct platform_device *pdev)
reset_control_assert(gpio->rst);
clk_disable_unprepare(gpio->clk);
+ if ((peri_subctrl_base_addr != NULL) && enable_ascend_mini_gpio_dwapb &&
+ enable_ascend_gpio_dwapb)
+ iounmap(peri_subctrl_base_addr);
+
return 0;
}
@@ -787,6 +837,9 @@ static int dwapb_gpio_suspend(struct device *dev)
ctx->int_pol = dwapb_read(gpio, GPIO_INT_POLARITY);
ctx->int_type = dwapb_read(gpio, GPIO_INTTYPE_LEVEL);
ctx->int_deb = dwapb_read(gpio, GPIO_PORTA_DEBOUNCE);
+ if (enable_ascend_gpio_dwapb)
+ ctx->int_comb_mask =
+ dwapb_read(gpio, GPIO_INTCOMB_MASK);
/* Mask out interrupts */
dwapb_write(gpio, GPIO_INTMASK,
@@ -807,6 +860,7 @@ static int dwapb_gpio_resume(struct device *dev)
struct gpio_chip *gc = &gpio->ports[0].gc;
unsigned long flags;
int i;
+ unsigned int value;
if (!IS_ERR(gpio->clk))
clk_prepare_enable(gpio->clk);
@@ -835,6 +889,9 @@ static int dwapb_gpio_resume(struct device *dev)
dwapb_write(gpio, GPIO_PORTA_DEBOUNCE, ctx->int_deb);
dwapb_write(gpio, GPIO_INTEN, ctx->int_en);
dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);
+ if (enable_ascend_gpio_dwapb)
+ dwapb_write(gpio,
+ GPIO_INTCOMB_MASK, ctx->int_comb_mask);
/* Clear out spurious interrupts */
dwapb_write(gpio, GPIO_PORTA_EOI, 0xffffffff);
@@ -842,6 +899,13 @@ static int dwapb_gpio_resume(struct device *dev)
}
spin_unlock_irqrestore(&gc->bgpio_lock, flags);
+ if ((peri_subctrl_base_addr != NULL) && enable_ascend_mini_gpio_dwapb &&
+ enable_ascend_gpio_dwapb) {
+ value = readl(peri_subctrl_base_addr + GPIO_INT_MASK_REG);
+ value &= ~1UL;
+ writel(value, peri_subctrl_base_addr + GPIO_INT_MASK_REG);
+ }
+
return 0;
}
#endif
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/9] iommu/arm-smmu-v3: Add support to configure mpam in STE/CD context
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
-------------------------------------------------
To support limiting qos of device, the partid and pmg need to be set
into the SMMU STE/CD context. This introduce support of SMMU mpam
feature and add interface to set mpam configuration in STE/CD.
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Signed-off-by: Rui Zhu <zhurui3(a)huawei.com>
Reviewed-by: Yingtai Xie <xieyingtai(a)huawei.com>
Reviewed-by: Zhen Lei <thunder.leizhen(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/iommu/arm-smmu-v3-context.c | 25 +++++
drivers/iommu/arm-smmu-v3.c | 151 ++++++++++++++++++++++++++++
include/linux/ascend_smmu.h | 9 ++
3 files changed, 185 insertions(+)
create mode 100644 include/linux/ascend_smmu.h
diff --git a/drivers/iommu/arm-smmu-v3-context.c b/drivers/iommu/arm-smmu-v3-context.c
index 2351de86d31f1..b30a93e076608 100644
--- a/drivers/iommu/arm-smmu-v3-context.c
+++ b/drivers/iommu/arm-smmu-v3-context.c
@@ -66,6 +66,9 @@
#define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(51, 4)
+#define CTXDESC_CD_5_PARTID_MASK GENMASK_ULL(47, 32)
+#define CTXDESC_CD_5_PMG_MASK GENMASK_ULL(55, 48)
+
/* Convert between AArch64 (CPU) TCR format and SMMU CD format */
#define ARM_SMMU_TCR2CD(tcr, fld) FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \
FIELD_GET(ARM64_TCR_##fld, tcr))
@@ -563,6 +566,28 @@ static int arm_smmu_set_cd(struct iommu_pasid_table_ops *ops, int pasid,
return arm_smmu_write_ctx_desc(tbl, pasid, cd);
}
+int arm_smmu_set_cd_mpam(struct iommu_pasid_table_ops *ops,
+ int ssid, int partid, int pmg)
+{
+ struct arm_smmu_cd_tables *tbl = pasid_ops_to_tables(ops);
+ u64 val;
+ __le64 *cdptr = arm_smmu_get_cd_ptr(tbl, ssid);
+
+ if (!cdptr)
+ return -ENOMEM;
+
+ val = le64_to_cpu(cdptr[5]);
+ val &= ~CTXDESC_CD_5_PARTID_MASK;
+ val |= FIELD_PREP(CTXDESC_CD_5_PARTID_MASK, partid);
+ val &= ~CTXDESC_CD_5_PMG_MASK;
+ val |= FIELD_PREP(CTXDESC_CD_5_PMG_MASK, pmg);
+ WRITE_ONCE(cdptr[5], cpu_to_le64(val));
+
+ iommu_pasid_flush(&tbl->pasid, ssid, true);
+
+ return 0;
+}
+
static void arm_smmu_clear_cd(struct iommu_pasid_table_ops *ops, int pasid,
struct iommu_pasid_entry *entry)
{
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 6731b47558f4d..c282fc129222f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -88,6 +88,10 @@
#define IDR1_SSIDSIZE GENMASK(10, 6)
#define IDR1_SIDSIZE GENMASK(5, 0)
+#define ARM_SMMU_IDR3 0xc
+#define IDR3_MPAM (1 << 7)
+#define ARM_SMMU_IDR3_CFG 0x140C
+
#define ARM_SMMU_IDR5 0x14
#define IDR5_STALL_MAX GENMASK(31, 16)
#define IDR5_GRAN64K (1 << 6)
@@ -186,6 +190,10 @@
#define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
#define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
+#define ARM_SMMU_MPAMIDR 0x130
+#define MPAMIDR_PMG_MAX GENMASK(23, 16)
+#define MPAMIDR_PARTID_MAX GENMASK(15, 0)
+
/* Common MSI config fields */
#define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2)
#define MSI_CFG2_SH GENMASK(5, 4)
@@ -250,6 +258,7 @@
#define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4)
#define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6)
+#define STRTAB_STE_1_S1MPAM (1UL << 26)
#define STRTAB_STE_1_S1STALLD (1UL << 27)
#define STRTAB_STE_1_EATS GENMASK_ULL(29, 28)
@@ -273,6 +282,11 @@
#define STRTAB_STE_3_S2TTB_MASK GENMASK_ULL(51, 4)
+#define STRTAB_STE_4_PARTID_MASK GENMASK_ULL(31, 16)
+
+#define STRTAB_STE_5_MPAM_NS (1UL << 8)
+#define STRTAB_STE_5_PMG_MASK GENMASK_ULL(7, 0)
+
/* Command queue */
#define CMDQ_ENT_SZ_SHIFT 4
#define CMDQ_ENT_DWORDS ((1 << CMDQ_ENT_SZ_SHIFT) >> 3)
@@ -634,6 +648,7 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_SVA (1 << 17)
#define ARM_SMMU_FEAT_HA (1 << 18)
#define ARM_SMMU_FEAT_HD (1 << 19)
+#define ARM_SMMU_FEAT_MPAM (1 << 20)
u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
@@ -672,6 +687,9 @@ struct arm_smmu_device {
struct mutex streams_mutex;
struct iopf_queue *iopf_queue;
+
+ unsigned int mpam_partid_max;
+ unsigned int mpam_pmg_max;
};
struct arm_smmu_stream {
@@ -3977,6 +3995,25 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
if (smmu->sid_bits <= STRTAB_SPLIT)
smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+ /* IDR3 */
+ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+
+ if (!(reg & IDR3_MPAM)) {
+ reg |= FIELD_PREP(IDR3_MPAM, 1);
+ writel(reg, smmu->base + ARM_SMMU_IDR3_CFG);
+ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+ if (!(reg & IDR3_MPAM))
+ dev_warn(smmu->dev, "enable smmu mpam failed\n");
+ }
+
+ if (reg & IDR3_MPAM) {
+ reg = readl_relaxed(smmu->base + ARM_SMMU_MPAMIDR);
+ smmu->mpam_partid_max = FIELD_GET(MPAMIDR_PARTID_MAX, reg);
+ smmu->mpam_pmg_max = FIELD_GET(MPAMIDR_PMG_MAX, reg);
+ if (smmu->mpam_partid_max || smmu->mpam_pmg_max)
+ smmu->features |= ARM_SMMU_FEAT_MPAM;
+ }
+
/* IDR5 */
reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
@@ -4124,6 +4161,120 @@ static unsigned long arm_smmu_resource_size(struct arm_smmu_device *smmu)
return SZ_128K;
}
+static int arm_smmu_set_ste_mpam(struct arm_smmu_device *smmu,
+ int sid, int partid, int pmg, int s1mpam)
+{
+ u64 val;
+ __le64 *ste;
+
+ if (!arm_smmu_sid_in_range(smmu, sid))
+ return -ERANGE;
+
+ /* get ste ptr */
+ ste = arm_smmu_get_step_for_sid(smmu, sid);
+
+ /* write s1mpam to ste */
+ val = le64_to_cpu(ste[1]);
+ val &= ~STRTAB_STE_1_S1MPAM;
+ val |= FIELD_PREP(STRTAB_STE_1_S1MPAM, s1mpam);
+ WRITE_ONCE(ste[1], cpu_to_le64(val));
+
+ val = le64_to_cpu(ste[4]);
+ val &= ~STRTAB_STE_4_PARTID_MASK;
+ val |= FIELD_PREP(STRTAB_STE_4_PARTID_MASK, partid);
+ WRITE_ONCE(ste[4], cpu_to_le64(val));
+
+ val = le64_to_cpu(ste[5]);
+ val &= ~STRTAB_STE_5_PMG_MASK;
+ val |= FIELD_PREP(STRTAB_STE_5_PMG_MASK, pmg);
+ WRITE_ONCE(ste[5], cpu_to_le64(val));
+
+ arm_smmu_sync_ste_for_sid(smmu, sid);
+
+ return 0;
+}
+
+int arm_smmu_set_cd_mpam(struct iommu_pasid_table_ops *ops,
+ int ssid, int partid, int pmg);
+
+static int arm_smmu_set_mpam(struct arm_smmu_device *smmu,
+ int sid, int ssid, int partid, int pmg, int s1mpam)
+{
+ struct arm_smmu_master_data *master = arm_smmu_find_master(smmu, sid);
+ struct arm_smmu_s1_cfg *cfg = master ? master->ste.s1_cfg : NULL;
+ struct arm_smmu_domain *domain = master ? master->domain : NULL;
+ int ret;
+
+ struct arm_smmu_cmdq_ent prefetch_cmd = {
+ .opcode = CMDQ_OP_PREFETCH_CFG,
+ .prefetch = {
+ .sid = sid,
+ },
+ };
+
+ if (!(smmu->features & ARM_SMMU_FEAT_MPAM))
+ return -ENODEV;
+
+ if (WARN_ON(!domain))
+ return -EINVAL;
+
+ if (WARN_ON(!cfg))
+ return -EINVAL;
+
+ if (WARN_ON(ssid >= (1 << master->ssid_bits)))
+ return -E2BIG;
+
+ if (partid > smmu->mpam_partid_max || pmg > smmu->mpam_pmg_max) {
+ dev_err(smmu->dev,
+ "mpam rmid out of range: partid[0, %d] pmg[0, %d]\n",
+ smmu->mpam_partid_max, smmu->mpam_pmg_max);
+ return -ERANGE;
+ }
+
+ ret = arm_smmu_set_ste_mpam(smmu, sid, partid, pmg, s1mpam);
+ if (ret < 0) {
+ dev_err(smmu->dev, "set ste mpam configuration error %d\n",
+ ret);
+ return ret;
+ }
+
+ /* do not modify cd table which owned by guest */
+ if (domain->stage == ARM_SMMU_DOMAIN_NESTED) {
+ dev_err(smmu->dev,
+ "mpam: smmu cd is owned by guest, not modified\n");
+ return 0;
+ }
+
+ ret = arm_smmu_set_cd_mpam(cfg->ops, ssid, partid, pmg);
+ if (s1mpam && ret < 0) {
+ dev_err(smmu->dev, "set cd mpam configuration error %d\n",
+ ret);
+ return ret;
+ }
+
+ /* It's likely that we'll want to use the new STE soon */
+ if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
+ arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+
+ dev_info(smmu->dev, "partid %d, pmg %d\n", partid, pmg);
+
+ return 0;
+}
+
+/**
+ * arm_smmu_set_dev_mpam() - Set mpam configuration to SMMU STE/CD
+ */
+int arm_smmu_set_dev_mpam(struct device *dev, int ssid, int partid, int pmg,
+ int s1mpam)
+{
+ struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+ struct arm_smmu_device *smmu = master->domain->smmu;
+ int sid = master->streams->id;
+
+ return arm_smmu_set_mpam(smmu, sid, ssid, partid, pmg, s1mpam);
+}
+EXPORT_SYMBOL(arm_smmu_set_dev_mpam);
+
static int arm_smmu_device_probe(struct platform_device *pdev)
{
int irq, ret;
diff --git a/include/linux/ascend_smmu.h b/include/linux/ascend_smmu.h
new file mode 100644
index 0000000000000..bd0edd25057e1
--- /dev/null
+++ b/include/linux/ascend_smmu.h
@@ -0,0 +1,9 @@
+#ifndef __LINUX_ASCEND_SMMU_H
+#define __LINUX_ASCEND_SMMU_H
+
+#include <linux/device.h>
+
+extern int arm_smmu_set_dev_mpam(struct device *dev, int ssid, int partid,
+ int pmg, int s1mpam);
+
+#endif /* __LINUX_ASCEND_SMMU_H */
--
2.25.1
1
8

[PATCH kernel-4.19 1/9] iommu/arm-smmu-v3: Add support to configure mpam in STE/CD context
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
-------------------------------------------------
To support limiting qos of device, the partid and pmg need to be set
into the SMMU STE/CD context. This introduce support of SMMU mpam
feature and add interface to set mpam configuration in STE/CD.
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Signed-off-by: Rui Zhu <zhurui3(a)huawei.com>
Reviewed-by: Yingtai Xie <xieyingtai(a)huawei.com>
Reviewed-by: Zhen Lei <thunder.leizhen(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/iommu/arm-smmu-v3-context.c | 25 +++++
drivers/iommu/arm-smmu-v3.c | 151 ++++++++++++++++++++++++++++
include/linux/ascend_smmu.h | 9 ++
3 files changed, 185 insertions(+)
create mode 100644 include/linux/ascend_smmu.h
diff --git a/drivers/iommu/arm-smmu-v3-context.c b/drivers/iommu/arm-smmu-v3-context.c
index 2351de86d31f1..b30a93e076608 100644
--- a/drivers/iommu/arm-smmu-v3-context.c
+++ b/drivers/iommu/arm-smmu-v3-context.c
@@ -66,6 +66,9 @@
#define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(51, 4)
+#define CTXDESC_CD_5_PARTID_MASK GENMASK_ULL(47, 32)
+#define CTXDESC_CD_5_PMG_MASK GENMASK_ULL(55, 48)
+
/* Convert between AArch64 (CPU) TCR format and SMMU CD format */
#define ARM_SMMU_TCR2CD(tcr, fld) FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \
FIELD_GET(ARM64_TCR_##fld, tcr))
@@ -563,6 +566,28 @@ static int arm_smmu_set_cd(struct iommu_pasid_table_ops *ops, int pasid,
return arm_smmu_write_ctx_desc(tbl, pasid, cd);
}
+int arm_smmu_set_cd_mpam(struct iommu_pasid_table_ops *ops,
+ int ssid, int partid, int pmg)
+{
+ struct arm_smmu_cd_tables *tbl = pasid_ops_to_tables(ops);
+ u64 val;
+ __le64 *cdptr = arm_smmu_get_cd_ptr(tbl, ssid);
+
+ if (!cdptr)
+ return -ENOMEM;
+
+ val = le64_to_cpu(cdptr[5]);
+ val &= ~CTXDESC_CD_5_PARTID_MASK;
+ val |= FIELD_PREP(CTXDESC_CD_5_PARTID_MASK, partid);
+ val &= ~CTXDESC_CD_5_PMG_MASK;
+ val |= FIELD_PREP(CTXDESC_CD_5_PMG_MASK, pmg);
+ WRITE_ONCE(cdptr[5], cpu_to_le64(val));
+
+ iommu_pasid_flush(&tbl->pasid, ssid, true);
+
+ return 0;
+}
+
static void arm_smmu_clear_cd(struct iommu_pasid_table_ops *ops, int pasid,
struct iommu_pasid_entry *entry)
{
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 8b5083c3e0a16..6634d596f7efc 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -88,6 +88,10 @@
#define IDR1_SSIDSIZE GENMASK(10, 6)
#define IDR1_SIDSIZE GENMASK(5, 0)
+#define ARM_SMMU_IDR3 0xc
+#define IDR3_MPAM (1 << 7)
+#define ARM_SMMU_IDR3_CFG 0x140C
+
#define ARM_SMMU_IDR5 0x14
#define IDR5_STALL_MAX GENMASK(31, 16)
#define IDR5_GRAN64K (1 << 6)
@@ -186,6 +190,10 @@
#define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
#define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
+#define ARM_SMMU_MPAMIDR 0x130
+#define MPAMIDR_PMG_MAX GENMASK(23, 16)
+#define MPAMIDR_PARTID_MAX GENMASK(15, 0)
+
/* Common MSI config fields */
#define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2)
#define MSI_CFG2_SH GENMASK(5, 4)
@@ -250,6 +258,7 @@
#define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4)
#define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6)
+#define STRTAB_STE_1_S1MPAM (1UL << 26)
#define STRTAB_STE_1_S1STALLD (1UL << 27)
#define STRTAB_STE_1_EATS GENMASK_ULL(29, 28)
@@ -273,6 +282,11 @@
#define STRTAB_STE_3_S2TTB_MASK GENMASK_ULL(51, 4)
+#define STRTAB_STE_4_PARTID_MASK GENMASK_ULL(31, 16)
+
+#define STRTAB_STE_5_MPAM_NS (1UL << 8)
+#define STRTAB_STE_5_PMG_MASK GENMASK_ULL(7, 0)
+
/* Command queue */
#define CMDQ_ENT_SZ_SHIFT 4
#define CMDQ_ENT_DWORDS ((1 << CMDQ_ENT_SZ_SHIFT) >> 3)
@@ -634,6 +648,7 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_SVA (1 << 17)
#define ARM_SMMU_FEAT_HA (1 << 18)
#define ARM_SMMU_FEAT_HD (1 << 19)
+#define ARM_SMMU_FEAT_MPAM (1 << 20)
u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
@@ -672,6 +687,9 @@ struct arm_smmu_device {
struct mutex streams_mutex;
struct iopf_queue *iopf_queue;
+
+ unsigned int mpam_partid_max;
+ unsigned int mpam_pmg_max;
};
struct arm_smmu_stream {
@@ -3980,6 +3998,25 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
if (smmu->sid_bits <= STRTAB_SPLIT)
smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+ /* IDR3 */
+ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+
+ if (!(reg & IDR3_MPAM)) {
+ reg |= FIELD_PREP(IDR3_MPAM, 1);
+ writel(reg, smmu->base + ARM_SMMU_IDR3_CFG);
+ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+ if (!(reg & IDR3_MPAM))
+ dev_warn(smmu->dev, "enable smmu mpam failed\n");
+ }
+
+ if (reg & IDR3_MPAM) {
+ reg = readl_relaxed(smmu->base + ARM_SMMU_MPAMIDR);
+ smmu->mpam_partid_max = FIELD_GET(MPAMIDR_PARTID_MAX, reg);
+ smmu->mpam_pmg_max = FIELD_GET(MPAMIDR_PMG_MAX, reg);
+ if (smmu->mpam_partid_max || smmu->mpam_pmg_max)
+ smmu->features |= ARM_SMMU_FEAT_MPAM;
+ }
+
/* IDR5 */
reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
@@ -4127,6 +4164,120 @@ static unsigned long arm_smmu_resource_size(struct arm_smmu_device *smmu)
return SZ_128K;
}
+static int arm_smmu_set_ste_mpam(struct arm_smmu_device *smmu,
+ int sid, int partid, int pmg, int s1mpam)
+{
+ u64 val;
+ __le64 *ste;
+
+ if (!arm_smmu_sid_in_range(smmu, sid))
+ return -ERANGE;
+
+ /* get ste ptr */
+ ste = arm_smmu_get_step_for_sid(smmu, sid);
+
+ /* write s1mpam to ste */
+ val = le64_to_cpu(ste[1]);
+ val &= ~STRTAB_STE_1_S1MPAM;
+ val |= FIELD_PREP(STRTAB_STE_1_S1MPAM, s1mpam);
+ WRITE_ONCE(ste[1], cpu_to_le64(val));
+
+ val = le64_to_cpu(ste[4]);
+ val &= ~STRTAB_STE_4_PARTID_MASK;
+ val |= FIELD_PREP(STRTAB_STE_4_PARTID_MASK, partid);
+ WRITE_ONCE(ste[4], cpu_to_le64(val));
+
+ val = le64_to_cpu(ste[5]);
+ val &= ~STRTAB_STE_5_PMG_MASK;
+ val |= FIELD_PREP(STRTAB_STE_5_PMG_MASK, pmg);
+ WRITE_ONCE(ste[5], cpu_to_le64(val));
+
+ arm_smmu_sync_ste_for_sid(smmu, sid);
+
+ return 0;
+}
+
+int arm_smmu_set_cd_mpam(struct iommu_pasid_table_ops *ops,
+ int ssid, int partid, int pmg);
+
+static int arm_smmu_set_mpam(struct arm_smmu_device *smmu,
+ int sid, int ssid, int partid, int pmg, int s1mpam)
+{
+ struct arm_smmu_master_data *master = arm_smmu_find_master(smmu, sid);
+ struct arm_smmu_s1_cfg *cfg = master ? master->ste.s1_cfg : NULL;
+ struct arm_smmu_domain *domain = master ? master->domain : NULL;
+ int ret;
+
+ struct arm_smmu_cmdq_ent prefetch_cmd = {
+ .opcode = CMDQ_OP_PREFETCH_CFG,
+ .prefetch = {
+ .sid = sid,
+ },
+ };
+
+ if (!(smmu->features & ARM_SMMU_FEAT_MPAM))
+ return -ENODEV;
+
+ if (WARN_ON(!domain))
+ return -EINVAL;
+
+ if (WARN_ON(!cfg))
+ return -EINVAL;
+
+ if (WARN_ON(ssid >= (1 << master->ssid_bits)))
+ return -E2BIG;
+
+ if (partid > smmu->mpam_partid_max || pmg > smmu->mpam_pmg_max) {
+ dev_err(smmu->dev,
+ "mpam rmid out of range: partid[0, %d] pmg[0, %d]\n",
+ smmu->mpam_partid_max, smmu->mpam_pmg_max);
+ return -ERANGE;
+ }
+
+ ret = arm_smmu_set_ste_mpam(smmu, sid, partid, pmg, s1mpam);
+ if (ret < 0) {
+ dev_err(smmu->dev, "set ste mpam configuration error %d\n",
+ ret);
+ return ret;
+ }
+
+ /* do not modify cd table which owned by guest */
+ if (domain->stage == ARM_SMMU_DOMAIN_NESTED) {
+ dev_err(smmu->dev,
+ "mpam: smmu cd is owned by guest, not modified\n");
+ return 0;
+ }
+
+ ret = arm_smmu_set_cd_mpam(cfg->ops, ssid, partid, pmg);
+ if (s1mpam && ret < 0) {
+ dev_err(smmu->dev, "set cd mpam configuration error %d\n",
+ ret);
+ return ret;
+ }
+
+ /* It's likely that we'll want to use the new STE soon */
+ if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
+ arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+
+ dev_info(smmu->dev, "partid %d, pmg %d\n", partid, pmg);
+
+ return 0;
+}
+
+/**
+ * arm_smmu_set_dev_mpam() - Set mpam configuration to SMMU STE/CD
+ */
+int arm_smmu_set_dev_mpam(struct device *dev, int ssid, int partid, int pmg,
+ int s1mpam)
+{
+ struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+ struct arm_smmu_device *smmu = master->domain->smmu;
+ int sid = master->streams->id;
+
+ return arm_smmu_set_mpam(smmu, sid, ssid, partid, pmg, s1mpam);
+}
+EXPORT_SYMBOL(arm_smmu_set_dev_mpam);
+
static int arm_smmu_device_probe(struct platform_device *pdev)
{
int irq, ret;
diff --git a/include/linux/ascend_smmu.h b/include/linux/ascend_smmu.h
new file mode 100644
index 0000000000000..bd0edd25057e1
--- /dev/null
+++ b/include/linux/ascend_smmu.h
@@ -0,0 +1,9 @@
+#ifndef __LINUX_ASCEND_SMMU_H
+#define __LINUX_ASCEND_SMMU_H
+
+#include <linux/device.h>
+
+extern int arm_smmu_set_dev_mpam(struct device *dev, int ssid, int partid,
+ int pmg, int s1mpam);
+
+#endif /* __LINUX_ASCEND_SMMU_H */
--
2.25.1
1
8

[PATCH openEuler-1.0-LTS 0/6] Fix the problem that the number of tcp timeout retransmissions is lost
by jiazhenyuan 25 Oct '21
by jiazhenyuan 25 Oct '21
25 Oct '21
From: Jiazhenyuan <jiazhenyuan(a)uniontech.com>
issue: https://gitee.com/openeuler/kernel/issues/I4AFRJ?from=project-issue
jiazhenyuan (6):
tcp: switch tcp and sch_fq to new earliest departure time
net_sched: sch_fq: ensure maxrate fq parameter applies to EDT flows
From 9efdda4e3abed13f0903b7b6e4d4c2102019440a Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com> Date: Sat, 24 Nov 2018
09:12:24 -0800 Subject: [PATCH] tcp: address problems caused by EDT
misshaps
From 7ae189759cc48cf8b54beebff566e9fd2d4e7d7c Mon Sep 17 00:00:00 2001
From: Yuchung Cheng <ycheng(a)google.com> Date: Wed, 16 Jan 2019
15:05:30 -0800 Subject: [PATCH] tcp: always set retrans_stamp on
recovery
From 01a523b071618abbc634d1958229fe3bd2dfa5fa Mon Sep 17 00:00:00 2001
From: Yuchung Cheng <ycheng(a)google.com> Date: Wed, 16 Jan 2019
15:05:32 -0800 Subject: [PATCH] tcp: create a helper to model
exponential backoff
From 3256a2d6ab1f71f9a1bd2d7f6f18eb8108c48d17 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com> Date: Mon, 30 Sep 2019
15:44:44 -0700 Subject: [PATCH] tcp: adjust rto_base in
retransmits_timed_out()
net/ipv4/tcp_bbr.c | 7 +++--
net/ipv4/tcp_input.c | 17 +++++++-----
net/ipv4/tcp_output.c | 29 ++++++++++++++------
net/ipv4/tcp_timer.c | 64 ++++++++++++++++++++-----------------------
net/sched/sch_fq.c | 46 ++++++++++++++++++-------------
5 files changed, 92 insertions(+), 71 deletions(-)
--
2.27.0
2
7

[PATCH openEuler-1.0-LTS] nvme-rdma: destroy cm id before destroy qp to avoid use after free
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Ruozhu Li <liruozhu(a)huawei.com>
mainline inclusion
from mainline-v5.15-rc2
commit 9817d763dbe15327b9b3ff4404fa6f27f927e744
category: bugfix
bugzilla: NA
CVE: NA
Link: https://gitee.com/openeuler/kernel/issues/I1WGZE
We got a panic when host received a rej cm event soon after a connect
error cm event.
When host get connect error cm event, it will destroy qp immediately.
But cm_id is still valid then.Another cm event rise here, try to access
the qp which was destroyed.Then we got a kernel panic blow:
[87816.777089] [20473] ib_cm:cm_rep_handler:2343: cm_rep_handler: Stale connection. local_comm_id -154357094, remote_comm_id -1133609861
[87816.777223] [20473] ib_cm:cm_init_qp_rtr_attr:4162: cm_init_qp_rtr_attr: local_id -1150387077, cm_id_priv->id.state: 13
[87816.777225] [20473] rdma_cm:cma_rep_recv:1871: RDMA CM: CONNECT_ERROR: failed to handle reply. status -22
[87816.777395] [20473] ib_cm:ib_send_cm_rej:2781: ib_send_cm_rej: local_id -1150387077, cm_id->state: 13
[87816.777398] [20473] nvme_rdma:nvme_rdma_cm_handler:1718: nvme nvme278: connect error (6): status -22 id 00000000c3809aff
[87816.801155] [20473] nvme_rdma:nvme_rdma_cm_handler:1742: nvme nvme278: CM error event 6
[87816.801160] [20473] rdma_cm:cma_ib_handler:1947: RDMA CM: REJECTED: consumer defined
[87816.801163] nvme nvme278: rdma connection establishment failed (-104)
[87816.801168] BUG: unable to handle kernel NULL pointer dereference at 0000000000000370
[87816.801201] RIP: 0010:_ib_modify_qp+0x6e/0x3a0 [ib_core]
[87816.801215] Call Trace:
[87816.801223] cma_modify_qp_err+0x52/0x80 [rdma_cm]
[87816.801228] ? __dynamic_pr_debug+0x8a/0xb0
[87816.801232] cma_ib_handler+0x25a/0x2f0 [rdma_cm]
[87816.801235] cm_process_work+0x60/0xe0 [ib_cm]
[87816.801238] cm_work_handler+0x13b/0x1b97 [ib_cm]
[87816.801243] ? __switch_to_asm+0x35/0x70
[87816.801244] ? __switch_to_asm+0x41/0x70
[87816.801246] ? __switch_to_asm+0x35/0x70
[87816.801248] ? __switch_to_asm+0x41/0x70
[87816.801252] ? __switch_to+0x8c/0x480
[87816.801254] ? __switch_to_asm+0x41/0x70
[87816.801256] ? __switch_to_asm+0x35/0x70
[87816.801259] process_one_work+0x1a7/0x3b0
[87816.801263] worker_thread+0x30/0x390
[87816.801266] ? create_worker+0x1a0/0x1a0
[87816.801268] kthread+0x112/0x130
[87816.801270] ? kthread_flush_work_fn+0x10/0x10
[87816.801272] ret_from_fork+0x35/0x40
-------------------------------------------------
We should always destroy cm_id before destroy qp to avoid to get cma
event after qp was destroyed, which may lead to use after free.
In RDMA connection establishment error flow, don't destroy qp in cm
event handler.Just report cm_error to upper level, qp will be destroy
in nvme_rdma_alloc_queue() after destroy cm id.
Signed-off-by: Ruozhu Li <liruozhu(a)huawei.com>
Reviewed-by: Max Gurtovoy <mgurtovoy(a)nvidia.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Conflicts:
drivers/nvme/host/rdma.c
[lrz: adjust context]
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/nvme/host/rdma.c | 16 +++-------------
1 file changed, 3 insertions(+), 13 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index b8e0d637ddcfc..049edb9ed1858 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -575,8 +575,8 @@ static void nvme_rdma_free_queue(struct nvme_rdma_queue *queue)
if (!test_and_clear_bit(NVME_RDMA_Q_ALLOCATED, &queue->flags))
return;
- nvme_rdma_destroy_queue_ib(queue);
rdma_destroy_id(queue->cm_id);
+ nvme_rdma_destroy_queue_ib(queue);
mutex_destroy(&queue->queue_lock);
}
@@ -1509,14 +1509,10 @@ static int nvme_rdma_conn_established(struct nvme_rdma_queue *queue)
for (i = 0; i < queue->queue_size; i++) {
ret = nvme_rdma_post_recv(queue, &queue->rsp_ring[i]);
if (ret)
- goto out_destroy_queue_ib;
+ return ret;
}
return 0;
-
-out_destroy_queue_ib:
- nvme_rdma_destroy_queue_ib(queue);
- return ret;
}
static int nvme_rdma_conn_rejected(struct nvme_rdma_queue *queue,
@@ -1608,14 +1604,10 @@ static int nvme_rdma_route_resolved(struct nvme_rdma_queue *queue)
if (ret) {
dev_err(ctrl->ctrl.device,
"rdma_connect failed (%d).\n", ret);
- goto out_destroy_queue_ib;
+ return ret;
}
return 0;
-
-out_destroy_queue_ib:
- nvme_rdma_destroy_queue_ib(queue);
- return ret;
}
static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
@@ -1646,8 +1638,6 @@ static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
case RDMA_CM_EVENT_ROUTE_ERROR:
case RDMA_CM_EVENT_CONNECT_ERROR:
case RDMA_CM_EVENT_UNREACHABLE:
- nvme_rdma_destroy_queue_ib(queue);
- /* fall through */
case RDMA_CM_EVENT_ADDR_ERROR:
dev_dbg(queue->ctrl->ctrl.device,
"CM error event %d\n", ev->event);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] arm64: Errata: fix kabi changed by cpu_errata
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Weilong Chen <chenweilong(a)huawei.com>
ascend inclusion
category: feature
bugzilla: 46922
CVE: NA
-------------------------------------
Patch "cache: Workaround HiSilicon Taishan DC CVAU"
breaks the kabi symbols:
cpu_hwcaps
cpu_hwcap_keys
Fix by using CONFIG_HISILICON_ERRATUM_1980005 to protect it.
Signed-off-by: Weilong Chen <chenweilong(a)huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/include/asm/cpucaps.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index d6c863d2cf984..69515f9077e34 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -56,8 +56,12 @@
#define ARM64_WORKAROUND_1463225 35
#define ARM64_HAS_CRC32 36
#define ARM64_SSBS 37
+#ifdef CONFIG_HISILICON_ERRATUM_1980005
#define ARM64_WORKAROUND_HISILICON_1980005 38
#define ARM64_NCAPS 39
+#else
+#define ARM64_NCAPS 38
+#endif
#endif /* __ASM_CPUCAPS_H */
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 01/45] af_unix: fix garbage collect vs MSG_PEEK
by Yang Yingliang 25 Oct '21
by Yang Yingliang 25 Oct '21
25 Oct '21
From: Miklos Szeredi <mszeredi(a)redhat.com>
stable inclusion
from linux-4.19.200
commit 1dabafa9f61118b1377fde424d9a94bf8dbf2813
--------------------------------
commit cbcf01128d0a92e131bd09f1688fe032480b65ca upstream.
unix_gc() assumes that candidate sockets can never gain an external
reference (i.e. be installed into an fd) while the unix_gc_lock is
held. Except for MSG_PEEK this is guaranteed by modifying inflight
count under the unix_gc_lock.
MSG_PEEK does not touch any variable protected by unix_gc_lock (file
count is not), yet it needs to be serialized with garbage collection.
Do this by locking/unlocking unix_gc_lock:
1) increment file count
2) lock/unlock barrier to make sure incremented file count is visible
to garbage collection
3) install file into fd
This is a lock barrier (unlike smp_mb()) that ensures that garbage
collection is run completely before or completely after the barrier.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
net/unix/af_unix.c | 51 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 49 insertions(+), 2 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 337c4797ab167..98c253afa0db2 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1517,6 +1517,53 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer)
return err;
}
+static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb)
+{
+ scm->fp = scm_fp_dup(UNIXCB(skb).fp);
+
+ /*
+ * Garbage collection of unix sockets starts by selecting a set of
+ * candidate sockets which have reference only from being in flight
+ * (total_refs == inflight_refs). This condition is checked once during
+ * the candidate collection phase, and candidates are marked as such, so
+ * that non-candidates can later be ignored. While inflight_refs is
+ * protected by unix_gc_lock, total_refs (file count) is not, hence this
+ * is an instantaneous decision.
+ *
+ * Once a candidate, however, the socket must not be reinstalled into a
+ * file descriptor while the garbage collection is in progress.
+ *
+ * If the above conditions are met, then the directed graph of
+ * candidates (*) does not change while unix_gc_lock is held.
+ *
+ * Any operations that changes the file count through file descriptors
+ * (dup, close, sendmsg) does not change the graph since candidates are
+ * not installed in fds.
+ *
+ * Dequeing a candidate via recvmsg would install it into an fd, but
+ * that takes unix_gc_lock to decrement the inflight count, so it's
+ * serialized with garbage collection.
+ *
+ * MSG_PEEK is special in that it does not change the inflight count,
+ * yet does install the socket into an fd. The following lock/unlock
+ * pair is to ensure serialization with garbage collection. It must be
+ * done between incrementing the file count and installing the file into
+ * an fd.
+ *
+ * If garbage collection starts after the barrier provided by the
+ * lock/unlock, then it will see the elevated refcount and not mark this
+ * as a candidate. If a garbage collection is already in progress
+ * before the file count was incremented, then the lock/unlock pair will
+ * ensure that garbage collection is finished before progressing to
+ * installing the fd.
+ *
+ * (*) A -> B where B is on the queue of A or B is on the queue of C
+ * which is on the queue of listening socket A.
+ */
+ spin_lock(&unix_gc_lock);
+ spin_unlock(&unix_gc_lock);
+}
+
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds)
{
int err = 0;
@@ -2142,7 +2189,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg,
sk_peek_offset_fwd(sk, size);
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
}
err = (flags & MSG_TRUNC) ? skb->len - skip : size;
@@ -2383,7 +2430,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
/* It is questionable, see note in unix_dgram_recvmsg.
*/
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
sk_peek_offset_fwd(sk, chunk);
--
2.25.1
1
44

[PATCH kernel-4.19 1/9] iommu/arm-smmu-v3: Add support to configure mpam in STE/CD context
by Yang Yingliang 22 Oct '21
by Yang Yingliang 22 Oct '21
22 Oct '21
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
-------------------------------------------------
To support limiting qos of device, the partid and pmg need to be set
into the SMMU STE/CD context. This introduce support of SMMU mpam
feature and add interface to set mpam configuration in STE/CD.
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Signed-off-by: Rui Zhu <zhurui3(a)huawei.com>
Reviewed-by: Yingtai Xie <xieyingtai(a)huawei.com>
Reviewed-by: Zhen Lei <thunder.leizhen(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/iommu/arm-smmu-v3-context.c | 25 +++++
drivers/iommu/arm-smmu-v3.c | 151 ++++++++++++++++++++++++++++
include/linux/ascend_smmu.h | 9 ++
3 files changed, 185 insertions(+)
create mode 100644 include/linux/ascend_smmu.h
diff --git a/drivers/iommu/arm-smmu-v3-context.c b/drivers/iommu/arm-smmu-v3-context.c
index 2351de86d31f1..b30a93e076608 100644
--- a/drivers/iommu/arm-smmu-v3-context.c
+++ b/drivers/iommu/arm-smmu-v3-context.c
@@ -66,6 +66,9 @@
#define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(51, 4)
+#define CTXDESC_CD_5_PARTID_MASK GENMASK_ULL(47, 32)
+#define CTXDESC_CD_5_PMG_MASK GENMASK_ULL(55, 48)
+
/* Convert between AArch64 (CPU) TCR format and SMMU CD format */
#define ARM_SMMU_TCR2CD(tcr, fld) FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \
FIELD_GET(ARM64_TCR_##fld, tcr))
@@ -563,6 +566,28 @@ static int arm_smmu_set_cd(struct iommu_pasid_table_ops *ops, int pasid,
return arm_smmu_write_ctx_desc(tbl, pasid, cd);
}
+int arm_smmu_set_cd_mpam(struct iommu_pasid_table_ops *ops,
+ int ssid, int partid, int pmg)
+{
+ struct arm_smmu_cd_tables *tbl = pasid_ops_to_tables(ops);
+ u64 val;
+ __le64 *cdptr = arm_smmu_get_cd_ptr(tbl, ssid);
+
+ if (!cdptr)
+ return -ENOMEM;
+
+ val = le64_to_cpu(cdptr[5]);
+ val &= ~CTXDESC_CD_5_PARTID_MASK;
+ val |= FIELD_PREP(CTXDESC_CD_5_PARTID_MASK, partid);
+ val &= ~CTXDESC_CD_5_PMG_MASK;
+ val |= FIELD_PREP(CTXDESC_CD_5_PMG_MASK, pmg);
+ WRITE_ONCE(cdptr[5], cpu_to_le64(val));
+
+ iommu_pasid_flush(&tbl->pasid, ssid, true);
+
+ return 0;
+}
+
static void arm_smmu_clear_cd(struct iommu_pasid_table_ops *ops, int pasid,
struct iommu_pasid_entry *entry)
{
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 8b5083c3e0a16..6634d596f7efc 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -88,6 +88,10 @@
#define IDR1_SSIDSIZE GENMASK(10, 6)
#define IDR1_SIDSIZE GENMASK(5, 0)
+#define ARM_SMMU_IDR3 0xc
+#define IDR3_MPAM (1 << 7)
+#define ARM_SMMU_IDR3_CFG 0x140C
+
#define ARM_SMMU_IDR5 0x14
#define IDR5_STALL_MAX GENMASK(31, 16)
#define IDR5_GRAN64K (1 << 6)
@@ -186,6 +190,10 @@
#define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
#define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
+#define ARM_SMMU_MPAMIDR 0x130
+#define MPAMIDR_PMG_MAX GENMASK(23, 16)
+#define MPAMIDR_PARTID_MAX GENMASK(15, 0)
+
/* Common MSI config fields */
#define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2)
#define MSI_CFG2_SH GENMASK(5, 4)
@@ -250,6 +258,7 @@
#define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4)
#define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6)
+#define STRTAB_STE_1_S1MPAM (1UL << 26)
#define STRTAB_STE_1_S1STALLD (1UL << 27)
#define STRTAB_STE_1_EATS GENMASK_ULL(29, 28)
@@ -273,6 +282,11 @@
#define STRTAB_STE_3_S2TTB_MASK GENMASK_ULL(51, 4)
+#define STRTAB_STE_4_PARTID_MASK GENMASK_ULL(31, 16)
+
+#define STRTAB_STE_5_MPAM_NS (1UL << 8)
+#define STRTAB_STE_5_PMG_MASK GENMASK_ULL(7, 0)
+
/* Command queue */
#define CMDQ_ENT_SZ_SHIFT 4
#define CMDQ_ENT_DWORDS ((1 << CMDQ_ENT_SZ_SHIFT) >> 3)
@@ -634,6 +648,7 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_SVA (1 << 17)
#define ARM_SMMU_FEAT_HA (1 << 18)
#define ARM_SMMU_FEAT_HD (1 << 19)
+#define ARM_SMMU_FEAT_MPAM (1 << 20)
u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
@@ -672,6 +687,9 @@ struct arm_smmu_device {
struct mutex streams_mutex;
struct iopf_queue *iopf_queue;
+
+ unsigned int mpam_partid_max;
+ unsigned int mpam_pmg_max;
};
struct arm_smmu_stream {
@@ -3980,6 +3998,25 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
if (smmu->sid_bits <= STRTAB_SPLIT)
smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+ /* IDR3 */
+ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+
+ if (!(reg & IDR3_MPAM)) {
+ reg |= FIELD_PREP(IDR3_MPAM, 1);
+ writel(reg, smmu->base + ARM_SMMU_IDR3_CFG);
+ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+ if (!(reg & IDR3_MPAM))
+ dev_warn(smmu->dev, "enable smmu mpam failed\n");
+ }
+
+ if (reg & IDR3_MPAM) {
+ reg = readl_relaxed(smmu->base + ARM_SMMU_MPAMIDR);
+ smmu->mpam_partid_max = FIELD_GET(MPAMIDR_PARTID_MAX, reg);
+ smmu->mpam_pmg_max = FIELD_GET(MPAMIDR_PMG_MAX, reg);
+ if (smmu->mpam_partid_max || smmu->mpam_pmg_max)
+ smmu->features |= ARM_SMMU_FEAT_MPAM;
+ }
+
/* IDR5 */
reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
@@ -4127,6 +4164,120 @@ static unsigned long arm_smmu_resource_size(struct arm_smmu_device *smmu)
return SZ_128K;
}
+static int arm_smmu_set_ste_mpam(struct arm_smmu_device *smmu,
+ int sid, int partid, int pmg, int s1mpam)
+{
+ u64 val;
+ __le64 *ste;
+
+ if (!arm_smmu_sid_in_range(smmu, sid))
+ return -ERANGE;
+
+ /* get ste ptr */
+ ste = arm_smmu_get_step_for_sid(smmu, sid);
+
+ /* write s1mpam to ste */
+ val = le64_to_cpu(ste[1]);
+ val &= ~STRTAB_STE_1_S1MPAM;
+ val |= FIELD_PREP(STRTAB_STE_1_S1MPAM, s1mpam);
+ WRITE_ONCE(ste[1], cpu_to_le64(val));
+
+ val = le64_to_cpu(ste[4]);
+ val &= ~STRTAB_STE_4_PARTID_MASK;
+ val |= FIELD_PREP(STRTAB_STE_4_PARTID_MASK, partid);
+ WRITE_ONCE(ste[4], cpu_to_le64(val));
+
+ val = le64_to_cpu(ste[5]);
+ val &= ~STRTAB_STE_5_PMG_MASK;
+ val |= FIELD_PREP(STRTAB_STE_5_PMG_MASK, pmg);
+ WRITE_ONCE(ste[5], cpu_to_le64(val));
+
+ arm_smmu_sync_ste_for_sid(smmu, sid);
+
+ return 0;
+}
+
+int arm_smmu_set_cd_mpam(struct iommu_pasid_table_ops *ops,
+ int ssid, int partid, int pmg);
+
+static int arm_smmu_set_mpam(struct arm_smmu_device *smmu,
+ int sid, int ssid, int partid, int pmg, int s1mpam)
+{
+ struct arm_smmu_master_data *master = arm_smmu_find_master(smmu, sid);
+ struct arm_smmu_s1_cfg *cfg = master ? master->ste.s1_cfg : NULL;
+ struct arm_smmu_domain *domain = master ? master->domain : NULL;
+ int ret;
+
+ struct arm_smmu_cmdq_ent prefetch_cmd = {
+ .opcode = CMDQ_OP_PREFETCH_CFG,
+ .prefetch = {
+ .sid = sid,
+ },
+ };
+
+ if (!(smmu->features & ARM_SMMU_FEAT_MPAM))
+ return -ENODEV;
+
+ if (WARN_ON(!domain))
+ return -EINVAL;
+
+ if (WARN_ON(!cfg))
+ return -EINVAL;
+
+ if (WARN_ON(ssid >= (1 << master->ssid_bits)))
+ return -E2BIG;
+
+ if (partid > smmu->mpam_partid_max || pmg > smmu->mpam_pmg_max) {
+ dev_err(smmu->dev,
+ "mpam rmid out of range: partid[0, %d] pmg[0, %d]\n",
+ smmu->mpam_partid_max, smmu->mpam_pmg_max);
+ return -ERANGE;
+ }
+
+ ret = arm_smmu_set_ste_mpam(smmu, sid, partid, pmg, s1mpam);
+ if (ret < 0) {
+ dev_err(smmu->dev, "set ste mpam configuration error %d\n",
+ ret);
+ return ret;
+ }
+
+ /* do not modify cd table which owned by guest */
+ if (domain->stage == ARM_SMMU_DOMAIN_NESTED) {
+ dev_err(smmu->dev,
+ "mpam: smmu cd is owned by guest, not modified\n");
+ return 0;
+ }
+
+ ret = arm_smmu_set_cd_mpam(cfg->ops, ssid, partid, pmg);
+ if (s1mpam && ret < 0) {
+ dev_err(smmu->dev, "set cd mpam configuration error %d\n",
+ ret);
+ return ret;
+ }
+
+ /* It's likely that we'll want to use the new STE soon */
+ if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
+ arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+
+ dev_info(smmu->dev, "partid %d, pmg %d\n", partid, pmg);
+
+ return 0;
+}
+
+/**
+ * arm_smmu_set_dev_mpam() - Set mpam configuration to SMMU STE/CD
+ */
+int arm_smmu_set_dev_mpam(struct device *dev, int ssid, int partid, int pmg,
+ int s1mpam)
+{
+ struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+ struct arm_smmu_device *smmu = master->domain->smmu;
+ int sid = master->streams->id;
+
+ return arm_smmu_set_mpam(smmu, sid, ssid, partid, pmg, s1mpam);
+}
+EXPORT_SYMBOL(arm_smmu_set_dev_mpam);
+
static int arm_smmu_device_probe(struct platform_device *pdev)
{
int irq, ret;
diff --git a/include/linux/ascend_smmu.h b/include/linux/ascend_smmu.h
new file mode 100644
index 0000000000000..bd0edd25057e1
--- /dev/null
+++ b/include/linux/ascend_smmu.h
@@ -0,0 +1,9 @@
+#ifndef __LINUX_ASCEND_SMMU_H
+#define __LINUX_ASCEND_SMMU_H
+
+#include <linux/device.h>
+
+extern int arm_smmu_set_dev_mpam(struct device *dev, int ssid, int partid,
+ int pmg, int s1mpam);
+
+#endif /* __LINUX_ASCEND_SMMU_H */
--
2.25.1
1
8

[PATCH kernel-4.19] efi: Change down_interruptible() in virt_efi_reset_system() to down_trylock()
by Yang Yingliang 22 Oct '21
by Yang Yingliang 22 Oct '21
22 Oct '21
From: Zhang Jianhua <chris.zjh(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 182892, https://gitee.com/openeuler/kernel/issues/I4F04R
--------
While reboot the system by sysrq, the following bug will be occur.
BUG: sleeping function called from invalid context at kernel/locking/semaphore.c:90
in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 10052, name: rc.shutdown
CPU: 3 PID: 10052 Comm: rc.shutdown Tainted: G W O 5.10.0 #1
Call trace:
dump_backtrace+0x0/0x1c8
show_stack+0x18/0x28
dump_stack+0xd0/0x110
___might_sleep+0x14c/0x160
__might_sleep+0x74/0x88
down_interruptible+0x40/0x118
virt_efi_reset_system+0x3c/0xd0
efi_reboot+0xd4/0x11c
machine_restart+0x60/0x9c
emergency_restart+0x1c/0x2c
sysrq_handle_reboot+0x1c/0x2c
__handle_sysrq+0xd0/0x194
write_sysrq_trigger+0xbc/0xe4
proc_reg_write+0xd4/0xf0
vfs_write+0xa8/0x148
ksys_write+0x6c/0xd8
__arm64_sys_write+0x18/0x28
el0_svc_common.constprop.3+0xe4/0x16c
do_el0_svc+0x1c/0x2c
el0_svc+0x20/0x30
el0_sync_handler+0x80/0x17c
el0_sync+0x158/0x180
The reason for this problem is that irq has been disabled in
machine_restart() and then it calls down_interruptible() in
virt_efi_reset_system(), which would occur sleep in irq context,
it is dangerous! Commit 99409b935c9a("locking/semaphore: Add
might_sleep() to down_*() family") add might_sleep() in
down_interruptible(), so the bug info is here. down_trylock()
can solve this problem, cause there is no might_sleep.
Signed-off-by: Zhang Jianhua <chris.zjh(a)huawei.com>
Reviewed-by: Chen Lifu <chenlifu(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Reviewed-by: Liao Chang <liaochang1(a)huawei.com>
Reviewed-by: He Ying <heying24(a)huawei.com>
Signed-off-by: Chen Jun <chenjun102(a)huawei.com>
Reviewed-by: Yihang Xu <xuyihang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/firmware/efi/runtime-wrappers.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index 04288719721fc..ecee810f7a6fa 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -408,7 +408,7 @@ static void virt_efi_reset_system(int reset_type,
unsigned long data_size,
efi_char16_t *data)
{
- if (down_interruptible(&efi_runtime_lock)) {
+ if (down_trylock(&efi_runtime_lock)) {
pr_warn("failed to invoke the reset_system() runtime service:\n"
"could not get exclusive access to the firmware\n");
return;
--
2.25.1
1
0

22 Oct '21
From: Lijun Fang <fanglijun3(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4DZYE
CVE: NA
---------------------------
svm_mmap use the pgoff flag for the security requirement nid, which used
as cdm node.
Signed-off-by: Lijun Fang <fanglijun3(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/char/svm.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c
index f312b7e02263c..62092fbaa0104 100644
--- a/drivers/char/svm.c
+++ b/drivers/char/svm.c
@@ -1820,9 +1820,17 @@ static int svm_mmap(struct file *file, struct vm_area_struct *vma)
if ((vma->vm_end < vma->vm_start) || (vm_size > MMAP_PHY32_MAX))
return -EINVAL;
- page = alloc_pages(GFP_KERNEL | GFP_DMA32, get_order(vm_size));
+ /* vma->vm_pgoff transfer the nid */
+ if (vma->vm_pgoff == 0)
+ page = alloc_pages(GFP_KERNEL | GFP_DMA32,
+ get_order(vm_size));
+ else
+ page = alloc_pages_node((int)vma->vm_pgoff,
+ GFP_KERNEL | __GFP_THISNODE,
+ get_order(vm_size));
if (!page) {
- dev_err(sdev->dev, "fail to alloc page\n");
+ dev_err(sdev->dev, "fail to alloc page on node 0x%lx\n",
+ vma->vm_pgoff);
return -ENOMEM;
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] Ascend/hugetlb:support alloc normal and buddy hugepage
by Yang Yingliang 22 Oct '21
by Yang Yingliang 22 Oct '21
22 Oct '21
From: Zhou Guanghui <zhouguanghui1(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I
CVE: NA
----------------------------------------------------
The current function hugetlb_alloc_hugepage implements the allocation
from static hugepages first. When the static hugepage is used up, it
attempts to apply for hugepages from buddy system. Two additional modes
are supported: static hugepages only and buddy hugepages only.
Signed-off-by: Zhou Guanghui <zhouguanghui1(a)huawei.com>
Signed-off-by: Guo Mengqi <guomengqi3(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
include/linux/hugetlb.h | 11 +++++++++--
mm/hugetlb.c | 35 +++++++++++++++++++++++++++++++++--
2 files changed, 42 insertions(+), 4 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 830c41a5ca70e..de6cdfa51694c 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -374,8 +374,15 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
pgoff_t idx);
#ifdef CONFIG_ASCEND_FEATURES
+#define HUGETLB_ALLOC_NONE 0x00
+#define HUGETLB_ALLOC_NORMAL 0x01 /* normal hugepage */
+#define HUGETLB_ALLOC_BUDDY 0x02 /* buddy hugepage */
+#define HUGETLB_ALLOC_MASK (HUGETLB_ALLOC_NONE | \
+ HUGETLB_ALLOC_NORMAL | \
+ HUGETLB_ALLOC_BUDDY)
+
const struct hstate *hugetlb_get_hstate(void);
-struct page *hugetlb_alloc_hugepage(int nid);
+struct page *hugetlb_alloc_hugepage(int nid, int flag);
int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr,
pgprot_t prot, struct page *hpage);
#else
@@ -384,7 +391,7 @@ static inline const struct hstate *hugetlb_get_hstate(void)
return NULL;
}
-static inline struct page *hugetlb_alloc_hugepage(int nid)
+static inline struct page *hugetlb_alloc_hugepage(int nid, int flag)
{
return NULL;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3091c61bb63e3..907b1351b0f5f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5233,17 +5233,48 @@ const struct hstate *hugetlb_get_hstate(void)
}
EXPORT_SYMBOL_GPL(hugetlb_get_hstate);
+static struct page *hugetlb_alloc_hugepage_normal(struct hstate *h,
+ gfp_t gfp_mask, int nid)
+{
+ struct page *page = NULL;
+
+ spin_lock(&hugetlb_lock);
+ if (h->free_huge_pages - h->resv_huge_pages > 0)
+ page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL, NULL);
+ spin_unlock(&hugetlb_lock);
+
+ return page;
+}
+
/*
* Allocate hugepage without reserve
*/
-struct page *hugetlb_alloc_hugepage(int nid)
+struct page *hugetlb_alloc_hugepage(int nid, int flag)
{
+ struct hstate *h = &default_hstate;
+ gfp_t gfp_mask = htlb_alloc_mask(h);
+ struct page *page = NULL;
+
VM_WARN_ON(nid < 0 || nid >= MAX_NUMNODES);
+ if (flag & ~HUGETLB_ALLOC_MASK)
+ return NULL;
+
if (nid == NUMA_NO_NODE)
nid = numa_mem_id();
- return alloc_huge_page_node(&default_hstate, nid);
+ gfp_mask |= __GFP_THISNODE;
+
+ if (flag & HUGETLB_ALLOC_NORMAL)
+ page = hugetlb_alloc_hugepage_normal(h, gfp_mask, nid);
+ else if (flag & HUGETLB_ALLOC_BUDDY) {
+ if (enable_charge_mighp)
+ gfp_mask |= __GFP_ACCOUNT;
+ page = alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
+ } else
+ page = alloc_huge_page_node(h, nid);
+
+ return page;
}
EXPORT_SYMBOL_GPL(hugetlb_alloc_hugepage);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] Ascend/memcg: Use CONFIG_ASCEND_FEATURES for customized interfaces
by Yang Yingliang 22 Oct '21
by Yang Yingliang 22 Oct '21
22 Oct '21
From: Zhou Guanghui <zhouguanghui1(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I
CVE: NA
-------------------------------------------------------------
The following functions are used only in the ascend scenario:
hugetlb_get_hstate,
hugetlb_alloc_hugepage,
hugetlb_insert_hugepage_pte,
hugetlb_insert_hugepage_pte_by_pa
Remove unused interface hugetlb_insert_hugepage
Signed-off-by: Zhou Guanghui <zhouguanghui1(a)huawei.com>
Signed-off-by: Guo Mengqi <guomengqi3(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
include/linux/hugetlb.h | 33 ++++++++++++++++++++++++++-------
mm/hugetlb.c | 2 +-
2 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 6831b24defd64..830c41a5ca70e 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -373,11 +373,27 @@ struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
pgoff_t idx);
-#ifdef CONFIG_ARM64
+#ifdef CONFIG_ASCEND_FEATURES
const struct hstate *hugetlb_get_hstate(void);
struct page *hugetlb_alloc_hugepage(int nid);
int hugetlb_insert_hugepage_pte(struct mm_struct *mm, unsigned long addr,
pgprot_t prot, struct page *hpage);
+#else
+static inline const struct hstate *hugetlb_get_hstate(void)
+{
+ return NULL;
+}
+
+static inline struct page *hugetlb_alloc_hugepage(int nid)
+{
+ return NULL;
+}
+
+static inline int hugetlb_insert_hugepage_pte(struct mm_struct *mm,
+ unsigned long addr, pgprot_t prot, struct page *hpage)
+{
+ return -EPERM;
+}
#endif
/* arch callback */
@@ -614,12 +630,6 @@ static inline void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr
{
}
-static inline int hugetlb_insert_hugepage_pte_by_pa(struct mm_struct *mm,
- unsigned long vir_addr,
- pgprot_t prot, unsigned long phy_addr)
-{
- return 0;
-}
#endif /* CONFIG_HUGETLB_PAGE */
static inline spinlock_t *huge_pte_lock(struct hstate *h,
@@ -632,4 +642,13 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
return ptl;
}
+#ifndef CONFIG_ASCEND_FEATURES
+static inline int hugetlb_insert__hugepage_pte_by_pa(struct mm_struct *mm,
+ unsigned long vir_addr,
+ pgprot_t prot, unsigned long phy_addr)
+{
+ return -EPERM;
+}
+#endif
+
#endif /* _LINUX_HUGETLB_H */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5132fefb60c86..3091c61bb63e3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5226,7 +5226,7 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
}
}
-#ifdef CONFIG_ARM64
+#ifdef CONFIG_ASCEND_FEATURES
const struct hstate *hugetlb_get_hstate(void)
{
return &default_hstate;
--
2.25.1
1
0