mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Linuxarm

Threads by month
  • ----- 2025 -----
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
linuxarm@openeuler.org

January 2021

  • 52 participants
  • 77 discussions
[PATCH] percpu: set the pointer to NULL after free the memory in free_percpu
by Tian Tao 26 Jan '21

26 Jan '21
set the pointer to NULL after free the memory in free_percpu, so there is no need to set the pointer to NULL in the driver after calling free_percpu again. Signed-off-by: Tian Tao <tiantao6(a)hisilicon.com> --- mm/percpu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/percpu.c b/mm/percpu.c index f53b89b..af944cc 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -2121,6 +2121,8 @@ void free_percpu(void __percpu *ptr) if (need_balance) pcpu_schedule_balance_work(); + + ptr = NULL; } EXPORT_SYMBOL_GPL(free_percpu); -- 2.7.4
1 0
0 0
[GIT PULL] arm64: dts: hisilicon dts updates for v5.12
by Wei Xu 26 Jan '21

26 Jan '21
Hi ARM-SoC team, Please consider to pull the following changes. Thanks! Best Regards, Wei --- The following changes since commit 5c8fe583cce542aa0b84adc939ce85293de36e5e: Linux 5.11-rc1 (2020-12-27 15:30:22 -0800) are available in the git repository at: git://github.com/hisilicon/linux-hisi.git tags/hisi-arm64-dt-for-5.12 for you to fetch changes up to 9091f9b9d2965983ee38dd68609002daeacf5199: arm64: dts: hisilicon: hi3670.dtsi: add I2C settings (2021-01-26 15:48:22 +0800) ---------------------------------------------------------------- ARM64: DT: Hisilicon ARM64 DT updates for 5.12 - Further cleanups of the hisilicon DTS to align with the dtschema - Add or update the I2C, pinctrl and reset nodes for Hikey970 ---------------------------------------------------------------- Mauro Carvalho Chehab (3): arm64: dts: hisilicon: hi3670.dtsi: add iomcu_rst arm64: dts: hisilicon: hikey970-pinctrl.dtsi: add missing pinctrl settings arm64: dts: hisilicon: hi3670.dtsi: add I2C settings Zhen Lei (7): arm64: dts: hisilicon: correct vendor prefix hisi to hisilicon arm64: dts: hisilicon: separate each group of data in the property "ranges" arm64: dts: hisilicon: place clock-names "bus" before "core" arm64: dts: hisilicon: normalize the node name of the module thermal arm64: dts: hisilicon: normalize the node name of the localbus arm64: dts: hisilicon: avoid irrelevant nodes being mistakenly identified as PHY nodes arm64: dts: hisilicon: delete unused property smmu-cb-memtype arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 +- arch/arm64/boot/dts/hisilicon/hi3670.dtsi | 79 ++- arch/arm64/boot/dts/hisilicon/hi3798cv200.dtsi | 8 +- arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 8 +- .../arm64/boot/dts/hisilicon/hikey970-pinctrl.dtsi | 632 ++++++++++++++++++++- arch/arm64/boot/dts/hisilicon/hip05.dtsi | 2 +- arch/arm64/boot/dts/hisilicon/hip06.dtsi | 6 +- arch/arm64/boot/dts/hisilicon/hip07.dtsi | 9 +- 8 files changed, 717 insertions(+), 37 deletions(-)
1 0
0 0
Re: [PATCH v2 plinth/topic-sas-5.10 3/4] {topost} scsi: hisi_sas: enable DFX by default
by John Garry 26 Jan '21

26 Jan '21
On 16/12/2020 10:38, Luo Jiaxing wrote: > This patch add a config option "CONFIG_SCSI_HISI_DEBUGFS_DEFAULT_ENABLE" I just noticed that this should really be CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE so, I will change it to be like that
1 0
0 0
[RFC PATCH] sched/fair: first try to fix the scheduling impact of NUMA diameter > 2
by Barry Song 26 Jan '21

26 Jan '21
This patch is a follow-up of the 3-hops issue reported by Valentin Schneider: [1] https://lore.kernel.org/lkml/jhjtux5edo2.mognet@arm.com/ [2] https://lore.kernel.org/lkml/20201110184300.15673-1-valentin.schneider@arm.… Here is a brief summary of the background: For a NUMA system with 3-hops, sched_group for NUMA 2-hops could be not a subset of sched_domain. For example, for a system with the below topology(two cpus in each NUMA node): node 0 1 2 3 0: 10 12 20 22 1: 12 10 22 24 2: 20 22 10 12 3: 22 24 12 10 For CPU0, domain-2 will span 0-5, but its group will span 0-3, 4-7. 4-7 isn't a subset of 0-5. CPU0 attaching sched-domain(s): domain-0: span=0-1 level=MC groups: 0:{ span=0 cap=989 }, 1:{ span=1 cap=1016 } domain-1: span=0-3 level=NUMA groups: 0:{ span=0-1 cap=2005 }, 2:{ span=2-3 cap=2028 } domain-2: span=0-5 level=NUMA groups: 0:{ span=0-3 cap=4033 }, 4:{ span=4-7 cap=3909 } ERROR: groups don't span domain->span domain-3: span=0-7 level=NUMA groups: 0:{ span=0-5 mask=0-1 cap=6062 }, 6:{ span=4-7 mask=6-7 cap=3928 } All other cpus also have the same issue: sched_group could be not a subset of sched_domain. Here I am trying to figure out the scheduling impact of this issue from two aspects: 1. find busiest cpu in load_balance 2. find idlest cpu in fork/exec/wake balance For case 1, load_balance() seems to be handling this issue correctly as it only fills cpus in sched_domain to the cpus of lb_env. Also, find_busiest_group() and find_busiest_queue() will result in scanning cpus within env.cpus only: static int load_balance(int this_cpu, struct rq *this_rq, struct sched_domain *sd, enum cpu_idle_type idle, int *continue_balancing) {` ... struct lb_env env = { ... .cpus = cpus, .fbq_type = all, .tasks = LIST_HEAD_INIT(env.tasks), }; /* added by barry: only cpus in sched_domain are put in lb_env */ cpumask_and(cpus, sched_domain_span(sd), cpu_active_mask); ... /* * added by barry: the below functions are only scanning cpus * in env.cpus */ group = find_busiest_group(&env); ... busiest = find_busiest_queue(&env, group); ... } But one thing which looks wrong is that update_sg_lb_stats() is only counting tasks in sched_domain, but sgs->group_capacity and sgs->group_weight are counting all cpus in the sched_group. Then finally, update_sg_lb_stats() uses the load of cpus which are in the sched_domain to calculate group_type and avg_load which can be seriously underestimated. This is explained in detail as the comments added by me in the code: static inline void update_sg_lb_stats() { int i, nr_running, local_group; /* added by barry: here it only counts cpu in the sched_domain */ for_each_cpu_and(i, sched_group_span(group), env->cpus) { ... sgs->group_load += cpu_load(rq); sgs->group_util += cpu_util(i); sgs->group_runnable += cpu_runnable(rq); sgs->sum_h_nr_running += rq->cfs.h_nr_running; nr_running = rq->nr_running; sgs->sum_nr_running += nr_running; ... } ... /* added by barry: here it count all cpus which might not be in the domain */ sgs->group_capacity = group->sgc->capacity; sgs->group_weight = group->group_weight; /* added by barry: finally the group_type and avg_load could be wrong */ sgs->group_type = group_classify(env->sd->imbalance_pct, group, sgs); if (sgs->group_type == group_overloaded) sgs->avg_load = (sgs->group_load * SCHED_CAPACITY_SCALE) / sgs->group_capacity; ... } For example, if we have 2 cpus in sched_domain and 4 cpus in sched_group, the code is using the load of 2 cpus to calculate the group_type and avg_load of 4 cpus, the sched_group is likely to get much lower load than the real case. This patch fixed it by only counting cpus within sched_domain for group_capacity and group_weight. For case 2, find_idlest_group() and find_idlest_group_cpu() don't use sched_domain for scanning at all. They are scanning all cpus in the sched_group though sched_group isn't a subset of sched_domain. So they can result in picking an idle cpu outside the sched_domain but inside the sched_group. This patch moved to only scan cpus within the sched_domain, which would be similar with load_balance(). For this moment, this is pretty much PoC code to get feedback. Signed-off-by: Barry Song <song.bao.hua(a)hisilicon.com> --- kernel/sched/fair.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 04a3ce20da67..f183dba4961e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5901,7 +5901,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu); * find_idlest_group_cpu - find the idlest CPU among the CPUs in the group. */ static int -find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) +find_idlest_group_cpu(struct sched_domain *sd, struct sched_group *group, struct task_struct *p, int this_cpu) { unsigned long load, min_load = ULONG_MAX; unsigned int min_exit_latency = UINT_MAX; @@ -5916,6 +5916,10 @@ find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this /* Traverse only the allowed CPUs */ for_each_cpu_and(i, sched_group_span(group), p->cpus_ptr) { + /* when sched_group isn't a subset of sched_domain */ + if (!cpumask_test_cpu(i, sched_domain_span(sd))) + continue; + if (sched_idle_cpu(i)) return i; @@ -5984,7 +5988,7 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p continue; } - new_cpu = find_idlest_group_cpu(group, p, cpu); + new_cpu = find_idlest_group_cpu(sd, group, p, cpu); if (new_cpu == cpu) { /* Now try balancing at a lower domain level of 'cpu': */ sd = sd->child; @@ -8416,6 +8420,8 @@ static inline void update_sg_lb_stats(struct lb_env *env, if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq, false)) env->flags |= LBF_NOHZ_AGAIN; + sgs->group_capacity += capacity_of(i); + sgs->group_weight++; sgs->group_load += cpu_load(rq); sgs->group_util += cpu_util(i); sgs->group_runnable += cpu_runnable(rq); @@ -8462,10 +8468,6 @@ static inline void update_sg_lb_stats(struct lb_env *env, sgs->group_asym_packing = 1; } - sgs->group_capacity = group->sgc->capacity; - - sgs->group_weight = group->group_weight; - sgs->group_type = group_classify(env->sd->imbalance_pct, group, sgs); /* Computing avg_load makes sense only when group is overloaded */ @@ -8688,10 +8690,12 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd, memset(sgs, 0, sizeof(*sgs)); - for_each_cpu(i, sched_group_span(group)) { + for_each_cpu_and(i, sched_group_span(group), sched_domain_span(sd)) { struct rq *rq = cpu_rq(i); unsigned int local; + sgs->group_capacity += capacity_of(i); + sgs->group_weight++; sgs->group_load += cpu_load_without(rq, p); sgs->group_util += cpu_util_without(i, p); sgs->group_runnable += cpu_runnable_without(rq, p); @@ -8715,10 +8719,6 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd, sgs->group_misfit_task_load = 1; } - sgs->group_capacity = group->sgc->capacity; - - sgs->group_weight = group->group_weight; - sgs->group_type = group_classify(sd->imbalance_pct, group, sgs); /* -- 2.25.1
4 7
0 0
[PATCH v2 8/8] rasdaemon: Modify confiure.ac for Hisilicon Kunpeng errors
by Shiju Jose 26 Jan '21

26 Jan '21
Modify HIP07 SAS HW errors : $USE_HISI_NS_DECODE to HISI Kunpeng errors : $USE_HISI_NS_DECODE. Signed-off-by: Shiju Jose <shiju.jose(a)huawei.com> --- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 9893bb4..3a8b0c7 100644 --- a/configure.ac +++ b/configure.ac @@ -191,7 +191,7 @@ compile time options summary EXTLOG : $USE_EXTLOG CPER non-standard : $USE_NON_STANDARD ABRT report : $USE_ABRT_REPORT - HIP07 SAS HW errors : $USE_HISI_NS_DECODE + HISI Kunpeng errors : $USE_HISI_NS_DECODE ARM events : $USE_ARM DEVLINK : $USE_DEVLINK Disk I/O errors : $USE_DISKERROR -- 2.17.1
1 0
0 0
[PATCH v2 7/8] rasdaemon: ras-mc-ctl: Add support for HiSilicon Kunpeng9xx common errors
by Shiju Jose 26 Jan '21

26 Jan '21
Add support for the HiSilicon Kunpeng9xx platforms common errors. Signed-off-by: Shiju Jose <shiju.jose(a)huawei.com> Reviewed-by: Xiaofei Tan <tanxiaofei(a)huawei.com> --- util/ras-mc-ctl.in | 44 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in index 8befc5d..37a5042 100755 --- a/util/ras-mc-ctl.in +++ b/util/ras-mc-ctl.in @@ -1519,6 +1519,7 @@ sub errors # Definitions of the vendor platform IDs. use constant { HISILICON_KUNPENG_920 => "Kunpeng920", + HISILICON_KUNPENG_9XX => "Kunpeng9xx", }; sub vendor_errors_summary @@ -1526,7 +1527,7 @@ sub vendor_errors_summary require DBI; my ($num_args, $platform_id); my ($query, $query_handle, $count, $out); - my ($module_id, $sub_module_id, $err_severity, $err_sev); + my ($module_id, $sub_module_id, $err_severity, $err_sev, $err_info); $num_args = $#ARGV + 1; $platform_id = 0; @@ -1601,6 +1602,24 @@ sub vendor_errors_summary $query_handle->finish; } + # HiSilicon Kunpeng9xx common errors + if ($platform_id eq HISILICON_KUNPENG_9XX) { + $query = "select err_info, count(*) from hisi_common_section"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($err_info, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\terrors: $count\n"; + } + if ($out ne "") { + print "HiSilicon Kunpeng9xx common error events summary:\n$out\n"; + } else { + print "No HiSilicon Kunpeng9xx common errors.\n\n"; + } + $query_handle->finish; + } + undef($dbh); } @@ -1610,7 +1629,7 @@ sub vendor_errors my ($num_args, $platform_id); my ($query, $query_handle, $id, $timestamp, $out); my ($version, $soc_id, $socket_id, $nimbus_id, $core_id, $port_id); - my ($module_id, $sub_module_id, $err_severity, $err_type, $regs); + my ($module_id, $sub_module_id, $err_severity, $err_type, $err_info, $regs); $num_args = $#ARGV + 1; $platform_id = 0; @@ -1696,6 +1715,26 @@ sub vendor_errors $query_handle->finish; } + # HiSilicon Kunpeng9xx common errors + if ($platform_id eq HISILICON_KUNPENG_9XX) { + $query = "select id, timestamp, err_info, regs_dump from hisi_common_section order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $err_info, $regs)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id. $timestamp "; + $out .= "Error Info:$err_info \n" if ($err_info); + $out .= "Error Registers: $regs\n\n" if ($regs); + } + if ($out ne "") { + print "HiSilicon Kunpeng9xx common error events:\n$out\n"; + } else { + print "No HiSilicon Kunpeng9xx common errors.\n"; + } + $query_handle->finish; + } + undef($dbh); } @@ -1703,6 +1742,7 @@ sub vendor_platforms { print "\nSupported platforms for the vendor-specific errors:\n"; print "\tHiSilicon Kunpeng920, platform-id=\"", HISILICON_KUNPENG_920, "\"\n"; + print "\tHiSilicon Kunpeng9xx, platform-id=\"", HISILICON_KUNPENG_9XX, "\"\n"; print "\n"; } -- 2.17.1
1 0
0 0
[PATCH v2 6/8] rasdaemon: ras-mc-ctl: Add support for HiSilicon Kunpeng920 errors
by Shiju Jose 26 Jan '21

26 Jan '21
Add support for the HiSilicon Kunpeng920 errors. Supported error formats: OEM type 1, OEM typ2 and PCIe controller error formats. Signed-off-by: Shiju Jose <shiju.jose(a)huawei.com> Reviewed-by: Xiaofei Tan <tanxiaofei(a)huawei.com> --- util/ras-mc-ctl.in | 149 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 149 insertions(+) diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in index 6820823..8befc5d 100755 --- a/util/ras-mc-ctl.in +++ b/util/ras-mc-ctl.in @@ -1516,10 +1516,17 @@ sub errors undef($dbh); } +# Definitions of the vendor platform IDs. +use constant { + HISILICON_KUNPENG_920 => "Kunpeng920", +}; + sub vendor_errors_summary { require DBI; my ($num_args, $platform_id); + my ($query, $query_handle, $count, $out); + my ($module_id, $sub_module_id, $err_severity, $err_sev); $num_args = $#ARGV + 1; $platform_id = 0; @@ -1531,6 +1538,69 @@ sub vendor_errors_summary my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {}); + # HiSilicon Kunpeng920 errors + if ($platform_id eq HISILICON_KUNPENG_920) { + $query = "select err_severity, module_id, count(*) from hip08_oem_type1_event_v2 group by err_severity, module_id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($err_severity, $module_id, $count)); + $out = ""; + $err_sev = ""; + while($query_handle->fetch()) { + if ($err_severity ne $err_sev) { + $out .= "$err_severity errors:\n"; + $err_sev = $err_severity; + } + $out .= "\t$module_id: $count\n"; + } + if ($out ne "") { + print "HiSilicon Kunpeng920 OEM type1 error events summary:\n$out\n"; + } else { + print "No HiSilicon Kunpeng920 OEM type1 errors.\n\n"; + } + $query_handle->finish; + + $query = "select err_severity, module_id, count(*) from hip08_oem_type2_event_v2 group by err_severity, module_id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($err_severity, $module_id, $count)); + $out = ""; + $err_sev = ""; + while($query_handle->fetch()) { + if ($err_severity ne $err_sev) { + $out .= "$err_severity errors:\n"; + $err_sev = $err_severity; + } + $out .= "\t$module_id: $count\n"; + } + if ($out ne "") { + print "HiSilicon Kunpeng920 OEM type2 error events summary:\n$out\n"; + } else { + print "No HiSilicon Kunpeng920 OEM type2 errors.\n\n"; + } + $query_handle->finish; + + $query = "select err_severity, sub_module_id, count(*) from hip08_pcie_local_event_v2 group by err_severity, sub_module_id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($err_severity, $sub_module_id, $count)); + $out = ""; + $err_sev = ""; + while($query_handle->fetch()) { + if ($err_severity ne $err_sev) { + $out .= "$err_severity errors:\n"; + $err_sev = $err_severity; + } + $out .= "\t$sub_module_id: $count\n"; + } + if ($out ne "") { + print "HiSilicon Kunpeng920 PCIe controller error events summary:\n$out\n"; + } else { + print "No HiSilicon Kunpeng920 PCIe controller errors.\n\n"; + } + $query_handle->finish; + } + undef($dbh); } @@ -1538,6 +1608,9 @@ sub vendor_errors { require DBI; my ($num_args, $platform_id); + my ($query, $query_handle, $id, $timestamp, $out); + my ($version, $soc_id, $socket_id, $nimbus_id, $core_id, $port_id); + my ($module_id, $sub_module_id, $err_severity, $err_type, $regs); $num_args = $#ARGV + 1; $platform_id = 0; @@ -1549,12 +1622,88 @@ sub vendor_errors my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {}); + # HiSilicon Kunpeng920 errors + if ($platform_id eq HISILICON_KUNPENG_920) { + $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, module_id, sub_module_id, err_severity, regs_dump from hip08_oem_type1_event_v2 order by id, module_id, err_severity"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $module_id, $sub_module_id, $err_severity, $regs)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id. $timestamp Error Info: "; + $out .= "version=$version, "; + $out .= "soc_id=$soc_id, " if ($soc_id); + $out .= "socket_id=$socket_id, " if ($socket_id); + $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id); + $out .= "module_id=$module_id, " if ($module_id); + $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id); + $out .= "err_severity=$err_severity, \n" if ($err_severity); + $out .= "Error Registers: $regs\n\n" if ($regs); + } + if ($out ne "") { + print "HiSilicon Kunpeng920 OEM type1 error events:\n$out\n"; + } else { + print "No HiSilicon Kunpeng920 OEM type1 errors.\n"; + } + $query_handle->finish; + + $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, module_id, sub_module_id, err_severity, regs_dump from hip08_oem_type2_event_v2 order by id, module_id, err_severity"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $module_id, $sub_module_id, $err_severity, $regs)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id. $timestamp Error Info: "; + $out .= "version=$version, "; + $out .= "soc_id=$soc_id, " if ($soc_id); + $out .= "socket_id=$socket_id, " if ($socket_id); + $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id); + $out .= "module_id=$module_id, " if ($module_id); + $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id); + $out .= "err_severity=$err_severity, \n" if ($err_severity); + $out .= "Error Registers: $regs\n\n" if ($regs); + } + if ($out ne "") { + print "HiSilicon Kunpeng920 OEM type2 error events:\n$out\n"; + } else { + print "No HiSilicon Kunpeng920 OEM type2 errors.\n"; + } + $query_handle->finish; + + $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, sub_module_id, core_id, port_id, err_severity, err_type, regs_dump from hip08_pcie_local_event_v2 order by id, sub_module_id, err_severity"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $sub_module_id, $core_id, $port_id, $err_severity, $err_type, $regs)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id. $timestamp Error Info: "; + $out .= "version=$version, "; + $out .= "soc_id=$soc_id, " if ($soc_id); + $out .= "socket_id=$socket_id, " if ($socket_id); + $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id); + $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id); + $out .= "core_id=$core_id, " if ($core_id); + $out .= "port_id=$port_id, " if ($port_id); + $out .= "err_severity=$err_severity, " if ($err_severity); + $out .= "err_type=$err_type, \n" if ($err_type); + $out .= "Error Registers: $regs\n\n" if ($regs); + } + if ($out ne "") { + print "HiSilicon Kunpeng920 PCIe controller error events:\n$out\n"; + } else { + print "No HiSilicon Kunpeng920 PCIe controller errors.\n"; + } + $query_handle->finish; + } + undef($dbh); } sub vendor_platforms { print "\nSupported platforms for the vendor-specific errors:\n"; + print "\tHiSilicon Kunpeng920, platform-id=\"", HISILICON_KUNPENG_920, "\"\n"; + print "\n"; } sub log_msg { print STDERR "$prog: ", @_ unless $conf{opt}{quiet}; } -- 2.17.1
1 0
0 0
[PATCH v2 5/8] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors
by Shiju Jose 26 Jan '21

26 Jan '21
Add commands to support logging the vendor-specific error info in the ras-mc-ctl. Signed-off-by: Shiju Jose <shiju.jose(a)huawei.com> Reviewed-by: Xiaofei Tan <tanxiaofei(a)huawei.com> --- util/ras-mc-ctl.in | 64 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in index 97b1fa4..6820823 100755 --- a/util/ras-mc-ctl.in +++ b/util/ras-mc-ctl.in @@ -87,6 +87,9 @@ Usage: $prog [OPTIONS...] --summary Presents a summary of the logged errors. --errors Shows the errors stored at the error database. --error-count Shows the corrected and uncorrected error counts using sysfs. + --vendor-errors-summary <platform-id> Presents a summary of the vendor-specific logged errors. + --vendor-errors <platform-id> Shows the vendor-specific errors stored in the error database. + --vendor-platforms Shows the supported platforms with platform-ids for the vendor-specific errors. --help This help message. EOF @@ -134,6 +137,18 @@ if ($conf{opt}{errors}) { errors (); } +if ($conf{opt}{vendor_errors_summary}) { + vendor_errors_summary (); +} + +if ($conf{opt}{vendor_errors}) { + vendor_errors (); +} + +if ($conf{opt}{vendor_platforms}) { + vendor_platforms (); +} + exit (0); sub parse_cmdline @@ -149,6 +164,9 @@ sub parse_cmdline $conf{opt}{summary} = 0; $conf{opt}{errors} = 0; $conf{opt}{error_count} = 0; + $conf{opt}{vendor_errors_summary} = 0; + $conf{opt}{vendor_errors} = 0; + $conf{opt}{vendor_platforms} = 0; my $rref = \$conf{opt}{report}; my $mref = \$conf{opt}{mainboard}; @@ -166,7 +184,10 @@ sub parse_cmdline "layout" => \$conf{opt}{display_memory_layout}, "summary" => \$conf{opt}{summary}, "errors" => \$conf{opt}{errors}, - "error-count" => \$conf{opt}{error_count} + "error-count" => \$conf{opt}{error_count}, + "vendor-errors-summary" => \$conf{opt}{vendor_errors_summary}, + "vendor-errors" => \$conf{opt}{vendor_errors}, + "vendor-platforms" => \$conf{opt}{vendor_platforms}, ); usage(1) if !$rc; @@ -1495,6 +1516,47 @@ sub errors undef($dbh); } +sub vendor_errors_summary +{ + require DBI; + my ($num_args, $platform_id); + + $num_args = $#ARGV + 1; + $platform_id = 0; + if ($num_args ne 0) { + $platform_id = $ARGV[0]; + } else { + return; + } + + my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {}); + + undef($dbh); +} + +sub vendor_errors +{ + require DBI; + my ($num_args, $platform_id); + + $num_args = $#ARGV + 1; + $platform_id = 0; + if ($num_args ne 0) { + $platform_id = $ARGV[0]; + } else { + return; + } + + my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {}); + + undef($dbh); +} + +sub vendor_platforms +{ + print "\nSupported platforms for the vendor-specific errors:\n"; +} + sub log_msg { print STDERR "$prog: ", @_ unless $conf{opt}{quiet}; } sub log_error { log_msg ("Error: @_"); } -- 2.17.1
1 0
0 0
[PATCH v2 4/8] rasdaemon: ras-mc-ctl: Fix for exception when an event is not enabled
by Shiju Jose 26 Jan '21

26 Jan '21
When an event is not enabled in the build and thus the event's table is not present in the SQLite DB, then the DBI would detect exception and ras-mc-ctl exit without read and log remaining event's information. Following is the error log when the devlink_event is not enabled, "DBD::SQLite::db prepare failed: no such table: devlink_event at ./ras-mc-ctl line 1198. Can't call method "execute" on an undefined value at ./ras-mc-ctl line 1199" Add an extra check, whether an event is enabled in the build, before try reading the tables. Reported-by: Xiaofei Tan <tanxiaofei(a)huawei.com> Suggested-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org> Signed-off-by: Shiju Jose <shiju.jose(a)huawei.com> --- configure.ac | 7 + util/ras-mc-ctl.in | 506 ++++++++++++++++++++++++--------------------- 2 files changed, 278 insertions(+), 235 deletions(-) diff --git a/configure.ac b/configure.ac index a6251d4..9893bb4 100644 --- a/configure.ac +++ b/configure.ac @@ -42,6 +42,7 @@ AC_SUBST([SQLITE3_LIBS]) AC_ARG_ENABLE([aer], AS_HELP_STRING([--enable-aer], [enable PCIe AER events (currently experimental)])) +AC_SUBST([enable_aer]) AS_IF([test "x$enable_aer" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_AER,1,"have PCIe AER events collect") @@ -63,6 +64,7 @@ AM_COND_IF([WITH_NON_STANDARD], [USE_NON_STANDARD="yes"], [USE_NON_STANDARD="no" AC_ARG_ENABLE([arm], AS_HELP_STRING([--enable-arm], [enable ARM events (currently experimental)])) +AC_SUBST([enable_arm]) AS_IF([test "x$enable_arm" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_ARM,1,"have ARM events collect") @@ -73,6 +75,7 @@ AM_COND_IF([WITH_ARM], [USE_ARM="yes"], [USE_ARM="no"]) AC_ARG_ENABLE([mce], AS_HELP_STRING([--enable-mce], [enable MCE events (currently experimental)])) +AC_SUBST([enable_mce]) AS_IF([test "x$enable_mce" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_MCE,1,"have PCIe MCE events collect") @@ -83,6 +86,7 @@ AM_COND_IF([WITH_MCE], [USE_MCE="yes"], [USE_MCE="no"]) AC_ARG_ENABLE([extlog], AS_HELP_STRING([--enable-extlog], [enable EXTLOG events (currently experimental)])) +AC_SUBST([enable_extlog]) AS_IF([test "x$enable_extlog" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_EXTLOG,1,"have EXTLOG events collect") @@ -93,6 +97,7 @@ AM_COND_IF([WITH_EXTLOG], [USE_EXTLOG="yes"], [USE_EXTLOG="no"]) AC_ARG_ENABLE([devlink], AS_HELP_STRING([--enable-devlink], [enable devlink health events (currently experimental)])) +AC_SUBST([enable_devlink]) AS_IF([test "x$enable_devlink" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_DEVLINK,1,"have devlink health events collect") @@ -103,6 +108,7 @@ AM_COND_IF([WITH_DEVLINK], [USE_DEVLINK="yes"], [USE_DEVLINK="no"]) AC_ARG_ENABLE([diskerror], AS_HELP_STRING([--enable-diskerror], [enable disk I/O error events (currently experimental)])) +AC_SUBST([enable_diskerror]) AS_IF([test "x$enable_diskerror" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_DISKERROR,1,"have disk I/O errors collect") @@ -113,6 +119,7 @@ AM_COND_IF([WITH_DISKERROR], [USE_DISKERROR="yes"], [USE_DISKERROR="no"]) AC_ARG_ENABLE([memory_failure], AS_HELP_STRING([--enable-memory-failure], [enable memory failure events (currently experimental)])) +AC_SUBST([enable_memory_failure]) AS_IF([test "x$enable_memory_failure" = "xyes" || test "x$enable_all" == "xyes"], [ AC_DEFINE(HAVE_MEMORY_FAILURE,1,"have memory failure events collect") diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in index eebcc4e..97b1fa4 100755 --- a/util/ras-mc-ctl.in +++ b/util/ras-mc-ctl.in @@ -65,6 +65,14 @@ $conf{mbconfig} = "$sysconfdir/ras/mainboard"; my $status = 0; +my $enable_aer = "@enable_aer@"; +my $enable_arm = "@enable_arm@"; +my $enable_mce = "@enable_mce@"; +my $enable_extlog = "@enable_extlog@"; +my $enable_devlink = "@enable_devlink@"; +my $enable_diskerror = "@enable_diskerror@"; +my $enable_mem_failure = "@enable_memory_failure@"; + my $usage = <<EOF; Usage: $prog [OPTIONS...] --quiet Quiet operation. @@ -1144,118 +1152,132 @@ sub summary $query_handle->finish; # PCIe AER aer_event errors - $query = "select err_type, err_msg, count(*) from aer_event group by err_type, err_msg"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($err_type, $msg, $count)); - $out = ""; - while($query_handle->fetch()) { - $out .= "\t$count $err_type errors: $msg\n"; - } - if ($out ne "") { - print "PCIe AER events summary:\n$out\n"; - } else { - print "No PCIe AER errors.\n\n"; + if ($enable_aer eq "yes") { + $query = "select err_type, err_msg, count(*) from aer_event group by err_type, err_msg"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($err_type, $msg, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\t$count $err_type errors: $msg\n"; + } + if ($out ne "") { + print "PCIe AER events summary:\n$out\n"; + } else { + print "No PCIe AER errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # ARM processor arm_event errors - $query = "select mpidr, count(*) from arm_event group by mpidr"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($mpidr, $count)); - $out = ""; - while($query_handle->fetch()) { - $out .= sprintf "\tCPU(mpidr=0x%x) has %d errors\n", $mpidr, $count; - } - if ($out ne "") { - print "ARM processor events summary:\n$out\n"; - } else { - print "No ARM processor errors.\n\n"; + if ($enable_arm eq "yes") { + $query = "select mpidr, count(*) from arm_event group by mpidr"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($mpidr, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= sprintf "\tCPU(mpidr=0x%x) has %d errors\n", $mpidr, $count; + } + if ($out ne "") { + print "ARM processor events summary:\n$out\n"; + } else { + print "No ARM processor errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # extlog errors - $query = "select etype, severity, count(*) from extlog_event group by etype, severity"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($etype, $severity, $count)); - $out = ""; - while($query_handle->fetch()) { - $etype_string = get_extlog_type($etype); - $severity_string = get_extlog_severity($severity); - $out .= "\t$count $etype_string $severity_string errors\n"; - } - if ($out ne "") { - print "Extlog records summary:\n$out"; - } else { - print "No Extlog errors.\n\n"; + if ($enable_extlog eq "yes") { + $query = "select etype, severity, count(*) from extlog_event group by etype, severity"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($etype, $severity, $count)); + $out = ""; + while($query_handle->fetch()) { + $etype_string = get_extlog_type($etype); + $severity_string = get_extlog_severity($severity); + $out .= "\t$count $etype_string $severity_string errors\n"; + } + if ($out ne "") { + print "Extlog records summary:\n$out"; + } else { + print "No Extlog errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # devlink errors - $query = "select dev_name, count(*) from devlink_event group by dev_name"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($dev_name, $count)); - $out = ""; - while($query_handle->fetch()) { - $out .= "\t$dev_name has $count errors\n"; - } - if ($out ne "") { - print "Devlink records summary:\n$out"; - } else { - print "No devlink errors.\n"; + if ($enable_devlink eq "yes") { + $query = "select dev_name, count(*) from devlink_event group by dev_name"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($dev_name, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\t$dev_name has $count errors\n"; + } + if ($out ne "") { + print "Devlink records summary:\n$out"; + } else { + print "No devlink errors.\n"; + } + $query_handle->finish; } - $query_handle->finish; # Disk errors - $query = "select dev, count(*) from disk_errors group by dev"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($dev, $count)); - $out = ""; - while($query_handle->fetch()) { - $out .= "\t$dev has $count errors\n"; - } - if ($out ne "") { - print "Disk errors summary:\n$out"; - } else { - print "No disk errors.\n"; + if ($enable_diskerror eq "yes") { + $query = "select dev, count(*) from disk_errors group by dev"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($dev, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\t$dev has $count errors\n"; + } + if ($out ne "") { + print "Disk errors summary:\n$out"; + } else { + print "No disk errors.\n"; + } + $query_handle->finish; } - $query_handle->finish; # Memory failure errors - $query = "select action_result, count(*) from memory_failure_event group by action_result"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($action_result, $count)); - $out = ""; - while($query_handle->fetch()) { - $out .= "\t$action_result errors: $count\n"; - } - if ($out ne "") { - print "Memory failure events summary:\n$out\n"; - } else { - print "No Memory failure errors.\n\n"; + if ($enable_mem_failure eq "yes") { + $query = "select action_result, count(*) from memory_failure_event group by action_result"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($action_result, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\t$action_result errors: $count\n"; + } + if ($out ne "") { + print "Memory failure events summary:\n$out\n"; + } else { + print "No Memory failure errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # MCE mce_record errors - $query = "select error_msg, count(*) from mce_record group by error_msg"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($msg, $count)); - $out = ""; - while($query_handle->fetch()) { - $out .= "\t$count $msg errors\n"; - } - if ($out ne "") { - print "MCE records summary:\n$out"; - } else { - print "No MCE errors.\n"; + if ($enable_mce eq "yes") { + $query = "select error_msg, count(*) from mce_record group by error_msg"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($msg, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\t$count $msg errors\n"; + } + if ($out ne "") { + print "MCE records summary:\n$out"; + } else { + print "No MCE errors.\n"; + } + $query_handle->finish; } - $query_handle->finish; undef($dbh); } @@ -1294,167 +1316,181 @@ sub errors $query_handle->finish; # PCIe AER aer_event errors - $query = "select id, timestamp, dev_name, err_type, err_msg from aer_event order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $time, $devname, $type, $msg)); - $out = ""; - while($query_handle->fetch()) { - $out .= "$id $time $devname $type error: $msg\n"; - } - if ($out ne "") { - print "PCIe AER events:\n$out\n"; - } else { - print "No PCIe AER errors.\n\n"; + if ($enable_aer eq "yes") { + $query = "select id, timestamp, dev_name, err_type, err_msg from aer_event order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $time, $devname, $type, $msg)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $time $devname $type error: $msg\n"; + } + if ($out ne "") { + print "PCIe AER events:\n$out\n"; + } else { + print "No PCIe AER errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # ARM processor arm_event errors - $query = "select id, timestamp, error_count, affinity, mpidr, running_state, psci_state from arm_event order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $timestamp, $error_count, $affinity, $mpidr, $r_state, $psci_state)); - $out = ""; - while($query_handle->fetch()) { - $out .= "$id $timestamp error: "; - $out .= "error_count=$error_count, " if ($error_count); - $out .= "affinity_level=$affinity, "; - $out .= sprintf "mpidr=0x%x, ", $mpidr; - $out .= sprintf "running_state=0x%x, ", $r_state; - $out .= sprintf "psci_state=0x%x", $psci_state; - $out .= "\n"; - } - if ($out ne "") { - print "ARM processor events:\n$out\n"; - } else { - print "No ARM processor errors.\n\n"; + if ($enable_arm eq "yes") { + $query = "select id, timestamp, error_count, affinity, mpidr, running_state, psci_state from arm_event order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $error_count, $affinity, $mpidr, $r_state, $psci_state)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $timestamp error: "; + $out .= "error_count=$error_count, " if ($error_count); + $out .= "affinity_level=$affinity, "; + $out .= sprintf "mpidr=0x%x, ", $mpidr; + $out .= sprintf "running_state=0x%x, ", $r_state; + $out .= sprintf "psci_state=0x%x", $psci_state; + $out .= "\n"; + } + if ($out ne "") { + print "ARM processor events:\n$out\n"; + } else { + print "No ARM processor errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # Extlog errors - $query = "select id, timestamp, etype, severity, address, fru_id, fru_text, cper_data from extlog_event order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $timestamp, $etype, $severity, $addr, $fru_id, $fru_text, $cper_data)); - $out = ""; - while($query_handle->fetch()) { - $etype_string = get_extlog_type($etype); - $severity_string = get_extlog_severity($severity); - $out .= "$id $timestamp error: "; - $out .= "type=$etype_string, "; - $out .= "severity=$severity_string, "; - $out .= sprintf "address=0x%08x, ", $addr; - $out .= sprintf "fru_id=%s, ", get_uuid_le($fru_id); - $out .= "fru_text='$fru_text', "; - $out .= get_cper_data_text($cper_data) if ($cper_data); - $out .= "\n"; - } - if ($out ne "") { - print "Extlog events:\n$out\n"; - } else { - print "No Extlog errors.\n\n"; + if ($enable_extlog eq "yes") { + $query = "select id, timestamp, etype, severity, address, fru_id, fru_text, cper_data from extlog_event order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $etype, $severity, $addr, $fru_id, $fru_text, $cper_data)); + $out = ""; + while($query_handle->fetch()) { + $etype_string = get_extlog_type($etype); + $severity_string = get_extlog_severity($severity); + $out .= "$id $timestamp error: "; + $out .= "type=$etype_string, "; + $out .= "severity=$severity_string, "; + $out .= sprintf "address=0x%08x, ", $addr; + $out .= sprintf "fru_id=%s, ", get_uuid_le($fru_id); + $out .= "fru_text='$fru_text', "; + $out .= get_cper_data_text($cper_data) if ($cper_data); + $out .= "\n"; + } + if ($out ne "") { + print "Extlog events:\n$out\n"; + } else { + print "No Extlog errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # devlink errors - $query = "select id, timestamp, bus_name, dev_name, driver_name, reporter_name, msg from devlink_event order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $timestamp, $bus_name, $dev_name, $driver_name, $reporter_name, $msg)); - $out = ""; - while($query_handle->fetch()) { - $out .= "$id $timestamp error: "; - $out .= "bus_name=$bus_name, "; - $out .= "dev_name=$dev_name, "; - $out .= "driver_name=$driver_name, "; - $out .= "reporter_name=$reporter_name, "; - $out .= "message='$msg', "; - $out .= "\n"; - } - if ($out ne "") { - print "Devlink events:\n$out\n"; - } else { - print "No devlink errors.\n\n"; + if ($enable_devlink eq "yes") { + $query = "select id, timestamp, bus_name, dev_name, driver_name, reporter_name, msg from devlink_event order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $bus_name, $dev_name, $driver_name, $reporter_name, $msg)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $timestamp error: "; + $out .= "bus_name=$bus_name, "; + $out .= "dev_name=$dev_name, "; + $out .= "driver_name=$driver_name, "; + $out .= "reporter_name=$reporter_name, "; + $out .= "message='$msg', "; + $out .= "\n"; + } + if ($out ne "") { + print "Devlink events:\n$out\n"; + } else { + print "No devlink errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # Disk errors - $query = "select id, timestamp, dev, sector, nr_sector, error, rwbs, cmd from disk_errors order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $timestamp, $dev, $sector, $nr_sector, $error, $rwbs, $cmd)); - $out = ""; - while($query_handle->fetch()) { - $out .= "$id $timestamp error: "; - $out .= "dev=$dev, "; - $out .= "sector=$sector, "; - $out .= "nr_sector=$nr_sector, "; - $out .= "error='$error', "; - $out .= "rwbs='$rwbs', "; - $out .= "cmd='$cmd', "; - $out .= "\n"; - } - if ($out ne "") { - print "Disk errors\n$out\n"; - } else { - print "No disk errors.\n\n"; + if ($enable_diskerror eq "yes") { + $query = "select id, timestamp, dev, sector, nr_sector, error, rwbs, cmd from disk_errors order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $dev, $sector, $nr_sector, $error, $rwbs, $cmd)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $timestamp error: "; + $out .= "dev=$dev, "; + $out .= "sector=$sector, "; + $out .= "nr_sector=$nr_sector, "; + $out .= "error='$error', "; + $out .= "rwbs='$rwbs', "; + $out .= "cmd='$cmd', "; + $out .= "\n"; + } + if ($out ne "") { + print "Disk errors\n$out\n"; + } else { + print "No disk errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # Memory failure errors - $query = "select id, timestamp, pfn, page_type, action_result from memory_failure_event order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $timestamp, $pfn, $page_type, $action_result)); - $out = ""; - while($query_handle->fetch()) { - $out .= "$id $timestamp error: "; - $out .= "pfn=$pfn, page_type=$page_type, action_result=$action_result\n"; - } - if ($out ne "") { - print "Memory failure events:\n$out\n"; - } else { - print "No Memory failure errors.\n\n"; + if ($enable_mem_failure eq "yes") { + $query = "select id, timestamp, pfn, page_type, action_result from memory_failure_event order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $pfn, $page_type, $action_result)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $timestamp error: "; + $out .= "pfn=$pfn, page_type=$page_type, action_result=$action_result\n"; + } + if ($out ne "") { + print "Memory failure events:\n$out\n"; + } else { + print "No Memory failure errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; # MCE mce_record errors - $query = "select id, timestamp, mcgcap, mcgstatus, status, addr, misc, ip, tsc, walltime, cpu, cpuid, apicid, socketid, cs, bank, cpuvendor, bank_name, error_msg, mcgstatus_msg, mcistatus_msg, user_action, mc_location from mce_record order by id"; - $query_handle = $dbh->prepare($query); - $query_handle->execute(); - $query_handle->bind_columns(\($id, $time, $mcgcap,$mcgstatus, $status, $addr, $misc, $ip, $tsc, $walltime, $cpu, $cpuid, $apicid, $socketid, $cs, $bank, $cpuvendor, $bank_name, $msg, $mcgstatus_msg, $mcistatus_msg, $user_action, $mc_location)); - $out = ""; - while($query_handle->fetch()) { - $out .= "$id $time error: $msg"; - $out .= ", CPU $cpuvendor" if ($cpuvendor); - $out .= ", bank $bank_name" if ($bank_name); - $out .= ", mcg $mcgstatus_msg" if ($mcgstatus_msg); - $out .= ", mci $mcistatus_msg" if ($mcistatus_msg); - $out .= ", $mc_location" if ($mc_location); - $out .= ", $user_action" if ($user_action); - $out .= sprintf ", mcgcap=0x%08x", $mcgcap if ($mcgcap); - $out .= sprintf ", mcgstatus=0x%08x", $mcgstatus if ($mcgstatus); - $out .= sprintf ", status=0x%08x", $status if ($status); - $out .= sprintf ", addr=0x%08x", $addr if ($addr); - $out .= sprintf ", misc=0x%08x", $misc if ($misc); - $out .= sprintf ", ip=0x%08x", $ip if ($ip); - $out .= sprintf ", tsc=0x%08x", $tsc if ($tsc); - $out .= sprintf ", walltime=0x%08x", $walltime if ($walltime); - $out .= sprintf ", cpu=0x%08x", $cpu if ($cpu); - $out .= sprintf ", cpuid=0x%08x", $cpuid if ($cpuid); - $out .= sprintf ", apicid=0x%08x", $apicid if ($apicid); - $out .= sprintf ", socketid=0x%08x", $socketid if ($socketid); - $out .= sprintf ", cs=0x%08x", $cs if ($cs); - $out .= sprintf ", bank=0x%08x", $bank if ($bank); - - $out .= "\n"; - } - if ($out ne "") { - print "MCE events:\n$out\n"; - } else { - print "No MCE errors.\n\n"; + if ($enable_mce eq "yes") { + $query = "select id, timestamp, mcgcap, mcgstatus, status, addr, misc, ip, tsc, walltime, cpu, cpuid, apicid, socketid, cs, bank, cpuvendor, bank_name, error_msg, mcgstatus_msg, mcistatus_msg, user_action, mc_location from mce_record order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $time, $mcgcap,$mcgstatus, $status, $addr, $misc, $ip, $tsc, $walltime, $cpu, $cpuid, $apicid, $socketid, $cs, $bank, $cpuvendor, $bank_name, $msg, $mcgstatus_msg, $mcistatus_msg, $user_action, $mc_location)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $time error: $msg"; + $out .= ", CPU $cpuvendor" if ($cpuvendor); + $out .= ", bank $bank_name" if ($bank_name); + $out .= ", mcg $mcgstatus_msg" if ($mcgstatus_msg); + $out .= ", mci $mcistatus_msg" if ($mcistatus_msg); + $out .= ", $mc_location" if ($mc_location); + $out .= ", $user_action" if ($user_action); + $out .= sprintf ", mcgcap=0x%08x", $mcgcap if ($mcgcap); + $out .= sprintf ", mcgstatus=0x%08x", $mcgstatus if ($mcgstatus); + $out .= sprintf ", status=0x%08x", $status if ($status); + $out .= sprintf ", addr=0x%08x", $addr if ($addr); + $out .= sprintf ", misc=0x%08x", $misc if ($misc); + $out .= sprintf ", ip=0x%08x", $ip if ($ip); + $out .= sprintf ", tsc=0x%08x", $tsc if ($tsc); + $out .= sprintf ", walltime=0x%08x", $walltime if ($walltime); + $out .= sprintf ", cpu=0x%08x", $cpu if ($cpu); + $out .= sprintf ", cpuid=0x%08x", $cpuid if ($cpuid); + $out .= sprintf ", apicid=0x%08x", $apicid if ($apicid); + $out .= sprintf ", socketid=0x%08x", $socketid if ($socketid); + $out .= sprintf ", cs=0x%08x", $cs if ($cs); + $out .= sprintf ", bank=0x%08x", $bank if ($bank); + + $out .= "\n"; + } + if ($out ne "") { + print "MCE events:\n$out\n"; + } else { + print "No MCE errors.\n\n"; + } + $query_handle->finish; } - $query_handle->finish; undef($dbh); } -- 2.17.1
1 0
0 0
[PATCH v2 3/8] rasdaemon: ras-mc-ctl: Add memory failure events
by Shiju Jose 26 Jan '21

26 Jan '21
Add supporting memory failure errors (memory_failure_event) to the ras-mc-ctl tool. Sample Log, ras-mc-ctl --summary ... Memory failure events summary: Delayed errors: 4 Failed errors: 1 ... ras-mc-ctl --errors ... Memory failure events: 1 2020-10-28 23:20:41 -0800 error: pfn=0x204000000, page_type=free buddy page, action_result=Delayed 2 2020-10-28 23:31:38 -0800 error: pfn=0x204000000, page_type=free buddy page, action_result=Delayed 3 2020-10-28 23:54:54 -0800 error: pfn=0x205000000, page_type=free buddy page, action_result=Delayed 4 2020-10-29 00:12:25 -0800 error: pfn=0x204000000, page_type=free buddy page, action_result=Delayed 5 2020-10-29 00:26:36 -0800 error: pfn=0x204000000, page_type=free buddy page, action_result=Failed Signed-off-by: Shiju Jose <shiju.jose(a)huawei.com> --- util/ras-mc-ctl.in | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in index d8abdbd..eebcc4e 100755 --- a/util/ras-mc-ctl.in +++ b/util/ras-mc-ctl.in @@ -1120,7 +1120,7 @@ sub summary { require DBI; my ($query, $query_handle, $out); - my ($err_type, $label, $mc, $top, $mid, $low, $count, $msg); + my ($err_type, $label, $mc, $top, $mid, $low, $count, $msg, $action_result); my ($etype, $severity, $etype_string, $severity_string); my ($dev_name, $dev); my ($mpidr); @@ -1225,6 +1225,22 @@ sub summary } $query_handle->finish; + # Memory failure errors + $query = "select action_result, count(*) from memory_failure_event group by action_result"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($action_result, $count)); + $out = ""; + while($query_handle->fetch()) { + $out .= "\t$action_result errors: $count\n"; + } + if ($out ne "") { + print "Memory failure events summary:\n$out\n"; + } else { + print "No Memory failure errors.\n\n"; + } + $query_handle->finish; + # MCE mce_record errors $query = "select error_msg, count(*) from mce_record group by error_msg"; $query_handle = $dbh->prepare($query); @@ -1253,6 +1269,7 @@ sub errors my ($bus_name, $dev_name, $driver_name, $reporter_name); my ($dev, $sector, $nr_sector, $error, $rwbs, $cmd); my ($error_count, $affinity, $mpidr, $r_state, $psci_state); + my ($pfn, $page_type, $action_result); my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {}); @@ -1384,6 +1401,23 @@ sub errors } $query_handle->finish; + # Memory failure errors + $query = "select id, timestamp, pfn, page_type, action_result from memory_failure_event order by id"; + $query_handle = $dbh->prepare($query); + $query_handle->execute(); + $query_handle->bind_columns(\($id, $timestamp, $pfn, $page_type, $action_result)); + $out = ""; + while($query_handle->fetch()) { + $out .= "$id $timestamp error: "; + $out .= "pfn=$pfn, page_type=$page_type, action_result=$action_result\n"; + } + if ($out ne "") { + print "Memory failure events:\n$out\n"; + } else { + print "No Memory failure errors.\n\n"; + } + $query_handle->finish; + # MCE mce_record errors $query = "select id, timestamp, mcgcap, mcgstatus, status, addr, misc, ip, tsc, walltime, cpu, cpuid, apicid, socketid, cs, bank, cpuvendor, bank_name, error_msg, mcgstatus_msg, mcistatus_msg, user_action, mc_location from mce_record order by id"; $query_handle = $dbh->prepare($query); -- 2.17.1
1 0
0 0
  • ← Newer
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • Older →

HyperKitty Powered by HyperKitty