From: Steven Rostedt rostedt@goodmis.org
mainline inclusion from mainline-v5.14-rc1 commit 62de4f29e9174e67beb8d34ef5ced6730e087a31 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA
-------------------------------------------------
To have nanosecond output displayed in a more human readable format, its nicer to convert it to a seconds format (XXX.YYYYYYYYY). The problem is that to do so, the numbers must be divided by NSEC_PER_SEC, and moded too. But as these numbers are 64 bit, this can not be done simply with '/' and '%' operators, but must use do_div() instead.
Instead of performing the expensive do_div() in the hot path of the tracepoint, it is more efficient to perform it during the output phase. But passing in do_div() can confuse the parser, and do_div() doesn't work exactly like a normal C function. It modifies the number in place, and we don't want to modify the actual values in the ring buffer.
Two helper functions are now created:
__print_ns_to_secs() and __print_ns_without_secs()
They both take a value of nanoseconds, and the former will return that number divided by NSEC_PER_SEC, and the latter will mod it with NSEC_PER_SEC giving a way to print a nice human readable format:
__print_fmt("time=%llu.%09u", __print_ns_to_secs(REC->nsec_val), __print_ns_without_secs(REC->nsec_val))
Link: https://lkml.kernel.org/r/e503b903045496c4ccde52843e1e318b422f7a56.162437231...
Cc: Phil Auld pauld@redhat.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Kate Carcia kcarcia@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Ingo Molnar mingo@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Alexandre Chartre alexandre.chartre@oracle.com Cc: Clark Willaims williams@redhat.com Cc: John Kacur jkacur@redhat.com Cc: Juri Lelli juri.lelli@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Daniel Bristot de Oliveira bristot@redhat.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- include/trace/trace_events.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h index 7785961d82ba..ea34810846ea 100644 --- a/include/trace/trace_events.h +++ b/include/trace/trace_events.h @@ -347,6 +347,21 @@ TRACE_MAKE_SYSTEM_STR(); trace_print_hex_dump_seq(p, prefix_str, prefix_type, \ rowsize, groupsize, buf, len, ascii)
+#undef __print_ns_to_secs +#define __print_ns_to_secs(value) \ + ({ \ + u64 ____val = (u64)(value); \ + do_div(____val, NSEC_PER_SEC); \ + ____val; \ + }) + +#undef __print_ns_without_secs +#define __print_ns_without_secs(value) \ + ({ \ + u64 ____val = (u64)(value); \ + (u32) do_div(____val, NSEC_PER_SEC); \ + }) + #undef DECLARE_EVENT_CLASS #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ static notrace enum print_line_t \ @@ -725,6 +740,16 @@ static inline void ftrace_test_probe_##call(void) \ #undef __print_array #undef __print_hex_dump
+/* + * The below is not executed in the kernel. It is only what is + * displayed in the print format for userspace to parse. + */ +#undef __print_ns_to_secs +#define __print_ns_to_secs(val) (val) / 1000000000UL + +#undef __print_ns_without_secs +#define __print_ns_without_secs(val) (val) % 1000000000UL + #undef TP_printk #define TP_printk(fmt, args...) """ fmt "", " __stringify(args)