mirror of
https://github.com/torvalds/linux.git
synced 2025-08-15 14:11:42 +02:00

- Have module addresses get updated in the persistent ring buffer The addresses of the modules from the previous boot are saved in the persistent ring buffer. If the same modules are loaded and an address is in the old buffer points to an address that was both saved in the persistent ring buffer and is loaded in memory, shift the address to point to the address that is loaded in memory in the trace event. - Print function names for irqs off and preempt off callsites When ignoring the print fmt of a trace event and just printing the fields directly, have the fields for preempt off and irqs off events still show the function name (via kallsyms) instead of just showing the raw address. - Clean ups of the histogram code The histogram functions saved over 800 bytes on the stack to process events as they come in. Instead, create per-cpu buffers that can hold this information and have a separate location for each context level (thread, softirq, IRQ and NMI). Also add some more comments to the code. - Add "common_comm" field for histograms Add "common_comm" that uses the current->comm as a field in an event histogram and acts like any of the other fields of the event. - Show "subops" in the enabled_functions file When the function graph infrastructure is used, a subsystem has a "subops" that it attaches its callback function to. Instead of the enabled_functions just showing a function calling the function that calls the subops functions, also show the subops functions that will get called for that function too. - Add "copy_trace_marker" option to instances There are cases where an instance is created for tooling to write into, but the old tooling has the top level instance hardcoded into the application. New tools want to consume the data from an instance and not the top level buffer. By adding a copy_trace_marker option, whenever the top instance trace_marker is written into, a copy of it is also written into the instance with this option set. This allows new tools to read what old tools are writing into the top buffer. If this option is cleared by the top instance, then what is written into the trace_marker is not written into the top instance. This is a way to redirect the trace_marker writes into another instance. - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp() If a tracepoint is created by DECLARE_TRACE() instead of TRACE_EVENT(), then it will not be exposed via tracefs. Currently there's no way to differentiate in the kernel the tracepoint functions between those that are exposed via tracefs or not. A calling convention has been made manually to append a "_tp" prefix for events created by DECLARE_TRACE(). Instead of doing this manually, force it so that all DECLARE_TRACE() events have this notation. - Use __string() for task->comm in some sched events Instead of hardcoding the comm to be TASK_COMM_LEN in some of the scheduler events use __string() which makes it dynamic. Note, if these events are parsed by user space it they may break, and the event may have to be converted back to the hardcoded size. - Have function graph "depth" be unsigned to the user Internally to the kernel, the "depth" field of the function graph event is signed due to -1 being used for end of boundary. What actually gets recorded in the event itself is zero or positive. Reflect this to user space by showing "depth" as unsigned int and be consistent across all events. - Allow an arbitrary long CPU string to osnoise_cpus_write() The filtering of which CPUs to write to can exceed 256 bytes. If a machine has 256 CPUs, and the filter is to filter every other CPU, the write would take a string larger than 256 bytes. Instead of using a fixed size buffer on the stack that is 256 bytes, allocate it to handle what is passed in. - Stop having ftrace check the per-cpu data "disabled" flag The "disabled" flag in the data structure passed to most ftrace functions is checked to know if tracing has been disabled or not. This flag was added back in 2008 before the ring buffer had its own way to disable tracing. The "disable" flag is now not always set when needed, and the ring buffer flag should be used in all locations where the disabled is needed. Since the "disable" flag is redundant and incorrect, stop using it. Fix up some locations that use the "disable" flag to use the ring buffer info. - Use a new tracer_tracing_disable/enable() instead of data->disable flag There's a few cases that set the data->disable flag to stop tracing, but this flag is not consistently used. It is also an on/off switch where if a function set it and calls another function that sets it, the called function may incorrectly enable it. Use a new trace_tracing_disable() and tracer_tracing_enable() that uses a counter and can be nested. These use the ring buffer flags which are always checked making the disabling more consistent. - Save the trace clock in the persistent ring buffer Save what clock was used for tracing in the persistent ring buffer and set it back to that clock after a reboot. - Remove unused reference to a per CPU data pointer in mmiotrace functions - Remove unused buffer_page field from trace_array_cpu structure - Remove more strncpy() instances - Other minor clean ups and fixes -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaDhiqRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qkheAQDpyRHoXF1AIoEqyahDax8f3vpZQeCH B/mn+YJmU1wuVgEA7AFALov5SHKv4IzoARz68GXtR0jGhP5D8uebUhUqDAQ= =WmFG -----END PGP SIGNATURE----- Merge tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Have module addresses get updated in the persistent ring buffer The addresses of the modules from the previous boot are saved in the persistent ring buffer. If the same modules are loaded and an address is in the old buffer points to an address that was both saved in the persistent ring buffer and is loaded in memory, shift the address to point to the address that is loaded in memory in the trace event. - Print function names for irqs off and preempt off callsites When ignoring the print fmt of a trace event and just printing the fields directly, have the fields for preempt off and irqs off events still show the function name (via kallsyms) instead of just showing the raw address. - Clean ups of the histogram code The histogram functions saved over 800 bytes on the stack to process events as they come in. Instead, create per-cpu buffers that can hold this information and have a separate location for each context level (thread, softirq, IRQ and NMI). Also add some more comments to the code. - Add "common_comm" field for histograms Add "common_comm" that uses the current->comm as a field in an event histogram and acts like any of the other fields of the event. - Show "subops" in the enabled_functions file When the function graph infrastructure is used, a subsystem has a "subops" that it attaches its callback function to. Instead of the enabled_functions just showing a function calling the function that calls the subops functions, also show the subops functions that will get called for that function too. - Add "copy_trace_marker" option to instances There are cases where an instance is created for tooling to write into, but the old tooling has the top level instance hardcoded into the application. New tools want to consume the data from an instance and not the top level buffer. By adding a copy_trace_marker option, whenever the top instance trace_marker is written into, a copy of it is also written into the instance with this option set. This allows new tools to read what old tools are writing into the top buffer. If this option is cleared by the top instance, then what is written into the trace_marker is not written into the top instance. This is a way to redirect the trace_marker writes into another instance. - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp() If a tracepoint is created by DECLARE_TRACE() instead of TRACE_EVENT(), then it will not be exposed via tracefs. Currently there's no way to differentiate in the kernel the tracepoint functions between those that are exposed via tracefs or not. A calling convention has been made manually to append a "_tp" prefix for events created by DECLARE_TRACE(). Instead of doing this manually, force it so that all DECLARE_TRACE() events have this notation. - Use __string() for task->comm in some sched events Instead of hardcoding the comm to be TASK_COMM_LEN in some of the scheduler events use __string() which makes it dynamic. Note, if these events are parsed by user space it they may break, and the event may have to be converted back to the hardcoded size. - Have function graph "depth" be unsigned to the user Internally to the kernel, the "depth" field of the function graph event is signed due to -1 being used for end of boundary. What actually gets recorded in the event itself is zero or positive. Reflect this to user space by showing "depth" as unsigned int and be consistent across all events. - Allow an arbitrary long CPU string to osnoise_cpus_write() The filtering of which CPUs to write to can exceed 256 bytes. If a machine has 256 CPUs, and the filter is to filter every other CPU, the write would take a string larger than 256 bytes. Instead of using a fixed size buffer on the stack that is 256 bytes, allocate it to handle what is passed in. - Stop having ftrace check the per-cpu data "disabled" flag The "disabled" flag in the data structure passed to most ftrace functions is checked to know if tracing has been disabled or not. This flag was added back in 2008 before the ring buffer had its own way to disable tracing. The "disable" flag is now not always set when needed, and the ring buffer flag should be used in all locations where the disabled is needed. Since the "disable" flag is redundant and incorrect, stop using it. Fix up some locations that use the "disable" flag to use the ring buffer info. - Use a new tracer_tracing_disable/enable() instead of data->disable flag There's a few cases that set the data->disable flag to stop tracing, but this flag is not consistently used. It is also an on/off switch where if a function set it and calls another function that sets it, the called function may incorrectly enable it. Use a new trace_tracing_disable() and tracer_tracing_enable() that uses a counter and can be nested. These use the ring buffer flags which are always checked making the disabling more consistent. - Save the trace clock in the persistent ring buffer Save what clock was used for tracing in the persistent ring buffer and set it back to that clock after a reboot. - Remove unused reference to a per CPU data pointer in mmiotrace functions - Remove unused buffer_page field from trace_array_cpu structure - Remove more strncpy() instances - Other minor clean ups and fixes * tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits) tracing: Fix compilation warning on arm32 tracing: Record trace_clock and recover when reboot tracing/sched: Use __string() instead of fixed lengths for task->comm tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix tracing: Cleanup upper_empty() in pid_list tracing: Allow the top level trace_marker to write into another instances tracing: Add a helper function to handle the dereference arg in verifier tracing: Remove unnecessary "goto out" that simply returns ret is trigger code tracing: Fix error handling in event_trigger_parse() tracing: Rename event_trigger_alloc() to trigger_data_alloc() tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf tracing: Remove unused buffer_page field from trace_array_cpu structure tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer tracing: Convert the per CPU "disabled" counter to local from atomic tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field ring-buffer: Add ring_buffer_record_is_on_cpu() tracing: Do not use per CPU array_buffer.data->disabled for cpumask ftrace: Do not disabled function graph based on "disabled" field tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one() ...
1030 lines
24 KiB
C
1030 lines
24 KiB
C
// SPDX-License-Identifier: GPL-2.0
|
|
/*
|
|
* ring buffer based function tracer
|
|
*
|
|
* Copyright (C) 2007-2008 Steven Rostedt <srostedt@redhat.com>
|
|
* Copyright (C) 2008 Ingo Molnar <mingo@redhat.com>
|
|
*
|
|
* Based on code from the latency_tracer, that is:
|
|
*
|
|
* Copyright (C) 2004-2006 Ingo Molnar
|
|
* Copyright (C) 2004 Nadia Yvette Chambers
|
|
*/
|
|
#include <linux/ring_buffer.h>
|
|
#include <linux/debugfs.h>
|
|
#include <linux/uaccess.h>
|
|
#include <linux/ftrace.h>
|
|
#include <linux/slab.h>
|
|
#include <linux/fs.h>
|
|
|
|
#include "trace.h"
|
|
|
|
static void tracing_start_function_trace(struct trace_array *tr);
|
|
static void tracing_stop_function_trace(struct trace_array *tr);
|
|
static void
|
|
function_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs);
|
|
static void
|
|
function_args_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs);
|
|
static void
|
|
function_stack_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs);
|
|
static void
|
|
function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs);
|
|
static void
|
|
function_stack_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op,
|
|
struct ftrace_regs *fregs);
|
|
static struct tracer_flags func_flags;
|
|
|
|
/* Our option */
|
|
enum {
|
|
|
|
TRACE_FUNC_NO_OPTS = 0x0, /* No flags set. */
|
|
TRACE_FUNC_OPT_STACK = 0x1,
|
|
TRACE_FUNC_OPT_NO_REPEATS = 0x2,
|
|
TRACE_FUNC_OPT_ARGS = 0x4,
|
|
|
|
/* Update this to next highest bit. */
|
|
TRACE_FUNC_OPT_HIGHEST_BIT = 0x8
|
|
};
|
|
|
|
#define TRACE_FUNC_OPT_MASK (TRACE_FUNC_OPT_HIGHEST_BIT - 1)
|
|
|
|
int ftrace_allocate_ftrace_ops(struct trace_array *tr)
|
|
{
|
|
struct ftrace_ops *ops;
|
|
|
|
/* The top level array uses the "global_ops" */
|
|
if (tr->flags & TRACE_ARRAY_FL_GLOBAL)
|
|
return 0;
|
|
|
|
ops = kzalloc(sizeof(*ops), GFP_KERNEL);
|
|
if (!ops)
|
|
return -ENOMEM;
|
|
|
|
/* Currently only the non stack version is supported */
|
|
ops->func = function_trace_call;
|
|
ops->flags = FTRACE_OPS_FL_PID;
|
|
|
|
tr->ops = ops;
|
|
ops->private = tr;
|
|
|
|
return 0;
|
|
}
|
|
|
|
void ftrace_free_ftrace_ops(struct trace_array *tr)
|
|
{
|
|
kfree(tr->ops);
|
|
tr->ops = NULL;
|
|
}
|
|
|
|
int ftrace_create_function_files(struct trace_array *tr,
|
|
struct dentry *parent)
|
|
{
|
|
int ret;
|
|
/*
|
|
* The top level array uses the "global_ops", and the files are
|
|
* created on boot up.
|
|
*/
|
|
if (tr->flags & TRACE_ARRAY_FL_GLOBAL)
|
|
return 0;
|
|
|
|
if (!tr->ops)
|
|
return -EINVAL;
|
|
|
|
ret = allocate_fgraph_ops(tr, tr->ops);
|
|
if (ret) {
|
|
kfree(tr->ops);
|
|
return ret;
|
|
}
|
|
|
|
ftrace_create_filter_files(tr->ops, parent);
|
|
|
|
return 0;
|
|
}
|
|
|
|
void ftrace_destroy_function_files(struct trace_array *tr)
|
|
{
|
|
ftrace_destroy_filter_files(tr->ops);
|
|
ftrace_free_ftrace_ops(tr);
|
|
free_fgraph_ops(tr);
|
|
}
|
|
|
|
static ftrace_func_t select_trace_function(u32 flags_val)
|
|
{
|
|
switch (flags_val & TRACE_FUNC_OPT_MASK) {
|
|
case TRACE_FUNC_NO_OPTS:
|
|
return function_trace_call;
|
|
case TRACE_FUNC_OPT_ARGS:
|
|
return function_args_trace_call;
|
|
case TRACE_FUNC_OPT_STACK:
|
|
return function_stack_trace_call;
|
|
case TRACE_FUNC_OPT_NO_REPEATS:
|
|
return function_no_repeats_trace_call;
|
|
case TRACE_FUNC_OPT_STACK | TRACE_FUNC_OPT_NO_REPEATS:
|
|
return function_stack_no_repeats_trace_call;
|
|
default:
|
|
return NULL;
|
|
}
|
|
}
|
|
|
|
static bool handle_func_repeats(struct trace_array *tr, u32 flags_val)
|
|
{
|
|
if (!tr->last_func_repeats &&
|
|
(flags_val & TRACE_FUNC_OPT_NO_REPEATS)) {
|
|
tr->last_func_repeats = alloc_percpu(struct trace_func_repeats);
|
|
if (!tr->last_func_repeats)
|
|
return false;
|
|
}
|
|
|
|
return true;
|
|
}
|
|
|
|
static int function_trace_init(struct trace_array *tr)
|
|
{
|
|
ftrace_func_t func;
|
|
/*
|
|
* Instance trace_arrays get their ops allocated
|
|
* at instance creation. Unless it failed
|
|
* the allocation.
|
|
*/
|
|
if (!tr->ops)
|
|
return -ENOMEM;
|
|
|
|
func = select_trace_function(func_flags.val);
|
|
if (!func)
|
|
return -EINVAL;
|
|
|
|
if (!handle_func_repeats(tr, func_flags.val))
|
|
return -ENOMEM;
|
|
|
|
ftrace_init_array_ops(tr, func);
|
|
|
|
tr->array_buffer.cpu = raw_smp_processor_id();
|
|
|
|
tracing_start_cmdline_record();
|
|
tracing_start_function_trace(tr);
|
|
return 0;
|
|
}
|
|
|
|
static void function_trace_reset(struct trace_array *tr)
|
|
{
|
|
tracing_stop_function_trace(tr);
|
|
tracing_stop_cmdline_record();
|
|
ftrace_reset_array_ops(tr);
|
|
}
|
|
|
|
static void function_trace_start(struct trace_array *tr)
|
|
{
|
|
tracing_reset_online_cpus(&tr->array_buffer);
|
|
}
|
|
|
|
/* fregs are guaranteed not to be NULL if HAVE_DYNAMIC_FTRACE_WITH_ARGS is set */
|
|
#if defined(CONFIG_FUNCTION_GRAPH_TRACER) && defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS)
|
|
static __always_inline unsigned long
|
|
function_get_true_parent_ip(unsigned long parent_ip, struct ftrace_regs *fregs)
|
|
{
|
|
unsigned long true_parent_ip;
|
|
int idx = 0;
|
|
|
|
true_parent_ip = parent_ip;
|
|
if (unlikely(parent_ip == (unsigned long)&return_to_handler) && fregs)
|
|
true_parent_ip = ftrace_graph_ret_addr(current, &idx, parent_ip,
|
|
(unsigned long *)ftrace_regs_get_stack_pointer(fregs));
|
|
return true_parent_ip;
|
|
}
|
|
#else
|
|
static __always_inline unsigned long
|
|
function_get_true_parent_ip(unsigned long parent_ip, struct ftrace_regs *fregs)
|
|
{
|
|
return parent_ip;
|
|
}
|
|
#endif
|
|
|
|
static void
|
|
function_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs)
|
|
{
|
|
struct trace_array *tr = op->private;
|
|
unsigned int trace_ctx;
|
|
int bit;
|
|
|
|
if (unlikely(!tr->function_enabled))
|
|
return;
|
|
|
|
bit = ftrace_test_recursion_trylock(ip, parent_ip);
|
|
if (bit < 0)
|
|
return;
|
|
|
|
parent_ip = function_get_true_parent_ip(parent_ip, fregs);
|
|
|
|
trace_ctx = tracing_gen_ctx_dec();
|
|
|
|
trace_function(tr, ip, parent_ip, trace_ctx, NULL);
|
|
|
|
ftrace_test_recursion_unlock(bit);
|
|
}
|
|
|
|
static void
|
|
function_args_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs)
|
|
{
|
|
struct trace_array *tr = op->private;
|
|
unsigned int trace_ctx;
|
|
int bit;
|
|
|
|
if (unlikely(!tr->function_enabled))
|
|
return;
|
|
|
|
bit = ftrace_test_recursion_trylock(ip, parent_ip);
|
|
if (bit < 0)
|
|
return;
|
|
|
|
trace_ctx = tracing_gen_ctx();
|
|
|
|
trace_function(tr, ip, parent_ip, trace_ctx, fregs);
|
|
|
|
ftrace_test_recursion_unlock(bit);
|
|
}
|
|
|
|
#ifdef CONFIG_UNWINDER_ORC
|
|
/*
|
|
* Skip 2:
|
|
*
|
|
* function_stack_trace_call()
|
|
* ftrace_call()
|
|
*/
|
|
#define STACK_SKIP 2
|
|
#else
|
|
/*
|
|
* Skip 3:
|
|
* __trace_stack()
|
|
* function_stack_trace_call()
|
|
* ftrace_call()
|
|
*/
|
|
#define STACK_SKIP 3
|
|
#endif
|
|
|
|
static void
|
|
function_stack_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs)
|
|
{
|
|
struct trace_array *tr = op->private;
|
|
struct trace_array_cpu *data;
|
|
unsigned long flags;
|
|
long disabled;
|
|
int cpu;
|
|
unsigned int trace_ctx;
|
|
int skip = STACK_SKIP;
|
|
|
|
if (unlikely(!tr->function_enabled))
|
|
return;
|
|
|
|
/*
|
|
* Need to use raw, since this must be called before the
|
|
* recursive protection is performed.
|
|
*/
|
|
local_irq_save(flags);
|
|
parent_ip = function_get_true_parent_ip(parent_ip, fregs);
|
|
cpu = raw_smp_processor_id();
|
|
data = per_cpu_ptr(tr->array_buffer.data, cpu);
|
|
disabled = local_inc_return(&data->disabled);
|
|
|
|
if (likely(disabled == 1)) {
|
|
trace_ctx = tracing_gen_ctx_flags(flags);
|
|
trace_function(tr, ip, parent_ip, trace_ctx, NULL);
|
|
#ifdef CONFIG_UNWINDER_FRAME_POINTER
|
|
if (ftrace_pids_enabled(op))
|
|
skip++;
|
|
#endif
|
|
__trace_stack(tr, trace_ctx, skip);
|
|
}
|
|
|
|
local_dec(&data->disabled);
|
|
local_irq_restore(flags);
|
|
}
|
|
|
|
static inline bool is_repeat_check(struct trace_array *tr,
|
|
struct trace_func_repeats *last_info,
|
|
unsigned long ip, unsigned long parent_ip)
|
|
{
|
|
if (last_info->ip == ip &&
|
|
last_info->parent_ip == parent_ip &&
|
|
last_info->count < U16_MAX) {
|
|
last_info->ts_last_call =
|
|
ring_buffer_time_stamp(tr->array_buffer.buffer);
|
|
last_info->count++;
|
|
return true;
|
|
}
|
|
|
|
return false;
|
|
}
|
|
|
|
static inline void process_repeats(struct trace_array *tr,
|
|
unsigned long ip, unsigned long parent_ip,
|
|
struct trace_func_repeats *last_info,
|
|
unsigned int trace_ctx)
|
|
{
|
|
if (last_info->count) {
|
|
trace_last_func_repeats(tr, last_info, trace_ctx);
|
|
last_info->count = 0;
|
|
}
|
|
|
|
last_info->ip = ip;
|
|
last_info->parent_ip = parent_ip;
|
|
}
|
|
|
|
static void
|
|
function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op,
|
|
struct ftrace_regs *fregs)
|
|
{
|
|
struct trace_func_repeats *last_info;
|
|
struct trace_array *tr = op->private;
|
|
unsigned int trace_ctx;
|
|
int bit;
|
|
|
|
if (unlikely(!tr->function_enabled))
|
|
return;
|
|
|
|
bit = ftrace_test_recursion_trylock(ip, parent_ip);
|
|
if (bit < 0)
|
|
return;
|
|
|
|
parent_ip = function_get_true_parent_ip(parent_ip, fregs);
|
|
if (!tracer_tracing_is_on(tr))
|
|
goto out;
|
|
|
|
/*
|
|
* An interrupt may happen at any place here. But as far as I can see,
|
|
* the only damage that this can cause is to mess up the repetition
|
|
* counter without valuable data being lost.
|
|
* TODO: think about a solution that is better than just hoping to be
|
|
* lucky.
|
|
*/
|
|
last_info = this_cpu_ptr(tr->last_func_repeats);
|
|
if (is_repeat_check(tr, last_info, ip, parent_ip))
|
|
goto out;
|
|
|
|
trace_ctx = tracing_gen_ctx_dec();
|
|
process_repeats(tr, ip, parent_ip, last_info, trace_ctx);
|
|
|
|
trace_function(tr, ip, parent_ip, trace_ctx, NULL);
|
|
|
|
out:
|
|
ftrace_test_recursion_unlock(bit);
|
|
}
|
|
|
|
static void
|
|
function_stack_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op,
|
|
struct ftrace_regs *fregs)
|
|
{
|
|
struct trace_func_repeats *last_info;
|
|
struct trace_array *tr = op->private;
|
|
struct trace_array_cpu *data;
|
|
unsigned long flags;
|
|
long disabled;
|
|
int cpu;
|
|
unsigned int trace_ctx;
|
|
|
|
if (unlikely(!tr->function_enabled))
|
|
return;
|
|
|
|
/*
|
|
* Need to use raw, since this must be called before the
|
|
* recursive protection is performed.
|
|
*/
|
|
local_irq_save(flags);
|
|
parent_ip = function_get_true_parent_ip(parent_ip, fregs);
|
|
cpu = raw_smp_processor_id();
|
|
data = per_cpu_ptr(tr->array_buffer.data, cpu);
|
|
disabled = local_inc_return(&data->disabled);
|
|
|
|
if (likely(disabled == 1)) {
|
|
last_info = per_cpu_ptr(tr->last_func_repeats, cpu);
|
|
if (is_repeat_check(tr, last_info, ip, parent_ip))
|
|
goto out;
|
|
|
|
trace_ctx = tracing_gen_ctx_flags(flags);
|
|
process_repeats(tr, ip, parent_ip, last_info, trace_ctx);
|
|
|
|
trace_function(tr, ip, parent_ip, trace_ctx, NULL);
|
|
__trace_stack(tr, trace_ctx, STACK_SKIP);
|
|
}
|
|
|
|
out:
|
|
local_dec(&data->disabled);
|
|
local_irq_restore(flags);
|
|
}
|
|
|
|
static struct tracer_opt func_opts[] = {
|
|
#ifdef CONFIG_STACKTRACE
|
|
{ TRACER_OPT(func_stack_trace, TRACE_FUNC_OPT_STACK) },
|
|
#endif
|
|
{ TRACER_OPT(func-no-repeats, TRACE_FUNC_OPT_NO_REPEATS) },
|
|
#ifdef CONFIG_FUNCTION_TRACE_ARGS
|
|
{ TRACER_OPT(func-args, TRACE_FUNC_OPT_ARGS) },
|
|
#endif
|
|
{ } /* Always set a last empty entry */
|
|
};
|
|
|
|
static struct tracer_flags func_flags = {
|
|
.val = TRACE_FUNC_NO_OPTS, /* By default: all flags disabled */
|
|
.opts = func_opts
|
|
};
|
|
|
|
static void tracing_start_function_trace(struct trace_array *tr)
|
|
{
|
|
tr->function_enabled = 0;
|
|
register_ftrace_function(tr->ops);
|
|
tr->function_enabled = 1;
|
|
}
|
|
|
|
static void tracing_stop_function_trace(struct trace_array *tr)
|
|
{
|
|
tr->function_enabled = 0;
|
|
unregister_ftrace_function(tr->ops);
|
|
}
|
|
|
|
static struct tracer function_trace;
|
|
|
|
static int
|
|
func_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
|
|
{
|
|
ftrace_func_t func;
|
|
u32 new_flags;
|
|
|
|
/* Do nothing if already set. */
|
|
if (!!set == !!(func_flags.val & bit))
|
|
return 0;
|
|
|
|
/* We can change this flag only when not running. */
|
|
if (tr->current_trace != &function_trace)
|
|
return 0;
|
|
|
|
new_flags = (func_flags.val & ~bit) | (set ? bit : 0);
|
|
func = select_trace_function(new_flags);
|
|
if (!func)
|
|
return -EINVAL;
|
|
|
|
/* Check if there's anything to change. */
|
|
if (tr->ops->func == func)
|
|
return 0;
|
|
|
|
if (!handle_func_repeats(tr, new_flags))
|
|
return -ENOMEM;
|
|
|
|
unregister_ftrace_function(tr->ops);
|
|
tr->ops->func = func;
|
|
register_ftrace_function(tr->ops);
|
|
|
|
return 0;
|
|
}
|
|
|
|
static struct tracer function_trace __tracer_data =
|
|
{
|
|
.name = "function",
|
|
.init = function_trace_init,
|
|
.reset = function_trace_reset,
|
|
.start = function_trace_start,
|
|
.flags = &func_flags,
|
|
.set_flag = func_set_flag,
|
|
.allow_instances = true,
|
|
#ifdef CONFIG_FTRACE_SELFTEST
|
|
.selftest = trace_selftest_startup_function,
|
|
#endif
|
|
};
|
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
|
static void update_traceon_count(struct ftrace_probe_ops *ops,
|
|
unsigned long ip,
|
|
struct trace_array *tr, bool on,
|
|
void *data)
|
|
{
|
|
struct ftrace_func_mapper *mapper = data;
|
|
long *count;
|
|
long old_count;
|
|
|
|
/*
|
|
* Tracing gets disabled (or enabled) once per count.
|
|
* This function can be called at the same time on multiple CPUs.
|
|
* It is fine if both disable (or enable) tracing, as disabling
|
|
* (or enabling) the second time doesn't do anything as the
|
|
* state of the tracer is already disabled (or enabled).
|
|
* What needs to be synchronized in this case is that the count
|
|
* only gets decremented once, even if the tracer is disabled
|
|
* (or enabled) twice, as the second one is really a nop.
|
|
*
|
|
* The memory barriers guarantee that we only decrement the
|
|
* counter once. First the count is read to a local variable
|
|
* and a read barrier is used to make sure that it is loaded
|
|
* before checking if the tracer is in the state we want.
|
|
* If the tracer is not in the state we want, then the count
|
|
* is guaranteed to be the old count.
|
|
*
|
|
* Next the tracer is set to the state we want (disabled or enabled)
|
|
* then a write memory barrier is used to make sure that
|
|
* the new state is visible before changing the counter by
|
|
* one minus the old counter. This guarantees that another CPU
|
|
* executing this code will see the new state before seeing
|
|
* the new counter value, and would not do anything if the new
|
|
* counter is seen.
|
|
*
|
|
* Note, there is no synchronization between this and a user
|
|
* setting the tracing_on file. But we currently don't care
|
|
* about that.
|
|
*/
|
|
count = (long *)ftrace_func_mapper_find_ip(mapper, ip);
|
|
old_count = *count;
|
|
|
|
if (old_count <= 0)
|
|
return;
|
|
|
|
/* Make sure we see count before checking tracing state */
|
|
smp_rmb();
|
|
|
|
if (on == !!tracer_tracing_is_on(tr))
|
|
return;
|
|
|
|
if (on)
|
|
tracer_tracing_on(tr);
|
|
else
|
|
tracer_tracing_off(tr);
|
|
|
|
/* Make sure tracing state is visible before updating count */
|
|
smp_wmb();
|
|
|
|
*count = old_count - 1;
|
|
}
|
|
|
|
static void
|
|
ftrace_traceon_count(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
update_traceon_count(ops, ip, tr, 1, data);
|
|
}
|
|
|
|
static void
|
|
ftrace_traceoff_count(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
update_traceon_count(ops, ip, tr, 0, data);
|
|
}
|
|
|
|
static void
|
|
ftrace_traceon(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
if (tracer_tracing_is_on(tr))
|
|
return;
|
|
|
|
tracer_tracing_on(tr);
|
|
}
|
|
|
|
static void
|
|
ftrace_traceoff(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
if (!tracer_tracing_is_on(tr))
|
|
return;
|
|
|
|
tracer_tracing_off(tr);
|
|
}
|
|
|
|
#ifdef CONFIG_UNWINDER_ORC
|
|
/*
|
|
* Skip 3:
|
|
*
|
|
* function_trace_probe_call()
|
|
* ftrace_ops_assist_func()
|
|
* ftrace_call()
|
|
*/
|
|
#define FTRACE_STACK_SKIP 3
|
|
#else
|
|
/*
|
|
* Skip 5:
|
|
*
|
|
* __trace_stack()
|
|
* ftrace_stacktrace()
|
|
* function_trace_probe_call()
|
|
* ftrace_ops_assist_func()
|
|
* ftrace_call()
|
|
*/
|
|
#define FTRACE_STACK_SKIP 5
|
|
#endif
|
|
|
|
static __always_inline void trace_stack(struct trace_array *tr)
|
|
{
|
|
__trace_stack(tr, tracing_gen_ctx_dec(), FTRACE_STACK_SKIP);
|
|
}
|
|
|
|
static void
|
|
ftrace_stacktrace(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
trace_stack(tr);
|
|
}
|
|
|
|
static void
|
|
ftrace_stacktrace_count(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
struct ftrace_func_mapper *mapper = data;
|
|
long *count;
|
|
long old_count;
|
|
long new_count;
|
|
|
|
if (!tracing_is_on())
|
|
return;
|
|
|
|
/* unlimited? */
|
|
if (!mapper) {
|
|
trace_stack(tr);
|
|
return;
|
|
}
|
|
|
|
count = (long *)ftrace_func_mapper_find_ip(mapper, ip);
|
|
|
|
/*
|
|
* Stack traces should only execute the number of times the
|
|
* user specified in the counter.
|
|
*/
|
|
do {
|
|
old_count = *count;
|
|
|
|
if (!old_count)
|
|
return;
|
|
|
|
new_count = old_count - 1;
|
|
new_count = cmpxchg(count, old_count, new_count);
|
|
if (new_count == old_count)
|
|
trace_stack(tr);
|
|
|
|
if (!tracing_is_on())
|
|
return;
|
|
|
|
} while (new_count != old_count);
|
|
}
|
|
|
|
static int update_count(struct ftrace_probe_ops *ops, unsigned long ip,
|
|
void *data)
|
|
{
|
|
struct ftrace_func_mapper *mapper = data;
|
|
long *count = NULL;
|
|
|
|
if (mapper)
|
|
count = (long *)ftrace_func_mapper_find_ip(mapper, ip);
|
|
|
|
if (count) {
|
|
if (*count <= 0)
|
|
return 0;
|
|
(*count)--;
|
|
}
|
|
|
|
return 1;
|
|
}
|
|
|
|
static void
|
|
ftrace_dump_probe(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
if (update_count(ops, ip, data))
|
|
ftrace_dump(DUMP_ALL);
|
|
}
|
|
|
|
/* Only dump the current CPU buffer. */
|
|
static void
|
|
ftrace_cpudump_probe(unsigned long ip, unsigned long parent_ip,
|
|
struct trace_array *tr, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
if (update_count(ops, ip, data))
|
|
ftrace_dump(DUMP_ORIG);
|
|
}
|
|
|
|
static int
|
|
ftrace_probe_print(const char *name, struct seq_file *m,
|
|
unsigned long ip, struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
struct ftrace_func_mapper *mapper = data;
|
|
long *count = NULL;
|
|
|
|
seq_printf(m, "%ps:%s", (void *)ip, name);
|
|
|
|
if (mapper)
|
|
count = (long *)ftrace_func_mapper_find_ip(mapper, ip);
|
|
|
|
if (count)
|
|
seq_printf(m, ":count=%ld\n", *count);
|
|
else
|
|
seq_puts(m, ":unlimited\n");
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int
|
|
ftrace_traceon_print(struct seq_file *m, unsigned long ip,
|
|
struct ftrace_probe_ops *ops,
|
|
void *data)
|
|
{
|
|
return ftrace_probe_print("traceon", m, ip, ops, data);
|
|
}
|
|
|
|
static int
|
|
ftrace_traceoff_print(struct seq_file *m, unsigned long ip,
|
|
struct ftrace_probe_ops *ops, void *data)
|
|
{
|
|
return ftrace_probe_print("traceoff", m, ip, ops, data);
|
|
}
|
|
|
|
static int
|
|
ftrace_stacktrace_print(struct seq_file *m, unsigned long ip,
|
|
struct ftrace_probe_ops *ops, void *data)
|
|
{
|
|
return ftrace_probe_print("stacktrace", m, ip, ops, data);
|
|
}
|
|
|
|
static int
|
|
ftrace_dump_print(struct seq_file *m, unsigned long ip,
|
|
struct ftrace_probe_ops *ops, void *data)
|
|
{
|
|
return ftrace_probe_print("dump", m, ip, ops, data);
|
|
}
|
|
|
|
static int
|
|
ftrace_cpudump_print(struct seq_file *m, unsigned long ip,
|
|
struct ftrace_probe_ops *ops, void *data)
|
|
{
|
|
return ftrace_probe_print("cpudump", m, ip, ops, data);
|
|
}
|
|
|
|
|
|
static int
|
|
ftrace_count_init(struct ftrace_probe_ops *ops, struct trace_array *tr,
|
|
unsigned long ip, void *init_data, void **data)
|
|
{
|
|
struct ftrace_func_mapper *mapper = *data;
|
|
|
|
if (!mapper) {
|
|
mapper = allocate_ftrace_func_mapper();
|
|
if (!mapper)
|
|
return -ENOMEM;
|
|
*data = mapper;
|
|
}
|
|
|
|
return ftrace_func_mapper_add_ip(mapper, ip, init_data);
|
|
}
|
|
|
|
static void
|
|
ftrace_count_free(struct ftrace_probe_ops *ops, struct trace_array *tr,
|
|
unsigned long ip, void *data)
|
|
{
|
|
struct ftrace_func_mapper *mapper = data;
|
|
|
|
if (!ip) {
|
|
free_ftrace_func_mapper(mapper, NULL);
|
|
return;
|
|
}
|
|
|
|
ftrace_func_mapper_remove_ip(mapper, ip);
|
|
}
|
|
|
|
static struct ftrace_probe_ops traceon_count_probe_ops = {
|
|
.func = ftrace_traceon_count,
|
|
.print = ftrace_traceon_print,
|
|
.init = ftrace_count_init,
|
|
.free = ftrace_count_free,
|
|
};
|
|
|
|
static struct ftrace_probe_ops traceoff_count_probe_ops = {
|
|
.func = ftrace_traceoff_count,
|
|
.print = ftrace_traceoff_print,
|
|
.init = ftrace_count_init,
|
|
.free = ftrace_count_free,
|
|
};
|
|
|
|
static struct ftrace_probe_ops stacktrace_count_probe_ops = {
|
|
.func = ftrace_stacktrace_count,
|
|
.print = ftrace_stacktrace_print,
|
|
.init = ftrace_count_init,
|
|
.free = ftrace_count_free,
|
|
};
|
|
|
|
static struct ftrace_probe_ops dump_probe_ops = {
|
|
.func = ftrace_dump_probe,
|
|
.print = ftrace_dump_print,
|
|
.init = ftrace_count_init,
|
|
.free = ftrace_count_free,
|
|
};
|
|
|
|
static struct ftrace_probe_ops cpudump_probe_ops = {
|
|
.func = ftrace_cpudump_probe,
|
|
.print = ftrace_cpudump_print,
|
|
};
|
|
|
|
static struct ftrace_probe_ops traceon_probe_ops = {
|
|
.func = ftrace_traceon,
|
|
.print = ftrace_traceon_print,
|
|
};
|
|
|
|
static struct ftrace_probe_ops traceoff_probe_ops = {
|
|
.func = ftrace_traceoff,
|
|
.print = ftrace_traceoff_print,
|
|
};
|
|
|
|
static struct ftrace_probe_ops stacktrace_probe_ops = {
|
|
.func = ftrace_stacktrace,
|
|
.print = ftrace_stacktrace_print,
|
|
};
|
|
|
|
static int
|
|
ftrace_trace_probe_callback(struct trace_array *tr,
|
|
struct ftrace_probe_ops *ops,
|
|
struct ftrace_hash *hash, char *glob,
|
|
char *cmd, char *param, int enable)
|
|
{
|
|
void *count = (void *)-1;
|
|
char *number;
|
|
int ret;
|
|
|
|
/* hash funcs only work with set_ftrace_filter */
|
|
if (!enable)
|
|
return -EINVAL;
|
|
|
|
if (glob[0] == '!')
|
|
return unregister_ftrace_function_probe_func(glob+1, tr, ops);
|
|
|
|
if (!param)
|
|
goto out_reg;
|
|
|
|
number = strsep(¶m, ":");
|
|
|
|
if (!strlen(number))
|
|
goto out_reg;
|
|
|
|
/*
|
|
* We use the callback data field (which is a pointer)
|
|
* as our counter.
|
|
*/
|
|
ret = kstrtoul(number, 0, (unsigned long *)&count);
|
|
if (ret)
|
|
return ret;
|
|
|
|
out_reg:
|
|
ret = register_ftrace_function_probe(glob, tr, ops, count);
|
|
|
|
return ret < 0 ? ret : 0;
|
|
}
|
|
|
|
static int
|
|
ftrace_trace_onoff_callback(struct trace_array *tr, struct ftrace_hash *hash,
|
|
char *glob, char *cmd, char *param, int enable)
|
|
{
|
|
struct ftrace_probe_ops *ops;
|
|
|
|
if (!tr)
|
|
return -ENODEV;
|
|
|
|
/* we register both traceon and traceoff to this callback */
|
|
if (strcmp(cmd, "traceon") == 0)
|
|
ops = param ? &traceon_count_probe_ops : &traceon_probe_ops;
|
|
else
|
|
ops = param ? &traceoff_count_probe_ops : &traceoff_probe_ops;
|
|
|
|
return ftrace_trace_probe_callback(tr, ops, hash, glob, cmd,
|
|
param, enable);
|
|
}
|
|
|
|
static int
|
|
ftrace_stacktrace_callback(struct trace_array *tr, struct ftrace_hash *hash,
|
|
char *glob, char *cmd, char *param, int enable)
|
|
{
|
|
struct ftrace_probe_ops *ops;
|
|
|
|
if (!tr)
|
|
return -ENODEV;
|
|
|
|
ops = param ? &stacktrace_count_probe_ops : &stacktrace_probe_ops;
|
|
|
|
return ftrace_trace_probe_callback(tr, ops, hash, glob, cmd,
|
|
param, enable);
|
|
}
|
|
|
|
static int
|
|
ftrace_dump_callback(struct trace_array *tr, struct ftrace_hash *hash,
|
|
char *glob, char *cmd, char *param, int enable)
|
|
{
|
|
struct ftrace_probe_ops *ops;
|
|
|
|
if (!tr)
|
|
return -ENODEV;
|
|
|
|
ops = &dump_probe_ops;
|
|
|
|
/* Only dump once. */
|
|
return ftrace_trace_probe_callback(tr, ops, hash, glob, cmd,
|
|
"1", enable);
|
|
}
|
|
|
|
static int
|
|
ftrace_cpudump_callback(struct trace_array *tr, struct ftrace_hash *hash,
|
|
char *glob, char *cmd, char *param, int enable)
|
|
{
|
|
struct ftrace_probe_ops *ops;
|
|
|
|
if (!tr)
|
|
return -ENODEV;
|
|
|
|
ops = &cpudump_probe_ops;
|
|
|
|
/* Only dump once. */
|
|
return ftrace_trace_probe_callback(tr, ops, hash, glob, cmd,
|
|
"1", enable);
|
|
}
|
|
|
|
static struct ftrace_func_command ftrace_traceon_cmd = {
|
|
.name = "traceon",
|
|
.func = ftrace_trace_onoff_callback,
|
|
};
|
|
|
|
static struct ftrace_func_command ftrace_traceoff_cmd = {
|
|
.name = "traceoff",
|
|
.func = ftrace_trace_onoff_callback,
|
|
};
|
|
|
|
static struct ftrace_func_command ftrace_stacktrace_cmd = {
|
|
.name = "stacktrace",
|
|
.func = ftrace_stacktrace_callback,
|
|
};
|
|
|
|
static struct ftrace_func_command ftrace_dump_cmd = {
|
|
.name = "dump",
|
|
.func = ftrace_dump_callback,
|
|
};
|
|
|
|
static struct ftrace_func_command ftrace_cpudump_cmd = {
|
|
.name = "cpudump",
|
|
.func = ftrace_cpudump_callback,
|
|
};
|
|
|
|
static int __init init_func_cmd_traceon(void)
|
|
{
|
|
int ret;
|
|
|
|
ret = register_ftrace_command(&ftrace_traceoff_cmd);
|
|
if (ret)
|
|
return ret;
|
|
|
|
ret = register_ftrace_command(&ftrace_traceon_cmd);
|
|
if (ret)
|
|
goto out_free_traceoff;
|
|
|
|
ret = register_ftrace_command(&ftrace_stacktrace_cmd);
|
|
if (ret)
|
|
goto out_free_traceon;
|
|
|
|
ret = register_ftrace_command(&ftrace_dump_cmd);
|
|
if (ret)
|
|
goto out_free_stacktrace;
|
|
|
|
ret = register_ftrace_command(&ftrace_cpudump_cmd);
|
|
if (ret)
|
|
goto out_free_dump;
|
|
|
|
return 0;
|
|
|
|
out_free_dump:
|
|
unregister_ftrace_command(&ftrace_dump_cmd);
|
|
out_free_stacktrace:
|
|
unregister_ftrace_command(&ftrace_stacktrace_cmd);
|
|
out_free_traceon:
|
|
unregister_ftrace_command(&ftrace_traceon_cmd);
|
|
out_free_traceoff:
|
|
unregister_ftrace_command(&ftrace_traceoff_cmd);
|
|
|
|
return ret;
|
|
}
|
|
#else
|
|
static inline int init_func_cmd_traceon(void)
|
|
{
|
|
return 0;
|
|
}
|
|
#endif /* CONFIG_DYNAMIC_FTRACE */
|
|
|
|
__init int init_function_trace(void)
|
|
{
|
|
init_func_cmd_traceon();
|
|
return register_tracer(&function_trace);
|
|
}
|