Skip to content

Commit cbad0fb

Browse files
mrutland-armctmarinas
authored andcommitted
ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS
Architectures without dynamic ftrace trampolines incur an overhead when multiple ftrace_ops are enabled with distinct filters. in these cases, each call site calls a common trampoline which uses ftrace_ops_list_func() to iterate over all enabled ftrace functions, and so incurs an overhead relative to the size of this list (including RCU protection overhead). Architectures with dynamic ftrace trampolines avoid this overhead for call sites which have a single associated ftrace_ops. In these cases, the dynamic trampoline is customized to branch directly to the relevant ftrace function, avoiding the list overhead. On some architectures it's impractical and/or undesirable to implement dynamic ftrace trampolines. For example, arm64 has limited branch ranges and cannot always directly branch from a call site to an arbitrary address (e.g. from a kernel text address to an arbitrary module address). Calls from modules to core kernel text can be indirected via PLTs (allocated at module load time) to address this, but the same is not possible from calls from core kernel text. Using an indirect branch from a call site to an arbitrary trampoline is possible, but requires several more instructions in the function prologue (or immediately before it), and/or comes with far more complex requirements for patching. Instead, this patch adds a new option, where an architecture can associate each call site with a pointer to an ftrace_ops, placed at a fixed offset from the call site. A shared trampoline can recover this pointer and call ftrace_ops::func() without needing to go via ftrace_ops_list_func(), avoiding the associated overhead. This avoids issues with branch range limitations, and avoids the need to allocate and manipulate dynamic trampolines, making it far simpler to implement and maintain, while having similar performance characteristics. Note that this allows for dynamic ftrace_ops to be invoked directly from an architecture's ftrace_caller trampoline, whereas existing code forces the use of ftrace_ops_get_list_func(), which is in part necessary to permit the ftrace_ops to be freed once unregistered *and* to avoid branch/address-generation range limitation on some architectures (e.g. where ops->func is a module address, and may be outside of the direct branch range for callsites within the main kernel image). The CALL_OPS approach avoids this problems and is safe as: * The existing synchronization in ftrace_shutdown() using ftrace_shutdown() using synchronize_rcu_tasks_rude() (and synchronize_rcu_tasks()) ensures that no tasks hold a stale reference to an ftrace_ops (e.g. in the middle of the ftrace_caller trampoline, or while invoking ftrace_ops::func), when that ftrace_ops is unregistered. Arguably this could also be relied upon for the existing scheme, permitting dynamic ftrace_ops to be invoked directly when ops->func is in range, but this will require additional logic to handle branch range limitations, and is not handled by this patch. * Each callsite's ftrace_ops pointer literal can hold any valid kernel address, and is updated atomically. As an architecture's ftrace_caller trampoline will atomically load the ops pointer then dereference ops->func, there is no risk of invoking ops->func with a mismatches ops pointer, and updates to the ops pointer do not require special care. A subsequent patch will implement architectures support for arm64. There should be no functional change as a result of this patch alone. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Steven Rostedt (Google) <[email protected]> Cc: Florent Revest <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
1 parent b7bfaa7 commit cbad0fb

File tree

3 files changed

+125
-9
lines changed

3 files changed

+125
-9
lines changed

include/linux/ftrace.h

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ static inline void ftrace_boot_snapshot(void) { }
3939

4040
struct ftrace_ops;
4141
struct ftrace_regs;
42+
struct dyn_ftrace;
4243

4344
#ifdef CONFIG_FUNCTION_TRACER
4445
/*
@@ -57,6 +58,9 @@ void arch_ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip);
5758
void arch_ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
5859
struct ftrace_ops *op, struct ftrace_regs *fregs);
5960
#endif
61+
extern const struct ftrace_ops ftrace_nop_ops;
62+
extern const struct ftrace_ops ftrace_list_ops;
63+
struct ftrace_ops *ftrace_find_unique_ops(struct dyn_ftrace *rec);
6064
#endif /* CONFIG_FUNCTION_TRACER */
6165

6266
/* Main tracing buffer and events set up */
@@ -391,8 +395,6 @@ struct ftrace_func_entry {
391395
unsigned long direct; /* for direct lookup only */
392396
};
393397

394-
struct dyn_ftrace;
395-
396398
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
397399
extern int ftrace_direct_func_count;
398400
int register_ftrace_direct(unsigned long ip, unsigned long addr);
@@ -563,6 +565,8 @@ bool is_ftrace_trampoline(unsigned long addr);
563565
* IPMODIFY - the record allows for the IP address to be changed.
564566
* DISABLED - the record is not ready to be touched yet
565567
* DIRECT - there is a direct function to call
568+
* CALL_OPS - the record can use callsite-specific ops
569+
* CALL_OPS_EN - the function is set up to use callsite-specific ops
566570
*
567571
* When a new ftrace_ops is registered and wants a function to save
568572
* pt_regs, the rec->flags REGS is set. When the function has been
@@ -580,9 +584,11 @@ enum {
580584
FTRACE_FL_DISABLED = (1UL << 25),
581585
FTRACE_FL_DIRECT = (1UL << 24),
582586
FTRACE_FL_DIRECT_EN = (1UL << 23),
587+
FTRACE_FL_CALL_OPS = (1UL << 22),
588+
FTRACE_FL_CALL_OPS_EN = (1UL << 21),
583589
};
584590

585-
#define FTRACE_REF_MAX_SHIFT 23
591+
#define FTRACE_REF_MAX_SHIFT 21
586592
#define FTRACE_REF_MAX ((1UL << FTRACE_REF_MAX_SHIFT) - 1)
587593

588594
#define ftrace_rec_count(rec) ((rec)->flags & FTRACE_REF_MAX)
@@ -820,7 +826,8 @@ static inline int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
820826
*/
821827
extern int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr);
822828

823-
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
829+
#if defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS) || \
830+
defined(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)
824831
/**
825832
* ftrace_modify_call - convert from one addr to another (no nop)
826833
* @rec: the call site record (e.g. mcount/fentry)
@@ -833,6 +840,9 @@ extern int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr);
833840
* what we expect it to be, and then on success of the compare,
834841
* it should write to the location.
835842
*
843+
* When using call ops, this is called when the associated ops change, even
844+
* when (addr == old_addr).
845+
*
836846
* The code segment at @rec->ip should be a caller to @old_addr
837847
*
838848
* Return must be:

kernel/trace/Kconfig

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,9 @@ config HAVE_DYNAMIC_FTRACE_WITH_REGS
4242
config HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
4343
bool
4444

45+
config HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
46+
bool
47+
4548
config HAVE_DYNAMIC_FTRACE_WITH_ARGS
4649
bool
4750
help
@@ -257,6 +260,10 @@ config DYNAMIC_FTRACE_WITH_DIRECT_CALLS
257260
depends on DYNAMIC_FTRACE_WITH_REGS
258261
depends on HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
259262

263+
config DYNAMIC_FTRACE_WITH_CALL_OPS
264+
def_bool y
265+
depends on HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
266+
260267
config DYNAMIC_FTRACE_WITH_ARGS
261268
def_bool y
262269
depends on DYNAMIC_FTRACE

kernel/trace/ftrace.c

Lines changed: 104 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,33 @@ struct ftrace_ops global_ops;
125125
void ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
126126
struct ftrace_ops *op, struct ftrace_regs *fregs);
127127

128+
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
129+
/*
130+
* Stub used to invoke the list ops without requiring a separate trampoline.
131+
*/
132+
const struct ftrace_ops ftrace_list_ops = {
133+
.func = ftrace_ops_list_func,
134+
.flags = FTRACE_OPS_FL_STUB,
135+
};
136+
137+
static void ftrace_ops_nop_func(unsigned long ip, unsigned long parent_ip,
138+
struct ftrace_ops *op,
139+
struct ftrace_regs *fregs)
140+
{
141+
/* do nothing */
142+
}
143+
144+
/*
145+
* Stub used when a call site is disabled. May be called transiently by threads
146+
* which have made it into ftrace_caller but haven't yet recovered the ops at
147+
* the point the call site is disabled.
148+
*/
149+
const struct ftrace_ops ftrace_nop_ops = {
150+
.func = ftrace_ops_nop_func,
151+
.flags = FTRACE_OPS_FL_STUB,
152+
};
153+
#endif
154+
128155
static inline void ftrace_ops_init(struct ftrace_ops *ops)
129156
{
130157
#ifdef CONFIG_DYNAMIC_FTRACE
@@ -1814,6 +1841,18 @@ static bool __ftrace_hash_rec_update(struct ftrace_ops *ops,
18141841
* if rec count is zero.
18151842
*/
18161843
}
1844+
1845+
/*
1846+
* If the rec has a single associated ops, and ops->func can be
1847+
* called directly, allow the call site to call via the ops.
1848+
*/
1849+
if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS) &&
1850+
ftrace_rec_count(rec) == 1 &&
1851+
ftrace_ops_get_func(ops) == ops->func)
1852+
rec->flags |= FTRACE_FL_CALL_OPS;
1853+
else
1854+
rec->flags &= ~FTRACE_FL_CALL_OPS;
1855+
18171856
count++;
18181857

18191858
/* Must match FTRACE_UPDATE_CALLS in ftrace_modify_all_code() */
@@ -2108,8 +2147,9 @@ void ftrace_bug(int failed, struct dyn_ftrace *rec)
21082147
struct ftrace_ops *ops = NULL;
21092148

21102149
pr_info("ftrace record flags: %lx\n", rec->flags);
2111-
pr_cont(" (%ld)%s", ftrace_rec_count(rec),
2112-
rec->flags & FTRACE_FL_REGS ? " R" : " ");
2150+
pr_cont(" (%ld)%s%s", ftrace_rec_count(rec),
2151+
rec->flags & FTRACE_FL_REGS ? " R" : " ",
2152+
rec->flags & FTRACE_FL_CALL_OPS ? " O" : " ");
21132153
if (rec->flags & FTRACE_FL_TRAMP_EN) {
21142154
ops = ftrace_find_tramp_ops_any(rec);
21152155
if (ops) {
@@ -2177,6 +2217,7 @@ static int ftrace_check_record(struct dyn_ftrace *rec, bool enable, bool update)
21772217
* want the direct enabled (it will be done via the
21782218
* direct helper). But if DIRECT_EN is set, and
21792219
* the count is not one, we need to clear it.
2220+
*
21802221
*/
21812222
if (ftrace_rec_count(rec) == 1) {
21822223
if (!(rec->flags & FTRACE_FL_DIRECT) !=
@@ -2185,6 +2226,19 @@ static int ftrace_check_record(struct dyn_ftrace *rec, bool enable, bool update)
21852226
} else if (rec->flags & FTRACE_FL_DIRECT_EN) {
21862227
flag |= FTRACE_FL_DIRECT;
21872228
}
2229+
2230+
/*
2231+
* Ops calls are special, as count matters.
2232+
* As with direct calls, they must only be enabled when count
2233+
* is one, otherwise they'll be handled via the list ops.
2234+
*/
2235+
if (ftrace_rec_count(rec) == 1) {
2236+
if (!(rec->flags & FTRACE_FL_CALL_OPS) !=
2237+
!(rec->flags & FTRACE_FL_CALL_OPS_EN))
2238+
flag |= FTRACE_FL_CALL_OPS;
2239+
} else if (rec->flags & FTRACE_FL_CALL_OPS_EN) {
2240+
flag |= FTRACE_FL_CALL_OPS;
2241+
}
21882242
}
21892243

21902244
/* If the state of this record hasn't changed, then do nothing */
@@ -2229,6 +2283,21 @@ static int ftrace_check_record(struct dyn_ftrace *rec, bool enable, bool update)
22292283
rec->flags &= ~FTRACE_FL_DIRECT_EN;
22302284
}
22312285
}
2286+
2287+
if (flag & FTRACE_FL_CALL_OPS) {
2288+
if (ftrace_rec_count(rec) == 1) {
2289+
if (rec->flags & FTRACE_FL_CALL_OPS)
2290+
rec->flags |= FTRACE_FL_CALL_OPS_EN;
2291+
else
2292+
rec->flags &= ~FTRACE_FL_CALL_OPS_EN;
2293+
} else {
2294+
/*
2295+
* Can only call directly if there's
2296+
* only one set of associated ops.
2297+
*/
2298+
rec->flags &= ~FTRACE_FL_CALL_OPS_EN;
2299+
}
2300+
}
22322301
}
22332302

22342303
/*
@@ -2258,7 +2327,8 @@ static int ftrace_check_record(struct dyn_ftrace *rec, bool enable, bool update)
22582327
* and REGS states. The _EN flags must be disabled though.
22592328
*/
22602329
rec->flags &= ~(FTRACE_FL_ENABLED | FTRACE_FL_TRAMP_EN |
2261-
FTRACE_FL_REGS_EN | FTRACE_FL_DIRECT_EN);
2330+
FTRACE_FL_REGS_EN | FTRACE_FL_DIRECT_EN |
2331+
FTRACE_FL_CALL_OPS_EN);
22622332
}
22632333

22642334
ftrace_bug_type = FTRACE_BUG_NOP;
@@ -2431,6 +2501,25 @@ ftrace_find_tramp_ops_new(struct dyn_ftrace *rec)
24312501
return NULL;
24322502
}
24332503

2504+
struct ftrace_ops *
2505+
ftrace_find_unique_ops(struct dyn_ftrace *rec)
2506+
{
2507+
struct ftrace_ops *op, *found = NULL;
2508+
unsigned long ip = rec->ip;
2509+
2510+
do_for_each_ftrace_op(op, ftrace_ops_list) {
2511+
2512+
if (hash_contains_ip(ip, op->func_hash)) {
2513+
if (found)
2514+
return NULL;
2515+
found = op;
2516+
}
2517+
2518+
} while_for_each_ftrace_op(op);
2519+
2520+
return found;
2521+
}
2522+
24342523
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
24352524
/* Protected by rcu_tasks for reading, and direct_mutex for writing */
24362525
static struct ftrace_hash *direct_functions = EMPTY_HASH;
@@ -3780,11 +3869,12 @@ static int t_show(struct seq_file *m, void *v)
37803869
if (iter->flags & FTRACE_ITER_ENABLED) {
37813870
struct ftrace_ops *ops;
37823871

3783-
seq_printf(m, " (%ld)%s%s%s",
3872+
seq_printf(m, " (%ld)%s%s%s%s",
37843873
ftrace_rec_count(rec),
37853874
rec->flags & FTRACE_FL_REGS ? " R" : " ",
37863875
rec->flags & FTRACE_FL_IPMODIFY ? " I" : " ",
3787-
rec->flags & FTRACE_FL_DIRECT ? " D" : " ");
3876+
rec->flags & FTRACE_FL_DIRECT ? " D" : " ",
3877+
rec->flags & FTRACE_FL_CALL_OPS ? " O" : " ");
37883878
if (rec->flags & FTRACE_FL_TRAMP_EN) {
37893879
ops = ftrace_find_tramp_ops_any(rec);
37903880
if (ops) {
@@ -3800,6 +3890,15 @@ static int t_show(struct seq_file *m, void *v)
38003890
} else {
38013891
add_trampoline_func(m, NULL, rec);
38023892
}
3893+
if (rec->flags & FTRACE_FL_CALL_OPS_EN) {
3894+
ops = ftrace_find_unique_ops(rec);
3895+
if (ops) {
3896+
seq_printf(m, "\tops: %pS (%pS)",
3897+
ops, ops->func);
3898+
} else {
3899+
seq_puts(m, "\tops: ERROR!");
3900+
}
3901+
}
38033902
if (rec->flags & FTRACE_FL_DIRECT) {
38043903
unsigned long direct;
38053904

0 commit comments

Comments
 (0)