[RFC 0/4] tracing,x86_64 - function/graph trace without mcount/-pg/framepointer

February 03rd, 2011 - 10:50 am ET by Jiri Olsa | Report spam
hi,

I recently saw the direct jump probing made for kprobes
and tried to use it inside the trace framework.

The global idea is patching the function entry with direct
jump to the trace code, instead of using pregenerated gcc
profile code.

I started this just to see if it would be even possible
to hook with new probing to the current trace code. It
appears it's not that bad. I was able to run function
and function_graph trace on x86_64.

For details on direct jumps probe, please check:
http://www.linuxinsight.com/ols2007...rhead.html


I realize using this way to hook the functions has some
drawbacks, from what I can see it's roughly:
- no all functions could be patched
- need to find a way to say which function is safe to patch
- memory consumption for detour buffers and symbol records

but seems there're some advantages as well:
- trace code could be in a module
- no profiling code is needed
- framepointer can be disabled (framepointer is needed for
generating profile code)


As for the attached implementation it's hack mostly (expect bugs),
especially the ftrace/kprobe integration could be probably done better.
It's only for x86_64.

It can be used like this:

- new menu config item is added (function tracer engine),
to choose mcount or ktrace
- new file "ktrace" is added to the tracing dir
- to add symbols to trace run:
echo mutex_unlock > ./ktrace
echo mutex_lock >> ./ktrace
- to display trace symbols:
cat ktrace
- to enable the trace, the usual is needed:
echo function > ./current_tracer
echo function_graph > ./current_tracer
- to remove symbols from trace:
echo nop > ./current_tracer
echo > ./ktrace
- if the function is added while the tracer is running,
the symbol is enabled automatically.
- only all symbols could be removed and only if there's
no tracer running.

I'm not sure how to choose from kallsyms interface what function
is safe to patch, so I omit patching of all symbols so far.


attached patches:
1/4 - kprobe - ktrace instruction slot cache interface
using kprobe detour buffer allocation, adding interface
to use it from trace framework

2/4 - tracing - adding size parameter to do_ftrace_mod_code
adding size parameter to be able to restore the saved
instructions, which could be longer than relative call

3/4 - ktrace - function trace support
adding ktrace support with function tracer

4/4 - ktrace - function trace support
adding function graph support


please let me know what you think, thanks
jirka

Makefile | 2 +-
arch/x86/Kconfig | 4 +-
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/entry_64.S | 50 +++++++
arch/x86/kernel/ftrace.c | 157 +++++++++++-
arch/x86/kernel/ktrace.c | 256 ++++++++++++++++++++++++++++++++++
include/linux/ftrace.h | 36 +++++-
include/linux/kprobes.h | 8 +
kernel/kprobes.c | 33 +++++
kernel/trace/Kconfig | 28 ++++-
kernel/trace/Makefile | 1 +
kernel/trace/ftrace.c | 21 +++
kernel/trace/ktrace.c | 330 ++++++++++++++++++++++++++++++++++++++++++++
kernel/trace/trace.c | 1 +
14 files changed, 846 insertions(+), 82 deletions(-)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussionReplies 4 repliesReplies Make a reply

Similar topics

Replies

#1 Steven Rostedt
February 03rd, 2011 - 11:40 am ET | Report spam
On Thu, 2011-02-03 at 16:42 +0100, Jiri Olsa wrote:
hi,

I recently saw the direct jump probing made for kprobes
and tried to use it inside the trace framework.

The global idea is patching the function entry with direct
jump to the trace code, instead of using pregenerated gcc
profile code.



Interesting, but ideally, it would be nice if gcc provided a better
"mcount" mechanism. One that calls mcount (or whatever new name it would
have) before it does anything with the stack.


I started this just to see if it would be even possible
to hook with new probing to the current trace code. It
appears it's not that bad. I was able to run function
and function_graph trace on x86_64.

For details on direct jumps probe, please check:
http://www.linuxinsight.com/ols2007...rhead.html


I realize using this way to hook the functions has some
drawbacks, from what I can see it's roughly:
- no all functions could be patched



What's the reason for not all functions?

- need to find a way to say which function is safe to patch
- memory consumption for detour buffers and symbol records

but seems there're some advantages as well:
- trace code could be in a module



What makes this allow module code?

ftrace could do that now, but it would require a separate handler. I
would need to disable preemption before calling the module code function
handler.

- no profiling code is needed
- framepointer can be disabled (framepointer is needed for
generating profile code)



Again ideally, gcc should fix this.



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#2 Frederic Weisbecker
February 03rd, 2011 - 12:40 pm ET | Report spam
On Thu, Feb 03, 2011 at 11:33:25AM -0500, Steven Rostedt wrote:
On Thu, 2011-02-03 at 16:42 +0100, Jiri Olsa wrote:
> hi,
>
> I recently saw the direct jump probing made for kprobes
> and tried to use it inside the trace framework.
>
> The global idea is patching the function entry with direct
> jump to the trace code, instead of using pregenerated gcc
> profile code.

Interesting, but ideally, it would be nice if gcc provided a better
"mcount" mechanism. One that calls mcount (or whatever new name it would
have) before it does anything with the stack.

>
> I started this just to see if it would be even possible
> to hook with new probing to the current trace code. It
> appears it's not that bad. I was able to run function
> and function_graph trace on x86_64.
>
> For details on direct jumps probe, please check:
> http://www.linuxinsight.com/ols2007...rhead.html
>
>
> I realize using this way to hook the functions has some
> drawbacks, from what I can see it's roughly:
> - no all functions could be patched

What's the reason for not all functions?



Because of those that kprobes calls, so to avoid recursion.
kprobes has some recursion detection mechanism, IIRC, but
until we reach that checkpoint, I think there are some functions
in the path.

Well, ftrace has the same problem. That's just due to the nature of
function tracing.

There may be some places too fragile to use kprobes there too.

Ah, the whole trap path for example :-(

> - need to find a way to say which function is safe to patch
> - memory consumption for detour buffers and symbol records
>
> but seems there're some advantages as well:
> - trace code could be in a module

What makes this allow module code?

ftrace could do that now, but it would require a separate handler. I
would need to disable preemption before calling the module code function
handler.



Kprobes takes care of handlers from modules already.
I'm not sure we want that, it makes the tracing code more sensitive.

Look, for example I think kprobes doesn't trace kernel faults path
because module space is allocated through vmalloc (hmm, is it still
the case?).

> - no profiling code is needed
> - framepointer can be disabled (framepointer is needed for
> generating profile code)

Again ideally, gcc should fix this.



As another drawback of using kprobes, there is also the overhead.
I can't imagine a trap triggering for every functions. But then
yeah we have the jmp optimisation. But then it needs that detour
buffer that we can avoid with mcount.

So like Steve I think mcount is still a better backend for function
tracing. More optimized by nature, even though it indeed needs
some fixes.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#3 Steven Rostedt
February 03rd, 2011 - 02:10 pm ET | Report spam
On Thu, 2011-02-03 at 18:35 +0100, Frederic Weisbecker wrote:

> ftrace could do that now, but it would require a separate handler. I
> would need to disable preemption before calling the module code function
> handler.

Kprobes takes care of handlers from modules already.
I'm not sure we want that, it makes the tracing code more sensitive.



Masami,

I'm looking at the optimize code, particularly
kprobes_optinsn_template_holder(), which looks to be the template that
is called on optimized kprobes. I don't see where preemption or
interrupts are disabled when a probe is called.

If modules can register probes, and we can call it in any arbitrary
location of the kernel, then preemption must be disabled prior to
calling the module code. Otherwise you risk crashing the system on
module unload.


module:
-
register_kprobe(probe);


Core:
hit break point
call probe

module:
-
in probe function
preempted

module:
-
unregister_kprobe(probe);
stop_machine();
<module unloaded>

Core:
module <zombie>:
-
gets CPU again
executes module code that's been freed
DEATH BY ZOMBIES

Maybe I missed something. But does the optimize kprobes disable
preemption or interrupts before calling the optimized probe?



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#4 Masami Hiramatsu
February 04th, 2011 - 01:10 am ET | Report spam
Hi,

(2011/02/04 0:42), Jiri Olsa wrote:
hi,

I recently saw the direct jump probing made for kprobes
and tried to use it inside the trace framework.

The global idea is patching the function entry with direct
jump to the trace code, instead of using pregenerated gcc
profile code.

I started this just to see if it would be even possible
to hook with new probing to the current trace code. It
appears it's not that bad. I was able to run function
and function_graph trace on x86_64.

For details on direct jumps probe, please check:
http://www.linuxinsight.com/ols2007...rhead.html



Thank you for referring it ;-)

I realize using this way to hook the functions has some
drawbacks, from what I can see it's roughly:
- no all functions could be patched



Yeah, that is why the "djprobe" becomes "optprobe". If kprobe
finds there is no space to patch, it just fallback to a
breakpoint. Since this check is done internally, kprobes
user takes this benefit transparently ( don't need to
change user's code).

- need to find a way to say which function is safe to patch
- memory consumption for detour buffers and symbol records



And also, you can't patch more than two instructions without
int3 bypass method (or special stack checker), because a processor
can run and may have been interrupted on the 2nd instruction
when stop_machine is issued.
That's the 2nd reason why the djprobe is a part of kprobes.
this "int3 bypass" method disallow you to probe NMI handlers,
since int3 inside NMI will clear additional NMI masking by
issuing IRET.

but seems there're some advantages as well:
- trace code could be in a module
- no profiling code is needed
- framepointer can be disabled (framepointer is needed for
generating profile code)



nowadays profiling code with dynamic ftrace will not make
visible overhead, and if you need to do that without
profiling binary, you can already use kprobe-tracer for it.
(Using kprobe-tracer via perf-probe allows you to probe not
only actual function but also inlined function entry ;-))


Thank you,


As for the attached implementation it's hack mostly (expect bugs),
especially the ftrace/kprobe integration could be probably done better.
It's only for x86_64.

It can be used like this:

- new menu config item is added (function tracer engine),
to choose mcount or ktrace
- new file "ktrace" is added to the tracing dir
- to add symbols to trace run:
echo mutex_unlock > ./ktrace
echo mutex_lock >> ./ktrace
- to display trace symbols:
cat ktrace
- to enable the trace, the usual is needed:
echo function > ./current_tracer
echo function_graph > ./current_tracer
- to remove symbols from trace:
echo nop > ./current_tracer
echo > ./ktrace
- if the function is added while the tracer is running,
the symbol is enabled automatically.
- only all symbols could be removed and only if there's
no tracer running.

I'm not sure how to choose from kallsyms interface what function
is safe to patch, so I omit patching of all symbols so far.




attached patches:
1/4 - kprobe - ktrace instruction slot cache interface
using kprobe detour buffer allocation, adding interface
to use it from trace framework

2/4 - tracing - adding size parameter to do_ftrace_mod_code
adding size parameter to be able to restore the saved
instructions, which could be longer than relative call

3/4 - ktrace - function trace support
adding ktrace support with function tracer

4/4 - ktrace - function trace support
adding function graph support


please let me know what you think, thanks
jirka

Makefile | 2 +-
arch/x86/Kconfig | 4 +-
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/entry_64.S | 50 +++++++
arch/x86/kernel/ftrace.c | 157 +++++++++++-
arch/x86/kernel/ktrace.c | 256 ++++++++++++++++++++++++++++++++++
include/linux/ftrace.h | 36 +++++-
include/linux/kprobes.h | 8 +
kernel/kprobes.c | 33 +++++
kernel/trace/Kconfig | 28 ++++-
kernel/trace/Makefile | 1 +
kernel/trace/ftrace.c | 21 +++
kernel/trace/ktrace.c | 330 ++++++++++++++++++++++++++++++++++++++++++++
kernel/trace/trace.c | 1 +
14 files changed, 846 insertions(+), 82 deletions(-)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/




Masami HIRAMATSU
2nd Dept. Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail:
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussion Replies Reply to this message
Help Create a new topicReplies Make a reply
Search Make your own search