[PATCH] jump label: Reduce the cycle count by changing the link order

August 05th, 2011 - 04:50 pm ET by Jason Baron | Report spam
In the course of testing jump labels for use with the CFS bandwidth controller,
Paul Turner, discovered that using jump labels reduced the branch count and the
instruction count, but did not reduce the cycle count or wall time.

I noticed that having the jump_label.o included in the kernel but not used in
any way still caused this increase in cycle count and wall time. Thus, I moved
jump_label.o in the kernel/Makefile, thus changing the link order, and
presumably moving it out of hot icache areas. This brought down the cycle
count/time as expected.

In addition to Paul's testing, I've tested the patch using a single
'static_branch()' in the getppid() path, and basically running tight loops of
calls to getppid(). Here are my results for the branch disabled case:

With jump labels turned on (CONFIG_JUMP_LABEL), branch disabled:

Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):

3,969,510,217 instructions # 0.864 IPC ( +-0.000% )
4,592,334,954 cycles ( +- 0.046% )
751,634,470 branches ( +- 0.000% )

1.722635797 seconds time elapsed ( +- 0.046% )

Jump labels turned off (CONFIG_JUMP_LABEL not set), branch disabled:

Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):

4,009,611,846 instructions # 0.867 IPC ( +-0.000% )
4,622,210,580 cycles ( +- 0.012% )
771,662,904 branches ( +- 0.000% )

1.734341454 seconds time elapsed ( +- 0.022% )

Signed-off-by: Jason Baron <jbaron@redhat.com>
Tested-by: Paul Turner <pjt@google.com>

kernel/Makefile | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/Makefile b/kernel/Makefile
index 2d64cfc..329dfcc 100644
a/kernel/Makefile
+++ b/kernel/Makefile
@@ -10,7 +10,7 @@ obj-y = sched.o fork.o exec_domain.o panic.o printk.o \
kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
notifier.o ksysfs.o pm_qos_params.o sched_clock.o cred.o \
- async.o range.o jump_label.o
+ async.o range.o
obj-y += groups.o

ifdef CONFIG_FUNCTION_TRACER
@@ -107,6 +107,7 @@ obj-$(CONFIG_PERF_EVENTS) += events/
obj-$(CONFIG_USER_RETURN_NOTIFIER) += user-return-notifier.o
obj-$(CONFIG_PADATA) += padata.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
+obj-$(CONFIG_JUMP_LABEL) += jump_label.o

ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
# According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
1.7.5.4

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussionReplies 6 repliesReplies Make a reply

Replies

#1 Peter Zijlstra
August 05th, 2011 - 06:20 pm ET | Report spam
On Fri, 2011-08-05 at 16:40 -0400, Jason Baron wrote:
In the course of testing jump labels for use with the CFS bandwidth controller,
Paul Turner, discovered that using jump labels reduced the branch count and the
instruction count, but did not reduce the cycle count or wall time.

I noticed that having the jump_label.o included in the kernel but not used in
any way still caused this increase in cycle count and wall time. Thus, I moved
jump_label.o in the kernel/Makefile, thus changing the link order, and
presumably moving it out of hot icache areas. This brought down the cycle
count/time as expected.

In addition to Paul's testing, I've tested the patch using a single
'static_branch()' in the getppid() path, and basically running tight loops of
calls to getppid(). Here are my results for the branch disabled case:



Those numbers don't seem to be pre/post patch, but merely
CONFIG_JUMP_LABEL=y/n so they don't tell us what the patch does.

Anyway, should we put a comment in the Makefile telling us we should
keep jump_label.o last?

Also, pjt mentioned on IRC that mucking about with link order is
something google is not unfamiliar with.. could we use some sort of
runtime feedback to generate linker layout maps or so? That seems like a
more scalable version than randomly mucking about with Makefiles :-)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Similar topics