We noticed in a hackbench test (./hackbench 100 process 2000)
on a Sandy bridge 2 socket server, there has been a slow down
by a factor of 4 since commit 367456c7 was applied
(sched: Ditch per cgroup task lists for load-balancing).
The commit 5d6523e (sched: Fix load-balance wreckage) did
not fix the regression.
In the profile, there is heavy spin lock contention in the load_balance path of 3.4-rc2
where it was less than .003% of cpu before commit 367456c7.
When we looked into /proc/schedstat for 3.4-rc2 for the run duration,
on cpu0 schedule was called 13x more often, and schedule call which
left the processor idle was 530x as much.
There was also a big increase in try to wake up remote (sd->ttwu_wake_remote) count.
increase in sd->ttwu_wake_remote for cpu0
domain 0 540%
domain 1 7570%
domain 2 4426%
Wonder if there is unnecessary load balancing to remote cpu?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/