[RFC PATCH v2 0/4] cpuidle: (POWER) cpuidle driver for pSeries

November 17th, 2011 - 06:40 am ET by Deepthi Dharwar | Report spam
This patch series ports the cpuidle framework for ppc64 platform and
implements a cpuidle back-end driver for ppc64 (pSeries) platform.
Currently idle states are managed by pseries_{dedicated,shared}_idle_sleep()
routines in arch/powerpc/platforms/pseries/setup.c. There are
two idle states (snooze and cede) that are exploited by
these routines based on simple heuristics.

Moving the idle states over to cpuidle framework can take advantage of
the advanced heuristics, tunables, and features provided by cpuidle
framework. Additional idle states like extended cede with hints would be
included and exploited using the cpuidle framework. The statistics and
tracing infrastructure provided by the cpuidle framework also helps in
enabling power management related tools and help tune the system and
applications.

This series aims to maintain compatibility and functionality to
existing pSeries idle cpu management code. There are no new functions
or idle states added as part of this series.

The previous version of this patch can be found at
https://lkml.org/lkml/2011/6/7/375

Changes from the previous version (v1):

1] Rebased to latest 3.2-rc2

2] Incorporated the changes from the feedback provided by Ben
in the previous version of this series.

3] The first three patches in this series posted in v1 are not
re-posted as they were taken from "idle cleanup - v3" posted by
Len Brown (https://lkml.org/lkml/2011/4/2/8) which have made
to the mainline in 3.1.
This cleanup was a community requirement and major dependency
before cpuidle framework can be ported over to powerpc
architecture.

This patch series includes:

[1/4] - Provides arch specific cpu_idle_wait() function required by cpuidle
subsystem.
[2/4] - pseries_idle cpuidle driver
[3/4] - Enables cpuidle for pSeries and directly calls cpuidle_idle_call()
[4/4] - Handles powersave=off kernel boot parameter and disables registration
of pseries_idle cpuidle driver.

This series has been tested on ppc64 pSeries POWER7 system with the snooze
and cede states

arch/powerpc/Kconfig | 4
arch/powerpc/include/asm/processor.h | 3
arch/powerpc/include/asm/system.h | 9 +
arch/powerpc/kernel/idle.c | 26 ++
arch/powerpc/kernel/sysfs.c | 2
arch/powerpc/platforms/Kconfig | 6
arch/powerpc/platforms/pseries/Kconfig | 9 +
arch/powerpc/platforms/pseries/Makefile | 1
arch/powerpc/platforms/pseries/processor_idle.c | 321 +++++++++++++++++++++++
arch/powerpc/platforms/pseries/pseries.h | 3
arch/powerpc/platforms/pseries/setup.c | 89
arch/powerpc/platforms/pseries/smp.c | 1
include/linux/cpuidle.h | 2
13 files changed, 387 insertions(+), 89 deletions(-)
create mode 100644 arch/powerpc/platforms/pseries/processor_idle.c



-Deepthi

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussionReplies 16 repliesReplies Make a reply

Replies

#1 Benjamin Herrenschmidt
November 27th, 2011 - 05:50 pm ET | Report spam
On Thu, 2011-11-17 at 16:58 +0530, Deepthi Dharwar wrote:
This patch provides cpu_idle_wait() routine for the powerpc
platform which is required by the cpuidle subsystem. This
routine is requied to change the idle handler on SMP systems.
The equivalent routine for x86 is in arch/x86/kernel/process.c
but the powerpc implementation is different.

Signed-off-by: Deepthi Dharwar
Signed-off-by: Trinabh Gupta
Signed-off-by: Arun R Bharadwaj




No, that patch also adds this idle boot override thing (can you pick a
shorter name for boot_option_idle_override btw ?) which seems unrelated
and without any explanation as to what it's supposed to be about.

Additionally, I'm a bit worried (but maybe we already discussed that a
while back, I don't know) but cpu_idle_wait() has "wait" in the name,
which makes me think it might need to actually -wait- for all cpus to
have come out of the function.

Now your implementation doesn't provide that guarantee. It might be
fine, I don't know, but if it is, you'd better document it well in the
comments surrounding the code, because as it is, all you do is shoot an
interrupt which will cause the target CPU to eventually come out of idle
some time in the future.

Cheers,
Ben.

arch/powerpc/Kconfig | 4 ++++
arch/powerpc/include/asm/processor.h | 2 ++
arch/powerpc/include/asm/system.h | 1 +
arch/powerpc/kernel/idle.c | 26 ++++++++++++++++++++++++++
4 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b177caa..87f8604 100644
a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -87,6 +87,10 @@ config ARCH_HAS_ILOG2_U64
bool
default y if 64BIT

+config ARCH_HAS_CPU_IDLE_WAIT
+ bool
+ default y
+
config GENERIC_HWEIGHT
bool
default y
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index eb11a44..811b7e7 100644
a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -382,6 +382,8 @@ static inline unsigned long get_clean_sp(struct pt_regs *regs, int is_32)
}
#endif

+enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
+
#endif /* __KERNEL__ */
#endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_PROCESSOR_H */
diff --git a/arch/powerpc/include/asm/system.h b/arch/powerpc/include/asm/system.h
index e30a13d..ff66680 100644
a/arch/powerpc/include/asm/system.h
+++ b/arch/powerpc/include/asm/system.h
@@ -221,6 +221,7 @@ extern unsigned long klimit;
extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);

extern int powersave_nap; /* set if nap mode can be used in idle loop */
+void cpu_idle_wait(void);

/*
* Atomic exchange
diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
index 39a2baa..b478c72 100644
a/arch/powerpc/kernel/idle.c
+++ b/arch/powerpc/kernel/idle.c
@@ -39,9 +39,13 @@
#define cpu_should_die() 0
#endif

+unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE;
+EXPORT_SYMBOL(boot_option_idle_override);
+
static int __init powersave_off(char *arg)
{
ppc_md.power_save = NULL;
+ boot_option_idle_override = IDLE_POWERSAVE_OFF;
return 0;
}
__setup("powersave=off", powersave_off);
@@ -102,6 +106,28 @@ void cpu_idle(void)
}
}

+
+/*
+ * cpu_idle_wait - Used to ensure that all the CPUs come out of the old
+ * idle loop and start using the new idle loop.
+ * Required while changing idle handler on SMP systems.
+ * Caller must have changed idle handler to the new value before the call.
+ */
+void cpu_idle_wait(void)
+{
+ int cpu;
+ smp_mb();
+
+ /* kick all the CPUs so that they exit out of old idle routine */
+ get_online_cpus();
+ for_each_online_cpu(cpu) {
+ if (cpu != smp_processor_id())
+ smp_send_reschedule(cpu);
+ }
+ put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(cpu_idle_wait);
+
int powersave_nap;

#ifdef CONFIG_SYSCTL

_______________________________________________
Linuxppc-dev mailing list

https://lists.ozlabs.org/listinfo/linuxppc-dev




To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Similar topics