[PATCH v3 0/3] CPU hotplug, Freezer: Fix bugs in CPU hotplug call path

October 21st, 2011 - 08:30 am ET by Srivatsa S. Bhat | Report spam
When using the CPU hotplug infrastructure to offline/online CPUs, the cpu_up()
and cpu_down() functions are used, which internally call _cpu_up() and
_cpu_down() with the second argument *always* set to 0. The second argument
is "tasks_frozen", which should be correctly set to 1 when tasks have been
frozen, even when the freezing of tasks may be due to an event unrelated
to CPU hotplug, such as a suspend operation in progress, in which case the
freezer subsystem freezes all the freezable tasks.

Not giving the correct value for the 'tasks_frozen' argument can potentially
lead to a lot of issues since this will send wrong notifications via the
notifier mechanism leading to the execution of inappropriate code by the
callbacks registered for the notifiers. That is, instead of CPU_ONLINE_FROZEN
and CPU_DEAD_FROZEN notifications, it results in CPU_ONLINE and CPU_DEAD
notifications to be sent all the time, irrespective of the actual state of
the system (i.e., whether tasks are frozen or not).

This patch introduces a flag to indicate whether the tasks are frozen are not
(by any event) and modifies cpu_up() and cpu_down() functions to check the
value of this flag and accordingly call _cpu_up() and _cpu_down() respectively
by supplying the correct value as the second argument based on the state of
the system. This in turn means that the correct notifications are sent, thus
ensuring that all the registered callbacks do the right thing.

Additionally, to ensure that the tasks are not frozen or thawed by the freezer
subsystem while the registered callbacks are running, this patch adds a few
notifications in the freezer which is then hooked onto by the CPU hotplug
code, to avoid this race.

v3: * Added synchronization between CPU hotplug and the freezer subsystem
without introducing any new locks in the CPU hotplug call path.

v2: * Removed the atomic_t declaration of tasks_frozen flag and the
atomic_[set|read] functions since they were unnecessary.
* Updated the changelog to give an example scenario where things could go
wrong due to the bug in the CPU hotplug call path.

References:
v1 -> http://thread.gmane.org/gmane.linux...l/1198312/
v2 -> http://thread.gmane.org/gmane.linux...8312/focus99087

Srivatsa S. Bhat (3):
CPU hotplug, Freezer: Don't hardcode 'tasks_frozen' argument in _cpu_*()
CPU hotplug, Freezer: Synchronize CPU hotplug and freezer
PM, Freezer, Docs: Update documentation about the new freezer notifiers


Documentation/power/notifiers.txt | 8 ++++
include/linux/freezer.h | 2 +
include/linux/suspend.h | 6 ++-
kernel/cpu.c | 76 ++++++++++++++++++++++++++++++++++++-
kernel/power/process.c | 32 +++++++++++++++-
5 files changed, 120 insertions(+), 4 deletions(-)


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussionReplies 3 repliesReplies Make a reply

Replies

#1 Srivatsa S. Bhat
October 21st, 2011 - 08:30 am ET | Report spam
This patch introduces a 'tasks_frozen' flag that is set whenever tasks are
frozen by the freezer and reset whenever the tasks are thawed. The value of
this flag is then checked by the CPU hotplug code in order to pass the
correct argument to the _cpu_up() and _cpu_down() functions, thereby
reflecting the correct state of the system.

Signed-off-by: Srivatsa S. Bhat


include/linux/freezer.h | 2 ++
kernel/cpu.c | 5 +++--
kernel/power/process.c | 22 +++++++++++++++++++++-
3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 1effc8b..a5283ce 100644
a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -50,6 +50,7 @@ extern int thaw_process(struct task_struct *p);
extern void refrigerator(void);
extern int freeze_processes(void);
extern void thaw_processes(void);
+extern int tasks_are_frozen(void);

static inline int try_to_freeze(void)
{
@@ -173,6 +174,7 @@ static inline int thaw_process(struct task_struct *p) { return 1; }
static inline void refrigerator(void) {}
static inline int freeze_processes(void) { BUG(); return 0; }
static inline void thaw_processes(void) {}
+static inline int tasks_are_frozen(void) { return 0; }

static inline int try_to_freeze(void) { return 0; }

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 12b7458..e889ffd 100644
a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -15,6 +15,7 @@
#include <linux/stop_machine.h>
#include <linux/mutex.h>
#include <linux/gfp.h>
+#include <linux/freezer.h>

#ifdef CONFIG_SMP
/* Serializes the updates to cpu_online_mask, cpu_present_mask */
@@ -280,7 +281,7 @@ int __ref cpu_down(unsigned int cpu)
goto out;
}

- err = _cpu_down(cpu, 0);
+ err = _cpu_down(cpu, tasks_are_frozen());

out:
cpu_maps_update_done();
@@ -373,7 +374,7 @@ int __cpuinit cpu_up(unsigned int cpu)
goto out;
}

- err = _cpu_up(cpu, 0);
+ err = _cpu_up(cpu, tasks_are_frozen());

out:
cpu_maps_update_done();
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 0cf3a27..230bf96 100644
a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -17,11 +17,28 @@
#include <linux/delay.h>
#include <linux/workqueue.h>

-/*
+/*
* Timeout for stopping processes
*/
#define TIMEOUT (20 * HZ)

+static int tasks_frozen;
+
+static void set_tasks_frozen_flag(void)
+{
+ tasks_frozen = 1;
+}
+
+static void clear_tasks_frozen_flag(void)
+{
+ tasks_frozen = 0;
+}
+
+int tasks_are_frozen(void)
+{
+ return tasks_frozen;
+}
+
static inline int freezable(struct task_struct * p)
{
if ((p == current) ||
@@ -151,6 +168,8 @@ int freeze_processes(void)
error = try_to_freeze_tasks(false);
if (error)
goto Exit;
+
+ set_tasks_frozen_flag();
printk("done.");

oom_killer_disable();
@@ -189,6 +208,7 @@ void thaw_processes(void)
thaw_workqueues();
thaw_tasks(true);
thaw_tasks(false);
+ clear_tasks_frozen_flag();
schedule();
printk("done.");
}

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Similar topics