[PATCH RFC V4 0/3] kvm: Improving directed yield in PLE handler

July 16th, 2012 - 04:30 am ET by Raghavendra K T | Report spam
Currently Pause Loop Exit (PLE) handler is doing directed yield to a
random vcpu on pl-exit. We already have filtering while choosing
the candidate to yield_to. This change adds more checks while choosing
a candidate to yield_to.

On a large vcpu guests, there is a high probability of
yielding to the same vcpu who had recently done a pause-loop exit.
Such a yield can lead to the vcpu spinning again.

The patchset keeps track of the pause loop exit and gives chance to a
vcpu which has:

(a) Not done pause loop exit at all (probably he is preempted lock-holder)

(b) vcpu skipped in last iteration because it did pause loop exit, and
probably has become eligible now (next eligible lock holder)

This concept also helps in cpu relax interception cases which use same handler.

Changes since V3:
- arch specific fix/changes (Christian)

Changes since v2:
- Move ple structure to common code (Avi)
- rename pause_loop_exited to cpu_relax_intercepted (Avi)
- add config HAVE_KVM_CPU_RELAX_INTERCEPT (Avi)
- Drop superfluous curly braces (Ingo)

Changes since v1:
- Add more documentation for structure and algorithm and Rename
plo ==> ple (Rik).
- change dy_eligible initial value to false. (otherwise very first directed
yield will not be skipped. (Nikunj)
- fixup signoff/from issue

Future enhancements:
(1) Currently we have a boolean to decide on eligibility of vcpu. It
would be nice if I get feedback on guest (>32 vcpu) whether we can
improve better with integer counter. (with counter = say f(log n )).

(2) We have not considered system load during iteration of vcpu. With
that information we can limit the scan and also decide whether schedule()
is better. [ I am able to use #kicked vcpus to decide on this But may
be there are better ideas like information from global loadavg.]

(3) We can exploit this further with PV patches since it also knows about
next eligible lock-holder.

Summary: There is a very good improvement for moderate / no over-commit scenario
for kvm based guest on PLE machine.

Results: kernbench improves by around 30%, 6% for 1x,2x respectively
ebizzy improves by around 87%, 23% for 1x,2x respectively

Note: The patches are tested on x86.

Links
V1: https://lkml.org/lkml/2012/7/9/32
V2: https://lkml.org/lkml/2012/7/10/392
V3: https://lkml.org/lkml/2012/7/12/437

Raghavendra K T (3):
config: Add config to support ple or cpu relax optimzation
kvm : Note down when cpu relax intercepted or pause loop exited
kvm : Choose a better candidate for directed yield

arch/s390/kvm/Kconfig | 1 +
arch/x86/kvm/Kconfig | 1 +
include/linux/kvm_host.h | 42 ++++++++++++++++++++++++++++++++++++++++++
virt/kvm/Kconfig | 3 +++
virt/kvm/kvm_main.c | 40 ++++++++++++++++++++++++++++++++++++++++
5 files changed, 87 insertions(+), 0 deletions(-)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussionReplies 14 repliesReplies Make a reply

Similar topics

Replies

#1 Raghavendra K T
July 16th, 2012 - 04:30 am ET | Report spam
From: Raghavendra K T

Noting pause loop exited vcpu or cpu relax intercepted helps in
filtering right candidate to yield. Wrong selection of vcpu;
i.e., a vcpu that just did a pl-exit or cpu relax intercepted may
contribute to performance degradation.

Signed-off-by: Raghavendra K T

V2 was:
Reviewed-by: Rik van Riel

include/linux/kvm_host.h | 37 +++++++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 5 +++++
2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c446435..fa353bd 100644
a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -183,6 +183,18 @@ struct kvm_vcpu {
} async_pf;
#endif

+#ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
+ /*
+ * Cpu relax intercept or pause loop exit optimization
+ * cpu_relax_intercepted: set when a vcpu does a pause loop exit
+ * or cpu relax intercepted.
+ * dy_eligible: indicates whether vcpu is eligible for directed yield.
+ */
+ struct {
+ bool cpu_relax_intercepted;
+ bool dy_eligible;
+ } ple;
+#endif
struct kvm_vcpu_arch arch;
};

@@ -890,5 +902,30 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
}
}

+#ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
+
+static inline void kvm_vcpu_set_cpu_relax_intercepted(struct kvm_vcpu *vcpu,
+ bool val)
+{
+ vcpu->ple.cpu_relax_intercepted = val;
+}
+static inline void kvm_vcpu_set_dy_eligible(struct kvm_vcpu *vcpu, bool val)
+{
+ vcpu->ple.dy_eligible = val;
+}
+
+#else /* !CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
+
+static inline void kvm_vcpu_set_cpu_relax_intercepted(struct kvm_vcpu *vcpu,
+ bool val)
+{
+}
+
+static inline void kvm_vcpu_set_dy_eligible(struct kvm_vcpu *vcpu, bool val)
+{
+}
+
+#endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
+
#endif

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7e14068..c50213f 100644
a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -236,6 +236,9 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
}
vcpu->run = page_address(page);

+ kvm_vcpu_set_cpu_relax_intercepted(vcpu, false);
+ kvm_vcpu_set_dy_eligible(vcpu, false);
+
r = kvm_arch_vcpu_init(vcpu);
if (r < 0)
goto fail_free_run;
@@ -1577,6 +1580,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
int pass;
int i;

+ kvm_vcpu_set_cpu_relax_intercepted(me, true);
/*
* We boost the priority of a VCPU that is runnable but not
* currently running, because it got preempted by something
@@ -1602,6 +1606,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
}
}
}
+ kvm_vcpu_set_cpu_relax_intercepted(me, false);
}
EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin);


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#2 Raghavendra K T
July 16th, 2012 - 04:30 am ET | Report spam
From: Raghavendra K T

Currently, on a large vcpu guests, there is a high probability of
yielding to the same vcpu who had recently done a pause-loop exit or
cpu relax intercepted. Such a yield can lead to the vcpu spinning
again and hence degrade the performance.

The patchset keeps track of the pause loop exit/cpu relax interception
and gives chance to a vcpu which:
(a) Has not done pause loop exit or cpu relax intercepted at all
(probably he is preempted lock-holder)
(b) Was skipped in last iteration because it did pause loop exit or
cpu relax intercepted, and probably has become eligible now
(next eligible lock holder)

Signed-off-by: Raghavendra K T

V2 was:
Reviewed-by: Rik van Riel

include/linux/kvm_host.h | 5 +++++
virt/kvm/kvm_main.c | 35 +++++++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index fa353bd..cfb38a2 100644
a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -925,6 +925,11 @@ static inline void kvm_vcpu_set_dy_eligible(struct kvm_vcpu *vcpu, bool val)
{
}

+static inline bool kvm_vcpu_check_and_update_eligible(struct kvm_vcpu *vcpu)
+{
+ return true;
+}
+
#endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */

#endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c50213f..6a82580 100644
a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1571,6 +1571,39 @@ bool kvm_vcpu_yield_to(struct kvm_vcpu *target)
}
EXPORT_SYMBOL_GPL(kvm_vcpu_yield_to);

+#ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
+/*
+ * Helper that checks whether a VCPU is eligible for directed yield.
+ * Most eligible candidate to yield is decided by following heuristics:
+ *
+ * (a) VCPU which has not done pl-exit or cpu relax intercepted recently
+ * (preempted lock holder), indicated by @cpu_relax_intercepted.
+ * Set at the beiginning and cleared at the end of interception/PLE handler.
+ *
+ * (b) VCPU which has done pl-exit/ cpu relax intercepted but did not get
+ * chance last time (mostly it has become eligible now since we have probably
+ * yielded to lockholder in last iteration. This is done by toggling
+ * @dy_eligible each time a VCPU checked for eligibility.)
+ *
+ * Yielding to a recently pl-exited/cpu relax intercepted VCPU before yielding
+ * to preempted lock-holder could result in wrong VCPU selection and CPU
+ * burning. Giving priority for a potential lock-holder increases lock
+ * progress.
+ */
+bool kvm_vcpu_check_and_update_eligible(struct kvm_vcpu *vcpu)
+{
+ bool eligible;
+
+ eligible = !vcpu->ple.cpu_relax_intercepted ||
+ (vcpu->ple.cpu_relax_intercepted &&
+ vcpu->ple.dy_eligible);
+
+ if (vcpu->ple.cpu_relax_intercepted)
+ vcpu->ple.dy_eligible = !vcpu->ple.dy_eligible;
+
+ return eligible;
+}
+#endif
void kvm_vcpu_on_spin(struct kvm_vcpu *me)
{
struct kvm *kvm = me->kvm;
@@ -1599,6 +1632,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
continue;
if (waitqueue_active(&vcpu->wq))
continue;
+ if (!kvm_vcpu_check_and_update_eligible(vcpu))
+ continue;
if (kvm_vcpu_yield_to(vcpu)) {
kvm->last_boosted_vcpu = i;
yielded = 1;

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#3 Raghavendra K T
July 16th, 2012 - 04:30 am ET | Report spam
From: Raghavendra K T

Suggested-by: Avi Kivity
Signed-off-by: Raghavendra K T

arch/s390/kvm/Kconfig | 1 +
arch/x86/kvm/Kconfig | 1 +
virt/kvm/Kconfig | 3 +++
3 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
index 78eb984..a6e2677 100644
a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
depends on HAVE_KVM && EXPERIMENTAL
select PREEMPT_NOTIFIERS
select ANON_INODES
+ select HAVE_KVM_CPU_RELAX_INTERCEPT
help
Support hosting paravirtualized guest machines using the SIE
virtualization capability on the mainframe. This should work
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index a28f338..45c044f 100644
a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -37,6 +37,7 @@ config KVM
select TASK_DELAY_ACCT
select PERF_EVENTS
select HAVE_KVM_MSI
+ select HAVE_KVM_CPU_RELAX_INTERCEPT
help
Support hosting fully virtualized guest machines using hardware
virtualization extensions. You will need a fairly recent
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 28694f4..d01b24b 100644
a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -21,3 +21,6 @@ config KVM_ASYNC_PF

config HAVE_KVM_MSI
bool
+
+config HAVE_KVM_CPU_RELAX_INTERCEPT
+ bool

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#4 Avi Kivity
July 16th, 2012 - 06:10 am ET | Report spam
On 07/16/2012 11:25 AM, Raghavendra K T wrote:
From: Raghavendra K T

Currently, on a large vcpu guests, there is a high probability of
yielding to the same vcpu who had recently done a pause-loop exit or
cpu relax intercepted. Such a yield can lead to the vcpu spinning
again and hence degrade the performance.

The patchset keeps track of the pause loop exit/cpu relax interception
and gives chance to a vcpu which:
(a) Has not done pause loop exit or cpu relax intercepted at all
(probably he is preempted lock-holder)
(b) Was skipped in last iteration because it did pause loop exit or
cpu relax intercepted, and probably has become eligible now
(next eligible lock holder)



+#ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
+/*
+ * Helper that checks whether a VCPU is eligible for directed yield.
+ * Most eligible candidate to yield is decided by following heuristics:
+ *
+ * (a) VCPU which has not done pl-exit or cpu relax intercepted recently
+ * (preempted lock holder), indicated by @cpu_relax_intercepted.
+ * Set at the beiginning and cleared at the end of interception/PLE handler.
+ *
+ * (b) VCPU which has done pl-exit/ cpu relax intercepted but did not get
+ * chance last time (mostly it has become eligible now since we have probably
+ * yielded to lockholder in last iteration. This is done by toggling
+ * @dy_eligible each time a VCPU checked for eligibility.)
+ *
+ * Yielding to a recently pl-exited/cpu relax intercepted VCPU before yielding
+ * to preempted lock-holder could result in wrong VCPU selection and CPU
+ * burning. Giving priority for a potential lock-holder increases lock
+ * progress.
+ */
+bool kvm_vcpu_check_and_update_eligible(struct kvm_vcpu *vcpu)



Predicates' names should give a hint as to what true and false returns
mean. For example vcpu_eligible_for_directed_yield().

+{
+ bool eligible;
+
+ eligible = !vcpu->ple.cpu_relax_intercepted ||
+ (vcpu->ple.cpu_relax_intercepted &&
+ vcpu->ple.dy_eligible);
+
+ if (vcpu->ple.cpu_relax_intercepted)
+ vcpu->ple.dy_eligible = !vcpu->ple.dy_eligible;



Probably should assign 'true', since the previous value is essentially
random.

+
+ return eligible;
+}



You're accessing another vcpu's data structures without any locking.
This is probably okay since we're not basing any life or death decisions
on this, but a comment would be good to explain to readers that this has
been considered and is okay (and why).




error compiling committee.c: too many arguments to function


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#5 Avi Kivity
July 16th, 2012 - 06:10 am ET | Report spam
On 07/16/2012 11:25 AM, Raghavendra K T wrote:
From: Raghavendra K T

Noting pause loop exited vcpu or cpu relax intercepted helps in
filtering right candidate to yield. Wrong selection of vcpu;
i.e., a vcpu that just did a pl-exit or cpu relax intercepted may
contribute to performance degradation.

Signed-off-by: Raghavendra K T

V2 was:
Reviewed-by: Rik van Riel

include/linux/kvm_host.h | 37 +++++++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 5 +++++
2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c446435..fa353bd 100644
a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -183,6 +183,18 @@ struct kvm_vcpu {
} async_pf;
#endif

+#ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
+ /*
+ * Cpu relax intercept or pause loop exit optimization
+ * cpu_relax_intercepted: set when a vcpu does a pause loop exit
+ * or cpu relax intercepted.
+ * dy_eligible: indicates whether vcpu is eligible for directed yield.
+ */
+ struct {
+ bool cpu_relax_intercepted;
+ bool dy_eligible;
+ } ple;



Naming:

- ple is the x86 name, but this is common code. I'd call it spin_loop.
- cpu_relax_intercepted: can call it in_spin_loop, the original name can
me "are we intercepting cpu_relax or not" which may be useful at one
point (when we know we aren't overcommitted)
- dy_eligible: I'd like to understand the algorithm more and would want
the name to reflect it, will come to this later if I have a better idea.

+#endif



blank line.

struct kvm_vcpu_arch arch;
};





error compiling committee.c: too many arguments to function


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
Help Create a new topicNext page Replies Make a reply
Search Make your own search