[RFC] slub: Per object NUMA support

April 15th, 2011 - 04:00 pm ET by Christoph Lameter | Report spam
I am not sure if such a feature is needed/wanted/desired. It would make
the object allocation method similar to SLAB instead of relying on page
based policy application (which IMHO was the intend of the memory policy
system before Paul Jackson got that changed in SLAB).

Anyways the implementation is rather simple.





Currently slub applies NUMA policies per allocated slab page. Change
that to apply memory policies for each individual object allocated.

F.e. before this patch MPOL_INTERLEAVE would return objects from the
same slab page until a new slab page was allocated. Now an object
from a different page is taken for each allocation.

This increases the overhead of the fastpath under NUMA.

Signed-off-by: Christoph Lameter <cl@linux.com>


mm/slub.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

Index: linux-2.6/mm/slub.c
linux-2.6.orig/mm/slub.c 2011-04-15 12:54:42.000000000 -0500
+++ linux-2.6/mm/slub.c 2011-04-15 13:11:25.000000000 -0500
@@ -1887,6 +1887,21 @@ debug:
goto unlock_out;
}

+static __always_inline int alternate_slab_node(struct kmem_cache *s,
+ gfp_t flags, int node)
+{
+#ifdef CONFIG_NUMA
+ if (unlikely(node == NUMA_NO_NODE &&
+ !(flags & __GFP_THISNODE) &&
+ !in_interrupt())) {
+ if ((s->flags & SLAB_MEM_SPREAD) && cpuset_do_slab_mem_spread())
+ node = cpuset_slab_spread_node();
+ else if (current->mempolicy)
+ node = slab_node(current->mempolicy);
+ }
+#endif
+ return node;
+}
/*
* Inlined fastpath so that allocation functions (kmalloc, kmem_cache_alloc)
* have the fastpath folded into their functions. So no function call
@@ -1911,6 +1926,7 @@ static __always_inline void *slab_alloc(
if (slab_pre_alloc_hook(s, gfpflags))
return NULL;

+ node = alternate_slab_node(s, gfpflags, node);
#ifndef CONFIG_CMPXCHG_LOCAL
local_irq_save(flags);
#else
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussionReplies 2 repliesReplies Make a reply

Similar topics

Replies

#1 David Rientjes
May 11th, 2011 - 05:20 pm ET | Report spam
On Fri, 15 Apr 2011, Christoph Lameter wrote:

I am not sure if such a feature is needed/wanted/desired. It would make
the object allocation method similar to SLAB instead of relying on page
based policy application (which IMHO was the intend of the memory policy
system before Paul Jackson got that changed in SLAB).

Anyways the implementation is rather simple.




The implementation may be simple, but it seems like this would absolutely
kill performance for MPOL_INTERLEAVE if slub is required for every
kmalloc() or kmem_cache_alloc() to take the alternate slab node's
list_lock to scan the partial list or, worse yet, allocate a new slab on
that node when there are objects available on the freelist.

That, to me, would always nullify the performance benefit of using the
mempolicy in the first place and end up making MPOL_INTERLEAVE worse than
no mempolicy. Do you have any benchmarks that suggest this has no
negative impact? I'd be very surprised.

Currently slub applies NUMA policies per allocated slab page. Change
that to apply memory policies for each individual object allocated.

F.e. before this patch MPOL_INTERLEAVE would return objects from the
same slab page until a new slab page was allocated. Now an object
from a different page is taken for each allocation.

This increases the overhead of the fastpath under NUMA.

Signed-off-by: Christoph Lameter


mm/slub.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

Index: linux-2.6/mm/slub.c
> linux-2.6.orig/mm/slub.c 2011-04-15 12:54:42.000000000 -0500
+++ linux-2.6/mm/slub.c 2011-04-15 13:11:25.000000000 -0500
@@ -1887,6 +1887,21 @@ debug:
goto unlock_out;
}

+static __always_inline int alternate_slab_node(struct kmem_cache *s,
+ gfp_t flags, int node)
+{
+#ifdef CONFIG_NUMA
+ if (unlikely(node == NUMA_NO_NODE &&
+ !(flags & __GFP_THISNODE) &&
+ !in_interrupt())) {
+ if ((s->flags & SLAB_MEM_SPREAD) && cpuset_do_slab_mem_spread())
+ node = cpuset_slab_spread_node();
+ else if (current->mempolicy)
+ node = slab_node(current->mempolicy);
+ }
+#endif
+ return node;
+}
/*
* Inlined fastpath so that allocation functions (kmalloc, kmem_cache_alloc)
* have the fastpath folded into their functions. So no function call
@@ -1911,6 +1926,7 @@ static __always_inline void *slab_alloc(
if (slab_pre_alloc_hook(s, gfpflags))
return NULL;

+ node = alternate_slab_node(s, gfpflags, node);
#ifndef CONFIG_CMPXCHG_LOCAL
local_irq_save(flags);
#else



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Replies Reply to this message
#2 Christoph Lameter
May 12th, 2011 - 10:50 am ET | Report spam
On Wed, 11 May 2011, David Rientjes wrote:

> Anyways the implementation is rather simple.
>

The implementation may be simple, but it seems like this would absolutely
kill performance for MPOL_INTERLEAVE if slub is required for every
kmalloc() or kmem_cache_alloc() to take the alternate slab node's
list_lock to scan the partial list or, worse yet, allocate a new slab on
that node when there are objects available on the freelist.



Right. SLAB does something similar though. We could optimize it more by
taking locks like done there and aovid the switching of the per cpu slabs.

That, to me, would always nullify the performance benefit of using the
mempolicy in the first place and end up making MPOL_INTERLEAVE worse than
no mempolicy. Do you have any benchmarks that suggest this has no
negative impact? I'd be very surprised.



The default is no memory policy. You would only use MPOL_INTERLEAVE on
large NUMA system that incur sufficient cross node latency to justify the
interleave.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
email Follow the discussion Replies Reply to this message
Help Create a new topicReplies Make a reply
Search Make your own search