From: Nick Piggin <npiggin@suse.de>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>,
Andi Kleen <andi@firstfloor.org>,
Christoph Lameter <cl@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
haicheng.li@intel.com,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [patch] slab: add memory hotplug support
Date: Tue, 23 Mar 2010 08:12:02 +1100 [thread overview]
Message-ID: <20100322211202.GU17637@laptop> (raw)
In-Reply-To: <4BA7A8D6.4000706@cs.helsinki.fi>
On Mon, Mar 22, 2010 at 07:28:54PM +0200, Pekka Enberg wrote:
> Nick Piggin wrote:
> >On Mon, Mar 08, 2010 at 03:19:48PM -0800, David Rientjes wrote:
> >>On Fri, 5 Mar 2010, Nick Piggin wrote:
> >>
> >>>>+#if defined(CONFIG_NUMA) && defined(CONFIG_MEMORY_HOTPLUG)
> >>>>+/*
> >>>>+ * Drains and frees nodelists for a node on each slab cache, used for memory
> >>>>+ * hotplug. Returns -EBUSY if all objects cannot be drained on memory
> >>>>+ * hot-remove so that the node is not removed. When used because memory
> >>>>+ * hot-add is canceled, the only result is the freed kmem_list3.
> >>>>+ *
> >>>>+ * Must hold cache_chain_mutex.
> >>>>+ */
> >>>>+static int __meminit free_cache_nodelists_node(int node)
> >>>>+{
> >>>>+ struct kmem_cache *cachep;
> >>>>+ int ret = 0;
> >>>>+
> >>>>+ list_for_each_entry(cachep, &cache_chain, next) {
> >>>>+ struct array_cache *shared;
> >>>>+ struct array_cache **alien;
> >>>>+ struct kmem_list3 *l3;
> >>>>+
> >>>>+ l3 = cachep->nodelists[node];
> >>>>+ if (!l3)
> >>>>+ continue;
> >>>>+
> >>>>+ spin_lock_irq(&l3->list_lock);
> >>>>+ shared = l3->shared;
> >>>>+ if (shared) {
> >>>>+ free_block(cachep, shared->entry, shared->avail, node);
> >>>>+ l3->shared = NULL;
> >>>>+ }
> >>>>+ alien = l3->alien;
> >>>>+ l3->alien = NULL;
> >>>>+ spin_unlock_irq(&l3->list_lock);
> >>>>+
> >>>>+ if (alien) {
> >>>>+ drain_alien_cache(cachep, alien);
> >>>>+ free_alien_cache(alien);
> >>>>+ }
> >>>>+ kfree(shared);
> >>>>+
> >>>>+ drain_freelist(cachep, l3, l3->free_objects);
> >>>>+ if (!list_empty(&l3->slabs_full) ||
> >>>>+ !list_empty(&l3->slabs_partial)) {
> >>>>+ /*
> >>>>+ * Continue to iterate through each slab cache to free
> >>>>+ * as many nodelists as possible even though the
> >>>>+ * offline will be canceled.
> >>>>+ */
> >>>>+ ret = -EBUSY;
> >>>>+ continue;
> >>>>+ }
> >>>>+ kfree(l3);
> >>>>+ cachep->nodelists[node] = NULL;
> >>>What's stopping races of other CPUs trying to access l3 and array
> >>>caches while they're being freed?
> >>>
> >>numa_node_id() will not return an offlined nodeid and
> >>cache_alloc_node() already does a fallback to other onlined
> >>nodes in case a nodeid is passed to kmalloc_node() that does not
> >>have a nodelist. l3->shared and l3->alien cannot be accessed
> >>without l3->list_lock (drain, cache_alloc_refill,
> >>cache_flusharray) or cache_chain_mutex (kmem_cache_destroy,
> >>cache_reap).
> >
> >Yeah, but can't it _have_ a nodelist (ie. before it is set to NULL here)
> >while it is being accessed by another CPU and concurrently being freed
> >on this one?
> >
> >
> >>>>+ }
> >>>>+ return ret;
> >>>>+}
> >>>>+
> >>>>+/*
> >>>>+ * Onlines nid either as the result of memory hot-add or canceled hot-remove.
> >>>>+ */
> >>>>+static int __meminit slab_node_online(int nid)
> >>>>+{
> >>>>+ int ret;
> >>>>+ mutex_lock(&cache_chain_mutex);
> >>>>+ ret = init_cache_nodelists_node(nid);
> >>>>+ mutex_unlock(&cache_chain_mutex);
> >>>>+ return ret;
> >>>>+}
> >>>>+
> >>>>+/*
> >>>>+ * Offlines nid either as the result of memory hot-remove or canceled hot-add.
> >>>>+ */
> >>>>+static int __meminit slab_node_offline(int nid)
> >>>>+{
> >>>>+ int ret;
> >>>>+ mutex_lock(&cache_chain_mutex);
> >>>>+ ret = free_cache_nodelists_node(nid);
> >>>>+ mutex_unlock(&cache_chain_mutex);
> >>>>+ return ret;
> >>>>+}
> >>>>+
> >>>>+static int __meminit slab_memory_callback(struct notifier_block *self,
> >>>>+ unsigned long action, void *arg)
> >>>>+{
> >>>>+ struct memory_notify *mnb = arg;
> >>>>+ int ret = 0;
> >>>>+ int nid;
> >>>>+
> >>>>+ nid = mnb->status_change_nid;
> >>>>+ if (nid < 0)
> >>>>+ goto out;
> >>>>+
> >>>>+ switch (action) {
> >>>>+ case MEM_GOING_ONLINE:
> >>>>+ case MEM_CANCEL_OFFLINE:
> >>>>+ ret = slab_node_online(nid);
> >>>>+ break;
> >>>This would explode if CANCEL_OFFLINE fails. Call it theoretical and
> >>>put a panic() in here and I don't mind. Otherwise you get corruption
> >>>somewhere in the slab code.
> >>>
> >>MEM_CANCEL_ONLINE would only fail here if a struct kmem_list3
> >>couldn't be allocated anywhere on the system and if that happens
> >>then the node simply couldn't be allocated from (numa_node_id()
> >>would never return it as the cpu's node, so it's possible to
> >>fallback in this scenario).
> >
> >Why would it never return the CPU's node? It's CANCEL_OFFLINE that is
> >the problem.
>
> So I was thinking of pushing this towards Linus but I didn't see
> anyone respond to Nick's concerns. I'm not that familiar with all
> this hotplug stuff so can someone make also Nick happy so we can
> move forward?
I don't mind about the memory failure cases (just add a panic
there that should never really happen anyway, just to document
that a part is still missing).
I am more worried about the races. Maybe I just missed how they
are protected against.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-22 21:12 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-11 20:53 [PATCH] [0/4] Update slab memory hotplug series Andi Kleen
2010-02-11 20:54 ` [PATCH] [1/4] SLAB: Handle node-not-up case in fallback_alloc() v2 Andi Kleen
2010-02-11 21:41 ` David Rientjes
2010-02-11 21:55 ` Andi Kleen
2010-02-15 6:04 ` Nick Piggin
2010-02-15 10:07 ` Andi Kleen
2010-02-15 10:22 ` Nick Piggin
2010-02-11 20:54 ` [PATCH] [2/4] SLAB: Separate node initialization into separate function Andi Kleen
2010-02-11 21:44 ` David Rientjes
2010-02-11 20:54 ` [PATCH] [3/4] SLAB: Set up the l3 lists for the memory of freshly added memory v2 Andi Kleen
2010-02-11 21:45 ` David Rientjes
2010-02-15 6:06 ` Nick Piggin
2010-02-15 21:47 ` David Rientjes
2010-02-16 14:04 ` Nick Piggin
2010-02-16 20:45 ` Pekka Enberg
2010-02-11 20:54 ` [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap Andi Kleen
2010-02-11 21:45 ` David Rientjes
2010-02-15 6:15 ` Nick Piggin
2010-02-15 10:32 ` Andi Kleen
2010-02-15 10:41 ` Nick Piggin
2010-02-15 10:52 ` Andi Kleen
2010-02-15 11:01 ` Nick Piggin
2010-02-15 15:30 ` Andi Kleen
2010-02-19 18:22 ` Christoph Lameter
2010-02-20 9:01 ` Andi Kleen
2010-02-22 10:53 ` Pekka Enberg
2010-02-22 14:31 ` Andi Kleen
2010-02-22 16:11 ` Pekka Enberg
2010-02-22 20:20 ` Andi Kleen
2010-02-24 15:49 ` Christoph Lameter
2010-02-25 7:26 ` Pekka Enberg
2010-02-25 8:01 ` David Rientjes
2010-02-25 18:30 ` Christoph Lameter
2010-02-25 21:45 ` David Rientjes
2010-02-25 22:31 ` Christoph Lameter
2010-02-26 10:45 ` Pekka Enberg
2010-02-26 11:43 ` Andi Kleen
2010-02-26 12:35 ` Pekka Enberg
2010-02-26 14:08 ` Andi Kleen
2010-02-26 1:09 ` KAMEZAWA Hiroyuki
2010-02-26 11:41 ` Andi Kleen
2010-02-26 15:04 ` Christoph Lameter
2010-02-26 15:05 ` Christoph Lameter
2010-02-26 15:59 ` Andi Kleen
2010-02-26 15:57 ` Andi Kleen
2010-02-26 17:24 ` Christoph Lameter
2010-02-26 17:31 ` Andi Kleen
2010-03-01 1:59 ` KAMEZAWA Hiroyuki
2010-03-01 10:27 ` David Rientjes
2010-02-27 0:01 ` David Rientjes
2010-03-01 10:24 ` [patch] slab: add memory hotplug support David Rientjes
2010-03-02 5:53 ` Pekka Enberg
2010-03-02 20:20 ` Christoph Lameter
2010-03-02 21:03 ` David Rientjes
2010-03-03 1:28 ` KAMEZAWA Hiroyuki
2010-03-03 2:39 ` David Rientjes
2010-03-03 2:51 ` KAMEZAWA Hiroyuki
2010-03-02 12:53 ` Andi Kleen
2010-03-02 15:04 ` Pekka Enberg
2010-03-03 14:34 ` Andi Kleen
2010-03-03 15:46 ` Christoph Lameter
2010-03-02 21:17 ` David Rientjes
2010-03-05 6:20 ` Nick Piggin
2010-03-05 12:47 ` Anca Emanuel
2010-03-05 13:58 ` Anca Emanuel
2010-03-05 14:11 ` Christoph Lameter
2010-03-08 3:06 ` Andi Kleen
2010-03-08 2:58 ` Andi Kleen
2010-03-08 23:19 ` David Rientjes
2010-03-09 13:46 ` Nick Piggin
2010-03-22 17:28 ` Pekka Enberg
2010-03-22 21:12 ` Nick Piggin [this message]
2010-03-28 2:13 ` David Rientjes
2010-03-28 2:40 ` [patch v2] " David Rientjes
2010-03-30 9:01 ` Pekka Enberg
2010-03-30 16:43 ` Christoph Lameter
2010-04-04 20:45 ` David Rientjes
2010-04-07 16:29 ` Pekka Enberg
2010-02-25 18:34 ` [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap Christoph Lameter
2010-02-25 18:46 ` Pekka Enberg
2010-02-25 19:19 ` Christoph Lameter
2010-03-02 12:55 ` Andi Kleen
2010-02-19 18:22 ` Christoph Lameter
2010-02-22 10:57 ` Pekka Enberg
2010-02-13 10:24 ` [PATCH] [0/4] Update slab memory hotplug series Pekka Enberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100322211202.GU17637@laptop \
--to=npiggin@suse.de \
--cc=andi@firstfloor.org \
--cc=cl@linux-foundation.org \
--cc=haicheng.li@intel.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@cs.helsinki.fi \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).