linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
	Andi Kleen <andi@firstfloor.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	haicheng.li@intel.com,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [patch] slab: add memory hotplug support
Date: Wed, 10 Mar 2010 00:46:33 +1100	[thread overview]
Message-ID: <20100309134633.GM8653@laptop> (raw)
In-Reply-To: <alpine.DEB.2.00.1003081502400.30456@chino.kir.corp.google.com>

On Mon, Mar 08, 2010 at 03:19:48PM -0800, David Rientjes wrote:
> On Fri, 5 Mar 2010, Nick Piggin wrote:
> 
> > > +#if defined(CONFIG_NUMA) && defined(CONFIG_MEMORY_HOTPLUG)
> > > +/*
> > > + * Drains and frees nodelists for a node on each slab cache, used for memory
> > > + * hotplug.  Returns -EBUSY if all objects cannot be drained on memory
> > > + * hot-remove so that the node is not removed.  When used because memory
> > > + * hot-add is canceled, the only result is the freed kmem_list3.
> > > + *
> > > + * Must hold cache_chain_mutex.
> > > + */
> > > +static int __meminit free_cache_nodelists_node(int node)
> > > +{
> > > +	struct kmem_cache *cachep;
> > > +	int ret = 0;
> > > +
> > > +	list_for_each_entry(cachep, &cache_chain, next) {
> > > +		struct array_cache *shared;
> > > +		struct array_cache **alien;
> > > +		struct kmem_list3 *l3;
> > > +
> > > +		l3 = cachep->nodelists[node];
> > > +		if (!l3)
> > > +			continue;
> > > +
> > > +		spin_lock_irq(&l3->list_lock);
> > > +		shared = l3->shared;
> > > +		if (shared) {
> > > +			free_block(cachep, shared->entry, shared->avail, node);
> > > +			l3->shared = NULL;
> > > +		}
> > > +		alien = l3->alien;
> > > +		l3->alien = NULL;
> > > +		spin_unlock_irq(&l3->list_lock);
> > > +
> > > +		if (alien) {
> > > +			drain_alien_cache(cachep, alien);
> > > +			free_alien_cache(alien);
> > > +		}
> > > +		kfree(shared);
> > > +
> > > +		drain_freelist(cachep, l3, l3->free_objects);
> > > +		if (!list_empty(&l3->slabs_full) ||
> > > +					!list_empty(&l3->slabs_partial)) {
> > > +			/*
> > > +			 * Continue to iterate through each slab cache to free
> > > +			 * as many nodelists as possible even though the
> > > +			 * offline will be canceled.
> > > +			 */
> > > +			ret = -EBUSY;
> > > +			continue;
> > > +		}
> > > +		kfree(l3);
> > > +		cachep->nodelists[node] = NULL;
> > 
> > What's stopping races of other CPUs trying to access l3 and array
> > caches while they're being freed?
> > 
> 
> numa_node_id() will not return an offlined nodeid and cache_alloc_node() 
> already does a fallback to other onlined nodes in case a nodeid is passed 
> to kmalloc_node() that does not have a nodelist.  l3->shared and l3->alien 
> cannot be accessed without l3->list_lock (drain, cache_alloc_refill, 
> cache_flusharray) or cache_chain_mutex (kmem_cache_destroy, cache_reap).

Yeah, but can't it _have_ a nodelist (ie. before it is set to NULL here)
while it is being accessed by another CPU and concurrently being freed
on this one? 


> > > +	}
> > > +	return ret;
> > > +}
> > > +
> > > +/*
> > > + * Onlines nid either as the result of memory hot-add or canceled hot-remove.
> > > + */
> > > +static int __meminit slab_node_online(int nid)
> > > +{
> > > +	int ret;
> > > +	mutex_lock(&cache_chain_mutex);
> > > +	ret = init_cache_nodelists_node(nid);
> > > +	mutex_unlock(&cache_chain_mutex);
> > > +	return ret;
> > > +}
> > > +
> > > +/*
> > > + * Offlines nid either as the result of memory hot-remove or canceled hot-add.
> > > + */
> > > +static int __meminit slab_node_offline(int nid)
> > > +{
> > > +	int ret;
> > > +	mutex_lock(&cache_chain_mutex);
> > > +	ret = free_cache_nodelists_node(nid);
> > > +	mutex_unlock(&cache_chain_mutex);
> > > +	return ret;
> > > +}
> > > +
> > > +static int __meminit slab_memory_callback(struct notifier_block *self,
> > > +					unsigned long action, void *arg)
> > > +{
> > > +	struct memory_notify *mnb = arg;
> > > +	int ret = 0;
> > > +	int nid;
> > > +
> > > +	nid = mnb->status_change_nid;
> > > +	if (nid < 0)
> > > +		goto out;
> > > +
> > > +	switch (action) {
> > > +	case MEM_GOING_ONLINE:
> > > +	case MEM_CANCEL_OFFLINE:
> > > +		ret = slab_node_online(nid);
> > > +		break;
> > 
> > This would explode if CANCEL_OFFLINE fails. Call it theoretical and
> > put a panic() in here and I don't mind. Otherwise you get corruption
> > somewhere in the slab code.
> > 
> 
> MEM_CANCEL_ONLINE would only fail here if a struct kmem_list3 couldn't be 
> allocated anywhere on the system and if that happens then the node simply 
> couldn't be allocated from (numa_node_id() would never return it as the 
> cpu's node, so it's possible to fallback in this scenario).

Why would it never return the CPU's node? It's CANCEL_OFFLINE that is
the problem.


> Instead of doing this all at MEM_GOING_OFFLINE, we could delay freeing of 
> the array caches and the nodelist until MEM_OFFLINE.  We're guaranteed 
> that all pages are freed at that point so there are no existing objects 
> that we need to track and then if the offline fails from a different 
> callback it would be possible to reset the l3->nodelists[node] pointers 
> since they haven't been freed yet.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-09 13:46 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-11 20:53 [PATCH] [0/4] Update slab memory hotplug series Andi Kleen
2010-02-11 20:54 ` [PATCH] [1/4] SLAB: Handle node-not-up case in fallback_alloc() v2 Andi Kleen
2010-02-11 21:41   ` David Rientjes
2010-02-11 21:55     ` Andi Kleen
2010-02-15  6:04   ` Nick Piggin
2010-02-15 10:07     ` Andi Kleen
2010-02-15 10:22       ` Nick Piggin
2010-02-11 20:54 ` [PATCH] [2/4] SLAB: Separate node initialization into separate function Andi Kleen
2010-02-11 21:44   ` David Rientjes
2010-02-11 20:54 ` [PATCH] [3/4] SLAB: Set up the l3 lists for the memory of freshly added memory v2 Andi Kleen
2010-02-11 21:45   ` David Rientjes
2010-02-15  6:06     ` Nick Piggin
2010-02-15 21:47       ` David Rientjes
2010-02-16 14:04         ` Nick Piggin
2010-02-16 20:45           ` Pekka Enberg
2010-02-11 20:54 ` [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap Andi Kleen
2010-02-11 21:45   ` David Rientjes
2010-02-15  6:15   ` Nick Piggin
2010-02-15 10:32     ` Andi Kleen
2010-02-15 10:41       ` Nick Piggin
2010-02-15 10:52         ` Andi Kleen
2010-02-15 11:01           ` Nick Piggin
2010-02-15 15:30             ` Andi Kleen
2010-02-19 18:22             ` Christoph Lameter
2010-02-20  9:01               ` Andi Kleen
2010-02-22 10:53                 ` Pekka Enberg
2010-02-22 14:31                   ` Andi Kleen
2010-02-22 16:11                     ` Pekka Enberg
2010-02-22 20:20                       ` Andi Kleen
2010-02-24 15:49                 ` Christoph Lameter
2010-02-25  7:26                   ` Pekka Enberg
2010-02-25  8:01                     ` David Rientjes
2010-02-25 18:30                       ` Christoph Lameter
2010-02-25 21:45                         ` David Rientjes
2010-02-25 22:31                           ` Christoph Lameter
2010-02-26 10:45                             ` Pekka Enberg
2010-02-26 11:43                               ` Andi Kleen
2010-02-26 12:35                                 ` Pekka Enberg
2010-02-26 14:08                                   ` Andi Kleen
2010-02-26  1:09                         ` KAMEZAWA Hiroyuki
2010-02-26 11:41                         ` Andi Kleen
2010-02-26 15:04                           ` Christoph Lameter
2010-02-26 15:05                             ` Christoph Lameter
2010-02-26 15:59                               ` Andi Kleen
2010-02-26 15:57                             ` Andi Kleen
2010-02-26 17:24                               ` Christoph Lameter
2010-02-26 17:31                                 ` Andi Kleen
2010-03-01  1:59                                   ` KAMEZAWA Hiroyuki
2010-03-01 10:27                                     ` David Rientjes
2010-02-27  0:01                                 ` David Rientjes
2010-03-01 10:24                                   ` [patch] slab: add memory hotplug support David Rientjes
2010-03-02  5:53                                     ` Pekka Enberg
2010-03-02 20:20                                       ` Christoph Lameter
2010-03-02 21:03                                         ` David Rientjes
2010-03-03  1:28                                         ` KAMEZAWA Hiroyuki
2010-03-03  2:39                                           ` David Rientjes
2010-03-03  2:51                                             ` KAMEZAWA Hiroyuki
2010-03-02 12:53                                     ` Andi Kleen
2010-03-02 15:04                                       ` Pekka Enberg
2010-03-03 14:34                                         ` Andi Kleen
2010-03-03 15:46                                           ` Christoph Lameter
2010-03-02 21:17                                       ` David Rientjes
2010-03-05  6:20                                     ` Nick Piggin
2010-03-05 12:47                                       ` Anca Emanuel
2010-03-05 13:58                                         ` Anca Emanuel
2010-03-05 14:11                                         ` Christoph Lameter
2010-03-08  3:06                                           ` Andi Kleen
2010-03-08  2:58                                         ` Andi Kleen
2010-03-08 23:19                                       ` David Rientjes
2010-03-09 13:46                                         ` Nick Piggin [this message]
2010-03-22 17:28                                           ` Pekka Enberg
2010-03-22 21:12                                             ` Nick Piggin
2010-03-28  2:13                                           ` David Rientjes
2010-03-28  2:40                                             ` [patch v2] " David Rientjes
2010-03-30  9:01                                               ` Pekka Enberg
2010-03-30 16:43                                                 ` Christoph Lameter
2010-04-04 20:45                                                   ` David Rientjes
2010-04-07 16:29                                               ` Pekka Enberg
2010-02-25 18:34                     ` [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap Christoph Lameter
2010-02-25 18:46                       ` Pekka Enberg
2010-02-25 19:19                         ` Christoph Lameter
2010-03-02 12:55                         ` Andi Kleen
2010-02-19 18:22       ` Christoph Lameter
2010-02-22 10:57         ` Pekka Enberg
2010-02-13 10:24 ` [PATCH] [0/4] Update slab memory hotplug series Pekka Enberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100309134633.GM8653@laptop \
    --to=npiggin@suse.de \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=haicheng.li@intel.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).