linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pekka Enberg <penberg@cs.helsinki.fi>
To: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>,
	Andi Kleen <andi@firstfloor.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	haicheng.li@intel.com,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [patch] slab: add memory hotplug support
Date: Mon, 22 Mar 2010 19:28:54 +0200	[thread overview]
Message-ID: <4BA7A8D6.4000706@cs.helsinki.fi> (raw)
In-Reply-To: <20100309134633.GM8653@laptop>

Nick Piggin wrote:
> On Mon, Mar 08, 2010 at 03:19:48PM -0800, David Rientjes wrote:
>> On Fri, 5 Mar 2010, Nick Piggin wrote:
>>
>>>> +#if defined(CONFIG_NUMA) && defined(CONFIG_MEMORY_HOTPLUG)
>>>> +/*
>>>> + * Drains and frees nodelists for a node on each slab cache, used for memory
>>>> + * hotplug.  Returns -EBUSY if all objects cannot be drained on memory
>>>> + * hot-remove so that the node is not removed.  When used because memory
>>>> + * hot-add is canceled, the only result is the freed kmem_list3.
>>>> + *
>>>> + * Must hold cache_chain_mutex.
>>>> + */
>>>> +static int __meminit free_cache_nodelists_node(int node)
>>>> +{
>>>> +	struct kmem_cache *cachep;
>>>> +	int ret = 0;
>>>> +
>>>> +	list_for_each_entry(cachep, &cache_chain, next) {
>>>> +		struct array_cache *shared;
>>>> +		struct array_cache **alien;
>>>> +		struct kmem_list3 *l3;
>>>> +
>>>> +		l3 = cachep->nodelists[node];
>>>> +		if (!l3)
>>>> +			continue;
>>>> +
>>>> +		spin_lock_irq(&l3->list_lock);
>>>> +		shared = l3->shared;
>>>> +		if (shared) {
>>>> +			free_block(cachep, shared->entry, shared->avail, node);
>>>> +			l3->shared = NULL;
>>>> +		}
>>>> +		alien = l3->alien;
>>>> +		l3->alien = NULL;
>>>> +		spin_unlock_irq(&l3->list_lock);
>>>> +
>>>> +		if (alien) {
>>>> +			drain_alien_cache(cachep, alien);
>>>> +			free_alien_cache(alien);
>>>> +		}
>>>> +		kfree(shared);
>>>> +
>>>> +		drain_freelist(cachep, l3, l3->free_objects);
>>>> +		if (!list_empty(&l3->slabs_full) ||
>>>> +					!list_empty(&l3->slabs_partial)) {
>>>> +			/*
>>>> +			 * Continue to iterate through each slab cache to free
>>>> +			 * as many nodelists as possible even though the
>>>> +			 * offline will be canceled.
>>>> +			 */
>>>> +			ret = -EBUSY;
>>>> +			continue;
>>>> +		}
>>>> +		kfree(l3);
>>>> +		cachep->nodelists[node] = NULL;
>>> What's stopping races of other CPUs trying to access l3 and array
>>> caches while they're being freed?
>>>
>> numa_node_id() will not return an offlined nodeid and cache_alloc_node() 
>> already does a fallback to other onlined nodes in case a nodeid is passed 
>> to kmalloc_node() that does not have a nodelist.  l3->shared and l3->alien 
>> cannot be accessed without l3->list_lock (drain, cache_alloc_refill, 
>> cache_flusharray) or cache_chain_mutex (kmem_cache_destroy, cache_reap).
> 
> Yeah, but can't it _have_ a nodelist (ie. before it is set to NULL here)
> while it is being accessed by another CPU and concurrently being freed
> on this one? 
> 
> 
>>>> +	}
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Onlines nid either as the result of memory hot-add or canceled hot-remove.
>>>> + */
>>>> +static int __meminit slab_node_online(int nid)
>>>> +{
>>>> +	int ret;
>>>> +	mutex_lock(&cache_chain_mutex);
>>>> +	ret = init_cache_nodelists_node(nid);
>>>> +	mutex_unlock(&cache_chain_mutex);
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Offlines nid either as the result of memory hot-remove or canceled hot-add.
>>>> + */
>>>> +static int __meminit slab_node_offline(int nid)
>>>> +{
>>>> +	int ret;
>>>> +	mutex_lock(&cache_chain_mutex);
>>>> +	ret = free_cache_nodelists_node(nid);
>>>> +	mutex_unlock(&cache_chain_mutex);
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static int __meminit slab_memory_callback(struct notifier_block *self,
>>>> +					unsigned long action, void *arg)
>>>> +{
>>>> +	struct memory_notify *mnb = arg;
>>>> +	int ret = 0;
>>>> +	int nid;
>>>> +
>>>> +	nid = mnb->status_change_nid;
>>>> +	if (nid < 0)
>>>> +		goto out;
>>>> +
>>>> +	switch (action) {
>>>> +	case MEM_GOING_ONLINE:
>>>> +	case MEM_CANCEL_OFFLINE:
>>>> +		ret = slab_node_online(nid);
>>>> +		break;
>>> This would explode if CANCEL_OFFLINE fails. Call it theoretical and
>>> put a panic() in here and I don't mind. Otherwise you get corruption
>>> somewhere in the slab code.
>>>
>> MEM_CANCEL_ONLINE would only fail here if a struct kmem_list3 couldn't be 
>> allocated anywhere on the system and if that happens then the node simply 
>> couldn't be allocated from (numa_node_id() would never return it as the 
>> cpu's node, so it's possible to fallback in this scenario).
> 
> Why would it never return the CPU's node? It's CANCEL_OFFLINE that is
> the problem.

So I was thinking of pushing this towards Linus but I didn't see anyone 
respond to Nick's concerns. I'm not that familiar with all this hotplug 
stuff so can someone make also Nick happy so we can move forward?

			Pekka

  reply	other threads:[~2010-03-22 17:29 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-11 20:53 [PATCH] [0/4] Update slab memory hotplug series Andi Kleen
2010-02-11 20:54 ` [PATCH] [1/4] SLAB: Handle node-not-up case in fallback_alloc() v2 Andi Kleen
2010-02-11 21:41   ` David Rientjes
2010-02-11 21:55     ` Andi Kleen
2010-02-15  6:04   ` Nick Piggin
2010-02-15 10:07     ` Andi Kleen
2010-02-15 10:22       ` Nick Piggin
2010-02-11 20:54 ` [PATCH] [2/4] SLAB: Separate node initialization into separate function Andi Kleen
2010-02-11 21:44   ` David Rientjes
2010-02-11 20:54 ` [PATCH] [3/4] SLAB: Set up the l3 lists for the memory of freshly added memory v2 Andi Kleen
2010-02-11 21:45   ` David Rientjes
2010-02-15  6:06     ` Nick Piggin
2010-02-15 21:47       ` David Rientjes
2010-02-16 14:04         ` Nick Piggin
2010-02-16 20:45           ` Pekka Enberg
2010-02-11 20:54 ` [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap Andi Kleen
2010-02-11 21:45   ` David Rientjes
2010-02-15  6:15   ` Nick Piggin
2010-02-15 10:32     ` Andi Kleen
2010-02-15 10:41       ` Nick Piggin
2010-02-15 10:52         ` Andi Kleen
2010-02-15 11:01           ` Nick Piggin
2010-02-15 15:30             ` Andi Kleen
2010-02-19 18:22             ` Christoph Lameter
2010-02-20  9:01               ` Andi Kleen
2010-02-22 10:53                 ` Pekka Enberg
2010-02-22 14:31                   ` Andi Kleen
2010-02-22 16:11                     ` Pekka Enberg
2010-02-22 20:20                       ` Andi Kleen
2010-02-24 15:49                 ` Christoph Lameter
2010-02-25  7:26                   ` Pekka Enberg
2010-02-25  8:01                     ` David Rientjes
2010-02-25 18:30                       ` Christoph Lameter
2010-02-25 21:45                         ` David Rientjes
2010-02-25 22:31                           ` Christoph Lameter
2010-02-26 10:45                             ` Pekka Enberg
2010-02-26 11:43                               ` Andi Kleen
2010-02-26 12:35                                 ` Pekka Enberg
2010-02-26 14:08                                   ` Andi Kleen
2010-02-26  1:09                         ` KAMEZAWA Hiroyuki
2010-02-26 11:41                         ` Andi Kleen
2010-02-26 15:04                           ` Christoph Lameter
2010-02-26 15:05                             ` Christoph Lameter
2010-02-26 15:59                               ` Andi Kleen
2010-02-26 15:57                             ` Andi Kleen
2010-02-26 17:24                               ` Christoph Lameter
2010-02-26 17:31                                 ` Andi Kleen
2010-03-01  1:59                                   ` KAMEZAWA Hiroyuki
2010-03-01 10:27                                     ` David Rientjes
2010-02-27  0:01                                 ` David Rientjes
2010-03-01 10:24                                   ` [patch] slab: add memory hotplug support David Rientjes
2010-03-02  5:53                                     ` Pekka Enberg
2010-03-02 20:20                                       ` Christoph Lameter
2010-03-02 21:03                                         ` David Rientjes
2010-03-03  1:28                                         ` KAMEZAWA Hiroyuki
2010-03-03  2:39                                           ` David Rientjes
2010-03-03  2:51                                             ` KAMEZAWA Hiroyuki
2010-03-02 12:53                                     ` Andi Kleen
2010-03-02 15:04                                       ` Pekka Enberg
2010-03-03 14:34                                         ` Andi Kleen
2010-03-03 15:46                                           ` Christoph Lameter
2010-03-02 21:17                                       ` David Rientjes
2010-03-05  6:20                                     ` Nick Piggin
2010-03-05 12:47                                       ` Anca Emanuel
2010-03-05 13:58                                         ` Anca Emanuel
2010-03-05 14:11                                         ` Christoph Lameter
2010-03-08  3:06                                           ` Andi Kleen
2010-03-08  2:58                                         ` Andi Kleen
2010-03-08 23:19                                       ` David Rientjes
2010-03-09 13:46                                         ` Nick Piggin
2010-03-22 17:28                                           ` Pekka Enberg [this message]
2010-03-22 21:12                                             ` Nick Piggin
2010-03-28  2:13                                           ` David Rientjes
2010-03-28  2:40                                             ` [patch v2] " David Rientjes
2010-03-30  9:01                                               ` Pekka Enberg
2010-03-30 16:43                                                 ` Christoph Lameter
2010-04-04 20:45                                                   ` David Rientjes
2010-04-07 16:29                                               ` Pekka Enberg
2010-02-25 18:34                     ` [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap Christoph Lameter
2010-02-25 18:46                       ` Pekka Enberg
2010-02-25 19:19                         ` Christoph Lameter
2010-03-02 12:55                         ` Andi Kleen
2010-02-19 18:22       ` Christoph Lameter
2010-02-22 10:57         ` Pekka Enberg
2010-02-13 10:24 ` [PATCH] [0/4] Update slab memory hotplug series Pekka Enberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BA7A8D6.4000706@cs.helsinki.fi \
    --to=penberg@cs.helsinki.fi \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=haicheng.li@intel.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).