linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>,
	Larry Finger <Larry.Finger@lwfinger.net>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Johannes Berg <johannes@sipsolutions.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	npiggin@suse.de
Subject: Re: [Bug #13319] Page allocation failures with b43 and p54usb
Date: Wed, 10 Jun 2009 16:56:26 +0100	[thread overview]
Message-ID: <20090610155626.GA7951@csn.ul.ie> (raw)
In-Reply-To: <1244531201.5024.3.camel@penberg-laptop>

On Tue, Jun 09, 2009 at 10:06:41AM +0300, Pekka Enberg wrote:
> Hi Mel,
> 
> On Mon, 2009-06-08 at 15:12 +0100, Mel Gorman wrote:
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 65ffda5..b5acf18 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -1565,6 +1565,8 @@ new_slab:
> > >  		c->page = new;
> > >  		goto load_freelist;
> > >  	}
> > > +	printk(KERN_WARNING "SLUB: unable to satisfy allocation for cache %s (size=%d, node=%d, gfp=%x)\n",
> > > +		s->name, s->size, node, gfpflags);
> > 
> > size could be almost anything here for a casual reader. You are
> > outputting the size of the object plus its metadata so the name should
> > reflect that. I think it would be better to output objsize= and the
> > object size without the metadata overhead. What do you think?
> > 
> > In addition, include how many objects there are per-slab and include what
> > the order is being passed to the page allocator when allocating new slabs.
> > Would that be enough to determine if fallback-to-smaller orders occured?
> 
> So how about something like this then?
> 
> 			Pekka
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 65ffda5..a03dbe8 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1484,6 +1484,58 @@ static inline int node_match(struct kmem_cache_cpu *c, int node)
>  	return 1;
>  }
>  
> +static int count_free(struct page *page)
> +{
> +	return page->objects - page->inuse;
> +}
> +
> +static unsigned long count_partial(struct kmem_cache_node *n,
> +					int (*get_count)(struct page *))
> +{
> +	unsigned long flags;
> +	unsigned long x = 0;
> +	struct page *page;
> +
> +	spin_lock_irqsave(&n->list_lock, flags);
> +	list_for_each_entry(page, &n->partial, lru)
> +		x += get_count(page);
> +	spin_unlock_irqrestore(&n->list_lock, flags);
> +	return x;
> +}
> +
> +static noinline void
> +slab_out_of_memory(struct kmem_cache *s, gfp_t gfpflags, int nid)
> +{
> +	int node;
> +
> +	printk(KERN_WARNING
> +		"SLUB: Unable to allocate memory on node %d (gfp=%x)\n",
> +		nid, gfpflags);
> +	printk(KERN_WARNING "  cache: %s, object size: %d, buffer size: %d, "
> +		"default order: %d, min order: %d\n", s->name, s->objsize,
> +		s->size, oo_order(s->oo), oo_order(s->min));
> +

Much nicer. There is a clear division between the object size and the
size including the metadata. There is also now a good idea of what sort
of request it was, we know what cache it was so we can guess the size
passed to kmalloc() with reasonable accuracy.

> +	for_each_online_node(node) {
> +		struct kmem_cache_node *n = get_node(s, node);
> +		unsigned long nr_partials;
> +		unsigned long nr_slabs;
> +		unsigned long nr_objs;
> +		unsigned long nr_free;
> +
> +		if (!n)
> +			continue;
> +
> +		nr_partials = n->nr_partial;
> +		nr_slabs = atomic_long_read(&n->nr_slabs);
> +		nr_objs = atomic_long_read(&n->total_objects);
> +		nr_free = count_partial(n, count_free);
> +
> +		printk(KERN_WARNING
> +			"  node %d: partials: %ld, slabs: %ld, objs: %ld, free: %ld\n",
> +			node, nr_partials, nr_slabs, nr_objs, nr_free);
> +	}
> +}

That looks like it would generate easier-to-debug-with messages and to
not-expert-at-slub eye, it looks correct. Slap a changelog on it with an
example message and go with it.  It should make page allocation failures
messages that go through SLUB a lot easier to figure out.

Thanks

> +
>  /*
>   * Slow path. The lockless freelist is empty or we need to perform
>   * debugging duties.
> @@ -1565,6 +1617,7 @@ new_slab:
>  		c->page = new;
>  		goto load_freelist;
>  	}
> +	slab_out_of_memory(s, gfpflags, node);
>  	return NULL;
>  debug:
>  	if (!alloc_debug_processing(s, c->page, object, addr))
> @@ -3318,20 +3371,6 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags,
>  }
>  
>  #ifdef CONFIG_SLUB_DEBUG
> -static unsigned long count_partial(struct kmem_cache_node *n,
> -					int (*get_count)(struct page *))
> -{
> -	unsigned long flags;
> -	unsigned long x = 0;
> -	struct page *page;
> -
> -	spin_lock_irqsave(&n->list_lock, flags);
> -	list_for_each_entry(page, &n->partial, lru)
> -		x += get_count(page);
> -	spin_unlock_irqrestore(&n->list_lock, flags);
> -	return x;
> -}
> -
>  static int count_inuse(struct page *page)
>  {
>  	return page->inuse;
> @@ -3342,11 +3381,6 @@ static int count_total(struct page *page)
>  	return page->objects;
>  }
>  
> -static int count_free(struct page *page)
> -{
> -	return page->objects - page->inuse;
> -}
> -
>  static int validate_slab(struct kmem_cache *s, struct page *page,
>  						unsigned long *map)
>  {
> 
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

  parent reply	other threads:[~2009-06-10 15:56 UTC|newest]

Thread overview: 170+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-07  9:47 2.6.30-rc8-git4: Reported regressions from 2.6.29 Rafael J. Wysocki
2009-06-07  9:47 ` [Bug #13109] High latency on /sys/class/thermal Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13179] CD-R: wodim intermittent failures Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13116] Can't boot with nosmp Rafael J. Wysocki
2009-06-08 16:15   ` Stephen Hemminger
2009-06-08 16:29     ` Dan Williams
2009-06-09  0:04       ` Stephen Hemminger
2009-06-09 17:20         ` Dan Williams
2009-06-09 18:30           ` Avi Kivity
2009-06-09 18:36             ` Stephen Hemminger
2009-06-09 18:42               ` Avi Kivity
2009-06-09 20:58                 ` Stephen Hemminger
2009-06-09 23:19                   ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13277] 2.6.30 regression - unreliable resume - bisected - Thinkpad X40 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13119] Trouble with make-install from a NFS mount Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13219] Since kernel 2.6.30-rc1, computers hangs randomly Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13180] 2.6.30-rc2: WARNING at i915_gem.c for i915_gem_idle Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13318] AGP doesn't work anymore on nforce2 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13306] hibernate slow on _second_ run Rafael J. Wysocki
2009-06-08  6:36   ` Johannes Berg
2009-06-08 11:14     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13313] vm86old oops Rafael J. Wysocki
2009-06-11 13:02   ` Sergey Senozhatsky
2009-06-07  9:52 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-06-07 13:10   ` Larry Finger
2009-06-07 13:40     ` Pekka Enberg
2009-06-07 14:19       ` Rik van Riel
2009-06-07 14:32         ` Pekka Enberg
2009-06-07 16:35           ` Larry Finger
2009-06-08  8:32             ` KAMEZAWA Hiroyuki
2009-06-08 17:20               ` Larry Finger
2009-06-08 10:17           ` Mel Gorman
2009-06-08 10:52             ` Pekka Enberg
2009-06-08 11:03               ` Mel Gorman
2009-06-08 13:58                 ` Pekka J Enberg
2009-06-08 14:12                   ` Mel Gorman
2009-06-08 14:42                     ` Christoph Lameter
2009-06-09  7:06                     ` Pekka Enberg
2009-06-09  7:54                       ` David Rientjes
2009-06-09  7:58                         ` Pekka Enberg
2009-06-09  8:14                           ` David Rientjes
2009-06-09  8:28                             ` Pekka Enberg
2009-06-10 14:41                               ` Larry Finger
2009-06-10 15:44                                 ` Pekka Enberg
2009-06-10 15:49                                   ` Pekka Enberg
2009-06-10 15:52                                     ` Johannes Berg
2009-06-10 16:06                                       ` Pekka Enberg
2009-06-10 16:16                                       ` Pekka Enberg
2009-06-10 16:10                                     ` Larry Finger
2009-06-11 14:41                                   ` Christoph Lameter
2009-06-11 15:09                                     ` Pekka Enberg
2009-06-11 18:41                                       ` Johannes Berg
2009-06-10 15:56                       ` Mel Gorman [this message]
2009-06-10 18:03                         ` Pekka Enberg
2009-06-09  7:50                     ` Pekka Enberg
2009-06-08 13:20             ` Rik van Riel
2009-06-08 13:35               ` Mel Gorman
2009-06-08 13:34             ` Larry Finger
2009-06-07  9:52 ` [Bug #13341] Random Oops at boot at loading ip6tables rules Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13337] [post 2.6.29 regression] hang during suspend of b44/b43 modules Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13328] b44: eth0: BUG! Timeout waiting for bit 00000002 of register 42c to clear Rafael J. Wysocki
2009-06-08  7:29   ` Francis Moreau
2009-06-12 13:27     ` Francis Moreau
2009-06-12 19:14       ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13330] nfs4 NULL pointer dereference in _nfs4_do_setlk Rafael J. Wysocki
2009-06-07 19:28   ` Trond Myklebust
2009-06-07 21:04     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13362] rt2x00: slow wifi with correct basic rate bitmap Rafael J. Wysocki
2009-06-07 12:58   ` Alejandro Riveira Fernández
2009-06-07 21:05     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13351] 2.6.30 corrupts my system after suspend resume with readonly mounted hard disk Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13366] About 80% of shutdowns fail (blocking) Rafael J. Wysocki
2009-06-07 16:02   ` Martin Bammer
2009-06-07 21:09     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13389] Warning 'Invalid throttling state, reset' gets displayed when it should not be Rafael J. Wysocki
2009-06-08 11:31   ` Frans Pop
2009-06-07  9:52 ` [Bug #13391] Kernel boot hangs at about every second start when kms is activated Rafael J. Wysocki
2009-06-07 16:04   ` Martin Bammer
2009-06-07 21:11     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13374] reiserfs blocked for more than 120secs Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13373] fbcon, intelfb, i915: INFO: possible circular locking dependency detected Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13401] pktcdvd writing is really slow with CFQ scheduler (bisected) Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13407] adb trackpad disappears after suspend to ram Rafael J. Wysocki
2009-06-25 15:07   ` Jan Scholz
2009-06-07  9:52 ` [Bug #13408] Performance regression in 2.6.30-rc7 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13423] JMicron SATA controller not available Rafael J. Wysocki
2009-06-07 15:23   ` Marc Dionne
2009-06-07 21:13     ` Rafael J. Wysocki
2009-06-08  2:12       ` Marc Dionne
2009-06-07  9:52 ` [Bug #13424] possible deadlock when doing governor switching Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13446] resume after suspend-to-ram broken on Toshiba Satellite A100 with 2.6.30-rc8 (works in 2.6.28) Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13470] Machine doesn't boot due to mmconfig detection problem Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13462] Unused bands in intefb console and smaller 180x56 -> 128x48 Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13474] Oops whilst booting Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13473] Bug while trying to launch a KVM guest Rafael J. Wysocki
2009-06-08  4:26   ` Sachin Sant
2009-06-08 11:16     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13475] suspend/hibernate lockdep warning Rafael J. Wysocki
2009-06-07 13:21   ` Pekka Enberg
2009-06-08  7:35     ` Dave Young
2009-06-08  7:49       ` Pekka Enberg
2009-06-08 12:48         ` Mathieu Desnoyers
2009-06-08 14:32           ` Dave Jones
2009-06-08 15:23             ` [PATCH] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) Mathieu Desnoyers
2009-06-08 16:57               ` Pallipadi, Venkatesh
2009-06-08 17:17                 ` Mathieu Desnoyers
2009-06-09  1:15               ` Dave Young
2009-06-09 15:23                 ` Mathieu Desnoyers
2009-06-11  4:46                   ` Dave Young
2009-06-11 13:39             ` [Bug #13475] suspend/hibernate lockdep warning Simon Holm Thøgersen
2009-06-11 15:23               ` Mathieu Desnoyers
2009-06-17  0:39                 ` Pallipadi, Venkatesh
2009-06-17  1:05                   ` Mathieu Desnoyers
2009-06-17 15:29                   ` Thomas Renninger
2009-06-17 17:03                     ` Pallipadi, Venkatesh
2009-06-18  5:46                   ` Dave Young
2009-06-07  9:52 ` [Bug #13471] Loading parport_pc kills the keyboard if ACPI is enabled Rafael J. Wysocki
2009-06-07 13:25   ` Ozan Çağlayan
2009-06-07 21:14     ` Rafael J. Wysocki
2009-06-07  9:52 ` [Bug #13472] Oops with minicom and USB serial Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2009-08-25 20:37 2.6.31-rc7-git2: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-08-25 21:05 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-08-26  6:25   ` Pekka Enberg
2009-08-26 20:53     ` Rafael J. Wysocki
2009-08-19 20:36 2.6.31-rc6-git5: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-08-19 20:40 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-08-09 21:07 2.6.31-rc5-git5: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-08-09 21:10 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-08-02 19:06 2.6.31-rc5: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-08-02 19:09 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-07-26 20:41 2.6.31-rc4: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-07-26 20:45 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-07-27  0:17   ` Larry Finger
2009-07-27  0:24     ` David Rientjes
2009-07-27  7:08       ` Pekka Enberg
2009-07-27  9:37         ` David Rientjes
2009-07-27 17:20           ` Christoph Lameter
2009-07-27 18:16             ` David Rientjes
2009-07-27 21:43               ` Christoph Lameter
2009-07-27 22:38                 ` David Rientjes
2009-07-06 23:57 2.6.31-rc2: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-07-07  0:00 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-07-07  1:05   ` Larry Finger
2009-07-07  6:29     ` David Rientjes
2009-07-07  6:57       ` Pekka Enberg
2009-07-08 13:18         ` Larry Finger
2009-06-29  0:26 2.6.31-rc1-git3: Reported regressions 2.6.29 -> 2.6.30 Rafael J. Wysocki
2009-06-29  0:30 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-06-29 16:51   ` Larry Finger
2009-06-29 23:15     ` Rafael J. Wysocki
2009-06-29 23:47     ` David Rientjes
2009-06-30  2:06       ` Larry Finger
2009-06-30  5:47         ` David Rientjes
2009-06-30  6:55       ` Pekka Enberg
2009-06-30  7:47         ` David Rientjes
2009-06-30  8:24           ` Pekka Enberg
2009-06-30 14:38             ` Larry Finger
2009-06-30 20:25             ` David Rientjes
2009-06-30 14:32         ` Christoph Lameter
2009-06-30 15:01           ` Pekka Enberg
2009-06-30 15:14             ` Christoph Lameter
2009-06-30 20:04               ` David Rientjes
2009-06-30 21:05                 ` Christoph Lameter
2009-06-30 21:15                   ` David Rientjes
2009-06-30 21:23                     ` Christoph Lameter
2009-06-30 21:52                       ` David Rientjes
2009-06-30 22:18                         ` Christoph Lameter
2009-07-01  5:53                         ` Pekka Enberg
2009-07-02 17:18                           ` David Rientjes
2009-07-03  7:23                             ` Pekka Enberg
2009-05-30 19:29 2.6.30-rc7-git4: Reported regressions from 2.6.29 Rafael J. Wysocki
2009-05-30 19:37 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-05-24 19:06 2.6.30-rc7: Reported regressions from 2.6.29 Rafael J. Wysocki
2009-05-24 19:11 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-05-16 19:14 2.6.30-rc6: Reported regressions from 2.6.29 Rafael J. Wysocki
2009-05-16 19:20 ` [Bug #13319] Page allocation failures with b43 and p54usb Rafael J. Wysocki
2009-05-16 23:36   ` Andrew Morton
2009-05-17 23:16     ` Larry Finger
2009-05-18  6:31     ` Pekka Enberg
2009-05-21 13:21   ` Larry Finger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090610155626.GA7951@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=Larry.Finger@lwfinger.net \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=johannes@sipsolutions.net \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).