All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <david.ahern@oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>
Cc: linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	sparclinux@vger.kernel.org
Subject: Re: 4.0.0-rc4: panic in free_block
Date: Fri, 20 Mar 2015 16:53:12 +0000	[thread overview]
Message-ID: <550C5078.8040402@oracle.com> (raw)
In-Reply-To: <CA+55aFxhNphSMrNvwqj0AQRzuqRdPG11J6DaazKWMb2U+H7wKg@mail.gmail.com>

On 3/20/15 10:48 AM, Linus Torvalds wrote:
> [ Added Davem and the sparc mailing list, since it happens on sparc
> and that just makes me suspicious ]
>
> On Fri, Mar 20, 2015 at 8:07 AM, David Ahern <david.ahern@oracle.com> wrote:
>> I can easily reproduce the panic below doing a kernel build with make -j N,
>> N\x128, 256, etc. This is a 1024 cpu system running 4.0.0-rc4.
>
> 3.19 is fine? Because I dont' think I've seen any reports like this
> for others, and what stands out is sparc (and to a lesser degree "1024
> cpus", which obviously gets a lot less testing)

I haven't tried 3.19 yet. Just backed up to 3.18 and it shows the same 
problem. And I can reproduce the 4.0 crash in a 128 cpu ldom (VM).

>
>> The top 3 frames are consistently:
>>      free_block+0x60
>>      cache_flusharray+0xac
>>      kmem_cache_free+0xfc
>>
>> After that one path has been from __mmdrop and the others are like below,
>> from remove_vma.
>>
>> Unable to handle kernel paging request at virtual address 0006100000000000
>
> One thing you *might* check is if the problem goes away if you select
> CONFIG_SLUB instead of CONFIG_SLAB. I'd really like to just get rid of
> SLAB. The whole "we have multiple different allocators" is a mess and
> causes test coverage issues.
>
> Apart from testing with CONFIG_SLUB, if 3.19 is ok and you seem to be
> able to "easily reproduce" this, the obvious thing to do is to try to
> bisect it.

I'll try SLUB. The ldom reboots 1000 times faster then resetting the h/w 
so a better chance of bisecting - if I can find a known good release.

David


WARNING: multiple messages have this Message-ID (diff)
From: David Ahern <david.ahern@oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>
Cc: linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	sparclinux@vger.kernel.org
Subject: Re: 4.0.0-rc4: panic in free_block
Date: Fri, 20 Mar 2015 10:53:12 -0600	[thread overview]
Message-ID: <550C5078.8040402@oracle.com> (raw)
In-Reply-To: <CA+55aFxhNphSMrNvwqj0AQRzuqRdPG11J6DaazKWMb2U+H7wKg@mail.gmail.com>

On 3/20/15 10:48 AM, Linus Torvalds wrote:
> [ Added Davem and the sparc mailing list, since it happens on sparc
> and that just makes me suspicious ]
>
> On Fri, Mar 20, 2015 at 8:07 AM, David Ahern <david.ahern@oracle.com> wrote:
>> I can easily reproduce the panic below doing a kernel build with make -j N,
>> N=128, 256, etc. This is a 1024 cpu system running 4.0.0-rc4.
>
> 3.19 is fine? Because I dont' think I've seen any reports like this
> for others, and what stands out is sparc (and to a lesser degree "1024
> cpus", which obviously gets a lot less testing)

I haven't tried 3.19 yet. Just backed up to 3.18 and it shows the same 
problem. And I can reproduce the 4.0 crash in a 128 cpu ldom (VM).

>
>> The top 3 frames are consistently:
>>      free_block+0x60
>>      cache_flusharray+0xac
>>      kmem_cache_free+0xfc
>>
>> After that one path has been from __mmdrop and the others are like below,
>> from remove_vma.
>>
>> Unable to handle kernel paging request at virtual address 0006100000000000
>
> One thing you *might* check is if the problem goes away if you select
> CONFIG_SLUB instead of CONFIG_SLAB. I'd really like to just get rid of
> SLAB. The whole "we have multiple different allocators" is a mess and
> causes test coverage issues.
>
> Apart from testing with CONFIG_SLUB, if 3.19 is ok and you seem to be
> able to "easily reproduce" this, the obvious thing to do is to try to
> bisect it.

I'll try SLUB. The ldom reboots 1000 times faster then resetting the h/w 
so a better chance of bisecting - if I can find a known good release.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: David Ahern <david.ahern@oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>
Cc: linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	sparclinux@vger.kernel.org
Subject: Re: 4.0.0-rc4: panic in free_block
Date: Fri, 20 Mar 2015 10:53:12 -0600	[thread overview]
Message-ID: <550C5078.8040402@oracle.com> (raw)
In-Reply-To: <CA+55aFxhNphSMrNvwqj0AQRzuqRdPG11J6DaazKWMb2U+H7wKg@mail.gmail.com>

On 3/20/15 10:48 AM, Linus Torvalds wrote:
> [ Added Davem and the sparc mailing list, since it happens on sparc
> and that just makes me suspicious ]
>
> On Fri, Mar 20, 2015 at 8:07 AM, David Ahern <david.ahern@oracle.com> wrote:
>> I can easily reproduce the panic below doing a kernel build with make -j N,
>> N=128, 256, etc. This is a 1024 cpu system running 4.0.0-rc4.
>
> 3.19 is fine? Because I dont' think I've seen any reports like this
> for others, and what stands out is sparc (and to a lesser degree "1024
> cpus", which obviously gets a lot less testing)

I haven't tried 3.19 yet. Just backed up to 3.18 and it shows the same 
problem. And I can reproduce the 4.0 crash in a 128 cpu ldom (VM).

>
>> The top 3 frames are consistently:
>>      free_block+0x60
>>      cache_flusharray+0xac
>>      kmem_cache_free+0xfc
>>
>> After that one path has been from __mmdrop and the others are like below,
>> from remove_vma.
>>
>> Unable to handle kernel paging request at virtual address 0006100000000000
>
> One thing you *might* check is if the problem goes away if you select
> CONFIG_SLUB instead of CONFIG_SLAB. I'd really like to just get rid of
> SLAB. The whole "we have multiple different allocators" is a mess and
> causes test coverage issues.
>
> Apart from testing with CONFIG_SLUB, if 3.19 is ok and you seem to be
> able to "easily reproduce" this, the obvious thing to do is to try to
> bisect it.

I'll try SLUB. The ldom reboots 1000 times faster then resetting the h/w 
so a better chance of bisecting - if I can find a known good release.

David


  reply	other threads:[~2015-03-20 16:53 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-20 15:07 4.0.0-rc4: panic in free_block David Ahern
2015-03-20 15:07 ` David Ahern
2015-03-20 16:48 ` Linus Torvalds
2015-03-20 16:48   ` Linus Torvalds
2015-03-20 16:48   ` Linus Torvalds
2015-03-20 16:53   ` David Ahern [this message]
2015-03-20 16:53     ` David Ahern
2015-03-20 16:53     ` David Ahern
2015-03-20 16:58     ` Linus Torvalds
2015-03-20 16:58       ` Linus Torvalds
2015-03-20 16:58       ` Linus Torvalds
2015-03-20 18:05       ` David Ahern
2015-03-20 18:05         ` David Ahern
2015-03-20 18:05         ` David Ahern
2015-03-20 18:53         ` Linus Torvalds
2015-03-20 18:53           ` Linus Torvalds
2015-03-20 18:53           ` Linus Torvalds
2015-03-20 19:04           ` David Ahern
2015-03-20 19:04             ` David Ahern
2015-03-20 19:04             ` David Ahern
2015-03-20 19:47         ` David Miller
2015-03-20 19:47           ` David Miller
2015-03-20 19:47           ` David Miller
2015-03-20 19:54           ` David Ahern
2015-03-20 19:54             ` David Ahern
2015-03-20 19:54             ` David Ahern
2015-03-20 20:19             ` David Miller
2015-03-20 20:19               ` David Miller
2015-03-20 20:19               ` David Miller
2015-03-20 19:42       ` David Miller
2015-03-20 19:42         ` David Miller
2015-03-20 19:42         ` David Miller
2015-03-20 20:01       ` Dave Hansen
2015-03-20 20:01         ` Dave Hansen
2015-03-20 20:01         ` Dave Hansen
2015-03-20 21:17 ` Linus Torvalds
2015-03-20 21:17   ` Linus Torvalds
2015-03-20 22:49   ` David Ahern
2015-03-20 22:49     ` David Ahern
2015-03-21  0:18     ` David Ahern
2015-03-21  0:18       ` David Ahern
2015-03-21  0:34       ` David Rientjes
2015-03-21  0:34         ` David Rientjes
2015-03-21  0:39         ` David Ahern
2015-03-21  0:39           ` David Ahern
2015-03-21  0:47       ` Linus Torvalds
2015-03-21  0:47         ` Linus Torvalds
2015-03-21 17:45         ` David Ahern
2015-03-21 17:45           ` David Ahern
2015-03-21 18:49           ` Linus Torvalds
2015-03-21 18:49             ` Linus Torvalds
2015-03-21 18:49             ` Linus Torvalds
2015-03-22 17:36             ` David Miller
2015-03-22 17:36               ` David Miller
2015-03-22 17:36               ` David Miller
2015-03-22 19:25               ` Bob Picco
2015-03-22 19:25                 ` Bob Picco
2015-03-22 19:25                 ` Bob Picco
2015-03-22 19:47               ` Linus Torvalds
2015-03-22 19:47                 ` Linus Torvalds
2015-03-22 19:47                 ` Linus Torvalds
2015-03-22 22:23                 ` David Miller
2015-03-22 22:23                   ` David Miller
2015-03-22 22:23                   ` David Miller
2015-03-22 23:35                   ` David Ahern
2015-03-22 23:35                     ` David Ahern
2015-03-22 23:35                     ` David Ahern
2015-03-22 23:54                     ` David Miller
2015-03-22 23:54                       ` David Miller
2015-03-22 23:54                       ` David Miller
2015-03-23  0:03                       ` David Ahern
2015-03-23  0:03                         ` David Ahern
2015-03-23  0:03                         ` David Ahern
2015-03-23  2:00                         ` David Miller
2015-03-23  2:00                           ` David Miller
2015-03-23  2:00                           ` David Miller
2015-03-23  2:19                           ` David Miller
2015-03-23  2:19                             ` David Miller
2015-03-23  2:19                             ` David Miller
2015-03-23 16:25                             ` David Miller
2015-03-23 16:25                               ` David Miller
2015-03-23 16:25                               ` David Miller
2015-03-23 16:51                               ` John Stoffel
2015-03-23 16:51                                 ` John Stoffel
2015-03-23 16:51                                 ` John Stoffel
2015-03-23 19:16                                 ` David Miller
2015-03-23 19:16                                   ` David Miller
2015-03-23 19:16                                   ` David Miller
2015-03-23 19:56                                   ` John Stoffel
2015-03-23 19:56                                     ` John Stoffel
2015-03-23 19:56                                     ` John Stoffel
2015-03-23 20:08                                     ` David Miller
2015-03-23 20:08                                       ` David Miller
2015-03-23 20:08                                       ` David Miller
2015-03-23 17:00                               ` Linus Torvalds
2015-03-23 17:00                                 ` Linus Torvalds
2015-03-23 17:00                                 ` Linus Torvalds
2015-03-23 19:08                                 ` David Miller
2015-03-23 19:08                                   ` David Miller
2015-03-23 19:08                                   ` David Miller
2015-03-23 19:47                                   ` Linus Torvalds
2015-03-23 19:47                                     ` Linus Torvalds
2015-03-23 19:47                                     ` Linus Torvalds
2015-03-23 19:52                                     ` David Miller
2015-03-23 19:52                                       ` David Miller
2015-03-23 19:52                                       ` David Miller
2015-03-23 17:34                               ` David Ahern
2015-03-23 17:34                                 ` David Ahern
2015-03-23 17:34                                 ` David Ahern
2015-03-23 19:35                                 ` David Miller
2015-03-23 19:35                                   ` David Miller
2015-03-23 19:35                                   ` David Miller
2015-03-23 19:58                                   ` David Ahern
2015-03-23 19:58                                     ` David Ahern
2015-03-23 19:58                                     ` David Ahern
2015-03-24  1:01                                   ` David Ahern
2015-03-24  1:01                                     ` David Ahern
2015-03-24  1:01                                     ` David Ahern
2015-03-24 14:57                               ` Bob Picco
2015-03-24 14:57                                 ` Bob Picco
2015-03-24 14:57                                 ` Bob Picco
2015-03-24 16:05                                 ` David Miller
2015-03-24 16:05                                   ` David Miller
2015-03-24 16:05                                   ` David Miller
2015-03-22 23:49                   ` Linus Torvalds
2015-03-22 23:49                     ` Linus Torvalds
2015-03-22 23:49                     ` Linus Torvalds
2015-03-22 23:57                     ` David Miller
2015-03-22 23:57                       ` David Miller
2015-03-22 23:57                       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550C5078.8040402@oracle.com \
    --to=david.ahern@oracle.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.