public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* kerneloops.org: 2.6.26-rc possible regression in ext3
@ 2008-06-19  5:34 Arjan van de Ven
  2008-06-19  6:01 ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19  5:34 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Linus Torvalds, linux-ext4, Andrew Morton

In the kerneloops.org stats, a new oops is rapidly climbing the charts.
The oops is a page fault in the ext3 "do_slit" function, and the first
report of it was with 2.6.26-rc6-git3.

It happens with various applications; the backtraces are at:

http://www.kerneloops.org/search.php?search=do_split

but are generally of this pattern:

*do_split
ext3_add_entry
ext3_rename
vfs_rename
... <various paths into vfs_rename> ...

or

*do_split
? add_dirent_to_buf
ext3_add_entry
ext3_new_inode
ext3_add_nondir
ext3_create
vfs_create
....


did we change anything in ext3 this cycle?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* kerneloops.org: 2.6.26-rc possible regression in ext3
@ 2008-06-19  5:36 Arjan van de Ven
  2008-06-19  5:42 ` Dave Airlie
  2008-06-19 14:00 ` Eric Sandeen
  0 siblings, 2 replies; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19  5:36 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Linus Torvalds, linux-ext4, Andrew Morton

In the kerneloops.org stats, a new oops is rapidly climbing the charts.
The oops is a page fault in the ext3 "do_slit" function, and the first
report of it was with 2.6.26-rc6-git3.

It happens with various applications; the backtraces are at:

http://www.kerneloops.org/search.php?search=do_split

but are generally of this pattern:

*do_split
ext3_add_entry
ext3_rename
vfs_rename
... <various paths into vfs_rename> ...

or

*do_split
? add_dirent_to_buf
ext3_add_entry
ext3_new_inode
ext3_add_nondir
ext3_create
vfs_create
....


did we change anything in ext


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  5:36 kerneloops.org: 2.6.26-rc possible regression in ext3 Arjan van de Ven
@ 2008-06-19  5:42 ` Dave Airlie
  2008-06-19  5:48   ` Arjan van de Ven
                     ` (2 more replies)
  2008-06-19 14:00 ` Eric Sandeen
  1 sibling, 3 replies; 24+ messages in thread
From: Dave Airlie @ 2008-06-19  5:42 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ext4,
	Andrew Morton

On Thu, Jun 19, 2008 at 3:36 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> In the kerneloops.org stats, a new oops is rapidly climbing the charts.
> The oops is a page fault in the ext3 "do_slit" function, and the first
> report of it was with 2.6.26-rc6-git3.
>
> It happens with various applications; the backtraces are at:
>
> http://www.kerneloops.org/search.php?search=do_split
>

This is a bug in rawhide in gcc miscompiling something...

https://bugzilla.redhat.com/show_bug.cgi?id=451068

Dave.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  5:42 ` Dave Airlie
@ 2008-06-19  5:48   ` Arjan van de Ven
  2008-06-19  6:42   ` Linus Torvalds
  2008-06-19  8:11   ` Adrian Bunk
  2 siblings, 0 replies; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19  5:48 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ext4,
	Andrew Morton

Dave Airlie wrote:
> On Thu, Jun 19, 2008 at 3:36 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
>> In the kerneloops.org stats, a new oops is rapidly climbing the charts.
>> The oops is a page fault in the ext3 "do_slit" function, and the first
>> report of it was with 2.6.26-rc6-git3.
>>
>> It happens with various applications; the backtraces are at:
>>
>> http://www.kerneloops.org/search.php?search=do_split
>>
> 
> This is a bug in rawhide in gcc miscompiling something...
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=451068
> 

thanks for letting us know so fast!
I've marked this one in the database as a fedora gcc bug

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  5:34 Arjan van de Ven
@ 2008-06-19  6:01 ` Linus Torvalds
  2008-06-19  6:09   ` Arjan van de Ven
                     ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Linus Torvalds @ 2008-06-19  6:01 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linux Kernel Mailing List, linux-ext4, Andrew Morton, Al Viro



On Wed, 18 Jun 2008, Arjan van de Ven wrote:
>
> In the kerneloops.org stats, a new oops is rapidly climbing the charts.
> The oops is a page fault in the ext3 "do_split" function, and the first
> report of it was with 2.6.26-rc6-git3.

Interesting.

> It happens with various applications; the backtraces are at:
> 
> http://www.kerneloops.org/search.php?search=do_split
> 
> but are generally of this pattern:
> 
> *do_split
> ext3_add_entry
> ext3_rename
> vfs_rename
> ... <various paths into vfs_rename> ...
> 
> or
> 
> *do_split
> ? add_dirent_to_buf
> ext3_add_entry
> ext3_new_inode
> ext3_add_nondir
> ext3_create
> vfs_create
> ....
> 
> did we change anything in ext3 this cycle?

I'm not seeing anything relevant, but I'm adding Al to the cc in, since 
the r/o bind mounts did change fs/namei.c and vfs_create/mkdir in 
particular. Not that I see why that would trigger either, but the changes 
to fs/ext3/namei.c seem to be even _less_ interesting than that.

One thing I note is that all the oopses seem to be i686 - are there that 
few x86-64 fc10 users (I'd have assumed that 64-bit is starting to be the 
norm for people who live on the edge, but perhaps I'm just out of touch)? 

Or could this perhaps be an indication that it is specific to i686 some 
way (eg a compiler issue?)

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  6:01 ` Linus Torvalds
@ 2008-06-19  6:09   ` Arjan van de Ven
  2008-06-19  6:12   ` Arjan van de Ven
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19  6:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, linux-ext4, Andrew Morton, Al Viro

Linus Torvalds wrote:
> Or could this perhaps be an indication that it is specific to i686 some 
> way (eg a compiler issue?)
> 

Dave Airlie just confirmed this is a compiler bug indeed in gcc 4.3.1
and pointed at https://bugzilla.redhat.com/show_bug.cgi?id=451068


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  6:01 ` Linus Torvalds
  2008-06-19  6:09   ` Arjan van de Ven
@ 2008-06-19  6:12   ` Arjan van de Ven
  2008-06-19  6:14   ` Linus Torvalds
  2008-06-20 15:34   ` Bill Nottingham
  3 siblings, 0 replies; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19  6:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Andrew Morton, Al Viro

Linus Torvalds wrote:
> One thing I note is that all the oopses seem to be i686 - are there that 
> few x86-64 fc10 users (I'd have assumed that 64-bit is starting to be the 
> norm for people who live on the edge, but perhaps I'm just out of touch)? 
> 

for rawhide the 64/32 ratio seems to be 106/135
for fedora 9 the 64/32 ratio is 4946/13636

(nr of oopses for the specific architecture/releases)

so your assumption of the experimental rawhide users are more likely to use 64 bit seems to be quite correct.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  6:01 ` Linus Torvalds
  2008-06-19  6:09   ` Arjan van de Ven
  2008-06-19  6:12   ` Arjan van de Ven
@ 2008-06-19  6:14   ` Linus Torvalds
  2008-06-19  6:40     ` Linus Torvalds
  2008-06-20 15:34   ` Bill Nottingham
  3 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2008-06-19  6:14 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linux Kernel Mailing List, linux-ext4, Andrew Morton, Al Viro



On Wed, 18 Jun 2008, Linus Torvalds wrote:
> 
> One thing I note is that all the oopses seem to be i686 - are there that 
> few x86-64 fc10 users (I'd have assumed that 64-bit is starting to be the 
> norm for people who live on the edge, but perhaps I'm just out of touch)? 
> 
> Or could this perhaps be an indication that it is specific to i686 some 
> way (eg a compiler issue?)

The oops code is odd:

  27:	8d 4c 18 fe          	lea    0xfffffffe(%eax,%ebx,1),%ecx
  2b:*	8b 19                	mov    (%ecx),%ebx     <-- trapping instruction
  2d:	83 e9 08             	sub    $0x8,%ecx
  30:	89 d8                	mov    %ebx,%eax
  32:	66 d1 e8             	shr    %ax
  35:	0f b7 c0             	movzwl %ax,%eax

and that "lea" is doing an address computation of "eax+2*ebx-2". Which 
does *not* look like an address to a 32-bit entity, but to a 16-bit one. 
Yeah, it's not conclusive, but it is suggestive.

And the 16-bit "shr+movzwl" further strengthens the case that it is 
actually working on a 16-bit entity. The trapping instruction _should_ 
possibly have been a "movzwl (%ecx),%ebx" to begin with.

But it did a 32-bit load, and in this case it looks as if the 16-bit load 
would have been correct! The value of ECX in this example was

	ECX: dc384ffe

ie it was indeed a two-byte aligned thing at the end of the page, and if 
the load had been a 16-bit load (like the data seems to be), it would 
never have oopsed! The page fault seems to be due to DEBUG_PAGEALLOC and 
the next page being unmapped because it's not allocated.

I only looked closer at one particular oops (25906, in case anybody 
cares), but at least judging from that particular one I would indeed 
suspect a compiler bug.

Of course, the main reason I say that is that none of the ext3 or VFS 
changes look even _remotely_ relevant to any of this. They really don't 
look like they could possibly matter for "do_split()" unless there is 
something really odd going on.

			Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  6:14   ` Linus Torvalds
@ 2008-06-19  6:40     ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2008-06-19  6:40 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linux Kernel Mailing List, linux-ext4, Andrew Morton, Al Viro



On Wed, 18 Jun 2008, Linus Torvalds wrote:
> 
> and that "lea" is doing an address computation of "eax+2*ebx-2". Which 
> does *not* look like an address to a 32-bit entity, but to a 16-bit one. 
> Yeah, it's not conclusive, but it is suggestive.

I'm wrong, that's just "eax+ebx-2". The *2 was just a brainfart on my 
part.

But I think I have pinpointed where it comes from: it's the 

	struct dx_map_entry *map;

which is a structure like this:

	struct dx_map_entry
	{
	        u32 hash;
	        u16 offs;
	        u16 size;
	};

and it does look like it's the

	if (size + map[i].size/2 > blocksize/2)

calculation, where "i" counts backwards from "count-1" to 0.

In particular, the code

  27:	8d 4c 18 fe          	lea    0xfffffffe(%eax,%ebx,1),%ecx
  2b:*	8b 19                	mov    (%ecx),%ebx     <-- trapping instruction
  2d:	83 e9 08             	sub    $0x8,%ecx
  30:	89 d8                	mov    %ebx,%eax
  32:	66 d1 e8             	shr    %ax
  38:	8d 04 02             	lea    (%edx,%eax,1),%eax

seems to be that "size + map[i].size/2" calculation, but I have a hard 
time trying to line it up with wat _my_ compiler gives me. But the nearest 
match I have is:

        movw    6(%ecx), %bx    # <variable>.size, D.21305
        subl    $8, %ecx        #, ivtmp.921
        movl    -104(%ebp), %edx        # blocksize, tmp179
        movl    %ebx, %eax      # D.21305, tmp176
        shrw    %ax     # tmp176
        movzwl  %ax, %eax       # tmp176, tmp177
        leal    (%esi,%eax), %eax       #, tmp178

which seems to be largely the same thing (except I have a "movw" to load 
the size, and %ecx is offset by one 'map' entry - so the offset is 6 (in 
the memop) instead of that "-2" (from the lea).

I think I'll give up, but that's the closest match I can find. No 
guarantees, but it seems to support the notion of "wrong 32-bit load where 
it should have used a 16-bit one".

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  5:42 ` Dave Airlie
  2008-06-19  5:48   ` Arjan van de Ven
@ 2008-06-19  6:42   ` Linus Torvalds
  2008-06-19  7:09     ` Arjan van de Ven
  2008-06-19  8:11   ` Adrian Bunk
  2 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2008-06-19  6:42 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Arjan van de Ven, Linux Kernel Mailing List, linux-ext4,
	Andrew Morton



On Thu, 19 Jun 2008, Dave Airlie wrote:
> 
> This is a bug in rawhide in gcc miscompiling something...
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=451068

Gaah. I should read all my email instead of wasting my time trying to 
match up the code with what I can reproduce..

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  6:42   ` Linus Torvalds
@ 2008-06-19  7:09     ` Arjan van de Ven
  0 siblings, 0 replies; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19  7:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, Linux Kernel Mailing List, linux-ext4, Andrew Morton

Linus Torvalds wrote:
> 
> On Thu, 19 Jun 2008, Dave Airlie wrote:
>> This is a bug in rawhide in gcc miscompiling something...
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=451068
> 
> Gaah. I should read all my email instead of wasting my time trying to 
> match up the code with what I can reproduce..
> 

unfortunately, kerneloops.org didn't pick up the link to this bug (due to the fact
that the oops in the bug was a jpeg....)... maybe one day if I'm really bored
I'll implement OCR into it ;)

sorry about wasting your time


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  5:42 ` Dave Airlie
  2008-06-19  5:48   ` Arjan van de Ven
  2008-06-19  6:42   ` Linus Torvalds
@ 2008-06-19  8:11   ` Adrian Bunk
  2008-06-19  8:32     ` Mikael Pettersson
  2008-06-19 13:40     ` Arjan van de Ven
  2 siblings, 2 replies; 24+ messages in thread
From: Adrian Bunk @ 2008-06-19  8:11 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Arjan van de Ven, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

On Thu, Jun 19, 2008 at 03:42:34PM +1000, Dave Airlie wrote:
> On Thu, Jun 19, 2008 at 3:36 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> > In the kerneloops.org stats, a new oops is rapidly climbing the charts.
> > The oops is a page fault in the ext3 "do_slit" function, and the first
> > report of it was with 2.6.26-rc6-git3.
> >
> > It happens with various applications; the backtraces are at:
> >
> > http://www.kerneloops.org/search.php?search=do_split
> 
> This is a bug in rawhide in gcc miscompiling something...
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=451068

If I understand it correctly that's a bug in upstream gcc 4.3.1
(but not in gcc 4.3.0)?

Expect a lot more of this to pop up in the future.
Should we #error for gcc 4.3.1?

> Dave.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  8:11   ` Adrian Bunk
@ 2008-06-19  8:32     ` Mikael Pettersson
  2008-06-19 10:49       ` Adrian Bunk
  2008-06-19 13:40     ` Arjan van de Ven
  1 sibling, 1 reply; 24+ messages in thread
From: Mikael Pettersson @ 2008-06-19  8:32 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Dave Airlie, Arjan van de Ven, Linux Kernel Mailing List,
	Linus Torvalds, linux-ext4, Andrew Morton

Adrian Bunk writes:
 > On Thu, Jun 19, 2008 at 03:42:34PM +1000, Dave Airlie wrote:
 > > On Thu, Jun 19, 2008 at 3:36 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
 > > > In the kerneloops.org stats, a new oops is rapidly climbing the charts.
 > > > The oops is a page fault in the ext3 "do_slit" function, and the first
 > > > report of it was with 2.6.26-rc6-git3.
 > > >
 > > > It happens with various applications; the backtraces are at:
 > > >
 > > > http://www.kerneloops.org/search.php?search=do_split
 > > 
 > > This is a bug in rawhide in gcc miscompiling something...
 > > 
 > > https://bugzilla.redhat.com/show_bug.cgi?id=451068
 > 
 > If I understand it correctly that's a bug in upstream gcc 4.3.1
 > (but not in gcc 4.3.0)?
 > 
 > Expect a lot more of this to pop up in the future.
 > Should we #error for gcc 4.3.1?

There are other nasty bugs in gcc-4.3.0. I actually
had to completely ban 4.3.0 in a user-space project
I'm involved with (Erlang) due to gcc PR36339 (fixed
in 4.3.1).

What's the gcc bugzilla number for this new 4.3.1 bug?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  8:32     ` Mikael Pettersson
@ 2008-06-19 10:49       ` Adrian Bunk
  0 siblings, 0 replies; 24+ messages in thread
From: Adrian Bunk @ 2008-06-19 10:49 UTC (permalink / raw)
  To: Mikael Pettersson
  Cc: Dave Airlie, Arjan van de Ven, Linux Kernel Mailing List,
	Linus Torvalds, linux-ext4, Andrew Morton

On Thu, Jun 19, 2008 at 10:32:24AM +0200, Mikael Pettersson wrote:
>...
> What's the gcc bugzilla number for this new 4.3.1 bug?

#36533

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  8:11   ` Adrian Bunk
  2008-06-19  8:32     ` Mikael Pettersson
@ 2008-06-19 13:40     ` Arjan van de Ven
  2008-06-19 15:10       ` Adrian Bunk
  1 sibling, 1 reply; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19 13:40 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Dave Airlie, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

Adrian Bunk wrote:
> 
> Expect a lot more of this to pop up in the future.
> Should we #error for gcc 4.3.1?
> 

it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
test based on that.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  5:36 kerneloops.org: 2.6.26-rc possible regression in ext3 Arjan van de Ven
  2008-06-19  5:42 ` Dave Airlie
@ 2008-06-19 14:00 ` Eric Sandeen
  2008-06-19 14:07   ` Arjan van de Ven
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Sandeen @ 2008-06-19 14:00 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ext4,
	Andrew Morton

Arjan van de Ven wrote:
> In the kerneloops.org stats, a new oops is rapidly climbing the charts.
> The oops is a page fault in the ext3 "do_slit" function, and the first
> report of it was with 2.6.26-rc6-git3.
> 
> It happens with various applications; the backtraces are at:
> 
> http://www.kerneloops.org/search.php?search=do_split

Arjan, I was just looking at kerneloops last night, seeing the count for
this oops climb, and was wishing there were some way to annotate an oops
signature with more info.  If I could have tagged this with the RH
bugzilla nr. it might have saved a lot of time for folks.  Is this
feasible?  Or is finding the oops text in bugzilla the only way?

Thanks,

-Eric


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 14:00 ` Eric Sandeen
@ 2008-06-19 14:07   ` Arjan van de Ven
  2008-06-19 14:17     ` Eric Sandeen
  0 siblings, 1 reply; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19 14:07 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ext4,
	Andrew Morton

Eric Sandeen wrote:
> Arjan van de Ven wrote:
>> In the kerneloops.org stats, a new oops is rapidly climbing the charts.
>> The oops is a page fault in the ext3 "do_slit" function, and the first
>> report of it was with 2.6.26-rc6-git3.
>>
>> It happens with various applications; the backtraces are at:
>>
>> http://www.kerneloops.org/search.php?search=do_split
> 
> Arjan, I was just looking at kerneloops last night, seeing the count for
> this oops climb, and was wishing there were some way to annotate an oops
> signature with more info.  If I could have tagged this with the RH
> bugzilla nr. it might have saved a lot of time for folks.  Is this
> feasible?  Or is finding the oops text in bugzilla the only way?
> 

there's a way to add a description to oopses (you might have seen some of these
descriptions already); however I've not implemented an account system yet so for
now it's only me who can add these.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 14:07   ` Arjan van de Ven
@ 2008-06-19 14:17     ` Eric Sandeen
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Sandeen @ 2008-06-19 14:17 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ext4,
	Andrew Morton

Arjan van de Ven wrote:
> Eric Sandeen wrote:
>> Arjan van de Ven wrote:
>>> In the kerneloops.org stats, a new oops is rapidly climbing the charts.
>>> The oops is a page fault in the ext3 "do_slit" function, and the first
>>> report of it was with 2.6.26-rc6-git3.
>>>
>>> It happens with various applications; the backtraces are at:
>>>
>>> http://www.kerneloops.org/search.php?search=do_split
>> Arjan, I was just looking at kerneloops last night, seeing the count for
>> this oops climb, and was wishing there were some way to annotate an oops
>> signature with more info.  If I could have tagged this with the RH
>> bugzilla nr. it might have saved a lot of time for folks.  Is this
>> feasible?  Or is finding the oops text in bugzilla the only way?
>>
> 
> there's a way to add a description to oopses (you might have seen some of these
> descriptions already); however I've not implemented an account system yet so for
> now it's only me who can add these.

Ok, that was my guess.  I'll shoot you an email next time.  :)

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 13:40     ` Arjan van de Ven
@ 2008-06-19 15:10       ` Adrian Bunk
  2008-06-19 15:18         ` Arjan van de Ven
  0 siblings, 1 reply; 24+ messages in thread
From: Adrian Bunk @ 2008-06-19 15:10 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Dave Airlie, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

On Thu, Jun 19, 2008 at 06:40:05AM -0700, Arjan van de Ven wrote:
> Adrian Bunk wrote:
>>
>> Expect a lot more of this to pop up in the future.
>> Should we #error for gcc 4.3.1?
>
> it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
> test based on that.

The gcc Bugzilla contains a testcase.

But how do you plan to integrate it into a kernel build?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 15:10       ` Adrian Bunk
@ 2008-06-19 15:18         ` Arjan van de Ven
  2008-06-19 15:25           ` Adrian Bunk
  0 siblings, 1 reply; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19 15:18 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Dave Airlie, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

Adrian Bunk wrote:
> On Thu, Jun 19, 2008 at 06:40:05AM -0700, Arjan van de Ven wrote:
>> Adrian Bunk wrote:
>>> Expect a lot more of this to pop up in the future.
>>> Should we #error for gcc 4.3.1?
>> it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
>> test based on that.
> 
> The gcc Bugzilla contains a testcase.
> 
> But how do you plan to integrate it into a kernel build?

we already have several of these.
Just look at scripts/gcc-x86_64-has-stack-protector.sh for an example of such a beast.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 15:18         ` Arjan van de Ven
@ 2008-06-19 15:25           ` Adrian Bunk
  2008-06-19 15:27             ` Arjan van de Ven
  0 siblings, 1 reply; 24+ messages in thread
From: Adrian Bunk @ 2008-06-19 15:25 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Dave Airlie, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

On Thu, Jun 19, 2008 at 08:18:39AM -0700, Arjan van de Ven wrote:
> Adrian Bunk wrote:
>> On Thu, Jun 19, 2008 at 06:40:05AM -0700, Arjan van de Ven wrote:
>>> Adrian Bunk wrote:
>>>> Expect a lot more of this to pop up in the future.
>>>> Should we #error for gcc 4.3.1?
>>> it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
>>> test based on that.
>>
>> The gcc Bugzilla contains a testcase.
>>
>> But how do you plan to integrate it into a kernel build?
>
> we already have several of these.
> Just look at scripts/gcc-x86_64-has-stack-protector.sh for an example of such a beast.

Checking whether gcc supports some flags is easy.

But miscompilations are a different issue.

Especially since we also want to reject broken gcc versions for cross 
compilations.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 15:25           ` Adrian Bunk
@ 2008-06-19 15:27             ` Arjan van de Ven
  2008-06-19 15:43               ` Adrian Bunk
  0 siblings, 1 reply; 24+ messages in thread
From: Arjan van de Ven @ 2008-06-19 15:27 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Dave Airlie, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

Adrian Bunk wrote:
> On Thu, Jun 19, 2008 at 08:18:39AM -0700, Arjan van de Ven wrote:
>> Adrian Bunk wrote:
>>> On Thu, Jun 19, 2008 at 06:40:05AM -0700, Arjan van de Ven wrote:
>>>> Adrian Bunk wrote:
>>>>> Expect a lot more of this to pop up in the future.
>>>>> Should we #error for gcc 4.3.1?
>>>> it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
>>>> test based on that.
>>> The gcc Bugzilla contains a testcase.
>>>
>>> But how do you plan to integrate it into a kernel build?
>> we already have several of these.
>> Just look at scripts/gcc-x86_64-has-stack-protector.sh for an example of such a beast.
> 
> Checking whether gcc supports some flags is easy.

have you actually looked at this script?
You didn't, since the script doesn't check if gcc supports some flag.
It checks very specifically for a code generation pattern...

Please go look at the script first before responding.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19 15:27             ` Arjan van de Ven
@ 2008-06-19 15:43               ` Adrian Bunk
  0 siblings, 0 replies; 24+ messages in thread
From: Adrian Bunk @ 2008-06-19 15:43 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Dave Airlie, Linux Kernel Mailing List, Linus Torvalds,
	linux-ext4, Andrew Morton

On Thu, Jun 19, 2008 at 08:27:48AM -0700, Arjan van de Ven wrote:
> Adrian Bunk wrote:
>> On Thu, Jun 19, 2008 at 08:18:39AM -0700, Arjan van de Ven wrote:
>>> Adrian Bunk wrote:
>>>> On Thu, Jun 19, 2008 at 06:40:05AM -0700, Arjan van de Ven wrote:
>>>>> Adrian Bunk wrote:
>>>>>> Expect a lot more of this to pop up in the future.
>>>>>> Should we #error for gcc 4.3.1?
>>>>> it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
>>>>> test based on that.
>>>> The gcc Bugzilla contains a testcase.
>>>>
>>>> But how do you plan to integrate it into a kernel build?
>>> we already have several of these.
>>> Just look at scripts/gcc-x86_64-has-stack-protector.sh for an example of such a beast.
>>
>> Checking whether gcc supports some flags is easy.
>
> have you actually looked at this script?
> You didn't, since the script doesn't check if gcc supports some flag.
> It checks very specifically for a code generation pattern...
>
> Please go look at the script first before responding.

I did look, but I missed the last pipe...

Do we know for sure this bug can only trigger on 32bit x86?

Or is there anything else I miss in gcc-x86_64-has-stack-protector.sh 
that allows to use this approach to check for wrong code generation 
caused by platform independent gcc bugs?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: kerneloops.org: 2.6.26-rc possible regression in ext3
  2008-06-19  6:01 ` Linus Torvalds
                     ` (2 preceding siblings ...)
  2008-06-19  6:14   ` Linus Torvalds
@ 2008-06-20 15:34   ` Bill Nottingham
  3 siblings, 0 replies; 24+ messages in thread
From: Bill Nottingham @ 2008-06-20 15:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arjan van de Ven, Linux Kernel Mailing List, linux-ext4,
	Andrew Morton, Al Viro

Linus Torvalds (torvalds@linux-foundation.org) said: 
> Or could this perhaps be an indication that it is specific to i686 some 
> way (eg a compiler issue?)

Yes. https://bugzilla.redhat.com/show_bug.cgi?id=451068

Bill

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2008-06-20 15:36 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-19  5:36 kerneloops.org: 2.6.26-rc possible regression in ext3 Arjan van de Ven
2008-06-19  5:42 ` Dave Airlie
2008-06-19  5:48   ` Arjan van de Ven
2008-06-19  6:42   ` Linus Torvalds
2008-06-19  7:09     ` Arjan van de Ven
2008-06-19  8:11   ` Adrian Bunk
2008-06-19  8:32     ` Mikael Pettersson
2008-06-19 10:49       ` Adrian Bunk
2008-06-19 13:40     ` Arjan van de Ven
2008-06-19 15:10       ` Adrian Bunk
2008-06-19 15:18         ` Arjan van de Ven
2008-06-19 15:25           ` Adrian Bunk
2008-06-19 15:27             ` Arjan van de Ven
2008-06-19 15:43               ` Adrian Bunk
2008-06-19 14:00 ` Eric Sandeen
2008-06-19 14:07   ` Arjan van de Ven
2008-06-19 14:17     ` Eric Sandeen
  -- strict thread matches above, loose matches on Subject: below --
2008-06-19  5:34 Arjan van de Ven
2008-06-19  6:01 ` Linus Torvalds
2008-06-19  6:09   ` Arjan van de Ven
2008-06-19  6:12   ` Arjan van de Ven
2008-06-19  6:14   ` Linus Torvalds
2008-06-19  6:40     ` Linus Torvalds
2008-06-20 15:34   ` Bill Nottingham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox