linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
       [not found] <AANLkTinbqG7sXxf82wc516snLoae1DtCWjo+VtsPx2P3@mail.gmail.com>
@ 2010-11-22 23:47 ` Andrew Morton
  2010-11-23 14:55   ` Robert Święcki
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2010-11-22 23:47 UTC (permalink / raw)
  To: Robert Święcki; +Cc: linux-kernel, linux-mm

(cc linux-mm)

On Mon, 15 Nov 2010 15:45:42 +0100
Robert __wi__cki <robert@swiecki.net> wrote:

> Hi,
> 
> I was doing some fuzzing with http://code.google.com/p/iknowthis/ and
> my system pretty quickly crashes with the BUG() below.

So it is a repeatable crash?

> - Even if it's just BUG() it renders my system unusable (I'm able to
> type a few characters on the virtual terminal at most)
> - Judging from the stacktrace it's sys_madvise(..., ..., MADV_REMOVE)
> - I'm testing with ubuntu's 2.6.35-22-server#35 but I got similar
> results with 2.6.32 some time ago

It is.

> - I'm posting this cause diving into linux mm spaghetti code might be
> not a trivial task, but if nobody can see anything obvious in a day or
> so, I'll try to debug it mysel
> - I'm unable to provide a testcase by now, nor any usable state of the
> crashing process, cause the system becomes unusable
> - It crashes both linux-kernel working on a physical machine as well
> as on the VirtualBox emulator
> - I'm usually waiting from 0.5h to 12h for this crash to appear, I
> think it could be speed up greatly by disabling any irrelevant
> syscalls in the fuzzer
> 
> [25142.286531] kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!

That's

	BUG_ON(page_mapped(page));

in __remove_from_page_cache().  That state is worth a BUG().

> [25142.290627] invalid opcode: 0000 [#2] SMP
> [25142.290627] last sysfs file: /sys/fs/ext4/sda1/lifetime_write_kbytes
> [25142.301039] CPU 1
> [25142.301039] Modules linked in: dccp_ipv6 decnet llc2 x25 pppoe
> pppox irda crc_ccitt dccp_ipv4 dccp ipx p8023 rds af_key netrom econet
> can atm l2cap bluetooth appletalk rose ax25 nouveau snd_hda_codec_idt
> ttm snd_hda_intel drm_kms_helper snd_hda_codec led_class drm psmouse
> snd_hwdep ppdev snd_pcm snd_timer i2c_algo_bit serio_raw i82975x_edac
> snd parport_pc lp soundcore edac_core snd_page_alloc parport dcdbas
> floppy ahci firewire_ohci firewire_core crc_itu_t tg3 libahci
> [25142.301039]
> [25142.301039] Pid: 22612, comm: iknowthis Tainted: G      D W
> 2.6.35-22-server #35-Ubuntu 0GH911/Precision WorkStation 390
> [25142.301039] RIP: 0010:[<ffffffff81101645>]  [<ffffffff81101645>]
> __remove_from_page_cache+0xd5/0xe0
> [25142.367617] RSP: 0000:ffff8800cb8a3cf8  EFLAGS: 00010002
> [25142.367617] RAX: 0100000000100029 RBX: ffffea0002159670 RCX: 0000000000000000
> [25142.367617] RDX: 0000000000000001 RSI: 0000000000000016 RDI: ffff880100000700
> [25142.367617] RBP: ffff8800cb8a3d08 R08: 0000000000000018 R09: 0000000000000000
> [25142.367617] R10: ffffea0002159678 R11: 0000000000000000 R12: ffff8800cb865188
> [25142.367617] R13: ffff8800cb865188 R14: ffff8800cb8a3d88 R15: 0000000000000000
> [25142.367617] FS:  0000000000000000(0000) GS:ffff880001e40000(0063)
> knlGS:00000000f75276c0
> [25142.367617] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> [25142.367617] CR2: 00000000080ba1fc CR3: 000000012091d000 CR4: 00000000000006e0
> [25142.367617] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [25142.367617] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [25142.367617] Process iknowthis (pid: 22612, threadinfo
> ffff8800cb8a2000, task ffff8800cb9516e0)
> [25142.367617] Stack:
> [25142.367617]  ffffea0002159670 ffff8800cb8651a0 ffff8800cb8a3d28
> ffffffff81101686
> [25142.367617] <0> ffffea0002159670 ffff8800cb865188 ffff8800cb8a3d48
> ffffffff8110c6d9
> [25142.367617] <0> 0000000000000001 0000000000000008 ffff8800cb8a3e38
> ffffffff8110cce0
> [25142.367617] Call Trace:
> [25142.367617]  [<ffffffff81101686>] remove_from_page_cache+0x36/0x60
> [25142.367617]  [<ffffffff8110c6d9>] truncate_inode_page+0x79/0xb0
> [25142.367617]  [<ffffffff8110cce0>] truncate_inode_pages_range+0x160/0x460
> [25142.367617]  [<ffffffff81121250>] ? unmap_mapping_range+0xb0/0x160
> [25142.367617]  [<ffffffff81121398>] vmtruncate_range+0x98/0x100
> [25142.367617]  [<ffffffff8111ae5a>] madvise_vma+0x17a/0x210
> [25142.367617]  [<ffffffff8111b055>] sys_madvise+0x165/0x290
> [25142.367617]  [<ffffffff810467c3>] ia32_sysret+0x0/0x5
> [25142.367617] Code: 00 00 e8 1f 8a 1c 00 48 89 df 57 9d 0f 1f 44 00
> 00 5b 41 5c c9 c3 be 16 00 00 00 48 89 df e8 a3 68 01 00 48 8b 03 e9
> 71 ff ff ff <0f> 0b eb fe eb 05 90 90 90 90 90 55 48 89 e5 48 83 ec 10
> 48 89
> [25142.367617] RIP  [<ffffffff81101645>] __remove_from_page_cache+0xd5/0xe0
> [25142.367617]  RSP <ffff8800cb8a3cf8>
> [25142.367617] ---[ end trace a262aa785b417723 ]---

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2010-11-22 23:47 ` kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128! Andrew Morton
@ 2010-11-23 14:55   ` Robert Święcki
  2010-11-29 23:25     ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Święcki @ 2010-11-23 14:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm

>> Hi,
>>
>> I was doing some fuzzing with http://code.google.com/p/iknowthis/ and
>> my system pretty quickly crashes with the BUG() below.
>
> So it is a repeatable crash?

Not in a sense that I can provide you with a sequence of syscalls that
led to this state. Generally it repeats after some time (<12 hours on
2 intel-core) of running the
http://code.google.com/p/iknowthis/source/browse/#svn/trunk

>> - Even if it's just BUG() it renders my system unusable (I'm able to
>> type a few characters on the virtual terminal at most)
>> - Judging from the stacktrace it's sys_madvise(..., ..., MADV_REMOVE)
>> - I'm testing with ubuntu's 2.6.35-22-server#35 but I got similar
>> results with 2.6.32 some time ago
>
> It is.
>
>> - I'm posting this cause diving into linux mm spaghetti code might be
>> not a trivial task, but if nobody can see anything obvious in a day or
>> so, I'll try to debug it mysel
>> - I'm unable to provide a testcase by now, nor any usable state of the
>> crashing process, cause the system becomes unusable
>> - It crashes both linux-kernel working on a physical machine as well
>> as on the VirtualBox emulator
>> - I'm usually waiting from 0.5h to 12h for this crash to appear, I
>> think it could be speed up greatly by disabling any irrelevant
>> syscalls in the fuzzer
>>
>> [25142.286531] kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
>
> That's
>
>        BUG_ON(page_mapped(page));
>
> in __remove_from_page_cache().  That state is worth a BUG().
>
>> [25142.290627] invalid opcode: 0000 [#2] SMP
>> [25142.290627] last sysfs file: /sys/fs/ext4/sda1/lifetime_write_kbytes
>> [25142.301039] CPU 1
>> [25142.301039] Modules linked in: dccp_ipv6 decnet llc2 x25 pppoe
>> pppox irda crc_ccitt dccp_ipv4 dccp ipx p8023 rds af_key netrom econet
>> can atm l2cap bluetooth appletalk rose ax25 nouveau snd_hda_codec_idt
>> ttm snd_hda_intel drm_kms_helper snd_hda_codec led_class drm psmouse
>> snd_hwdep ppdev snd_pcm snd_timer i2c_algo_bit serio_raw i82975x_edac
>> snd parport_pc lp soundcore edac_core snd_page_alloc parport dcdbas
>> floppy ahci firewire_ohci firewire_core crc_itu_t tg3 libahci
>> [25142.301039]
>> [25142.301039] Pid: 22612, comm: iknowthis Tainted: G      D W
>> 2.6.35-22-server #35-Ubuntu 0GH911/Precision WorkStation 390
>> [25142.301039] RIP: 0010:[<ffffffff81101645>]  [<ffffffff81101645>]
>> __remove_from_page_cache+0xd5/0xe0
>> [25142.367617] RSP: 0000:ffff8800cb8a3cf8  EFLAGS: 00010002
>> [25142.367617] RAX: 0100000000100029 RBX: ffffea0002159670 RCX: 0000000000000000
>> [25142.367617] RDX: 0000000000000001 RSI: 0000000000000016 RDI: ffff880100000700
>> [25142.367617] RBP: ffff8800cb8a3d08 R08: 0000000000000018 R09: 0000000000000000
>> [25142.367617] R10: ffffea0002159678 R11: 0000000000000000 R12: ffff8800cb865188
>> [25142.367617] R13: ffff8800cb865188 R14: ffff8800cb8a3d88 R15: 0000000000000000
>> [25142.367617] FS:  0000000000000000(0000) GS:ffff880001e40000(0063)
>> knlGS:00000000f75276c0
>> [25142.367617] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
>> [25142.367617] CR2: 00000000080ba1fc CR3: 000000012091d000 CR4: 00000000000006e0
>> [25142.367617] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [25142.367617] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [25142.367617] Process iknowthis (pid: 22612, threadinfo
>> ffff8800cb8a2000, task ffff8800cb9516e0)
>> [25142.367617] Stack:
>> [25142.367617]  ffffea0002159670 ffff8800cb8651a0 ffff8800cb8a3d28
>> ffffffff81101686
>> [25142.367617] <0> ffffea0002159670 ffff8800cb865188 ffff8800cb8a3d48
>> ffffffff8110c6d9
>> [25142.367617] <0> 0000000000000001 0000000000000008 ffff8800cb8a3e38
>> ffffffff8110cce0
>> [25142.367617] Call Trace:
>> [25142.367617]  [<ffffffff81101686>] remove_from_page_cache+0x36/0x60
>> [25142.367617]  [<ffffffff8110c6d9>] truncate_inode_page+0x79/0xb0
>> [25142.367617]  [<ffffffff8110cce0>] truncate_inode_pages_range+0x160/0x460
>> [25142.367617]  [<ffffffff81121250>] ? unmap_mapping_range+0xb0/0x160
>> [25142.367617]  [<ffffffff81121398>] vmtruncate_range+0x98/0x100
>> [25142.367617]  [<ffffffff8111ae5a>] madvise_vma+0x17a/0x210
>> [25142.367617]  [<ffffffff8111b055>] sys_madvise+0x165/0x290
>> [25142.367617]  [<ffffffff810467c3>] ia32_sysret+0x0/0x5
>> [25142.367617] Code: 00 00 e8 1f 8a 1c 00 48 89 df 57 9d 0f 1f 44 00
>> 00 5b 41 5c c9 c3 be 16 00 00 00 48 89 df e8 a3 68 01 00 48 8b 03 e9
>> 71 ff ff ff <0f> 0b eb fe eb 05 90 90 90 90 90 55 48 89 e5 48 83 ec 10
>> 48 89
>> [25142.367617] RIP  [<ffffffff81101645>] __remove_from_page_cache+0xd5/0xe0
>> [25142.367617]  RSP <ffff8800cb8a3cf8>
>> [25142.367617] ---[ end trace a262aa785b417723 ]---
>
>



-- 
Robert Święcki

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2010-11-23 14:55   ` Robert Święcki
@ 2010-11-29 23:25     ` Andrew Morton
  2010-11-30 18:22       ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2010-11-29 23:25 UTC (permalink / raw)
  To: Robert Święcki
  Cc: linux-kernel, linux-mm, Nick Piggin, Hugh Dickins

On Tue, 23 Nov 2010 15:55:31 +0100
Robert  wi cki <robert@swiecki.net> wrote:

> >> Hi,
> >>
> >> I was doing some fuzzing with http://code.google.com/p/iknowthis/ and
> >> my system pretty quickly crashes with the BUG() below.
> >
> > So it is a repeatable crash?
> 
> Not in a sense that I can provide you with a sequence of syscalls that
> led to this state. Generally it repeats after some time (<12 hours on
> 2 intel-core) of running the
> http://code.google.com/p/iknowthis/source/browse/#svn/trunk
> 
> >> - Even if it's just BUG() it renders my system unusable (I'm able to
> >> type a few characters on the virtual terminal at most)
> >> - Judging from the stacktrace it's sys_madvise(..., ..., MADV_REMOVE)
> >> - I'm testing with ubuntu's 2.6.35-22-server#35 but I got similar
> >> results with 2.6.32 some time ago
> >
> > It is.
> >
> >> - I'm posting this cause diving into linux mm spaghetti code might be
> >> not a trivial task, but if nobody can see anything obvious in a day or
> >> so, I'll try to debug it mysel
> >> - I'm unable to provide a testcase by now, nor any usable state of the
> >> crashing process, cause the system becomes unusable
> >> - It crashes both linux-kernel working on a physical machine as well
> >> as on the VirtualBox emulator
> >> - I'm usually waiting from 0.5h to 12h for this crash to appear, I
> >> think it could be speed up greatly by disabling any irrelevant
> >> syscalls in the fuzzer
> >>
> >> [25142.286531] kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
> >
> > That's
> >
> >        BUG_ON(page_mapped(page));
> >
> > in  remove_from_page_cache().  That state is worth a BUG().

At a guess I'd say that another thread came in and established a
mapping against a page in the to-be-truncated range while
vmtruncate_range() was working on it.  In fact I'd be suspecting that
the mapping was established after truncate_inode_page() ran its
page_mapped() test.

Let's take a look at vmtruncate_range():

int vmtruncate_range(struct inode *inode, loff_t offset, loff_t end)
{
	struct address_space *mapping = inode->i_mapping;

	/*
	 * If the underlying filesystem is not going to provide
	 * a way to truncate a range of blocks (punch a hole) -
	 * we should return failure right now.
	 */
	if (!inode->i_op->truncate_range)
		return -ENOSYS;

	mutex_lock(&inode->i_mutex);
	down_write(&inode->i_alloc_sem);
	unmap_mapping_range(mapping, offset, (end - offset), 1);
	truncate_inode_pages_range(mapping, offset, end);
	unmap_mapping_range(mapping, offset, (end - offset), 1);
	inode->i_op->truncate_range(inode, offset, end);
	up_write(&inode->i_alloc_sem);
	mutex_unlock(&inode->i_mutex);

	return 0;
}

Now, why does it call unmap_mapping_range() twice?

Nick's original 2007 patch d00806b183152af6d2 ("mm: fix fault vs
invalidate race for linear mappings") added the second
unmap_mapping_range() call, along with this nice comment, which
explains it all:


+       /*
+        * unmap_mapping_range is called twice, first simply for efficiency
+        * so that truncate_inode_pages does fewer single-page unmaps. However
+        * after this first call, and before truncate_inode_pages finishes,
+        * it is possible for private pages to be COWed, which remain after
+        * truncate_inode_pages finishes, hence the second unmap_mapping_range
+        * call must be made for correctness.
+	 /*

Later, some twirp deleted the damn comment.  Why'd we do that?  It
still seems to be valid.

If this _is_ still valid, and the first call to unmap_mapping_range() is
really just a best-effort performance thing which won't reliably clear
all the mappings then perhaps the BUG_ON(page_mapped(page)) assertion
in __remove_from_page_cache() is simply bogus.

We don't appear to have mmap_sem coverage around here, perhaps for
lock-ordering reasons.  I suspect we'll be struggling to plug all holes
here without that coverage.

Fortunately the comment over madvise_remove() says it's tmpfs-only, so
we can blame Hugh :)


hm, I found the lost comment.  It somehow wandered over into
truncate_pagecache(), but is still relevant at the vmtruncate_range()
site.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2010-11-29 23:25     ` Andrew Morton
@ 2010-11-30 18:22       ` Hugh Dickins
  2010-12-29 20:54         ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2010-11-30 18:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Robert Święcki, linux-kernel, linux-mm, Nick Piggin

On Mon, 29 Nov 2010, Andrew Morton wrote:
> On Tue, 23 Nov 2010 15:55:31 +0100
> Robert  wi cki <robert@swiecki.net> wrote:
> > >> [25142.286531] kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
> > >
> > > That's
> > >
> > >        BUG_ON(page_mapped(page));
> > >
> > > in  remove_from_page_cache().  That state is worth a BUG().
> 
> At a guess I'd say that another thread came in and established a
> mapping against a page in the to-be-truncated range while
> vmtruncate_range() was working on it.  In fact I'd be suspecting that
> the mapping was established after truncate_inode_page() ran its
> page_mapped() test.

It looks that way, but I don't see how it can be: the page is locked
before calling truncate_inode_page() and unlocked after it: and the
page (certainly in this tmpfs case, perhaps not for every filesystem)
cannot be faulted into an address space without holding its page lock.

Either we've made a change somewhere, and are now dropping and retaking
page lock in a way which exposes this bug?  Or truncate_inode_page()'s
unmap_mapping_range() call is somehow missing the page it's called for?

I guess the latter is the more likely: maybe the truncate_count/restart
logic isn't working properly.  I'll try to check over that again later -
but will be happy if someone else beats me to it.

> +       /*
> +        * unmap_mapping_range is called twice, first simply for efficiency
> +        * so that truncate_inode_pages does fewer single-page unmaps. However
> +        * after this first call, and before truncate_inode_pages finishes,
> +        * it is possible for private pages to be COWed, which remain after
> +        * truncate_inode_pages finishes, hence the second unmap_mapping_range
> +        * call must be made for correctness.
> +	 /*
> 
> Later, some twirp deleted the damn comment.  Why'd we do that?  It
> still seems to be valid.
> 
> If this _is_ still valid, and the first call to unmap_mapping_range() is
> really just a best-effort performance thing which won't reliably clear
> all the mappings then perhaps the BUG_ON(page_mapped(page)) assertion
> in __remove_from_page_cache() is simply bogus.

No, I believe the first call to unmap_mapping_range() is sufficient to
deal correctly with all page cache pages (and Robert's issue is certainly
with a page cache page).  The second call to unmap_mapping_range() is to
mop up private copied-on-write copies of page cache pages: being separate
pages, the page locking is not adequate to deal with them as thoroughly,
but standards still require them to be removed (SIGBUS beyond EOF).

> 
> We don't appear to have mmap_sem coverage around here, perhaps for
> lock-ordering reasons.  I suspect we'll be struggling to plug all holes
> here without that coverage.

I think any help from mmap_sem here would be deceptive: another mm,
with another mmap_sem, could equally operate on the pages of this file,
and presumably introduce the same condition.  i_mmap_lock and page lock
should already be handling it.

> 
> Fortunately the comment over madvise_remove() says it's tmpfs-only, so
> we can blame Hugh :)

Glad to be of service!  But wish I could work out what's happening.

> 
> hm, I found the lost comment.  It somehow wandered over into
> truncate_pagecache(), but is still relevant at the vmtruncate_range()
> site.

Yes, it is indeed a helpful comment.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2010-11-30 18:22       ` Hugh Dickins
@ 2010-12-29 20:54         ` Hugh Dickins
  2010-12-30 21:25           ` Robert Święcki
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2010-12-29 20:54 UTC (permalink / raw)
  To: robert; +Cc: Miklos Szeredi, Andrew Morton, Nick Piggin, linux-kernel,
	linux-mm

On Tue, 30 Nov 2010, Hugh Dickins wrote:
> On Mon, 29 Nov 2010, Andrew Morton wrote:
> > On Tue, 23 Nov 2010 15:55:31 +0100
> > Robert  wi cki <robert@swiecki.net> wrote:
> > > >> [25142.286531] kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
> > > >
> > > > That's
> > > >
> > > >        BUG_ON(page_mapped(page));
> > > >
> > > > in  remove_from_page_cache().  That state is worth a BUG().
> > 
> > At a guess I'd say that another thread came in and established a
> > mapping against a page in the to-be-truncated range while
> > vmtruncate_range() was working on it.  In fact I'd be suspecting that
> > the mapping was established after truncate_inode_page() ran its
> > page_mapped() test.
> 
> It looks that way, but I don't see how it can be: the page is locked
> before calling truncate_inode_page() and unlocked after it: and the
> page (certainly in this tmpfs case, perhaps not for every filesystem)
> cannot be faulted into an address space without holding its page lock.
> 
> Either we've made a change somewhere, and are now dropping and retaking
> page lock in a way which exposes this bug?  Or truncate_inode_page()'s
> unmap_mapping_range() call is somehow missing the page it's called for?
> 
> I guess the latter is the more likely: maybe the truncate_count/restart
> logic isn't working properly.  I'll try to check over that again later -
> but will be happy if someone else beats me to it.

I have since found an omission in the restart_addr logic: looking back
at the October 2004 history of vm_truncate_count, I see that originally
I designed it to work one way, but hurriedly added a 7/6 redesign when
vma splitting turned out to leave an ambiguity.  I should have updated
the protection in mremap move at that time, but missed it.

Robert, please try out the patch below (should apply fine to 2.6.35):
I'm hoping this will fix what the fuzzer found, but it's still quite
possible that it found something else wrong that I've not yet noticed.
The patch could probably be cleverer (if we exported the notion of
restart_addr out of mm/memory.c), but I'm more in the mood for being
safe than clever at the moment.

I didn't hear whether you'd managed to try out Miklos's patch; but
this one is a better bet to be the fix for your particular issue.

Thanks,
Hugh

--- 2.6.37-rc8/mm/mremap.c	2010-11-01 13:01:32.000000000 -0700
+++ linux/mm/mremap.c	2010-12-29 12:25:46.000000000 -0800
@@ -91,9 +91,7 @@ static void move_ptes(struct vm_area_str
 		 */
 		mapping = vma->vm_file->f_mapping;
 		spin_lock(&mapping->i_mmap_lock);
-		if (new_vma->vm_truncate_count &&
-		    new_vma->vm_truncate_count != vma->vm_truncate_count)
-			new_vma->vm_truncate_count = 0;
+		new_vma->vm_truncate_count = 0;
 	}
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2010-12-29 20:54         ` Hugh Dickins
@ 2010-12-30 21:25           ` Robert Święcki
  2011-01-07 13:02             ` Robert Święcki
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Święcki @ 2010-12-30 21:25 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Miklos Szeredi, Andrew Morton, Nick Piggin, linux-kernel,
	linux-mm

>> I guess the latter is the more likely: maybe the truncate_count/restart
>> logic isn't working properly.  I'll try to check over that again later -
>> but will be happy if someone else beats me to it.
>
> I have since found an omission in the restart_addr logic: looking back
> at the October 2004 history of vm_truncate_count, I see that originally
> I designed it to work one way, but hurriedly added a 7/6 redesign when
> vma splitting turned out to leave an ambiguity.  I should have updated
> the protection in mremap move at that time, but missed it.
>
> Robert, please try out the patch below (should apply fine to 2.6.35):

In the beginning  of Jan (3-4) at earliest I'm afraid, i.e. when I
manage to get to my console-over-rs232 setup.

> I'm hoping this will fix what the fuzzer found, but it's still quite
> possible that it found something else wrong that I've not yet noticed.
> The patch could probably be cleverer (if we exported the notion of
> restart_addr out of mm/memory.c), but I'm more in the mood for being
> safe than clever at the moment.

-- 
Robert Święcki

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2010-12-30 21:25           ` Robert Święcki
@ 2011-01-07 13:02             ` Robert Święcki
  2011-01-07 13:05               ` Robert Święcki
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Święcki @ 2011-01-07 13:02 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Miklos Szeredi, Andrew Morton, Nick Piggin, linux-kernel,
	linux-mm

 but will be happy if someone else beats me to it.
>>
>> I have since found an omission in the restart_addr logic: looking back
>> at the October 2004 history of vm_truncate_count, I see that originally
>> I designed it to work one way, but hurriedly added a 7/6 redesign when
>> vma splitting turned out to leave an ambiguity.  I should have updated
>> the protection in mremap move at that time, but missed it.
>>
>> Robert, please try out the patch below (should apply fine to 2.6.35):
>
> In the beginning  of Jan (3-4) at earliest I'm afraid, i.e. when I
> manage to get to my console-over-rs232 setup.

I cannot reproduce it even with the unpatched kernel, cause I get the
following oops (3 times out of 3 tries) relatively quickly. Still
trying.

Entering kdb (current=0xffff88006b525ac0, pid 12468) on processor 1 Oops: (null)
due to oops @ 0xffffffff810c2a1b
CPU 1 <c>
<d>Pid: 12468, comm: iknowthis Not tainted 2.6.37-rc2 #1
0GH911/Precision WorkStation 390
<d>RIP: 0010:[<ffffffff810c2a1b>]  [<ffffffff810c2a1b>] next_pidmap+0x4b/0xa0
<d>RSP: 0000:ffff88006bc6fd78  EFLAGS: 00010206
<d>RAX: 001ffffffffd2010 RBX: 001fffff829eee38 RCX: 0000000000000034
<d>RDX: 0000000000001bd7 RSI: 00000000e9009bd6 RDI: ffffffff82a1ce20
<d>RBP: ffff88006bc6fd98 R08: c000000000000000 R09: 4fb0000000000000
<d>R10: 7d80000000000000 R11: 0000000000000000 R12: ffffffff82a1d628
<d>R13: ffffffff82a1ce20 R14: ffff880123768000 R15: ffffffff811d98e0
<d>FS:  0000000000000000(0000) GS:ffff8800cfc40000(0063) knlGS:00000000f74aa6c0
<d>CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
<d>CR2: 000000000808865c CR3: 000000011b1a4000 CR4: 00000000000006e0
<d>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<d>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process iknowthis (pid: 12468, threadinfo ffff88006bc6e000, task
ffff88006b525ac0)
<0>Stack:
<c> ffff88006bc6fe78<c> 00000000e9009bd6<c> 0000000000000000<c>
ffffffff82a1ce20<c>
<c> ffff88006bc6fdc8<c> ffffffff810c2aac<c> ffff880018cfc410<c>
ffffffff82a1ce20<c>
<c> 00000000e9009bd6<c> ffffffff811d98e0<c> ffff88006bc6fe28<c>
ffffffff811f3dab<c>
<0>Call Trace:
[1]more>
Only 'q' or 'Q' are processed at more prompt, input ignored
<0> [<ffffffff810c2aac>] find_ge_pid+0x3c/0x50
<0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
<0> [<ffffffff811f3dab>] next_tgid+0x3b/0xc0
<0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
<0> [<ffffffff811f42ec>] proc_pid_readdir+0x13c/0x1d0
<0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
<0> [<ffffffff811f03fa>] proc_root_readdir+0x4a/0x60
<0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
<0> [<ffffffff811a58d0>] vfs_readdir+0xc0/0xe0
<0> [<ffffffff811d8435>] compat_sys_old_readdir+0x45/0x70
<0> [<ffffffff8108b023>] ia32_sysret+0x0/0x5
<0>Code: 89 fd 48 c1 e8 0f 48 c1 e0 04 48 8d 5c 07 08 4c 39 e3 73 54
81 e2 ff 7f 00 00 eb 0f eb 02 90 90 48 83 c3 10 31 d2 49 39 dc 76 3d
<48> 8b 7b 08 48 85 ff 74 ec 48 63 d2 be 00 80 00 00 e8 6f a4 41
Call Trace:
 [<ffffffff810c2aac>] find_ge_pid+0x3c/0x50
 [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
 [<ffffffff811f3dab>] next_tgid+0x3b/0xc0
 [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
 [<ffffffff811f42ec>] proc_pid_readdir+0x13c/0x1d0
 [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
 [<ffffffff811f03fa>] proc_root_readdir+0x4a/0x60
 [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
 [<ffffffff811a58d0>] vfs_readdir+0xc0/0xe0
[1]more>
Only 'q' or 'Q' are processed at more prompt, input ignored
 [<ffffffff811d8435>] compat_sys_old_readdir+0x45/0x70
 [<ffffffff8108b023>] ia32_sysret+0x0/0x5



-- 
Robert Święcki

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128!
  2011-01-07 13:02             ` Robert Święcki
@ 2011-01-07 13:05               ` Robert Święcki
  0 siblings, 0 replies; 8+ messages in thread
From: Robert Święcki @ 2011-01-07 13:05 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Miklos Szeredi, Andrew Morton, Nick Piggin, linux-kernel,
	linux-mm

On Fri, Jan 7, 2011 at 2:02 PM, Robert Święcki <robert@swiecki.net> wrote:
>  but will be happy if someone else beats me to it.
>>>
>>> I have since found an omission in the restart_addr logic: looking back
>>> at the October 2004 history of vm_truncate_count, I see that originally
>>> I designed it to work one way, but hurriedly added a 7/6 redesign when
>>> vma splitting turned out to leave an ambiguity.  I should have updated
>>> the protection in mremap move at that time, but missed it.
>>>
>>> Robert, please try out the patch below (should apply fine to 2.6.35):
>>
>> In the beginning  of Jan (3-4) at earliest I'm afraid, i.e. when I
>> manage to get to my console-over-rs232 setup.
>
> I cannot reproduce it even with the unpatched kernel, cause I get the
> following oops (3 times out of 3 tries) relatively quickly. Still
> trying.
>
> Entering kdb (current=0xffff88006b525ac0, pid 12468) on processor 1 Oops: (null)
> due to oops @ 0xffffffff810c2a1b
> CPU 1 <c>
> <d>Pid: 12468, comm: iknowthis Not tainted 2.6.37-rc2 #1

Hm.. wrong kernel, trying now with 2.6.35 - anyway this proc readdir
oops appears also on 2.6.35

> 0GH911/Precision WorkStation 390
> <d>RIP: 0010:[<ffffffff810c2a1b>]  [<ffffffff810c2a1b>] next_pidmap+0x4b/0xa0
> <d>RSP: 0000:ffff88006bc6fd78  EFLAGS: 00010206
> <d>RAX: 001ffffffffd2010 RBX: 001fffff829eee38 RCX: 0000000000000034
> <d>RDX: 0000000000001bd7 RSI: 00000000e9009bd6 RDI: ffffffff82a1ce20
> <d>RBP: ffff88006bc6fd98 R08: c000000000000000 R09: 4fb0000000000000
> <d>R10: 7d80000000000000 R11: 0000000000000000 R12: ffffffff82a1d628
> <d>R13: ffffffff82a1ce20 R14: ffff880123768000 R15: ffffffff811d98e0
> <d>FS:  0000000000000000(0000) GS:ffff8800cfc40000(0063) knlGS:00000000f74aa6c0
> <d>CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> <d>CR2: 000000000808865c CR3: 000000011b1a4000 CR4: 00000000000006e0
> <d>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <d>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process iknowthis (pid: 12468, threadinfo ffff88006bc6e000, task
> ffff88006b525ac0)
> <0>Stack:
> <c> ffff88006bc6fe78<c> 00000000e9009bd6<c> 0000000000000000<c>
> ffffffff82a1ce20<c>
> <c> ffff88006bc6fdc8<c> ffffffff810c2aac<c> ffff880018cfc410<c>
> ffffffff82a1ce20<c>
> <c> 00000000e9009bd6<c> ffffffff811d98e0<c> ffff88006bc6fe28<c>
> ffffffff811f3dab<c>
> <0>Call Trace:
> [1]more>
> Only 'q' or 'Q' are processed at more prompt, input ignored
> <0> [<ffffffff810c2aac>] find_ge_pid+0x3c/0x50
> <0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
> <0> [<ffffffff811f3dab>] next_tgid+0x3b/0xc0
> <0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
> <0> [<ffffffff811f42ec>] proc_pid_readdir+0x13c/0x1d0
> <0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
> <0> [<ffffffff811f03fa>] proc_root_readdir+0x4a/0x60
> <0> [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
> <0> [<ffffffff811a58d0>] vfs_readdir+0xc0/0xe0
> <0> [<ffffffff811d8435>] compat_sys_old_readdir+0x45/0x70
> <0> [<ffffffff8108b023>] ia32_sysret+0x0/0x5
> <0>Code: 89 fd 48 c1 e8 0f 48 c1 e0 04 48 8d 5c 07 08 4c 39 e3 73 54
> 81 e2 ff 7f 00 00 eb 0f eb 02 90 90 48 83 c3 10 31 d2 49 39 dc 76 3d
> <48> 8b 7b 08 48 85 ff 74 ec 48 63 d2 be 00 80 00 00 e8 6f a4 41
> Call Trace:
>  [<ffffffff810c2aac>] find_ge_pid+0x3c/0x50
>  [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
>  [<ffffffff811f3dab>] next_tgid+0x3b/0xc0
>  [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
>  [<ffffffff811f42ec>] proc_pid_readdir+0x13c/0x1d0
>  [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
>  [<ffffffff811f03fa>] proc_root_readdir+0x4a/0x60
>  [<ffffffff811d98e0>] ? compat_fillonedir+0x0/0xe0
>  [<ffffffff811a58d0>] vfs_readdir+0xc0/0xe0
> [1]more>
> Only 'q' or 'Q' are processed at more prompt, input ignored
>  [<ffffffff811d8435>] compat_sys_old_readdir+0x45/0x70
>  [<ffffffff8108b023>] ia32_sysret+0x0/0x5
>
>
>
> --
> Robert Święcki
>



-- 
Robert Święcki

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-01-07 13:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <AANLkTinbqG7sXxf82wc516snLoae1DtCWjo+VtsPx2P3@mail.gmail.com>
2010-11-22 23:47 ` kernel BUG at /build/buildd/linux-2.6.35/mm/filemap.c:128! Andrew Morton
2010-11-23 14:55   ` Robert Święcki
2010-11-29 23:25     ` Andrew Morton
2010-11-30 18:22       ` Hugh Dickins
2010-12-29 20:54         ` Hugh Dickins
2010-12-30 21:25           ` Robert Święcki
2011-01-07 13:02             ` Robert Święcki
2011-01-07 13:05               ` Robert Święcki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).