Re: Fwd: Ext4 bug with fallocate

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eric Sandeen <sandeen@redhat.com>
To: Fredrik Andersson <nablaman@gmail.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: Fwd: Ext4 bug with fallocate
Date: Tue, 20 Oct 2009 21:20:18 -0500	[thread overview]
Message-ID: <4ADE6FE2.20807@redhat.com> (raw)
In-Reply-To: <a125a870910190249l39183927i8f042a9ace39904c@mail.gmail.com>

Fredrik Andersson wrote:
> Hi, here is the data for this process:

Including all of the processes in D state (everything reported by 
sysrq-w) would probably be most helpful.

Feel free to file an ext4 bug on bugzilla.kernel.org w/ this 
information, too, so it doesn't get lost in busy schedules ...

Thanks,
-Eric

> 5958816.744013] drdbmake      D ffff88021e4c7800     0 27019  13796
> [5958816.744013]  ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
> ffff8801678b1380
> [5958816.744013]  0000000000010e80 000000000000c748 ffff8800404963c0
> ffffffff81526360
> [5958816.744013]  ffff880040496730 00000000f4ce9bf0 000000025819cebe
> 0000000000000282
> [5958816.744013] Call Trace:
> [5958816.744013]  [<ffffffff813a9639>] schedule+0x9/0x20
> [5958816.744013]  [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
> [5958816.744013]  [<ffffffff8105b900>] ? autoremove_wake_function+0x0/
> 0x40
> [5958816.744013]  [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
> [5958816.744013]  [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
> [5958816.744013]  [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
> [5958816.744013]  [<ffffffff81150a78>] ext4_truncate+0x198/0x680
> [5958816.744013]  [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
> [5958816.744013]  [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
> [5958816.744013]  [<ffffffff810acd25>] vmtruncate+0xa5/0x110
> [5958816.744013]  [<ffffffff810dda10>] inode_setattr+0x30/0x180
> [5958816.744013]  [<ffffffff8114d073>] ext4_setattr+0x173/0x310
> [5958816.744013]  [<ffffffff810ddc79>] notify_change+0x119/0x330
> [5958816.744013]  [<ffffffff810c6df3>] do_truncate+0x63/0x90
> [5958816.744013]  [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
> [5958816.744013]  [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
> [5958816.744013]  [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
> 
> Don't know if this has anything to do with it, but  I also noticed
> that another process of mine,
> which is working just fine, is executing a suspicious looking function
> called raid0_unplug.
> It operates on the same raid0/ext4 filesystem as the hung process. I
> include the calltrace for it here too:
> 
> [5958816.744013] nodeserv      D ffff880167bd7ca8     0 17900  13796
> [5958816.744013]  ffff880167bd7bf8 0000000000000082 ffff88002800a588
> ffff88021e5b56e0
> [5958816.744013]  0000000000010e80 000000000000c748 ffff880100664020
> ffffffff81526360
> [5958816.744013]  ffff880100664390 000000008119bd17 000000026327bfa9
> 0000000000000002
> [5958816.744013] Call Trace:
> [5958816.744013]  [<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
> [5958816.744013]  [<ffffffff813a9639>] schedule+0x9/0x20
> [5958816.744013]  [<ffffffff813a9687>] io_schedule+0x37/0x50
> [5958816.744013]  [<ffffffff81095e35>] sync_page+0x35/0x60
> [5958816.744013]  [<ffffffff81095e69>] sync_page_killable+0x9/0x50
> [5958816.744013]  [<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
> [5958816.744013]  [<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
> [5958816.744013]  [<ffffffff81095d74>] __lock_page_killable+0x64/0x70
> [5958816.744013]  [<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
> [5958816.744013]  [<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
> [5958816.744013]  [<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
> [5958816.744013]  [<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
> [5958816.744013]  [<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
> [5958816.744013]  [<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
> [5958816.744013]  [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
> [5958816.744013]  [<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
> [5958816.744013]  [<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
> [5958816.744013]  [<ffffffff810c85b8>] vfs_read+0xc8/0x170
> [5958816.744013]  [<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
> [5958816.744013]  [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
> 
> Hope this makes sense to anyone, and please let me know if there is
> more info I can provide.
> 
> /Fredrik
> 
> On Sun, Oct 18, 2009 at 5:57 PM, Eric Sandeen <sandeen@redhat.com> wrote:
>> Fredrik Andersson wrote:
>>> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
>>> this is the right place to do so.
>>>
>>> My program creates a big file (around 30 GB) with posix_fallocate (to
>>> utilize extents), fills it with data and uses ftruncate to crop it to
>>> its final size (usually somewhere between 20 and 25 GB).
>>> The problem is that in around 5% of the cases, the program locks up
>>> completely in a syscall. The process can thus not be killed even with
>>> kill -9, and a reboot is all that will do.
>> does echo w > /proc/sysrq-trigger (this does sleeping processes; or use echo t for all processes) show you where the stuck threads are?
>>
>> -Eric
>>

next prev parent reply	other threads:[~2009-10-21  2:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <a125a870910180836i66781e1ey8113e7f8569f1fad@mail.gmail.com>
2009-10-18 15:47 ` Fwd: Ext4 bug with f Fredrik Andersson
2009-10-18 15:57   ` Fwd: Ext4 bug with fallocate Eric Sandeen
2009-10-19  9:49     ` Fredrik Andersson
2009-10-20 16:49       ` Fredrik Andersson
2009-10-21  0:24         ` Mingming
2009-10-22  7:37           ` Fredrik Andersson
2009-10-26 10:43             ` Fredrik Andersson
2009-10-21  2:20       ` Eric Sandeen [this message]
2009-10-21  9:08         ` Fredrik Andersson
2009-10-27  4:42   ` Eric Sandeen
2009-10-27  8:17     ` Fredrik Andersson
2009-10-27 13:56       ` Eric Sandeen
2009-10-27 15:29         ` Fredrik Andersson
2009-10-27 15:37           ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ADE6FE2.20807@redhat.com \
    --to=sandeen@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=nablaman@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.