From: Liu Bo <liubo2009@cn.fujitsu.com>
To: Marc MERLIN <marc@merlins.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
Date: Wed, 04 Jul 2012 13:58:31 +0800 [thread overview]
Message-ID: <4FF3DB87.5090405@cn.fujitsu.com> (raw)
In-Reply-To: <20120702195820.GA10655@merlins.org>
On 07/03/2012 03:58 AM, Marc MERLIN wrote:
> On Fri, Jun 29, 2012 at 05:36:24AM -0700, Marc MERLIN wrote:
>> On Tue, Jun 26, 2012 at 10:20:12PM -0700, Marc MERLIN wrote:
>>> On Tue, Jun 26, 2012 at 06:38:18PM -0700, Marc MERLIN wrote:
>>>> Now, I'm also seeing these below and I have this again (86% CPU):
>>>> 6076 root 20 0 0 0 0 R 86 0.0 29:40.11 btrfs-delalloc-
>>>>
>>>> How bad is it, doctor? I think I'll be going back to 3.2.16 for now though.
>>
>> I reverted to 3.2.16 and haven't had further problems after dropping the
>> current snapshot that was corrupted in various ways.
>>
>> Now, I'm not sure when I should upgrade anymore since I haven't heard of
>> any fixes for what I saw.
>> Assuming I go forward again, is there something else I could have
>> provided to help debug?
>
> Mmmh, ok. I understand that this code comes with no guarantees, and I have
> backups, but I'm reporting a problem that lead to corruption (I had multiple
> files that were corrupted in my latest snapshot and I had to drop it and
> revert to an older snapshot and then out of fear for 3.4.4, went back to
> 3.2.16).
>
Hi Marc,
Sorry for not replying this earlier.
The dmesg log, sysrq log and stack dump info can usually be very helpful.
>From your report, we can see the csum error and hang on log,
'no csum' is not that bad while hanging-on is serious and dangerous.
so can you please get any 'sysrq + w' log in the hanging-on case and paste them here,
and the log may tell us who blocks other threads.
> I didn't see any problems with 3.2.16 (doesn't mean there weren't any, just
> that I didn't see any).
Feel free to use the latest btrfs upstream, it always contains some fixes.
thanks,
liubo
> Since my filesystem was a bit full, and that triggers problems with btrfs, I
> freed up 70GB
> gandalfthegreat:~# btrfs fi show
> Label: 'btrfs_pool1' uuid: 873d526c-e911-4234-af1b-239889cd143d
> Total devices 1 FS bytes used 163.01GB
> devid 1 size 231.02GB used 231.02GB path /dev/dm-0
>
> I rebooted with 3.4.4 and started copying data, and for now I've gotten this:
> kernel: [ 832.108558] btrfs no csum found for inode 3896855 start 0
> kernel: [ 832.108873] btrfs csum failed ino 3896855 off 0 csum 1150320628 private 0
>
> How bad is this?
>
> More generally, what was missing from my previous report (I gave all the
> sysrq I could output) that no one seemed to be able to use it?
>
> Thanks,
> Marc
>
>>> Back to 3.2.16, I'm now seeing this:
>>> [ 840.516733] INFO: task VirtualBox:6818 blocked for more than 120 seconds.
>>> [ 840.516735] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [ 840.516736] VirtualBox D ffff8801fd134080 0 6818 6758 0x00000080
>>> [ 840.516740] ffff8801fd134080 0000000000000086 0000000000000050 ffff880202e7f100
>>> [ 840.516744] 0000000000013580 ffff8801c6f0dfd8 ffff8801c6f0dfd8 ffff8801fd134080
>>> [ 840.516748] ffff8801c6f0da68 ffff8801c6f0da68 ffff88020a4e22f0 ffff88023bc13e08
>>> [ 840.516752] Call Trace:
>>> [ 840.516755] [<ffffffff810b5c67>] ? __lock_page+0x66/0x66
>>> [ 840.516758] [<ffffffff8134aea4>] ? io_schedule+0x58/0x6f
>>> [ 840.516761] [<ffffffff810b5c6d>] ? sleep_on_page+0x6/0xa
>>> [ 840.516764] [<ffffffff8134b1e5>] ? __wait_on_bit_lock+0x3c/0x85
>>> [ 840.516767] [<ffffffff810b5c62>] ? __lock_page+0x61/0x66
>>> [ 840.516770] [<ffffffff81060051>] ? autoremove_wake_function+0x2a/0x2a
>>> [ 840.516785] [<ffffffffa01838d7>] ? extent_write_cache_pages.isra.13.constprop.22+0xf6/0x278 [btrfs]
>>> [ 840.516789] [<ffffffff810ec9cb>] ? __cache_free.isra.40+0x19/0x1a7
>>> [ 840.516792] [<ffffffff8134ed52>] ? sub_preempt_count+0x83/0x94
>>> [ 840.516795] [<ffffffff8134c2dd>] ? _raw_spin_unlock+0x24/0x30
>>> [ 840.516811] [<ffffffffa0183c4b>] ? extent_writepages+0x40/0x57 [btrfs]
>>> [ 840.516826] [<ffffffffa0177f5f>] ? __btrfs_buffered_write+0x2bb/0x2dc [btrfs]
>>> [ 840.516841] [<ffffffffa016e88a>] ? uncompress_inline.isra.44+0x116/0x116 [btrfs]
>>> [ 840.516844] [<ffffffff810b6aaf>] ? __filemap_fdatawrite_range+0x4b/0x50
>>> [ 840.516847] [<ffffffff810b6ad9>] ? filemap_write_and_wait_range+0x25/0x4d
>>> [ 840.516863] [<ffffffffa01782ce>] ? btrfs_file_aio_write+0x34e/0x490 [btrfs]
>>> [ 840.516866] [<ffffffff8103e092>] ? get_parent_ip+0x9/0x1b
>>> [ 840.516882] [<ffffffffa0177f80>] ? __btrfs_buffered_write+0x2dc/0x2dc [btrfs]
>>> [ 840.516886] [<ffffffff8112f19c>] ? aio_rw_vect_retry+0x70/0x18e
>>> [ 840.516888] [<ffffffff8112f12c>] ? aio_fsync+0x22/0x22
>>> [ 840.516891] [<ffffffff8112fbc7>] ? aio_run_iocb+0x72/0x11c
>>> [ 840.516894] [<ffffffff81130d9a>] ? do_io_submit+0x6a4/0x7f9
>>> [ 840.516898] [<ffffffff813508d2>] ? system_call_fastpath+0x16/0x1b
>>> [ 1187.553635] btrfs: unlinked 8 orphans
>>> [ 3810.200064] e1000e 0000:00:19.0: BAR 0: set to [mem 0xfc000000-0xfc01ffff] (PCI address [0xfc000000-0xfc01ffff])
>>> [ 3810.200071] e1000e 0000:00:19.0: BAR 1: set to [mem 0xfc025000-0xfc025fff] (PCI address [0xfc025000-0xfc025fff])
>>> [ 3810.200076] e1000e 0000:00:19.0: BAR 2: set to [io 0x1840-0x185f] (PCI address [0x1840-0x185f])
>>> [ 3810.200093] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b)
>>> [ 3810.200115] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
>>> [ 3810.200147] e1000e 0000:00:19.0: PME# disabled
>>> [ 3810.200224] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
>>> [ 4671.144685] iwlwifi 0000:03:00.0: Tx aggregation enabled on ra = 2c:b0:5d:3c:7d:f1 tid = 1
>>> [ 4799.384107] btrfs: unlinked 8 orphans
>>> [ 8436.512513] btrfs: unlinked 7 orphans
>>> [11350.749850] btrfs no csum found for inode 3909426 start 0
>>> [11350.750697] btrfs csum failed ino 3909426 off 0 csum 1419704114 private 0
>>> [11652.088805] btrfs no csum found for inode 3910848 start 0
>>> [11652.089524] btrfs csum failed ino 3910848 off 0 csum 3145117582 private 0
>>>
>>> My firefox and chrome profiles were corrupted, so I had to restore them from an old snapshot.
>>>
>>> I can't prove it, but it looks like my corruption happened right at the same
>>> time than I rebooted to 3.4.4.
>>>
>>> Marc
>>> --
>>> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>>> Microsoft is to operating systems ....
>>> .... what McDonalds is to gourmet cooking
>>> Home page: http://marc.merlins.org/
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>> Microsoft is to operating systems ....
>> .... what McDonalds is to gourmet cooking
>> Home page: http://marc.merlins.org/
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2012-07-04 5:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120626193637.GA27856@merlins.org>
2012-06-27 1:38 ` Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction Marc MERLIN
2012-06-27 5:20 ` Marc MERLIN
2012-06-29 12:36 ` Marc MERLIN
2012-07-02 19:58 ` Marc MERLIN
2012-07-04 5:58 ` Liu Bo [this message]
2012-07-04 15:15 ` Marc MERLIN
2012-07-05 13:25 ` Liu Bo
2012-07-05 14:34 ` Marc MERLIN
2012-07-18 18:01 ` Long btrfs hangs during suspend to RAM / BTRFS warning (device Marc MERLIN
2012-07-19 1:00 ` Liu Bo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF3DB87.5090405@cn.fujitsu.com \
--to=liubo2009@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=marc@merlins.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).