All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Marc MERLIN <marc@merlins.org>, <linux-btrfs@vger.kernel.org>
Subject: Re: Also seeing full deadlocks with 3.15.1
Date: Fri, 27 Jun 2014 15:36:08 -0700	[thread overview]
Message-ID: <53ADF1D8.9060309@fb.com> (raw)
In-Reply-To: <20140627185009.GA21428@merlins.org>

On 06/27/2014 11:50 AM, Marc MERLIN wrote:
> My laptop deadlocked some more times (everything works until it needs to
> touch the filesystem, and then it's deadlocked).
> Unfortunately, I can trigger sysrq, but it doesn't get committed to disk and
> netconsole eats half of it because it goes too fast for UDP apparently
>
> Now, I just captured that on my server with serial console.
>
> 11005  1-16:11:10 wait_current_trans.isra.15     /usr/bin/zma -m 3
> 14441  1-16:07:44 wait_current_trans.isra.15     /usr/bin/zma -m 1
> 17045  1-23:53:33 wait_current_trans.isra.15     /usr/bin/zma -m 9
> 22261  2-00:40:36 wait_current_trans.isra.15     /usr/bin/zma -m 6
> 22292  2-00:40:36 wait_current_trans.isra.15     /usr/bin/zma -m 8
>
> 19911    09:29:35 wait_current_trans.isra.15     rm -f -- /mnt/dshelf2/backup/0Notmachines/mysql//mysql.daily.sql.gz.13 /mnt/dshelf2/backup/0Notmachines/mysql//mysql.daily.sql.gz.13.gz
> 22848  1-05:18:35 wait_current_trans.isra.15     rm -f -- mnt/dshelf2/backup/0Notmachines/jen//backup.tar.bz.11 mnt/dshelf2/backup/0Notmachines/jen//backup.tar.bz.11.gz
>
> Those are 2 different filesystems (one single device mapper disk, the other one is btrfs raid1), so I'm not sure which one of the 2 caused the problem, but I'm perplexed as to why one would than hang the other, unless they both hit the same bug?
>
> The sysrq-w output is here:
> https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/tmp/btrfs-hang.txt&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=CZ0ka0XcM6ZpRAF31LYBziutfoecu9ODO78jo5Kb2JQ%3D%0A&s=6213c6dc2c99166a71f262a1804bc7135ca17bffd8b9de175f655ed2a6a54f10
>
> but here is one hung process:
>   zma		D 0000000000000003     0 22292	    1 0x20020084
>    ffff880074733bb0 0000000000000082 ffff8800c933f270 ffff880074733fd8
>    ffff8801853b4610 00000000000141c0 ffff8801aac60f00 ffff880036caa9e8
>    0000000000000000 ffff880036caa800 ffff8801db59f0c0 ffff880074733bc0
>   Call Trace:
>    [<ffffffff8161d3c6>] schedule+0x73/0x75
>    [<ffffffff8122a87b>] wait_current_trans.isra.15+0x98/0xf4
>    [<ffffffff810847ed>] ? finish_wait+0x65/0x65
>    [<ffffffff8122bd95>] start_transaction+0x498/0x4fc
>    [<ffffffff8122be14>] btrfs_start_transaction+0x1b/0x1d
>    [<ffffffff8123602a>] btrfs_create+0x3c/0x1ce
>    [<ffffffff81298985>] ? security_inode_permission+0x1c/0x23
>    [<ffffffff8115e93e>] ? __inode_permission+0x79/0xa4
>    [<ffffffff8115fbfc>] vfs_create+0x66/0x8c
>    [<ffffffff8116095e>] do_last+0x5af/0xa23
>    [<ffffffff81161009>] path_openat+0x237/0x4de
>    [<ffffffff81162408>] do_filp_open+0x3a/0x7f
>    [<ffffffff8161faeb>] ? _raw_spin_unlock+0x17/0x2a
>    [<ffffffff8116c3eb>] ? __alloc_fd+0xea/0xf9
>    [<ffffffff8115499d>] do_sys_open+0x70/0xff
>    [<ffffffff81194e20>] compat_SyS_open+0x1b/0x1d
>    [<ffffffff8162842c>] sysenter_dispatch+0x7/0x21
>
> As per the other thread, I'm happy to test a patch against 3.15, but not hot about switching to a likely even less stable 3.16 since it's a real server with real data.
>

A few other people have complained about this, I've not been able to reproduce
it but I have a patch you can try.  It will make it so the box doesn't deadlock
anymore but I still need the output, look for "timed out", thats when you need
to dump the logs and send it to me.  The patch is here


http://ur1.ca/hlj6d

Thanks,

Josef

  parent reply	other threads:[~2014-06-27 22:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27 18:50 Also seeing full deadlocks with 3.15.1 Marc MERLIN
2014-06-27 20:40 ` Marc MERLIN
2014-06-27 21:50   ` ronnie sahlberg
2014-06-27 22:33     ` Marc MERLIN
2014-06-27 22:36 ` Josef Bacik [this message]
2014-06-27 23:59   ` Marc MERLIN
2014-06-28  0:14     ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53ADF1D8.9060309@fb.com \
    --to=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=marc@merlins.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.