public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Friesen <chris.friesen@windriver.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Subject: Re: looking for assistance with jbd2 (and other processes) hung trying to write to disk
Date: Wed, 11 Nov 2020 10:24:35 -0600	[thread overview]
Message-ID: <654925b2-7c4a-2432-f75e-4fb9ef08a816@windriver.com> (raw)
In-Reply-To: <20201111155743.GG28132@quack2.suse.cz>

On 11/11/2020 9:57 AM, Jan Kara wrote:
> On Tue 10-11-20 09:57:39, Chris Friesen wrote:
>> Just to be sure, I'm looking for whoever has the BH_Lock bit set on the
>> buffer_head "b_state" field, right?  I don't see any ownership field the way
>> we have for mutexes.  Is there some way to find out who would have locked
>> the buffer?
> 
> Buffer lock is a bitlock so there's no owner field. If you can reproduce
> the problem at will and can use debug kernels, then it's easiest to add
> code to lock_buffer() (and fields to struct buffer_head) to track lock
> owner and then see who locked the buffer. Without this, the only way is to
> check stack traces of all UN processes and see whether some stacktrace
> looks suspicious like it could hold the buffer locked (e.g. recursing into
> memory allocation and reclaim while holding buffer locked or something like
> that)...

That's what I thought. :)   Debug kernels are doable, but unfortunately 
we can't (yet) reproduce the problem at will.  Naturally it's only shown 
up in a couple of customer sites so far and not in any test labs.

> As Ted wrote the buffer is indeed usually locked because the IO is running
> and that would be the expected situation with the jdb2 stacktrace you
> posted. So it could also be the IO got stuck somewhere in the block layer
> or NVME (frankly, AFAIR NVME was pretty rudimentary with 3.10). To see
> whether that's the case, you need to find 'bio' pointing to the buffer_head
> (through bi_private field), possibly also struct request for that bio and see
> what state they are in... Again, if you can run debug kernels, you can
> write code to simplify this search for you...

Thanks, that's helpful.

Chris

      reply	other threads:[~2020-11-11 16:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-09 21:11 looking for assistance with jbd2 (and other processes) hung trying to write to disk Chris Friesen
2020-11-10 11:42 ` Jan Kara
2020-11-10 15:57   ` Chris Friesen
2020-11-10 19:46     ` Theodore Y. Ts'o
2020-11-10 20:14       ` Chris Friesen
2020-11-11 15:57     ` Jan Kara
2020-11-11 16:24       ` Chris Friesen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=654925b2-7c4a-2432-f75e-4fb9ef08a816@windriver.com \
    --to=chris.friesen@windriver.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox