From: Dmitry Katsubo <dma_k@mail.ru>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Process is blocked for more than 120 seconds
Date: Wed, 11 Nov 2015 12:38:32 +0100 [thread overview]
Message-ID: <564328B8.2020509@mail.ru> (raw)
In-Reply-To: <56409EE4.6070605@gmail.com>
On 2015-11-09 14:25, Austin S Hemmelgarn wrote:
> On 2015-11-07 07:22, Dmitry Katsubo wrote:
>> Hi everyone,
>>
>> I have noticed the following in the log. The system continues to run,
>> but I am not sure for how long it will be stable. Should I start
>> worrying? Thanks in advance for the opinion.
>>
> This just means that a process was stuck in the D state (uninterruptible
> I/O sleep) for more than 120 seconds. Depending on a number of factors,
> this happening could mean:
> 1. Absolutely nothing (if you have low-powered or older hardware, for
> example, I get these regularly on a first generation Raspberry Pi if I
> don't increase the timeout significantly)
> 2. The program is doing a very large chunk of I/O (usually with the
> O_DIRECT flag, although this probably isn't the case here)
> 3. There's a bug in the blocked program (this is rarely the case when
> this type of thing happens)
> 4. There's a bug in the kernel (which is why this dumps a stack trace)
> 5. The filesystem itself is messed up somehow, and the kernel isn't
> handling it properly (technically a bug, but a more specific case of it).
> 6. You're hardware is misbehaving, failing, or experienced a transient
> error.
>
> Assuming you can rule out possibilities 1 and 6, I think that 4 is the
> most likely cause, as all of the listed programs (I'm assuming that
> 'master' is from postfix) are relatively well audited, and all of them
> hit this at the same time.
>
> For what it's worth, if you want you can do:
> echo 0 > /proc/sys/kernel/hung_task_timeout_secs
> like the message says to stop these from appearing in the future, or use
> some arbitrary number to change the timeout before these messages appear
> (I usually use at least 150 on production systems, and more often 300,
> although on something like a Raspberry Pi I often use timeouts as high
> as 1800 seconds).
Thanks for comments, Austin.
The system is "normal" PC, running Intel Core 2 Duo Mobile @1.66GHz.
"master" is indeed a postfix process.
I haven't seen anything like that when I was on 3.16 kernel, but after I
have upgraded to 4.2.3, I caught that message. I/O and CPU load are
usually low, but it could be (6) from your list, as the system is
generally very old (5+ years).
As the problem appeared only once for passed 15 days, I think it is just
a transient error. Thanks for clarifying the possible reasons.
--
With best regards,
Dmitry
next prev parent reply other threads:[~2015-11-11 12:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-02 11:43 Process is blocked for more than 120 seconds Dmitry Katsubo
2015-11-07 12:22 ` Dmitry Katsubo
2015-11-09 13:25 ` Austin S Hemmelgarn
2015-11-11 11:38 ` Dmitry Katsubo [this message]
2016-06-16 1:05 ` Dmitry Katsubo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=564328B8.2020509@mail.ru \
--to=dma_k@mail.ru \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).