From: Richard Weinberger <richard@nod.at>
To: Anton Ivanov <anton.ivanov@kot-begemot.co.uk>,
"user-mode-linux-devel@lists.sourceforge.net"
<user-mode-linux-devel@lists.sourceforge.net>
Subject: Re: [uml-devel] Old process in D state bug
Date: Fri, 27 Nov 2015 10:59:51 +0100 [thread overview]
Message-ID: <56582997.50407@nod.at> (raw)
In-Reply-To: <5656D596.4060906@kot-begemot.co.uk>
Hi!
Am 26.11.2015 um 10:49 schrieb Anton Ivanov:
> Hi List, hi Richard,
>
> While working on the EPOLL I managed to consistently reproduce and get down to the bottom of the process in D state bug which you occasionally see with UML. I recall asking
> Richard's help on this for the first time nearly 5 years ago ;-).
O_o
> It is extremely rare with the POLL based controller, timers and the stock UBD drivers. As you make things go faster (anywhere in UML) it rares its ugly head. So improving the IRQs,
> improving UBD itself, etc - all make it easier to trigger.
>
> It looks like it is possible to end up in a state where the restart list is not empty (an earlier transaction to the disk io thread failed with EAGAIN), but with no pending IO on
> the UBD IPC thread fd. So the restart list is never re-triggered and the UBD device ends up with a non-empty queue. The process that requested the IO ends up in D state. Any other
> processes trying IO to the same disk join it. As the requests to the same UBD queue up, ultimately, UML goes belly up.
>
> Pinging the UML process with SIGIO does not help as there is no IO pending on the fd. So it is not a lost interrupt. It somehow manages to race forming the restart queue.
>
> If, however, you have more than one UBD device IO to the other one unstucks it by re-running the restart queue out of the ubd interrupt handler.
>
> Once again - this is extremely rare at present, but possible (I have seen it a few times over the last 5 years).
>
> So it needs a viable fix or a workaround. I will have to get this one out of the door first as it constantly gets in the way in debugging both the Epoll and the signals stuff.
Okay, let's collect some facts first.
Is a guest or a host process in state D?
If it is a guest process, you can use "/proc/<pid>/stack" to find out where in the UML kernel it is blocking.
5 years ago UML didn't have this feature.
Thanks,
//richard
------------------------------------------------------------------------------
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
next prev parent reply other threads:[~2015-11-27 10:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-26 9:49 [uml-devel] Old process in D state bug Anton Ivanov
2015-11-27 9:59 ` Richard Weinberger [this message]
2015-11-27 11:03 ` Anton Ivanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56582997.50407@nod.at \
--to=richard@nod.at \
--cc=anton.ivanov@kot-begemot.co.uk \
--cc=user-mode-linux-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.