[uml-devel] Old process in D state bug

All of lore.kernel.org
 help / color / mirror / Atom feed

* [uml-devel] Old process in D state bug
@ 2015-11-26  9:49 Anton Ivanov
  2015-11-27  9:59 ` Richard Weinberger
  0 siblings, 1 reply; 3+ messages in thread
From: Anton Ivanov @ 2015-11-26  9:49 UTC (permalink / raw)
  To: user-mode-linux-devel@lists.sourceforge.net; +Cc: Richard Weinberger

Hi List, hi Richard,

While working on the EPOLL I managed to consistently reproduce and get 
down to the bottom of the process in D state bug which you occasionally 
see with UML. I recall asking Richard's help on this for the first time 
nearly 5 years ago ;-).

It is extremely rare with the POLL based controller, timers and the 
stock UBD drivers. As you make things go faster (anywhere in UML) it 
rares its ugly head. So improving the IRQs, improving UBD itself, etc - 
all make it easier to trigger.

It looks like it is possible to end up in a state where the restart list 
is not empty (an earlier transaction to the disk io thread failed with 
EAGAIN), but with no pending IO on the UBD IPC thread fd. So the restart 
list is never re-triggered and the UBD device ends up with a non-empty 
queue. The process that requested the IO ends up in D state. Any other 
processes trying IO to the same disk join it. As the requests to the 
same UBD queue up, ultimately, UML goes belly up.

Pinging the UML process with SIGIO does not help as there is no IO 
pending on the fd. So it is not a lost interrupt. It somehow manages to 
race forming the restart queue.

If, however, you have more than one UBD device IO to the other one 
unstucks it by re-running the restart queue out of the ubd interrupt 
handler.

Once again - this is extremely rare at present, but possible (I have 
seen it a few times over the last 5 years).

So it needs a viable fix or a workaround. I will have to get this one 
out of the door first as it constantly gets in the way in debugging both 
the Epoll and the signals stuff.

A.

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [uml-devel] Old process in D state bug
  2015-11-26  9:49 [uml-devel] Old process in D state bug Anton Ivanov
@ 2015-11-27  9:59 ` Richard Weinberger
  2015-11-27 11:03   ` Anton Ivanov
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Weinberger @ 2015-11-27  9:59 UTC (permalink / raw)
  To: Anton Ivanov, user-mode-linux-devel@lists.sourceforge.net

Hi!

Am 26.11.2015 um 10:49 schrieb Anton Ivanov:
> Hi List, hi Richard,
> 
> While working on the EPOLL I managed to consistently reproduce and get down to the bottom of the process in D state bug which you occasionally see with UML. I recall asking
> Richard's help on this for the first time nearly 5 years ago ;-).

O_o

> It is extremely rare with the POLL based controller, timers and the stock UBD drivers. As you make things go faster (anywhere in UML) it rares its ugly head. So improving the IRQs,
> improving UBD itself, etc - all make it easier to trigger.
> 
> It looks like it is possible to end up in a state where the restart list is not empty (an earlier transaction to the disk io thread failed with EAGAIN), but with no pending IO on
> the UBD IPC thread fd. So the restart list is never re-triggered and the UBD device ends up with a non-empty queue. The process that requested the IO ends up in D state. Any other
> processes trying IO to the same disk join it. As the requests to the same UBD queue up, ultimately, UML goes belly up.
> 
> Pinging the UML process with SIGIO does not help as there is no IO pending on the fd. So it is not a lost interrupt. It somehow manages to race forming the restart queue.
> 
> If, however, you have more than one UBD device IO to the other one unstucks it by re-running the restart queue out of the ubd interrupt handler.
> 
> Once again - this is extremely rare at present, but possible (I have seen it a few times over the last 5 years).
> 
> So it needs a viable fix or a workaround. I will have to get this one out of the door first as it constantly gets in the way in debugging both the Epoll and the signals stuff.

Okay, let's collect some facts first.
Is a guest or a host process in state D?
If it is a guest process, you can use "/proc/<pid>/stack" to find out where in the UML kernel it is blocking.
5 years ago UML didn't have this feature.

Thanks,
//richard

------------------------------------------------------------------------------
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [uml-devel] Old process in D state bug
  2015-11-27  9:59 ` Richard Weinberger
@ 2015-11-27 11:03   ` Anton Ivanov
  0 siblings, 0 replies; 3+ messages in thread
From: Anton Ivanov @ 2015-11-27 11:03 UTC (permalink / raw)
  To: Richard Weinberger, user-mode-linux-devel@lists.sourceforge.net

On 27/11/15 09:59, Richard Weinberger wrote:
> Hi!
>
> Am 26.11.2015 um 10:49 schrieb Anton Ivanov:
>> Hi List, hi Richard,
>>
>> While working on the EPOLL I managed to consistently reproduce and get down to the bottom of the process in D state bug which you occasionally see with UML. I recall asking
>> Richard's help on this for the first time nearly 5 years ago ;-).
> O_o
>
>> It is extremely rare with the POLL based controller, timers and the stock UBD drivers. As you make things go faster (anywhere in UML) it rares its ugly head. So improving the IRQs,
>> improving UBD itself, etc - all make it easier to trigger.
>>
>> It looks like it is possible to end up in a state where the restart list is not empty (an earlier transaction to the disk io thread failed with EAGAIN), but with no pending IO on
>> the UBD IPC thread fd. So the restart list is never re-triggered and the UBD device ends up with a non-empty queue. The process that requested the IO ends up in D state. Any other
>> processes trying IO to the same disk join it. As the requests to the same UBD queue up, ultimately, UML goes belly up.
>>
>> Pinging the UML process with SIGIO does not help as there is no IO pending on the fd. So it is not a lost interrupt. It somehow manages to race forming the restart queue.
>>
>> If, however, you have more than one UBD device IO to the other one unstucks it by re-running the restart queue out of the ubd interrupt handler.
>>
>> Once again - this is extremely rare at present, but possible (I have seen it a few times over the last 5 years).
>>
>> So it needs a viable fix or a workaround. I will have to get this one out of the door first as it constantly gets in the way in debugging both the Epoll and the signals stuff.
> Okay, let's collect some facts first.
> Is a guest or a host process in state D?
Guest

> If it is a guest process, you can use "/proc/<pid>/stack" to find out where in the UML kernel it is blocking.

Done that already using the uml_mconsole functionality :)

Based on that it looks like block IO. It has issued a read request, that 
has gone through the guts of ext3fs and has gotten as far as the block 
layer. It is not inside UBD, it is reported as a couple of layers up. I 
can roll back my tree to a state where I can replicate that reliably and 
re-take the trace (sorry, did not keep it).

> 5 years ago UML didn't have this feature.

I already did some experiments and did some re-reading of the source:

1. You can have a failed atomic allocation. This is rare, but it 
happens. This is why it is atomic - it does not wait.
2. If you fail the atomic allocation in the ubd submit routine you have 
the request bounced up and the device scheduled for a restart (it is 
added for a restart list).
3. The restarts themselves will happen only if you have IO - they are 
done in the bottom of the IRQ routine. There is no other place to 
restart the IO.

So what happens if you have no IO pending from the disk IO thread and 
have a failed allocation so a device is scheduled for a restart?

That however may not be the only way you get in this state - I stuck a 
few debug printks and they are extremely difficult to trigger and it is 
possible to trigger this in a different way.

I have yet to figure out what triggers it in "the different way" use case.

I am also trying a couple of potential solutions:

1. Start a timer on IO submission and update it on last IO so we have an 
efficient timeout mechanism. Rerun the event loop and the restart queue 
if the timer expires. This is much closer to how a real disk operates - 
we can even recover from a failed IO thread this way.

I tried that already. It works 100%, but looks a bit expensive. I do not 
like it. I also need to do a few experiments to ensure that the effect 
is real and not a Heisenbug from triggering more timer interrupts.

2. Replace blocking IO in the disk IO thread with non-blocking, do a 
timeouted poll and throw a NULL or pointer to a static "keepalive" 
transaction when timeouts expire. This retriggers the IO interrupt and 
reruns both IO and the restart queue. This is much better aligned with 
what is needed to make the disk IO faster by bulking transactions as you 
can now read N disk io requests at a time in the io thread.

I will try that over the weekend and report back with traces.

A.

>
> Thanks,
> //richard
>

------------------------------------------------------------------------------
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-11-27 11:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-26  9:49 [uml-devel] Old process in D state bug Anton Ivanov
2015-11-27  9:59 ` Richard Weinberger
2015-11-27 11:03   ` Anton Ivanov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.