* [Qemu-devel] net/tap.c: Possibly a way to stall tap input
@ 2013-08-01 17:15 Jan Kiszka
2013-08-01 17:24 ` Jan Kiszka
2013-08-02 11:46 ` Stefan Hajnoczi
0 siblings, 2 replies; 9+ messages in thread
From: Jan Kiszka @ 2013-08-01 17:15 UTC (permalink / raw)
To: qemu-devel, Stefan Hajnoczi
Hi all,
I'm tracking down a nasty stall of tap input over a custom 1.3.x QEMU
version. Under certain load, our tap backend stops reading from the char
device, and that even if we reset the guest. The frontend device
(pcnet32) is able to receive (can_receive would return > 0), but the
tap's fd is no longer registered with the iohandler list.
I was digging into the involved code and found something fishy:
net/tap.c:
static void tap_send(void *opaque)
{
...
size = qemu_send_packet_async(&s->nc, buf, size,
tap_send_completed);
if (size == 0) {
tap_read_poll(s, false);
}
So, if tap_send is registered for the mainloop polling (ie. can_receive
returned true before starting to poll) but qemu_send_packet_async
returns 0 now as qemu_can_send_packet/can_receive happens to report
false in the meantime, we will disable read polling. If also write
polling is off, the fd will be completely removed from the iohandler
list. But even if write polling remains on, I wonder what should bring
read polling back?
We only have an unhandy reproduction scenario, so I wasn't able to
confirm this theory on the target yet (and will not be before Monday,
unfortunately). But any comments on this would be very welcome.
Thanks,
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-01 17:15 [Qemu-devel] net/tap.c: Possibly a way to stall tap input Jan Kiszka
@ 2013-08-01 17:24 ` Jan Kiszka
2013-08-02 7:33 ` Stefan Hajnoczi
2013-08-02 11:46 ` Stefan Hajnoczi
1 sibling, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2013-08-01 17:24 UTC (permalink / raw)
To: qemu-devel, Stefan Hajnoczi
On 2013-08-01 19:15, Jan Kiszka wrote:
> Hi all,
>
> I'm tracking down a nasty stall of tap input over a custom 1.3.x QEMU
> version. Under certain load, our tap backend stops reading from the char
> device, and that even if we reset the guest. The frontend device
> (pcnet32) is able to receive (can_receive would return > 0), but the
^^^^^^^
Yes, the pcnet lacks qemu_flush_queued_packets, like certain other NIC
models already have. We added that to pcnet_init and pcnet_start (patch
will follow), but that didn't make a difference, likely due to what I
described below.
Jan
> tap's fd is no longer registered with the iohandler list.
>
> I was digging into the involved code and found something fishy:
>
> net/tap.c:
> static void tap_send(void *opaque)
> {
> ...
> size = qemu_send_packet_async(&s->nc, buf, size,
> tap_send_completed);
> if (size == 0) {
> tap_read_poll(s, false);
> }
>
> So, if tap_send is registered for the mainloop polling (ie. can_receive
> returned true before starting to poll) but qemu_send_packet_async
> returns 0 now as qemu_can_send_packet/can_receive happens to report
> false in the meantime, we will disable read polling. If also write
> polling is off, the fd will be completely removed from the iohandler
> list. But even if write polling remains on, I wonder what should bring
> read polling back?
>
> We only have an unhandy reproduction scenario, so I wasn't able to
> confirm this theory on the target yet (and will not be before Monday,
> unfortunately). But any comments on this would be very welcome.
>
> Thanks,
> Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-01 17:24 ` Jan Kiszka
@ 2013-08-02 7:33 ` Stefan Hajnoczi
2013-08-02 7:45 ` Jan Kiszka
0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2013-08-02 7:33 UTC (permalink / raw)
To: Jan Kiszka; +Cc: qemu-devel
On Thu, Aug 01, 2013 at 07:24:13PM +0200, Jan Kiszka wrote:
> On 2013-08-01 19:15, Jan Kiszka wrote:
> > Hi all,
> >
> > I'm tracking down a nasty stall of tap input over a custom 1.3.x QEMU
> > version. Under certain load, our tap backend stops reading from the char
> > device, and that even if we reset the guest. The frontend device
> > (pcnet32) is able to receive (can_receive would return > 0), but the
> ^^^^^^^
> Yes, the pcnet lacks qemu_flush_queued_packets, like certain other NIC
> models already have. We added that to pcnet_init and pcnet_start (patch
> will follow), but that didn't make a difference, likely due to what I
> described below.
Yes, you need to add qemu_flush_queued_packets() where the guest enables
RX or marks RX descriptors free (usually by writing to a hardware
register to update the RX ring index).
Please also confirm that you are using -netdev tap -device pcnet32
instead of the legacy -net syntax. The reason I'm asking is because I
remember the same issue with net/hub.c, which gets in the way if you use
legacy syntax (see 199ee608).
Hope this helps,
Stefan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-02 7:33 ` Stefan Hajnoczi
@ 2013-08-02 7:45 ` Jan Kiszka
0 siblings, 0 replies; 9+ messages in thread
From: Jan Kiszka @ 2013-08-02 7:45 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel
On 2013-08-02 09:33, Stefan Hajnoczi wrote:
> On Thu, Aug 01, 2013 at 07:24:13PM +0200, Jan Kiszka wrote:
>> On 2013-08-01 19:15, Jan Kiszka wrote:
>>> Hi all,
>>>
>>> I'm tracking down a nasty stall of tap input over a custom 1.3.x QEMU
>>> version. Under certain load, our tap backend stops reading from the char
>>> device, and that even if we reset the guest. The frontend device
>>> (pcnet32) is able to receive (can_receive would return > 0), but the
>> ^^^^^^^
>> Yes, the pcnet lacks qemu_flush_queued_packets, like certain other NIC
>> models already have. We added that to pcnet_init and pcnet_start (patch
>> will follow), but that didn't make a difference, likely due to what I
>> described below.
>
> Yes, you need to add qemu_flush_queued_packets() where the guest enables
> RX or marks RX descriptors free (usually by writing to a hardware
> register to update the RX ring index).
For the pcnet, this is different as can_receive checks for stop and
suspend states.
>
> Please also confirm that you are using -netdev tap -device pcnet32
> instead of the legacy -net syntax. The reason I'm asking is because I
> remember the same issue with net/hub.c, which gets in the way if you use
> legacy syntax (see 199ee608).
We are using legacy syntax. But my analysis of the issue in tap.c
appears to be independent of other networking components. Possibly,
other setup paper over it, but the conceptual issue remains.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-01 17:15 [Qemu-devel] net/tap.c: Possibly a way to stall tap input Jan Kiszka
2013-08-01 17:24 ` Jan Kiszka
@ 2013-08-02 11:46 ` Stefan Hajnoczi
2013-08-02 12:45 ` Jan Kiszka
1 sibling, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2013-08-02 11:46 UTC (permalink / raw)
To: Jan Kiszka; +Cc: qemu-devel
On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
> I was digging into the involved code and found something fishy:
>
> net/tap.c:
> static void tap_send(void *opaque)
> {
> ...
> size = qemu_send_packet_async(&s->nc, buf, size,
> tap_send_completed);
> if (size == 0) {
> tap_read_poll(s, false);
> }
>
> So, if tap_send is registered for the mainloop polling (ie. can_receive
> returned true before starting to poll) but qemu_send_packet_async
> returns 0 now as qemu_can_send_packet/can_receive happens to report
> false in the meantime, we will disable read polling. If also write
> polling is off, the fd will be completely removed from the iohandler
> list. But even if write polling remains on, I wonder what should bring
> read polling back?
This behavior seems fine to me. Once the peer (pcnet) is able to
receive again it must flush the queue, this will re-enable
tap_read_poll().
Can you explain a bit more why this would be a problem?
Stefan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-02 11:46 ` Stefan Hajnoczi
@ 2013-08-02 12:45 ` Jan Kiszka
2013-08-02 16:49 ` Jan Kiszka
2013-08-02 19:41 ` Jan Kiszka
0 siblings, 2 replies; 9+ messages in thread
From: Jan Kiszka @ 2013-08-02 12:45 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel
On 2013-08-02 13:46, Stefan Hajnoczi wrote:
> On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
>> I was digging into the involved code and found something fishy:
>>
>> net/tap.c:
>> static void tap_send(void *opaque)
>> {
>> ...
>> size = qemu_send_packet_async(&s->nc, buf, size,
>> tap_send_completed);
>> if (size == 0) {
>> tap_read_poll(s, false);
>> }
>>
>> So, if tap_send is registered for the mainloop polling (ie. can_receive
>> returned true before starting to poll) but qemu_send_packet_async
>> returns 0 now as qemu_can_send_packet/can_receive happens to report
>> false in the meantime, we will disable read polling. If also write
>> polling is off, the fd will be completely removed from the iohandler
>> list. But even if write polling remains on, I wonder what should bring
>> read polling back?
>
> This behavior seems fine to me. Once the peer (pcnet) is able to
> receive again it must flush the queue, this will re-enable
> tap_read_poll().
>
> Can you explain a bit more why this would be a problem?
The problem is that I don't see at all what will call tap_read_poll(s,
1), neither in theory nor in reality.
As long as the real test case is out of reach, I tried to emulate the
faulty behaviour by letting tap_can_send always return 1. Result:
reception stalls during boot as even qemu_flush_queued_packets cannot
get it running again once tap_read_poll(s, 0) was called.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-02 12:45 ` Jan Kiszka
@ 2013-08-02 16:49 ` Jan Kiszka
2013-08-02 19:41 ` Jan Kiszka
1 sibling, 0 replies; 9+ messages in thread
From: Jan Kiszka @ 2013-08-02 16:49 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel
On 2013-08-02 14:45, Jan Kiszka wrote:
> On 2013-08-02 13:46, Stefan Hajnoczi wrote:
>> On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
>>> I was digging into the involved code and found something fishy:
>>>
>>> net/tap.c:
>>> static void tap_send(void *opaque)
>>> {
>>> ...
>>> size = qemu_send_packet_async(&s->nc, buf, size,
>>> tap_send_completed);
>>> if (size == 0) {
>>> tap_read_poll(s, false);
>>> }
>>>
>>> So, if tap_send is registered for the mainloop polling (ie. can_receive
>>> returned true before starting to poll) but qemu_send_packet_async
>>> returns 0 now as qemu_can_send_packet/can_receive happens to report
>>> false in the meantime, we will disable read polling. If also write
>>> polling is off, the fd will be completely removed from the iohandler
>>> list. But even if write polling remains on, I wonder what should bring
>>> read polling back?
>>
>> This behavior seems fine to me. Once the peer (pcnet) is able to
>> receive again it must flush the queue, this will re-enable
>> tap_read_poll().
>>
>> Can you explain a bit more why this would be a problem?
>
> The problem is that I don't see at all what will call tap_read_poll(s,
> 1), neither in theory nor in reality.
>
> As long as the real test case is out of reach, I tried to emulate the
> faulty behaviour by letting tap_can_send always return 1. Result:
> reception stalls during boot as even qemu_flush_queued_packets cannot
> get it running again once tap_read_poll(s, 0) was called.
OK, this is the bug: When a NIC becomes ready to send or receive again,
the issued qemu_flush_queued_packets will only flush queued packets that
are supposed to leave the NIC, none that may have been queued at the
output of the corresponding backend. For the case of hub-based setups,
we need to propagate this flush via the hub to all attached peers. This
flush will trigger the send callback of tap, and that will re-enable
receive polling.
So this is actually a generic bug that should theoretically affect any
user space NIC, with or without a hub in the middle. I'll cook up a fix,
play with it on Monday and share the outcome.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-02 12:45 ` Jan Kiszka
2013-08-02 16:49 ` Jan Kiszka
@ 2013-08-02 19:41 ` Jan Kiszka
2013-08-05 11:38 ` Stefan Hajnoczi
1 sibling, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2013-08-02 19:41 UTC (permalink / raw)
Cc: qemu-devel, Stefan Hajnoczi
[-- Attachment #1: Type: text/plain, Size: 2140 bytes --]
On 2013-08-02 14:45, Jan Kiszka wrote:
> On 2013-08-02 13:46, Stefan Hajnoczi wrote:
>> On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
>>> I was digging into the involved code and found something fishy:
>>>
>>> net/tap.c:
>>> static void tap_send(void *opaque)
>>> {
>>> ...
>>> size = qemu_send_packet_async(&s->nc, buf, size,
>>> tap_send_completed);
>>> if (size == 0) {
>>> tap_read_poll(s, false);
>>> }
>>>
>>> So, if tap_send is registered for the mainloop polling (ie. can_receive
>>> returned true before starting to poll) but qemu_send_packet_async
>>> returns 0 now as qemu_can_send_packet/can_receive happens to report
>>> false in the meantime, we will disable read polling. If also write
>>> polling is off, the fd will be completely removed from the iohandler
>>> list. But even if write polling remains on, I wonder what should bring
>>> read polling back?
>>
>> This behavior seems fine to me. Once the peer (pcnet) is able to
>> receive again it must flush the queue, this will re-enable
>> tap_read_poll().
>>
>> Can you explain a bit more why this would be a problem?
>
> The problem is that I don't see at all what will call tap_read_poll(s,
> 1), neither in theory nor in reality.
>
> As long as the real test case is out of reach, I tried to emulate the
> faulty behaviour by letting tap_can_send always return 1. Result:
> reception stalls during boot as even qemu_flush_queued_packets cannot
> get it running again once tap_read_poll(s, 0) was called.
OK, false alarm. The issue was most likely fixed by commit 199ee608
(net: fix qemu_flush_queued_packets() in presence of a hub) which is
present in 1.5.x but not 1.3.x. We initially tried to test on 1.5 but
had to role back to 1.3 due to other issues - and missed this fix.
My understanding of the networking maze was confused by the unfortunate
naming of the incoming net client queues ("send_queue") - will propose a
renaming.
This still requires a confirmation on the target, but I'm quite
optimistic now.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
2013-08-02 19:41 ` Jan Kiszka
@ 2013-08-05 11:38 ` Stefan Hajnoczi
0 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2013-08-05 11:38 UTC (permalink / raw)
To: Jan Kiszka; +Cc: qemu-devel, Stefan Hajnoczi
On Fri, Aug 02, 2013 at 09:41:24PM +0200, Jan Kiszka wrote:
> On 2013-08-02 14:45, Jan Kiszka wrote:
> > On 2013-08-02 13:46, Stefan Hajnoczi wrote:
> >> On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
> >>> I was digging into the involved code and found something fishy:
> >>>
> >>> net/tap.c:
> >>> static void tap_send(void *opaque)
> >>> {
> >>> ...
> >>> size = qemu_send_packet_async(&s->nc, buf, size,
> >>> tap_send_completed);
> >>> if (size == 0) {
> >>> tap_read_poll(s, false);
> >>> }
> >>>
> >>> So, if tap_send is registered for the mainloop polling (ie. can_receive
> >>> returned true before starting to poll) but qemu_send_packet_async
> >>> returns 0 now as qemu_can_send_packet/can_receive happens to report
> >>> false in the meantime, we will disable read polling. If also write
> >>> polling is off, the fd will be completely removed from the iohandler
> >>> list. But even if write polling remains on, I wonder what should bring
> >>> read polling back?
> >>
> >> This behavior seems fine to me. Once the peer (pcnet) is able to
> >> receive again it must flush the queue, this will re-enable
> >> tap_read_poll().
> >>
> >> Can you explain a bit more why this would be a problem?
> >
> > The problem is that I don't see at all what will call tap_read_poll(s,
> > 1), neither in theory nor in reality.
> >
> > As long as the real test case is out of reach, I tried to emulate the
> > faulty behaviour by letting tap_can_send always return 1. Result:
> > reception stalls during boot as even qemu_flush_queued_packets cannot
> > get it running again once tap_read_poll(s, 0) was called.
>
> OK, false alarm. The issue was most likely fixed by commit 199ee608
> (net: fix qemu_flush_queued_packets() in presence of a hub) which is
> present in 1.5.x but not 1.3.x. We initially tried to test on 1.5 but
> had to role back to 1.3 due to other issues - and missed this fix.
>
> My understanding of the networking maze was confused by the unfortunate
> naming of the incoming net client queues ("send_queue") - will propose a
> renaming.
>
> This still requires a confirmation on the target, but I'm quite
> optimistic now.
Okay, good to hear. It makes more sense now and I agree that
"send_queue" is not a great name.
Stefan
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-08-05 11:38 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-01 17:15 [Qemu-devel] net/tap.c: Possibly a way to stall tap input Jan Kiszka
2013-08-01 17:24 ` Jan Kiszka
2013-08-02 7:33 ` Stefan Hajnoczi
2013-08-02 7:45 ` Jan Kiszka
2013-08-02 11:46 ` Stefan Hajnoczi
2013-08-02 12:45 ` Jan Kiszka
2013-08-02 16:49 ` Jan Kiszka
2013-08-02 19:41 ` Jan Kiszka
2013-08-05 11:38 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).