From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <famz@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up
Date: Tue, 30 Jun 2015 16:32:56 +0800 [thread overview]
Message-ID: <55925438.8070907@redhat.com> (raw)
In-Reply-To: <20150629103926.GE32151@stefanha-thinkpad.redhat.com>
On 06/29/2015 06:39 PM, Stefan Hajnoczi wrote:
> On Fri, Jun 26, 2015 at 05:06:01PM +0800, Jason Wang wrote:
>>
>> On 06/25/2015 05:18 PM, Stefan Hajnoczi wrote:
>>> e1000_can_receive() checks the link up status register bit. If the bit
>>> is clear, packets will be queued and the peer may disable receive to
>>> avoid wasting CPU reading packets that cannot be delivered. The queue
>>> must be flushed once the link comes back up again.
>>>
>>> This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests
>>> and tap networking. Flushing the queue invokes the async send callback,
>>> which re-enables tap fd read.
>>>
>>> Reported-by: Jonathan Liu <net147@gmail.com>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>> hw/net/e1000.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>> index bab8e2a..5c6bcd0 100644
>>> --- a/hw/net/e1000.c
>>> +++ b/hw/net/e1000.c
>>> @@ -185,6 +185,9 @@ e1000_link_up(E1000State *s)
>>> {
>>> s->mac_reg[STATUS] |= E1000_STATUS_LU;
>>> s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
>>> +
>>> + /* E1000_STATUS_LU is tested by e1000_can_receive() */
>>> + qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>> }
>>>
>>> static bool
>> This only solves the issue partially and just for e1000. After checking
>> all can_receive() functions, another card with similar behaviour is vmxnet3.
> No, this bug is specific to e1000 (details on that below in case 2b).
>
> There are more issues though, so additional patches are welcome.
>
>> Looking at commit a90a7425cf592a3afeff3eaf32f543b83050ee5c ("tap: Drop
>> tap_can_send") again. The commit disable tap read poll when
>> qemu_net_queue_send() returns zero. Which is usually the following cases:
>>
>> 1) queue->delivering is 1
>> 2) qemu_can_send_packet() returns zero, which is:
>> 2a) vm_running is false
>> or
>> 2b) can_receive() return zero
>> 3) qemu_net_queue_deliver() returns zero, which is:
>> 3a) nc->receive_disabled is true
>> or
>> 3b) info->receive_raw() or nc->receive->receive() returns zero
>>
>> This means we should enable tap read poll when one of those conditions
>> is not existed. This patch fixes 2b) only for e1000.
>>
>> for 1, I'm not quite sure it's a real problem or how to fix.
> This is not a problem. When the delivery code returns it flushes the
> queue. This will invoke send callback functions, so tap will re-enable
> read poll.
>
>> for 2a, we may probably need a vm state change handler to flush the
>> queue, otherwise network will stall after a stop and cont.
> Yes, I'm surprised the code isn't doing this already.
>
>> for 2b, we can either fixes the card that check link status in its
>> can_receive() or just simply can qemu_flush_queued_packets() in set_link.
> No, this doesn't work because the e1000 link status bit is *not*
> equivalent to nc->link_down. The link auto-negotiation code changes the
> e1000 bit independently from nc->link_down using a timer.
>
> The net subsystem cannot know when this e1000-specific bit is supposed
> to change. Therefore there is no general fix.
>
> Also, qemu_send_packet() drops packets while nc->link_down is true.
> We're not supposed to queue so it shouldn't be necessary to flush
> packets in set_link at the net subsystem level.
But qemu_send_packet() only check sender's nc->link_down. So tap read
poll will be disabled and won't get enabled when its peers' link status
is recovered.
>
>> 3a and 3b looks ok, since this happen when there's no space for rx,
>> after guest refill, qemu will call qemu_flush_queued_packets() to flush
>> and re-enable tap read poll.
>>
>> 2a and 2b does not exits before this commit, since tap_send check
>> qemu_can_send_packet() before.
> The delivery and queuing mechanism in QEMU is overcomplicated. The
> semantics of .can_receive() and .receive() need to be clarified before
> we can do real cleanup though:
>
> .receive() returning 0 is not quite the same as .can_receive() returning
> false.
>
> When .receive() returns 0 this means the peer should wait before sending
> more. In most cases the net queue only needs to hold 1 packet. But
> there are cases like slirp where QEMU code generates packets and may not
> be able to continue without calling qemu_send_packet() on more packets
> (I'm not sure but I suspect this is the case).
>
> When .can_receive() returns false it also casues packets to be queued,
> but NIC emulation code often includes more conditions in .can_receive()
> than .receive() returning 0.
Some NICs do even more tricks. E.g when receiver is disabled, 8139's
can_receive() returns true and expect receive() to drop the packets.
>
> Cleaning this up requires auditing all the NICs. Maybe we can get rid
> of .can_receive() and just have a simplified .receive():
>
> false - queue packet and tell peer to stop sending. Flush must be
> called later to re-enable sending.
And call can_receive() in receive().
> true - packet delivered or dropped
This looks good, but do you think we can do this in 2.4? Looks a little
bit risky.
prev parent reply other threads:[~2015-06-30 8:33 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-25 9:18 [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up Stefan Hajnoczi
2015-06-25 10:55 ` Fam Zheng
2015-06-25 14:24 ` Stefan Hajnoczi
2015-06-26 9:06 ` Jason Wang
2015-06-29 3:26 ` Fam Zheng
2015-06-29 10:39 ` Stefan Hajnoczi
2015-06-29 11:03 ` Fam Zheng
2015-06-30 8:32 ` Jason Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55925438.8070907@redhat.com \
--to=jasowang@redhat.com \
--cc=famz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).