From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <famz@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up
Date: Tue, 30 Jun 2015 16:32:56 +0800 [thread overview]
Message-ID: <55925438.8070907@redhat.com> (raw)
In-Reply-To: <20150629103926.GE32151@stefanha-thinkpad.redhat.com>
On 06/29/2015 06:39 PM, Stefan Hajnoczi wrote:
> On Fri, Jun 26, 2015 at 05:06:01PM +0800, Jason Wang wrote:
>>
>> On 06/25/2015 05:18 PM, Stefan Hajnoczi wrote:
>>> e1000_can_receive() checks the link up status register bit. If the bit
>>> is clear, packets will be queued and the peer may disable receive to
>>> avoid wasting CPU reading packets that cannot be delivered. The queue
>>> must be flushed once the link comes back up again.
>>>
>>> This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests
>>> and tap networking. Flushing the queue invokes the async send callback,
>>> which re-enables tap fd read.
>>>
>>> Reported-by: Jonathan Liu <net147@gmail.com>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>> hw/net/e1000.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>> index bab8e2a..5c6bcd0 100644
>>> --- a/hw/net/e1000.c
>>> +++ b/hw/net/e1000.c
>>> @@ -185,6 +185,9 @@ e1000_link_up(E1000State *s)
>>> {
>>> s->mac_reg[STATUS] |= E1000_STATUS_LU;
>>> s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
>>> +
>>> + /* E1000_STATUS_LU is tested by e1000_can_receive() */
>>> + qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>> }
>>>
>>> static bool
>> This only solves the issue partially and just for e1000. After checking
>> all can_receive() functions, another card with similar behaviour is vmxnet3.
> No, this bug is specific to e1000 (details on that below in case 2b).
>
> There are more issues though, so additional patches are welcome.
>
>> Looking at commit a90a7425cf592a3afeff3eaf32f543b83050ee5c ("tap: Drop
>> tap_can_send") again. The commit disable tap read poll when
>> qemu_net_queue_send() returns zero. Which is usually the following cases:
>>
>> 1) queue->delivering is 1
>> 2) qemu_can_send_packet() returns zero, which is:
>> 2a) vm_running is false
>> or
>> 2b) can_receive() return zero
>> 3) qemu_net_queue_deliver() returns zero, which is:
>> 3a) nc->receive_disabled is true
>> or
>> 3b) info->receive_raw() or nc->receive->receive() returns zero
>>
>> This means we should enable tap read poll when one of those conditions
>> is not existed. This patch fixes 2b) only for e1000.
>>
>> for 1, I'm not quite sure it's a real problem or how to fix.
> This is not a problem. When the delivery code returns it flushes the
> queue. This will invoke send callback functions, so tap will re-enable
> read poll.
>
>> for 2a, we may probably need a vm state change handler to flush the
>> queue, otherwise network will stall after a stop and cont.
> Yes, I'm surprised the code isn't doing this already.
>
>> for 2b, we can either fixes the card that check link status in its
>> can_receive() or just simply can qemu_flush_queued_packets() in set_link.
> No, this doesn't work because the e1000 link status bit is *not*
> equivalent to nc->link_down. The link auto-negotiation code changes the
> e1000 bit independently from nc->link_down using a timer.
>
> The net subsystem cannot know when this e1000-specific bit is supposed
> to change. Therefore there is no general fix.
>
> Also, qemu_send_packet() drops packets while nc->link_down is true.
> We're not supposed to queue so it shouldn't be necessary to flush
> packets in set_link at the net subsystem level.
But qemu_send_packet() only check sender's nc->link_down. So tap read
poll will be disabled and won't get enabled when its peers' link status
is recovered.
>
>> 3a and 3b looks ok, since this happen when there's no space for rx,
>> after guest refill, qemu will call qemu_flush_queued_packets() to flush
>> and re-enable tap read poll.
>>
>> 2a and 2b does not exits before this commit, since tap_send check
>> qemu_can_send_packet() before.
> The delivery and queuing mechanism in QEMU is overcomplicated. The
> semantics of .can_receive() and .receive() need to be clarified before
> we can do real cleanup though:
>
> .receive() returning 0 is not quite the same as .can_receive() returning
> false.
>
> When .receive() returns 0 this means the peer should wait before sending
> more. In most cases the net queue only needs to hold 1 packet. But
> there are cases like slirp where QEMU code generates packets and may not
> be able to continue without calling qemu_send_packet() on more packets
> (I'm not sure but I suspect this is the case).
>
> When .can_receive() returns false it also casues packets to be queued,
> but NIC emulation code often includes more conditions in .can_receive()
> than .receive() returning 0.
Some NICs do even more tricks. E.g when receiver is disabled, 8139's
can_receive() returns true and expect receive() to drop the packets.
>
> Cleaning this up requires auditing all the NICs. Maybe we can get rid
> of .can_receive() and just have a simplified .receive():
>
> false - queue packet and tell peer to stop sending. Flush must be
> called later to re-enable sending.
And call can_receive() in receive().
> true - packet delivered or dropped
This looks good, but do you think we can do this in 2.4? Looks a little
bit risky.
prev parent reply other threads:[~2015-06-30 8:33 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-25 9:18 [Qemu-devel] [PATCH for-2.4] e1000: flush packets when link comes up Stefan Hajnoczi
2015-06-25 10:55 ` Fam Zheng
2015-06-25 14:24 ` Stefan Hajnoczi
2015-06-26 9:06 ` Jason Wang
2015-06-29 3:26 ` Fam Zheng
2015-06-29 10:39 ` Stefan Hajnoczi
2015-06-29 11:03 ` Fam Zheng
2015-06-30 8:32 ` Jason Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55925438.8070907@redhat.com \
--to=jasowang@redhat.com \
--cc=famz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.