From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive
Date: Mon, 13 Jul 2015 12:52:45 +0800 [thread overview]
Message-ID: <55A3441D.5000305@redhat.com> (raw)
In-Reply-To: <20150708105009.GA25442@stefanha-thinkpad.redhat.com>
On 07/08/2015 06:50 PM, Stefan Hajnoczi wrote:
> On Tue, Jul 07, 2015 at 04:45:41PM +0800, Jason Wang wrote:
>>
>> On 07/06/2015 11:21 PM, Stefan Hajnoczi wrote:
>>> On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote:
>>>> On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote:
>>>>> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote:
>>>>>> On 06/30/2015 11:06 AM, Fam Zheng wrote:
>>>>>>> virtio_net_receive still does the check by calling
>>>>>>> virtio_net_can_receive, if the device or driver is not ready, the packet
>>>>>>> is dropped.
>>>>>>>
>>>>>>> This is necessary because returning false from can_receive complicates
>>>>>>> things: the peer would disable sending until we explicitly flush the
>>>>>>> queue.
>>>>>>>
>>>>>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>>>>>> ---
>>>>>>> hw/net/virtio-net.c | 1 -
>>>>>>> 1 file changed, 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>>> index d728233..dbef0d0 100644
>>>>>>> --- a/hw/net/virtio-net.c
>>>>>>> +++ b/hw/net/virtio-net.c
>>>>>>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f,
>>>>>>> static NetClientInfo net_virtio_info = {
>>>>>>> .type = NET_CLIENT_OPTIONS_KIND_NIC,
>>>>>>> .size = sizeof(NICState),
>>>>>>> - .can_receive = virtio_net_can_receive,
>>>>>>> .receive = virtio_net_receive,
>>>>>>> .link_status_changed = virtio_net_set_link_status,
>>>>>>> .query_rx_filter = virtio_net_query_rxfilter,
>>>>>> A side effect of this patch is it will read and then drop packet is
>>>>>> guest driver is no ok.
>>>>> I think that the semantics of .can_receive() and .receive() return
>>>>> values are currently incorrect in many NICs. They have .can_receive()
>>>>> functions that return false for conditions where .receive() would
>>>>> discard the packet. So what happens is that packets get queued when
>>>>> they should actually be discarded.
>>>> Yes, but they are bugs more or less.
>>>>
>>>>> The purpose of the flow control (queuing) mechanism is to tell the
>>>>> sender to hold off until the receiver has more rx buffers available.
>>>>> It's a short-term thing that doesn't included link down, rx disable, or
>>>>> NIC reset states.
>>>>>
>>>>> Therefore, I think this patch will not introduce a regression. It is
>>>>> adjusting the code to stop queuing packets when they should actually be
>>>>> dropped.
>>>>>
>>>>> Thoughts?
>>>> I agree there's no functional issue. But it cause wasting of cpu cycles
>>>> (consider guest is being flooded). Sometime it maybe even dangerous. For
>>>> tap, we're probably ok since we have 756ae78b but for other backend, we
>>>> don't.
>>> If the guest uses iptables rules or other mechanisms to drop bogus
>>> packets the cost is even higher than discarding them at the QEMU layer.
>> But it was the choice of guest.
>>
>>> What's more is that if you're using link down as a DoS mitigation
>>> strategy then you might as well hot unplug the NIC.
>>>
>>> Stefan
>> I think there're two problems for virtio-net:
>>
>> 1) mitigation method when guest driver is ok. For tx, we have either
>> timer or bh, for rx and only for tap, we have 756ae78b. We probably need
>> fixes for other backends.
> I agree. It's not directly related to the change Fam is proposing
> though.
>
>> 2) when driver is not ok, the point is we should not poll the backend at
>> all (I believe this is one of the main objects of main loop). Something
>> like tap_can_send() and the commit that drops tap_can_send() all follow
>> this rule. But this patch does not, we end up with:
>>
>> - driver is not ok or no driver, qemu keep reading and dropping packets.
> It's correct from a functionality perspective. When rx is disabled or
> the link is down, packets should be dropped.
>
> From a performance perspective I agree it would be better to drop
> packets earlier in the stack.
>
> If you want tap to drop packets efficiently during link down/receive
> disabled/etc:
>
> What's the best way to disable tap rx so the host kernel drops packets
> instead of queuing them?
There's no such API currently especially for a unprivileged qemu. But
tuntap will drop the packet if it exceeds tx_queue_length. This looks
still much better than current behaviour.
>
>> - driver is ok but not enough rx buffer, qemu will disable tap read poll.
> QEMU relies on the kernel's socket receive buffer or netif receive
> queue to hold incoming packets. When QEMU enables tap read poll again
> it receives the queued packets.
>
> Queuing is necessary so USB net emulation (the rx buffer is only 1
> packet!) and maybe slirp work.
Yes.
>
> I think real NICs drop packets when the rx ring is exhausted, but we
> need to keep the queuing behavior unless someone proposes a way to
> eliminate it.
Yes, queueing seems useless if backend can do it.
So I'm ok with the patch, and we can optimize in the future.
Thanks
prev parent reply other threads:[~2015-07-13 4:52 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-30 3:06 [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive Fam Zheng
2015-06-30 8:35 ` Jason Wang
2015-07-02 12:46 ` Stefan Hajnoczi
2015-07-02 16:46 ` Michael S. Tsirkin
2015-07-03 1:12 ` Fam Zheng
2015-07-03 4:17 ` Fam Zheng
2015-07-04 18:48 ` Michael S. Tsirkin
2015-07-06 3:32 ` Jason Wang
2015-07-06 15:21 ` Stefan Hajnoczi
2015-07-06 17:09 ` Michael S. Tsirkin
2015-07-07 0:53 ` Fam Zheng
2015-07-07 8:10 ` Michael S. Tsirkin
2015-07-07 8:45 ` Jason Wang
2015-07-08 10:50 ` Stefan Hajnoczi
2015-07-13 4:52 ` Jason Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55A3441D.5000305@redhat.com \
--to=jasowang@redhat.com \
--cc=famz@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).