From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive
Date: Mon, 13 Jul 2015 12:52:45 +0800 [thread overview]
Message-ID: <55A3441D.5000305@redhat.com> (raw)
In-Reply-To: <20150708105009.GA25442@stefanha-thinkpad.redhat.com>
On 07/08/2015 06:50 PM, Stefan Hajnoczi wrote:
> On Tue, Jul 07, 2015 at 04:45:41PM +0800, Jason Wang wrote:
>>
>> On 07/06/2015 11:21 PM, Stefan Hajnoczi wrote:
>>> On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote:
>>>> On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote:
>>>>> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote:
>>>>>> On 06/30/2015 11:06 AM, Fam Zheng wrote:
>>>>>>> virtio_net_receive still does the check by calling
>>>>>>> virtio_net_can_receive, if the device or driver is not ready, the packet
>>>>>>> is dropped.
>>>>>>>
>>>>>>> This is necessary because returning false from can_receive complicates
>>>>>>> things: the peer would disable sending until we explicitly flush the
>>>>>>> queue.
>>>>>>>
>>>>>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>>>>>> ---
>>>>>>> hw/net/virtio-net.c | 1 -
>>>>>>> 1 file changed, 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>>> index d728233..dbef0d0 100644
>>>>>>> --- a/hw/net/virtio-net.c
>>>>>>> +++ b/hw/net/virtio-net.c
>>>>>>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f,
>>>>>>> static NetClientInfo net_virtio_info = {
>>>>>>> .type = NET_CLIENT_OPTIONS_KIND_NIC,
>>>>>>> .size = sizeof(NICState),
>>>>>>> - .can_receive = virtio_net_can_receive,
>>>>>>> .receive = virtio_net_receive,
>>>>>>> .link_status_changed = virtio_net_set_link_status,
>>>>>>> .query_rx_filter = virtio_net_query_rxfilter,
>>>>>> A side effect of this patch is it will read and then drop packet is
>>>>>> guest driver is no ok.
>>>>> I think that the semantics of .can_receive() and .receive() return
>>>>> values are currently incorrect in many NICs. They have .can_receive()
>>>>> functions that return false for conditions where .receive() would
>>>>> discard the packet. So what happens is that packets get queued when
>>>>> they should actually be discarded.
>>>> Yes, but they are bugs more or less.
>>>>
>>>>> The purpose of the flow control (queuing) mechanism is to tell the
>>>>> sender to hold off until the receiver has more rx buffers available.
>>>>> It's a short-term thing that doesn't included link down, rx disable, or
>>>>> NIC reset states.
>>>>>
>>>>> Therefore, I think this patch will not introduce a regression. It is
>>>>> adjusting the code to stop queuing packets when they should actually be
>>>>> dropped.
>>>>>
>>>>> Thoughts?
>>>> I agree there's no functional issue. But it cause wasting of cpu cycles
>>>> (consider guest is being flooded). Sometime it maybe even dangerous. For
>>>> tap, we're probably ok since we have 756ae78b but for other backend, we
>>>> don't.
>>> If the guest uses iptables rules or other mechanisms to drop bogus
>>> packets the cost is even higher than discarding them at the QEMU layer.
>> But it was the choice of guest.
>>
>>> What's more is that if you're using link down as a DoS mitigation
>>> strategy then you might as well hot unplug the NIC.
>>>
>>> Stefan
>> I think there're two problems for virtio-net:
>>
>> 1) mitigation method when guest driver is ok. For tx, we have either
>> timer or bh, for rx and only for tap, we have 756ae78b. We probably need
>> fixes for other backends.
> I agree. It's not directly related to the change Fam is proposing
> though.
>
>> 2) when driver is not ok, the point is we should not poll the backend at
>> all (I believe this is one of the main objects of main loop). Something
>> like tap_can_send() and the commit that drops tap_can_send() all follow
>> this rule. But this patch does not, we end up with:
>>
>> - driver is not ok or no driver, qemu keep reading and dropping packets.
> It's correct from a functionality perspective. When rx is disabled or
> the link is down, packets should be dropped.
>
> From a performance perspective I agree it would be better to drop
> packets earlier in the stack.
>
> If you want tap to drop packets efficiently during link down/receive
> disabled/etc:
>
> What's the best way to disable tap rx so the host kernel drops packets
> instead of queuing them?
There's no such API currently especially for a unprivileged qemu. But
tuntap will drop the packet if it exceeds tx_queue_length. This looks
still much better than current behaviour.
>
>> - driver is ok but not enough rx buffer, qemu will disable tap read poll.
> QEMU relies on the kernel's socket receive buffer or netif receive
> queue to hold incoming packets. When QEMU enables tap read poll again
> it receives the queued packets.
>
> Queuing is necessary so USB net emulation (the rx buffer is only 1
> packet!) and maybe slirp work.
Yes.
>
> I think real NICs drop packets when the rx ring is exhausted, but we
> need to keep the queuing behavior unless someone proposes a way to
> eliminate it.
Yes, queueing seems useless if backend can do it.
So I'm ok with the patch, and we can optimize in the future.
Thanks
prev parent reply other threads:[~2015-07-13 4:52 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-30 3:06 [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive Fam Zheng
2015-06-30 8:35 ` Jason Wang
2015-07-02 12:46 ` Stefan Hajnoczi
2015-07-02 16:46 ` Michael S. Tsirkin
2015-07-03 1:12 ` Fam Zheng
2015-07-03 4:17 ` Fam Zheng
2015-07-04 18:48 ` Michael S. Tsirkin
2015-07-06 3:32 ` Jason Wang
2015-07-06 15:21 ` Stefan Hajnoczi
2015-07-06 17:09 ` Michael S. Tsirkin
2015-07-07 0:53 ` Fam Zheng
2015-07-07 8:10 ` Michael S. Tsirkin
2015-07-07 8:45 ` Jason Wang
2015-07-08 10:50 ` Stefan Hajnoczi
2015-07-13 4:52 ` Jason Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55A3441D.5000305@redhat.com \
--to=jasowang@redhat.com \
--cc=famz@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.