Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
	qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive
Date: Mon, 13 Jul 2015 12:52:45 +0800	[thread overview]
Message-ID: <55A3441D.5000305@redhat.com> (raw)
In-Reply-To: <20150708105009.GA25442@stefanha-thinkpad.redhat.com>



On 07/08/2015 06:50 PM, Stefan Hajnoczi wrote:
> On Tue, Jul 07, 2015 at 04:45:41PM +0800, Jason Wang wrote:
>>
>> On 07/06/2015 11:21 PM, Stefan Hajnoczi wrote:
>>> On Mon, Jul 06, 2015 at 11:32:25AM +0800, Jason Wang wrote:
>>>> On 07/02/2015 08:46 PM, Stefan Hajnoczi wrote:
>>>>> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote:
>>>>>> On 06/30/2015 11:06 AM, Fam Zheng wrote:
>>>>>>> virtio_net_receive still does the check by calling
>>>>>>> virtio_net_can_receive, if the device or driver is not ready, the packet
>>>>>>> is dropped.
>>>>>>>
>>>>>>> This is necessary because returning false from can_receive complicates
>>>>>>> things: the peer would disable sending until we explicitly flush the
>>>>>>> queue.
>>>>>>>
>>>>>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>>>>>> ---
>>>>>>>  hw/net/virtio-net.c | 1 -
>>>>>>>  1 file changed, 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>>> index d728233..dbef0d0 100644
>>>>>>> --- a/hw/net/virtio-net.c
>>>>>>> +++ b/hw/net/virtio-net.c
>>>>>>> @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f,
>>>>>>>  static NetClientInfo net_virtio_info = {
>>>>>>>      .type = NET_CLIENT_OPTIONS_KIND_NIC,
>>>>>>>      .size = sizeof(NICState),
>>>>>>> -    .can_receive = virtio_net_can_receive,
>>>>>>>      .receive = virtio_net_receive,
>>>>>>>      .link_status_changed = virtio_net_set_link_status,
>>>>>>>      .query_rx_filter = virtio_net_query_rxfilter,
>>>>>> A side effect of this patch is it will read and then drop packet is
>>>>>> guest driver is no ok.
>>>>> I think that the semantics of .can_receive() and .receive() return
>>>>> values are currently incorrect in many NICs.  They have .can_receive()
>>>>> functions that return false for conditions where .receive() would
>>>>> discard the packet.  So what happens is that packets get queued when
>>>>> they should actually be discarded.
>>>> Yes, but they are bugs more or less.
>>>>
>>>>> The purpose of the flow control (queuing) mechanism is to tell the
>>>>> sender to hold off until the receiver has more rx buffers available.
>>>>> It's a short-term thing that doesn't included link down, rx disable, or
>>>>> NIC reset states.
>>>>>
>>>>> Therefore, I think this patch will not introduce a regression.  It is
>>>>> adjusting the code to stop queuing packets when they should actually be
>>>>> dropped.
>>>>>
>>>>> Thoughts?
>>>> I agree there's no functional issue. But it cause wasting of cpu cycles
>>>> (consider guest is being flooded). Sometime it maybe even dangerous. For
>>>> tap, we're probably ok since we have 756ae78b but for other backend, we
>>>> don't.
>>> If the guest uses iptables rules or other mechanisms to drop bogus
>>> packets the cost is even higher than discarding them at the QEMU layer.
>> But it was the choice of guest.
>>
>>> What's more is that if you're using link down as a DoS mitigation
>>> strategy then you might as well hot unplug the NIC.
>>>
>>> Stefan
>> I think there're two problems for virtio-net:
>>
>> 1) mitigation method when guest driver is ok. For tx, we have either
>> timer or bh, for rx and only for tap, we have 756ae78b. We probably need
>> fixes for other backends.
> I agree.  It's not directly related to the change Fam is proposing
> though.
>
>> 2) when driver is not ok, the point is we should not poll the backend at
>> all (I believe this is one of the main objects of main loop). Something
>> like tap_can_send() and the commit that drops tap_can_send() all follow
>> this rule. But this patch does not, we end up with:
>>
>> - driver is not ok or no driver, qemu keep reading and dropping packets.
> It's correct from a functionality perspective.  When rx is disabled or
> the link is down, packets should be dropped.
>
> From a performance perspective I agree it would be better to drop
> packets earlier in the stack.
>
> If you want tap to drop packets efficiently during link down/receive
> disabled/etc:
>
> What's the best way to disable tap rx so the host kernel drops packets
> instead of queuing them?

There's no such API currently especially for a unprivileged qemu. But
tuntap will drop the packet if it exceeds tx_queue_length. This looks
still much better than current behaviour. 

>
>> - driver is ok but not enough rx buffer, qemu will disable tap read poll.
> QEMU relies on the kernel's socket receive buffer or netif receive
> queue to hold incoming packets.  When QEMU enables tap read poll again
> it receives the queued packets.
>
> Queuing is necessary so USB net emulation (the rx buffer is only 1
> packet!) and maybe slirp work.

Yes.

>
> I think real NICs drop packets when the rx ring is exhausted, but we
> need to keep the queuing behavior unless someone proposes a way to
> eliminate it.

Yes, queueing seems useless if backend can do it.

So I'm ok with the patch, and we can optimize in the future.

Thanks

     prev parent reply	other threads:[~2015-07-13  4:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-30  3:06 [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive Fam Zheng
2015-06-30  8:35 ` Jason Wang
2015-07-02 12:46   ` Stefan Hajnoczi
2015-07-02 16:46     ` Michael S. Tsirkin
2015-07-03  1:12       ` Fam Zheng
2015-07-03  4:17         ` Fam Zheng
2015-07-04 18:48           ` Michael S. Tsirkin
2015-07-06  3:32     ` Jason Wang
2015-07-06 15:21       ` Stefan Hajnoczi
2015-07-06 17:09         ` Michael S. Tsirkin
2015-07-07  0:53           ` Fam Zheng
2015-07-07  8:10             ` Michael S. Tsirkin
2015-07-07  8:45         ` Jason Wang
2015-07-08 10:50           ` Stefan Hajnoczi
2015-07-13  4:52             ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A3441D.5000305@redhat.com \
    --to=jasowang@redhat.com \
    --cc=famz@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).