Re: [PATCH 5/5] Stop dropping so many RX packets in tap (v3)

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi@qumranet.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: kvm-devel@lists.sourceforge.net, Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [PATCH 5/5] Stop dropping so many RX packets in tap (v3)
Date: Mon, 12 May 2008 09:59:53 +0300	[thread overview]
Message-ID: <4827EAE9.3070205@qumranet.com> (raw)
In-Reply-To: <48275ABC.3080400@codemonkey.ws>

Anthony Liguori wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>  
>>>> How about the other way round: when the vlan consumer detects it 
>>>> can no longer receive packets, it tells that to the vlan.  When all 
>>>> vlan consumers can no longer receive, tell the producer to stop 
>>>> producing.  For the tap producer, this is simply removing its fd 
>>>> from the read poll list.  When a vlan consumer becomes ready to 
>>>> receive again, it tells the vlan, which tells the producers, which 
>>>> then install their fds back again.
>>>>       
>>> Yeah, that's a nice idea.   I'll think about it.  I don't know if 
>>> it's really worth doing as an intermediate step though.  What I'd 
>>> really like to do is have a vlan interface where consumers published 
>>> all of their receive buffers.  Then there's no need for 
>>> notifications of receive-ability.
>>>     
>>
>> That's definitely better, and is also more multiqueue nic / vringfd 
>> friendly.
>>
>> I still think interrupt-on-halfway-mark is needed much more 
>> urgently.  It deals with concurrency much better:
>>   
>
> We already sort of do this.  In try_fill_recv() in virtio-net.c, we 
> try to allocate as many skbs as we can to fill the rx queue.  We keep 
> track of how many we've been able to allocate.  Whenever we process an 
> RX packet, we check to see if the current number of RX packets is less 
> than half the maximum number of rx packets we've been able to allocate.
>

Then why are we dropping packets?  We shouldn't run out of rx 
descriptors if this works.

> In the common case of small queues, this should have exactly the 
> behavior you describe.  We don't currently suppress the RX 
> notification even though we really could.  The can_receive changes are 
> really the key to this.  Unless we are suppressing tap fd 
> select()'ing, we can always suppress the RX notification.  That's been 
> sitting on my TODO for a bit.
>

Sorry, totally confused.  How does can_receive come into this?

>> rx:
>>   host interrupts guest on halfway mark
>>   guest starts processing packets
>>   host keeps delivering
>>
>> tx:
>>   guest kicks host on halfway mark
>>   host starts processing packets
>>   second vcpu on guest keeps on queueing
>>   
>
> I'm not convinced it's at all practical to suppress notifications in 
> the front-end.  We simply don't know whether we'll get more packets 
> which means we have to do TX mitigation within the front-end.  We've 
> been there, it's not as nice as doing it in the back-end.
>
> What we really ought to do in the back-end though, is start processing 
> the TX packets as soon as we begin to do TX mitigation.  This would 
> address the ping latency issue while also increasing throughput 
> (hopefully).  One thing I've wanted to try is to register a 
> bottom-half or a polling function so that the IO thread was always 
> trying to process TX packets while the TX timer is active.

Setting the timer clearly should be the host's job.  But suppression 
needs to be coordinated.  One fairly generic mechanism is for one side 
to tell the other at which ring index it wants the notification.

There are four cases to consider:

- tx guest->host

Normally the host sets the notify index at current+1, to get immediate 
notifications.  If it starts seeing many packets, it can gradually 
increasing this to current+size/2, setting a timer to limit latency.  If 
the timer fires, drop the window size.

- tx host->guest

Returning descriptors is not an urgent task if the ring is large 
enough.  Can be set to size/2 plus a longish timer.

- rx guest->host

Similar to above, set to size/2, don't even need a timer.

- rx host->guest

Similar to tx guest->host, start with a small window to get minimal 
latency, if we see many packets within a short time we can increase the 
window + set a timer.

We should aim for the timer to fire only rarely.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

next prev parent reply	other threads:[~2008-05-12  6:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-07 18:09 [PATCH 1/5] Support more than 3.5GB with virtio (v3) Anthony Liguori
2008-05-07 18:09 ` [PATCH 2/5] Validate the SG list layouts in virtio Anthony Liguori
2008-05-07 18:09 ` [PATCH 3/5] Revert virtio tap hack (v3) Anthony Liguori
2008-05-07 18:09 ` [PATCH 4/5] Make virtio-net can_receive more accurate (v3) Anthony Liguori
2008-05-07 18:09 ` [PATCH 5/5] Stop dropping so many RX packets in tap (v3) Anthony Liguori
2008-05-11 14:34   ` Avi Kivity
2008-05-11 18:30     ` Anthony Liguori
2008-05-11 18:52       ` Avi Kivity
2008-05-11 20:44         ` Anthony Liguori
2008-05-12  6:59           ` Avi Kivity [this message]
2008-05-09 15:24 ` [PATCH 1/5] Support more than 3.5GB with virtio (v3) Avi Kivity
2008-05-09 18:37   ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4827EAE9.3070205@qumranet.com \
    --to=avi@qumranet.com \
    --cc=anthony@codemonkey.ws \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox