From: Zoltan Kiss <zoltan.kiss@citrix.com>
To: Paul Durrant <Paul.Durrant@citrix.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
Ian Campbell <Ian.Campbell@citrix.com>,
Wei Liu <wei.liu2@citrix.com>,
Malcolm Crossley <malcolm.crossley@citrix.com>,
David Vrabel <david.vrabel@citrix.com>
Cc: Jonathan Davies <Jonathan.Davies@citrix.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Eric Dumazet <eric.dumazet@gmail.com>
Subject: Timout packets in device's TX queue
Date: Thu, 14 Nov 2013 19:27:16 +0000 [thread overview]
Message-ID: <52852414.2070105@citrix.com> (raw)
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD0172846@AMSPEX01CL01.citrite.net>
Discussing this further with Paul, we came to the conclusion that
probably the best solution would be to drop these packets in qdisc.
Netback RX path stop accepting new packets if the target guest doesn't
have enough room in the ring. Also (AFAIK) NIC drivers do the same if
they don't have more resource for TX, and this is all good for us. Our
problem is that the queue in qdisc layer (sorry if my terminology not
clear!) can still accumulate these packets indefinitely. The drastic
measure would be to reduce txqueuelen to 0 during setup for every
affected device, but that's not really nice.
Instead, we should be able to configure qdisc to timeout packets on
those queues, at least the SKBs where (skb_shinfo(skb)->tx_flags &
SKBTX_DEV_ZEROCOPY). I'm not that familiar with it to know if that's
already possible, or if not, then how good idea would it be to implement it.
I've changed the subject and included netdev and Eric, maybe someone can
shed some more light on this question.
Regards,
Zoli
On 14/11/13 09:42, Paul Durrant wrote:
>> -----Original Message-----
>> From: Zoltan Kiss
>> Sent: 13 November 2013 20:30
>> To: xen-devel@lists.xenproject.org; Ian Campbell; Wei Liu; Paul Durrant;
>> Malcolm Crossley; David Vrabel
>> Cc: Jonathan Davies
>> Subject: netback: Delayed copy alternative
>>
>> Hi,
>>
>> I'm trying to forward port delayed copy to my new grant mapping patches.
>> One important problem I've faced is that classic used
>> gnttab_copy_grant_page to replace the granted page with a local copy and
>> unmap the grant. And this function has never been upstreamed as only
>> netback used it. Unfortunately upstreaming it is not a very easy task,
>> as the kernel's grant table infrastructure doesn't track at the moment
>> whether the page is DMA mapped or not. It is required because we
>> shouldn't proceed with the copy and replace if a device already mapped
>> the page for DMA.
>> David came up with an alternative idea: we do this delayed copy because
>> we don't want the guest's page to get stucked in Dom0 indefinitely. The
>> only realistic case for that would be if the egress interface would be
>> an another guest's vif, where the guest (either due to a bug or as a
>> malicious attempt) doesn't empty its ring. I think it's a safe
>> assumption that Dom0 otherwise doesn't hold on to packets for too long.
>> Or if it does, then that's a bug we should fix instead of doing a copy
>> of the packet.
>> If we accept that only other vif's can keep the skb indefinitely, then
>> an easier solution would be to handle this problem on the RX side: the
>> RX thread can also check whether this skb hanged around for too long and
>> drop it. Actually, xenvif_start_xmit already checks if the guest
>> provided enough slots for us to do the grant copy. If I understand it
>> correctly. What do you think about such an approach?
>>
>
> Well, now that David fixed the DMA unmap tracking thing, I believe that another vif is *generally* the only place an skb can hang around for a long time. The problem is that there is an edge case... If a network driver turns off queue processing (for flow control reasons, and NB that 10G Ethernet requires the driver to do this if the PHY signals flow control and internal buffering is exhausted, 1G is allowed to be an open drain) then the skb can sit in the queue indefinitely and there's no way you can deal with this from the guest RX side of netback. You need to have a copy-aside option to handle this.
>
> Paul
>
parent reply other threads:[~2013-11-14 19:27 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <9AAE0902D5BC7E449B7C8E4E778ABCD0172846@AMSPEX01CL01.citrite.net>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52852414.2070105@citrix.com \
--to=zoltan.kiss@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Jonathan.Davies@citrix.com \
--cc=Paul.Durrant@citrix.com \
--cc=david.vrabel@citrix.com \
--cc=eric.dumazet@gmail.com \
--cc=malcolm.crossley@citrix.com \
--cc=netdev@vger.kernel.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).