From: Zoltan Kiss <zoltan.kiss@citrix.com>
To: David Miller <davem@davemloft.net>
Cc: <wei.liu2@citrix.com>, <Ian.Campbell@citrix.com>,
<david.vrabel@citrix.com>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <xen-devel@lists.xenproject.org>,
Paul Durrant <Paul.Durrant@citrix.com>
Subject: Re: [PATCH net-next 0/2] xen-netback: Changes around carrier handling
Date: Thu, 7 Aug 2014 16:51:17 +0100 [thread overview]
Message-ID: <53E3A075.2090400@citrix.com> (raw)
In-Reply-To: <20140806.140146.1448631909293617629.davem@davemloft.net>
On 06/08/14 22:01, David Miller wrote:
> From: Zoltan Kiss <zoltan.kiss@citrix.com>
> Date: Wed, 6 Aug 2014 20:20:55 +0100
>
>> The fundamental problem with netback that start_xmit place the
>> packet into an internal queue, and then the thread does the actual
>> transmission from that queue, but it doesn't know whether it will
>> succeed in a finite time period.
>
> A hardware device acts the same way when the link goes down or the
> transmitter hangs. I do not see this situation, therefore, as
> fundamentally unique to xen-netback.
>
>> I just noticed a possible problem with start_xmit: if this slot
>> estimation fails, and the packet is likely to be stalled at least for
>> a while, it still place the skb into the internal queue and return
>> NETDEV_TX_OK. Shouldn't we return NETDEV_TX_BUSY and not placing the
>> packet into the internal queue? It will be requeued later, or dropped
>> by QDisc. I think it will be more natural. But it can decrease
>> performance if these "not enough slot" situations are very frequent
>> and short living, and by the time the RX thread wakes up
>> My long term idea is to move part of the thread's work into
>> start_xmit, so it can set up the grant operations and if there isn't
>> enough slot it can return the skb to QDisc with NETDEV_TX_BUSY for
>> requeueing. Then the thread can only do the batching of the grant copy
>> operations and releasing the skbs. And we can ditch a good part of the
>> code ...
>
> Yes, it would be a slight improvement if slot availability was
> detected at ->ndo_start_xmit() time.
>
> But best would be to properly stop the queue at the _end_ of
> ->ndo_start_xmit() like nearly all ethernet drivers do.
>
> And we can't do that in netback because..... your queues are too
> small.
>
> If your queues were large enough you could say "now that I've queued
> up SKB for transmit, do I still have enough slots available for a
> maxed out SKB?" and stop the queue if the answer to that question is
> no.
>
> The queue state was not meant to be a "maybe I can queue a new packet"
> indication. It's supposed to mean that you can absolutely take at
> least one more SKB of any size or configuration.
Ok, how about this:
* ndo_start_xmit tries to set up the grant copy operations, something
which is done now in the thread
* no estimation madness, just go ahead and try to do it
* if the skb can fit, kick the thread
* if it fails (not enough slots to complete the TX), then:
* call netif_tx_stop_queue on that queue (just like now)
* set up timer rx_stalled (just like now)
* save the state of the current skb (where the grant copy op setup is
halted)
* if new slots coming in, continue to create the grant copy ops for the
stalled skb, and if it succeeds, kick the thread plus call
netif_tx_start_queue. (just like now)
* if the timer fires, drop the stalled skb, and set the carrier off, so
QDisc won't bother to queue packets for a stalled interface
* the thread will only do the actual grant copy hypercall and releasing
the skb
* in any case, ndo_start_xmit should return NETDEV_TX_OK, just like now
>
> Returning TX_BUSY from ->ndo_start_xmit() is fundamentally, therefore,
> more like an error condition rather than something that should occur
> under normal circumstances. If your queue is up, you should be able
> to accept any one single packet. That's the rule.
>
> Requeueing back into the qdisc is expensive and takes a lot of locks
> awkwardly. You do not want the kernel taking this code path.
Ok, thanks for clearing that up, I thought it works differently.
next prev parent reply other threads:[~2014-08-07 15:51 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-04 15:20 [PATCH net-next 0/2] xen-netback: Changes around carrier handling Zoltan Kiss
2014-08-04 15:20 ` [PATCH net-next 1/2] xen-netback: Using a new state bit instead of carrier Zoltan Kiss
2014-08-05 12:45 ` Wei Liu
2014-08-06 18:25 ` Zoltan Kiss
2014-08-04 15:20 ` [PATCH net-next 2/2] xen-netback: Turn off the carrier if the guest is not able to receive Zoltan Kiss
2014-08-05 12:45 ` Wei Liu
2014-08-06 19:18 ` Zoltan Kiss
2014-08-04 15:34 ` [PATCH net-next 0/2] xen-netback: Changes around carrier handling Zoltan Kiss
2014-08-05 23:07 ` David Miller
2014-08-06 0:00 ` Wei Liu
2014-08-06 1:50 ` David Miller
2014-08-06 8:54 ` Wei Liu
2014-08-06 19:20 ` Zoltan Kiss
2014-08-06 21:01 ` David Miller
2014-08-07 15:51 ` Zoltan Kiss [this message]
2014-08-08 5:28 ` David Miller
2014-08-07 16:49 ` Zoltan Kiss
2014-08-08 5:29 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53E3A075.2090400@citrix.com \
--to=zoltan.kiss@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Paul.Durrant@citrix.com \
--cc=davem@davemloft.net \
--cc=david.vrabel@citrix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).