From: Jason Wang <jasowang@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: eric.dumazet@gmail.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, davem@davemloft.net
Subject: Re: [RFC PATCH net-next 5/6] virtio-net: enable tx interrupt
Date: Wed, 15 Oct 2014 18:25:25 +0800 [thread overview]
Message-ID: <543E4B95.2040209@redhat.com> (raw)
In-Reply-To: <20141015101826.GD25776@redhat.com>
On 10/15/2014 06:18 PM, Michael S. Tsirkin wrote:
> On Wed, Oct 15, 2014 at 03:25:29PM +0800, Jason Wang wrote:
>> > Orphan skb in ndo_start_xmit() breaks socket accounting and packet
>> > queuing. This in fact breaks lots of things such as pktgen and several
>> > TCP optimizations. And also make BQL can't be implemented for
>> > virtio-net.
>> >
>> > This patch tries to solve this issue by enabling tx interrupt. To
>> > avoid introducing extra spinlocks, a tx napi was scheduled to free
>> > those packets.
>> >
>> > More tx interrupt mitigation method could be used on top.
>> >
>> > Cc: Rusty Russell <rusty@rustcorp.com.au>
>> > Cc: Michael S. Tsirkin <mst@redhat.com>
>> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>> > ---
>> > drivers/net/virtio_net.c | 125 +++++++++++++++++++++++++++++++---------------
>> > 1 files changed, 85 insertions(+), 40 deletions(-)
>> >
>> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> > index ccf98f9..2afc2e2 100644
>> > --- a/drivers/net/virtio_net.c
>> > +++ b/drivers/net/virtio_net.c
>> > @@ -72,6 +72,8 @@ struct send_queue {
>> >
>> > /* Name of the send queue: output.$index */
>> > char name[40];
>> > +
>> > + struct napi_struct napi;
>> > };
>> >
>> > /* Internal representation of a receive virtqueue */
>> > @@ -217,15 +219,40 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask)
>> > return p;
>> > }
>> >
>> > +static int free_old_xmit_skbs(struct send_queue *sq, int budget)
>> > +{
>> > + struct sk_buff *skb;
>> > + unsigned int len;
>> > + struct virtnet_info *vi = sq->vq->vdev->priv;
>> > + struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
>> > + u64 tx_bytes = 0, tx_packets = 0;
>> > +
>> > + while (tx_packets < budget &&
>> > + (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
>> > + pr_debug("Sent skb %p\n", skb);
>> > +
>> > + tx_bytes += skb->len;
>> > + tx_packets++;
>> > +
>> > + dev_kfree_skb_any(skb);
>> > + }
>> > +
>> > + u64_stats_update_begin(&stats->tx_syncp);
>> > + stats->tx_bytes += tx_bytes;
>> > + stats->tx_packets =+ tx_packets;
>> > + u64_stats_update_end(&stats->tx_syncp);
>> > +
>> > + return tx_packets;
>> > +}
>> > +
>> > static void skb_xmit_done(struct virtqueue *vq)
>> > {
>> > struct virtnet_info *vi = vq->vdev->priv;
>> > + struct send_queue *sq = &vi->sq[vq2txq(vq)];
>> >
>> > - /* Suppress further interrupts. */
>> > - virtqueue_disable_cb(vq);
>> > -
>> > - /* We were probably waiting for more output buffers. */
>> > - netif_wake_subqueue(vi->dev, vq2txq(vq));
>> > + if (napi_schedule_prep(&sq->napi)) {
>> > + __napi_schedule(&sq->napi);
>> > + }
>> > }
>> >
>> > static unsigned int mergeable_ctx_to_buf_truesize(unsigned long mrg_ctx)
>> > @@ -774,7 +801,39 @@ again:
>> > return received;
>> > }
>> >
>> > +static int virtnet_poll_tx(struct napi_struct *napi, int budget)
>> > +{
>> > + struct send_queue *sq =
>> > + container_of(napi, struct send_queue, napi);
>> > + struct virtnet_info *vi = sq->vq->vdev->priv;
>> > + struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq));
>> > + unsigned int r, sent = 0;
>> > +
>> > +again:
>> > + __netif_tx_lock(txq, smp_processor_id());
>> > + virtqueue_disable_cb(sq->vq);
>> > + sent += free_old_xmit_skbs(sq, budget - sent);
>> > +
>> > + if (sent < budget) {
>> > + r = virtqueue_enable_cb_prepare(sq->vq);
>> > + napi_complete(napi);
>> > + __netif_tx_unlock(txq);
>> > + if (unlikely(virtqueue_poll(sq->vq, r)) &&
> So you are enabling callback on the next packet,
> which is almost sure to cause an interrupt storm
> on the guest.
>
>
> I think it's a bad idea, this is why I used
> enable_cb_delayed in my patch.
Right, will do this, but may also need to make sure used event never
goes back since we may call virtqueue_enable_cb_avail().
>
>
>> > + napi_schedule_prep(napi)) {
>> > + virtqueue_disable_cb(sq->vq);
>> > + __napi_schedule(napi);
>> > + goto again;
>> > + }
>> > + } else {
>> > + __netif_tx_unlock(txq);
>> > + }
>> > +
>> > + netif_wake_subqueue(vi->dev, vq2txq(sq->vq));
>> > + return sent;
>> > +}
>> > +
>> > #ifdef CONFIG_NET_RX_BUSY_POLL
>> > +
>> > /* must be called with local_bh_disable()d */
>> > static int virtnet_busy_poll(struct napi_struct *napi)
>> > {
>> > @@ -822,36 +881,12 @@ static int virtnet_open(struct net_device *dev)
>> > if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
>> > schedule_delayed_work(&vi->refill, 0);
>> > virtnet_napi_enable(&vi->rq[i]);
>> > + napi_enable(&vi->sq[i].napi);
>> > }
>> >
>> > return 0;
>> > }
>> >
>> > -static int free_old_xmit_skbs(struct send_queue *sq)
>> > -{
>> > - struct sk_buff *skb;
>> > - unsigned int len;
>> > - struct virtnet_info *vi = sq->vq->vdev->priv;
>> > - struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
>> > - u64 tx_bytes = 0, tx_packets = 0;
>> > -
>> > - while ((skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
>> > - pr_debug("Sent skb %p\n", skb);
>> > -
>> > - tx_bytes += skb->len;
>> > - tx_packets++;
>> > -
>> > - dev_kfree_skb_any(skb);
>> > - }
>> > -
>> > - u64_stats_update_begin(&stats->tx_syncp);
>> > - stats->tx_bytes += tx_bytes;
>> > - stats->tx_packets =+ tx_packets;
>> > - u64_stats_update_end(&stats->tx_syncp);
>> > -
>> > - return tx_packets;
>> > -}
>> > -
> So you end up moving it all anyway, why bother splitting out
> minor changes in previous patches?
To make review easier, but if you think this complicated it in fact,
will pack them into a single patch.
next prev parent reply other threads:[~2014-10-15 10:25 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-15 7:25 [RFC PATCH net-next 0/6] Always use tx interrupt for virtio-net Jason Wang
2014-10-15 7:25 ` [RFC PATCH net-next 1/6] virtio: make sure used event never go backwards Jason Wang
2014-10-15 9:34 ` Michael S. Tsirkin
2014-10-15 10:13 ` Jason Wang
2014-10-15 10:32 ` Michael S. Tsirkin
2014-10-15 10:44 ` Jason Wang
2014-10-15 11:38 ` Michael S. Tsirkin
2014-10-17 5:04 ` Jason Wang
2014-10-15 7:25 ` [RFC PATCH net-next 2/6] virtio: introduce virtio_enable_cb_avail() Jason Wang
2014-10-15 9:28 ` Michael S. Tsirkin
2014-10-15 10:19 ` Jason Wang
2014-10-15 10:41 ` Michael S. Tsirkin
2014-10-15 10:58 ` Jason Wang
2014-10-15 11:43 ` Michael S. Tsirkin
2014-10-15 7:25 ` [RFC PATCH net-next 3/6] virtio-net: small optimization on free_old_xmit_skbs() Jason Wang
2014-10-15 9:36 ` Eric Dumazet
2014-10-15 9:37 ` Michael S. Tsirkin
2014-10-15 9:49 ` David Laight
2014-10-15 10:48 ` Michael S. Tsirkin
2014-10-15 10:51 ` David Laight
2014-10-15 12:00 ` Michael S. Tsirkin
2014-10-15 7:25 ` [RFC PATCH net-next 4/6] virtio-net: return the number of packets sent in free_old_xmit_skbs() Jason Wang
2014-10-15 7:25 ` [RFC PATCH net-next 5/6] virtio-net: enable tx interrupt Jason Wang
2014-10-15 9:37 ` Eric Dumazet
2014-10-15 10:21 ` Jason Wang
2014-10-15 10:18 ` Michael S. Tsirkin
2014-10-15 10:25 ` Jason Wang [this message]
2014-10-15 10:43 ` Michael S. Tsirkin
2014-10-15 11:00 ` Jason Wang
2014-10-15 7:25 ` [RFC PATCH net-next 6/6] virtio-net: enable tx interrupt only for the final skb in the chain Jason Wang
2014-10-15 10:22 ` Michael S. Tsirkin
2014-10-15 10:31 ` Jason Wang
2014-10-15 10:46 ` Michael S. Tsirkin
2014-10-15 10:25 ` [RFC PATCH net-next 0/6] Always use tx interrupt for virtio-net Michael S. Tsirkin
2014-10-15 11:14 ` Jason Wang
2014-10-15 11:58 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=543E4B95.2040209@redhat.com \
--to=jasowang@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).