From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] net: core: orphan frags before queuing to slow qdisc
Date: Sun, 19 Jan 2014 11:21:40 +0200 [thread overview]
Message-ID: <20140119092140.GA2984@redhat.com> (raw)
In-Reply-To: <1389951734-13234-1-git-send-email-jasowang@redhat.com>
On Fri, Jan 17, 2014 at 05:42:14PM +0800, Jason Wang wrote:
> Many qdiscs can queue a packet for a long time, this will lead an issue
> with zerocopy skb. It means the frags will not be orphaned in an expected
> short time, this breaks the assumption that virtio-net will transmit the
> packet in time.
>
> So if guest packets were queued through such kind of qdisc and hit the
> limitation of the max pending packets for virtio/vhost. All packets that
> go to another destination from guest will also be blocked.
>
> A case for reproducing the issue:
>
> - Boot two VMs and connect them to the same bridge kvmbr.
> - Setup tbf with a very low rate/burst on eth0 which is a port of kvmbr.
> - Let VM1 send lots of packets thorugh eth0
> - After a while, VM1 is unable to send any packets out since the number of
> pending packets (queued to tbf) were exceeds the limitation of vhost/virito
>
> Solve this issue by orphaning the frags before queuing it to a slow qdisc (the
> one without TCQ_F_CAN_BYPASS).
This seems too aggressive to me.
The issue is that packet can stay queued indefinitely long.
So I think we should only do this for tbf.
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> net/core/dev.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0ce469e..1209774 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2700,6 +2700,12 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
> contended = qdisc_is_running(q);
> if (unlikely(contended))
> spin_lock(&q->busylock);
> + if (!(q->flags & TCQ_F_CAN_BYPASS) &&
> + unlikely(skb_orphan_frags(skb, GFP_ATOMIC))) {
> + kfree_skb(skb);
> + rc = NET_XMIT_DROP;
> + goto out;
> + }
>
> spin_lock(root_lock);
> if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
> @@ -2739,6 +2745,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
> }
> }
> spin_unlock(root_lock);
> +out:
> if (unlikely(contended))
> spin_unlock(&q->busylock);
> return rc;
> --
> 1.8.3.2
prev parent reply other threads:[~2014-01-19 9:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-17 9:42 [PATCH net] net: core: orphan frags before queuing to slow qdisc Jason Wang
2014-01-17 14:28 ` Eric Dumazet
2014-01-18 5:35 ` Jason Wang
2014-01-19 9:56 ` Michael S. Tsirkin
2014-01-19 9:21 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140119092140.GA2984@redhat.com \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.