All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Steven Galgano <sgalgano@adjacentlink.com>,
	davem@davemloft.net, xemul@parallels.com,
	wuzhy@linux.vnet.ibm.com, therbert@google.com, yamato@redhat.com,
	richardcochran@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Brian Adamson <Brian.Adamson@nrl.navy.mil>,
	Joseph Giovatto <jgiovatto@adjacentlink.com>
Subject: Re: [PATCH] tuntap: add flow control to support back pressure
Date: Fri, 11 Apr 2014 10:57:55 +0800	[thread overview]
Message-ID: <1397185075.6230.1.camel@localhost> (raw)
In-Reply-To: <20140410102931.GA12077@redhat.com>

On Thu, 2014-04-10 at 13:29 +0300, Michael S. Tsirkin wrote:
> On Wed, Apr 09, 2014 at 10:19:40PM -0400, Steven Galgano wrote:
> > Add tuntap flow control support for use by back pressure routing protocols. Setting the new TUNSETIFF flag IFF_FLOW_CONTROL, will signal resources as unavailable when the tx queue limit is reached by issuing a netif_tx_stop_all_queues() rather than discarding frames. A netif_tx_wake_all_queues() is issued after reading a frame from the queue to signal resource availability.
> > 
> > Back pressure capability was previously supported by the legacy tun default mode. This change restores that functionality, which was last present in v3.7.
> > 
> > Reported-by: Brian Adamson <brian.adamson@nrl.navy.mil>
> > Tested-by: Joseph Giovatto <jgiovatto@adjacentlink.com>
> > Signed-off-by: Steven Galgano <sgalgano@adjacentlink.com>
> 
> I don't think it's a good idea.
> 
> This trivial flow control really created more problems than it was worth.
> 
> In particular this blocks all flows so it's trivially easy for one flow
> to block and starve all others: just send a bunch of packets to loopback
> destinations that get queued all over the place.
> 
> Luckily it was never documented so we changed the default and nothing
> seems to break, but we won't be so lucky if we add an explicit API.
> 
> 
> One way to implement this would be with ubuf_info callback this is
> already invoked in most places where a packet might get stuck for a long
> time.  It's still incomplete though: this will prevent head of queue
> blocking literally forever, but a single bad flow can still degrade
> performance significantly.

This is send queue for tuntap. Like all other real nics, we can solve
this through fairness qdiscs?
> 
> Another alternative is to try and isolate the flows that we
> can handle and throttle them.
> 
> It's all fixable but we really need to fix the issues *before*
> exposing the interface to userspace.
> 
> 
> 
> > ---
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index ee328ba..268130c 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -783,8 +783,19 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	 * number of queues.
> >  	 */
> >  	if (skb_queue_len(&tfile->socket.sk->sk_receive_queue) * numqueues
> > -			  >= dev->tx_queue_len)
> > -		goto drop;
> > +			>= dev->tx_queue_len) {
> > +		if (tun->flags & TUN_FLOW_CONTROL) {
> > +			/* Resources unavailable stop transmissions */
> > +			netif_tx_stop_all_queues(dev);
> > +
> > +			/* We won't see all dropped packets individually, so
> > +			 * over run error is more appropriate.
> > +			 */
> > +			dev->stats.tx_fifo_errors++;
> > +		} else {
> > +			goto drop;
> > +		}
> > +	}
> >  
> >  	if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
> >  		goto drop;
> > @@ -1362,6 +1373,9 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
> >  			continue;
> >  		}
> >  
> > +		/* Wake in case resources previously signaled unavailable */
> > +		netif_tx_wake_all_queues(tun->dev);
> > +
> >  		ret = tun_put_user(tun, tfile, skb, iv, len);
> >  		kfree_skb(skb);
> >  		break;
> > @@ -1550,6 +1564,9 @@ static int tun_flags(struct tun_struct *tun)
> >  	if (tun->flags & TUN_PERSIST)
> >  		flags |= IFF_PERSIST;
> >  
> > +	if (tun->flags & TUN_FLOW_CONTROL)
> > +		flags |= IFF_FLOW_CONTROL;
> > +
> >  	return flags;
> >  }
> >  
> > @@ -1732,6 +1749,11 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
> >  	else
> >  		tun->flags &= ~TUN_TAP_MQ;
> >  
> > +	if (ifr->ifr_flags & IFF_FLOW_CONTROL)
> > +		tun->flags |= TUN_FLOW_CONTROL;
> > +	else
> > +		tun->flags &= ~TUN_FLOW_CONTROL;
> > +
> >  	/* Make sure persistent devices do not get stuck in
> >  	 * xoff state.
> >  	 */
> > @@ -1900,7 +1922,8 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
> >  		 * This is needed because we never checked for invalid flags on
> >  		 * TUNSETIFF. */
> >  		return put_user(IFF_TUN | IFF_TAP | IFF_NO_PI | IFF_ONE_QUEUE |
> > -				IFF_VNET_HDR | IFF_MULTI_QUEUE,
> > +				IFF_VNET_HDR | IFF_MULTI_QUEUE |
> > +				IFF_FLOW_CONTROL,
> >  				(unsigned int __user*)argp);
> >  	} else if (cmd == TUNSETQUEUE)
> >  		return tun_set_queue(file, &ifr);
> > diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
> > index e9502dd..bcf2790 100644
> > --- a/include/uapi/linux/if_tun.h
> > +++ b/include/uapi/linux/if_tun.h
> > @@ -36,6 +36,7 @@
> >  #define TUN_PERSIST 	0x0100	
> >  #define TUN_VNET_HDR 	0x0200
> >  #define TUN_TAP_MQ      0x0400
> > +#define TUN_FLOW_CONTROL 0x0800
> >  
> >  /* Ioctl defines */
> >  #define TUNSETNOCSUM  _IOW('T', 200, int) 
> > @@ -70,6 +71,7 @@
> >  #define IFF_MULTI_QUEUE 0x0100
> >  #define IFF_ATTACH_QUEUE 0x0200
> >  #define IFF_DETACH_QUEUE 0x0400
> > +#define IFF_FLOW_CONTROL 0x0010
> >  /* read-only flag */
> >  #define IFF_PERSIST	0x0800
> >  #define IFF_NOFILTER	0x1000



      parent reply	other threads:[~2014-04-11  2:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-10  2:19 [PATCH] tuntap: add flow control to support back pressure Steven Galgano
     [not found] ` <20140410102931.GA12077@redhat.com>
2014-04-11  1:42   ` Steven Galgano
2014-04-11  3:02     ` Jason Wang
2014-04-11 16:41     ` Brian Adamson
2014-04-13 14:14       ` Michael S. Tsirkin
2014-04-14  1:28         ` Steven Galgano
2014-04-14  5:40           ` Michael S. Tsirkin
2014-04-14 18:45             ` Brian Adamson
2014-04-13 14:17     ` Michael S. Tsirkin
2014-04-14  1:30       ` [PATCH v2] " Steven Galgano
2014-04-14  1:40         ` David Miller
2014-04-14  4:19           ` Steven Galgano
2014-04-14  4:34             ` David Miller
2014-04-14 13:21               ` [PATCH v3] " Steven Galgano
2014-04-14 13:31                 ` Michael S. Tsirkin
2014-04-14 13:43                   ` Steven Galgano
2014-04-14  7:02             ` [PATCH v2] " Michael S. Tsirkin
2014-04-11  2:57   ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1397185075.6230.1.camel@localhost \
    --to=jasowang@redhat.com \
    --cc=Brian.Adamson@nrl.navy.mil \
    --cc=davem@davemloft.net \
    --cc=jgiovatto@adjacentlink.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=richardcochran@gmail.com \
    --cc=sgalgano@adjacentlink.com \
    --cc=therbert@google.com \
    --cc=wuzhy@linux.vnet.ibm.com \
    --cc=xemul@parallels.com \
    --cc=yamato@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.