netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	hannes@stressinduktion.org, edumazet@google.com
Subject: Re: [PATCH net] tuntap: raise EPOLLOUT on device up
Date: Tue, 22 May 2018 06:45:09 +0300	[thread overview]
Message-ID: <20180522063128-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <358628f8-a296-ad0e-985b-307895ed5520@redhat.com>

On Tue, May 22, 2018 at 11:22:11AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月22日 06:08, Michael S. Tsirkin wrote:
> > On Mon, May 21, 2018 at 11:47:42AM -0400, David Miller wrote:
> > > From: Jason Wang <jasowang@redhat.com>
> > > Date: Fri, 18 May 2018 21:00:43 +0800
> > > 
> > > > We return -EIO on device down but can not raise EPOLLOUT after it was
> > > > up. This may confuse user like vhost which expects tuntap to raise
> > > > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > > > be easily reproduced by transmitting packets from VM while down and up
> > > > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > > > 
> > > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > I'm no so sure what to do with this patch.
> > > 
> > > Like Michael says, this flag bit is only checks upon transmit which
> > > may or may not happen after this point.  It doesn't seem to be
> > > guaranteed.
> 
> The flag is checked in tun_chr_poll() as well.
> 
> > Jason, can't we detect a link up transition and respond accordingly?
> > What do you think?
> > 
> 
> I think we've already tried to do this, in tun_net_open() we call
> write_space(). But the problem is the bit may not be set at that time.

Which bit? __dev_change_flags seems to set IFF_UP before calling
ndo_open. The issue  I think is that tun_sock_write_space
exits if SOCKWQ_ASYNC_NOSPACE is clear.

And now I think I understand what is going on:

	When link is down, writes to the device might fail with -EIO.
	Userspace needs an indication when the status is resolved.  As a fix,
	tun_net_open attempts to wake up writers - but that is only effective if
	SOCKWQ_ASYNC_NOSPACE has been set in the past.  As a quick hack, set
	SOCKWQ_ASYNC_NOSPACE when write fails because of the link down status.
	If no writes failed, userspace does not know that interface
	was down so should not care that it's going up.


does this describe what this line of code does?
If yes feel free to include this info in a code comment and commit log.



> A second thought is to set the bit in tun_chr_poll() instead of -EIO like:
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index d45ac37..46a1573 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1423,6 +1423,13 @@ static void tun_net_init(struct net_device *dev)
>         dev->max_mtu = MAX_MTU - dev->hard_header_len;
>  }
> 
> +static bool tun_sock_writeable(struct tun_struct *tun, struct tun_file
> *tfile)
> +{
> +       struct sock *sk = tfile->socket.sk;
> +
> +       return (tun->dev->flags & IFF_UP) && sock_writeable(sk);
> +}
> +
>  /* Character device part */
> 
>  /* Poll */
> @@ -1445,10 +1452,9 @@ static __poll_t tun_chr_poll(struct file *file,
> poll_table *wait)
>         if (!ptr_ring_empty(&tfile->tx_ring))
>                 mask |= EPOLLIN | EPOLLRDNORM;
> 
> -       if (tun->dev->flags & IFF_UP &&
> -           (sock_writeable(sk) ||
> -            (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
> &&
> -             sock_writeable(sk))))
> +       if (tun_sock_writeable(tun, tfile) ||
> +           (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
> &&
> +            tun_sock_writeable(tun, tfile)));
>                 mask |= EPOLLOUT | EPOLLWRNORM;
> 
>         if (tun->dev->reg_state != NETREG_REGISTERED)
> 
> Does this make more sense?
> 
> Thanks

  reply	other threads:[~2018-05-22  3:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 13:00 [PATCH net] tuntap: raise EPOLLOUT on device up Jason Wang
2018-05-18 13:13 ` Michael S. Tsirkin
2018-05-18 13:26   ` Jason Wang
2018-05-18 14:00     ` Jason Wang
2018-05-18 14:06       ` Michael S. Tsirkin
2018-05-18 14:11         ` Jason Wang
2018-05-18 14:46           ` Michael S. Tsirkin
2018-05-19  1:09             ` Jason Wang
2018-05-21 22:06               ` Michael S. Tsirkin
2018-05-21 15:47 ` David Miller
2018-05-21 22:08   ` Michael S. Tsirkin
2018-05-22  3:22     ` Jason Wang
2018-05-22  3:45       ` Michael S. Tsirkin [this message]
2018-05-22  3:46       ` Michael S. Tsirkin
2018-05-22  4:00         ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180522063128-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hannes@stressinduktion.org \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).