netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Neil Brown <neilb@suse.de>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 4/4] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack.
Date: Sun, 13 Nov 2011 12:17:14 +0200	[thread overview]
Message-ID: <20111113101713.GB15322@redhat.com> (raw)
In-Reply-To: <1321017627.955.254.camel@zakaz.uk.xensource.com>

On Fri, Nov 11, 2011 at 01:20:27PM +0000, Ian Campbell wrote:
> On Fri, 2011-11-11 at 12:38 +0000, Michael S. Tsirkin wrote:
> > On Wed, Nov 09, 2011 at 03:02:07PM +0000, Ian Campbell wrote:
> > > This prevents an issue where an ACK is delayed, a retransmit is queued (either
> > > at the RPC or TCP level) and the ACK arrives before the retransmission hits the
> > > wire. If this happens to an NFS WRITE RPC then the write() system call
> > > completes and the userspace process can continue, potentially modifying data
> > > referenced by the retransmission before the retransmission occurs.
> > > 
> > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > > Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > > Cc: "David S. Miller" <davem@davemloft.net>
> > > Cc: Neil Brown <neilb@suse.de>
> > > Cc: "J. Bruce Fields" <bfields@fieldses.org>
> > > Cc: linux-nfs@vger.kernel.org
> > > Cc: netdev@vger.kernel.org
> > 
> > So this blocks the system call until all page references
> > are gone, right?
> 
> Right. The alternative is to return to userspace while the network stack
> still has a reference to the buffer which was passed in -- that's the
> exact class of problem this patch is supposed to fix.

BTW, the increased latency and the overhead extra wakeups might for some
workloads be greater than the cost of the data copy.

> > But, there's no upper limit on how long the
> > page is referenced, correct?
> 
> Correct.
> 
> >  consider a bridged setup
> > with an skb queued at a tap device - this cause one process
> > to block another one by virtue of not consuming a cloned skb?
> 
> Hmm, yes.
> 
> One approach might be to introduce the concept of an skb timeout to the
> stack as a whole and cancel (or deep copy) after that timeout occurs.
> That's going to be tricky though I suspect...

Further, an application might use signals such as SIGALARM,
delaying them significantly will cause trouble.

> A simpler option would be to have an end points such as a tap device

Which end points would that be? Doesn't this affect a packet socket
with matching filters? A userspace TCP socket that happens to
reside on the same box?  It also seems that at least with a tap device
an skb can get queued in a qdisk, also indefinitely, right?

> which can swallow skbs for arbitrary times implement a policy in this
> regard, either to deep copy or drop after a timeout?
> 
> Ian.

Or deep copy immediately?

-- 
MST

  parent reply	other threads:[~2011-11-13 10:16 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-09 15:01 [PATCH 0/4] skb paged fragment destructors Ian Campbell
2011-11-09 15:02 ` [PATCH 1/4] net: add support for per-paged-fragment destructors Ian Campbell
2011-11-09 15:33   ` Michał Mirosław
2011-11-09 16:25     ` Ian Campbell
2011-11-09 17:24       ` Michał Mirosław
2011-11-09 17:28         ` Ian Campbell
2011-11-09 15:02 ` [PATCH 2/4] net: only allow paged fragments with the same destructor to be coalesced Ian Campbell
     [not found] ` <1320850895.955.172.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>
2011-11-09 15:02   ` [PATCH 3/4] net: add paged frag destructor support to kernel_sendpage Ian Campbell
2011-11-09 18:02     ` Michał Mirosław
2011-11-09 15:02   ` [PATCH 4/4] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack Ian Campbell
     [not found]     ` <1320850927-30240-4-git-send-email-ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
2011-11-11 12:38       ` Michael S. Tsirkin
     [not found]         ` <20111111123824.GA23902-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-11-11 13:20           ` Ian Campbell
     [not found]             ` <1321017627.955.254.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>
2011-11-11 20:00               ` J. Bruce Fields
2011-11-13 10:17             ` Michael S. Tsirkin [this message]
     [not found]               ` <20111113101713.GB15322-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-11-14 13:07                 ` Ian Campbell
2011-11-14 13:25                   ` David Laight
2011-11-09 17:49 ` [PATCH 0/4] skb paged fragment destructors Eric Dumazet
2011-11-10 10:39   ` Ian Campbell
2011-11-17 14:45   ` Ian Campbell
2011-11-17 14:51     ` Eric Dumazet
2011-11-17 20:22     ` Michał Mirosław
2011-12-06 11:57 ` Ian Campbell
2011-12-06 13:24   ` Eric Dumazet
2011-12-07 13:35     ` Ian Campbell
2011-12-09 13:47       ` Ian Campbell
2011-12-09 18:34         ` David Miller
2011-12-21 11:03           ` Ian Campbell
2011-12-21 11:08             ` Eric Dumazet
2011-12-21 11:18               ` Ian Campbell
2011-12-21 12:30                 ` Eric Dumazet
2011-12-21 13:48                   ` Ian Campbell
2011-12-21 14:02                     ` Eric Dumazet
2011-12-21 19:28                       ` David Miller
2011-12-22 10:33                         ` Ian Campbell
2011-12-22 18:20                           ` David Miller
2011-12-23  9:35                             ` Ian Campbell
2012-01-03  9:36                               ` David Laight
     [not found]                             ` <45B8991A987A4149B40F1A061BF49097B96C9EFF05@LONPMAILBOX01.citrite.net>
2011-12-23  9:39                               ` Ian Campbell
2011-12-23 21:52                                 ` David Miller
2011-12-22 18:34                           ` Michał Mirosław
2011-12-22 18:43                             ` David Miller
2011-12-22 19:29                               ` Michał Mirosław
2011-12-22 19:34                                 ` David Miller
2011-12-23 18:10                                   ` Michał Mirosław

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111113101713.GB15322@redhat.com \
    --to=mst@redhat.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=bfields@fieldses.org \
    --cc=davem@davemloft.net \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).