linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Allison <jra-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
To: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
Cc: Jeremy Allison <jra-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
	Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
	linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-fsdevel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Recvfile patch used for Samba.
Date: Tue, 23 Jul 2013 14:58:58 -0700	[thread overview]
Message-ID: <20130723215858.GB16356@samba2> (raw)
In-Reply-To: <20130723071027.GJ19986@dastard>

On Tue, Jul 23, 2013 at 05:10:27PM +1000, Dave Chinner wrote:
> So, we are nesting up to 32 page locks here. That's bad. And we are
> nesting kmap() calls for all the pages individually - is that even
> safe to do?
> 
> So, what happens when we've got 16 pages in, and the filesystem has
> allocated space for those 16 blocks, and we get ENOSPC on the 17th?
> Sure, you undo the state here, but what about the 16 blocks that the
> filesystem has allocated to this file? There's no notification to
> the filesystem that they need to be truncated away because the write
> failed....
> 
> > +
> > +	/* IOV is ready, receive the date from socket now */
> > +	msg.msg_name = NULL;
> > +	msg.msg_namelen = 0;
> > +	msg.msg_iov = (struct iovec *)&iov[0];
> > +	msg.msg_iovlen = cPagesAllocated ;
> > +	msg.msg_control = NULL;
> > +	msg.msg_controllen = 0;
> > +	msg.msg_flags = MSG_KERNSPACE;
> > +	rcvtimeo = sock->sk->sk_rcvtimeo;    
> > +	sock->sk->sk_rcvtimeo = 8 * HZ;
> 
> We can hold the inode and the pages locked for 8 seconds?
> 
> I'll stop there. This is fundamentally broken. It's an attempt to do
> a multi-page write operation without any of the supporting
> structures needed to handle the failure cases properly.  The nested
> page locking has "deadlock" written all over it, and the lack of
> partial failure handling shouts "data corruption" and "stale data
> exposure" to me. The fact it can block for up to 8 seconds waiting
> for network shenanigans to be completed while holding lots of locks
> is going to cause all sorts of problems under memory pressure.
> 
> Not to mention it means that all memory allocations in the msgrcv
> path need to be done with GFP_NOFS, because GFP_KERNEL allocations
> are almost guaranteed to deadlock on the locked pages this path
> already holds....
> 
> Need I say more?

No, that's great ! :-).

Thanks for the analysis. I'd heard it wasn't
near production quality, but not being a kernel
engineer myself I wasn't able to make that assessment.

Having said that the OEMs that are using it does
find it improves write speeds by a large amount (10%
or more), so it's showing there is room for improvement
here if the correct code can be created for recvfile.

Cheers,

Jeremy.

  parent reply	other threads:[~2013-07-23 21:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-22 21:57 Recvfile patch used for Samba Jeremy Allison
2013-07-22 23:26 ` Joe Perches
2013-07-23  7:10 ` Dave Chinner
2013-07-23 13:31   ` Jeff Layton
2013-07-23 21:58   ` Jeremy Allison [this message]
2013-07-24  2:47     ` Dave Chinner
2013-07-25  8:17       ` Steven Whitehouse
2013-07-26  4:42         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130723215858.GB16356@samba2 \
    --to=jra-eunubhrolfbytjvyw6ydsg@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org \
    --cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).