From: Jason Gunthorpe <jgg@ziepe.ca>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
Dave Chinner <david@fromorbit.com>,
Christopher Lameter <cl@linux.com>,
Doug Ledford <dledford@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
lsf-pc@lists.linux-foundation.org,
linux-rdma <linux-rdma@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
John Hubbard <jhubbard@nvidia.com>,
Jerome Glisse <jglisse@redhat.com>,
Michal Hocko <mhocko@kernel.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA
Date: Mon, 11 Feb 2019 11:26:49 -0700 [thread overview]
Message-ID: <20190211182649.GD24692@ziepe.ca> (raw)
In-Reply-To: <20190211181921.GA5526@iweiny-DESK2.sc.intel.com>
On Mon, Feb 11, 2019 at 10:19:22AM -0800, Ira Weiny wrote:
> On Mon, Feb 11, 2019 at 11:06:54AM -0700, Jason Gunthorpe wrote:
> > On Mon, Feb 11, 2019 at 09:22:58AM -0800, Dan Williams wrote:
> >
> > > I honestly don't like the idea that random subsystems can pin down
> > > file blocks as a side effect of gup on the result of mmap. Recall that
> > > it's not just RDMA that wants this guarantee. It seems safer to have
> > > the file be in an explicit block-allocation-immutable-mode so that the
> > > fallocate man page can describe this error case. Otherwise how would
> > > you describe the scenarios under which FALLOC_FL_PUNCH_HOLE fails?
> >
> > I rather liked CL's version of this - ftruncate/etc is simply racing
> > with a parallel pwrite - and it doesn't fail.
> >
> > But it also doesnt' trucate/create a hole. Another thread wrote to it
> > right away and the 'hole' was essentially instantly reallocated. This
> > is an inherent, pre-existing, race in the ftrucate/etc APIs.
>
> I kind of like it as well, except Christopher did not answer my question:
>
> What if user space then writes to the end of the file with a regular write?
> Does that write end up at the point they truncated to or off the end of the
> mmaped area (old length)?
IIRC it depends how the user does the write..
pwrite() with a given offset will write to that offset, re-extending
the file if needed
A file opened with O_APPEND and a write done with write() should
append to the new end
A normal file with a normal write should write to the FD's current
seek pointer.
I'm not sure what happens if you write via mmap/msync.
RDMA is similar to pwrite() and mmap.
> Or is it safe to consider all gup pinned pages this way?
O_DIRECT still has to work sensibly, and if you ftruncate something
that is currently being written with O_DIRECT it should behave the
same as if the CPU touched the mmap'd memory, IMHO.
The only real change here is that if there is a GUP then ftruncate/etc
races are always resolved as 'GUP user goes last' instead of randomly.
ftrunacte/etc already only work as you'd expect if the operator has
excluded writes. Otherwise blocks are instantly reallocated by another
racing thread.
I'm not sure why RDMA should be so special to earn an error code ..
Jason
next prev parent reply other threads:[~2019-02-11 18:26 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-05 17:50 [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA Ira Weiny
2019-02-05 18:01 ` Ira Weiny
2019-02-06 21:31 ` Dave Chinner
2019-02-06 9:50 ` Jan Kara
2019-02-06 17:31 ` Jason Gunthorpe
2019-02-06 17:52 ` Matthew Wilcox
2019-02-06 18:32 ` Doug Ledford
2019-02-06 18:35 ` Matthew Wilcox
2019-02-06 18:44 ` Doug Ledford
2019-02-06 18:52 ` Jason Gunthorpe
2019-02-06 19:45 ` Dan Williams
2019-02-06 20:14 ` Doug Ledford
2019-02-06 21:04 ` Dan Williams
2019-02-06 21:12 ` Doug Ledford
2019-02-06 19:16 ` Christopher Lameter
2019-02-06 19:40 ` Matthew Wilcox
2019-02-06 20:16 ` Doug Ledford
2019-02-06 20:20 ` Matthew Wilcox
2019-02-06 20:28 ` Doug Ledford
2019-02-06 20:41 ` Matthew Wilcox
2019-02-06 20:47 ` Doug Ledford
2019-02-06 20:49 ` Matthew Wilcox
2019-02-06 20:50 ` Doug Ledford
2019-02-06 20:31 ` Jason Gunthorpe
2019-02-06 20:39 ` Christopher Lameter
2019-02-06 20:54 ` Doug Ledford
2019-02-07 16:48 ` Jan Kara
2019-02-06 20:24 ` Christopher Lameter
2019-02-06 21:03 ` Dave Chinner
2019-02-06 22:08 ` Jason Gunthorpe
2019-02-06 22:24 ` Doug Ledford
2019-02-06 22:44 ` Dan Williams
2019-02-06 23:21 ` Jason Gunthorpe
2019-02-06 23:30 ` Dan Williams
2019-02-06 23:41 ` Jason Gunthorpe
2019-02-07 0:22 ` Dan Williams
2019-02-07 5:33 ` Jason Gunthorpe
2019-02-07 1:57 ` Doug Ledford
2019-02-07 2:48 ` Dan Williams
2019-02-07 2:42 ` Doug Ledford
2019-02-07 3:13 ` Dan Williams
2019-02-07 17:23 ` Ira Weiny
2019-02-07 16:25 ` Doug Ledford
2019-02-07 16:55 ` Christopher Lameter
2019-02-07 17:35 ` Ira Weiny
2019-02-07 18:17 ` Christopher Lameter
2019-02-08 4:43 ` Dave Chinner
2019-02-08 11:10 ` Jan Kara
2019-02-08 20:50 ` Dan Williams
2019-02-11 10:24 ` Jan Kara
2019-02-11 17:22 ` Dan Williams
2019-02-11 18:06 ` Jason Gunthorpe
2019-02-11 18:15 ` Dan Williams
2019-02-11 18:19 ` Ira Weiny
2019-02-11 18:26 ` Jason Gunthorpe [this message]
2019-02-11 18:40 ` Matthew Wilcox
2019-02-11 19:58 ` Dan Williams
2019-02-11 20:49 ` Jason Gunthorpe
2019-02-11 21:02 ` Dan Williams
2019-02-11 21:09 ` Jason Gunthorpe
2019-02-12 16:34 ` Jan Kara
2019-02-12 16:55 ` Christopher Lameter
2019-02-13 15:06 ` Jan Kara
2019-02-12 16:36 ` Christopher Lameter
2019-02-12 16:44 ` Jan Kara
2019-02-11 21:08 ` Jerome Glisse
2019-02-11 21:22 ` John Hubbard
2019-02-11 22:12 ` Jason Gunthorpe
2019-02-11 22:33 ` John Hubbard
2019-02-12 16:39 ` Christopher Lameter
2019-02-13 2:58 ` John Hubbard
2019-02-12 16:28 ` Jan Kara
2019-02-14 20:26 ` Jerome Glisse
2019-02-14 20:50 ` Matthew Wilcox
2019-02-14 21:39 ` Jerome Glisse
2019-02-15 1:19 ` Dave Chinner
2019-02-15 15:42 ` Christopher Lameter
2019-02-15 18:08 ` Matthew Wilcox
2019-02-15 18:31 ` Christopher Lameter
2019-02-15 22:00 ` Jason Gunthorpe
2019-02-15 23:38 ` Ira Weiny
2019-02-16 22:42 ` Dave Chinner
2019-02-17 2:54 ` Christopher Lameter
2019-02-12 16:07 ` Jan Kara
2019-02-12 21:53 ` Dan Williams
2019-02-08 21:20 ` Dave Chinner
2019-02-08 15:33 ` Christopher Lameter
2019-02-07 17:24 ` Matthew Wilcox
2019-02-07 17:26 ` Jason Gunthorpe
2019-02-07 3:52 ` Dave Chinner
2019-02-07 5:23 ` Jason Gunthorpe
2019-02-07 6:00 ` Dan Williams
2019-02-07 17:17 ` Jason Gunthorpe
2019-02-07 23:54 ` Dan Williams
2019-02-08 1:44 ` Ira Weiny
2019-02-08 5:19 ` Jason Gunthorpe
2019-02-08 7:20 ` Dan Williams
2019-02-08 15:42 ` Jason Gunthorpe
2019-02-07 15:04 ` Chuck Lever
2019-02-07 15:28 ` Tom Talpey
2019-02-07 15:37 ` Doug Ledford
2019-02-07 15:41 ` Tom Talpey
2019-02-07 15:56 ` Doug Ledford
2019-02-07 16:57 ` Ira Weiny
2019-02-07 21:31 ` Tom Talpey
2019-02-07 16:54 ` Ira Weiny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190211182649.GD24692@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=cl@linux.com \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=dledford@redhat.com \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).