* [PATCH] Sync only the requested range in msync
@ 2014-03-27 23:02 Matthew Wilcox
2014-04-23 14:11 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2014-03-27 23:02 UTC (permalink / raw)
To: linux-mm, linux-fsdevel; +Cc: Matthew Wilcox, willy
[untested. posted because it keeps coming up at lsfmm/collab]
msync() currently syncs more than POSIX requires or BSD or Solaris
implement. It is supposed to be equivalent to fdatasync(), not fsync(),
and it is only supposed to sync the portion of the file that overlaps
the range passed to msync.
If the VMA is non-linear, fall back to syncing the entire file, but we
still optimise to only fdatasync() the entire file, not the full fsync().
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
---
mm/msync.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/msync.c b/mm/msync.c
index 632df45..a5c6736 100644
--- a/mm/msync.c
+++ b/mm/msync.c
@@ -58,6 +58,7 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags)
vma = find_vma(mm, start);
for (;;) {
struct file *file;
+ loff_t fstart, fend;
/* Still start < end. */
error = -ENOMEM;
@@ -77,12 +78,17 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags)
goto out_unlock;
}
file = vma->vm_file;
+ fstart = start + ((loff_t)vma->vm_pgoff << PAGE_SHIFT);
+ fend = fstart + (min(end, vma->vm_end) - start) - 1;
start = vma->vm_end;
if ((flags & MS_SYNC) && file &&
(vma->vm_flags & VM_SHARED)) {
get_file(file);
up_read(&mm->mmap_sem);
- error = vfs_fsync(file, 0);
+ if (vma->vm_flags & VM_NONLINEAR)
+ error = vfs_fsync(file, 1);
+ else
+ error = vfs_fsync_range(file, fstart, fend, 1);
fput(file);
if (error || start >= end)
goto out;
--
1.9.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Sync only the requested range in msync
2014-03-27 23:02 [PATCH] Sync only the requested range in msync Matthew Wilcox
@ 2014-04-23 14:11 ` Christoph Hellwig
2014-05-12 23:39 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2014-04-23 14:11 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-mm, linux-fsdevel, willy
On Thu, Mar 27, 2014 at 07:02:41PM -0400, Matthew Wilcox wrote:
> [untested. posted because it keeps coming up at lsfmm/collab]
>
> msync() currently syncs more than POSIX requires or BSD or Solaris
> implement. It is supposed to be equivalent to fdatasync(), not fsync(),
> and it is only supposed to sync the portion of the file that overlaps
> the range passed to msync.
>
> If the VMA is non-linear, fall back to syncing the entire file, but we
> still optimise to only fdatasync() the entire file, not the full fsync().
>
> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Sync only the requested range in msync
2014-04-23 14:11 ` Christoph Hellwig
@ 2014-05-12 23:39 ` Andrew Morton
2014-05-13 13:31 ` Jeff Moyer
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2014-05-12 23:39 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Matthew Wilcox, linux-mm, linux-fsdevel, willy
On Wed, 23 Apr 2014 07:11:15 -0700 Christoph Hellwig <hch@infradead.org> wrote:
> On Thu, Mar 27, 2014 at 07:02:41PM -0400, Matthew Wilcox wrote:
> > [untested. posted because it keeps coming up at lsfmm/collab]
> >
> > msync() currently syncs more than POSIX requires or BSD or Solaris
> > implement. It is supposed to be equivalent to fdatasync(), not fsync(),
> > and it is only supposed to sync the portion of the file that overlaps
> > the range passed to msync.
> >
> > If the VMA is non-linear, fall back to syncing the entire file, but we
> > still optimise to only fdatasync() the entire file, not the full fsync().
> >
> > Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
>
> Looks good,
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
I worry that if there are people who are relying on the current
behaviour (knowingly or not!) then this patch will put their data at
risk and nobody will ever know. Until that data gets lost, that is.
At some level of cautiousness, this is one of those things we can never
fix.
I suppose we could add an msync2() syscall with the new behaviour so
people can migrate over. That would be very cheap to do.
It's hard to know what's the right thing to do here.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Sync only the requested range in msync
2014-05-12 23:39 ` Andrew Morton
@ 2014-05-13 13:31 ` Jeff Moyer
2014-05-15 11:24 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Jeff Moyer @ 2014-05-13 13:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Christoph Hellwig, Matthew Wilcox, linux-mm, linux-fsdevel, willy
Andrew Morton <akpm@linux-foundation.org> writes:
> On Wed, 23 Apr 2014 07:11:15 -0700 Christoph Hellwig <hch@infradead.org> wrote:
>
>> On Thu, Mar 27, 2014 at 07:02:41PM -0400, Matthew Wilcox wrote:
>> > [untested. posted because it keeps coming up at lsfmm/collab]
>> >
>> > msync() currently syncs more than POSIX requires or BSD or Solaris
>> > implement. It is supposed to be equivalent to fdatasync(), not fsync(),
>> > and it is only supposed to sync the portion of the file that overlaps
>> > the range passed to msync.
>> >
>> > If the VMA is non-linear, fall back to syncing the entire file, but we
>> > still optimise to only fdatasync() the entire file, not the full fsync().
>> >
>> > Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
>>
>> Looks good,
>>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> I worry that if there are people who are relying on the current
> behaviour (knowingly or not!) then this patch will put their data at
> risk and nobody will ever know. Until that data gets lost, that is.
> At some level of cautiousness, this is one of those things we can never
> fix.
>
> I suppose we could add an msync2() syscall with the new behaviour so
> people can migrate over. That would be very cheap to do.
>
> It's hard to know what's the right thing to do here.
FWIW, I think we should apply the patch. Anyone using the API properly
will not get the desired result, and it could have a negative impact on
performance. The man page is very explicit on what you should expect,
here. Anyone relying on undocumented behavior gets to keep both pieces
when it breaks. That said, I do understand your viewpoint, Andrew,
especially since it's so hard to get people to sync their data at all,
much less correctly.
Acked-by: Jeff Moyer <jmoyer@redhat.com>
-Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Sync only the requested range in msync
2014-05-13 13:31 ` Jeff Moyer
@ 2014-05-15 11:24 ` Christoph Hellwig
2014-05-15 12:50 ` Chris Mason
2014-05-20 22:58 ` Andrew Morton
2 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2014-05-15 11:24 UTC (permalink / raw)
To: Jeff Moyer; +Cc: Andrew Morton, Matthew Wilcox, linux-mm, linux-fsdevel, willy
On Tue, May 13, 2014 at 09:31:01AM -0400, Jeff Moyer wrote:
> FWIW, I think we should apply the patch. Anyone using the API properly
> will not get the desired result, and it could have a negative impact on
> performance. The man page is very explicit on what you should expect,
> here. Anyone relying on undocumented behavior gets to keep both pieces
> when it breaks. That said, I do understand your viewpoint, Andrew,
> especially since it's so hard to get people to sync their data at all,
> much less correctly.
Agreed, we never made filesystems write out all data in the file system
in fsync either just because ext3 behaved that way.
And unlike that case I can't even see a good way to get msync wrong -
you call it on the mapped region, so expecting it to write out data
that isn't mapped at all seems rather odd.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Sync only the requested range in msync
2014-05-13 13:31 ` Jeff Moyer
2014-05-15 11:24 ` Christoph Hellwig
@ 2014-05-15 12:50 ` Chris Mason
2014-05-20 22:58 ` Andrew Morton
2 siblings, 0 replies; 7+ messages in thread
From: Chris Mason @ 2014-05-15 12:50 UTC (permalink / raw)
To: Jeff Moyer, Andrew Morton
Cc: Christoph Hellwig, Matthew Wilcox, linux-mm, linux-fsdevel, willy
On 05/13/2014 09:31 AM, Jeff Moyer wrote:
> Andrew Morton <akpm@linux-foundation.org> writes:
>
>> On Wed, 23 Apr 2014 07:11:15 -0700 Christoph Hellwig <hch@infradead.org> wrote:
>>
>>> On Thu, Mar 27, 2014 at 07:02:41PM -0400, Matthew Wilcox wrote:
>>>> [untested. posted because it keeps coming up at lsfmm/collab]
>>>>
>>>> msync() currently syncs more than POSIX requires or BSD or Solaris
>>>> implement. It is supposed to be equivalent to fdatasync(), not fsync(),
>>>> and it is only supposed to sync the portion of the file that overlaps
>>>> the range passed to msync.
>>>>
>>>> If the VMA is non-linear, fall back to syncing the entire file, but we
>>>> still optimise to only fdatasync() the entire file, not the full fsync().
>>>>
>>>> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
>>>
>>> Looks good,
>>>
>>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>>
>> I worry that if there are people who are relying on the current
>> behaviour (knowingly or not!) then this patch will put their data at
>> risk and nobody will ever know. Until that data gets lost, that is.
>> At some level of cautiousness, this is one of those things we can never
>> fix.
>>
>> I suppose we could add an msync2() syscall with the new behaviour so
>> people can migrate over. That would be very cheap to do.
>>
>> It's hard to know what's the right thing to do here.
>
> FWIW, I think we should apply the patch. Anyone using the API properly
> will not get the desired result, and it could have a negative impact on
> performance. The man page is very explicit on what you should expect,
> here. Anyone relying on undocumented behavior gets to keep both pieces
> when it breaks. That said, I do understand your viewpoint, Andrew,
> especially since it's so hard to get people to sync their data at all,
> much less correctly.
>
> Acked-by: Jeff Moyer <jmoyer@redhat.com>
Maybe we can talk someone with all of linux in a repository to grep for
msync calls that even specify a range at all?
Eric did this for 64 bit inodes a few years ago, and it wouldn't hurt to
have a little data on how common this is.
I think for msync the list will be much shorter and easier to audit.
-chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Sync only the requested range in msync
2014-05-13 13:31 ` Jeff Moyer
2014-05-15 11:24 ` Christoph Hellwig
2014-05-15 12:50 ` Chris Mason
@ 2014-05-20 22:58 ` Andrew Morton
2 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2014-05-20 22:58 UTC (permalink / raw)
To: Jeff Moyer
Cc: Christoph Hellwig, Matthew Wilcox, linux-mm, linux-fsdevel, willy
On Tue, 13 May 2014 09:31:01 -0400 Jeff Moyer <jmoyer@redhat.com> wrote:
> Andrew Morton <akpm@linux-foundation.org> writes:
>
> > On Wed, 23 Apr 2014 07:11:15 -0700 Christoph Hellwig <hch@infradead.org> wrote:
> >
> >> On Thu, Mar 27, 2014 at 07:02:41PM -0400, Matthew Wilcox wrote:
> >> > [untested. posted because it keeps coming up at lsfmm/collab]
> >> >
> >> > msync() currently syncs more than POSIX requires or BSD or Solaris
> >> > implement. It is supposed to be equivalent to fdatasync(), not fsync(),
> >> > and it is only supposed to sync the portion of the file that overlaps
> >> > the range passed to msync.
> >> >
> >> > If the VMA is non-linear, fall back to syncing the entire file, but we
> >> > still optimise to only fdatasync() the entire file, not the full fsync().
> >> >
> >> > Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
> >>
> >> Looks good,
> >>
> >> Reviewed-by: Christoph Hellwig <hch@lst.de>
> >
> > I worry that if there are people who are relying on the current
> > behaviour (knowingly or not!) then this patch will put their data at
> > risk and nobody will ever know. Until that data gets lost, that is.
> > At some level of cautiousness, this is one of those things we can never
> > fix.
> >
> > I suppose we could add an msync2() syscall with the new behaviour so
> > people can migrate over. That would be very cheap to do.
> >
> > It's hard to know what's the right thing to do here.
>
> FWIW, I think we should apply the patch. Anyone using the API properly
> will not get the desired result, and it could have a negative impact on
> performance. The man page is very explicit on what you should expect,
> here. Anyone relying on undocumented behavior gets to keep both pieces
> when it breaks. That said, I do understand your viewpoint, Andrew,
> especially since it's so hard to get people to sync their data at all,
> much less correctly.
>
> Acked-by: Jeff Moyer <jmoyer@redhat.com>
>
Sigh, OK, scary. And I don't see a sane way in which we can do
WARN_ON_ONCE("hey dummy this doesn't do what you think it does").
Matthew, I checked the arith carefully and it looks OK, but would be
more comfortable if you could find a way of testing it please? Try
this dirty trick:
if (!strcmp(current->comm, "my-test-app")) {
char buf[100];
sprintf(buf, "kernel: %llu->%llu\n", fstart, fend);
sys_write(1, buf, strlen(buf));
}
so your debug info nicely appears on my-test-app's (fflushed) stdout ;)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-05-20 22:58 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-27 23:02 [PATCH] Sync only the requested range in msync Matthew Wilcox
2014-04-23 14:11 ` Christoph Hellwig
2014-05-12 23:39 ` Andrew Morton
2014-05-13 13:31 ` Jeff Moyer
2014-05-15 11:24 ` Christoph Hellwig
2014-05-15 12:50 ` Chris Mason
2014-05-20 22:58 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).