From: Dave Chinner <david@fromorbit.com>
To: Andres Freund <andres@anarazel.de>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Triggering non-integrity writeback from userspace
Date: Thu, 29 Oct 2015 07:48:34 +1100 [thread overview]
Message-ID: <20151028204834.GP8773@dastard> (raw)
In-Reply-To: <20151028092752.GF29811@alap3.anarazel.de>
Hi Andres,
On Wed, Oct 28, 2015 at 10:27:52AM +0100, Andres Freund wrote:
> On 2015-10-25 08:39:12 +1100, Dave Chinner wrote:
....
> > Data integrity operations require related file metadata (e.g. block
> > allocation trnascations) to be forced to the journal/disk, and a
> > device cache flush issued to ensure the data is on stable storage.
> > SYNC_FILE_RANGE_WRITE does neither of these things, and hence while
> > the IO might be the same pattern as a data integrity operation, it
> > does not provide such guarantees.
>
> Which is desired here - the actual integrity is still going to be done
> via fsync().
OK, so you require data integrity, but....
> The idea of using SYNC_FILE_RANGE_WRITE beforehand is that
> the fsync() will only have to do very little work. The language in
> sync_file_range(2) doesn't inspire enough confidence for using it as an
> actual integrity operation :/
So really you're trying to minimise the blocking/latency of fsync()?
> > You don't want to do writeback from the syscall, right? i.e. you'd
> > like to expire the inode behind the fd, and schedule background
> > writeback to run on it immediately?
>
> Yes, that's exactly what we want. Blocking if a process has done too
> much writes is fine tho.
OK, so it's really the latency of the fsync() operation that is what
you are trying to avoid? I've been meaning to get back to a generic
implementation of an aio fsync operation:
http://oss.sgi.com/archives/xfs/2014-06/msg00214.html
Would that be a better approach to solving your need for a
non-blocking data integrity flush of a file?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Andres Freund <andres@anarazel.de>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Triggering non-integrity writeback from userspace
Date: Thu, 29 Oct 2015 07:48:34 +1100 [thread overview]
Message-ID: <20151028204834.GP8773@dastard> (raw)
In-Reply-To: <20151028092752.GF29811@alap3.anarazel.de>
Hi Andres,
On Wed, Oct 28, 2015 at 10:27:52AM +0100, Andres Freund wrote:
> On 2015-10-25 08:39:12 +1100, Dave Chinner wrote:
....
> > Data integrity operations require related file metadata (e.g. block
> > allocation trnascations) to be forced to the journal/disk, and a
> > device cache flush issued to ensure the data is on stable storage.
> > SYNC_FILE_RANGE_WRITE does neither of these things, and hence while
> > the IO might be the same pattern as a data integrity operation, it
> > does not provide such guarantees.
>
> Which is desired here - the actual integrity is still going to be done
> via fsync().
OK, so you require data integrity, but....
> The idea of using SYNC_FILE_RANGE_WRITE beforehand is that
> the fsync() will only have to do very little work. The language in
> sync_file_range(2) doesn't inspire enough confidence for using it as an
> actual integrity operation :/
So really you're trying to minimise the blocking/latency of fsync()?
> > You don't want to do writeback from the syscall, right? i.e. you'd
> > like to expire the inode behind the fd, and schedule background
> > writeback to run on it immediately?
>
> Yes, that's exactly what we want. Blocking if a process has done too
> much writes is fine tho.
OK, so it's really the latency of the fsync() operation that is what
you are trying to avoid? I've been meaning to get back to a generic
implementation of an aio fsync operation:
http://oss.sgi.com/archives/xfs/2014-06/msg00214.html
Would that be a better approach to solving your need for a
non-blocking data integrity flush of a file?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-28 20:48 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-22 13:15 Triggering non-integrity writeback from userspace Andres Freund
2015-10-22 13:15 ` Andres Freund
2015-10-24 19:09 ` Jan Kara
2015-10-24 19:09 ` Jan Kara
2015-10-24 21:39 ` Dave Chinner
2015-10-24 21:39 ` Dave Chinner
2015-10-28 9:27 ` Andres Freund
2015-10-28 9:27 ` Andres Freund
2015-10-28 20:48 ` Dave Chinner [this message]
2015-10-28 20:48 ` Dave Chinner
2015-10-28 23:23 ` Andres Freund
2015-10-28 23:23 ` Andres Freund
2015-10-29 1:54 ` Dave Chinner
2015-10-29 1:54 ` Dave Chinner
2015-10-29 16:23 ` Andres Freund
2015-10-29 16:23 ` Andres Freund
2015-10-29 22:10 ` Dave Chinner
2015-10-29 22:10 ` Dave Chinner
2015-10-28 23:26 ` Dave Chinner
2015-10-28 23:26 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151028204834.GP8773@dastard \
--to=david@fromorbit.com \
--cc=andres@anarazel.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.