From: Dave Chinner <david@fromorbit.com>
To: Andres Freund <andres@anarazel.de>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Triggering non-integrity writeback from userspace
Date: Thu, 29 Oct 2015 12:54:22 +1100 [thread overview]
Message-ID: <20151029015422.GT8773@dastard> (raw)
In-Reply-To: <20151028232312.GL29811@alap3.anarazel.de>
On Thu, Oct 29, 2015 at 12:23:12AM +0100, Andres Freund wrote:
> Hi,
>
> On 2015-10-29 07:48:34 +1100, Dave Chinner wrote:
> > > The idea of using SYNC_FILE_RANGE_WRITE beforehand is that
> > > the fsync() will only have to do very little work. The language in
> > > sync_file_range(2) doesn't inspire enough confidence for using it as an
> > > actual integrity operation :/
> >
> > So really you're trying to minimise the blocking/latency of fsync()?
>
> The blocking/latency of the fsync doesn't actually matter at all *for
> this callsite*. It's called from a dedicated background process - if
> it's slowed down by a couple seconds it doesn't matter much.
> The problem is that if you have a couple gigabytes of dirty data being
> fsync()ed at once, latency for concurrent reads and writes often goes
> absolutely apeshit. And those concurrent reads and writes might
> actually be latency sensitive.
Right, but my point is with an async fsync/fdatasync you don't need
this background process - you can just trickle out async fdatasync
calls instead of trckling out calls to sync_file_range().
> By calling sync_file_range() over small ranges of pages shortly after
> they've been written we make it unlikely (but still possible) that much
> data has to be flushed at fsync() time.
Right, but you still need the fsync call, whereas with a async fsync
call you don't - when you gather the completion, no further action
needs to be taken on that dirty range.
> At the moment using fdatasync() instead of fsync() is a considerable
> performance advantage... If I understand the above proposal correctly,
> it'd allow specifying ranges, is that right?
Well, the patch I sent doesn't do ranges, but it could easily be
passed in as the iocb has offset/len parameters that are used by
IOCB_CMD_PREAD/PWRITE. io_prep_fsync/io_fsync both memset the iocb
to zero, so if we pass in a non-zero length, we could treat it as a
ranged f(d)sync quite easily.
> There'll be some concern about portability around this - issuing
> sync_file_range() every now and then isn't particularly invasive. Using
> aio might end up being that, not sure.
It's still a non-portable/linux only solution, because it is using
the linux native aio interface, not the glibc one...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Andres Freund <andres@anarazel.de>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Triggering non-integrity writeback from userspace
Date: Thu, 29 Oct 2015 12:54:22 +1100 [thread overview]
Message-ID: <20151029015422.GT8773@dastard> (raw)
In-Reply-To: <20151028232312.GL29811@alap3.anarazel.de>
On Thu, Oct 29, 2015 at 12:23:12AM +0100, Andres Freund wrote:
> Hi,
>
> On 2015-10-29 07:48:34 +1100, Dave Chinner wrote:
> > > The idea of using SYNC_FILE_RANGE_WRITE beforehand is that
> > > the fsync() will only have to do very little work. The language in
> > > sync_file_range(2) doesn't inspire enough confidence for using it as an
> > > actual integrity operation :/
> >
> > So really you're trying to minimise the blocking/latency of fsync()?
>
> The blocking/latency of the fsync doesn't actually matter at all *for
> this callsite*. It's called from a dedicated background process - if
> it's slowed down by a couple seconds it doesn't matter much.
> The problem is that if you have a couple gigabytes of dirty data being
> fsync()ed at once, latency for concurrent reads and writes often goes
> absolutely apeshit. And those concurrent reads and writes might
> actually be latency sensitive.
Right, but my point is with an async fsync/fdatasync you don't need
this background process - you can just trickle out async fdatasync
calls instead of trckling out calls to sync_file_range().
> By calling sync_file_range() over small ranges of pages shortly after
> they've been written we make it unlikely (but still possible) that much
> data has to be flushed at fsync() time.
Right, but you still need the fsync call, whereas with a async fsync
call you don't - when you gather the completion, no further action
needs to be taken on that dirty range.
> At the moment using fdatasync() instead of fsync() is a considerable
> performance advantage... If I understand the above proposal correctly,
> it'd allow specifying ranges, is that right?
Well, the patch I sent doesn't do ranges, but it could easily be
passed in as the iocb has offset/len parameters that are used by
IOCB_CMD_PREAD/PWRITE. io_prep_fsync/io_fsync both memset the iocb
to zero, so if we pass in a non-zero length, we could treat it as a
ranged f(d)sync quite easily.
> There'll be some concern about portability around this - issuing
> sync_file_range() every now and then isn't particularly invasive. Using
> aio might end up being that, not sure.
It's still a non-portable/linux only solution, because it is using
the linux native aio interface, not the glibc one...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2015-10-29 1:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-22 13:15 Triggering non-integrity writeback from userspace Andres Freund
2015-10-22 13:15 ` Andres Freund
2015-10-24 19:09 ` Jan Kara
2015-10-24 19:09 ` Jan Kara
2015-10-24 21:39 ` Dave Chinner
2015-10-24 21:39 ` Dave Chinner
2015-10-28 9:27 ` Andres Freund
2015-10-28 9:27 ` Andres Freund
2015-10-28 20:48 ` Dave Chinner
2015-10-28 20:48 ` Dave Chinner
2015-10-28 23:23 ` Andres Freund
2015-10-28 23:23 ` Andres Freund
2015-10-29 1:54 ` Dave Chinner [this message]
2015-10-29 1:54 ` Dave Chinner
2015-10-29 16:23 ` Andres Freund
2015-10-29 16:23 ` Andres Freund
2015-10-29 22:10 ` Dave Chinner
2015-10-29 22:10 ` Dave Chinner
2015-10-28 23:26 ` Dave Chinner
2015-10-28 23:26 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151029015422.GT8773@dastard \
--to=david@fromorbit.com \
--cc=andres@anarazel.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.