* Re: [PATCH] io_uring: add support for barrier fsync [not found] <7c7276e4-8ffa-495a-6abf-926a58ee899e@kernel.dk> @ 2019-04-09 18:17 ` Christoph Hellwig 2019-04-09 18:23 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2019-04-09 18:17 UTC (permalink / raw) To: Jens Axboe Cc: linux-fsdevel, linux-block@vger.kernel.org, linux-api, linux-kernel On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote: > It's a quite common use case to issue a bunch of writes, then an fsync > or fdatasync when they complete. Since io_uring doesn't guarantee any > type of ordering, the application must track issued writes and wait > with the fsync issue until they have completed. > > Add an IORING_FSYNC_BARRIER flag that helps with this so the application > doesn't have to do this manually. If this flag is set for the fsync > request, we won't issue it until pending IO has already completed. I think we need a much more detailed explanation of the semantics, preferably in man page format. Barrier at least in Linux traditionally means all previously submitted requests have finished and no new ones are started until the barrier request finishes, which is very heavy handed. Is that what this is supposed to do? If not what are the exact guarantees vs ordering and or barrier semantics? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] io_uring: add support for barrier fsync 2019-04-09 18:17 ` [PATCH] io_uring: add support for barrier fsync Christoph Hellwig @ 2019-04-09 18:23 ` Jens Axboe 2019-04-09 18:42 ` Chris Mason 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2019-04-09 18:23 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-fsdevel, linux-block@vger.kernel.org, linux-api, linux-kernel On 4/9/19 12:17 PM, Christoph Hellwig wrote: > On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote: >> It's a quite common use case to issue a bunch of writes, then an fsync >> or fdatasync when they complete. Since io_uring doesn't guarantee any >> type of ordering, the application must track issued writes and wait >> with the fsync issue until they have completed. >> >> Add an IORING_FSYNC_BARRIER flag that helps with this so the application >> doesn't have to do this manually. If this flag is set for the fsync >> request, we won't issue it until pending IO has already completed. > > I think we need a much more detailed explanation of the semantics, > preferably in man page format. > > Barrier at least in Linux traditionally means all previously submitted > requests have finished and no new ones are started until the > barrier request finishes, which is very heavy handed. Is that what > this is supposed to do? If not what are the exact guarantees vs > ordering and or barrier semantics? The patch description isn't that great, and maybe the naming isn't that intuitive either. The way it's implemented, the fsync will NOT be issued until previously issued IOs have completed. That means both reads and writes, since there's no way to wait for just one. In terms of semantics, any previously submitted writes will have completed before this fsync is issued. The barrier fsync has no ordering wrt future writes, no ordering is implied there. Hence: W1, W2, W3, FSYNC_W_BARRIER, W4, W5 W1..3 will have been completed by the hardware side before we start FSYNC_W_BARRIER. We don't wait with issuing W4..5 until after the fsync completes, no ordering is provided there. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] io_uring: add support for barrier fsync 2019-04-09 18:23 ` Jens Axboe @ 2019-04-09 18:42 ` Chris Mason 2019-04-09 18:46 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Chris Mason @ 2019-04-09 18:42 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, linux-fsdevel, linux-block@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org On 9 Apr 2019, at 14:23, Jens Axboe wrote: > On 4/9/19 12:17 PM, Christoph Hellwig wrote: >> On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote: >>> It's a quite common use case to issue a bunch of writes, then an >>> fsync >>> or fdatasync when they complete. Since io_uring doesn't guarantee >>> any >>> type of ordering, the application must track issued writes and wait >>> with the fsync issue until they have completed. >>> >>> Add an IORING_FSYNC_BARRIER flag that helps with this so the >>> application >>> doesn't have to do this manually. If this flag is set for the fsync >>> request, we won't issue it until pending IO has already completed. >> >> I think we need a much more detailed explanation of the semantics, >> preferably in man page format. >> >> Barrier at least in Linux traditionally means all previously >> submitted >> requests have finished and no new ones are started until the >> barrier request finishes, which is very heavy handed. Is that what >> this is supposed to do? If not what are the exact guarantees vs >> ordering and or barrier semantics? > > The patch description isn't that great, and maybe the naming isn't > that > intuitive either. The way it's implemented, the fsync will NOT be > issued > until previously issued IOs have completed. That means both reads and > writes, since there's no way to wait for just one. In terms of > semantics, any previously submitted writes will have completed before > this fsync is issued. The barrier fsync has no ordering wrt future > writes, no ordering is implied there. Hence: > > W1, W2, W3, FSYNC_W_BARRIER, W4, W5 > > W1..3 will have been completed by the hardware side before we start > FSYNC_W_BARRIER. We don't wait with issuing W4..5 until after the > fsync > completes, no ordering is provided there. Looking at the patch, why is fsync special? Seems like you could add this ordering bit to any write? While you're here, do you want to add a way to FUA/cache flush? Basically the rest of what user land would need to make their own write-back-cache-safe implementation. -chris ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] io_uring: add support for barrier fsync 2019-04-09 18:42 ` Chris Mason @ 2019-04-09 18:46 ` Jens Axboe 2019-04-09 18:56 ` Chris Mason 2019-04-11 11:05 ` Dave Chinner 0 siblings, 2 replies; 6+ messages in thread From: Jens Axboe @ 2019-04-09 18:46 UTC (permalink / raw) To: Chris Mason Cc: Christoph Hellwig, linux-fsdevel, linux-block@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org On 4/9/19 12:42 PM, Chris Mason wrote: > On 9 Apr 2019, at 14:23, Jens Axboe wrote: > >> On 4/9/19 12:17 PM, Christoph Hellwig wrote: >>> On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote: >>>> It's a quite common use case to issue a bunch of writes, then an >>>> fsync >>>> or fdatasync when they complete. Since io_uring doesn't guarantee >>>> any >>>> type of ordering, the application must track issued writes and wait >>>> with the fsync issue until they have completed. >>>> >>>> Add an IORING_FSYNC_BARRIER flag that helps with this so the >>>> application >>>> doesn't have to do this manually. If this flag is set for the fsync >>>> request, we won't issue it until pending IO has already completed. >>> >>> I think we need a much more detailed explanation of the semantics, >>> preferably in man page format. >>> >>> Barrier at least in Linux traditionally means all previously >>> submitted >>> requests have finished and no new ones are started until the >>> barrier request finishes, which is very heavy handed. Is that what >>> this is supposed to do? If not what are the exact guarantees vs >>> ordering and or barrier semantics? >> >> The patch description isn't that great, and maybe the naming isn't >> that >> intuitive either. The way it's implemented, the fsync will NOT be >> issued >> until previously issued IOs have completed. That means both reads and >> writes, since there's no way to wait for just one. In terms of >> semantics, any previously submitted writes will have completed before >> this fsync is issued. The barrier fsync has no ordering wrt future >> writes, no ordering is implied there. Hence: >> >> W1, W2, W3, FSYNC_W_BARRIER, W4, W5 >> >> W1..3 will have been completed by the hardware side before we start >> FSYNC_W_BARRIER. We don't wait with issuing W4..5 until after the >> fsync >> completes, no ordering is provided there. > > Looking at the patch, why is fsync special? Seems like you could add > this ordering bit to any write? It's really not, the exact same technique could be used on any type of command to imply ordering. My initial idea was to have an explicit barrier/ordering command, but I didn't think that separating it from an actual command would be needed/useful. > While you're here, do you want to add a way to FUA/cache flush? > Basically the rest of what user land would need to make their own > write-back-cache-safe implementation. FUA would be a WRITEV/WRITE_FIXED flag, that should be trivially doable. In terms of cache flush, that's very heavy handed (and storage oriented). What applications would want/need to do an explicit whole device flush? -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] io_uring: add support for barrier fsync 2019-04-09 18:46 ` Jens Axboe @ 2019-04-09 18:56 ` Chris Mason 2019-04-11 11:05 ` Dave Chinner 1 sibling, 0 replies; 6+ messages in thread From: Chris Mason @ 2019-04-09 18:56 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, linux-fsdevel, linux-block@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org On 9 Apr 2019, at 14:46, Jens Axboe wrote: > On 4/9/19 12:42 PM, Chris Mason wrote: >> Looking at the patch, why is fsync special? Seems like you could add >> this ordering bit to any write? > > It's really not, the exact same technique could be used on any type of > command to imply ordering. My initial idea was to have an explicit > barrier/ordering command, but I didn't think that separating it from > an > actual command would be needed/useful. Might want to order discards? Commit a transaction to free some blocks, discard those blocks? > >> While you're here, do you want to add a way to FUA/cache flush? >> Basically the rest of what user land would need to make their own >> write-back-cache-safe implementation. > > FUA would be a WRITEV/WRITE_FIXED flag, that should be trivially > doable. > > In terms of cache flush, that's very heavy handed (and storage > oriented). What applications would want/need to do an explicit whole > device flush? Basically if someone is writing a userland filesystem they'd want to cache flush for the same reasons the kernel filesystems do. -chris ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] io_uring: add support for barrier fsync 2019-04-09 18:46 ` Jens Axboe 2019-04-09 18:56 ` Chris Mason @ 2019-04-11 11:05 ` Dave Chinner 1 sibling, 0 replies; 6+ messages in thread From: Dave Chinner @ 2019-04-11 11:05 UTC (permalink / raw) To: Jens Axboe Cc: Chris Mason, Christoph Hellwig, linux-fsdevel, linux-block@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org On Tue, Apr 09, 2019 at 12:46:15PM -0600, Jens Axboe wrote: > On 4/9/19 12:42 PM, Chris Mason wrote: > > On 9 Apr 2019, at 14:23, Jens Axboe wrote: > > > >> On 4/9/19 12:17 PM, Christoph Hellwig wrote: > >>> On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote: > >>>> It's a quite common use case to issue a bunch of writes, then an > >>>> fsync > >>>> or fdatasync when they complete. Since io_uring doesn't guarantee > >>>> any > >>>> type of ordering, the application must track issued writes and wait > >>>> with the fsync issue until they have completed. > >>>> > >>>> Add an IORING_FSYNC_BARRIER flag that helps with this so the > >>>> application > >>>> doesn't have to do this manually. If this flag is set for the fsync > >>>> request, we won't issue it until pending IO has already completed. > >>> > >>> I think we need a much more detailed explanation of the semantics, > >>> preferably in man page format. > >>> > >>> Barrier at least in Linux traditionally means all previously > >>> submitted > >>> requests have finished and no new ones are started until the > >>> barrier request finishes, which is very heavy handed. Is that what > >>> this is supposed to do? If not what are the exact guarantees vs > >>> ordering and or barrier semantics? > >> > >> The patch description isn't that great, and maybe the naming isn't > >> that > >> intuitive either. The way it's implemented, the fsync will NOT be > >> issued > >> until previously issued IOs have completed. That means both reads and > >> writes, since there's no way to wait for just one. In terms of > >> semantics, any previously submitted writes will have completed before > >> this fsync is issued. The barrier fsync has no ordering wrt future > >> writes, no ordering is implied there. Hence: > >> > >> W1, W2, W3, FSYNC_W_BARRIER, W4, W5 > >> > >> W1..3 will have been completed by the hardware side before we start > >> FSYNC_W_BARRIER. We don't wait with issuing W4..5 until after the > >> fsync > >> completes, no ordering is provided there. > > > > Looking at the patch, why is fsync special? Seems like you could add > > this ordering bit to any write? > > It's really not, the exact same technique could be used on any type of > command to imply ordering. My initial idea was to have an explicit > barrier/ordering command, but I didn't think that separating it from an > actual command would be needed/useful. > > > While you're here, do you want to add a way to FUA/cache flush? > > Basically the rest of what user land would need to make their own > > write-back-cache-safe implementation. > > FUA would be a WRITEV/WRITE_FIXED flag, that should be trivially doable. We already have plumbing to make pwritev2 and AIO issue FUA writes via the RWF_DSYNC flag through the fs/iomap.c direct IO path. FUA is only valid if the file does not have dirty metadata (e.g. because of block allocation) and that requires the filesystem block mapping to tell the IO path if FUA can be used. Otherwise a journal flush is also required to make the data stable and there's no point in doing a FUA write for the data in that case... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-04-11 11:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <7c7276e4-8ffa-495a-6abf-926a58ee899e@kernel.dk>
2019-04-09 18:17 ` [PATCH] io_uring: add support for barrier fsync Christoph Hellwig
2019-04-09 18:23 ` Jens Axboe
2019-04-09 18:42 ` Chris Mason
2019-04-09 18:46 ` Jens Axboe
2019-04-09 18:56 ` Chris Mason
2019-04-11 11:05 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).