linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Steigerwald <martin@lichtvoll.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, willy@infradead.org, clm@fb.com,
	torvalds@linux-foundation.org, david@fromorbit.com
Subject: Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED
Date: Thu, 12 Dec 2019 22:45:47 +0100	[thread overview]
Message-ID: <2091494.0NDvsO6yje@merkaba> (raw)
In-Reply-To: <7bf74660-874e-6fd7-7a41-f908ccab694e@kernel.dk>

Jens Axboe - 12.12.19, 16:16:31 CET:
> On 12/12/19 3:44 AM, Martin Steigerwald wrote:
> > Jens Axboe - 11.12.19, 16:29:38 CET:
> >> Recently someone asked me how io_uring buffered IO compares to
> >> mmaped
> >> IO in terms of performance. So I ran some tests with buffered IO,
> >> and
> >> found the experience to be somewhat painful. The test case is
> >> pretty
> >> basic, random reads over a dataset that's 10x the size of RAM.
> >> Performance starts out fine, and then the page cache fills up and
> >> we
> >> hit a throughput cliff. CPU usage of the IO threads go up, and we
> >> have kswapd spending 100% of a core trying to keep up. Seeing
> >> that, I was reminded of the many complaints I here about buffered
> >> IO, and the fact that most of the folks complaining will
> >> ultimately bite the bullet and move to O_DIRECT to just get the
> >> kernel out of the way.
> >> 
> >> But I don't think it needs to be like that. Switching to O_DIRECT
> >> isn't always easily doable. The buffers have different life times,
> >> size and alignment constraints, etc. On top of that, mixing
> >> buffered
> >> and O_DIRECT can be painful.
> >> 
> >> Seems to me that we have an opportunity to provide something that
> >> sits somewhere in between buffered and O_DIRECT, and this is where
> >> RWF_UNCACHED enters the picture. If this flag is set on IO, we get
> >> the following behavior:
> >> 
> >> - If the data is in cache, it remains in cache and the copy (in or
> >> out) is served to/from that.
> >> 
> >> - If the data is NOT in cache, we add it while performing the IO.
> >> When the IO is done, we remove it again.
> >> 
> >> With this, I can do 100% smooth buffered reads or writes without
> >> pushing the kernel to the state where kswapd is sweating bullets.
> >> In
> >> fact it doesn't even register.
> > 
> > A question from a user or Linux Performance trainer perspective:
> > 
> > How does this compare with posix_fadvise() with POSIX_FADV_DONTNEED
> > that for example the nocache¹ command is using? Excerpt from
> > manpage> 
> > posix_fadvice(2):
> >        POSIX_FADV_DONTNEED
> >        
> >               The specified data will not be accessed  in  the  near
> >               future.
> >               
> >               POSIX_FADV_DONTNEED  attempts to free cached pages as‐
> >               sociated with the specified region.  This  is  useful,
> >               for  example,  while streaming large files.  A program
> >               may periodically request the  kernel  to  free  cached
> >               data  that  has already been used, so that more useful
> >               cached pages are not discarded instead.
> > 
> > [1] packaged in Debian as nocache or available
> > herehttps://github.com/ Feh/nocache
> > 
> > In any way, would be nice to have some option in rsync… I still did
> > not change my backup script to call rsync via nocache.
> 
> I don't know the nocache tool, but I'm guessing it just does the
> writes (or reads) and then uses FADV_DONTNEED to drop behind those
> pages? That's fine for slower use cases, it won't work very well for
> fast IO. The write side currently works pretty much like that
> internally, whereas the read side doesn't use the page cache at all.

Yes, it does that. And yeah I saw you changed the read site to bypass 
the cache entirely.

Also as I understand it this is for asynchronous using io uring 
primarily?

-- 
Martin



  reply	other threads:[~2019-12-12 21:45 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11 15:29 [PATCHSET v3 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 1/5] fs: add read support " Jens Axboe
2019-12-11 15:29 ` [PATCH 2/5] mm: make generic_perform_write() take a struct kiocb Jens Axboe
2019-12-11 15:29 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 4/5] iomap: pass in the write_begin/write_end flags to iomap_actor Jens Axboe
2019-12-11 17:19   ` Linus Torvalds
2019-12-11 15:29 ` [PATCH 5/5] iomap: support RWF_UNCACHED for buffered writes Jens Axboe
2019-12-11 17:19   ` Matthew Wilcox
2019-12-11 18:05     ` Jens Axboe
2019-12-12 22:34   ` Dave Chinner
2019-12-13  0:54     ` Jens Axboe
2019-12-13  0:57       ` Jens Axboe
2019-12-16  4:17         ` Dave Chinner
2019-12-17 14:31           ` Jens Axboe
2019-12-18  0:49             ` Dave Chinner
2019-12-18  1:01               ` Jens Axboe
2019-12-11 17:37 ` [PATCHSET v3 0/5] Support for RWF_UNCACHED Linus Torvalds
2019-12-11 17:56   ` Jens Axboe
2019-12-11 19:14     ` Linus Torvalds
2019-12-11 19:34     ` Jens Axboe
2019-12-11 20:03       ` Linus Torvalds
2019-12-11 20:08         ` Jens Axboe
2019-12-11 20:18           ` Linus Torvalds
2019-12-11 21:04             ` Johannes Weiner
2019-12-12  1:30               ` Jens Axboe
2019-12-11 23:41             ` Jens Axboe
2019-12-12  1:08               ` Linus Torvalds
2019-12-12  1:11                 ` Jens Axboe
2019-12-12  1:22                   ` Linus Torvalds
2019-12-12  1:29                     ` Jens Axboe
2019-12-12  1:41                       ` Linus Torvalds
2019-12-12  1:56                         ` Matthew Wilcox
2019-12-12  2:47                           ` Linus Torvalds
2019-12-12 17:52                             ` Matthew Wilcox
2019-12-12 18:29                               ` Linus Torvalds
2019-12-12 20:05                                 ` Matthew Wilcox
2019-12-12  1:41                       ` Jens Axboe
2019-12-12  1:49                         ` Linus Torvalds
2019-12-12  1:09               ` Jens Axboe
2019-12-12  2:03                 ` Jens Axboe
2019-12-12  2:10                   ` Jens Axboe
2019-12-12  2:21                   ` Matthew Wilcox
2019-12-12  2:38                     ` Jens Axboe
2019-12-12 22:18                 ` Dave Chinner
2019-12-13  1:32                   ` Chris Mason
2020-01-07 17:42                     ` Christoph Hellwig
2020-01-08 14:09                       ` Chris Mason
2020-02-01 10:33                     ` Andres Freund
2019-12-11 20:43           ` Matthew Wilcox
2019-12-11 20:04       ` Jens Axboe
2019-12-12 10:44 ` Martin Steigerwald
2019-12-12 15:16   ` Jens Axboe
2019-12-12 21:45     ` Martin Steigerwald [this message]
2019-12-12 22:15       ` Jens Axboe
2019-12-12 22:18     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2091494.0NDvsO6yje@merkaba \
    --to=martin@lichtvoll.de \
    --cc=axboe@kernel.dk \
    --cc=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).