All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Ming Lei <ming.lei@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Don Dutile <ddutile@redhat.com>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	linux-block@vger.kernel.org, David Hildenbrand <david@redhat.com>
Subject: Re: [RFC PATCH] mm/readahead: readahead aggressively if read drops in willneed range
Date: Mon, 29 Jan 2024 17:46:52 -0500	[thread overview]
Message-ID: <Zbgq3B8nmMuJooEl@redhat.com> (raw)
In-Reply-To: <Zbgi6wajZlEkWISO@dread.disaster.area>

On Mon, Jan 29 2024 at  5:12P -0500,
Dave Chinner <david@fromorbit.com> wrote:

> On Mon, Jan 29, 2024 at 12:19:02PM -0500, Mike Snitzer wrote:
> > While I'm sure this legacy application would love to not have to
> > change its code at all, I think we can all agree that we need to just
> > focus on how best to advise applications that have mixed workloads
> > accomplish efficient mmap+read of both sequential and random.
> > 
> > To that end, I heard Dave clearly suggest 2 things:
> > 
> > 1) update MADV/FADV_SEQUENTIAL to set file->f_ra.ra_pages to
> >    bdi->io_pages, not bdi->ra_pages * 2
> > 
> > 2) Have the application first issue MADV_SEQUENTIAL to convey that for
> >    the following MADV_WILLNEED is for sequential file load (so it is
> >    desirable to use larger ra_pages)
> > 
> > This overrides the default of bdi->ra_pages and _should_ provide the
> > required per-file duality of control for readahead, correct?
> 
> I just discovered MADV_POPULATE_READ - see my reply to Ming
> up-thread about that. The applicaiton should use that instead of
> MADV_WILLNEED because it gives cache population guarantees that
> WILLNEED doesn't. Then we can look at optimising the performance of
> MADV_POPULATE_READ (if needed) as there is constrained scope we can
> optimise within in ways that we cannot do with WILLNEED.

Nice find! Given commit 4ca9b3859dac ("mm/madvise: introduce
MADV_POPULATE_(READ|WRITE) to prefault page tables"), I've cc'd David
Hildenbrand just so he's in the loop.

FYI, I proactively raised feedback and questions to the reporter of
this issue:
 
CONTEXT: madvise(WILLNEED) doesn't convey the nature of the access,
sequential vs random, just the range that may be accessed.
 
Q1: Is your application's sequential vs random (or smaller sequential)
access split on a per-file basis?  Or is the same file accessed both
sequentially and randomly?
 
  A1: The same files can be accessed either randomly or sequentially,
  depending on certain access patterns and optimizing logic.
 
Q2: Can the application be changed to use madvise() MADV_SEQUENTIAL
and MADV_RANDOM to indicate its access pattern?
 
  A2: No, the application is a Java application. Java does not expose
  MADVISE API directly. Our application uses Java NIO API via
  MappedByteBuffer.load()
  (https://docs.oracle.com/javase/8/docs/api/java/nio/MappedByteBuffer.html#load--)
  that calls MADVISE_WILLNEED at the low level. There is no way for us
  to switch this behavior, but we take advantage of this behavior to
  optimize large file sequential I/O with great success.
 
So it's looking like it'll be hard to help this reporter avoid
changes... but that's not upstream's problem!

Mike

  reply	other threads:[~2024-01-29 22:46 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-28 14:25 [RFC PATCH] mm/readahead: readahead aggressively if read drops in willneed range Ming Lei
2024-01-28 22:02 ` Matthew Wilcox
2024-01-28 23:12   ` Mike Snitzer
2024-01-29  0:21     ` Matthew Wilcox
2024-01-29  0:39       ` Mike Snitzer
2024-01-29  1:47         ` Dave Chinner
2024-01-29  2:12           ` Mike Snitzer
2024-01-29  4:56             ` Dave Chinner
2024-01-29  3:57           ` Ming Lei
2024-01-29  5:15             ` Dave Chinner
2024-01-29  8:25               ` Ming Lei
2024-01-29 13:26                 ` Matthew Wilcox
2024-01-29 22:07                 ` Dave Chinner
2024-01-30  3:13                   ` Ming Lei
2024-01-30  5:29                     ` Dave Chinner
2024-01-30 11:34                       ` Ming Lei
2024-01-29  3:20       ` Ming Lei
2024-01-29  3:00   ` Ming Lei
2024-01-29 17:19 ` Mike Snitzer
2024-01-29 17:42   ` Mike Snitzer
2024-01-29 22:12   ` Dave Chinner
2024-01-29 22:46     ` Mike Snitzer [this message]
2024-01-30 10:43       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zbgq3B8nmMuJooEl@redhat.com \
    --to=snitzer@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ming.lei@redhat.com \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.