All of lore.kernel.org
 help / color / mirror / Atom feed
* + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
@ 2021-03-02 23:43 akpm
  2021-03-24 19:48 ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: akpm @ 2021-03-02 23:43 UTC (permalink / raw)
  To: axboe, jack, mm-commits, willy


The patch titled
     Subject: mm: provide filemap_range_needs_writeback() helper
has been added to the -mm tree.  Its filename is
     mm-provide-filemap_range_needs_writeback-helper.patch

This patch should soon appear at
    https://ozlabs.org/~akpm/mmots/broken-out/mm-provide-filemap_range_needs_writeback-helper.patch
and later at
    https://ozlabs.org/~akpm/mmotm/broken-out/mm-provide-filemap_range_needs_writeback-helper.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Jens Axboe <axboe@kernel.dk>
Subject: mm: provide filemap_range_needs_writeback() helper

For O_DIRECT reads/writes, we check if we need to issue a call to
filemap_write_and_wait_range() to issue and/or wait for writeback for any
page in the given range.  The existing mechanism just checks for a page in
the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
slow path (and needing retry) if there's just a clean page cache page in
the range.

Provide filemap_range_needs_writeback() which tries a little harder to
check if we actually need to issue and/or wait for writeback in the range.

Link: https://lkml.kernel.org/r/20210224164455.1096727-2-axboe@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/fs.h |    2 ++
 mm/filemap.c       |   43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

--- a/include/linux/fs.h~mm-provide-filemap_range_needs_writeback-helper
+++ a/include/linux/fs.h
@@ -2739,6 +2739,8 @@ static inline int filemap_fdatawait(stru
 
 extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
 				  loff_t lend);
+extern bool filemap_range_needs_writeback(struct address_space *,
+					  loff_t lstart, loff_t lend);
 extern int filemap_write_and_wait_range(struct address_space *mapping,
 				        loff_t lstart, loff_t lend);
 extern int __filemap_fdatawrite_range(struct address_space *mapping,
--- a/mm/filemap.c~mm-provide-filemap_range_needs_writeback-helper
+++ a/mm/filemap.c
@@ -636,6 +636,49 @@ static bool mapping_needs_writeback(stru
 }
 
 /**
+ * filemap_range_needs_writeback - check if range potentially needs writeback
+ * @mapping:           address space within which to check
+ * @start_byte:        offset in bytes where the range starts
+ * @end_byte:          offset in bytes where the range ends (inclusive)
+ *
+ * Find at least one page in the range supplied, usually used to check if
+ * direct writing in this range will trigger a writeback. Used by O_DIRECT
+ * read/write with IOCB_NOWAIT, to see if the caller needs to do
+ * filemap_write_and_wait_range() before proceeding.
+ *
+ * Return: %true if the caller should do filemap_write_and_wait_range() before
+ * doing O_DIRECT to a page in this range, %false otherwise.
+ */
+bool filemap_range_needs_writeback(struct address_space *mapping,
+				   loff_t start_byte, loff_t end_byte)
+{
+	XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
+	pgoff_t max = end_byte >> PAGE_SHIFT;
+	struct page *page;
+
+	if (!mapping_needs_writeback(mapping))
+		return false;
+	if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
+	    !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
+		return false;
+	if (end_byte < start_byte)
+		return false;
+
+	rcu_read_lock();
+	xas_for_each(&xas, page, max) {
+		if (xas_retry(&xas, page))
+			continue;
+		if (xa_is_value(page))
+			continue;
+		if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
+			break;
+	}
+	rcu_read_unlock();
+	return page != NULL;
+}
+EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
+
+/**
  * filemap_write_and_wait_range - write out & wait on a file range
  * @mapping:	the address_space for the pages
  * @lstart:	offset in bytes where the range starts
_

Patches currently in -mm which might be from axboe@kernel.dk are

swap-fix-swapfile-read-write-offset.patch
mm-provide-filemap_range_needs_writeback-helper.patch
mm-use-filemap_range_needs_writeback-for-o_direct-reads.patch
iomap-use-filemap_range_needs_writeback-for-o_direct-reads.patch


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
  2021-03-02 23:43 + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree akpm
@ 2021-03-24 19:48 ` Jens Axboe
  2021-03-24 22:02   ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-03-24 19:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, mm-commits, Matthew Wilcox

On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
>
>
> The patch titled
>      Subject: mm: provide filemap_range_needs_writeback() helper
> has been added to the -mm tree.  Its filename is
>      mm-provide-filemap_range_needs_writeback-helper.patch

Are you still planning on flushing this out for 5.12??

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
  2021-03-24 19:48 ` Jens Axboe
@ 2021-03-24 22:02   ` Andrew Morton
  2021-03-24 22:13     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2021-03-24 22:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jan Kara, mm-commits, Matthew Wilcox

On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:

> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
> >
> >
> > The patch titled
> >      Subject: mm: provide filemap_range_needs_writeback() helper
> > has been added to the -mm tree.  Its filename is
> >      mm-provide-filemap_range_needs_writeback-helper.patch
> 
> Are you still planning on flushing this out for 5.12??

Oh.  No, I wasn't planning on that.  I saw nothing which made me think
it was that urgent.

What is the justification?  How much performance benefit are we talking
here?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
  2021-03-24 22:02   ` Andrew Morton
@ 2021-03-24 22:13     ` Jens Axboe
  2021-03-24 22:19       ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-03-24 22:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, mm-commits, Matthew Wilcox

On 3/24/21 4:02 PM, Andrew Morton wrote:
> On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
>>>
>>>
>>> The patch titled
>>>      Subject: mm: provide filemap_range_needs_writeback() helper
>>> has been added to the -mm tree.  Its filename is
>>>      mm-provide-filemap_range_needs_writeback-helper.patch
>>
>> Are you still planning on flushing this out for 5.12??
> 
> Oh.  No, I wasn't planning on that.  I saw nothing which made me think
> it was that urgent.

Ehm ok, I think that was the plan though when we originally talked about
this series. At least that was what was in my original emails on this
topic :-)

> What is the justification?  How much performance benefit are we talking
> here?

Well it's pretty huge, since if we return "nope can't do this
non-blocking" it'll mean that io_uring has to bump the operation to an
async threads. So CPU cost goes way up, and latencies does too.

Problem is, we're now at -rc4, and it was my understanding that we'd get
this in for 5.12, but obviously way sooner than this. But I kind of lost
track when it went into your tree, until I started thinking about it
earlier today. While it's not the end of the world to wait for 5.13,
though the current situation does suck a lot for certain workloads.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
  2021-03-24 22:13     ` Jens Axboe
@ 2021-03-24 22:19       ` Andrew Morton
  2021-03-24 22:25         ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2021-03-24 22:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jan Kara, mm-commits, Matthew Wilcox

On Wed, 24 Mar 2021 16:13:48 -0600 Jens Axboe <axboe@kernel.dk> wrote:

> On 3/24/21 4:02 PM, Andrew Morton wrote:
> > On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> > 
> >> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
> >>>
> >>>
> >>> The patch titled
> >>>      Subject: mm: provide filemap_range_needs_writeback() helper
> >>> has been added to the -mm tree.  Its filename is
> >>>      mm-provide-filemap_range_needs_writeback-helper.patch
> >>
> >> Are you still planning on flushing this out for 5.12??
> > 
> > Oh.  No, I wasn't planning on that.  I saw nothing which made me think
> > it was that urgent.
> 
> Ehm ok, I think that was the plan though when we originally talked about
> this series. At least that was what was in my original emails on this
> topic :-)
> 
> > What is the justification?  How much performance benefit are we talking
> > here?
> 
> Well it's pretty huge, since if we return "nope can't do this
> non-blocking" it'll mean that io_uring has to bump the operation to an
> async threads. So CPU cost goes way up, and latencies does too.
> 
> Problem is, we're now at -rc4, and it was my understanding that we'd get
> this in for 5.12, but obviously way sooner than this. But I kind of lost
> track when it went into your tree, until I started thinking about it
> earlier today. While it's not the end of the world to wait for 5.13,
> though the current situation does suck a lot for certain workloads.

I'd be OK with sending it to Linus now, but we really should put some
numbers behind "suck a lot" to justify it.  And a -stable backport
might be justified if the benefit is large enough, and if "certain
workloads" are common enough.

But with what have here, I and everyone else who considers these
patches is going in blind!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
  2021-03-24 22:19       ` Andrew Morton
@ 2021-03-24 22:25         ` Jens Axboe
  2021-03-25  0:54           ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-03-24 22:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, mm-commits, Matthew Wilcox

On 3/24/21 4:19 PM, Andrew Morton wrote:
> On Wed, 24 Mar 2021 16:13:48 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 3/24/21 4:02 PM, Andrew Morton wrote:
>>> On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
>>>>>
>>>>>
>>>>> The patch titled
>>>>>      Subject: mm: provide filemap_range_needs_writeback() helper
>>>>> has been added to the -mm tree.  Its filename is
>>>>>      mm-provide-filemap_range_needs_writeback-helper.patch
>>>>
>>>> Are you still planning on flushing this out for 5.12??
>>>
>>> Oh.  No, I wasn't planning on that.  I saw nothing which made me think
>>> it was that urgent.
>>
>> Ehm ok, I think that was the plan though when we originally talked about
>> this series. At least that was what was in my original emails on this
>> topic :-)
>>
>>> What is the justification?  How much performance benefit are we talking
>>> here?
>>
>> Well it's pretty huge, since if we return "nope can't do this
>> non-blocking" it'll mean that io_uring has to bump the operation to an
>> async threads. So CPU cost goes way up, and latencies does too.
>>
>> Problem is, we're now at -rc4, and it was my understanding that we'd get
>> this in for 5.12, but obviously way sooner than this. But I kind of lost
>> track when it went into your tree, until I started thinking about it
>> earlier today. While it's not the end of the world to wait for 5.13,
>> though the current situation does suck a lot for certain workloads.
> 
> I'd be OK with sending it to Linus now, but we really should put some
> numbers behind "suck a lot" to justify it.  And a -stable backport
> might be justified if the benefit is large enough, and if "certain
> workloads" are common enough.
> 
> But with what have here, I and everyone else who considers these
> patches is going in blind!

The backstory is that an internal workload complained because it was
using too much CPU, and when I took a look, we had a lot of io_uring
workers going to town. For an async buffered read like workload, I am
normally expecting _zero_ offloads to a worker thread, but this one
had tons of them. I'd drop caches and things would look good again,
but then a minute later we'd regress back to using workers. Turns out
that every minute something was reading parts of the device, which
would add page cache for that inode. I put patches like these in for
our kernel, and the problem was solved.

Obviously that's not a great story since there are no hard numbers
there, and I'd have to generate those for you! Which I surely can.
While the original premise of the patches was fixing excessive CPU
usage, the general observation is that outside of just using a lot more
CPU, it'll also cause worse latencies as offload to a thread always adds
latency.

So let me know if you want me to generate some numbers. They will be
artificial, but at least it'll be something outside of a story.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
  2021-03-24 22:25         ` Jens Axboe
@ 2021-03-25  0:54           ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2021-03-25  0:54 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jan Kara, mm-commits, Matthew Wilcox

On Wed, 24 Mar 2021 16:25:52 -0600 Jens Axboe <axboe@kernel.dk> wrote:

> The backstory is that an internal workload complained because it was
> using too much CPU, and when I took a look, we had a lot of io_uring
> workers going to town. For an async buffered read like workload, I am
> normally expecting _zero_ offloads to a worker thread, but this one
> had tons of them. I'd drop caches and things would look good again,
> but then a minute later we'd regress back to using workers. Turns out
> that every minute something was reading parts of the device, which
> would add page cache for that inode. I put patches like these in for
> our kernel, and the problem was solved.

I added the above to the 0/n changelog if tha'ts OK.

> Obviously that's not a great story since there are no hard numbers
> there, and I'd have to generate those for you! Which I surely can.
> While the original premise of the patches was fixing excessive CPU
> usage, the general observation is that outside of just using a lot more
> CPU, it'll also cause worse latencies as offload to a thread always adds
> latency.
> 
> So let me know if you want me to generate some numbers. They will be
> artificial, but at least it'll be something outside of a story.

We really should back it up with numbers, even if they're handwavy.

Particularly because we need to figure out if -stable wants this.  What
are your feelings on that one?

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-03-25  0:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-02 23:43 + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree akpm
2021-03-24 19:48 ` Jens Axboe
2021-03-24 22:02   ` Andrew Morton
2021-03-24 22:13     ` Jens Axboe
2021-03-24 22:19       ` Andrew Morton
2021-03-24 22:25         ` Jens Axboe
2021-03-25  0:54           ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.