* + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
@ 2021-03-02 23:43 akpm
2021-03-24 19:48 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: akpm @ 2021-03-02 23:43 UTC (permalink / raw)
To: axboe, jack, mm-commits, willy
The patch titled
Subject: mm: provide filemap_range_needs_writeback() helper
has been added to the -mm tree. Its filename is
mm-provide-filemap_range_needs_writeback-helper.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-provide-filemap_range_needs_writeback-helper.patch
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-provide-filemap_range_needs_writeback-helper.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Jens Axboe <axboe@kernel.dk>
Subject: mm: provide filemap_range_needs_writeback() helper
For O_DIRECT reads/writes, we check if we need to issue a call to
filemap_write_and_wait_range() to issue and/or wait for writeback for any
page in the given range. The existing mechanism just checks for a page in
the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
slow path (and needing retry) if there's just a clean page cache page in
the range.
Provide filemap_range_needs_writeback() which tries a little harder to
check if we actually need to issue and/or wait for writeback in the range.
Link: https://lkml.kernel.org/r/20210224164455.1096727-2-axboe@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/fs.h | 2 ++
mm/filemap.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)
--- a/include/linux/fs.h~mm-provide-filemap_range_needs_writeback-helper
+++ a/include/linux/fs.h
@@ -2739,6 +2739,8 @@ static inline int filemap_fdatawait(stru
extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
loff_t lend);
+extern bool filemap_range_needs_writeback(struct address_space *,
+ loff_t lstart, loff_t lend);
extern int filemap_write_and_wait_range(struct address_space *mapping,
loff_t lstart, loff_t lend);
extern int __filemap_fdatawrite_range(struct address_space *mapping,
--- a/mm/filemap.c~mm-provide-filemap_range_needs_writeback-helper
+++ a/mm/filemap.c
@@ -636,6 +636,49 @@ static bool mapping_needs_writeback(stru
}
/**
+ * filemap_range_needs_writeback - check if range potentially needs writeback
+ * @mapping: address space within which to check
+ * @start_byte: offset in bytes where the range starts
+ * @end_byte: offset in bytes where the range ends (inclusive)
+ *
+ * Find at least one page in the range supplied, usually used to check if
+ * direct writing in this range will trigger a writeback. Used by O_DIRECT
+ * read/write with IOCB_NOWAIT, to see if the caller needs to do
+ * filemap_write_and_wait_range() before proceeding.
+ *
+ * Return: %true if the caller should do filemap_write_and_wait_range() before
+ * doing O_DIRECT to a page in this range, %false otherwise.
+ */
+bool filemap_range_needs_writeback(struct address_space *mapping,
+ loff_t start_byte, loff_t end_byte)
+{
+ XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
+ pgoff_t max = end_byte >> PAGE_SHIFT;
+ struct page *page;
+
+ if (!mapping_needs_writeback(mapping))
+ return false;
+ if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
+ !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
+ return false;
+ if (end_byte < start_byte)
+ return false;
+
+ rcu_read_lock();
+ xas_for_each(&xas, page, max) {
+ if (xas_retry(&xas, page))
+ continue;
+ if (xa_is_value(page))
+ continue;
+ if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
+ break;
+ }
+ rcu_read_unlock();
+ return page != NULL;
+}
+EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
+
+/**
* filemap_write_and_wait_range - write out & wait on a file range
* @mapping: the address_space for the pages
* @lstart: offset in bytes where the range starts
_
Patches currently in -mm which might be from axboe@kernel.dk are
swap-fix-swapfile-read-write-offset.patch
mm-provide-filemap_range_needs_writeback-helper.patch
mm-use-filemap_range_needs_writeback-for-o_direct-reads.patch
iomap-use-filemap_range_needs_writeback-for-o_direct-reads.patch
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
2021-03-02 23:43 + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree akpm
@ 2021-03-24 19:48 ` Jens Axboe
2021-03-24 22:02 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-03-24 19:48 UTC (permalink / raw)
To: Andrew Morton; +Cc: Jan Kara, mm-commits, Matthew Wilcox
On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
>
>
> The patch titled
> Subject: mm: provide filemap_range_needs_writeback() helper
> has been added to the -mm tree. Its filename is
> mm-provide-filemap_range_needs_writeback-helper.patch
Are you still planning on flushing this out for 5.12??
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
2021-03-24 19:48 ` Jens Axboe
@ 2021-03-24 22:02 ` Andrew Morton
2021-03-24 22:13 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2021-03-24 22:02 UTC (permalink / raw)
To: Jens Axboe; +Cc: Jan Kara, mm-commits, Matthew Wilcox
On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
> >
> >
> > The patch titled
> > Subject: mm: provide filemap_range_needs_writeback() helper
> > has been added to the -mm tree. Its filename is
> > mm-provide-filemap_range_needs_writeback-helper.patch
>
> Are you still planning on flushing this out for 5.12??
Oh. No, I wasn't planning on that. I saw nothing which made me think
it was that urgent.
What is the justification? How much performance benefit are we talking
here?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
2021-03-24 22:02 ` Andrew Morton
@ 2021-03-24 22:13 ` Jens Axboe
2021-03-24 22:19 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-03-24 22:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: Jan Kara, mm-commits, Matthew Wilcox
On 3/24/21 4:02 PM, Andrew Morton wrote:
> On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
>
>> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
>>>
>>>
>>> The patch titled
>>> Subject: mm: provide filemap_range_needs_writeback() helper
>>> has been added to the -mm tree. Its filename is
>>> mm-provide-filemap_range_needs_writeback-helper.patch
>>
>> Are you still planning on flushing this out for 5.12??
>
> Oh. No, I wasn't planning on that. I saw nothing which made me think
> it was that urgent.
Ehm ok, I think that was the plan though when we originally talked about
this series. At least that was what was in my original emails on this
topic :-)
> What is the justification? How much performance benefit are we talking
> here?
Well it's pretty huge, since if we return "nope can't do this
non-blocking" it'll mean that io_uring has to bump the operation to an
async threads. So CPU cost goes way up, and latencies does too.
Problem is, we're now at -rc4, and it was my understanding that we'd get
this in for 5.12, but obviously way sooner than this. But I kind of lost
track when it went into your tree, until I started thinking about it
earlier today. While it's not the end of the world to wait for 5.13,
though the current situation does suck a lot for certain workloads.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
2021-03-24 22:13 ` Jens Axboe
@ 2021-03-24 22:19 ` Andrew Morton
2021-03-24 22:25 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2021-03-24 22:19 UTC (permalink / raw)
To: Jens Axboe; +Cc: Jan Kara, mm-commits, Matthew Wilcox
On Wed, 24 Mar 2021 16:13:48 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> On 3/24/21 4:02 PM, Andrew Morton wrote:
> > On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> >
> >> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
> >>>
> >>>
> >>> The patch titled
> >>> Subject: mm: provide filemap_range_needs_writeback() helper
> >>> has been added to the -mm tree. Its filename is
> >>> mm-provide-filemap_range_needs_writeback-helper.patch
> >>
> >> Are you still planning on flushing this out for 5.12??
> >
> > Oh. No, I wasn't planning on that. I saw nothing which made me think
> > it was that urgent.
>
> Ehm ok, I think that was the plan though when we originally talked about
> this series. At least that was what was in my original emails on this
> topic :-)
>
> > What is the justification? How much performance benefit are we talking
> > here?
>
> Well it's pretty huge, since if we return "nope can't do this
> non-blocking" it'll mean that io_uring has to bump the operation to an
> async threads. So CPU cost goes way up, and latencies does too.
>
> Problem is, we're now at -rc4, and it was my understanding that we'd get
> this in for 5.12, but obviously way sooner than this. But I kind of lost
> track when it went into your tree, until I started thinking about it
> earlier today. While it's not the end of the world to wait for 5.13,
> though the current situation does suck a lot for certain workloads.
I'd be OK with sending it to Linus now, but we really should put some
numbers behind "suck a lot" to justify it. And a -stable backport
might be justified if the benefit is large enough, and if "certain
workloads" are common enough.
But with what have here, I and everyone else who considers these
patches is going in blind!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
2021-03-24 22:19 ` Andrew Morton
@ 2021-03-24 22:25 ` Jens Axboe
2021-03-25 0:54 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-03-24 22:25 UTC (permalink / raw)
To: Andrew Morton; +Cc: Jan Kara, mm-commits, Matthew Wilcox
On 3/24/21 4:19 PM, Andrew Morton wrote:
> On Wed, 24 Mar 2021 16:13:48 -0600 Jens Axboe <axboe@kernel.dk> wrote:
>
>> On 3/24/21 4:02 PM, Andrew Morton wrote:
>>> On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> On Tue, Mar 2, 2021 at 4:43 PM <akpm@linux-foundation.org> wrote:
>>>>>
>>>>>
>>>>> The patch titled
>>>>> Subject: mm: provide filemap_range_needs_writeback() helper
>>>>> has been added to the -mm tree. Its filename is
>>>>> mm-provide-filemap_range_needs_writeback-helper.patch
>>>>
>>>> Are you still planning on flushing this out for 5.12??
>>>
>>> Oh. No, I wasn't planning on that. I saw nothing which made me think
>>> it was that urgent.
>>
>> Ehm ok, I think that was the plan though when we originally talked about
>> this series. At least that was what was in my original emails on this
>> topic :-)
>>
>>> What is the justification? How much performance benefit are we talking
>>> here?
>>
>> Well it's pretty huge, since if we return "nope can't do this
>> non-blocking" it'll mean that io_uring has to bump the operation to an
>> async threads. So CPU cost goes way up, and latencies does too.
>>
>> Problem is, we're now at -rc4, and it was my understanding that we'd get
>> this in for 5.12, but obviously way sooner than this. But I kind of lost
>> track when it went into your tree, until I started thinking about it
>> earlier today. While it's not the end of the world to wait for 5.13,
>> though the current situation does suck a lot for certain workloads.
>
> I'd be OK with sending it to Linus now, but we really should put some
> numbers behind "suck a lot" to justify it. And a -stable backport
> might be justified if the benefit is large enough, and if "certain
> workloads" are common enough.
>
> But with what have here, I and everyone else who considers these
> patches is going in blind!
The backstory is that an internal workload complained because it was
using too much CPU, and when I took a look, we had a lot of io_uring
workers going to town. For an async buffered read like workload, I am
normally expecting _zero_ offloads to a worker thread, but this one
had tons of them. I'd drop caches and things would look good again,
but then a minute later we'd regress back to using workers. Turns out
that every minute something was reading parts of the device, which
would add page cache for that inode. I put patches like these in for
our kernel, and the problem was solved.
Obviously that's not a great story since there are no hard numbers
there, and I'd have to generate those for you! Which I surely can.
While the original premise of the patches was fixing excessive CPU
usage, the general observation is that outside of just using a lot more
CPU, it'll also cause worse latencies as offload to a thread always adds
latency.
So let me know if you want me to generate some numbers. They will be
artificial, but at least it'll be something outside of a story.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree
2021-03-24 22:25 ` Jens Axboe
@ 2021-03-25 0:54 ` Andrew Morton
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2021-03-25 0:54 UTC (permalink / raw)
To: Jens Axboe; +Cc: Jan Kara, mm-commits, Matthew Wilcox
On Wed, 24 Mar 2021 16:25:52 -0600 Jens Axboe <axboe@kernel.dk> wrote:
> The backstory is that an internal workload complained because it was
> using too much CPU, and when I took a look, we had a lot of io_uring
> workers going to town. For an async buffered read like workload, I am
> normally expecting _zero_ offloads to a worker thread, but this one
> had tons of them. I'd drop caches and things would look good again,
> but then a minute later we'd regress back to using workers. Turns out
> that every minute something was reading parts of the device, which
> would add page cache for that inode. I put patches like these in for
> our kernel, and the problem was solved.
I added the above to the 0/n changelog if tha'ts OK.
> Obviously that's not a great story since there are no hard numbers
> there, and I'd have to generate those for you! Which I surely can.
> While the original premise of the patches was fixing excessive CPU
> usage, the general observation is that outside of just using a lot more
> CPU, it'll also cause worse latencies as offload to a thread always adds
> latency.
>
> So let me know if you want me to generate some numbers. They will be
> artificial, but at least it'll be something outside of a story.
We really should back it up with numbers, even if they're handwavy.
Particularly because we need to figure out if -stable wants this. What
are your feelings on that one?
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-03-25 0:55 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-02 23:43 + mm-provide-filemap_range_needs_writeback-helper.patch added to -mm tree akpm
2021-03-24 19:48 ` Jens Axboe
2021-03-24 22:02 ` Andrew Morton
2021-03-24 22:13 ` Jens Axboe
2021-03-24 22:19 ` Andrew Morton
2021-03-24 22:25 ` Jens Axboe
2021-03-25 0:54 ` Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.