From: Hannes Reinecke <hare@suse.de>
To: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>,
Kundan Kumar <kundan.kumar@samsung.com>,
axboe@kernel.dk, linux-block@vger.kernel.org,
joshi.k@samsung.com, mcgrof@kernel.org, anuj20.g@samsung.com,
nj.shetty@samsung.com, c.gameti@samsung.com,
gost.dev@samsung.com
Subject: Re: [PATCH v2] block : add larger order folio size instead of pages
Date: Sun, 5 May 2024 14:10:14 +0200 [thread overview]
Message-ID: <33717b97-8986-4d6e-aa10-47393b810ea2@suse.de> (raw)
In-Reply-To: <ZjZjBHAdUdt6FJe6@casper.infradead.org>
On 5/4/24 18:32, Matthew Wilcox wrote:
> On Sat, May 04, 2024 at 02:35:15PM +0200, Hannes Reinecke wrote:
>>> I think this is wandering into a minefield. I'm pretty sure
>>> it's considered valid to split the bio, and complete the two halves
>>> independently. Each one will put the refcounts for the pages it touches,
>>> and if we do this early putting of references, that's going to fail.
>>
>> Precisesly my worries. Something I want to talk to you about at LSF;
>> refcounting of folios vs refcounting of pages.
>> When one takes a refcount on a folio we are actually taking a refcount
>> on the first page, which is okay if we stick with using the folio throughout
>> the call chain. But if we start mixing between pages and folios (as we do
>> here) we will be getting the refcount wrong.
>>
>> Do you have plans how we could improve the situation?
>> Like a warning 'Hey, you've used the folio for taking the reference, but now
>> you are releasing the references for the page'?
>
> This is a fairly common misunderstanding, but TLDR: problem solved long
> before I started this project.
>
> Individual pages don't actually have a refcount. I know it looks
> like they do, and they kind of do, but for tail pages, the refcount is
> always 0. Functions like get_page() and put_page() always operate on
> the head page (ie folio) refcount.
>
Precisely.
> Specifically, I think you're concerned about pages coming from GUP.
> Take a look at try_get_folio(). We pass in a struct page, explicitly
> get the refcount on a folio, check the page is still part of the
> folio, then return the folio. And we return the page to the caller
> because the caller needs to know the precise page at that address,
> not the folio that contains it.
>
> There are functions which don't surreptitiously call compound_head()
> behind your back. set_page_count(), for example. And page_ref_count()
> (rather than the more normal page_count()).
>
> And none of this is true if you don't use __GFP_COMP. But let's call
> that an aberration that must die.
Ah, right. So the refcount for a page is always unwound to use the
refcount of the enclosing folio.
I was actually concerned with the iov_iter functions, where we take a
reference for each page. Currently iov_iter is iterating in units of
PAGE_SIZE, so there is no easy way of converting that to folios.
But one step at a time, I guess. First get the blocksize > pagesize
patches in.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
prev parent reply other threads:[~2024-05-05 12:10 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20240430175735epcas5p103ac74e1482eda3e393c0034cea8e9ff@epcas5p1.samsung.com>
2024-04-30 17:50 ` [PATCH v2] block : add larger order folio size instead of pages Kundan Kumar
2024-05-02 5:35 ` Christoph Hellwig
2024-05-07 11:19 ` Kundan Kumar
2024-05-02 6:45 ` Hannes Reinecke
2024-05-02 11:52 ` Kundan Kumar
2024-05-02 12:53 ` [PATCH v2] " Christoph Hellwig
2024-05-03 15:26 ` Matthew Wilcox
2024-05-03 16:22 ` Christoph Hellwig
2024-05-04 12:35 ` Hannes Reinecke
2024-05-04 16:32 ` Matthew Wilcox
2024-05-05 12:10 ` Hannes Reinecke [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=33717b97-8986-4d6e-aa10-47393b810ea2@suse.de \
--to=hare@suse.de \
--cc=anuj20.g@samsung.com \
--cc=axboe@kernel.dk \
--cc=c.gameti@samsung.com \
--cc=gost.dev@samsung.com \
--cc=hch@lst.de \
--cc=joshi.k@samsung.com \
--cc=kundan.kumar@samsung.com \
--cc=linux-block@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=nj.shetty@samsung.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox