From: Matthew Wilcox <willy@infradead.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH 0/5] Remove some races around folio_test_hugetlb
Date: Tue, 5 Mar 2024 20:35:54 +0000 [thread overview]
Message-ID: <ZeeCKiX_e_fd6Cko@casper.infradead.org> (raw)
In-Reply-To: <d8c9e038-21fc-4611-a0f9-fc6748098b81@redhat.com>
On Tue, Mar 05, 2024 at 10:10:08AM +0100, David Hildenbrand wrote:
> > The cost of this reliability is that we now consume the word I recently
> > freed in folio->page[1]. I think this is acceptable; we've still gained
> > a completely reliable folio_test_hugetlb() (which we didn't have before
> > I started messing around with the folio dtors). Non-hugetlb users
> > can use large_id as a pointer to something else entirely, or even as a
> > non-pointer, as long as they can guarantee it can't conflict (ie don't
> > use it as a bitfield).
>
> That probably means that we have to always set the lowest bit to use it for
> something else, or use another bit.
Yes, that would work.
> I was wondering if
>
> a) We could move that to another subpage. In hugetlb folios we have plenty
> of space for such things. I guess we'd have be able to detect the folio size
> without holding a reference, to make sure we can touch another subpage.
Yes, that was my concern. I wanted to put it in page[2] with all the
other hugetlb goop, but I got to thinking about an order-1 compound
page allocated at the end of memmap and got scared. We could make
folio_test_hugetlb() look at ->flags for the head bit, then look at
->flags_1 for the order and finally at ->hugetlb_id, but now we've looked
at three cachelines to answer a fairly frequent question. And then what
if the folio got split between looking at ->flags and ->flags_1 and we
get a bogus folio order that makes it look OK? We can't even look at
->flags, ->flags_1 and recheck ->flags because it might have got split,
freed and reallocated in the meantime.
> b) We could overload _nr_pages_mapped. We'd effectively have to steal one
> bit from _nr_pages_mapped to make this work.
>
> Maybe what works is using the existing mechanism (hugetlb flag), and then
> storing the pointer in __nr_pages_mapped.
>
> So depending on the hugetlb flag, we can interpret __nr_pages_mapped either
> as the pointer or as the old variant.
>
> Mostly only folio_large_is_mapped() would need care for now, to ignore
> _nr_pages_mapped if the hugetlb flag is set.
I don't mind that at all. We wouldn't even need to steal a bit or use the
existing flag; we could just say that -2 means this is a hugetlb folio.
As long as it ends up at the same offset as page->mapping (because that's
always NULL or a pointer possibly with a low bit set so can't ever be a
number between -4095 and -1).
IOW:
word page0 page1
0 flags flags
1 lru.next head
2 lru.prev entire_mapcount + gap
3 mapping nr_pages_mapped + gap / hugetlb_id
4 index pincount + nr_pages
5 private unused
6 mapcount+refcount mapcount+refcount(0)
7 memcg_data -
or on 32-bit
word page0 page1
0 flags flags
1 lru.next head
2 lru.prev entire_mapcount
3 mapping nr_pages_mapped / hugetlb_id
4 index pincount
5 private unused
6 mapcount mapcount
7 refcount refcount
8 memcg_data -
9+ virtual? last_cpupid? whatever
Does this fit with your plans?
next prev parent reply other threads:[~2024-03-05 20:35 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-01 21:47 [PATCH 0/5] Remove some races around folio_test_hugetlb Matthew Wilcox (Oracle)
2024-03-01 21:47 ` [PATCH 1/5] hugetlb: Make folio_test_hugetlb safer to call Matthew Wilcox (Oracle)
2024-03-05 6:43 ` Oscar Salvador
2024-03-05 8:39 ` David Hildenbrand
2024-03-01 21:47 ` [PATCH 2/5] hugetlb: Add hugetlb_pfn_folio Matthew Wilcox (Oracle)
2024-03-05 6:58 ` Oscar Salvador
2024-03-01 21:47 ` [PATCH 3/5] memory-failure: Use hugetlb_pfn_folio Matthew Wilcox (Oracle)
2024-03-01 21:47 ` [PATCH 4/5] memory-failure: Reorganise get_huge_page_for_hwpoison() Matthew Wilcox (Oracle)
2024-03-01 21:47 ` [PATCH 5/5] compaction: Use hugetlb_pfn_folio in isolate_migratepages_block Matthew Wilcox (Oracle)
2024-03-04 9:09 ` [PATCH 0/5] Remove some races around folio_test_hugetlb Miaohe Lin
2024-03-04 17:08 ` Matthew Wilcox
2024-03-06 7:58 ` Miaohe Lin
2024-03-07 21:16 ` Matthew Wilcox
2024-03-05 9:10 ` David Hildenbrand
2024-03-05 20:35 ` Matthew Wilcox [this message]
2024-03-06 15:18 ` David Hildenbrand
2024-03-07 4:31 ` Matthew Wilcox
2024-03-07 9:20 ` David Hildenbrand
2024-03-07 21:14 ` Matthew Wilcox
2024-03-07 21:38 ` David Hildenbrand
2024-03-08 4:31 ` Matthew Wilcox
2024-03-08 8:46 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZeeCKiX_e_fd6Cko@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.