* How should we RCU-free folios? @ 2025-05-29 15:02 Matthew Wilcox 2025-06-24 12:13 ` David Hildenbrand 0 siblings, 1 reply; 3+ messages in thread From: Matthew Wilcox @ 2025-05-29 15:02 UTC (permalink / raw) To: linux-mm When folios are allocated separately from the underlying pages they represent, they must also be freed. See https://kernelnewbies.org/MatthewWilcox/FolioAlloc Since we want to do lockless lookups of folios in the page cache and GUP, we must RCU free the folios somehow. As I see it, we have three options: 1. Free the folio back to the slab immediately, and mark the slab as TYPESAFE_BY_RCU. That means that the folio may get reallocated at any time, but it must always remain a folio (until an RCU grace period has passed and then the entire slab may be reallocated to a different purpose). Lookups will do: a. Get a pointer to the folio b. Tryget a refcount on the folio c. If it succeeds, re-check the folio is still the one we want (If pagecache, check the xarray still points to the folio; if GUP, check the page still points to the folio) 2. RCU-free the folio. The folio will not be reallocated until the reader drops the RCU read lock. The read side still needs to tryget the folio refcount. However, if it succeeds, it does not need to re-check the pointer to the folio as the folio cannot have been freed. The downside is that folios will hang around in the system for longer before being reallocated, and this may be an unacceptable increase in memory usage. 3. RCU free the folio and RCU free the memory it controls. Now an RCU-protected lookup doesn't need to bump the refcount; if it found the pointer, it knows the memory cannot be freed. I think this is a step too far and would I'm favouring option 1; it's what we currently do. But I wanted to give people a chance to chime in and tell me my tradeoffs are wrong. Or propose a fourth option. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: How should we RCU-free folios? 2025-05-29 15:02 How should we RCU-free folios? Matthew Wilcox @ 2025-06-24 12:13 ` David Hildenbrand 2025-06-24 20:29 ` Matthew Wilcox 0 siblings, 1 reply; 3+ messages in thread From: David Hildenbrand @ 2025-06-24 12:13 UTC (permalink / raw) To: Matthew Wilcox, linux-mm On 29.05.25 17:02, Matthew Wilcox wrote: > When folios are allocated separately from the underlying pages they > represent, they must also be freed. See > https://kernelnewbies.org/MatthewWilcox/FolioAlloc > > Since we want to do lockless lookups of folios in the page cache and > GUP, And in PFN walkers as well. > we must RCU free the folios somehow. As I see it, we have three > options: > > 1. Free the folio back to the slab immediately, and mark the slab as > TYPESAFE_BY_RCU. That means that the folio may get reallocated at > any time, but it must always remain a folio (until an RCU grace period > has passed and then the entire slab may be reallocated to a different > purpose). Lookups will do: > > a. Get a pointer to the folio > b. Tryget a refcount on the folio > c. If it succeeds, re-check the folio is still the one we want > (If pagecache, check the xarray still points to the folio; if GUP, > check the page still points to the folio) Hm, that means that all PFN walker would now also have to do a tryget unconditionally. Also, free hugetlb folios have a refcount of 0 right now ... > > 2. RCU-free the folio. The folio will not be reallocated until the > reader drops the RCU read lock. The read side still needs to tryget > the folio refcount. However, if it succeeds, it does not need to > re-check the pointer to the folio as the folio cannot have been > freed. The downside is that folios will hang around in the system for > longer before being reallocated, and this may be an unacceptable > increase in memory usage. > > 3. RCU free the folio and RCU free the memory it controls. Now an > RCU-protected lookup doesn't need to bump the refcount; if it found the > pointer, it knows the memory cannot be freed. I think this is a > step too far and would That sound nice, though :) > > I'm favouring option 1; it's what we currently do. But I wanted to > give people a chance to chime in and tell me my tradeoffs are wrong. > Or propose a fourth option. I really dislike the refcount dependency. Also ... what about memdescs without a refcount (e.g., PFN walkers and slab?)? -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: How should we RCU-free folios? 2025-06-24 12:13 ` David Hildenbrand @ 2025-06-24 20:29 ` Matthew Wilcox 0 siblings, 0 replies; 3+ messages in thread From: Matthew Wilcox @ 2025-06-24 20:29 UTC (permalink / raw) To: David Hildenbrand; +Cc: linux-mm On Tue, Jun 24, 2025 at 02:13:47PM +0200, David Hildenbrand wrote: > On 29.05.25 17:02, Matthew Wilcox wrote: > > When folios are allocated separately from the underlying pages they > > represent, they must also be freed. See > > https://kernelnewbies.org/MatthewWilcox/FolioAlloc > > > > Since we want to do lockless lookups of folios in the page cache and > > GUP, > > And in PFN walkers as well. Good point (for those not quite clear what David means here, think migration where we're doing a physical walk and trying to decide what to do with the memory, although that's just an example; hwpoison detection has similar problems) > > 1. Free the folio back to the slab immediately, and mark the slab as > > TYPESAFE_BY_RCU. That means that the folio may get reallocated at > > any time, but it must always remain a folio (until an RCU grace period > > has passed and then the entire slab may be reallocated to a different > > purpose). Lookups will do: > > > > a. Get a pointer to the folio > > b. Tryget a refcount on the folio > > c. If it succeeds, re-check the folio is still the one we want > > (If pagecache, check the xarray still points to the folio; if GUP, > > check the page still points to the folio) > > Hm, that means that all PFN walker would now also have to do a tryget > unconditionally. To a certain extent. At least for migration, there's a first pass where we can just look at the value contained in the memdesc to decide if this block is migratable, then in the second pass we get the refcount and start doing migration-things to each page. > Also, free hugetlb folios have a refcount of 0 right now ... Right ... I think handling of hugetlb folios will probably change a bit. A free hugetlb folio probably doesn't free the folio, but might set a flag indicating that it's free. It'd be up to the PFN walker to, say, grab the hugetlb_lock which would make sure this hugetlb folio wasn't allocated while it's messing with it. > > 2. RCU-free the folio. The folio will not be reallocated until the > > reader drops the RCU read lock. The read side still needs to tryget > > the folio refcount. However, if it succeeds, it does not need to > > re-check the pointer to the folio as the folio cannot have been > > freed. The downside is that folios will hang around in the system for > > longer before being reallocated, and this may be an unacceptable > > increase in memory usage. > > > > 3. RCU free the folio and RCU free the memory it controls. Now an > > RCU-protected lookup doesn't need to bump the refcount; if it found the > > pointer, it knows the memory cannot be freed. I think this is a > > step too far and would > > That sound nice, though :) > > > > > I'm favouring option 1; it's what we currently do. But I wanted to > > give people a chance to chime in and tell me my tradeoffs are wrong. > > Or propose a fourth option. > > I really dislike the refcount dependency. > > Also ... what about memdescs without a refcount (e.g., PFN walkers and > slab?)? Depending on the PFN walker, it needs to know how to handle each kind of memdesc. Migration might choose to skip slabs and so "handle" them by moving on to the next block. hwpoison doesn't need to handle them either (the system is dead if we see poison in a slab). I'm not sure how a PFN walker can protect against slab "doing something" with the struct slab. Maybe something like slab_lock() will be needed (yes, I know mostly slab bypasses slab_lock). But it is going to be a per-memdesc kind of problem to solve. Two things I did want to raise though: First, this is an improvement. There's altogether too much code that thinks "If I raise the refcount on the page, that will prevent the memory from being freed". And it'll certainly prevent the page from being returned to the page allocator, but it won't prevent the slab allocator from reusing the memory. Other allocators (eg dma_pool)? No idea. Second, struct slab doesn't need to be RCU freed (unless we discover PFN walkers are going to force us to). The slab allocator knows it is the only user, and when it's done, it can just free it and there's no chance anybody else is looking at it. Unless PFN walkers look at it, which they can't today because struct slab is in mm/slab.h. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-06-24 20:29 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-29 15:02 How should we RCU-free folios? Matthew Wilcox 2025-06-24 12:13 ` David Hildenbrand 2025-06-24 20:29 ` Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).