* [LSF/MM/BPF TOPIC] Memory profiling using code tagging @ 2023-02-22 19:31 Suren Baghdasaryan 2023-03-28 16:28 ` Vlastimil Babka 2023-05-10 16:26 ` Suren Baghdasaryan 0 siblings, 2 replies; 9+ messages in thread From: Suren Baghdasaryan @ 2023-02-22 19:31 UTC (permalink / raw) To: lsf-pc; +Cc: linux-fsdevel, linux-mm, Kent Overstreet We would like to continue the discussion about code tagging use for memory allocation profiling. The code tagging framework [1] and its applications were posted as an RFC [2] and discussed at LPC 2022. It has many applications proposed in the RFC but we would like to focus on its application for memory profiling. It can be used as a low-overhead solution to track memory leaks, rank memory consumers by the amount of memory they use, identify memory allocation hot paths and possible other use cases. Kent Overstreet and I worked on simplifying the solution, minimizing the overhead and implementing features requested during RFC review. Kent Overstreet, Michal Hocko, Johannes Weiner, Matthew Wilcox, Andrew Morton, David Hildenbrand, Vlastimil Babka, Roman Gushchin would be good participants. [1] https://lwn.net/Articles/906660/ [2] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2023-02-22 19:31 [LSF/MM/BPF TOPIC] Memory profiling using code tagging Suren Baghdasaryan @ 2023-03-28 16:28 ` Vlastimil Babka 2023-03-28 16:55 ` Kent Overstreet 2023-05-10 16:28 ` Kent Overstreet 2023-05-10 16:26 ` Suren Baghdasaryan 1 sibling, 2 replies; 9+ messages in thread From: Vlastimil Babka @ 2023-03-28 16:28 UTC (permalink / raw) To: Suren Baghdasaryan, lsf-pc; +Cc: linux-fsdevel, linux-mm, Kent Overstreet On 2/22/23 20:31, Suren Baghdasaryan wrote: > We would like to continue the discussion about code tagging use for > memory allocation profiling. The code tagging framework [1] and its > applications were posted as an RFC [2] and discussed at LPC 2022. It > has many applications proposed in the RFC but we would like to focus > on its application for memory profiling. It can be used as a > low-overhead solution to track memory leaks, rank memory consumers by > the amount of memory they use, identify memory allocation hot paths > and possible other use cases. > Kent Overstreet and I worked on simplifying the solution, minimizing > the overhead and implementing features requested during RFC review. IIRC one large objection was the use of page_ext, I don't recall if you found another solution to that? > Kent Overstreet, Michal Hocko, Johannes Weiner, Matthew Wilcox, Andrew > Morton, David Hildenbrand, Vlastimil Babka, Roman Gushchin would be > good participants. > > [1] https://lwn.net/Articles/906660/ > [2] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/ > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2023-03-28 16:28 ` Vlastimil Babka @ 2023-03-28 16:55 ` Kent Overstreet 2023-05-10 16:28 ` Kent Overstreet 1 sibling, 0 replies; 9+ messages in thread From: Kent Overstreet @ 2023-03-28 16:55 UTC (permalink / raw) To: Vlastimil Babka; +Cc: Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote: > On 2/22/23 20:31, Suren Baghdasaryan wrote: > > We would like to continue the discussion about code tagging use for > > memory allocation profiling. The code tagging framework [1] and its > > applications were posted as an RFC [2] and discussed at LPC 2022. It > > has many applications proposed in the RFC but we would like to focus > > on its application for memory profiling. It can be used as a > > low-overhead solution to track memory leaks, rank memory consumers by > > the amount of memory they use, identify memory allocation hot paths > > and possible other use cases. > > Kent Overstreet and I worked on simplifying the solution, minimizing > > the overhead and implementing features requested during RFC review. > > IIRC one large objection was the use of page_ext, I don't recall if you > found another solution to that? No, page_ext is really the thing that makes the most sense here. It's per-page allocation information, so the only other place it could go is in struct page itself - but that doesn't make any sense with the boot time option we've got now, page_ext works perfectly with that. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2023-03-28 16:28 ` Vlastimil Babka 2023-03-28 16:55 ` Kent Overstreet @ 2023-05-10 16:28 ` Kent Overstreet 2024-01-21 23:39 ` Pasha Tatashin 1 sibling, 1 reply; 9+ messages in thread From: Kent Overstreet @ 2023-05-10 16:28 UTC (permalink / raw) To: Vlastimil Babka; +Cc: Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote: > On 2/22/23 20:31, Suren Baghdasaryan wrote: > > We would like to continue the discussion about code tagging use for > > memory allocation profiling. The code tagging framework [1] and its > > applications were posted as an RFC [2] and discussed at LPC 2022. It > > has many applications proposed in the RFC but we would like to focus > > on its application for memory profiling. It can be used as a > > low-overhead solution to track memory leaks, rank memory consumers by > > the amount of memory they use, identify memory allocation hot paths > > and possible other use cases. > > Kent Overstreet and I worked on simplifying the solution, minimizing > > the overhead and implementing features requested during RFC review. > > IIRC one large objection was the use of page_ext, I don't recall if you > found another solution to that? Hasn't been addressed yet, but we were just talking about moving the codetag pointer from page_ext to page last night for memory overhead reasons. The disadvantage then is that the memory overhead doesn't go down if you disable memory allocation profiling at boot time... But perhaps the performance overhead is low enough now that this is not something we expect to be doing as much? Choices, choices... ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2023-05-10 16:28 ` Kent Overstreet @ 2024-01-21 23:39 ` Pasha Tatashin 2024-01-21 23:56 ` Kent Overstreet 2024-01-22 1:18 ` Matthew Wilcox 0 siblings, 2 replies; 9+ messages in thread From: Pasha Tatashin @ 2024-01-21 23:39 UTC (permalink / raw) To: Kent Overstreet Cc: Vlastimil Babka, Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm On Wed, May 10, 2023 at 12:28 PM Kent Overstreet <kent.overstreet@linux.dev> wrote: > > On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote: > > On 2/22/23 20:31, Suren Baghdasaryan wrote: > > > We would like to continue the discussion about code tagging use for > > > memory allocation profiling. The code tagging framework [1] and its > > > applications were posted as an RFC [2] and discussed at LPC 2022. It > > > has many applications proposed in the RFC but we would like to focus > > > on its application for memory profiling. It can be used as a > > > low-overhead solution to track memory leaks, rank memory consumers by > > > the amount of memory they use, identify memory allocation hot paths > > > and possible other use cases. > > > Kent Overstreet and I worked on simplifying the solution, minimizing > > > the overhead and implementing features requested during RFC review. > > > > IIRC one large objection was the use of page_ext, I don't recall if you > > found another solution to that? > > Hasn't been addressed yet, but we were just talking about moving the > codetag pointer from page_ext to page last night for memory overhead > reasons. > > The disadvantage then is that the memory overhead doesn't go down if you > disable memory allocation profiling at boot time... > > But perhaps the performance overhead is low enough now that this is not > something we expect to be doing as much? > > Choices, choices... I would like to participate in this discussion, specifically to discuss how to make this profiling applicable at the scale environment. Where we have many machines in the fleet, but the memory and performance overheads must be much smaller compared to what is currently proposed. There are several ideas that we can discuss: 1. Filtering files that are going to be tagged at the build time. For example, If a specific driver does not need to be tagged it can be filtered out during build time. 2. Reducing the memory overhead by not using page_ext pointer, but instead use n-bits in the page->flags. The number of buckets is actually not that large, there is no need to keep 8-byte pointer in page_ext, it could be an idx in an array of a specific size. There could be buckets that contain several stacks. 3. Using static branches for performance optimizations, especially for the cases when profiling is disabled. 4. Optionally enable only a specific allocator profiling: kmalloc/pgalloc/vmalloc/pcp etc. Pasha ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2024-01-21 23:39 ` Pasha Tatashin @ 2024-01-21 23:56 ` Kent Overstreet 2024-01-22 1:18 ` Matthew Wilcox 1 sibling, 0 replies; 9+ messages in thread From: Kent Overstreet @ 2024-01-21 23:56 UTC (permalink / raw) To: Pasha Tatashin Cc: Vlastimil Babka, Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote: > On Wed, May 10, 2023 at 12:28 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > > > On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote: > > > On 2/22/23 20:31, Suren Baghdasaryan wrote: > > > > We would like to continue the discussion about code tagging use for > > > > memory allocation profiling. The code tagging framework [1] and its > > > > applications were posted as an RFC [2] and discussed at LPC 2022. It > > > > has many applications proposed in the RFC but we would like to focus > > > > on its application for memory profiling. It can be used as a > > > > low-overhead solution to track memory leaks, rank memory consumers by > > > > the amount of memory they use, identify memory allocation hot paths > > > > and possible other use cases. > > > > Kent Overstreet and I worked on simplifying the solution, minimizing > > > > the overhead and implementing features requested during RFC review. > > > > > > IIRC one large objection was the use of page_ext, I don't recall if you > > > found another solution to that? > > > > Hasn't been addressed yet, but we were just talking about moving the > > codetag pointer from page_ext to page last night for memory overhead > > reasons. > > > > The disadvantage then is that the memory overhead doesn't go down if you > > disable memory allocation profiling at boot time... > > > > But perhaps the performance overhead is low enough now that this is not > > something we expect to be doing as much? > > > > Choices, choices... > > I would like to participate in this discussion, specifically to > discuss how to make this profiling applicable at the scale > environment. Where we have many machines in the fleet, but the memory > and performance overheads must be much smaller compared to what is > currently proposed. > > There are several ideas that we can discuss: > 1. Filtering files that are going to be tagged at the build time. > For example, If a specific driver does not need to be tagged it can be > filtered out during build time. Not a bad idea - but do we have a concrete reason we want this? Our goal has been low enough overhead to be enabled in production, and I think we're delivering on that; perhaps we could wait and see if anyone complains. We've already got the runtime switch (via a static branch), so if overhead is the concern that should cover that. > 2. Reducing the memory overhead by not using page_ext pointer, but > instead use n-bits in the page->flags. > > The number of buckets is actually not that large, there is no need to > keep 8-byte pointer in page_ext, it could be an idx in an array of a > specific size. There could be buckets that contain several stacks. Just a single tag index directly maps to the pointer it replaces, we should be able to do this. > 3. Using static branches for performance optimizations, especially for > the cases when profiling is disabled. Already are :) > 4. Optionally enable only a specific allocator profiling: > kmalloc/pgalloc/vmalloc/pcp etc. See above - I'd prefer to be restrained with the knobs we add. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2024-01-21 23:39 ` Pasha Tatashin 2024-01-21 23:56 ` Kent Overstreet @ 2024-01-22 1:18 ` Matthew Wilcox 2024-01-22 3:29 ` Suren Baghdasaryan 1 sibling, 1 reply; 9+ messages in thread From: Matthew Wilcox @ 2024-01-22 1:18 UTC (permalink / raw) To: Pasha Tatashin, g Cc: Kent Overstreet, Vlastimil Babka, Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote: > On Wed, May 10, 2023 at 12:28 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > Hasn't been addressed yet, but we were just talking about moving the > > codetag pointer from page_ext to page last night for memory overhead > > reasons. > > > > The disadvantage then is that the memory overhead doesn't go down if you > > disable memory allocation profiling at boot time... > > > > But perhaps the performance overhead is low enough now that this is not > > something we expect to be doing as much? > > > > Choices, choices... > > I would like to participate in this discussion, specifically to Umm, this is a discussion proposal for last year, not this. I don't remember if a followup discussion has been proposed for this year? > 2. Reducing the memory overhead by not using page_ext pointer, but > instead use n-bits in the page->flags. > > The number of buckets is actually not that large, there is no need to > keep 8-byte pointer in page_ext, it could be an idx in an array of a > specific size. There could be buckets that contain several stacks. There are a lot of people using "n bits in page->flags" and I don't have a good feeling for how many we really have left. MGLRU uses a variable number of bits. There's PG_arch_2 and PG_arch_3. There's PG_uncached. There's PG_young and PG_idle. And of course we have NUMA node (10 bits?), section (?), zone (3 bits?) I count 28 bits allocated with all the CONFIG enabled, then 13 for node+zone, so it certainly seems like there's a lot free on 64-bit, but it'd be nice to have it written out properly. Related, what do we think is going to happen with page_ext in a memdesc world (also what's going to happen with the kmsan goop in struct page?) I see page_idle_ops, page_owner_ops and page_table_check_ops. page_idle_ops only uses the 8 byte flags. page_owner_ops uses an extra 64 bytes (!). page_table_check uses an extra 8 bytes. page_idle looks to be for folios only. page_table_check seems like it should be folded into pgdesc. page_owner maybe gets added to every allocation rather than every page (but that's going to be interesting for memdescs which don't normally need an allocation). That seems to imply that we can get rid of page_ext entirely, which will be nice. I don't understand kmsan well enough to understand what to do about it. If it's per-allocation, we can handle it like page_owner. If it really is per-page, we can make it an ifdef in struct page itself. I think it's OK to grow struct page for such a rarely used debugging option. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2024-01-22 1:18 ` Matthew Wilcox @ 2024-01-22 3:29 ` Suren Baghdasaryan 0 siblings, 0 replies; 9+ messages in thread From: Suren Baghdasaryan @ 2024-01-22 3:29 UTC (permalink / raw) To: Matthew Wilcox Cc: Pasha Tatashin, g, Kent Overstreet, Vlastimil Babka, lsf-pc, linux-fsdevel, linux-mm On Sun, Jan 21, 2024 at 5:18 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote: > > On Wed, May 10, 2023 at 12:28 PM Kent Overstreet > > <kent.overstreet@linux.dev> wrote: > > > Hasn't been addressed yet, but we were just talking about moving the > > > codetag pointer from page_ext to page last night for memory overhead > > > reasons. > > > > > > The disadvantage then is that the memory overhead doesn't go down if you > > > disable memory allocation profiling at boot time... > > > > > > But perhaps the performance overhead is low enough now that this is not > > > something we expect to be doing as much? > > > > > > Choices, choices... > > > > I would like to participate in this discussion, specifically to > > Umm, this is a discussion proposal for last year, not this. I don't > remember if a followup discussion has been proposed for this year? My bad. I should submit a proposal for followup discussion for this year. Will do that this coming week. > > > 2. Reducing the memory overhead by not using page_ext pointer, but > > instead use n-bits in the page->flags. > > > > The number of buckets is actually not that large, there is no need to > > keep 8-byte pointer in page_ext, it could be an idx in an array of a > > specific size. There could be buckets that contain several stacks. > > There are a lot of people using "n bits in page->flags" and I don't > have a good feeling for how many we really have left. MGLRU uses a > variable number of bits. There's PG_arch_2 and PG_arch_3. There's > PG_uncached. There's PG_young and PG_idle. And of course we have > NUMA node (10 bits?), section (?), zone (3 bits?) I count 28 bits > allocated with all the CONFIG enabled, then 13 for node+zone, so it > certainly seems like there's a lot free on 64-bit, but it'd be > nice to have it written out properly. > > Related, what do we think is going to happen with page_ext in a memdesc > world (also what's going to happen with the kmsan goop in struct page?) > > I see page_idle_ops, page_owner_ops and page_table_check_ops. > page_idle_ops only uses the 8 byte flags. page_owner_ops uses an extra > 64 bytes (!). page_table_check uses an extra 8 bytes. > > page_idle looks to be for folios only. page_table_check seems like > it should be folded into pgdesc. page_owner maybe gets added to every > allocation rather than every page (but that's going to be interesting > for memdescs which don't normally need an allocation). > > That seems to imply that we can get rid of page_ext entirely, which will > be nice. I don't understand kmsan well enough to understand what to > do about it. If it's per-allocation, we can handle it like page_owner. > If it really is per-page, we can make it an ifdef in struct page itself. > I think it's OK to grow struct page for such a rarely used debugging > option. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging 2023-02-22 19:31 [LSF/MM/BPF TOPIC] Memory profiling using code tagging Suren Baghdasaryan 2023-03-28 16:28 ` Vlastimil Babka @ 2023-05-10 16:26 ` Suren Baghdasaryan 1 sibling, 0 replies; 9+ messages in thread From: Suren Baghdasaryan @ 2023-05-10 16:26 UTC (permalink / raw) To: lsf-pc; +Cc: linux-fsdevel, linux-mm, Kent Overstreet On Wed, Feb 22, 2023 at 11:31 AM Suren Baghdasaryan <surenb@google.com> wrote: > > We would like to continue the discussion about code tagging use for > memory allocation profiling. The code tagging framework [1] and its > applications were posted as an RFC [2] and discussed at LPC 2022. It > has many applications proposed in the RFC but we would like to focus > on its application for memory profiling. It can be used as a > low-overhead solution to track memory leaks, rank memory consumers by > the amount of memory they use, identify memory allocation hot paths > and possible other use cases. > Kent Overstreet and I worked on simplifying the solution, minimizing > the overhead and implementing features requested during RFC review. > > Kent Overstreet, Michal Hocko, Johannes Weiner, Matthew Wilcox, Andrew > Morton, David Hildenbrand, Vlastimil Babka, Roman Gushchin would be > good participants. > > [1] https://lwn.net/Articles/906660/ > [2] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/ Sharing the slides here: https://drive.google.com/file/d/1dBjYgk03hvaVAe7ph0Sad-zfr4Gw4irQ/view?usp=sharing ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-01-22 3:29 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-22 19:31 [LSF/MM/BPF TOPIC] Memory profiling using code tagging Suren Baghdasaryan 2023-03-28 16:28 ` Vlastimil Babka 2023-03-28 16:55 ` Kent Overstreet 2023-05-10 16:28 ` Kent Overstreet 2024-01-21 23:39 ` Pasha Tatashin 2024-01-21 23:56 ` Kent Overstreet 2024-01-22 1:18 ` Matthew Wilcox 2024-01-22 3:29 ` Suren Baghdasaryan 2023-05-10 16:26 ` Suren Baghdasaryan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).