linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] Memory profiling using code tagging
@ 2023-02-22 19:31 Suren Baghdasaryan
  2023-03-28 16:28 ` Vlastimil Babka
  2023-05-10 16:26 ` Suren Baghdasaryan
  0 siblings, 2 replies; 9+ messages in thread
From: Suren Baghdasaryan @ 2023-02-22 19:31 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-fsdevel, linux-mm, Kent Overstreet

We would like to continue the discussion about code tagging use for
memory allocation profiling. The code tagging framework [1] and its
applications were posted as an RFC [2] and discussed at LPC 2022. It
has many applications proposed in the RFC but we would like to focus
on its application for memory profiling. It can be used as a
low-overhead solution to track memory leaks, rank memory consumers by
the amount of memory they use, identify memory allocation hot paths
and possible other use cases.
Kent Overstreet and I worked on simplifying the solution, minimizing
the overhead and implementing features requested during RFC review.

Kent Overstreet, Michal Hocko, Johannes Weiner, Matthew Wilcox, Andrew
Morton, David Hildenbrand, Vlastimil Babka, Roman Gushchin would be
good participants.

[1] https://lwn.net/Articles/906660/
[2] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2023-02-22 19:31 [LSF/MM/BPF TOPIC] Memory profiling using code tagging Suren Baghdasaryan
@ 2023-03-28 16:28 ` Vlastimil Babka
  2023-03-28 16:55   ` Kent Overstreet
  2023-05-10 16:28   ` Kent Overstreet
  2023-05-10 16:26 ` Suren Baghdasaryan
  1 sibling, 2 replies; 9+ messages in thread
From: Vlastimil Babka @ 2023-03-28 16:28 UTC (permalink / raw)
  To: Suren Baghdasaryan, lsf-pc; +Cc: linux-fsdevel, linux-mm, Kent Overstreet

On 2/22/23 20:31, Suren Baghdasaryan wrote:
> We would like to continue the discussion about code tagging use for
> memory allocation profiling. The code tagging framework [1] and its
> applications were posted as an RFC [2] and discussed at LPC 2022. It
> has many applications proposed in the RFC but we would like to focus
> on its application for memory profiling. It can be used as a
> low-overhead solution to track memory leaks, rank memory consumers by
> the amount of memory they use, identify memory allocation hot paths
> and possible other use cases.
> Kent Overstreet and I worked on simplifying the solution, minimizing
> the overhead and implementing features requested during RFC review.

IIRC one large objection was the use of page_ext, I don't recall if you
found another solution to that?

> Kent Overstreet, Michal Hocko, Johannes Weiner, Matthew Wilcox, Andrew
> Morton, David Hildenbrand, Vlastimil Babka, Roman Gushchin would be
> good participants.
> 
> [1] https://lwn.net/Articles/906660/
> [2] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2023-03-28 16:28 ` Vlastimil Babka
@ 2023-03-28 16:55   ` Kent Overstreet
  2023-05-10 16:28   ` Kent Overstreet
  1 sibling, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-03-28 16:55 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm

On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote:
> On 2/22/23 20:31, Suren Baghdasaryan wrote:
> > We would like to continue the discussion about code tagging use for
> > memory allocation profiling. The code tagging framework [1] and its
> > applications were posted as an RFC [2] and discussed at LPC 2022. It
> > has many applications proposed in the RFC but we would like to focus
> > on its application for memory profiling. It can be used as a
> > low-overhead solution to track memory leaks, rank memory consumers by
> > the amount of memory they use, identify memory allocation hot paths
> > and possible other use cases.
> > Kent Overstreet and I worked on simplifying the solution, minimizing
> > the overhead and implementing features requested during RFC review.
> 
> IIRC one large objection was the use of page_ext, I don't recall if you
> found another solution to that?

No, page_ext is really the thing that makes the most sense here. It's
per-page allocation information, so the only other place it could go is
in struct page itself - but that doesn't make any sense with the boot
time option we've got now, page_ext works perfectly with that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2023-02-22 19:31 [LSF/MM/BPF TOPIC] Memory profiling using code tagging Suren Baghdasaryan
  2023-03-28 16:28 ` Vlastimil Babka
@ 2023-05-10 16:26 ` Suren Baghdasaryan
  1 sibling, 0 replies; 9+ messages in thread
From: Suren Baghdasaryan @ 2023-05-10 16:26 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-fsdevel, linux-mm, Kent Overstreet

On Wed, Feb 22, 2023 at 11:31 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> We would like to continue the discussion about code tagging use for
> memory allocation profiling. The code tagging framework [1] and its
> applications were posted as an RFC [2] and discussed at LPC 2022. It
> has many applications proposed in the RFC but we would like to focus
> on its application for memory profiling. It can be used as a
> low-overhead solution to track memory leaks, rank memory consumers by
> the amount of memory they use, identify memory allocation hot paths
> and possible other use cases.
> Kent Overstreet and I worked on simplifying the solution, minimizing
> the overhead and implementing features requested during RFC review.
>
> Kent Overstreet, Michal Hocko, Johannes Weiner, Matthew Wilcox, Andrew
> Morton, David Hildenbrand, Vlastimil Babka, Roman Gushchin would be
> good participants.
>
> [1] https://lwn.net/Articles/906660/
> [2] https://lore.kernel.org/all/20220830214919.53220-1-surenb@google.com/

Sharing the slides here:
https://drive.google.com/file/d/1dBjYgk03hvaVAe7ph0Sad-zfr4Gw4irQ/view?usp=sharing

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2023-03-28 16:28 ` Vlastimil Babka
  2023-03-28 16:55   ` Kent Overstreet
@ 2023-05-10 16:28   ` Kent Overstreet
  2024-01-21 23:39     ` Pasha Tatashin
  1 sibling, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2023-05-10 16:28 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Suren Baghdasaryan, lsf-pc, linux-fsdevel, linux-mm

On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote:
> On 2/22/23 20:31, Suren Baghdasaryan wrote:
> > We would like to continue the discussion about code tagging use for
> > memory allocation profiling. The code tagging framework [1] and its
> > applications were posted as an RFC [2] and discussed at LPC 2022. It
> > has many applications proposed in the RFC but we would like to focus
> > on its application for memory profiling. It can be used as a
> > low-overhead solution to track memory leaks, rank memory consumers by
> > the amount of memory they use, identify memory allocation hot paths
> > and possible other use cases.
> > Kent Overstreet and I worked on simplifying the solution, minimizing
> > the overhead and implementing features requested during RFC review.
> 
> IIRC one large objection was the use of page_ext, I don't recall if you
> found another solution to that?

Hasn't been addressed yet, but we were just talking about moving the
codetag pointer from page_ext to page last night for memory overhead
reasons.

The disadvantage then is that the memory overhead doesn't go down if you
disable memory allocation profiling at boot time...

But perhaps the performance overhead is low enough now that this is not
something we expect to be doing as much?

Choices, choices...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2023-05-10 16:28   ` Kent Overstreet
@ 2024-01-21 23:39     ` Pasha Tatashin
  2024-01-21 23:56       ` Kent Overstreet
  2024-01-22  1:18       ` Matthew Wilcox
  0 siblings, 2 replies; 9+ messages in thread
From: Pasha Tatashin @ 2024-01-21 23:39 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Vlastimil Babka, Suren Baghdasaryan, lsf-pc, linux-fsdevel,
	linux-mm

On Wed, May 10, 2023 at 12:28 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote:
> > On 2/22/23 20:31, Suren Baghdasaryan wrote:
> > > We would like to continue the discussion about code tagging use for
> > > memory allocation profiling. The code tagging framework [1] and its
> > > applications were posted as an RFC [2] and discussed at LPC 2022. It
> > > has many applications proposed in the RFC but we would like to focus
> > > on its application for memory profiling. It can be used as a
> > > low-overhead solution to track memory leaks, rank memory consumers by
> > > the amount of memory they use, identify memory allocation hot paths
> > > and possible other use cases.
> > > Kent Overstreet and I worked on simplifying the solution, minimizing
> > > the overhead and implementing features requested during RFC review.
> >
> > IIRC one large objection was the use of page_ext, I don't recall if you
> > found another solution to that?
>
> Hasn't been addressed yet, but we were just talking about moving the
> codetag pointer from page_ext to page last night for memory overhead
> reasons.
>
> The disadvantage then is that the memory overhead doesn't go down if you
> disable memory allocation profiling at boot time...
>
> But perhaps the performance overhead is low enough now that this is not
> something we expect to be doing as much?
>
> Choices, choices...

I would like to participate in this discussion, specifically to
discuss how to make this profiling applicable at the scale
environment. Where we have many machines in the fleet, but the memory
and performance overheads must be much smaller compared to what is
currently proposed.

There are several ideas that we can discuss:
1. Filtering files that are going to be tagged at the build time.
For example, If a specific driver does not need to be tagged it can be
filtered out during build time.

2. Reducing the memory overhead by not using page_ext pointer, but
instead use n-bits in the page->flags.

The number of buckets is actually not that large, there is no need to
keep 8-byte pointer in page_ext, it could be an idx in an array of a
specific size. There could be buckets that contain several stacks.

3. Using static branches for performance optimizations, especially for
the cases when profiling is disabled.

4. Optionally enable only a specific allocator profiling:
kmalloc/pgalloc/vmalloc/pcp etc.

Pasha

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2024-01-21 23:39     ` Pasha Tatashin
@ 2024-01-21 23:56       ` Kent Overstreet
  2024-01-22  1:18       ` Matthew Wilcox
  1 sibling, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2024-01-21 23:56 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: Vlastimil Babka, Suren Baghdasaryan, lsf-pc, linux-fsdevel,
	linux-mm

On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote:
> On Wed, May 10, 2023 at 12:28 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> > On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote:
> > > On 2/22/23 20:31, Suren Baghdasaryan wrote:
> > > > We would like to continue the discussion about code tagging use for
> > > > memory allocation profiling. The code tagging framework [1] and its
> > > > applications were posted as an RFC [2] and discussed at LPC 2022. It
> > > > has many applications proposed in the RFC but we would like to focus
> > > > on its application for memory profiling. It can be used as a
> > > > low-overhead solution to track memory leaks, rank memory consumers by
> > > > the amount of memory they use, identify memory allocation hot paths
> > > > and possible other use cases.
> > > > Kent Overstreet and I worked on simplifying the solution, minimizing
> > > > the overhead and implementing features requested during RFC review.
> > >
> > > IIRC one large objection was the use of page_ext, I don't recall if you
> > > found another solution to that?
> >
> > Hasn't been addressed yet, but we were just talking about moving the
> > codetag pointer from page_ext to page last night for memory overhead
> > reasons.
> >
> > The disadvantage then is that the memory overhead doesn't go down if you
> > disable memory allocation profiling at boot time...
> >
> > But perhaps the performance overhead is low enough now that this is not
> > something we expect to be doing as much?
> >
> > Choices, choices...
> 
> I would like to participate in this discussion, specifically to
> discuss how to make this profiling applicable at the scale
> environment. Where we have many machines in the fleet, but the memory
> and performance overheads must be much smaller compared to what is
> currently proposed.
> 
> There are several ideas that we can discuss:
> 1. Filtering files that are going to be tagged at the build time.
> For example, If a specific driver does not need to be tagged it can be
> filtered out during build time.

Not a bad idea - but do we have a concrete reason we want this? Our goal
has been low enough overhead to be enabled in production, and I think
we're delivering on that; perhaps we could wait and see if anyone
complains.

We've already got the runtime switch (via a static branch), so if
overhead is the concern that should cover that.

> 2. Reducing the memory overhead by not using page_ext pointer, but
> instead use n-bits in the page->flags.
>
> The number of buckets is actually not that large, there is no need to
> keep 8-byte pointer in page_ext, it could be an idx in an array of a
> specific size. There could be buckets that contain several stacks.

Just a single tag index directly maps to the pointer it replaces, we
should be able to do this.

> 3. Using static branches for performance optimizations, especially for
> the cases when profiling is disabled.

Already are :)

> 4. Optionally enable only a specific allocator profiling:
> kmalloc/pgalloc/vmalloc/pcp etc.

See above - I'd prefer to be restrained with the knobs we add.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2024-01-21 23:39     ` Pasha Tatashin
  2024-01-21 23:56       ` Kent Overstreet
@ 2024-01-22  1:18       ` Matthew Wilcox
  2024-01-22  3:29         ` Suren Baghdasaryan
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2024-01-22  1:18 UTC (permalink / raw)
  To: Pasha Tatashin, g
  Cc: Kent Overstreet, Vlastimil Babka, Suren Baghdasaryan, lsf-pc,
	linux-fsdevel, linux-mm

On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote:
> On Wed, May 10, 2023 at 12:28 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> > Hasn't been addressed yet, but we were just talking about moving the
> > codetag pointer from page_ext to page last night for memory overhead
> > reasons.
> >
> > The disadvantage then is that the memory overhead doesn't go down if you
> > disable memory allocation profiling at boot time...
> >
> > But perhaps the performance overhead is low enough now that this is not
> > something we expect to be doing as much?
> >
> > Choices, choices...
> 
> I would like to participate in this discussion, specifically to

Umm, this is a discussion proposal for last year, not this.  I don't
remember if a followup discussion has been proposed for this year?

> 2. Reducing the memory overhead by not using page_ext pointer, but
> instead use n-bits in the page->flags.
> 
> The number of buckets is actually not that large, there is no need to
> keep 8-byte pointer in page_ext, it could be an idx in an array of a
> specific size. There could be buckets that contain several stacks.

There are a lot of people using "n bits in page->flags" and I don't
have a good feeling for how many we really have left.  MGLRU uses a
variable number of bits.  There's PG_arch_2 and PG_arch_3.  There's
PG_uncached.  There's PG_young and PG_idle.  And of course we have
NUMA node (10 bits?), section (?), zone (3 bits?)  I count 28 bits
allocated with all the CONFIG enabled, then 13 for node+zone, so it
certainly seems like there's a lot free on 64-bit, but it'd be
nice to have it written out properly.

Related, what do we think is going to happen with page_ext in a memdesc
world (also what's going to happen with the kmsan goop in struct page?)

I see page_idle_ops, page_owner_ops and page_table_check_ops.
page_idle_ops only uses the 8 byte flags.  page_owner_ops uses an extra
64 bytes (!).  page_table_check uses an extra 8 bytes.

page_idle looks to be for folios only.  page_table_check seems like
it should be folded into pgdesc.  page_owner maybe gets added to every
allocation rather than every page (but that's going to be interesting
for memdescs which don't normally need an allocation).

That seems to imply that we can get rid of page_ext entirely, which will
be nice.  I don't understand kmsan well enough to understand what to
do about it.  If it's per-allocation, we can handle it like page_owner.
If it really is per-page, we can make it an ifdef in struct page itself.
I think it's OK to grow struct page for such a rarely used debugging
option.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging
  2024-01-22  1:18       ` Matthew Wilcox
@ 2024-01-22  3:29         ` Suren Baghdasaryan
  0 siblings, 0 replies; 9+ messages in thread
From: Suren Baghdasaryan @ 2024-01-22  3:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Pasha Tatashin, g, Kent Overstreet, Vlastimil Babka, lsf-pc,
	linux-fsdevel, linux-mm

On Sun, Jan 21, 2024 at 5:18 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote:
> > On Wed, May 10, 2023 at 12:28 PM Kent Overstreet
> > <kent.overstreet@linux.dev> wrote:
> > > Hasn't been addressed yet, but we were just talking about moving the
> > > codetag pointer from page_ext to page last night for memory overhead
> > > reasons.
> > >
> > > The disadvantage then is that the memory overhead doesn't go down if you
> > > disable memory allocation profiling at boot time...
> > >
> > > But perhaps the performance overhead is low enough now that this is not
> > > something we expect to be doing as much?
> > >
> > > Choices, choices...
> >
> > I would like to participate in this discussion, specifically to
>
> Umm, this is a discussion proposal for last year, not this.  I don't
> remember if a followup discussion has been proposed for this year?

My bad. I should submit a proposal for followup discussion for this
year. Will do that this coming week.

>
> > 2. Reducing the memory overhead by not using page_ext pointer, but
> > instead use n-bits in the page->flags.
> >
> > The number of buckets is actually not that large, there is no need to
> > keep 8-byte pointer in page_ext, it could be an idx in an array of a
> > specific size. There could be buckets that contain several stacks.
>
> There are a lot of people using "n bits in page->flags" and I don't
> have a good feeling for how many we really have left.  MGLRU uses a
> variable number of bits.  There's PG_arch_2 and PG_arch_3.  There's
> PG_uncached.  There's PG_young and PG_idle.  And of course we have
> NUMA node (10 bits?), section (?), zone (3 bits?)  I count 28 bits
> allocated with all the CONFIG enabled, then 13 for node+zone, so it
> certainly seems like there's a lot free on 64-bit, but it'd be
> nice to have it written out properly.
>
> Related, what do we think is going to happen with page_ext in a memdesc
> world (also what's going to happen with the kmsan goop in struct page?)
>
> I see page_idle_ops, page_owner_ops and page_table_check_ops.
> page_idle_ops only uses the 8 byte flags.  page_owner_ops uses an extra
> 64 bytes (!).  page_table_check uses an extra 8 bytes.
>
> page_idle looks to be for folios only.  page_table_check seems like
> it should be folded into pgdesc.  page_owner maybe gets added to every
> allocation rather than every page (but that's going to be interesting
> for memdescs which don't normally need an allocation).
>
> That seems to imply that we can get rid of page_ext entirely, which will
> be nice.  I don't understand kmsan well enough to understand what to
> do about it.  If it's per-allocation, we can handle it like page_owner.
> If it really is per-page, we can make it an ifdef in struct page itself.
> I think it's OK to grow struct page for such a rarely used debugging
> option.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-01-22  3:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-22 19:31 [LSF/MM/BPF TOPIC] Memory profiling using code tagging Suren Baghdasaryan
2023-03-28 16:28 ` Vlastimil Babka
2023-03-28 16:55   ` Kent Overstreet
2023-05-10 16:28   ` Kent Overstreet
2024-01-21 23:39     ` Pasha Tatashin
2024-01-21 23:56       ` Kent Overstreet
2024-01-22  1:18       ` Matthew Wilcox
2024-01-22  3:29         ` Suren Baghdasaryan
2023-05-10 16:26 ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).