Re: [PATCH v7 3/4] mm,page_owner: Display all stacks and their count

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oscar Salvador <osalvador@suse.de>
To: Marco Elver <elver@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Alexander Potapenko <glider@google.com>
Subject: Re: [PATCH v7 3/4] mm,page_owner: Display all stacks and their count
Date: Fri, 9 Feb 2024 22:52:48 +0100	[thread overview]
Message-ID: <ZcaesCP4mY-94ciJ@localhost.localdomain> (raw)
In-Reply-To: <CANpmjNNcPr=pPco_HN9nXBabubtfo02SAH=taZGNCvYDq42YUQ@mail.gmail.com>

On Fri, Feb 09, 2024 at 09:00:40AM +0100, Marco Elver wrote:
> > +/**
> > + * stack_depot_get_next_stack - Returns all stacks, one at a time
> 
> "Returns all stack_records" to be clear that this is returning the struct.

Fixed.


> 
> > + *
> > + * @table:     Current table we are checking
> > + * @bucket:    Current bucket we are checking
> > + * @last_found:        Last stack that was found
> > + *
> > + * This function finds first a non-empty bucket and returns the first stack
> > + * stored in it. On consequent calls, it walks the bucket to see whether
> > + * it contains more stacks.
> > + * Once we have walked all the stacks in a bucket, we check
> > + * the next one, and we repeat the same steps until we have checked all of them
> 
> I think for this function it's important to say that no entry returned
> from this function can be evicted.
> 
> I.e. the easiest way to ensure this is that the caller makes sure the
> entries returned are never passed to stack_depot_put() - which is
> certainly the case for your usecase because you do not use
> stack_depot_put().
> 
> > + * Return: A pointer a to stack_record struct, or NULL when we have walked all
> > + * buckets.
> > + */
> > +struct stack_record *stack_depot_get_next_stack(unsigned long *table,
> 
> To keep consistent, I'd also call this
> __stack_depot_get_next_stack_record(), so that we're clear this is
> more of an internal function not for general usage.
> 
> > +                                               struct list_head **bucket,
> > +                                               struct stack_record **last_found);
> > +
> >  /**
> >   * stack_depot_fetch - Fetch a stack trace from stack depot
> >   *
> > diff --git a/lib/stackdepot.c b/lib/stackdepot.c
> > index 197c355601f9..107bd0174cd6 100644
> > --- a/lib/stackdepot.c
> > +++ b/lib/stackdepot.c
> > @@ -782,6 +782,52 @@ unsigned int stack_depot_get_extra_bits(depot_stack_handle_t handle)
> >  }
> >  EXPORT_SYMBOL(stack_depot_get_extra_bits);
> >
> > +struct stack_record *stack_depot_get_next_stack(unsigned long *table,
> > +                                               struct list_head **curr_bucket,
> > +                                               struct stack_record **last_found)
> > +{
> > +       struct list_head *bucket = *curr_bucket;
> > +       unsigned long nr_table = *table;
> > +       struct stack_record *found = NULL;
> > +       unsigned long stack_table_entries = stack_hash_mask + 1;
> > +
> > +       rcu_read_lock_sched_notrace();
> 
> We are returning pointers to stack_records out of the RCU-read
> critical section, which are then later used to continue the iteration.
> list_for_each_entry_continue_rcu() says this is fine if "... you held
> some sort of non-RCU reference (such as a reference count) ...".
> Updating the function's documentation to say none of these entries can
> be evicted via a stack_depot_put() is required.

Thinking about it some more, I think I made a mistake:

I am walking all buckets, and within those buckets there are not only
page_owner stack_records, which means that I could return a stack_record
from e.g: KASAN (which I think can evict stack_records) and then
everything goes off the rails.
Which means I cannot walk the buckets like that.

Actually, I think that having something like the following

 struct list_stack_records {
      struct stack_record *stack;
      struct list_stack_records *next;
 }

in page_owner would make sense.
Then the only thing I would have to do is to add a new record on every
new stack_record, and then I could just walk the list like a linked
list.

Which means that the function stack_depot_get_next_stack() could be
killed because everything would happen in page_owner code.

e.g:

 static void inc_stack_record_count(depot_stack_handle_t handle)
 {
         struct stack_record *stack = __stack_depot_get_stack_record(handle);
 
         if (stack) {
                 /*
                  * New stack_record's that do not use STACK_DEPOT_FLAG_GET start
                  * with REFCOUNT_SATURATED to catch spurious increments of their
                  * refcount.
                  * Since we do not use STACK_DEPOT_FLAG_{GET,PUT} API, let us
                  * set a refcount of 1 ourselves.
                  */
                 if (refcount_read(&stack->count) == REFCOUNT_SATURATED) {
                         refcount_set(&stack->count, 1);
			 add_new_stack_record_into_the_list(stack)
		 }
                 refcount_inc(&stack->count);
         }
 }

and then just walk the list_stack_records list whenever we want to
show the stacktraces and their counting.

I think that overall this approach is cleaner and safer.

-- 
Oscar Salvador
SUSE Labs

next prev parent reply	other threads:[~2024-02-09 21:51 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-08 23:45 [PATCH v7 0/4] page_owner: print stacks and their outstanding allocations Oscar Salvador
2024-02-08 23:45 ` [PATCH v7 1/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
2024-02-09  7:45   ` Marco Elver
2024-02-09 21:33     ` Oscar Salvador
2024-02-09 17:39   ` kernel test robot
2024-02-10  9:59   ` kernel test robot
2024-02-08 23:45 ` [PATCH v7 2/4] mm,page_owner: Implement the tracking of the stacks count Oscar Salvador
2024-02-09  7:37   ` Marco Elver
2024-02-09  7:45   ` Marco Elver
2024-02-09 21:39     ` Oscar Salvador
2024-02-09 21:42       ` Marco Elver
2024-02-09 21:44         ` Marco Elver
2024-02-11 20:42           ` Oscar Salvador
2024-02-08 23:45 ` [PATCH v7 3/4] mm,page_owner: Display all stacks and their count Oscar Salvador
2024-02-09  8:00   ` Marco Elver
2024-02-09 21:52     ` Oscar Salvador [this message]
2024-02-09 23:14       ` Oscar Salvador
2024-02-10  7:52         ` Marco Elver
2024-02-11 20:39           ` Oscar Salvador
2024-02-12 10:47             ` Vlastimil Babka
2024-02-09 23:14   ` kernel test robot
2024-02-08 23:45 ` [PATCH v7 4/4] mm,page_owner: Filter out stacks by a threshold Oscar Salvador
2024-02-09  0:28 ` [PATCH v7 0/4] page_owner: print stacks and their outstanding allocations Andrew Morton
2024-02-09 21:31   ` Oscar Salvador
2024-02-09  8:03 ` Marco Elver
2024-02-09 21:32   ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZcaesCP4mY-94ciJ@localhost.localdomain \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.