From: Joel Fernandes <joel@joelfernandes.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Tim Murray <timmurray@google.com>,
Carmen Jackson <carmenjackson@google.com>,
mayankgupta@google.com, Daniel Colascione <dancol@google.com>,
Steven Rostedt <rostedt@goodmis.org>,
Minchan Kim <minchan@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
kernel-team <kernel-team@android.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Dan Williams <dan.j.williams@intel.com>,
Jerome Glisse <jglisse@redhat.com>, linux-mm <linux-mm@kvack.org>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@suse.cz>,
Ralph Campbell <rcampbell@nvidia.com>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold
Date: Wed, 4 Sep 2019 01:02:40 -0400 [thread overview]
Message-ID: <20190904050240.GD144846@google.com> (raw)
In-Reply-To: <CAJuCfpEXpYq2i3zNbJ3w+R+QXTuMyzwL6S9UpiGEDvTioKORhQ@mail.gmail.com>
On Tue, Sep 03, 2019 at 09:44:51PM -0700, Suren Baghdasaryan wrote:
> On Tue, Sep 3, 2019 at 1:09 PM Joel Fernandes (Google)
> <joel@joelfernandes.org> wrote:
> >
> > Useful to track how RSS is changing per TGID to detect spikes in RSS and
> > memory hogs. Several Android teams have been using this patch in various
> > kernel trees for half a year now. Many reported to me it is really
> > useful so I'm posting it upstream.
> >
> > Initial patch developed by Tim Murray. Changes I made from original patch:
> > o Prevent any additional space consumed by mm_struct.
> > o Keep overhead low by checking if tracing is enabled.
> > o Add some noise reduction and lower overhead by emitting only on
> > threshold changes.
> >
> > Co-developed-by: Tim Murray <timmurray@google.com>
> > Signed-off-by: Tim Murray <timmurray@google.com>
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> >
> > ---
> >
> > v1->v2: Added more commit message.
> >
> > Cc: carmenjackson@google.com
> > Cc: mayankgupta@google.com
> > Cc: dancol@google.com
> > Cc: rostedt@goodmis.org
> > Cc: minchan@kernel.org
> > Cc: akpm@linux-foundation.org
> > Cc: kernel-team@android.com
> >
> > include/linux/mm.h | 14 +++++++++++---
> > include/trace/events/kmem.h | 21 +++++++++++++++++++++
> > mm/memory.c | 20 ++++++++++++++++++++
> > 3 files changed, 52 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0334ca97c584..823aaf759bdb 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1671,19 +1671,27 @@ static inline unsigned long get_mm_counter(struct mm_struct *mm, int member)
> > return (unsigned long)val;
> > }
> >
> > +void mm_trace_rss_stat(int member, long count, long value);
> > +
> > static inline void add_mm_counter(struct mm_struct *mm, int member, long value)
> > {
> > - atomic_long_add(value, &mm->rss_stat.count[member]);
> > + long count = atomic_long_add_return(value, &mm->rss_stat.count[member]);
> > +
> > + mm_trace_rss_stat(member, count, value);
> > }
> >
> > static inline void inc_mm_counter(struct mm_struct *mm, int member)
> > {
> > - atomic_long_inc(&mm->rss_stat.count[member]);
> > + long count = atomic_long_inc_return(&mm->rss_stat.count[member]);
> > +
> > + mm_trace_rss_stat(member, count, 1);
> > }
> >
> > static inline void dec_mm_counter(struct mm_struct *mm, int member)
> > {
> > - atomic_long_dec(&mm->rss_stat.count[member]);
> > + long count = atomic_long_dec_return(&mm->rss_stat.count[member]);
> > +
> > + mm_trace_rss_stat(member, count, -1);
> > }
> >
> > /* Optimized variant when page is already known not to be PageAnon */
> > diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> > index eb57e3037deb..8b88e04fafbf 100644
> > --- a/include/trace/events/kmem.h
> > +++ b/include/trace/events/kmem.h
> > @@ -315,6 +315,27 @@ TRACE_EVENT(mm_page_alloc_extfrag,
> > __entry->change_ownership)
> > );
> >
> > +TRACE_EVENT(rss_stat,
> > +
> > + TP_PROTO(int member,
> > + long count),
> > +
> > + TP_ARGS(member, count),
> > +
> > + TP_STRUCT__entry(
> > + __field(int, member)
> > + __field(long, size)
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->member = member;
> > + __entry->size = (count << PAGE_SHIFT);
> > + ),
> > +
> > + TP_printk("member=%d size=%ldB",
> > + __entry->member,
> > + __entry->size)
> > + );
> > #endif /* _TRACE_KMEM_H */
> >
> > /* This part must be outside protection */
> > diff --git a/mm/memory.c b/mm/memory.c
> > index e2bb51b6242e..9d81322c24a3 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -72,6 +72,8 @@
> > #include <linux/oom.h>
> > #include <linux/numa.h>
> >
> > +#include <trace/events/kmem.h>
> > +
> > #include <asm/io.h>
> > #include <asm/mmu_context.h>
> > #include <asm/pgalloc.h>
> > @@ -140,6 +142,24 @@ static int __init init_zero_pfn(void)
> > }
> > core_initcall(init_zero_pfn);
> >
> > +/*
> > + * This threshold is the boundary in the value space, that the counter has to
> > + * advance before we trace it. Should be a power of 2. It is to reduce unwanted
> > + * trace overhead. The counter is in units of number of pages.
> > + */
> > +#define TRACE_MM_COUNTER_THRESHOLD 128
>
> IIUC the counter has to change by 128 pages (512kB assuming 4kB pages)
> before the change gets traced. Would it make sense to make this step
> size configurable? For a system with limited memory size change of
> 512kB might be considerable while on systems with plenty of memory
> that might be negligible. Not even mentioning possible difference in
> page sizes. Maybe something like
> /sys/kernel/debug/tracing/rss_step_order with
> TRACE_MM_COUNTER_THRESHOLD=(1<<rss_step_order)?
I would not want to complicate this more to be honest. It is already a bit
complex, and I am not sure about the win in making it as configurable as you
seem to want. The "threshold" thing is just a slight improvement, it is not
aiming to be optimal. If in your tracing, this granularity is an issue, we
can visit it then.
thanks,
- Joel
> > +void mm_trace_rss_stat(int member, long count, long value)
> > +{
> > + long thresh_mask = ~(TRACE_MM_COUNTER_THRESHOLD - 1);
> > +
> > + if (!trace_rss_stat_enabled())
> > + return;
> > +
> > + /* Threshold roll-over, trace it */
> > + if ((count & thresh_mask) != ((count - value) & thresh_mask))
> > + trace_rss_stat(member, count);
> > +}
> >
> > #if defined(SPLIT_RSS_COUNTING)
> >
> > --
> > 2.23.0.187.g17f5b7556c-goog
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >
next prev parent reply other threads:[~2019-09-04 5:02 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-03 20:09 [PATCH v2] mm: emit tracepoint when RSS changes by threshold Joel Fernandes (Google)
2019-09-04 4:44 ` Suren Baghdasaryan
2019-09-04 4:51 ` Daniel Colascione
2019-09-04 5:15 ` Joel Fernandes
2019-09-04 5:42 ` Daniel Colascione
2019-09-04 14:59 ` Joel Fernandes
2019-09-04 17:15 ` Daniel Colascione
2019-09-04 23:59 ` sspatil
2019-09-04 5:02 ` Joel Fernandes [this message]
2019-09-04 5:38 ` Suren Baghdasaryan
2019-09-04 8:45 ` Michal Hocko
2019-09-04 15:32 ` Joel Fernandes
2019-09-04 15:37 ` Michal Hocko
2019-09-04 16:28 ` Joel Fernandes
2019-09-05 10:54 ` Michal Hocko
2019-09-05 14:14 ` Joel Fernandes
2019-09-05 14:20 ` Michal Hocko
2019-09-05 14:23 ` Joel Fernandes
2019-09-05 14:43 ` Michal Hocko
2019-09-05 16:03 ` Suren Baghdasaryan
2019-09-05 17:35 ` Steven Rostedt
2019-09-05 17:39 ` Suren Baghdasaryan
2019-09-05 17:43 ` Tim Murray
2019-09-05 17:47 ` Joel Fernandes
2019-09-05 17:51 ` Joel Fernandes
2019-09-05 19:56 ` Tom Zanussi
2019-09-05 20:24 ` Daniel Colascione
2019-09-05 20:32 ` Tom Zanussi
2019-09-05 21:14 ` Tom Zanussi
2019-09-05 22:12 ` Daniel Colascione
2019-09-05 22:51 ` Daniel Colascione
2019-09-05 17:50 ` Daniel Colascione
2019-09-06 0:59 ` Joel Fernandes
2019-09-06 1:15 ` Daniel Colascione
2019-09-06 3:01 ` Joel Fernandes
2019-09-04 17:17 ` Daniel Colascione
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190904050240.GD144846@google.com \
--to=joel@joelfernandes.org \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=carmenjackson@google.com \
--cc=dan.j.williams@intel.com \
--cc=dancol@google.com \
--cc=jglisse@redhat.com \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mayankgupta@google.com \
--cc=mhocko@suse.cz \
--cc=minchan@kernel.org \
--cc=rcampbell@nvidia.com \
--cc=rostedt@goodmis.org \
--cc=surenb@google.com \
--cc=timmurray@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.