Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joel Fernandes <joel@joelfernandes.org>
To: Daniel Colascione <dancol@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Tim Murray <timmurray@google.com>,
	Carmen Jackson <carmenjackson@google.com>,
	Mayank Gupta <mayankgupta@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	kernel-team <kernel-team@android.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Jerome Glisse <jglisse@redhat.com>, linux-mm <linux-mm@kvack.org>,
	Matthew Wilcox <willy@infradead.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Tom Zanussi <zanussi@kernel.org>
Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold
Date: Thu, 5 Sep 2019 23:01:42 -0400	[thread overview]
Message-ID: <20190906030142.GA29926@google.com> (raw)
In-Reply-To: <CAKOZuevJyfZRFz3M5myLy+XpS=mAxYCf+oQ2csxCHh7VO-OrKw@mail.gmail.com>

On Thu, Sep 05, 2019 at 06:15:43PM -0700, Daniel Colascione wrote:
[snip]
> > > > > > > The bigger improvement with the threshold is the number of trace records are
> > > > > > > almost halved by using a threshold. The number of records went from 4.6K to
> > > > > > > 2.6K.
> > > > > >
> > > > > > Steven, would it be feasible to add a generic tracepoint throttling?
> > > > >
> > > > > I might misunderstand this but is the issue here actually throttling
> > > > > of the sheer number of trace records or tracing large enough changes
> > > > > to RSS that user might care about? Small changes happen all the time
> > > > > but we are likely not interested in those. Surely we could postprocess
> > > > > the traces to extract changes large enough to be interesting but why
> > > > > capture uninteresting information in the first place? IOW the
> > > > > throttling here should be based not on the time between traces but on
> > > > > the amount of change of the traced signal. Maybe a generic facility
> > > > > like that would be a good idea?
> > > >
> > > > You mean like add a trigger (or filter) that only traces if a field has
> > > > changed since the last time the trace was hit? Hmm, I think we could
> > > > possibly do that. Perhaps even now with histogram triggers?
> > >
> > > I was thinking along the same lines. The histogram subsystem seems
> > > like a very good fit here. Histogram triggers already let users talk
> > > about specific fields of trace events, aggregate them in configurable
> > > ways, and (importantly, IMHO) create synthetic new trace events that
> > > the kernel emits under configurable conditions.
> >
> > Hmm, I think this tracing feature will be a good idea. But in order not to
> > gate this patch, can we agree on keeping a temporary threshold for this
> > patch? Once such idea is implemented in trace subsystem, then we can remove
> > the temporary filter.
> >
> > As Tim said, we don't want our traces flooded and this is a very useful
> > tracepoint as proven in our internal usage at Android. The threshold filter
> > is just few lines of code.
> 
> I'm not sure the threshold filtering code you've added does the right
> thing: we don't keep state, so if a counter constantly flips between
> one "side" of the TRACE_MM_COUNTER_THRESHOLD and the other, we'll emit
> ftrace events at high frequency. More generally, this filtering
> couples the rate of counter logging to the *value* of the counter ---
> that is, we log ftrace events at different times depending on how much
> memory we happen to have used --- and that's not ideal from a
> predictability POV.
> 
> All things being equal, I'd prefer that we get things upstream as fast
> as possible. But in this case, I'd rather wait for a general-purpose
> filtering facility (whether that facility is based on histogram, eBPF,
> or something else) rather than hardcode one particular fixed filtering
> strategy (which might be suboptimal) for one particular kind of event.
> Is there some special urgency here?
> 
> How about we instead add non-filtered tracepoints for the mm counters?
> These tracepoints will still be free when turned off.
> 
> Having added the basic tracepoints, we can discuss separately how to
> do the rate limiting. Maybe instead of providing direct support for
> the algorithm that I described above, we can just use a BPF program as
> a yes/no predicate for whether to log to ftrace. That'd get us to the
> same place as this patch, but more flexibly, right?

Chatted with Daniel offline, we agreed on removing the threshold -- which
Michal also wants to be that way.

So I'll be resubmitting this patch with the threshold removed; and we'll work
on seeing to use filtering through other generic ways like BPF.

thanks all!

 - Joel

next prev parent reply	other threads:[~2019-09-06  3:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 20:09 [PATCH v2] mm: emit tracepoint when RSS changes by threshold Joel Fernandes (Google)
2019-09-04  4:44 ` Suren Baghdasaryan
2019-09-04  4:51   ` Daniel Colascione
2019-09-04  5:15     ` Joel Fernandes
2019-09-04  5:42       ` Daniel Colascione
2019-09-04 14:59         ` Joel Fernandes
2019-09-04 17:15           ` Daniel Colascione
2019-09-04 23:59             ` sspatil
2019-09-04  5:02   ` Joel Fernandes
2019-09-04  5:38     ` Suren Baghdasaryan
2019-09-04  8:45 ` Michal Hocko
2019-09-04 15:32   ` Joel Fernandes
2019-09-04 15:37     ` Michal Hocko
2019-09-04 16:28       ` Joel Fernandes
2019-09-05 10:54         ` Michal Hocko
2019-09-05 14:14           ` Joel Fernandes
2019-09-05 14:20             ` Michal Hocko
2019-09-05 14:23               ` Joel Fernandes
2019-09-05 14:43         ` Michal Hocko
2019-09-05 16:03           ` Suren Baghdasaryan
2019-09-05 17:35             ` Steven Rostedt
2019-09-05 17:39               ` Suren Baghdasaryan
2019-09-05 17:43               ` Tim Murray
2019-09-05 17:47               ` Joel Fernandes
2019-09-05 17:51                 ` Joel Fernandes
2019-09-05 19:56                   ` Tom Zanussi
2019-09-05 20:24                     ` Daniel Colascione
2019-09-05 20:32                       ` Tom Zanussi
2019-09-05 21:14                       ` Tom Zanussi
2019-09-05 22:12                         ` Daniel Colascione
2019-09-05 22:51                           ` Daniel Colascione
2019-09-05 17:50               ` Daniel Colascione
2019-09-06  0:59                 ` Joel Fernandes
2019-09-06  1:15                   ` Daniel Colascione
2019-09-06  3:01                     ` Joel Fernandes [this message]
2019-09-04 17:17       ` Daniel Colascione

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190906030142.GA29926@google.com \
    --to=joel@joelfernandes.org \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=carmenjackson@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dancol@google.com \
    --cc=jglisse@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mayankgupta@google.com \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=rcampbell@nvidia.com \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=zanussi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.