public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: rostedt@goodmis.org, daniel.wagner@bmw-carit.de,
	masami.hiramatsu.pt@hitachi.com, josh@joshtriplett.org,
	andi@firstfloor.org, mathieu.desnoyers@efficios.com,
	peterz@infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v11 08/28] tracing: Add lock-free tracing_map
Date: Mon, 2 Nov 2015 16:08:05 +0900	[thread overview]
Message-ID: <20151102070805.GB17941@sejong> (raw)
In-Reply-To: <1446143743.4661.58.camel@tzanussi-mobl.amr.corp.intel.com>

Hi Tom,

On Thu, Oct 29, 2015 at 01:35:43PM -0500, Tom Zanussi wrote:
> Hi Namhyung,
> 
> On Thu, 2015-10-29 at 17:31 +0900, Namhyung Kim wrote:
> > Hi Tom,
> > 
> > On Thu, Oct 22, 2015 at 01:14:12PM -0500, Tom Zanussi wrote:
> > > Add tracing_map, a special-purpose lock-free map for tracing.
> > > 
> > > tracing_map is designed to aggregate or 'sum' one or more values
> > > associated with a specific object of type tracing_map_elt, which
> > > is associated by the map to a given key.
> > > 
> > > It provides various hooks allowing per-tracer customization and is
> > > separated out into a separate file in order to allow it to be shared
> > > between multiple tracers, but isn't meant to be generally used outside
> > > of that context.
> > > 
> > > The tracing_map implementation was inspired by lock-free map
> > > algorithms originated by Dr. Cliff Click:
> > > 
> > >  http://www.azulsystems.com/blog/cliff/2007-03-26-non-blocking-hashtable
> > >  http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf
> > > 
> > > Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
> > > Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > ---
> > > +/**
> > > + * tracing_map_insert - Insert key and/or retrieve val from a tracing_map
> > > + * @map: The tracing_map to insert into
> > > + * @key: The key to insert
> > > + *
> > > + * Inserts a key into a tracing_map and creates and returns a new
> > > + * tracing_map_elt for it, or if the key has already been inserted by
> > > + * a previous call, returns the tracing_map_elt already associated
> > > + * with it.  When the map was created, the number of elements to be
> > > + * allocated for the map was specified (internally maintained as
> > > + * 'max_elts' in struct tracing_map), and that number of
> > > + * tracing_map_elts was created by tracing_map_init().  This is the
> > > + * pre-allocated pool of tracing_map_elts that tracing_map_insert()
> > > + * will allocate from when adding new keys.  Once that pool is
> > > + * exhausted, tracing_map_insert() is useless and will return NULL to
> > > + * signal that state.
> > > + *
> > > + * This is a lock-free tracing map insertion function implementing a
> > > + * modified form of Cliff Click's basic insertion algorithm.  It
> > > + * requires the table size be a power of two.  To prevent any
> > > + * possibility of an infinite loop we always make the internal table
> > > + * size double the size of the requested table size (max_elts * 2).
> > > + * Likewise, we never reuse a slot or resize or delete elements - when
> > > + * we've reached max_elts entries, we simply return NULL once we've
> > > + * run out of entries.  Readers can at any point in time traverse the
> > > + * tracing map and safely access the key/val pairs.
> > > + *
> > > + * Return: the tracing_map_elt pointer val associated with the key.
> > > + * If this was a newly inserted key, the val will be a newly allocated
> > > + * and associated tracing_map_elt pointer val.  If the key wasn't
> > > + * found and the pool of tracing_map_elts has been exhausted, NULL is
> > > + * returned and no further insertions will succeed.
> > > + */
> > > +struct tracing_map_elt *tracing_map_insert(struct tracing_map *map, void *key)
> > > +{
> > > +	u32 idx, key_hash, test_key;
> > > +	struct tracing_map_entry *entry;
> > > +
> > > +	key_hash = jhash(key, map->key_size, 0);
> > > +	if (key_hash == 0)
> > > +		key_hash = 1;
> > > +	idx = key_hash >> (32 - (map->map_bits + 1));
> > > +
> > > +	while (1) {
> > > +		idx &= (map->map_size - 1);
> > > +		entry = TRACING_MAP_ENTRY(map->map, idx);
> > > +		test_key = entry->key;
> > > +
> > > +		if (test_key && test_key == key_hash && entry->val &&
> > > +		    keys_match(key, entry->val->key, map->key_size))
> > > +			return entry->val;
> > > +
> > > +		if (!test_key && !cmpxchg(&entry->key, 0, key_hash)) {
> > > +			struct tracing_map_elt *elt;
> > > +
> > > +			elt = get_free_elt(map);
> > > +			if (!elt)
> > > +				break;
> > > +			memcpy(elt->key, key, map->key_size);
> > > +			entry->val = elt;
> > > +
> > > +			return entry->val;
> > > +		}
> > > +		idx++;
> > > +	}
> > > +
> > > +	return NULL;
> > > +}
> > 
> > IIUC this always insert new entry if no matching key found.  And if
> > the map is full, it only fails after walking through the entries to
> > find an empty one, mark the entry with the key and call to
> > get_free_elt() returns NULL.  As more key added, it worsenes the
> > problem since more entries will be marked with no value IMHO.
> > 
> > I can see you checked hist_data->drops in the next patch to work
> > around this problem.  But IMHO it's suboptimal since it cannot update
> > the existing entries too.  I think it'd be better having lookup-only
> > version of this function and use it after it sees drops.  The lookup
> > function can bail out from the loop if the insert doesn't mark empty
> > entry anymore IMHO.
> > 
> > Thoughts?
> > 
> 
> The assumption has always been that once you have drops (i.e.
> tracing_map_insert() returns NULL), the data can't really be trusted any
> more and tracing should just stop (and presumably be restarted with a
> bigger table).  It doesn't mean that the data is completely useless,
> just that it no longer can be assumed to have captured all the events
> over the tracing run.  Having a lookup-only version for the purpose of
> updating only existing entries sort of illustrates the problem even
> better - in that case only the events that already have entries in the
> table will be included while the events that don't yet have entries will
> be ignored, skewing the value of the data.

I thought it'd be better if users can see which one is the real drop
or not.  IOW if drop count is much smaller than the normal event
count, [s]he might want to ignore the occasional drops.  Otherwise,
[s]he should restart with a bigger table.  This requires accurate
counts of events and drops though.


> 
> On the other hand, if users do end up calling this even after it's
> returned NULL, we should make sure it doesn't result in an infinite
> loop, and cap the number of iterations through the loop.  tracing_map
> wasn't really meant to be generally reusable - it was separated out so
> it could be shared between two tracers - but it wouldn't hurt to add
> that check just in case...

Right.

Thanks,
Namhyung

  reply	other threads:[~2015-11-02  7:08 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-22 18:14 [PATCH 00/28] tracing: 'hist' triggers Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 01/28] tracing: Update cond flag when enabling or disabling a trigger Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 02/28] tracing: Make ftrace_event_field checking functions available Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 03/28] tracing: Make event trigger " Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 04/28] tracing: Add event record param to trigger_ops.func() Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 05/28] tracing: Add get_syscall_name() Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 06/28] tracing: Add a per-event-trigger 'paused' field Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 07/28] tracing: Add needs_rec flag to event triggers Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 08/28] tracing: Add lock-free tracing_map Tom Zanussi
2015-10-29  8:31   ` Namhyung Kim
2015-10-29 18:35     ` Tom Zanussi
2015-11-02  7:08       ` Namhyung Kim [this message]
2015-11-04  1:47         ` Tom Zanussi
2015-11-04  2:26           ` Namhyung Kim
2015-11-04  2:56             ` Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 09/28] tracing: Add 'hist' event trigger command Tom Zanussi
2015-10-29  9:11   ` Namhyung Kim
2015-10-29 18:37     ` Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 10/28] tracing: Add hist trigger support for multiple values ('vals=' param) Tom Zanussi
2015-11-02  7:42   ` Namhyung Kim
2015-10-22 18:14 ` [PATCH v11 11/28] tracing: Add hist trigger support for compound keys Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 12/28] tracing: Add hist trigger support for user-defined sorting ('sort=' param) Tom Zanussi
2015-11-02  8:03   ` Namhyung Kim
2015-10-22 18:14 ` [PATCH v11 13/28] tracing: Add hist trigger support for pausing and continuing a trace Tom Zanussi
2015-11-03  8:38   ` Namhyung Kim
2015-11-04  1:53     ` Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 14/28] tracing: Add hist trigger support for clearing " Tom Zanussi
2015-11-03  8:43   ` Namhyung Kim
2015-10-22 18:14 ` [PATCH v11 15/28] tracing: Add hist trigger 'hex' modifier for displaying numeric fields Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 16/28] tracing: Add hist trigger 'sym' and 'sym-offset' modifiers Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 17/28] tracing: Add hist trigger 'execname' modifier Tom Zanussi
2015-11-02 14:10   ` Namhyung Kim
2015-11-04  2:37     ` Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 18/28] tracing: Add hist trigger 'syscall' modifier Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 19/28] tracing: Add hist trigger support for stacktraces as keys Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 20/28] tracing: Support string type key properly Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 21/28] tracing: Remove restriction on string position in hist trigger keys Tom Zanussi
2015-11-02 14:40   ` Namhyung Kim
2015-10-22 18:14 ` [PATCH v11 22/28] tracing: Add enable_hist/disable_hist triggers Tom Zanussi
2015-11-03  8:55   ` Namhyung Kim
2015-11-04  1:58     ` Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 23/28] tracing: Add 'hist' trigger Documentation Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 24/28] tracing: Add support for multiple hist triggers per event Tom Zanussi
2015-11-03  0:34   ` Namhyung Kim
2015-11-04  1:52     ` Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 25/28] tracing: Add support for named triggers Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 26/28] tracing: Add support for named hist triggers Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 27/28] kselftests/ftrace : Add event trigger testcases Tom Zanussi
2015-10-22 18:14 ` [PATCH v11 28/28] kselftests/ftrace: Add hist " Tom Zanussi
2015-10-22 18:21 ` [PATCH 00/28] tracing: 'hist' triggers Steven Rostedt
2015-10-22 18:31   ` Tom Zanussi
2015-11-05 14:43 ` Namhyung Kim
2015-11-05 14:59   ` 平松雅巳 / HIRAMATU,MASAMI
2015-11-05 22:13   ` Tom Zanussi
2015-11-05 23:18     ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151102070805.GB17941@sejong \
    --to=namhyung@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tom.zanussi@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox