All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
	Jens Axboe <axboe@kernel.dk>,
	Philipp Reisner <philipp.reisner@linbit.com>,
	Bryan Tan <bryantan@vmware.com>,
	Eric Dumazet <edumazet@google.com>,
	Bob Pearson <rpearsonhpe@gmail.com>,
	Ariel Levkovich <lariel@nvidia.com>,
	Theodore Ts'o <tytso@mit.edu>, Julian Anastasov <ja@ssi.bg>
Subject: Re: [PATCH 04/13] tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
Date: Fri, 17 Mar 2023 10:05:22 +0100	[thread overview]
Message-ID: <ZBQtUr6MLhDYqPl5@pc636> (raw)
In-Reply-To: <ZBMwJYFYpfLsuW5F@pc636>

On Thu, Mar 16, 2023 at 04:05:09PM +0100, Uladzislau Rezki wrote:
> On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote:
> > On Thu, 16 Mar 2023 09:16:37 +0100
> > Uladzislau Rezki <urezki@gmail.com> wrote:
> > 
> > > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> > > > index ef8ed3b65d05..e6037752dcf0 100644
> > > > --- a/kernel/trace/trace_probe.h
> > > > +++ b/kernel/trace/trace_probe.h
> > > > @@ -256,6 +256,7 @@ struct trace_probe {
> > > >  struct event_file_link {
> > > >  	struct trace_event_file		*file;
> > > >  	struct list_head		list;
> > > > +	struct rcu_head			rcu;
> > > >  };
> > > >  
> > > >  static inline bool trace_probe_test_flag(struct trace_probe *tp,
> > > >  
> > > struct foo_a {
> > >   int a;
> > >   int b;
> > > };
> > 
> > Most machines today are 64 bits, even low end machines.
> > 
> >  struct foo_a {
> > 	long long a;
> > 	long long b;
> >  };
> > 
> > is more accurate. That's 16 bytes.
> > 
> > Although it is more likely off because list_head is a double pointer. But
> > let's just go with this, as the amount really doesn't matter here.
> > 
> > > 
> > > your obj size is 8 byte
> > > 
> > > struct foo_b {
> > >   struct rcu_head rcu;
> > 
> > Isn't rcu_head defined as;
> > 
> > struct callback_head {
> >         struct callback_head *next;
> >         void (*func)(struct callback_head *head);
> > } __attribute__((aligned(sizeof(void *))));
> > #define rcu_head callback_head
> > 
> > Which makes it 8 not 16 on 32 bit as well?
> > 
> > >   int a;
> > >   int b;
> > > };
> > 
> > So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit.
> > 
> > > 
> > > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object
> > > will be 32 bytes since there is no slab for 24 bytes:
> > > 
> > > <snip>
> > >   kmalloc-32         19840  19840     32  128    1 : tunables    0    0    0 : slabdata    155    155      0
> > >   kmalloc-16         28857  28928     16  256    1 : tunables    0    0    0 : slabdata    113    113      0
> > >   kmalloc-8          37376  37376      8  512    1 : tunables    0    0    0 : slabdata     73     73      0
> > > <snip>
> > > 
> > > if we allocate 512 objects of foo_a it would be 4096 bytes
> > > in case of foo_b it is 24 * 512 = 12228 bytes.
> > 
> > This is for probe events. We usually allocate 1, maybe 2. Oh, some may even
> > allocate 100 to be crazy. But each probe event is in reality much larger
> > (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes
> > extra is still lost in the noise.
> > 
> > > 
> > > single argument will give you 4096 + 512 * 8 = 8192 bytes
> > > int terms of memory consumtion.
> > 
> > If someone allocate 512 instances, that would be closer to a meg in size
> > without this change. 8k is probably less than 1%
> > 
> In percentage. My case. (12228 - 8192) * 100 / 12228 = ~33% difference.
> 
> > > 
> > > And double argument will not give you better performance comparing
> > > with a single argument.
> > 
> > It will, because it will no longer have to allocate anything if need be.
> > Note, when it doesn't allocate the system is probably mostly idle and we
> > don't care about performance, but when it needs allocation, that's likely a
> > time when performance is a bit more important.
> > 
> The problem further is about pointer chasing, like comparing arrays and
> lists. It will take longer time to offload all pointers.
> 
Since i have a data, IMHO it is better to share than not:

--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=3 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"

# double-argument 10 run
Total time taken by all kfree'ers: 4387872408 ns, loops: 10000, batches: 958,  memory footprint: 40MB
Total time taken by all kfree'ers: 4415232304 ns, loops: 10000, batches: 982,  memory footprint: 39MB
Total time taken by all kfree'ers: 4270303081 ns, loops: 10000, batches: 955,  memory footprint: 42MB
Total time taken by all kfree'ers: 4364984147 ns, loops: 10000, batches: 953,  memory footprint: 40MB
Total time taken by all kfree'ers: 4225994506 ns, loops: 10000, batches: 942,  memory footprint: 40MB
Total time taken by all kfree'ers: 4601087346 ns, loops: 10000, batches: 1033, memory footprint: 40MB
Total time taken by all kfree'ers: 4853397855 ns, loops: 10000, batches: 1109, memory footprint: 38MB
Total time taken by all kfree'ers: 4627914204 ns, loops: 10000, batches: 1037, memory footprint: 39MB
Total time taken by all kfree'ers: 4274587317 ns, loops: 10000, batches: 938,  memory footprint: 33MB
Total time taken by all kfree'ers: 4642151205 ns, loops: 10000, batches: 1068, memory footprint: 39MB

# single-argument 10 run
Total time taken by all kfree'ers: 3661190052 ns, loops: 10000, batches: 831, memory footprint: 29MB
Total time taken by all kfree'ers: 3616277061 ns, loops: 10000, batches: 780, memory footprint: 27MB
Total time taken by all kfree'ers: 3704584439 ns, loops: 10000, batches: 810, memory footprint: 27MB
Total time taken by all kfree'ers: 3631291959 ns, loops: 10000, batches: 812, memory footprint: 28MB
Total time taken by all kfree'ers: 3610490769 ns, loops: 10000, batches: 795, memory footprint: 27MB
Total time taken by all kfree'ers: 3595446243 ns, loops: 10000, batches: 825, memory footprint: 28MB
Total time taken by all kfree'ers: 3686252889 ns, loops: 10000, batches: 784, memory footprint: 27MB
Total time taken by all kfree'ers: 3821475275 ns, loops: 10000, batches: 869, memory footprint: 27MB
Total time taken by all kfree'ers: 3740407185 ns, loops: 10000, batches: 813, memory footprint: 28MB
Total time taken by all kfree'ers: 3646684795 ns, loops: 10000, batches: 780, memory footprint: 24MB

And yes, there are side effects. For example a low memory condition.

--
Uladzislau Rezki

  reply	other threads:[~2023-03-17  9:06 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-01 15:08 [PATCH 00/13] Rename k[v]free_rcu() single argument to k[v]free_rcu_mightsleep() Uladzislau Rezki (Sony)
2023-02-01 15:08 ` [PATCH 01/13] rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep() Uladzislau Rezki (Sony)
2023-02-02  7:54   ` Zhuo, Qiuxu
2023-02-02 15:07     ` Paul E. McKenney
2023-02-01 15:08 ` [PATCH 02/13] drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep() Uladzislau Rezki (Sony)
2023-03-09 13:39   ` Uladzislau Rezki
2023-02-01 15:08 ` [PATCH 03/13] misc: vmw_vmci: " Uladzislau Rezki (Sony)
2023-03-09 13:41   ` Uladzislau Rezki
2023-03-09 14:36   ` Vishnu Dasa
2023-02-01 15:08 ` [PATCH 04/13] tracing: " Uladzislau Rezki (Sony)
2023-03-09 13:45   ` Uladzislau Rezki
2023-03-15 22:36     ` Steven Rostedt
2023-03-15 23:19       ` Jens Axboe
2023-03-16  0:37         ` Paul E. McKenney
2023-03-16  2:23           ` Steven Rostedt
2023-03-16  3:44             ` Paul E. McKenney
2023-03-16  4:16               ` Joel Fernandes
2023-03-16 12:14                 ` Steven Rostedt
2023-03-16 14:56                   ` Paul E. McKenney
2023-03-16  8:16       ` Uladzislau Rezki
2023-03-16 13:56         ` Steven Rostedt
2023-03-16 15:05           ` Uladzislau Rezki
2023-03-17  9:05             ` Uladzislau Rezki [this message]
2023-03-16 15:12           ` Paul E. McKenney
2023-03-16 17:54       ` Paul E. McKenney
2023-03-16 17:57       ` Uladzislau Rezki
2023-03-16 18:01         ` Joel Fernandes
2023-03-18 16:11           ` Steven Rostedt
2023-03-22 23:10             ` Joel Fernandes
2023-02-01 15:08 ` [PATCH 05/13] lib/test_vmalloc.c: " Uladzislau Rezki (Sony)
2023-02-01 15:08 ` [PATCH 06/13] net/sysctl: " Uladzislau Rezki (Sony)
2023-03-09 13:48   ` Uladzislau Rezki
2023-03-09 13:49   ` Uladzislau Rezki
2023-02-01 15:08 ` [PATCH 07/13] RDMA/rxe: Rename kfree_rcu() to kfree_rcu_mightsleep() Uladzislau Rezki (Sony)
2023-03-09 13:48   ` Uladzislau Rezki
2023-03-09 14:13     ` Uladzislau Rezki
2023-03-10  0:55       ` Joel Fernandes
2023-03-13 19:43         ` Bob Pearson
2023-03-15 11:50           ` Joel Fernandes
2023-03-15 18:07             ` Bob Pearson
2023-03-14  6:31       ` Zhu Yanjun
2023-02-01 15:08 ` [PATCH 08/13] net/mlx5: " Uladzislau Rezki (Sony)
2023-03-09 13:47   ` Uladzislau Rezki
2023-02-01 15:08 ` [PATCH 09/13] ext4/super: " Uladzislau Rezki (Sony)
2023-03-09 13:43   ` Uladzislau Rezki
2023-02-01 19:12 ` [PATCH 00/13] Rename k[v]free_rcu() single argument to k[v]free_rcu_mightsleep() Paul E. McKenney
2023-02-02 15:54   ` Uladzislau Rezki
2023-02-02 16:35     ` Paul E. McKenney
2023-02-23 12:45 ` Frederic Weisbecker
2023-02-23 14:29   ` Zhuo, Qiuxu
2023-02-23 15:54     ` Paul E. McKenney
2023-02-23 16:21       ` Julian Anastasov
2023-02-23 17:14         ` Paul E. McKenney
2023-02-23 17:36           ` Pablo Neira Ayuso
2023-02-23 18:21             ` Paul E. McKenney
2023-02-23 14:57 ` Jens Axboe
2023-02-23 18:31   ` Paul E. McKenney
2023-02-23 19:36     ` Jens Axboe
2023-02-23 19:47       ` Paul E. McKenney
2023-02-23 19:57         ` Jens Axboe
2023-03-15 19:14 ` Steven Rostedt
2023-03-15 19:16   ` Jens Axboe
2023-03-15 19:25     ` Uladzislau Rezki
2023-03-15 19:34       ` Steven Rostedt
2023-03-15 19:57         ` Joel Fernandes
2023-03-15 20:28           ` Steven Rostedt
2023-03-15 21:07             ` Uladzislau Rezki
2023-03-15 21:14               ` Uladzislau Rezki
2023-03-15 22:08             ` Joel Fernandes
2023-03-15 22:26               ` Steven Rostedt
2023-03-16  2:13                 ` Joel Fernandes
2023-03-16  2:50                   ` Steven Rostedt
2023-03-16  5:01                     ` Joel Fernandes
2023-03-16  1:25               ` Theodore Ts'o
2023-03-16  2:15                 ` Steven Rostedt
2023-03-16  2:52                 ` Paul E. McKenney
2023-03-16  0:42         ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBQtUr6MLhDYqPl5@pc636 \
    --to=urezki@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bryantan@vmware.com \
    --cc=edumazet@google.com \
    --cc=ja@ssi.bg \
    --cc=lariel@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=philipp.reisner@linbit.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.