public inbox for lttng-dev@lists.lttng.org
 help / color / mirror / Atom feed
From: Olivier Dion via lttng-dev <lttng-dev@lists.lttng.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, paulmck@kernel.org
Cc: lttng-dev@lists.lttng.org
Subject: Re: [Userspace RCU] - rcu_dereference() memory ordering
Date: Thu, 21 Nov 2024 13:13:29 -0500	[thread overview]
Message-ID: <87serkcv5i.fsf@laura> (raw)
In-Reply-To: <47507865-d3a7-494f-91a0-7a3ff2a6f8db@efficios.com>

On Thu, 21 Nov 2024, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> On 2024-10-21 19:35, Paul E. McKenney wrote:
>> On Mon, Oct 21, 2024 at 03:53:04PM -0400, Olivier Dion wrote:
[...] 
>> How much of the added "Volatile access" overhead is due to the volatile
>> load and how much to the cmm_ptr_eq?  Many use cases do not need to
>> compare pointers, except maybe against NULL.  Or against a sentinel.
>> In both cases, an equality comparison means no dereferncing, so no
>> problems.
>
> Olivier will prepare benchmarks without the cmm_ptr_eq() so we can isolate
> the overhead contribution of volatile vs atomic builtins more
> specifically.

Here is the micro-benchmark without pointers comparison.  Tight loop of
rcu_derefenrece() ran 1 000 000 000 times:

Hardware:

  ARM Cortex-A57

Overview:

 | Implementation | Instructions   | Cycles         | Branch misses | Task clock (ms) | Insn/cycle |
 |----------------+----------------+----------------+---------------+-----------------+------------|
 | Volatile (V)   | 10 006 366 281 | 6 011 214 706  | 21 168        | 3 159.60        |       1.66 |
 | Atomic (A)     | 10 020 098 136 | 21 081 007 289 | 46 091        | 11 039.38       |       0.48 |
 |----------------+----------------+----------------+---------------+-----------------+------------|
 | Δ (A / V - 1)  | 0.14 %         | 250.69 %       | 117.74 %      | 249.39 %        |   -71.08 % |

Volatile:

         0000000000000860 <func>:
          860:   90000100        adrp    x0, 20000 <__libc_start_main@GLIBC_2.34>
          864:   91012001        add     x1, x0, #0x48
          868:   f9402400        ldr     x0, [x0, #72]   ;; rcu_dereference()
          86c:   f9400000        ldr     x0, [x0]
          870:   f9000420        str     x0, [x1, #8]
          874:   d65f03c0        ret

          3,159.60 msec task-clock                       #    0.999 CPUs utilized
                 3      context-switches                 #    0.949 /sec
                 0      cpu-migrations                   #    0.000 /sec
                42      page-faults                      #   13.293 /sec
     6,011,214,706      cycles                           #    1.903 GHz
    10,006,366,281      instructions                     #    1.66  insn per cycle
   <not supported>      branches
            21,168      branch-misses

       3.161819264 seconds time elapsed

       3.161902000 seconds user
       0.000000000 seconds sys

Atomic:

        0000000000000860 <func>:
         860:   90000100        adrp    x0, 20000 <__libc_start_main@GLIBC_2.34>
         864:   91012000        add     x0, x0, #0x48
         868:   c8dffc01        ldar    x1, [x0]        ;; rcu_dereference()
         86c:   f9400021        ldr     x1, [x1]
         870:   f9000401        str     x1, [x0, #8]
         874:   d65f03c0        ret

         11,039.38 msec task-clock                       #    1.000 CPUs utilized
                20      context-switches                 #    1.812 /sec
                 0      cpu-migrations                   #    0.000 /sec
                43      page-faults                      #    3.895 /sec
    21,081,007,289      cycles                           #    1.910 GHz
    10,020,098,136      instructions                     #    0.48  insn per cycle
   <not supported>      branches
            46,091      branch-misses

      11.042103521 seconds time elapsed

      11.041847000 seconds user
       0.000000000 seconds sys

[...]
-- 
Olivier Dion
EfficiOS Inc.
https://www.efficios.com

      reply	other threads:[~2024-11-21 18:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21 19:53 [lttng-dev] [Userspace RCU] - rcu_dereference() memory ordering Olivier Dion via lttng-dev
2024-10-21 23:35 ` Paul E. McKenney via lttng-dev
2024-11-21 17:35   ` Mathieu Desnoyers via lttng-dev
2024-11-21 18:13     ` Olivier Dion via lttng-dev [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87serkcv5i.fsf@laura \
    --to=lttng-dev@lists.lttng.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=odion@efficios.com \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox