All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olivier Dion via lttng-dev <lttng-dev@lists.lttng.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, paulmck@kernel.org
Cc: lttng-dev@lists.lttng.org
Subject: Re: [Userspace RCU] - rcu_dereference() memory ordering
Date: Thu, 21 Nov 2024 13:13:29 -0500	[thread overview]
Message-ID: <87serkcv5i.fsf@laura> (raw)
In-Reply-To: <47507865-d3a7-494f-91a0-7a3ff2a6f8db@efficios.com>

On Thu, 21 Nov 2024, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> On 2024-10-21 19:35, Paul E. McKenney wrote:
>> On Mon, Oct 21, 2024 at 03:53:04PM -0400, Olivier Dion wrote:
[...] 
>> How much of the added "Volatile access" overhead is due to the volatile
>> load and how much to the cmm_ptr_eq?  Many use cases do not need to
>> compare pointers, except maybe against NULL.  Or against a sentinel.
>> In both cases, an equality comparison means no dereferncing, so no
>> problems.
>
> Olivier will prepare benchmarks without the cmm_ptr_eq() so we can isolate
> the overhead contribution of volatile vs atomic builtins more
> specifically.

Here is the micro-benchmark without pointers comparison.  Tight loop of
rcu_derefenrece() ran 1 000 000 000 times:

Hardware:

  ARM Cortex-A57

Overview:

 | Implementation | Instructions   | Cycles         | Branch misses | Task clock (ms) | Insn/cycle |
 |----------------+----------------+----------------+---------------+-----------------+------------|
 | Volatile (V)   | 10 006 366 281 | 6 011 214 706  | 21 168        | 3 159.60        |       1.66 |
 | Atomic (A)     | 10 020 098 136 | 21 081 007 289 | 46 091        | 11 039.38       |       0.48 |
 |----------------+----------------+----------------+---------------+-----------------+------------|
 | Δ (A / V - 1)  | 0.14 %         | 250.69 %       | 117.74 %      | 249.39 %        |   -71.08 % |

Volatile:

         0000000000000860 <func>:
          860:   90000100        adrp    x0, 20000 <__libc_start_main@GLIBC_2.34>
          864:   91012001        add     x1, x0, #0x48
          868:   f9402400        ldr     x0, [x0, #72]   ;; rcu_dereference()
          86c:   f9400000        ldr     x0, [x0]
          870:   f9000420        str     x0, [x1, #8]
          874:   d65f03c0        ret

          3,159.60 msec task-clock                       #    0.999 CPUs utilized
                 3      context-switches                 #    0.949 /sec
                 0      cpu-migrations                   #    0.000 /sec
                42      page-faults                      #   13.293 /sec
     6,011,214,706      cycles                           #    1.903 GHz
    10,006,366,281      instructions                     #    1.66  insn per cycle
   <not supported>      branches
            21,168      branch-misses

       3.161819264 seconds time elapsed

       3.161902000 seconds user
       0.000000000 seconds sys

Atomic:

        0000000000000860 <func>:
         860:   90000100        adrp    x0, 20000 <__libc_start_main@GLIBC_2.34>
         864:   91012000        add     x0, x0, #0x48
         868:   c8dffc01        ldar    x1, [x0]        ;; rcu_dereference()
         86c:   f9400021        ldr     x1, [x1]
         870:   f9000401        str     x1, [x0, #8]
         874:   d65f03c0        ret

         11,039.38 msec task-clock                       #    1.000 CPUs utilized
                20      context-switches                 #    1.812 /sec
                 0      cpu-migrations                   #    0.000 /sec
                43      page-faults                      #    3.895 /sec
    21,081,007,289      cycles                           #    1.910 GHz
    10,020,098,136      instructions                     #    0.48  insn per cycle
   <not supported>      branches
            46,091      branch-misses

      11.042103521 seconds time elapsed

      11.041847000 seconds user
       0.000000000 seconds sys

[...]
-- 
Olivier Dion
EfficiOS Inc.
https://www.efficios.com

      reply	other threads:[~2024-11-21 18:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21 19:53 [lttng-dev] [Userspace RCU] - rcu_dereference() memory ordering Olivier Dion via lttng-dev
2024-10-21 23:35 ` Paul E. McKenney via lttng-dev
2024-11-21 17:35   ` Mathieu Desnoyers via lttng-dev
2024-11-21 18:13     ` Olivier Dion via lttng-dev [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87serkcv5i.fsf@laura \
    --to=lttng-dev@lists.lttng.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=odion@efficios.com \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.