All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	Andrea Righi <andrea.righi@canonical.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Joel Fernandes <joelaf@google.com>,
	rcu@vger.kernel.org,
	"Cc: Frederic Weisbecker" <fweisbec@gmail.com>
Subject: Re: Observation on NOHZ_FULL
Date: Mon, 12 Feb 2024 10:05:50 +0100	[thread overview]
Message-ID: <ZcnfbhteSsngapOP@pc636> (raw)
In-Reply-To: <29e6c22d-0487-42a5-b840-e3c7ea7c6270@joelfernandes.org>

On Sun, Feb 11, 2024 at 07:22:55PM -0500, Joel Fernandes wrote:
> 
> 
> On 2/8/2024 9:51 AM, Uladzislau Rezki wrote:
> > On Thu, Feb 08, 2024 at 01:53:58PM +0100, Uladzislau Rezki wrote:
> >> On Thu, Feb 08, 2024 at 07:55:37AM +0100, Andrea Righi wrote:
> >>> On Thu, Feb 08, 2024 at 12:54:58AM -0500, Joel Fernandes wrote:
> >>> ...
> >>>>>> Slightly related, but one of the things we are wondering also is how
> >>>>>> much of the overhead for your nohz-full and lazy-RCU test (on top of
> >>>>>> baseline - that is just CONFIG_HZ=1000 without nohz-full or nocbs) is
> >>>>>> because of just using NOCB. Uladsizlau mentioned he might run a test
> >>>>>> for comparing along those lines as well.
> >>>>>
> >>>>> Just to clarify, "lazy rcu on" results are just with rcu_nocb=all and
> >>>>> lazy RCUs enabled (and HZ=1000), so without nohz_full.
> >>>>>
> >>>>> If I enable only nohz_full=all (without rcu_nocb) I see something like
> >>>>> this:
> >>>>
> >>>> Ok. I did want to mention nohz_full implies rcu_nocb on the same CPUs as well.
> >>>>
> >>>> Its also mentioned in the boot param docs on the last line of the description:
> >>>>
> >>>>         nohz_full=      [KNL,BOOT,SMP,ISOL]
> >>>>                         The argument is a cpu list, as described above.
> >>>>                         In kernels built with CONFIG_NO_HZ_FULL=y, set
> >>>>                         the specified list of CPUs whose tick will be stopped
> >>>>                         whenever possible. The boot CPU will be forced outside
> >>>>                         the range to maintain the timekeeping.  Any CPUs
> >>>>                         in this list will have their RCU callbacks offloaded,
> >>>>                         just as if they had also been called out in the
> >>>>                         rcu_nocbs= boot parameter.
> >>>
> >>> Ah I didn't realize that, it definitely makes sense, thanks for
> >>> clarifying it.
> >>>
> >>> Then basically in the results that I posted the difference is
> >>> "nohz_full=all+rcu_nocb=all" vs "rcu_nocb=all+lazy_RCU=on".
> >>>
> >> So, you say that a hrtimer_interrupt() handler takes more time in case
> >> of lazy + nocb + rcu_nocb=all and for nohz_full + rcu_nocb=all it faster?
> >> Could you please clarify this? I will try to measure from my side!
> >>
> >> I have done some basic research about hrtimer_interrupt() latency on my
> >> HW with latest Linux kernel. I have compared below cases:
> >>
> >> case a: 1000HZ + lazy + nocb_all_cpus
> >> case b: 1000HZ + nocb_all_cpus
> >>
> >> I used "ftrace" to measure time(in microseconds). Steps:
> >>
> >> echo 0 > tracing_on
> >> echo function_graph > current_tracer
> >> echo funcgraph-proc > trace_options
> >> echo funcgraph-abstime > trace_options
> >> echo hrtimer_interrupt > set_ftrace_filter
> >>
> >> fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based --runtime=50&
> >>
> >> echo 1 > tracing_on; sleep 10; echo 0 > tracing_on
> >>
> >> data is based on 10 seconds collection:
> >>
> >> <case a>
> >>      6  2102 ############################################################
> >>      8  2079 ############################################################
> >>     10  1464 ##########################################
> >>      7   897 ##########################
> 
> So first column is microseconds and second one is count?
> 
> >>      9   625 ##################
> >>     12   490 ##############
> >>     13   479 ##############
> >>     11   289 #########
> >>      5   249 ########
> >>     14   124 ####
> >>     15    72 ###
> >>     16    41 ##
> >>     17    24 #
> >>      4    22 #
> >>     18    12 #
> >>     22     2 #
> >>     19     1 #
> >> <case a>
> >>
> >> <case b>
> >>      9  1658 ############################################################
> >>     13  1308 ################################################
> >>     12  1224 #############################################
> 
> Assuming that, it does seem the "best" case is off by 3 microseconds (9 vs 6),
> still would not warrant being regarded a bug and possibly just in the noise.
> 
> >>     10   972 ####################################
> >>      8   703 ##########################
> >>     14   595 ######################
> >>     15   571 #####################
> >>     11   525 ###################
> >>     17   350 #############
> >>     16   235 #########
> >>      7   214 ########
> >>      4    73 ###
> >>      5    68 ###
> >>      6    54 ##
> >>     20     9 #
> >>     18     9 #
> >>     19     6 #
> >>     33     1 #
> >>      3     1 #
> >>     28     1 #
> >>     27     1 #
> >>     25     1 #
> >>     22     1 #
> >>     21     1 #
> >> <case b>
> >>
> >> I do not see the difference, there is a nose of 1/2/3 microseconds diff.
> >>
> > Let me further have a look at what we use for lazy in terms on hrtimer though.
> 
> Thanks for tracing it. Yeah it would be nice to count how many counts of
> do_nocb_deferred_wakeup() does the fio test trigger. If it is few, then maybe
> the problem with hrtimer_interrupt() is something else.
> 

urezki@pc638:~$ fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based --runtime=1000 > /dev/null 2>&1&
[1] 697
urezki@pc638:~$ sudo cat /sys/kernel/debug/tracing/trace_pipe  | grep Timer
          <idle>-0       [005] ..s1.   175.762689: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   186.002386: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.   186.002391: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.   228.497061: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [001] ..s1.   242.831956: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-705     [004] ..s..   272.009997: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   282.253447: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   347.786758: rcu_nocb_wake: rcu_preempt 4 Timer
            cron-714     [004] ..s1.   362.429287: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] .Ns1.   377.993814: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   393.353371: rcu_nocb_wake: rcu_preempt 0 Timer
    kworker/6:1H-172     [006] ..s1.   410.760933: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   427.148456: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   438.220173: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] .Ns1.   451.719831: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   472.199389: rcu_nocb_wake: rcu_preempt 4 Timer
           <...>-717     [000] ..s..   482.434074: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   484.969012: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   495.750151: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-702     [000] ..s1.   516.439595: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   578.212713: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   595.583863: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   597.634693: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.   617.598664: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   655.996890: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   660.606691: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   680.059867: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   688.227533: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   698.491126: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-703     [000] ..s1.   705.925823: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   721.018256: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   732.793797: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   747.129268: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   766.075583: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-701     [004] ..s2.   767.047547: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   782.456032: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   797.303567: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/6:1H-172     [006] ..s1.   813.686996: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.   828.534430: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-700     [004] ..s2.   850.536698: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.   860.789367: rcu_nocb_wake: rcu_preempt 4 Timer
     ksoftirqd/6-63      [006] ..s..   888.948515: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-699     [000] ..s..   905.854912: rcu_nocb_wake: rcu_preempt 0 Timer
    kworker/6:1H-172     [006] ..s1.   911.987781: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.   924.275337: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.   938.208899: rcu_nocb_wake: rcu_preempt 4 Timer
    kworker/0:1H-100     [000] ..s1.   938.236053: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.   948.338615: rcu_nocb_wake: rcu_preempt 4 Timer
             fio-702     [000] ..s2.   958.216274: rcu_nocb_wake: rcu_preempt 0 Timer
           <...>-728     [004] ..s..   962.432149: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.   978.545645: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [003] ..s1.  1002.608935: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1002.614904: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1012.848592: rcu_nocb_wake: rcu_preempt 0 Timer
    kworker/6:1H-172     [006] ..s1.  1021.552408: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [002] ..s1.  1042.543682: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [001] ..s1.  1068.654892: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [002] ..s1.  1080.430541: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1098.349992: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [002] ..s1.  1115.757464: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1170.539822: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1178.191592: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1188.459281: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1196.139077: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1216.762430: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1238.374788: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.  1241.194246: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [003] ..s1.  1251.433991: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1261.673634: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1261.673637: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1271.913347: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1308.263711: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1308.263738: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1308.270707: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1326.183705: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1326.183709: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.  1326.193714: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1336.423403: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [003] ..s1.  1336.423404: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1368.678480: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1398.373573: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1428.580662: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1428.580666: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1468.515358: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [000] ..s1.  1488.482412: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [000] ..s1.  1548.385082: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1548.385083: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [003] ..s1.  1572.448367: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1572.448367: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [004] ..s1.  1572.453972: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [005] ..s1.  1582.688060: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1582.688062: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [004] ..s1.  1592.927895: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [007] ..s1.  1608.287294: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [001] ..s1.  1608.799312: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [002] ..s1.  1621.086909: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [005] ..s1.  1631.326609: rcu_nocb_wake: rcu_preempt 4 Timer
          <idle>-0       [003] .Ns1.  1653.854052: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [001] ..s1.  1668.189531: rcu_nocb_wake: rcu_preempt 0 Timer
          <idle>-0       [007] ..s1.  1678.429219: rcu_nocb_wake: rcu_preempt 4 Timer
<snip>

1000 seconds runtime of "fio" test triggers the do_nocb_deferred_wakeup_timer()
around 100 times. Most of them are handled from the soft-irq context. I have
looked at the perf figures vs hrtimer_interrupt(). I do not see any obvious
performance issues with it.

8xCPUs KVM system;
1000HZ timer;
lazy + all offloading.

--
Uladzislau Rezki

  reply	other threads:[~2024-02-12  9:05 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-29 17:47 Observation on NOHZ_FULL Joel Fernandes
2024-01-29 20:41 ` Paul E. McKenney
2024-01-29 22:16   ` Joel Fernandes
2024-01-30  6:58     ` Andrea Righi
2024-01-30 10:17       ` Paul E. McKenney
2024-01-30 11:06         ` Uladzislau Rezki
2024-01-30 11:27           ` Andrea Righi
2024-01-30 11:41             ` Uladzislau Rezki
2024-01-30 13:39             ` Paul E. McKenney
2024-02-06 17:51               ` Andrea Righi
2024-02-06 18:44                 ` Paul E. McKenney
2024-02-07 13:04                   ` Uladzislau Rezki
2024-02-07 15:12                     ` Andrea Righi
2024-02-07 15:49                       ` Joel Fernandes
2024-02-15  7:51                   ` Andrea Righi
2024-02-15 13:15                     ` Uladzislau Rezki
2024-02-15 14:02                       ` Joel Fernandes
2024-02-07 15:48                 ` Joel Fernandes
2024-02-07 16:31                   ` Andrea Righi
2024-02-07 16:52                     ` Joel Fernandes
2024-02-07 17:05                       ` Andrea Righi
2024-02-08  5:54                         ` Joel Fernandes
2024-02-08  6:55                           ` Andrea Righi
2024-02-08 12:53                             ` Uladzislau Rezki
2024-02-08 14:51                               ` Uladzislau Rezki
2024-02-12  0:22                                 ` Joel Fernandes
2024-02-12  9:05                                   ` Uladzislau Rezki [this message]
2024-02-12  9:44                                   ` Uladzislau Rezki
2024-01-29 20:48 ` Uladzislau Rezki
2024-01-29 22:20   ` Joel Fernandes
2024-01-29 22:43     ` Frederic Weisbecker
2024-01-29 22:53       ` Joel Fernandes
2024-01-29 23:11         ` Frederic Weisbecker
2024-01-29 23:36           ` Joel Fernandes
2024-01-30  0:40       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZcnfbhteSsngapOP@pc636 \
    --to=urezki@gmail.com \
    --cc=andrea.righi@canonical.com \
    --cc=fweisbec@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=joelaf@google.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.