public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lee Revell <rlrevell@joe-job.com>
To: dipankar@in.ibm.com
Cc: paulmck@us.ibm.com, Ingo Molnar <mingo@elte.hu>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@osdl.org>
Subject: Re: RCU latency regression in 2.6.16-rc1
Date: Sat, 28 Jan 2006 13:00:02 -0500	[thread overview]
Message-ID: <1138471203.2799.13.camel@mindpipe> (raw)
In-Reply-To: <20060128170302.GB5633@in.ibm.com>

On Sat, 2006-01-28 at 22:33 +0530, Dipankar Sarma wrote:
> On Fri, Jan 27, 2006 at 01:55:22PM -0500, Lee Revell wrote:
> > On Thu, 2006-01-26 at 11:18 -0800, Paul E. McKenney wrote:
> > > >     Xorg-2154  0d.s.  213us : call_rcu_bh (rt_run_flush)
> > > >     Xorg-2154  0d.s.  215us : local_bh_enable (rt_run_flush)
> > > >     Xorg-2154  0d.s.  222us : local_bh_enable (rt_run_flush)
> > > >     Xorg-2154  0d.s.  223us : call_rcu_bh (rt_run_flush)
> > > > 
> > > > [ zillions of these deleted ]
> > > > 
> > > >     Xorg-2154  0d.s. 7335us : local_bh_enable (rt_run_flush)
> > > 
> > > Dipankar's latest patch should hopefully address this problem.
> > > 
> > > Could you please give it a spin when you get a chance? 
> > 
> > Nope, no improvement at all, furthermore, the machine locked up once
> > under heavy disk activity.
> > 
> > I just got an 8ms+ latency from rt_run_flush that looks basically
> > identical to the above.  It's still flushing routes in huge batches:
> 
> I am not supprised that the earlier patch doesn't help your
> test. Once you reach the high watermark, the "desperation mode"
> latency can be fairly bad since the RCU batch size is pretty
> big.
> 
> How about trying out the patch included below ? It doesn't reduce
> amount of work done from softirq context, but decreases the
> *number of RCUs* generated during rt_run_flush() by using
> one RCU per hash chain. Can you check if this makes any
> difference ?
> 
> Thanks
> Dipankar
> 
> 
> Reduce the number of RCU callbacks by flushing one hash chain
> at a time. This is intended to reduce RCU overhead during
> frequent flushing.
> 
> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>

OK, now we are making progress.  I've been running my Gnutella client
for half an hour and the worst latency is only ~1ms, in what looks like
a closely related code path, but due to holding a spinlock
(rt_hash_lock_addr() in rt_check_expire), rather than merely being in
softirq context like the previous case.  Whether 1ms is too long to be
holding a spinlock can be addressed later; this is a significant
improvement.

preemption latency trace v1.1.5 on 2.6.16-rc1
--------------------------------------------------------------------
 latency: 1036 us, #1001/1001, CPU#0 | (M:rt VP:0, KP:0, SP:0 HP:0)
    -----------------
    | task: Xorg-2221 (uid:0 nice:0 policy:0 rt_prio:0)
    -----------------

                 _------=> CPU#            
                / _-----=> irqs-off        
               | / _----=> need-resched    
               || / _---=> hardirq/softirq 
               ||| / _--=> preempt-depth   
               |||| /                      
               |||||     delay             
   cmd     pid ||||| time  |   caller      
      \   /    |||||   \   |   /           
  <idle>-0     0d.s4    1us : __trace_start_sched_wakeup (try_to_wake_up)
  <idle>-0     0d.s4    1us : __trace_start_sched_wakeup <<...>-2221> (73 0)
  <idle>-0     0d.s2    2us : wake_up_state (signal_wake_up)
  <idle>-0     0d.s.    3us : hrtimer_forward (it_real_fn)
  <idle>-0     0d.s.    4us : ktime_get (hrtimer_forward)
  <idle>-0     0d.s.    5us : ktime_get_ts (ktime_get)
  <idle>-0     0d.s.    5us : getnstimeofday (ktime_get_ts)
  <idle>-0     0d.s.    6us : do_gettimeofday (getnstimeofday)
  <idle>-0     0d.s.    6us : get_offset_tsc (do_gettimeofday)
  <idle>-0     0d.s.    7us : set_normalized_timespec (ktime_get_ts)
  <idle>-0     0d.s1    8us : enqueue_hrtimer (hrtimer_run_queues)
  <idle>-0     0d.s1   10us+: rb_insert_color (enqueue_hrtimer)
  <idle>-0     0d.s.   12us+: rt_check_expire (run_timer_softirq)
  <idle>-0     0d.s1   14us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   16us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   17us : call_rcu_bh (rt_check_expire)
  <idle>-0     0d.s1   18us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   20us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   21us : call_rcu_bh (rt_check_expire)
  <idle>-0     0d.s1   22us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   23us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   24us : call_rcu_bh (rt_check_expire)
  <idle>-0     0d.s1   25us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   26us : call_rcu_bh (rt_check_expire)
  <idle>-0     0d.s1   27us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1   27us : call_rcu_bh (rt_check_expire)

[ etc ]

  <idle>-0     0d.s1  995us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1  996us : call_rcu_bh (rt_check_expire)
  <idle>-0     0d.s1  997us : rt_may_expire (rt_check_expire)
  <idle>-0     0d.s1  998us : call_rcu_bh (rt_check_expire)
  <idle>-0     0d.s.  999us : mod_timer (rt_check_expire)
  <idle>-0     0d.s. 1000us : __mod_timer (mod_timer)
  <idle>-0     0d.s. 1001us : lock_timer_base (__mod_timer)
  <idle>-0     0d.s1 1001us : internal_add_timer (__mod_timer)
  <idle>-0     0d.s. 1003us : run_timer_softirq (__do_softirq)
  <idle>-0     0d.s. 1004us : hrtimer_run_queues (run_timer_softirq)
  <idle>-0     0d.s. 1005us : ktime_get_real (hrtimer_run_queues)
  <idle>-0     0d.s. 1005us : getnstimeofday (ktime_get_real)
  <idle>-0     0d.s. 1006us : do_gettimeofday (getnstimeofday)
  <idle>-0     0d.s. 1007us : get_offset_tsc (do_gettimeofday)
  <idle>-0     0d.s. 1008us : ktime_get (hrtimer_run_queues)
  <idle>-0     0d.s. 1008us : ktime_get_ts (ktime_get)
  <idle>-0     0d.s. 1009us : getnstimeofday (ktime_get_ts)
  <idle>-0     0d.s. 1010us : do_gettimeofday (getnstimeofday)
  <idle>-0     0d.s. 1010us : get_offset_tsc (do_gettimeofday)
  <idle>-0     0d.s. 1011us : set_normalized_timespec (ktime_get_ts)
  <idle>-0     0d.s. 1013us : tasklet_action (__do_softirq)
  <idle>-0     0d.s. 1013us : rcu_process_callbacks (tasklet_action)
  <idle>-0     0d.s. 1014us : __rcu_process_callbacks (rcu_process_callbacks)
  <idle>-0     0d.s. 1015us : __rcu_process_callbacks (rcu_process_callbacks)
  <idle>-0     0d.s1 1016us+: rcu_start_batch (__rcu_process_callbacks)
  <idle>-0     0dn.1 1018us < (2097760)
  <idle>-0     0dn.. 1019us : schedule (cpu_idle)
  <idle>-0     0dn.. 1020us : stop_trace (schedule)
  <idle>-0     0dn.. 1020us : profile_hit (schedule)
  <idle>-0     0dn.1 1021us+: sched_clock (schedule)
  <idle>-0     0dn.2 1023us : recalc_task_prio (schedule)
  <idle>-0     0dn.2 1024us : effective_prio (recalc_task_prio)
  <idle>-0     0dn.2 1025us : requeue_task (schedule)
   <...>-2221  0d..2 1029us+: __switch_to (schedule)
   <...>-2221  0d..2 1031us : schedule <<idle>-0> (8c 73)
   <...>-2221  0d..1 1032us : trace_stop_sched_switched (schedule)
   <...>-2221  0d..2 1033us+: trace_stop_sched_switched <<...>-2221> (73 0)
   <...>-2221  0d..2 1035us : schedule (schedule)



  reply	other threads:[~2006-01-28 18:00 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-24  7:52 RCU latency regression in 2.6.16-rc1 Lee Revell
2006-01-24  7:56 ` Ingo Molnar
2006-01-24  7:58   ` Lee Revell
2006-01-24  8:01     ` Ingo Molnar
2006-01-24  8:03       ` Lee Revell
2006-01-24  8:11         ` Ingo Molnar
2006-01-24  8:07       ` Lee Revell
2006-01-24  8:13         ` Ingo Molnar
2006-01-24  8:15           ` Lee Revell
2006-01-24  9:17             ` Paul E. McKenney
2006-01-24  9:23               ` Ingo Molnar
2006-01-24  9:44                 ` Lee Revell
2006-01-24 16:28                   ` Dipankar Sarma
2006-01-24 21:38                     ` Dipankar Sarma
2006-01-25 21:28                       ` Lee Revell
2006-01-25 22:56                         ` Ingo Molnar
2006-01-25 23:13                           ` Lee Revell
2006-01-26 19:18                         ` Paul E. McKenney
2006-01-27 18:55                           ` Lee Revell
2006-01-28 17:03                             ` Dipankar Sarma
2006-01-28 18:00                               ` Lee Revell [this message]
2006-01-28 18:51                                 ` Lee Revell
2006-01-28 19:34                                   ` Dipankar Sarma
2006-01-28 19:46                                     ` Lee Revell
2006-01-28 19:52                                     ` Eric Dumazet
2006-01-29  7:38                                       ` Lee Revell
2006-01-29  7:51                                         ` Ingo Molnar
2006-01-29  8:21                                           ` Lee Revell
2006-01-30  4:36                                       ` Paul E. McKenney
2006-01-30  4:55                                         ` Eric Dumazet
2006-01-30  5:11                                           ` Paul E. McKenney
2006-01-30  5:52                                             ` David S. Miller
2006-01-30 10:00                                               ` Paul E. McKenney
2006-02-12  0:45                                             ` Lee Revell
2006-01-24 16:57 ` Dipankar Sarma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1138471203.2799.13.camel@mindpipe \
    --to=rlrevell@joe-job.com \
    --cc=dipankar@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@us.ibm.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox