From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751449AbcFXPOe (ORCPT <rfc822;w@1wt.eu>);
	Fri, 24 Jun 2016 11:14:34 -0400
Received: from mail.windriver.com ([147.11.1.11]:56696 "EHLO
	mail.windriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751121AbcFXPOc (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 24 Jun 2016 11:14:32 -0400
Message-ID: <576D4E56.3040504@windriver.com>
Date: Fri, 24 Jun 2016 09:14:30 -0600
From: Chris Friesen <chris.friesen@windriver.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: lkml <linux-kernel@vger.kernel.org>
Subject: why would we be spending a whole msec running just the cascade()
 function?
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.25.39.2]
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hi,

I'm trying to get an idea why we would be spending a whole millisecond running 
the cascade() routine.

This is on a CentOS 7 kernel (config modified as per below) so feel free to send 
me off to the other support fora if you like.

Running cyclictest gave the following:

cyclicte-29932  14d..2 282141264us : pick_next_task_idle <-__schedule
   <idle>-0      14d..1 282149191us : do_softirq <-irq_exit
   <idle>-0      14d..1 282149192us : __do_softirq <-call_softirq
   <idle>-0      14d..1 282149192us : msecs_to_jiffies <-__do_softirq
   <idle>-0      14..s1 282149192us : run_timer_softirq <-__do_softirq
   <idle>-0      14..s1 282149192us : _raw_spin_lock_irq <-run_timer_softirq
   <idle>-0      14d.s2 282149193us : cascade <-run_timer_softirq
(1167 similar lines removed)
   <idle>-0      14d.s2 282150286us : cascade <-run_timer_softirq
   <idle>-0      14d.s2 282150286us : __internal_add_timer <-cascade
   <idle>-0      14d.s2 282150287us : cascade <-run_timer_softirq
(20 similar lines removed)
   <idle>-0      14d.s2 282150305us : cascade <-run_timer_softirq
   <idle>-0      14d.s2 282150305us : __internal_add_timer <-cascade
   <idle>-0      14d.s2 282150306us : _raw_spin_unlock_irq <-run_timer_softirq

cyclicte-29932  14...1 282150347us : tracing_mark_write: hit latency threshold 
(1100 > 1000)


I vaguely understand what cascade() is doing, but it doesn't make any sense to 
me that it would do it for over a millisecond.   Is it coincidence that 
cascade() is called roughly every microsecond for just over a millisecond?

Overall the latency is actually pretty reasonable except for these sporadic 
glitches:

/home/wrsroot/cyclictest -S -m -p 80 -l 1000000
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 4.08 4.07 4.08 5/371 6845

T: 0 ( 5060) P:80 I:1000 C:1000000 Min:      1 Act:    2 Avg:    1 Max:      58
T: 1 ( 5061) P:80 I:1500 C: 666670 Min:      1 Act:    1 Avg:    1 Max:       3
T: 2 ( 5062) P:80 I:2000 C: 500002 Min:      2 Act:    2 Avg:    2 Max:       4
T: 3 ( 5063) P:80 I:2500 C: 400001 Min:      1 Act:    2 Avg:    1 Max:      10
T: 4 ( 5064) P:80 I:3000 C: 333333 Min:      1 Act:    2 Avg:    1 Max:       8
T: 5 ( 5065) P:80 I:3500 C: 285711 Min:      1 Act:    2 Avg:    2 Max:    1011
T: 6 ( 5066) P:80 I:4000 C: 249996 Min:      1 Act:    2 Avg:    2 Max:    1015
T: 7 ( 5067) P:80 I:4500 C: 222219 Min:      1 Act:    2 Avg:    2 Max:      17
T: 8 ( 5068) P:80 I:5000 C: 199996 Min:      1 Act:    1 Avg:    1 Max:       3
T: 9 ( 5069) P:80 I:5500 C: 181814 Min:      2 Act:    2 Avg:    2 Max:       5
T:10 ( 5070) P:80 I:6000 C: 166662 Min:      1 Act:    2 Avg:    2 Max:       5
T:11 ( 5071) P:80 I:6500 C: 153841 Min:      1 Act:    2 Avg:    2 Max:       8
T:12 ( 5072) P:80 I:7000 C: 142853 Min:      1 Act:    2 Avg:    2 Max:    1008
T:13 ( 5073) P:80 I:7500 C: 133329 Min:      2 Act:    3 Avg:    2 Max:       5
T:14 ( 5074) P:80 I:8000 C: 124996 Min:      1 Act:    2 Avg:    2 Max:       7
T:15 ( 5075) P:80 I:8500 C: 117643 Min:      1 Act:    2 Avg:    2 Max:       6


Kernel config has been modified as follows:

CONFIG_RCU_BOOST=y
CONFIG_DEBUG_PREEMPT=n
CONFIG_PROVE_RCU_DELAY=n
CONFIG_RCU_CPU_STALL_VERBOSE=n
CONFIG_PREEMPT_TRACER=n
CONFIG_RCU_BOOST_PRIO=1
CONFIG_RCU_BOOST_DELAY=500

Chris