* task switch from net-rx to idle when there is napi processing to be done
@ 2012-01-25 2:39 Venkat Subbiah
2012-01-25 8:55 ` Uwe Kleine-König
0 siblings, 1 reply; 7+ messages in thread
From: Venkat Subbiah @ 2012-01-25 2:39 UTC (permalink / raw)
To: RT
In the process of debugging a napi ethernet driver performance issue,
what I am noticing is
1. While the driver is in the middle of a napi packet processing loop,
there is a task switch from
sirq-net-rx to idle even though there is pending napi processing to be done.
2. This task switch seems to happen every second
venkat@vs-lnx:~/nfss/trace$ grep "sched_switch: task sirq-net-rx" trace
| grep swapper
sirq-net-rx/0-7 [000] 3800.664615: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
sirq-net-rx/0-7 [000] 3801.663616: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
sirq-net-rx/0-7 [000] 3802.664615: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
sirq-net-rx/0-7 [000] 3803.664615: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
sirq-net-rx/0-7 [000] 3804.664614: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
sirq-net-rx/0-7 [000] 3805.664619: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
3. A log of one of the task switch is as
sirq-net-rx/0-7 [000] 3800.664567: cvm_oct_napi_poll_38:
napi_poll_cnt=2480984 backlog=2 rx_count=18 drop_cnt=0
sirq-net-rx/0-7 [000] 3800.664569: cvm_oct_napi_poll_38
<-net_rx_action
sirq-net-rx/0-7 [000] 3800.664608: preempt_schedule_irq
<-need_resched
sirq-net-rx/0-7 [000] 3800.664610: __schedule
<-preempt_schedule_irq
sirq-net-rx/0-7 [000] 3800.664615: sched_switch: task
sirq-net-rx/0:7 [50] (R) ==> swapper:0 [120]
<idle>-0 [000] 3800.714604: __schedule <-cpu_idle
<idle>-0 [000] 3800.714608: sched_switch: task swapper:0 [120] (R)
==> sirq-net-rx/0:7 [50]
sirq-net-rx/0-7 [000] 3800.714611: __schedule
<-preempt_schedule_irq
sirq-net-rx/0-7 [000] 3800.714691: cvm_oct_napi_poll_38:
napi_poll_cnt=2480985 backlog=1 rx_count=32 drop_cnt=0
4. The logs in 3 tell show that the driver was in the napi polling
thread method cvm_oct_napi_poll_38 when the scheduler was invoked.
Looks like this was probably due to a hard irq happening. The main
problem I see is the scheduler switching the tasks from
sirq-net-rx to idle even though there is napi processing to be done.
Appreciate any hints to debug this further.
Notes
------
* This is with PREEMPT_RT turned on and 2.6.32 version of the kernel
which has a backport of 2.6.33.9-rt31 patch.
Thanks a lot for reading this far and any thoughts you may have!
-Venkat
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: task switch from net-rx to idle when there is napi processing to be done 2012-01-25 2:39 task switch from net-rx to idle when there is napi processing to be done Venkat Subbiah @ 2012-01-25 8:55 ` Uwe Kleine-König 2012-01-25 9:18 ` Mike Galbraith 2012-01-25 11:30 ` Venkat Subbiah 0 siblings, 2 replies; 7+ messages in thread From: Uwe Kleine-König @ 2012-01-25 8:55 UTC (permalink / raw) To: Venkat Subbiah; +Cc: RT Hello, On Tue, Jan 24, 2012 at 06:39:34PM -0800, Venkat Subbiah wrote: > In the process of debugging a napi ethernet driver performance > issue, what I am noticing is > > 1. While the driver is in the middle of a napi packet processing > loop, there is a task switch from > sirq-net-rx to idle even though there is pending napi processing to be done. I didn't check your logs below, but maybe this is related to the default settings in /proc/sys/kernel/sched_rt_period_us and /proc/sys/kernel/sched_rt_runtime_us? That is 0.05s per second is reserved for non-RT tasks tasks such that a run-away realtime process will not lock up the machine. To verify that, try echo -1 > /proc/sys/kernel/sched_rt_runtime_us . Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: task switch from net-rx to idle when there is napi processing to be done 2012-01-25 8:55 ` Uwe Kleine-König @ 2012-01-25 9:18 ` Mike Galbraith 2012-01-25 11:40 ` Venkat Subbiah 2012-01-25 11:30 ` Venkat Subbiah 1 sibling, 1 reply; 7+ messages in thread From: Mike Galbraith @ 2012-01-25 9:18 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Venkat Subbiah, RT On Wed, 2012-01-25 at 09:55 +0100, Uwe Kleine-König wrote: > Hello, > > On Tue, Jan 24, 2012 at 06:39:34PM -0800, Venkat Subbiah wrote: > > In the process of debugging a napi ethernet driver performance > > issue, what I am noticing is > > > > 1. While the driver is in the middle of a napi packet processing > > loop, there is a task switch from > > sirq-net-rx to idle even though there is pending napi processing to be done. > I didn't check your logs below, but maybe this is related to the default > settings in /proc/sys/kernel/sched_rt_period_us and > /proc/sys/kernel/sched_rt_runtime_us? That is 0.05s per second is > reserved for non-RT tasks tasks such that a run-away realtime process > will not lock up the machine. > > To verify that, try > > echo -1 > /proc/sys/kernel/sched_rt_runtime_us Hm, makes sense if this is a UP box. SMP would just go borrow a cup of runtime from a neighbor. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: task switch from net-rx to idle when there is napi processing to be done 2012-01-25 9:18 ` Mike Galbraith @ 2012-01-25 11:40 ` Venkat Subbiah 0 siblings, 0 replies; 7+ messages in thread From: Venkat Subbiah @ 2012-01-25 11:40 UTC (permalink / raw) To: Mike Galbraith; +Cc: Uwe Kleine-König, Subbiah, Venkat, RT On 01/25/2012 01:18 AM, Mike Galbraith wrote: > On Wed, 2012-01-25 at 09:55 +0100, Uwe Kleine-König wrote: >> Hello, >> >> On Tue, Jan 24, 2012 at 06:39:34PM -0800, Venkat Subbiah wrote: >>> In the process of debugging a napi ethernet driver performance >>> issue, what I am noticing is >>> >>> 1. While the driver is in the middle of a napi packet processing >>> loop, there is a task switch from >>> sirq-net-rx to idle even though there is pending napi processing to be done. >> I didn't check your logs below, but maybe this is related to the default >> settings in /proc/sys/kernel/sched_rt_period_us and >> /proc/sys/kernel/sched_rt_runtime_us? That is 0.05s per second is >> reserved for non-RT tasks tasks such that a run-away realtime process >> will not lock up the machine. >> >> To verify that, try >> >> echo -1> /proc/sys/kernel/sched_rt_runtime_us > Hm, makes sense if this is a UP box. SMP would just go borrow a cup of > runtime from a neighbor. Yes this experiment was when running with only one core. > -Mike > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: task switch from net-rx to idle when there is napi processing to be done 2012-01-25 8:55 ` Uwe Kleine-König 2012-01-25 9:18 ` Mike Galbraith @ 2012-01-25 11:30 ` Venkat Subbiah 2012-01-26 15:29 ` Steven Rostedt 1 sibling, 1 reply; 7+ messages in thread From: Venkat Subbiah @ 2012-01-25 11:30 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Subbiah, Venkat, RT On 01/25/2012 12:55 AM, Uwe Kleine-König wrote: > Hello, > > On Tue, Jan 24, 2012 at 06:39:34PM -0800, Venkat Subbiah wrote: >> In the process of debugging a napi ethernet driver performance >> issue, what I am noticing is >> >> 1. While the driver is in the middle of a napi packet processing >> loop, there is a task switch from >> sirq-net-rx to idle even though there is pending napi processing to be done. > I didn't check your logs below, but maybe this is related to the default > settings in /proc/sys/kernel/sched_rt_period_us and > /proc/sys/kernel/sched_rt_runtime_us? That is 0.05s per second is > reserved for non-RT tasks tasks such that a run-away realtime process > will not lock up the machine. > > To verify that, try > > echo -1> /proc/sys/kernel/sched_rt_runtime_us > > . Thanks for you response. That was it. Setting this to -1 does the expected. Then I tried playing with these settings and set /proc/sys/kernel/sched_rt_runtime_us to 95000 /proc/sys/kernel/sched_rt_period_us to 100000 And even with the switch from sirq-net-rx to idle happens every seconds and stays in idle for 0.05 seconds. Are they any restrictions on what these can be set to? I guess these setting may not be reasonable. I did verify by doing a cat of these files and read back the expected values. Then I tried /proc/sys/kernel/sched_rt_period_us to 1000000 /proc/sys/kernel/sched_rt_runtime_us to 980000 Even here the idle is for 0.05 seconds > Best regards > Uwe > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: task switch from net-rx to idle when there is napi processing to be done 2012-01-25 11:30 ` Venkat Subbiah @ 2012-01-26 15:29 ` Steven Rostedt 2012-01-26 20:35 ` Venkat Subbiah 0 siblings, 1 reply; 7+ messages in thread From: Steven Rostedt @ 2012-01-26 15:29 UTC (permalink / raw) To: Venkat Subbiah; +Cc: Uwe Kleine-König, Subbiah, Venkat, RT On Wed, 2012-01-25 at 03:30 -0800, Venkat Subbiah wrote: > On 01/25/2012 12:55 AM, Uwe Kleine-König wrote: > > Hello, > > > > On Tue, Jan 24, 2012 at 06:39:34PM -0800, Venkat Subbiah wrote: > >> In the process of debugging a napi ethernet driver performance > >> issue, what I am noticing is > >> > >> 1. While the driver is in the middle of a napi packet processing > >> loop, there is a task switch from > >> sirq-net-rx to idle even though there is pending napi processing to be done. > > I didn't check your logs below, but maybe this is related to the default > > settings in /proc/sys/kernel/sched_rt_period_us and > > /proc/sys/kernel/sched_rt_runtime_us? That is 0.05s per second is > > reserved for non-RT tasks tasks such that a run-away realtime process > > will not lock up the machine. > > > > To verify that, try > > > > echo -1> /proc/sys/kernel/sched_rt_runtime_us > > > > . > Thanks for you response. That was it. Setting this to -1 does the expected. Note, you should also have seen a warning in the logs when an RT task is throttled. Did you see such a thing? -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: task switch from net-rx to idle when there is napi processing to be done 2012-01-26 15:29 ` Steven Rostedt @ 2012-01-26 20:35 ` Venkat Subbiah 0 siblings, 0 replies; 7+ messages in thread From: Venkat Subbiah @ 2012-01-26 20:35 UTC (permalink / raw) To: Steven Rostedt; +Cc: Subbiah, Venkat, Uwe Kleine-König, RT On 01/26/2012 07:29 AM, Steven Rostedt wrote: > On Wed, 2012-01-25 at 03:30 -0800, Venkat Subbiah wrote: >> On 01/25/2012 12:55 AM, Uwe Kleine-König wrote: >>> Hello, >>> >>> On Tue, Jan 24, 2012 at 06:39:34PM -0800, Venkat Subbiah wrote: >>>> In the process of debugging a napi ethernet driver performance >>>> issue, what I am noticing is >>>> >>>> 1. While the driver is in the middle of a napi packet processing >>>> loop, there is a task switch from >>>> sirq-net-rx to idle even though there is pending napi processing to be done. >>> I didn't check your logs below, but maybe this is related to the default >>> settings in /proc/sys/kernel/sched_rt_period_us and >>> /proc/sys/kernel/sched_rt_runtime_us? That is 0.05s per second is >>> reserved for non-RT tasks tasks such that a run-away realtime process >>> will not lock up the machine. >>> >>> To verify that, try >>> >>> echo -1> /proc/sys/kernel/sched_rt_runtime_us >>> >>> . >> Thanks for you response. That was it. Setting this to -1 does the expected. > Note, you should also have seen a warning in the logs when an RT task is > throttled. Did you see such a thing? I didn't get to the console,but it is in the kernel logs. > > -- Steve > > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-01-26 20:35 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-25 2:39 task switch from net-rx to idle when there is napi processing to be done Venkat Subbiah 2012-01-25 8:55 ` Uwe Kleine-König 2012-01-25 9:18 ` Mike Galbraith 2012-01-25 11:40 ` Venkat Subbiah 2012-01-25 11:30 ` Venkat Subbiah 2012-01-26 15:29 ` Steven Rostedt 2012-01-26 20:35 ` Venkat Subbiah
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.