* [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load
@ 2013-07-22 15:23 Bart Van Assche
[not found] ` <51ED4E60.30203-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2013-07-22 15:23 UTC (permalink / raw)
To: linux-rdma, Or Gerlitz, David Miller
Avoid that kernel threads running mlx4_cmd_wait() contribute to the
system load by setting the task state to TASK_INTERRUPTIBLE instead
of TASK_UNINTERRUPTIBLE while waiting. This patch reduces the load
average from about 0.5 to about 0.0 on an idle system with one mlx4
HCA and no IB cables connected.
Note: I'm posting this patch as an RFC since it involves a behavior
change (a signal sent to a worker thread that is waiting for a
command to finish causes the command to fail) and since I'm not sure
this behavior change is acceptable.
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
---
drivers/net/ethernet/mellanox/mlx4/cmd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 299d018..fb7f7fa 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -559,8 +559,8 @@ static int mlx4_cmd_wait(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
mlx4_cmd_post(dev, in_param, out_param ? *out_param : 0,
in_modifier, op_modifier, op, context->token, 1);
- if (!wait_for_completion_timeout(&context->done,
- msecs_to_jiffies(timeout))) {
+ if (wait_for_completion_interruptible_timeout(&context->done,
+ msecs_to_jiffies(timeout)) <= 0) {
mlx4_warn(dev, "command 0x%x timed out (go bit not cleared)\n",
op);
err = -EBUSY;
--
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread[parent not found: <51ED4E60.30203-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load [not found] ` <51ED4E60.30203-HInyCGIudOg@public.gmane.org> @ 2013-07-24 15:17 ` Or Gerlitz [not found] ` <51EFF018.7050409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Or Gerlitz @ 2013-07-24 15:17 UTC (permalink / raw) To: Bart Van Assche; +Cc: linux-rdma, David Miller On 22/07/2013 18:23, Bart Van Assche wrote: > Avoid that kernel threads running mlx4_cmd_wait() contribute to the > system load by setting the task state to TASK_INTERRUPTIBLE instead > of TASK_UNINTERRUPTIBLE while waiting. This patch reduces the load > average from about 0.5 to about 0.0 on an idle system with one mlx4 > HCA and no IB cables connected. Before diving to the implications of the patch, lets discuss the phenomena you see... So the load average on this idle system is 0.5 or 0.05? I don't see 0.5 or a like on my systems that are installed with HCAs and are idle. Could it be that some IB management entity is repeatedly sending MADs to this system? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <51EFF018.7050409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load [not found] ` <51EFF018.7050409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-07-24 16:48 ` Bart Van Assche [not found] ` <51F0055F.7000703-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Bart Van Assche @ 2013-07-24 16:48 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-rdma, David Miller On 07/24/13 17:17, Or Gerlitz wrote: > On 22/07/2013 18:23, Bart Van Assche wrote: >> Avoid that kernel threads running mlx4_cmd_wait() contribute to the >> system load by setting the task state to TASK_INTERRUPTIBLE instead >> of TASK_UNINTERRUPTIBLE while waiting. This patch reduces the load >> average from about 0.5 to about 0.0 on an idle system with one mlx4 >> HCA and no IB cables connected. > > Before diving to the implications of the patch, lets discuss the > phenomena you see... > So the load average on this idle system is 0.5 or 0.05? > > I don't see 0.5 or a like on my systems that are installed with HCAs and > are idle. Could it > be that some IB management entity is repeatedly sending MADs to this > system? Hello Or, I saw a load of 0.5 with several different upstream kernels (3.6..3.10 at least). The only IB-related process that was running on the system was opensmd. This is definitely reproducible. It was only a month after I had noticed this phenomenon that I started searching for the root cause. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <51F0055F.7000703-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load [not found] ` <51F0055F.7000703-HInyCGIudOg@public.gmane.org> @ 2013-07-24 17:06 ` Or Gerlitz [not found] ` <51F0097C.4010806-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Or Gerlitz @ 2013-07-24 17:06 UTC (permalink / raw) To: Bart Van Assche; +Cc: linux-rdma, David Miller On 24/07/2013 19:48, Bart Van Assche wrote: > I saw a load of 0.5 with several different upstream kernels (3.6..3.10 > at least). The only IB-related process that was running on the system > was opensmd. This is definitely reproducible. It was only a month > after I had noticed this phenomenon that I started searching for the > root cause. do you see it also on systems that don't run opensm? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <51F0097C.4010806-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load [not found] ` <51F0097C.4010806-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-07-24 17:13 ` Bart Van Assche 2013-07-24 18:34 ` Bart Van Assche 1 sibling, 0 replies; 6+ messages in thread From: Bart Van Assche @ 2013-07-24 17:13 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-rdma, David Miller On 07/24/13 19:06, Or Gerlitz wrote: > On 24/07/2013 19:48, Bart Van Assche wrote: >> I saw a load of 0.5 with several different upstream kernels (3.6..3.10 >> at least). The only IB-related process that was running on the system >> was opensmd. This is definitely reproducible. It was only a month >> after I had noticed this phenomenon that I started searching for the >> root cause. > do you see it also on systems that don't run opensm? That's a test I have not yet run. So sorry, I don't know whether this also happens without opensm. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load [not found] ` <51F0097C.4010806-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2013-07-24 17:13 ` Bart Van Assche @ 2013-07-24 18:34 ` Bart Van Assche 1 sibling, 0 replies; 6+ messages in thread From: Bart Van Assche @ 2013-07-24 18:34 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-rdma, David Miller On 07/24/13 19:06, Or Gerlitz wrote: > On 24/07/2013 19:48, Bart Van Assche wrote: >> I saw a load of 0.5 with several different upstream kernels (3.6..3.10 >> at least). The only IB-related process that was running on the system >> was opensmd. This is definitely reproducible. It was only a month >> after I had noticed this phenomenon that I started searching for the >> root cause. > do you see it also on systems that don't run opensm? Yes. This happens both on systems running opensm and on systems not running opensm. A call trace from a system on which CPU load was higher than expected is as follows: # echo w >/proc/sysrq-trigger; dmesg -c SysRq : Show Blocked State task PC stack pid father kworker/u:7 D ffff88011fa125c0 0 181 2 0x00000000 ffff880114d63b48 0000000000000046 ffff8801158d1c40 ffff880114d63fd8 ffff880114d63fd8 ffff880114d63fd8 ffffffff81613440 ffff8801158d1c40 ffffffff817a7180 ffff880114d63b80 ffffffff817a7180 000000010000ebf9 Call Trace: [<ffffffff813ea259>] schedule+0x29/0x70 [<ffffffff813e88aa>] schedule_timeout+0x10a/0x1e0 [<ffffffff813e98e2>] wait_for_common+0xd2/0x180 [<ffffffff813e9a63>] wait_for_completion_timeout+0x13/0x20 [<ffffffffa03bfb99>] __mlx4_cmd+0x259/0x5e0 [mlx4_core] [<ffffffffa03d60e4>] mlx4_SENSE_PORT+0x54/0xc0 [mlx4_core] [<ffffffffa03d620f>] mlx4_do_sense_ports+0xbf/0xd0 [mlx4_core] [<ffffffffa03d6262>] mlx4_sense_port+0x42/0xc0 [mlx4_core] [<ffffffff81055f9c>] process_one_work+0x16c/0x4b0 [<ffffffff8105825d>] worker_thread+0x15d/0x450 [<ffffffff8105d5b0>] kthread+0xc0/0xd0 [<ffffffff813f34dc>] ret_from_fork+0x7c/0xb0 Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-07-24 18:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-22 15:23 [PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load Bart Van Assche
[not found] ` <51ED4E60.30203-HInyCGIudOg@public.gmane.org>
2013-07-24 15:17 ` Or Gerlitz
[not found] ` <51EFF018.7050409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-07-24 16:48 ` Bart Van Assche
[not found] ` <51F0055F.7000703-HInyCGIudOg@public.gmane.org>
2013-07-24 17:06 ` Or Gerlitz
[not found] ` <51F0097C.4010806-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-07-24 17:13 ` Bart Van Assche
2013-07-24 18:34 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox