From: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Jason Gunthorpe <jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Bart Van Assche <Bart.VanAssche-Sjgp3cTcYWE@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
<ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: Kernel v4.16 / v4.17 SRP and SRPT patches
Date: Wed, 10 Jan 2018 14:30:39 -0500 [thread overview]
Message-ID: <1515612639.10153.3.camel@redhat.com> (raw)
In-Reply-To: <20180110191510.GK4518-uk2M96/98Pc@public.gmane.org>
On Wed, 2018-01-10 at 12:15 -0700, Jason Gunthorpe wrote:
> On Wed, Jan 10, 2018 at 01:59:10PM -0500, Laurence Oberman wrote:
>
> > Yep, this seems specific to the mlx5 and IB.
> > The problem though is Linus's tree 4.15-rc-7 already has enough of
> > the
> > part of the RDMA updates to see issues.
>
> Every time you post a backtrace it is different.. The only
> commonality
> seems to be that the CQ completion core appears to be processing
> garbage, accompanied by these sorts of sketch kernel messages from
> mlx5:
>
> > [ 1360.511682] mlx5_core 0000:08:00.1: Shutdown was called
> > [ 1360.550531] mlx5_core 0000:08:00.1:
> > mlx5_enter_error_state:121:(pid
> > [ 938.938946] mlx5_core 0000:08:00.1: Shutdown was called
> > [ 938.968423] mlx5_core 0000:08:00.1:
> > mlx5_cmd_force_teardown_hca:245:(pid 14752): teardown with force
> > mode failed
> > [ 938.978359] mlx5_core 0000:08:00.1:
> > mlx5_cmd_comp_handler:1445:(pid 13186): Command completion arrived
> > after timeout (entry idx = 0).
> > [ 942.209464] mlx5_1:wait_for_async_commands:735:(pid 14752): done
> > with all pending requests
>
> My other guess is a mlx5 issue where it is returning CQ wrids it
> should not return?
>
> Leon?
>
> I don't see anything changing in this area in rdma.git for-rc, so I
> can't give you a guess on a patch, sorry.
>
> Do you think this test ever worked for you? You said bisect, so I
> assume so?
>
> Jason
Hi Jason
Just to be clear, I have posted two types of stack traces, one where I
panic the other here above where I am not panicking.
This is not any special type of test. I booted the kernel, mapped the
SRP devices from the target server and proceeded to shutdown the client
with shutdown -r now.
This is part of my holistic test I always do against new patches in
Bart's tree.
I start with reboots, them rmmod's etc. before I go on to perform I/O
against the LUNS from the target.
The panic was the first issue I came across after building a kernel
with Bart's tree.
I have not even started testing anything else yet.
The trace above was provided because Bart asked me to test two kernels,
1. Linus's tree 4.15-rc7
2. The RDMA tree.
Bart's Tree panics the same as the RDMA tree I cloned.
I will look at prior release candidates in Linus's tree and see where
this maybe crept in. I am of course puzzled why I am the only one to
see it, other folks must have MLX5 (CX4) like I do.
Would be good to know what test was last performed on the current RDMA
tree by Leon and team.
Regards
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-01-10 19:30 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-06 0:22 [PATCH 5/8] infiniband: fix ulp/srpt/ib_srpt.c kernel-doc notation Randy Dunlap
[not found] ` <5a5016c0.4c0a620a.ed2b3.60da-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2018-01-06 0:36 ` Bart Van Assche
[not found] ` <fcc3f226-848d-abc4-2a81-f4fd821761c9-Sjgp3cTcYWE@public.gmane.org>
2018-01-06 5:55 ` Randy Dunlap
[not found] ` <31f69352-b8b1-9ed1-635b-2c654b49c775-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-01-06 16:50 ` Bart Van Assche
2018-01-09 20:15 ` Laurence Oberman
[not found] ` <1515528956.3919.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 20:31 ` Laurence Oberman
[not found] ` <1515529869.3919.4.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 20:51 ` Kernel v4.16 / v4.17 SRP and SRPT patches Bart Van Assche
[not found] ` <1515531079.2721.26.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-09 21:00 ` Laurence Oberman
[not found] ` <1515531652.26021.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 22:40 ` Laurence Oberman
[not found] ` <1515537614.26021.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 13:42 ` Laurence Oberman
[not found] ` <1515591723.26021.6.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 18:26 ` Jason Gunthorpe
[not found] ` <20180110182648.GI4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 18:40 ` Bart Van Assche
[not found] ` <1515609623.2745.20.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-10 18:59 ` Laurence Oberman
[not found] ` <1515610750.10153.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 19:15 ` Jason Gunthorpe
[not found] ` <20180110191510.GK4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 19:30 ` Laurence Oberman [this message]
[not found] ` <1515612639.10153.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 20:52 ` Jason Gunthorpe
[not found] ` <20180110205243.GP4776-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-10 21:11 ` Laurence Oberman
[not found] ` <1515618674.10153.6.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 21:15 ` Jason Gunthorpe
[not found] ` <20180110211501.GS4776-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-11 13:02 ` Laurence Oberman
[not found] ` <1515675741.21421.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 18:20 ` Laurence Oberman
[not found] ` <1515694855.21421.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 18:35 ` Patch: RDMA mlx5_core.c : mlx5_try_fast_unload causes panics Laurence Oberman
2018-01-11 20:43 ` Kernel v4.16 / v4.17 SRP and SRPT patches Laurence Oberman
[not found] ` <1515703435.21421.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 21:15 ` Bart Van Assche
[not found] ` <1515705340.2752.60.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-11 21:33 ` Laurence Oberman
[not found] ` <1515706433.21421.11.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 21:43 ` Bart Van Assche
2018-01-12 21:11 ` Bart Van Assche
[not found] ` <1515791472.2396.57.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-13 0:09 ` Laurence Oberman
[not found] ` <1515802177.1566.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-13 1:57 ` Laurence Oberman
[not found] ` <1515808673.11354.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-13 14:53 ` Laurence Oberman
[not found] ` <1515855226.32050.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-15 16:12 ` Bart Van Assche
[not found] ` <1516032762.3951.5.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-15 16:52 ` Laurence Oberman
2018-01-10 21:17 ` Laurence Oberman
2018-01-10 19:17 ` Jason Gunthorpe
[not found] ` <20180110191758.GL4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 19:32 ` Bart Van Assche
[not found] ` <1515612733.2745.27.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-10 22:43 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1515612639.10153.3.camel@redhat.com \
--to=loberman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=Bart.VanAssche-Sjgp3cTcYWE@public.gmane.org \
--cc=ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox