From: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Jason Gunthorpe <jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Bart Van Assche <Bart.VanAssche-Sjgp3cTcYWE@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
<ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: Kernel v4.16 / v4.17 SRP and SRPT patches
Date: Wed, 10 Jan 2018 14:30:39 -0500 [thread overview]
Message-ID: <1515612639.10153.3.camel@redhat.com> (raw)
In-Reply-To: <20180110191510.GK4518-uk2M96/98Pc@public.gmane.org>
On Wed, 2018-01-10 at 12:15 -0700, Jason Gunthorpe wrote:
> On Wed, Jan 10, 2018 at 01:59:10PM -0500, Laurence Oberman wrote:
>
> > Yep, this seems specific to the mlx5 and IB.
> > The problem though is Linus's tree 4.15-rc-7 already has enough of
> > the
> > part of the RDMA updates to see issues.
>
> Every time you post a backtrace it is different.. The only
> commonality
> seems to be that the CQ completion core appears to be processing
> garbage, accompanied by these sorts of sketch kernel messages from
> mlx5:
>
> > [ 1360.511682] mlx5_core 0000:08:00.1: Shutdown was called
> > [ 1360.550531] mlx5_core 0000:08:00.1:
> > mlx5_enter_error_state:121:(pid
> > [ 938.938946] mlx5_core 0000:08:00.1: Shutdown was called
> > [ 938.968423] mlx5_core 0000:08:00.1:
> > mlx5_cmd_force_teardown_hca:245:(pid 14752): teardown with force
> > mode failed
> > [ 938.978359] mlx5_core 0000:08:00.1:
> > mlx5_cmd_comp_handler:1445:(pid 13186): Command completion arrived
> > after timeout (entry idx = 0).
> > [ 942.209464] mlx5_1:wait_for_async_commands:735:(pid 14752): done
> > with all pending requests
>
> My other guess is a mlx5 issue where it is returning CQ wrids it
> should not return?
>
> Leon?
>
> I don't see anything changing in this area in rdma.git for-rc, so I
> can't give you a guess on a patch, sorry.
>
> Do you think this test ever worked for you? You said bisect, so I
> assume so?
>
> Jason
Hi Jason
Just to be clear, I have posted two types of stack traces, one where I
panic the other here above where I am not panicking.
This is not any special type of test. I booted the kernel, mapped the
SRP devices from the target server and proceeded to shutdown the client
with shutdown -r now.
This is part of my holistic test I always do against new patches in
Bart's tree.
I start with reboots, them rmmod's etc. before I go on to perform I/O
against the LUNS from the target.
The panic was the first issue I came across after building a kernel
with Bart's tree.
I have not even started testing anything else yet.
The trace above was provided because Bart asked me to test two kernels,
1. Linus's tree 4.15-rc7
2. The RDMA tree.
Bart's Tree panics the same as the RDMA tree I cloned.
I will look at prior release candidates in Linus's tree and see where
this maybe crept in. I am of course puzzled why I am the only one to
see it, other folks must have MLX5 (CX4) like I do.
Would be good to know what test was last performed on the current RDMA
tree by Leon and team.
Regards
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-01-10 19:30 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-06 0:22 [PATCH 5/8] infiniband: fix ulp/srpt/ib_srpt.c kernel-doc notation Randy Dunlap
[not found] ` <5a5016c0.4c0a620a.ed2b3.60da-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2018-01-06 0:36 ` Bart Van Assche
[not found] ` <fcc3f226-848d-abc4-2a81-f4fd821761c9-Sjgp3cTcYWE@public.gmane.org>
2018-01-06 5:55 ` Randy Dunlap
[not found] ` <31f69352-b8b1-9ed1-635b-2c654b49c775-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-01-06 16:50 ` Bart Van Assche
2018-01-09 20:15 ` Laurence Oberman
[not found] ` <1515528956.3919.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 20:31 ` Laurence Oberman
[not found] ` <1515529869.3919.4.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 20:51 ` Kernel v4.16 / v4.17 SRP and SRPT patches Bart Van Assche
[not found] ` <1515531079.2721.26.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-09 21:00 ` Laurence Oberman
[not found] ` <1515531652.26021.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 22:40 ` Laurence Oberman
[not found] ` <1515537614.26021.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 13:42 ` Laurence Oberman
[not found] ` <1515591723.26021.6.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 18:26 ` Jason Gunthorpe
[not found] ` <20180110182648.GI4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 18:40 ` Bart Van Assche
[not found] ` <1515609623.2745.20.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-10 18:59 ` Laurence Oberman
[not found] ` <1515610750.10153.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 19:15 ` Jason Gunthorpe
[not found] ` <20180110191510.GK4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 19:30 ` Laurence Oberman [this message]
[not found] ` <1515612639.10153.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 20:52 ` Jason Gunthorpe
[not found] ` <20180110205243.GP4776-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-10 21:11 ` Laurence Oberman
[not found] ` <1515618674.10153.6.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 21:15 ` Jason Gunthorpe
[not found] ` <20180110211501.GS4776-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-11 13:02 ` Laurence Oberman
[not found] ` <1515675741.21421.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 18:20 ` Laurence Oberman
[not found] ` <1515694855.21421.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 18:35 ` Patch: RDMA mlx5_core.c : mlx5_try_fast_unload causes panics Laurence Oberman
2018-01-11 20:43 ` Kernel v4.16 / v4.17 SRP and SRPT patches Laurence Oberman
[not found] ` <1515703435.21421.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 21:15 ` Bart Van Assche
[not found] ` <1515705340.2752.60.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-11 21:33 ` Laurence Oberman
[not found] ` <1515706433.21421.11.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 21:43 ` Bart Van Assche
2018-01-12 21:11 ` Bart Van Assche
[not found] ` <1515791472.2396.57.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-13 0:09 ` Laurence Oberman
[not found] ` <1515802177.1566.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-13 1:57 ` Laurence Oberman
[not found] ` <1515808673.11354.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-13 14:53 ` Laurence Oberman
[not found] ` <1515855226.32050.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-15 16:12 ` Bart Van Assche
[not found] ` <1516032762.3951.5.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-15 16:52 ` Laurence Oberman
2018-01-10 21:17 ` Laurence Oberman
2018-01-10 19:17 ` Jason Gunthorpe
[not found] ` <20180110191758.GL4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 19:32 ` Bart Van Assche
[not found] ` <1515612733.2745.27.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-10 22:43 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1515612639.10153.3.camel@redhat.com \
--to=loberman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=Bart.VanAssche-Sjgp3cTcYWE@public.gmane.org \
--cc=ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.