From: Dongsu Park <dongsu.park@profitbricks.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>,
David Dillow <dillowda@ornl.gov>
Subject: Re: [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes
Date: Tue, 28 Aug 2012 14:25:28 +0200 [thread overview]
Message-ID: <20120828122528.GB28144@gmail.com> (raw)
In-Reply-To: <503C97AC.9060703@acm.org>
Hi Bart,
On 28.08.2012 10:04, Bart Van Assche wrote:
> On 08/27/12 18:37, Dongsu Park wrote:
> > while testing ib_srp based on your srp-ha,
> > we sometimes hit kernel crashes with the call trace below.
> >
> > How to reproduce:
> >
> > 0. Kernel 3.2.15 with SCST v4193 on the target,
> > Kernel 3.2.8 with ib_srp-ha on the initiator.
> > 1. Configure 500+ vdisks on target, and get initiator connected.
> > 2. Exchange data intensively, which works well.
> > 3. (On initiator) delete SRP remote port occasionally, e.g.
> > # echo "1" > /sys/class/srp_remote_ports/port-6\:1/delete
> > And configure again the SRP target.
> > 4. (On target) disable Infiniband interface, and enable it again.
> > 5. Repeat 3 and 4.
> >
> > Then the initiator's kernel suddenly crashes. (but not always)
> >
> > Do you have any idea why?
>
> Hello Dongsu,
>
> That's unfortunate. I've just finished running the above test 1000 times
> on my test setup. The test ran perfectly - login succeeded every time,
> the test finished in the expected time, no kernel crash did occur and no
> memory was leaked. I've been running my test with kernel 3.6-rc3 instead
> of kernel 3.2.8 though. Can you repeat your test with kernel 3.6-rc3 on
> the initiator system instead of kernel 3.2.8 ? The 3.6-rc3 kernel
> contains multiple patches that improve robustness with regard to SCSI
> device removal.
Ok, when I get a chance to set up a new test system with kernel 3.6-rc3,
I'll do a new test and let you know.
By the way, as long as I've observed today, the crash occurs only if
rport_dev_loss_timedout() is called. It means, without device loss,
a simple rport_delete does not make any crash.
Is that probably because arguments to pr_err() are accessing to invalid
addresses?
drivers/scsi/scsi_transport_srp.c:275
pr_err("SRP transport: dev_loss_tmo (%ds) expired - removing %s.\n",
rport->dev_loss_tmo, dev_name(&rport->dev));
Cheers,
Dongsu
next prev parent reply other threads:[~2012-08-28 12:25 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-09 15:41 [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes Bart Van Assche
2012-08-09 15:57 ` [PATCH 14/20] srp_transport: Simplify attribute initialization code Bart Van Assche
[not found] ` <5023DA39.7020000-HInyCGIudOg@public.gmane.org>
2012-08-09 15:43 ` [PATCH 01/20] ib_srp: Fix a race condition Bart Van Assche
[not found] ` <5023DAA1.1040507-HInyCGIudOg@public.gmane.org>
2012-08-14 3:19 ` David Dillow
[not found] ` <1344914386.31833.45.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-08-14 13:21 ` Bart Van Assche
2012-08-14 13:18 ` [PATCH 01/20 v4b] " Bart Van Assche
[not found] ` <502A503D.5030604-HInyCGIudOg@public.gmane.org>
2012-08-15 0:03 ` David Dillow
2012-08-09 15:44 ` [PATCH 02/20] ib_srp: Enlarge block layer timeout Bart Van Assche
2012-08-09 15:45 ` [PATCH 03/20] ib_srp: Move QP state check into srp_send_tsk_mgmt() Bart Van Assche
2012-08-09 15:47 ` [PATCH 04/20] ib_srp: Stop queueing if QP in error Bart Van Assche
2012-08-09 15:48 ` [PATCH 05/20] ib_srp: Eliminate state SRP_TARGET_CONNECTING Bart Van Assche
2012-08-09 15:48 ` [PATCH 06/20] ib_srp: Suppress superfluous error messages Bart Van Assche
2012-08-09 15:49 ` [PATCH 07/20] ib_srp: Avoid that SCSI error handling triggers a crash Bart Van Assche
2012-08-09 15:50 ` [PATCH 08/20] ib_srp: Introduce the helper function, srp_remove_target() Bart Van Assche
2012-08-09 15:51 ` [PATCH 09/20] ib_srp: Eliminate state SRP_TARGET_DEAD Bart Van Assche
2012-08-09 15:52 ` [PATCH 10/20] ib_srp: Keep processing commands during scsi_remove_host() Bart Van Assche
2012-08-09 15:53 ` [PATCH 11/20] ib_srp: Make srp_disconnect_target() wait for IB completions Bart Van Assche
[not found] ` <5023DCFF.4020709-HInyCGIudOg@public.gmane.org>
2012-08-23 15:59 ` Sebastian Riemer
[not found] ` <5036536B.1000003-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2012-08-23 16:43 ` Bart Van Assche
[not found] ` <50365DC3.1050807-HInyCGIudOg@public.gmane.org>
2012-08-24 10:42 ` Dongsu Park
2012-08-09 15:54 ` [PATCH 12/20] ib_srp: Document sysfs attributes Bart Van Assche
2012-08-09 15:56 ` [PATCH 13/20] srp_transport: Fix atttribute registration Bart Van Assche
2012-08-09 15:58 ` [PATCH 15/20] srp_transport: Document sysfs attributes Bart Van Assche
2012-08-09 15:59 ` [PATCH 16/20] ib_srp: Allow SRP disconnect through sysfs Bart Van Assche
2012-08-09 16:00 ` [PATCH 17/20] ib_srp: Introduce a temporary variable in srp_remove_target() Bart Van Assche
2012-08-09 16:01 ` [PATCH 18/20] ib_srp: Maintain a single connection per I_T nexus Bart Van Assche
2012-08-09 16:02 ` [PATCH 19/20] srp_transport: Add transport layer error handling Bart Van Assche
2012-08-09 16:04 ` [PATCH 20/20] ib_srp: Add dev_loss_tmo support Bart Van Assche
2012-08-27 18:37 ` [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes Dongsu Park
2012-08-28 10:04 ` Bart Van Assche
2012-08-28 12:25 ` Dongsu Park [this message]
2012-08-28 12:58 ` Bart Van Assche
2012-09-25 15:05 ` Bart Van Assche
2012-09-27 0:31 ` David Dillow
[not found] ` <1348705896.26028.3.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-11-23 15:07 ` Bart Van Assche
[not found] ` <50AF9146.5000405-HInyCGIudOg@public.gmane.org>
2012-11-26 4:47 ` David Dillow
2012-08-09 16:18 ` Bart Van Assche
[not found] ` <5023E2E3.4030602-HInyCGIudOg@public.gmane.org>
2012-08-11 8:29 ` Joseph Glanville
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120828122528.GB28144@gmail.com \
--to=dongsu.park@profitbricks.com \
--cc=bvanassche@acm.org \
--cc=dillowda@ornl.gov \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.