From: Dongsu Park <dongsu.park-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
To: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Sebastian Riemer
<sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>,
Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Subject: Re: [PATCH 11/20] ib_srp: Make srp_disconnect_target() wait for IB completions
Date: Fri, 24 Aug 2012 12:42:37 +0200 [thread overview]
Message-ID: <20120824104237.GB4227@gmail.com> (raw)
In-Reply-To: <50365DC3.1050807-HInyCGIudOg@public.gmane.org>
Hi Bart,
I'll try to explain, as Sebastian is on vacation.
On 23.08.2012 16:43, Bart Van Assche wrote:
> On 08/23/12 15:59, Sebastian Riemer wrote:
> > we've triggered the WARN_ON() in srp_wait_last_send_wqe() by connecting
> > to a disabled SCST SRP target.
> >
> > I would remove that one.
> >
> > [ ... ]
> >
> >> + while (!target->last_send_wqe && time_before(jiffies, deadline)) {
> >> + srp_send_completion(target->send_cq, target);
> >> + msleep(20);
> >> + }
> >> +
> >> + WARN_ON(!target->last_send_wqe);
> >
> > <-- here it is - remove it
>
> But why was that WARN_ON() statement hit ? srp_wait_last_send_wqe() is
> invoked after the QP has been transitioned into the error state. It is
> the responsibility of the HCA to generate an error completion for any
> work queued on a QP that is in the error state. If that WARN_ON()
> statement has been hit that means that it took more than the RC timeout
> before the HCA finished processing earlier queued work and generated an
> error completion. That's not really something I had expected.
That occurs usually when releasing multiple targets at the same time.
A typical situation is unloading kernel module ib_srp.ko immediately,
which leads to tearing down every Infiniband connection.
But it doesn't occur always, which makes it hard for us to test.
Example of kernel trace:
WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:529
srp_disconnect_target+0x317/0x320 [ib_srp]()
Hardware name: H8DGU
Modules linked in:
rdma_ucm rdma_cm iw_cm ib_addr ib_ipoib ib_uverbs ib_umad ib_srp
scsi_transport_srp scsi_tgt ib_cm ib_sa loop ib_mthca psmouse ib_mad
amd64_edac_mod edac_core i2c_piix4 evdev serio_raw edac_mce_amd
ib_core tpm_tis tpm tpm_bios processor button thermal_sys sg
hid_cherry sd_mod crc_t10dif usb_storage ahci libahci libata scsi_mod
[last unloaded: scsi_wait_scan]
Pid: 101, comm: kworker/1:1 Tainted: G W 3.2.8-pserver #1
Call Trace:
[<ffffffff81048dbb>] ? warn_slowpath_common+0x7b/0xc0
[<ffffffffa00a9b07>] ? srp_disconnect_target+0x317/0x320 [ib_srp]
[<ffffffff8106a640>] ? wake_up_bit+0x40/0x40
[<ffffffffa00ab0bf>] ? srp_remove_work+0x13f/0x1c0 [ib_srp]
[<ffffffffa00aaf80>] ? srp_free_req_data+0xd0/0xd0 [ib_srp]
[<ffffffff81063383>] ? process_one_work+0x113/0x470
[<ffffffff81065c73>] ? worker_thread+0x163/0x3e0
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff8106a126>] ? kthread+0x96/0xa0
[<ffffffff8165f674>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8106a090>] ? kthread_worker_fn+0x180/0x180
[<ffffffff8165f670>] ? gs_change+0x13/0x13
Regards,
Dongsu
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-08-24 10:42 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-09 15:41 [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes Bart Van Assche
2012-08-09 15:57 ` [PATCH 14/20] srp_transport: Simplify attribute initialization code Bart Van Assche
2012-08-09 16:18 ` [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes Bart Van Assche
[not found] ` <5023E2E3.4030602-HInyCGIudOg@public.gmane.org>
2012-08-11 8:29 ` Joseph Glanville
[not found] ` <5023DA39.7020000-HInyCGIudOg@public.gmane.org>
2012-08-09 15:43 ` [PATCH 01/20] ib_srp: Fix a race condition Bart Van Assche
[not found] ` <5023DAA1.1040507-HInyCGIudOg@public.gmane.org>
2012-08-14 3:19 ` David Dillow
[not found] ` <1344914386.31833.45.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-08-14 13:21 ` Bart Van Assche
2012-08-14 13:18 ` [PATCH 01/20 v4b] " Bart Van Assche
[not found] ` <502A503D.5030604-HInyCGIudOg@public.gmane.org>
2012-08-15 0:03 ` David Dillow
2012-08-09 15:44 ` [PATCH 02/20] ib_srp: Enlarge block layer timeout Bart Van Assche
2012-08-09 15:45 ` [PATCH 03/20] ib_srp: Move QP state check into srp_send_tsk_mgmt() Bart Van Assche
2012-08-09 15:47 ` [PATCH 04/20] ib_srp: Stop queueing if QP in error Bart Van Assche
2012-08-09 15:48 ` [PATCH 05/20] ib_srp: Eliminate state SRP_TARGET_CONNECTING Bart Van Assche
2012-08-09 15:48 ` [PATCH 06/20] ib_srp: Suppress superfluous error messages Bart Van Assche
2012-08-09 15:49 ` [PATCH 07/20] ib_srp: Avoid that SCSI error handling triggers a crash Bart Van Assche
2012-08-09 15:50 ` [PATCH 08/20] ib_srp: Introduce the helper function, srp_remove_target() Bart Van Assche
2012-08-09 15:51 ` [PATCH 09/20] ib_srp: Eliminate state SRP_TARGET_DEAD Bart Van Assche
2012-08-09 15:52 ` [PATCH 10/20] ib_srp: Keep processing commands during scsi_remove_host() Bart Van Assche
2012-08-09 15:53 ` [PATCH 11/20] ib_srp: Make srp_disconnect_target() wait for IB completions Bart Van Assche
[not found] ` <5023DCFF.4020709-HInyCGIudOg@public.gmane.org>
2012-08-23 15:59 ` Sebastian Riemer
[not found] ` <5036536B.1000003-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2012-08-23 16:43 ` Bart Van Assche
[not found] ` <50365DC3.1050807-HInyCGIudOg@public.gmane.org>
2012-08-24 10:42 ` Dongsu Park [this message]
2012-08-09 15:54 ` [PATCH 12/20] ib_srp: Document sysfs attributes Bart Van Assche
2012-08-09 15:56 ` [PATCH 13/20] srp_transport: Fix atttribute registration Bart Van Assche
2012-08-09 15:58 ` [PATCH 15/20] srp_transport: Document sysfs attributes Bart Van Assche
2012-08-09 15:59 ` [PATCH 16/20] ib_srp: Allow SRP disconnect through sysfs Bart Van Assche
2012-08-09 16:00 ` [PATCH 17/20] ib_srp: Introduce a temporary variable in srp_remove_target() Bart Van Assche
2012-08-09 16:01 ` [PATCH 18/20] ib_srp: Maintain a single connection per I_T nexus Bart Van Assche
2012-08-09 16:02 ` [PATCH 19/20] srp_transport: Add transport layer error handling Bart Van Assche
2012-08-09 16:04 ` [PATCH 20/20] ib_srp: Add dev_loss_tmo support Bart Van Assche
2012-08-27 18:37 ` [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes Dongsu Park
2012-08-28 10:04 ` Bart Van Assche
2012-08-28 12:25 ` Dongsu Park
2012-08-28 12:58 ` Bart Van Assche
2012-09-25 15:05 ` Bart Van Assche
2012-09-27 0:31 ` David Dillow
[not found] ` <1348705896.26028.3.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-11-23 15:07 ` Bart Van Assche
[not found] ` <50AF9146.5000405-HInyCGIudOg@public.gmane.org>
2012-11-26 4:47 ` David Dillow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120824104237.GB4227@gmail.com \
--to=dongsu.park-eikl63zcoxah+58jc4qpia@public.gmane.org \
--cc=bvanassche-HInyCGIudOg@public.gmane.org \
--cc=dillowda-1Heg1YXhbW8@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org \
--cc=sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.