From: "Jörn Engel" <joern@logfs.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Roland Dreier <roland@purestorage.com>,
target-devel@vger.kernel.org, Arun Easi <arun.easi@qlogic.com>,
linux-scsi <linux-scsi@vger.kernel.org>,
Steve Hodgson <steve@purestorage.com>
Subject: Re: [PATCH] qla_target: Check refcount in find_sess_by_*
Date: Thu, 17 May 2012 21:15:28 -0400 [thread overview]
Message-ID: <20120518011528.GG18067@logfs.org> (raw)
In-Reply-To: <1337306556.32217.365.camel@haakon2.linux-iscsi.org>
On Thu, 17 May 2012 19:02:36 -0700, Nicholas A. Bellinger wrote:
> >
> > +static void tcm_qla2xxx_release_session(struct kref *kref)
> > +{
> > + struct se_session *se_sess = container_of(kref,
> > + struct se_session, sess_kref);
> > +
> > + tcm_qla2xxx_close_session(se_sess);
> > +}
> > +
>
> Deadlock here.. tcm_qla2xxx_close_session() takes the hardware_lock
> before calling qlt_unreg_sess()..
Good catch.
> > +static void tcm_qla2xxx_put_session(struct se_session *se_sess)
> > +{
> > + struct qla_tgt_sess *sess = se_sess->fabric_sess_ptr;
> > + struct qla_hw_data *ha = sess->vha->hw;
> > + unsigned long flags;
> > +
> > + lock_hardware(ha, &flags);
> > + kref_put(&se_sess->sess_kref, tcm_qla2xxx_release_session);
> > + unlock_hardware(ha, &flags);
> > +}
>
> I'm not sure what wrapper lock_hardware + unlock_hardware looks like..?
> What special does it do beyond just obtaining ha->hardware_lock..?
I have replaced all instances of taking/releasing the hardware_lock in
our tree with these wrappers and also added assert_hardware_locked().
We have a rare bug where qla_tgt_reset() holds the hardware_lock and
qla_tgt_issue_task_mgmt() does not, as proven by the assertions. By
now I have added all sorts of instrumentations in a desperate attempt
to understand the bug. Main drawback is that porting patches between
our tree and yours is increasingly painful, as you have noticed.
> <SNIP>
>
> No need to rename iscsi-target stuff now, as it does not need
> ->put_session() or change it's current operation using
> target_put_session() for existing users..
Actually... I renamed the function to generic_target_put_session() so
that transports without special needs can use that as their
.put_session pointer. In principle the rename isn't needed, but the
generic versions of similar callbacks tend to be named generic_foo()
elsewhere and I like to follow common patterns.
But more importantly, I completely forgot to add the function pointers
to all the other transports. That needs fixing...
> > diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
> > index 70c3ffb..dfc1b1b 100644
> > --- a/drivers/target/target_core_tpg.c
> > +++ b/drivers/target/target_core_tpg.c
> > @@ -510,10 +510,10 @@ int core_tpg_del_initiator_node_acl(
> > list_del(&sess->sess_acl_list);
> >
> > rc = tpg->se_tpg_tfo->shutdown_session(sess);
> > - target_put_session(sess);
> > + tpg->se_tpg_tfo->put_session(sess);
> > if (!rc)
> > continue;
> > - target_put_session(sess);
> > + tpg->se_tpg_tfo->put_session(sess);
> > }
> > target_put_nacl(acl);
> > /*
>
> Note this breaks all the other fabric cases that don't provide a
> TFO->put_session() callback and will currently require a
> target_put_session() call.
>
> This call to TFO->put_sesssion() has been moved into
> target_put_session() the -v2 patch below..
...as you also noticed.
While your version isn't completely absurd, I learned to hate this
pattern the hard way. In filesystem code, there are several dozen
function pointers that default to buffer_head code. If the default
was to dereference a NULL pointer and get a clean backtrace,
developers of new filesystems would notice soon enough, add their own
function pointer and move on.
But since the default is to assume buffer_heads, you now use a
filesystem-specific structure with buffer_head code and the result,
often enough, is a fairly subtle corruption that can take several days
to track down. The only thing that kept me back from removing those
defaults was fear of missing a fix for one of the fourty-odd
filesystems linux currently supports. We are fucked because we are so
deeply fucked that noone dares to unfuck the situation.
So if you try to follow the buffer_head pattern, prepare to dodge some
rotten fruit the next few times we meet in person. I really _really_
hate that. I has cost me several weeks of my life that I could have
spent being more productive and feeling less pain.
<end mad rant>
> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> index 24b4e45..5aa8372 100644
> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> @@ -845,9 +845,28 @@ static void tcm_qla2xxx_clear_nacl_from_fcport_map(struct qla_tgt_sess *sess)
> nacl->nport_id);
> }
>
> +static void tcm_qla2xxx_release_session(struct kref *kref)
> +{
> + struct se_session *se_sess = container_of(kref,
> + struct se_session, sess_kref);
> +
> + qlt_unreg_sess(se_sess->fabric_sess_ptr);
> +}
> +
> +static void tcm_qla2xxx_put_session(struct se_session *se_sess)
> +{
> + struct qla_tgt_sess *sess = se_sess->fabric_sess_ptr;
> + struct qla_hw_data *ha = sess->vha->hw;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ha->hardware_lock, flags);
> + kref_put(&se_sess->sess_kref, tcm_qla2xxx_release_session);
> + spin_unlock_irqrestore(&ha->hardware_lock, flags);
> +}
> +
> static void tcm_qla2xxx_put_sess(struct qla_tgt_sess *sess)
> {
> - target_put_session(sess->se_sess);
> + tcm_qla2xxx_put_session(sess->se_sess);
> }
>
> static void tcm_qla2xxx_shutdown_sess(struct qla_tgt_sess *sess)
> @@ -1743,6 +1762,7 @@ static struct target_core_fabric_ops tcm_qla2xxx_ops = {
> .new_cmd_map = NULL,
> .check_stop_free = tcm_qla2xxx_check_stop_free,
> .release_cmd = tcm_qla2xxx_release_cmd,
> + .put_session = tcm_qla2xxx_put_session,
> .shutdown_session = tcm_qla2xxx_shutdown_session,
> .close_session = tcm_qla2xxx_close_session,
> .sess_get_index = tcm_qla2xxx_sess_get_index,
> @@ -1791,6 +1811,7 @@ static struct target_core_fabric_ops tcm_qla2xxx_npiv_ops = {
> .tpg_release_fabric_acl = tcm_qla2xxx_release_fabric_acl,
> .tpg_get_inst_index = tcm_qla2xxx_tpg_get_inst_index,
> .release_cmd = tcm_qla2xxx_release_cmd,
> + .put_session = tcm_qla2xxx_put_session,
> .shutdown_session = tcm_qla2xxx_shutdown_session,
> .close_session = tcm_qla2xxx_close_session,
> .sess_get_index = tcm_qla2xxx_sess_get_index,
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index e797986..6b7dc67 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -315,7 +315,7 @@ void transport_register_session(
> }
> EXPORT_SYMBOL(transport_register_session);
>
> -static void target_release_session(struct kref *kref)
> +void target_release_session(struct kref *kref)
> {
> struct se_session *se_sess = container_of(kref,
> struct se_session, sess_kref);
> @@ -332,6 +332,12 @@ EXPORT_SYMBOL(target_get_session);
>
> void target_put_session(struct se_session *se_sess)
> {
> + struct se_portal_group *tpg = se_sess->se_tpg;
> +
> + if (tpg->se_tpg_tfo->put_session != NULL) {
> + tpg->se_tpg_tfo->put_session(se_sess);
> + return;
> + }
> kref_put(&se_sess->sess_kref, target_release_session);
> }
> EXPORT_SYMBOL(target_put_session);
> diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h
> index 4e3f7a4..2c2af2b 100644
> --- a/include/target/target_core_fabric.h
> +++ b/include/target/target_core_fabric.h
> @@ -47,6 +47,7 @@ struct target_core_fabric_ops {
> */
> int (*check_stop_free)(struct se_cmd *);
> void (*release_cmd)(struct se_cmd *);
> + void (*put_session)(struct se_session *);
> /*
> * Called with spin_lock_bh(struct se_portal_group->session_lock held.
> */
Apart from my buffer_head rant, I like it. Thanks!
Jörn
--
Science is the belief in the ignorance of experts.
-- Richard Feynman
next prev parent reply other threads:[~2012-05-18 1:15 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120511143341.GA18753@logfs.org>
[not found] ` <20120517162600.GA18067@logfs.org>
2012-05-17 19:46 ` [PATCH] qla_target: Check refcount in find_sess_by_* Nicholas A. Bellinger
2012-05-17 20:49 ` Roland Dreier
2012-05-17 19:28 ` Jörn Engel
2012-05-17 21:05 ` Jörn Engel
2012-05-17 21:10 ` Jörn Engel
2012-05-18 2:02 ` Nicholas A. Bellinger
2012-05-18 1:15 ` Jörn Engel [this message]
2012-05-18 6:09 ` Nicholas A. Bellinger
2012-05-18 22:49 ` Jörn Engel
2012-05-19 1:26 ` Nicholas A. Bellinger
2012-05-21 22:18 ` Nicholas A. Bellinger
2012-05-17 21:24 ` Nicholas A. Bellinger
2012-05-17 19:56 ` Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120518011528.GG18067@logfs.org \
--to=joern@logfs.org \
--cc=arun.easi@qlogic.com \
--cc=linux-scsi@vger.kernel.org \
--cc=nab@linux-iscsi.org \
--cc=roland@purestorage.com \
--cc=steve@purestorage.com \
--cc=target-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.