From: Shyam Iyer <shyam_iyer@dell.com>
To: sekharan@us.ibm.com
Cc: "Moger, Babu" <Babu.Moger@lsi.com>,
device-mapper development <dm-devel@redhat.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"Qi, Yanling" <Yanling.Qi@lsi.com>,
"Chauhan, Vijay" <Vijay.Chauhan@lsi.com>,
"Dachepalli, Sudhir" <Sudhir.Dachepalli@lsi.com>,
"Stankey, Robert" <Robert.Stankey@lsi.com>
Subject: Re: [PATCH 0/6] scsi_dh : Couple of fixes for scsi device handlers
Date: Fri, 30 Jul 2010 20:22:12 -0400 [thread overview]
Message-ID: <4C536CB4.6090108@dell.com> (raw)
In-Reply-To: <1280534189.17620.284.camel@chandra-lucid.beaverton.ibm.com>
On 07/30/2010 07:56 PM, Chandra Seetharaman wrote:
> On Fri, 2010-07-30 at 12:12 -0600, Moger, Babu wrote:
>
>
>>>> Yes, We can do that. Problem is I am hitting the issue with BUG_ON
>>>> in get_rdac_data which is there in the beginning of rdac_activate.
>>>>
>>> I do not understand the problem.
>>>
>>> If the BUG_ON on get_rdac_data() is being triggered, that means
>>> sdev->scsi_dh_data is NULL, if that is the case, how can you access
>>> sdev->scsi_dh_data->kref in scsi_dh_activate (in patch 2/6) ? Wouldn't
>>> it trip a oops ?
>>>
>> Test case is deleting both active and passive paths almost together during
>> the multipath testing. Looks like DM picked up the active path failure
>> first. Then failed the active path and started scheduling activate_path to
>> failover to passive path. Passive path is also about to go down pretty soon.
>> When the control was in scsi_dh_activate, Looks like scsi_dh_data was still
>> valid because I did not see panic here. But scsi_dh_data became NULL when
>> control went to rdac_activate. That is when I hit the bug on.
>>
>> kernel BUG at /usr/src/packages/BUILD/lsi-scsi_dh_rdac-01.00/obj/default/scsi_dh_rdac.c:232!
>> RIP: 0010: rdac_activate+0x257/0x387 [scsi_dh_rdac]
>>
>> My understanding is someone triggered scsi_dh->detach for passive path during
>> this small window. Only way I could see problem go away is holding reference
>> counts between these calls. Did i miss anything here? See the code snippet below..
>>
> So, basically, the new patch just reduces the window of race. IOW, if it
> just spins for few seconds (or preempted) just before calling the
> kref_get() in the new patch, it will generate an oops as scsi_dh_data
> would be NULL.
>
> Looks like you have to create a lock to protect scsi_dh_data and then
> call kref_get() with the protection of lock.
>
>
Second that.. Sounds like each kref_get/kref_put .. needs a lock protection not just this scenario.
next prev parent reply other threads:[~2010-07-31 0:22 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-28 22:58 [PATCH 0/6] scsi_dh : Couple of fixes for scsi device handlers Moger, Babu
2010-07-29 21:54 ` Chandra Seetharaman
2010-07-29 22:35 ` Moger, Babu
2010-07-30 0:04 ` Chandra Seetharaman
2010-07-30 18:12 ` Moger, Babu
2010-07-30 23:56 ` Chandra Seetharaman
2010-07-31 0:22 ` Shyam Iyer [this message]
2010-07-31 5:02 ` Moger, Babu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C536CB4.6090108@dell.com \
--to=shyam_iyer@dell.com \
--cc=Babu.Moger@lsi.com \
--cc=Robert.Stankey@lsi.com \
--cc=Sudhir.Dachepalli@lsi.com \
--cc=Vijay.Chauhan@lsi.com \
--cc=Yanling.Qi@lsi.com \
--cc=dm-devel@redhat.com \
--cc=linux-scsi@vger.kernel.org \
--cc=sekharan@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.