From: Wenchao Hao <haowenchao22@gmail.com>
To: "Peter Wang (王信友)" <peter.wang@mediatek.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"bvanassche@acm.org" <bvanassche@acm.org>,
"avri.altman@wdc.com" <avri.altman@wdc.com>,
"quic_nguyenb@quicinc.com" <quic_nguyenb@quicinc.com>,
"alim.akhtar@samsung.com" <alim.akhtar@samsung.com>,
"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
"jejb@linux.ibm.com" <jejb@linux.ibm.com>
Cc: "linux-mediatek@lists.infradead.org"
<linux-mediatek@lists.infradead.org>,
"Jiajie Hao (郝加节)" <jiajie.hao@mediatek.com>,
"CC Chou (周志杰)" <cc.chou@mediatek.com>,
"Eddie Huang (黃智傑)" <eddie.huang@mediatek.com>,
"Alice Chao (趙珮均)" <Alice.Chao@mediatek.com>,
wsd_upstream <wsd_upstream@mediatek.com>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"Lin Gui (桂林)" <Lin.Gui@mediatek.com>,
"Chun-Hung Wu (巫駿宏)" <Chun-hung.Wu@mediatek.com>,
"Tun-yu Yu (游敦聿)" <Tun-yu.Yu@mediatek.com>,
"chu.stanley@gmail.com" <chu.stanley@gmail.com>,
"Chaotian Jing (井朝天)" <Chaotian.Jing@mediatek.com>,
"Powen Kao (高伯文)" <Powen.Kao@mediatek.com>,
"Naomi Chu (朱詠田)" <Naomi.Chu@mediatek.com>,
"Qilin Tan (谭麒麟)" <Qilin.Tan@mediatek.com>
Subject: Re: [PATCH v2] ufs: core: fix ufshcd_abort_all racing issue
Date: Thu, 27 Jun 2024 15:59:00 +0800 [thread overview]
Message-ID: <58505ca5-5822-47f5-a77d-a517eda0c508@gmail.com> (raw)
In-Reply-To: <0e1e0c0a4303f53a50a95aa0672311015ddeaee2.camel@mediatek.com>
On 2024/6/26 11:56, Peter Wang (王信友) wrote:
> On Tue, 2024-06-25 at 09:42 -0700, Bart Van Assche wrote:
>>
>>
>> Please include a full root cause analysis when reposting fixes for
>> the
>> reported crashes. It is not clear to me how it is possible that an
>> invalid pointer is passed to blk_mq_unique_tag() (0x194). As I
>> mentioned
>> in my previous email, freeing a request does not modify the request
>> pointer and does not modify the SCSI command pointer either. As one
>> can
>> derive from the blk_mq_alloc_rqs() call stack, memory for struct
>> request
>> and struct scsi_cmnd is allocated at request queue allocation time
>> and
>> is not freed until the request queue is freed. Hence, for a given
>> tag,
>> neither the request pointer nor the SCSI command pointer changes as
>> long
>> as a request queue exists. Hence my request for an explanation how it
>> is
>> possible that an invalid pointer was passed to blk_mq_unique_tag().
>>
>> Thanks,
>>
>> Bart.
>>
>
> Hi Bart,
>
> Sorry I have not explain root-cause clearly.
> I will add more clear root-cause analyze next version.
>
> And it is not an invalid pointer is passed to blk_mq_unique_tag(),
> I means blk_mq_unique_tag function try access null pointer.
> It is differnt and cause misunderstanding.
>
> The null pinter blk_mq_unique_tag try access is:
> rq->mq_hctx(NULL)->queue_num.
>
Hi Peter,
What is queue_num's offset of blk_mq_hw_ctx in your machine?
gdb vmlinux
(gdb) print /x (int)&((struct blk_mq_hw_ctx *)0)->queue_num
$5 = 0x164
I read your descriptions and wondered a same race flow as you described
following. But I found the offset mismatch, if the racing flow is correct,
then the address accessed in blk_mq_unique_tag() should be 0x164, not 0x194.
Maybe the offset is different between our machine?
What's more, if the racing flow is correct, I did not get how your changes
can address this racing flow.
> The racing flow is:
>
> Thread A
> ufshcd_err_handler step 1
> ufshcd_cmd_inflight(true) step 3
> ufshcd_mcq_req_to_hwq
> blk_mq_unique_tag
> rq->mq_hctx->queue_num step 5
>
> Thread B
> ufs_mtk_mcq_intr(cq complete ISR) step 2
> scsi_done
> ...
> __blk_mq_free_request
> rq->mq_hctx = NULL; step 4
>
> Thanks.
> Peter
>
>
>
>
next prev parent reply other threads:[~2024-06-27 7:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-24 12:11 [PATCH v2] ufs: core: fix ufshcd_abort_all racing issue peter.wang
2024-06-24 18:01 ` Bart Van Assche
2024-06-25 8:29 ` Peter Wang (王信友)
2024-06-25 16:42 ` Bart Van Assche
2024-06-26 3:56 ` Peter Wang (王信友)
2024-06-26 17:13 ` Bart Van Assche
2024-06-27 9:19 ` Wenchao Hao
2024-06-27 10:59 ` Peter Wang (王信友)
2024-06-27 20:13 ` Bart Van Assche
2024-06-28 3:13 ` Peter Wang (王信友)
2024-06-27 7:59 ` Wenchao Hao [this message]
2024-06-27 10:58 ` Peter Wang (王信友)
2024-06-28 1:44 ` Wenchao Hao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58505ca5-5822-47f5-a77d-a517eda0c508@gmail.com \
--to=haowenchao22@gmail.com \
--cc=Alice.Chao@mediatek.com \
--cc=Chaotian.Jing@mediatek.com \
--cc=Chun-hung.Wu@mediatek.com \
--cc=Lin.Gui@mediatek.com \
--cc=Naomi.Chu@mediatek.com \
--cc=Powen.Kao@mediatek.com \
--cc=Qilin.Tan@mediatek.com \
--cc=Tun-yu.Yu@mediatek.com \
--cc=alim.akhtar@samsung.com \
--cc=avri.altman@wdc.com \
--cc=bvanassche@acm.org \
--cc=cc.chou@mediatek.com \
--cc=chu.stanley@gmail.com \
--cc=eddie.huang@mediatek.com \
--cc=jejb@linux.ibm.com \
--cc=jiajie.hao@mediatek.com \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=peter.wang@mediatek.com \
--cc=quic_nguyenb@quicinc.com \
--cc=stable@vger.kernel.org \
--cc=wsd_upstream@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.