From: Martin Wilck <mwilck@suse.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org,
Daniel Wagner <daniel.wagner@suse.com>,
kbusch@kernel.org, sagi@grimberg.me
Subject: Re: [PATCH] nvme-multipath: fix double initialization of ANA state
Date: Fri, 14 May 2021 23:08:25 +0200 [thread overview]
Message-ID: <23286bbda8b62afcb8bc0ce0d4deb655197cf8a7.camel@suse.com> (raw)
In-Reply-To: <bc0a9cbf4567ee3ec96615fec56cd4899d5d496e.camel@suse.com>
Hello Christoph,
On Wed, 2021-05-12 at 16:53 +0200, Martin Wilck wrote:
> On Thu, 2021-05-06 at 15:48 +0200, Christoph Hellwig wrote:
> > nvme_init_identify and thus nvme_mpath_init can be called multiple
> > times and thus must not overwrite potentially initialized or in-use
> > fields. Split out a helper for the basic initialization when the
> > controller is initialized and make sure the init_identify path does
> > not blindly change in-use data structures.
> >
> > Fixes: 0d0b660f214d ("nvme: add ANA support")
> > Reported-by: Martin Wilck <mwilck@suse.com>
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> Thank you. I'll prepare another test kernel for our partner.
Our partner reported a crash during NVMe controller initialization with
the kernel I built with this patch applied. I'm still looking at the
dump, and it's not impossible that I made a mistake backporting your
patch. But I thought I should inform you anyway.
[ 1010.869437] nvme-fabrics ctl: Failed to read smart log (error -5)
[ 1010.869444] nvme nvme0: queue_size 128 > ctrl sqsize 32, clamping down
[ 1010.879383] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.1.14:4420
[ 1010.929700] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[ 1011.041659] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 1011.041665] #PF: supervisor write access in kernel mode
[ 1011.041666] #PF: error_code(0x0002) - not-present page
[ 1011.041668] PGD 0 P4D 0
[ 1011.041672] Oops: 0002 [#1] SMP PTI
[ 1011.041675] CPU: 13 PID: 0 Comm: swapper/13 Kdump: loaded Tainted: G X 5.3.18-6.g7ea043c-default #1 SLE15-SP2 (unreleased)
[ 1011.041678] Hardware name: FUJITSU PRIMERGY RX2530 M2/D3279-B1, BIOS V5.0.0.11 R1.20.0 for D3279-B1x 06/15/2018
[ 1011.041689] RIP: 0010:bio_copy_kern_endio_read+0xc6/0x130
[ 1011.041691] Code: c0 75 87 8b 4e 0c 44 89 df 89 ca 81 e1 ff 0f 00 00 c1 ea 0c 29 cf 48 c1 e2 06 89 f9 48 03 16 e9 6f ff ff ff 48 8b 3e 4c 89 c5 <49> 89 38 4a 8b 7c 0e f8 4b 89 7c 08 f8 49 8d 78 08 4d 01 c8 48 83
[ 1011.041695] RSP: 0018:ffffab41804c8ee8 EFLAGS: 00010212
[ 1011.041697] RAX: 0000000000000000 RBX: ffff9ff1b73e1500 RCX: 0000000000001000
[ 1011.041699] RDX: fffff2b810ce1240 RSI: ffff9ff1b3849000 RDI: 0000000000000000
[ 1011.041701] RBP: 0000000000000010 R08: 0000000000000010 R09: 0000000000001000
[ 1011.041702] R10: 0000000000000001 R11: 0000000000001000 R12: ffff9ff168ace140
[ 1011.041703] R13: 0000000000004810 R14: 0000000000000000 R15: 0000000000000000
[ 1011.041705] FS: 0000000000000000(0000) GS:ffff9ff1ff2c0000(0000) knlGS:0000000000000000
[ 1011.041707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1011.041708] CR2: 0000000000000010 CR3: 00000001de60a006 CR4: 00000000003606e0
[ 1011.041710] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1011.041711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1011.041713] Call Trace:
[ 1011.041716] <IRQ>
[ 1011.041722] blk_update_request+0x8a/0x3a0
[ 1011.041726] blk_mq_end_request+0x1a/0x130
[ 1011.041729] blk_done_softirq+0x8f/0xc0
[ 1011.041736] __do_softirq+0xe3/0x2dc
[ 1011.041744] irq_exit+0xd5/0xe0
[ 1011.041747] call_function_single_interrupt+0xf/0x20
[ 1011.041749] </IRQ>
bio_copy_kern_endio_read() means that this was a command sent via
__nvme_submit_sync_cmd(). I don't know yet which one.
Regards,
Martin
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-05-14 21:09 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-06 13:48 [PATCH] nvme-multipath: fix double initialization of ANA state Christoph Hellwig
2021-05-06 14:51 ` Keith Busch
2021-05-07 16:54 ` Hannes Reinecke
2021-05-07 17:16 ` Sagi Grimberg
2021-05-12 14:53 ` Martin Wilck
2021-05-14 21:08 ` Martin Wilck [this message]
2021-05-14 23:05 ` Sagi Grimberg
2021-05-17 12:48 ` Christoph Hellwig
2021-05-17 16:47 ` Martin Wilck
2021-05-17 16:49 ` Martin Wilck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23286bbda8b62afcb8bc0ce0d4deb655197cf8a7.camel@suse.com \
--to=mwilck@suse.com \
--cc=daniel.wagner@suse.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox