public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Martin Wilck <mwilck@suse.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org,
	Daniel Wagner <daniel.wagner@suse.com>,
	 kbusch@kernel.org, sagi@grimberg.me
Subject: Re: [PATCH] nvme-multipath: fix double initialization of ANA state
Date: Fri, 14 May 2021 23:08:25 +0200	[thread overview]
Message-ID: <23286bbda8b62afcb8bc0ce0d4deb655197cf8a7.camel@suse.com> (raw)
In-Reply-To: <bc0a9cbf4567ee3ec96615fec56cd4899d5d496e.camel@suse.com>

Hello Christoph,

On Wed, 2021-05-12 at 16:53 +0200, Martin Wilck wrote:
> On Thu, 2021-05-06 at 15:48 +0200, Christoph Hellwig wrote:
> > nvme_init_identify and thus nvme_mpath_init can be called multiple
> > times and thus must not overwrite potentially initialized or in-use
> > fields.  Split out a helper for the basic initialization when the
> > controller is initialized and make sure the init_identify path does
> > not blindly change in-use data structures.
> > 
> > Fixes: 0d0b660f214d ("nvme: add ANA support")
> > Reported-by: Martin Wilck <mwilck@suse.com>
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Thank you. I'll prepare another test kernel for our partner.

Our partner reported a crash during NVMe controller initialization with
the kernel I built with this patch applied. I'm still looking at the
dump, and it's not impossible that I made a mistake backporting your
patch. But I thought I should inform you anyway.

[ 1010.869437] nvme-fabrics ctl: Failed to read smart log (error -5)
[ 1010.869444] nvme nvme0: queue_size 128 > ctrl sqsize 32, clamping down
[ 1010.879383] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.1.14:4420
[ 1010.929700] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[ 1011.041659] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 1011.041665] #PF: supervisor write access in kernel mode
[ 1011.041666] #PF: error_code(0x0002) - not-present page
[ 1011.041668] PGD 0 P4D 0 
[ 1011.041672] Oops: 0002 [#1] SMP PTI
[ 1011.041675] CPU: 13 PID: 0 Comm: swapper/13 Kdump: loaded Tainted: G               X    5.3.18-6.g7ea043c-default #1 SLE15-SP2 (unreleased)
[ 1011.041678] Hardware name: FUJITSU PRIMERGY RX2530 M2/D3279-B1, BIOS V5.0.0.11 R1.20.0 for D3279-B1x                    06/15/2018
[ 1011.041689] RIP: 0010:bio_copy_kern_endio_read+0xc6/0x130
[ 1011.041691] Code: c0 75 87 8b 4e 0c 44 89 df 89 ca 81 e1 ff 0f 00 00 c1 ea 0c 29 cf 48 c1 e2 06 89 f9 48 03 16 e9 6f ff ff ff 48 8b 3e 4c 89 c5 <49> 89 38 4a 8b 7c 0e f8 4b 89 7c 08 f8 49 8d 78 08 4d 01 c8 48 83
[ 1011.041695] RSP: 0018:ffffab41804c8ee8 EFLAGS: 00010212
[ 1011.041697] RAX: 0000000000000000 RBX: ffff9ff1b73e1500 RCX: 0000000000001000
[ 1011.041699] RDX: fffff2b810ce1240 RSI: ffff9ff1b3849000 RDI: 0000000000000000
[ 1011.041701] RBP: 0000000000000010 R08: 0000000000000010 R09: 0000000000001000
[ 1011.041702] R10: 0000000000000001 R11: 0000000000001000 R12: ffff9ff168ace140
[ 1011.041703] R13: 0000000000004810 R14: 0000000000000000 R15: 0000000000000000
[ 1011.041705] FS:  0000000000000000(0000) GS:ffff9ff1ff2c0000(0000) knlGS:0000000000000000
[ 1011.041707] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1011.041708] CR2: 0000000000000010 CR3: 00000001de60a006 CR4: 00000000003606e0
[ 1011.041710] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1011.041711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1011.041713] Call Trace:
[ 1011.041716]  <IRQ>
[ 1011.041722]  blk_update_request+0x8a/0x3a0
[ 1011.041726]  blk_mq_end_request+0x1a/0x130
[ 1011.041729]  blk_done_softirq+0x8f/0xc0
[ 1011.041736]  __do_softirq+0xe3/0x2dc
[ 1011.041744]  irq_exit+0xd5/0xe0
[ 1011.041747]  call_function_single_interrupt+0xf/0x20
[ 1011.041749]  </IRQ>

bio_copy_kern_endio_read() means that this was a command sent via 
__nvme_submit_sync_cmd(). I don't know yet which one.

Regards,
Martin





_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-05-14 21:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 13:48 [PATCH] nvme-multipath: fix double initialization of ANA state Christoph Hellwig
2021-05-06 14:51 ` Keith Busch
2021-05-07 16:54 ` Hannes Reinecke
2021-05-07 17:16 ` Sagi Grimberg
2021-05-12 14:53 ` Martin Wilck
2021-05-14 21:08   ` Martin Wilck [this message]
2021-05-14 23:05     ` Sagi Grimberg
2021-05-17 12:48     ` Christoph Hellwig
2021-05-17 16:47       ` Martin Wilck
2021-05-17 16:49         ` Martin Wilck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23286bbda8b62afcb8bc0ce0d4deb655197cf8a7.camel@suse.com \
    --to=mwilck@suse.com \
    --cc=daniel.wagner@suse.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox