From: Ondrej Zary <linux@zary.sk>
To: Bart Van Assche <bvanassche@acm.org>
Cc: qla2xxx-upstream@qlogic.com, linux-scsi@vger.kernel.org,
linux-kernel@vger.kernel.org,
Michael Hernandez <michael.hernandez@cavium.com>,
Sawan Chandak <sawan.chandak@cavium.com>,
Himanshu Madhani <himanshu.madhani@cavium.com>,
GR-QLogic-Storage-Upstream@marvell.com,
Nilesh Javali <njavali@marvell.com>
Subject: Re: NULL pointer dereference in qla24xx_abort_command, kernel 4.19.98 (Debian)
Date: Thu, 19 Mar 2020 19:01:46 +0100 [thread overview]
Message-ID: <202003191901.46307.linux@zary.sk> (raw)
In-Reply-To: <202003022326.08698.linux@zary.sk>
On Monday 02 March 2020 23:26:08 Ondrej Zary wrote:
> On Thursday 27 February 2020 18:09:07 Ondrej Zary wrote:
> >
> > On Tuesday 25 February 2020 04:41:48 Bart Van Assche wrote:
> > > On 2020-02-24 00:20, Ondrej Zary wrote:
> > > > Looks like it's in some inlined function.
> > > >
> > > > /usr/src/linux-source-4.19# gdb /lib/modules/4.19.0-8-amd64/kernel/drivers/scsi/qla2xxx/qla2xxx.ko
> > > > GNU gdb (Debian 8.2.1-2+b3) 8.2.1
> > > > ...
> > > > Reading symbols from /lib/modules/4.19.0-8-amd64/kernel/drivers/scsi/qla2xxx/qla2xxx.ko...Reading symbols
> > > > from /usr/lib/debug//lib/modules/4.19.0-8-amd64/kernel/drivers/scsi/qla2xxx/qla2xxx.ko...done.
> > > > done.
> > > >
> > > > (gdb) list *(qla24xx_async_abort_cmd+0x1b)
> > > > 0xf88b is in qla24xx_async_abort_cmd (./arch/x86/include/asm/atomic.h:97).
> > > > 92 *
> > > > 93 * Atomically increments @v by 1.
> > > > 94 */
> > > > 95 static __always_inline void arch_atomic_inc(atomic_t *v)
> > > > 96 {
> > > > 97 asm volatile(LOCK_PREFIX "incl %0"
> > > > 98 : "+m" (v->counter) :: "memory");
> > > > 99 }
> > > > 100 #define arch_atomic_inc arch_atomic_inc
> > > >
> > > > [ ... ]
> > > >
> > > > (gdb) disassemble qla24xx_async_abort_cmd
> > > > Dump of assembler code for function qla24xx_async_abort_cmd:
> > > > 0x000000000000f870 <+0>: callq 0xf875 <qla24xx_async_abort_cmd+5>
> > > > 0x000000000000f875 <+5>: push %r15
> > > > 0x000000000000f877 <+7>: push %r14
> > > > 0x000000000000f879 <+9>: push %r13
> > > > 0x000000000000f87b <+11>: push %r12
> > > > 0x000000000000f87d <+13>: push %rbp
> > > > 0x000000000000f87e <+14>: push %rbx
> > > > 0x000000000000f87f <+15>: mov 0x28(%rdi),%r13
> > > > 0x000000000000f883 <+19>: mov 0x20(%rdi),%r15
> > > > 0x000000000000f887 <+23>: mov 0x48(%rdi),%r14
> > > > 0x000000000000f88b <+27>: lock incl 0x4(%r14)
> > > > 0x000000000000f890 <+32>: mfence
> > >
> > > Thanks, this is very helpful. I think the above means that the crash is
> > > triggered by the following code:
> > >
> > > sp = qla2xxx_get_qpair_sp(cmd_sp->qpair, cmd_sp->fcport,
> > > GFP_KERNEL);
> > >
> > > From the start of qla2xxx_get_qpair_sp():
> > >
> > > QLA_QPAIR_MARK_BUSY(qpair, bail);
> > >
> > > From qla_def.h:
> > >
> > > #define QLA_QPAIR_MARK_BUSY(__qpair, __bail) do { \
> > > atomic_inc(&__qpair->ref_count); \
> > > mb(); \
> > > if (__qpair->delete_in_progress) { \
> > > atomic_dec(&__qpair->ref_count); \
> > > __bail = 1; \
> > > } else { \
> > > __bail = 0; \
> > > } \
> > > } while (0)
> > >
> > > One of the changes between kernel version v4.9.210 and v4.19.98 is the
> > > following: "qla2xxx: Add multiple queue pair functionality". I think the
> > > above information means that the cmd_sp->qpair pointer is NULL. I will
> > > let QLogic recommend a solution.
> >
> > Thank you very much for the analysis.
> > Unfortunately, QLogic does not seem to care...
>
> Let's try to CC the people at Cavium that signed-off the commit.
No reply.
qla2xxx-upstream@qlogic.com address is dead:
Generating server: DC5-EXCH01.marvell.com
qla2xxx-upstream@qlogic.com
Remote Server returned '550 5.1.1 RESOLVER.ADR.RecipNotFound; not found'
Added some more CC addresses.
Yesterday it crashed again at the same place:
[2076301.849762] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[2076301.850021] PGD 0 P4D 0
[2076301.850109] Oops: 0002 [#1] SMP PTI
[2076301.850219] CPU: 4 PID: 18992 Comm: kworker/u16:1 Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
[2076301.850478] Hardware name: Dell Inc. PowerEdge 2950/0JR815, BIOS 2.7.0 10/30/2010
[2076301.850720] Workqueue: scsi_tmf_4 scmd_eh_abort_handler [scsi_mod]
[2076301.850936] RIP: 0010:qla24xx_async_abort_cmd+0x1b/0x250 [qla2xxx]
[2076301.851130] Code: e9 19 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 41 57 41 56 41 55 41 54 55 53 4c 8b 6f 28 4c 8b 7f 20 4c 8b 77 48 <f0> 41 ff 46 04 0f a
e f0 41 f6 46 24 04 74 17 f0 41 ff 4e 04 bd 02
[2076301.851663] RSP: 0018:ffffa10f8bbe7da8 EFLAGS: 00010293
[2076301.851820] RAX: 0000000000000800 RBX: ffff8ab8ddd197a8 RCX: 0000000000000070
[2076301.852036] RDX: ffff8ab8de4a8388 RSI: 0000000000000001 RDI: ffff8ab8799b8c40
[2076301.852253] RBP: ffff8ab8dc96c480 R08: ffffffffc03b7860 R09: 0000000000000000
[2076301.852469] R10: 8080808080808080 R11: 0000000000000010 R12: ffff8ab8dea00000
[2076301.852686] R13: ffff8ab8ddd197a8 R14: 0000000000000000 R15: ffff8ab8dd632000
[2076301.852902] FS: 0000000000000000(0000) GS:ffff8ab8e7b00000(0000) knlGS:0000000000000000
[2076301.853142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2076301.853320] CR2: 0000000000000004 CR3: 00000002203dc000 CR4: 00000000000006e0
[2076301.853536] Call Trace:
[2076301.853632] qla24xx_abort_command+0x218/0x2d0 [qla2xxx]
[2076301.853799] ? __switch_to_asm+0x41/0x70
[2076301.853924] ? __switch_to_asm+0x35/0x70
[2076301.854056] qla2xxx_eh_abort+0x117/0x310 [qla2xxx]
[2076301.854209] scmd_eh_abort_handler+0x85/0x220 [scsi_mod]
[2076301.854375] process_one_work+0x1a7/0x3a0
[2076301.854506] worker_thread+0x30/0x390
[2076301.854628] ? create_worker+0x1a0/0x1a0
[2076301.854753] kthread+0x112/0x130
[2076301.854859] ? kthread_bind+0x30/0x30
[2076301.854980] ret_from_fork+0x35/0x40
[2076301.855095] Modules linked in: loop ipmi_ssif radeon coretemp ttm drm_kms_helper drm kvm i2c_algo_bit i5000_edac iTCO_wdt sg iTCO_vendor_support irqbypass evdev i5k_
amb serio_raw joydev ipmi_si rng_core pcc_cpufreq dcdbas pcspkr ipmi_devintf acpi_cpufreq ipmi_msghandler button ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb crypt
o_simd cryptd glue_helper aes_x86_64 dm_service_time dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua uas usb_storage hid_generic usbhid hid sr_mod cdrom ses enc
losure sd_mod scsi_transport_sas ata_generic qla2xxx ata_piix nvme_fc ehci_pci nvme_fabrics libata uhci_hcd psmouse ehci_hcd nvme_core megaraid_sas usbcore scsi_transport
_fc lpc_ich mfd_core scsi_mod usb_common bnx2
[2076301.856887] CR2: 0000000000000004
[2076301.856999] ---[ end trace e9083db8fb76e126 ]---
[2076301.857151] RIP: 0010:qla24xx_async_abort_cmd+0x1b/0x250 [qla2xxx]
[2076301.857345] Code: e9 19 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 41 57 41 56 41 55 41 54 55 53 4c 8b 6f 28 4c 8b 7f 20 4c 8b 77 48 <f0> 41 ff 46 04 0f a
e f0 41 f6 46 24 04 74 17 f0 41 ff 4e 04 bd 02
[2076301.857878] RSP: 0018:ffffa10f8bbe7da8 EFLAGS: 00010293
[2076301.858035] RAX: 0000000000000800 RBX: ffff8ab8ddd197a8 RCX: 0000000000000070
[2076301.858251] RDX: ffff8ab8de4a8388 RSI: 0000000000000001 RDI: ffff8ab8799b8c40
[2076301.858467] RBP: ffff8ab8dc96c480 R08: ffffffffc03b7860 R09: 0000000000000000
[2076301.869384] R10: 8080808080808080 R11: 0000000000000010 R12: ffff8ab8dea00000
[2076301.880412] R13: ffff8ab8ddd197a8 R14: 0000000000000000 R15: ffff8ab8dd632000
[2076301.891483] FS: 0000000000000000(0000) GS:ffff8ab8e7b00000(0000) knlGS:0000000000000000
[2076301.902490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2076301.913344] CR2: 0000000000000004 CR3: 00000002203dc000 CR4: 00000000000006e0
[2077225.259348] mysqld[2155]: segfault at 0 ip 000056409366ad93 sp 00007fa049514450 error 6 in mysqld[564092eb2000+805000]
[2077225.270564] Code: c7 45 00 00 00 00 00 8b 7d cc 4c 89 e2 4c 89 f6 e8 62 81 84 ff 49 89 c7 49 39 c4 0f 84 f6 00 00 00 e8 e1 1c 00 00 41 8b 4d 00 <89> 08 85 c9 74 37 4
9 83 ff ff 0f 84 9d 00 00 00 f6 c3 06 75 28 4d
--
Ondrej Zary
prev parent reply other threads:[~2020-03-19 18:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-23 18:29 NULL pointer dereference in qla24xx_abort_command, kernel 4.19.98 (Debian) Ondrej Zary
2020-02-23 19:26 ` Bart Van Assche
2020-02-23 19:57 ` Ondrej Zary
2020-02-24 2:17 ` Bart Van Assche
2020-02-24 8:20 ` Ondrej Zary
2020-02-25 3:41 ` Bart Van Assche
2020-02-27 17:09 ` Ondrej Zary
2020-03-02 22:26 ` Ondrej Zary
2020-03-19 18:01 ` Ondrej Zary [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202003191901.46307.linux@zary.sk \
--to=linux@zary.sk \
--cc=GR-QLogic-Storage-Upstream@marvell.com \
--cc=bvanassche@acm.org \
--cc=himanshu.madhani@cavium.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michael.hernandez@cavium.com \
--cc=njavali@marvell.com \
--cc=qla2xxx-upstream@qlogic.com \
--cc=sawan.chandak@cavium.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.