public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <Bart.VanAssche-Sjgp3cTcYWE@public.gmane.org>
To: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
	<loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: Kernel v4.16 / v4.17 SRP and SRPT patches
Date: Tue, 9 Jan 2018 20:51:20 +0000	[thread overview]
Message-ID: <1515531079.2721.26.camel@wdc.com> (raw)
In-Reply-To: <1515529869.3919.4.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Tue, 2018-01-09 at 15:31 -0500, Laurence Oberman wrote:
> On Tue, 2018-01-09 at 15:15 -0500, Laurence Oberman wrote:
> > [  220.843344] ------------[ cut here ]------------
> > [  220.869309] list_add corruption. prev->next should be next
> > (000000002a07d255), but was           (null).
> > (prev=000000000edf5e8c).
> > [  220.935392] WARNING: CPU: 1 PID: 694 at lib/list_debug.c:28
> > __list_add_valid+0x6a/0x70
> > [  220.979462] Modules linked in: xt_CHECKSUM iptable_mangle
> > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
> > nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables
> > ip6table_filter ip6_tables iptable_filter rpcrdma ib_isert
> > iscsi_target_mod target_core_mod ib_iser libiscsi
> > scsi_transport_iscsi
> > ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
> > rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp
> > kvm_intel
> > kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_ssif
> > ghash_clmulni_intel pcbc aesni_intel joydev ipmi_si crypto_simd
> > dm_service_time iTCO_wdt hpwdt iTCO_vendor_support glue_helper cryptd
> > ipmi_devintf sg gpio_ich pcspkr hpilo ipmi_msghandler lpc_ich
> > acpi_power_meter i7core_edac shpchp
> > [  221.385270]  pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace
> > sunrpc
> > dm_multipath ip_tables xfs libcrc32c radeon i2c_algo_bit
> > drm_kms_helper
> > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx5_core mlxfw
> > sd_mod drm ptp hpsa pps_core crc32c_intel i2c_core serio_raw bnx2
> > devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
> > [  221.554496] CPU: 1 PID: 694 Comm: kworker/1:1H Tainted:
> > G          I      4.15.0-rc7+ #1
> > [  221.606907] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [  221.642980] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> > [  221.674616] RIP: 0010:__list_add_valid+0x6a/0x70
> > [  221.700561] RSP: 0018:ffffb2bdc75c7cf0 EFLAGS: 00010086
> > [  221.730608] RAX: 0000000000000000 RBX: ffff94342d610880 RCX:
> > ffffffff8ba62928
> > [  221.771490] RDX: 0000000000000001 RSI: 0000000000000082 RDI:
> > 0000000000000046
> > [  221.812721] RBP: ffff94342d6108b8 R08: 0000000000000000 R09:
> > 0000000000000722
> > [  221.853073] R10: 0000000000000000 R11: ffffb2bdc75c7a58 R12:
> > 0000000000000200
> > [  221.894156] R13: 0000000000000246 R14: ffff943fb7fd5000 R15:
> > ffff943fb7fd5000
> > [  221.935233] FS:  0000000000000000(0000) GS:ffff944033200000(0000)
> > knlGS:0000000000000000
> > [  221.980521] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  222.013062] CR2: 00007f1bdc0ee910 CR3: 00000017e7e0a002 CR4:
> > 00000000000206e0
> > [  222.052302] Call Trace:
> > [  222.065971]  ib_mad_post_receive_mads+0x177/0x310 [ib_core]
> > [  222.097349]  ib_mad_recv_done+0x471/0x9c0 [ib_core]
> > [  222.124387]  __ib_process_cq+0x55/0xa0 [ib_core]
> > [  222.150827]  ib_cq_poll_work+0x1b/0x60 [ib_core]
> > [  222.177751]  process_one_work+0x141/0x340
> > [  222.200383]  worker_thread+0x47/0x3e0
> > [  222.220641]  kthread+0xf5/0x130
> > [  222.238951]  ? rescuer_thread+0x380/0x380
> > [  222.262034]  ? kthread_associate_blkcg+0x90/0x90
> > [  222.288514]  ? do_group_exit+0x39/0xa0
> > [  222.309492]  ret_from_fork+0x1f/0x30
> > [  222.330073] Code: fe 31 c0 48 c7 c7 98 36 89 8b e8 02 9c cf ff 0f
> > ff
> > 31 c0 c3 48 89 d1 48 c7 c7 48 36 89 8b 48 89 f2 48 89 c6 31 c0 e8 e6
> > 9b
> > cf ff <0f> ff 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48
> > 8b 
> > [  222.438058] ---[ end trace 5d41544bf17ab73b ]---
> > [  222.465993] BUG: unable to handle kernel NULL pointer dereference
> > at
> > 0000000000000028
> > [  222.510316] IP: ib_mad_post_receive_mads+0x3c/0x310 [ib_core]
> > [  222.543188] PGD 0 P4D 0 
> > [  222.557625] Oops: 0000 [#1] SMP PTI
> > [  222.576674] Modules linked in: xt_CHECKSUM iptable_mangle
> > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
> > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
> > nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables
> > ip6table_filter ip6_tables iptable_filter rpcrdma ib_isert
> > iscsi_target_mod target_core_mod ib_iser libiscsi
> > scsi_transport_iscsi
> > ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
> > rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp
> > kvm_intel
> > kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_ssif
> > ghash_clmulni_intel pcbc aesni_intel joydev ipmi_si crypto_simd
> > dm_service_time iTCO_wdt hpwdt iTCO_vendor_support glue_helper cryptd
> > ipmi_devintf sg gpio_ich pcspkr hpilo ipmi_msghandler lpc_ich
> > acpi_power_meter i7core_edac shpchp
> > [  222.981443]  pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace
> > sunrpc
> > dm_multipath ip_tables xfs libcrc32c radeon i2c_algo_bit
> > drm_kms_helper
> > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx5_core mlxfw
> > sd_mod drm ptp hpsa pps_core crc32c_intel i2c_core serio_raw bnx2
> > devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
> > [  223.152359] CPU: 1 PID: 694 Comm: kworker/1:1H Tainted: G        W
> > I      4.15.0-rc7+ #1
> > [  223.198577] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [  223.235101] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> > [  223.266750] RIP: 0010:ib_mad_post_receive_mads+0x3c/0x310
> > [ib_core]
> > [  223.303012] RSP: 0018:ffffb2bdc75c7cf8 EFLAGS: 00010286
> > [  223.333022] RAX: 0000000000000000 RBX: ffff94342d610908 RCX:
> > ffff94342d610948
> > [  223.373307] RDX: 0000000000000001 RSI: ffff94342d6108c0 RDI:
> > ffff94342d610908
> > [  223.414451] RBP: ffff94342d610940 R08: ffff94342a8e64c0 R09:
> > ffff94342a8e64e8
> > [  223.454789] R10: ffff94342a8e64e8 R11: ffff94342d6109a8 R12:
> > ffff944029c2e048
> > [  223.496554] R13: 0000000000000000 R14: ffff94342a8e64c0 R15:
> > ffff94342d6108c0
> > [  223.537489] FS:  0000000000000000(0000) GS:ffff944033200000(0000)
> > knlGS:0000000000000000
> > [  223.583538] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  223.616545] CR2: 0000000000000028 CR3: 00000017e7e0a002 CR4:
> > 00000000000206e0
> > [  223.657337] Call Trace:
> > [  223.671022]  ? find_mad_agent+0x77/0x1b0 [ib_core]
> > [  223.698581]  ? __kmalloc+0x1be/0x1f0
> > [  223.719074]  ib_mad_recv_done+0x471/0x9c0 [ib_core]
> > [  223.747190]  __ib_process_cq+0x55/0xa0 [ib_core]
> > [  223.774140]  ib_cq_poll_work+0x1b/0x60 [ib_core]
> > [  223.800719]  process_one_work+0x141/0x340
> > [  223.824120]  worker_thread+0x47/0x3e0
> > [  223.845133]  kthread+0xf5/0x130
> > [  223.863116]  ? rescuer_thread+0x380/0x380
> > [  223.886173]  ? kthread_associate_blkcg+0x90/0x90
> > [  223.912207]  ? do_group_exit+0x39/0xa0
> > [  223.933198]  ret_from_fork+0x1f/0x30
> > [  223.953218] Code: 55 41 54 55 48 8d 6f 38 53 48 89 fb 48 83 ec 50
> > 65
> > 48 8b 04 25 28 00 00 00 48 89 44 24 48 31 c0 48 8b 07 48 85 f6 48 89
> > 4c
> > 24 08 <48> 8b 50 28 8b 12 48 c7 44 24 28 00 00 00 00 c7 44 24 40 01
> > 00 
> > [  224.059985] RIP: ib_mad_post_receive_mads+0x3c/0x310 [ib_core]
> > RSP:
> > ffffb2bdc75c7cf8
> > [  224.103994] CR2: 0000000000000028
> 
> Just wanted to add that the panic is consistent, rebooted into only a
> single path to my SRP LUNS and on reboot had the same panic.

Hello Laurence,

Can you repeat your test with the following two kernels:
* v4.15-rc7 (Linus' latest).
* The for-next branch of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git.

I'm asking this because the crash occurred in a code path that is not modified by
any of my patches.

Thanks,

Bart.

  parent reply	other threads:[~2018-01-09 20:51 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-06  0:22 [PATCH 5/8] infiniband: fix ulp/srpt/ib_srpt.c kernel-doc notation Randy Dunlap
     [not found] ` <5a5016c0.4c0a620a.ed2b3.60da-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2018-01-06  0:36   ` Bart Van Assche
     [not found]     ` <fcc3f226-848d-abc4-2a81-f4fd821761c9-Sjgp3cTcYWE@public.gmane.org>
2018-01-06  5:55       ` Randy Dunlap
     [not found]         ` <31f69352-b8b1-9ed1-635b-2c654b49c775-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-01-06 16:50           ` Bart Van Assche
2018-01-09 20:15       ` Laurence Oberman
     [not found]         ` <1515528956.3919.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 20:31           ` Laurence Oberman
     [not found]             ` <1515529869.3919.4.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 20:51               ` Bart Van Assche [this message]
     [not found]                 ` <1515531079.2721.26.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-09 21:00                   ` Kernel v4.16 / v4.17 SRP and SRPT patches Laurence Oberman
     [not found]                     ` <1515531652.26021.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-09 22:40                       ` Laurence Oberman
     [not found]                         ` <1515537614.26021.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 13:42                           ` Laurence Oberman
     [not found]                             ` <1515591723.26021.6.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 18:26                               ` Jason Gunthorpe
     [not found]                                 ` <20180110182648.GI4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 18:40                                   ` Bart Van Assche
     [not found]                                     ` <1515609623.2745.20.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-10 18:59                                       ` Laurence Oberman
     [not found]                                         ` <1515610750.10153.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 19:15                                           ` Jason Gunthorpe
     [not found]                                             ` <20180110191510.GK4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 19:30                                               ` Laurence Oberman
     [not found]                                                 ` <1515612639.10153.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 20:52                                                   ` Jason Gunthorpe
     [not found]                                                     ` <20180110205243.GP4776-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-10 21:11                                                       ` Laurence Oberman
     [not found]                                                         ` <1515618674.10153.6.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-10 21:15                                                           ` Jason Gunthorpe
     [not found]                                                             ` <20180110211501.GS4776-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-11 13:02                                                               ` Laurence Oberman
     [not found]                                                                 ` <1515675741.21421.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 18:20                                                                   ` Laurence Oberman
     [not found]                                                                     ` <1515694855.21421.3.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 18:35                                                                       ` Patch: RDMA mlx5_core.c : mlx5_try_fast_unload causes panics Laurence Oberman
2018-01-11 20:43                                                                   ` Kernel v4.16 / v4.17 SRP and SRPT patches Laurence Oberman
     [not found]                                                                     ` <1515703435.21421.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 21:15                                                                       ` Bart Van Assche
     [not found]                                                                         ` <1515705340.2752.60.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-11 21:33                                                                           ` Laurence Oberman
     [not found]                                                                             ` <1515706433.21421.11.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-11 21:43                                                                               ` Bart Van Assche
2018-01-12 21:11                                                                               ` Bart Van Assche
     [not found]                                                                                 ` <1515791472.2396.57.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-13  0:09                                                                                   ` Laurence Oberman
     [not found]                                                                                     ` <1515802177.1566.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-13  1:57                                                                                       ` Laurence Oberman
     [not found]                                                                                         ` <1515808673.11354.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-13 14:53                                                                                           ` Laurence Oberman
     [not found]                                                                                             ` <1515855226.32050.1.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-15 16:12                                                                                               ` Bart Van Assche
     [not found]                                                                                                 ` <1516032762.3951.5.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-15 16:52                                                                                                   ` Laurence Oberman
2018-01-10 21:17                                                           ` Laurence Oberman
2018-01-10 19:17                                       ` Jason Gunthorpe
     [not found]                                         ` <20180110191758.GL4518-uk2M96/98Pc@public.gmane.org>
2018-01-10 19:32                                           ` Bart Van Assche
     [not found]                                             ` <1515612733.2745.27.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-10 22:43                                               ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1515531079.2721.26.camel@wdc.com \
    --to=bart.vanassche-sjgp3ctcywe@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox