public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Adam Mazur <adam.mazur-yCD69WgB1YhWk0Htik3J/w@public.gmane.org>
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: CRASH 3.18-rc2, 3.17.1, isert_connect_request
Date: Mon, 03 Nov 2014 11:28:24 +0100	[thread overview]
Message-ID: <545758C8.4050300@tiktalik.com> (raw)

Can someone help us with these crashes? We are not able to recreate it 
on demand, but it takes 30 minutes to a few hours to appear the crash. 
We've seen it on kernel 3.17.1 and 3.18-rc2.

On 3.18-rc2 it leaves such tracebacks:


  BUG: unable to handle kernel NULL pointer dereference at 0000000000000720
  IP: [<ffffffffc05dc7fd>] isert_connect_request.isra.48+0x2fd/0x7d0 
[ib_isert]
  PGD 0
  Oops: 0000 [#1] SMP
  Modules linked in: target_core_user uio target_core_pscsi 
target_core_file target_core_iblock dm_thin_pool(OE) dm_persistent_data 
dm_bio_prison dm_bufio libcrc32c gpio_ich intel_powerclamp core
temp kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel dcdbas ast 
aesni_intel ttm drm_kms_helper aes_x86_64 lrw gf128mul glue_helper 
ablk_helper cryptd drm syscopyarea sysfillrect sysimgblt
joydev serio_raw i7core_edac ib_mthca ib_isert lpc_ich edac_core 
iscsi_target_mod ipmi_si 8250_fintek mac_hid ib_iser ipmi_msghandler 
libiscsi scsi_transport_iscsi rdma_ucm ib_uverbs rdma_cm iw_
cm ib_ipoib ib_srpt ib_cm ib_sa target_core_mod configfs ib_umad ib_mad 
ib_core ib_addr lp parport bcache raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor hid_generi
c usbhid hid raid6_pq igb raid1 ahci i2c_algo_bit raid0 dca libahci ptp 
megaraid_sas pps_core multipath linear
  CPU: 3 PID: 23400 Comm: kworker/3:2 Tainted: G           OE 
3.18.0-031800rc2-generic #201410281737
  Hardware name: Dell                   FS12-TY               /      , 
BIOS C99Q3B23 08/16/2012
  Workqueue: ib_cm cm_work_handler [ib_cm]
  task: ffff8803ca928000 ti: ffff8803ca8b8000 task.ti: ffff8803ca8b8000
  RIP: 0010:[<ffffffffc05dc7fd>]  [<ffffffffc05dc7fd>] 
isert_connect_request.isra.48+0x2fd/0x7d0 [ib_isert]
  RSP: 0018:ffff8803ca8bbbf8  EFLAGS: 00010283
  RAX: 0000000000000000 RBX: ffff8803e53b0800 RCX: 0000000000009484
  RDX: ffff880424b08000 RSI: ffff8803e8638d80 RDI: ffff88042ec03d00
  RBP: ffff8803ca8bbc48 R08: 00000000000173e0 R09: ffffea000fa18e00
  R10: ffffffffc060ab31 R11: 0000000000000000 R12: ffff880424b08000
  R13: ffff88041a2a7400 R14: ffff88041215f800 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff88042f260000(0000) 
knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000720 CR3: 0000000001c16000 CR4: 00000000000007e0
  Stack:
   ffff8803e53b0c58 ffff8803ca8bbc9a ffff880412a2a680 ffff8800b7a16000
   c9750c0000ad0500 ffff88041a2a7400 ffff8803ca8bbc88 ffff880411ca3800
   0000000000000000 ffff88042750e400 ffff8803ca8bbc68 ffffffffc05dcded
  Call Trace:
   [<ffffffffc05dcded>] isert_cma_handler+0x11d/0x170 [ib_isert]
   [<ffffffffc0512cd6>] cma_req_handler+0x196/0x430 [rdma_cm]
   [<ffffffffc04bdff0>] cm_process_work+0x30/0x140 [ib_cm]
   [<ffffffffc04bfe84>] cm_req_handler+0x274/0x3a0 [ib_cm]
   [<ffffffffc04c02f5>] cm_work_handler+0xb5/0x1d4 [ib_cm]
   [<ffffffff8108d4be>] process_one_work+0x14e/0x460
   [<ffffffff8108de3b>] worker_thread+0x11b/0x3f0
   [<ffffffff8108dd20>] ? create_worker+0x1e0/0x1e0
   [<ffffffff810939b9>] kthread+0xc9/0xe0
   [<ffffffff810938f0>] ? flush_kthread_worker+0x90/0x90
   [<ffffffff817b227c>] ret_from_fork+0x7c/0xb0
   [<ffffffff810938f0>] ? flush_kthread_worker+0x90/0x90
  Code: be 01 00 00 00 48 89 c7 e8 c1 af e4 ff 48 3d 00 f0 ff ff 48 89 
83 90 05 00 00 0f 87 80 04 00 00 49 8b 86 78 01 00 00 48 8b 40 08 <0f> 
b6 90 20 07 00 00 84 d2 74 0e 48 8b 45 c8 80 78 04 00 0f 84
  RIP  [<ffffffffc05dc7fd>] isert_connect_request.isra.48+0x2fd/0x7d0 
[ib_isert]
   RSP <ffff8803ca8bbbf8>
  CR2: 0000000000000720
  ---[ end trace b8718ad554264a63 ]---

followed by:

  BUG: unable to handle kernel paging request at ffffffffffffffd8
  IP: [<ffffffff81093d50>] kthread_data+0x10/0x20
  PGD 1c19067 PUD 1c1b067 PMD 0
  Oops: 0000 [#2] SMP
  Modules linked in: target_core_user uio target_core_pscsi 
target_core_file target_core_iblock dm_thin_pool(OE) dm_persistent_data 
dm_bio_prison dm_bufio libcrc32c gpio_ich intel_powerclamp coretemp kvm 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel dcdbas ast aesni_intel 
ttm drm_kms_helper aes_x86_64 lrw gf128mul glue_helper ablk_helper 
cryptd drm syscopyarea sysfillrect sysimgblt joydev serio_raw 
i7core_edac ib_mthca ib_isert lpc_ich edac_core iscsi_target_mod ipmi_si 
8250_fintek mac_hid ib_iser ipmi_msghandler libiscsi 
scsi_transport_iscsi rdma_ucm ib_uverbs rdma_cm iw_cm ib_ipoib ib_srpt 
ib_cm ib_sa target_core_mod configfs ib_umad ib_mad ib_core ib_addr lp 
parport bcache raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor hid_generic usbhid hid raid6_pq igb raid1 ahci 
i2c_algo_bit raid0 dca libahci ptp megaraid_sas pps_core multipath linear
  CPU: 3 PID: 23400 Comm: kworker/3:2 Tainted: G      D    OE 
3.18.0-031800rc2-generic #201410281737
  Hardware name: Dell                   FS12-TY               /      , 
BIOS C99Q3B23 08/16/2012
  task: ffff8803ca928000 ti: ffff8803ca8b8000 task.ti: ffff8803ca8b8000
  RIP: 0010:[<ffffffff81093d50>]  [<ffffffff81093d50>] 
kthread_data+0x10/0x20
  RSP: 0018:ffff8803ca8bb808  EFLAGS: 00010096
  RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffff81ec8e40
  RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff8803ca928000
  RBP: ffff8803ca8bb808 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 000000000000e5b0 R12: 0000000000000003
  R13: ffff8803ca928538 R14: 0000000000000001 R15: 0000000000000046
  FS:  0000000000000000(0000) GS:ffff88042f260000(0000) 
knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000001c16000 CR4: 00000000000007e0
  Stack:
   ffff8803ca8bb828 ffffffff8108ed85 ffff8803ca8bb828 ffff88042f274600
   ffff8803ca8bb8a8 ffffffff817ade93 ffff8803ca8bb848 ffff8804250612d8
   ffff8803ca8bbfd8 0000000000014600 ffff8803ca8bb888 0000000000014600
  Call Trace:
   [<ffffffff8108ed85>] wq_worker_sleeping+0x15/0xb0
   [<ffffffff817ade93>] __schedule+0x5f3/0x780
   [<ffffffff817ae0f9>] schedule+0x29/0x70
   [<ffffffff81077915>] do_exit+0x2a5/0x470
   [<ffffffff81017dc8>] oops_end+0xb8/0x160
   [<ffffffff81796707>] no_context+0x1b5/0x1c4
   [<ffffffff817968e9>] __bad_area_nosemaphore+0x1d3/0x1f2
   [<ffffffff8179691b>] bad_area_nosemaphore+0x13/0x15
   [<ffffffff81062372>] __do_page_fault+0x3b2/0x550
   [<ffffffffc060a3a9>] ? mthca_cmd_wait+0x149/0x1e0 [ib_mthca]
   [<ffffffff8106269e>] do_page_fault+0x3e/0x80
   [<ffffffff817b4388>] page_fault+0x28/0x30



Traceback from kernel 3.17.1 (hope this will help too):

  BUG: unable to handle kernel paging request at 0000100000000718
  IP: [<ffffffffc064c7fd>] isert_connect_request.isra.47+0x2fd/0x7d0 
[ib_isert]
  PGD 0
  Oops: 0000 [#1] SMP
  Modules linked in: target_core_pscsi target_core_file 
target_core_iblock dm_thin_pool(OE) dm_persistent_data dm_bio_prison 
dm_bufio libcrc32c intel_powerclamp coretemp ast gpio_ich ttm kvm crct
10dif_pclmul crc32_pclmul dcdbas ghash_clmulni_intel drm_kms_helper 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd drm 
serio_raw syscopyarea sysfillrect sysimgblt joydev lpc_
ich ib_mthca ib_isert i7core_edac iscsi_target_mod edac_core ipmi_si 
ipmi_msghandler ib_iser mac_hid libiscsi scsi_transport_iscsi rdma_ucm 
ib_uverbs rdma_cm iw_cm ib_ipoib ib_srpt ib_cm ib_sa target_core_mod 
configfs ib_umad ib_mad ib_core ib_addr lp parport bcache ses enclosure 
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor hid_generic usbhid hid raid6_pq igb ahci libahci raid1 
i2c_algo_bit dca raid0 ptp pps_core megaraid_sas multipath linear
  CPU: 2 PID: 18880 Comm: kworker/2:0 Tainted: G           OE 
3.17.1-031701-generic #201410150735
  Hardware name: Dell                   FS12-TY               /      , 
BIOS C99Q3B23 08/16/2012
  Workqueue: ib_cm cm_work_handler [ib_cm]
  task: ffff8803ea031e00 ti: ffff880378d84000 task.ti: ffff880378d84000
  RIP: 0010:[<ffffffffc064c7fd>]  [<ffffffffc064c7fd>] 
isert_connect_request.isra.47+0x2fd/0x7d0 [ib_isert]
  RSP: 0018:ffff880378d87bf8  EFLAGS: 00010287
  RAX: 0000100000000000 RBX: ffff880362f81000 RCX: 000000000009bda8
  RDX: ffff880426361000 RSI: ffff88035e872d30 RDI: ffff88042ec03d00
  RBP: ffff880378d87c48 R08: 0000000000017320 R09: ffffea000d7a1c80
  R10: ffffffffc065fb31 R11: 0000000000000000 R12: ffff880426361000
  R13: ffff880357e05000 R14: ffff880426b6f400 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff88042f240000(0000) 
knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000100000000718 CR3: 0000000001c16000 CR4: 00000000000007e0
  Stack:
   ffff880362f81458 ffff880378d87c9a ffff8804100e6d80 ffff88040db00800
   c9750c0000ad0500 ffff880357e05000 ffff880378d87c88 ffff88040f41cc00
   0000000000000000 ffff8803abc70800 ffff880378d87c68 ffffffffc064cded
  Call Trace:
   [<ffffffffc064cded>] isert_cma_handler+0x11d/0x170 [ib_isert]
   [<ffffffffc056dcc6>] cma_req_handler+0x196/0x430 [rdma_cm]
   [<ffffffffc051eff0>] cm_process_work+0x30/0x140 [ib_cm]
   [<ffffffffc0520e84>] cm_req_handler+0x274/0x3a0 [ib_cm]
   [<ffffffffc05212f5>] cm_work_handler+0xb5/0x1d4 [ib_cm]
   [<ffffffff8108ce2e>] process_one_work+0x14e/0x460
   [<ffffffff8108d7ab>] worker_thread+0x11b/0x3f0
   [<ffffffff8108d690>] ? create_worker+0x1e0/0x1e0
   [<ffffffff810932b9>] kthread+0xc9/0xe0
   [<ffffffff810931f0>] ? flush_kthread_worker+0x90/0x90
   [<ffffffff817a46fc>] ret_from_fork+0x7c/0xb0
   [<ffffffff810931f0>] ? flush_kthread_worker+0x90/0x90
  Code: be 01 00 00 00 48 89 c7 e8 c1 ff d7 ff 48 3d 00 f0 ff ff 48 89 
83 90 05 00 00 0f 87 80 04 00 00 49 8b 86 78 01 00 00 48 8b 40 08 <0f> 
b6 90 18 07 00 00 84 d2 74 0e 48 8b 45 c8 80 78 04 00 0f 84
  RIP  [<ffffffffc064c7fd>] isert_connect_request.isra.47+0x2fd/0x7d0 
[ib_isert]
   RSP <ffff880378d87bf8>
  CR2: 0000100000000718


Best regards,
Adam Mazur
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2014-11-03 10:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-03 10:28 Adam Mazur [this message]
2014-11-03 11:27 ` CRASH 3.18-rc2, 3.17.1, isert_connect_request Sagi Grimberg
     [not found]   ` <54576696.4000203-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-11-03 11:50     ` Adam Mazur
2014-11-04  8:50       ` Adam Mazur
     [not found]         ` <54589351.1080007-yCD69WgB1YhWk0Htik3J/w@public.gmane.org>
2014-11-04 16:44           ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545758C8.4050300@tiktalik.com \
    --to=adam.mazur-ycd69wgb1yhwk0htik3j/w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox