All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Steve Wise
	<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>,
	Matan Barak
	<matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: ib_uverbs: list corruption destroying a cq
Date: Wed, 26 Jul 2017 10:43:05 -0600	[thread overview]
Message-ID: <20170726164305.GE20499@obsidianresearch.com> (raw)
In-Reply-To: <00d301d30627$26b58d80$7420a880$@opengridcomputing.com>

On Wed, Jul 26, 2017 at 10:52:14AM -0500, Steve Wise wrote:
> Hey all,
> 
> The test group hit this during a heavy rdma stress test that sets up a few
> thousand connections, runs some IO, then tears down the connections.  It
> repeatedly does this.  After around 4 hours, they see the warning below.  Looks
> like the list pointer were from freed memory (poisoned)?    This is with
> linux-4.13-rc2.
> 
> Has anyone else seen this?  I didn't find anything looking in recent posts...

This was probably introduced byMatan's recent work in this area..

Guessing it is some kind of race..

Jason

> list_del corruption. prev->next should be ffff9514cf64be90, but was
> dead000000000100
> WARNING: CPU: 3 PID: 27966 at lib/list_debug.c:53
> __list_del_entry_valid+0x83/0xa0
> Modules linked in: rdma_ucm iw_cxgb4 cxgb4 nfsv3 nfs_acl nfs fscache lockd grace
> rpcrdma sunrpc rdma_cm ib_cm iw_cm ib_uverbs ebtable_nat ebtables ipt_REJECT
> nf_reject _ipv4 xt_CHECKSUM bridge autofs4 target_core_iblock target_core_file
> target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc
> 8021q garp scsi_tran sport_fc stp llc dm_mirror dm_region_hash dm_log vhost_net
> vhost tap tun kvm_intel kvm irqbypass uinput ppdev floppy parport_pc parport
> iTCO_wdt iTCO_vendor_support pc spkr serio_raw sg i2c_i801 lpc_ich mfd_core igb
> dca shpchp i5400_edac i5k_amb dm_mod(E) dax(E) ext4(E) jbd2(E) mbcache(E)
> sd_mod(E) pata_acpi(E) ata_generic(E) ata_pii x(E) ib_core(E) libcxgb(E) ipv6(E)
> crc_ccitt(E) ptp(E) pps_core(E) radeon(E) ttm(E) drm_kms_helper(E) drm(E)
> fb_sys_fops(E) sysimgblt(E)
>  sysfillrect(E) syscopyarea(E) i2c_algo_bit(E) i2c_core(E) [last unloaded:
> cxgb4]
> CPU: 3 PID: 27966 Comm: mbw Tainted: G            E   4.13.0-rc2 #1
> Hardware name: Supermicro X7DWU/X7DWU, BIOS 1.2c 11/19/2010
> task: ffff951450fb6780 task.stack: ffffa81588144000
> RIP: 0010:__list_del_entry_valid+0x83/0xa0
> RSP: 0000:ffffa81588147b38 EFLAGS: 00010092
> RAX: 0000000000000054 RBX: ffff9514731e4240 RCX: 0000000000000000
> RDX: ffff9514efd94880 RSI: ffff9514efd8cb68 RDI: ffff9514efd8cb68
> RBP: ffffa81588147b38 R08: 0000000000000004 R09: 0000000000000000
> R10: 0000000000000074 R11: 000000000000000f R12: ffff9514a230b000
> R13: ffff9514cf64be80 R14: ffff9514d19bab38 R15: ffff9514d19bab58
> FS:  000014e8e054d720(0000) GS:ffff9514efd80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000006df4b0 CR3: 000000052dcb9000 CR4: 00000000000406e0
> Call Trace:
>  ib_uverbs_release_ucq+0x64/0x160 [ib_uverbs]
>  uverbs_free_cq+0x51/0x80 [ib_uverbs]
>  remove_commit_idr_uobject+0x22/0x50 [ib_uverbs]
>  ? uverbs_uobject_free+0x32/0x40 [ib_uverbs]
>  uverbs_cleanup_ucontext+0xe6/0x1a0 [ib_uverbs]
>  ib_uverbs_cleanup_ucontext+0x23/0x40 [ib_uverbs]
>  ib_uverbs_close+0x3c/0x120 [ib_uverbs]
>  __fput+0xc8/0x240
>  ____fput+0xe/0x10
>  task_work_run+0x68/0xa0
>  ? free_fs_struct+0x32/0x40
>  do_exit+0x16a/0x470
>  ? __getnstimeofday64+0x4d/0xf0
>  ? getnstimeofday64+0xe/0x20
>  ? __audit_syscall_entry+0xaa/0x100
>  do_group_exit+0x4e/0xc0
>  SyS_exit_group+0x17/0x20
>  do_syscall_64+0x55/0xd0
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x3fe06acf38
> RSP: 002b:00007ffc10a6efd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> RAX: ffffffffffffffda RBX: 0000003fe098a838 RCX: 0000003fe06acf38
> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff98
> R10: 0000003fe0991828 R11: 0000000000000246 R12: 0000003fe098a838
> R13: 00007ffc10a6f0d0 R14: 0000000000000000 R15: 0000000000000000
> Code: c0 c9 c3 48 89 fe 31 c0 48 c7 c7 78 17 a2 93 e8 78 a2 d9 ff 0f ff 31 c0 c9
> c3 48 89 fe 31 c0 48 c7 c7 38 17 a2 93 e8 61 a2 d9 ff <0f> ff 31 c0 c9 c3 48 89
> fe 31  c0 48 c7 c7 00 17 a2 93 e8 4a a2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2017-07-26 16:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26 15:52 ib_uverbs: list corruption destroying a cq Steve Wise
2017-07-26 16:27 ` Matan Barak
     [not found]   ` <CAAKD3BCkZVcMbvyMPVF75Kg0wU4Ld7cByMTWRrydgsyqjCuS9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-26 16:34     ` Steve Wise
2017-07-26 16:43 ` Jason Gunthorpe [this message]
2017-07-26 18:12 ` Robert LeBlanc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170726164305.GE20499@obsidianresearch.com \
    --to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.