public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Yan Burman <yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: NFS over RDMA crashing
Date: Wed, 6 Feb 2013 17:24:35 -0500	[thread overview]
Message-ID: <20130206222435.GL16417@fieldses.org> (raw)
In-Reply-To: <51127B3F.2090200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> When killing mount command that got stuck:
> -------------------------------------------
> 
> BUG: unable to handle kernel paging request at ffff880324dc7ff8
> IP: [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 8000000324dc7161
> Oops: 0003 [#1] PREEMPT SMP
> Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
> ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> target_core_file target_core_pscsi target_core_mod configfs 8021q
> bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
> sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
> CPU 6
> Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> X8DTH-i/6/iF/6F/X8DTH
> RIP: 0010:[<ffffffffa05f3dfb>]  [<ffffffffa05f3dfb>]
> rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> RSP: 0018:ffff880324c3dbf8  EFLAGS: 00010297
> RAX: ffff880324dc8000 RBX: 0000000000000001 RCX: ffff880324dd8428
> RDX: ffff880324dc7ff8 RSI: ffff880324dd8428 RDI: ffffffff81149618
> RBP: ffff880324c3dd78 R08: 000060f9c0000860 R09: 0000000000000001
> R10: ffff880324dd8000 R11: 0000000000000001 R12: ffff8806299dcb10
> R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000010
> FS:  0000000000000000(0000) GS:ffff88063fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: ffff880324dc7ff8 CR3: 0000000001a0b000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process nfsd (pid: 4744, threadinfo ffff880324c3c000, task ffff880330550000)
> Stack:
>  ffff880324c3dc78 ffff880324c3dcd8 0000000000000282 ffff880631cec000
>  ffff880324dd8000 ffff88062ed33040 0000000124c3dc48 ffff880324dd8000
>  ffff88062ed33058 ffff880630ce2b90 ffff8806299e8000 0000000000000003
> Call Trace:
>  [<ffffffffa05f466e>] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
>  [<ffffffff81086540>] ? try_to_wake_up+0x2f0/0x2f0
>  [<ffffffffa045963f>] svc_recv+0x3ef/0x4b0 [sunrpc]
>  [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd]
>  [<ffffffffa0571e5d>] nfsd+0xad/0x130 [nfsd]
>  [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd]
>  [<ffffffff81071df6>] kthread+0xd6/0xe0
>  [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70
>  [<ffffffff814b462c>] ret_from_fork+0x7c/0xb0
>  [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70
> Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
> RIP  [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
>  RSP <ffff880324c3dbf8>
> CR2: ffff880324dc7ff8
> ---[ end trace 06d0384754e9609a ]---
> 
> 
> It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> is responsible for the crash (it seems to be crashing in
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.
> 
> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> was no longer getting the server crashes,
> so the reset of my tests were done using that point (it is somewhere
> in the middle of 3.7.0-rc2).

OK, so this part's clearly my fault--I'll work on a patch, but the
rdma's use of the ->rq_pages array is pretty confusing.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2013-02-06 22:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-06 15:48 NFS over RDMA crashing Yan Burman
     [not found] ` <51127B3F.2090200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-06 15:58   ` Steve Wise
     [not found]     ` <51127DB1.6070804-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-02-06 17:06       ` Jeff Becker
     [not found]         ` <51128DAC.9000206-NSQ8wuThN14@public.gmane.org>
2013-02-07 15:54           ` Yan Burman
2013-02-06 22:24   ` J. Bruce Fields [this message]
     [not found]     ` <20130206222435.GL16417-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-02-06 22:28       ` Steve Wise
     [not found]         ` <5112D903.9010601-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-02-08  5:37           ` Tom Tucker
2013-02-07 16:41       ` J. Bruce Fields
     [not found]         ` <20130207164134.GK3222-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-02-11 15:19           ` Yan Burman
     [not found]             ` <0EE9A1CDC8D6434DB00095CD7DB8734611518A44-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-02-11 18:13               ` J. Bruce Fields
2013-02-15 15:27               ` J. Bruce Fields
     [not found]                 ` <20130215152746.GI8343-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-02-18 11:44                   ` Yan Burman
2014-03-07 16:59           ` Steve Wise
2014-03-07 20:41             ` Steve Wise
2014-03-08 16:39               ` Steve Wise
     [not found]                 ` <531B47B3.1070503-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-03-08 19:20                   ` Steve Wise
     [not found]                     ` <531B6D90.2090208-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-03-08 20:13                       ` Steve Wise
     [not found]                         ` <531B79F8.2020008-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-03-12 13:33                           ` Jeff Layton
     [not found]                             ` <20140312093300.7a434cbb-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2014-03-12 14:05                               ` Trond Myklebust
     [not found]                                 ` <731A7629-7DBB-4FC3-8F21-70380705ED4E-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2014-03-12 14:22                                   ` Tom Tucker
2014-03-12 14:28                                   ` Jeffrey Layton
     [not found]                                     ` <20140312102806.435847a7-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
2014-03-12 15:03                                       ` Trond Myklebust
     [not found]                                         ` <56B1FEC7-8514-4B2B-851B-7BC965A26AA8-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2014-03-12 15:29                                           ` Jeffrey Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130206222435.GL16417@fieldses.org \
    --to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    --cc=yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox