linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anna Schumaker <schumaker.anna@gmail.com>
To: Senn Klemens <klemens.senn@ims.co.at>, linux-nfs@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Subject: Re: Soft lockup in unloading kernel modules
Date: Thu, 08 May 2014 11:59:26 -0400	[thread overview]
Message-ID: <536BA9DE.2060702@gmail.com> (raw)
In-Reply-To: <lkg4ae$hj7$1@ger.gmane.org>

I haven't applied Chuck's recent (v3) patches to that kernel yet (I've been waiting to see if people have comments).  I'll try to push something out today.

On 05/08/2014 10:28 AM, Senn Klemens wrote:
> Hi,
>
> I am getting a soft lockup on the NFS server on its reboot if at least
> one client mount is established. I am using OpenSUSE 12.3 with the
> nfs-rdma kernel from Anna Schumaker
> (git://git.linux-nfs.org/projects/anna/nfs-rdma.git).
>
> The export on the server side is done with
> /data	*(fsid=0,crossmnt,rw,mp,no_root_squash,sync,no_subtree_check,insecure)
>
> Following command is used for mounting the NFSv4 share:
> mount -t nfs -o port=20049,rdma,vers=4.0,timeo=900 172.16.100.19:/ /mnt
>
> The HCA is a Mellanox MT4099 on the server and the client.
>
> The soft lockup can be reproduced by following steps:
>   o server: Start the nfs server
>   o client: Mount the share
>   o client: Do a "ls" in the mounted directory
>   o server: Stop the nfs server
>   o server: Unload the nfs and mlx4 modules or reboot the server (I used
> the openibd init script from the Mellanox driver without having the
> Mellanox stack installed)
>
> The server reports a soft lockup
>   BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:6146]
> most times.
>
> Sometimes I get following kernel panic
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000003
> IP: [<ffffffff815a5c35>] _raw_spin_lock_bh+0x15/0x40
> PGD 82a820067 PUD 857832067 PMD 0
> Oops: 0002 [#1] SMP
> Modules linked in: nfsd nfs_acl auth_rpcgss oid_registry nfnetlink_log
> nfnetlink bluetooth rfkill nfsv4 svcrdma dm_mod cpuid nfs fscache lockd
> sunrpc af_packet 8021q garp stp llc rdma_ucm ib_ucm rdma_cm iw_cm
> ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib(-) ib_sa ib_mad ib_core
> ib_addr sr_mod cdrom usb_storage joydev mlx4_core usbhid
> x86_pkg_temp_thermal coretemp kvm_intel kvm ghash_clmulni_intel
> aesni_intel ablk_helper cryptd iTCO_wdt lrw igb gf128mul
> iTCO_vendor_support ehci_pci glue_helper pcspkr i2c_algo_bit isci
> ehci_hcd aes_x86_64 ptp libsas ioatdma lpc_ich microcode sb_edac sg
> pps_core usbcore ipmi_si tpm_tis edac_core scsi_transport_sas i2c_i801
> mfd_core dca usb_common tpm ipmi_msghandler wmi acpi_cpufreq button edd
> autofs4 xfs libcrc32c crc32c_intel processor thermal_sys scsi_dh_rdac
> scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh [last unloaded: oid_registry]
> CPU: 0 PID: 6603 Comm: modprobe Not tainted 3.15.0-rc2-anna-nfs-rdma+ #3
> Hardware name: Supermicro B9DRG-E/B9DRG-E, BIOS 3.0 09/04/2013
> task: ffff88105b8c6050 ti: ffff88105d814000 task.ti: ffff88105d814000
> RIP: 0010:[<ffffffff815a5c35>]  [<ffffffff815a5c35>]
> _raw_spin_lock_bh+0x15/0x40
> RSP: 0018:ffff88105d815d18  EFLAGS: 00010286
> RAX: 0000000000010000 RBX: ffffffffffffffff RCX: 0000000000000000
> RDX: 000000000000000b RSI: 0000000000000000 RDI: 0000000000000003
> RBP: ffff88105d815d18 R08: ffff88087c611f38 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88087c3c9800
> R13: ffff88107b82ab00 R14: 0000000000000003 R15: 0000000000000007
> FS:  00007fef64612700(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000003 CR3: 000000087c2c7000 CR4: 00000000000407f0
> Stack:
>  ffff88105d815d58 ffffffffa05199f0 ffff88105d815d88 ffff88087c3c9800
>  ffff88087c3c9400 ffff88107b82ab00 ffff88087c3c9660 ffff88087c3c95c8
>  ffff88105d815d78 ffffffffa0421ce9 ffff88087c3c9400 ffff88107b82aac0
> Call Trace:
>  [<ffffffffa05199f0>] svc_xprt_enqueue+0x50/0x220 [sunrpc]
>  [<ffffffffa0421ce9>] rdma_cma_handler+0x69/0x180 [svcrdma]
>  [<ffffffffa039d086>] cma_remove_one+0x1f6/0x220 [rdma_cm]
>  [<ffffffffa01dca86>] ib_unregister_device+0x46/0x120 [ib_core]
>  [<ffffffffa032ddc9>] mlx4_ib_remove+0x29/0x260 [mlx4_ib]
>  [<ffffffffa02fb9d0>] mlx4_remove_device+0xa0/0xc0 [mlx4_core]
>  [<ffffffffa02fba2b>] mlx4_unregister_interface+0x3b/0xa0 [mlx4_core]
>  [<ffffffffa033f4cc>] mlx4_ib_cleanup+0x10/0x23 [mlx4_ib]
>  [<ffffffff810bd6b2>] SyS_delete_module+0x152/0x220
>  [<ffffffff811496e4>] ? vm_munmap+0x54/0x70
>  [<ffffffff815adca6>] system_call_fastpath+0x1a/0x1f
> Code: 5d c3 0f b7 17 66 39 ca 74 f6 f3 90 0f b7 17 66 39 d1 75 f6 5d c3
> 55 65 81 04 25 20 b9 00 00 00 02 00 00 48 89 e5 b8 00 00 01 00 <f0> 0f
> c1 07 89 c2 c1 ea 10 66 39 c2 75 04 5d c3 f3 90 0f b7 07
> RIP  [<ffffffff815a5c35>] _raw_spin_lock_bh+0x15/0x40
>  RSP <ffff88105d815d18>
> CR2: 0000000000000003
> ---[ end trace 18e02ff413ac4b9b ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range:
> 0xffffffff80000000-0xffffffff9fffffff)
> ---[ end Kernel panic - not syncing: Fatal exception in interrupt
>
> Kind regards,
> Klemens
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2014-05-08 15:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08 14:28 Soft lockup in unloading kernel modules Senn Klemens
2014-05-08 15:59 ` Anna Schumaker [this message]
2014-05-13 16:48   ` Klemens Senn
2014-05-19 17:51     ` Chuck Lever
2014-05-19 21:02       ` Shirley Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=536BA9DE.2060702@gmail.com \
    --to=schumaker.anna@gmail.com \
    --cc=klemens.senn@ims.co.at \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).