All of lore.kernel.org
 help / color / mirror / Atom feed
From: Caspar Zhang <caspar@linux.alibaba.com>
To: Yihao Wu <wuyihao@linux.alibaba.com>
Cc: Greg KH <greg@kroah.com>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	caspar@linux.alibaba.com, linux-nfs@vger.kernel.org
Subject: Re: [backport request][stable] SUNRPC: Clean up initialisation of the struct rpc_rqst
Date: Mon, 1 Jul 2019 16:25:47 +0800	[thread overview]
Message-ID: <20190701082547.GA89243@linux.alibaba.com> (raw)
In-Reply-To: <1b4585b9-401a-9022-6bc9-5ecbe253799d@linux.alibaba.com>

On Mon, Jul 01, 2019 at 01:27:26AM +0800, Yihao Wu wrote:
> Hi,
>
> I'm using kernel v4.19.y and find that v4.19.y panic when mounting NFSv4, which can be simply reproduced as follows:
>
>
> while :; do
> mkfs.ext4 -F /dev/vdb
> mount /dev/vdb /tmp/mymnt_server
> exportfs -o insecure,rw,sync,no_root_squash,fsid=1 127.0.0.1:/tmp/mymnt_server
> mount -t nfs4 -o minorversion=0 127.0.0.1:/tmp/mymnt_server /tmp/mymnt_client
> umount -f -l /tmp/mymnt_client
> sleep 0.5
> exportfs -r
> sleep 0.2
> umount -f -l /tmp/mymnt_server
> done
>
>
> # kernel 4.19.y
> After a while, the kernel panic.

This is a Regression, introduced by the following commit:

    NFS4: Fix v4.0 client state corruption when mount

in v4.19.46[1].

The interesting part is, looks like commit 9dc6edcf676("SUNRPC: Clean up
initialisation of the struct rpc_rqst")[2] is not targeted to fix this
regression, but a general clean-up fix, maybe that explains somehow we
missed this fix in stable tree ;-)

Since this is a Regression in stable tree, I strongly suggest we put the
fix commit into 4.19. Greg, Trond, any comments?

Thanks,
Caspar


[1] https://lore.kernel.org/patchwork/patch/1086274/
[2] https://github.com/torvalds/linux/commit/9dc6edcf676fe188430e8b119f91280bbf285163


>
> [ 6049.676833] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: (null)
> [ 6049.747488] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: (null)
> [ 6050.248646] NFSD: starting 90-second grace period (net f0000098)
> [ 6051.348206] BUG: unable to handle kernel NULL pointer dereference at 00000000000006d8
> [ 6051.349582] PGD 0 P4D 0
> [ 6051.350279] Oops: 0000 [#1] SMP KASAN PTI
> [ 6051.351102] CPU: 7 PID: 8168 Comm: kworker/u16:3 Kdump: loaded Not tainted 4.19.48-0.267.git.c08655b39.al7.x86_64.debug #1
> [ 6051.352534] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
> [ 6051.354746] Workqueue: rpciod rpc_async_schedule [sunrpc]
> [ 6051.355844] RIP: 0010:__lock_acquire+0x2f0/0x820
> [ 6051.356892] Code: 12 05 00 00 45 31 f6 85 c9 48 8b 85 00 09 00 00 0f 85 b8 00 00 00 48 85 c0 0f 85 da 01 00 00 41 be 01 00 00 00 e9 a4 00 00 00 <49> 81 3c 24 c0 05 e1 b7 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 87
> [ 6051.359845] RSP: 0018:ffff8883e6a4fce0 EFLAGS: 00010002
> [ 6051.361045] RAX: 0000000000000001 RBX: 0000000000000246 RCX: 0000000000000000
> [ 6051.362426] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000006d8
> [ 6051.363811] RBP: ffff8883ca280000 R08: 0000000000000001 R09: 0000000000000000
> [ 6051.365213] R10: ffffed107dd7c0f6 R11: ffffed107dd7c0f6 R12: 00000000000006d8
> [ 6051.366617] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 6051.368036] FS:  0000000000000000(0000) GS:ffff8883eea00000(0000) knlGS:0000000000000000
> [ 6051.369609] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 6051.370959] CR2: 00000000000006d8 CR3: 00000001ff216001 CR4: 00000000003606a0
> [ 6051.372446] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 6051.373946] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 6051.375439] Call Trace:
> [ 6051.376583]  ? kvm_sched_clock_read+0x14/0x30
> [ 6051.377891]  lock_acquire+0xc8/0x1e0
> [ 6051.379162]  ? xprt_release+0x5b/0x360 [sunrpc]
> [ 6051.380531]  _raw_spin_lock+0x39/0x70
> [ 6051.381827]  ? xprt_release+0x5b/0x360 [sunrpc]
> [ 6051.383199]  xprt_release+0x5b/0x360 [sunrpc]
> [ 6051.384565]  rpc_release_resources_task+0xe/0x40 [sunrpc]
> [ 6051.386029]  __rpc_execute+0x1fe/0x520 [sunrpc]
> [ 6051.387429]  process_one_work+0x22f/0x600
> [ 6051.388788]  ? process_one_work+0x19a/0x600
> [ 6051.390172]  worker_thread+0x179/0x3f0
> [ 6051.391514]  kthread+0x114/0x150
> [ 6051.392829]  ? process_one_work+0x600/0x600
> [ 6051.394234]  ? kthread_park+0x80/0x80
> [ 6051.395602]  ret_from_fork+0x3a/0x50
> [ 6051.396968] Modules linked in: nfsv3 nfsd rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace hid_generic usbhid button sunrpc uhci_hcd ehci_pci ehci_hcd
>
>
> # kernel 5.0.16
> Kernel doesn't panic
>
>
> rpc_task is initialized in call_reserve, but rpc_task->tk_xprt is not initialized until call_reserveresult.
> When xprt_release is called between calling call_reserve and call_reserveresult, tk_xprt is still NULL.
> So spin_lock(&xprt->recv_lock) in xprt_release causes kernel panic.
>
> I notice that v4.20 has already fixed this issue with commit 9dc6edcf676fe188430e8b119f91280bbf285163
>
> But this patch do not cc stable@vger.kernel.org (why? forgotten?). And will v4.19.y consider to backport this patch?
>
>
> Thanks,
> Yihao Wu

--
        Thanks,
        Caspar

  reply	other threads:[~2019-07-01  8:25 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-30 17:27 [backport request][stable] SUNRPC: Clean up initialisation of the struct rpc_rqst Yihao Wu
2019-07-01  8:25 ` Caspar Zhang [this message]
2019-07-01 15:37   ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190701082547.GA89243@linux.alibaba.com \
    --to=caspar@linux.alibaba.com \
    --cc=greg@kroah.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=wuyihao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.