All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Jeff Layton <jlayton@redhat.com>
Cc: Daniel J Blueman <daniel.blueman@gmail.com>,
	linux-nfs@vger.kernel.org, nfsv4@linux-nfs.org,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [2.6.26-rc4] mount.nfsv4/memory poisoning issues...
Date: Tue, 10 Jun 2008 14:54:48 -0400	[thread overview]
Message-ID: <1213124088.20459.16.camel@localhost> (raw)
In-Reply-To: <20080604203504.62730951-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>

On Wed, 2008-06-04 at 20:35 -0400, Jeff Layton wrote:
> On Thu, 5 Jun 2008 00:33:54 +0100
> "Daniel J Blueman" <daniel.blueman@gmail.com> wrote:
> 
> > Having experienced 'mount.nfs4: internal error' when mounting nfsv4 in
> > the past, I have a minimal test-case I sometimes run:
> > 
> > $ while :; do mount -t nfs4 filer:/store /store; umount /store; done
> > 
> > After ~100 iterations, I saw the 'mount.nfs4: internal error',
> > followed by symptoms of memory corruption [1], a locking issue with
> > the reporting [2] and another (related?) memory-corruption issue
> > (off-by-1?) [3]. A little analysis shows memory being overwritten by
> > (likely) a poison value, which gets complicated if it's not
> > use-after-free...
> > 
> > Anyone dare confirm this issue? NFSv4 server is x86-64 Ubuntu 8.04
> > 2.6.24-18, client U8.04 2.6.26-rc4; batteries included [4].
> > 
> > I'm happy to decode addresses, test patches etc.
> > 
> > Daniel
> > 
> 
> Looks like it fell down while trying to take down the kthread during a
> failed mount attempt. I have to wonder if I might have introduced a
> race when I changed nfs4 callback thread to kthread API. I think we may
> need the BKL around the last 2 statements in the main callback thread
> function. If you can easily reproduce this, could you test the
> following patch and let me know if it helps?
> 
> Note that this patch is entirely untested, so test it someplace
> non-critical ;-).
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> 
> 
> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> index c1e7c83..a3e83f9 100644
> --- a/fs/nfs/callback.c
> +++ b/fs/nfs/callback.c
> @@ -90,9 +90,9 @@ nfs_callback_svc(void *vrqstp)
>  		preverr = err;
>  		svc_process(rqstp);
>  	}
> -	unlock_kernel();
>  	nfs_callback_info.task = NULL;
>  	svc_exit_thread(rqstp);
> +	unlock_kernel();
>  	return 0;
>  }

We certainly need to protect nfs_callback_info.task (and I believe I
explained this earlier), but why do we need to protect svc_exit_thread?

Also, looking at the general use of the BKL in that code, I thought we
agreed that there was no need to hold the BKL while taking the
nfs_callback_mutex?

Trond


WARNING: multiple messages have this Message-ID (diff)
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Jeff Layton <jlayton@redhat.com>
Cc: Daniel J Blueman <daniel.blueman@gmail.com>,
	linux-nfs@vger.kernel.org, nfsv4@linux-nfs.org,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [2.6.26-rc4] mount.nfsv4/memory poisoning issues...
Date: Tue, 10 Jun 2008 14:54:48 -0400	[thread overview]
Message-ID: <1213124088.20459.16.camel@localhost> (raw)
In-Reply-To: <20080604203504.62730951@tleilax.poochiereds.net>

On Wed, 2008-06-04 at 20:35 -0400, Jeff Layton wrote:
> On Thu, 5 Jun 2008 00:33:54 +0100
> "Daniel J Blueman" <daniel.blueman@gmail.com> wrote:
> 
> > Having experienced 'mount.nfs4: internal error' when mounting nfsv4 in
> > the past, I have a minimal test-case I sometimes run:
> > 
> > $ while :; do mount -t nfs4 filer:/store /store; umount /store; done
> > 
> > After ~100 iterations, I saw the 'mount.nfs4: internal error',
> > followed by symptoms of memory corruption [1], a locking issue with
> > the reporting [2] and another (related?) memory-corruption issue
> > (off-by-1?) [3]. A little analysis shows memory being overwritten by
> > (likely) a poison value, which gets complicated if it's not
> > use-after-free...
> > 
> > Anyone dare confirm this issue? NFSv4 server is x86-64 Ubuntu 8.04
> > 2.6.24-18, client U8.04 2.6.26-rc4; batteries included [4].
> > 
> > I'm happy to decode addresses, test patches etc.
> > 
> > Daniel
> > 
> 
> Looks like it fell down while trying to take down the kthread during a
> failed mount attempt. I have to wonder if I might have introduced a
> race when I changed nfs4 callback thread to kthread API. I think we may
> need the BKL around the last 2 statements in the main callback thread
> function. If you can easily reproduce this, could you test the
> following patch and let me know if it helps?
> 
> Note that this patch is entirely untested, so test it someplace
> non-critical ;-).
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> 
> 
> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> index c1e7c83..a3e83f9 100644
> --- a/fs/nfs/callback.c
> +++ b/fs/nfs/callback.c
> @@ -90,9 +90,9 @@ nfs_callback_svc(void *vrqstp)
>  		preverr = err;
>  		svc_process(rqstp);
>  	}
> -	unlock_kernel();
>  	nfs_callback_info.task = NULL;
>  	svc_exit_thread(rqstp);
> +	unlock_kernel();
>  	return 0;
>  }

We certainly need to protect nfs_callback_info.task (and I believe I
explained this earlier), but why do we need to protect svc_exit_thread?

Also, looking at the general use of the BKL in that code, I thought we
agreed that there was no need to hold the BKL while taking the
nfs_callback_mutex?

Trond


  parent reply	other threads:[~2008-06-10 18:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-04 23:33 [2.6.26-rc4] mount.nfsv4/memory poisoning issues Daniel J Blueman
2008-06-04 23:33 ` Daniel J Blueman
     [not found] ` <6278d2220806041633n3bfe3dd2ke9602697697228b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-04 23:43   ` Chuck Lever
2008-06-04 23:43     ` Chuck Lever
     [not found]     ` <76bd70e30806041643j4d632a6exf64b29c34173d40f-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-15 18:10       ` Daniel J Blueman
2008-06-15 18:10         ` Daniel J Blueman
2008-06-16 16:17         ` Chuck Lever
2008-06-16 16:17           ` Chuck Lever
     [not found]         ` <6278d2220806151110x68ee91fej8cf8e6b591ce1319-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-19 12:14           ` Jeff Layton
2008-06-19 12:14             ` Jeff Layton
     [not found]             ` <20080619081420.24645bc4-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-19 12:37               ` Daniel J Blueman
2008-06-19 12:37                 ` Daniel J Blueman
     [not found]                 ` <6278d2220806190537u7b781309q415f904390e02f3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-19 17:32                   ` Chuck Lever
2008-06-19 17:32                     ` Chuck Lever
2008-06-05  0:35 ` Jeff Layton
2008-06-05  8:28   ` Daniel J Blueman
     [not found]     ` <6278d2220806050128x6e892df3p1632d6ae6b40b55b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-05 10:32       ` Jeff Layton
2008-06-05 10:32         ` Jeff Layton
     [not found]   ` <20080604203504.62730951-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-10 18:54     ` Trond Myklebust [this message]
2008-06-10 18:54       ` Trond Myklebust
2008-06-10 19:13       ` Jeff Layton
     [not found]         ` <20080610151357.150b6f69-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-10 19:18           ` Jeff Layton
2008-06-10 19:18             ` Jeff Layton
     [not found]             ` <20080610151829.3c4d6c1e-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-10 20:27               ` Daniel J Blueman
2008-06-10 20:27                 ` Daniel J Blueman
2008-06-18 12:07                 ` Jeff Layton
2008-06-18 12:07                   ` Jeff Layton
2008-06-21 17:52                   ` Daniel J Blueman
2008-06-21 17:52                     ` Daniel J Blueman
2008-06-10 19:58           ` Trond Myklebust
2008-06-10 19:58             ` Trond Myklebust
2008-06-10 20:13             ` Jeff Layton
2008-06-10 20:33               ` Trond Myklebust
2008-06-10 20:33                 ` Trond Myklebust
2008-06-10 20:41                 ` Jeff Layton
2008-06-10 20:41                   ` Jeff Layton
2008-06-10 21:01                 ` Jeff Layton
2008-06-10 21:01                   ` Jeff Layton
2008-06-10 21:37                   ` Trond Myklebust
2008-06-10 21:37                     ` Trond Myklebust
2008-06-10 22:04                     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1213124088.20459.16.camel@localhost \
    --to=trond.myklebust@fys.uio.no \
    --cc=daniel.blueman@gmail.com \
    --cc=jlayton@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.