linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org,
	viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
	hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	michael.brantley-Iq/kdjr4a97QT0dZR+AlfA@public.gmane.org,
	sven.breuner-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org,
	chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	pstaubach-83r9SdEf25FBDgjK7y7TUQ@public.gmane.org,
	malahal-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org,
	bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org,
	trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org,
	rees-63aXycvo3TyHXe+LvDLADg@public.gmane.org
Subject: Re: [PATCH RFC v3] vfs: make fstatat retry once on ESTALE errors from getattr call
Date: Sun, 22 Apr 2012 09:46:32 +0530	[thread overview]
Message-ID: <4F938620.2080301@redhat.com> (raw)
In-Reply-To: <20120420104055.511e15bc-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>

On 04/20/2012 08:10 PM, Jeff Layton wrote:
> On Wed, 18 Apr 2012 07:52:07 -0400
> Jeff Layton<jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>  wrote:
>
>> ESTALE errors are a source of pain for many users of NFS. Usually they
>> occur when a file is removed from the server after a successful lookup
>> against it.
>>
>> Luckily, the remedy in these cases is usually simple. We should just
>> redo the lookup, forcing revalidations all the way in and then retry the
>> call. We of course cannot do this for syscalls that do not involve a
>> path, but for path-based syscalls we can and should attempt to recover
>> from an ESTALE.
>>
>> This patch implements this by having the VFS reattempt the lookup (with
>> LOOKUP_REVAL set) and call exactly once when it would ordinarily return
>> ESTALE. This should catch the bulk of these cases under normal usage,
>> without unduly inconveniencing other filesystems that return ESTALE on
>> path-based syscalls.
>>
>> Note that it's possible to hit this race more than once, but a single
>> retry should catch the bulk of these cases under normal circumstances.
>>
>> This patch is just an example. We'll alter most path-based syscalls in a
>> similar fashion to fix this correctly. At this point, I'm just trying to
>> ensure that the retry semantics are acceptable before I being that work.
>>
>> Does anyone have strong objections to this patch? I'm aware that the
>> retry mechanism is not as robust as many (e.g. Peter) would like, but it
>> should at least improve the current situation.
>>
>> If no one has a strong objection, then I'll start going through and
>> adding similar code to the other syscalls. And we can hopefully we can
>> get at least some of them in for 3.5.
>>
>> Signed-off-by: Jeff Layton<jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>> ---
>>   fs/stat.c |    9 ++++++++-
>>   1 files changed, 8 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/stat.c b/fs/stat.c
>> index c733dc5..0ee9cb4 100644
>> --- a/fs/stat.c
>> +++ b/fs/stat.c
>> @@ -73,7 +73,8 @@ int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat,
>>   {
>>   	struct path path;
>>   	int error = -EINVAL;
>> -	int lookup_flags = 0;
>> +	bool retried = false;
>> +	unsigned int lookup_flags = 0;
>>
>>   	if ((flag&  ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT |
>>   		      AT_EMPTY_PATH)) != 0)
>> @@ -84,12 +85,18 @@ int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat,
>>   	if (flag&  AT_EMPTY_PATH)
>>   		lookup_flags |= LOOKUP_EMPTY;
>>
>> +retry:
>>   	error = user_path_at(dfd, filename, lookup_flags,&path);
>>   	if (error)
>>   		goto out;
>>
>>   	error = vfs_getattr(path.mnt, path.dentry, stat);
>>   	path_put(&path);
>> +	if (error == -ESTALE&&  !retried) {
>> +		retried = true;
>> +		lookup_flags |= LOOKUP_REVAL;
>> +		goto retry;
>> +	}
>>   out:
>>   	return error;
>>   }
> Apologies for replying to myself here. Just to beat on the deceased
> equine a little longer, I should note that the above approach does
> *not* fix Peter's reproducer in his original email. It fails rather
> quickly when run simultaneously on the client and server.
>
> At least one of the tests in it creates and removes a hierarchy of
> directories in a loop. During that, the lookup from the client can
> easily fail more than once with ESTALE. Since we give up after a single
> retry, that causes the call to return ESTALE.
>
> While testing this approach with mkdir and fstatat, I modified the
> patch to retry multiple times, also retry when the lookup thows ESTALE
> and to throw a printk when the number of retries was>  1 with the
> number of retries that the call did and the eventual error code.
>
> With Peter's (admittedly synthetic) test, we can get an answer of sorts
> to Trond's question from earlier in the thread as to how many retries
> is "enough":
>
> [   45.023665] sys_mkdirat: retries=33 error=-2
> [   47.889300] sys_mkdirat: retries=51 error=-2
> [   49.172746] sys_mkdirat: retries=27 error=-2
> [   52.325723] sys_mkdirat: retries=10 error=-2
> [   58.082576] sys_mkdirat: retries=33 error=-2
> [   59.810461] sys_mkdirat: retries=5 error=-2
> [   63.387914] sys_mkdirat: retries=14 error=-2
> [   63.630785] sys_mkdirat: retries=4 error=-2
> [   68.268903] sys_mkdirat: retries=6 error=-2
> [   71.124173] sys_mkdirat: retries=99 error=-2
> [   75.657649] sys_mkdirat: retries=123 error=-2
> [   76.903814] sys_mkdirat: retries=26 error=-2
> [   82.009463] sys_mkdirat: retries=30 error=-2
> [   84.807731] sys_mkdirat: retries=67 error=-2
> [   89.825529] sys_mkdirat: retries=166 error=-2
> [   91.599104] sys_mkdirat: retries=8 error=-2
> [   95.621855] sys_mkdirat: retries=44 error=-2
> [   98.164588] sys_mkdirat: retries=61 error=-2
> [   99.783347] sys_mkdirat: retries=11 error=-2
> [  105.593980] sys_mkdirat: retries=104 error=-2
> [  110.348861] sys_mkdirat: retries=8 error=-2
> [  112.087966] sys_mkdirat: retries=46 error=-2
> [  117.613316] sys_mkdirat: retries=90 error=-2
> [  120.117550] sys_mkdirat: retries=2 error=-2
> [  122.624330] sys_mkdirat: retries=15 error=-2
>
> So, now I'm having buyers remorse of sorts about proposing to just
> retry once as that may not be strong enough to fix some of the cases
> we're interested in.
>
> I guess the questions at this point is:
>
> 1) How representative is Peter's mkdir_test() of a real-world workload?
>
> 2) if we assume that it is fairly representative of one, how can we
> achieve retrying indefinitely with NFS, or at least some large finite
> amount?
>
> I have my doubts as to whether it would really be as big a problem for
> other filesystems as Miklos and others have asserted, but I'll take
> their word for it at the moment. What's the best way to contain this
> behavior to just those filesystems that want to retry indefinitely when
> they get an ESTALE? Would we need to go with an entirely new
> ESTALERETRY after all?
>

Maybe we should have a default of a single loop and a tunable to allow clients 
to crank it up?

Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-04-22  4:16 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-13 11:25 [PATCH RFC] vfs: make fstatat retry on ESTALE errors from getattr call Jeff Layton
     [not found] ` <20120413150518.GA1987@us.ibm.com>
2012-04-13 15:42   ` Jeff Layton
2012-04-13 16:07     ` Steve Dickson
     [not found]       ` <4F884F32.7010402-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2012-04-13 17:10         ` Jeff Layton
2012-04-13 17:34         ` Peter Staubach
     [not found]           ` <2F609A9B-B44B-4CEA-BF35-D6BEDA729363-83r9SdEf25FBDgjK7y7TUQ@public.gmane.org>
2012-04-13 23:00             ` Jeff Layton
2012-04-14  0:57           ` Trond Myklebust
2012-04-15 19:03     ` Bernd Schubert
     [not found]       ` <4F8B1B7B.3040304-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-04-15 19:27         ` J. Bruce Fields
2012-04-16 14:23           ` Bernd Schubert
2012-04-15 19:57         ` Chuck Lever
2012-04-16 11:23           ` Jeff Layton
2012-04-17 11:53           ` Steve Dickson
2012-04-16 11:36         ` Jeff Layton
     [not found]           ` <20120416073655.7cdb90cf-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-16 12:54             ` Peter Staubach
2012-04-16 16:04               ` Jeff Layton
2012-04-16 14:44           ` Bernd Schubert
     [not found]             ` <4F8C3036.2030702-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-04-16 17:46               ` Jeff Layton
2012-04-16 19:33                 ` Myklebust, Trond
2012-04-16 19:43                   ` Jeff Layton
2012-04-16 20:25                     ` Myklebust, Trond
2012-04-16 23:05                       ` Jeff Layton
     [not found]                         ` <20120416190548.2463d1d0-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-17 11:46                           ` Steve Dickson
     [not found]                             ` <4F8D580B.7060104-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2012-04-17 13:36                               ` Jeff Layton
     [not found]                                 ` <20120417093643.7f172057-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-17 14:14                                   ` Steve Dickson
2012-04-17 14:27                                     ` Miklos Szeredi
2012-04-17 15:02                                       ` Jeff Layton
     [not found]                                         ` <20120417110219.0db9bdee-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-17 15:50                                           ` Miklos Szeredi
     [not found]                                             ` <87aa2anys1.fsf-d8RdFUjzFsbxNFs70CDYszOMxtEWgIxa@public.gmane.org>
2012-04-17 16:03                                               ` Jeff Layton
2012-04-17 15:59                                           ` Steve Dickson
2012-04-17 13:12                         ` Miklos Szeredi
2012-04-17 13:32                           ` Jeff Layton
2012-04-17 14:03                             ` Miklos Szeredi
     [not found]                               ` <87obqqo3qd.fsf-d8RdFUjzFsbxNFs70CDYszOMxtEWgIxa@public.gmane.org>
2012-04-17 14:22                                 ` Jeff Layton
     [not found]                             ` <20120417093222.2ff5e1bd-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-17 14:04                               ` Myklebust, Trond
2012-04-17 14:20                                 ` Jeff Layton
     [not found]                                   ` <20120417102035.2236e553-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-17 15:45                                     ` J. Bruce Fields
     [not found]                                       ` <20120417154549.GA27426-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-04-17 16:02                                         ` Miklos Szeredi
2012-04-17 13:39                         ` Peter Staubach
2012-04-17 14:08                           ` Myklebust, Trond
     [not found]                             ` <1334671736.2963.30.camel-SyLVLa/KEI9HwK5hSS5vWJi5GlFTYi68DQmRywkZCB4@public.gmane.org>
2012-04-17 14:48                               ` Peter Staubach
     [not found]                                 ` <FA8A9A935BFD3A4D8F0CDA1C4F611BCC063CF8E132-0qHjP65cd0bRCIvD65MY1w@public.gmane.org>
2012-04-18 15:16                                   ` Jeff Layton
     [not found]                 ` <20120416134642.1754cd3e-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-16 19:43                   ` Scott Lovenberg
2012-04-16 16:55 ` [PATCH RFC v2] " Jeff Layton
     [not found] ` <1334316311-22331-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-13 12:02   ` [PATCH RFC] " Jim Rees
     [not found]     ` <20120413120232.GA27179-63aXycvo3TyHXe+LvDLADg@public.gmane.org>
2012-04-13 12:09       ` Jeff Layton
2012-04-18 11:52   ` [PATCH RFC v3] vfs: make fstatat retry once " Jeff Layton
2012-04-20 14:40     ` Jeff Layton
     [not found]       ` <20120420104055.511e15bc-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-04-20 20:18         ` Steve Dickson
     [not found]           ` <4F91C49D.8070908-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2012-04-20 20:37             ` Malahal Naineni
2012-04-20 21:13               ` Jeff Layton
2012-04-22  5:40                 ` Miklos Szeredi
     [not found]                   ` <CAJfpegt40cgMJQQo3JuNaaS1w957Y2a_NxVoyvx3bmTMj1TGOA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-23 12:00                     ` Jeff Layton
     [not found]                       ` <20120423080012.7c23ef24-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-04-23 13:00                         ` J. Bruce Fields
     [not found]                           ` <20120423130009.GA13681-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-04-23 13:12                             ` Jeff Layton
     [not found]                               ` <20120423091255.00f926c4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-04-23 13:34                                 ` J. Bruce Fields
     [not found]                                   ` <20120423133412.GB13681-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-04-23 13:50                                     ` Jeff Layton
2012-04-23 13:54                                       ` J. Bruce Fields
2012-04-23 14:51                                         ` Miklos Szeredi
     [not found]                                           ` <87hawasdrb.fsf-d8RdFUjzFsbxNFs70CDYszOMxtEWgIxa@public.gmane.org>
2012-04-23 15:02                                             ` Chuck Lever
2012-04-23 15:23                                               ` Miklos Szeredi
2012-04-23 17:45                                                 ` Peter Staubach
2012-04-23 15:16                                             ` Jeff Layton
2012-04-23 15:28                                               ` Miklos Szeredi
2012-04-23 18:59                                                 ` Jeff Layton
2012-04-20 21:13             ` Jeff Layton
     [not found]               ` <20120420171300.326d6e36-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-04-23 14:55                 ` Steve Dickson
     [not found]                   ` <4F956D5C.5050801-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2012-04-23 15:32                     ` Jeff Layton
     [not found]                       ` <20120423113216.01992555-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-04-23 18:06                         ` Steve Dickson
2012-04-23 18:33                           ` Jeff Layton
     [not found]                           ` <4F959A36.2080402-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2012-04-23 20:38                             ` Peter Staubach
2012-04-24 14:50                               ` Jeff Layton
     [not found]                                 ` <20120424105049.5ed96b40-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-04-24 15:54                                   ` Miklos Szeredi
2012-04-24 16:34                                     ` Jeff Layton
     [not found]                                       ` <20120424123413.17625d5d-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-04-25  9:41                                         ` Miklos Szeredi
     [not found]                                           ` <87bomgkv1b.fsf-d8RdFUjzFsbxNFs70CDYszOMxtEWgIxa@public.gmane.org>
2012-04-25 12:04                                             ` Jeff Layton
2012-04-23 17:43                     ` Peter Staubach
2012-04-23 19:06                     ` Malahal Naineni
2012-04-22  4:16         ` Ric Wheeler [this message]
2012-04-23 11:20           ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F938620.2080301@redhat.com \
    --to=rwheeler-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=malahal-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=michael.brantley-Iq/kdjr4a97QT0dZR+AlfA@public.gmane.org \
    --cc=miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org \
    --cc=pstaubach-83r9SdEf25FBDgjK7y7TUQ@public.gmane.org \
    --cc=rees-63aXycvo3TyHXe+LvDLADg@public.gmane.org \
    --cc=sven.breuner-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org \
    --cc=trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org \
    --cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).