Linux NFS development
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: chucklever@gmail.com
Cc: NFS list <linux-nfs@vger.kernel.org>,
	Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: Re: [PATCH V2] make "noac" and "actimeo=0" work correctly
Date: Thu, 10 Jul 2008 13:41:02 -0400	[thread overview]
Message-ID: <487649AE.1090909@redhat.com> (raw)
In-Reply-To: <76bd70e30807100858g58fbf454uc9331035a2bbf264-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Chuck Lever wrote:
> Hi Peter-
>
>   

Hi, Chuck.

> On Tue, Jul 8, 2008 at 12:08 PM, Peter Staubach <staubach@redhat.com> wrote:
>   
>> Hi.
>>
>> I've been looking at a bugzilla which describes a problem where
>> a customer was advised to use either the "noac" or "actimeo=0"
>> mount options to solve a consistency problem that they were
>> seeing in the file attributes.  It turned out that this solution
>> did not work reliably for them because sometimes, the local
>> attribute cache was believed to be valid and not timed out.
>> (With an attribute cache timeout of 0, the cache should always
>> appear to be timed out.)
>>
>> In looking at this situation, it appears to me that the problem
>> is that the attribute cache timeout code has an off-by-one
>> error in it.  It is assuming that the cache is valid in the
>> region, [read_cache_jiffies, read_cache_jiffies + attrtimeo].  The
>> cache should be considered valid only in the region,
>> [read_cache_jiffies, read_cache_jiffies + attrtimeo).  With this
>> change, the options, "noac" and "actimeo=0", work as originally
>> expected.
>>
>> While I was there, I addressed a problem with the jiffies
>> overflowing on 32 bit systems.  When overflow occurs, the
>> value of read_cache_jiffies + attrtimeo can be less then the
>> value of read_cache_jiffies.  This would cause an unnecessary
>> GETATTR over the wire.
>>
>> Thoughts and/or comments?  This is an updated patch which includes
>> the previous support which was added to correct the noac/actimeo=0
>> handling.
>>     
>
> A couple of random thoughts below.
>
>   

Some thoughts in response --

>> Signed-off-by: Peter Staubach <staubach@redhat.com>
>>
>>
>> --- linux-2.6.25.i686/fs/nfs/dir.c.org
>> +++ linux-2.6.25.i686/fs/nfs/dir.c
>> @@ -1808,7 +1808,7 @@ static int nfs_access_get_cached(struct
>>        cache = nfs_access_search_rbtree(inode, cred);
>>        if (cache == NULL)
>>                goto out;
>> -       if (!time_in_range(jiffies, cache->jiffies, cache->jiffies +
>> nfsi->attrtimeo))
>> +       if (!nfs_time_in_range_open(jiffies, cache->jiffies, cache->jiffies
>> + nfsi->attrtimeo))
>>                goto out_stale;
>>        res->jiffies = cache->jiffies;
>>        res->cred = cache->cred;
>> --- linux-2.6.25.i686/fs/nfs/inode.c.org
>> +++ linux-2.6.25.i686/fs/nfs/inode.c
>> @@ -706,14 +706,7 @@ int nfs_attribute_timeout(struct inode *
>>
>>        if (nfs_have_delegation(inode, FMODE_READ))
>>                return 0;
>> -       /*
>> -        * Special case: if the attribute timeout is set to 0, then always
>> -        *               treat the cache as having expired (unless holding
>> -        *               a delegation).
>> -        */
>> -       if (nfsi->attrtimeo == 0)
>> -               return 1;
>> -       return !time_in_range(jiffies, nfsi->read_cache_jiffies,
>> nfsi->read_cache_jiffies + nfsi->attrtimeo);
>> +       return !nfs_time_in_range_open(jiffies, nfsi->read_cache_jiffies,
>> nfsi->read_cache_jiffies + nfsi->attrtimeo);
>>  }
>>
>>  /**
>> @@ -1098,7 +1091,7 @@ static int nfs_update_inode(struct inode
>>                nfsi->attrtimeo_timestamp = now;
>>                nfsi->last_updated = now;
>>        } else {
>> -               if (!time_in_range(now, nfsi->attrtimeo_timestamp,
>> nfsi->attrtimeo_timestamp + nfsi->attrtimeo)) {
>> +               if (!nfs_time_in_range_open(now, nfsi->attrtimeo_timestamp,
>> nfsi->attrtimeo_timestamp + nfsi->attrtimeo)) {
>>                        if ((nfsi->attrtimeo <<= 1) >
>> NFS_MAXATTRTIMEO(inode))
>>                                nfsi->attrtimeo = NFS_MAXATTRTIMEO(inode);
>>                        nfsi->attrtimeo_timestamp = now;
>> --- linux-2.6.25.i686/include/linux/nfs_fs.h.org
>> +++ linux-2.6.25.i686/include/linux/nfs_fs.h
>> @@ -121,7 +121,7 @@ struct nfs_inode {
>>         *
>>         * We need to revalidate the cached attrs for this inode if
>>         *
>> -        *      jiffies - read_cache_jiffies > attrtimeo
>> +        *      jiffies - read_cache_jiffies >= attrtimeo
>>         */
>>        unsigned long           read_cache_jiffies;
>>        unsigned long           attrtimeo;
>> @@ -244,6 +244,22 @@ static inline unsigned NFS_MAXATTRTIMEO(
>>        return S_ISDIR(inode->i_mode) ? nfss->acdirmax : nfss->acregmax;
>>  }
>>
>> +static inline int nfs_time_in_range_open(unsigned long a,
>> +                               unsigned long b, unsigned long c)
>>     
>
> All of nfs_time_in_range_open's callers use a sum of 'b' and
> 'nfsi->attrtimeo' for 'c'.  Would it be cleaner to pass in
> nfsi->attrtimeo' rather than 'b + nfsi->attrtimeo' and do the sum
> here?  It might make sense to explicitly check nfsi->attrtimeo for
> zero here to document the special behavior of "actimeo=0".
>
>   

The behavior of "actimeo=0" isn't any more special than "actimeo=1".
It simply indicates that the attribute timeout is 0 jiffies long.

I thought about reducing the arguments, but it didn't seem to yield
anything any clearer to me.

> Alternately, checking explicitly if b and c are equal might accomplish
> the same without changing the synopsis.
>
> Also, all of nfs_time_in_range_open's callers negate the return value.
>  Would the code in the callers be cleaner if that negation was moved
> into nfs_time_in_range_open?  You might rename
> nfs_time_in_range_open() as nfs_cache_has_expired(), for example, to
> make the 'if' statements in the callers make sense in English.
>
> My feeling is that if you have to sit and stare at this for 5 minutes
> to understand precisely how it works, it has already become too
> obfuscated.  In addition to fixing any bugs, I wonder if we can make
> it easier to understand and maintain as well.
>
>   
>> +{
>> +       /*
>> +        * If c is less then b, then the jiffies have wrapped.
>> +        * If so, then check to see if a is between b and the
>> +        * max jiffies value or between 0 and the value of c.
>> +        * This is the range between b and c.
>>     
>
> include/linux/jiffies.h claims it handles jiffy wrapping correctly.
> Why isn't time_in_range() sufficient if 'c' has wrapped?  If it isn't,
> should you fix time_in_range() too?
>
>   

Clearly, time_in_range() is not sufficient if the 'c' has
wrapped.  It only tests to see if a >=b and a <= c.  If 'c'
is less than 'b', then time_in_range() will return false.

I am reluctant to fix time_in_range() because I don't know
that it is broken.  It appears to me that it works for other
uses, because otherwise, someone would have "fixed" it.

> You could then simplify this to "return b != c && time_in_range(a, b,
> c);" or something like that.  Or if you negate the return value here:
>
> static inline nfs_attributes_have_expired(unsigned long current,
>                                                          unsigned long
> start, unsigned long end)
> {
>         return (start == end) || !time_in_range(current, start, end);
> }
>
> My 0.02USD.
>
>   

The change, which makes attrtimeo=0 work for free, is to figure out
that if the attrtimeo is N, then the attribute cache is valid from
time, T, to T + N - 1, not T + N.  Thus, the current attribute
cache implementation is off by one because the attribute cache
should expire at time, T + N.  The time_in_range() macro was handy
and looked right, but wasn't quite right for the desired semantics.

Adding tests to check to see if b and c are equal is tuning for
the wrong case, I think.  I believe that the majority of file
systems are not mounted with "noac" or "actimeo=0", so the extra
test would just be overhead for the common case.

       ps

>> +        *
>> +        * Otherwise, just check to see whether a is in [b, c).
>> +        */
>> +       if (c < b)
>> +               return time_after_eq(a, b) || time_before(a, c);
>> +       return time_after_eq(a, b) && time_before(a, c);
>> +}
>> +
>>  static inline int NFS_STALE(const struct inode *inode)
>>  {
>>        return test_bit(NFS_INO_STALE, &NFS_I(inode)->flags);
>>     
>
>   


  parent reply	other threads:[~2008-07-10 17:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-10 20:52 [PATCH] make "noac" and "actimeo=0" work correctly Peter Staubach
2008-07-08 16:08 ` [PATCH V2] " Peter Staubach
2008-07-10 15:58   ` Chuck Lever
     [not found]     ` <76bd70e30807100858g58fbf454uc9331035a2bbf264-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-10 17:41       ` Peter Staubach [this message]
2008-07-10 18:55         ` Chuck Lever
     [not found]           ` <76bd70e30807101155l226c1cceh24ca17157cb454bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-10 19:23             ` Peter Staubach
2008-07-10 19:31               ` Chuck Lever
2008-07-10 19:29         ` Trond Myklebust
2008-07-11 20:14           ` Peter Staubach
2008-07-11 20:19             ` Trond Myklebust
2008-07-11 20:24               ` Peter Staubach
2008-12-05 21:37                 ` [PATCH V3] optimize attribute timeouts for "noac" and "actimeo=0" Peter Staubach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=487649AE.1090909@redhat.com \
    --to=staubach@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=chucklever@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox