From: Peter Staubach <staubach@redhat.com>
To: chucklever@gmail.com
Cc: NFS list <linux-nfs@vger.kernel.org>,
Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: Re: [PATCH V2] make "noac" and "actimeo=0" work correctly
Date: Thu, 10 Jul 2008 13:41:02 -0400 [thread overview]
Message-ID: <487649AE.1090909@redhat.com> (raw)
In-Reply-To: <76bd70e30807100858g58fbf454uc9331035a2bbf264-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Chuck Lever wrote:
> Hi Peter-
>
>
Hi, Chuck.
> On Tue, Jul 8, 2008 at 12:08 PM, Peter Staubach <staubach@redhat.com> wrote:
>
>> Hi.
>>
>> I've been looking at a bugzilla which describes a problem where
>> a customer was advised to use either the "noac" or "actimeo=0"
>> mount options to solve a consistency problem that they were
>> seeing in the file attributes. It turned out that this solution
>> did not work reliably for them because sometimes, the local
>> attribute cache was believed to be valid and not timed out.
>> (With an attribute cache timeout of 0, the cache should always
>> appear to be timed out.)
>>
>> In looking at this situation, it appears to me that the problem
>> is that the attribute cache timeout code has an off-by-one
>> error in it. It is assuming that the cache is valid in the
>> region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The
>> cache should be considered valid only in the region,
>> [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this
>> change, the options, "noac" and "actimeo=0", work as originally
>> expected.
>>
>> While I was there, I addressed a problem with the jiffies
>> overflowing on 32 bit systems. When overflow occurs, the
>> value of read_cache_jiffies + attrtimeo can be less then the
>> value of read_cache_jiffies. This would cause an unnecessary
>> GETATTR over the wire.
>>
>> Thoughts and/or comments? This is an updated patch which includes
>> the previous support which was added to correct the noac/actimeo=0
>> handling.
>>
>
> A couple of random thoughts below.
>
>
Some thoughts in response --
>> Signed-off-by: Peter Staubach <staubach@redhat.com>
>>
>>
>> --- linux-2.6.25.i686/fs/nfs/dir.c.org
>> +++ linux-2.6.25.i686/fs/nfs/dir.c
>> @@ -1808,7 +1808,7 @@ static int nfs_access_get_cached(struct
>> cache = nfs_access_search_rbtree(inode, cred);
>> if (cache == NULL)
>> goto out;
>> - if (!time_in_range(jiffies, cache->jiffies, cache->jiffies +
>> nfsi->attrtimeo))
>> + if (!nfs_time_in_range_open(jiffies, cache->jiffies, cache->jiffies
>> + nfsi->attrtimeo))
>> goto out_stale;
>> res->jiffies = cache->jiffies;
>> res->cred = cache->cred;
>> --- linux-2.6.25.i686/fs/nfs/inode.c.org
>> +++ linux-2.6.25.i686/fs/nfs/inode.c
>> @@ -706,14 +706,7 @@ int nfs_attribute_timeout(struct inode *
>>
>> if (nfs_have_delegation(inode, FMODE_READ))
>> return 0;
>> - /*
>> - * Special case: if the attribute timeout is set to 0, then always
>> - * treat the cache as having expired (unless holding
>> - * a delegation).
>> - */
>> - if (nfsi->attrtimeo == 0)
>> - return 1;
>> - return !time_in_range(jiffies, nfsi->read_cache_jiffies,
>> nfsi->read_cache_jiffies + nfsi->attrtimeo);
>> + return !nfs_time_in_range_open(jiffies, nfsi->read_cache_jiffies,
>> nfsi->read_cache_jiffies + nfsi->attrtimeo);
>> }
>>
>> /**
>> @@ -1098,7 +1091,7 @@ static int nfs_update_inode(struct inode
>> nfsi->attrtimeo_timestamp = now;
>> nfsi->last_updated = now;
>> } else {
>> - if (!time_in_range(now, nfsi->attrtimeo_timestamp,
>> nfsi->attrtimeo_timestamp + nfsi->attrtimeo)) {
>> + if (!nfs_time_in_range_open(now, nfsi->attrtimeo_timestamp,
>> nfsi->attrtimeo_timestamp + nfsi->attrtimeo)) {
>> if ((nfsi->attrtimeo <<= 1) >
>> NFS_MAXATTRTIMEO(inode))
>> nfsi->attrtimeo = NFS_MAXATTRTIMEO(inode);
>> nfsi->attrtimeo_timestamp = now;
>> --- linux-2.6.25.i686/include/linux/nfs_fs.h.org
>> +++ linux-2.6.25.i686/include/linux/nfs_fs.h
>> @@ -121,7 +121,7 @@ struct nfs_inode {
>> *
>> * We need to revalidate the cached attrs for this inode if
>> *
>> - * jiffies - read_cache_jiffies > attrtimeo
>> + * jiffies - read_cache_jiffies >= attrtimeo
>> */
>> unsigned long read_cache_jiffies;
>> unsigned long attrtimeo;
>> @@ -244,6 +244,22 @@ static inline unsigned NFS_MAXATTRTIMEO(
>> return S_ISDIR(inode->i_mode) ? nfss->acdirmax : nfss->acregmax;
>> }
>>
>> +static inline int nfs_time_in_range_open(unsigned long a,
>> + unsigned long b, unsigned long c)
>>
>
> All of nfs_time_in_range_open's callers use a sum of 'b' and
> 'nfsi->attrtimeo' for 'c'. Would it be cleaner to pass in
> nfsi->attrtimeo' rather than 'b + nfsi->attrtimeo' and do the sum
> here? It might make sense to explicitly check nfsi->attrtimeo for
> zero here to document the special behavior of "actimeo=0".
>
>
The behavior of "actimeo=0" isn't any more special than "actimeo=1".
It simply indicates that the attribute timeout is 0 jiffies long.
I thought about reducing the arguments, but it didn't seem to yield
anything any clearer to me.
> Alternately, checking explicitly if b and c are equal might accomplish
> the same without changing the synopsis.
>
> Also, all of nfs_time_in_range_open's callers negate the return value.
> Would the code in the callers be cleaner if that negation was moved
> into nfs_time_in_range_open? You might rename
> nfs_time_in_range_open() as nfs_cache_has_expired(), for example, to
> make the 'if' statements in the callers make sense in English.
>
> My feeling is that if you have to sit and stare at this for 5 minutes
> to understand precisely how it works, it has already become too
> obfuscated. In addition to fixing any bugs, I wonder if we can make
> it easier to understand and maintain as well.
>
>
>> +{
>> + /*
>> + * If c is less then b, then the jiffies have wrapped.
>> + * If so, then check to see if a is between b and the
>> + * max jiffies value or between 0 and the value of c.
>> + * This is the range between b and c.
>>
>
> include/linux/jiffies.h claims it handles jiffy wrapping correctly.
> Why isn't time_in_range() sufficient if 'c' has wrapped? If it isn't,
> should you fix time_in_range() too?
>
>
Clearly, time_in_range() is not sufficient if the 'c' has
wrapped. It only tests to see if a >=b and a <= c. If 'c'
is less than 'b', then time_in_range() will return false.
I am reluctant to fix time_in_range() because I don't know
that it is broken. It appears to me that it works for other
uses, because otherwise, someone would have "fixed" it.
> You could then simplify this to "return b != c && time_in_range(a, b,
> c);" or something like that. Or if you negate the return value here:
>
> static inline nfs_attributes_have_expired(unsigned long current,
> unsigned long
> start, unsigned long end)
> {
> return (start == end) || !time_in_range(current, start, end);
> }
>
> My 0.02USD.
>
>
The change, which makes attrtimeo=0 work for free, is to figure out
that if the attrtimeo is N, then the attribute cache is valid from
time, T, to T + N - 1, not T + N. Thus, the current attribute
cache implementation is off by one because the attribute cache
should expire at time, T + N. The time_in_range() macro was handy
and looked right, but wasn't quite right for the desired semantics.
Adding tests to check to see if b and c are equal is tuning for
the wrong case, I think. I believe that the majority of file
systems are not mounted with "noac" or "actimeo=0", so the extra
test would just be overhead for the common case.
ps
>> + *
>> + * Otherwise, just check to see whether a is in [b, c).
>> + */
>> + if (c < b)
>> + return time_after_eq(a, b) || time_before(a, c);
>> + return time_after_eq(a, b) && time_before(a, c);
>> +}
>> +
>> static inline int NFS_STALE(const struct inode *inode)
>> {
>> return test_bit(NFS_INO_STALE, &NFS_I(inode)->flags);
>>
>
>
next prev parent reply other threads:[~2008-07-10 17:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-10 20:52 [PATCH] make "noac" and "actimeo=0" work correctly Peter Staubach
2008-07-08 16:08 ` [PATCH V2] " Peter Staubach
2008-07-10 15:58 ` Chuck Lever
[not found] ` <76bd70e30807100858g58fbf454uc9331035a2bbf264-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-10 17:41 ` Peter Staubach [this message]
2008-07-10 18:55 ` Chuck Lever
[not found] ` <76bd70e30807101155l226c1cceh24ca17157cb454bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-10 19:23 ` Peter Staubach
2008-07-10 19:31 ` Chuck Lever
2008-07-10 19:29 ` Trond Myklebust
2008-07-11 20:14 ` Peter Staubach
2008-07-11 20:19 ` Trond Myklebust
2008-07-11 20:24 ` Peter Staubach
2008-12-05 21:37 ` [PATCH V3] optimize attribute timeouts for "noac" and "actimeo=0" Peter Staubach
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=487649AE.1090909@redhat.com \
--to=staubach@redhat.com \
--cc=Trond.Myklebust@netapp.com \
--cc=chucklever@gmail.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox