From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Fabio Olive Leite <fleite@redhat.com>
Cc: linux-nfs <nfs@lists.sourceforge.net>
Subject: Re: [PATCH] Attribute timeout handling and wrapping u32 jiffies
Date: Wed, 25 Jul 2007 10:00:13 -0400 [thread overview]
Message-ID: <1185372013.6585.73.camel@localhost> (raw)
In-Reply-To: <20070725030809.GB27619@sleipnir.redhat.com>
On Wed, 2007-07-25 at 00:08 -0300, Fabio Olive Leite wrote:
> Hi all,
>
> I would like to discuss the idea that the current checks for attribute
> timeout using time_after are inadequate for 32bit architectures, since
> time_after works correctly only when the two timestamps being compared
> are within 2^31 jiffies of each other. The signed overflow caused by
> comparing values more than 2^31 jiffies apart will flip the result,
> causing incorrect assumptions of validity.
>
> 2^31 jiffies is a fairly large period of time (~25 days) when compared
> to the lifetime of most kernel data structures, but for long lived NFS
> mounts that can sit idle for months (think that for some reason autofs
> cannot be used), it is easy to compare inode attribute timestamps with
> very disparate or even bogus values (as in when jiffies have wrapped
> many times, where the comparison doesn't even make sense).
>
> Currently the code tests for attribute timeout by simply adding the
> desired amount of jiffies to the stored timestamp and comparing that
> with the current timestamp of obtained attribute data with time_after.
> This is incorrect, as it returns true for the desired timeout period
> and another full 2^31 range of jiffies.
>
> I would like to propose a different way for testing for timeouts: we
> first make sure the timestamps are comparable, meaning the "current"
> timestamp is checked for being after the "stored" timestamp (if it is
> before we assume it has wrapped), and after that we check if the
> difference between them is greater than the desired timeout/validity
> period.
>
> This test does not cast the values to signed, and instead of comparing
> two possibly unrelated points in time, it actually tests if the
> timestamp wrapped, and if the period of time between the two
> timestamps is greater than the period represented by the timeout
> value. It seems to make more sense this way.
>
> In testing with artificial jumps (several small jumps, not one big
> crank) of the jiffies I was able to reproduce a problem found in a
> server with very long lived NFS mounts, where attributes would not be
> refreshed even after touching files and directories in the server:
>
> Initial uptime:
> 03:42:01 up 6 min, 0 users, load average: 0.01, 0.12, 0.07
>
> NFS volume is mounted and time is advanced:
> 03:38:09 up 25 days, 2 min, 0 users, load average: 1.22, 1.05, 1.08
>
> # ls -l /local/A/foo/bar /nfs/A/foo/bar
> -rw-r--r-- 1 root root 0 Dec 17 03:38 /local/A/foo/bar
> -rw-r--r-- 1 root root 0 Nov 22 00:36 /nfs/A/foo/bar
>
> # touch /local/A/foo/bar
>
> # ls -l /local/A/foo/bar /nfs/A/foo/bar
> -rw-r--r-- 1 root root 0 Dec 17 03:47 /local/A/foo/bar
> -rw-r--r-- 1 root root 0 Nov 22 00:36 /nfs/A/foo/bar
>
> We can see the local mtime is updated, but the NFS mount still shows
> the old value. The patch below makes it work:
>
> Initial setup...
> 07:11:02 up 25 days, 1 min, 0 users, load average: 0.15, 0.03, 0.04
>
> # ls -l /local/A/foo/bar /nfs/A/foo/bar
> -rw-r--r-- 1 root root 0 Jan 11 07:11 /local/A/foo/bar
> -rw-r--r-- 1 root root 0 Jan 11 07:11 /nfs/A/foo/bar
>
> # touch /local/A/foo/bar
>
> # ls -l /local/A/foo/bar /nfs/A/foo/bar
> -rw-r--r-- 1 root root 0 Jan 11 07:14 /local/A/foo/bar
> -rw-r--r-- 1 root root 0 Jan 11 07:14 /nfs/A/foo/bar
>
> I'd love to hear comments and criticism. Even if this is not the
> correct way to fix it, I'm fairly sure we can agree that there's the
> need for a fix and a proper fix can be found.
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index ea97408..8e85710 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1852,7 +1852,7 @@ int nfs_access_get_cached(struct inode *inode, struct rpc_cred *cred, struct nfs
> cache = nfs_access_search_rbtree(inode, cred);
> if (cache == NULL)
> goto out;
> - if (time_after(jiffies, cache->jiffies + NFS_ATTRTIMEO(inode)))
> + if (timeout_or_wrap(jiffies, cache->jiffies, NFS_ATTRTIMEO(inode)))
> goto out_stale;
> res->jiffies = cache->jiffies;
> res->cred = cache->cred;
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index bca6cdc..f611f95 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -662,7 +662,7 @@ int nfs_attribute_timeout(struct inode *inode)
>
> if (nfs_have_delegation(inode, FMODE_READ))
> return 0;
> - return time_after(jiffies, nfsi->read_cache_jiffies+nfsi->attrtimeo);
> + return timeout_or_wrap(jiffies, nfsi->read_cache_jiffies, nfsi->attrtimeo);
> }
>
> /**
> @@ -1061,7 +1061,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> nfs_inc_stats(inode, NFSIOS_ATTRINVALIDATE);
> nfsi->attrtimeo = NFS_MINATTRTIMEO(inode);
> nfsi->attrtimeo_timestamp = now;
> - } else if (time_after(now, nfsi->attrtimeo_timestamp+nfsi->attrtimeo)) {
> + } else if (timeout_or_wrap(now, nfsi->attrtimeo_timestamp, nfsi->attrtimeo)) {
> if ((nfsi->attrtimeo <<= 1) > NFS_MAXATTRTIMEO(inode))
> nfsi->attrtimeo = NFS_MAXATTRTIMEO(inode);
> nfsi->attrtimeo_timestamp = now;
> diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
> index c080f61..ffda7cb 100644
> --- a/include/linux/jiffies.h
> +++ b/include/linux/jiffies.h
> @@ -115,6 +115,11 @@ static inline u64 get_jiffies_64(void)
> ((long)(a) - (long)(b) >= 0))
> #define time_before_eq(a,b) time_after_eq(b,a)
>
> +#define timeout_or_wrap(a,b,c) \
> + (typecheck(unsigned long, a) && \
> + typecheck(unsigned long, b) && \
> + ((a) < (b) || (a) - (b) > (c)))
> +
Ugly name. Besides, the above macro causes problems at the wraparound
boundary when (b) may indeed be < (a), but we're still in the allowed
range. There is also no guarantee that (c) is unsigned. How about
#define time_in_range(a,b,c) \
(time_after_eq(a,b) && \
time_before_eq(a,c))
instead?
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2007-07-25 14:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-25 3:08 [PATCH] Attribute timeout handling and wrapping u32 jiffies Fabio Olive Leite
2007-07-25 14:00 ` Trond Myklebust [this message]
2007-07-25 14:28 ` Fabio Olive Leite
2007-07-25 14:32 ` Trond Myklebust
2007-07-25 14:54 ` Fabio Olive Leite
2007-07-27 1:59 ` Fabio Olive Leite
2007-07-31 16:25 ` Fabio Olive Leite
2007-07-31 19:47 ` Trond Myklebust
2007-08-22 18:25 ` Fabio Olive Leite
2007-08-22 18:47 ` J. Bruce Fields
2007-08-27 13:23 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1185372013.6585.73.camel@localhost \
--to=trond.myklebust@fys.uio.no \
--cc=fleite@redhat.com \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.