From: Daniel McNeil <daniel@osdl.org>
To: Andrea Arcangeli <andrea@suse.de>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.4.19rc2aa1 i_size atomic access
Date: 29 Jul 2002 17:34:16 -0700 [thread overview]
Message-ID: <1027989256.578.30.camel@IBM-C> (raw)
In-Reply-To: <20020723170807.GW1116@dualathlon.random>
Andrea,
Sorry I haven't responded, but I was on vacation all last week and
was not near a computer.
I like your code change. Incrementing the v2 before the v1 in the
i_size_write() is much better. My code was definitely uglier -- but
it was correct since the version1 and version2 where sampled before
i_size was read and version1 and version2 where checked again after.
It was excessive, but correct.
On your patch, shouldn't non-smp preempt still use the 64-bit stuff?
The comment says it should, but the #ifdef's are not checking for
PREEMPT or did I miss something?
I would still be curious about the performance difference between the
version approach and the cmpxchg8 approach. With SMP I'm a bit worried
about the cacheline bouncing around and the memory bandwith wasted.
Any ideas on what kind of test would be appropriate?
I've got access to 2-proc to 8-proc systems I could run some tests on,
just not sure what test would be useful. The fstat() test isn't
realistic.
Increasing the versions to 32-bit is ok with -- I was just trying to
not waste too much space.
Daniel
On Tue, 2002-07-23 at 10:08, Andrea Arcangeli wrote:
> On Fri, Jul 19, 2002 at 03:56:36PM -0700, Daniel McNeil wrote:
> > + short v1;
> > + short v2;
> > + loff_t i_size;
> > +
> > + /*
> > + * retry if i_size was possibly modified while sampling.
> > + */
> > + do {
> > + v1 = inode->i_attr_version1;
> > + v2 = inode->i_attr_version2;
> > + rmb();
> > + i_size = inode->i_size;
> > + rmb();
> > + } while (v1 != v2 ||
> > + v1 != inode->i_attr_version1 ||
> > + v1 != inode->i_attr_version2);
> > + return i_size;
> [..]
> > #elif BITS_PER_LONG==64
> > return inode->i_size;
> > #endif
> > @@ -548,8 +566,12 @@
> > static inline void i_size_write(struct inode * inode, loff_t i_size)
> > {
> > #if BITS_PER_LONG==32
> > - set_64bit((unsigned long long *) &inode->i_size,
> > - (unsigned long long) i_size);
> > + inode->i_attr_version1++; /* changing i_size */
> > + wmb();
> > + inode->i_size = i_size;
> > + wmb();
> > + inode->i_attr_version2++; /* done with change */
> > + wmb();
> > #elif BITS_PER_LONG==64
> > inode->i_size = i_size;
> > #endif
>
> btw, looking at the implementation the read side was buggy. First it's
> pointless to read both version1 and version2 before reading the isize,
> secondly if you increase version1 first (in the writer), the reader has
> to read version2 before i_size and version1 after i_size. While you're
> doing the opposite, you compare v1 (version1) read before i_size with
> version2 after i_size.
>
> So while merging it I rewrote it this way (I also change the type of the
> sequence number to int, 2^16 can overflow quite fast in a multighz cpu,
> mostly for paranoid of course, and to go safe with the atomic granularty
> provided by archs, int certainly has to be atomic-granular).
>
> loff_t i_size;
> int v1, v2;
>
> /* Retry if i_size was possibly modified while sampling. */
> do {
> v1 = inode->i_size_version1;
> rmb();
> i_size = inode->i_size;
> rmb();
> v2 = inode->i_size_version2;
> } while (v1 != v2);
>
> return i_size;
>
>
> and the new writer side:
>
> inode->i_size_version2++;
> wmb();
> inode->i_size = i_size;
> wmb();
> inode->i_size_version1++;
> wmb(); /* make it visible ASAP */
>
> Andrea
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2002-07-30 0:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1026949132.20314.0.camel@joe2.pdx.osdl.net>
[not found] ` <1026951041.2412.38.camel@IBM-C>
[not found] ` <20020718103511.GG994@dualathlon.random>
2002-07-19 0:09 ` 2.4.19rc2aa1 i_size atomic access Daniel McNeil
2002-07-19 9:23 ` Andrea Arcangeli
2002-07-19 22:56 ` Daniel McNeil
2002-07-23 16:56 ` Andrea Arcangeli
2002-07-23 17:08 ` Andrea Arcangeli
2002-07-23 17:47 ` Andrea Arcangeli
2002-07-23 18:15 ` Maciej W. Rozycki
2002-07-23 19:20 ` Andrea Arcangeli
2002-07-24 14:19 ` Maciej W. Rozycki
2002-07-24 14:26 ` Andrea Arcangeli
2002-07-29 18:37 ` Bob Miller
2002-07-29 18:47 ` Andrea Arcangeli
2002-07-30 0:34 ` Daniel McNeil [this message]
2002-07-30 1:12 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1027989256.578.30.camel@IBM-C \
--to=daniel@osdl.org \
--cc=andrea@suse.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox