From: Andrew Morton <akpm@digeo.com>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: green@namesys.com, linux-kernel@vger.kernel.org, hch@lst.de,
jack@suse.cz, mason@suse.com, shemminger@osdl.org
Subject: Re: ext2 FS corruption with 2.5.59.
Date: Sat, 25 Jan 2003 21:10:03 -0800 [thread overview]
Message-ID: <20030125211003.082cb92c.akpm@digeo.com> (raw)
In-Reply-To: <20030126041426.GB780@holomorphy.com>
William Lee Irwin III <wli@holomorphy.com> wrote:
>
> William Lee Irwin III <wli@holomorphy.com> wrote:
> >> Ticket locks need atomic fetch and increment. These don't look right.
>
> On Sat, Jan 25, 2003 at 07:46:48PM -0800, Andrew Morton wrote:
> > Well look at the reader side:
> > loff_t i_size_read(struct inode *inode)
> > {
> > unsigned seq;
> > loff_t ret;
> >
> > do {
> > seq = fr_write_begin(&inode->i_frlock);
> > ret = inode->i_size;
> > } while (seq != fr_write_end(&inode->i_frlock);
> > return ret;
> > }
argh. That should have been:
> > seq = fr_read_begin(&inode->i_frlock);
> > ret = inode->i_size;
> > } while (seq != fr_read_end(&inode->i_frlock);
> > return ret;
> > }
of course.
> This doesn't look particularly reassuring either. We have:
>
> (1) increment ->pre_sequence
> (2) wmb()
> (3) get inode->i_size
> (4) wmb()
> (5) increment ->post_sequence
> (6) wmb()
>
> Supposing the overall scheme is sound, one of the wmb()'s is unnecessary;
Could be.
> I'd have to go through some kind of state transition fiasco to be sure
> this actually recovers from the races where two readers fetch the same
> value of ->pre_sequence or ->post_sequence and store the same
> incremented value to convince myself this is right.
readers do not modify the lock - they simply observe.
The fr_write_begin/fr_write_end pair assumes that there is only a single
writer possible. In the case of i_size, that exclusion is provided by i_sem.
i_size is always modified under i_sem.
> I'll assume you've
> either done so yourself or are relying on someone else's verification.
More the latter ;)
> Restarting the read like this is highly unusual; if retrying the
> critical section is in fact the basis of this locking algorithm then
> it's not a true ticket lock.
Retrying the read is the basis of the locking algorithm.
The frlock stuff needs more work for non-SMP bloat avoidance, but it's simple
and seems sensible.
next prev parent reply other threads:[~2003-01-26 5:00 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-01-23 12:38 ext2 FS corruption with 2.5.59 Oleg Drokin
2003-01-23 14:09 ` Hugh Dickins
2003-01-23 14:26 ` Oleg Drokin
2003-01-23 14:39 ` Oleg Drokin
2003-01-24 10:32 ` Andrew Morton
2003-01-24 12:39 ` Oleg Drokin
2003-01-25 6:53 ` Andrew Morton
2003-01-25 12:36 ` Oleg Drokin
2003-01-25 23:13 ` Andrew Morton
2003-01-26 9:25 ` Oleg Drokin
2003-01-26 3:04 ` Andrew Morton
2003-01-26 3:28 ` William Lee Irwin III
2003-01-26 3:46 ` Andrew Morton
2003-01-26 4:14 ` William Lee Irwin III
2003-01-26 5:10 ` Andrew Morton [this message]
2003-01-27 22:59 ` Stephen Hemminger
2003-01-27 23:59 ` William Lee Irwin III
2003-01-26 11:11 ` Anton Blanchard
2003-01-26 11:23 ` Andrew Morton
2003-01-28 13:50 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030125211003.082cb92c.akpm@digeo.com \
--to=akpm@digeo.com \
--cc=green@namesys.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=mason@suse.com \
--cc=shemminger@osdl.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox