public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Lachlan McIlroy <lachlan@sgi.com>
To: David Chinner <dgc@sgi.com>
Cc: Mark Goodwin <markgw@sgi.com>, Timothy Shimmin <tes@sgi.com>,
	xfs-dev <xfs-dev@sgi.com>, xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH] log replay should not overwrite newer ondisk inodes
Date: Mon, 10 Sep 2007 14:43:57 +1000	[thread overview]
Message-ID: <46E4CB8D.6010301@sgi.com> (raw)
In-Reply-To: <20070907140541.GA734179@sgi.com>

David Chinner wrote:
> On Fri, Sep 07, 2007 at 12:03:00PM +1000, Lachlan McIlroy wrote:
>> David Chinner wrote:
>>> An unlinked inode is only detectable by the mode parameter being zero.
>>> The rest of the inode will look valid.
>>>
>>> To detect the difference between a newly allocated inode *chunk*
>>> that has been written to and a stale inode chunk that we have
>>> just allocated and not written to yet, you need to walk every inode
>>> in the chunk and determine if the mode parameter is zero in every
>>> inode.
>>>
>>> If the mode is zero for all inodes and there are generation numbers
>>> that are not zero, then you've detected a stale buffer and you should
>>> replay the inode cluster buffer initialisation.
>>>
>> Thanks for this info Dave.  I looked into it and came up with a solution
>> that looks at the ondisk inode buffer and determines if it has been
>> written to since being logged.  It iterates through all the inodes and
>> checks each one with:
>>
>> - if the magic number is wrong the buffer is stale
> 
> *nod*
> 
>> - if the mode is non-zero then the buffer is newer than the log
> 
> *nod*
> 
>> - if the mode is zero and the generation count is non-zero then the
>>   buffer is stale
> 
> On second thoughts, this might be more complex - the buffer is stale
> only if all inodes in the *chunk* have this pattern. We may have multiple
> buffers to a chunk. e.g. allocate 55 inodes, they span two 8k cluster
> buffers. Both meet the second criteria. Now remove the first 32 inodes,
> and we have one buffer with zero allocated inodes but non-zero generation
> numbers (i.e. thrid state) and one that meets the second criteria.
> 
> However, both buffers are more recent than the buffer being replayed.
> We could lose generation count changes if we overwrite the buffer with
> no inodes in it....
> 
> So I think the stale buffer criteria must be that all the inodes in the entire
> inode chunk (i.e. across all buffers) must have a zero mode and at least one
> of the inodes has a non-zero generation count. Does that make sense?

Yeah, that makes sense.  Is inode cluster I/O done in size of chunks then?
So we are trying to determine if a chunk has been written since any of the
buffers were logged.  Is it possible that only some of the buffers are
logged before the chunk is written to disk and the rest of the buffers are
logged after?

> 
>> If the end result is a stale buffer then the buffer is replayed otherwise
>> it is skipped.  I added a new flag that gets logged with a new inode
>> cluster so that we can identify a buffer of inodes from something else.
>> This fix is passing all the tests we have.  Is this a better approach
>> than the last fix?
> 
> Definitely, but I think our "stale buffer" detection needs more work.
> 
> Cheers,
> 
> Dave.

  reply	other threads:[~2007-09-10  4:41 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-30  2:12 [PATCH] log replay should not overwrite newer ondisk inodes Lachlan McIlroy
2007-08-30  4:31 ` Timothy Shimmin
2007-08-30  4:50   ` Lachlan McIlroy
2007-08-30  8:29     ` Timothy Shimmin
2007-08-30  8:51       ` Timothy Shimmin
2007-08-31  2:22         ` Lachlan McIlroy
2007-08-31  4:01           ` Mark Goodwin
2007-08-31 15:48             ` David Chinner
2007-09-02 22:50               ` Vlad Apostolov
2007-09-03  8:49                 ` David Chinner
2007-09-07  2:03               ` Lachlan McIlroy
2007-09-07 14:05                 ` David Chinner
2007-09-10  4:43                   ` Lachlan McIlroy [this message]
2007-08-31  2:14       ` Lachlan McIlroy
2007-08-30 14:02   ` David Chinner
2007-09-04 23:05 ` Shailendra Tripathi
2007-09-04 23:49   ` David Chinner
2007-09-04 23:51     ` David Chinner
2007-09-05  1:19   ` Timothy Shimmin
2007-09-05  1:40     ` Lachlan McIlroy
2007-09-05  6:54       ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46E4CB8D.6010301@sgi.com \
    --to=lachlan@sgi.com \
    --cc=dgc@sgi.com \
    --cc=markgw@sgi.com \
    --cc=tes@sgi.com \
    --cc=xfs-dev@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox