public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jörn Engel" <joern@logfs.org>
To: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] nilfs2: continuous snapshotting file system
Date: Wed, 27 Aug 2008 20:13:38 +0200	[thread overview]
Message-ID: <20080827181338.GC1371@logfs.org> (raw)
In-Reply-To: <200808261654.AA00216@capsicum.lab.ntt.co.jp>

On Wed, 27 August 2008 01:54:30 +0900, Ryusuke Konishi wrote:
> 
> Yeah, it was very tough battle :)
> Read is OK.  But write was hard.  I looked at the vfs code over again and
> again.
> We've implemented NILFS without bringing specific changes into vfs.
> However, if we can find common basis for LFSes, I'm grad to cooperate 
> with you.
> Though I don't know whether exporting inode_lock is the case or not ;)

Well, I was looking more for something like a list of problems and
solutions.  Partially because I am plain curious and partially because I
know those are the problem areas of any log-structured filesystem and
they deserve special attention in a review.

In logfs, garbage collection may read (and write) any inode and any
block from any file.  And since garbage collection may be called from
writepage() and write_inode(), the fun included:

P: iget() on the inode being currently written back and locked.
S: Split I_LOCK into I_LOCK and I_SYNC.  Has been merged upstream.

P: iget() on an inode in I_FREEING or I_WILL_FREE state.
S: Add inodes to a list in drop_inode() and remove them again in
   destroy_inode().  iget() in GC context is wrapped in a method that
   checks said list first and return an inode from the list when
   applicable.  Used to hold inode_lock to prevent races, but a
   logfs-local lock is actually sufficient.

If either of the two problems above is solved by calling
ilookup5_nowait() I bet you a fiver that a race with data corruption is
lurking somewhere in the area.

P: find_get_page() or some variant on a page handed to
   logfs_writepage().
S: Use the one available page flag, PG_owner_priv_1 to mark pages that
   are waiting for the single-threaded logfs write path.  If any page GC
   needs is locked, check for PG_owner_priv_1 and if it is set, just use
   the page anyway.  Whoever has set the flag cannot clear it until GC
   has finished.
   If the flag is not set, the page might still be somewhere in the
   logfs write path - before setting the page.  So simply do the check
   in a loop, call schedule() each time, knock on wood and keep your
   fingers crossed that the page will either become unlocked and set
   PG_owner_priv_1 sometime soon.  I'm not proud of this solution but
   know no better one.

So something like the above for nilfs would be useful.  And maybe, just
to be on the safe side, try the following testcase overnight:
- Create tiny filesystem (32M or so).
- Fill filesystem 100% with a single file.
- Rewrite random parts of the file in an endless loop.

Or even better, combine this testcase with some automated system crashes
and do an fsck every time the system comes back up. ;)

Jörn

-- 
Geld macht nicht glücklich.
Glück macht nicht satt.

  reply	other threads:[~2008-08-27 18:14 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-20  2:45 [PATCH RFC] nilfs2: continuous snapshotting file system Ryusuke Konishi
2008-08-20  7:43 ` Andrew Morton
2008-08-20  8:22   ` Pekka Enberg
2008-08-20 18:47     ` Ryusuke Konishi
2008-08-20 16:13   ` Ryusuke Konishi
2008-08-20 21:25     ` Szabolcs Szakacsits
2008-08-20 21:39       ` Andrew Morton
2008-08-20 21:48         ` Szabolcs Szakacsits
2008-08-21  2:12         ` Dave Chinner
2008-08-21  2:46           ` Szabolcs Szakacsits
2008-08-21  5:15             ` XFS vs Elevators (was Re: [PATCH RFC] nilfs2: continuous snapshotting file system) Dave Chinner
2008-08-21  6:00               ` gus3
2008-08-21  6:14                 ` Dave Chinner
2008-08-21  7:00                   ` Nick Piggin
2008-08-21  8:53                     ` Dave Chinner
2008-08-21  9:33                       ` Nick Piggin
2008-08-21 17:08                         ` Dave Chinner
2008-08-22  2:29                           ` Nick Piggin
2008-08-25  1:59                             ` Dave Chinner
2008-08-25  4:32                               ` Nick Piggin
2008-08-25 12:01                               ` Jamie Lokier
2008-08-26  3:07                                 ` Dave Chinner
2008-08-26  3:50                                   ` david
2008-08-27  1:20                                     ` Dave Chinner
2008-08-27 21:54                                       ` david
2008-08-28  1:08                                         ` Dave Chinner
2008-08-21 14:52                       ` Chris Mason
2008-08-21  6:04               ` Dave Chinner
2008-08-21  8:07                 ` Aaron Carroll
2008-08-21  8:25                 ` Dave Chinner
2008-08-21 11:02                   ` Martin Steigerwald
2008-08-21 15:00                     ` Martin Steigerwald
2008-08-21 17:10                   ` Szabolcs Szakacsits
2008-08-21 17:33                     ` Szabolcs Szakacsits
2008-08-22  2:24                       ` Dave Chinner
2008-08-22  6:49                         ` Martin Steigerwald
2008-08-22 12:44                         ` Szabolcs Szakacsits
2008-08-23 12:52                           ` Szabolcs Szakacsits
2008-08-21 11:53                 ` Matthew Wilcox
2008-08-21 15:56                   ` Dave Chinner
2008-08-21 12:51       ` [PATCH RFC] nilfs2: continuous snapshotting file system Chris Mason
2008-08-26 10:16     ` Jörn Engel
2008-08-26 16:54       ` Ryusuke Konishi
2008-08-27 18:13         ` Jörn Engel [this message]
2008-08-27 18:19         ` Jörn Engel
2008-08-29  6:29           ` Ryusuke Konishi
2008-08-29  8:40             ` Arnd Bergmann
2008-08-29 10:51               ` konishi.ryusuke
2008-08-29 11:04                 ` Jörn Engel
2008-08-29 10:45             ` Jörn Engel
2008-08-29 16:37               ` Ryusuke Konishi
2008-08-29 19:16                 ` Jörn Engel
2008-09-01 12:25                   ` Ryusuke Konishi
2008-08-20  9:47 ` Andi Kleen
2008-08-21  4:57   ` Ryusuke Konishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080827181338.GC1371@logfs.org \
    --to=joern@logfs.org \
    --cc=akpm@linux-foundation.org \
    --cc=konishi.ryusuke@lab.ntt.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox