* integrity question
@ 2002-05-25 19:43 Toby Dickenson
2002-05-26 5:25 ` Jean-Francois Landry
0 siblings, 1 reply; 5+ messages in thread
From: Toby Dickenson @ 2002-05-25 19:43 UTC (permalink / raw)
To: reiserfs-list
I am developing a storage layer for the ZODB object database, which is
designed to play to reiserfs strengths.
http://sourceforge.net/projects/dirstorage
I have a question about how much fsyncing is necessary to avoid losing files
on power loss, when moving them between directories.
Consider this sequence
1. write to A/B/somefileX and fsync it
2. mkdir A/C
3. rename A/B/somefile to A/C/somefile
4. rmdir A/B
5. power loss
I would like to guarantee that, after journal replay, 'somefile' is in either
of those two directories (or both). Am I correct to think that I dont need
any other syncs in there?
(A second question; is there any documentation that I could have used to
answer the first question myself?)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: integrity question
2002-05-25 19:43 integrity question Toby Dickenson
@ 2002-05-26 5:25 ` Jean-Francois Landry
2002-05-27 9:35 ` Toby Dickenson
0 siblings, 1 reply; 5+ messages in thread
From: Jean-Francois Landry @ 2002-05-26 5:25 UTC (permalink / raw)
To: Toby Dickenson; +Cc: reiserfs-list
On Sat, May 25, 2002 at 08:43:29PM +0100, Toby Dickenson wrote:
> I am developing a storage layer for the ZODB object database, which is
> designed to play to reiserfs strengths.
>
> http://sourceforge.net/projects/dirstorage
>
> I have a question about how much fsyncing is necessary to avoid losing files
> on power loss, when moving them between directories.
>
> Consider this sequence
>
> 1. write to A/B/somefileX and fsync it
> 2. mkdir A/C
> 3. rename A/B/somefile to A/C/somefile
> 4. rmdir A/B
> 5. power loss
>
> I would like to guarantee that, after journal replay, 'somefile' is in either
> of those two directories (or both). Am I correct to think that I dont need
> any other syncs in there?
You are correct, renames are atomic on journalling filesystems.
So, no problems with half-written directory entries, if you lose power
at the wrong time the journal replay procedure will throw the
transaction out and you end up as if you never issued a rename at all.
> (A second question; is there any documentation that I could have used to
> answer the first question myself?)
I can't point to specific documentation, but this behavior is part of
the main ideas behind journalling filesystems. You would get
basically the same thing with ext3, XFS, JFS, Solaris UFS with the
logging option, etc.
Of course, you could always scan some mailing list archives ;)
reiserfs-list and linux-XFS are quite active and contain very valuable
insights.
Jean-Francois Landry
--
Any sufficiently complicated C or Fortran program contains an ad hoc
informally-specified bug-ridden slow implementation of half of Common Lisp.
Greenspun's Tenth Rule
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: integrity question
2002-05-26 5:25 ` Jean-Francois Landry
@ 2002-05-27 9:35 ` Toby Dickenson
2002-05-28 2:26 ` Jean-Francois Landry
0 siblings, 1 reply; 5+ messages in thread
From: Toby Dickenson @ 2002-05-27 9:35 UTC (permalink / raw)
To: Jean-Francois Landry; +Cc: reiserfs-list
On Sunday 26 May 2002 6:25 am, Jean-Francois Landry wrote:
thanks for your time,
>> 1. write to A/B/somefileX and fsync it
>> 2. mkdir A/C
>> 3. rename A/B/somefile to A/C/somefile
>> 4. rmdir A/B
>> 5. power loss
>>
>> I would like to guarantee that, after journal replay, 'somefile' is in
>> either of those two directories (or both). Am I correct to think that I
>> dont need any other syncs in there?
>
>You are correct, renames are atomic on journalling filesystems.
>So, no problems with half-written directory entries, if you lose power
>at the wrong time the journal replay procedure will throw the
>transaction out and you end up as if you never issued a rename at all.
Atomic rename isnt quite enough in the scenario I described. The filesystem
also needs to take care that the step 3 is not committed before step 2, and
that step 4 is not committed before step 3.
>I can't point to specific documentation, but this behavior is part of
>the main ideas behind journalling filesystems.
I would be very happy to find that I am worrying over a problem that does not
exist.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: integrity question
2002-05-27 9:35 ` Toby Dickenson
@ 2002-05-28 2:26 ` Jean-Francois Landry
2002-05-28 11:05 ` Chris Mason
0 siblings, 1 reply; 5+ messages in thread
From: Jean-Francois Landry @ 2002-05-28 2:26 UTC (permalink / raw)
To: Toby Dickenson; +Cc: Jean-Francois Landry, reiserfs-list
On Mon, May 27, 2002 at 10:35:24AM +0100, Toby Dickenson wrote:
> On Sunday 26 May 2002 6:25 am, Jean-Francois Landry wrote:
>
> thanks for your time,
>
> >> 1. write to A/B/somefileX and fsync it
> >> 2. mkdir A/C
> >> 3. rename A/B/somefile to A/C/somefile
> >> 4. rmdir A/B
> >> 5. power loss
> >>
> >> I would like to guarantee that, after journal replay, 'somefile' is in
> >> either of those two directories (or both). Am I correct to think that I
> >> dont need any other syncs in there?
> >
> >You are correct, renames are atomic on journalling filesystems.
> >So, no problems with half-written directory entries, if you lose power
> >at the wrong time the journal replay procedure will throw the
> >transaction out and you end up as if you never issued a rename at all.
>
> Atomic rename isnt quite enough in the scenario I described. The filesystem
> also needs to take care that the step 3 is not committed before step 2, and
> that step 4 is not committed before step 3.
AFAIK, all transactions are written to the journal in the order they
were issued, so this problem won't arise.
>
> I would be very happy to find that I am worrying over a problem that does not
> exist.
>
I am not aware of a journalling filesystem that does not guarantee the
two points you mentionned. You can relax now :)
Jean-Francois Landry
--
'Instead of asking why a piece of software is using "1970s technology,"
start asking why software is ignoring 30 years of accumulated wisdom.'
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: integrity question
2002-05-28 2:26 ` Jean-Francois Landry
@ 2002-05-28 11:05 ` Chris Mason
0 siblings, 0 replies; 5+ messages in thread
From: Chris Mason @ 2002-05-28 11:05 UTC (permalink / raw)
To: Jean-Francois Landry; +Cc: Toby Dickenson, reiserfs-list
On Mon, 2002-05-27 at 22:26, Jean-Francois Landry wrote:
> On Mon, May 27, 2002 at 10:35:24AM +0100, Toby Dickenson wrote:
> > On Sunday 26 May 2002 6:25 am, Jean-Francois Landry wrote:
> >
> > thanks for your time,
> >
> > >> 1. write to A/B/somefileX and fsync it
> > >> 2. mkdir A/C
> > >> 3. rename A/B/somefile to A/C/somefile
> > >> 4. rmdir A/B
> > >> 5. power loss
> > >>
> > >> I would like to guarantee that, after journal replay, 'somefile' is in
> > >> either of those two directories (or both). Am I correct to think that I
> > >> dont need any other syncs in there?
> > >
> > >You are correct, renames are atomic on journalling filesystems.
> > >So, no problems with half-written directory entries, if you lose power
> > >at the wrong time the journal replay procedure will throw the
> > >transaction out and you end up as if you never issued a rename at all.
> >
> > Atomic rename isnt quite enough in the scenario I described. The filesystem
> > also needs to take care that the step 3 is not committed before step 2, and
> > that step 4 is not committed before step 3.
>
> AFAIK, all transactions are written to the journal in the order they
> were issued, so this problem won't arise.
Correct. Only data blocks can be written out of order, but the fsync
protects you from that.
-chris
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2002-05-28 11:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-25 19:43 integrity question Toby Dickenson
2002-05-26 5:25 ` Jean-Francois Landry
2002-05-27 9:35 ` Toby Dickenson
2002-05-28 2:26 ` Jean-Francois Landry
2002-05-28 11:05 ` Chris Mason
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.