linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* how do versioning filesystems take snapshot of opened files?
@ 2007-07-03  5:28 Xin Zhao
       [not found] ` <778391c50707022236v5157a933qea1994cf4cbce879@mail.gmail.com>
  2007-07-03 13:09 ` Chris Mason
  0 siblings, 2 replies; 13+ messages in thread
From: Xin Zhao @ 2007-07-03  5:28 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

Hi,


If a file is already opened when snapshot command is issued,  the file
itself could be in an inconsistent state already. Before the file is
closed, maybe part of the file contains old data, the rest contains
new data.
How does a versioning filesystem guarantee that the file snapshot is
in a consistent state in this case?

I googled it but didn't find any answer. Can someone explain it a little bit?

Thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
       [not found] ` <778391c50707022236v5157a933qea1994cf4cbce879@mail.gmail.com>
@ 2007-07-03  5:44   ` Xin Zhao
  0 siblings, 0 replies; 13+ messages in thread
From: Xin Zhao @ 2007-07-03  5:44 UTC (permalink / raw)
  To: Sunil Joshi; +Cc: linux-fsdevel, linux-kernel

I do know how copy-on-write works. But this seems insufficient to
address the problem I asked.

One possible way is to modify the filesystem so that when a file is
opened for write, we always keep a copy of the file's metadata and a
CoW bitmap. For subsequent writes, we first copy the old data to
somewhere else in memory for backup, mark the CoW bitmap, and then
write to the file. The temporary data is discarded when the file is
closed. By checking the CoW bitmap, the backup application can
determine which data it should copy to the backup media.

This solution can address this problem. But it requires copy-on-write
on every write, no matter whether there is a backup operation. I
suspect that this solution will impact the system performance (at
least write performance) quite a bit and consume more memory.

Any further thought?

Thanks!


On 7/3/07, Sunil Joshi <mail2joshi@gmail.com> wrote:
> It depends how snapshot is being taken. Usually it is Copy On Write. Google
> for Copy on Write and you will find the answer for this.
>
>
>
>
> On 7/3/07, Xin Zhao <uszhaoxin@gmail.com> wrote:
> >
> > Hi,
> >
> >
> > If a file is already opened when snapshot command is issued,  the file
> > itself could be in an inconsistent state already. Before the file is
> > closed, maybe part of the file contains old data, the rest contains
> > new data.
> > How does a versioning filesystem guarantee that the file snapshot is
> > in a consistent state in this case?
> >
> > I googled it but didn't find any answer. Can someone explain it a little
> bit?
> >
> > Thanks
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel"
> in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> >
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03  5:28 how do versioning filesystems take snapshot of opened files? Xin Zhao
       [not found] ` <778391c50707022236v5157a933qea1994cf4cbce879@mail.gmail.com>
@ 2007-07-03 13:09 ` Chris Mason
  2007-07-03 16:12   ` Xin Zhao
                     ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Chris Mason @ 2007-07-03 13:09 UTC (permalink / raw)
  To: Xin Zhao; +Cc: linux-fsdevel, linux-kernel

On Tue, 3 Jul 2007 01:28:57 -0400
"Xin Zhao" <uszhaoxin@gmail.com> wrote:

> Hi,
> 
> 
> If a file is already opened when snapshot command is issued,  the file
> itself could be in an inconsistent state already. Before the file is
> closed, maybe part of the file contains old data, the rest contains
> new data.
> How does a versioning filesystem guarantee that the file snapshot is
> in a consistent state in this case?
> 
> I googled it but didn't find any answer. Can someone explain it a
> little bit?

It's the same answer as in most filesystem related questions...it
depends ;)  Consistent state means many different things.  It may mean
that the metadata accurately reflects the space on disk allocated to
the file and that all data for the file is properly on disk (ie from an
fsync).

But, even this is less than useful because very few files on the
filesystem stand alone.  Applications spread their state across a
number of files and so consistent means something different to
every application.

Getting a snapshot that is useful with respect to application data
requires help from the application.  The app needs to be shutdown or
paused prior to the snapshot and then started up again after the
snapshot is taken.

-chris



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 13:09 ` Chris Mason
@ 2007-07-03 16:12   ` Xin Zhao
  2007-07-03 16:35     ` Bryan Henderson
  2007-07-03 16:24   ` Bryan Henderson
  2007-07-03 22:02   ` Neil Brown
  2 siblings, 1 reply; 13+ messages in thread
From: Xin Zhao @ 2007-07-03 16:12 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-fsdevel, linux-kernel

Thanks for your reply.

Sounds like one has to stop or pause the applications to get
consistent snapshot?  But you look around, you may find that many
systems claim that they can take snapshot without shutdown the
application. Actually, I think it is impractical to require that app
to be shutdown before taking snapshot in a commercial environment.

Pausing apps is possible from the filesystem perspective. A simple
solution is that the filesystem stop writing any data to disk from the
point that the snapshotting command is received. But as we discussed
earlier, this is not sufficient to prevent a file from containing part
of old data and part of new data.

That's why I am so confused how can these systems provide consistent
snapshotting capability without sacrificing system performance much.



On 7/3/07, Chris Mason <chris.mason@oracle.com> wrote:
> On Tue, 3 Jul 2007 01:28:57 -0400
> "Xin Zhao" <uszhaoxin@gmail.com> wrote:
>
> > Hi,
> >
> >
> > If a file is already opened when snapshot command is issued,  the file
> > itself could be in an inconsistent state already. Before the file is
> > closed, maybe part of the file contains old data, the rest contains
> > new data.
> > How does a versioning filesystem guarantee that the file snapshot is
> > in a consistent state in this case?
> >
> > I googled it but didn't find any answer. Can someone explain it a
> > little bit?
>
> It's the same answer as in most filesystem related questions...it
> depends ;)  Consistent state means many different things.  It may mean
> that the metadata accurately reflects the space on disk allocated to
> the file and that all data for the file is properly on disk (ie from an
> fsync).
>
> But, even this is less than useful because very few files on the
> filesystem stand alone.  Applications spread their state across a
> number of files and so consistent means something different to
> every application.
>
> Getting a snapshot that is useful with respect to application data
> requires help from the application.  The app needs to be shutdown or
> paused prior to the snapshot and then started up again after the
> snapshot is taken.
>
> -chris
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 13:09 ` Chris Mason
  2007-07-03 16:12   ` Xin Zhao
@ 2007-07-03 16:24   ` Bryan Henderson
  2007-07-03 16:31     ` Xin Zhao
  2007-07-03 22:02   ` Neil Brown
  2 siblings, 1 reply; 13+ messages in thread
From: Bryan Henderson @ 2007-07-03 16:24 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-fsdevel, Xin Zhao

> Consistent state means many different things. 

And, significantly, open/close has nothing to do with any of them 
(assuming we're talking about the system calls).  open/close does not 
identify a transaction; a program may open and close a file multiple times 
the course of making a "single" update.  Also, data and metadata updates 
remain buffered at the kernel level after a close.  And don't forget that 
a single update may span multiple files.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Filesystems


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 16:24   ` Bryan Henderson
@ 2007-07-03 16:31     ` Xin Zhao
  2007-07-03 17:04       ` Chris Mason
  0 siblings, 1 reply; 13+ messages in thread
From: Xin Zhao @ 2007-07-03 16:31 UTC (permalink / raw)
  To: Bryan Henderson; +Cc: Chris Mason, linux-fsdevel

That's a good point!

But this sounds hopeless to take a real consistent snapshot from app
perspective unless you shutdown the computer. Right?

Thanks.


On 7/3/07, Bryan Henderson <hbryan@us.ibm.com> wrote:
> > Consistent state means many different things.
>
> And, significantly, open/close has nothing to do with any of them
> (assuming we're talking about the system calls).  open/close does not
> identify a transaction; a program may open and close a file multiple times
> the course of making a "single" update.  Also, data and metadata updates
> remain buffered at the kernel level after a close.  And don't forget that
> a single update may span multiple files.
>
> --
> Bryan Henderson                     IBM Almaden Research Center
> San Jose CA                         Filesystems
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 16:12   ` Xin Zhao
@ 2007-07-03 16:35     ` Bryan Henderson
  0 siblings, 0 replies; 13+ messages in thread
From: Bryan Henderson @ 2007-07-03 16:35 UTC (permalink / raw)
  To: Xin Zhao; +Cc: Chris Mason, linux-fsdevel

>But you look around, you may find that many
>systems claim that they can take snapshot without shutdown the
>application.

The claim is true, because you can just pause the application and not shut 
it down.  While this means you can't simply add snapshot capability and 
solve your copy consistency problem (you need new applications too), this 
is a huge advance over what there was before.  Without snapshots, you do 
have to shut down the application.  Often for hours, and during that time 
any service request to the application fails.  With snapshots, you simply 
pause the application for a few seconds.  During that time it delays 
processing of service requests, but every request ultimately goes through, 
with the requester probably not noticing any difference.

If a system claims that snapshot function in the filesystem alone gets you 
consistent backups, it's wrong.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Filesystems


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 16:31     ` Xin Zhao
@ 2007-07-03 17:04       ` Chris Mason
  2007-07-03 17:15         ` Xin Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Mason @ 2007-07-03 17:04 UTC (permalink / raw)
  To: Xin Zhao; +Cc: Bryan Henderson, linux-fsdevel

On Tue, 3 Jul 2007 12:31:49 -0400
"Xin Zhao" <uszhaoxin@gmail.com> wrote:

> That's a good point!
> 
> But this sounds hopeless to take a real consistent snapshot from app
> perspective unless you shutdown the computer. Right?

Many different applications support some form of pausing in order
to facilitate live backups.  You just have to keep it all in mind when
designing the total backup solution.

-chris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 17:04       ` Chris Mason
@ 2007-07-03 17:15         ` Xin Zhao
  2007-07-03 17:38           ` Chris Mason
  0 siblings, 1 reply; 13+ messages in thread
From: Xin Zhao @ 2007-07-03 17:15 UTC (permalink / raw)
  To: Chris Mason; +Cc: Bryan Henderson, linux-fsdevel

OK. From discussion above, can we reach a conclusion: from the
application perspective, it is very hard, if not impossible, to take a
transactional consistent snapshot without the help from applications?

Chris, you mentioned that "Many different applications support some
form of pausing in order to facilitate live backups. " Can you provide
some examples? I mean popular apps.

Finally, if we back up a little bit, say, we don't care the
transaction level consistency ( a transaction that open/close many
times), but we want a open/close consistency in snapshots. That is, a
file in a snapshot must be in a single version, but it can be in a
middle state of a transaction. Can we do that? Pausing apps itself
does not solve this problem, because a file could be already opened
and in the middle of write. As I mentioned earlier, some systems can
backup old data every time new data is written, but I suspect that
this will impact the system performance quite a bit. Any idea about
that?

Thanks.



On 7/3/07, Chris Mason <chris.mason@oracle.com> wrote:
> On Tue, 3 Jul 2007 12:31:49 -0400
> "Xin Zhao" <uszhaoxin@gmail.com> wrote:
>
> > That's a good point!
> >
> > But this sounds hopeless to take a real consistent snapshot from app
> > perspective unless you shutdown the computer. Right?
>
> Many different applications support some form of pausing in order
> to facilitate live backups.  You just have to keep it all in mind when
> designing the total backup solution.
>
> -chris
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 17:15         ` Xin Zhao
@ 2007-07-03 17:38           ` Chris Mason
  2007-07-03 21:06             ` Bryan Henderson
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Mason @ 2007-07-03 17:38 UTC (permalink / raw)
  To: Xin Zhao; +Cc: Bryan Henderson, linux-fsdevel

On Tue, 3 Jul 2007 13:15:06 -0400
"Xin Zhao" <uszhaoxin@gmail.com> wrote:

> OK. From discussion above, can we reach a conclusion: from the
> application perspective, it is very hard, if not impossible, to take a
> transactional consistent snapshot without the help from applications?

You definitely need help from the applications.  They define what a
transaction is.

> 
> Chris, you mentioned that "Many different applications support some
> form of pausing in order to facilitate live backups. " Can you provide
> some examples? I mean popular apps.

Oracle, db2, mysql, ldap, postgres, sleepycat databases...just search
for online backup and most programs that involve something
transactional have a way to do it.

> 
> Finally, if we back up a little bit, say, we don't care the
> transaction level consistency ( a transaction that open/close many
> times), but we want a open/close consistency in snapshots. That is, a
> file in a snapshot must be in a single version, but it can be in a
> middle state of a transaction. Can we do that? Pausing apps itself
> does not solve this problem, because a file could be already opened
> and in the middle of write. As I mentioned earlier, some systems can
> backup old data every time new data is written, but I suspect that
> this will impact the system performance quite a bit. Any idea about
> that?
> 

This depends on the transaction engine in your filesystem.  None of the
existing linux filesystems have a way to start a transaction when the
file opens and finish it when the file closes, or a way to roll back
individual operations that have happened inside a given transaction.

It certainly could be done, but it would also introduce a great deal of
complexity to the FS.

-chris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 17:38           ` Chris Mason
@ 2007-07-03 21:06             ` Bryan Henderson
  2007-07-03 21:17               ` Xin Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Bryan Henderson @ 2007-07-03 21:06 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-fsdevel, Xin Zhao

>>we want a open/close consistency in snapshots.
>
>This depends on the transaction engine in your filesystem.  None of the
>existing linux filesystems have a way to start a transaction when the
>file opens and finish it when the file closes, or a way to roll back
>individual operations that have happened inside a given transaction.
>
>It certainly could be done, but it would also introduce a great deal of
>complexity to the FS.

And I would be opposed as a matter of architecture to making open/close 
transactional.  People often read more into open/close than is there, but 
open is just about gaining access and close is just about releasing 
resources.  It isn't appropriate for close to _mean_ anything.

There are filesystems that have transactions.  They use separate start 
transaction / end transaction system calls (not POSIX).

>> Pausing apps itself
>> does not solve this problem, because a file could be already opened
>> and in the middle of write.

Just to be clear: we're saying "pause," but we mean "quiesce."  I.e., tell 
the application to reach a point where it's not in the middle of anything 
and then tell you it's there.  Indeed, whether you use open/close or some 
other kind of transaction, just pausing the application doesn't help.  If 
you were to implement open/close transactions, the filesystem driver would 
just wait for the application to close and in the meantime block all new 
opens.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Filesystems



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 21:06             ` Bryan Henderson
@ 2007-07-03 21:17               ` Xin Zhao
  0 siblings, 0 replies; 13+ messages in thread
From: Xin Zhao @ 2007-07-03 21:17 UTC (permalink / raw)
  To: Bryan Henderson; +Cc: Chris Mason, linux-fsdevel

On 7/3/07, Bryan Henderson <hbryan@us.ibm.com> wrote:
> >>we want a open/close consistency in snapshots.
> >
> >This depends on the transaction engine in your filesystem.  None of the
> >existing linux filesystems have a way to start a transaction when the
> >file opens and finish it when the file closes, or a way to roll back
> >individual operations that have happened inside a given transaction.
> >
> >It certainly could be done, but it would also introduce a great deal of
> >complexity to the FS.
>
> And I would be opposed as a matter of architecture to making open/close
> transactional.  People often read more into open/close than is there, but
> open is just about gaining access and close is just about releasing
> resources.  It isn't appropriate for close to _mean_ anything.
>
> There are filesystems that have transactions.  They use separate start
> transaction / end transaction system calls (not POSIX).
>
> >> Pausing apps itself
> >> does not solve this problem, because a file could be already opened
> >> and in the middle of write.
>
> Just to be clear: we're saying "pause," but we mean "quiesce."  I.e., tell
> the application to reach a point where it's not in the middle of anything
> and then tell you it's there.  Indeed, whether you use open/close or some
> other kind of transaction, just pausing the application doesn't help.  If
> you were to implement open/close transactions, the filesystem driver would
> just wait for the application to close and in the meantime block all new
> opens.

If we want to support open/close consistency,  maybe we don't really
need the help from the application. For example, the filesystem is
implemented this way. When a file is opened for write, we copy the
metadata and create a CoW bitmap to keep track what has been changed.
Before writing any new data to the file, we copy the old data and then
write the new data. As such, when we take snapshot and encounter the
opened file, we can save the old data instead of the newdata, since
the old data is in a consistent state. Of course, new file opening
should also be handled this way.

The filesystem driver cannot wait for application to close, I think.
If the application is snapshot aware, the wait time could be
tolerable. But if the application does not provide a way to process
the quience request, the wait could be infinite.

What do you think?


>
> --
> Bryan Henderson                     IBM Almaden Research Center
> San Jose CA                         Filesystems
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: how do versioning filesystems take snapshot of opened files?
  2007-07-03 13:09 ` Chris Mason
  2007-07-03 16:12   ` Xin Zhao
  2007-07-03 16:24   ` Bryan Henderson
@ 2007-07-03 22:02   ` Neil Brown
  2 siblings, 0 replies; 13+ messages in thread
From: Neil Brown @ 2007-07-03 22:02 UTC (permalink / raw)
  To: Chris Mason; +Cc: Xin Zhao, linux-fsdevel, linux-kernel

On Tuesday July 3, chris.mason@oracle.com wrote:
> 
> Getting a snapshot that is useful with respect to application data
> requires help from the application.

Certainly.

>                                      The app needs to be shutdown or
> paused prior to the snapshot and then started up again after the
> snapshot is taken.

Alternately, the app needs to be able to cope with unexpected system
shutdown (aka crash) and the same ability will allow it to cope with
an atomic snapshot.  It may be able to recover more efficiently from
an expected shutdown, so being able to tell the app about an impending
snapshot is probably a good idea, but it should be advisory only.

NeilBrown

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-07-03 22:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-03  5:28 how do versioning filesystems take snapshot of opened files? Xin Zhao
     [not found] ` <778391c50707022236v5157a933qea1994cf4cbce879@mail.gmail.com>
2007-07-03  5:44   ` Xin Zhao
2007-07-03 13:09 ` Chris Mason
2007-07-03 16:12   ` Xin Zhao
2007-07-03 16:35     ` Bryan Henderson
2007-07-03 16:24   ` Bryan Henderson
2007-07-03 16:31     ` Xin Zhao
2007-07-03 17:04       ` Chris Mason
2007-07-03 17:15         ` Xin Zhao
2007-07-03 17:38           ` Chris Mason
2007-07-03 21:06             ` Bryan Henderson
2007-07-03 21:17               ` Xin Zhao
2007-07-03 22:02   ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).