linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hard links across snapshots/subvolumes are actually a bad idea.
@ 2010-11-24 22:07 David Nicol
  2010-11-24 22:47 ` Erik Logtenberg
  2010-11-24 23:29 ` Gordan Bobic
  0 siblings, 2 replies; 9+ messages in thread
From: David Nicol @ 2010-11-24 22:07 UTC (permalink / raw)
  To: BTRFS MAILING LIST

I've been thinking about this for a while, from a perspective of how
to make it work by allocating i-node numbers from a global pool, but
yesterday I realized that offering the feature would be a bad idea
because it violates the semantics of file systems.

I will be happy to expand on that point if anyone disagrees with it.

dln

-- 
"It is merely a matter of persistence." -- Albert Camus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-24 22:07 hard links across snapshots/subvolumes are actually a bad idea David Nicol
@ 2010-11-24 22:47 ` Erik Logtenberg
  2010-11-24 23:29 ` Gordan Bobic
  1 sibling, 0 replies; 9+ messages in thread
From: Erik Logtenberg @ 2010-11-24 22:47 UTC (permalink / raw)
  Cc: BTRFS MAILING LIST

Hi David,

It's not that i disagree per se, but I'd appreciate if you would explain
this statement a little. Personally, I was hoping for cross-subvolumes
hardlinks, so I'd like to know why this hope was apparently a bad idea.

Kind regards,

Erik.


On 11/24/2010 11:07 PM, David Nicol wrote:
> I've been thinking about this for a while, from a perspective of how
> to make it work by allocating i-node numbers from a global pool, but
> yesterday I realized that offering the feature would be a bad idea
> because it violates the semantics of file systems.
> 
> I will be happy to expand on that point if anyone disagrees with it.
> 
> dln
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-24 22:07 hard links across snapshots/subvolumes are actually a bad idea David Nicol
  2010-11-24 22:47 ` Erik Logtenberg
@ 2010-11-24 23:29 ` Gordan Bobic
  2010-11-24 23:57   ` cwillu
  1 sibling, 1 reply; 9+ messages in thread
From: Gordan Bobic @ 2010-11-24 23:29 UTC (permalink / raw)
  To: BTRFS MAILING LIST

On 11/24/2010 10:07 PM, David Nicol wrote:
> I've been thinking about this for a while, from a perspective of how
> to make it work by allocating i-node numbers from a global pool, but
> yesterday I realized that offering the feature would be a bad idea
> because it violates the semantics of file systems.
>
> I will be happy to expand on that point if anyone disagrees with it.

One thing I would like to see is copy-on-write hard-links. The 
hard-links that span snapshots should be possible, but they should be 
copy-on-write, i.e. as soon as hard-linked file that spans snapshots is 
written, the snapshot that wrote it should have it's own forked copy 
henceforth.

This would be immensely useful for things like virtualization and memory 
saving. Not only would it save memory in the caches, but if multiple 
instances of the same OS are used in LXC containers cloned using 
snapshots, then two DLLs with the same inode number would mmap into the 
same memory segment. That means that no matter how many VMs you run, if 
they have the same DLLs, the memory requirement for all those would be 
the same as if there was only one VM running.

Linux Vserver does exactly this (it patches several of the FS drivers to 
add this feature. This is pretty much the only reason why I use it 
instead of the already merged LXC.

Gordan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-24 23:29 ` Gordan Bobic
@ 2010-11-24 23:57   ` cwillu
  2010-11-25  0:18     ` Gordan Bobic
  0 siblings, 1 reply; 9+ messages in thread
From: cwillu @ 2010-11-24 23:57 UTC (permalink / raw)
  Cc: BTRFS MAILING LIST

On Wed, Nov 24, 2010 at 5:29 PM, Gordan Bobic <gordan@bobich.net> wrote:
> On 11/24/2010 10:07 PM, David Nicol wrote:
>>
>> I've been thinking about this for a while, from a perspective of how
>> to make it work by allocating i-node numbers from a global pool, but
>> yesterday I realized that offering the feature would be a bad idea
>> because it violates the semantics of file systems.
>>
>> I will be happy to expand on that point if anyone disagrees with it.
>
> One thing I would like to see is copy-on-write hard-links. The hard-links
> that span snapshots should be possible, but they should be copy-on-write,
> i.e. as soon as hard-linked file that spans snapshots is written, the
> snapshot that wrote it should have it's own forked copy henceforth.

There are sym-links, hard-links, and ref-links.  Cross device symlinks
are trivial.  Cross device hardlinks are evil.  Cross device ref-links
are just plain smart (and are at least partitially implemented in
btrfs;  does bcp work across subvolumes?).  :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-24 23:57   ` cwillu
@ 2010-11-25  0:18     ` Gordan Bobic
  2010-11-25  0:31       ` cwillu
  0 siblings, 1 reply; 9+ messages in thread
From: Gordan Bobic @ 2010-11-25  0:18 UTC (permalink / raw)
  To: BTRFS MAILING LIST

On 11/24/2010 11:57 PM, cwillu wrote:
> On Wed, Nov 24, 2010 at 5:29 PM, Gordan Bobic<gordan@bobich.net>  wrote:
>> On 11/24/2010 10:07 PM, David Nicol wrote:
>>>
>>> I've been thinking about this for a while, from a perspective of how
>>> to make it work by allocating i-node numbers from a global pool, but
>>> yesterday I realized that offering the feature would be a bad idea
>>> because it violates the semantics of file systems.
>>>
>>> I will be happy to expand on that point if anyone disagrees with it.
>>
>> One thing I would like to see is copy-on-write hard-links. The hard-links
>> that span snapshots should be possible, but they should be copy-on-write,
>> i.e. as soon as hard-linked file that spans snapshots is written, the
>> snapshot that wrote it should have it's own forked copy henceforth.
>
> There are sym-links, hard-links, and ref-links.  Cross device symlinks
> are trivial.  Cross device hardlinks are evil.  Cross device ref-links
> are just plain smart (and are at least partitially implemented in
> btrfs;  does bcp work across subvolumes?).  :)

Last time I asked a similar question, there was no equivalent thing to 
COW hard-links, across snapshots or otherwise. Hard-links spanning 
physical devices don't make sense. Hard-links spanning snapshots, 
however, do. In fact, I would intuitively expect that a snapshot 
contains only COW hard-links which would get COW-ed from both the head 
and the snapshot.

Gordan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-25  0:18     ` Gordan Bobic
@ 2010-11-25  0:31       ` cwillu
  2010-11-25  8:00         ` Gordan Bobic
  0 siblings, 1 reply; 9+ messages in thread
From: cwillu @ 2010-11-25  0:31 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: BTRFS MAILING LIST

>>> One thing I would like to see is copy-on-write hard-links. The hard=
-links
>>> that span snapshots should be possible, but they should be copy-on-=
write,
>>> i.e. as soon as hard-linked file that spans snapshots is written, t=
he
>>> snapshot that wrote it should have it's own forked copy henceforth.
>>
>> There are sym-links, hard-links, and ref-links. =A0Cross device syml=
inks
>> are trivial. =A0Cross device hardlinks are evil. =A0Cross device ref=
-links
>> are just plain smart (and are at least partitially implemented in
>> btrfs; =A0does bcp work across subvolumes?). =A0:)
>
> Last time I asked a similar question, there was no equivalent thing t=
o COW
> hard-links, across snapshots or otherwise. Hard-links spanning physic=
al
> devices don't make sense. Hard-links spanning snapshots, however, do.=
 In
> fact, I would intuitively expect that a snapshot contains only COW
> hard-links which would get COW-ed from both the head and the snapshot=
=2E

"COW hardlinks" are ref-links (as far as I'm concerned).  I said
partially implemented, because that's exactly what a snapshot is.  I'm
just not certain whether bcp works across subvolumes or not.  An
actual hardlink (i.e., all writes appear in all hardlinks) across any
file-system-like-structure (including subvolumes and snapshots) is
insane, for the reasons that I'm sure David offered to explain.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-25  0:31       ` cwillu
@ 2010-11-25  8:00         ` Gordan Bobic
  2010-11-25  9:36           ` David Nicol
  0 siblings, 1 reply; 9+ messages in thread
From: Gordan Bobic @ 2010-11-25  8:00 UTC (permalink / raw)
  To: BTRFS MAILING LIST

cwillu wrote:
>>>> One thing I would like to see is copy-on-write hard-links. The hard-links
>>>> that span snapshots should be possible, but they should be copy-on-write,
>>>> i.e. as soon as hard-linked file that spans snapshots is written, the
>>>> snapshot that wrote it should have it's own forked copy henceforth.
>>> There are sym-links, hard-links, and ref-links.  Cross device symlinks
>>> are trivial.  Cross device hardlinks are evil.  Cross device ref-links
>>> are just plain smart (and are at least partitially implemented in
>>> btrfs;  does bcp work across subvolumes?).  :)
>> Last time I asked a similar question, there was no equivalent thing to COW
>> hard-links, across snapshots or otherwise. Hard-links spanning physical
>> devices don't make sense. Hard-links spanning snapshots, however, do. In
>> fact, I would intuitively expect that a snapshot contains only COW
>> hard-links which would get COW-ed from both the head and the snapshot.
> 
> "COW hardlinks" are ref-links (as far as I'm concerned).  I said
> partially implemented, because that's exactly what a snapshot is.  I'm
> just not certain whether bcp works across subvolumes or not.  An
> actual hardlink (i.e., all writes appear in all hardlinks) across any
> file-system-like-structure (including subvolumes and snapshots) is
> insane, for the reasons that I'm sure David offered to explain.

My understanding is that inode numbers on the "same" files are different 
between snapshots. If that is the case then it is not good enough for 
the use-case I was talking bout. Hard-links share inode numbers. Do 
ref-links?

Gordan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
  2010-11-25  8:00         ` Gordan Bobic
@ 2010-11-25  9:36           ` David Nicol
  0 siblings, 0 replies; 9+ messages in thread
From: David Nicol @ 2010-11-25  9:36 UTC (permalink / raw)
  Cc: BTRFS MAILING LIST

>> "COW hardlinks" are ref-links (as far as I'm concerned). =C2=A0I sai=
d
>> partially implemented, because that's exactly what a snapshot is. =C2=
=A0I'm
>> just not certain whether bcp works across subvolumes or not. =C2=A0A=
n
>> actual hardlink (i.e., all writes appear in all hardlinks) across an=
y
>> file-system-like-structure (including subvolumes and snapshots) is
>> insane, for the reasons that I'm sure David offered to explain.

Yes. Which side owns the definitive copy? The file system boundary
should stay respected; a symlink will work fine to clearly indicate
where the file is and is not. Applications needing hardlinks can use
mirrored directories. I had gotten the idea that the facility to
create them had been removed due to engineering rather than semantic
causes. Regardless of the history, it makes sense going forward to
respond to the question the next time it comes up with a semantic
answer rather than "we used to have them, but we took them out because
they cause instability due to i-node collisions." Such a semantic
answer would look like "Hard links across file-system-like structures
violate the separation of the volumes: if you really need that, use a
directory instead of a subvolume. (suggest metadata-only copy tool.)
Also, the inode numbers in a snapshot are the same as the inode
numbers in the volume snapshotted, which is a good thing, and they
stay the same even after the files diverge in content."  of course,
edited for precise accuracy.


> My understanding is that inode numbers on the "same" files are differ=
ent
> between snapshots. If that is the case then it is not good enough for=
 the
> use-case I was talking bout. Hard-links share inode numbers. Do ref-l=
inks?

the idea of a btrfs-specific "copy" that simply copies the metadata,
reusing the data blocks -- that's cool -- I'm guessing that's what
"bcp" is, and semantics-wise it should be identical to a
read-and-write-everything copy, and faster than the equivalent on a
deduplicating FS because the reading can get skipped too as well as
the writing. I will look up what "bcp" is and does before the next
time I write to this list.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: hard links across snapshots/subvolumes are actually a bad idea.
@ 2010-11-25 13:48 Tomasz Chmielewski
  0 siblings, 0 replies; 9+ messages in thread
From: Tomasz Chmielewski @ 2010-11-25 13:48 UTC (permalink / raw)
  To: linux-btrfs

> I'm just not certain whether bcp works across subvolumes or not.

It doesn't seem to work across subvolumes.


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-11-25 13:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-24 22:07 hard links across snapshots/subvolumes are actually a bad idea David Nicol
2010-11-24 22:47 ` Erik Logtenberg
2010-11-24 23:29 ` Gordan Bobic
2010-11-24 23:57   ` cwillu
2010-11-25  0:18     ` Gordan Bobic
2010-11-25  0:31       ` cwillu
2010-11-25  8:00         ` Gordan Bobic
2010-11-25  9:36           ` David Nicol
  -- strict thread matches above, loose matches on Subject: below --
2010-11-25 13:48 Tomasz Chmielewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).