* hard links
@ 2012-04-04 16:38 Arnd Hannemann
2012-04-04 19:33 ` Shyam Prasad N
0 siblings, 1 reply; 9+ messages in thread
From: Arnd Hannemann @ 2012-04-04 16:38 UTC (permalink / raw)
To: linux-btrfs
Hi,
today I experimented with hard links on btrfs and by this used all available inode space of a file.
Interestingly if this happens even a rename of such an filename to an _equal length_ filename
fails:
arnd@kallisto:/mnt/btrfs/tmp$ mv a b
mv: cannot move `a' to `b': Too many links
Is this expected behavior?
There should be no reason to let this particular case fail?
Best regards,
Arnd
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: hard links
2012-04-04 16:38 hard links Arnd Hannemann
@ 2012-04-04 19:33 ` Shyam Prasad N
2012-04-04 19:39 ` Arnd Hannemann
0 siblings, 1 reply; 9+ messages in thread
From: Shyam Prasad N @ 2012-04-04 19:33 UTC (permalink / raw)
To: Arnd Hannemann; +Cc: linux-btrfs
On 04/04/2012 10:08 PM, Arnd Hannemann wrote:
> Hi,
>
> today I experimented with hard links on btrfs and by this used all available inode space of a file.
> Interestingly if this happens even a rename of such an filename to an _equal length_ filename
> fails:
>
> arnd@kallisto:/mnt/btrfs/tmp$ mv a b
> mv: cannot move `a' to `b': Too many links
>
> Is this expected behavior?
> There should be no reason to let this particular case fail?
>
> Best regards,
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Arnd,
What do you mean by 'used all available inode space'? What did you do
exactly?
Thanks,
Shyam
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: hard links
2012-04-04 19:33 ` Shyam Prasad N
@ 2012-04-04 19:39 ` Arnd Hannemann
2012-04-04 19:53 ` Hugo Mills
0 siblings, 1 reply; 9+ messages in thread
From: Arnd Hannemann @ 2012-04-04 19:39 UTC (permalink / raw)
To: nspmangalore; +Cc: linux-btrfs
Hi Shyam,
Am 04.04.2012 21:33, schrieb Shyam Prasad N:
> On 04/04/2012 10:08 PM, Arnd Hannemann wrote:
>> Hi,
>>
>> today I experimented with hard links on btrfs and by this used all available inode space of a file.
>> Interestingly if this happens even a rename of such an filename to an _equal length_ filename
>> fails:
>>
>> arnd@kallisto:/mnt/btrfs/tmp$ mv a b
>> mv: cannot move `a' to `b': Too many links
>>
>> Is this expected behavior?
>> There should be no reason to let this particular case fail?
>>
> What do you mean by 'used all available inode space'? What did you do exactly?
I created hard links of a file in the same directory until no additional one
can be created.
Eg.:
touch a
for i in {1..1000}; do ln a $i; done;
Best regards
Arnd
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: hard links
2012-04-04 19:39 ` Arnd Hannemann
@ 2012-04-04 19:53 ` Hugo Mills
2012-04-04 20:12 ` Arnd Hannemann
0 siblings, 1 reply; 9+ messages in thread
From: Hugo Mills @ 2012-04-04 19:53 UTC (permalink / raw)
To: Arnd Hannemann; +Cc: nspmangalore, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]
On Wed, Apr 04, 2012 at 09:39:39PM +0200, Arnd Hannemann wrote:
> Am 04.04.2012 21:33, schrieb Shyam Prasad N:
> > On 04/04/2012 10:08 PM, Arnd Hannemann wrote:
> >> Hi,
> >>
> >> today I experimented with hard links on btrfs and by this used all available inode space of a file.
> >> Interestingly if this happens even a rename of such an filename to an _equal length_ filename
> >> fails:
> >>
> >> arnd@kallisto:/mnt/btrfs/tmp$ mv a b
> >> mv: cannot move `a' to `b': Too many links
> >>
> >> Is this expected behavior?
> >> There should be no reason to let this particular case fail?
> >>
>
> > What do you mean by 'used all available inode space'? What did you do exactly?
>
There's no inode limit specifically. The limit is on the number of
hardlinks to the same file stored in the same directory (and it's very
small). This is a known limitation of btrfs. Someone's working on a
fix (can't remember who, off-hand), but it's not been published yet.
> I created hard links of a file in the same directory until no additional one
> can be created.
>
> Eg.:
> touch a
> for i in {1..1000}; do ln a $i; done;
This is a little odd, but I don't really know what the internals of
mv try to do...
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Nothing right in my left brain. Nothing left in ---
my right brain.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: hard links
2012-04-04 19:53 ` Hugo Mills
@ 2012-04-04 20:12 ` Arnd Hannemann
2012-04-04 20:57 ` Zach Brown
0 siblings, 1 reply; 9+ messages in thread
From: Arnd Hannemann @ 2012-04-04 20:12 UTC (permalink / raw)
To: Hugo Mills, nspmangalore, linux-btrfs
Hi Hugo,
Am 04.04.2012 21:53, schrieb Hugo Mills:
> On Wed, Apr 04, 2012 at 09:39:39PM +0200, Arnd Hannemann wrote:
>> Am 04.04.2012 21:33, schrieb Shyam Prasad N:
>>> On 04/04/2012 10:08 PM, Arnd Hannemann wrote:
>>>> Hi,
>>>>
>>>> today I experimented with hard links on btrfs and by this used all available inode space of a file.
>>>> Interestingly if this happens even a rename of such an filename to an _equal length_ filename
>>>> fails:
>>>>
>>>> arnd@kallisto:/mnt/btrfs/tmp$ mv a b
>>>> mv: cannot move `a' to `b': Too many links
>>>>
>>>> Is this expected behavior?
>>>> There should be no reason to let this particular case fail?
>>>>
>>
>>> What do you mean by 'used all available inode space'? What did you do exactly?
>>
>
> There's no inode limit specifically. The limit is on the number of
> hardlinks to the same file stored in the same directory (and it's very
> small). This is a known limitation of btrfs. Someone's working on a
> fix (can't remember who, off-hand), but it's not been published yet.
Sorry maybe I was unclear. I didn't want to say I used up all inodes.
I wanted to express that I filled up the space of a particular inode.
My understanding is that the limit on the number of hardlinks to the same
file stored in the same directory, is, because the names of the
hardlinks are stored within the same inode. As such the number of hardlinks is
naturally limited by the size of the inode (and dependent on the length
of the filenames). Correct?
It's not a big deal, but with my original posting I just tried to point
out that btrfs fails an operation while the above constraint is not violated.
The size needed to store the filename "a" is exactly the same as the size needed
to store the filename "b".
Therefore, I would assume the operation mv "a" "b" to just work.
I hope I did clarify my point?
Best regards
Arnd
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: hard links
2012-04-04 20:12 ` Arnd Hannemann
@ 2012-04-04 20:57 ` Zach Brown
0 siblings, 0 replies; 9+ messages in thread
From: Zach Brown @ 2012-04-04 20:57 UTC (permalink / raw)
To: Arnd Hannemann; +Cc: Hugo Mills, nspmangalore, linux-btrfs
> My understanding is that the limit on the number of hardlinks to the same
> file stored in the same directory, is, because the names of the
> hardlinks are stored within the same inode. As such the number of hardlinks is
> naturally limited by the size of the inode (and dependent on the length
> of the filenames). Correct?
Correct enough :). Yes, there is a limited amount of space associated
with an inode that tracks hard links and the name of the link consumes
that space.
The details are a little different. The backrefs to a given inode are
stored in one item in the tree whose size is limited by the block size
of the leaves in the tree.
> It's not a big deal, but with my original posting I just tried to
> point out that btrfs fails an operation while the above constraint is
> not violated. The size needed to store the filename "a" is exactly
> the same as the size needed to store the filename "b". Therefore, I
> would assume the operation mv "a" "b" to just work.
Huh, indeed.
I'd hope that the current behaviour could be fixed. In flipping through
the code it certainly looks like it adds the new backref for the new
name before it unlinks the old name and removes the old backref.
- z
^ permalink raw reply [flat|nested] 9+ messages in thread
* Slides from Ceph talk at linux.conf.au?
@ 2010-02-02 21:05 Craig Dunwoody
2010-02-02 21:43 ` Sage Weil
0 siblings, 1 reply; 9+ messages in thread
From: Craig Dunwoody @ 2010-02-02 21:05 UTC (permalink / raw)
To: ceph-devel; +Cc: cdunwoody
Hello Sage,
Could you make available the slides from your recent Ceph talk at
linux.conf.au? I'd be very interested to see them, and I expect that
others here would as well.
Craig Dunwoody
GraphStream Incorporated
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Slides from Ceph talk at linux.conf.au?
2010-02-02 21:05 Slides from Ceph talk at linux.conf.au? Craig Dunwoody
@ 2010-02-02 21:43 ` Sage Weil
2010-02-05 0:17 ` Hard links Chris Dunlop
0 siblings, 1 reply; 9+ messages in thread
From: Sage Weil @ 2010-02-02 21:43 UTC (permalink / raw)
To: Craig Dunwoody; +Cc: ceph-devel
Hi Craig,
On Tue, 2 Feb 2010, Craig Dunwoody wrote:
> Hello Sage,
>
> Could you make available the slides from your recent Ceph talk at
> linux.conf.au? I'd be very interested to see them, and I expect that
> others here would as well.
I've posted the openoffice presentation at
http://ceph.newdream.net/presentations/
There's also a PDF version, although some slides may look weird due to the
animated figures.
I think LCA should have the slides (and video) up soon for all the talks
as well.
Enjoy!
sage
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Hard links
2010-02-02 21:43 ` Sage Weil
@ 2010-02-05 0:17 ` Chris Dunlop
2010-02-05 20:34 ` Sage Weil
0 siblings, 1 reply; 9+ messages in thread
From: Chris Dunlop @ 2010-02-05 0:17 UTC (permalink / raw)
To: ceph-devel
G'day Sage,
Sage Weil <sage <at> newdream.net> writes:
> I've posted the openoffice presentation at
>
> http://ceph.newdream.net/presentations/
The last slide (39) mentions "Hard links are rare!".
This isn't necessarily true in a backup system where each
snapshot hard links to the previous snapshot for files that
haven't changed, e.g. an 'rsnapshot' installation.
For some hard numbers, one of our server backups has 72316 files
in yesterday's set, with only 194 not hard linked, and 62141
have 76 hard links (there are currently 76 days of backups for
this server). This is one of 66 servers being backed up to this
one 4.5 TB storage pool.
Does the "hard links are rare" assertion imply that ceph may
have some issues (e.g. hard limits or performance) handling very
large numbers of hard links?
E.g. I see that hard links are mentioned in your Dec 2007
dissertation on ceph, along with the use of an anchor table
which is "managed by a single MDS". Might this be an issue
(e.g. a 'hot spot') for situations with a large number of hard
links such as that described above?
Cheers,
Chris
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hard links
2010-02-05 0:17 ` Hard links Chris Dunlop
@ 2010-02-05 20:34 ` Sage Weil
2010-02-07 1:09 ` Chris Dunlop
0 siblings, 1 reply; 9+ messages in thread
From: Sage Weil @ 2010-02-05 20:34 UTC (permalink / raw)
To: Chris Dunlop; +Cc: ceph-devel
On Fri, 5 Feb 2010, Chris Dunlop wrote:
> G'day Sage,
>
> Sage Weil <sage <at> newdream.net> writes:
> > I've posted the openoffice presentation at
> >
> > http://ceph.newdream.net/presentations/
>
> The last slide (39) mentions "Hard links are rare!".
>
> This isn't necessarily true in a backup system where each
> snapshot hard links to the previous snapshot for files that
> haven't changed, e.g. an 'rsnapshot' installation.
>
> For some hard numbers, one of our server backups has 72316 files
> in yesterday's set, with only 194 not hard linked, and 62141
> have 76 hard links (there are currently 76 days of backups for
> this server). This is one of 66 servers being backed up to this
> one 4.5 TB storage pool.
>
> Does the "hard links are rare" assertion imply that ceph may
> have some issues (e.g. hard limits or performance) handling very
> large numbers of hard links?
>
> E.g. I see that hard links are mentioned in your Dec 2007
> dissertation on ceph, along with the use of an anchor table
> which is "managed by a single MDS". Might this be an issue
> (e.g. a 'hot spot') for situations with a large number of hard
> links such as that described above?
Yes and no. The performance impact of hard links is low for the
common backup scenario, but the anchor table scaling has not been
address (it's still a single MDS).
What slide 39 doesn't include is a description of the figure. One of the
most common use scenarios of hard links is what I called 'parallel' links,
where many files in one directory are all hard linked to parallel files in
a different directory, which is exactly what you see with cp -al or
rsnapshot. In that case, the cost of doing a lookup in the anchor
table is amortized over the whole directory.
The anchor table is still maintained by a single MDS, though, and it's all
in RAM at once, so it will be a scaling problem if the fs has a lot of
hard links. That just needs some design attention at some point.
sage
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hard links
2010-02-05 20:34 ` Sage Weil
@ 2010-02-07 1:09 ` Chris Dunlop
0 siblings, 0 replies; 9+ messages in thread
From: Chris Dunlop @ 2010-02-07 1:09 UTC (permalink / raw)
To: ceph-devel
Sage Weil <sage <at> newdream.net> writes:
> On Fri, 5 Feb 2010, Chris Dunlop wrote:
>> Does the "hard links are rare" assertion imply that ceph may
>> have some issues (e.g. hard limits or performance) handling very
>> large numbers of hard links?
>>
>> E.g. I see that hard links are mentioned in your Dec 2007
>> dissertation on ceph, along with the use of an anchor table
>> which is "managed by a single MDS". Might this be an issue
>> (e.g. a 'hot spot') for situations with a large number of hard
>> links such as that described above?
>
> Yes and no. The performance impact of hard links is low for the
> common backup scenario, but the anchor table scaling has not been
> address (it's still a single MDS).
>
> What slide 39 doesn't include is a description of the figure. One of the
> most common use scenarios of hard links is what I called 'parallel' links,
> where many files in one directory are all hard linked to parallel files in
> a different directory, which is exactly what you see with cp -al or
> rsnapshot. In that case, the cost of doing a lookup in the anchor
> table is amortized over the whole directory.
>
> The anchor table is still maintained by a single MDS, though, and it's all
> in RAM at once, so it will be a scaling problem if the fs has a lot of
> hard links. That just needs some design attention at some point.
"Just a small matter of designing and coding..." :-)
With the current design of the anchor table maintained by a
single MDS, does this have an impact on the resiliancy of the
system? Specifically, what happens if that one MDS becomes
unavailable?
How much memory does the anchor table require? E.g. for a
million, or 100 million hard links?
Chris
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-04-04 20:57 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-04 16:38 hard links Arnd Hannemann
2012-04-04 19:33 ` Shyam Prasad N
2012-04-04 19:39 ` Arnd Hannemann
2012-04-04 19:53 ` Hugo Mills
2012-04-04 20:12 ` Arnd Hannemann
2012-04-04 20:57 ` Zach Brown
-- strict thread matches above, loose matches on Subject: below --
2010-02-02 21:05 Slides from Ceph talk at linux.conf.au? Craig Dunwoody
2010-02-02 21:43 ` Sage Weil
2010-02-05 0:17 ` Hard links Chris Dunlop
2010-02-05 20:34 ` Sage Weil
2010-02-07 1:09 ` Chris Dunlop
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.