linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sami Liedes <sliedes@cc.hut.fi>
To: linux-btrfs@vger.kernel.org
Subject: Re: Number of hard links limit
Date: Fri, 6 Aug 2010 14:30:39 +0300	[thread overview]
Message-ID: <20100806113039.GA18858@lh.kyla.fi> (raw)
In-Reply-To: <0362zshcyx.fsf@msgid.viggen.net>

[-- Attachment #1: Type: text/plain, Size: 1820 bytes --]

On Tue, Aug 03, 2010 at 12:22:14AM +0200, Oystein Viggen wrote:
> IIRC, the limit on hard links is per directory.  That is, if you put
> each hard link into its own directory, there's basically no limit to the
> amount of hard links you can make to one file.

Yes, that's always pointed out in these threads. Still, it seems to be
breaking real use cases also beyond backuppc (someone mentioned
installing some Gentoo package).

> Thus, many generations of backup with BackupPC shouldn't trigger the
> problem, as each generation is stored in its own directory tree.  The
> problem appears when your source data has many identical files in the
> same directory, since these would be deduplicated as hard links to the
> same file in the backup pool.

I think you are right. I looked at the errors I reported in

  https://bugzilla.kernel.org/show_bug.cgi?id=15762

and figured out what happens there. In dbus's source tree, under
doc/api/man/man3dbus, there are 119 files with the content ".so
man3dbus/DBusProtocol.3dbus". I think these are redirects to a single
man page.

Interestingly (and only somewhat off-topic), because of deduplication
backuppc can be used to explore the merits of deduplication in
filesystems too. I looked at the files I have most copies of. Many of
them seem perhaps a bit silly.

The most common file has a single line with the text "# dummy". I have
54991 copies of that file. They are some kind of dependency files
(perhaps by libtool or something?), always in a directory named .deps
and with a file extension of .Po or .Plo.

There are also tens of thousands of copies of a file with a single
line with the text "END". Subversion VCS seems to have two of these
for every file under version control. In fact a large portion of the
most common files seem to be owned by Subversion.

	Sami

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2010-08-06 11:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-02 11:40 Number of hard links limit Sami Liedes
2010-08-02 13:05 ` Xavier Nicollet
2010-08-02 18:31   ` Anthony Roberts
2010-08-02 19:56     ` Michael Niederle
2010-08-02 20:43       ` Roberto Ragusa
2010-08-02 22:22         ` Oystein Viggen
2010-08-06 11:30           ` Sami Liedes [this message]
2010-08-06 13:46             ` Chris Mason
2010-08-08 12:32               ` Roberto Ragusa
2011-10-12 17:34               ` Martin Steigerwald
2011-11-08 21:23                 ` Sami Liedes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100806113039.GA18858@lh.kyla.fi \
    --to=sliedes@cc.hut.fi \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).