From: Jamie Lokier <jamie@shareable.org>
To: Pavel Machek <pavel@ucw.cz>
Cc: "Jörn Engel" <joern@wohnheim.fh-wedel.de>,
mj@ucw.cz, jack@ucw.cz,
"Patrick J. LoPresti" <patl@users.sourceforge.net>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cowlinks v2
Date: Fri, 2 Apr 2004 21:09:21 +0100 [thread overview]
Message-ID: <20040402200921.GC653@mail.shareable.org> (raw)
In-Reply-To: <20040402182357.GB410@elf.ucw.cz>
Pavel Machek wrote:
> Okay, now I have to start talking about implementation. Assume ext2 as
> a base. Theres new object "cowid" which contains, well, id for
> get_data_id() and usage count. Each inode either has pointer to
> "cowid" object, or it is plain old regular file.
Pavel has it exactly right.
A simple way to store COWID objects in the filesystem itself is as
another ordinary inode. The attributes of that inode (mtime, mode
etc.) aren't important (except to fsck), only the size and data
pointers are important. The files which point to a COWID need a flag
to indicate that, too.
Someone said that a problem with using sendfile() to create cowlinks
is that sendfile() takes a length parameter. It's a size_t, which
isn't large enough to copy Large files.
Actually that isn't a problem. The COWID inodes contain the real
length of the shared data. It would be possible for the individual
cowlink inodes to have a smaller length, indicating that they don't
have all the data. Then a cp implementation which calls sendfile()
repeatedly could copy large files and still share the data. Provided
the offset and length are compatible with sharing, the first
sendfile() would create the COWID (if it doesn't exist already), and
subsequent ones would simply enlarge the individual cowlink's length
attribute. It's not the cleanest interface; I only mention it because
sendfile() exists already and could be used.
get_data_id() is one way to detect equivalent files. Another would be
a function files_equal(fd1, fd2) which returns a boolean.
get_data_id() has the advantage that it can report immediately whether
a file has _any_ cowlink peers, which is important for programs that
scan trees. Perhaps getxattr() would be reasonable interface, using a
named attribute "data-id".
-- Jamie
next prev parent reply other threads:[~2004-04-02 20:09 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-20 8:34 [PATCH] cowlinks v2 Jörn Engel
2004-03-20 8:49 ` Andrew Morton
2004-03-20 11:27 ` Jörn Engel
2004-03-20 19:28 ` Andrew Morton
2004-03-21 12:43 ` Jörn Engel
2004-03-21 18:53 ` Jörn Engel
[not found] ` <mit.lcs.mail.linux-kernel/20040320083411.GA25934@wohnheim.fh-wedel.de>
2004-03-20 15:03 ` Patrick J. LoPresti
2004-03-20 15:23 ` Jörn Engel
2004-03-29 17:12 ` Pavel Machek
2004-03-29 21:05 ` Patrick J. LoPresti
2004-03-29 23:16 ` Pavel Machek
2004-03-31 14:34 ` Jamie Lokier
2004-03-31 14:45 ` Pavel Machek
2004-03-31 15:20 ` Jamie Lokier
2004-04-02 11:44 ` Tim Connors
2004-04-02 16:54 ` Jörn Engel
2004-04-02 18:01 ` Pavel Machek
2004-04-02 18:17 ` Jörn Engel
2004-04-02 18:23 ` Pavel Machek
2004-04-02 19:28 ` Ross Biro
2004-04-02 21:35 ` Pavel Machek
2004-04-05 8:12 ` Jörn Engel
2004-04-05 8:19 ` Pavel Machek
2004-04-05 8:45 ` Jörn Engel
2004-04-02 20:09 ` Jamie Lokier [this message]
2004-04-02 21:39 ` Pavel Machek
2004-04-02 22:00 ` Chris Friesen
2004-04-03 0:49 ` Jamie Lokier
2004-04-03 8:23 ` Pavel Machek
2004-04-03 13:15 ` Jamie Lokier
2004-04-05 8:19 ` Jörn Engel
2004-04-05 8:22 ` Pavel Machek
2004-04-03 0:46 ` Jamie Lokier
2004-04-03 1:04 ` Jamie Lokier
2004-04-03 1:21 ` Erik Andersen
2004-04-03 1:59 ` Jamie Lokier
2004-04-03 3:55 ` Ross Biro
2004-04-03 9:09 ` Pavel Machek
2004-04-03 13:27 ` Jamie Lokier
2004-04-03 18:39 ` Eric W. Biederman
2004-04-03 19:43 ` Jamie Lokier
2004-04-03 20:30 ` Eric W. Biederman
2004-04-03 21:59 ` Jamie Lokier
2004-04-04 8:15 ` Eric W. Biederman
2004-04-05 8:35 ` Jörn Engel
2004-04-05 9:15 ` Eric W. Biederman
2004-04-05 9:18 ` Jörn Engel
2004-04-05 11:43 ` Pavel Machek
2004-04-05 12:17 ` Jamie Lokier
2004-04-05 12:39 ` Jamie Lokier
2004-04-05 12:41 ` Jamie Lokier
2004-04-05 18:03 ` Jörn Engel
2004-04-05 11:10 ` jlnance
2004-04-05 11:46 ` Pavel Machek
2004-04-05 12:35 ` Jamie Lokier
2004-04-05 8:43 ` Jörn Engel
2004-04-03 19:47 ` Eric W. Biederman
2004-04-05 8:54 ` Jörn Engel
2004-04-05 9:07 ` Eric W. Biederman
2004-03-20 16:48 ` Davide Libenzi
2004-03-21 12:57 ` Jörn Engel
2004-03-21 17:59 ` Davide Libenzi
2004-03-21 18:14 ` Jörn Engel
2004-03-21 20:26 ` Davide Libenzi
2004-03-21 20:35 ` Jörn Engel
2004-03-22 0:18 ` Eric W. Biederman
2004-03-22 0:25 ` Davide Libenzi
2004-03-22 5:07 ` Eric W. Biederman
2004-03-22 5:11 ` Davide Libenzi
2004-03-22 11:20 ` Eric W. Biederman
2004-03-22 16:02 ` Davide Libenzi
2004-03-25 17:49 ` Jamie Lokier
2004-03-25 18:06 ` Eric W. Biederman
2004-03-25 19:43 ` Jamie Lokier
2004-03-25 20:38 ` Linus Torvalds
2004-03-25 22:16 ` Eric W. Biederman
2004-04-01 14:53 ` Jörn Engel
2004-04-02 11:54 ` Tim Connors
2004-03-25 21:46 ` Eric W. Biederman
2004-03-27 10:28 ` Jamie Lokier
2004-03-27 21:00 ` Eric W. Biederman
2004-03-27 21:42 ` Jamie Lokier
2004-03-27 23:45 ` Eric W. Biederman
2004-03-28 0:43 ` Eric W. Biederman
2004-03-28 12:22 ` Jamie Lokier
2004-03-28 20:07 ` Eric W. Biederman
2004-03-28 23:55 ` Jamie Lokier
2004-03-29 1:31 ` Eric W. Biederman
2004-03-29 12:36 ` Jamie Lokier
2004-03-29 19:36 ` Eric W. Biederman
2004-03-29 23:05 ` Jamie Lokier
2004-03-29 23:58 ` Eric W. Biederman
2004-03-29 7:45 ` Denis Vlasenko
2004-03-29 9:28 ` Pavel Machek
2004-03-29 12:40 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040402200921.GC653@mail.shareable.org \
--to=jamie@shareable.org \
--cc=jack@ucw.cz \
--cc=joern@wohnheim.fh-wedel.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mj@ucw.cz \
--cc=patl@users.sourceforge.net \
--cc=pavel@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox