public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Tony Wallace <tony@tony.gen.nz>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>,
	Richard Weinberger <richard.weinberger@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: Suggested new user link command
Date: Sat, 05 May 2018 00:10:10 -0500	[thread overview]
Message-ID: <87zi1e7knh.fsf@xmission.com> (raw)
In-Reply-To: <fa04b946-c17f-4938-465e-345c76611c89@tony.gen.nz> (Tony Wallace's message of "Wed, 2 May 2018 06:41:59 +1200")

Tony Wallace <tony@tony.gen.nz> writes:

> On 02/05/18 01:35, Bernd Petrovitsch wrote:
>> Hi all!
>>
>> Top-quoting is evil BTW.
>>
>> On Wed, 2018-05-02 at 00:17 +1200, Tony Wallace wrote:
>>> Two issues here:
>>> 1) Use case (which I have)
>>> 2) Permissions
>>>
>>> 1) Use case
>>>
>>> I am trying to build a backup system.  To avoid duplication of files
>>> over multiple backups I take an Md5 check sum of file contents.  Files
>>> with the same sum are hardlinked together.   Files are linked in to a
>>> standard directory structure a new link for each backup that the file is
>>> part of.  When all backups pointing to a file are deleted the reference
>>> count drops to zero and the file is deleted.  We can keep a database of
>>> checksums and there related inode numbers for linking purposes.  So why
>> a) You can store one of the filenames instead of the inode number.
>> b) You can keep an extra directory with a hardlink named as the inode 
>>    number (and delete the entries there if the link count drops to 1).
>>
>>> not have some reference copy to link against it would take no extra
>>> space.  Well it doesn't, but it keeps at least one copy of the file on
>> You have a (disk) space problems on an backup system?
>> I don't think so, Tim;-)
>>
>>> disk forever and the reference count never drops to zero.  Using one of
>>> the backup copies to link to (as stored as the reference copy in the
>>> database) will not work as it could be deleted at any time.
>>>
>>> I have seen on stack overflow others wanting to do this also.
>> "Do. Or do not. There is no try." - Yoda
>> SCNR .....
>>
>>> 2) Permissions
>>>
>>> To maintain security there are two requirements: 
>>> 2.1) The effective user must have rights to the inode, that is they must
>>> either own it or be root
>>> 2.2) The effective user must have rights file creation rights to the
>>> directory where it is being linked
>> Obviously (und useful). And on a backup system, there is no problem
>> about that (because the backup software probably runs as root anyways
>> because otherwise 2.1) below will limit the deduplication severely).
>>
>> But for a (to be mainlined/accepted) new syscall, one should think
>> about all situations/use cases and not just one.
>>
>> Additionally to the 2 items above, one needs also x-permissions on
>> *all* directories from / to one existing hardlink in the traditional
>> case and such a syscall bypasses that.
>> Think about it: Everyone can write a progrm to try link all inodes from
>>  0 to ~0 to a directory entry and gets all files with restrictions 2.1)
>> and 2.2) from below.
>> ATM it is enough to `chmod o= ~` to keep all others from all files in
>> my $HOME. Afterwards it's no longer that easy.
>>
>>> If you say no, that is fine, but I do think this idea has merit and can
>>> be done without compromising the system.
>> I'm no one to say no (or yes;-) here to anything;-) I'm just thinking
>> about the implications.
>>
>> And you can always implement a patch and if it's ignored/not accepted,
>> you can use it locally anyways - no one can stop that:-)
>>
>> One more - more constructive - thing: Perhaps it is more
>> acceptable/useful if there is a mount option which must be activated on
>> the backup filesystems and that is not activated anywhere else.
>>
>> MfG,
>> 	Bernd
>
> I want to thank everyone for their time. I have taken note of your
> comments.  I believe that there is the need for a companion command
> istat that obtains the stat data from an inode.  Istat may be useful in
> constructing ilink.  For my proposed use case complexity is minimised,
> and effectiveness is maximised by making both istat and ilink root only
> system calls, and then doing my backup as root.  I do not know how a
> mount option would work, and for my own use it is again probably
> unnecessary complexity, but accept it may be necessary if released more
> generally.
>
> I will be dropping the matter now, at least until I have some code to
> show, but if anyone has any more thoughts feel free to drop me an
> email. 

Actually the functionality you are looking for has in some sense already
been implemented, and in a way that does not assume a strictly posix
filesystem.

The system calls are:
name_to_handle_at
open_by_handle_at

Good luck,
Eric

      reply	other threads:[~2018-05-05  5:10 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-01  8:03 Suggested new user link command Tony Wallace
2018-05-01  9:03 ` Bernd Petrovitsch
2018-05-01 10:58   ` Tony Wallace
2018-05-01 11:04     ` Richard Weinberger
2018-05-01 12:17       ` Tony Wallace
2018-05-01 13:35         ` Bernd Petrovitsch
2018-05-01 18:41           ` Tony Wallace
2018-05-05  5:10             ` Eric W. Biederman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zi1e7knh.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=bernd@petrovitsch.priv.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.weinberger@gmail.com \
    --cc=tony@tony.gen.nz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox