From mboxrd@z Thu Jan  1 00:00:00 1970
From: Peter Teoh <htmldeveloper@gmail.com>
Subject: Re: "Write once only but read many" filesystem
Date: Mon, 24 Mar 2008 14:45:28 +0800
Message-ID: <47E74E08.4000100@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
To: linux-fsdevel@vger.kernel.org
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from hs-out-0708.google.com ([64.233.178.240]:12564 "EHLO
	hs-out-0708.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753651AbYCXGeB (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 24 Mar 2008 02:34:01 -0400
Received: by hs-out-0708.google.com with SMTP id 4so2092977hsl.5
        for <linux-fsdevel@vger.kernel.org>; Sun, 23 Mar 2008 23:34:00 -0700 (PDT)
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

  Scott Lovenberg wrote:
> J=F6rn Engel wrote:
>> On Sat, 22 March 2008 23:55:53 +0800, Peter Teoh wrote:
>>>>   Or do you want individual files/directories to be immutable -=20
>>>> chattr?
>>> chattr is not good enough, as root can still modify it.   So if
>>> current feature is not there, then some small development may be
>>> needed.
>>>
>>>>  And in either case, what problem do you want to solve with a=20
>>>> read-only filesystem?
>>> Simple:   i want to record down everything that a user does, or a
>>> database does, or any applications running - just record down its
>>> state permanently securely into the filesystem, knowing that for su=
re,
>>> there is not way to modify the data, short of recreating the
>>> filesystem again.    Sound logical?   Or is there any loophole in t=
his
>>> concept?
>>
>> The loophole is called root.  In a normal setup, root can do anythin=
g,
>> including writing directly to the device your filesystem resides in,
>> writing to kernel memory, etc.
>>
>> It may be rather inconvenient to change a filesystem by writing to t=
he
>> block device, but far from impossible.  If you want to make such cha=
nges
>> impossible, you are facing an uphill battle that I personally don't =
care
>> about.  And if inconvenience is good enough, wouldn't chattr be
>> sufficiently inconvenient?
>>
>> J=F6rn
>>
>
> How about mounting an isofs via loopback?  This has the added benefit=
=20
> of being ready to be exported to disc.  You can make it with mkisofs=20
> on a directory structure and mount it to the tree with a normal=20
> mount(1).  If it asks for fs type on mount, I think its 'iso9660'.
>
>
Thanks for the idea.   Based on this idea, I will start looking at the=20
implementation of isofs, and how is it made to be readonly.......my=20
ultimate aim is to make the filesystem readonly upon after being=20
written, and the file closed.   Not sure if it can be done, but I=20
envisaged a lot of audit journalling are of these nature.   Of course,=20
it is always possible to "dd" the filesystem to modify the content, but=
=20
then if given design into its protection mechanism (like incremental=20
checksum - current checksum based on previous checksum, generated and=20
stored together with the file, upon after every writing of data) we can=
=20
always protect its integrity.   Aim is to set it to readonly......anyon=
e=20
can read....but not modifiable.   As a start I will try out patching=20
ext2, hoping that it is much simpler than ext3/ext4.

Upon changes to its content (via dd) it will invalidate the immediate=20
future incremental checksum.   Similarly, if u patch the current=20
checksum (which depends on a hash function of previous data, and=20
previous checksum), it will not be valid, as the current checksum is=20
also dependent on the history of previous checksum.   So everytime u=20
change the content via dd, u will need to modify the next checksum,=20
which is calculated based on this data, and the current checksum, which=
=20
again provide the seed for the next checksum etc.   U will have to=20
modify a lot of data, unless u are near the tip of the latest=20
modification.   The smaller the chunk of data per checksum, the more=20
difficult to keep up with the rate of modification.   Tradeoff is more=20
work for CPU.

The userspace tool part will then always validate the checksum with the=
=20
data being read, if modified, checksum will not be valid.   Since it is=
=20
a hash function, given the modified data, and the previous checksum, it=
=20
is not possible calculate the current checksum.

/Of course, if the intruder is root, then it is as good as not having=20
all these complex calculations, so our assumption is that the machine i=
s=20
not compromised yet.

Then of course the "chattr" can be used as well - if it is not=20
compromised.   True, but the possibility that it can be modified=20
infinitely via chattr also exists - which is what write-once-read-many=20
is against - to provide the assurance that it has not been overwritten=20
the second time (possible technically, but very very difficult).

An equivalent requirements of such a filesystem will be:   a filesystem=
=20
such that upon every changes made, a log of dates of changes are=20
made.      Overhead for these feature is high, so a lightweighted=20
version will be the date/history only it is first written.

On an existing ext2 filesystem, once the '"worm" kernel module is=20
loaded, the feature immediately take effect - content becomes read-only=
=20
but not modifiable, but modifiable for new contents.   And old contents=
=20
may or may not have a checksum to protect it, but if the checksum=20
exists, it will come from the last time "worm" was running and=20
generating checksum.

So u have the best of both world - ext2/3/4 with/without worm.    If=20
worm-enabled, it may also be configured at the directory level - some=20
directory can be solely dedicated for worm-journalling.

What do u guys think - any conceptual errors?


/
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel=
" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html