All of lore.kernel.org
 help / color / mirror / Atom feed
* The situation at hand and in the future
@ 2004-05-27 20:01 mjt
  2004-05-27 21:05 ` Valdis.Kletnieks
  0 siblings, 1 reply; 41+ messages in thread
From: mjt @ 2004-05-27 20:01 UTC (permalink / raw)
  To: reiserfs-list

Hi

I reinstated the reiser part of my page:
http://mjt.nysv.org/reiser/

The broken package is here:
http://mjt.nysv.org/reiser/bk-2004-05-25/

With the screenshots (courtesy of my Siemens SX1, which decided it won't
crash while taking pictures) and the config on the top level.

It's uploading still and will be doing so for quite a while, but if you
see a tarball there, it's completely uploaded, because I'll have made
the tar :) Or then I'll go to sleep and I will make a tar when needed.
Maybe only vmlinux is enough for debugging.

I'm also doing a new, clean, bk clone/get.

That is not what I figured to mail about though.

I'm afraid I'll reopen a can of worms here. But hopefully compensate
with something that may arouse discussion of practical issues.

I'm sure we all know about Hans' attitude toward xattrs. As I said before,
I don't care which API is in or out as long as it works for me, and I
hope the better one wins.

Now, considering the state of the union, maybe it might be a wise strategic
move to back off a bit from that. The pseudo file API is really near
completion, right? All I don't see are the compression and crypto plugins.

The following paragraphs may be a bit flawed, depending on how much a
difference there should be between the pseudo files API and the syscall
API. Looking at the web page, the ACLs seem to be the biggest issue.

So, the Reiser4 API is good for providing passphrases for encrypted files
and directories, right? Because using echo to unlock them from the file
system is hazardous?

Why not add user-space programs which basically do what echo does, but
with the terminal echo turned off? They would probably size at around
0k and would use the pseudo file API.

Maybe this would be a use case:
1. Joe wants to encrypt the files he l33ch3d from kazaa at work.
2. Joe chdirs to the pseudo directory under ~/Work/Warez/
3. Joe says echo 3DES\0 > plugin/crypto
4. Now the directory knows it must be 3DES-encrypted, but it needs
   a passphrase.
5. Joe says echo TOPSECRETPASSWORD\0 > plugin/cryptokey
6. The file system works its magic

Now every file Joe downloads under ~/Work/Warez/ is automatically
encrypted with the provided phrase.

Joe then decides that his work documents might be good to encrypt as
well.

7. chdir again
8. echo BLOWFISH\0 > plugin/crypto
9. echo THEOTHERPASSWORD\0 > plugin/cryptokey
10. The encryption kicks in and encrypts the directory (contents?)

Does that, btw, encrypt the directory object or it and its children?
With fibration the new children are fibrated as per the parent directory's
settings, right? So this should be the same with every plugin?

~/Work/Warez/ should still be left untouched, no matter what happens,
because it has a different algorithm.

Files which contain passphrases must be write-only and hopefully even
have the passphrase expire on some idle time.

Now Joe needs to access the files as well.
This is where it gets trickier...

No matter what happens, the software accessing encrypted data must support
some api.

11a. echo TOPSECRETPASSWORD\0 > plugin/cryptounlock
     This would unlock the directory, naturally the cryptoplugin could not
     be changed to none without the data being unencrypted to boot.

11b. echo PID-TOPSECRETPASSWORD\0 > plugin/cryptounlock
     This would grant access to one certain PID.
     
What happens when Joe is a SuSE or Mandrake user, who just wants to
clickety-click his way through these procedures?

Solution 1: Have a graphical pseudo file manager as a separate project,
            maybe just start it and let someone else finalize and admin it.
Solution 2: Contribute, or have someone else contribute, code to popular
            graphical file managers.

The question arises, is it possible for a pseudo file to know which pid
is accessing it?

What happens if you want to use Apache or some other program that has
close to n+1 processes, maybe specify the PPID?

Also, it might be clearer if the plugin file crypto were a plugin directory
crypto, with files in it like "algorithm", "passphrase", "unlock" or
something?

The same use case would apply with compression algorithms, I suppose?

This did not really get us very far from the can of worms, however.

If all this is doable like this, or done better than my suggestions, from 
the base of meta files, why bother with the Reiser4 syscalls API now?

Seems to me like a waste of resources, financial and human, and time.
I never tried them, but xattrs were in Reiser4, at least to some extent,
so would it be possible to re-integrate them and start working on the
syscalls later on?

I think, if possible with little work, xattrs could be pushed quicky in.
Maybe left there as "the crappy alternative"
Also, I think Chris Mason said the ReiserFS xattrs were implemented as
files? Or have I misunderstood/forgotten? Maybe Reiser4 xattrs could
be implemented as pseudo files? And maybe I should see for myself
some day...

Stuff like sequencing I/Os and file access without descriptors
can't be done with xattrs? That's how I interpret this. So the syscalls
must come.

Will the syscalls, or I wish much more the pseudo files, enable stuff
like retroactive inheritance of parent plugin settings?

All I'm aiming at here is to minimize work and provide some ideas of
how to get quick community acceptance, so if the xattrs advocation
seems like worthless provocation or annoying rambling, it was not my
intent and if I just made an innocent fool of myself, I can live with
that :)

Also, some hints at what the future will bring would be good. Like,
what's the next Big Thing? If the mmap bug and NFS are fixed, the
next logical things to attack would be:
* Re-sustain stability and performance (The loss which I can not estimate
  since the bugger crashes on mount :)
* The pseudo files API, more closely crypto and compression.
* Copy-on-capture
* Reiser4 syscalls

Apart from the social aspect of getting it out to the public, for which
only the first, or first two, are required.

What is Hans' take on those attacks?

Thanks and I hope at least some reaction comes from this mail :)

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-27 20:01 The situation at hand and in the future mjt
@ 2004-05-27 21:05 ` Valdis.Kletnieks
  2004-05-27 22:09   ` David Masover
  0 siblings, 1 reply; 41+ messages in thread
From: Valdis.Kletnieks @ 2004-05-27 21:05 UTC (permalink / raw)
  To: Markus =?UNKNOWN?Q?T=F6rnqvist?=; +Cc: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]

On Thu, 27 May 2004 23:01:27 +0300, mjt@nysv.org (Markus =?UNKNOWN?Q?T=F6rnqvist?=)  said:

> So, the Reiser4 API is good for providing passphrases for encrypted files
> and directories, right? Because using echo to unlock them from the file
> system is hazardous?
> 
> Why not add user-space programs which basically do what echo does, but
> with the terminal echo turned off? They would probably size at around
> 0k and would use the pseudo file API.
> 
> Maybe this would be a use case:
> 1. Joe wants to encrypt the files he l33ch3d from kazaa at work.
> 2. Joe chdirs to the pseudo directory under ~/Work/Warez/
> 3. Joe says echo 3DES\0 > plugin/crypto
> 4. Now the directory knows it must be 3DES-encrypted, but it needs
>    a passphrase.
> 5. Joe says echo TOPSECRETPASSWORD\0 > plugin/cryptokey

Remember that English has an entropy of only 2.5 bits/byte or so - as a result,
you *REALLY*, *REALLY* want to use a program that reads a much longer
passphrase and computes a hash to use as the actual key.

See Jari Russu's patch to util-linux 2.12 in the loop-AES package
(http://loop-aes.sourceforge.net) for an example if what I mean - it includes
patches to "mount" and "losetup" to do all the passphrase/key handling.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-27 21:05 ` Valdis.Kletnieks
@ 2004-05-27 22:09   ` David Masover
  2004-05-28  6:33     ` mjt
  0 siblings, 1 reply; 41+ messages in thread
From: David Masover @ 2004-05-27 22:09 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Markus ?, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Valdis.Kletnieks@vt.edu wrote:
| On Thu, 27 May 2004 23:01:27 +0300, mjt@nysv.org (Markus
=?UNKNOWN?Q?T=F6rnqvist?=)  said:
|
|
|>So, the Reiser4 API is good for providing passphrases for encrypted files
|>and directories, right? Because using echo to unlock them from the file
|>system is hazardous?
|>
|>Why not add user-space programs which basically do what echo does, but
|>with the terminal echo turned off? They would probably size at around
|>0k and would use the pseudo file API.
|>
|>Maybe this would be a use case:
|>1. Joe wants to encrypt the files he l33ch3d from kazaa at work.
|>2. Joe chdirs to the pseudo directory under ~/Work/Warez/
|>3. Joe says echo 3DES\0 > plugin/crypto
|>4. Now the directory knows it must be 3DES-encrypted, but it needs
|>   a passphrase.
|>5. Joe says echo TOPSECRETPASSWORD\0 > plugin/cryptokey
|
|
| Remember that English has an entropy of only 2.5 bits/byte or so - as
a result,
| you *REALLY*, *REALLY* want to use a program that reads a much longer
| passphrase and computes a hash to use as the actual key.

Could this be a plugin/pseudo?  Maybe I'm being obsessive, but here's
what I want:

$ cd ~/.secret/..metas/crypto
$ ln -s ciphers/blowfish cipher
$ echo 256 > bitsize
$ echo 0 > block	# deny access rather than block if dir is accessed
before passphrase is entered
$ echo sha1 > hash
$ echo think you a h4x0r?  break this > passphrase
$ cat passphrase
cat: passphrase: Permission denied
$ cat key
e786498ac12eb276badf570ddfb4eb50de07f2b2
$ echo think you a h4x0r?  break this | sha1sum | cut -f 1 -d ' ' > key
$ cat key
e786498ac12eb276badf570ddfb4eb50de07f2b2
$ echo 5m > timeout		# a setting of 0 is no timeout
$ sleep 5m			# on an otherwise inactive system
$ ls ~/.secret/
ls: ~/.secret/: Permission denied
$ cat key
cat: key: Permission denied
$ echo lame attempt > passphrase	# can we die now? probably not
$ ls ~/.secret/
ls: ~/.secret/: Permission denied      # fails because of bad passphrase
$ echo think you a h4x0r?  break this > passphrase
$ ls ~/.secret/
bios  hacking  pgp  more stuff


I do want it to verify for itself somehow whether my passphrase/key
worked.  This is important because if we just do the cryptoloop thing,
either the directory is corrupt or (worse) the files inside are corrupt.
~ Worse, someone might create a new file with the different key...

I don't even want to _think_ about what this would do to Mozilla if I
applied it to my home.

If it's possible, I'd like a bad passphrase to make 'echo' get an "acess
denied" error (or maybe block for one second first), but I don't see how
that's possible -- how do we know when the write is done and the
passphrase is entered until the file is closed?

I'd like all settings but passphrase and key to be persistant.

I've heard that there will be an api to allow multiple files to be
accessed with a single filehandle -- also, there's sort of vague concept
of other possibilities -- SQL access?  If so, any app wanting
performance (setting individual crypto settings for thousands of dirs)
would not do the steps I outlined from a shell script.  If a pseudo file
which contains such a tiny bit of information as a passphrase or a
boolean value is written to individually, we can assume it's either a
human or a shell script, and in either case, I think we can waste the
time to check for things like newlines and a lack of \0 at the end of a
line -- in general, we can do some simple parsing.  Even having
"true/false" instead of "1/0" might be nice.

However, I'm not all _that_ picky.  I'd really be happy with any crypto
plugin at all, as cryptoloop causes problems.  I see
'metas/plugin/crypto', but it seems to have some information only, and
no real way to turn it on.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLZnDHgHNmZLgCUhAQJYuQ/+J5A+TwndTEcCkdeG0SZ/w3mPgGAx24UH
Oy23H6d+pkasRtM6f9mHlTVcp7ow19rzmRcCeBmcRaIXHEzpKYRIcFOmBqE2+2yY
+T1dLaMcwDP7S3kYOdhduf1eoZlp7zFd0ahcTpGhY0wywKowS4f3UxQnWsGXaaAU
E4z64rKEtXXeTGe53RA3rTTZDS18/FgqCWF4TrK/AE1jAAxrSvHOxUpWxL4J1e4M
KGUeNszjICb7DX1WVVzARE3kqlaryBH8PGGCywkcP9FvTYJ+mldJ+d6ZGoAooyLA
OXTMvg40n8qASlvzeMvs9LffufKp/46YYRhlyWKxMBo0xwqpRPFFtaYzwzzh9auo
ERG3D4J+pYJdDvWuK7V+w+7Y2ljTSxApuwUxlPAhw9IXtXm1GHLnT+W7SdKsxmAu
KpBnqiLRayGx7va1rsNMEPjf6s0GbldBdEpVWNLZ2PdwtTw/8PFY4RlXrUA/f2db
8tugn3dlp7/jj5wl1F1HL7ByFOPc7/DUQurLG2dAsBm+XQlTO3T2MNH6v+ZNmOkQ
7psTkSZmJrha5FQprIGqnnkIM+XR0o6YHmEMayxu1Q70e5DTzpTB1WVSTNxu0xjr
BiVoFMa22Zovr9XKmQFz8eBKVgfO5CEaw1iX0xiL8+siRIfnhQhPHybsbdY9SXVn
p7QG2CIv7c0=
=Tmom
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-27 22:09   ` David Masover
@ 2004-05-28  6:33     ` mjt
  2004-05-28 19:53       ` Valdis.Kletnieks
  2004-05-29 14:22       ` David Masover
  0 siblings, 2 replies; 41+ messages in thread
From: mjt @ 2004-05-28  6:33 UTC (permalink / raw)
  To: David Masover; +Cc: Valdis.Kletnieks, reiserfs-list

On Thu, May 27, 2004 at 05:09:17PM -0500, David Masover wrote:
>Could this be a plugin/pseudo?  Maybe I'm being obsessive, but here's
>what I want:

Much more detailed a use case than my initial example, very good
and very much to the point.

>$ cd ~/.secret/..metas/crypto

This step would make life easier.

>$ ln -s ciphers/blowfish cipher

Isn't the idea with Reiser4's crypto system that it uses cryptoalgos
compiled into the kernel?

Of course these could be represented under ciphers. Actually, I'd
be surprised if they're not under /proc/ or /sys/

The idea is still good and remains.

>$ echo 256 > bitsize
>$ echo 0 > block	# deny access rather than block if dir is accessed
>before passphrase is entered

What is the difference between blocking and denying access?

[snip]

>If it's possible, I'd like a bad passphrase to make 'echo' get an "acess
>denied" error (or maybe block for one second first), but I don't see how
>that's possible -- how do we know when the write is done and the
>passphrase is entered until the file is closed?

Adding a passphrase retroactively? The write is done, the file
is closed, and only then is the file encrypted.

>I'd like all settings but passphrase and key to be persistant.

Persistent over boots? I'd like the passphrase and key to survive
a boot...

>I've heard that there will be an api to allow multiple files to be
>accessed with a single filehandle -- also, there's sort of vague concept
>of other possibilities -- SQL access?  If so, any app wanting

That's the Reiser4 syscalls api, I gather.

I hope Namesys doesn't put too much effort into that yet.
I don't mean to be pretentious, or teaching my grandmother to suck
eggs (is that really the English-language saying?) but there should
be more pressing matters.

>performance (setting individual crypto settings for thousands of dirs)
>would not do the steps I outlined from a shell script.  If a pseudo file

Naturally. But if the crypto settings propagate automatically to new
children, it may not be such a performance loss.

>which contains such a tiny bit of information as a passphrase or a
>boolean value is written to individually, we can assume it's either a
>human or a shell script, and in either case, I think we can waste the
>time to check for things like newlines and a lack of \0 at the end of a
>line -- in general, we can do some simple parsing.  Even having
>"true/false" instead of "1/0" might be nice.

I have lived in the belief that \0 is there to make things faster.
I'm afraid I have to disagree here, but that's trivial semantics.

Besides, shell scripts take about the same time to write, I mean,
it takes the same time for the human to write the script, has it
the strict \0 1/0 syntax or not. Maybe even one script or user-space
reiser4prog would suffice for general cryptohandling.

r4_encrypt ~/.secret/ -c sha1 -s 256 # and so on and so forth.

>However, I'm not all _that_ picky.  I'd really be happy with any crypto
>plugin at all, as cryptoloop causes problems.  I see

Ever since having read about Reiser4's implementation, cryptoloop has
seemed like a terrible kludge, so I'm really looking forward to this
better solution.

GIMME GIMME GIMME! ;)

>'metas/plugin/crypto', but it seems to have some information only, and
>no real way to turn it on.

Does the code exist?

Well, I'd like to hear Namesys' opinion on these matters :)

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-28  6:33     ` mjt
@ 2004-05-28 19:53       ` Valdis.Kletnieks
  2004-05-29 12:48         ` mjt
  2004-05-29 14:22       ` David Masover
  1 sibling, 1 reply; 41+ messages in thread
From: Valdis.Kletnieks @ 2004-05-28 19:53 UTC (permalink / raw)
  To: Markus =?UNKNOWN?Q?T=F6rnqvist?=; +Cc: David Masover, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 446 bytes --]

On Fri, 28 May 2004 09:33:24 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said:

> Persistent over boots? I'd like the passphrase and key to survive
> a boot...

No you don't.

If the passphrase and key are persistent, then an attacker can get your data.

Think about it - the only reason an attacker doesn't have access to your
data is because they don't have the passphrase/key.  If you leave them around,
you've given away the keys to the kingdom.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-28 19:53       ` Valdis.Kletnieks
@ 2004-05-29 12:48         ` mjt
  0 siblings, 0 replies; 41+ messages in thread
From: mjt @ 2004-05-29 12:48 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: David Masover, reiserfs-list

On Fri, May 28, 2004 at 03:53:43PM -0400, Valdis.Kletnieks@vt.edu wrote:

>> Persistent over boots? I'd like the passphrase and key to survive
>> a boot...
>No you don't.
>If the passphrase and key are persistent, then an attacker can get your data.

I think we are talking about the same/different thing :)

Of course I don't want the files to become unencrypted after a boot.
I meant that I want the files to be locked _with the same key_ after
a boot. Because the unlocking/locking key must be always the same.

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-28  6:33     ` mjt
  2004-05-28 19:53       ` Valdis.Kletnieks
@ 2004-05-29 14:22       ` David Masover
  2004-05-29 15:49         ` mjt
  2004-05-29 20:04         ` Hubert Chan
  1 sibling, 2 replies; 41+ messages in thread
From: David Masover @ 2004-05-29 14:22 UTC (permalink / raw)
  To: Markus Törnqvist; +Cc: Valdis.Kletnieks, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Markus Törnqvist wrote:
| Isn't the idea with Reiser4's crypto system that it uses cryptoalgos
| compiled into the kernel?

Sure, although it might be nice to allow userland crypto programs.  But
there's not all that many crypto algorithms, and most of them are in the
kernel -- so the only really nice thing that comes from that is the
ability to update them without rebooting (in case of some insecurity).
Not worth wasting too much time on right now.

| Of course these could be represented under ciphers. Actually, I'd
| be surprised if they're not under /proc/ or /sys/

The symlink is just my idea of a good way to represent references.  The
alternative here would be something like:

$ echo blowfish > cipher

This is analagous to providing a text entry box when you really want a
drop-down menu.  My suggestion for a filesystem-based "drop-down menu"
would be to have a directory full of files (which could easily be
completely empty).  You create a symlink to the one you want.

|>$ echo 256 > bitsize
|>$ echo 0 > block	# deny access rather than block if dir is accessed
|>before passphrase is entered
|
|
| What is the difference between blocking and denying access?

Blocking as in waiting for input.  Think like a program reading from
stdin.  It may think it's reading from a file that's already got all the
data it needs, but if stdin points to a terminal, it will stop and wait
for each line of text you enter.  Here, a program trying to access
~/.secret will sleep until someone does

$ echo (passphrase) > ~/.secret/..metas/crypto/passphrase

This might be nice while booting up, so that the user can enter the
passphrase whenever they feel like it without having to watch programs
die with "access denied" errors, and having to restart those programs.
But an init script run early enough in boot should take care of that
problem, and it'd be a lot more annoying if the user did something like

$ ls ~/.secret

and it sleeps for awhile, then the user goes "Doh!  Forgot to enter my
passphrase" and has to kill 'ls' and enter it.

What would be better is to simply deny any request to ~/.secret which is
not directly tied to accessing ~/.secret/..metas.

|>If it's possible, I'd like a bad passphrase to make 'echo' get an "acess
|>denied" error (or maybe block for one second first), but I don't see how
|>that's possible -- how do we know when the write is done and the
|>passphrase is entered until the file is closed?
|
|
| Adding a passphrase retroactively? The write is done, the file
| is closed, and only then is the file encrypted.

Yeah, that's fine, but then "echo" doesn't die.  This is fine for shell
scripts -- they can always detect that it didn't work out and try again.
~ But I would like both shell scripts and manual user munging to be well
supported.

|>I'd like all settings but passphrase and key to be persistant.
|
|
| Persistent over boots? I'd like the passphrase and key to survive
| a boot...

Reading ahead in my mail, I see this has already been answered.  Note
that cryptoloop does exactly what you're describing, only it allows an
incorrect passphrase to be entered, because it can't tell the difference
between correct or incorrect -- only you can, because incorrect will
yield gibberish.  We would want something to persist that allows a
passphrase to be checked.

|>performance (setting individual crypto settings for thousands of dirs)
|>would not do the steps I outlined from a shell script.  If a pseudo file
|
|
| Naturally. But if the crypto settings propagate automatically to new
| children, it may not be such a performance loss.

True, but it should be possible for people to design a system which
requires thousands of *individual* crypto settings (individual
passphrases and so on, so it can't all propegate to new children) -- not
because I can think why they'd do that, but because someone will
inevitably come whining to Namesys if they can't.

Not that this is a priority.

| I have lived in the belief that \0 is there to make things faster.
| I'm afraid I have to disagree here, but that's trivial semantics.

Meaning it shouldn't be hard to change.  And if we're writing shell
scripts, we don't want performance.  And if we want peformance, I'll bet
there's a lot more lost on having so many individual files than on
having a little bit of parsing.

| Besides, shell scripts take about the same time to write, I mean,
| it takes the same time for the human to write the script, has it

Maybe I don't want to write a script?  And manually editing the files
should be at least as easy as it is with /proc/sys and /sys files.
Those do not require a \0, chop off a trailing \n at the end of input,
and add one at the end of output.

| the strict \0 1/0 syntax or not. Maybe even one script or user-space
| reiser4prog would suffice for general cryptohandling.
|
| r4_encrypt ~/.secret/ -c sha1 -s 256 # and so on and so forth.

I can imagine something like this, but you can't count on having
something like that for every plugin in every situation.  Unless there's
a generic shell script, which seems like a waste.

| Ever since having read about Reiser4's implementation, cryptoloop has
| seemed like a terrible kludge, so I'm really looking forward to this
| better solution.

dm_crypt is a better solution than cryptoloop, but this is better still.

| GIMME GIMME GIMME! ;)

Amen.  Any code to start from?  Because I might even want to hack a
plugin together this summer.  Assuming it doesn't exist yet...


The original mail which this is in reply to a reply to was an attempt to
start being more consise, especially on mailing lists.  Oh well, I tried :P
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLicnHgHNmZLgCUhAQIcoQ/+Pk08sL8479I5ZGULyv2ZVDbszU7DGczr
T5pTK9tkh2XRWMGhYcjByVHmdQzZncspLkyA+ULRShP4F15rQcJESSOgVGbWqbb2
nwubUierQhtuM7tMUGSCFxYbNnWdY7BCNenS/qCnSDwr2Ygf8XWpBWG7mkd+Z0+I
b5N8IqOK7dPg+YbxyWHJXirYHPoBqDCnFfSt1HC10eR+DDt505oR9GqUkNVo9ls0
6o9sfKj5bIEnsdHLb/CLGiloSpu9CISinCJS6xEA2twGGWyAogrLxpuYgjCqqel0
YqVc6z3R+PdBegq9PJt6rcTy098e+GimHPQfFz+ijfj/M+6+iquhDWL0DK46Nfde
7Xb8EYBXKN8VO1QpgCyDSrdT/ilSRnSHDbixo7MZGyScc0zjTjL/t96BMpwA4cb3
4upKhefEFlPm8ksP7GPXr3r//AbiFgykayFNfwE0gt34LIVCguVwkp/2nZELLsA7
HXnhLgSNJyrDCcviYbPdlMcNSYoJrXv2GDyNsX234XIhuzVrbLUOM4TdpKi6A3HU
pmwEEoUPduDG9S+h+6eiccX/LhAUo94mAfPZW5e+lxoi/+82W58lBcJaE0IQdU5h
mLVrnXiRJange1x5gE3D1a3uwDQrOi/Z9CJPL5mqHwVlVw/Gx3T6n7f/KlNf7SK5
oHylvIqYxJQ=
=AkCE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 14:22       ` David Masover
@ 2004-05-29 15:49         ` mjt
  2004-05-29 23:16           ` David Masover
  2004-06-02  2:45           ` Hans Reiser
  2004-05-29 20:04         ` Hubert Chan
  1 sibling, 2 replies; 41+ messages in thread
From: mjt @ 2004-05-29 15:49 UTC (permalink / raw)
  To: David Masover; +Cc: Valdis.Kletnieks, reiserfs-list

On Sat, May 29, 2004 at 09:22:20AM -0500, David Masover wrote:
>Sure, although it might be nice to allow userland crypto programs.  But

Correct me if I'm wrong, but that would mean interfacing the kernel
with the user in ways that may break more easily than just using
in-kernel algos?

>Not worth wasting too much time on right now.

Yeah.

>The symlink is just my idea of a good way to represent references.  The
>alternative here would be something like:
>$ echo blowfish > cipher

Yes. This springs to mind the pretty axiomatic notion that in order
to change a cipher from non-none, it must be unlocked :)

>This is analagous to providing a text entry box when you really want a
>drop-down menu.  My suggestion for a filesystem-based "drop-down menu"
>would be to have a directory full of files (which could easily be
>completely empty).  You create a symlink to the one you want.

Symlinks have the good point that this doesn't happen:
$ echo foobar > cipher

Of course that would be handled, but having you chasticed by the kernel
may be bloating it. Is that a bigger bloat than handling symlinks out
of the pseudo file system? Or maybe silent failures are good enough..

>Blocking as in waiting for input.  Think like a program reading from
>stdin.  It may think it's reading from a file that's already got all the
>data it needs, but if stdin points to a terminal, it will stop and wait
>for each line of text you enter.  Here, a program trying to access
>~/.secret will sleep until someone does
>
>$ echo (passphrase) > ~/.secret/..metas/crypto/passphrase
>
>This might be nice while booting up, so that the user can enter the
>passphrase whenever they feel like it without having to watch programs
>die with "access denied" errors, and having to restart those programs.
>But an init script run early enough in boot should take care of that
>problem, and it'd be a lot more annoying if the user did something like
>
>$ ls ~/.secret
>
>and it sleeps for awhile, then the user goes "Doh!  Forgot to enter my
>passphrase" and has to kill 'ls' and enter it.

No matter what, I think the user will echo the passphrase and have
to program wake up from sleep.

But this is a nice feature, which I hope again will be implemented.

>What would be better is to simply deny any request to ~/.secret which is
>not directly tied to accessing ~/.secret/..metas.

This reminds me of the point of encrypting a directory.

I really want to know if it encrypts the directory object or all content.

>|>If it's possible, I'd like a bad passphrase to make 'echo' get an "acess
>|>denied" error (or maybe block for one second first), but I don't see how
>|>that's possible -- how do we know when the write is done and the
>|>passphrase is entered until the file is closed?
>| Adding a passphrase retroactively? The write is done, the file
>| is closed, and only then is the file encrypted.
>Yeah, that's fine, but then "echo" doesn't die.  This is fine for shell
>scripts -- they can always detect that it didn't work out and try again.
>~ But I would like both shell scripts and manual user munging to be well
>supported.

I'm afraid I don't see this situation quite clearly.

You mean the file creation starts and echo for encrypting is issued
before the write is done?

I'm not sure about the syntax but I hope it's clear ;)

$ cd .secret
$ dd if=/dev/urandom of=KingArthur.0d.DivX.avi &
[1] 2084
$ (sleep 20; kill %1) &
[2] 2085
$ echo passphrase > crypto/passphrase

A situation like this would be awkward...

>Reading ahead in my mail, I see this has already been answered.  Note
>that cryptoloop does exactly what you're describing, only it allows an
>incorrect passphrase to be entered, because it can't tell the difference
>between correct or incorrect -- only you can, because incorrect will
>yield gibberish.  We would want something to persist that allows a
>passphrase to be checked.

Absolutely.
Having no cryptoloop experience, I'd probably have filed a bug report
for this. It's broken behavior, feature or not :)

HOWEVER!
Anyone with sufficient knowledge may dig out the persistent metadata
information from the fs, say, using dd.

How would this be handled? Are MD5 hashes strong enough?

One other option would be for the kernel to generate a super-passphrase
so it stores cryptoinformation using this passphrase, which is totally
inaccessible to users, but it may cause breakage on unclean shutdowns.

Wouldn't protect against hardware theft either. Naah, bad idea :)

>True, but it should be possible for people to design a system which
>requires thousands of *individual* crypto settings (individual
>passphrases and so on, so it can't all propegate to new children) -- not
>because I can think why they'd do that, but because someone will
>inevitably come whining to Namesys if they can't.

I would solve this by having the settings propagate to all unencrypted
children and let the children be re-encrypted or whatever that changes
the cryptosettings.

If that's not good enough, deal with it later.

Sure it iterates a lot over the tree in vain, but maybe the Reiser4
syscall API would allow these to be better bundled together.

Requiring some cool reiser4progs addition which uses this API.
And then someone would write a Python module so someone with skills
as lacking as mine can write a sexier application ;)

>Meaning it shouldn't be hard to change.  And if we're writing shell
>scripts, we don't want performance.  And if we want peformance, I'll bet
>there's a lot more lost on having so many individual files than on
>having a little bit of parsing.

Mmyeah, perhaps you're right.

Maybe it should be homogenized to \n and let it slide.

>| r4_encrypt ~/.secret/ -c sha1 -s 256 # and so on and so forth.
>I can imagine something like this, but you can't count on having
>something like that for every plugin in every situation.  Unless there's
>a generic shell script, which seems like a waste.

I was thinking more along the lines of a c program (which is basically
a shell script ported to c, think djb's makemaildir or whatever it was
called) which would do lookups of available cryptoalgos and stuff.

A glorified shell script. A generic one. Even if it were a waste.

>| Ever since having read about Reiser4's implementation, cryptoloop has
>| seemed like a terrible kludge, so I'm really looking forward to this
>| better solution.
>dm_crypt is a better solution than cryptoloop, but this is better still.

I think I must look into dm_crypt, but I don't think I'll be encrypting
any of my files until the Reiser4 mechanism is complete.

I like easy solutions and having it on the fs is just too damned simple :)

>Amen.  Any code to start from?  Because I might even want to hack a
>plugin together this summer.  Assuming it doesn't exist yet...

There is some code that I tried to read through.
reiser4/crypt.c
reiser4/plugin/cryptcompress.h
reiser4/plugin/cryptcompress.c

So it seems it will compress as well as encrypt. But I'm not sure about
what all that code does. Should give you a start though.

Unfortunately the error codes in the code are named Edward, who doesn't
appear in the README...

>The original mail which this is in reply to a reply to was an attempt to
>start being more consise, especially on mailing lists.  Oh well, I tried :P

Consistency is a bit like perfection.
A good goal but the distance to it shortens only in halves.
Close enough is good enough.

I'm doing yet another bk -r get now, as the last one didn't compile.
I don't think much has developed during the weekend, but if there's
a 24 hour latency there's also hope :)

(Rumor has it the nightly snapshot system will come next week ;)

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 14:22       ` David Masover
  2004-05-29 15:49         ` mjt
@ 2004-05-29 20:04         ` Hubert Chan
  2004-05-29 23:19           ` David Masover
  1 sibling, 1 reply; 41+ messages in thread
From: Hubert Chan @ 2004-05-29 20:04 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "David" == David Masover <ninja@slaphack.com> writes:

[...]

David> Reading ahead in my mail, I see this has already been answered.
David> Note that cryptoloop does exactly what you're describing, only it
David> allows an incorrect passphrase to be entered, because it can't
David> tell the difference between correct or incorrect -- only you can,
David> because incorrect will yield gibberish.  We would want something
David> to persist that allows a passphrase to be checked.

Note that allowing a passphrase to be checked may decrease security
(slightly).  If an attacker has a way to check if the passphrase is
correct, it allows him/her to bruteforce the passphrase.  Otherwise,
when the attacker enters a passphrase and reads gibberish, he/she
doesn't know if that really is the data that's encrypted, or if he/she
entered the wrong passphrase.

Of course, in practice, it won't be too bad, because known file formats
are fairly easily recognizable.  But one could obtain "gibberish" to
encrypt by encrypting multiple times.  (So the attacker would need to
also know the number of encryption layers before he/she would be able
to bruteforce.)

[...]

David> | Ever since having read about Reiser4's implementation,
David> | cryptoloop has seemed like a terrible kludge, so I'm really
David> | looking forward to this better solution.

David> dm_crypt is a better solution than cryptoloop, but this is better
David> still.

dm_crypt is basically the same idea as cryptoloop, but implemented using
Device Mapper instead of loopback.  It's an implementation improvement,
which allows it to be more flexible, but is basically the same model of
use.

Of course, Reiser4 crypto won't make dm_crypt obsolete.  e.g. Reiser4
crypto won't be able to do swapfile encryption (which everyone who has
encrypted files should be doing).  For standard file encryption,
Reiser4 crypto is probably the way to go.  But dm_crypt/cryptoloop
still has its uses.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 15:49         ` mjt
@ 2004-05-29 23:16           ` David Masover
  2004-05-30  0:41             ` Hubert Chan
                               ` (2 more replies)
  2004-06-02  2:45           ` Hans Reiser
  1 sibling, 3 replies; 41+ messages in thread
From: David Masover @ 2004-05-29 23:16 UTC (permalink / raw)
  To: Markus Törnqvist; +Cc: Valdis.Kletnieks, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Markus Törnqvist wrote:

| Symlinks have the good point that this doesn't happen:
| $ echo foobar > cipher

Oh yes, it does.
$ ln -s /tmp/foobar cipher

Of course, the nice thing here is that at least 'ln' refuses to create a
broken symlink, not sure about more low-level things.  But it does make
it easier for the user to manually set up a cipher.  Traditional /sys style:

$ cat ciphers
blowfish
3des
...
$ echo blowfish > cipher

My way:

$ ln -s ciphers/<tab><tab>
blowfish
3des
...

Now all the user has to do is a 'b<tab><enter>' to use blowfish.

| Of course that would be handled, but having you chasticed by the kernel
| may be bloating it. Is that a bigger bloat than handling symlinks out
| of the pseudo file system? Or maybe silent failures are good enough..

I was actually thinking that it would be nice for some of this to be
moved to userland anyway.  I mean, what's crucial to be in the kernel
(for speed reasons) is the filesystem and probably things like the
crypto api, individual ciphers, and whatever else is involved in the
actual operation.  The setup doesn't need to be in the kernel at all,
and in fact, I think it'd be nice to have a meta-plugin which exports
the plugin interface to userland, to make this sort of thing easier.

|>What would be better is to simply deny any request to ~/.secret which is
|>not directly tied to accessing ~/.secret/..metas.
|
|
| This reminds me of the point of encrypting a directory.
|
| I really want to know if it encrypts the directory object or all content.

Encrypting a directory should encrypt the directory listing, at least.
It would theoretically be possible to scan the disk for all files which
use that plugin (or is the plugin id stored in the directory?) so that
you know which files are encrypted, but then you don't know filenames or
the directory hierarchy.

If that's true, though, it means that in order to be sane, we have to
verify the passphrase.

|>|>If it's possible, I'd like a bad passphrase to make 'echo' get an "acess
|>|>denied" error (or maybe block for one second first), but I don't see how
|>|>that's possible -- how do we know when the write is done and the
|>|>passphrase is entered until the file is closed?
|>| Adding a passphrase retroactively? The write is done, the file
|>| is closed, and only then is the file encrypted.
|>Yeah, that's fine, but then "echo" doesn't die.  This is fine for shell
|>scripts -- they can always detect that it didn't work out and try again.
|>~ But I would like both shell scripts and manual user munging to be well
|>supported.

I was talking about the metafile editing.  If I do 'echo foo >
passphrase', and it's a bad passphrase, I think the 'echo' shouldn't be
successful.  Obviously not a priority, as either script or user can
check for a directory listing to make sure the passphrase was correct.

But your point is well taken:

| I'm afraid I don't see this situation quite clearly.
|
| You mean the file creation starts and echo for encrypting is issued
| before the write is done?
|
| I'm not sure about the syntax but I hope it's clear ;)
|
| $ cd .secret
| $ dd if=/dev/urandom of=KingArthur.0d.DivX.avi &
| [1] 2084
| $ (sleep 20; kill %1) &
| [2] 2085
| $ echo passphrase > crypto/passphrase
|
| A situation like this would be awkward...

Awkward to code, but not particularly awkward to figure out.  The 'echo'
command should block (sleep) until the directory has been encrypted,
assuming it was not encrypted to begin with.  Whatever 'dd' does over
the course of the encryption should be reflected in the eventual file.

As for coding it, we probably want a snapshot of the directory to
encrypt.  Atoms within the directory are not allowed to complete until
the encryption process is done.  Any changes to the directory or its
files should either block (the easy answer) or be stored as diffs to the
original snapshot (the right answer).

The easy answer works, but causes the problem of:  what if the user
starts this encryption of a fairly large directory (such as /), and then
needs to run something?  The encryption process is in the kernel, so we
don't crash, we just wait around for a very long time.

The right answer requires proper snapshotting.  I think how this works
is to make a copy of the file such that individual blocks are
copy-on-write, allowing us to take a snapshot of a huge file, change a
tiny piece of it, and use no more space than the chunk changed.  After a
little reflection, it would make sense to have all copies work that way.
~ It seems easier to make the whole file copy-on-write, but then we can't
handle huge files efficiently.

A (MUCH) simpler approach would be to make each file dirty and then set
it to be encrypted, as an atom.  This doesn't work for huge files,
though, because then we have to read the entire file into RAM before we
encrypt it.

If the crypto is implemented with snapshotting, it works like this:  You
start writing to foo.avi and it works as usual.  You start encrypting
stuff, and as a single atom, a snapshot is taken of foo.avi and it is
set to be encrypted.  The 'setting to be encrypted' basically forces all
future writes and reads to this file to be encrypted/decrypted where
they touch disk.  This is ok, because the blocks that are not encrypted
yet don't really belong to the new, encrypted file, and thus don't go
through the decryption plugin.

Now all we have to do is start reading in the non-encrypted blocks, and
writing them back on top of themselves as we go.  Ideally, we somehow
know which ones are not encrypted yet, and thus we don't end up
duplicating the work of new writes from dd (which are encrypted).
Practically, it may be implemented such that all blocks are read and
written to by the process -- which is ok, because each read/write pair
is an atom.  But on a big enough file with enough activity during the
encryption process, certain blocks would be encrypted twice, unless we
could check before encrypting a block whether it's already encrypted.

I still haven't read the reiser4 code, and so I don't know if this is
practical -- can blocks be that intelligent?  Maybe we need to implement
these "blocks" as subfiles, so that they can each have their own
plugin-id?  Maybe I've got entirely the wrong approach?

|>Reading ahead in my mail, I see this has already been answered.  Note
|>that cryptoloop does exactly what you're describing, only it allows an
|>incorrect passphrase to be entered, because it can't tell the difference
|>between correct or incorrect -- only you can, because incorrect will
|>yield gibberish.  We would want something to persist that allows a
|>passphrase to be checked.
|
|
| Absolutely.
| Having no cryptoloop experience, I'd probably have filed a bug report
| for this. It's broken behavior, feature or not :)

The "feature" here is increased security.  Hubert wrote about this, but
I don't find it's such a big deal.  However, since some people obviously
care enough, this should only be the default option, not the only one.

| HOWEVER!
| Anyone with sufficient knowledge may dig out the persistent metadata
| information from the fs, say, using dd.
|
| How would this be handled? Are MD5 hashes strong enough?

Don't know about md5, but how hard is it to brute-force the file itself?
~ How about some strong magic at the beginning of the file (perhaps a
checksum of the filename?) which can be used to verify (within reason)
that the passphrase worked?  How vulnerable are modern ciphers to
known-plaintext attacks?

| One other option would be for the kernel to generate a super-passphrase
| so it stores cryptoinformation using this passphrase, which is totally
| inaccessible to users, but it may cause breakage on unclean shutdowns.

How easy is it to hide something from the user, anyway?  I mean, if
loadable modules are supported, it all goes out the window -- simply
write a driver that reads all kernel memory by directly asking the
hardware for it.  That's assuming that you can't simply scan through all
the memory yourself.

|>True, but it should be possible for people to design a system which
|>requires thousands of *individual* crypto settings (individual
|>passphrases and so on, so it can't all propegate to new children) -- not
|>because I can think why they'd do that, but because someone will
|>inevitably come whining to Namesys if they can't.
|
|
| I would solve this by having the settings propagate to all unencrypted
| children and let the children be re-encrypted or whatever that changes
| the cryptosettings.

That's fine, but that's still thousands of passphrases to change.  But
yes, deal with it later.

|>dm_crypt is a better solution than cryptoloop, but this is better still.
|
|
| I think I must look into dm_crypt, but I don't think I'll be encrypting
| any of my files until the Reiser4 mechanism is complete.

dm_crypt is only slightly better than cryptoloop -- to the user, it's
basically the same, only not as broken.

| There is some code that I tried to read through.
| reiser4/crypt.c
| reiser4/plugin/cryptcompress.h
| reiser4/plugin/cryptcompress.c
|
| So it seems it will compress as well as encrypt. But I'm not sure about
| what all that code does. Should give you a start though.

Seems right, only compression doesn't fit my (attempt at) design of a
crypto plugin above.  Or it does, but it'd have to be adjusted, because
compressing something changes its size, while encrypting it doesn't.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLkZ3ngHNmZLgCUhAQKk7Q//aOXz30G5ZE2RZi8We9jPvYcmFci3d60a
ml/EV+sNnQNOiVe8p65+XuXD3SdMwJYzQKjAP02mPLeI2zHHbb4XnzeZOKq+eGbo
N3NaZRmbfaXVlcyfuA9QOyVu0QIHEx95XZXuZ983ZdAETuK8FcQFDHf3hTaGgw9g
qkyLpttHsdvdYC9Kbil7iFYOUll7q4EzYWwUo7HwIjAtkREgAMx6mMnNqtQlGOfc
xZl65PDU80kGquXtZ3DoaYS4rWzRO+XsTxEi3bPu1rq/+vWFbd5MZTCUrphbBiJ7
BeRgdzoncTreKJMpq1uqlg2iyb1097WTQ8zAmzWP3UhqDwsFiSqt4ZmA+GZuBFre
r70EN5UqYv6TI3yoNaDz25RXoWRToki/CnrUrSWKgyfQj3z2Qs1KUgC1FFQsGyUq
CkhyiEBbRAcsOFUBt1Z+jIOg3tJ5OyOsKisw1u2ycE0cg/XMTDTAB+wbHPhfknRa
K2Qe6bi9WXoa738wbdOcbWuL9ZNvudstO9kY7jcKym+2EX8640yTrDnJY1hy4rEg
ncEY565yyEHjl+F0Bz1Qq/9A9VRITogEpVHwXYbm0KCKDLYbMKEB/WEcikL31R4c
2E9ImUsaWbbCr6YNBjBscRejjUUJN/AHj0pat9dYPLRsAui9sVZswndqDjIxB3QT
wUCt28tFh74=
=gVEx
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 20:04         ` Hubert Chan
@ 2004-05-29 23:19           ` David Masover
  2004-05-31 18:27             ` Valdis.Kletnieks
  0 siblings, 1 reply; 41+ messages in thread
From: David Masover @ 2004-05-29 23:19 UTC (permalink / raw)
  To: Hubert Chan; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hubert Chan wrote:

| Note that allowing a passphrase to be checked may decrease security

Files are not encrypted by default, so encrypted files should allow a
passphrase to be checked by default.  In balancing security and
convenience, make convenience the default.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLkaZXgHNmZLgCUhAQK3vhAAg2xE4ohKaXqV63EOBRkszdqYx34w04+Q
ulpL7/uHridbT6OUNszkkMh9sls8cBQilBcwVMIP/uGeo/ehnthxDQ6NUmJhxKTq
qMIQLU6Ak42VsYZDau87RFHij53uqilZQcfl4Scvv3DcM0oQhJYJur/CtQQiLTk/
uYS+q/d31sLFYxnKqTGOvBq1M3fOOeSqAhQw2Djv6jPaSMhFTbQW5l8jbXp319l6
bzPucjGzrDJZSqu93gMzonQF2zSnHHTQxjctxUDaQ9WpAVCIoNUFqCeM1r5khet9
WIT16+wJTRgWeusWpKJpe9N9WYECuX6u6UKYCfpmMXGh6ae+Z8jOCLENkstI9DZL
Vbbo1ugsuPjM9q/J7X/FIQ9vLKSiFx2krdHYfQD+2TR09hZAdSRwAYvgpaHDKZqg
MgHuD4hK+mpxVorpNDZNc/Z2noX+f3ylDHl+GnFH2WBaksSbmbAjN3/IFBtG1otF
rTA5fQS77h4PYhIUXTbCWcYC3EYyUqsReI3VmUyVLmZhvrxwXy2r5Hy0LBof7aST
iiIzqetXk2xvXhzaADh6uaPaDFUDfJD3R3dPCzCMNirdq+ALuf3DY+4Tq43hxd5R
NohKrFgxQuo2FPE2I5zGw6IAmjvLZpzYB3+E246TCdBwHt3WHqmHGW9Vk7jy8bbH
zwDe/j5xGo0=
=AgBH
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 23:16           ` David Masover
@ 2004-05-30  0:41             ` Hubert Chan
  2004-05-30 12:29               ` mjt
  2004-05-30 12:27             ` mjt
  2004-05-31 18:31             ` Valdis.Kletnieks
  2 siblings, 1 reply; 41+ messages in thread
From: Hubert Chan @ 2004-05-30  0:41 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "David" == David Masover <ninja@slaphack.com> writes:

[...]

David> Don't know about md5, but how hard is it to brute-force the file
David> itself?  ~ How about some strong magic at the beginning of the
David> file (perhaps a checksum of the filename?) which can be used to
David> verify (within reason) that the passphrase worked?  How
David> vulnerable are modern ciphers to known-plaintext attacks?

Modern ciphers should be fairly resistant to know-plaintext attacks, I
think.  When you put a filesystem on a loopback, you've essentially got
a known-plaintext, because filesystems typically start with a magic
number.  (Assuming the attacker knows what filesystem you're using.)
It's best to avoid known plaintexts if possible, of course, if you're
worried about security.

One thing that can be done is to take just the first couple of bytes
from a hash to be used as your check.  That will catch the user from
common entry errors, and won't reduce the keyspace by that much
(hopefully).

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 23:16           ` David Masover
  2004-05-30  0:41             ` Hubert Chan
@ 2004-05-30 12:27             ` mjt
  2004-05-30 17:09               ` Hubert Chan
                                 ` (2 more replies)
  2004-05-31 18:31             ` Valdis.Kletnieks
  2 siblings, 3 replies; 41+ messages in thread
From: mjt @ 2004-05-30 12:27 UTC (permalink / raw)
  To: David Masover; +Cc: Valdis.Kletnieks, reiserfs-list

On Sat, May 29, 2004 at 06:16:47PM -0500, David Masover wrote:

>Now all the user has to do is a 'b<tab><enter>' to use blowfish.

This is good. And if the user uses sh or ash which seem to lack
tab expansion, they inherently pay attention to typing.

Besides, only Linux is supported for now and it tends to provide
better shells by default.

>I was actually thinking that it would be nice for some of this to be
>moved to userland anyway.  I mean, what's crucial to be in the kernel
>(for speed reasons) is the filesystem and probably things like the
>crypto api, individual ciphers, and whatever else is involved in the
>actual operation.  The setup doesn't need to be in the kernel at all,
>and in fact, I think it'd be nice to have a meta-plugin which exports
>the plugin interface to userland, to make this sort of thing easier.

The pseudo file system is kernel-space.
Having something completely user-space creates yet another knob
to worry about?

What do you propose this will look like?

>Encrypting a directory should encrypt the directory listing, at least.

Yes, the directory listing is in the pseudo file system, so this
should be relatively easy to do.

Is readdir() so that it reads the same thing as 
cat ..metas/something/dirlist?

>It would theoretically be possible to scan the disk for all files which
>use that plugin (or is the plugin id stored in the directory?) so that
>you know which files are encrypted, but then you don't know filenames or
>the directory hierarchy.
>
>If that's true, though, it means that in order to be sane, we have to
>verify the passphrase.

Should be doable.

>| $ cd .secret
>| $ dd if=/dev/urandom of=KingArthur.0d.DivX.avi &
>| [1] 2084
>| $ (sleep 20; kill %1) &
>| [2] 2085
>| $ echo passphrase > crypto/passphrase
>| A situation like this would be awkward...
>Awkward to code, but not particularly awkward to figure out.  The 'echo'
>command should block (sleep) until the directory has been encrypted,
>assuming it was not encrypted to begin with.  Whatever 'dd' does over
>the course of the encryption should be reflected in the eventual file.

We agree.

>As for coding it, we probably want a snapshot of the directory to
>encrypt.  Atoms within the directory are not allowed to complete until
>the encryption process is done.  Any changes to the directory or its
>files should either block (the easy answer) or be stored as diffs to the
>original snapshot (the right answer).

Timothy Webster asked about this and Vladimir Savliev replied No.

I think these snapshots should be implemented here.

>The easy answer works, but causes the problem of:  what if the user
>starts this encryption of a fairly large directory (such as /), and then
>needs to run something?  The encryption process is in the kernel, so we
>don't crash, we just wait around for a very long time.
>
>The right answer requires proper snapshotting.  I think how this works
>is to make a copy of the file such that individual blocks are
>copy-on-write, allowing us to take a snapshot of a huge file, change a
>tiny piece of it, and use no more space than the chunk changed.  After a
>little reflection, it would make sense to have all copies work that way.
>~ It seems easier to make the whole file copy-on-write, but then we can't
>handle huge files efficiently.
>A (MUCH) simpler approach would be to make each file dirty and then set
>it to be encrypted, as an atom.  This doesn't work for huge files,
>though, because then we have to read the entire file into RAM before we
>encrypt it.

Could there be a compromise?
Like with the policy=smart mount option.

Take the easy way where it's more efficient, the right way when it's
more efficient.

It should be possible to evaluate the amount of free RAM and swap
and the size of the directory and then automagically decide how to
do it.

Maybe the following mount options:
cryptopolicy=smart
cryptopolicy=wait
cryptopolicy=dirty
cryptopolicy=snapshot

I just wonder where else these snapshots could be used...
Then again, it's basically copy-on-capture. So even if CoC doesn't
present much in the way of speed-up, it may present much in the
way of other functionality.

>Now all we have to do is start reading in the non-encrypted blocks, and
>writing them back on top of themselves as we go.  Ideally, we somehow
>know which ones are not encrypted yet, and thus we don't end up
>duplicating the work of new writes from dd (which are encrypted).

This is problematic, yes?

If the journal is updated atomically on writes of complete files,
we would either have completely encrypted or completely unencrypted
files?

>Practically, it may be implemented such that all blocks are read and
>written to by the process -- which is ok, because each read/write pair
>is an atom.  But on a big enough file with enough activity during the
>encryption process, certain blocks would be encrypted twice, unless we
>could check before encrypting a block whether it's already encrypted.

But if this could be in the journal, it would make life easier.

>I still haven't read the reiser4 code, and so I don't know if this is
>practical -- can blocks be that intelligent?  Maybe we need to implement
>these "blocks" as subfiles, so that they can each have their own
>plugin-id?  Maybe I've got entirely the wrong approach?

I'm afraid I couldn't make this out of the code.
But see below for more.

>| Absolutely.
>| Having no cryptoloop experience, I'd probably have filed a bug report
>| for this. It's broken behavior, feature or not :)
>The "feature" here is increased security.  Hubert wrote about this, but
>I don't find it's such a big deal.  However, since some people obviously
>care enough, this should only be the default option, not the only one.

But if the user is asked to retrieve encrypted data for another user
from an encrypted fs, say some shared environment, this may happend:
The user enters a faulty passphrase, gets gibberish data, he believes
it's correct because it's something funky like binary measurement data
which he doesn't understand, delivers it to the requesting user who
can't do anything with it.

Likely? Hell, no.

>Don't know about md5, but how hard is it to brute-force the file itself?
>~ How about some strong magic at the beginning of the file (perhaps a
>checksum of the filename?) which can be used to verify (within reason)
>that the passphrase worked?  How vulnerable are modern ciphers to
>known-plaintext attacks?

I can't remember how MD5 works. It's a 32-byte 7-bit hash of data?
It has also something to do with the length of the data, if you know
how long the data is, it's easier to do a successful birthday attack.

I've been thinking about doing a simulator for this in Python, but
when I don't even get enough sleep to make writing, or thinking in 
English easy, it will have to wait :)
I'd be interested in how to birthday an MD5 with the wrong assumption
about data length. Hopefully it is impossible.

Someone more security-oriented may want to give info here :)

What you propose is probably good for performance too, just see
if (decrypt(name) == decrypt(gethashedname(name))) or something..

It would still have to be a set-length hash so that we can reserve a
static amount of disk space for the headerish data.

>How easy is it to hide something from the user, anyway?  I mean, if
>loadable modules are supported, it all goes out the window -- simply
>write a driver that reads all kernel memory by directly asking the
>hardware for it.  That's assuming that you can't simply scan through all
>the memory yourself.

But this would require root privileges. If you can't trust your root,
who can you trust?

Isn't some memory always accessible by root solo? Or at least owned
by the user, you, so no-one else can access it? If a bug that circumvents
this gets into the kernel, certain sysadmins will start farting blood.

But that's assuming I remember the initial conditions correctly.

>| So it seems it will compress as well as encrypt. But I'm not sure about
>| what all that code does. Should give you a start though.
>Seems right, only compression doesn't fit my (attempt at) design of a
>crypto plugin above.  Or it does, but it'd have to be adjusted, because
>compressing something changes its size, while encrypting it doesn't.

Nor mine.

I would tend to bite the bullet and split compression and encryption
into separate plugins.

Are all encryptions set-length? (Is that an official security term?)

Maybe there is some way of doing the compression after and before
encryption and decryption, yes.

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30  0:41             ` Hubert Chan
@ 2004-05-30 12:29               ` mjt
  2004-05-30 16:54                 ` Hubert Chan
  0 siblings, 1 reply; 41+ messages in thread
From: mjt @ 2004-05-30 12:29 UTC (permalink / raw)
  To: Hubert Chan; +Cc: reiserfs-list

On Sat, May 29, 2004 at 08:41:48PM -0400, Hubert Chan wrote:

>Modern ciphers should be fairly resistant to know-plaintext attacks, I

This is a lame question, but what does a known-plaintext attack
imply?

The cracker wants to know the plaintext passphrase instead of some
random string whose hash matches the original passphrase's hash?

>think.  When you put a filesystem on a loopback, you've essentially got
>a known-plaintext, because filesystems typically start with a magic
>number.  (Assuming the attacker knows what filesystem you're using.)

The above hypothesis may not fit in here..

Thanks!

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30 12:29               ` mjt
@ 2004-05-30 16:54                 ` Hubert Chan
  0 siblings, 0 replies; 41+ messages in thread
From: Hubert Chan @ 2004-05-30 16:54 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "Markus" == Markus Törnqvist <mjt@nysv.org> writes:

Markus> On Sat, May 29, 2004 at 08:41:48PM -0400, Hubert Chan wrote:

Markus> This is a lame question, but what does a known-plaintext attack
Markus> imply?

Markus> The cracker wants to know the plaintext passphrase instead of
Markus> some random string whose hash matches the original passphrase's
Markus> hash?

Known plaintext mostly applies to encryption -- I don't know about any
version that applies to hashes.  In known plaintext, the attacker has a
plaintext, and the encrypted data, and is able to retrieve the key from
that.

e.g. if the attacker knows the first several bytes of your file, they
may be able to retrieve the key and decrypt the rest of the file.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30 12:27             ` mjt
@ 2004-05-30 17:09               ` Hubert Chan
  2004-05-31  0:07                 ` The Amazing Dragon
  2004-05-30 17:13               ` Hubert Chan
  2004-05-31  0:45               ` David Masover
  2 siblings, 1 reply; 41+ messages in thread
From: Hubert Chan @ 2004-05-30 17:09 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "Markus" == Markus Törnqvist <mjt@nysv.org> writes:

[...]

Markus> I can't remember how MD5 works. It's a 32-byte 7-bit hash of
Markus> data?  It has also something to do with the length of the data,
Markus> if you know how long the data is, it's easier to do a successful
Markus> birthday attack.

Huh?  A birthday attack isn't related to retrieving the original text
from the hash.  A birthday attack just shows how you can find two
strings that hash to the same value.

If you know how long the data is, it (greatly) reduces the space that
you need to search.  I don't know of any other vulnerabilities related
to known length.

BTW, MD5 has some known vulnerabilities related to hash collisions.  I
don't know the details -- I've only heard it mentioned briefly -- but
extremely paranoid people will probably want to use SHA1 instead.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30 12:27             ` mjt
  2004-05-30 17:09               ` Hubert Chan
@ 2004-05-30 17:13               ` Hubert Chan
  2004-05-30 18:06                 ` mjt
  2004-05-31  0:45               ` David Masover
  2 siblings, 1 reply; 41+ messages in thread
From: Hubert Chan @ 2004-05-30 17:13 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "Markus" == Markus Törnqvist <mjt@nysv.org> writes:

[...]

Markus> Maybe there is some way of doing the compression after and
Markus> before encryption and decryption, yes.

Note that if you want to do encryption and compression, you must do
compression first, and then encryption.  When you encrypt, you end up
with what looks like random data, which typically has very bad
compression characteristics (i.e. your file size is more likely to grow
than to shrink).

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30 17:13               ` Hubert Chan
@ 2004-05-30 18:06                 ` mjt
  0 siblings, 0 replies; 41+ messages in thread
From: mjt @ 2004-05-30 18:06 UTC (permalink / raw)
  To: Hubert Chan; +Cc: reiserfs-list

On Sun, May 30, 2004 at 01:13:50PM -0400, Hubert Chan wrote:

>Note that if you want to do encryption and compression, you must do
>compression first, and then encryption.  When you encrypt, you end up
>with what looks like random data, which typically has very bad
>compression characteristics (i.e. your file size is more likely to grow
>than to shrink).

But if compression is tied to size, this would change matters?

Otherwise yes..

Maybe compression and encryption should be different things?

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30 17:09               ` Hubert Chan
@ 2004-05-31  0:07                 ` The Amazing Dragon
  0 siblings, 0 replies; 41+ messages in thread
From: The Amazing Dragon @ 2004-05-31  0:07 UTC (permalink / raw)
  To: Hubert Chan; +Cc: reiserfs-list

> From: Hubert Chan <hubert@uhoreg.ca>
> >>>>> "Markus" == Markus Törnqvist <mjt@nysv.org> writes:
> Markus> I can't remember how MD5 works. It's a 32-byte 7-bit hash of
> Markus> data?  It has also something to do with the length of the data,
> Markus> if you know how long the data is, it's easier to do a successful
> Markus> birthday attack.
> 
> Huh?  A birthday attack isn't related to retrieving the original text
> from the hash.  A birthday attack just shows how you can find two
> strings that hash to the same value.
> 
> If you know how long the data is, it (greatly) reduces the space that
> you need to search.  I don't know of any other vulnerabilities related
> to known length.
> 
> BTW, MD5 has some known vulnerabilities related to hash collisions.  I
> don't know the details -- I've only heard it mentioned briefly -- but
> extremely paranoid people will probably want to use SHA1 instead.

MD5 is 128-bits, so 16 bytes not 32 bytes. Length is irrelevant. You're
just as likely to find a 16 byte string that matches a particular MD5
hash value, as a 2GB string that matches a particular MD5 hash value. MD5
is strictly based on the value of the data, not its length. At 128-bits
though finding a match is very difficult without breaking MD5. There was
some research that produced collisions that reached pretty deep into MD5,
but no complete break has ever been published. The researcher who did the
work was absorbed into a German intellegence agency though...

SHA1 aka SHA160 isn't broken, nor are there any serious attacks on it.
SHA1 though has insufficient length to truely protect against the
birthday attack. You need around 2^81th strings to perform an attack on
SHA1, too large for currently known installations, but if you want to
defend for 20 years (the standard length of time used) then such an
installation will likely be feasible. Thankfully SHA256/384/512 are out.
SHA512 is currently overkill, but for the truely paranoid nothing less
will do.

> From: Hubert Chan <hubert@uhoreg.ca>
> >>>>> "Markus" == Markus Törnqvist <mjt@nysv.org> writes:
> Markus> Maybe there is some way of doing the compression after and
> Markus> before encryption and decryption, yes.
> 
> Note that if you want to do encryption and compression, you must do
> compression first, and then encryption.  When you encrypt, you end up
> with what looks like random data, which typically has very bad
> compression characteristics (i.e. your file size is more likely to grow
> than to shrink).

Yup, there are additional reasons as well. Compression greatly increases
the Unicity distance (in fact ideal compression will make the Unicity
distance infinite), making it much harder to distinguish an incorrect key
from the correct key. Further, by compressing first you reduce the amount
of data you need to encrypt, resulting in a large speed increase.


-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \   (    |         EHeM@cs.pdx.edu      PGP 8881EF59         |    )   /
  \_  \   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
    \___\_|_/82 04 A1 3C C7 B1 37 2A*E3 6E 84 DA 97 4C 40 E6\_|_/___/



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-30 12:27             ` mjt
  2004-05-30 17:09               ` Hubert Chan
  2004-05-30 17:13               ` Hubert Chan
@ 2004-05-31  0:45               ` David Masover
  2004-05-31  8:38                 ` mjt
  2004-06-01 13:25                 ` Edward Shushkin
  2 siblings, 2 replies; 41+ messages in thread
From: David Masover @ 2004-05-31  0:45 UTC (permalink / raw)
  To: Markus Törnqvist; +Cc: Valdis.Kletnieks, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Markus Törnqvist wrote:
|>actual operation.  The setup doesn't need to be in the kernel at all,
|>and in fact, I think it'd be nice to have a meta-plugin which exports
|>the plugin interface to userland, to make this sort of thing easier.
|
|
| The pseudo file system is kernel-space.
| Having something completely user-space creates yet another knob
| to worry about?
|
| What do you propose this will look like?

Well, what does writing a plugin look like?  I'm thinking making it
possible to write a userspace plugin in pretty much the same syntax as
you'd use for the kernel-based one, only you have a lot more toys in
userland (say, libpcre).  Then the next step would be perl/python/etc
bindings.

| Could there be a compromise?
| Like with the policy=smart mount option.
|
| Take the easy way where it's more efficient, the right way when it's
| more efficient.

Maybe.  But the only inefficient thing I see happening to the ultimate
"right way" of doing this is that in order to make sure that we don't
encrypt more than we have to, we have to keep track of whether each
individual block has been encrypted or not.  I don't know if that's
easily possible, though.  The place where that is inefficient is when
encrypting a huge file that doesn't change much, but I can't imagine it
being so inefficient that you'd prefer to encrypt some blocks twice.
Say this is your file:

12345678

You start to encrypt it, but when you're here:

1234|5678

someone changes something:

1234|5600

Ideally, you want to just keep going from here, but what if they changed
something towards the beginning of the file, instead?  Obviously then
the change must be encrypted, even though the original was also
encrytped.  But towards the end, we should be able to forget the '78'
and just encrypt the '00'.

| If the journal is updated atomically on writes of complete files,
| we would either have completely encrypted or completely unencrypted
| files?

This works, but is annoying, because then if we are encrypting a large
file, we can't use it until we're done encrypting.  Either that, or our
changes can't be committed until the encryption is done, which has the
problem of what I'm describing above -- first we encrypt the '78', then
we encrypt the '00'.  All while possibly wasting disk space -- suppose
it's a 9 gig file on a 10 gig partition?

| But this would require root privileges. If you can't trust your root,
| who can you trust?
|
| Isn't some memory always accessible by root solo? Or at least owned
| by the user, you, so no-one else can access it? If a bug that circumvents
| this gets into the kernel, certain sysadmins will start farting blood.
|
| But that's assuming I remember the initial conditions correctly.

Yeah, you're right, I was overestimating your level of paranoia.
Because sometimes people actually do lock things down so hard that there
are places you can't get to.  But in doing so, they cripple root so
badly that I'd never want to 0wn those systems (in either sense of the
word).

| I would tend to bite the bullet and split compression and encryption
| into separate plugins.

As long as they share a lot of code.  The initialization would have to
be different for each one, but during operation, I believe they could
literally be the exact same code, only one of them uses zlib and another
uses blowfish (or something).  I mean, reiser makes no assumptions about
eventual size until a flush to disk anyway, so what difference does
compression make?

| Maybe there is some way of doing the compression after and before
| encryption and decryption, yes.

Compress before encryption, decompress after decryption.  But here's a
question -- can we as users choose what order plugins of the same layer
are run?  You'd have to be an idiot to want to encrypt before you
compress -- an idiot, or someone who's thought of something we haven't.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLqAK3gHNmZLgCUhAQKeJg/+IMg1ezLOFMa/sjg/rNkOVLNglvZzNowy
dkZ2xu48AbjiIIl1ipll6pTz59RYgVk6hk9OMQE6xcABorFRVEAOsJ9+CwyPPWZ8
37GfhxqDlprmjuOpIysl+fIKzaFwJolu5QAi4aRkUlXMjsh/Tg8LLB94Kot3Unwu
qyBuCppm6HlLtO+1KJZYEN3qTbMt395BdqFd7SxX2rcpu+npkEanDi9LcQuE8AAw
Jd6SM5Q2j2iiXPSwGKvp9Jm0WGrvfU09stAeQYK7LRB7iAw0b/S/jAH4q1PoCP6U
G6Ujq0RhliwkQSV6X2w3KUc5cQLVaOtXRzKLG+fomvgFvTi9YWzYuyH3ZzIis1zX
CNh2ZsUs4xpHlHnNKg/s/KjU1zHhD3qTYPa1xqsaFGl2woOqoxbMR/PuEMUfj4y0
tA7OSjQ3gwL4bGBv3+9ULdqhTkbnnXj2bRRF+iRGmib/pstb32SM6DeD+KIBsUDE
1NK+AVksJlqhJIt+1M0iVpLr8Lap+ltxUapWtkAgV0syDGF3+yPaKxrABdvWFm0C
b5NbqtUzXlklNi5dcTiX2qWR8ucMLwjtS1UexkEMkuYOAPr+j/dvtLTtTs4Q0PoZ
grs16zmcJY5YW4MaJ8diZX6tQVxzK0NtNroq6niyQlsWI41Aa/2ro2vvF6mKgADg
d9vUzjwdNzE=
=CIEZ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31  0:45               ` David Masover
@ 2004-05-31  8:38                 ` mjt
  2004-05-31 15:12                   ` David Masover
  2004-05-31 15:16                   ` Hubert Chan
  2004-06-01 13:25                 ` Edward Shushkin
  1 sibling, 2 replies; 41+ messages in thread
From: mjt @ 2004-05-31  8:38 UTC (permalink / raw)
  To: David Masover; +Cc: Valdis.Kletnieks, reiserfs-list

On Sun, May 30, 2004 at 07:45:32PM -0500, David Masover wrote:

>Well, what does writing a plugin look like?  I'm thinking making it
>possible to write a userspace plugin in pretty much the same syntax as
>you'd use for the kernel-based one, only you have a lot more toys in
>userland (say, libpcre).  Then the next step would be perl/python/etc
>bindings.

This would be a good idea. I would also like to see some python
module that uses ..metas/ as the interface.

>This works, but is annoying, because then if we are encrypting a large
>file, we can't use it until we're done encrypting.  Either that, or our
>changes can't be committed until the encryption is done, which has the
>problem of what I'm describing above -- first we encrypt the '78', then
>we encrypt the '00'.  All while possibly wasting disk space -- suppose
>it's a 9 gig file on a 10 gig partition?

Let's see what the Namesys guys have to say about this, but somehow
this must be reasonably solved.

Maybe they're sitting on some spec and just haven't had the time
to tell us.

>| I would tend to bite the bullet and split compression and encryption
>| into separate plugins.
>
>As long as they share a lot of code.  The initialization would have to
>be different for each one, but during operation, I believe they could
>literally be the exact same code, only one of them uses zlib and another
>uses blowfish (or something).  I mean, reiser makes no assumptions about
>eventual size until a flush to disk anyway, so what difference does
>compression make?

This was another case of me not thinking things through, that happens.

Have you, by any chance, had time to look at the code?
I (probably the entire list, too ;) would be interested in hearing
about the inner workings of the code, and it seems the main devs don't
have time to explain.

Maybe I should take a vacation and learn C really well and read all
the Reiser4 code :)

>| Maybe there is some way of doing the compression after and before
>| encryption and decryption, yes.
>Compress before encryption, decompress after decryption.  But here's a

This was addressed before, but I was under the impression that the other
way around was easier/better. Oh well, you live, you learn.

But it seems possible that you can't do both with the same plugin.
If it's echo zlib or echo blowfish, unless there's echo zlib+blowfish,
which would actually be much cooler with symlinks.

I heard once that using many encryptions may reduce the level of security.
I don't know if there's any real truth behind this. But if there is,
maybe it should be impossible to support two cryptos, only a crypto
and a compression.

>question -- can we as users choose what order plugins of the same layer
>are run?  You'd have to be an idiot to want to encrypt before you
>compress -- an idiot, or someone who's thought of something we haven't.

I'd probably vote for myself being an idiot, but I prefer the terms
inexperienced and ignorant ;)

But people have a tendency of think of stuff that other people haven't
so nothing should be discarded straight away.

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31  8:38                 ` mjt
@ 2004-05-31 15:12                   ` David Masover
  2004-05-31 17:20                     ` Hubert Chan
  2004-05-31 15:16                   ` Hubert Chan
  1 sibling, 1 reply; 41+ messages in thread
From: David Masover @ 2004-05-31 15:12 UTC (permalink / raw)
  To: Markus Törnqvist; +Cc: Valdis.Kletnieks, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Markus Törnqvist wrote:

| Let's see what the Namesys guys have to say about this, but somehow
| this must be reasonably solved.
|
| Maybe they're sitting on some spec and just haven't had the time
| to tell us.

Hope springs eternal.  Namesys guys?

| Have you, by any chance, had time to look at the code?

| Maybe I should take a vacation and learn C really well and read all
| the Reiser4 code :)

I know C tolerably well, I just need to spend the time learning
reiser4's code.  This is currently impossible.  I'm in high school and
on the verge of failing math (due to laziness, not lack of ability).
However, summer vacation is in a little over a week.

| But it seems possible that you can't do both with the same plugin.
| If it's echo zlib or echo blowfish, unless there's echo zlib+blowfish,
| which would actually be much cooler with symlinks.

Easy:

$ ln -s cryptos/zlib crypto/1
$ ln -s cryptos/blowfish crypto/2

Not as graceful when only one is wanted, though.

| I heard once that using many encryptions may reduce the level of security.
| I don't know if there's any real truth behind this. But if there is,
| maybe it should be impossible to support two cryptos, only a crypto
| and a compression.

I'd rather allow people to be as stupid as they choose, but this would
make the above syntax a bit less ugly:

$ cd ..metas/crypto
$ ln -s ciphers/blowfish cipher
$ echo 256 > bits
$ cd ../compress
$ ln -s codecs/zlib codec
$ echo 9 > level

This allows us to allow things like bzip2 instead.  Default should
probably be zlib, at whatever its default compression level is, on files
which are encrypted.  (Non-encrypted files have no default compression.)

|>question -- can we as users choose what order plugins of the same layer
|>are run?  You'd have to be an idiot to want to encrypt before you
|>compress -- an idiot, or someone who's thought of something we haven't.
|
|
| I'd probably vote for myself being an idiot, but I prefer the terms
| inexperienced and ignorant ;)
|
| But people have a tendency of think of stuff that other people haven't
| so nothing should be discarded straight away.

So maybe we want some sort of way of stacking these?  I'm not sure how
to do this gracefully and still use the above syntax -- I've seen it for
less than a weekend and I'm already attached to it :D

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLtLeHgHNmZLgCUhAQKrHw//Wnw7IBuPdPiIHpDKvMiMqqiroSnxkZ+q
oiY4zFg+0Gxtzt2+gVww0MQFH+4jjs7EgmrtJ+Bl3XbAEJDTzYADUdVVB8KlCQDa
TCyj3+xNkRLY6+D9iHUTaITnQVQYge4WNQ6QURhmmA/azrc9RvOofrCALl3oiqdG
IozXlw+fhSaSocHsrIsVf7bUB5UjQ5mRyW/y+sGYTcwxq/bw2OdVUzsHwQZ1g9Cy
9J+xqrDDmEF5uoGwXZvES7NXXxdE7N+XAhJAStEBz23PoN3JYM591WuLI18espfr
omvMmVMYFumn49e4dVIq7w0gHBseCSrjhsRTOMgnv+3hlafrjDXQswt247aAtqbY
x9J6sgqpI6tJYkbiT8AHXQHZkF0nzpyfjH2g1L+D6xRkfJsxdi6lNVUQkwwD9tNA
WltES/jbII6/Aie6VXgky3uDJNZT87oVv0mfzre4QzAWUq1XvWN1omIztODnYDpz
5Q4kPOdUT9AgkLI+dbi0l58VjSTAPhGDLIeJoZpqCLJeTP+YVZAwks96o9AqsNl2
SKAR2Ol11Xm2MlHeXeUD81Y4S0dL/wlwz2SaNgex8l2vjaMdm/RCwkyq+9FnMvR6
n5/YX2FF7/raM+T//Uq6NqDE7a6gu+AV4/FO13WNcu5DdNO1SVbH4IoTUqe1Iyxv
oKIoaUdEWY0=
=MuFR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31  8:38                 ` mjt
  2004-05-31 15:12                   ` David Masover
@ 2004-05-31 15:16                   ` Hubert Chan
  1 sibling, 0 replies; 41+ messages in thread
From: Hubert Chan @ 2004-05-31 15:16 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "Markus" == Markus Törnqvist <mjt@nysv.org> writes:

[...]

Markus> I heard once that using many encryptions may reduce the level of
Markus> security.  I don't know if there's any real truth behind
Markus> this. But if there is, maybe it should be impossible to support
Markus> two cryptos, only a crypto and a compression.

I would have a hard time believing that double encryption would weaken
anything (unless you're using ROT13).  Double encryption might not make
anything stronger, or might not strengthen more than you would expect
(e.g. double DES is not "twice" as strong as single DES), though.  3-DES
is basically just DES three times.

I would also imagine that encrypting twice, each time with different
ciphers, has a very good chance of being stronger than just a single
encryption.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31 15:12                   ` David Masover
@ 2004-05-31 17:20                     ` Hubert Chan
  2004-05-31 21:14                       ` David Masover
  0 siblings, 1 reply; 41+ messages in thread
From: Hubert Chan @ 2004-05-31 17:20 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "David" == David Masover <ninja@slaphack.com> writes:

David> Markus Törnqvist wrote:

David> | Maybe I should take a vacation and learn C really well and read
David> | all the Reiser4 code :)

David> I know C tolerably well, I just need to spend the time learning
David> reiser4's code.

I know C, but Reiser4 is very big and contains many files.  (I'm also
not a filesystem programmer, and also don't have much time, but that's
beside the point ;-).)  I think that what would be most helpful is a map
of all the source files.  Hans says that Namesys code is very well
commented, which a very casual glance over the code seems to confirm
(except that lines seem to be wrapped at 90 characters instead of 80!).
But without a high-level overview of all the files, and how they all fit
together, it's still hard to read.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 23:19           ` David Masover
@ 2004-05-31 18:27             ` Valdis.Kletnieks
  2004-05-31 21:23               ` David Masover
  0 siblings, 1 reply; 41+ messages in thread
From: Valdis.Kletnieks @ 2004-05-31 18:27 UTC (permalink / raw)
  To: David Masover; +Cc: Hubert Chan, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 458 bytes --]

On Sat, 29 May 2004 18:19:01 CDT, David Masover said:

> Files are not encrypted by default, so encrypted files should allow a
> passphrase to be checked by default.  In balancing security and
> convenience, make convenience the default.

Note that if 98% of the files on a file system are plaintext, then
somebody can infer quite a bit merely by knowing what 2% are
in fact encrypted.  If they're all encrypted, an attacker can't
leverage that knowledge.



[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 23:16           ` David Masover
  2004-05-30  0:41             ` Hubert Chan
  2004-05-30 12:27             ` mjt
@ 2004-05-31 18:31             ` Valdis.Kletnieks
  2004-05-31 21:15               ` David Masover
  2 siblings, 1 reply; 41+ messages in thread
From: Valdis.Kletnieks @ 2004-05-31 18:31 UTC (permalink / raw)
  To: David Masover; +Cc: Markus Törnqvist, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 629 bytes --]

On Sat, 29 May 2004 18:16:47 CDT, David Masover said:

> $ ln -s /tmp/foobar cipher
> 
> Of course, the nice thing here is that at least 'ln' refuses to create a
> broken symlink, not sure about more low-level things.

Odd.  At least my 'ln' is perfectly happy creating a dangling symlink:

[~]2 ls /tmp/foo
ls: /tmp/foo: No such file or directory
[~]2 ln -s /tmp/foo frobozz
[~]2 ls -l frobozz
lrwxrwxrwx  1 valdis valdis 8 May 31 14:28 frobozz -> /tmp/foo
[~]2 ls -lL frobozz
ls: frobozz: No such file or directory
[~]2 rpm -qf /bin/ln
coreutils-5.2.1-10

What did you mean by "broken" in that context?



[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31 17:20                     ` Hubert Chan
@ 2004-05-31 21:14                       ` David Masover
  0 siblings, 0 replies; 41+ messages in thread
From: David Masover @ 2004-05-31 21:14 UTC (permalink / raw)
  To: Hubert Chan; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Hubert Chan wrote:
[ ... ]
| But without a high-level overview of all the files, and how they all fit
| together, it's still hard to read.

True.  Actually, one of the most desirable traits of a hacker is the
ability to take any reference and learn from it as if it were a
tutorial.  I know people who can hear me talk about things like
filesystems for a full 5 minutes, and at the end of it come back to me
and say they understood everything, they just need a few definitions
(like what a filesystem is).

But, to make it more accessible to other hackers, someone should
eventually do such a thing.  I probably will as I go -- if I read the
reiser4 code this summer, I will post a 'reiser4 cheit sheet' or something.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLugMHgHNmZLgCUhAQK2yQ//TNnS8k0CMG+redCoGlBstaPlJ53vcL90
9TkrNryd/WqfmdwJo9O79Mqtpc8je23DVeo+4Ti3aKHXZp+R0Fs8FSb/ij/rs2jD
vVfE6bfbSFHne0FCPeucnWad910kHUWcaopgm8nQOu049JbfGFr00XC1kwVwygw+
66RoYwX0Ql0YZbueZhElXY66MOS/EM+X/V6sWvVWLGWWZHgUGFyGXjY3ixudeLMG
cuy7gAkRLyt7VhfI8CT9RD0340RT1XSOmuFClm6gIZ7Oq097U/D5/ArugtvKe7TI
swdcjl+WmiLO9VraW/1x5dXXghZFAxHEUQFtoy6Yjx3Pv7+B5NSx97IP0AdDzSXT
NYErwJAOfHjnaQbUzsTEp7FrpJo8N/QjznfRzhbNqxaD6zF+OgQ0WE7rg7j5hFVr
e/cjQaLGyVWDTzTzHqmWRezPtTQp1pe53jcEA3XiNMk9/56BNyRxHNV1cEw/+jSg
tcbRodCBLrJ7FaNDTuY0/sb4jQ6OOPcz8nTAT51QA76bwI0ruE4Fnp0R8bkYWGbr
D/rpczs7q8qOU7864usAKbyraPtTkcyNc32qf2U8/sHSgxmHEAJwxM3MkZ9wH2ND
D0EdIqUrHT6YPK3r5EwlsO8oKtGQSlJTK0X+wbjniTA/I84/wPj5tA9NAmJp0Aik
EhT5z3D3mcc=
=u0RU
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31 18:31             ` Valdis.Kletnieks
@ 2004-05-31 21:15               ` David Masover
  0 siblings, 0 replies; 41+ messages in thread
From: David Masover @ 2004-05-31 21:15 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Markus_Törnqvist, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Valdis.Kletnieks@vt.edu wrote:
| On Sat, 29 May 2004 18:16:47 CDT, David Masover said:
|
|
|>$ ln -s /tmp/foobar cipher
|>
|>Of course, the nice thing here is that at least 'ln' refuses to create a
|>broken symlink, not sure about more low-level things.
[...]
| What did you mean by "broken" in that context?

Exactly what you meant.  Not sure where I got that idea, though I didn't
actaully test it.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLugjHgHNmZLgCUhAQINOw//b479rqnxQQHWvEkDHyOc9iE1Hq5+65c+
A2TpcTzGyxUIrHr8FbqSLC0HIysLnu8+hL1JE6XG5awn1/2N6twkOhu2ff4cFevM
84V1qI7luXwbHlhuge+RSIPCezkj+XoO35HCUbmdW70YPgpMezbEAZoYcXyMBuK7
824hQ82nMibPIdC+6UmL114zaDVBnOs/VH+9yhhRq0fTFQxE6LL9eSbr57Ef0iDh
wVcUs5UDwlh3+DuuFsG3YuuPlzOSYhGJynIQNARUllvmlhC7QRnAPpx3urly2jzH
qqx4G6TgyZX/HVijZiyU+hnHmcC4WLpnXRRFDl3YsFsigENNc+jg+6ddvZLcxggw
8h5rMNmCcUgKYYifDSaP5P9afALgMS2FL7mNT5U0wfkxFyZM+lvOqB1KBuhhyZgq
QCkpMRxvpyNwwDSj5YngB/ZAzM3zZFto8PkXwyrgnC4+0nd2QNTyYe+Ot1FUdBeZ
yOCVlQxEdGPaDY4JlFgws3+Z/gTaPEXKWCYI099ULJ05N+yKd9zoViwMYB9ID9+Z
gCo+0sOsTwR/johGrBMoqQYYGhXkWzw2WuKYp2SpImzbmdJ3Vesdcy7zfE5JfvKD
wapPly/25foVCKhuP6ODNG7tXT9Q/NIvPMEAlQa5cPMA+iiOwyZQwr3KiaBgxXcT
y8bgfuqfgAY=
=7HMS
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31 18:27             ` Valdis.Kletnieks
@ 2004-05-31 21:23               ` David Masover
  2004-06-01  2:09                 ` Hubert Chan
  0 siblings, 1 reply; 41+ messages in thread
From: David Masover @ 2004-05-31 21:23 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Hubert Chan, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Valdis.Kletnieks@vt.edu wrote:
| On Sat, 29 May 2004 18:19:01 CDT, David Masover said:
|
|
|>Files are not encrypted by default, so encrypted files should allow a
|>passphrase to be checked by default.  In balancing security and
|>convenience, make convenience the default.
|
|
| Note that if 98% of the files on a file system are plaintext, then
| somebody can infer quite a bit merely by knowing what 2% are
| in fact encrypted.  If they're all encrypted, an attacker can't
| leverage that knowledge.

Fine, so some people will encrypt the entire filesystem, except perhaps
the scripts necessary for entering the passphrase.  But then you're
vulnerable because people can figure out exactly what size all of your
files are, or at least what size they are when compressed...

For most of us, well, I don't give a flying ... er, purple people eater
... whether people know that my pgp key is encrypted, or my ssh keys, or
a little folder called "secret".

Also, I'm not sure how relevent it is today, but I know the Germans
limited the length of an Enigma message, because a longer message means
more redundancy.

But back to the point.  No encryption by default.  My grandmother would
wonder why  her computer is so slow, and if you told her it's so that
the FBI can't read anything when they sieze her computer, she wouldn't
be happy.  Except for the fact that my grandmother is a very patient
woman, so she might not care that it's slow.  But my brother would, and
he plays games, so...

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQLuiPXgHNmZLgCUhAQLeWQ/8DCs+jdS2OeKwbeDOqk8t6SnfOY9SMVpx
uqHOgOMYupAL1gkKnXLLVz3JKK23OK/QXmbVRj54g6oX4vNDsAFonbQ5O6DHlDTP
3uSBLFbYFKROIhIW4+DaSlsDBPPUdP0HSNQ+HvBVmhFQluNOcJHvuO+1QO2UBM54
u53PjlEpgLRTTzYl9IjCAOHO/6Do62DOgjKa2l3b1pUob/BMXJd42lpaks5h0NS3
PZPktA/EYEiTwCmPMIGAZW6skxu4P5CFx+YveTbn43bPQt1yVH+KTBqhphoMScwB
sMpVR5FACBSHsS/xM3sZYMQUoV2jpRJZXL2ACdq7R7HiCmvykrtqW0xVPcuPnXMH
7DUqHGaFt2j0QcdY2+pQ55KdpuiwR1Bhlf5q78QYOkbY0F5eD+wmBUvVZEJYrjfy
oVLYihPs6m08qq5Q2yvkaqLPwQIzIynQzUdYC+BoV+scJkM8OMuzjc87vGNgHRs5
tQa9uEWZn5kYRR6iaOaixhCrEeJMQCWxz0ppq1G5VxbX/Pf0SaCXPIt3HmjjajMn
Vpk605Slln6b14hJ4pJfQogYV/CNcO+aKkCX5Wz42/qvlnx7XykQ1jSh18A5LE1c
m8j85Glt8FbfwgalMLSVV6UV2G61IeJfU/6VblTkRDodW8UrFs574/kqg76V7hRC
ALWqFagD0Cw=
=wOXX
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31 21:23               ` David Masover
@ 2004-06-01  2:09                 ` Hubert Chan
  2004-06-05  4:50                   ` David Masover
  0 siblings, 1 reply; 41+ messages in thread
From: Hubert Chan @ 2004-06-01  2:09 UTC (permalink / raw)
  To: reiserfs-list

>>>>> "David" == David Masover <ninja@slaphack.com> writes:

[...]

David> But back to the point.  No encryption by default.  My grandmother
David> would wonder why her computer is so slow, and if you told her
David> it's so that the FBI can't read anything when they sieze her
David> computer, she wouldn't be happy.  Except for the fact that my
David> grandmother is a very patient woman, so she might not care that
David> it's slow.  But my brother would, and he plays games, so...

Well, filesystem performance probably won't affect games that much.
Unless your game is very filesystem-bound, but then it'll be slow
anyways (and the programmer should be given a swift smack to the head
for doing something that stupid).

Filesystem performance is more likely to be an issue for things such as
compiling programs, running a webserver, etc.

But I agree.  No encryption by default.  Especially since encryption is
illegal in some countries.  And because encryption would require the
user to provide a passphrase.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-31  0:45               ` David Masover
  2004-05-31  8:38                 ` mjt
@ 2004-06-01 13:25                 ` Edward Shushkin
  2004-06-02  8:05                   ` mjt
  1 sibling, 1 reply; 41+ messages in thread
From: Edward Shushkin @ 2004-06-01 13:25 UTC (permalink / raw)
  To: David Masover; +Cc: Markus Törnqvist, Valdis.Kletnieks, reiserfs-list

David Masover wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
>
> Markus Törnqvist wrote:
> |>actual operation.  The setup doesn't need to be in the kernel at all,
> |>and in fact, I think it'd be nice to have a meta-plugin which exports
> |>the plugin interface to userland, to make this sort of thing easier.
> |
> |
> | The pseudo file system is kernel-space.
> | Having something completely user-space creates yet another knob
> | to worry about?
> |
> | What do you propose this will look like?
>
> Well, what does writing a plugin look like?  I'm thinking making it
> possible to write a userspace plugin in pretty much the same syntax as
> you'd use for the kernel-based one, only you have a lot more toys in
> userland (say, libpcre).  Then the next step would be perl/python/etc
> bindings.
>
> | Could there be a compromise?
> | Like with the policy=smart mount option.
> |
> | Take the easy way where it's more efficient, the right way when it's
> | more efficient.
>
> Maybe.  But the only inefficient thing I see happening to the ultimate
> "right way" of doing this is that in order to make sure that we don't
> encrypt more than we have to, we have to keep track of whether each
> individual block has been encrypted or not.  I don't know if that's
> easily possible, though.  The place where that is inefficient is when
> encrypting a huge file that doesn't change much, but I can't imagine it
> being so inefficient that you'd prefer to encrypt some blocks twice.
> Say this is your file:
>
> 12345678
>
> You start to encrypt it, but when you're here:
>
> 1234|5678
>
> someone changes something:
>
> 1234|5600
>
> Ideally, you want to just keep going from here, but what if they changed
> something towards the beginning of the file, instead?  Obviously then
> the change must be encrypted, even though the original was also
> encrytped.  But towards the end, we should be able to forget the '78'
> and just encrypt the '00'.
>
> | If the journal is updated atomically on writes of complete files,
> | we would either have completely encrypted or completely unencrypted
> | files?
>
Hello.
We use clustering approach to make all crypto transforms. So each file 
to compress and(or) encrypt is
considered as a set of clusters and each cluster is transformed 
atomically and independently. Moreover
with any block[page] which contains transformed[plain] text, the 
transaction should include all other
blocks[pages] of the cluster. So there is no such issues.

> This works, but is annoying, because then if we are encrypting a large
> file, we can't use it until we're done encrypting.  Either that, or our
> changes can't be committed until the encryption is done, which has the
> problem of what I'm describing above -- first we encrypt the '78', then
> we encrypt the '00'.  All while possibly wasting disk space -- suppose
> it's a 9 gig file on a 10 gig partition?
>
> | But this would require root privileges. If you can't trust your root,
> | who can you trust?
> |
> | Isn't some memory always accessible by root solo? Or at least owned
> | by the user, you, so no-one else can access it? If a bug that 
> circumvents
> | this gets into the kernel, certain sysadmins will start farting blood.
> |
> | But that's assuming I remember the initial conditions correctly.
>
> Yeah, you're right, I was overestimating your level of paranoia.
> Because sometimes people actually do lock things down so hard that there
> are places you can't get to.  But in doing so, they cripple root so
> badly that I'd never want to 0wn those systems (in either sense of the
> word).
>
> | I would tend to bite the bullet and split compression and encryption
> | into separate plugins.
>
> As long as they share a lot of code.  The initialization would have to
> be different for each one, but during operation, I believe they could
> literally be the exact same code, only one of them uses zlib and another
> uses blowfish (or something).

zlib_deflate is cpu-gluttonous because of its general purpose, and its using
for compression of clusters (which should be small) is a big question..

> I mean, reiser makes no assumptions about
> eventual size until a flush to disk anyway, so what difference does
> compression make?
>
> | Maybe there is some way of doing the compression after and before
> | encryption and decryption, yes.
>
> Compress before encryption, decompress after decryption.  But here's a
> question -- can we as users choose what order plugins of the same layer
> are run?  You'd have to be an idiot to want to encrypt before you
> compress -- an idiot, or someone who's thought of something we haven't.

Yes, compression after encryption is a horror, this is why the order of 
transforms
is builtin and can not be changed by users. The user can only assign 
plugins. For each
kind of transform there is special "none" plugin which means absence of 
this transform.

Edward.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-05-29 15:49         ` mjt
  2004-05-29 23:16           ` David Masover
@ 2004-06-02  2:45           ` Hans Reiser
  1 sibling, 0 replies; 41+ messages in thread
From: Hans Reiser @ 2004-06-02  2:45 UTC (permalink / raw)
  To: Markus Törnqvist, Edward Shishkin
  Cc: David Masover, Valdis.Kletnieks, reiserfs-list

Markus Törnqvist wrote:

>
>Unfortunately the error codes in the code are named Edward, who doesn't
>appear in the README...
>  
>

Edward, add this credit for yourself to both the readme and the 
mkreiserfs credits when it starts to work:

Edward Shushkin wrote the reiser4 compression and crypto plugins


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-01 13:25                 ` Edward Shushkin
@ 2004-06-02  8:05                   ` mjt
  2004-06-02 12:51                     ` Edward Shushkin
  0 siblings, 1 reply; 41+ messages in thread
From: mjt @ 2004-06-02  8:05 UTC (permalink / raw)
  To: Edward Shushkin; +Cc: David Masover, Valdis.Kletnieks, reiserfs-list

On Tue, Jun 01, 2004 at 05:25:03PM +0400, Edward Shushkin wrote:
>is builtin and can not be changed by users. The user can only assign 
>plugins. For each
>kind of transform there is special "none" plugin which means absence of 
>this transform.

Is there any chance of this looking more like a crypto/ directory
than a crypto file?

I really liked David Masover's ideas and I'd like to hear if they
are somehow infeasible. To me they make sense as practical.

Thanks!

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-02  8:05                   ` mjt
@ 2004-06-02 12:51                     ` Edward Shushkin
  2004-06-02 15:15                       ` mjt
  0 siblings, 1 reply; 41+ messages in thread
From: Edward Shushkin @ 2004-06-02 12:51 UTC (permalink / raw)
  To: Markus Törnqvist; +Cc: David Masover, Valdis.Kletnieks, reiserfs-list

Markus Törnqvist wrote:

>On Tue, Jun 01, 2004 at 05:25:03PM +0400, Edward Shushkin wrote:
>  
>
>>is builtin and can not be changed by users. The user can only assign 
>>plugins. For each
>>kind of transform there is special "none" plugin which means absence of 
>>this transform.
>>    
>>
>
>Is there any chance of this looking more like a crypto/ directory
>than a crypto file?
>  
>
'Looking more' sounds incorrect here (does  directory  look more then 
file? )
You might want special (directory) object plugin. All ideas should be 
implemented
via its methods..

>I really liked David Masover's ideas and I'd like to hear if they
>are somehow infeasible.
>
I am a bit confused: usually I inspire everyone that something is 
feasible ;)

Edward.

>To me they make sense as practical.
>
>Thanks!
>
>  
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-02 12:51                     ` Edward Shushkin
@ 2004-06-02 15:15                       ` mjt
  0 siblings, 0 replies; 41+ messages in thread
From: mjt @ 2004-06-02 15:15 UTC (permalink / raw)
  To: Edward Shushkin; +Cc: David Masover, Valdis.Kletnieks, reiserfs-list

On Wed, Jun 02, 2004 at 04:51:43PM +0400, Edward Shushkin wrote:
>Markus Törnqvist wrote:
>>Is there any chance of this looking more like a crypto/ directory
>>than a crypto file?
>'Looking more' sounds incorrect here (does  directory  look more then 
>file? )
>You might want special (directory) object plugin. All ideas should be 
>implemented via its methods..

I meant what it looks like to the end user.
crypto/ directory where we have files like seed and whatnot.

Even if the difference between these two are somewhat diffuse
in Reiser4 ;)

>>I really liked David Masover's ideas and I'd like to hear if they
>>are somehow infeasible.
>I am a bit confused: usually I inspire everyone that something is 
>feasible ;)

Well, I think they are feasible ;)

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-01  2:09                 ` Hubert Chan
@ 2004-06-05  4:50                   ` David Masover
  2004-06-05  7:30                     ` Valdis.Kletnieks
  0 siblings, 1 reply; 41+ messages in thread
From: David Masover @ 2004-06-05  4:50 UTC (permalink / raw)
  To: Hubert Chan; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Hubert Chan wrote:
|>>>>>"David" == David Masover <ninja@slaphack.com> writes:
|
|
| [...]
|
| David> But back to the point.  No encryption by default.  My grandmother
| David> would wonder why her computer is so slow, and if you told her
| David> it's so that the FBI can't read anything when they sieze her
| David> computer, she wouldn't be happy.  Except for the fact that my
| David> grandmother is a very patient woman, so she might not care that
| David> it's slow.  But my brother would, and he plays games, so...
|
| Well, filesystem performance probably won't affect games that much.
| Unless your game is very filesystem-bound, but then it'll be slow
| anyways (and the programmer should be given a swift smack to the head
| for doing something that stupid).

Suppose we've got a game which dynamically loads a level, such that the
player only has to watch the thing load when it boots up.  Now, if our
filesystem is too slow, the horizon in the game shrinks.

Or suppose we simply don't want people to have to wait 2-3 times as long
for a level to load, or for the game to load.  Games tend to be huge
apps these days (or at least have a huge amount of content to load at
once).  Every few minutes (or 5-10 mins) in ut2004, the server changes
levels.  The levels currently take 15-30 seconds to load, most of it
pure disk activity.

Also, what do you mean by "filesystem-bound"?  Most games I've seen make
their own databases -- there's things all over with names like
"textures.mycoolgame".  I wouldn't, I'd probably have it use whatever
the reiser interface to multiple files is.  The last thing I want to
think about when writing a game engine (a huge beast on its own which I
will attempt one of these days) is data storage.

Or suppose we're talking about databases.

| Filesystem performance is more likely to be an issue for things such as
| compiling programs, running a webserver, etc.

Actually, no.  Not compiling programs, anyway.  Most of the time there
is spent doing the actual compiling, which moves surprisingly small
amounts of data at a time, although it does help to have things
buffered/cached.  And running a webserver may want encryption anyway,
depending on the kind of server it is, but mine doesn't have enough
bandwidth to want fast performance.

| But I agree.  No encryption by default.  Especially since encryption is

And I agree that excessive encryption isn't all that bad.

Enough said from the both of us.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQMFRD3gHNmZLgCUhAQJyJQ/7BrH3bHwyMVH4DlZnXf4Ntbx6ftLKVQZt
t71dyNcEHn8ksRGA711tEIgmf1wrMiPP3lm6sgOmJuHlrfi06jJi0+YsmINIYX/T
im2W5Zht6h6ie0Saa5w1QUh96vok6HF11FpqgHyctOz11yVa7zPKLWD1tXYQz79I
wAvid217Mb9FzpBu/qkptJt2NpRYshSZYdTeKx08iow7kwtDXddlmEWZ6fDLyTQJ
bunqfzUlOOgYJv4jdi715t660YsrUlswS/36q51TR1lAMYb1BiZZippV5hZpJ0wr
ECcZKnVdXVKadkAIjmnp9dconSNSmqguKNA2zpJaeCekBLsOyJkEh3EqZJcqjM4J
fPeUqKKblGmQ+oT1VRx9RbSdO40wZ9Yaxndo//jxARe0yEnRV3Rw/jjh0EfxWKH0
3XxWeDOWgGOl+92irxRHWQNszZEjPh46Y7IAIb0pxLAvTtmhqk7k5TS7yVskzncA
tzSgpk7ZH9RY2WR6v98zZjOY8bVadbbWr+2TALxDMrUJ0mhRZMQdakiMpHOLHIpr
MTTV1WO+eKlZlABmZzhF33tE0v9bec01rcM2glqj72FfNHWGAjTjvwlZeGYW/Rpj
f0eHykBou7yjSNgS/txdk0dRRl2ZWSaRL3xYtk6T80HMCgeMkx3YPSbAXszv0GY2
T01oXQxEpbc=
=rKKZ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-05  4:50                   ` David Masover
@ 2004-06-05  7:30                     ` Valdis.Kletnieks
  2004-06-05 10:07                       ` Christian Iversen
  2004-06-09 22:01                       ` David Masover
  0 siblings, 2 replies; 41+ messages in thread
From: Valdis.Kletnieks @ 2004-06-05  7:30 UTC (permalink / raw)
  To: David Masover; +Cc: Hubert Chan, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1667 bytes --]

On Fri, 04 Jun 2004 23:50:23 CDT, David Masover said:

> Or suppose we simply don't want people to have to wait 2-3 times as long
> for a level to load, or for the game to load.  Games tend to be huge
> apps these days (or at least have a huge amount of content to load at
> once).  Every few minutes (or 5-10 mins) in ut2004, the server changes
> levels.  The levels currently take 15-30 seconds to load, most of it
> pure disk activity.

Proper programming would probably render encryption almost-zero overhead.
Except for really high-end disk subsystems, you're not going to get an *effective*
(i.e. after seeks and rotational delay and all that) throughput of much over
40 megabytes/second - and encrypting/decrypting at 40 megabytes/second
on modern hardware should be a no-brainer if you are sensible about key
scheduling and the like.

The only remaining point is to make sure you overlap the CPU and I/O operations.
If you wait for it to read in all 14 megabytes and then decrypt, it will be slow.
If you have enough sense to decrypt the first 64K while the next 64K is reading,
and so on, it should be nearly un-noticable.

> Also, what do you mean by "filesystem-bound"?  Most games I've seen make
> their own databases -- there's things all over with names like
> "textures.mycoolgame".  I wouldn't, I'd probably have it use whatever
> the reiser interface to multiple files is.  The last thing I want to
> think about when writing a game engine (a huge beast on its own which I
> will attempt one of these days) is data storage.

Just remember - if it doesn't all fit in memory at once, the *first* thing you
should be worrying about is data storage ;)



[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-05  7:30                     ` Valdis.Kletnieks
@ 2004-06-05 10:07                       ` Christian Iversen
  2004-06-07 17:35                         ` Valdis.Kletnieks
  2004-06-09 22:01                       ` David Masover
  1 sibling, 1 reply; 41+ messages in thread
From: Christian Iversen @ 2004-06-05 10:07 UTC (permalink / raw)
  To: reiserfs-list

On Saturday 05 June 2004 09:30, Valdis.Kletnieks@vt.edu wrote:
> On Fri, 04 Jun 2004 23:50:23 CDT, David Masover said:
> > Or suppose we simply don't want people to have to wait 2-3 times as long
> > for a level to load, or for the game to load.  Games tend to be huge
> > apps these days (or at least have a huge amount of content to load at
> > once).  Every few minutes (or 5-10 mins) in ut2004, the server changes
> > levels.  The levels currently take 15-30 seconds to load, most of it
> > pure disk activity.
>
> Proper programming would probably render encryption almost-zero overhead.
> Except for really high-end disk subsystems, you're not going to get an
> *effective* (i.e. after seeks and rotational delay and all that) throughput
> of much over 40 megabytes/second - and encrypting/decrypting at 40
> megabytes/second on modern hardware should be a no-brainer if you are
> sensible about key scheduling and the like.
>
> The only remaining point is to make sure you overlap the CPU and I/O
> operations. If you wait for it to read in all 14 megabytes and then
> decrypt, it will be slow. If you have enough sense to decrypt the first 64K
> while the next 64K is reading, and so on, it should be nearly un-noticable.

This will fix the problem of throughput, but wont it give a significant 
latency?

-- 
Regards,
Christian Iversen

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-05 10:07                       ` Christian Iversen
@ 2004-06-07 17:35                         ` Valdis.Kletnieks
  0 siblings, 0 replies; 41+ messages in thread
From: Valdis.Kletnieks @ 2004-06-07 17:35 UTC (permalink / raw)
  To: Christian Iversen; +Cc: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 769 bytes --]

On Sat, 05 Jun 2004 12:07:43 +0200, Christian Iversen <chrivers@iversen-net.dk>  said:

> This will fix the problem of throughput, but wont it give a significant 
> latency?

Probably not - remember that the trip out to disk cost you anywhere from 6ms to
10ms or so (remember that 5ms is 200 IO/sec), and the crypto shouldn't add more
than a small fraction of a millisecond.

Ballpark math - if you can do the crypto at 20 Mbytes/sec, which is slow for a
modern cipher and CPU, you add about 0.5ms for the last 4K block (assuming you
pipelined the preceeding blocks and overlapped the cryto and I/O).  If your
system is running so close to the edge that the difference between 6ms and
6.5ms sinks it, you have BIGGER problems over in the capacity-management
area.....


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-05  7:30                     ` Valdis.Kletnieks
  2004-06-05 10:07                       ` Christian Iversen
@ 2004-06-09 22:01                       ` David Masover
  2004-06-10  8:23                         ` mjt
  1 sibling, 1 reply; 41+ messages in thread
From: David Masover @ 2004-06-09 22:01 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Hubert Chan, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Valdis.Kletnieks@vt.edu wrote:
[...]
| Proper programming would probably render encryption almost-zero overhead.
| Except for really high-end disk subsystems, you're not going to get an
*effective*
| (i.e. after seeks and rotational delay and all that) throughput of
much over
| 40 megabytes/second - and encrypting/decrypting at 40 megabytes/second

Yes, I know this.  Consider:  On Windows 9x (and 3.1, of course), when
you read from a floppy, it's unbelievably slow.  True, this doesn't
matter much when you consider how slow the physical medium is.  But
while it's actually reading it, you cannot do anything else.

So, multitasking matters.  Suppose I've got this game (because games are
fun) with two threads -- the player-kills-all thread, and the
load-rest-of-level thread.  So the player isn't going to move for awhile
- -- he's getting shot at too much -- but the game wants to load a little
more detail from disk to make the sky and/or landscape look prettier.
Without encryption, big deal -- the level-loading thread asks for the
data from the disk, then goes to sleep until it arrives, thus freeing
resources for the immediate gameplay.  With encryption, if we are trying
to load things from disk, we are also going to have to spend some CPU
decrypting them which we wouldn't otherwise.  Not much latency -- unless
you need CPU to make sure the player doesn't lag.  Or you put the thread
on a low priority, which works -- unless you need as close to 100% CPU
as possible for higher-priority threads, which drives said latency up
insanely.

Of course, most of the time people can just buy a new CPU.

Should we take this off-list?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQMeIrXgHNmZLgCUhAQJLkhAAj97ah2lvz97mmtVFvydmZ+nJPS1f/OBP
gifHHcE8ZGe/PN2Iy+1PQVz2P1GkDmk2dW8/HIid+aHYv0UZsISw6Y6rQ2Kyxr9y
OULkebG+qgV5zxwKFUMKqkTMaxFXJBQinxvf1mdEbOMmTfxW1HzKB7FYVH3muAWO
i4knab6t720m+rWQ0q+VBKKSH52CVHCno1ncZnaRiTmZlicJQ3GSitglq/5NHsk6
3PKPXM68TMWl7ms9k6Htseba3jYcW4l9kHNjHICjAbQ4hK/KYtkNhoBeTIoVcpbW
BkGa9ThKF23/3CTYNycXZ+BUS6C7RdH9iXku5WjlbrT6dlEZDDenleiJz5HEBo94
MB74xlGLpz2O/b/A/i0iie5QJyoyOzexqh+AMoJlN9n6VUS1o5sUquoDWPhL9ZvH
1BUJnm/tk9qcOuzmkcxux0sWVGambqJah9gU8F+a0AQTiT71z4UnEb78YmWfQFg7
hXsWgXG2l/aF8XKqQY4w+lZT2WsX7kcSZ417VzeoiMVx1yHpYQk2qPDC59n8EcNh
q1EmNaYJD0DEtAbexH5jZhlofB9mnpJDpRHPvixKDxvD5h7WLrcwyfJ7+SHZnykw
xUDXRumoWmF2La/Dh75SwQc6YJeyXVmOICB6uRuX7mC187wjfQfy/KC2Z6dRjNow
ysMN5rbtSLY=
=Ayvm
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: The situation at hand and in the future
  2004-06-09 22:01                       ` David Masover
@ 2004-06-10  8:23                         ` mjt
  0 siblings, 0 replies; 41+ messages in thread
From: mjt @ 2004-06-10  8:23 UTC (permalink / raw)
  To: David Masover; +Cc: Valdis.Kletnieks, Hubert Chan, reiserfs-list

On Wed, Jun 09, 2004 at 05:01:17PM -0500, David Masover wrote:

>Of course, most of the time people can just buy a new CPU.

Or don't encrypt the game data.

>Should we take this off-list?

Please don't, I'm not sure if I'm the only one who finds this
discussion too interesting not to be readable by everyone.

It also relates to Reiser4.

-- 
mjt


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2004-06-10  8:23 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-27 20:01 The situation at hand and in the future mjt
2004-05-27 21:05 ` Valdis.Kletnieks
2004-05-27 22:09   ` David Masover
2004-05-28  6:33     ` mjt
2004-05-28 19:53       ` Valdis.Kletnieks
2004-05-29 12:48         ` mjt
2004-05-29 14:22       ` David Masover
2004-05-29 15:49         ` mjt
2004-05-29 23:16           ` David Masover
2004-05-30  0:41             ` Hubert Chan
2004-05-30 12:29               ` mjt
2004-05-30 16:54                 ` Hubert Chan
2004-05-30 12:27             ` mjt
2004-05-30 17:09               ` Hubert Chan
2004-05-31  0:07                 ` The Amazing Dragon
2004-05-30 17:13               ` Hubert Chan
2004-05-30 18:06                 ` mjt
2004-05-31  0:45               ` David Masover
2004-05-31  8:38                 ` mjt
2004-05-31 15:12                   ` David Masover
2004-05-31 17:20                     ` Hubert Chan
2004-05-31 21:14                       ` David Masover
2004-05-31 15:16                   ` Hubert Chan
2004-06-01 13:25                 ` Edward Shushkin
2004-06-02  8:05                   ` mjt
2004-06-02 12:51                     ` Edward Shushkin
2004-06-02 15:15                       ` mjt
2004-05-31 18:31             ` Valdis.Kletnieks
2004-05-31 21:15               ` David Masover
2004-06-02  2:45           ` Hans Reiser
2004-05-29 20:04         ` Hubert Chan
2004-05-29 23:19           ` David Masover
2004-05-31 18:27             ` Valdis.Kletnieks
2004-05-31 21:23               ` David Masover
2004-06-01  2:09                 ` Hubert Chan
2004-06-05  4:50                   ` David Masover
2004-06-05  7:30                     ` Valdis.Kletnieks
2004-06-05 10:07                       ` Christian Iversen
2004-06-07 17:35                         ` Valdis.Kletnieks
2004-06-09 22:01                       ` David Masover
2004-06-10  8:23                         ` mjt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.