From: Paulo Marques <pmarques@grupopie.com>
To: Guillaume@Lacote.name
Cc: "Jörn Engel" <joern@wohnheim.fh-wedel.de>,
linux-kernel@vger.kernel.org, Linux@glacote.com
Subject: Re: Using compression before encryption in device-mapper
Date: Wed, 14 Apr 2004 16:23:02 +0100 [thread overview]
Message-ID: <407D5756.6030604@grupopie.com> (raw)
In-Reply-To: 200404141602.43695.Guillaume@Lacote.name
Guillaume Lacôte wrote:
>...
> Actually (see my reply to Timothy Miller) I really want to do "compression"
> even if it does not reduce space: it is a matter of growing the per-bit
> entropy rather than to gain space (see http://jsam.sourceforge.net). Moreover
> I do not want to use sophisticated algorithms (in order to be able to compute
> plain text random distributions that ensure that the compressed distributions
> will be uniform, which is very difficult with for e.g zlib; in particular
> having any kind of "meta-data", "signatures" or "dictionnary" is a no-go for
> me). See details at the end of this post.
Point taken
> ...
>
>>A while ago I started working on a proof of concept kind of thing, that was
>>a network block device server that compressed the data sent to it.
>>
> Would it be possible for you to point me to the relevant material ?
>
I just need to tidy it up a little :)
Maybe I can publish it tomorrow or something like that.
>....
>>2 - The compression layer should report a large block size upwards, and use
>>a little block size downwards, so that compression is as efficient as
>>possible. Good results are obtained with a 32kB / 512 byte ratio. This can
>>cause extra read-modify-write cycles upwards.
>>
> I failed to understand; could you provide me with more details please ?
>
If we are to compress on a block basis, the bigger the block the higher the
compression ratio we'll be able to achieve (using zlib, for instance). However,
data sent to the actual block device will have to go in blocks itself.
For instance, if we compress a 32kB block and it only needs 8980 bytes to be
stored, we need 18 512byte blocks to store it. On average, we will lose 1/2 of
the actual block size per "upper level" block size bytes of data. In a 32kB/512
byte ratio, we would lose on average 256 bytes per 32kb of data ~ 0.8% (which is
more than acceptable).
>>...
> As I said earlier I my point is definetely not to gain space, but to grow the
> "per-bit entropy". I really want to encode my data even if this grows its
> length, as is done in http://jsam.sourceforge.net . My final goal is the
> following: for each plain block first draw a chunk of random bytes, and then
> compresse both the random bytes followed by the plain data with a dynamic
> huffman encoding. The random bytes are _not_ drawn uniformly, but rather so
> that the distribution on huffman trees (and thus on encodings) is uniform.
> This ensures (?) that an attacker really has not other solution to decipher
> the data than brute-force: each and every key is possible, and more
> precisely, each and every key is equi-probable.
Ok, we are definitely fighting different wars here.
Anyway, I'll try to gather what I did with the network block device server and
place it somewhere where you can look at it. It will probably help you do some
tests, too. Because it is a block device in "user space", it is much simpler to
develop and test different approaches, and gather some results, before trying
things inside the kernel.
If you want to start now, just go to:
http://nbd.sourceforge.net/
and download the source for the network block device server. My server is
probably more complex than the original, because of all the metadata handling.
I hope this helps,
--
Paulo Marques - www.grupopie.com
"In a world without walls and fences who needs windows and gates?"
next prev parent reply other threads:[~2004-04-14 15:26 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-04-13 15:44 Using compression before encryption in device-mapper Guillaume Lacôte
2004-04-13 16:57 ` Timothy Miller
2004-04-14 6:48 ` Guillaume Lacôte
2004-04-13 17:45 ` Jörn Engel
2004-04-13 19:42 ` Ville Herva
2004-04-14 6:54 ` Guillaume Lacôte
2004-04-14 9:43 ` Jörn Engel
2004-04-14 10:02 ` Guillaume Lacôte
2004-04-14 11:25 ` Jörn Engel
2004-04-14 12:44 ` Paulo Marques
2004-04-14 13:34 ` Jörn Engel
2004-04-14 13:58 ` maccorin
2004-04-14 14:02 ` Guillaume Lacôte
2004-04-14 14:39 ` Grzegorz Kulewski
2004-04-14 15:07 ` Guillaume Lacôte
2004-04-14 16:14 ` Grzegorz Kulewski
2004-04-14 15:23 ` Paulo Marques [this message]
2004-04-14 15:32 ` Guillaume Lacôte
2004-04-14 17:25 ` Bill Davidsen
2004-04-15 9:28 ` Jörn Engel
2004-04-22 7:59 ` Guillaume Lacôte
2004-04-22 9:18 ` Jörn Engel
2004-04-22 10:20 ` Guillaume Lacôte
2004-04-22 12:15 ` Jörn Engel
2004-04-22 13:06 ` Guillaume Lacôte
2004-04-22 16:00 ` Jörn Engel
2004-04-23 15:16 ` Guillaume Lacôte
2004-04-23 16:57 ` Jörn Engel
[not found] <1KykU-4VD-17@gated-at.bofh.it>
[not found] ` <1KPvh-26S-7@gated-at.bofh.it>
[not found] ` <1KSMw-4P1-13@gated-at.bofh.it>
[not found] ` <1KTfJ-5gK-25@gated-at.bofh.it>
2004-04-14 15:02 ` Pascal Schmidt
2004-04-14 15:25 ` Guillaume Lacôte
2004-04-14 19:29 ` Pascal Schmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=407D5756.6030604@grupopie.com \
--to=pmarques@grupopie.com \
--cc=Guillaume@Lacote.name \
--cc=Linux@glacote.com \
--cc=joern@wohnheim.fh-wedel.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.