From: Karl MacMillan <kmacmillan@mentalrootkit.com>
To: James Antill <jantill@redhat.com>
Cc: SELinux Mail List <selinux@tycho.nsa.gov>
Subject: Re: [RFC] Support for bzip compressed modules
Date: Tue, 09 Jan 2007 10:51:55 -0500 [thread overview]
Message-ID: <45A3BA1B.9000908@mentalrootkit.com> (raw)
In-Reply-To: <1168327098.22423.93.camel@code.and.org>
James Antill wrote:
> On Mon, 2007-01-08 at 15:34 -0500, Karl MacMillan wrote:
>
>> The patch implements this support by changing sepol_policy_file_t to
>> support decompressing files or memory areas into a private memory copy.
>> This support is optional - dlopen is used so that a hard dependency to
>> libbz2 is not introduced. I took the approach of decompressing the
>> entire file or memory area because:
>
> Why don't we want to depend on libbz, if we are building with bz2
> support?
>
This allows the same binary to be installed with or without libbz2.
Since libsepol gets pulled into so many installs, it seems preferable to
not add additional dependencies.
On the other hand, I don't have a strong preference about that part of
the patch. If libbz2 is deemed sufficiently available (or even libz I
guess) we can just directly link. Perhaps with compile time options.
>> * It is very simple
>> * The current code depends on the ability to seek within policy files -
>> this is not really possible within compressed streams using the bzip2
>> library.
>>
>> The downsides are:
>>
>> * Increased memory usage
>> * No transparent support for compressed writing with an fd based policy
>> file.
>>
>> I didn't want to add additional set functions - I would have preferred
>> to allow sepol_policy_file_set_[mem,fd] to transparently open compressed
>> streams with functions to set other behaviors as options stored in
>> sepol_policy_file_t structs. This was not possible becuase the current
>> set functions do not return errors.
>
> Do we really care about the memory usage, my instinct would be to drop
> the FILE specific code and just dump everything into memory and then
> call the mem_set function and thus. have only one decompression loop
> (adding the fd version is simple then too).
> Calling fstat(fileno(fp)) to read the policy in is probably easier than
> a loop.
>
Not certain what you are getting at - both code paths result in an
uncompressed copy of the compressed data in memory. The only difference
is whether we are decompressing from an fd or from another memory buffer.
>> Comments appreciated. Some very crude benchmarking below (note that I am
>> using a patched semodule to allow the globbing syntax - patch for that
>> to follow). The summary is that there is substantial space savings at
>> the expense of some increase in time to complete common actions. An
>> acceptable trade-off in my opinion.
>>
>> Anyone have suggestions for something as simple as time but for max
>> memory usage?
>
> There's memusage in glibc-utils.
>
Thanks
> ---- code ----
>
> The bz2 code looks fine, although the += BUFSIZE in one loop and *= 2
> in the other is weird, and there's a couple of minor nits in the
> interface:
>
If you look at the loop with *= 2, we are just guessing buffer size so I
want to grow the buffer much more quickly if the decompression fails. We
start with a 2-1 compression ratio, then try 4-1, etc. If we only adding
BUFSIZ then we might loop for a long time growing the buffer.
> . check is always true in callers, and I'm not sure why you'd have it
> zero.
>
The magic number checking seems fragile - I'm assuming it might be
necessary to force the stream as compressed at some point. Since we are
maintaining ABI for this library (and these functions), seems better to
be safe.
> . All code paths have:
>
> if (set_foo_bz2() == FAILED)
> set_foo();
>
> ...which tells me set_foo_bz2() should do that ... in fact it seems sane
> to just change set_foo() to check of bz2ness and do the right thing,
> without having to alter the callers.
>
Note my comments with the original patch - this isn't possible because
set_foo() has a void return and we want to maintain binary compatibility.
> . A personal minor nit is that free(NULL) works fine, so don't work
> around it (this idiom seems to be used in sepol).
>
I don't that I see - maybe you mean:
+ if (pf && pf->orig_data) {
+ free(pf->orig_data);
+ pf->orig_data = NULL;
+ }
This is to allow this function to be called with a null pf - so I have
to check before looking in the struct.
> . sepol_policy_file_free_data() is also called multiple times at the end
> of the set_foo_bz2() functions (once inside set_foo() and then
> explicitly immediately after).
>
Good catch - thanks.
>
> I assume the only reason you went with bzip2 over gzip is the "have to
> init yourself in the set_mem case"?
No - just better compression.
[kmacmill@localhost ~]$ ls -l base.pp.*
-rw-r--r-- 1 kmacmill kmacmill 86379 Jan 9 10:50 base.pp.bz2
-rw-r--r-- 1 kmacmill kmacmill 167382 Jan 9 10:50 base.pp.gz
I've done that before[1], so I can
> help you get that bit done if you want ... this will drop
> CPU/memory/dependency requirements (although expecting all Linux to have
> libbz now isn't a big deal, IMO).
>
Ok - I'd be happy to support both if you want to send a patch.
Thanks - Karl
--
This message was distributed to subscribers of the selinux mailing list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.
next prev parent reply other threads:[~2007-01-09 15:51 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-08 20:34 [RFC] Support for bzip compressed modules Karl MacMillan
2007-01-09 7:18 ` James Antill
2007-01-09 15:51 ` Karl MacMillan [this message]
2007-01-09 15:58 ` Stephen Smalley
2007-01-09 16:50 ` James Antill
2007-01-09 21:18 ` Karl MacMillan
2007-01-10 5:06 ` James Antill
2007-01-11 18:41 ` Karl MacMillan
2007-01-09 22:33 ` Russell Coker
2007-01-11 18:48 ` Karl MacMillan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45A3BA1B.9000908@mentalrootkit.com \
--to=kmacmillan@mentalrootkit.com \
--cc=jantill@redhat.com \
--cc=selinux@tycho.nsa.gov \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.