* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
[not found] ` <20101019073901.GB11284@dastard>
@ 2010-10-19 16:24 ` Eric Paris
2010-10-19 16:29 ` Christoph Hellwig
0 siblings, 1 reply; 21+ messages in thread
From: Eric Paris @ 2010-10-19 16:24 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, linux-kernel, linux-security-module,
linux-fsdevel, zohar, warthog9, jmorris, kyle, hpa, akpm,
torvalds, mingo, viro
On Tue, 2010-10-19 at 18:39 +1100, Dave Chinner wrote:
> On Mon, Oct 18, 2010 at 10:14:03PM -0400, Eric Paris wrote:
> Eric, just to put that in context - changing the size of an inode
> needs to be conidered carefully because we cache so many of them. We
> often jump through hoops just to reduce it by 4 or 8 bytes. You are
> proposing to increase it by 24 bytes (roughly 5%) and as such that
> _should_ be considered a big deal, especially for something that is
> currently rarely used.
In my mind it's framed a little differently, my patch series is reducing
it from ~900 bytes to 24 bytes. Even though that memory might not have
been inside struct inode if there is always a 1-1 mapping it might as
well be.... I'm going from seriously broken to a hell of a lot better.
I believe that when I resend this series I'll drop 8 more of those bytes
(open count as I think we can do without that these days).
> Personally I that adding a pointer into the struct inode is as much
> as I'd want to compromise to. Those that want to use IMA or have the
> possibility of turning it on dynamicaly can accept the additional
> overhead of another memory allocation during inode allocation as the
> cost of using this functionality. That's the way the security
> subsystem works, so I don't see any problems with doing this for IMA
> and it turns the overhead problem into one that only affects those
> that have it both configured and enabled. That seems like a
> reasonable compromise to me....
The problem is that this would actually waste another 8 bytes (the size
of the pointer in struct inode) since IMA is still going to need to
allocate a structure for every inode to hold the 16-24 bytes of
counters. That 16-24 might not be in struct inode, but like I said, if
there is a 1-1 mapping between the two there is no difference.
I said that if there was a consensus that this overhead was still too
large (and it seems that may be the case) I would put looking at using a
userspace freezer to attempt to collect the information dynamically on
my todo list. I'll gladly do that but we have a space/time tradeoff I'd
rather have a consensus on before I start.
If I go the pointer in struct inode route, I don't need to serialize
entry and removal from core of every inode if IMA is enabled (while I
add and remove it from the IMA lookup tree.) If I don't add any fields
to struct inode I'll need to serialize while I add them to the IMA
lookup tree, but at the savings of a void * in struct inode.
My guess is that most people will say forcing users to serialize and
saving 8bytes per inode is the better choice, but I know there is
scalability work going on and I want to make sure everyone agrees that
is the right choice before we spend a lot of time on anything like
this...
-Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 16:24 ` [PATCH 1/3] IMA: move read/write counters into struct inode Eric Paris
@ 2010-10-19 16:29 ` Christoph Hellwig
0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2010-10-19 16:29 UTC (permalink / raw)
To: Eric Paris
Cc: Dave Chinner, Christoph Hellwig, linux-kernel,
linux-security-module, linux-fsdevel, zohar, warthog9, jmorris,
kyle, hpa, akpm, torvalds, mingo, viro
Eric,
I think you and just about everyone here are on a different page, and
before we have the basic disagreement settled I'm not sure we can
make much progress.
Can you please explain why a feature like IMA that no sane user would
ever want should cause _any_ overhead for users that just have it
compiled in because their distro or defconfig did without actually
using.
What exactly is the problem to require the few people that want it to
use a kernel command line option and/or an _DEFAULT_ON config option?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
[not found] ` <AANLkTimGdjPyLXuknDNa7WNthDT9+2FdOuPdxwjRiMHD@mail.gmail.com>
@ 2010-10-19 16:36 ` Eric Paris
2010-10-19 16:55 ` Al Viro
0 siblings, 1 reply; 21+ messages in thread
From: Eric Paris @ 2010-10-19 16:36 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-kernel, linux-security-module, linux-fsdevel, hch, zohar,
warthog9, david, jmorris, kyle, hpa, akpm, mingo, viro
On Tue, 2010-10-19 at 08:52 -0700, Linus Torvalds wrote:
> On Mon, Oct 18, 2010 at 6:16 PM, Eric Paris <eparis@redhat.com> wrote:
> >
> > IMA currently alocated an inode integrity structure for every inode in
> > core. This stucture is about 120 bytes long. Most files however
> > (especially on a system which doesn't make use of IMA) will never need any
> > of this space. The problem is that if IMA is enabled we need to know
> > information about the number of readers and the number of writers for every
> > inode on the box. At the moment we collect that information in the per
> > inode iint structure and waste the rest of the space. This patch moves those
> > counters into the struct inode so we can eventually stop allocating an IMA
> > integrity structure except when absolutely needed.
>
> Hmm. I don't think this is really acceptable as-is.
>
> First off (and most trivially) - the fields are misnamed. Just calling
> them "{open,read,write}count" was fine when it was part of an ima
> structure, but for all the historical reasons, inode fields are called
> 'i_xyzzy'.
Will fix.
> Secondly, we already maintain a write count (called "i_writecount").
> Why is the IMA writecount different, and should it be?
I ask Al about reusing this field long ago and he indicated it had a
very different meaning. I can't remember what he indicated it meant off
the top of my head but I'll take a look at it again. Lines like this
leave me leary:
drivers/md/md.c::deny_bitmap_write_access()
atomic_set(&inode->i_writecount, -1);
> Thirdly, why is it an "unsigned long"? Are the IMA numbers cumulative
> or something? How could you ever overflow a 32-bit counter if not?
Not cumulative. 32bits would probably be fine.
> Finally, why does IMA even care about the read-counts vs open-counts?
> Why not just open-counts, and consider any non-write to be an open?
What IMA needs to function is the current readers and current writers.
The open count was originally very useful when a number of places inside
the kernel were allocating struct files themselves rather than letting
the VFS do the lifting and we could end up with more struct files to a
given inode than IMA realized. Back then IMA started trying to do
one-off hooks to each filesystem doing this to fix the counters and
measure appropriately but we eventually decided it was best to move all
struct file creation into the vfs so it couldn't get out of whack. I
believe at this point we could drop the opencount....
> In short, I think this patch would be _much_ more acceptable if it
> added just a _single_ 32-bit "i_opencount". And even then I'd ask
> "what's the difference between i_opencount and our already existing
> i_count?
i_count, I believe, is much different. i_count is counting the number
of dentries in core referencing the inode, even if none of them are
being used in any struct file or if one dentry is being referenced in
1000 struct files. The IMA counters are from a higher level, they
counts the number of struct files referencing this inode.
I'll resend, shrinking from unsigned long to unsigned int and dropping
opencount from struct inode. Should get us from using ~900 bytes per
inode to using about 8 bytes per inode.
And like I said, if that still seems like too much overhead to most
people (and it seems that's the case) I'll look at how to get down to 0,
but it isn't going to be a fast obvious change...
-Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 16:36 ` Eric Paris
@ 2010-10-19 16:55 ` Al Viro
2010-10-19 17:03 ` Linus Torvalds
0 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2010-10-19 16:55 UTC (permalink / raw)
To: Eric Paris
Cc: Linus Torvalds, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm, mingo
On Tue, Oct 19, 2010 at 12:36:55PM -0400, Eric Paris wrote:
> I ask Al about reusing this field long ago and he indicated it had a
> very different meaning. I can't remember what he indicated it meant off
> the top of my head but I'll take a look at it again.
> i_count, I believe, is much different. i_count is counting the number
> of dentries in core referencing the inode, even if none of them are
> being used in any struct file or if one dentry is being referenced in
> 1000 struct files. The IMA counters are from a higher level, they
> counts the number of struct files referencing this inode.
a) i_writecount is about VM_DENYWRITE, basically. Reusing it for ima could
get unpleasant; when it's positive, we are fine, but it can get negative as
well. IMA will have interesting time dealing with that.
b) i_count is simply a refcount for struct inode. Not exactly the number
of dentries, but that's the main contributor. Basically, that's "how many
pointers outside of inode hash chains point that that struct inode at the
moment".
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 16:55 ` Al Viro
@ 2010-10-19 17:03 ` Linus Torvalds
2010-10-19 17:28 ` Al Viro
2010-10-19 22:49 ` Eric Paris
0 siblings, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2010-10-19 17:03 UTC (permalink / raw)
To: Al Viro
Cc: Eric Paris, linux-kernel, linux-security-module, linux-fsdevel,
hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo
On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> a) i_writecount is about VM_DENYWRITE, basically. Reusing it for ima could
> get unpleasant; when it's positive, we are fine, but it can get negative as
> well. IMA will have interesting time dealing with that.
>
> b) i_count is simply a refcount for struct inode. Not exactly the number
> of dentries, but that's the main contributor. Basically, that's "how many
> pointers outside of inode hash chains point that that struct inode at the
> moment".
My question was deeper. More along the lines of "why would IMA care?"
How/why could IMA ever care about the pointless and trivial
differences between its current private open/read/write counts and the
counts that we already maintain?
Yes, yes, I realize that they have technical differences in what they
count. That's not the question. The question is "Why would IMA care?"
Linus
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 17:03 ` Linus Torvalds
@ 2010-10-19 17:28 ` Al Viro
2010-10-19 18:16 ` Mimi Zohar
2010-10-19 19:11 ` Matthew Wilcox
2010-10-19 22:49 ` Eric Paris
1 sibling, 2 replies; 21+ messages in thread
From: Al Viro @ 2010-10-19 17:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Eric Paris, linux-kernel, linux-security-module, linux-fsdevel,
hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo
On Tue, Oct 19, 2010 at 10:03:48AM -0700, Linus Torvalds wrote:
> On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > a) i_writecount is about VM_DENYWRITE, basically. ?Reusing it for ima could
> > get unpleasant; when it's positive, we are fine, but it can get negative as
> > well. ?IMA will have interesting time dealing with that.
> >
> > b) i_count is simply a refcount for struct inode. ?Not exactly the number
> > of dentries, but that's the main contributor. ?Basically, that's "how many
> > pointers outside of inode hash chains point that that struct inode at the
> > moment".
>
> My question was deeper. More along the lines of "why would IMA care?"
>
> How/why could IMA ever care about the pointless and trivial
> differences between its current private open/read/write counts and the
> counts that we already maintain?
>
> Yes, yes, I realize that they have technical differences in what they
> count. That's not the question. The question is "Why would IMA care?"
I'd rather not say what I think about IMA sanity (and usefulness); as for what
it tries to do... They want to whine if you open a file that is already opened
for write and they want to whine if you open a file for write when it's already
opened for read. Unless they decide to leave the file alone, that is.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 17:28 ` Al Viro
@ 2010-10-19 18:16 ` Mimi Zohar
2010-10-20 13:10 ` John Stoffel
2010-10-19 19:11 ` Matthew Wilcox
1 sibling, 1 reply; 21+ messages in thread
From: Mimi Zohar @ 2010-10-19 18:16 UTC (permalink / raw)
To: Al Viro
Cc: Linus Torvalds, Eric Paris, linux-kernel, linux-security-module,
linux-fsdevel, hch, Mimi Zohar, warthog9, david, jmorris, kyle,
hpa, akpm, mingo
On Tue, 2010-10-19 at 18:28 +0100, Al Viro wrote:
> On Tue, Oct 19, 2010 at 10:03:48AM -0700, Linus Torvalds wrote:
> > On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > a) i_writecount is about VM_DENYWRITE, basically. ?Reusing it for ima could
> > > get unpleasant; when it's positive, we are fine, but it can get negative as
> > > well. ?IMA will have interesting time dealing with that.
> > >
> > > b) i_count is simply a refcount for struct inode. ?Not exactly the number
> > > of dentries, but that's the main contributor. ?Basically, that's "how many
> > > pointers outside of inode hash chains point that that struct inode at the
> > > moment".
> >
> > My question was deeper. More along the lines of "why would IMA care?"
> >
> > How/why could IMA ever care about the pointless and trivial
> > differences between its current private open/read/write counts and the
> > counts that we already maintain?
> >
> > Yes, yes, I realize that they have technical differences in what they
> > count. That's not the question. The question is "Why would IMA care?"
The filesystem prevents files being executed from being opened for
write. The same guarantees that the file won't change, obviously,
doesn't exist for files being opened for read. Thus measuring a file
opened for read that has already been open for write, has no meaning.
Unfortunately, since the inode counters don't provide this information,
IMA maintains a separate set of counters.
> I'd rather not say what I think about IMA sanity (and usefulness); as for what
> it tries to do... They want to whine if you open a file that is already opened
> for write and they want to whine if you open a file for write when it's already
> opened for read. Unless they decide to leave the file alone, that is.
You left out one minor detail, invalidate the PCR as well.
Mimi
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 17:28 ` Al Viro
2010-10-19 18:16 ` Mimi Zohar
@ 2010-10-19 19:11 ` Matthew Wilcox
2010-10-20 3:15 ` Al Viro
1 sibling, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2010-10-19 19:11 UTC (permalink / raw)
To: Al Viro
Cc: Linus Torvalds, Eric Paris, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm, mingo
On Tue, Oct 19, 2010 at 06:28:05PM +0100, Al Viro wrote:
> I'd rather not say what I think about IMA sanity (and usefulness); as for what
> it tries to do... They want to whine if you open a file that is already opened
> for write and they want to whine if you open a file for write when it's already
> opened for read. Unless they decide to leave the file alone, that is.
Hm. Sounds like the same question that the file leases code needs
answered. The important difference is that the leases code can just
refuse to set a lease on inodes with multiple dentries.
While my mind's on it ... Al, is this code even close to correct?
if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
goto out;
if ((arg == F_WRLCK)
&& ((atomic_read(&dentry->d_count) > 1)
|| (atomic_read(&inode->i_count) > 1)))
goto out;
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 17:03 ` Linus Torvalds
2010-10-19 17:28 ` Al Viro
@ 2010-10-19 22:49 ` Eric Paris
2010-10-20 14:38 ` Ingo Molnar
1 sibling, 1 reply; 21+ messages in thread
From: Eric Paris @ 2010-10-19 22:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch,
zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo
Executive summary of the day's work:
Yesterday morning: 944 bytes per inode in core
Yesterday night: 24 bytes per inode in core
Tonight: 4 bytes per inode in core.
That's a x236 time reduction in memory usage. No I didn't even start
looking at a freezer. Which could bring that 4 down to 0, but would add
a scalability penalty on all inodes when IMA was enabled.
The memory associated with inodes that IMA actually cares about has gone
from 312 to 320 bytes.
I'm going to follow up with my patch series again but they aren't really
ready to be applied. The IBM people who wrote IMA are reviewing them.
I have some questions if my RCU+RBTREE usage is valid/correct. I'd
really like Al to take a close look at the last patch in the series to
make sure my use of i_writecount actually does what I want it to do...
-Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 19:11 ` Matthew Wilcox
@ 2010-10-20 3:15 ` Al Viro
2010-10-20 17:38 ` J. Bruce Fields
0 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2010-10-20 3:15 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Eric Paris, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm, mingo
On Tue, Oct 19, 2010 at 01:11:33PM -0600, Matthew Wilcox wrote:
> Hm. Sounds like the same question that the file leases code needs
> answered. The important difference is that the leases code can just
> refuse to set a lease on inodes with multiple dentries.
>
> While my mind's on it ... Al, is this code even close to correct?
>
> if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
> goto out;
> if ((arg == F_WRLCK)
> && ((atomic_read(&dentry->d_count) > 1)
> || (atomic_read(&inode->i_count) > 1)))
> goto out;
No. This is complete junk; note that e.g. ls -lR will disrupt it, since
lstat(2) will bump dentry refcount. The first part is more or less OK;
the second makes no sense.
What is it trying to do? Note that the first part also doesn't make a lot
of sense, since you could be acquiring a write reference *right* *now*,
just as that check passes. And you could finish getting it before you get
to do anything else in generic_setlease().
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 18:16 ` Mimi Zohar
@ 2010-10-20 13:10 ` John Stoffel
2010-10-20 13:36 ` Al Viro
0 siblings, 1 reply; 21+ messages in thread
From: John Stoffel @ 2010-10-20 13:10 UTC (permalink / raw)
To: Mimi Zohar
Cc: Al Viro, Linus Torvalds, Eric Paris, linux-kernel,
linux-security-module, linux-fsdevel, hch, Mimi Zohar, warthog9,
david, jmorris, kyle, hpa, akpm, mingo
>>>>> "Mimi" == Mimi Zohar <zohar@linux.vnet.ibm.com> writes:
Mimi> On Tue, 2010-10-19 at 18:28 +0100, Al Viro wrote:
>> On Tue, Oct 19, 2010 at 10:03:48AM -0700, Linus Torvalds wrote:
>> > On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > >
>> > > a) i_writecount is about VM_DENYWRITE, basically. ?Reusing it for ima could
>> > > get unpleasant; when it's positive, we are fine, but it can get negative as
>> > > well. ?IMA will have interesting time dealing with that.
>> > >
>> > > b) i_count is simply a refcount for struct inode. ?Not exactly the number
>> > > of dentries, but that's the main contributor. ?Basically, that's "how many
>> > > pointers outside of inode hash chains point that that struct inode at the
>> > > moment".
>> >
>> > My question was deeper. More along the lines of "why would IMA care?"
>> >
>> > How/why could IMA ever care about the pointless and trivial
>> > differences between its current private open/read/write counts and the
>> > counts that we already maintain?
>> >
>> > Yes, yes, I realize that they have technical differences in what they
>> > count. That's not the question. The question is "Why would IMA care?"
Mimi> The filesystem prevents files being executed from being opened
Mimi> for write. The same guarantees that the file won't change,
Mimi> obviously, doesn't exist for files being opened for read. Thus
Mimi> measuring a file opened for read that has already been open for
Mimi> write, has no meaning. Unfortunately, since the inode counters
Mimi> don't provide this information, IMA maintains a separate set of
Mimi> counters.
Does this mean I can't replace /bin/sh on a running system using IMA
at all, even if just one process has it opened and is running? So how
the hell am I supposed to do live upgrades on a system?
Currently, /bin/sh gets replaced with the newer, better (for some
value of better :-) version, while currently running users aren't
impacted at all. New users pick up the new binary.
Gah! The only way to upgrade such a system would look be via a
reboot. Not very nice at all... or can the root user disable IMA,
upgrade a binary, then re-start IMA on a system?
So how does this improve security if root is compromised?
John
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 13:10 ` John Stoffel
@ 2010-10-20 13:36 ` Al Viro
2010-10-20 14:09 ` John Stoffel
0 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2010-10-20 13:36 UTC (permalink / raw)
To: John Stoffel
Cc: Mimi Zohar, Linus Torvalds, Eric Paris, linux-kernel,
linux-security-module, linux-fsdevel, hch, Mimi Zohar, warthog9,
david, jmorris, kyle, hpa, akpm, mingo
On Wed, Oct 20, 2010 at 09:10:27AM -0400, John Stoffel wrote:
> Does this mean I can't replace /bin/sh on a running system using IMA
> at all, even if just one process has it opened and is running? So how
> the hell am I supposed to do live upgrades on a system?
rename(2). Prohibition against write(2) to binary being executed is
*old*. Try to do the following:
cp /bin/sh /tmp
/tmp/sh &
cat /bin/ls >/tmp/sh
and you'll get
sh: /tmp/sh: Text file busy
That has nothing to do with IMA...
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 13:36 ` Al Viro
@ 2010-10-20 14:09 ` John Stoffel
0 siblings, 0 replies; 21+ messages in thread
From: John Stoffel @ 2010-10-20 14:09 UTC (permalink / raw)
To: Al Viro
Cc: John Stoffel, Mimi Zohar, Linus Torvalds, Eric Paris,
linux-kernel, linux-security-module, linux-fsdevel, hch,
Mimi Zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo
>>>>> "Al" == Al Viro <viro@ZenIV.linux.org.uk> writes:
Al> On Wed, Oct 20, 2010 at 09:10:27AM -0400, John Stoffel wrote:
>> Does this mean I can't replace /bin/sh on a running system using IMA
>> at all, even if just one process has it opened and is running? So how
>> the hell am I supposed to do live upgrades on a system?
Al> rename(2). Prohibition against write(2) to binary being executed is
Al> *old*. Try to do the following:
Al> cp /bin/sh /tmp
Al> /tmp/sh &
Al> cat /bin/ls >/tmp/sh
Al> and you'll get
Al> sh: /tmp/sh: Text file busy
Al> That has nothing to do with IMA...
Duh, makes sense. Thanks for educating me. I still IMA is a waste of
system resources...
John
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-19 22:49 ` Eric Paris
@ 2010-10-20 14:38 ` Ingo Molnar
2010-10-20 14:46 ` Eric Paris
0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2010-10-20 14:38 UTC (permalink / raw)
To: Eric Paris
Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm
* Eric Paris <eparis@redhat.com> wrote:
> Executive summary of the day's work:
> Yesterday morning: 944 bytes per inode in core
> Yesterday night: 24 bytes per inode in core
> Tonight: 4 bytes per inode in core.
>
> That's a x236 time reduction in memory usage. No I didn't even start looking at a
> freezer. Which could bring that 4 down to 0, but would add a scalability penalty
> on all inodes when IMA was enabled.
Why not use inode->i_security intelligently? That already exists so that way it's 0
bytes.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 14:38 ` Ingo Molnar
@ 2010-10-20 14:46 ` Eric Paris
2010-10-20 15:15 ` Ingo Molnar
0 siblings, 1 reply; 21+ messages in thread
From: Eric Paris @ 2010-10-20 14:46 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm
On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote:
> * Eric Paris <eparis@redhat.com> wrote:
>
> > Executive summary of the day's work:
> > Yesterday morning: 944 bytes per inode in core
> > Yesterday night: 24 bytes per inode in core
> > Tonight: 4 bytes per inode in core.
> >
> > That's a x236 time reduction in memory usage. No I didn't even start looking at a
> > freezer. Which could bring that 4 down to 0, but would add a scalability penalty
> > on all inodes when IMA was enabled.
>
> Why not use inode->i_security intelligently? That already exists so that way it's 0
> bytes.
>
> Thanks,
It still wouldn't be 0 bytes since there would be a 1-1 mapping from
inode to i_security structs. It would just be hiding the memory
somewhere else so it's harder to spot. And that's exactly how we got
into this situation, instead of just doing thing in the inode we hid it
away in a radix tree node and ima_iint_cache. In any case it would
require a secondary structure
struct generic_lsm_inode_structure
{
IMA Fields
void *lsm_inode_structure;
}
Which means yet another pointer to the real per LSM inode struct. So it
would actually likely cost memory....
The real reason I don't pursue this route is because of the litany of
different ways this pointer is used in different LSMs (or not used at
all.) And we all know that LSM authors aren't known for seeing the
world the same way as each other. As a maintainer of one of those LSMs
even I'm scared to try pushing that forward....
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 14:46 ` Eric Paris
@ 2010-10-20 15:15 ` Ingo Molnar
2010-10-20 15:25 ` Eric Paris
2010-10-21 16:15 ` Casey Schaufler
0 siblings, 2 replies; 21+ messages in thread
From: Ingo Molnar @ 2010-10-20 15:15 UTC (permalink / raw)
To: Eric Paris
Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm
* Eric Paris <eparis@redhat.com> wrote:
> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote:
> > * Eric Paris <eparis@redhat.com> wrote:
> >
> > > Executive summary of the day's work:
> > > Yesterday morning: 944 bytes per inode in core
> > > Yesterday night: 24 bytes per inode in core
> > > Tonight: 4 bytes per inode in core.
> > >
> > > That's a x236 time reduction in memory usage. No I didn't even start looking
> > > at a freezer. Which could bring that 4 down to 0, but would add a scalability
> > > penalty on all inodes when IMA was enabled.
> >
> > Why not use inode->i_security intelligently? That already exists so that way
> > it's 0 bytes.
> >
> > Thanks,
>
> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to
> i_security structs. [...]
Only for IMA-affected files, right?
My point is to keep it 0 overhead for the _non IMA common case_.
> The real reason I don't pursue this route is because of the litany of different
> ways this pointer is used in different LSMs (or not used at all.) And we all know
> that LSM authors aren't known for seeing the world the same way as each other. As
> a maintainer of one of those LSMs even I'm scared to try pushing that forward....
Ugh. That's a perfect reason to do it exactly like i suggested.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 15:15 ` Ingo Molnar
@ 2010-10-20 15:25 ` Eric Paris
2010-10-21 16:15 ` Casey Schaufler
1 sibling, 0 replies; 21+ messages in thread
From: Eric Paris @ 2010-10-20 15:25 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module,
linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa,
akpm
On Wed, 2010-10-20 at 17:15 +0200, Ingo Molnar wrote:
> * Eric Paris <eparis@redhat.com> wrote:
>
> > On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote:
> > > * Eric Paris <eparis@redhat.com> wrote:
> > >
> > > > Executive summary of the day's work:
> > > > Yesterday morning: 944 bytes per inode in core
> > > > Yesterday night: 24 bytes per inode in core
> > > > Tonight: 4 bytes per inode in core.
> > > >
> > > > That's a x236 time reduction in memory usage. No I didn't even start looking
> > > > at a freezer. Which could bring that 4 down to 0, but would add a scalability
> > > > penalty on all inodes when IMA was enabled.
> > >
> > > Why not use inode->i_security intelligently? That already exists so that way
> > > it's 0 bytes.
> > >
> > > Thanks,
> >
> > It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to
> > i_security structs. [...]
>
> Only for IMA-affected files, right?
No, we need to keep the open read counter even for non-IMA-affected
files in case we later determine that it is IMA-affected. That's the 4
bytes I have today, which I said could be eliminated with a freezer that
calculated it when IMA was enabled, but isn't something I'm looking at
right now....
-Eric
-Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 3:15 ` Al Viro
@ 2010-10-20 17:38 ` J. Bruce Fields
0 siblings, 0 replies; 21+ messages in thread
From: J. Bruce Fields @ 2010-10-20 17:38 UTC (permalink / raw)
To: Al Viro
Cc: Matthew Wilcox, Linus Torvalds, Eric Paris, linux-kernel,
linux-security-module, linux-fsdevel, hch, zohar, warthog9, david,
jmorris, kyle, hpa, akpm, mingo
On Wed, Oct 20, 2010 at 04:15:04AM +0100, Al Viro wrote:
> On Tue, Oct 19, 2010 at 01:11:33PM -0600, Matthew Wilcox wrote:
> > Hm. Sounds like the same question that the file leases code needs
> > answered. The important difference is that the leases code can just
> > refuse to set a lease on inodes with multiple dentries.
> >
> > While my mind's on it ... Al, is this code even close to correct?
> >
> > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
> > goto out;
> > if ((arg == F_WRLCK)
> > && ((atomic_read(&dentry->d_count) > 1)
> > || (atomic_read(&inode->i_count) > 1)))
> > goto out;
>
> No. This is complete junk; note that e.g. ls -lR will disrupt it, since
> lstat(2) will bump dentry refcount. The first part is more or less OK;
> the second makes no sense.
It may be nobody cares about false positives--if leases are something
provided as a caching optimization, then it's OK to deny them more often
than strictly necessary.
(In fact, if you're using the write lease to allow somebody else to
cache writes, then you want the stat to break the lease so it can get
can trigger a cache flush and get an updated mtime.)
Not that I'll claim those checks are correct.
> What is it trying to do? Note that the first part also doesn't make a lot
> of sense, since you could be acquiring a write reference *right* *now*,
> just as that check passes. And you could finish getting it before you get
> to do anything else in generic_setlease().
Yeah, that's a race.
The lease code is broken. On my list of problems:
- Leases are broken only by conflicting opens; but both nfsv4
delegations and (I'm told) Windows op locks actually require
that read leases be broken on any operation that changes file
metadata--including unlink, link, rename, chmod, and chown.
- The internal kernel api used for lease-breaking is inherently
racy as long as it consists of a single break_lease() call.
(We probably need deny_leases() and undeny_leases(), or
something.)
I'd be happy just to have read leases that worked correctly; write
leases (which, at least for the NFSv4 server's purposes, also need to be
broken on *access* to metadata) are worse.
I have patches that attempted to replace the current mechanism for
F_RDLCK leases with an i_leasecount modeled on i_writecount. (So
positive counts the number of leases held, negative counts operations
temporarily disabling leases.) They were never completely correct.
I also wasn't sure what sort of requirements I should meet to avoid
noticeable scalability problems (how much additional space in the inode
could I get away with? What about additional locks, or just writes to
inode fields, on read opens?).
--b.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-20 15:15 ` Ingo Molnar
2010-10-20 15:25 ` Eric Paris
@ 2010-10-21 16:15 ` Casey Schaufler
2010-10-22 8:48 ` Ingo Molnar
1 sibling, 1 reply; 21+ messages in thread
From: Casey Schaufler @ 2010-10-21 16:15 UTC (permalink / raw)
To: Ingo Molnar
Cc: Eric Paris, Linus Torvalds, Al Viro, linux-kernel,
linux-security-module, linux-fsdevel, hch, zohar, warthog9, david,
jmorris, kyle, hpa, akpm
On 10/20/2010 8:15 AM, Ingo Molnar wrote:
> * Eric Paris <eparis@redhat.com> wrote:
>
>> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote:
>>> * Eric Paris <eparis@redhat.com> wrote:
>>>
>>>> Executive summary of the day's work:
>>>> Yesterday morning: 944 bytes per inode in core
>>>> Yesterday night: 24 bytes per inode in core
>>>> Tonight: 4 bytes per inode in core.
>>>>
>>>> That's a x236 time reduction in memory usage. No I didn't even start looking
>>>> at a freezer. Which could bring that 4 down to 0, but would add a scalability
>>>> penalty on all inodes when IMA was enabled.
>>> Why not use inode->i_security intelligently? That already exists so that way
>>> it's 0 bytes.
>>>
>>> Thanks,
>> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to
>> i_security structs. [...]
> Only for IMA-affected files, right?
>
> My point is to keep it 0 overhead for the _non IMA common case_.
>
>> The real reason I don't pursue this route is because of the litany of different
>> ways this pointer is used in different LSMs (or not used at all.) And we all know
>> that LSM authors aren't known for seeing the world the same way as each other. As
>> a maintainer of one of those LSMs even I'm scared to try pushing that forward....
> Ugh. That's a perfect reason to do it exactly like i suggested.
If you would like to make a proposal on LSM stacking other than
the traditional "rip the LSM out" I am sure that everyone in the
IMA, SELinux, TOMOYO, AppArmor and Smack communities would be
happy to have a look. Short of having a viable mechanism for
multiple LSMs to coexist IMA needs its separate space. Yes, people
do use both IMA and LSMs on the same machine at the same time.
> Thanks,
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-21 16:15 ` Casey Schaufler
@ 2010-10-22 8:48 ` Ingo Molnar
2010-10-22 17:50 ` Casey Schaufler
0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2010-10-22 8:48 UTC (permalink / raw)
To: Casey Schaufler
Cc: Eric Paris, Linus Torvalds, Al Viro, linux-kernel,
linux-security-module, linux-fsdevel, hch, zohar, warthog9, david,
jmorris, kyle, hpa, akpm
* Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 10/20/2010 8:15 AM, Ingo Molnar wrote:
> > * Eric Paris <eparis@redhat.com> wrote:
> >
> >> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote:
> >>> * Eric Paris <eparis@redhat.com> wrote:
> >>>
> >>>> Executive summary of the day's work:
> >>>> Yesterday morning: 944 bytes per inode in core
> >>>> Yesterday night: 24 bytes per inode in core
> >>>> Tonight: 4 bytes per inode in core.
> >>>>
> >>>> That's a x236 time reduction in memory usage. No I didn't even start looking
> >>>> at a freezer. Which could bring that 4 down to 0, but would add a scalability
> >>>> penalty on all inodes when IMA was enabled.
> >>> Why not use inode->i_security intelligently? That already exists so that way
> >>> it's 0 bytes.
> >>>
> >>> Thanks,
> >> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to
> >> i_security structs. [...]
> > Only for IMA-affected files, right?
> >
> > My point is to keep it 0 overhead for the _non IMA common case_.
> >
> >> The real reason I don't pursue this route is because of the litany of different
> >> ways this pointer is used in different LSMs (or not used at all.) And we all know
> >> that LSM authors aren't known for seeing the world the same way as each other. As
> >> a maintainer of one of those LSMs even I'm scared to try pushing that forward....
> > Ugh. That's a perfect reason to do it exactly like i suggested.
>
> If you would like to make a proposal on LSM stacking other than the traditional
> "rip the LSM out" I am sure that everyone in the IMA, SELinux, TOMOYO, AppArmor
> and Smack communities would be happy to have a look. Short of having a viable
> mechanism for multiple LSMs to coexist IMA needs its separate space. Yes, people
> do use both IMA and LSMs on the same machine at the same time.
Yes, that's the essence of what i suggested: if various security concepts can be
present at once then inode->security should not be a stupid pointer to a single,
exclusive data structure (because that hardwires a "only a single security subsystem
active" assumption), but should be a pointer to a linked list of security structures
- as many as there are security subsystems interested in that inode.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode
2010-10-22 8:48 ` Ingo Molnar
@ 2010-10-22 17:50 ` Casey Schaufler
0 siblings, 0 replies; 21+ messages in thread
From: Casey Schaufler @ 2010-10-22 17:50 UTC (permalink / raw)
To: Ingo Molnar
Cc: Eric Paris, Linus Torvalds, Al Viro, linux-kernel,
linux-security-module, linux-fsdevel, hch, zohar, warthog9, david,
jmorris, kyle, hpa, akpm
On 10/22/2010 1:48 AM, Ingo Molnar wrote:
> * Casey Schaufler <casey@schaufler-ca.com> wrote:
>
>> On 10/20/2010 8:15 AM, Ingo Molnar wrote:
>>> * Eric Paris <eparis@redhat.com> wrote:
>>>
>>>> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote:
>>>>> * Eric Paris <eparis@redhat.com> wrote:
>>>>>
>>>>>> Executive summary of the day's work:
>>>>>> Yesterday morning: 944 bytes per inode in core
>>>>>> Yesterday night: 24 bytes per inode in core
>>>>>> Tonight: 4 bytes per inode in core.
>>>>>>
>>>>>> That's a x236 time reduction in memory usage. No I didn't even start looking
>>>>>> at a freezer. Which could bring that 4 down to 0, but would add a scalability
>>>>>> penalty on all inodes when IMA was enabled.
>>>>> Why not use inode->i_security intelligently? That already exists so that way
>>>>> it's 0 bytes.
>>>>>
>>>>> Thanks,
>>>> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to
>>>> i_security structs. [...]
>>> Only for IMA-affected files, right?
>>>
>>> My point is to keep it 0 overhead for the _non IMA common case_.
>>>
>>>> The real reason I don't pursue this route is because of the litany of different
>>>> ways this pointer is used in different LSMs (or not used at all.) And we all know
>>>> that LSM authors aren't known for seeing the world the same way as each other. As
>>>> a maintainer of one of those LSMs even I'm scared to try pushing that forward....
>>> Ugh. That's a perfect reason to do it exactly like i suggested.
>> If you would like to make a proposal on LSM stacking other than the traditional
>> "rip the LSM out" I am sure that everyone in the IMA, SELinux, TOMOYO, AppArmor
>> and Smack communities would be happy to have a look. Short of having a viable
>> mechanism for multiple LSMs to coexist IMA needs its separate space. Yes, people
>> do use both IMA and LSMs on the same machine at the same time.
> Yes, that's the essence of what i suggested: if various security concepts can be
> present at once then inode->security should not be a stupid pointer to a single,
> exclusive data structure (because that hardwires a "only a single security subsystem
> active" assumption), but should be a pointer to a linked list of security structures
> - as many as there are security subsystems interested in that inode.
Glad to see support for an LSM module stacker (modulator, combiner, ...)
growing. All we really need to do is get someone a case of the beverage
of their choice and turn them loose on the problem. I think that the few
anti-stacking holdouts (I was one, but converted a couple years ago) can
be swayed by a reasonable implementation. It won't be easy, there are
plenty of problems that need to be solved, but anyone who wants easy
should stick to developing web portals and stay out of the kernel.
>
> Thanks,
>
> Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2010-10-22 17:50 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20101019011650.25346.99614.stgit@paris.rdu.redhat.com>
[not found] ` <20101019013037.GA31393@infradead.org>
[not found] ` <1287454443.2530.124.camel@localhost.localdomain>
[not found] ` <20101019073901.GB11284@dastard>
2010-10-19 16:24 ` [PATCH 1/3] IMA: move read/write counters into struct inode Eric Paris
2010-10-19 16:29 ` Christoph Hellwig
[not found] ` <AANLkTimGdjPyLXuknDNa7WNthDT9+2FdOuPdxwjRiMHD@mail.gmail.com>
2010-10-19 16:36 ` Eric Paris
2010-10-19 16:55 ` Al Viro
2010-10-19 17:03 ` Linus Torvalds
2010-10-19 17:28 ` Al Viro
2010-10-19 18:16 ` Mimi Zohar
2010-10-20 13:10 ` John Stoffel
2010-10-20 13:36 ` Al Viro
2010-10-20 14:09 ` John Stoffel
2010-10-19 19:11 ` Matthew Wilcox
2010-10-20 3:15 ` Al Viro
2010-10-20 17:38 ` J. Bruce Fields
2010-10-19 22:49 ` Eric Paris
2010-10-20 14:38 ` Ingo Molnar
2010-10-20 14:46 ` Eric Paris
2010-10-20 15:15 ` Ingo Molnar
2010-10-20 15:25 ` Eric Paris
2010-10-21 16:15 ` Casey Schaufler
2010-10-22 8:48 ` Ingo Molnar
2010-10-22 17:50 ` Casey Schaufler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox