* Re: [PATCH 1/3] IMA: move read/write counters into struct inode [not found] ` <20101019073901.GB11284@dastard> @ 2010-10-19 16:24 ` Eric Paris 2010-10-19 16:29 ` Christoph Hellwig 0 siblings, 1 reply; 21+ messages in thread From: Eric Paris @ 2010-10-19 16:24 UTC (permalink / raw) To: Dave Chinner Cc: Christoph Hellwig, linux-kernel, linux-security-module, linux-fsdevel, zohar, warthog9, jmorris, kyle, hpa, akpm, torvalds, mingo, viro On Tue, 2010-10-19 at 18:39 +1100, Dave Chinner wrote: > On Mon, Oct 18, 2010 at 10:14:03PM -0400, Eric Paris wrote: > Eric, just to put that in context - changing the size of an inode > needs to be conidered carefully because we cache so many of them. We > often jump through hoops just to reduce it by 4 or 8 bytes. You are > proposing to increase it by 24 bytes (roughly 5%) and as such that > _should_ be considered a big deal, especially for something that is > currently rarely used. In my mind it's framed a little differently, my patch series is reducing it from ~900 bytes to 24 bytes. Even though that memory might not have been inside struct inode if there is always a 1-1 mapping it might as well be.... I'm going from seriously broken to a hell of a lot better. I believe that when I resend this series I'll drop 8 more of those bytes (open count as I think we can do without that these days). > Personally I that adding a pointer into the struct inode is as much > as I'd want to compromise to. Those that want to use IMA or have the > possibility of turning it on dynamicaly can accept the additional > overhead of another memory allocation during inode allocation as the > cost of using this functionality. That's the way the security > subsystem works, so I don't see any problems with doing this for IMA > and it turns the overhead problem into one that only affects those > that have it both configured and enabled. That seems like a > reasonable compromise to me.... The problem is that this would actually waste another 8 bytes (the size of the pointer in struct inode) since IMA is still going to need to allocate a structure for every inode to hold the 16-24 bytes of counters. That 16-24 might not be in struct inode, but like I said, if there is a 1-1 mapping between the two there is no difference. I said that if there was a consensus that this overhead was still too large (and it seems that may be the case) I would put looking at using a userspace freezer to attempt to collect the information dynamically on my todo list. I'll gladly do that but we have a space/time tradeoff I'd rather have a consensus on before I start. If I go the pointer in struct inode route, I don't need to serialize entry and removal from core of every inode if IMA is enabled (while I add and remove it from the IMA lookup tree.) If I don't add any fields to struct inode I'll need to serialize while I add them to the IMA lookup tree, but at the savings of a void * in struct inode. My guess is that most people will say forcing users to serialize and saving 8bytes per inode is the better choice, but I know there is scalability work going on and I want to make sure everyone agrees that is the right choice before we spend a lot of time on anything like this... -Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 16:24 ` [PATCH 1/3] IMA: move read/write counters into struct inode Eric Paris @ 2010-10-19 16:29 ` Christoph Hellwig 0 siblings, 0 replies; 21+ messages in thread From: Christoph Hellwig @ 2010-10-19 16:29 UTC (permalink / raw) To: Eric Paris Cc: Dave Chinner, Christoph Hellwig, linux-kernel, linux-security-module, linux-fsdevel, zohar, warthog9, jmorris, kyle, hpa, akpm, torvalds, mingo, viro Eric, I think you and just about everyone here are on a different page, and before we have the basic disagreement settled I'm not sure we can make much progress. Can you please explain why a feature like IMA that no sane user would ever want should cause _any_ overhead for users that just have it compiled in because their distro or defconfig did without actually using. What exactly is the problem to require the few people that want it to use a kernel command line option and/or an _DEFAULT_ON config option? ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <AANLkTimGdjPyLXuknDNa7WNthDT9+2FdOuPdxwjRiMHD@mail.gmail.com>]
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode [not found] ` <AANLkTimGdjPyLXuknDNa7WNthDT9+2FdOuPdxwjRiMHD@mail.gmail.com> @ 2010-10-19 16:36 ` Eric Paris 2010-10-19 16:55 ` Al Viro 0 siblings, 1 reply; 21+ messages in thread From: Eric Paris @ 2010-10-19 16:36 UTC (permalink / raw) To: Linus Torvalds Cc: linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo, viro On Tue, 2010-10-19 at 08:52 -0700, Linus Torvalds wrote: > On Mon, Oct 18, 2010 at 6:16 PM, Eric Paris <eparis@redhat.com> wrote: > > > > IMA currently alocated an inode integrity structure for every inode in > > core. This stucture is about 120 bytes long. Most files however > > (especially on a system which doesn't make use of IMA) will never need any > > of this space. The problem is that if IMA is enabled we need to know > > information about the number of readers and the number of writers for every > > inode on the box. At the moment we collect that information in the per > > inode iint structure and waste the rest of the space. This patch moves those > > counters into the struct inode so we can eventually stop allocating an IMA > > integrity structure except when absolutely needed. > > Hmm. I don't think this is really acceptable as-is. > > First off (and most trivially) - the fields are misnamed. Just calling > them "{open,read,write}count" was fine when it was part of an ima > structure, but for all the historical reasons, inode fields are called > 'i_xyzzy'. Will fix. > Secondly, we already maintain a write count (called "i_writecount"). > Why is the IMA writecount different, and should it be? I ask Al about reusing this field long ago and he indicated it had a very different meaning. I can't remember what he indicated it meant off the top of my head but I'll take a look at it again. Lines like this leave me leary: drivers/md/md.c::deny_bitmap_write_access() atomic_set(&inode->i_writecount, -1); > Thirdly, why is it an "unsigned long"? Are the IMA numbers cumulative > or something? How could you ever overflow a 32-bit counter if not? Not cumulative. 32bits would probably be fine. > Finally, why does IMA even care about the read-counts vs open-counts? > Why not just open-counts, and consider any non-write to be an open? What IMA needs to function is the current readers and current writers. The open count was originally very useful when a number of places inside the kernel were allocating struct files themselves rather than letting the VFS do the lifting and we could end up with more struct files to a given inode than IMA realized. Back then IMA started trying to do one-off hooks to each filesystem doing this to fix the counters and measure appropriately but we eventually decided it was best to move all struct file creation into the vfs so it couldn't get out of whack. I believe at this point we could drop the opencount.... > In short, I think this patch would be _much_ more acceptable if it > added just a _single_ 32-bit "i_opencount". And even then I'd ask > "what's the difference between i_opencount and our already existing > i_count? i_count, I believe, is much different. i_count is counting the number of dentries in core referencing the inode, even if none of them are being used in any struct file or if one dentry is being referenced in 1000 struct files. The IMA counters are from a higher level, they counts the number of struct files referencing this inode. I'll resend, shrinking from unsigned long to unsigned int and dropping opencount from struct inode. Should get us from using ~900 bytes per inode to using about 8 bytes per inode. And like I said, if that still seems like too much overhead to most people (and it seems that's the case) I'll look at how to get down to 0, but it isn't going to be a fast obvious change... -Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 16:36 ` Eric Paris @ 2010-10-19 16:55 ` Al Viro 2010-10-19 17:03 ` Linus Torvalds 0 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2010-10-19 16:55 UTC (permalink / raw) To: Eric Paris Cc: Linus Torvalds, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Tue, Oct 19, 2010 at 12:36:55PM -0400, Eric Paris wrote: > I ask Al about reusing this field long ago and he indicated it had a > very different meaning. I can't remember what he indicated it meant off > the top of my head but I'll take a look at it again. > i_count, I believe, is much different. i_count is counting the number > of dentries in core referencing the inode, even if none of them are > being used in any struct file or if one dentry is being referenced in > 1000 struct files. The IMA counters are from a higher level, they > counts the number of struct files referencing this inode. a) i_writecount is about VM_DENYWRITE, basically. Reusing it for ima could get unpleasant; when it's positive, we are fine, but it can get negative as well. IMA will have interesting time dealing with that. b) i_count is simply a refcount for struct inode. Not exactly the number of dentries, but that's the main contributor. Basically, that's "how many pointers outside of inode hash chains point that that struct inode at the moment". ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 16:55 ` Al Viro @ 2010-10-19 17:03 ` Linus Torvalds 2010-10-19 17:28 ` Al Viro 2010-10-19 22:49 ` Eric Paris 0 siblings, 2 replies; 21+ messages in thread From: Linus Torvalds @ 2010-10-19 17:03 UTC (permalink / raw) To: Al Viro Cc: Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > a) i_writecount is about VM_DENYWRITE, basically. Reusing it for ima could > get unpleasant; when it's positive, we are fine, but it can get negative as > well. IMA will have interesting time dealing with that. > > b) i_count is simply a refcount for struct inode. Not exactly the number > of dentries, but that's the main contributor. Basically, that's "how many > pointers outside of inode hash chains point that that struct inode at the > moment". My question was deeper. More along the lines of "why would IMA care?" How/why could IMA ever care about the pointless and trivial differences between its current private open/read/write counts and the counts that we already maintain? Yes, yes, I realize that they have technical differences in what they count. That's not the question. The question is "Why would IMA care?" Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 17:03 ` Linus Torvalds @ 2010-10-19 17:28 ` Al Viro 2010-10-19 18:16 ` Mimi Zohar 2010-10-19 19:11 ` Matthew Wilcox 2010-10-19 22:49 ` Eric Paris 1 sibling, 2 replies; 21+ messages in thread From: Al Viro @ 2010-10-19 17:28 UTC (permalink / raw) To: Linus Torvalds Cc: Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Tue, Oct 19, 2010 at 10:03:48AM -0700, Linus Torvalds wrote: > On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > a) i_writecount is about VM_DENYWRITE, basically. ?Reusing it for ima could > > get unpleasant; when it's positive, we are fine, but it can get negative as > > well. ?IMA will have interesting time dealing with that. > > > > b) i_count is simply a refcount for struct inode. ?Not exactly the number > > of dentries, but that's the main contributor. ?Basically, that's "how many > > pointers outside of inode hash chains point that that struct inode at the > > moment". > > My question was deeper. More along the lines of "why would IMA care?" > > How/why could IMA ever care about the pointless and trivial > differences between its current private open/read/write counts and the > counts that we already maintain? > > Yes, yes, I realize that they have technical differences in what they > count. That's not the question. The question is "Why would IMA care?" I'd rather not say what I think about IMA sanity (and usefulness); as for what it tries to do... They want to whine if you open a file that is already opened for write and they want to whine if you open a file for write when it's already opened for read. Unless they decide to leave the file alone, that is. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 17:28 ` Al Viro @ 2010-10-19 18:16 ` Mimi Zohar 2010-10-20 13:10 ` John Stoffel 2010-10-19 19:11 ` Matthew Wilcox 1 sibling, 1 reply; 21+ messages in thread From: Mimi Zohar @ 2010-10-19 18:16 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, Mimi Zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Tue, 2010-10-19 at 18:28 +0100, Al Viro wrote: > On Tue, Oct 19, 2010 at 10:03:48AM -0700, Linus Torvalds wrote: > > On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > > > a) i_writecount is about VM_DENYWRITE, basically. ?Reusing it for ima could > > > get unpleasant; when it's positive, we are fine, but it can get negative as > > > well. ?IMA will have interesting time dealing with that. > > > > > > b) i_count is simply a refcount for struct inode. ?Not exactly the number > > > of dentries, but that's the main contributor. ?Basically, that's "how many > > > pointers outside of inode hash chains point that that struct inode at the > > > moment". > > > > My question was deeper. More along the lines of "why would IMA care?" > > > > How/why could IMA ever care about the pointless and trivial > > differences between its current private open/read/write counts and the > > counts that we already maintain? > > > > Yes, yes, I realize that they have technical differences in what they > > count. That's not the question. The question is "Why would IMA care?" The filesystem prevents files being executed from being opened for write. The same guarantees that the file won't change, obviously, doesn't exist for files being opened for read. Thus measuring a file opened for read that has already been open for write, has no meaning. Unfortunately, since the inode counters don't provide this information, IMA maintains a separate set of counters. > I'd rather not say what I think about IMA sanity (and usefulness); as for what > it tries to do... They want to whine if you open a file that is already opened > for write and they want to whine if you open a file for write when it's already > opened for read. Unless they decide to leave the file alone, that is. You left out one minor detail, invalidate the PCR as well. Mimi ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 18:16 ` Mimi Zohar @ 2010-10-20 13:10 ` John Stoffel 2010-10-20 13:36 ` Al Viro 0 siblings, 1 reply; 21+ messages in thread From: John Stoffel @ 2010-10-20 13:10 UTC (permalink / raw) To: Mimi Zohar Cc: Al Viro, Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, Mimi Zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo >>>>> "Mimi" == Mimi Zohar <zohar@linux.vnet.ibm.com> writes: Mimi> On Tue, 2010-10-19 at 18:28 +0100, Al Viro wrote: >> On Tue, Oct 19, 2010 at 10:03:48AM -0700, Linus Torvalds wrote: >> > On Tue, Oct 19, 2010 at 9:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: >> > > >> > > a) i_writecount is about VM_DENYWRITE, basically. ?Reusing it for ima could >> > > get unpleasant; when it's positive, we are fine, but it can get negative as >> > > well. ?IMA will have interesting time dealing with that. >> > > >> > > b) i_count is simply a refcount for struct inode. ?Not exactly the number >> > > of dentries, but that's the main contributor. ?Basically, that's "how many >> > > pointers outside of inode hash chains point that that struct inode at the >> > > moment". >> > >> > My question was deeper. More along the lines of "why would IMA care?" >> > >> > How/why could IMA ever care about the pointless and trivial >> > differences between its current private open/read/write counts and the >> > counts that we already maintain? >> > >> > Yes, yes, I realize that they have technical differences in what they >> > count. That's not the question. The question is "Why would IMA care?" Mimi> The filesystem prevents files being executed from being opened Mimi> for write. The same guarantees that the file won't change, Mimi> obviously, doesn't exist for files being opened for read. Thus Mimi> measuring a file opened for read that has already been open for Mimi> write, has no meaning. Unfortunately, since the inode counters Mimi> don't provide this information, IMA maintains a separate set of Mimi> counters. Does this mean I can't replace /bin/sh on a running system using IMA at all, even if just one process has it opened and is running? So how the hell am I supposed to do live upgrades on a system? Currently, /bin/sh gets replaced with the newer, better (for some value of better :-) version, while currently running users aren't impacted at all. New users pick up the new binary. Gah! The only way to upgrade such a system would look be via a reboot. Not very nice at all... or can the root user disable IMA, upgrade a binary, then re-start IMA on a system? So how does this improve security if root is compromised? John ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 13:10 ` John Stoffel @ 2010-10-20 13:36 ` Al Viro 2010-10-20 14:09 ` John Stoffel 0 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2010-10-20 13:36 UTC (permalink / raw) To: John Stoffel Cc: Mimi Zohar, Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, Mimi Zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Wed, Oct 20, 2010 at 09:10:27AM -0400, John Stoffel wrote: > Does this mean I can't replace /bin/sh on a running system using IMA > at all, even if just one process has it opened and is running? So how > the hell am I supposed to do live upgrades on a system? rename(2). Prohibition against write(2) to binary being executed is *old*. Try to do the following: cp /bin/sh /tmp /tmp/sh & cat /bin/ls >/tmp/sh and you'll get sh: /tmp/sh: Text file busy That has nothing to do with IMA... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 13:36 ` Al Viro @ 2010-10-20 14:09 ` John Stoffel 0 siblings, 0 replies; 21+ messages in thread From: John Stoffel @ 2010-10-20 14:09 UTC (permalink / raw) To: Al Viro Cc: John Stoffel, Mimi Zohar, Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, Mimi Zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo >>>>> "Al" == Al Viro <viro@ZenIV.linux.org.uk> writes: Al> On Wed, Oct 20, 2010 at 09:10:27AM -0400, John Stoffel wrote: >> Does this mean I can't replace /bin/sh on a running system using IMA >> at all, even if just one process has it opened and is running? So how >> the hell am I supposed to do live upgrades on a system? Al> rename(2). Prohibition against write(2) to binary being executed is Al> *old*. Try to do the following: Al> cp /bin/sh /tmp Al> /tmp/sh & Al> cat /bin/ls >/tmp/sh Al> and you'll get Al> sh: /tmp/sh: Text file busy Al> That has nothing to do with IMA... Duh, makes sense. Thanks for educating me. I still IMA is a waste of system resources... John ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 17:28 ` Al Viro 2010-10-19 18:16 ` Mimi Zohar @ 2010-10-19 19:11 ` Matthew Wilcox 2010-10-20 3:15 ` Al Viro 1 sibling, 1 reply; 21+ messages in thread From: Matthew Wilcox @ 2010-10-19 19:11 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Tue, Oct 19, 2010 at 06:28:05PM +0100, Al Viro wrote: > I'd rather not say what I think about IMA sanity (and usefulness); as for what > it tries to do... They want to whine if you open a file that is already opened > for write and they want to whine if you open a file for write when it's already > opened for read. Unless they decide to leave the file alone, that is. Hm. Sounds like the same question that the file leases code needs answered. The important difference is that the leases code can just refuse to set a lease on inodes with multiple dentries. While my mind's on it ... Al, is this code even close to correct? if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) goto out; if ((arg == F_WRLCK) && ((atomic_read(&dentry->d_count) > 1) || (atomic_read(&inode->i_count) > 1))) goto out; -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 19:11 ` Matthew Wilcox @ 2010-10-20 3:15 ` Al Viro 2010-10-20 17:38 ` J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2010-10-20 3:15 UTC (permalink / raw) To: Matthew Wilcox Cc: Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Tue, Oct 19, 2010 at 01:11:33PM -0600, Matthew Wilcox wrote: > Hm. Sounds like the same question that the file leases code needs > answered. The important difference is that the leases code can just > refuse to set a lease on inodes with multiple dentries. > > While my mind's on it ... Al, is this code even close to correct? > > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) > goto out; > if ((arg == F_WRLCK) > && ((atomic_read(&dentry->d_count) > 1) > || (atomic_read(&inode->i_count) > 1))) > goto out; No. This is complete junk; note that e.g. ls -lR will disrupt it, since lstat(2) will bump dentry refcount. The first part is more or less OK; the second makes no sense. What is it trying to do? Note that the first part also doesn't make a lot of sense, since you could be acquiring a write reference *right* *now*, just as that check passes. And you could finish getting it before you get to do anything else in generic_setlease(). ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 3:15 ` Al Viro @ 2010-10-20 17:38 ` J. Bruce Fields 0 siblings, 0 replies; 21+ messages in thread From: J. Bruce Fields @ 2010-10-20 17:38 UTC (permalink / raw) To: Al Viro Cc: Matthew Wilcox, Linus Torvalds, Eric Paris, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo On Wed, Oct 20, 2010 at 04:15:04AM +0100, Al Viro wrote: > On Tue, Oct 19, 2010 at 01:11:33PM -0600, Matthew Wilcox wrote: > > Hm. Sounds like the same question that the file leases code needs > > answered. The important difference is that the leases code can just > > refuse to set a lease on inodes with multiple dentries. > > > > While my mind's on it ... Al, is this code even close to correct? > > > > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) > > goto out; > > if ((arg == F_WRLCK) > > && ((atomic_read(&dentry->d_count) > 1) > > || (atomic_read(&inode->i_count) > 1))) > > goto out; > > No. This is complete junk; note that e.g. ls -lR will disrupt it, since > lstat(2) will bump dentry refcount. The first part is more or less OK; > the second makes no sense. It may be nobody cares about false positives--if leases are something provided as a caching optimization, then it's OK to deny them more often than strictly necessary. (In fact, if you're using the write lease to allow somebody else to cache writes, then you want the stat to break the lease so it can get can trigger a cache flush and get an updated mtime.) Not that I'll claim those checks are correct. > What is it trying to do? Note that the first part also doesn't make a lot > of sense, since you could be acquiring a write reference *right* *now*, > just as that check passes. And you could finish getting it before you get > to do anything else in generic_setlease(). Yeah, that's a race. The lease code is broken. On my list of problems: - Leases are broken only by conflicting opens; but both nfsv4 delegations and (I'm told) Windows op locks actually require that read leases be broken on any operation that changes file metadata--including unlink, link, rename, chmod, and chown. - The internal kernel api used for lease-breaking is inherently racy as long as it consists of a single break_lease() call. (We probably need deny_leases() and undeny_leases(), or something.) I'd be happy just to have read leases that worked correctly; write leases (which, at least for the NFSv4 server's purposes, also need to be broken on *access* to metadata) are worse. I have patches that attempted to replace the current mechanism for F_RDLCK leases with an i_leasecount modeled on i_writecount. (So positive counts the number of leases held, negative counts operations temporarily disabling leases.) They were never completely correct. I also wasn't sure what sort of requirements I should meet to avoid noticeable scalability problems (how much additional space in the inode could I get away with? What about additional locks, or just writes to inode fields, on read opens?). --b. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 17:03 ` Linus Torvalds 2010-10-19 17:28 ` Al Viro @ 2010-10-19 22:49 ` Eric Paris 2010-10-20 14:38 ` Ingo Molnar 1 sibling, 1 reply; 21+ messages in thread From: Eric Paris @ 2010-10-19 22:49 UTC (permalink / raw) To: Linus Torvalds Cc: Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm, mingo Executive summary of the day's work: Yesterday morning: 944 bytes per inode in core Yesterday night: 24 bytes per inode in core Tonight: 4 bytes per inode in core. That's a x236 time reduction in memory usage. No I didn't even start looking at a freezer. Which could bring that 4 down to 0, but would add a scalability penalty on all inodes when IMA was enabled. The memory associated with inodes that IMA actually cares about has gone from 312 to 320 bytes. I'm going to follow up with my patch series again but they aren't really ready to be applied. The IBM people who wrote IMA are reviewing them. I have some questions if my RCU+RBTREE usage is valid/correct. I'd really like Al to take a close look at the last patch in the series to make sure my use of i_writecount actually does what I want it to do... -Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-19 22:49 ` Eric Paris @ 2010-10-20 14:38 ` Ingo Molnar 2010-10-20 14:46 ` Eric Paris 0 siblings, 1 reply; 21+ messages in thread From: Ingo Molnar @ 2010-10-20 14:38 UTC (permalink / raw) To: Eric Paris Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm * Eric Paris <eparis@redhat.com> wrote: > Executive summary of the day's work: > Yesterday morning: 944 bytes per inode in core > Yesterday night: 24 bytes per inode in core > Tonight: 4 bytes per inode in core. > > That's a x236 time reduction in memory usage. No I didn't even start looking at a > freezer. Which could bring that 4 down to 0, but would add a scalability penalty > on all inodes when IMA was enabled. Why not use inode->i_security intelligently? That already exists so that way it's 0 bytes. Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 14:38 ` Ingo Molnar @ 2010-10-20 14:46 ` Eric Paris 2010-10-20 15:15 ` Ingo Molnar 0 siblings, 1 reply; 21+ messages in thread From: Eric Paris @ 2010-10-20 14:46 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote: > * Eric Paris <eparis@redhat.com> wrote: > > > Executive summary of the day's work: > > Yesterday morning: 944 bytes per inode in core > > Yesterday night: 24 bytes per inode in core > > Tonight: 4 bytes per inode in core. > > > > That's a x236 time reduction in memory usage. No I didn't even start looking at a > > freezer. Which could bring that 4 down to 0, but would add a scalability penalty > > on all inodes when IMA was enabled. > > Why not use inode->i_security intelligently? That already exists so that way it's 0 > bytes. > > Thanks, It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to i_security structs. It would just be hiding the memory somewhere else so it's harder to spot. And that's exactly how we got into this situation, instead of just doing thing in the inode we hid it away in a radix tree node and ima_iint_cache. In any case it would require a secondary structure struct generic_lsm_inode_structure { IMA Fields void *lsm_inode_structure; } Which means yet another pointer to the real per LSM inode struct. So it would actually likely cost memory.... The real reason I don't pursue this route is because of the litany of different ways this pointer is used in different LSMs (or not used at all.) And we all know that LSM authors aren't known for seeing the world the same way as each other. As a maintainer of one of those LSMs even I'm scared to try pushing that forward.... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 14:46 ` Eric Paris @ 2010-10-20 15:15 ` Ingo Molnar 2010-10-20 15:25 ` Eric Paris 2010-10-21 16:15 ` Casey Schaufler 0 siblings, 2 replies; 21+ messages in thread From: Ingo Molnar @ 2010-10-20 15:15 UTC (permalink / raw) To: Eric Paris Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm * Eric Paris <eparis@redhat.com> wrote: > On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote: > > * Eric Paris <eparis@redhat.com> wrote: > > > > > Executive summary of the day's work: > > > Yesterday morning: 944 bytes per inode in core > > > Yesterday night: 24 bytes per inode in core > > > Tonight: 4 bytes per inode in core. > > > > > > That's a x236 time reduction in memory usage. No I didn't even start looking > > > at a freezer. Which could bring that 4 down to 0, but would add a scalability > > > penalty on all inodes when IMA was enabled. > > > > Why not use inode->i_security intelligently? That already exists so that way > > it's 0 bytes. > > > > Thanks, > > It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to > i_security structs. [...] Only for IMA-affected files, right? My point is to keep it 0 overhead for the _non IMA common case_. > The real reason I don't pursue this route is because of the litany of different > ways this pointer is used in different LSMs (or not used at all.) And we all know > that LSM authors aren't known for seeing the world the same way as each other. As > a maintainer of one of those LSMs even I'm scared to try pushing that forward.... Ugh. That's a perfect reason to do it exactly like i suggested. Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 15:15 ` Ingo Molnar @ 2010-10-20 15:25 ` Eric Paris 2010-10-21 16:15 ` Casey Schaufler 1 sibling, 0 replies; 21+ messages in thread From: Eric Paris @ 2010-10-20 15:25 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm On Wed, 2010-10-20 at 17:15 +0200, Ingo Molnar wrote: > * Eric Paris <eparis@redhat.com> wrote: > > > On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote: > > > * Eric Paris <eparis@redhat.com> wrote: > > > > > > > Executive summary of the day's work: > > > > Yesterday morning: 944 bytes per inode in core > > > > Yesterday night: 24 bytes per inode in core > > > > Tonight: 4 bytes per inode in core. > > > > > > > > That's a x236 time reduction in memory usage. No I didn't even start looking > > > > at a freezer. Which could bring that 4 down to 0, but would add a scalability > > > > penalty on all inodes when IMA was enabled. > > > > > > Why not use inode->i_security intelligently? That already exists so that way > > > it's 0 bytes. > > > > > > Thanks, > > > > It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to > > i_security structs. [...] > > Only for IMA-affected files, right? No, we need to keep the open read counter even for non-IMA-affected files in case we later determine that it is IMA-affected. That's the 4 bytes I have today, which I said could be eliminated with a freezer that calculated it when IMA was enabled, but isn't something I'm looking at right now.... -Eric -Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-20 15:15 ` Ingo Molnar 2010-10-20 15:25 ` Eric Paris @ 2010-10-21 16:15 ` Casey Schaufler 2010-10-22 8:48 ` Ingo Molnar 1 sibling, 1 reply; 21+ messages in thread From: Casey Schaufler @ 2010-10-21 16:15 UTC (permalink / raw) To: Ingo Molnar Cc: Eric Paris, Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm On 10/20/2010 8:15 AM, Ingo Molnar wrote: > * Eric Paris <eparis@redhat.com> wrote: > >> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote: >>> * Eric Paris <eparis@redhat.com> wrote: >>> >>>> Executive summary of the day's work: >>>> Yesterday morning: 944 bytes per inode in core >>>> Yesterday night: 24 bytes per inode in core >>>> Tonight: 4 bytes per inode in core. >>>> >>>> That's a x236 time reduction in memory usage. No I didn't even start looking >>>> at a freezer. Which could bring that 4 down to 0, but would add a scalability >>>> penalty on all inodes when IMA was enabled. >>> Why not use inode->i_security intelligently? That already exists so that way >>> it's 0 bytes. >>> >>> Thanks, >> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to >> i_security structs. [...] > Only for IMA-affected files, right? > > My point is to keep it 0 overhead for the _non IMA common case_. > >> The real reason I don't pursue this route is because of the litany of different >> ways this pointer is used in different LSMs (or not used at all.) And we all know >> that LSM authors aren't known for seeing the world the same way as each other. As >> a maintainer of one of those LSMs even I'm scared to try pushing that forward.... > Ugh. That's a perfect reason to do it exactly like i suggested. If you would like to make a proposal on LSM stacking other than the traditional "rip the LSM out" I am sure that everyone in the IMA, SELinux, TOMOYO, AppArmor and Smack communities would be happy to have a look. Short of having a viable mechanism for multiple LSMs to coexist IMA needs its separate space. Yes, people do use both IMA and LSMs on the same machine at the same time. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-security-module" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-21 16:15 ` Casey Schaufler @ 2010-10-22 8:48 ` Ingo Molnar 2010-10-22 17:50 ` Casey Schaufler 0 siblings, 1 reply; 21+ messages in thread From: Ingo Molnar @ 2010-10-22 8:48 UTC (permalink / raw) To: Casey Schaufler Cc: Eric Paris, Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm * Casey Schaufler <casey@schaufler-ca.com> wrote: > On 10/20/2010 8:15 AM, Ingo Molnar wrote: > > * Eric Paris <eparis@redhat.com> wrote: > > > >> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote: > >>> * Eric Paris <eparis@redhat.com> wrote: > >>> > >>>> Executive summary of the day's work: > >>>> Yesterday morning: 944 bytes per inode in core > >>>> Yesterday night: 24 bytes per inode in core > >>>> Tonight: 4 bytes per inode in core. > >>>> > >>>> That's a x236 time reduction in memory usage. No I didn't even start looking > >>>> at a freezer. Which could bring that 4 down to 0, but would add a scalability > >>>> penalty on all inodes when IMA was enabled. > >>> Why not use inode->i_security intelligently? That already exists so that way > >>> it's 0 bytes. > >>> > >>> Thanks, > >> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to > >> i_security structs. [...] > > Only for IMA-affected files, right? > > > > My point is to keep it 0 overhead for the _non IMA common case_. > > > >> The real reason I don't pursue this route is because of the litany of different > >> ways this pointer is used in different LSMs (or not used at all.) And we all know > >> that LSM authors aren't known for seeing the world the same way as each other. As > >> a maintainer of one of those LSMs even I'm scared to try pushing that forward.... > > Ugh. That's a perfect reason to do it exactly like i suggested. > > If you would like to make a proposal on LSM stacking other than the traditional > "rip the LSM out" I am sure that everyone in the IMA, SELinux, TOMOYO, AppArmor > and Smack communities would be happy to have a look. Short of having a viable > mechanism for multiple LSMs to coexist IMA needs its separate space. Yes, people > do use both IMA and LSMs on the same machine at the same time. Yes, that's the essence of what i suggested: if various security concepts can be present at once then inode->security should not be a stupid pointer to a single, exclusive data structure (because that hardwires a "only a single security subsystem active" assumption), but should be a pointer to a linked list of security structures - as many as there are security subsystems interested in that inode. Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] IMA: move read/write counters into struct inode 2010-10-22 8:48 ` Ingo Molnar @ 2010-10-22 17:50 ` Casey Schaufler 0 siblings, 0 replies; 21+ messages in thread From: Casey Schaufler @ 2010-10-22 17:50 UTC (permalink / raw) To: Ingo Molnar Cc: Eric Paris, Linus Torvalds, Al Viro, linux-kernel, linux-security-module, linux-fsdevel, hch, zohar, warthog9, david, jmorris, kyle, hpa, akpm On 10/22/2010 1:48 AM, Ingo Molnar wrote: > * Casey Schaufler <casey@schaufler-ca.com> wrote: > >> On 10/20/2010 8:15 AM, Ingo Molnar wrote: >>> * Eric Paris <eparis@redhat.com> wrote: >>> >>>> On Wed, 2010-10-20 at 16:38 +0200, Ingo Molnar wrote: >>>>> * Eric Paris <eparis@redhat.com> wrote: >>>>> >>>>>> Executive summary of the day's work: >>>>>> Yesterday morning: 944 bytes per inode in core >>>>>> Yesterday night: 24 bytes per inode in core >>>>>> Tonight: 4 bytes per inode in core. >>>>>> >>>>>> That's a x236 time reduction in memory usage. No I didn't even start looking >>>>>> at a freezer. Which could bring that 4 down to 0, but would add a scalability >>>>>> penalty on all inodes when IMA was enabled. >>>>> Why not use inode->i_security intelligently? That already exists so that way >>>>> it's 0 bytes. >>>>> >>>>> Thanks, >>>> It still wouldn't be 0 bytes since there would be a 1-1 mapping from inode to >>>> i_security structs. [...] >>> Only for IMA-affected files, right? >>> >>> My point is to keep it 0 overhead for the _non IMA common case_. >>> >>>> The real reason I don't pursue this route is because of the litany of different >>>> ways this pointer is used in different LSMs (or not used at all.) And we all know >>>> that LSM authors aren't known for seeing the world the same way as each other. As >>>> a maintainer of one of those LSMs even I'm scared to try pushing that forward.... >>> Ugh. That's a perfect reason to do it exactly like i suggested. >> If you would like to make a proposal on LSM stacking other than the traditional >> "rip the LSM out" I am sure that everyone in the IMA, SELinux, TOMOYO, AppArmor >> and Smack communities would be happy to have a look. Short of having a viable >> mechanism for multiple LSMs to coexist IMA needs its separate space. Yes, people >> do use both IMA and LSMs on the same machine at the same time. > Yes, that's the essence of what i suggested: if various security concepts can be > present at once then inode->security should not be a stupid pointer to a single, > exclusive data structure (because that hardwires a "only a single security subsystem > active" assumption), but should be a pointer to a linked list of security structures > - as many as there are security subsystems interested in that inode. Glad to see support for an LSM module stacker (modulator, combiner, ...) growing. All we really need to do is get someone a case of the beverage of their choice and turn them loose on the problem. I think that the few anti-stacking holdouts (I was one, but converted a couple years ago) can be swayed by a reasonable implementation. It won't be easy, there are plenty of problems that need to be solved, but anyone who wants easy should stick to developing web portals and stay out of the kernel. > > Thanks, > > Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2010-10-22 17:50 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20101019011650.25346.99614.stgit@paris.rdu.redhat.com>
[not found] ` <20101019013037.GA31393@infradead.org>
[not found] ` <1287454443.2530.124.camel@localhost.localdomain>
[not found] ` <20101019073901.GB11284@dastard>
2010-10-19 16:24 ` [PATCH 1/3] IMA: move read/write counters into struct inode Eric Paris
2010-10-19 16:29 ` Christoph Hellwig
[not found] ` <AANLkTimGdjPyLXuknDNa7WNthDT9+2FdOuPdxwjRiMHD@mail.gmail.com>
2010-10-19 16:36 ` Eric Paris
2010-10-19 16:55 ` Al Viro
2010-10-19 17:03 ` Linus Torvalds
2010-10-19 17:28 ` Al Viro
2010-10-19 18:16 ` Mimi Zohar
2010-10-20 13:10 ` John Stoffel
2010-10-20 13:36 ` Al Viro
2010-10-20 14:09 ` John Stoffel
2010-10-19 19:11 ` Matthew Wilcox
2010-10-20 3:15 ` Al Viro
2010-10-20 17:38 ` J. Bruce Fields
2010-10-19 22:49 ` Eric Paris
2010-10-20 14:38 ` Ingo Molnar
2010-10-20 14:46 ` Eric Paris
2010-10-20 15:15 ` Ingo Molnar
2010-10-20 15:25 ` Eric Paris
2010-10-21 16:15 ` Casey Schaufler
2010-10-22 8:48 ` Ingo Molnar
2010-10-22 17:50 ` Casey Schaufler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox