* [PATCH] audit: grab a reference to context->pwd when it's cached @ 2012-08-16 1:13 Peter Moody 2012-08-16 1:26 ` Alexander Viro 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-08-16 1:13 UTC (permalink / raw) To: linux-audit On certain systems, in certain pathalogical cases, current's cwd can be deleted while we're still processing a syscall. This should prevent the system from evicting the inode while we're still referencing it. This seems to fix the bug I reported here: https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html Signed-off-by: Peter Moody <pmoody@google.com> --- kernel/auditsc.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kernel/auditsc.c b/kernel/auditsc.c index 4b96415..e86b8b9 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -2064,6 +2064,7 @@ void __audit_getname(const char *name) if (!context->pwd.dentry) get_fs_pwd(current->fs, &context->pwd); + path_get(&context->pwd); } /* audit_putname - intercept a putname request @@ -2091,6 +2092,7 @@ void audit_putname(const char *name) n->name, n->name ?: "(null)"); } #endif + path_put(&context->pwd); __putname(name); } #if AUDIT_DEBUG -- 1.7.7.3 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-08-16 1:13 [PATCH] audit: grab a reference to context->pwd when it's cached Peter Moody @ 2012-08-16 1:26 ` Alexander Viro 2012-08-16 1:41 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Alexander Viro @ 2012-08-16 1:26 UTC (permalink / raw) To: Peter Moody; +Cc: linux-audit On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote: > On certain systems, in certain pathalogical cases, current's cwd can > be deleted while we're still processing a syscall. This should prevent > the system from evicting the inode while we're still referencing it. > > This seems to fix the bug I reported here: > https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html Sigh... The bug is real, but I really don't like the fix. Among other things, it's going to cause cacheline bouncing from hell and not everyone runs with audit sensibly disabled... Let me think for a while and see if I can come up with something less unpleasant, OK? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-08-16 1:26 ` Alexander Viro @ 2012-08-16 1:41 ` Peter Moody 2012-08-20 17:19 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-08-16 1:41 UTC (permalink / raw) To: Alexander Viro; +Cc: linux-audit On Wed, Aug 15, 2012 at 6:26 PM, Alexander Viro <aviro@redhat.com> wrote: > On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote: >> On certain systems, in certain pathalogical cases, current's cwd can >> be deleted while we're still processing a syscall. This should prevent >> the system from evicting the inode while we're still referencing it. >> >> This seems to fix the bug I reported here: >> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html > > Sigh... The bug is real, but I really don't like the fix. Among other > things, it's going to cause cacheline bouncing from hell and not everyone > runs with audit sensibly disabled... Let me think for a while and see > if I can come up with something less unpleasant, OK? Works for me. I have a set of machines that I can very easily test a fix on. Cheers, peter -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-08-16 1:41 ` Peter Moody @ 2012-08-20 17:19 ` Peter Moody 2012-09-25 17:03 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-08-20 17:19 UTC (permalink / raw) To: Alexander Viro; +Cc: linux-audit Hi Al, Any word on a less unpleasant fix? Also, do you know if/how I could measure the impact of the cacheline bouncing? I haven't been able to notice anything anecdotally on the machine running this patch, but I'm not pushing it incredibly hard. Cheers, peter On Wed, Aug 15, 2012 at 6:41 PM, Peter Moody <pmoody@google.com> wrote: > On Wed, Aug 15, 2012 at 6:26 PM, Alexander Viro <aviro@redhat.com> wrote: >> On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote: >>> On certain systems, in certain pathalogical cases, current's cwd can >>> be deleted while we're still processing a syscall. This should prevent >>> the system from evicting the inode while we're still referencing it. >>> >>> This seems to fix the bug I reported here: >>> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html >> >> Sigh... The bug is real, but I really don't like the fix. Among other >> things, it's going to cause cacheline bouncing from hell and not everyone >> runs with audit sensibly disabled... Let me think for a while and see >> if I can come up with something less unpleasant, OK? > > Works for me. I have a set of machines that I can very easily test a fix on. > > Cheers, > peter > > -- > Peter Moody Google 1.650.253.7306 > Security Engineer pgp:0xC3410038 -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-08-20 17:19 ` Peter Moody @ 2012-09-25 17:03 ` Peter Moody 2012-09-26 13:50 ` Alexander Viro 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-09-25 17:03 UTC (permalink / raw) To: Alexander Viro; +Cc: linux-audit Hey folks, following up on old patches, are there any comments on this? Did you get around to finding a better way to fix this bug, Al? Cheers, peter On Mon, Aug 20, 2012 at 10:19 AM, Peter Moody <pmoody@google.com> wrote: > Hi Al, > > Any word on a less unpleasant fix? Also, do you know if/how I could > measure the impact of the cacheline bouncing? I haven't been able to > notice anything anecdotally on the machine running this patch, but I'm > not pushing it incredibly hard. > > Cheers, > peter > > On Wed, Aug 15, 2012 at 6:41 PM, Peter Moody <pmoody@google.com> wrote: >> On Wed, Aug 15, 2012 at 6:26 PM, Alexander Viro <aviro@redhat.com> wrote: >>> On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote: >>>> On certain systems, in certain pathalogical cases, current's cwd can >>>> be deleted while we're still processing a syscall. This should prevent >>>> the system from evicting the inode while we're still referencing it. >>>> >>>> This seems to fix the bug I reported here: >>>> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html >>> >>> Sigh... The bug is real, but I really don't like the fix. Among other >>> things, it's going to cause cacheline bouncing from hell and not everyone >>> runs with audit sensibly disabled... Let me think for a while and see >>> if I can come up with something less unpleasant, OK? >> >> Works for me. I have a set of machines that I can very easily test a fix on. >> >> Cheers, >> peter >> >> -- >> Peter Moody Google 1.650.253.7306 >> Security Engineer pgp:0xC3410038 > > > > -- > Peter Moody Google 1.650.253.7306 > Security Engineer pgp:0xC3410038 -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-09-25 17:03 ` Peter Moody @ 2012-09-26 13:50 ` Alexander Viro 2012-09-26 18:20 ` Peter Moody 2012-10-04 18:48 ` Peter Moody 0 siblings, 2 replies; 26+ messages in thread From: Alexander Viro @ 2012-09-26 13:50 UTC (permalink / raw) To: Peter Moody; +Cc: linux-audit On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: > Hey folks, > > following up on old patches, are there any comments on this? Did you > get around to finding a better way to fix this bug, Al? Alas, I've found none ;-/ Looks like we'll have to go with this one, at least until somebody comes up with better solution. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-09-26 13:50 ` Alexander Viro @ 2012-09-26 18:20 ` Peter Moody 2012-10-04 18:48 ` Peter Moody 1 sibling, 0 replies; 26+ messages in thread From: Peter Moody @ 2012-09-26 18:20 UTC (permalink / raw) To: Alexander Viro; +Cc: linux-audit On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >> Hey folks, >> >> following up on old patches, are there any comments on this? Did you >> get around to finding a better way to fix this bug, Al? > > Alas, I've found none ;-/ Looks like we'll have to go with this one, > at least until somebody comes up with better solution. Did you want me to re-send this or is the patch from Aug 15 ack-able/apply-able? Cheers, peter -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-09-26 13:50 ` Alexander Viro 2012-09-26 18:20 ` Peter Moody @ 2012-10-04 18:48 ` Peter Moody 2012-10-05 11:02 ` Jeff Layton 2012-10-05 12:55 ` Jeff Layton 1 sibling, 2 replies; 26+ messages in thread From: Peter Moody @ 2012-10-04 18:48 UTC (permalink / raw) To: Alexander Viro; +Cc: linux-audit, jlayton On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >> Hey folks, >> >> following up on old patches, are there any comments on this? Did you >> get around to finding a better way to fix this bug, Al? > > Alas, I've found none ;-/ Looks like we'll have to go with this one, > at least until somebody comes up with better solution. Not surprisingly, this patch doesn't actually fix the issue (or at least doesn't do it correctly). I hadn't noticed that get_fs_pwd() actually calls path_get() on &context->pwd so the additional path_get() is useless and the reference doesn't ever actually get freed if audit_putname is called while we're in a syscall. Al, Eric, Jeff; do any of you guys have an understanding of what the initial bug actually is since this clearly doesn't fix it? Cheers, peter -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-04 18:48 ` Peter Moody @ 2012-10-05 11:02 ` Jeff Layton 2012-10-05 12:55 ` Jeff Layton 1 sibling, 0 replies; 26+ messages in thread From: Jeff Layton @ 2012-10-05 11:02 UTC (permalink / raw) To: Peter Moody; +Cc: linux-audit On Thu, 4 Oct 2012 11:48:23 -0700 Peter Moody <pmoody@google.com> wrote: > On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: > >> Hey folks, > >> > >> following up on old patches, are there any comments on this? Did you > >> get around to finding a better way to fix this bug, Al? > > > > Alas, I've found none ;-/ Looks like we'll have to go with this one, > > at least until somebody comes up with better solution. > > Not surprisingly, this patch doesn't actually fix the issue (or at > least doesn't do it correctly). > > I hadn't noticed that get_fs_pwd() actually calls path_get() on > &context->pwd so the additional path_get() is useless and the > reference doesn't ever actually get freed if audit_putname is called > while we're in a syscall. > > Al, Eric, Jeff; do any of you guys have an understanding of what the > initial bug actually is since this clearly doesn't fix it? > > Cheers, > peter > Hrm...so yeah. The chdir call should have pinned the dentry. I think you'll need to track down where the oops actually occurred. One way to determine that is to do something like this, but you'll need to replace the kmod in the instructions with your vmlinux: https://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Oopses -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-04 18:48 ` Peter Moody 2012-10-05 11:02 ` Jeff Layton @ 2012-10-05 12:55 ` Jeff Layton 2012-10-05 13:57 ` Peter Moody 1 sibling, 1 reply; 26+ messages in thread From: Jeff Layton @ 2012-10-05 12:55 UTC (permalink / raw) To: Peter Moody; +Cc: linux-audit On Thu, 4 Oct 2012 11:48:23 -0700 Peter Moody <pmoody@google.com> wrote: > On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: > >> Hey folks, > >> > >> following up on old patches, are there any comments on this? Did you > >> get around to finding a better way to fix this bug, Al? > > > > Alas, I've found none ;-/ Looks like we'll have to go with this one, > > at least until somebody comes up with better solution. > > Not surprisingly, this patch doesn't actually fix the issue (or at > least doesn't do it correctly). > > I hadn't noticed that get_fs_pwd() actually calls path_get() on > &context->pwd so the additional path_get() is useless and the > reference doesn't ever actually get freed if audit_putname is called > while we're in a syscall. > > Al, Eric, Jeff; do any of you guys have an understanding of what the > initial bug actually is since this clearly doesn't fix it? > > Cheers, > peter > BTW, I ran this test on one of my KVM guests and it ran just fine. That one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is ext4 though, not ext3. So perhaps that's a factor? The oops message you posted at least looks like something down in the bowels of ext3 or fs/buffer.c. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 12:55 ` Jeff Layton @ 2012-10-05 13:57 ` Peter Moody 2012-10-05 14:10 ` Jeff Layton 2012-10-05 14:26 ` Jeff Layton 0 siblings, 2 replies; 26+ messages in thread From: Peter Moody @ 2012-10-05 13:57 UTC (permalink / raw) To: Jeff Layton; +Cc: linux-audit On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: > On Thu, 4 Oct 2012 11:48:23 -0700 > Peter Moody <pmoody@google.com> wrote: > >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >> >> Hey folks, >> >> >> >> following up on old patches, are there any comments on this? Did you >> >> get around to finding a better way to fix this bug, Al? >> > >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, >> > at least until somebody comes up with better solution. >> >> Not surprisingly, this patch doesn't actually fix the issue (or at >> least doesn't do it correctly). >> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on >> &context->pwd so the additional path_get() is useless and the >> reference doesn't ever actually get freed if audit_putname is called >> while we're in a syscall. >> >> Al, Eric, Jeff; do any of you guys have an understanding of what the >> initial bug actually is since this clearly doesn't fix it? >> >> Cheers, >> peter >> > > BTW, I ran this test on one of my KVM guests and it ran just fine. That > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is > ext4 though, not ext3. So perhaps that's a factor? > > The oops message you posted at least looks like something down in the > bowels of ext3 or fs/buffer.c. Yeah, the only place this actually happens for me on these giant xen instances we have (6 cores, 32G ram) and it happens on both ext3 and ext4 filesystems and it happens with 100% reliability. The actual oops is from: static inline void check_irqs_on(void) { #ifdef irqs_disabled BUG_ON(irqs_disabled()); #endif } with the code path looking like: __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() > -- > Jeff Layton <jlayton@redhat.com> -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 13:57 ` Peter Moody @ 2012-10-05 14:10 ` Jeff Layton 2012-10-05 14:26 ` Jeff Layton 1 sibling, 0 replies; 26+ messages in thread From: Jeff Layton @ 2012-10-05 14:10 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit On Fri, 5 Oct 2012 06:57:59 -0700 Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: > > On Thu, 4 Oct 2012 11:48:23 -0700 > > Peter Moody <pmoody@google.com> wrote: > > > >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: > >> >> Hey folks, > >> >> > >> >> following up on old patches, are there any comments on this? Did you > >> >> get around to finding a better way to fix this bug, Al? > >> > > >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, > >> > at least until somebody comes up with better solution. > >> > >> Not surprisingly, this patch doesn't actually fix the issue (or at > >> least doesn't do it correctly). > >> > >> I hadn't noticed that get_fs_pwd() actually calls path_get() on > >> &context->pwd so the additional path_get() is useless and the > >> reference doesn't ever actually get freed if audit_putname is called > >> while we're in a syscall. > >> > >> Al, Eric, Jeff; do any of you guys have an understanding of what the > >> initial bug actually is since this clearly doesn't fix it? > >> > >> Cheers, > >> peter > >> > > > > BTW, I ran this test on one of my KVM guests and it ran just fine. That > > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is > > ext4 though, not ext3. So perhaps that's a factor? > > > > The oops message you posted at least looks like something down in the > > bowels of ext3 or fs/buffer.c. > > Yeah, the only place this actually happens for me on these giant xen > instances we have (6 cores, 32G ram) and it happens on both ext3 and > ext4 filesystems and it happens with 100% reliability. > > The actual oops is from: > > static inline void check_irqs_on(void) > { > #ifdef irqs_disabled > BUG_ON(irqs_disabled()); > #endif > } > > with the code path looking like: > > __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() > Weird -- I wonder what's disabling IRQs there? Might be best to re-report this to linux-fsdevel and lkml so you can reach a broader audience... -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 13:57 ` Peter Moody 2012-10-05 14:10 ` Jeff Layton @ 2012-10-05 14:26 ` Jeff Layton 2012-10-05 15:16 ` Peter Moody 1 sibling, 1 reply; 26+ messages in thread From: Jeff Layton @ 2012-10-05 14:26 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit, sandeen On Fri, 5 Oct 2012 06:57:59 -0700 Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: > > On Thu, 4 Oct 2012 11:48:23 -0700 > > Peter Moody <pmoody@google.com> wrote: > > > >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: > >> >> Hey folks, > >> >> > >> >> following up on old patches, are there any comments on this? Did you > >> >> get around to finding a better way to fix this bug, Al? > >> > > >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, > >> > at least until somebody comes up with better solution. > >> > >> Not surprisingly, this patch doesn't actually fix the issue (or at > >> least doesn't do it correctly). > >> > >> I hadn't noticed that get_fs_pwd() actually calls path_get() on > >> &context->pwd so the additional path_get() is useless and the > >> reference doesn't ever actually get freed if audit_putname is called > >> while we're in a syscall. > >> > >> Al, Eric, Jeff; do any of you guys have an understanding of what the > >> initial bug actually is since this clearly doesn't fix it? > >> > >> Cheers, > >> peter > >> > > > > BTW, I ran this test on one of my KVM guests and it ran just fine. That > > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is > > ext4 though, not ext3. So perhaps that's a factor? > > > > The oops message you posted at least looks like something down in the > > bowels of ext3 or fs/buffer.c. > > Yeah, the only place this actually happens for me on these giant xen > instances we have (6 cores, 32G ram) and it happens on both ext3 and > ext4 filesystems and it happens with 100% reliability. > > The actual oops is from: > > static inline void check_irqs_on(void) > { > #ifdef irqs_disabled > BUG_ON(irqs_disabled()); > #endif > } > > with the code path looking like: > > __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() > Do you have a backtrace from a more recent kernel? I wonder if something in the syscall exit codepath is disabling IRQs here? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 14:26 ` Jeff Layton @ 2012-10-05 15:16 ` Peter Moody 2012-10-05 15:18 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-10-05 15:16 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: > On Fri, 5 Oct 2012 06:57:59 -0700 > Peter Moody <pmoody@google.com> wrote: > >> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: >> > On Thu, 4 Oct 2012 11:48:23 -0700 >> > Peter Moody <pmoody@google.com> wrote: >> > >> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >> >> >> Hey folks, >> >> >> >> >> >> following up on old patches, are there any comments on this? Did you >> >> >> get around to finding a better way to fix this bug, Al? >> >> > >> >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, >> >> > at least until somebody comes up with better solution. >> >> >> >> Not surprisingly, this patch doesn't actually fix the issue (or at >> >> least doesn't do it correctly). >> >> >> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on >> >> &context->pwd so the additional path_get() is useless and the >> >> reference doesn't ever actually get freed if audit_putname is called >> >> while we're in a syscall. >> >> >> >> Al, Eric, Jeff; do any of you guys have an understanding of what the >> >> initial bug actually is since this clearly doesn't fix it? >> >> >> >> Cheers, >> >> peter >> >> >> > >> > BTW, I ran this test on one of my KVM guests and it ran just fine. That >> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is >> > ext4 though, not ext3. So perhaps that's a factor? >> > >> > The oops message you posted at least looks like something down in the >> > bowels of ext3 or fs/buffer.c. >> >> Yeah, the only place this actually happens for me on these giant xen >> instances we have (6 cores, 32G ram) and it happens on both ext3 and >> ext4 filesystems and it happens with 100% reliability. >> >> The actual oops is from: >> >> static inline void check_irqs_on(void) >> { >> #ifdef irqs_disabled >> BUG_ON(irqs_disabled()); >> #endif >> } >> >> with the code path looking like: >> >> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() >> > > Do you have a backtrace from a more recent kernel? I wonder if > something in the syscall exit codepath is disabling IRQs here? is 3.6.0-rc1 recent enough or do you want something newer? > -- > Jeff Layton <jlayton@redhat.com> -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 15:16 ` Peter Moody @ 2012-10-05 15:18 ` Peter Moody 2012-10-05 15:57 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-10-05 15:18 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: >> On Fri, 5 Oct 2012 06:57:59 -0700 >> Peter Moody <pmoody@google.com> wrote: >> >>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: >>> > On Thu, 4 Oct 2012 11:48:23 -0700 >>> > Peter Moody <pmoody@google.com> wrote: >>> > >>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >>> >> >> Hey folks, >>> >> >> >>> >> >> following up on old patches, are there any comments on this? Did you >>> >> >> get around to finding a better way to fix this bug, Al? >>> >> > >>> >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, >>> >> > at least until somebody comes up with better solution. >>> >> >>> >> Not surprisingly, this patch doesn't actually fix the issue (or at >>> >> least doesn't do it correctly). >>> >> >>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on >>> >> &context->pwd so the additional path_get() is useless and the >>> >> reference doesn't ever actually get freed if audit_putname is called >>> >> while we're in a syscall. >>> >> >>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the >>> >> initial bug actually is since this clearly doesn't fix it? >>> >> >>> >> Cheers, >>> >> peter >>> >> >>> > >>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That >>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is >>> > ext4 though, not ext3. So perhaps that's a factor? >>> > >>> > The oops message you posted at least looks like something down in the >>> > bowels of ext3 or fs/buffer.c. >>> >>> Yeah, the only place this actually happens for me on these giant xen >>> instances we have (6 cores, 32G ram) and it happens on both ext3 and >>> ext4 filesystems and it happens with 100% reliability. >>> >>> The actual oops is from: >>> >>> static inline void check_irqs_on(void) >>> { >>> #ifdef irqs_disabled >>> BUG_ON(irqs_disabled()); >>> #endif >>> } >>> >>> with the code path looking like: >>> >>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() >>> >> >> Do you have a backtrace from a more recent kernel? I wonder if >> something in the syscall exit codepath is disabling IRQs here? > > is 3.6.0-rc1 recent enough or do you want something newer? nevermind, that doesn't boot. One sec. >> -- >> Jeff Layton <jlayton@redhat.com> > > > > -- > Peter Moody Google 1.650.253.7306 > Security Engineer pgp:0xC3410038 -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 15:18 ` Peter Moody @ 2012-10-05 15:57 ` Peter Moody 2012-10-05 16:15 ` Jeff Layton 2012-10-05 20:23 ` Eric Sandeen 0 siblings, 2 replies; 26+ messages in thread From: Peter Moody @ 2012-10-05 15:57 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote: >> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: >>> On Fri, 5 Oct 2012 06:57:59 -0700 >>> Peter Moody <pmoody@google.com> wrote: >>> >>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>> > On Thu, 4 Oct 2012 11:48:23 -0700 >>>> > Peter Moody <pmoody@google.com> wrote: >>>> > >>>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >>>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >>>> >> >> Hey folks, >>>> >> >> >>>> >> >> following up on old patches, are there any comments on this? Did you >>>> >> >> get around to finding a better way to fix this bug, Al? >>>> >> > >>>> >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, >>>> >> > at least until somebody comes up with better solution. >>>> >> >>>> >> Not surprisingly, this patch doesn't actually fix the issue (or at >>>> >> least doesn't do it correctly). >>>> >> >>>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on >>>> >> &context->pwd so the additional path_get() is useless and the >>>> >> reference doesn't ever actually get freed if audit_putname is called >>>> >> while we're in a syscall. >>>> >> >>>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the >>>> >> initial bug actually is since this clearly doesn't fix it? >>>> >> >>>> >> Cheers, >>>> >> peter >>>> >> >>>> > >>>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That >>>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is >>>> > ext4 though, not ext3. So perhaps that's a factor? >>>> > >>>> > The oops message you posted at least looks like something down in the >>>> > bowels of ext3 or fs/buffer.c. >>>> >>>> Yeah, the only place this actually happens for me on these giant xen >>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and >>>> ext4 filesystems and it happens with 100% reliability. >>>> >>>> The actual oops is from: >>>> >>>> static inline void check_irqs_on(void) >>>> { >>>> #ifdef irqs_disabled >>>> BUG_ON(irqs_disabled()); >>>> #endif >>>> } >>>> >>>> with the code path looking like: >>>> >>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() >>>> >>> >>> Do you have a backtrace from a more recent kernel? I wonder if >>> something in the syscall exit codepath is disabling IRQs here? >> >> is 3.6.0-rc1 recent enough or do you want something newer? > > nevermind, that doesn't boot. One sec. here's 3.5.0 ------------[ cut here ]------------ kernel BUG at fs/buffer.c:1220! invalid opcode: 0000 [#1] SMP CPU 0 Pid: 3683, comm: a.out Not tainted 3.5.0 #3 RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 CS: e033 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) Stack: ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 Call Trace: [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 [<ffffffff816af231>] ? down_read+0x11/0x30 [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 [<ffffffff81153b9a>] evict+0xaa/0x1b0 [<ffffffff81154843>] iput+0x103/0x210 [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 [<ffffffff811505ec>] dput+0x12c/0x250 [<ffffffff81146275>] path_put+0x15/0x30 [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 [<ffffffff816b30be>] sysexit_audit+0x29/0x5b Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 0c 80 43 5c 01 48 89 df e8 d5 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 41 54 53 RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 RSP <ffff8807b156dc28> ---[ end trace 8d09f8cfbb601c14 ]--- >>> -- >>> Jeff Layton <jlayton@redhat.com> >> >> >> >> -- >> Peter Moody Google 1.650.253.7306 >> Security Engineer pgp:0xC3410038 > > > > -- > Peter Moody Google 1.650.253.7306 > Security Engineer pgp:0xC3410038 -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 15:57 ` Peter Moody @ 2012-10-05 16:15 ` Jeff Layton 2012-10-05 17:04 ` Peter Moody 2012-10-05 20:23 ` Eric Sandeen 1 sibling, 1 reply; 26+ messages in thread From: Jeff Layton @ 2012-10-05 16:15 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit, sandeen On Fri, 5 Oct 2012 08:57:13 -0700 Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote: > > On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote: > >> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: > >>> On Fri, 5 Oct 2012 06:57:59 -0700 > >>> Peter Moody <pmoody@google.com> wrote: > >>> > >>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: > >>>> > On Thu, 4 Oct 2012 11:48:23 -0700 > >>>> > Peter Moody <pmoody@google.com> wrote: > >>>> > > >>>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: > >>>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: > >>>> >> >> Hey folks, > >>>> >> >> > >>>> >> >> following up on old patches, are there any comments on this? Did you > >>>> >> >> get around to finding a better way to fix this bug, Al? > >>>> >> > > >>>> >> > Alas, I've found none ;-/ Looks like we'll have to go with this one, > >>>> >> > at least until somebody comes up with better solution. > >>>> >> > >>>> >> Not surprisingly, this patch doesn't actually fix the issue (or at > >>>> >> least doesn't do it correctly). > >>>> >> > >>>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on > >>>> >> &context->pwd so the additional path_get() is useless and the > >>>> >> reference doesn't ever actually get freed if audit_putname is called > >>>> >> while we're in a syscall. > >>>> >> > >>>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the > >>>> >> initial bug actually is since this clearly doesn't fix it? > >>>> >> > >>>> >> Cheers, > >>>> >> peter > >>>> >> > >>>> > > >>>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That > >>>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is > >>>> > ext4 though, not ext3. So perhaps that's a factor? > >>>> > > >>>> > The oops message you posted at least looks like something down in the > >>>> > bowels of ext3 or fs/buffer.c. > >>>> > >>>> Yeah, the only place this actually happens for me on these giant xen > >>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and > >>>> ext4 filesystems and it happens with 100% reliability. > >>>> > >>>> The actual oops is from: > >>>> > >>>> static inline void check_irqs_on(void) > >>>> { > >>>> #ifdef irqs_disabled > >>>> BUG_ON(irqs_disabled()); > >>>> #endif > >>>> } > >>>> > >>>> with the code path looking like: > >>>> > >>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() > >>>> > >>> > >>> Do you have a backtrace from a more recent kernel? I wonder if > >>> something in the syscall exit codepath is disabling IRQs here? > >> > >> is 3.6.0-rc1 recent enough or do you want something newer? > > > > nevermind, that doesn't boot. One sec. > > > here's 3.5.0 > > ------------[ cut here ]------------ > kernel BUG at fs/buffer.c:1220! > invalid opcode: 0000 [#1] SMP > CPU 0 > Pid: 3683, comm: a.out Not tainted 3.5.0 #3 > RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] > check_irqs_on.part.8+0x4/0x6 > RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 > RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 > RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 > RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 > R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 > R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 > FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 > CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) > Stack: > ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 > ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 > ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 > Call Trace: > [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 > [<ffffffff816af231>] ? down_read+0x11/0x30 > [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 > [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 > [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 > [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 > [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 > [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 > [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 > [<ffffffff81153b9a>] evict+0xaa/0x1b0 > [<ffffffff81154843>] iput+0x103/0x210 > [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 > [<ffffffff811505ec>] dput+0x12c/0x250 > [<ffffffff81146275>] path_put+0x15/0x30 > [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 > [<ffffffff816b30be>] sysexit_audit+0x29/0x5b > Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 > 0c 80 43 5c 01 48 89 df e8 d5 > 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 > e5 0f 0b 55 48 89 e5 41 54 53 > RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 > RSP <ffff8807b156dc28> > ---[ end trace 8d09f8cfbb601c14 ]--- > > I don't see a function called sysexit_audit in 3.5. I assume that's created via some sort of macro goop? Could this be a xen specific problem? Might be interesting to add a check for irqs being disabled early in __audit_syscall_exit and see whether its doing that universally in its syscall exit routine? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 16:15 ` Jeff Layton @ 2012-10-05 17:04 ` Peter Moody 2012-10-05 19:41 ` Peter Moody 2012-10-08 14:10 ` Jeff Layton 0 siblings, 2 replies; 26+ messages in thread From: Peter Moody @ 2012-10-05 17:04 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote: >> >> ------------[ cut here ]------------ >> kernel BUG at fs/buffer.c:1220! >> invalid opcode: 0000 [#1] SMP >> CPU 0 >> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 >> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] >> check_irqs_on.part.8+0x4/0x6 >> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 >> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 >> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 >> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 >> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 >> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 >> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 >> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) >> Stack: >> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 >> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 >> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 >> Call Trace: >> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 >> [<ffffffff816af231>] ? down_read+0x11/0x30 >> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 >> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 >> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 >> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 >> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 >> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 >> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 >> [<ffffffff81153b9a>] evict+0xaa/0x1b0 >> [<ffffffff81154843>] iput+0x103/0x210 >> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 >> [<ffffffff811505ec>] dput+0x12c/0x250 >> [<ffffffff81146275>] path_put+0x15/0x30 >> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 >> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b >> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 >> 0c 80 43 5c 01 48 89 df e8 d5 >> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 >> e5 0f 0b 55 48 89 e5 41 54 53 >> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 >> RSP <ffff8807b156dc28> >> ---[ end trace 8d09f8cfbb601c14 ]--- >> >> > > I don't see a function called sysexit_audit in 3.5. I assume that's > created via some sort of macro goop? It looks like it's defined in arch/x86/kernel/entry_32.S my asm is non-existent, but it looks like it's calling __audit_syscall_exit and the disabling interrupts. > Could this be a xen specific problem? It could be. I asked the xen folks about this and they felt that it was unlikely that the bug itself was their (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html, specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html) xen could definitely be doing something wonky with irqs though. > Might be interesting to add a > check for irqs being disabled early in __audit_syscall_exit and see > whether its doing that universally in its syscall exit routine? I'll try this, thanks. > -- > Jeff Layton <jlayton@redhat.com> -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 17:04 ` Peter Moody @ 2012-10-05 19:41 ` Peter Moody 2012-10-08 12:23 ` Jeff Layton 2012-10-08 14:10 ` Jeff Layton 1 sibling, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-10-05 19:41 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote: > >>> >>> ------------[ cut here ]------------ >>> kernel BUG at fs/buffer.c:1220! >>> invalid opcode: 0000 [#1] SMP >>> CPU 0 >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 >>> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] >>> check_irqs_on.part.8+0x4/0x6 >>> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 >>> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 >>> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) >>> Stack: >>> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 >>> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 >>> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 >>> Call Trace: >>> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 >>> [<ffffffff816af231>] ? down_read+0x11/0x30 >>> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 >>> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 >>> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 >>> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 >>> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 >>> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 >>> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 >>> [<ffffffff81153b9a>] evict+0xaa/0x1b0 >>> [<ffffffff81154843>] iput+0x103/0x210 >>> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 >>> [<ffffffff811505ec>] dput+0x12c/0x250 >>> [<ffffffff81146275>] path_put+0x15/0x30 >>> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 >>> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 >>> 0c 80 43 5c 01 48 89 df e8 d5 >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 >>> e5 0f 0b 55 48 89 e5 41 54 53 >>> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 >>> RSP <ffff8807b156dc28> >>> ---[ end trace 8d09f8cfbb601c14 ]--- >>> >>> >> >> I don't see a function called sysexit_audit in 3.5. I assume that's >> created via some sort of macro goop? > > It looks like it's defined in arch/x86/kernel/entry_32.S > > my asm is non-existent, but it looks like it's calling > __audit_syscall_exit and the disabling interrupts. > >> Could this be a xen specific problem? > > It could be. I asked the xen folks about this and they felt that it > was unlikely that the bug itself was their > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html, > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html) > > xen could definitely be doing something wonky with irqs though. > >> Might be interesting to add a >> check for irqs being disabled early in __audit_syscall_exit and see >> whether its doing that universally in its syscall exit routine? > > I'll try this, thanks. So, I've added WARN_ON(irqs_disabled()) in various spots in the audit_syscall_exit path all the way up to the path_put in audit_free_names() that ends up triggering this and prior to Oops'ing, I don't see anything. Post oops, I get lots of messages like: ------------[ cut here ]------------ WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 Call Trace: [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 [<ffffffff81693483>] audit_free_names+0xb0/0xdb [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f ---[ end trace ec7a15dbe40eaf1d ]--- ------------[ cut here ]------------ WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 Call Trace: [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 [<ffffffff81693483>] audit_free_names+0xb0/0xdb [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f ---[ end trace ec7a15dbe40eaf1e ]--- where that python command is a python par file that may be doing some funny unlinking as well. I haven't tracked it down. thoughts? does this not look like a bug in audit; maybe something somewhere else is disabling interrupts? >> -- >> Jeff Layton <jlayton@redhat.com> > > > > -- > Peter Moody Google 1.650.253.7306 > Security Engineer pgp:0xC3410038 -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 19:41 ` Peter Moody @ 2012-10-08 12:23 ` Jeff Layton 2012-10-08 15:22 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Jeff Layton @ 2012-10-08 12:23 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit, sandeen On Fri, 5 Oct 2012 12:41:46 -0700 Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote: > > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote: > > > >>> > >>> ------------[ cut here ]------------ > >>> kernel BUG at fs/buffer.c:1220! > >>> invalid opcode: 0000 [#1] SMP > >>> CPU 0 > >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 > >>> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] > >>> check_irqs_on.part.8+0x4/0x6 > >>> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 > >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 > >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 > >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 > >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 > >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 > >>> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 > >>> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 > >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) > >>> Stack: > >>> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 > >>> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 > >>> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 > >>> Call Trace: > >>> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 > >>> [<ffffffff816af231>] ? down_read+0x11/0x30 > >>> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 > >>> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 > >>> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 > >>> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 > >>> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 > >>> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 > >>> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 > >>> [<ffffffff81153b9a>] evict+0xaa/0x1b0 > >>> [<ffffffff81154843>] iput+0x103/0x210 > >>> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 > >>> [<ffffffff811505ec>] dput+0x12c/0x250 > >>> [<ffffffff81146275>] path_put+0x15/0x30 > >>> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 > >>> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b > >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 > >>> 0c 80 43 5c 01 48 89 df e8 d5 > >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 > >>> e5 0f 0b 55 48 89 e5 41 54 53 > >>> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 > >>> RSP <ffff8807b156dc28> > >>> ---[ end trace 8d09f8cfbb601c14 ]--- > >>> > >>> > >> > >> I don't see a function called sysexit_audit in 3.5. I assume that's > >> created via some sort of macro goop? > > > > It looks like it's defined in arch/x86/kernel/entry_32.S > > > > my asm is non-existent, but it looks like it's calling > > __audit_syscall_exit and the disabling interrupts. > > > >> Could this be a xen specific problem? > > > > It could be. I asked the xen folks about this and they felt that it > > was unlikely that the bug itself was their > > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html, > > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html) > > > > xen could definitely be doing something wonky with irqs though. > > > >> Might be interesting to add a > >> check for irqs being disabled early in __audit_syscall_exit and see > >> whether its doing that universally in its syscall exit routine? > > > > I'll try this, thanks. > > So, I've added WARN_ON(irqs_disabled()) in various spots in the > audit_syscall_exit path all the way up to the path_put in > audit_free_names() that ends up triggering this and prior to Oops'ing, > I don't see anything. > > Post oops, I get lots of messages like: > > ------------[ cut here ]------------ > WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() > Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 > Call Trace: > [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb > [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 > [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 > [<ffffffff81693483>] audit_free_names+0xb0/0xdb > [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 > [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f > ---[ end trace ec7a15dbe40eaf1d ]--- > ------------[ cut here ]------------ > WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() > Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 > Call Trace: > [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb > [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 > [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 > [<ffffffff81693483>] audit_free_names+0xb0/0xdb > [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 > [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f > ---[ end trace ec7a15dbe40eaf1e ]--- > > where that python command is a python par file that may be doing some > funny unlinking as well. I haven't tracked it down. > > thoughts? does this not look like a bug in audit; maybe something > somewhere else is disabling interrupts? > I'm not sure whether it's a bug in audit or not. Tracking down what's disabling interrupts here is probably the key though... So I'm guessing the WARN_ON above was in audit_free_names. Was there a similar WARN_ON in audit_syscall_exit that did not fire? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-08 12:23 ` Jeff Layton @ 2012-10-08 15:22 ` Peter Moody 2012-10-08 16:45 ` Peter Moody 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-10-08 15:22 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Mon, Oct 8, 2012 at 5:23 AM, Jeff Layton <jlayton@redhat.com> wrote: > On Fri, 5 Oct 2012 12:41:46 -0700 > Peter Moody <pmoody@google.com> wrote: > >> On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote: >> > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote: >> > >> >>> >> >>> ------------[ cut here ]------------ >> >>> kernel BUG at fs/buffer.c:1220! >> >>> invalid opcode: 0000 [#1] SMP >> >>> CPU 0 >> >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 >> >>> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] >> >>> check_irqs_on.part.8+0x4/0x6 >> >>> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 >> >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 >> >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 >> >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 >> >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 >> >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 >> >>> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 >> >>> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >> >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 >> >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) >> >>> Stack: >> >>> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 >> >>> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 >> >>> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 >> >>> Call Trace: >> >>> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 >> >>> [<ffffffff816af231>] ? down_read+0x11/0x30 >> >>> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 >> >>> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 >> >>> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 >> >>> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 >> >>> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 >> >>> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 >> >>> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 >> >>> [<ffffffff81153b9a>] evict+0xaa/0x1b0 >> >>> [<ffffffff81154843>] iput+0x103/0x210 >> >>> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 >> >>> [<ffffffff811505ec>] dput+0x12c/0x250 >> >>> [<ffffffff81146275>] path_put+0x15/0x30 >> >>> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 >> >>> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b >> >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 >> >>> 0c 80 43 5c 01 48 89 df e8 d5 >> >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 >> >>> e5 0f 0b 55 48 89 e5 41 54 53 >> >>> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 >> >>> RSP <ffff8807b156dc28> >> >>> ---[ end trace 8d09f8cfbb601c14 ]--- >> >>> >> >>> >> >> >> >> I don't see a function called sysexit_audit in 3.5. I assume that's >> >> created via some sort of macro goop? >> > >> > It looks like it's defined in arch/x86/kernel/entry_32.S >> > >> > my asm is non-existent, but it looks like it's calling >> > __audit_syscall_exit and the disabling interrupts. >> > >> >> Could this be a xen specific problem? >> > >> > It could be. I asked the xen folks about this and they felt that it >> > was unlikely that the bug itself was their >> > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html, >> > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html) >> > >> > xen could definitely be doing something wonky with irqs though. >> > >> >> Might be interesting to add a >> >> check for irqs being disabled early in __audit_syscall_exit and see >> >> whether its doing that universally in its syscall exit routine? >> > >> > I'll try this, thanks. >> >> So, I've added WARN_ON(irqs_disabled()) in various spots in the >> audit_syscall_exit path all the way up to the path_put in >> audit_free_names() that ends up triggering this and prior to Oops'ing, >> I don't see anything. >> >> Post oops, I get lots of messages like: >> >> ------------[ cut here ]------------ >> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() >> Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 >> Call Trace: >> [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb >> [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 >> [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 >> [<ffffffff81693483>] audit_free_names+0xb0/0xdb >> [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 >> [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f >> ---[ end trace ec7a15dbe40eaf1d ]--- >> ------------[ cut here ]------------ >> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() >> Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 >> Call Trace: >> [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb >> [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 >> [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 >> [<ffffffff81693483>] audit_free_names+0xb0/0xdb >> [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 >> [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f >> ---[ end trace ec7a15dbe40eaf1e ]--- >> >> where that python command is a python par file that may be doing some >> funny unlinking as well. I haven't tracked it down. >> >> thoughts? does this not look like a bug in audit; maybe something >> somewhere else is disabling interrupts? >> > > I'm not sure whether it's a bug in audit or not. Tracking down what's > disabling interrupts here is probably the key though... Yeah. I'm trying to get a reproducer to work on a different version of Xen. > So I'm guessing the WARN_ON above was in audit_free_names. Was there a > similar WARN_ON in audit_syscall_exit that did not fire? Correct. I booted a few kernels with the WARN_ON successively further down the stack in audit land and it only started firing after BUG_ON() in fs/buffer.c Thanks for looking into this. Cheers, peter -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-08 15:22 ` Peter Moody @ 2012-10-08 16:45 ` Peter Moody 0 siblings, 0 replies; 26+ messages in thread From: Peter Moody @ 2012-10-08 16:45 UTC (permalink / raw) To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen On Mon, Oct 8, 2012 at 8:22 AM, Peter Moody <pmoody@google.com> wrote: > On Mon, Oct 8, 2012 at 5:23 AM, Jeff Layton <jlayton@redhat.com> wrote: >> On Fri, 5 Oct 2012 12:41:46 -0700 >> Peter Moody <pmoody@google.com> wrote: >> >>> On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote: >>> > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote: >>> > >>> >>> >>> >>> ------------[ cut here ]------------ >>> >>> kernel BUG at fs/buffer.c:1220! >>> >>> invalid opcode: 0000 [#1] SMP >>> >>> CPU 0 >>> >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 >>> >>> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] >>> >>> check_irqs_on.part.8+0x4/0x6 >>> >>> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 >>> >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 >>> >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 >>> >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 >>> >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 >>> >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 >>> >>> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 >>> >>> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >>> >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 >>> >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>> >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) >>> >>> Stack: >>> >>> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 >>> >>> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 >>> >>> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 >>> >>> Call Trace: >>> >>> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 >>> >>> [<ffffffff816af231>] ? down_read+0x11/0x30 >>> >>> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 >>> >>> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 >>> >>> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 >>> >>> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 >>> >>> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 >>> >>> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 >>> >>> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 >>> >>> [<ffffffff81153b9a>] evict+0xaa/0x1b0 >>> >>> [<ffffffff81154843>] iput+0x103/0x210 >>> >>> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 >>> >>> [<ffffffff811505ec>] dput+0x12c/0x250 >>> >>> [<ffffffff81146275>] path_put+0x15/0x30 >>> >>> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 >>> >>> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b >>> >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 >>> >>> 0c 80 43 5c 01 48 89 df e8 d5 >>> >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 >>> >>> e5 0f 0b 55 48 89 e5 41 54 53 >>> >>> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 >>> >>> RSP <ffff8807b156dc28> >>> >>> ---[ end trace 8d09f8cfbb601c14 ]--- >>> >>> >>> >>> >>> >> >>> >> I don't see a function called sysexit_audit in 3.5. I assume that's >>> >> created via some sort of macro goop? >>> > >>> > It looks like it's defined in arch/x86/kernel/entry_32.S >>> > >>> > my asm is non-existent, but it looks like it's calling >>> > __audit_syscall_exit and the disabling interrupts. >>> > >>> >> Could this be a xen specific problem? >>> > >>> > It could be. I asked the xen folks about this and they felt that it >>> > was unlikely that the bug itself was their >>> > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html, >>> > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html) >>> > >>> > xen could definitely be doing something wonky with irqs though. >>> > >>> >> Might be interesting to add a >>> >> check for irqs being disabled early in __audit_syscall_exit and see >>> >> whether its doing that universally in its syscall exit routine? >>> > >>> > I'll try this, thanks. >>> >>> So, I've added WARN_ON(irqs_disabled()) in various spots in the >>> audit_syscall_exit path all the way up to the path_put in >>> audit_free_names() that ends up triggering this and prior to Oops'ing, >>> I don't see anything. >>> >>> Post oops, I get lots of messages like: >>> >>> ------------[ cut here ]------------ >>> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() >>> Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 >>> Call Trace: >>> [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb >>> [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 >>> [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 >>> [<ffffffff81693483>] audit_free_names+0xb0/0xdb >>> [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 >>> [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f >>> ---[ end trace ec7a15dbe40eaf1d ]--- >>> ------------[ cut here ]------------ >>> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb() >>> Pid: 3981, comm: python2.6 Tainted: G D W 3.2.5-at17-ganetixenu #31 >>> Call Trace: >>> [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb >>> [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0 >>> [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20 >>> [<ffffffff81693483>] audit_free_names+0xb0/0xdb >>> [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 >>> [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f >>> ---[ end trace ec7a15dbe40eaf1e ]--- >>> >>> where that python command is a python par file that may be doing some >>> funny unlinking as well. I haven't tracked it down. >>> >>> thoughts? does this not look like a bug in audit; maybe something >>> somewhere else is disabling interrupts? >>> >> >> I'm not sure whether it's a bug in audit or not. Tracking down what's >> disabling interrupts here is probably the key though... > > Yeah. I'm trying to get a reproducer to work on a different version of Xen. > >> So I'm guessing the WARN_ON above was in audit_free_names. Was there a >> similar WARN_ON in audit_syscall_exit that did not fire? > > Correct. I booted a few kernels with the WARN_ON successively further > down the stack in audit land and it only started firing after BUG_ON() > in fs/buffer.c > > Thanks for looking into this. So one more random data point: I can't repro this with ext2, but ext3 and ext4 both crash immediately. off to bug fs-devel.. > Cheers, > peter > -- > Peter Moody Google 1.650.253.7306 > Security Engineer pgp:0xC3410038 -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 17:04 ` Peter Moody 2012-10-05 19:41 ` Peter Moody @ 2012-10-08 14:10 ` Jeff Layton 1 sibling, 0 replies; 26+ messages in thread From: Jeff Layton @ 2012-10-08 14:10 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit, sandeen On Fri, 5 Oct 2012 10:04:42 -0700 Peter Moody <pmoody@google.com> wrote: > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote: > > >> > >> ------------[ cut here ]------------ > >> kernel BUG at fs/buffer.c:1220! > >> invalid opcode: 0000 [#1] SMP > >> CPU 0 > >> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 > >> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] > >> check_irqs_on.part.8+0x4/0x6 > >> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 > >> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 > >> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 > >> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 > >> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 > >> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 > >> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 > >> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > >> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 > >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) > >> Stack: > >> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 > >> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 > >> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 > >> Call Trace: > >> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 > >> [<ffffffff816af231>] ? down_read+0x11/0x30 > >> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 > >> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 > >> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 > >> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 > >> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 > >> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 > >> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 > >> [<ffffffff81153b9a>] evict+0xaa/0x1b0 > >> [<ffffffff81154843>] iput+0x103/0x210 > >> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 > >> [<ffffffff811505ec>] dput+0x12c/0x250 > >> [<ffffffff81146275>] path_put+0x15/0x30 > >> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 > >> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b > >> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 > >> 0c 80 43 5c 01 48 89 df e8 d5 > >> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 > >> e5 0f 0b 55 48 89 e5 41 54 53 > >> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 > >> RSP <ffff8807b156dc28> > >> ---[ end trace 8d09f8cfbb601c14 ]--- > >> > >> > > > > I don't see a function called sysexit_audit in 3.5. I assume that's > > created via some sort of macro goop? > > It looks like it's defined in arch/x86/kernel/entry_32.S > > my asm is non-existent, but it looks like it's calling > __audit_syscall_exit and the disabling interrupts. > BTW, this is not really my area of expertise either, but I think that's the asm goop for "native" 32-bit x86. The asm for 32-bit binaries running on a 64-bit kernel are in arch/x86/ia32/ia32entry.S. >From what I can tell though, it looks like it should be calling __audit_syscall_exit with interrupts enabled: --------------[snip]----------------- .macro auditsys_exit exit testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT),TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) jnz ia32_ret_from_sys_call TRACE_IRQS_ON sti movl %eax,%esi /* second arg, syscall return value */ cmpl $-MAX_ERRNO,%eax /* is it an error ? */ jbe 1f movslq %eax, %rsi /* if error sign extend to 64 bits */ 1: setbe %al /* 1 if error, 0 if not */ movzbl %al,%edi /* zero-extend that into %edi */ call __audit_syscall_exit movq RAX-ARGOFFSET(%rsp),%rax /* reload syscall return value */ movl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT),%edi cli TRACE_IRQS_OFF testl %edi,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) jz \exit CLEAR_RREGS -ARGOFFSET jmp int_with_check .endm --------------[snip]----------------- AIUI, "sti" is what enables interrupts, and I don't see a call to "cli" (which disables interrupts) until after __audit_syscall_exit. Either something is reordering those instructions (which seems very unlikely), or we're not looking at the right syscall code? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 15:57 ` Peter Moody 2012-10-05 16:15 ` Jeff Layton @ 2012-10-05 20:23 ` Eric Sandeen 2012-10-07 3:22 ` Peter Moody 1 sibling, 1 reply; 26+ messages in thread From: Eric Sandeen @ 2012-10-05 20:23 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit, Jeff Layton On 10/5/12 10:57 AM, Peter Moody wrote: > On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote: >> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote: >>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>> On Fri, 5 Oct 2012 06:57:59 -0700 >>>> Peter Moody <pmoody@google.com> wrote: >>>> >>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>>>> On Thu, 4 Oct 2012 11:48:23 -0700 >>>>>> Peter Moody <pmoody@google.com> wrote: >>>>>> >>>>>>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >>>>>>>> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >>>>>>>>> Hey folks, >>>>>>>>> >>>>>>>>> following up on old patches, are there any comments on this? Did you >>>>>>>>> get around to finding a better way to fix this bug, Al? >>>>>>>> >>>>>>>> Alas, I've found none ;-/ Looks like we'll have to go with this one, >>>>>>>> at least until somebody comes up with better solution. >>>>>>> >>>>>>> Not surprisingly, this patch doesn't actually fix the issue (or at >>>>>>> least doesn't do it correctly). >>>>>>> >>>>>>> I hadn't noticed that get_fs_pwd() actually calls path_get() on >>>>>>> &context->pwd so the additional path_get() is useless and the >>>>>>> reference doesn't ever actually get freed if audit_putname is called >>>>>>> while we're in a syscall. >>>>>>> >>>>>>> Al, Eric, Jeff; do any of you guys have an understanding of what the >>>>>>> initial bug actually is since this clearly doesn't fix it? >>>>>>> >>>>>>> Cheers, >>>>>>> peter >>>>>>> >>>>>> >>>>>> BTW, I ran this test on one of my KVM guests and it ran just fine. That >>>>>> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is >>>>>> ext4 though, not ext3. So perhaps that's a factor? >>>>>> >>>>>> The oops message you posted at least looks like something down in the >>>>>> bowels of ext3 or fs/buffer.c. >>>>> >>>>> Yeah, the only place this actually happens for me on these giant xen >>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and >>>>> ext4 filesystems and it happens with 100% reliability. >>>>> >>>>> The actual oops is from: >>>>> >>>>> static inline void check_irqs_on(void) >>>>> { >>>>> #ifdef irqs_disabled >>>>> BUG_ON(irqs_disabled()); >>>>> #endif >>>>> } >>>>> >>>>> with the code path looking like: >>>>> >>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() >>>>> >>>> >>>> Do you have a backtrace from a more recent kernel? I wonder if >>>> something in the syscall exit codepath is disabling IRQs here? >>> >>> is 3.6.0-rc1 recent enough or do you want something newer? >> >> nevermind, that doesn't boot. One sec. > > > here's 3.5.0 > > ------------[ cut here ]------------ > kernel BUG at fs/buffer.c:1220! > invalid opcode: 0000 [#1] SMP > CPU 0 > Pid: 3683, comm: a.out Not tainted 3.5.0 #3 > RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] > check_irqs_on.part.8+0x4/0x6 > RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 > RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 > RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 > RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 > R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 > R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 > FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 > CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) One thing that crossed my mind is that back in the 4k stacks days I think sometimes a blown stack could corrupt threadinfo and we'd get spurious warnings like this. Is there any chance something like that has happened? (any stack depth messages, etc?) Maybe a crashdump on the BUG() for further poking around would be in order. -Eric > Stack: > ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 > ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 > ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 > Call Trace: > [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 > [<ffffffff816af231>] ? down_read+0x11/0x30 > [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 > [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 > [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 > [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 > [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 > [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 > [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 > [<ffffffff81153b9a>] evict+0xaa/0x1b0 > [<ffffffff81154843>] iput+0x103/0x210 > [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 > [<ffffffff811505ec>] dput+0x12c/0x250 > [<ffffffff81146275>] path_put+0x15/0x30 > [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 > [<ffffffff816b30be>] sysexit_audit+0x29/0x5b > Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 > 0c 80 43 5c 01 48 89 df e8 d5 > 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 > e5 0f 0b 55 48 89 e5 41 54 53 > RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 > RSP <ffff8807b156dc28> > ---[ end trace 8d09f8cfbb601c14 ]--- > > >>>> -- >>>> Jeff Layton <jlayton@redhat.com> >>> >>> >>> >>> -- >>> Peter Moody Google 1.650.253.7306 >>> Security Engineer pgp:0xC3410038 >> >> >> >> -- >> Peter Moody Google 1.650.253.7306 >> Security Engineer pgp:0xC3410038 > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-05 20:23 ` Eric Sandeen @ 2012-10-07 3:22 ` Peter Moody 2012-10-08 18:48 ` Eric Sandeen 0 siblings, 1 reply; 26+ messages in thread From: Peter Moody @ 2012-10-07 3:22 UTC (permalink / raw) To: Eric Sandeen; +Cc: lczerner, linux-audit, Jeff Layton On Fri, Oct 5, 2012 at 1:23 PM, Eric Sandeen <sandeen@redhat.com> wrote: > On 10/5/12 10:57 AM, Peter Moody wrote: >> On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote: >>> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote: >>>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>>> On Fri, 5 Oct 2012 06:57:59 -0700 >>>>> Peter Moody <pmoody@google.com> wrote: >>>>> >>>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>>>>> On Thu, 4 Oct 2012 11:48:23 -0700 >>>>>>> Peter Moody <pmoody@google.com> wrote: >>>>>>> >>>>>>>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >>>>>>>>> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >>>>>>>>>> Hey folks, >>>>>>>>>> >>>>>>>>>> following up on old patches, are there any comments on this? Did you >>>>>>>>>> get around to finding a better way to fix this bug, Al? >>>>>>>>> >>>>>>>>> Alas, I've found none ;-/ Looks like we'll have to go with this one, >>>>>>>>> at least until somebody comes up with better solution. >>>>>>>> >>>>>>>> Not surprisingly, this patch doesn't actually fix the issue (or at >>>>>>>> least doesn't do it correctly). >>>>>>>> >>>>>>>> I hadn't noticed that get_fs_pwd() actually calls path_get() on >>>>>>>> &context->pwd so the additional path_get() is useless and the >>>>>>>> reference doesn't ever actually get freed if audit_putname is called >>>>>>>> while we're in a syscall. >>>>>>>> >>>>>>>> Al, Eric, Jeff; do any of you guys have an understanding of what the >>>>>>>> initial bug actually is since this clearly doesn't fix it? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> peter >>>>>>>> >>>>>>> >>>>>>> BTW, I ran this test on one of my KVM guests and it ran just fine. That >>>>>>> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is >>>>>>> ext4 though, not ext3. So perhaps that's a factor? >>>>>>> >>>>>>> The oops message you posted at least looks like something down in the >>>>>>> bowels of ext3 or fs/buffer.c. >>>>>> >>>>>> Yeah, the only place this actually happens for me on these giant xen >>>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and >>>>>> ext4 filesystems and it happens with 100% reliability. >>>>>> >>>>>> The actual oops is from: >>>>>> >>>>>> static inline void check_irqs_on(void) >>>>>> { >>>>>> #ifdef irqs_disabled >>>>>> BUG_ON(irqs_disabled()); >>>>>> #endif >>>>>> } >>>>>> >>>>>> with the code path looking like: >>>>>> >>>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() >>>>>> >>>>> >>>>> Do you have a backtrace from a more recent kernel? I wonder if >>>>> something in the syscall exit codepath is disabling IRQs here? >>>> >>>> is 3.6.0-rc1 recent enough or do you want something newer? >>> >>> nevermind, that doesn't boot. One sec. >> >> >> here's 3.5.0 >> >> ------------[ cut here ]------------ >> kernel BUG at fs/buffer.c:1220! >> invalid opcode: 0000 [#1] SMP >> CPU 0 >> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 >> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] >> check_irqs_on.part.8+0x4/0x6 >> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 >> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 >> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 >> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 >> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 >> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 >> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 >> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) > > One thing that crossed my mind is that back in the 4k stacks days I > think sometimes a blown stack could corrupt threadinfo and we'd get > spurious warnings like this. > > Is there any chance something like that has happened? (any stack > depth messages, etc?) I do'nt see any stack depth messages in there, no. > Maybe a crashdump on the BUG() for further poking around would > be in order. when I add a painc() there, this is what I get. I'm not seeing anything about stack depth, Is there something obvious I'm missing? Kernel panic - not syncing: irqs disabled Pid: 23011, comm: a.out Tainted: G W 3.2.5-at18-ganetixenu #2 Call Trace: [<ffffffff81692b16>] panic+0x8c/0x1a3 [<ffffffff8100926d>] ? xen_force_evtchn_callback+0xd/0x10 [<ffffffff816960d6>] check_irqs_on.part.13+0x25/0x25 [<ffffffff81161dc9>] __find_get_block+0x1f9/0x200 [<ffffffff813c7a2f>] ? number.isra.2+0x31f/0x350 [<ffffffff811c9d85>] ext3_clear_blocks+0x75/0x140 [<ffffffff811c9f5c>] ext3_free_data+0x10c/0x150 [<ffffffff811da961>] ? ext3_journal_start_sb+0x31/0x60 [<ffffffff811ca635>] ext3_truncate+0x4a5/0x600 [<ffffffff81232f38>] ? journal_start+0xb8/0x100 [<ffffffff811ccf48>] ext3_evict_inode+0x228/0x2a0 [<ffffffff8114bfb1>] evict+0xa1/0x1a0 [<ffffffff8114cc61>] iput+0x101/0x210 [<ffffffff81148040>] d_kill+0xf0/0x130 [<ffffffff81148bd2>] dput+0xd2/0x1b0 [<ffffffff8113eb85>] path_put+0x15/0x30 [<ffffffff8169348f>] audit_free_names+0xbc/0xdb [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 [<ffffffff8169fdaa>] sysexit_audit+0x21/0x5f > -Eric > >> Stack: >> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 >> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 >> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 >> Call Trace: >> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 >> [<ffffffff816af231>] ? down_read+0x11/0x30 >> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 >> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 >> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 >> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 >> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 >> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 >> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 >> [<ffffffff81153b9a>] evict+0xaa/0x1b0 >> [<ffffffff81154843>] iput+0x103/0x210 >> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 >> [<ffffffff811505ec>] dput+0x12c/0x250 >> [<ffffffff81146275>] path_put+0x15/0x30 >> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 >> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b >> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 >> 0c 80 43 5c 01 48 89 df e8 d5 >> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 >> e5 0f 0b 55 48 89 e5 41 54 53 >> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 >> RSP <ffff8807b156dc28> >> ---[ end trace 8d09f8cfbb601c14 ]--- >> >> >>>>> -- >>>>> Jeff Layton <jlayton@redhat.com> >>>> >>>> >>>> >>>> -- >>>> Peter Moody Google 1.650.253.7306 >>>> Security Engineer pgp:0xC3410038 >>> >>> >>> >>> -- >>> Peter Moody Google 1.650.253.7306 >>> Security Engineer pgp:0xC3410038 >> >> >> > -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] audit: grab a reference to context->pwd when it's cached 2012-10-07 3:22 ` Peter Moody @ 2012-10-08 18:48 ` Eric Sandeen 0 siblings, 0 replies; 26+ messages in thread From: Eric Sandeen @ 2012-10-08 18:48 UTC (permalink / raw) To: Peter Moody; +Cc: lczerner, linux-audit, Jeff Layton On 10/6/12 10:22 PM, Peter Moody wrote: > On Fri, Oct 5, 2012 at 1:23 PM, Eric Sandeen <sandeen@redhat.com> wrote: >> On 10/5/12 10:57 AM, Peter Moody wrote: >>> On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote: >>>> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote: >>>>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>>>> On Fri, 5 Oct 2012 06:57:59 -0700 >>>>>> Peter Moody <pmoody@google.com> wrote: >>>>>> >>>>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote: >>>>>>>> On Thu, 4 Oct 2012 11:48:23 -0700 >>>>>>>> Peter Moody <pmoody@google.com> wrote: >>>>>>>> >>>>>>>>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote: >>>>>>>>>> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote: >>>>>>>>>>> Hey folks, >>>>>>>>>>> >>>>>>>>>>> following up on old patches, are there any comments on this? Did you >>>>>>>>>>> get around to finding a better way to fix this bug, Al? >>>>>>>>>> >>>>>>>>>> Alas, I've found none ;-/ Looks like we'll have to go with this one, >>>>>>>>>> at least until somebody comes up with better solution. >>>>>>>>> >>>>>>>>> Not surprisingly, this patch doesn't actually fix the issue (or at >>>>>>>>> least doesn't do it correctly). >>>>>>>>> >>>>>>>>> I hadn't noticed that get_fs_pwd() actually calls path_get() on >>>>>>>>> &context->pwd so the additional path_get() is useless and the >>>>>>>>> reference doesn't ever actually get freed if audit_putname is called >>>>>>>>> while we're in a syscall. >>>>>>>>> >>>>>>>>> Al, Eric, Jeff; do any of you guys have an understanding of what the >>>>>>>>> initial bug actually is since this clearly doesn't fix it? >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> peter >>>>>>>>> >>>>>>>> >>>>>>>> BTW, I ran this test on one of my KVM guests and it ran just fine. That >>>>>>>> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is >>>>>>>> ext4 though, not ext3. So perhaps that's a factor? >>>>>>>> >>>>>>>> The oops message you posted at least looks like something down in the >>>>>>>> bowels of ext3 or fs/buffer.c. >>>>>>> >>>>>>> Yeah, the only place this actually happens for me on these giant xen >>>>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and >>>>>>> ext4 filesystems and it happens with 100% reliability. >>>>>>> >>>>>>> The actual oops is from: >>>>>>> >>>>>>> static inline void check_irqs_on(void) >>>>>>> { >>>>>>> #ifdef irqs_disabled >>>>>>> BUG_ON(irqs_disabled()); >>>>>>> #endif >>>>>>> } >>>>>>> >>>>>>> with the code path looking like: >>>>>>> >>>>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG() >>>>>>> >>>>>> >>>>>> Do you have a backtrace from a more recent kernel? I wonder if >>>>>> something in the syscall exit codepath is disabling IRQs here? >>>>> >>>>> is 3.6.0-rc1 recent enough or do you want something newer? >>>> >>>> nevermind, that doesn't boot. One sec. >>> >>> >>> here's 3.5.0 >>> >>> ------------[ cut here ]------------ >>> kernel BUG at fs/buffer.c:1220! >>> invalid opcode: 0000 [#1] SMP >>> CPU 0 >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3 >>> RIP: e030:[<ffffffff816a99f4>] [<ffffffff816a99f4>] >>> check_irqs_on.part.8+0x4/0x6 >>> RSP: e02b:ffff8807b156dc28 EFLAGS: 00010046 >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396 >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000 >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50 >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54 >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396 >>> FS: 00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000 >>> CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000) >> >> One thing that crossed my mind is that back in the 4k stacks days I >> think sometimes a blown stack could corrupt threadinfo and we'd get >> spurious warnings like this. >> >> Is there any chance something like that has happened? (any stack >> depth messages, etc?) > > I do'nt see any stack depth messages in there, no. > >> Maybe a crashdump on the BUG() for further poking around would >> be in order. > > when I add a painc() there, this is what I get. I'm not seeing > anything about stack depth, Is there something obvious I'm missing? Not missing anything really; it's a long shot I suppose anyway. It's possible that there was some deep stack excursion that corrupted threadinfo, and then we went down the path below and the first bad news we received was when we tested whether we were in an interrupt. Hm, your other email said that you only started getting the in_interrupt() warnings AFTER the BUG_ON()? That makes me think I might be right after all: > I booted a few kernels with the WARN_ON successively further > down the stack in audit land and it only started firing after BUG_ON() > in fs/buffer.c On the other hand, there is a "canary" at the top of the stack which would usually get overwritten, and a BUG_ON() or panic() should print that fact, something like "Thread overran stack, or stack corrupted." What if you build with config STACK_TRACER on, and: # mount -t debugfs none /sys/kernel/debug # echo 1 > /proc/sys/kernel/stack_tracer_enabled # <run your test w/o the panic, get the BUG_ON> # cat /sys/kernel/debug/tracing/stack_trace If CONFIG_DEBUG_STACK_USAGE isn't on, that might help too. I could be way off here but might be worth a test. -Eric > Kernel panic - not syncing: irqs disabled > Pid: 23011, comm: a.out Tainted: G W 3.2.5-at18-ganetixenu #2 > Call Trace: > [<ffffffff81692b16>] panic+0x8c/0x1a3 > [<ffffffff8100926d>] ? xen_force_evtchn_callback+0xd/0x10 > [<ffffffff816960d6>] check_irqs_on.part.13+0x25/0x25 > [<ffffffff81161dc9>] __find_get_block+0x1f9/0x200 > [<ffffffff813c7a2f>] ? number.isra.2+0x31f/0x350 > [<ffffffff811c9d85>] ext3_clear_blocks+0x75/0x140 > [<ffffffff811c9f5c>] ext3_free_data+0x10c/0x150 > [<ffffffff811da961>] ? ext3_journal_start_sb+0x31/0x60 > [<ffffffff811ca635>] ext3_truncate+0x4a5/0x600 > [<ffffffff81232f38>] ? journal_start+0xb8/0x100 > [<ffffffff811ccf48>] ext3_evict_inode+0x228/0x2a0 > [<ffffffff8114bfb1>] evict+0xa1/0x1a0 > [<ffffffff8114cc61>] iput+0x101/0x210 > [<ffffffff81148040>] d_kill+0xf0/0x130 > [<ffffffff81148bd2>] dput+0xd2/0x1b0 > [<ffffffff8113eb85>] path_put+0x15/0x30 > [<ffffffff8169348f>] audit_free_names+0xbc/0xdb > [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0 > [<ffffffff8169fdaa>] sysexit_audit+0x21/0x5f > >> -Eric >> >>> Stack: >>> ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30 >>> ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231 >>> ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54 >>> Call Trace: >>> [<ffffffff8116a099>] __find_get_block+0x1f9/0x200 >>> [<ffffffff816af231>] ? down_read+0x11/0x30 >>> [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140 >>> [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150 >>> [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60 >>> [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600 >>> [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100 >>> [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0 >>> [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0 >>> [<ffffffff81153b9a>] evict+0xaa/0x1b0 >>> [<ffffffff81154843>] iput+0x103/0x210 >>> [<ffffffff8114fc88>] dentry_iput+0x88/0xd0 >>> [<ffffffff811505ec>] dput+0x12c/0x250 >>> [<ffffffff81146275>] path_put+0x15/0x30 >>> [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460 >>> [<ffffffff816b30be>] sysexit_audit+0x29/0x5b >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74 >>> 0c 80 43 5c 01 48 89 df e8 d5 >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 >>> e5 0f 0b 55 48 89 e5 41 54 53 >>> RIP [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6 >>> RSP <ffff8807b156dc28> >>> ---[ end trace 8d09f8cfbb601c14 ]--- >>> >>> ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2012-10-08 18:48 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-16 1:13 [PATCH] audit: grab a reference to context->pwd when it's cached Peter Moody 2012-08-16 1:26 ` Alexander Viro 2012-08-16 1:41 ` Peter Moody 2012-08-20 17:19 ` Peter Moody 2012-09-25 17:03 ` Peter Moody 2012-09-26 13:50 ` Alexander Viro 2012-09-26 18:20 ` Peter Moody 2012-10-04 18:48 ` Peter Moody 2012-10-05 11:02 ` Jeff Layton 2012-10-05 12:55 ` Jeff Layton 2012-10-05 13:57 ` Peter Moody 2012-10-05 14:10 ` Jeff Layton 2012-10-05 14:26 ` Jeff Layton 2012-10-05 15:16 ` Peter Moody 2012-10-05 15:18 ` Peter Moody 2012-10-05 15:57 ` Peter Moody 2012-10-05 16:15 ` Jeff Layton 2012-10-05 17:04 ` Peter Moody 2012-10-05 19:41 ` Peter Moody 2012-10-08 12:23 ` Jeff Layton 2012-10-08 15:22 ` Peter Moody 2012-10-08 16:45 ` Peter Moody 2012-10-08 14:10 ` Jeff Layton 2012-10-05 20:23 ` Eric Sandeen 2012-10-07 3:22 ` Peter Moody 2012-10-08 18:48 ` Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox