* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 [not found] <2966.1418778594@turing-police.cc.vt.edu> @ 2014-12-17 1:21 ` Eric Paris 2014-12-17 22:44 ` Richard Guy Briggs [not found] ` <5444.1418921116@turing-police.cc.vt.edu> 1 sibling, 1 reply; 10+ messages in thread From: Eric Paris @ 2014-12-17 1:21 UTC (permalink / raw) To: Valdis Kletnieks, rgb; +Cc: Paul Moore, linux-kernel, selinux, linux-audit I haven't looked into it, but I'd place my first bet on the audit multicast code... Richard? On Tue, 2014-12-16 at 20:09 -0500, Valdis Kletnieks wrote: > Not sure who's to blame here, but I'm tending towards selinux based on > who was holding the locks... > > Spotted these two while booting single-user on 20141216. 20141208 > doesn't throw these, so it's something in the last week or so.. > > Tossed it twice - once for /sbin/sulogin, and then a second time for /bin/bash. > > [ 34.061285] BUG: sleeping function called from invalid context at mm/slab.c:2849 > [ 34.062863] in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin > [ 34.064416] 2 locks held by sulogin/885: > [ 34.064418] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b > [ 34.064428] #1: (tty_files_lock){+.+.+.}, at: [<ffffffff9123e787>] selinux_bprm_committing_creds+0x55/0x22b > [ 34.064438] CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-20141216 #30 > [ 34.064440] Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014 > [ 34.064442] ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375 > [ 34.064447] ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006 > [ 34.064452] 0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38 > [ 34.064457] Call Trace: > [ 34.064463] [<ffffffff916ba529>] dump_stack+0x50/0xa8 > [ 34.064467] [<ffffffff91063185>] ___might_sleep+0x1b6/0x1be > [ 34.064472] [<ffffffff910632a6>] __might_sleep+0x119/0x128 > [ 34.064477] [<ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f > [ 34.064480] [<ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9 > [ 34.064484] [<ffffffff914e148d>] __alloc_skb+0x42/0x1a3 > [ 34.064488] [<ffffffff914e2b62>] skb_copy+0x3e/0xa3 > [ 34.064492] [<ffffffff910c263e>] audit_log_end+0x83/0x100 > [ 34.064496] [<ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103 > [ 34.064500] [<ffffffff91252a73>] common_lsm_audit+0x441/0x450 > [ 34.064503] [<ffffffff9123c163>] slow_avc_audit+0x63/0x67 > [ 34.064506] [<ffffffff9123c42c>] avc_has_perm+0xca/0xe3 > [ 34.064510] [<ffffffff9123dc2d>] inode_has_perm+0x5a/0x65 > [ 34.064514] [<ffffffff9123e7ca>] selinux_bprm_committing_creds+0x98/0x22b > [ 34.064519] [<ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10 > [ 34.064522] [<ffffffff911515e6>] install_exec_creds+0xe/0x79 > [ 34.064527] [<ffffffff911974cf>] load_elf_binary+0xe36/0x10d7 > [ 34.064542] [<ffffffff9115198e>] search_binary_handler+0x81/0x18c > [ 34.064545] [<ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7 > [ 34.064548] [<ffffffff91153669>] do_execve+0x1f/0x21 > [ 34.064552] [<ffffffff91153967>] SyS_execve+0x25/0x29 > [ 34.064557] [<ffffffff916c61a9>] stub_execve+0x69/0xa0 > > [ 48.826654] BUG: sleeping function called from invalid context at mm/slab.c:2849 > [ 48.829282] in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: bash > [ 48.829284] 2 locks held by bash/885: > [ 48.829297] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b > [ 48.829307] #1: (&(&newf->file_lock)->rlock){+.+.+.}, at: [<ffffffff91166b8b>] iterate_fd+0x34/0x11c > [ 48.829310] CPU: 3 PID: 885 Comm: bash Not tainted 3.18.0-next-20141216 #30 > [ 48.829311] Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014 > [ 48.829317] ffff880223744f10 ffff88022410f928 ffffffff916ba529 0000000000000375 > [ 48.829321] ffff880223744f10 ffff88022410f958 ffffffff91063185 0000000000000002 > [ 48.829325] 0000000000000000 0000000000000000 0000000000000000 ffff88022410f9a8 > [ 48.829327] Call Trace: > [ 48.829333] [<ffffffff916ba529>] dump_stack+0x50/0xa8 > [ 48.829338] [<ffffffff91063185>] ___might_sleep+0x1b6/0x1be > [ 48.829341] [<ffffffff910632a6>] __might_sleep+0x119/0x128 > [ 48.829347] [<ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f > [ 48.829350] [<ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9 > [ 48.829356] [<ffffffff914e148d>] __alloc_skb+0x42/0x1a3 > [ 48.829360] [<ffffffff914e2b62>] skb_copy+0x3e/0xa3 > [ 48.829367] [<ffffffff910c263e>] audit_log_end+0x83/0x100 > [ 48.829372] [<ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103 > [ 48.829377] [<ffffffff91252a73>] common_lsm_audit+0x441/0x450 > [ 48.829381] [<ffffffff9123c163>] slow_avc_audit+0x63/0x67 > [ 48.829386] [<ffffffff9123c42c>] avc_has_perm+0xca/0xe3 > [ 48.829391] [<ffffffff9123e255>] ? selinux_file_permission+0x9b/0x9b > [ 48.829395] [<ffffffff9123e0b9>] file_has_perm+0x6d/0x7c > [ 48.829400] [<ffffffff9123e283>] match_file+0x2e/0x3b > [ 48.829404] [<ffffffff91166c4b>] iterate_fd+0xf4/0x11c > [ 48.829409] [<ffffffff9123e802>] selinux_bprm_committing_creds+0xd0/0x22b > [ 48.829415] [<ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10 > [ 48.829419] [<ffffffff911515e6>] install_exec_creds+0xe/0x79 > [ 48.829426] [<ffffffff911974cf>] load_elf_binary+0xe36/0x10d7 > [ 48.829431] [<ffffffff9115198e>] search_binary_handler+0x81/0x18c > [ 48.829435] [<ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7 > [ 48.829462] [<ffffffff91153669>] do_execve+0x1f/0x21 > [ 48.829466] [<ffffffff91153967>] SyS_execve+0x25/0x29 > [ 48.829472] [<ffffffff916c61a9>] stub_execve+0x69/0xa0 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-17 1:21 ` linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 Eric Paris @ 2014-12-17 22:44 ` Richard Guy Briggs 0 siblings, 0 replies; 10+ messages in thread From: Richard Guy Briggs @ 2014-12-17 22:44 UTC (permalink / raw) To: Eric Paris Cc: Valdis Kletnieks, Paul Moore, linux-kernel, selinux, linux-audit On 14/12/16, Eric Paris wrote: > I haven't looked into it, but I'd place my first bet on the audit > multicast code... Any particular reason for the multicast code (other than the obvious skb_copy added)? That stuff went upstream 8 months ago rather than this linux-next window of 20141208 to 20141216. There are people using it (as evidenced by a bug report and a patch to fix incorrect size reporting has gone upstream). So I doubt it would be that unless something new was interacting with it. I'd more suspect 9eab339b197a6903043d272295dcb716ff739b21 [ audit: get comm using lock to avoid race in string printing ] in which the call to get_task_comm() might be more safely be replaced with memcpy() as in https://lkml.org/lkml/2014/11/16/184 > Richard? > > On Tue, 2014-12-16 at 20:09 -0500, Valdis Kletnieks wrote: > > Not sure who's to blame here, but I'm tending towards selinux based on > > who was holding the locks... > > > > Spotted these two while booting single-user on 20141216. 20141208 > > doesn't throw these, so it's something in the last week or so.. > > > > Tossed it twice - once for /sbin/sulogin, and then a second time for /bin/bash. > > > > [ 34.061285] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > [ 34.062863] in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin > > [ 34.064416] 2 locks held by sulogin/885: > > [ 34.064418] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b > > [ 34.064428] #1: (tty_files_lock){+.+.+.}, at: [<ffffffff9123e787>] selinux_bprm_committing_creds+0x55/0x22b > > [ 34.064438] CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-20141216 #30 > > [ 34.064440] Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014 > > [ 34.064442] ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375 > > [ 34.064447] ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006 > > [ 34.064452] 0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38 > > [ 34.064457] Call Trace: > > [ 34.064463] [<ffffffff916ba529>] dump_stack+0x50/0xa8 > > [ 34.064467] [<ffffffff91063185>] ___might_sleep+0x1b6/0x1be > > [ 34.064472] [<ffffffff910632a6>] __might_sleep+0x119/0x128 > > [ 34.064477] [<ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f > > [ 34.064480] [<ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9 > > [ 34.064484] [<ffffffff914e148d>] __alloc_skb+0x42/0x1a3 > > [ 34.064488] [<ffffffff914e2b62>] skb_copy+0x3e/0xa3 > > [ 34.064492] [<ffffffff910c263e>] audit_log_end+0x83/0x100 > > [ 34.064496] [<ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103 > > [ 34.064500] [<ffffffff91252a73>] common_lsm_audit+0x441/0x450 > > [ 34.064503] [<ffffffff9123c163>] slow_avc_audit+0x63/0x67 > > [ 34.064506] [<ffffffff9123c42c>] avc_has_perm+0xca/0xe3 > > [ 34.064510] [<ffffffff9123dc2d>] inode_has_perm+0x5a/0x65 > > [ 34.064514] [<ffffffff9123e7ca>] selinux_bprm_committing_creds+0x98/0x22b > > [ 34.064519] [<ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10 > > [ 34.064522] [<ffffffff911515e6>] install_exec_creds+0xe/0x79 > > [ 34.064527] [<ffffffff911974cf>] load_elf_binary+0xe36/0x10d7 > > [ 34.064542] [<ffffffff9115198e>] search_binary_handler+0x81/0x18c > > [ 34.064545] [<ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7 > > [ 34.064548] [<ffffffff91153669>] do_execve+0x1f/0x21 > > [ 34.064552] [<ffffffff91153967>] SyS_execve+0x25/0x29 > > [ 34.064557] [<ffffffff916c61a9>] stub_execve+0x69/0xa0 > > > > [ 48.826654] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > [ 48.829282] in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: bash > > [ 48.829284] 2 locks held by bash/885: > > [ 48.829297] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b > > [ 48.829307] #1: (&(&newf->file_lock)->rlock){+.+.+.}, at: [<ffffffff91166b8b>] iterate_fd+0x34/0x11c > > [ 48.829310] CPU: 3 PID: 885 Comm: bash Not tainted 3.18.0-next-20141216 #30 > > [ 48.829311] Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014 > > [ 48.829317] ffff880223744f10 ffff88022410f928 ffffffff916ba529 0000000000000375 > > [ 48.829321] ffff880223744f10 ffff88022410f958 ffffffff91063185 0000000000000002 > > [ 48.829325] 0000000000000000 0000000000000000 0000000000000000 ffff88022410f9a8 > > [ 48.829327] Call Trace: > > [ 48.829333] [<ffffffff916ba529>] dump_stack+0x50/0xa8 > > [ 48.829338] [<ffffffff91063185>] ___might_sleep+0x1b6/0x1be > > [ 48.829341] [<ffffffff910632a6>] __might_sleep+0x119/0x128 > > [ 48.829347] [<ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f > > [ 48.829350] [<ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9 > > [ 48.829356] [<ffffffff914e148d>] __alloc_skb+0x42/0x1a3 > > [ 48.829360] [<ffffffff914e2b62>] skb_copy+0x3e/0xa3 > > [ 48.829367] [<ffffffff910c263e>] audit_log_end+0x83/0x100 > > [ 48.829372] [<ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103 > > [ 48.829377] [<ffffffff91252a73>] common_lsm_audit+0x441/0x450 > > [ 48.829381] [<ffffffff9123c163>] slow_avc_audit+0x63/0x67 > > [ 48.829386] [<ffffffff9123c42c>] avc_has_perm+0xca/0xe3 > > [ 48.829391] [<ffffffff9123e255>] ? selinux_file_permission+0x9b/0x9b > > [ 48.829395] [<ffffffff9123e0b9>] file_has_perm+0x6d/0x7c > > [ 48.829400] [<ffffffff9123e283>] match_file+0x2e/0x3b > > [ 48.829404] [<ffffffff91166c4b>] iterate_fd+0xf4/0x11c > > [ 48.829409] [<ffffffff9123e802>] selinux_bprm_committing_creds+0xd0/0x22b > > [ 48.829415] [<ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10 > > [ 48.829419] [<ffffffff911515e6>] install_exec_creds+0xe/0x79 > > [ 48.829426] [<ffffffff911974cf>] load_elf_binary+0xe36/0x10d7 > > [ 48.829431] [<ffffffff9115198e>] search_binary_handler+0x81/0x18c > > [ 48.829435] [<ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7 > > [ 48.829462] [<ffffffff91153669>] do_execve+0x1f/0x21 > > [ 48.829466] [<ffffffff91153967>] SyS_execve+0x25/0x29 > > [ 48.829472] [<ffffffff916c61a9>] stub_execve+0x69/0xa0 > > > > - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <5444.1418921116@turing-police.cc.vt.edu>]
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 [not found] ` <5444.1418921116@turing-police.cc.vt.edu> @ 2014-12-18 16:50 ` Eric Paris 2014-12-18 17:46 ` Richard Guy Briggs 2014-12-18 19:21 ` Valdis.Kletnieks 0 siblings, 2 replies; 10+ messages in thread From: Eric Paris @ 2014-12-18 16:50 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: Paul Moore, linux-kernel, selinux, linux-audit, rgb On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > Spotted these two while booting single-user on 20141216. 20141208 > > doesn't throw these, so it's something in the last week or so.. > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > just fine for several days, but it went around the bend, apparently due > to a userspace or initrd change. $5 says you updated systemd? Richard? > egrep 'BUG|Linux vers' from my syslog: > > Dec 9 12:19:53 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 9 21:19:53 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 10 12:39:45 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 10 20:56:28 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 11 10:46:49 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 11 23:53:10 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 12 11:13:19 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 12 19:26:24 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 12 19:33:32 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 12 19:42:30 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 12 20:00:39 turing-police kernel: [ 1109.635328] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 20:00:43 turing-police kernel: [ 1113.680912] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 20:33:15 turing-police kernel: [ 3062.345461] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 20:37:48 turing-police kernel: [ 3335.788891] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 20:41:57 turing-police kernel: [ 3584.265255] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 20:42:47 turing-police kernel: [ 3633.863552] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 20:51:33 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > Dec 12 21:51:04 turing-police kernel: [ 3587.132867] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 22:20:01 turing-police kernel: [ 5322.313024] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 12 23:06:00 turing-police kernel: [ 8077.463289] BUG: sleeping function called from invalid context at mm/slab.c:2849 > Dec 13 00:00:05 turing-police kernel: [11318.405826] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > I need to figure out what changed around 7:30PM on the 12th. > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 16:50 ` Eric Paris @ 2014-12-18 17:46 ` Richard Guy Briggs 2014-12-18 17:50 ` Eric Paris 2014-12-18 19:21 ` Valdis.Kletnieks 1 sibling, 1 reply; 10+ messages in thread From: Richard Guy Briggs @ 2014-12-18 17:46 UTC (permalink / raw) To: Eric Paris Cc: Valdis.Kletnieks, Paul Moore, linux-kernel, selinux, linux-audit On 14/12/18, Eric Paris wrote: > On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > Spotted these two while booting single-user on 20141216. 20141208 > > > doesn't throw these, so it's something in the last week or so.. > > > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > > just fine for several days, but it went around the bend, apparently due > > to a userspace or initrd change. > > $5 says you updated systemd? > > Richard? Ok, so if you are correct, then either we justify dropping the lock (I assume the one commone to both BUG reports [sig->cred_guard_mutex] ), or we make yet another queue were were hoping to avoid... It would also be good to narrow it down to a rule that triggers this. > > egrep 'BUG|Linux vers' from my syslog: > > > > Dec 9 12:19:53 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 ... > > Dec 12 19:42:30 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > Dec 12 20:00:39 turing-police kernel: [ 1109.635328] BUG: sleeping function called from invalid context at mm/slab.c:2849 ... > > Dec 12 20:42:47 turing-police kernel: [ 3633.863552] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > Dec 12 20:51:33 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > Dec 12 21:51:04 turing-police kernel: [ 3587.132867] BUG: sleeping function called from invalid context at mm/slab.c:2849 ... > > I need to figure out what changed around 7:30PM on the 12th. - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 17:46 ` Richard Guy Briggs @ 2014-12-18 17:50 ` Eric Paris 2014-12-18 18:44 ` Richard Guy Briggs 0 siblings, 1 reply; 10+ messages in thread From: Eric Paris @ 2014-12-18 17:50 UTC (permalink / raw) To: Richard Guy Briggs; +Cc: linux-audit, selinux, linux-kernel On Thu, 2014-12-18 at 12:46 -0500, Richard Guy Briggs wrote: > On 14/12/18, Eric Paris wrote: > > On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > > > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > > Spotted these two while booting single-user on 20141216. 20141208 > > > > doesn't throw these, so it's something in the last week or so.. > > > > > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > > > just fine for several days, but it went around the bend, apparently due > > > to a userspace or initrd change. > > > > $5 says you updated systemd? > > > > Richard? > > Ok, so if you are correct, then either we justify dropping the lock (I > assume the one commone to both BUG reports [sig->cred_guard_mutex] ), > or we make yet another queue were were hoping to avoid... > > It would also be good to narrow it down to a rule that triggers this. I thought the first message was enough to find the problem, but: static void kauditd_send_multicast_skb(struct sk_buff *skb) { ... nlmsg_multicast(sock, copy, 0, AUDIT_NLGRP_READLOG, GFP_KERNEL); ... } Since kauditd_send_multicast_skb() gets called in audit_log_end(), which can come from any context (aka even a sleeping context) you can't use GFP_KERNEL. The audit_buffer know what context it should use. So pass that down and use that. -Eric > > > > egrep 'BUG|Linux vers' from my syslog: > > > > > > Dec 9 12:19:53 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > ... > > > Dec 12 19:42:30 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > > Dec 12 20:00:39 turing-police kernel: [ 1109.635328] BUG: sleeping function called from invalid context at mm/slab.c:2849 > ... > > > Dec 12 20:42:47 turing-police kernel: [ 3633.863552] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > > Dec 12 20:51:33 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > > Dec 12 21:51:04 turing-police kernel: [ 3587.132867] BUG: sleeping function called from invalid context at mm/slab.c:2849 > ... > > > I need to figure out what changed around 7:30PM on the 12th. > > - RGB > > -- > Richard Guy Briggs <rbriggs@redhat.com> > Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat > Remote, Ottawa, Canada > Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 17:50 ` Eric Paris @ 2014-12-18 18:44 ` Richard Guy Briggs 2014-12-18 19:04 ` Eric Paris 0 siblings, 1 reply; 10+ messages in thread From: Richard Guy Briggs @ 2014-12-18 18:44 UTC (permalink / raw) To: Eric Paris; +Cc: linux-audit, selinux, linux-kernel On 14/12/18, Eric Paris wrote: > On Thu, 2014-12-18 at 12:46 -0500, Richard Guy Briggs wrote: > > On 14/12/18, Eric Paris wrote: > > > On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > > > > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > > > Spotted these two while booting single-user on 20141216. 20141208 > > > > > doesn't throw these, so it's something in the last week or so.. > > > > > > > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > > > > just fine for several days, but it went around the bend, apparently due > > > > to a userspace or initrd change. > > > > > > $5 says you updated systemd? > > > > > > Richard? > > > > Ok, so if you are correct, then either we justify dropping the lock (I > > assume the one commone to both BUG reports [sig->cred_guard_mutex] ), > > or we make yet another queue were were hoping to avoid... > > > > It would also be good to narrow it down to a rule that triggers this. > > I thought the first message was enough to find the problem, but: > > static void kauditd_send_multicast_skb(struct sk_buff *skb) > { > ... > nlmsg_multicast(sock, copy, 0, AUDIT_NLGRP_READLOG, GFP_KERNEL); > ... > } > > Since kauditd_send_multicast_skb() gets called in audit_log_end(), which > can come from any context (aka even a sleeping context) you can't use > GFP_KERNEL. The audit_buffer know what context it should use. So pass > that down and use that. Ok, that looks more obvious now... We just need to change the internal interface to kauditd_send_multicast_skb() to accept an audit_buffer instead of just the skb and use the gfp_mask value from there instead of using our own... Thanks, Eric. > -Eric > > > > > egrep 'BUG|Linux vers' from my syslog: > > > > > > > > Dec 9 12:19:53 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > ... > > > > Dec 12 19:42:30 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > > > Dec 12 20:00:39 turing-police kernel: [ 1109.635328] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > ... > > > > Dec 12 20:42:47 turing-police kernel: [ 3633.863552] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > > > Dec 12 20:51:33 turing-police kernel: [ 0.000000] Linux version 3.18.0-next-20141208 (source@turing-police.cc.vt.edu) (gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) ) #27 SMP PREEMPT Mon Dec 8 22:20:07 EST 2014 > > > > Dec 12 21:51:04 turing-police kernel: [ 3587.132867] BUG: sleeping function called from invalid context at mm/slab.c:2849 > > ... > > > > I need to figure out what changed around 7:30PM on the 12th. > > > > - RGB - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 18:44 ` Richard Guy Briggs @ 2014-12-18 19:04 ` Eric Paris 2014-12-18 20:03 ` Paul Moore 0 siblings, 1 reply; 10+ messages in thread From: Eric Paris @ 2014-12-18 19:04 UTC (permalink / raw) To: Richard Guy Briggs; +Cc: linux-audit, selinux, linux-kernel On Thu, 2014-12-18 at 13:44 -0500, Richard Guy Briggs wrote: > On 14/12/18, Eric Paris wrote: > > On Thu, 2014-12-18 at 12:46 -0500, Richard Guy Briggs wrote: > > > On 14/12/18, Eric Paris wrote: > > > > On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > > > > > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > > > > Spotted these two while booting single-user on 20141216. 20141208 > > > > > > doesn't throw these, so it's something in the last week or so.. > > > > > > > > > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > > > > > just fine for several days, but it went around the bend, apparently due > > > > > to a userspace or initrd change. > > > > > > > > $5 says you updated systemd? > > > > > > > > Richard? > > > > > > Ok, so if you are correct, then either we justify dropping the lock (I > > > assume the one commone to both BUG reports [sig->cred_guard_mutex] ), > > > or we make yet another queue were were hoping to avoid... > > > > > > It would also be good to narrow it down to a rule that triggers this. > > > > I thought the first message was enough to find the problem, but: > > > > static void kauditd_send_multicast_skb(struct sk_buff *skb) > > { > > ... > > nlmsg_multicast(sock, copy, 0, AUDIT_NLGRP_READLOG, GFP_KERNEL); > > ... > > } > > > > Since kauditd_send_multicast_skb() gets called in audit_log_end(), which > > can come from any context (aka even a sleeping context) you can't use > > GFP_KERNEL. The audit_buffer know what context it should use. So pass > > that down and use that. > > Ok, that looks more obvious now... We just need to change the internal > interface to kauditd_send_multicast_skb() to accept an audit_buffer > instead of just the skb and use the gfp_mask value from there instead of > using our own... > > Thanks, Eric. I'd suggest just sending the GFP type, not the who audit_buffer, but that's up to you. -Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 19:04 ` Eric Paris @ 2014-12-18 20:03 ` Paul Moore 0 siblings, 0 replies; 10+ messages in thread From: Paul Moore @ 2014-12-18 20:03 UTC (permalink / raw) To: Richard Guy Briggs; +Cc: linux-audit, Eric Paris, linux-kernel, selinux On Thursday, December 18, 2014 02:04:46 PM Eric Paris wrote: > I'd suggest just sending the GFP type, not the who audit_buffer, but > that's up to you. That would be my preference too, especially since we will want to send this to stable and smaller is generally better there. -- paul moore security and virtualization @ redhat ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 16:50 ` Eric Paris 2014-12-18 17:46 ` Richard Guy Briggs @ 2014-12-18 19:21 ` Valdis.Kletnieks 2014-12-18 19:24 ` Richard Guy Briggs 1 sibling, 1 reply; 10+ messages in thread From: Valdis.Kletnieks @ 2014-12-18 19:21 UTC (permalink / raw) To: Eric Paris; +Cc: Paul Moore, linux-kernel, selinux, linux-audit, rgb [-- Attachment #1: Type: text/plain, Size: 1017 bytes --] On Thu, 18 Dec 2014 11:50:20 -0500, Eric Paris said: > On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > > > Spotted these two while booting single-user on 20141216. 20141208 > > > doesn't throw these, so it's something in the last week or so.. > > > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > > just fine for several days, but it went around the bend, apparently due > > to a userspace or initrd change. > > $5 says you updated systemd? Actually, yeah. yum.log says: Dec 12 14:08:09 Updated: systemd-218-1.fc22.x86_64 and things started going downhill that evening. Damned if I know what 218 was doing to cause the issues. (Other RPM update that looked vaguely related was for selinux-policy, but again, damned if I know what a policy file could contain that would change the behavior...) Is it worth backing it to -217, or do we have a handle on the issue and systemd is just the messenger? [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 2014-12-18 19:21 ` Valdis.Kletnieks @ 2014-12-18 19:24 ` Richard Guy Briggs 0 siblings, 0 replies; 10+ messages in thread From: Richard Guy Briggs @ 2014-12-18 19:24 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: Eric Paris, linux-audit, selinux, linux-kernel On 14/12/18, Valdis.Kletnieks@vt.edu wrote: > On Thu, 18 Dec 2014 11:50:20 -0500, Eric Paris said: > > On Thu, 2014-12-18 at 11:45 -0500, Valdis.Kletnieks@vt.edu wrote: > > > On Tue, 16 Dec 2014 20:09:54 -0500, Valdis Kletnieks said: > > > > Spotted these two while booting single-user on 20141216. 20141208 > > > > doesn't throw these, so it's something in the last week or so.. > > > > > > Gaah! Turns out that 20141208 *is* susceptible - it had been booting > > > just fine for several days, but it went around the bend, apparently due > > > to a userspace or initrd change. > > > > $5 says you updated systemd? > > Actually, yeah. yum.log says: > > Dec 12 14:08:09 Updated: systemd-218-1.fc22.x86_64 > > and things started going downhill that evening. Damned if I know what 218 > was doing to cause the issues. (Other RPM update that looked vaguely > related was for selinux-policy, but again, damned if I know what a policy > file could contain that would change the behavior...) > > Is it worth backing it to -217, or do we have a handle on the issue and > systemd is just the messenger? I've got a potential fix since the problem looks pretty obvious now. This is stable branch material... - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-12-18 20:03 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2966.1418778594@turing-police.cc.vt.edu>
2014-12-17 1:21 ` linux-next 20141216 BUG: sleeping function called from invalid context at mm/slab.c:2849 Eric Paris
2014-12-17 22:44 ` Richard Guy Briggs
[not found] ` <5444.1418921116@turing-police.cc.vt.edu>
2014-12-18 16:50 ` Eric Paris
2014-12-18 17:46 ` Richard Guy Briggs
2014-12-18 17:50 ` Eric Paris
2014-12-18 18:44 ` Richard Guy Briggs
2014-12-18 19:04 ` Eric Paris
2014-12-18 20:03 ` Paul Moore
2014-12-18 19:21 ` Valdis.Kletnieks
2014-12-18 19:24 ` Richard Guy Briggs
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox