public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
@ 2006-12-24 21:21 Adam J. Richter
  2006-12-25  0:15 ` Parag Warudkar
  2006-12-25  0:25 ` Andrew Morton
  0 siblings, 2 replies; 8+ messages in thread
From: Adam J. Richter @ 2006-12-24 21:21 UTC (permalink / raw)
  To: linux-kernel

	Under 2.6.20-rc1 and 2.6.20-rc2, I get the following complaint
for several network programs running on my system:

[  156.381868] BUG: sleeping function called from invalid context at net/core/sock.c:1523
[  156.381876] in_atomic():1, irqs_disabled():0
[  156.381881] no locks held by kio_http/9693.
[  156.381886]  [<c01057a2>] show_trace_log_lvl+0x1a/0x2f
[  156.381900]  [<c0105dab>] show_trace+0x12/0x14
[  156.381908]  [<c0105e48>] dump_stack+0x16/0x18
[  156.381917]  [<c011e30f>] __might_sleep+0xe5/0xeb
[  156.381926]  [<c025942a>] lock_sock_nested+0x1d/0xc4
[  156.381937]  [<c01cc570>] selinux_netlbl_inode_permission+0x5a/0x8e
[  156.381946]  [<c01c2505>] selinux_file_permission+0x96/0x9b
[  156.381954]  [<c0175a0a>] vfs_write+0x8d/0x167
[  156.381962]  [<c017605a>] sys_write+0x3f/0x63
[  156.381971]  [<c01040c0>] syscall_call+0x7/0xb
[  156.381980]  =======================

	I have 35 of these messages is my console log at the moment.
The only difference that I've noticed between the messages is
that they are for variety of processes: most for tor, xntpd, sendmail,
procmail.  The processes get to this point by sys_write, sys_send, or
sys_sendto (procmail was doing a sys_sendto, so it was also doing something
related to networking, even though it is not a program one normally
would think of as doing any networking system calls).

	My system seems to work OK even with these warning messages.
I can debug it futher.  I just figure I should report it now, because
I may have done everyone a disservice by putting off reporting it in
rc1 in the hopes of finding time to debug it.

Adam Richter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
  2006-12-24 21:21 Adam J. Richter
@ 2006-12-25  0:15 ` Parag Warudkar
  2006-12-25  0:25 ` Andrew Morton
  1 sibling, 0 replies; 8+ messages in thread
From: Parag Warudkar @ 2006-12-25  0:15 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: jmorris, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1331 bytes --]

On Mon, 25 Dec 2006, Adam J. Richter wrote:

> 	Under 2.6.20-rc1 and 2.6.20-rc2, I get the following complaint
> for several network programs running on my system:
>
> [  156.381868] BUG: sleeping function called from invalid context at net/core/sock.c:1523
> [  156.381876] in_atomic():1, irqs_disabled():0
> [  156.381881] no locks held by kio_http/9693.
> [  156.381886]  [<c01057a2>] show_trace_log_lvl+0x1a/0x2f
> [  156.381900]  [<c0105dab>] show_trace+0x12/0x14
> [  156.381908]  [<c0105e48>] dump_stack+0x16/0x18
> [  156.381917]  [<c011e30f>] __might_sleep+0xe5/0xeb
> [  156.381926]  [<c025942a>] lock_sock_nested+0x1d/0xc4
> [  156.381937]  [<c01cc570>] selinux_netlbl_inode_permission+0x5a/0x8e
> [  156.381946]  [<c01c2505>] selinux_file_permission+0x96/0x9b
> [  156.381954]  [<c0175a0a>] vfs_write+0x8d/0x167
> [  156.381962]  [<c017605a>] sys_write+0x3f/0x63
> [  156.381971]  [<c01040c0>] syscall_call+0x7/0xb
> [  156.381980]  =======================
>

lock_sock_nested can sleep, its BH counterpart doesn't.
selinux_netlbl_inode_permission() probably needs to use the BH counterpart 
unconditionally. But I am not sure if that function is always called from an atomic context. Assuming it is, the 
attached patch should fix this.

Compile tested.

Signed-off-by: Parag Warudkar <paragw@paragw.zapto.org>

Parag

[-- Attachment #2: selinux - use proper locking interfaces --]
[-- Type: TEXT/PLAIN, Size: 443 bytes --]

--- linux-2.6/security/selinux/ss/services.c.orig	2006-12-24 18:52:42.000000000 -0500
+++ linux-2.6/security/selinux/ss/services.c	2006-12-24 19:00:22.000000000 -0500
@@ -2660,9 +2660,9 @@
 		rcu_read_unlock();
 		return 0;
 	}
-	lock_sock(sock->sk);
+	bh_lock_sock_nested(sock->sk);
 	rc = selinux_netlbl_socket_setsid(sock, sksec->sid);
-	release_sock(sock->sk);
+	bh_unlock_sock(sock->sk);
 	rcu_read_unlock();
 
 	return rc;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
  2006-12-24 21:21 Adam J. Richter
  2006-12-25  0:15 ` Parag Warudkar
@ 2006-12-25  0:25 ` Andrew Morton
  2007-01-02  7:58   ` Adam J. Richter
  2007-01-02 21:14   ` Paul Moore
  1 sibling, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2006-12-25  0:25 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: linux-kernel, netdev, Paul Moore

On Mon, 25 Dec 2006 05:21:24 +0800
"Adam J. Richter" <adam@yggdrasil.com> wrote:

> 	Under 2.6.20-rc1 and 2.6.20-rc2, I get the following complaint
> for several network programs running on my system:
> 
> [  156.381868] BUG: sleeping function called from invalid context at net/core/sock.c:1523
> [  156.381876] in_atomic():1, irqs_disabled():0
> [  156.381881] no locks held by kio_http/9693.
> [  156.381886]  [<c01057a2>] show_trace_log_lvl+0x1a/0x2f
> [  156.381900]  [<c0105dab>] show_trace+0x12/0x14
> [  156.381908]  [<c0105e48>] dump_stack+0x16/0x18
> [  156.381917]  [<c011e30f>] __might_sleep+0xe5/0xeb
> [  156.381926]  [<c025942a>] lock_sock_nested+0x1d/0xc4
> [  156.381937]  [<c01cc570>] selinux_netlbl_inode_permission+0x5a/0x8e
> [  156.381946]  [<c01c2505>] selinux_file_permission+0x96/0x9b
> [  156.381954]  [<c0175a0a>] vfs_write+0x8d/0x167
> [  156.381962]  [<c017605a>] sys_write+0x3f/0x63
> [  156.381971]  [<c01040c0>] syscall_call+0x7/0xb
> [  156.381980]  =======================
> 

There's a glaring bug in selinux_netlbl_inode_permission() - taking
lock_sock() inside rcu_read_lock().

I would again draw attention to Documentation/SubmitChecklist.  In
particular please always always always enable all kernel debugging options
when developing and testing new kernel code.  And everything else in that
file, too.

<guesses that this was tested on ia64>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid  context in 2.6.20-rc[12]
@ 2006-12-26  5:30 Paul Moore
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Moore @ 2006-12-26  5:30 UTC (permalink / raw)
  To: akpm; +Cc: adam, linux-kernel, netdev

-----Original Message-----
From: Andrew Morton <akpm@osdl.org>
Date: Sunday, Dec 24, 2006 7:25 pm
Subject: Re: selinux networking: sleeping functin called from invalid  context in 2.6.20-rc[12]

On Mon, 25 Dec 2006 05:21:24 +0800
>"Adam J. Richter" <adam@yggdrasil.com> wrote:
>
>> 	Under 2.6.20-rc1 and 2.6.20-rc2, I get the following complaint
> for several network programs running on my system:
> 
> [  156.381868] BUG: sleeping function called from invalid context at net/core/sock.c:1523
> [  156.381876] in_atomic():1, irqs_disabled():0
> [  156.381881] no locks held by kio_http/9693.
> [  156.381886]  [<c01057a2>] show_trace_log_lvl+0x1a/0x2f
> [  156.381900]  [<c0105dab>] show_trace+0x12/0x14
> [  156.381908]  [<c0105e48>] dump_stack+0x16/0x18
> [  156.381917]  [<c011e30f>] __might_sleep+0xe5/0xeb
> [  156.381926]  [<c025942a>] lock_sock_nested+0x1d/0xc4
> [  156.381937]  [<c01cc570>] selinux_netlbl_inode_permission+0x5a/0x8e
> [  156.381946]  [<c01c2505>] selinux_file_permission+0x96/0x9b
> [  156.381954]  [<c0175a0a>] vfs_write+0x8d/0x167
> [  156.381962]  [<c017605a>] sys_write+0x3f/0x63
> [  156.381971]  [<c01040c0>] syscall_call+0x7/0xb
> [  156.381980]  =======================
> 
>
>There's a glaring bug in selinux_netlbl_inode_permission() - taking lock_sock() inside rcu_read_lock().
>
>I would again draw attention to Documentation/SubmitChecklist.  In
>particular please always always always enable all kernel debugging options when developing and testing new kernel code.  And everything else in that file, too.
>

I apologize for the mistake - I'm still trying to get a firm grasp on some of the finer points of Linux kernel development and I obviously missed something here.  Unfortunately, due to the holiday I won't be able to write/test/submit a patch until after the first of the year but I promise to do so as soon as I am able.

. paul moore
. linux security @ hp


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
  2006-12-25  0:25 ` Andrew Morton
@ 2007-01-02  7:58   ` Adam J. Richter
  2007-01-02 21:25     ` Paul Moore
  2007-01-02 21:14   ` Paul Moore
  1 sibling, 1 reply; 8+ messages in thread
From: Adam J. Richter @ 2007-01-02  7:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev, Paul Moore

[-- Attachment #1: Type: text/plain, Size: 2131 bytes --]

On Sun, Dec 24, 2006 at 04:25:11PM -0800, Andrew Morton wrote:
> On Mon, 25 Dec 2006 05:21:24 +0800
> "Adam J. Richter" <adam@yggdrasil.com> wrote:
> 
>> 	Under 2.6.20-rc1 and 2.6.20-rc2, I get the following complaint
>> for several network programs running on my system:
>> 
>> [  156.381868] BUG: sleeping function called from invalid context at net/core/sock.c:1523
[...]
> There's a glaring bug in selinux_netlbl_inode_permission() - taking
> lock_sock() inside rcu_read_lock().
> 
> I would again draw attention to Documentation/SubmitChecklist.  In
> particular please always always always enable all kernel debugging options
> when developing and testing new kernel code.  And everything else in that
> file, too.
> 
> <guesses that this was tested on ia64>

	I have not yet performed the 21 steps of
linux-2.6.20-rc3/Documentation/SubmitChecklist, which I think is a
great objectives list for future automation or some kind of community
web site.  I hope to find time to make progress through that
checklist, but, in the meantime, I think the world may nevertheless be
infinitesmally better off if I post the patch that I'm currently
using that seems to fix the problem, seeing as how rc3 has passed
with no fix incorporated.

	I think the intent of the offending code was to avoid doing
a lock_sock() in a presumably common case where there was no need to
take the lock.  So, I have kept the presumably fast test to exit
early.

	When it turns out to be necessary to take lock_sock(), RCU is
unlocked, then lock_sock is taken, the RCU is locked again, and
the test is repeated.

	If I am wrong about lock_sock being expensive, I can
delete the lines that do the early return.

	By the way, in a change not included in this patch,
I also tried consolidating the RCU locking in this file into a macro
IF_NLBL_REQUIRE(sksec, action), where "action" is the code
fragment to be executed with rcu_read_lock() held, although this
required splitting a couple of functions in half.

	Anyhow, here is my current patch as MIME attachment.
Comments and labor in getting it through SubmitChecklist would
both be welcome.

Adam Richter

[-- Attachment #2: selinux.diff --]
[-- Type: text/plain, Size: 621 bytes --]

--- linux-2.6.20-rc3/security/selinux/ss/services.c	2007-01-02 01:47:40.000000000 +0800
+++ linux/security/selinux/ss/services.c	2007-01-02 15:36:30.000000000 +0800
@@ -2658,14 +2658,22 @@
 	rcu_read_lock();
 	if (sksec->nlbl_state != NLBL_REQUIRE) {
 		rcu_read_unlock();
 		return 0;
 	}
+	rcu_read_unlock();
+
+
+	rc = 0;
 	lock_sock(sock->sk);
-	rc = selinux_netlbl_socket_setsid(sock, sksec->sid);
-	release_sock(sock->sk);
+	rcu_read_lock();
+
+	if (sksec->nlbl_state == NLBL_REQUIRE)
+		rc = selinux_netlbl_socket_setsid(sock, sksec->sid);
+
 	rcu_read_unlock();
+	release_sock(sock->sk);
 
 	return rc;
 }
 
 /**

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
  2006-12-25  0:25 ` Andrew Morton
  2007-01-02  7:58   ` Adam J. Richter
@ 2007-01-02 21:14   ` Paul Moore
  1 sibling, 0 replies; 8+ messages in thread
From: Paul Moore @ 2007-01-02 21:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Adam J. Richter, linux-kernel, netdev, Ingo Molnar

On Sunday, December 24 2006 7:25 pm, Andrew Morton wrote:
> On Mon, 25 Dec 2006 05:21:24 +0800
>
> "Adam J. Richter" <adam@yggdrasil.com> wrote:
> > 	Under 2.6.20-rc1 and 2.6.20-rc2, I get the following complaint
> > for several network programs running on my system:
> >
> > [  156.381868] BUG: sleeping function called from invalid context at
> > net/core/sock.c:1523 [  156.381876] in_atomic():1, irqs_disabled():0
> > [  156.381881] no locks held by kio_http/9693.
> > [  156.381886]  [<c01057a2>] show_trace_log_lvl+0x1a/0x2f
> > [  156.381900]  [<c0105dab>] show_trace+0x12/0x14
> > [  156.381908]  [<c0105e48>] dump_stack+0x16/0x18
> > [  156.381917]  [<c011e30f>] __might_sleep+0xe5/0xeb
> > [  156.381926]  [<c025942a>] lock_sock_nested+0x1d/0xc4
> > [  156.381937]  [<c01cc570>] selinux_netlbl_inode_permission+0x5a/0x8e
> > [  156.381946]  [<c01c2505>] selinux_file_permission+0x96/0x9b
> > [  156.381954]  [<c0175a0a>] vfs_write+0x8d/0x167
> > [  156.381962]  [<c017605a>] sys_write+0x3f/0x63
> > [  156.381971]  [<c01040c0>] syscall_call+0x7/0xb
> > [  156.381980]  =======================
>
> There's a glaring bug in selinux_netlbl_inode_permission() - taking
> lock_sock() inside rcu_read_lock().

Sorry for the delay, I'm finally back at a machine where I can look at the 
code.

I've been thinking about Parag Warudkar's and Ingo Molnar's patches as well as 
what the selinux_netlbl_inode_permission() function actually needs to do; I 
think the best answer isn't so much to change the socket locking calls, but 
to restructure the function a bit.

Currently the function does the following (in order):

 1. do some quick sanity checks (is the inode a socket, etc)
 2. rcu_read_lock()
 3. check the nlbl_state is set to NLBL_REQUIRE (otherwise return)
 4. lock_sock()
 5. netlabel magic
 6. release_sock()
 7. rcu_read_unlock()

I propose changing it to the following (in order):

  1. do some quick sanity checks (is the inode a socket, etc)
  2. rcu_read_lock()
  3. check the nlbl_state is set to NLBL_REQUIRE (otherwise return)
  4. rcu_read_unlock()
  5. lock_sock()
  6. rcu_read_lock()
  7. verify that nlbl_state is still set to NLBL_REQUIRE (otherwise return)
  8. netlabel magic
  9. rcu_read_unlock()
 10. release_sock()

This way we no longer need to worry about any special socket locking.  I 
realize this adds a bit of duplicated work but it is my understanding that 
RCU lock/unlock operations are *very* fast so the extra RCU lock operations 
shouldn't be too bad and the extra nlbl_state check should be of minimal 
cost.

However, I'm not the expert here, just a guy learning as he goes so any 
comments/feedback on the above proposal are welcome.  If it turns out this 
approach has some merit I'll put together a patch and send it out.

Once again, sorry for the regression.

-- 
paul moore
linux security @ hp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
  2007-01-02  7:58   ` Adam J. Richter
@ 2007-01-02 21:25     ` Paul Moore
  2007-01-02 23:37       ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Moore @ 2007-01-02 21:25 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: Andrew Morton, linux-kernel, netdev

On Tuesday, January 2 2007 2:58 am, Adam J. Richter wrote:
> 	I have not yet performed the 21 steps of
> linux-2.6.20-rc3/Documentation/SubmitChecklist, which I think is a
> great objectives list for future automation or some kind of community
> web site.  I hope to find time to make progress through that
> checklist, but, in the meantime, I think the world may nevertheless be
> infinitesmally better off if I post the patch that I'm currently
> using that seems to fix the problem, seeing as how rc3 has passed
> with no fix incorporated.
>
> 	I think the intent of the offending code was to avoid doing
> a lock_sock() in a presumably common case where there was no need to
> take the lock.  So, I have kept the presumably fast test to exit
> early.
>
> 	When it turns out to be necessary to take lock_sock(), RCU is
> unlocked, then lock_sock is taken, the RCU is locked again, and
> the test is repeated.

Hi Adam,

I'm sorry I just saw this mail (mail not sent directly to me get shuffled off 
to a folder).  I agree with your patch, I think dropping and then re-taking 
the RCU lock is the best way to go, although I'm curious to see what others 
have to say.

The only real comment I have with the patch is that there is some extra 
whitespace which could probably be removed, but that is more of a style nit 
than anything substantial.

> 	By the way, in a change not included in this patch,
> I also tried consolidating the RCU locking in this file into a macro
> IF_NLBL_REQUIRE(sksec, action), where "action" is the code
> fragment to be executed with rcu_read_lock() held, although this
> required splitting a couple of functions in half.

>From your description above I'm not sure I like that approach so much, 
however, I could be misunderstanding something.  Do you have a small example 
you could send?

-- 
paul moore
linux security @ hp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12]
  2007-01-02 21:25     ` Paul Moore
@ 2007-01-02 23:37       ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2007-01-02 23:37 UTC (permalink / raw)
  To: paul.moore; +Cc: adam, akpm, linux-kernel, netdev

From: Paul Moore <paul.moore@hp.com>
Date: Tue, 2 Jan 2007 16:25:24 -0500

> I'm sorry I just saw this mail (mail not sent directly to me get
> shuffled off to a folder).  I agree with your patch, I think
> dropping and then re-taking the RCU lock is the best way to go,
> although I'm curious to see what others have to say.

I think this is fine too.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-01-02 23:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-26  5:30 selinux networking: sleeping functin called from invalid context in 2.6.20-rc[12] Paul Moore
  -- strict thread matches above, loose matches on Subject: below --
2006-12-24 21:21 Adam J. Richter
2006-12-25  0:15 ` Parag Warudkar
2006-12-25  0:25 ` Andrew Morton
2007-01-02  7:58   ` Adam J. Richter
2007-01-02 21:25     ` Paul Moore
2007-01-02 23:37       ` David Miller
2007-01-02 21:14   ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox