netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash.net>
To: Jarek Poplawski <jarkao2@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>,
	netdev@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Thomas Graf <tgraf@suug.ch>
Subject: Re: netlink circular locking dependency
Date: Mon, 16 Jun 2008 23:48:01 +0200	[thread overview]
Message-ID: <4856DF91.30606@trash.net> (raw)
In-Reply-To: <20080616213417.GA14988@ami.dom.local>

[-- Attachment #1: Type: text/plain, Size: 1713 bytes --]

Jarek Poplawski wrote:
> Marcel Holtmann wrote, On 06/14/2008 02:35 PM:
> ...
>   
>> =======================================================
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.26-rc2 #5
>> -------------------------------------------------------
>> hcid/4136 is trying to acquire lock:
>>  (genl_mutex){--..}, at: [<c0000000002ace4c>] .ctrl_dumpfamily+0x74/0x174
>>
>> but task is already holding lock:
>>  (nlk->cb_mutex){--..}, at: [<c0000000002a766c>] .netlink_dump+0x58/0x27c
>>
>> which lock already depends on the new lock.
>>     
> ...
>
> Hi,
>
> IMHO it looks like a real lockup threat. Probably it needs something
> better, but for now here is my simplistic patch proposal for testing.
>   
So we have:

genl_rcv()            : take genl_mutex
genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
netlink_dump_start(),
netlink_dump()        : take nlk->cb_mutex
ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                        second time

netlink_rcv()         : call netlink_dump
netlink_dump          : take nlk->cb_mutex
ctrl_dumpfamily()     : take genl_mutex

which is a real bug.

It seems the best fix is to use genl_mutex for the netlink cb_mutex,
drop genl_mutex before calling netlink_dump_start and don't take it
in ctrl_dumpfamily, relying completely on af_netlink.c for dump
locking. Unfortunately this creates a race since the ops passed to
netlink_dump_start are also protect by the mutex, so this patch
is just for testing whether it fixes the warning.

On second though - that race seems to be present already since
the ops can be unregistered and the module unloaded while a dump
is in progress.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1378 bytes --]

diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index f5aa23c..3e1191c 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -444,8 +444,11 @@ static int genl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		if (ops->dumpit == NULL)
 			return -EOPNOTSUPP;
 
-		return netlink_dump_start(genl_sock, skb, nlh,
-					  ops->dumpit, ops->done);
+		genl_unlock();
+		err = netlink_dump_start(genl_sock, skb, nlh,
+					 ops->dumpit, ops->done);
+		genl_lock();
+		return err;
 	}
 
 	if (ops->doit == NULL)
@@ -603,9 +606,6 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb)
 	int chains_to_skip = cb->args[0];
 	int fams_to_skip = cb->args[1];
 
-	if (chains_to_skip != 0)
-		genl_lock();
-
 	for (i = 0; i < GENL_FAM_TAB_SIZE; i++) {
 		if (i < chains_to_skip)
 			continue;
@@ -623,9 +623,6 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb)
 	}
 
 errout:
-	if (chains_to_skip != 0)
-		genl_unlock();
-
 	cb->args[0] = i;
 	cb->args[1] = n;
 
@@ -770,7 +767,7 @@ static int __init genl_init(void)
 
 	/* we'll bump the group number right afterwards */
 	genl_sock = netlink_kernel_create(&init_net, NETLINK_GENERIC, 0,
-					  genl_rcv, NULL, THIS_MODULE);
+					  genl_rcv, &genl_mutex, THIS_MODULE);
 	if (genl_sock == NULL)
 		panic("GENL: Cannot initialize generic netlink\n");
 

  reply	other threads:[~2008-06-16 22:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-14 12:35 netlink circular locking dependency Marcel Holtmann
2008-06-16 21:34 ` Jarek Poplawski
2008-06-16 21:48   ` Patrick McHardy [this message]
2008-06-17  1:45     ` Marcel Holtmann
2008-06-17 12:50       ` Patrick McHardy
2008-06-17 13:09         ` Jarek Poplawski
2008-06-17 13:07           ` Patrick McHardy
2008-06-17 13:24             ` Jarek Poplawski
2008-06-17 13:27               ` Patrick McHardy
2008-06-17 13:43                 ` Jarek Poplawski
2008-06-18  4:30                   ` David Miller
2008-06-18  6:15                     ` Jarek Poplawski
2008-06-18  8:52                     ` Patrick McHardy
2008-06-18  9:08                       ` David Miller
2008-06-18 11:38                         ` Marcel Holtmann
2008-06-18 11:42                           ` Patrick McHardy
2008-06-17 13:08           ` Thomas Graf
2008-06-17 13:19             ` Patrick McHardy
2008-06-17  8:49     ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4856DF91.30606@trash.net \
    --to=kaber@trash.net \
    --cc=jarkao2@gmail.com \
    --cc=marcel@holtmann.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).