* caif BUG() with network namespaces
@ 2011-10-21 20:51 Woodhouse, David
2011-10-21 21:55 ` Sjur Brændeland
0 siblings, 1 reply; 8+ messages in thread
From: Woodhouse, David @ 2011-10-21 20:51 UTC (permalink / raw)
To: Sjur Braendeland; +Cc: netdev@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]
When Chrome initialises its sandbox, we get a BUG:
[ 63.674528] ------------[ cut here ]------------
[ 63.674540] kernel BUG at net/caif/caif_dev.c:66!
[ 63.674547] invalid opcode: 0000 [#1] PREEMPT SMP
[ 63.674556] Modules linked in: iwlagn serio_raw [last unloaded: battery]
[ 63.674568]
[ 63.674575] Pid: 801, comm: chrome-sandbox Not tainted 3.0.0-4.1-adaptation-pc #1 Intel Corporation Cedartrail platform/To be filled by O.E.M.
[ 63.674589] EIP: 0060:-[<c0baaf8c>] EFLAGS: 00210246 CPU: 1
[ 63.674602] EIP is at caif_device_list+0x4c/0x50
[ 63.674608] EAX: 00000000 EBX: 00000000 ECX: e1482800 EDX: 00000010
[ 63.674614] ESI: e14133c0 EDI: 00000010 EBP: e141dda0 ESP: e141dd98
[ 63.674620] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 63.674627] Process chrome-sandbox (pid: 801, ti=e141c000 task=e52a8ff0 task.ti=e141c000)
[ 63.674632] Stack:
[ 63.674636] e1482800 00000000 e141ddd8 c0bab291 c04da339 e141ddd8 c0ad64ff 00000011
[ 63.674655] e14133c0 e141ddd8 c0b335cd e1482800 00000010 c0f612bc 00000000 fffffff3
[ 63.674672] e141de08 c046e6e7 e141dde8 c0bcda38 e141de20 e1482800 00000010 c0f58ac0
[ 63.674690] Call Trace:
[ 63.674700] [<c0bab291>] caif_device_notify+0x21/0x2d0
[ 63.674710] [<c04da339>] ? pcpu_alloc_area+0x109/0x250
[ 63.674720] [<c0ad64ff>] ? inetdev_event+0x1f/0x3a0
[ 63.674728] [<c0b335cd>] ? packet_notifier+0x8d/0x180
[ 63.674738] [<c046e6e7>] notifier_call_chain+0x47/0x90
[ 63.674747] [<c0bcda38>] ? mutex_unlock+0x8/0x10
[ 63.674756] [<c046e77b>] raw_notifier_call_chain+0x1b/0x20
[ 63.674766] [<c0a7d938>] call_netdevice_notifiers+0x28/0x60
[ 63.674773] [<c04dac6a>] ? __alloc_percpu+0xa/0x10
[ 63.674782] [<c0a80c6d>] register_netdevice+0xed/0x210
[ 63.674790] [<c0bcdd0e>] ? mutex_lock+0x1e/0x30
[ 63.674797] [<c0a80da2>] register_netdev+0x12/0x20
[ 63.674807] [<c0868ec3>] loopback_net_init+0x43/0x90
[ 63.674815] [<c0a7864f>] ops_init+0x2f/0x80
[ 63.674822] [<c0a7877f>] setup_net+0x4f/0xe0
[ 63.674830] [<c0a78ccc>] copy_net_ns+0x6c/0xe0
[ 63.674838] [<c046de31>] create_new_namespaces+0xc1/0x150
[ 63.674847] [<c046df82>] copy_namespaces+0x72/0xb0
[ 63.674856] [<c0448efe>] copy_process+0x60e/0xc70
[ 63.674864] [<c04df3c1>] ? handle_mm_fault+0x141/0x250
[ 63.674872] [<c04495ef>] do_fork+0x5f/0x300
Is this already known/fixed?
https://bugs.meego.com/show_bug.cgi?id=23540
--
Sent with MeeGo's ActiveSync support.
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4370 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: caif BUG() with network namespaces
2011-10-21 20:51 caif BUG() with network namespaces Woodhouse, David
@ 2011-10-21 21:55 ` Sjur Brændeland
2011-10-23 21:21 ` [PATCH] Fix " Woodhouse, David
2011-10-23 21:24 ` [non-quoted-printable PATCH] " Woodhouse, David
0 siblings, 2 replies; 8+ messages in thread
From: Sjur Brændeland @ 2011-10-21 21:55 UTC (permalink / raw)
To: Woodhouse, David; +Cc: netdev@vger.kernel.org
Hi David,
> When Chrome initialises its sandbox, we get a BUG:
>
> [ 63.674528] ------------[ cut here ]------------
> [ 63.674540] kernel BUG at net/caif/caif_dev.c:66!
...
> Is this already known/fixed?
>
> https://bugs.meego.com/show_bug.cgi?id=23540
No, I'm afraid I haven't seen this one before. But I'm afraid we
haven't tested much with net-namespaces enabled either.
Regards,
Sjur
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Fix caif BUG() with network namespaces
2011-10-21 21:55 ` Sjur Brændeland
@ 2011-10-23 21:21 ` Woodhouse, David
2011-10-23 21:24 ` [non-quoted-printable PATCH] " Woodhouse, David
1 sibling, 0 replies; 8+ messages in thread
From: Woodhouse, David @ 2011-10-23 21:21 UTC (permalink / raw)
To: Sjur Brændeland, davem@redhat.com; +Cc: netdev@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]
The caif code will register its own pernet_operations, and then register
a netdevice_notifier. Each time the netdevice_notifier is triggered,
it'll do some stuff... including a lookup of its own pernet stuff with
net_generic().
If the net_generic() call ever returns NULL, the caif code will BUG().
That doesn't seem *so* unreasonable, I suppose — it does seem like it
should never happen.
However, it *does* happen. When we clone a network namespace,
setup_net() runs through all the pernet_operations one at a time. It
gets to loopback before it gets to caif. And loopback_net_init()
registers a netdevice... while caif hasn't been initialised. So the caif
netdevice notifier triggers, and immediately goes BUG().
I'm not entirely sure how best to fix this in the general case. Perhaps
the netdevice_notifier registration should be pernet too, rather than
global? Or perhaps we should suppress the notifier calls during
setup_net() and flush them at the end after everything has been
initialised?
But really, I'm inclined to just take the simple approach. Make
caif_device_notify() *not* go looking for its pernet data structures if
the device it's being notified about isn't a caif device in the first
place. This simple patch is sufficient to avoid the problem, and is
probably good enough.
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 7f9ac07..47fc8f3 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -212,8 +212,7 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
enum cfcnfg_phy_preference pref;
enum cfcnfg_phy_type phy_type;
struct cfcnfg *cfg;
- struct caif_device_entry_list *caifdevs =
- caif_device_list(dev_net(dev));
+ struct caif_device_entry_list *caifdevs;
if (dev->type != ARPHRD_CAIF)
return 0;
@@ -222,6 +221,8 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
if (cfg == NULL)
return 0;
+ caifdevs = caif_device_list(dev_net(dev));
+
switch (what) {
case NETDEV_REGISTER:
caifd = caif_device_alloc(dev);
Sent with MeeGo's ActiveSync support.
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4370 bytes --]
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [non-quoted-printable PATCH] Fix caif BUG() with network namespaces
2011-10-21 21:55 ` Sjur Brændeland
2011-10-23 21:21 ` [PATCH] Fix " Woodhouse, David
@ 2011-10-23 21:24 ` Woodhouse, David
2011-10-24 15:51 ` Sjur Brændeland
1 sibling, 1 reply; 8+ messages in thread
From: Woodhouse, David @ 2011-10-23 21:24 UTC (permalink / raw)
To: Sjur Brændeland, davem@redhat.com; +Cc: netdev@vger.kernel.org
The caif code will register its own pernet_operations, and then register
a netdevice_notifier. Each time the netdevice_notifier is triggered,
it'll do some stuff... including a lookup of its own pernet stuff with
net_generic().
If the net_generic() call ever returns NULL, the caif code will BUG().
That doesn't seem *so* unreasonable, I suppose — it does seem like it
should never happen.
However, it *does* happen. When we clone a network namespace,
setup_net() runs through all the pernet_operations one at a time. It
gets to loopback before it gets to caif. And loopback_net_init()
registers a netdevice... while caif hasn't been initialised. So the caif
netdevice notifier triggers, and immediately goes BUG().
I'm not entirely sure how best to fix this in the general case. Perhaps
the netdevice_notifier registration should be pernet too, rather than
global? Or perhaps we should suppress the notifier calls during
setup_net() and flush them at the end after everything has been
initialised?
But really, I'm inclined to just take the simple approach. Make
caif_device_notify() *not* go looking for its pernet data structures if
the device it's being notified about isn't a caif device in the first
place. This simple patch is sufficient to avoid the problem, and is
probably good enough.
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
Sorry, forgot to disable signature on previous patch so it got sent as
QP. This should be better.
diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 7f9ac07..47fc8f3 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -212,8 +212,7 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
enum cfcnfg_phy_preference pref;
enum cfcnfg_phy_type phy_type;
struct cfcnfg *cfg;
- struct caif_device_entry_list *caifdevs =
- caif_device_list(dev_net(dev));
+ struct caif_device_entry_list *caifdevs;
if (dev->type != ARPHRD_CAIF)
return 0;
@@ -222,6 +221,8 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
if (cfg == NULL)
return 0;
+ caifdevs = caif_device_list(dev_net(dev));
+
switch (what) {
case NETDEV_REGISTER:
caifd = caif_device_alloc(dev);
Sent with MeeGo's ActiveSync support.
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [non-quoted-printable PATCH] Fix caif BUG() with network namespaces
2011-10-23 21:24 ` [non-quoted-printable PATCH] " Woodhouse, David
@ 2011-10-24 15:51 ` Sjur Brændeland
2011-10-24 22:28 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Sjur Brændeland @ 2011-10-24 15:51 UTC (permalink / raw)
To: Woodhouse, David; +Cc: davem@redhat.com, netdev@vger.kernel.org
Hi David,
> The caif code will register its own pernet_operations, and then register
> a netdevice_notifier. Each time the netdevice_notifier is triggered,
> it'll do some stuff... including a lookup of its own pernet stuff with
> net_generic().
>
> If the net_generic() call ever returns NULL, the caif code will BUG().
> That doesn't seem *so* unreasonable, I suppose — it does seem like it
> should never happen.
>
> However, it *does* happen. When we clone a network namespace,
> setup_net() runs through all the pernet_operations one at a time. It
> gets to loopback before it gets to caif. And loopback_net_init()
> registers a netdevice... while caif hasn't been initialised. So the caif
> netdevice notifier triggers, and immediately goes BUG().
>
> I'm not entirely sure how best to fix this in the general case. Perhaps
> the netdevice_notifier registration should be pernet too, rather than
> global? Or perhaps we should suppress the notifier calls during
> setup_net() and flush them at the end after everything has been
> initialised?
>
> But really, I'm inclined to just take the simple approach. Make
> caif_device_notify() *not* go looking for its pernet data structures if
> the device it's being notified about isn't a caif device in the first
> place. This simple patch is sufficient to avoid the problem, and is
> probably good enough.
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Thank you for analyzing and fixing this David, this looks good to me.
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [non-quoted-printable PATCH] Fix caif BUG() with network namespaces
2011-10-24 15:51 ` Sjur Brændeland
@ 2011-10-24 22:28 ` David Miller
2011-10-25 7:25 ` [PATCH] " David Woodhouse
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2011-10-24 22:28 UTC (permalink / raw)
To: sjurbren; +Cc: david.woodhouse, netdev
From: Sjur Brændeland <sjurbren@gmail.com>
Date: Mon, 24 Oct 2011 17:51:54 +0200
> Hi David,
>
>> The caif code will register its own pernet_operations, and then register
>> a netdevice_notifier. Each time the netdevice_notifier is triggered,
>> it'll do some stuff... including a lookup of its own pernet stuff with
>> net_generic().
>>
>> If the net_generic() call ever returns NULL, the caif code will BUG().
>> That doesn't seem *so* unreasonable, I suppose — it does seem like it
>> should never happen.
>>
>> However, it *does* happen. When we clone a network namespace,
>> setup_net() runs through all the pernet_operations one at a time. It
>> gets to loopback before it gets to caif. And loopback_net_init()
>> registers a netdevice... while caif hasn't been initialised. So the caif
>> netdevice notifier triggers, and immediately goes BUG().
>>
>> I'm not entirely sure how best to fix this in the general case. Perhaps
>> the netdevice_notifier registration should be pernet too, rather than
>> global? Or perhaps we should suppress the notifier calls during
>> setup_net() and flush them at the end after everything has been
>> initialised?
>>
>> But really, I'm inclined to just take the simple approach. Make
>> caif_device_notify() *not* go looking for its pernet data structures if
>> the device it's being notified about isn't a caif device in the first
>> place. This simple patch is sufficient to avoid the problem, and is
>> probably good enough.
>> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
>
> Thank you for analyzing and fixing this David, this looks good to me.
> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
David W., the copy that ended up in patchwork was using DOS newlines:
http://patchwork.ozlabs.org/patch/121250/
Click on Download "mbox" to see what I mean.
Can you get your email configuration straightened out before
resubmitting this patch a third time?
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Fix caif BUG() with network namespaces
2011-10-24 22:28 ` David Miller
@ 2011-10-25 7:25 ` David Woodhouse
2011-10-25 23:22 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: David Woodhouse @ 2011-10-25 7:25 UTC (permalink / raw)
To: David Miller; +Cc: sjurbren, netdev
The caif code will register its own pernet_operations, and then register
a netdevice_notifier. Each time the netdevice_notifier is triggered,
it'll do some stuff... including a lookup of its own pernet stuff with
net_generic().
If the net_generic() call ever returns NULL, the caif code will BUG().
That doesn't seem *so* unreasonable, I suppose — it does seem like it
should never happen.
However, it *does* happen. When we clone a network namespace,
setup_net() runs through all the pernet_operations one at a time. It
gets to loopback before it gets to caif. And loopback_net_init()
registers a netdevice... while caif hasn't been initialised. So the caif
netdevice notifier triggers, and immediately goes BUG().
We could imagine a complex and overengineered solution to this generic
class of problems, but this patch takes the simple approach. It just
makes caif_device_notify() *not* go looking for its pernet data
structures if the device it's being notified about isn't a caif device
in the first place.
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
---
On Mon, 2011-10-24 at 18:28 -0400, David Miller wrote:
> David W., the copy that ended up in patchwork was using DOS newlines:
> http://patchwork.ozlabs.org/patch/121250/
> Click on Download "mbox" to see what I mean.
> Can you get your email configuration straightened out before
> resubmitting this patch a third time?
Sorry about that. It's not a configuration thing; just me being stupid
and using the wrong account to send. I *know* Exchange is not an email
server, and that signing emails causes QP encoding. This one should be
fine.
Having said that... in fact, email is *supposed* to have DOS newlines.
The SMTP protocol requires it. When receiving on *nix boxes, we tend to
convert CRLF to LF.
I played a little with applying the emails I sent with 'git am', and it
actually copes fine with the original Quoted-Printable email, as it
should. It *almost* copes with the base64 email, but it forgets to do
the CRLF to LF conversion. I'll look at fixing it (in the spirit of 'be
liberal in what you accept)... but of course we should also be
conservative in what we *send*, so here is (hopefully) an unencoded
version. Apologies again for the inconvenience.
/me double-checks that the signature is turned off, and the sending
account is not the Exchange one... <send>
net/caif/caif_dev.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 7f9ac07..47fc8f3 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -212,8 +212,7 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
enum cfcnfg_phy_preference pref;
enum cfcnfg_phy_type phy_type;
struct cfcnfg *cfg;
- struct caif_device_entry_list *caifdevs =
- caif_device_list(dev_net(dev));
+ struct caif_device_entry_list *caifdevs;
if (dev->type != ARPHRD_CAIF)
return 0;
@@ -222,6 +221,8 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
if (cfg == NULL)
return 0;
+ caifdevs = caif_device_list(dev_net(dev));
+
switch (what) {
case NETDEV_REGISTER:
caifd = caif_device_alloc(dev);
--
1.7.6.4
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Fix caif BUG() with network namespaces
2011-10-25 7:25 ` [PATCH] " David Woodhouse
@ 2011-10-25 23:22 ` David Miller
0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2011-10-25 23:22 UTC (permalink / raw)
To: dwmw2; +Cc: sjurbren, netdev
From: David Woodhouse <dwmw2@infradead.org>
Date: Tue, 25 Oct 2011 09:25:21 +0200
> The caif code will register its own pernet_operations, and then register
> a netdevice_notifier. Each time the netdevice_notifier is triggered,
> it'll do some stuff... including a lookup of its own pernet stuff with
> net_generic().
>
> If the net_generic() call ever returns NULL, the caif code will BUG().
> That doesn't seem *so* unreasonable, I suppose — it does seem like it
> should never happen.
>
> However, it *does* happen. When we clone a network namespace,
> setup_net() runs through all the pernet_operations one at a time. It
> gets to loopback before it gets to caif. And loopback_net_init()
> registers a netdevice... while caif hasn't been initialised. So the caif
> netdevice notifier triggers, and immediately goes BUG().
>
> We could imagine a complex and overengineered solution to this generic
> class of problems, but this patch takes the simple approach. It just
> makes caif_device_notify() *not* go looking for its pernet data
> structures if the device it's being notified about isn't a caif device
> in the first place.
>
> Cc: stable@kernel.org
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-10-25 23:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-21 20:51 caif BUG() with network namespaces Woodhouse, David
2011-10-21 21:55 ` Sjur Brændeland
2011-10-23 21:21 ` [PATCH] Fix " Woodhouse, David
2011-10-23 21:24 ` [non-quoted-printable PATCH] " Woodhouse, David
2011-10-24 15:51 ` Sjur Brændeland
2011-10-24 22:28 ` David Miller
2011-10-25 7:25 ` [PATCH] " David Woodhouse
2011-10-25 23:22 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).