* net, batman: lockdep circular dependency warning
@ 2012-11-12 20:37 Sasha Levin
2012-12-04 14:51 ` Sven Eckelmann
0 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2012-11-12 20:37 UTC (permalink / raw)
To: Marek Lindner, Simon Wunderlich, Antonio Quartulli,
David S. Miller
Cc: b.a.t.m.a.n, netdev, linux-kernel@vger.kernel.org, Dave Jones
Hi all,
While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next
kernel, I've stumbled on the following:
[ 1002.969392] ======================================================
[ 1002.971639] [ INFO: possible circular locking dependency detected ]
[ 1002.975805] 3.7.0-rc5-next-20121112-sasha-00018-g2f4ce0e #127 Tainted: G W
[ 1002.983691] -------------------------------------------------------
[ 1002.983691] trinity-child18/8149 is trying to acquire lock:
[ 1002.983691] (s_active#313){++++.+}, at: [<ffffffff812f9941>] sysfs_addrm_finish+0x31/0x60
[ 1002.983691]
[ 1002.983691] but task is already holding lock:
[ 1002.983691] (rtnl_mutex){+.+.+.}, at: [<ffffffff834fcc62>] rtnl_lock+0x12/0x20
[ 1002.983691]
[ 1002.983691] which lock already depends on the new lock.
[ 1002.983691]
[ 1002.983691]
[ 1002.983691] the existing dependency chain (in reverse order) is:
[ 1002.983691]
-> #1 (rtnl_mutex){+.+.+.}:
[ 1002.983691] [<ffffffff81180d0a>] check_prevs_add+0xba/0x1a0
[ 1002.983691] [<ffffffff81181490>] validate_chain.isra.23+0x6a0/0x7b0
[ 1002.983691] [<ffffffff81183e3b>] __lock_acquire+0x9db/0xa90
[ 1002.983691] [<ffffffff8118635a>] lock_acquire+0x1ca/0x270
[ 1002.983691] [<ffffffff83bf647a>] __mutex_lock_common+0x5a/0x550
[ 1002.983691] [<ffffffff83bf69af>] mutex_lock_nested+0x3f/0x50
[ 1002.983691] [<ffffffff834fcc62>] rtnl_lock+0x12/0x20
[ 1002.983691] [<ffffffff834eeb3c>] netdev_run_todo+0x7c/0x180
[ 1002.983691] [<ffffffff834fcfa9>] rtnl_unlock+0x9/0x10
[ 1002.983691] [<ffffffff839d7da1>] batadv_store_mesh_iface+0x121/0x160
[ 1002.983691] [<ffffffff819d303f>] kobj_attr_store+0xf/0x30
[ 1002.983691] [<ffffffff812f7d41>] sysfs_write_file+0x101/0x170
[ 1002.983691] [<ffffffff8127ac58>] vfs_write+0xb8/0x180
[ 1002.983691] [<ffffffff8127af57>] sys_pwrite64+0x67/0x90
[ 1002.983691] [<ffffffff83bfafd8>] tracesys+0xe1/0xe6
[ 1002.983691]
-> #0 (s_active#313){++++.+}:
[ 1002.983691] [<ffffffff81180725>] check_prev_add+0x115/0x640
[ 1002.983691] [<ffffffff81180d0a>] check_prevs_add+0xba/0x1a0
[ 1002.983691] [<ffffffff81181490>] validate_chain.isra.23+0x6a0/0x7b0
[ 1002.983691] [<ffffffff81183e3b>] __lock_acquire+0x9db/0xa90
[ 1002.983691] [<ffffffff8118635a>] lock_acquire+0x1ca/0x270
[ 1002.983691] [<ffffffff812f8eaa>] sysfs_deactivate+0x11a/0x190
[ 1002.983691] [<ffffffff812f9941>] sysfs_addrm_finish+0x31/0x60
[ 1002.983691] [<ffffffff812f9a0a>] __sysfs_remove_dir+0x9a/0xd0
[ 1002.983691] [<ffffffff812f9f2f>] sysfs_remove_dir+0x3f/0x50
[ 1002.983691] [<ffffffff819d3806>] kobject_del+0x16/0x40
[ 1002.983691] [<ffffffff819d3930>] kobject_cleanup+0x100/0x190
[ 1002.983691] [<ffffffff819d39cd>] kobject_release+0xd/0x10
[ 1002.983691] [<ffffffff819d33cc>] kobject_put+0x4c/0x60
[ 1002.983691] [<ffffffff839d8134>] batadv_sysfs_del_hardif+0x14/0x30
[ 1002.983691] [<ffffffff839cdacd>] batadv_hardif_remove_interface+0x5d/0x90
[ 1002.983691] [<ffffffff839cdba1>] batadv_hard_if_event+0xa1/0x2f0
[ 1002.983691] [<ffffffff8114271e>] notifier_call_chain+0xee/0x130
[ 1002.983691] [<ffffffff81142d31>] raw_notifier_call_chain+0x11/0x20
[ 1002.983691] [<ffffffff834e8dc2>] call_netdevice_notifiers+0x52/0x60
[ 1002.983691] [<ffffffff834ef4cd>] rollback_registered_many+0x14d/0x210
[ 1002.983691] [<ffffffff834ef5bc>] rollback_registered+0x2c/0x40
[ 1002.983691] [<ffffffff834ef650>] unregister_netdevice_queue+0x70/0xa0
[ 1002.983691] [<ffffffff834ef7bb>] unregister_netdev+0x1b/0x30
[ 1002.983691] [<ffffffff82a51fcc>] usbnet_disconnect+0x8c/0xf0
[ 1002.983691] [<ffffffff82b0aca7>] usb_unbind_interface+0x67/0x160
[ 1002.983691] [<ffffffff81e580a1>] __device_release_driver+0x81/0xe0
[ 1002.983691] [<ffffffff81e581f9>] device_release_driver+0x29/0x40
[ 1002.983691] [<ffffffff81e56c98>] bus_remove_device+0x138/0x150
[ 1002.983691] [<ffffffff81e54e8d>] device_del+0x13d/0x1a0
[ 1002.983691] [<ffffffff82b08459>] usb_disable_device+0xd9/0x270
[ 1002.983691] [<ffffffff82b08f38>] usb_set_configuration+0x268/0x7b0
[ 1002.983691] [<ffffffff82b0e5e3>] usb_remove_store+0x43/0x80
[ 1002.983691] [<ffffffff81e536c3>] dev_attr_store+0x13/0x30
[ 1002.983691] [<ffffffff812f7d41>] sysfs_write_file+0x101/0x170
[ 1002.983691] [<ffffffff8127ac58>] vfs_write+0xb8/0x180
[ 1002.983691] [<ffffffff8127ae10>] sys_write+0x50/0xa0
[ 1002.983691] [<ffffffff83bfafd8>] tracesys+0xe1/0xe6
[ 1002.983691]
[ 1002.983691] other info that might help us debug this:
[ 1002.983691]
[ 1002.983691] Possible unsafe locking scenario:
[ 1002.983691]
[ 1002.983691] CPU0 CPU1
[ 1002.983691] ---- ----
[ 1002.983691] lock(rtnl_mutex);
[ 1002.983691] lock(s_active#313);
[ 1002.983691] lock(rtnl_mutex);
[ 1002.983691] lock(s_active#313);
[ 1002.983691]
[ 1002.983691] *** DEADLOCK ***
[ 1002.983691]
[ 1002.983691] 4 locks held by trinity-child18/8149:
[ 1002.983691] #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff812f7c7f>] sysfs_write_file+0x3f/0x170
[ 1002.983691] #1: (&__lockdep_no_validate__){......}, at: [<ffffffff82b0e5c8>] usb_remove_store+0x28/0x80
[ 1002.983691] #2: (&__lockdep_no_validate__){......}, at: [<ffffffff81e581f1>] device_release_driver+0x21/0x40
[ 1002.983691] #3: (rtnl_mutex){+.+.+.}, at: [<ffffffff834fcc62>] rtnl_lock+0x12/0x20
[ 1002.983691]
[ 1002.983691] stack backtrace:
[ 1002.983691] Pid: 8149, comm: trinity-child18 Tainted: G W 3.7.0-rc5-next-20121112-sasha-00018-g2f4ce0e #127
[ 1002.983691] Call Trace:
[ 1002.983691] [<ffffffff83b322b9>] print_circular_bug+0xd3/0xe4
[ 1002.983691] [<ffffffff81180725>] check_prev_add+0x115/0x640
[ 1002.983691] [<ffffffff81180d0a>] check_prevs_add+0xba/0x1a0
[ 1002.983691] [<ffffffff8117e134>] ? graph_unlock+0xa4/0xb0
[ 1002.983691] [<ffffffff81181490>] validate_chain.isra.23+0x6a0/0x7b0
[ 1002.983691] [<ffffffff81183e3b>] __lock_acquire+0x9db/0xa90
[ 1002.983691] [<ffffffff81181ce9>] ? mark_held_locks+0xf9/0x130
[ 1002.983691] [<ffffffff811844af>] ? lockdep_init_map+0xcf/0x5e0
[ 1002.983691] [<ffffffff8118635a>] lock_acquire+0x1ca/0x270
[ 1002.983691] [<ffffffff812f9941>] ? sysfs_addrm_finish+0x31/0x60
[ 1002.983691] [<ffffffff812f8eaa>] sysfs_deactivate+0x11a/0x190
[ 1002.983691] [<ffffffff812f9941>] ? sysfs_addrm_finish+0x31/0x60
[ 1002.983691] [<ffffffff812f9941>] sysfs_addrm_finish+0x31/0x60
[ 1002.983691] [<ffffffff812f9a0a>] __sysfs_remove_dir+0x9a/0xd0
[ 1002.983691] [<ffffffff812f9f2f>] sysfs_remove_dir+0x3f/0x50
[ 1002.983691] [<ffffffff819d3806>] kobject_del+0x16/0x40
[ 1002.983691] [<ffffffff819d3930>] kobject_cleanup+0x100/0x190
[ 1002.983691] [<ffffffff819d39cd>] kobject_release+0xd/0x10
[ 1002.983691] [<ffffffff819d33cc>] kobject_put+0x4c/0x60
[ 1002.983691] [<ffffffff839d8134>] batadv_sysfs_del_hardif+0x14/0x30
[ 1002.983691] [<ffffffff839cd040>] ? batadv_primary_if_update_addr+0x280/0x280
[ 1002.983691] [<ffffffff839cdacd>] batadv_hardif_remove_interface+0x5d/0x90
[ 1002.983691] [<ffffffff839cdba1>] batadv_hard_if_event+0xa1/0x2f0
[ 1002.983691] [<ffffffff8114271e>] notifier_call_chain+0xee/0x130
[ 1002.983691] [<ffffffff81142d31>] raw_notifier_call_chain+0x11/0x20
[ 1002.983691] [<ffffffff834e8dc2>] call_netdevice_notifiers+0x52/0x60
[ 1002.983691] [<ffffffff834ef4cd>] rollback_registered_many+0x14d/0x210
[ 1002.983691] [<ffffffff834ef5bc>] rollback_registered+0x2c/0x40
[ 1002.983691] [<ffffffff834ef650>] unregister_netdevice_queue+0x70/0xa0
[ 1002.983691] [<ffffffff834ef7bb>] unregister_netdev+0x1b/0x30
[ 1002.983691] [<ffffffff82a51fcc>] usbnet_disconnect+0x8c/0xf0
[ 1002.983691] [<ffffffff82b0aca7>] usb_unbind_interface+0x67/0x160
[ 1002.983691] [<ffffffff81e580a1>] __device_release_driver+0x81/0xe0
[ 1002.983691] [<ffffffff81e581f9>] device_release_driver+0x29/0x40
[ 1002.983691] [<ffffffff81e56c98>] bus_remove_device+0x138/0x150
[ 1002.983691] [<ffffffff81e54e8d>] device_del+0x13d/0x1a0
[ 1002.983691] [<ffffffff82b08459>] usb_disable_device+0xd9/0x270
[ 1002.983691] [<ffffffff82b08f38>] usb_set_configuration+0x268/0x7b0
[ 1002.983691] [<ffffffff82b0e5c8>] ? usb_remove_store+0x28/0x80
[ 1002.983691] [<ffffffff82b0e5e3>] usb_remove_store+0x43/0x80
[ 1002.983691] [<ffffffff81e536c3>] dev_attr_store+0x13/0x30
[ 1002.983691] [<ffffffff812f7d41>] sysfs_write_file+0x101/0x170
[ 1002.983691] [<ffffffff8127ac58>] vfs_write+0xb8/0x180
[ 1002.983691] [<ffffffff8127ae10>] sys_write+0x50/0xa0
[ 1002.983691] [<ffffffff83bfafd8>] tracesys+0xe1/0xe6
Thanks,
Sasha
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: net, batman: lockdep circular dependency warning
2012-11-12 20:37 net, batman: lockdep circular dependency warning Sasha Levin
@ 2012-12-04 14:51 ` Sven Eckelmann
2012-12-05 15:33 ` [B.A.T.M.A.N.] " Simon Wunderlich
0 siblings, 1 reply; 4+ messages in thread
From: Sven Eckelmann @ 2012-12-04 14:51 UTC (permalink / raw)
To: b.a.t.m.a.n, netdev; +Cc: Sasha Levin
[-- Attachment #1: Type: text/plain, Size: 4997 bytes --]
Hi,
thanks for your report. It seems nobody else wanted to give an answer... so I
try to give a small overview.
On Monday 12 November 2012 15:37:47 Sasha Levin wrote:
> Hi all,
>
> While fuzzing with trinity inside a KVM tools (lkvm) guest running latest
> -next kernel, I've stumbled on the following:
>
> [ 1002.969392] ======================================================
> [ 1002.971639] [ INFO: possible circular locking dependency detected ]
> [ 1002.975805] 3.7.0-rc5-next-20121112-sasha-00018-g2f4ce0e #127 Tainted: G
> W [ 1002.983691]
> ------------------------------------------------------- [ 1002.983691]
> trinity-child18/8149 is trying to acquire lock:
> [ 1002.983691] (s_active#313){++++.+}, at: [<ffffffff812f9941>]
> sysfs_addrm_finish+0x31/0x60 [ 1002.983691]
> [ 1002.983691] but task is already holding lock:
> [ 1002.983691] (rtnl_mutex){+.+.+.}, at: [<ffffffff834fcc62>]
> rtnl_lock+0x12/0x20 [ 1002.983691]
> [ 1002.983691] which lock already depends on the new lock.
It is known that batman-adv has a problem with the attaching/detaching of
interfaces over this sysfs. The cause of this problem is related to the fact
that batman-adv not only creates its own net_devices, but also unregisters
net_devices. This unregister will add a new element in the net_todo_list. This
will cause a rtnl_lock when it calls netdev_wait_allrefs (there are some
condition, but we just ignore them for now). So the whole exercise of using
rtnl_trylock was useless.
This extra rtnl_lock can cause a deadlock as you found out because it is
activated through a sysfs file and therefore the s_active mutex is locked (we
have the dependency s_active -> rtnl_mutex, but other users have rtnl_mutex ->
s_active).
So, what to do? There are different possibilities. We have to keep in mind
that there is a patchset (not yet accepted by the batman-adv maintainers)
which allows to use `ip link` or compatible tools to create/destroy batman-adv
devices and attach/detach other devices.
1. Remove the sysfs interface to attach/detach net_devices (which
destroys/creates batman-adv devices)
This is not really backward compatible and therefore not really acceptable.
Marek Lindner and Simon Wunderlich are also against forcing users to
require special tools to add/configure batman-adv devices (even batctl, ip
and so on).
2. Ignore the possible deadlock
(sry, fill in your own comment...)
3. Add workarounds in the core net code
Simon Wunderlich already tried it... I personally think it is not the right
way because it more likely to introduce more bugs by hiding a batman-adv
bug. And these bugs are a lot harder to find... trust me
For example the usage of __rtnl_unlock will let this bug to appear in
other places which use rtnl_trylock. This is caused by the fact that the
todo item isn't processed by __rtnl_unlock (this is the whole idea by
calling it) and therefore the todo work stays in net_todo_list. Another
user of rtnl_trylock will now call rtnl_unlock and don't expect an entry in
net_todo_list because he never unregistered a device. So he now has the
problem of batman-adv (what an unsocial läderlappen).
And moving everybody using rtnl_trylock to __rtnl_unlock has still the
problem that batman-adv don't immediatelly work on its todo and so
maybe causes other side effects because... the notifications weren't
sent and therefore the refcount of the unregistered device didn't went
to zero.
(I'll leave other side effects as homework for the reader)
4. Don't automatically remove batman-adv devices
The current approach is to automatically unregister batman-adv devices
when they don't have attached slave-devices (hardif called by batman-adv).
Removing this will slightly change the behaviour, but the interface can
still be removed using `ip link del dev bat0` or a similar tool.
5. Add a workaround solution and promote the use of the standard interface
So, the basic problem is the s_active mutex locked by the sysfs interface.
An idea is to postpone the part which needs the rtnl_mutex to a later time.
This has obviously the problem that we cannot return an error code to the
caller when the device creation failed in the postponed part. This problem
can reduced slightly be moving only the unregister part, but now I'll leave
this out for simplicity of the description.
A possible implementation would create a work_struct and add it to
batadv_event_workqueue. This work item has to contain all information given
by the user (which hardif, name of the batman-adv device).
Simon Wunderlich already disliked this workaround, but Antonio Quartulli
tried to encourage an RFC implementation. I've prefered a textual
description rather than a patch missing explanations of the other
alternatives.
Kind regards,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [B.A.T.M.A.N.] net, batman: lockdep circular dependency warning
2012-12-04 14:51 ` Sven Eckelmann
@ 2012-12-05 15:33 ` Simon Wunderlich
2012-12-05 21:03 ` Antonio Quartulli
0 siblings, 1 reply; 4+ messages in thread
From: Simon Wunderlich @ 2012-12-05 15:33 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Cc: netdev, Sasha Levin
[-- Attachment #1: Type: text/plain, Size: 6742 bytes --]
Hey Sven,
thanks for showing these approaches! Comments inline ...
On Tue, Dec 04, 2012 at 03:51:55PM +0100, Sven Eckelmann wrote:
> Hi,
>
> thanks for your report. It seems nobody else wanted to give an answer... so I
> try to give a small overview.
>
> On Monday 12 November 2012 15:37:47 Sasha Levin wrote:
> > Hi all,
> >
> > While fuzzing with trinity inside a KVM tools (lkvm) guest running latest
> > -next kernel, I've stumbled on the following:
> >
> > [ 1002.969392] ======================================================
> > [ 1002.971639] [ INFO: possible circular locking dependency detected ]
> > [ 1002.975805] 3.7.0-rc5-next-20121112-sasha-00018-g2f4ce0e #127 Tainted: G
> > W [ 1002.983691]
> > ------------------------------------------------------- [ 1002.983691]
> > trinity-child18/8149 is trying to acquire lock:
> > [ 1002.983691] (s_active#313){++++.+}, at: [<ffffffff812f9941>]
> > sysfs_addrm_finish+0x31/0x60 [ 1002.983691]
> > [ 1002.983691] but task is already holding lock:
> > [ 1002.983691] (rtnl_mutex){+.+.+.}, at: [<ffffffff834fcc62>]
> > rtnl_lock+0x12/0x20 [ 1002.983691]
> > [ 1002.983691] which lock already depends on the new lock.
>
> It is known that batman-adv has a problem with the attaching/detaching of
> interfaces over this sysfs. The cause of this problem is related to the fact
> that batman-adv not only creates its own net_devices, but also unregisters
> net_devices. This unregister will add a new element in the net_todo_list. This
> will cause a rtnl_lock when it calls netdev_wait_allrefs (there are some
> condition, but we just ignore them for now). So the whole exercise of using
> rtnl_trylock was useless.
>
> This extra rtnl_lock can cause a deadlock as you found out because it is
> activated through a sysfs file and therefore the s_active mutex is locked (we
> have the dependency s_active -> rtnl_mutex, but other users have rtnl_mutex ->
> s_active).
>
> So, what to do? There are different possibilities. We have to keep in mind
> that there is a patchset (not yet accepted by the batman-adv maintainers)
> which allows to use `ip link` or compatible tools to create/destroy batman-adv
> devices and attach/detach other devices.
>
> 1. Remove the sysfs interface to attach/detach net_devices (which
> destroys/creates batman-adv devices)
>
> This is not really backward compatible and therefore not really acceptable.
> Marek Lindner and Simon Wunderlich are also against forcing users to
> require special tools to add/configure batman-adv devices (even batctl, ip
> and so on).
>
Yeah, at least I think we should keep what we have for now and fix it before
moving to the next interface. It has its merits I would like to keep, having
text output is one of them. :)
> 2. Ignore the possible deadlock
>
> (sry, fill in your own comment...)
>
That probably won't help. :)
> 3. Add workarounds in the core net code
>
> Simon Wunderlich already tried it... I personally think it is not the right
> way because it more likely to introduce more bugs by hiding a batman-adv
> bug. And these bugs are a lot harder to find... trust me
>
> For example the usage of __rtnl_unlock will let this bug to appear in
> other places which use rtnl_trylock. This is caused by the fact that the
> todo item isn't processed by __rtnl_unlock (this is the whole idea by
> calling it) and therefore the todo work stays in net_todo_list. Another
> user of rtnl_trylock will now call rtnl_unlock and don't expect an entry in
> net_todo_list because he never unregistered a device. So he now has the
> problem of batman-adv (what an unsocial läderlappen).
>
> And moving everybody using rtnl_trylock to __rtnl_unlock has still the
> problem that batman-adv don't immediatelly work on its todo and so
> maybe causes other side effects because... the notifications weren't
> sent and therefore the refcount of the unregistered device didn't went
> to zero.
>
> (I'll leave other side effects as homework for the reader)
>
You are right, it can probably not solved as easily as I thought before. Also,
it seems the bridge code is not concerned as I thought at first. Although
I still don't like the rtnl_unlock() concept in general, but I can't provide
an alternative here so I should't moan about that. :)
> 4. Don't automatically remove batman-adv devices
>
> The current approach is to automatically unregister batman-adv devices
> when they don't have attached slave-devices (hardif called by batman-adv).
> Removing this will slightly change the behaviour, but the interface can
> still be removed using `ip link del dev bat0` or a similar tool.
>
That would be possible, but we must at least make sure that the initialization
is done for all internal tables (tt, bla, ...), counters, seqnos, etc when the
first device is added. Otherwise old users might assume that the device is
resetted correctly when removing all hard interfaces of one soft interface
and add it again under the same soft interface name.
> 5. Add a workaround solution and promote the use of the standard interface
>
> So, the basic problem is the s_active mutex locked by the sysfs interface.
> An idea is to postpone the part which needs the rtnl_mutex to a later time.
> This has obviously the problem that we cannot return an error code to the
> caller when the device creation failed in the postponed part. This problem
> can reduced slightly be moving only the unregister part, but now I'll leave
> this out for simplicity of the description.
We probably won't need the return code anyway - usually it should never fail,
and if it does we don't handle it now too.
>
> A possible implementation would create a work_struct and add it to
> batadv_event_workqueue. This work item has to contain all information given
> by the user (which hardif, name of the batman-adv device).
Sounds good.
>
> Simon Wunderlich already disliked this workaround, but Antonio Quartulli
> tried to encourage an RFC implementation. I've prefered a textual
> description rather than a patch missing explanations of the other
> alternatives.
Well, actually that doesn't sound so bad - I currently don't have an overview
of how "big" this change would be - this one was one concern, the return code was
another but it appears that this isn't a problem. If we don't add too much bloat
this one would probably the best alternative. At least as long as rtnl_unlock()
behaves like this. :)
What do others think?
Cheers,
Simon
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [B.A.T.M.A.N.] net, batman: lockdep circular dependency warning
2012-12-05 15:33 ` [B.A.T.M.A.N.] " Simon Wunderlich
@ 2012-12-05 21:03 ` Antonio Quartulli
0 siblings, 0 replies; 4+ messages in thread
From: Antonio Quartulli @ 2012-12-05 21:03 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Cc: netdev, Sasha Levin
[-- Attachment #1: Type: text/plain, Size: 2835 bytes --]
Hi all,
On Wed, Dec 05, 2012 at 04:33:08PM +0100, Simon Wunderlich wrote:
> Hey Sven,
> >
> > 1. Remove the sysfs interface to attach/detach net_devices (which
> > destroys/creates batman-adv devices)
> >
> > This is not really backward compatible and therefore not really acceptable.
> > Marek Lindner and Simon Wunderlich are also against forcing users to
> > require special tools to add/configure batman-adv devices (even batctl, ip
> > and so on).
> >
>
> Yeah, at least I think we should keep what we have for now and fix it before
> moving to the next interface. It has its merits I would like to keep, having
> text output is one of them. :)
I agree on this. Not because of the nice text output, but rather because it is
better to first fix this deadlock in the current interface (which might mean
fixing old stable releases) and when we include the new feature.
[...]
> > 5. Add a workaround solution and promote the use of the standard interface
> >
> > So, the basic problem is the s_active mutex locked by the sysfs interface.
> > An idea is to postpone the part which needs the rtnl_mutex to a later time.
> > This has obviously the problem that we cannot return an error code to the
> > caller when the device creation failed in the postponed part. This problem
> > can reduced slightly be moving only the unregister part, but now I'll leave
> > this out for simplicity of the description.
>
> We probably won't need the return code anyway - usually it should never fail,
> and if it does we don't handle it now too.
>
> >
> > A possible implementation would create a work_struct and add it to
> > batadv_event_workqueue. This work item has to contain all information given
> > by the user (which hardif, name of the batman-adv device).
>
> Sounds good.
>
> >
> > Simon Wunderlich already disliked this workaround, but Antonio Quartulli
> > tried to encourage an RFC implementation. I've prefered a textual
> > description rather than a patch missing explanations of the other
> > alternatives.
>
> Well, actually that doesn't sound so bad - I currently don't have an overview
> of how "big" this change would be - this one was one concern, the return code was
> another but it appears that this isn't a problem. If we don't add too much bloat
> this one would probably the best alternative. At least as long as rtnl_unlock()
> behaves like this. :)
>
> What do others think?
>
I like this approach too.
It looks clean and it doesn't affect the rest of the net code.
I think we should put some effort in this and try to come up with a patch soon.
Thank you for your comments.
Cheers,
--
Antonio Quartulli
..each of us alone is worth nothing..
Ernesto "Che" Guevara
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-12-05 21:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-12 20:37 net, batman: lockdep circular dependency warning Sasha Levin
2012-12-04 14:51 ` Sven Eckelmann
2012-12-05 15:33 ` [B.A.T.M.A.N.] " Simon Wunderlich
2012-12-05 21:03 ` Antonio Quartulli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).