* [PATCH] syscall: Implement a convinience function restart_syscall
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
@ 2009-05-14 2:55 ` Eric W. Biederman
2009-05-14 2:57 ` [PATCH 2/7] net-sysfs: Use rtnl_trylock in sysfs methods Eric W. Biederman
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 2:55 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
Currently when we have a signal pending we have the functionality
to restart that the current system call. There are other cases
such as nasty lock ordering issues where it makes sense to have
a simple fix that uses try lock and restarts the system call.
Buying time to figure out how to rework the locking strategy.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
include/linux/sched.h | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b4c38bc..d853f6b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2178,6 +2178,12 @@ static inline int test_tsk_need_resched(struct task_struct *tsk)
return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED));
}
+static inline int restart_syscall(void)
+{
+ set_tsk_thread_flag(current, TIF_SIGPENDING);
+ return -ERESTARTNOINTR;
+}
+
static inline int signal_pending(struct task_struct *p)
{
return unlikely(test_tsk_thread_flag(p,TIF_SIGPENDING));
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 2/7] net-sysfs: Use rtnl_trylock in sysfs methods.
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
2009-05-14 2:55 ` [PATCH] syscall: Implement a convinience function restart_syscall Eric W. Biederman
@ 2009-05-14 2:57 ` Eric W. Biederman
2009-05-14 2:58 ` [PATCH 3/7] net: FIX ipv6_forward sysctl restart Eric W. Biederman
2009-05-14 2:59 ` [PATCH 4/7] net: Fix devinet_sysctl_forward Eric W. Biederman
` (6 subsequent siblings)
8 siblings, 1 reply; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 2:57 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
The earlier patch to fix the deadlock between a network device going
away and writing to sysfs attributes was incomplete.
- It did not set signal_pending so we would leak ERSTARTSYS to user space.
- It used ERESTARTSYS which only restarts if sigaction configures it to.
- It did not cover store and show for ifalias.
So fix all of these up and use the new helper restart_syscall so we get
the details correct on what it takes.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
net/core/net-sysfs.c | 8 +++++---
1 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 2da59a0..b9641e8 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -78,7 +78,7 @@ static ssize_t netdev_store(struct device *dev, struct device_attribute *attr,
goto err;
if (!rtnl_trylock())
- return -ERESTARTSYS;
+ return restart_syscall();
if (dev_isalive(net)) {
if ((ret = (*set)(net, new)) == 0)
@@ -225,7 +225,8 @@ static ssize_t store_ifalias(struct device *dev, struct device_attribute *attr,
if (len > 0 && buf[len - 1] == '\n')
--count;
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
ret = dev_set_alias(netdev, buf, count);
rtnl_unlock();
@@ -238,7 +239,8 @@ static ssize_t show_ifalias(struct device *dev,
const struct net_device *netdev = to_net_dev(dev);
ssize_t ret = 0;
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
if (netdev->ifalias)
ret = sprintf(buf, "%s\n", netdev->ifalias);
rtnl_unlock();
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 3/7] net: FIX ipv6_forward sysctl restart
2009-05-14 2:57 ` [PATCH 2/7] net-sysfs: Use rtnl_trylock in sysfs methods Eric W. Biederman
@ 2009-05-14 2:58 ` Eric W. Biederman
0 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 2:58 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
Just returning -ERESTARTSYS without a signal pending is not
good that will just leak it to userspace. We need return
-ERESTARTNOINTR so we always restart and set signal pending
so that we fall of the fast path of syscall return and setup
the system call restart.
So use restart_syscall() which does all of this for us.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
net/ipv6/addrconf.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index a8218bc..9eb61c7 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -503,7 +503,7 @@ static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int old)
return 0;
if (!rtnl_trylock())
- return -ERESTARTSYS;
+ return restart_syscall();
if (p == &net->ipv6.devconf_all->forwarding) {
__s32 newf = net->ipv6.devconf_all->forwarding;
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/7] net: Fix devinet_sysctl_forward
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
2009-05-14 2:55 ` [PATCH] syscall: Implement a convinience function restart_syscall Eric W. Biederman
2009-05-14 2:57 ` [PATCH 2/7] net-sysfs: Use rtnl_trylock in sysfs methods Eric W. Biederman
@ 2009-05-14 2:59 ` Eric W. Biederman
2009-05-14 3:00 ` [PATCH 5/7] net: Fix bridgeing sysfs handling of rtnl_lock Eric W. Biederman
` (5 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 2:59 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
sysctls are unregistered with the rntl_lock held making
it unsafe to unconditionally grab the the rtnl_lock. Instead
we need to call rtnl_trylock and restart the system call
if we can not grab it. Otherwise we could deadlock at unregistration
time.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
net/ipv4/devinet.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 126bb91..3863c3a 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1347,7 +1347,8 @@ static int devinet_sysctl_forward(ctl_table *ctl, int write,
struct net *net = ctl->extra2;
if (valp != &IPV4_DEVCONF_DFLT(net, FORWARDING)) {
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
if (valp == &IPV4_DEVCONF_ALL(net, FORWARDING)) {
inet_forward_change(net);
} else if (*valp) {
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 5/7] net: Fix bridgeing sysfs handling of rtnl_lock
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
` (2 preceding siblings ...)
2009-05-14 2:59 ` [PATCH 4/7] net: Fix devinet_sysctl_forward Eric W. Biederman
@ 2009-05-14 3:00 ` Eric W. Biederman
2009-05-14 3:01 ` [PATCH 6/7] net: Fix ipoib rtnl_lock sysfs deadlock Eric W. Biederman
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 3:00 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
Holding rtnl_lock when we are unregistering the sysfs files can
deadlock if we unconditionally take rtnl_lock in a sysfs file. So fix
it with the now familiar patter of: rtnl_trylock and syscall_restart()
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
net/bridge/br_sysfs_br.c | 3 ++-
net/bridge/br_sysfs_if.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index 603d892..ee4820a 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -172,7 +172,8 @@ static ssize_t store_stp_state(struct device *d,
if (endp == buf)
return -EINVAL;
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
br_stp_set_enabled(br, val);
rtnl_unlock();
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 02b2d50..4a3cdf8 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -189,7 +189,8 @@ static ssize_t brport_store(struct kobject * kobj,
val = simple_strtoul(buf, &endp, 0);
if (endp != buf) {
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
if (p->dev && p->br && brport_attr->store) {
spin_lock_bh(&p->br->lock);
ret = brport_attr->store(p, val);
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 6/7] net: Fix ipoib rtnl_lock sysfs deadlock.
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
` (3 preceding siblings ...)
2009-05-14 3:00 ` [PATCH 5/7] net: Fix bridgeing sysfs handling of rtnl_lock Eric W. Biederman
@ 2009-05-14 3:01 ` Eric W. Biederman
2009-05-14 3:02 ` [PATCH 7/7] net: FIX bonding sysfs rtnl_lock deadlock Eric W. Biederman
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 3:01 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
Network device sysfs files that grab the rtnl_lock unconditionally
will deadlock if accessed when the network device is being
unregistered. So use trylock and syscall_restart to avoid this
deadlock.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 6 ++++--
drivers/infiniband/ulp/ipoib/ipoib_vlan.c | 6 ++++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 47d588b..4248c31 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1455,13 +1455,15 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr,
struct net_device *dev = to_net_dev(d);
struct ipoib_dev_priv *priv = netdev_priv(dev);
+ if (!rtnl_trylock())
+ return restart_syscall();
+
/* flush paths if we switch modes so that connections are restarted */
if (IPOIB_CM_SUPPORTED(dev->dev_addr) && !strcmp(buf, "connected\n")) {
set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags);
ipoib_warn(priv, "enabling connected mode "
"will cause multicast packet drops\n");
- rtnl_lock();
dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO);
rtnl_unlock();
priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM;
@@ -1473,7 +1475,6 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr,
if (!strcmp(buf, "datagram\n")) {
clear_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags);
- rtnl_lock();
if (test_bit(IPOIB_FLAG_CSUM, &priv->flags)) {
dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
if (priv->hca_caps & IB_DEVICE_UD_TSO)
@@ -1485,6 +1486,7 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr,
return count;
}
+ rtnl_unlock();
return -EINVAL;
}
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
index 4c57f32..e3bf00d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -61,7 +61,8 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
ppriv = netdev_priv(pdev);
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
mutex_lock(&ppriv->vlan_mutex);
/*
@@ -167,7 +168,8 @@ int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey)
ppriv = netdev_priv(pdev);
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
mutex_lock(&ppriv->vlan_mutex);
list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) {
if (priv->pkey == pkey) {
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH 7/7] net: FIX bonding sysfs rtnl_lock deadlock
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
` (4 preceding siblings ...)
2009-05-14 3:01 ` [PATCH 6/7] net: Fix ipoib rtnl_lock sysfs deadlock Eric W. Biederman
@ 2009-05-14 3:02 ` Eric W. Biederman
2009-05-14 3:40 ` [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes David Miller
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 3:02 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
Sysfs files for a network device can not unconditionally take the
rtnl_lock as the bonding sysfs files do. If someone accesses those
sysfs files while the network device is being unregistered with the
rtnl_lock held we will deadlock.
So use trylock and restart_syscall to avoid this problem.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
drivers/net/bonding/bond_sysfs.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index d287315..3a1b7b0 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -251,7 +251,8 @@ static ssize_t bonding_store_slaves(struct device *d,
/* Note: We can't hold bond->lock here, as bond_create grabs it. */
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
down_write(&(bonding_rwsem));
sscanf(buffer, "%16s", command); /* IFNAMSIZ*/
@@ -1171,7 +1172,8 @@ static ssize_t bonding_store_primary(struct device *d,
struct slave *slave;
struct bonding *bond = to_bond(d);
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
read_lock(&bond->lock);
write_lock_bh(&bond->curr_slave_lock);
@@ -1288,7 +1290,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
struct slave *new_active = NULL;
struct bonding *bond = to_bond(d);
- rtnl_lock();
+ if (!rtnl_trylock())
+ return restart_syscall();
read_lock(&bond->lock);
write_lock_bh(&bond->curr_slave_lock);
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
` (5 preceding siblings ...)
2009-05-14 3:02 ` [PATCH 7/7] net: FIX bonding sysfs rtnl_lock deadlock Eric W. Biederman
@ 2009-05-14 3:40 ` David Miller
2009-05-14 6:44 ` [PATCH 1/7] syscall: Implement a convinience function restart_syscall Eric W. Biederman
2009-05-19 5:16 ` [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes David Miller
8 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2009-05-14 3:40 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, herbert, shemminger, greearb, kaber
I did not receive a copy of patch 1/7 of this series, where is it?
It didn't show up in patchwork either, which means it didn't make
it to the mailing lists either.
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/7] syscall: Implement a convinience function restart_syscall
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
` (6 preceding siblings ...)
2009-05-14 3:40 ` [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes David Miller
@ 2009-05-14 6:44 ` Eric W. Biederman
2009-05-19 5:16 ` [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes David Miller
8 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-14 6:44 UTC (permalink / raw)
To: David Miller
Cc: netdev, Herbert Xu, Stephen Hemminger, Ben Greear,
Patrick McHardy
Currently when we have a signal pending we have the functionality
to restart that the current system call. There are other cases
such as nasty lock ordering issues where it makes sense to have
a simple fix that uses try lock and restarts the system call.
Buying time to figure out how to rework the locking strategy.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
include/linux/sched.h | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b4c38bc..d853f6b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2178,6 +2178,12 @@ static inline int test_tsk_need_resched(struct task_struct *tsk)
return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED));
}
+static inline int restart_syscall(void)
+{
+ set_tsk_thread_flag(current, TIF_SIGPENDING);
+ return -ERESTARTNOINTR;
+}
+
static inline int signal_pending(struct task_struct *p)
{
return unlikely(test_tsk_thread_flag(p,TIF_SIGPENDING));
--
1.6.0.6
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes
2009-05-14 2:53 [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes Eric W. Biederman
` (7 preceding siblings ...)
2009-05-14 6:44 ` [PATCH 1/7] syscall: Implement a convinience function restart_syscall Eric W. Biederman
@ 2009-05-19 5:16 ` David Miller
2009-05-19 8:07 ` Eric W. Biederman
8 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2009-05-19 5:16 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, herbert, shemminger, greearb, kaber
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Wed, 13 May 2009 19:53:40 -0700
> A while ago it was noticed that unregistering network devices could cause
> deadlocks if virtual files that take the rtnl_lock were accessed at the
> wrong time.
>
> After looking at the different possibilities the only way to solve it
> cleanly without some kind of busy loop appears to be reducing the scope
> of the rtnl lock.
>
> I have not tackled the hard fix yet but I have tested our current work
> around and it does not succeed in restarting the system call and
> instead leaks -ERESTARNOSYS to userspace, because we do not have a
> signal pending.
>
> Further the current work around misses several interesting places
> in the network stack where the deadlock can occur.
>
> I have addressed the problems by making a common helper function
> and patching all of the places I could find that had this problem.
I've added these changes to net-next-2.6, thanks.
I really can't justify sticking this into net-2.6, especially
this late in the series, sorry.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes
2009-05-19 5:16 ` [PATCH 0/7] IPv4/IPv6 unregistration deadlock fixes David Miller
@ 2009-05-19 8:07 ` Eric W. Biederman
0 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2009-05-19 8:07 UTC (permalink / raw)
To: David Miller; +Cc: netdev, herbert, shemminger, greearb, kaber
David Miller <davem@davemloft.net> writes:
> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Wed, 13 May 2009 19:53:40 -0700
>
>> A while ago it was noticed that unregistering network devices could cause
>> deadlocks if virtual files that take the rtnl_lock were accessed at the
>> wrong time.
>>
>> After looking at the different possibilities the only way to solve it
>> cleanly without some kind of busy loop appears to be reducing the scope
>> of the rtnl lock.
>>
>> I have not tackled the hard fix yet but I have tested our current work
>> around and it does not succeed in restarting the system call and
>> instead leaks -ERESTARNOSYS to userspace, because we do not have a
>> signal pending.
>>
>> Further the current work around misses several interesting places
>> in the network stack where the deadlock can occur.
>>
>> I have addressed the problems by making a common helper function
>> and patching all of the places I could find that had this problem.
>
> I've added these changes to net-next-2.6, thanks.
>
> I really can't justify sticking this into net-2.6, especially
> this late in the series, sorry.
No problem. The problem is ancient. I just figured it was time to
start solving it.
Eric
^ permalink raw reply [flat|nested] 12+ messages in thread