netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: make dev_set_mtu() honor notification return code
@ 2014-01-10 12:48 Veaceslav Falico
  2014-01-10 15:36 ` Alexander Duyck
  2014-01-13 23:18 ` David Miller
  0 siblings, 2 replies; 6+ messages in thread
From: Veaceslav Falico @ 2014-01-10 12:48 UTC (permalink / raw)
  To: netdev
  Cc: Veaceslav Falico, Jiri Pirko, David S. Miller, Eric Dumazet,
	Alexander Duyck, Nicolas Dichtel

Currently, after changing the MTU for a device, dev_set_mtu() calls
NETDEV_CHANGEMTU notification, however doesn't verify it's return code -
which can be NOTIFY_BAD - i.e. some of the net notifier blocks refused this
change, and continues nevertheless.

To fix this, verify the return code, and if it's an error - then revert the
MTU to the original one, and pass the error code.

CC: Jiri Pirko <jiri@resnulli.us>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 net/core/dev.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index ce01847..1c570ff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5287,6 +5287,17 @@ int dev_change_flags(struct net_device *dev, unsigned int flags)
 }
 EXPORT_SYMBOL(dev_change_flags);
 
+static int __dev_set_mtu(struct net_device *dev, int new_mtu)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if (ops->ndo_change_mtu)
+		return ops->ndo_change_mtu(dev, new_mtu);
+
+	dev->mtu = new_mtu;
+	return 0;
+}
+
 /**
  *	dev_set_mtu - Change maximum transfer unit
  *	@dev: device
@@ -5296,8 +5307,7 @@ EXPORT_SYMBOL(dev_change_flags);
  */
 int dev_set_mtu(struct net_device *dev, int new_mtu)
 {
-	const struct net_device_ops *ops = dev->netdev_ops;
-	int err;
+	int err, orig_mtu;
 
 	if (new_mtu == dev->mtu)
 		return 0;
@@ -5309,14 +5319,15 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
 	if (!netif_device_present(dev))
 		return -ENODEV;
 
-	err = 0;
-	if (ops->ndo_change_mtu)
-		err = ops->ndo_change_mtu(dev, new_mtu);
-	else
-		dev->mtu = new_mtu;
+	orig_mtu = dev->mtu;
+	err = __dev_set_mtu(dev, new_mtu);
 
-	if (!err)
-		call_netdevice_notifiers(NETDEV_CHANGEMTU, dev);
+	if (!err) {
+		err = call_netdevice_notifiers(NETDEV_CHANGEMTU, dev);
+		err = notifier_to_errno(err);
+		if (err)
+			__dev_set_mtu(dev, orig_mtu);
+	}
 	return err;
 }
 EXPORT_SYMBOL(dev_set_mtu);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] net: make dev_set_mtu() honor notification return code
  2014-01-10 12:48 [PATCH net-next] net: make dev_set_mtu() honor notification return code Veaceslav Falico
@ 2014-01-10 15:36 ` Alexander Duyck
  2014-01-10 15:47   ` Veaceslav Falico
  2014-01-13 23:18 ` David Miller
  1 sibling, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2014-01-10 15:36 UTC (permalink / raw)
  To: Veaceslav Falico, netdev
  Cc: Jiri Pirko, David S. Miller, Eric Dumazet, Nicolas Dichtel

On 01/10/2014 04:48 AM, Veaceslav Falico wrote:
> Currently, after changing the MTU for a device, dev_set_mtu() calls
> NETDEV_CHANGEMTU notification, however doesn't verify it's return code -
> which can be NOTIFY_BAD - i.e. some of the net notifier blocks refused this
> change, and continues nevertheless.
>
> To fix this, verify the return code, and if it's an error - then revert the
> MTU to the original one, and pass the error code.
>
> CC: Jiri Pirko <jiri@resnulli.us>
> CC: "David S. Miller" <davem@davemloft.net>
> CC: Eric Dumazet <edumazet@google.com>
> CC: Alexander Duyck <alexander.h.duyck@intel.com>
> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
> ---
>  net/core/dev.c | 29 ++++++++++++++++++++---------
>  1 file changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ce01847..1c570ff 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5287,6 +5287,17 @@ int dev_change_flags(struct net_device *dev, unsigned int flags)
>  }
>  EXPORT_SYMBOL(dev_change_flags);
>  
> +static int __dev_set_mtu(struct net_device *dev, int new_mtu)
> +{
> +	const struct net_device_ops *ops = dev->netdev_ops;
> +
> +	if (ops->ndo_change_mtu)
> +		return ops->ndo_change_mtu(dev, new_mtu);
> +
> +	dev->mtu = new_mtu;
> +	return 0;
> +}
> +
>  /**
>   *	dev_set_mtu - Change maximum transfer unit
>   *	@dev: device
> @@ -5296,8 +5307,7 @@ EXPORT_SYMBOL(dev_change_flags);
>   */
>  int dev_set_mtu(struct net_device *dev, int new_mtu)
>  {
> -	const struct net_device_ops *ops = dev->netdev_ops;
> -	int err;
> +	int err, orig_mtu;
>  
>  	if (new_mtu == dev->mtu)
>  		return 0;
> @@ -5309,14 +5319,15 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
>  	if (!netif_device_present(dev))
>  		return -ENODEV;
>  
> -	err = 0;
> -	if (ops->ndo_change_mtu)
> -		err = ops->ndo_change_mtu(dev, new_mtu);
> -	else
> -		dev->mtu = new_mtu;
> +	orig_mtu = dev->mtu;
> +	err = __dev_set_mtu(dev, new_mtu);
>  
> -	if (!err)
> -		call_netdevice_notifiers(NETDEV_CHANGEMTU, dev);
> +	if (!err) {
> +		err = call_netdevice_notifiers(NETDEV_CHANGEMTU, dev);
> +		err = notifier_to_errno(err);
> +		if (err)
> +			__dev_set_mtu(dev, orig_mtu);
> +	}
>  	return err;
>  }
>  EXPORT_SYMBOL(dev_set_mtu);

So what about the netdevices that succeeded in changing the MTU based on
the notifiers?  It seems like you still  have an inconsistent state
after the failure unless you issue a second call with a notification
that you reverted to the old MTU.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] net: make dev_set_mtu() honor notification return code
  2014-01-10 15:36 ` Alexander Duyck
@ 2014-01-10 15:47   ` Veaceslav Falico
  0 siblings, 0 replies; 6+ messages in thread
From: Veaceslav Falico @ 2014-01-10 15:47 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, Jiri Pirko, David S. Miller, Eric Dumazet,
	Nicolas Dichtel

On Fri, Jan 10, 2014 at 07:36:45AM -0800, Alexander Duyck wrote:
>On 01/10/2014 04:48 AM, Veaceslav Falico wrote:
...snip...
>> -	err = 0;
>> -	if (ops->ndo_change_mtu)
>> -		err = ops->ndo_change_mtu(dev, new_mtu);
>> -	else
>> -		dev->mtu = new_mtu;
>> +	orig_mtu = dev->mtu;
>> +	err = __dev_set_mtu(dev, new_mtu);
>>
>> -	if (!err)
>> -		call_netdevice_notifiers(NETDEV_CHANGEMTU, dev);
>> +	if (!err) {
>> +		err = call_netdevice_notifiers(NETDEV_CHANGEMTU, dev);
>> +		err = notifier_to_errno(err);
>> +		if (err)
>> +			__dev_set_mtu(dev, orig_mtu);
>> +	}
>>  	return err;
>>  }
>>  EXPORT_SYMBOL(dev_set_mtu);
>
>So what about the netdevices that succeeded in changing the MTU based on
>the notifiers?  It seems like you still  have an inconsistent state
>after the failure unless you issue a second call with a notification
>that you reverted to the old MTU.

Good point, thank you. I'll add a second call_netdevice_notifiers() after
setting the original MTU and resend.

Thank you!

>
>Thanks,
>
>Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] net: make dev_set_mtu() honor notification return code
  2014-01-10 12:48 [PATCH net-next] net: make dev_set_mtu() honor notification return code Veaceslav Falico
  2014-01-10 15:36 ` Alexander Duyck
@ 2014-01-13 23:18 ` David Miller
  2014-01-14 12:13   ` Veaceslav Falico
  1 sibling, 1 reply; 6+ messages in thread
From: David Miller @ 2014-01-13 23:18 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, jiri, edumazet, alexander.h.duyck, nicolas.dichtel

From: Veaceslav Falico <vfalico@redhat.com>
Date: Fri, 10 Jan 2014 13:48:17 +0100

> Currently, after changing the MTU for a device, dev_set_mtu() calls
> NETDEV_CHANGEMTU notification, however doesn't verify it's return code -
> which can be NOTIFY_BAD - i.e. some of the net notifier blocks refused this
> change, and continues nevertheless.
> 
> To fix this, verify the return code, and if it's an error - then revert the
> MTU to the original one, and pass the error code.
> 
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

This is really a precariously designed code path.

If one of the notifiers says NOTIFY_BAD, well we've already changed
dev->mtu, therefore what if a packet got sent during this time?

Whoever the NOTIFY_BAD signaller is, obviously cannot handle an MTU
setting which we've already set in the netdev.  So allowing it's a
terribly idea to allow visibility of the new MTU value until we can be
sure everyone can handle it.

The problem is that we really need a transaction of some sort to fix
this properly.  First, we'd need to ask all the notifiers if they
can handle the new MTU, then we somehow atomically set netdev->mtu
and have the notifiers commit their own state changes.

Then we'll have the stick issue of what to do if a notifier is
unregistered between the check and the commit. :-)

Your patch is an improvement so I will apply it, this stuff really
is full of holes still.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] net: make dev_set_mtu() honor notification return code
  2014-01-13 23:18 ` David Miller
@ 2014-01-14 12:13   ` Veaceslav Falico
  2014-01-15 21:48     ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Veaceslav Falico @ 2014-01-14 12:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jiri, edumazet, alexander.h.duyck, nicolas.dichtel

On Mon, Jan 13, 2014 at 03:18:55PM -0800, David Miller wrote:
>From: Veaceslav Falico <vfalico@redhat.com>
>Date: Fri, 10 Jan 2014 13:48:17 +0100
>
>> Currently, after changing the MTU for a device, dev_set_mtu() calls
>> NETDEV_CHANGEMTU notification, however doesn't verify it's return code -
>> which can be NOTIFY_BAD - i.e. some of the net notifier blocks refused this
>> change, and continues nevertheless.
>>
>> To fix this, verify the return code, and if it's an error - then revert the
>> MTU to the original one, and pass the error code.
>>
>> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>
>This is really a precariously designed code path.
>
>If one of the notifiers says NOTIFY_BAD, well we've already changed
>dev->mtu, therefore what if a packet got sent during this time?
>
>Whoever the NOTIFY_BAD signaller is, obviously cannot handle an MTU
>setting which we've already set in the netdev.  So allowing it's a
>terribly idea to allow visibility of the new MTU value until we can be
>sure everyone can handle it.
>
>The problem is that we really need a transaction of some sort to fix
>this properly.  First, we'd need to ask all the notifiers if they
>can handle the new MTU, then we somehow atomically set netdev->mtu
>and have the notifiers commit their own state changes.

Yeah, but I can't think of a method to atomically set it for both netdev
and its notifiers... As in, for example, bridge (but not only) takes the
lowest MTU of its ports.

>
>Then we'll have the stick issue of what to do if a notifier is
>unregistered between the check and the commit. :-)

Maybe you've meant 'registered between ...' ? :-) Anyway, I guess
dev_set_mtu() should always be called under RTNL, and this way we won't
have these problems. Though I might be wrong, everyone seems playing with
MTU the way they want.

>
>Your patch is an improvement so I will apply it, this stuff really
>is full of holes still.

One (least intrusive) approach would be to add NETDEV_PRECHANGEMTU, which
would be used to verify if the notifiers all agree with changing, and leave
the NETDEV_CHANGEMTU fail only when something really bad happened. That's
your idea, basically.

As, currently, only team can signal NOTIFY_BAD on mtu change, it's really
easy to implement. What do you think?

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 736050d..4e50e04 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -2850,7 +2850,7 @@ static int team_device_event(struct notifier_block *unused,
  	case NETDEV_FEAT_CHANGE:
  		team_compute_features(port->team);
  		break;
-	case NETDEV_CHANGEMTU:
+	case NETDEV_PRECHANGEMTU:
  		/* Forbid to change mtu of underlaying device */
  		return NOTIFY_BAD;
  	case NETDEV_PRE_TYPE_CHANGE:
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a2a70cc..7e023c4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1731,6 +1731,7 @@ struct pcpu_sw_netstats {
  #define NETDEV_JOIN		0x0014
  #define NETDEV_CHANGEUPPER	0x0015
  #define NETDEV_RESEND_IGMP	0x0016
+#define NETDEV_PRECHANGEMTU	0x0017
  
  int register_netdevice_notifier(struct notifier_block *nb);
  int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 87312dc..096d4dd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5332,6 +5332,10 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
  	if (!netif_device_present(dev))
  		return -ENODEV;
  
+	err = call_netdevice_notifiers(NETDEV_PRECHANGEMTU, dev);
+	if (!err)
+		return notifier_to_errno(err);
+
  	orig_mtu = dev->mtu;
  	err = __dev_set_mtu(dev, new_mtu);
  

>
>Thanks.
>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] net: make dev_set_mtu() honor notification return code
  2014-01-14 12:13   ` Veaceslav Falico
@ 2014-01-15 21:48     ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2014-01-15 21:48 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, jiri, edumazet, alexander.h.duyck, nicolas.dichtel

From: Veaceslav Falico <vfalico@redhat.com>
Date: Tue, 14 Jan 2014 13:13:54 +0100

> As, currently, only team can signal NOTIFY_BAD on mtu change, it's
> really easy to implement. What do you think?

Looks great, and I agree that RTNL should solve all the other issues.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-01-15 21:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-10 12:48 [PATCH net-next] net: make dev_set_mtu() honor notification return code Veaceslav Falico
2014-01-10 15:36 ` Alexander Duyck
2014-01-10 15:47   ` Veaceslav Falico
2014-01-13 23:18 ` David Miller
2014-01-14 12:13   ` Veaceslav Falico
2014-01-15 21:48     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).