* [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail
@ 2020-04-30 11:42 Jesper Dangaard Brouer
2020-04-30 19:45 ` Jakub Kicinski
2020-05-04 18:51 ` David Miller
0 siblings, 2 replies; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2020-04-30 11:42 UTC (permalink / raw)
To: netdev
Cc: Jesper Dangaard Brouer, Jakub Kicinski, Stephen Hemminger,
David Ahern
Currently if the default qdisc setup/init fails, the device ends up with
qdisc "noop", which causes all TX packets to get dropped.
With the introduction of sysctl net/core/default_qdisc it is possible
to change the default qdisc to be more advanced, which opens for the
possibility that Qdisc_ops->init() can fail.
This patch detect these kind of failures, and choose to fallback to
qdisc "noqueue", which is so simple that its init call will not fail.
This allows the interface to continue functioning.
V2:
As this also captures memory failures, which are transient, the
device is not kept in IFF_NO_QUEUE state. This allows the net_device
to retry to default qdisc assignment.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
net/sched/sch_generic.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 2efd5b61acef..ad24fa1a51e6 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1037,10 +1037,9 @@ static void attach_one_default_qdisc(struct net_device *dev,
ops = &pfifo_fast_ops;
qdisc = qdisc_create_dflt(dev_queue, ops, TC_H_ROOT, NULL);
- if (!qdisc) {
- netdev_info(dev, "activation failed\n");
+ if (!qdisc)
return;
- }
+
if (!netif_is_multiqueue(dev))
qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
dev_queue->qdisc_sleeping = qdisc;
@@ -1065,6 +1064,18 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+
+ /* Detect default qdisc setup/init failed and fallback to "noqueue" */
+ if (dev->qdisc == &noop_qdisc) {
+ netdev_warn(dev, "default qdisc (%s) fail, fallback to %s\n",
+ default_qdisc_ops->id, noqueue_qdisc_ops.id);
+ dev->priv_flags |= IFF_NO_QUEUE;
+ netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL);
+ dev->qdisc = txq->qdisc_sleeping;
+ qdisc_refcount_inc(dev->qdisc);
+ dev->priv_flags ^= IFF_NO_QUEUE;
+ }
+
#ifdef CONFIG_NET_SCHED
if (dev->qdisc != &noop_qdisc)
qdisc_hash_add(dev->qdisc, false);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail
2020-04-30 11:42 [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail Jesper Dangaard Brouer
@ 2020-04-30 19:45 ` Jakub Kicinski
2020-05-01 11:56 ` Jesper Dangaard Brouer
2020-05-04 18:51 ` David Miller
1 sibling, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-04-30 19:45 UTC (permalink / raw)
To: Jesper Dangaard Brouer; +Cc: netdev, Stephen Hemminger, David Ahern
On Thu, 30 Apr 2020 13:42:22 +0200 Jesper Dangaard Brouer wrote:
> Currently if the default qdisc setup/init fails, the device ends up with
> qdisc "noop", which causes all TX packets to get dropped.
>
> With the introduction of sysctl net/core/default_qdisc it is possible
> to change the default qdisc to be more advanced, which opens for the
> possibility that Qdisc_ops->init() can fail.
>
> This patch detect these kind of failures, and choose to fallback to
> qdisc "noqueue", which is so simple that its init call will not fail.
> This allows the interface to continue functioning.
>
> V2:
> As this also captures memory failures, which are transient, the
> device is not kept in IFF_NO_QUEUE state. This allows the net_device
> to retry to default qdisc assignment.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
I have mixed feelings about this one, I wonder if I'm the only one.
Seems like failure to allocate the default qdisc is pretty critical,
the log message may be missed, especially in the boot time noise.
I think a WARN_ON() is in order here, I'd personally just replace the
netdev_info with a WARN_ON, without the fallback.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail
2020-04-30 19:45 ` Jakub Kicinski
@ 2020-05-01 11:56 ` Jesper Dangaard Brouer
2020-05-01 19:01 ` Jakub Kicinski
0 siblings, 1 reply; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2020-05-01 11:56 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, Stephen Hemminger, David Ahern, brouer
On Thu, 30 Apr 2020 12:45:49 -0700
Jakub Kicinski <kuba@kernel.org> wrote:
> On Thu, 30 Apr 2020 13:42:22 +0200 Jesper Dangaard Brouer wrote:
> > Currently if the default qdisc setup/init fails, the device ends up with
> > qdisc "noop", which causes all TX packets to get dropped.
> >
> > With the introduction of sysctl net/core/default_qdisc it is possible
> > to change the default qdisc to be more advanced, which opens for the
> > possibility that Qdisc_ops->init() can fail.
> >
> > This patch detect these kind of failures, and choose to fallback to
> > qdisc "noqueue", which is so simple that its init call will not fail.
> > This allows the interface to continue functioning.
> >
> > V2:
> > As this also captures memory failures, which are transient, the
> > device is not kept in IFF_NO_QUEUE state. This allows the net_device
> > to retry to default qdisc assignment.
> >
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>
> I have mixed feelings about this one, I wonder if I'm the only one.
> Seems like failure to allocate the default qdisc is pretty critical,
> the log message may be missed, especially in the boot time noise.
>
> I think a WARN_ON() is in order here, I'd personally just replace the
> netdev_info with a WARN_ON, without the fallback.
It is good that we agree that failure to default qdisc is pretty
critical. I guess we disagree on whether (1) we keep network
functioning in a degraded state, (2) drop all packets on net_device
such that people notice.
This change propose (1) keeping the box functioning. For me it was a
pretty bad experience, that when I pushed a new kernel over the network
to my embedded box, then I lost all network connectivity. I
fortunately had serial console access (as this was not an OpenWRT box
but a full devel board) so I could debug, but I could no-longer upgrade
the kernel. I clearly noticed, as the box was not operational, but I
guess most people would just give up at this point. (Imagine a small
OpenWRT box config setting default_qdisc to fq_codel, which brick the
box as it cannot allocate memory).
I hope that people will notice this degrade state, when they start to
transfer data to the device. Because running 'noqueue' on a physical
device will result in net_crit_ratelimited() messages below:
[86971.609318] Virtual device eth0 asks to queue packet!
[86971.622183] Virtual device eth0 asks to queue packet!
[86971.627510] Virtual device eth0 asks to queue packet!
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail
2020-05-01 11:56 ` Jesper Dangaard Brouer
@ 2020-05-01 19:01 ` Jakub Kicinski
0 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2020-05-01 19:01 UTC (permalink / raw)
To: Jesper Dangaard Brouer; +Cc: netdev, Stephen Hemminger, David Ahern
On Fri, 1 May 2020 13:56:02 +0200 Jesper Dangaard Brouer wrote:
> On Thu, 30 Apr 2020 12:45:49 -0700
> Jakub Kicinski <kuba@kernel.org> wrote:
>
> > On Thu, 30 Apr 2020 13:42:22 +0200 Jesper Dangaard Brouer wrote:
> > > Currently if the default qdisc setup/init fails, the device ends up with
> > > qdisc "noop", which causes all TX packets to get dropped.
> > >
> > > With the introduction of sysctl net/core/default_qdisc it is possible
> > > to change the default qdisc to be more advanced, which opens for the
> > > possibility that Qdisc_ops->init() can fail.
> > >
> > > This patch detect these kind of failures, and choose to fallback to
> > > qdisc "noqueue", which is so simple that its init call will not fail.
> > > This allows the interface to continue functioning.
> > >
> > > V2:
> > > As this also captures memory failures, which are transient, the
> > > device is not kept in IFF_NO_QUEUE state. This allows the net_device
> > > to retry to default qdisc assignment.
> > >
> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> >
> > I have mixed feelings about this one, I wonder if I'm the only one.
> > Seems like failure to allocate the default qdisc is pretty critical,
> > the log message may be missed, especially in the boot time noise.
> >
> > I think a WARN_ON() is in order here, I'd personally just replace the
> > netdev_info with a WARN_ON, without the fallback.
>
> It is good that we agree that failure to default qdisc is pretty
> critical. I guess we disagree on whether (1) we keep network
> functioning in a degraded state, (2) drop all packets on net_device
> such that people notice.
>
> This change propose (1) keeping the box functioning. For me it was a
> pretty bad experience, that when I pushed a new kernel over the network
> to my embedded box, then I lost all network connectivity. I
> fortunately had serial console access (as this was not an OpenWRT box
> but a full devel board) so I could debug, but I could no-longer upgrade
> the kernel. I clearly noticed, as the box was not operational, but I
> guess most people would just give up at this point. (Imagine a small
> OpenWRT box config setting default_qdisc to fq_codel, which brick the
> box as it cannot allocate memory).
>
> I hope that people will notice this degrade state, when they start to
> transfer data to the device. Because running 'noqueue' on a physical
> device will result in net_crit_ratelimited() messages below:
>
> [86971.609318] Virtual device eth0 asks to queue packet!
> [86971.622183] Virtual device eth0 asks to queue packet!
> [86971.627510] Virtual device eth0 asks to queue packet!
Both ways have advantages, I guess. I don't feel strongly,
but I do think that WARN_ON() is in order here.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail
2020-04-30 11:42 [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail Jesper Dangaard Brouer
2020-04-30 19:45 ` Jakub Kicinski
@ 2020-05-04 18:51 ` David Miller
1 sibling, 0 replies; 5+ messages in thread
From: David Miller @ 2020-05-04 18:51 UTC (permalink / raw)
To: brouer; +Cc: netdev, kuba, stephen, dsahern
From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Thu, 30 Apr 2020 13:42:22 +0200
> Currently if the default qdisc setup/init fails, the device ends up with
> qdisc "noop", which causes all TX packets to get dropped.
>
> With the introduction of sysctl net/core/default_qdisc it is possible
> to change the default qdisc to be more advanced, which opens for the
> possibility that Qdisc_ops->init() can fail.
>
> This patch detect these kind of failures, and choose to fallback to
> qdisc "noqueue", which is so simple that its init call will not fail.
> This allows the interface to continue functioning.
>
> V2:
> As this also captures memory failures, which are transient, the
> device is not kept in IFF_NO_QUEUE state. This allows the net_device
> to retry to default qdisc assignment.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-05-04 18:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-30 11:42 [PATCH net-next V2] net: sched: fallback to qdisc noqueue if default qdisc setup fail Jesper Dangaard Brouer
2020-04-30 19:45 ` Jakub Kicinski
2020-05-01 11:56 ` Jesper Dangaard Brouer
2020-05-01 19:01 ` Jakub Kicinski
2020-05-04 18:51 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).