* [PATCH] sch_teql: should not dereference skb after ndo_start_xmit @ 2009-05-18 10:31 Eric Dumazet 2009-05-18 22:12 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2009-05-18 10:31 UTC (permalink / raw) To: David S. Miller; +Cc: Linux Netdev List, Jussi Kivilinna David I found following by code inspection, not a crash analysis, but I believe it is a 2.6.30 candidate, and stable (2.6.27 and up) as well. Thank you [PATCH] sch_teql: should not dereference skb after ndo_start_xmit() It is illegal to dereference a skb after a successful ndo_start_xmit() call. We must store skb length in a local variable instead. Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb (net_sched: Add accessor function for packet length for qdiscs) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> --- diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c index ec697ce..3b64182 100644 --- a/net/sched/sch_teql.c +++ b/net/sched/sch_teql.c @@ -303,6 +303,8 @@ restart: switch (teql_resolve(skb, skb_res, slave)) { case 0: if (__netif_tx_trylock(slave_txq)) { + unsigned int length = qdisc_pkt_len(skb); + if (!netif_tx_queue_stopped(slave_txq) && !netif_tx_queue_frozen(slave_txq) && slave_ops->ndo_start_xmit(skb, slave) == 0) { @@ -310,8 +312,7 @@ restart: master->slaves = NEXT_SLAVE(q); netif_wake_queue(dev); master->stats.tx_packets++; - master->stats.tx_bytes += - qdisc_pkt_len(skb); + master->stats.tx_bytes += length; return 0; } __netif_tx_unlock(slave_txq); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] sch_teql: should not dereference skb after ndo_start_xmit 2009-05-18 10:31 [PATCH] sch_teql: should not dereference skb after ndo_start_xmit Eric Dumazet @ 2009-05-18 22:12 ` David Miller 2009-05-19 1:34 ` Eric Dumazet 0 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2009-05-18 22:12 UTC (permalink / raw) To: dada1; +Cc: netdev, jussi.kivilinna From: Eric Dumazet <dada1@cosmosbay.com> Date: Mon, 18 May 2009 12:31:57 +0200 > [PATCH] sch_teql: should not dereference skb after ndo_start_xmit() > > It is illegal to dereference a skb after a successful ndo_start_xmit() > call. We must store skb length in a local variable instead. > > Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb > (net_sched: Add accessor function for packet length for qdiscs) > > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Applied and queued up for -stable, thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sch_teql: should not dereference skb after ndo_start_xmit 2009-05-18 22:12 ` David Miller @ 2009-05-19 1:34 ` Eric Dumazet 2009-05-19 2:34 ` David Miller 2009-08-03 1:59 ` [PATCH] sch_teql: should not dereference skb after ndo_start_xmit David Miller 0 siblings, 2 replies; 7+ messages in thread From: Eric Dumazet @ 2009-05-19 1:34 UTC (permalink / raw) To: David Miller; +Cc: netdev, jussi.kivilinna David Miller a écrit : > From: Eric Dumazet <dada1@cosmosbay.com> > Date: Mon, 18 May 2009 12:31:57 +0200 > >> [PATCH] sch_teql: should not dereference skb after ndo_start_xmit() >> >> It is illegal to dereference a skb after a successful ndo_start_xmit() >> call. We must store skb length in a local variable instead. >> >> Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb >> (net_sched: Add accessor function for packet length for qdiscs) >> >> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> > > Applied and queued up for -stable, thanks! Looking again at teql_master_xmit(), I wonder if there is another problem in it. int subq = skb_get_queue_mapping(skb); But as a matter of fact, following code assumes subq is 0 struct netdev_queue *slave_txq = netdev_get_tx_queue(slave, 0); ... if (__netif_subqueue_stopped(slave, subq) || ... Either we should set subq to 0, or call dev_pick_tx() to better take into account multi queue slaves ? (As teqlN has one queue, I assume original dev_queue_xmit() will set skb queue_mapping to 0 before entering teql_master_xmit(), but maybe other paths could call teql_master_xmit() not through dev_queue_xmit() ? Oh well, time for sleeping a bit... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sch_teql: should not dereference skb after ndo_start_xmit 2009-05-19 1:34 ` Eric Dumazet @ 2009-05-19 2:34 ` David Miller 2009-05-19 5:43 ` [PATCH] sch_teql: Use net_device internal stats Eric Dumazet 2009-08-03 1:59 ` [PATCH] sch_teql: should not dereference skb after ndo_start_xmit David Miller 1 sibling, 1 reply; 7+ messages in thread From: David Miller @ 2009-05-19 2:34 UTC (permalink / raw) To: dada1; +Cc: netdev, jussi.kivilinna From: Eric Dumazet <dada1@cosmosbay.com> Date: Tue, 19 May 2009 03:34:03 +0200 > Looking again at teql_master_xmit(), I wonder if there is another > problem in it. > > int subq = skb_get_queue_mapping(skb); > > But as a matter of fact, following code assumes subq is 0 > > struct netdev_queue *slave_txq = netdev_get_tx_queue(slave, 0); > ... > if (__netif_subqueue_stopped(slave, subq) || > ... > > Either we should set subq to 0, or call dev_pick_tx() to better > take into account multi queue slaves ? > > (As teqlN has one queue, I assume original dev_queue_xmit() will > set skb queue_mapping to 0 before entering teql_master_xmit(), but > maybe other paths could call teql_master_xmit() not through dev_queue_xmit() ? > > Oh well, time for sleeping a bit... I was admittedly very hasty with the multiqueue conversion here. I'll take a closer look at this. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] sch_teql: Use net_device internal stats 2009-05-19 2:34 ` David Miller @ 2009-05-19 5:43 ` Eric Dumazet 2009-05-19 22:16 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2009-05-19 5:43 UTC (permalink / raw) To: David Miller; +Cc: netdev David Miller a écrit : > From: Eric Dumazet <dada1@cosmosbay.com> > Date: Tue, 19 May 2009 03:34:03 +0200 > >> Looking again at teql_master_xmit(), I wonder if there is another >> problem in it. >> >> int subq = skb_get_queue_mapping(skb); >> >> But as a matter of fact, following code assumes subq is 0 >> >> struct netdev_queue *slave_txq = netdev_get_tx_queue(slave, 0); >> ... >> if (__netif_subqueue_stopped(slave, subq) || >> ... >> >> Either we should set subq to 0, or call dev_pick_tx() to better >> take into account multi queue slaves ? >> >> (As teqlN has one queue, I assume original dev_queue_xmit() will >> set skb queue_mapping to 0 before entering teql_master_xmit(), but >> maybe other paths could call teql_master_xmit() not through dev_queue_xmit() ? >> >> Oh well, time for sleeping a bit... > > I was admittedly very hasty with the multiqueue conversion here. > I'll take a closer look at this. Thanks David In the meantime, I did the master->stats removal for teqlN devices, for net-next-2.6, if you dont mind :) [PATCH] sch_teql: Use net_device internal stats We can slightly reduce size of teqlN structure, not duplicating stats structure in teql_master but using stats field from net_device.stats for tx_errors and from netdev_queue for tx_bytes/tx_packets/tx_dropped values. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> --- net/sched/sch_teql.c | 17 +++++------------ 1 files changed, 5 insertions(+), 12 deletions(-) diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c index 3b64182..428a5ef 100644 --- a/net/sched/sch_teql.c +++ b/net/sched/sch_teql.c @@ -58,7 +58,6 @@ struct teql_master struct net_device *dev; struct Qdisc *slaves; struct list_head master_list; - struct net_device_stats stats; }; struct teql_sched_data @@ -272,6 +271,7 @@ static inline int teql_resolve(struct sk_buff *skb, static int teql_master_xmit(struct sk_buff *skb, struct net_device *dev) { struct teql_master *master = netdev_priv(dev); + struct netdev_queue *txq = netdev_get_tx_queue(dev, 0); struct Qdisc *start, *q; int busy; int nores; @@ -311,8 +311,8 @@ restart: __netif_tx_unlock(slave_txq); master->slaves = NEXT_SLAVE(q); netif_wake_queue(dev); - master->stats.tx_packets++; - master->stats.tx_bytes += length; + txq->tx_packets++; + txq->tx_bytes += length; return 0; } __netif_tx_unlock(slave_txq); @@ -339,10 +339,10 @@ restart: netif_stop_queue(dev); return 1; } - master->stats.tx_errors++; + dev->stats.tx_errors++; drop: - master->stats.tx_dropped++; + txq->tx_dropped++; dev_kfree_skb(skb); return 0; } @@ -395,12 +395,6 @@ static int teql_master_close(struct net_device *dev) return 0; } -static struct net_device_stats *teql_master_stats(struct net_device *dev) -{ - struct teql_master *m = netdev_priv(dev); - return &m->stats; -} - static int teql_master_mtu(struct net_device *dev, int new_mtu) { struct teql_master *m = netdev_priv(dev); @@ -425,7 +419,6 @@ static const struct net_device_ops teql_netdev_ops = { .ndo_open = teql_master_open, .ndo_stop = teql_master_close, .ndo_start_xmit = teql_master_xmit, - .ndo_get_stats = teql_master_stats, .ndo_change_mtu = teql_master_mtu, }; ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] sch_teql: Use net_device internal stats 2009-05-19 5:43 ` [PATCH] sch_teql: Use net_device internal stats Eric Dumazet @ 2009-05-19 22:16 ` David Miller 0 siblings, 0 replies; 7+ messages in thread From: David Miller @ 2009-05-19 22:16 UTC (permalink / raw) To: dada1; +Cc: netdev From: Eric Dumazet <dada1@cosmosbay.com> Date: Tue, 19 May 2009 07:43:09 +0200 > [PATCH] sch_teql: Use net_device internal stats > > We can slightly reduce size of teqlN structure, not duplicating stats structure > in teql_master but using stats field from net_device.stats for tx_errors and from > netdev_queue for tx_bytes/tx_packets/tx_dropped values. > > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Applied, thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sch_teql: should not dereference skb after ndo_start_xmit 2009-05-19 1:34 ` Eric Dumazet 2009-05-19 2:34 ` David Miller @ 2009-08-03 1:59 ` David Miller 1 sibling, 0 replies; 7+ messages in thread From: David Miller @ 2009-08-03 1:59 UTC (permalink / raw) To: dada1; +Cc: netdev, jussi.kivilinna From: Eric Dumazet <dada1@cosmosbay.com> Date: Tue, 19 May 2009 03:34:03 +0200 > Looking again at teql_master_xmit(), I wonder if there is another > problem in it. > > int subq = skb_get_queue_mapping(skb); > > But as a matter of fact, following code assumes subq is 0 I looked into this again, and damn this is tough to deal with. The code works as-is, since teql devices have only 1 queue we can assume queue 0 and we only end up using one of the slave devices queues too. I tried to export dev_pick_tx() (renaming it to netdev_pick_tx() to avoid global namespace pollution) but then I realized that we can't just whack the subq here. If this gets punted back to the caller and we don't actually send out the packet, we can't leave the new skb->queue_index in there as teql's multiqueue parameters are what will be checked and used against this SKB again. I suppose we could restore the old queue index value when we exhaust the slaves and can't transmit, but is getting messy for sure. Since the code works properly, and this is merely a performance issue, I'm deferring this again. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-08-03 1:59 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-18 10:31 [PATCH] sch_teql: should not dereference skb after ndo_start_xmit Eric Dumazet 2009-05-18 22:12 ` David Miller 2009-05-19 1:34 ` Eric Dumazet 2009-05-19 2:34 ` David Miller 2009-05-19 5:43 ` [PATCH] sch_teql: Use net_device internal stats Eric Dumazet 2009-05-19 22:16 ` David Miller 2009-08-03 1:59 ` [PATCH] sch_teql: should not dereference skb after ndo_start_xmit David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).