netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] pkt_sched: Fix gen_estimator locks
@ 2008-08-25 23:08 Jarek Poplawski
  2008-08-26 12:57 ` Herbert Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2008-08-25 23:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

pkt_sched: Fix gen_estimator locks

While passing a qdisc root lock to gen_new_estimator() and
gen_replace_estimator() dev could be deactivated or even before
grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create),
so using qdisc_root_lock() is not enough. This patch adds
qdisc_root_sleeping_lock() for this, plus additional checks, where
necessary.


Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

---

 include/net/sch_generic.h |    8 ++++++++
 net/sched/sch_api.c       |   14 +++++++++++---
 net/sched/sch_cbq.c       |    4 ++--
 net/sched/sch_hfsc.c      |    4 ++--
 net/sched/sch_htb.c       |    4 ++--
 5 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index b1d2cfe..ef8a7e2 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -217,6 +217,14 @@ static inline spinlock_t *qdisc_root_lock(struct Qdisc *qdisc)
 	return qdisc_lock(root);
 }
 
+static inline spinlock_t *qdisc_root_sleeping_lock(struct Qdisc *qdisc)
+{
+	struct Qdisc *root = qdisc_root_sleeping(qdisc);
+
+	ASSERT_RTNL();
+	return qdisc_lock(root);
+}
+
 static inline struct net_device *qdisc_dev(struct Qdisc *qdisc)
 {
 	return qdisc->dev_queue->dev;
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index e7fb9e0..4f9397f 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -830,9 +830,16 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
 			sch->stab = stab;
 		}
 		if (tca[TCA_RATE]) {
+			spinlock_t *root_lock;
+
+			if ((sch->parent != TC_H_ROOT) &&
+			    !(sch->flags & TCQ_F_INGRESS))
+				root_lock = qdisc_root_sleeping_lock(sch);
+			else
+				root_lock = qdisc_lock(sch);
+
 			err = gen_new_estimator(&sch->bstats, &sch->rate_est,
-						qdisc_root_lock(sch),
-						tca[TCA_RATE]);
+						root_lock, tca[TCA_RATE]);
 			if (err) {
 				/*
 				 * Any broken qdiscs that would require
@@ -884,7 +891,8 @@ static int qdisc_change(struct Qdisc *sch, struct nlattr **tca)
 
 	if (tca[TCA_RATE])
 		gen_replace_estimator(&sch->bstats, &sch->rate_est,
-				      qdisc_root_lock(sch), tca[TCA_RATE]);
+				      qdisc_root_sleeping_lock(sch),
+				      tca[TCA_RATE]);
 	return 0;
 }
 
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 8fa90d6..9b720ad 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1839,7 +1839,7 @@ cbq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, struct nlattr **t
 
 		if (tca[TCA_RATE])
 			gen_replace_estimator(&cl->bstats, &cl->rate_est,
-					      qdisc_root_lock(sch),
+					      qdisc_root_sleeping_lock(sch),
 					      tca[TCA_RATE]);
 		return 0;
 	}
@@ -1930,7 +1930,7 @@ cbq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, struct nlattr **t
 
 	if (tca[TCA_RATE])
 		gen_new_estimator(&cl->bstats, &cl->rate_est,
-				  qdisc_root_lock(sch), tca[TCA_RATE]);
+				  qdisc_root_sleeping_lock(sch), tca[TCA_RATE]);
 
 	*arg = (unsigned long)cl;
 	return 0;
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index c2b8d9c..c1e77da 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1045,7 +1045,7 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 
 		if (tca[TCA_RATE])
 			gen_replace_estimator(&cl->bstats, &cl->rate_est,
-					      qdisc_root_lock(sch),
+					      qdisc_root_sleeping_lock(sch),
 					      tca[TCA_RATE]);
 		return 0;
 	}
@@ -1104,7 +1104,7 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 
 	if (tca[TCA_RATE])
 		gen_new_estimator(&cl->bstats, &cl->rate_est,
-				  qdisc_root_lock(sch), tca[TCA_RATE]);
+				  qdisc_root_sleeping_lock(sch), tca[TCA_RATE]);
 	*arg = (unsigned long)cl;
 	return 0;
 }
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 0df0df2..97d4761 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1372,7 +1372,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 			goto failure;
 
 		gen_new_estimator(&cl->bstats, &cl->rate_est,
-				  qdisc_root_lock(sch),
+				  qdisc_root_sleeping_lock(sch),
 				  tca[TCA_RATE] ? : &est.nla);
 		cl->refcnt = 1;
 		cl->children = 0;
@@ -1427,7 +1427,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 	} else {
 		if (tca[TCA_RATE])
 			gen_replace_estimator(&cl->bstats, &cl->rate_est,
-					      qdisc_root_lock(sch),
+					      qdisc_root_sleeping_lock(sch),
 					      tca[TCA_RATE]);
 		sch_tree_lock(sch);
 	}

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-25 23:08 [PATCH 1/2] pkt_sched: Fix gen_estimator locks Jarek Poplawski
@ 2008-08-26 12:57 ` Herbert Xu
  2008-08-27  9:26   ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2008-08-26 12:57 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: davem, netdev

Jarek Poplawski <jarkao2@gmail.com> wrote:
> pkt_sched: Fix gen_estimator locks
> 
> While passing a qdisc root lock to gen_new_estimator() and
> gen_replace_estimator() dev could be deactivated or even before
> grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create),
> so using qdisc_root_lock() is not enough. This patch adds
> qdisc_root_sleeping_lock() for this, plus additional checks, where
> necessary.
> 
> 
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Looks good.

> @@ -830,9 +830,16 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
>                        sch->stab = stab;
>                }
>                if (tca[TCA_RATE]) {
> +                       spinlock_t *root_lock;
> +
> +                       if ((sch->parent != TC_H_ROOT) &&
> +                           !(sch->flags & TCQ_F_INGRESS))
> +                               root_lock = qdisc_root_sleeping_lock(sch);
> +                       else
> +                               root_lock = qdisc_lock(sch);

Another reason why we should just have pointers to the root.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-26 12:57 ` Herbert Xu
@ 2008-08-27  9:26   ` David Miller
  2008-08-27  9:55     ` Herbert Xu
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2008-08-27  9:26 UTC (permalink / raw)
  To: herbert; +Cc: jarkao2, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue, 26 Aug 2008 22:57:38 +1000

> Jarek Poplawski <jarkao2@gmail.com> wrote:
> > pkt_sched: Fix gen_estimator locks
> > 
> > While passing a qdisc root lock to gen_new_estimator() and
> > gen_replace_estimator() dev could be deactivated or even before
> > grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create),
> > so using qdisc_root_lock() is not enough. This patch adds
> > qdisc_root_sleeping_lock() for this, plus additional checks, where
> > necessary.
> > 
> > 
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> 
> Looks good.

Applied.

> > @@ -830,9 +830,16 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
> >                        sch->stab = stab;
> >                }
> >                if (tca[TCA_RATE]) {
> > +                       spinlock_t *root_lock;
> > +
> > +                       if ((sch->parent != TC_H_ROOT) &&
> > +                           !(sch->flags & TCQ_F_INGRESS))
> > +                               root_lock = qdisc_root_sleeping_lock(sch);
> > +                       else
> > +                               root_lock = qdisc_lock(sch);
> 
> Another reason why we should just have pointers to the root.

Yes root pointers would be useful, but they wouldn't kill these tests
:-)

Here in qdisc_create() we might be creating the new root, and we'd
still thus need a conditional for that case.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27  9:26   ` David Miller
@ 2008-08-27  9:55     ` Herbert Xu
  2008-08-27  9:57       ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2008-08-27  9:55 UTC (permalink / raw)
  To: David Miller; +Cc: jarkao2, netdev

On Wed, Aug 27, 2008 at 02:26:33AM -0700, David Miller wrote:
>
> > > @@ -830,9 +830,16 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
> > >                        sch->stab = stab;
> > >                }
> > >                if (tca[TCA_RATE]) {
> > > +                       spinlock_t *root_lock;
> > > +
> > > +                       if ((sch->parent != TC_H_ROOT) &&
> > > +                           !(sch->flags & TCQ_F_INGRESS))
> > > +                               root_lock = qdisc_root_sleeping_lock(sch);
> > > +                       else
> > > +                               root_lock = qdisc_lock(sch);
> > 
> > Another reason why we should just have pointers to the root.
> 
> Yes root pointers would be useful, but they wouldn't kill these tests
> :-)
> 
> Here in qdisc_create() we might be creating the new root, and we'd
> still thus need a conditional for that case.

What I mean is a pointer like sch->root_qdisc, and we would provide
that to qdisc_create as a parameter in place of dev_queue (and
NULL for root qdiscs themselves).  So as long as we assign it at
the top of this function it would be available here too.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27  9:55     ` Herbert Xu
@ 2008-08-27  9:57       ` David Miller
  2008-08-27 10:44         ` Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2008-08-27  9:57 UTC (permalink / raw)
  To: herbert; +Cc: jarkao2, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 27 Aug 2008 19:55:49 +1000

> What I mean is a pointer like sch->root_qdisc, and we would provide
> that to qdisc_create as a parameter in place of dev_queue (and
> NULL for root qdiscs themselves).  So as long as we assign it at
> the top of this function it would be available here too.

Ok, that makes sense.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27  9:57       ` David Miller
@ 2008-08-27 10:44         ` Jarek Poplawski
  2008-08-27 10:47           ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2008-08-27 10:44 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, netdev

On Wed, Aug 27, 2008 at 02:57:47AM -0700, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Wed, 27 Aug 2008 19:55:49 +1000
> 
> > What I mean is a pointer like sch->root_qdisc, and we would provide
> > that to qdisc_create as a parameter in place of dev_queue (and
> > NULL for root qdiscs themselves).  So as long as we assign it at
> > the top of this function it would be available here too.
> 
> Ok, that makes sense.

Yes, it should be simpler. (We can probably consider a pointer to
itself instead of NULL for root qdiscs, to skip testing for NULL e.g.
while getting a lock.) On the other hand, we lose with this the
possibility to easily determine which dev_queue is "the owner" of the
qdisc, or if some dev_queue contains a clone only.

Jarek P.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27 10:44         ` Jarek Poplawski
@ 2008-08-27 10:47           ` David Miller
  2008-08-27 11:09             ` Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2008-08-27 10:47 UTC (permalink / raw)
  To: jarkao2; +Cc: herbert, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Wed, 27 Aug 2008 10:44:58 +0000

> Yes, it should be simpler. (We can probably consider a pointer to
> itself instead of NULL for root qdiscs, to skip testing for NULL e.g.
> while getting a lock.) On the other hand, we lose with this the
> possibility to easily determine which dev_queue is "the owner" of the
> qdisc, or if some dev_queue contains a clone only.

For root qdiscs we can add a TCQ_F_SHARED flag for this purpose.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27 10:47           ` David Miller
@ 2008-08-27 11:09             ` Jarek Poplawski
  2008-08-27 11:15               ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Jarek Poplawski @ 2008-08-27 11:09 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, netdev

On Wed, Aug 27, 2008 at 03:47:23AM -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Wed, 27 Aug 2008 10:44:58 +0000
> 
> > Yes, it should be simpler. (We can probably consider a pointer to
> > itself instead of NULL for root qdiscs, to skip testing for NULL e.g.
> > while getting a lock.) On the other hand, we lose with this the
> > possibility to easily determine which dev_queue is "the owner" of the
> > qdisc, or if some dev_queue contains a clone only.
> 
> For root qdiscs we can add a TCQ_F_SHARED flag for this purpose.

Yes, but we can't do something like this:

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index e7fb9e0..a7f508c 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -235,6 +235,9 @@ struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
 		struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
 		struct Qdisc *txq_root = txq->qdisc_sleeping;
 
+		/* skip the clones */
+		if (txq_root->dev_queue != txq)
+			continue;
 		q = qdisc_match_from_root(txq_root, handle);
 		if (q)
 			goto unlock;

Jarek P.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27 11:09             ` Jarek Poplawski
@ 2008-08-27 11:15               ` David Miller
  2008-08-27 11:22                 ` Jarek Poplawski
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2008-08-27 11:15 UTC (permalink / raw)
  To: jarkao2; +Cc: herbert, netdev

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Wed, 27 Aug 2008 11:09:39 +0000

> On Wed, Aug 27, 2008 at 03:47:23AM -0700, David Miller wrote:
> > From: Jarek Poplawski <jarkao2@gmail.com>
> > Date: Wed, 27 Aug 2008 10:44:58 +0000
> > 
> > > Yes, it should be simpler. (We can probably consider a pointer to
> > > itself instead of NULL for root qdiscs, to skip testing for NULL e.g.
> > > while getting a lock.) On the other hand, we lose with this the
> > > possibility to easily determine which dev_queue is "the owner" of the
> > > qdisc, or if some dev_queue contains a clone only.
> > 
> > For root qdiscs we can add a TCQ_F_SHARED flag for this purpose.
> 
> Yes, but we can't do something like this:

Good point.

We could go back to using a device scope list.  The only piece to
solve at that point is how to differentiate the different entries in
the non-shared multiq case.

What it comes down to is semantics, and how we might want to handle
multiq non-shared cases in future uses.

Maybe some day we'll allow real complicated configurations, such as
mixes of shared and non-shared qdiscs on the TX queues.

So we'd need to think about how that would look, implementation wise.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pkt_sched: Fix gen_estimator locks
  2008-08-27 11:15               ` David Miller
@ 2008-08-27 11:22                 ` Jarek Poplawski
  0 siblings, 0 replies; 10+ messages in thread
From: Jarek Poplawski @ 2008-08-27 11:22 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, netdev

On Wed, Aug 27, 2008 at 04:15:11AM -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Wed, 27 Aug 2008 11:09:39 +0000
> 
> > On Wed, Aug 27, 2008 at 03:47:23AM -0700, David Miller wrote:
> > > From: Jarek Poplawski <jarkao2@gmail.com>
> > > Date: Wed, 27 Aug 2008 10:44:58 +0000
> > > 
> > > > Yes, it should be simpler. (We can probably consider a pointer to
> > > > itself instead of NULL for root qdiscs, to skip testing for NULL e.g.
> > > > while getting a lock.) On the other hand, we lose with this the
> > > > possibility to easily determine which dev_queue is "the owner" of the
> > > > qdisc, or if some dev_queue contains a clone only.
> > > 
> > > For root qdiscs we can add a TCQ_F_SHARED flag for this purpose.
> > 
> > Yes, but we can't do something like this:
> 
> Good point.
> 
> We could go back to using a device scope list.  The only piece to
> solve at that point is how to differentiate the different entries in
> the non-shared multiq case.
> 
> What it comes down to is semantics, and how we might want to handle
> multiq non-shared cases in future uses.
> 
> Maybe some day we'll allow real complicated configurations, such as
> mixes of shared and non-shared qdiscs on the TX queues.
> 
> So we'd need to think about how that would look, implementation wise.

On the other hand, we could save this functionality with another flag
e.g.:__QDISC_STATE_CLONED for all dev_queues, except "the owner"?

Jarek P.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-08-27 11:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-25 23:08 [PATCH 1/2] pkt_sched: Fix gen_estimator locks Jarek Poplawski
2008-08-26 12:57 ` Herbert Xu
2008-08-27  9:26   ` David Miller
2008-08-27  9:55     ` Herbert Xu
2008-08-27  9:57       ` David Miller
2008-08-27 10:44         ` Jarek Poplawski
2008-08-27 10:47           ` David Miller
2008-08-27 11:09             ` Jarek Poplawski
2008-08-27 11:15               ` David Miller
2008-08-27 11:22                 ` Jarek Poplawski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).