* [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU.
@ 2008-07-17 12:17 David Miller
[not found] ` <20080717.051726.226040470.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
0 siblings, 1 reply; 58+ messages in thread
From: David Miller @ 2008-07-17 12:17 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA
Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, johannes-cdvu00un1VgdHxzADdlk8Q,
linux-wireless-u79uwXL29TY76Z2rM5mHXA
This allows less strict control of access to the qdisc attached to a
netdev_queue. It is even allowed to enqueue into a qdisc which is
in the process of being destroyed. The RCU handler will toss out
those packets.
We will need this to handle sharing of a qdisc amongst multiple
TX queues. In such a setup the lock has to be shared, so will
be inside of the qdisc itself. At which point the netdev_queue
lock cannot be used to hard synchronize access to the ->qdisc
pointer.
One operation we have to keep inside of qdisc_destroy() is the list
deletion. It is the only piece of state visible after the RCU quiesce
period, so we have to undo it early and under the appropriate locking.
The operations in the RCU handler do not need any looking because the
qdisc tree is no longer visible to anything at that point.
Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
---
net/sched/sch_generic.c | 20 +++++++++++---------
1 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 7e078c5..082db8a 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -545,6 +545,17 @@ EXPORT_SYMBOL(qdisc_reset);
static void __qdisc_destroy(struct rcu_head *head)
{
struct Qdisc *qdisc = container_of(head, struct Qdisc, q_rcu);
+ const struct Qdisc_ops *ops = qdisc->ops;
+
+ gen_kill_estimator(&qdisc->bstats, &qdisc->rate_est);
+ if (ops->reset)
+ ops->reset(qdisc);
+ if (ops->destroy)
+ ops->destroy(qdisc);
+
+ module_put(ops->owner);
+ dev_put(qdisc_dev(qdisc));
+
kfree((char *) qdisc - qdisc->padded);
}
@@ -552,21 +563,12 @@ static void __qdisc_destroy(struct rcu_head *head)
void qdisc_destroy(struct Qdisc *qdisc)
{
- const struct Qdisc_ops *ops = qdisc->ops;
^ permalink raw reply related [flat|nested] 58+ messages in thread[parent not found: <20080717.051726.226040470.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080717.051726.226040470.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-17 13:03 ` Patrick McHardy [not found] ` <487F4327.1000107-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> 2008-07-19 3:59 ` David Miller 0 siblings, 2 replies; 58+ messages in thread From: Patrick McHardy @ 2008-07-17 13:03 UTC (permalink / raw) To: David Miller Cc: netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA David Miller wrote: > This allows less strict control of access to the qdisc attached to a > netdev_queue. It is even allowed to enqueue into a qdisc which is > in the process of being destroyed. The RCU handler will toss out > those packets. > > We will need this to handle sharing of a qdisc amongst multiple > TX queues. In such a setup the lock has to be shared, so will > be inside of the qdisc itself. At which point the netdev_queue > lock cannot be used to hard synchronize access to the ->qdisc > pointer. > > One operation we have to keep inside of qdisc_destroy() is the list > deletion. It is the only piece of state visible after the RCU quiesce > period, so we have to undo it early and under the appropriate locking. > > The operations in the RCU handler do not need any looking because the > qdisc tree is no longer visible to anything at that point. Still working my way through the patches, but this one caught my eye (we had this before and it caused quite a few problems). One of the problems is that only the uppermost qdisc is destroyed immediately, child qdiscs are still visible on qdisc_list and are removed without any locking from the RCU callback. There are also visibility issues for classifiers and actions deeper down in the hierarchy. The previous way to work around this was quite ugly. qdisc_destroy() walked the entire hierarchy to unlink inner classes immediately from the qdisc_list (commit 85670cc1f changed it to what we do now). That fixed visibility issues for everything visible only through qdiscs (child qdiscs and classifiers). Actions are also visible globally, so this might still be a problem, not sure though since they don't refer to their parent (haven't thought about it much yet). Another problem we had earlier with this was that qdiscs previously assumed changes (destruction) would only happen in process context and thus didn't disable BHs when taking a read_lock for walking the hierarchy (deadlocking with write_lock in BH context). This seems to be handled correctly in your tree by always disabling BHs. The remaining problem is data that was previously only used and modified under the RTNL (u32_list is one example). Modifications during destruction now need protection against concurrent use in process context. I still need to get a better understanding of how things work now, so I won't suggest a fix until then :) -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <487F4327.1000107-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <487F4327.1000107-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> @ 2008-07-17 13:12 ` David Miller 2008-07-17 13:48 ` Patrick McHardy 2008-07-17 13:35 ` jamal 1 sibling, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-17 13:12 UTC (permalink / raw) To: kaber-dcUjhNyLwpNeoWH0uzbU5w Cc: netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> Date: Thu, 17 Jul 2008 15:03:35 +0200 > Still working my way through the patches, but this one caught my > eye (we had this before and it caused quite a few problems). Indeed, it's the most delicate change. Thanks for the info about all the tricky bits in this area. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-17 13:12 ` David Miller @ 2008-07-17 13:48 ` Patrick McHardy [not found] ` <487F4DA6.6010009-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Patrick McHardy @ 2008-07-17 13:48 UTC (permalink / raw) To: David Miller; +Cc: netdev, johannes, linux-wireless David Miller wrote: > From: Patrick McHardy <kaber@trash.net> > Date: Thu, 17 Jul 2008 15:03:35 +0200 > >> Still working my way through the patches, but this one caught my >> eye (we had this before and it caused quite a few problems). > > Indeed, it's the most delicate change. > > Thanks for the info about all the tricky bits in this area. One thought that occured to me - we could avoid all the visiblity issues wrt. dev->qdisc_list by simply getting rid of it :) If we move the qdisc list from the device to the root Qdisc itself, it would become invisible automatically as soon as we assign a new root qdisc to the netdev_queue. Iteration would become slightly more complicated since we'd have to iterate over all netdev_queues, but I think it should avoid most of the problems I mentioned (besides the u32_list thing). ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <487F4DA6.6010009-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <487F4DA6.6010009-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> @ 2008-07-17 22:36 ` David Miller [not found] ` <20080717.153609.142867331.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-17 22:36 UTC (permalink / raw) To: kaber-dcUjhNyLwpNeoWH0uzbU5w Cc: netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> Date: Thu, 17 Jul 2008 15:48:22 +0200 > One thought that occured to me - we could avoid all the visiblity > issues wrt. dev->qdisc_list by simply getting rid of it :) > > If we move the qdisc list from the device to the root Qdisc itself, > it would become invisible automatically as soon as we assign a new > root qdisc to the netdev_queue. Iteration would become slightly > more complicated since we'd have to iterate over all netdev_queues, > but I think it should avoid most of the problems I mentioned > (besides the u32_list thing). What might make sense is to have a special Qdisc_root structure which is simply: struct Qdisc_root { struct Qdisc qd; struct list_head qdisc_list; }; Everything about tree level synchronization would be type explicit. Yes, as you say, the qdisc iteration would get slightly ugly. But that doesn't seem to be a huge deal. But it seems a clean solution to the child qdisc visibility problem. About u32_list, that thing definitely needs some spinlock. The consultation of that list, and refcount mods, only occur during config operations. So it's not like we have to grab this lock in the data paths. If we really want to sweep this problem under the rug, there is another way. Have the qdisc_destroy() RCU handler kick off a workqueue, and grab the RTNL semaphore there during the final destruction calls. :-) -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080717.153609.142867331.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080717.153609.142867331.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-17 23:58 ` Patrick McHardy 0 siblings, 0 replies; 58+ messages in thread From: Patrick McHardy @ 2008-07-17 23:58 UTC (permalink / raw) To: David Miller Cc: netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA David Miller wrote: > From: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> > Date: Thu, 17 Jul 2008 15:48:22 +0200 > >> One thought that occured to me - we could avoid all the visiblity >> issues wrt. dev->qdisc_list by simply getting rid of it :) >> >> If we move the qdisc list from the device to the root Qdisc itself, >> it would become invisible automatically as soon as we assign a new >> root qdisc to the netdev_queue. Iteration would become slightly >> more complicated since we'd have to iterate over all netdev_queues, >> but I think it should avoid most of the problems I mentioned >> (besides the u32_list thing). > > What might make sense is to have a special Qdisc_root structure which > is simply: > > struct Qdisc_root { > struct Qdisc qd; > struct list_head qdisc_list; > }; > > Everything about tree level synchronization would be type explicit. Device level grafting is also explicit, so that looks like a clean way. > Yes, as you say, the qdisc iteration would get slightly ugly. But > that doesn't seem to be a huge deal. > > But it seems a clean solution to the child qdisc visibility problem. > > About u32_list, that thing definitely needs some spinlock. The > consultation of that list, and refcount mods, only occur during config > operations. So it's not like we have to grab this lock in the data > paths. > > If we really want to sweep this problem under the rug, there is another > way. Have the qdisc_destroy() RCU handler kick off a workqueue, and > grab the RTNL semaphore there during the final destruction calls. :-) That would be the safe way. The RCU destruction used to cause us bugs for at least two years, but I actually believe the Qdisc_root thing will work :) -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <487F4327.1000107-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> 2008-07-17 13:12 ` David Miller @ 2008-07-17 13:35 ` jamal 2008-07-17 14:02 ` Patrick McHardy 2008-07-20 14:20 ` Herbert Xu 1 sibling, 2 replies; 58+ messages in thread From: jamal @ 2008-07-17 13:35 UTC (permalink / raw) To: Patrick McHardy Cc: David Miller, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Thu, 2008-17-07 at 15:03 +0200, Patrick McHardy wrote: > Actions are also visible > globally, so this might still be a problem, not sure though since > they don't refer to their parent (haven't thought about it much yet). Actions are fine because they are intended to be globaly shared. [i.e A classifier on ethx with qdiscA:Y (in/egress) can share an action with classifer on ethy with qdiscB:Z (eg/ingress)]. Like you i need to digest the patches to understand the impact on the rest but one thing i did notice was the last patch (replacement of pfifo_fast): prioritization based on TOS/DSCP (setsockopt) would no longer work, some user space code may suffer (routing daemons likely). One suggestion to fix it is to load pfifo qdisc (which does what fifo_fast is attempting) for drivers that are h/ware multiq capable. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-17 13:35 ` jamal @ 2008-07-17 14:02 ` Patrick McHardy [not found] ` <487F50EC.80008-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> 2008-07-20 14:20 ` Herbert Xu 1 sibling, 1 reply; 58+ messages in thread From: Patrick McHardy @ 2008-07-17 14:02 UTC (permalink / raw) To: hadi; +Cc: David Miller, netdev, johannes, linux-wireless jamal wrote: > On Thu, 2008-17-07 at 15:03 +0200, Patrick McHardy wrote: > >> Actions are also visible >> globally, so this might still be a problem, not sure though since >> they don't refer to their parent (haven't thought about it much yet). > > Actions are fine because they are intended to be globaly shared. > [i.e A classifier on ethx with qdiscA:Y (in/egress) can share an action > with classifer on ethy with qdiscB:Z (eg/ingress)]. Yes, in that case its not a problem. The case where it behaves differently than now is when only a single reference exists. > Like you i need to digest the patches to understand the impact on the > rest but one thing i did notice was the last patch (replacement of > pfifo_fast): > prioritization based on TOS/DSCP (setsockopt) would no longer work, some > user space code may suffer (routing daemons likely). One suggestion to > fix it is to load pfifo qdisc (which does what fifo_fast is attempting) > for drivers that are h/ware multiq capable. That would perform priorization within each qdisc, the individual qdiscs would still be transmitted using seperate HW queues though. ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <487F50EC.80008-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <487F50EC.80008-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> @ 2008-07-17 22:24 ` David Miller [not found] ` <20080717.152447.89672084.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 2008-07-20 14:32 ` Herbert Xu 0 siblings, 2 replies; 58+ messages in thread From: David Miller @ 2008-07-17 22:24 UTC (permalink / raw) To: kaber-dcUjhNyLwpNeoWH0uzbU5w Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> Date: Thu, 17 Jul 2008 16:02:20 +0200 > jamal wrote: > > On Thu, 2008-17-07 at 15:03 +0200, Patrick McHardy wrote: > > > > Like you i need to digest the patches to understand the impact on the > > rest but one thing i did notice was the last patch (replacement of > > pfifo_fast): > > prioritization based on TOS/DSCP (setsockopt) would no longer work, some > > user space code may suffer (routing daemons likely). One suggestion to > > fix it is to load pfifo qdisc (which does what fifo_fast is attempting) > > for drivers that are h/ware multiq capable. > > That would perform priorization within each qdisc, the individual > qdiscs would still be transmitted using seperate HW queues though. I think from certain perspectives it frankly doesn't matter. It's not like the skb->priority field lets the SKB bypass the packets already in the TX ring of the chip with a lower priority. It is true that, once the TX ring is full, the skb->priority thus begins to have an influence on which packets are moved from the qdisc to the TX ring of the device. However, I wonder if we're so sure that we want to give normal users that kind of powers. Let's say for example that you set the highest priority possible in the TOS socket option, and you do this for a ton of UDP sockets, and you just blast packets out as fast as possible. This backlogs the device TX ring, and if done effectively enough could keep other sockets blocked out of the device completely. Are we really really sure it's OK to let users do this? :) To me, as a default, I think TOS and DSCP really means just on-wire priority. If we absolutely want to, we can keep the old pfifo_fast around and use it (shared on multiq) if a certain sysctl knob is set. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080717.152447.89672084.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080717.152447.89672084.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-17 23:48 ` Patrick McHardy 2008-07-18 13:10 ` jamal 0 siblings, 1 reply; 58+ messages in thread From: Patrick McHardy @ 2008-07-17 23:48 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA David Miller wrote: > From: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> > Date: Thu, 17 Jul 2008 16:02:20 +0200 > >> jamal wrote: >>> prioritization based on TOS/DSCP (setsockopt) would no longer work, some >>> user space code may suffer (routing daemons likely). One suggestion to >>> fix it is to load pfifo qdisc (which does what fifo_fast is attempting) >>> for drivers that are h/ware multiq capable. >> That would perform priorization within each qdisc, the individual >> qdiscs would still be transmitted using seperate HW queues though. > > I think from certain perspectives it frankly doesn't matter. > > It's not like the skb->priority field lets the SKB bypass the packets > already in the TX ring of the chip with a lower priority. > > It is true that, once the TX ring is full, the skb->priority thus > begins to have an influence on which packets are moved from the > qdisc to the TX ring of the device. > > However, I wonder if we're so sure that we want to give normal users > that kind of powers. Let's say for example that you set the highest > priority possible in the TOS socket option, and you do this for a ton > of UDP sockets, and you just blast packets out as fast as possible. > This backlogs the device TX ring, and if done effectively enough could > keep other sockets blocked out of the device completely. > > Are we really really sure it's OK to let users do this? :) > > To me, as a default, I think TOS and DSCP really means just on-wire > priority. > > If we absolutely want to, we can keep the old pfifo_fast around and use > it (shared on multiq) if a certain sysctl knob is set. No, I fully agree that this is too much detail :) Its highly unlikely that this default behaviour is important on a per packet level :) I just meant to point out that using a pfifo is not going to be the same behaviour as previously. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-17 23:48 ` Patrick McHardy @ 2008-07-18 13:10 ` jamal 2008-07-18 13:27 ` jamal 2008-07-18 17:10 ` Roland Dreier 0 siblings, 2 replies; 58+ messages in thread From: jamal @ 2008-07-18 13:10 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, johannes, linux-wireless On Fri, 2008-18-07 at 01:48 +0200, Patrick McHardy wrote: > David Miller wrote: > > I think from certain perspectives it frankly doesn't matter. > > > > It's not like the skb->priority field lets the SKB bypass the packets > > already in the TX ring of the chip with a lower priority. > > > > It is true that, once the TX ring is full, the skb->priority thus > > begins to have an influence on which packets are moved from the > > qdisc to the TX ring of the device. > > Indeed QoS is irrelevant unless there is congestion. The question is whether the packets sitting on the fifo qdisc are being sorted fairly when congestion kicks in. Remember there is still a single wire still even on multiple rings;-> If Woz (really) showed up at 9am and the Broussards at 3 am[1] on that single (congestion-buffering) FIFO waiting for the shop/wire to open up, then Woz should jump the queue (if he deserves it) when shop opens at 10am. If queues are building up, then by definition you have congestion somewehere - IOW some resource (wire bandwidth, code-efficiency/cpu, bus, remote being slow etc) is not keeping up. I am sorry havent read the patches sufficiently to answer that question but i suspect that stashing the packets into different hardware queues already solves this since the hardware does whatever scheduling it needs to on the rings. > > However, I wonder if we're so sure that we want to give normal users > > that kind of powers. Let's say for example that you set the highest > > priority possible in the TOS socket option, and you do this for a ton > > of UDP sockets, and you just blast packets out as fast as possible. > > This backlogs the device TX ring, and if done effectively enough could > > keep other sockets blocked out of the device completely. > > > > Are we really really sure it's OK to let users do this? :) We do today - if it is a concern, one could make the setsock opts preferential (example via selinux or setting caps in the kernel etc). > > To me, as a default, I think TOS and DSCP really means just on-wire > > priority. Agreed - with the caveat above on congestion. i.e it is still a single wire even with multi rings. > > If we absolutely want to, we can keep the old pfifo_fast around and use > > it (shared on multiq) if a certain sysctl knob is set. > > No, I fully agree that this is too much detail :) Its highly > unlikely that this default behaviour is important on a per > packet level :) I just meant to point out that using a pfifo > is not going to be the same behaviour as previously. IMO, if non-multiq drivers continue to work as before with the prios, then nice. multiq could be tuned over a period of time. cheers, jamal ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-18 13:10 ` jamal @ 2008-07-18 13:27 ` jamal 2008-07-18 21:05 ` David Miller 2008-07-18 17:10 ` Roland Dreier 1 sibling, 1 reply; 58+ messages in thread From: jamal @ 2008-07-18 13:27 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, netdev, johannes, linux-wireless On Fri, 2008-18-07 at 09:10 -0400, jamal wrote: > wire still even on multiple rings;-> > If Woz (really) showed up at 9am and the Broussards at 3 am[1] on that > single (congestion-buffering) FIFO waiting for the shop/wire to open up, > then Woz should jump the queue (if he deserves it) when shop opens at > 10am. Sorry, the url of Woz allegedly jumping the queue: http://news.cnet.com/8301-1023_3-9989823-93.html?hhTest=1%C3%A2%C2%88% C2%82=rss&subj=news&tag=2547-1_3-0-20 cheers, jamal ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-18 13:27 ` jamal @ 2008-07-18 21:05 ` David Miller 2008-07-20 15:16 ` jamal 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-18 21:05 UTC (permalink / raw) To: hadi-fAAogVwAN2Kw5LPnMra/2Q Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org> Date: Fri, 18 Jul 2008 09:27:21 -0400 > On Fri, 2008-18-07 at 09:10 -0400, jamal wrote: > > > wire still even on multiple rings;-> > > If Woz (really) showed up at 9am and the Broussards at 3 am[1] on that > > single (congestion-buffering) FIFO waiting for the shop/wire to open up, > > then Woz should jump the queue (if he deserves it) when shop opens at > > 10am. > > Sorry, the url of Woz allegedly jumping the queue: > http://news.cnet.com/8301-1023_3-9989823-93.html?hhTest=1%C3%A2%C2%88% > C2%82=rss&subj=news&tag=2547-1_3-0-20 The fundamental issue is what we believe qdiscs schedule, do they schedule a device, or do they schedule what their namesake implies, "queues"? Logically once we have multiple queues, we schedule queues. Therefore what probably makes sense is that for mostly stateless priority queueing such as pfifo_fast, doing prioritization at the queue level is OK. But where non-trivial classification occurs, we have to either: 1) Make the queue selection match what the classifiers would do exactly. OR 2) Point all the queues at a single device global qdisc. What we have now implements #2. Later on we can try to do something as sophisticated as #1. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-18 21:05 ` David Miller @ 2008-07-20 15:16 ` jamal 2008-07-20 17:25 ` David Miller 0 siblings, 1 reply; 58+ messages in thread From: jamal @ 2008-07-20 15:16 UTC (permalink / raw) To: David Miller; +Cc: kaber, netdev, johannes, linux-wireless On Fri, 2008-18-07 at 14:05 -0700, David Miller wrote: > The fundamental issue is what we believe qdiscs schedule, do they > schedule a device, or do they schedule what their namesake implies, > "queues"? In the simple case of a single hardware queue/ring, the mapping between a hardware queue and "physical wire" is one-to-one. So in that case one could argue the root qdiscs are scheduling a device. > Logically once we have multiple queues, we schedule queues. > Therefore what probably makes sense is that for mostly stateless > priority queueing such as pfifo_fast, doing prioritization at the > queue level is OK. IMO, in the case of multiple hardware queues per physical wire, and such a netdevice already has a built-in hardware scheduler (they all seem to have this feature) then if we can feed the hardware queues directly, theres no need for any intermediate buffer(s). In such a case, to compare with qdisc arch, its like the root qdisc is in hardware. The only need for intermediate s/ware queues is for cases of congestion. If you could have a single software queue for each hardware queue, then you still have the obligation of correctness to make sure higher prio hardware rings get fed with packets first (depending on the hardwares scheduling capability). > But where non-trivial classification occurs, we have to either: > > 1) Make the queue selection match what the classifiers would do > exactly. > > OR > > 2) Point all the queues at a single device global qdisc. > > What we have now implements #2. Later on we can try to do something > as sophisticated as #1. sure; i think you could achieve the goals by using the single queue with a software pfifo_fast which maps skb->prio to hardware queues. such a pfifo_fast may even sit in the driver. This queue will always be empty unless you have congestion. The other thing is to make sure there is an upper bound to the size of this queue; otherwise a remote bug could cause it to grow infinitely and consume all memory. cheers, jamal ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-20 15:16 ` jamal @ 2008-07-20 17:25 ` David Miller [not found] ` <20080720.102534.246150854.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-20 17:25 UTC (permalink / raw) To: hadi; +Cc: kaber, netdev, johannes, linux-wireless From: jamal <hadi@cyberus.ca> Date: Sun, 20 Jul 2008 11:16:03 -0400 > IMO, in the case of multiple hardware queues per physical wire, > and such a netdevice already has a built-in hardware scheduler (they all > seem to have this feature) then if we can feed the hardware queues > directly, theres no need for any intermediate buffer(s). > In such a case, to compare with qdisc arch, its like the root qdisc is > in hardware. They tend to implement round-robin or some similar fairness algorithm amongst the queues, with zero concern about packet priorities. It really is just like a bunch of queues to the phsyical layer, fairly shared. These things are built for parallelization, not prioritization. ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080720.102534.246150854.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080720.102534.246150854.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-20 17:34 ` Tomas Winkler [not found] ` <1ba2fa240807201034u1fbe3343tdcc858f7772d2b39-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-07-20 22:32 ` jamal 1 sibling, 1 reply; 58+ messages in thread From: Tomas Winkler @ 2008-07-20 17:34 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Sun, Jul 20, 2008 at 8:25 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote: > From: jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org> > Date: Sun, 20 Jul 2008 11:16:03 -0400 > >> IMO, in the case of multiple hardware queues per physical wire, >> and such a netdevice already has a built-in hardware scheduler (they all >> seem to have this feature) then if we can feed the hardware queues >> directly, theres no need for any intermediate buffer(s). >> In such a case, to compare with qdisc arch, its like the root qdisc is >> in hardware. > > They tend to implement round-robin or some similar fairness algorithm > amongst the queues, with zero concern about packet priorities. > > It really is just like a bunch of queues to the phsyical layer, > fairly shared. > > These things are built for parallelization, not prioritization. Except wireless where HW has prioritizing scheduler per physical non-wire. Tomas -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <1ba2fa240807201034u1fbe3343tdcc858f7772d2b39-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <1ba2fa240807201034u1fbe3343tdcc858f7772d2b39-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2008-07-20 17:35 ` David Miller 0 siblings, 0 replies; 58+ messages in thread From: David Miller @ 2008-07-20 17:35 UTC (permalink / raw) To: tomasw-Re5JQEeQqe8AvxtiuMwx3w Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: "Tomas Winkler" <tomasw-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Date: Sun, 20 Jul 2008 20:34:18 +0300 > On Sun, Jul 20, 2008 at 8:25 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote: > > From: jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org> > > Date: Sun, 20 Jul 2008 11:16:03 -0400 > > > >> IMO, in the case of multiple hardware queues per physical wire, > >> and such a netdevice already has a built-in hardware scheduler (they all > >> seem to have this feature) then if we can feed the hardware queues > >> directly, theres no need for any intermediate buffer(s). > >> In such a case, to compare with qdisc arch, its like the root qdisc is > >> in hardware. > > > > They tend to implement round-robin or some similar fairness algorithm > > amongst the queues, with zero concern about packet priorities. > > > > It really is just like a bunch of queues to the phsyical layer, > > fairly shared. > > > > These things are built for parallelization, not prioritization. > > Except wireless where HW has prioritizing scheduler per physical non-wire. > Tomas I know that. We're talking about multiqueue ethernet NICs. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080720.102534.246150854.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 2008-07-20 17:34 ` Tomas Winkler @ 2008-07-20 22:32 ` jamal 2008-07-20 23:59 ` David Miller 1 sibling, 1 reply; 58+ messages in thread From: jamal @ 2008-07-20 22:32 UTC (permalink / raw) To: David Miller Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Sun, 2008-20-07 at 10:25 -0700, David Miller wrote: > They tend to implement round-robin or some similar fairness algorithm > amongst the queues, with zero concern about packet priorities. pfifo_fast would be a bad choice in that case, but even a pfifo cannot guarantee proper RR because it would present packets in a FIFO order (and example the first 10 could go to hardware queue1 and the next to hardware queue2). My view: i think you need a software queue per hardware queue. Maybe even these queues residing in the driver; that way you take care of congestion and it doesnt matter if the hardware is RR or strict prio (and you dont need the pfifo or pfifo_fast anymore). The use case would be something along: packets comes in, you classify find its for queue1, grab the per-hardware-queue1 lock, find the hardware queue1 is overloaded and stash it instead in s/ware queue1. If it wasnt congested, it would go on hardware queue1. When hardware queue1 becomes available and netif-woken, you pick first from s/ware queue1 (and batching could apply cleanly here) and send them to hardware queue. > It really is just like a bunch of queues to the phsyical layer, > fairly shared. I am suprised prioritization is not an issue. [My understanding of the intel/cisco datacentre cabal is they serve virtual machines using virtual wires; i would think in such scenarios youd have some customers who pay more than others]. > These things are built for parallelization, not prioritization. Total parallelization happens in the ideal case. If X cpus classify packets going to X different hardware queueus each CPU grabs only locks for that hardware queue. In virtualization, where only one customer's traffic is going to a specific hardware queue, things would work well. Non-virtualization scenario may result in collision in which two or more CPUs may contend for the same hardware queue (either transmitting or netif-waking etc). cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-20 22:32 ` jamal @ 2008-07-20 23:59 ` David Miller 2008-07-21 2:20 ` jamal 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-20 23:59 UTC (permalink / raw) To: hadi; +Cc: kaber, netdev, johannes, linux-wireless From: jamal <hadi@cyberus.ca> Date: Sun, 20 Jul 2008 18:32:50 -0400 > pfifo_fast would be a bad choice in that case, but even a pfifo cannot > guarantee proper RR because it would present packets in a FIFO order > (and example the first 10 could go to hardware queue1 and the next to > hardware queue2). Jamal, I have to wonder why are you so hung up on matching our software qdiscs with whatever fairness algorithm the hardware happens to implement internally for the TX queues? Every time this topic comes up, you insist on them having to match. And I have no idea why. It's largely irrelevant and the TX queue can be viewed purely as a buffer between us and the device, nearly a black hole we stick packets into. > > These things are built for parallelization, not prioritization. > > Total parallelization happens in the ideal case. If X cpus classify > packets going to X different hardware queueus each CPU grabs only locks > for that hardware queue. The problem is that the bottleneck is the qdisc itself since all those cpus synchonize on it's lock. We can't have a shared qdisc for the device and get full parallelization. That's why we're having one qdisc per TX queue, so that they all don't bunch up on the qdisc lock. Otherwise, there is zero point in all of these TX multiqueue features in the hardware if we can't parallelize things fully. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-20 23:59 ` David Miller @ 2008-07-21 2:20 ` jamal 2008-07-21 11:20 ` jamal 2008-07-21 11:58 ` Herbert Xu 0 siblings, 2 replies; 58+ messages in thread From: jamal @ 2008-07-21 2:20 UTC (permalink / raw) To: David Miller; +Cc: kaber, netdev, johannes, linux-wireless On Sun, 2008-20-07 at 16:59 -0700, David Miller wrote: > Every time this topic comes up, you insist on them having to match. > And I have no idea why. I dont insist on them matching, just on correctness. i.e if you say you have RR, then the scheduling needs to meet those requirements not an estimate - thats all. > The problem is that the bottleneck is the qdisc itself since all those > cpus synchonize on it's lock. We can't have a shared qdisc for the > device and get full parallelization. > > That's why we're having one qdisc per TX queue, so that they all don't > bunch up on the qdisc lock. That last sentence i have no issues with - it is what i thought wasnt happening;-> i misunderstood it to be a single fifo shared by all hardware tx queues from the begining (otherwise i wouldnt be posting). We are in sync i think, a single pfifo per TX queue is the way to go. I was suggesting it goes in the driver, but this is cleaner: In the future, one could could actually replace the pfifo with another qdisc since the single virtual wire becomes equivalent to a single virtual netdevice > Otherwise, there is zero point in all of these TX multiqueue features > in the hardware if we can't parallelize things fully. parallelization is achieveable in the ideal case. cheers, jamal ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 2:20 ` jamal @ 2008-07-21 11:20 ` jamal 2008-07-21 16:45 ` David Miller 2008-07-21 11:58 ` Herbert Xu 1 sibling, 1 reply; 58+ messages in thread From: jamal @ 2008-07-21 11:20 UTC (permalink / raw) To: David Miller Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Sun, 2008-20-07 at 22:20 -0400, jamal wrote: > We are in sync i think, a single pfifo per TX queue is the way to go. I > was suggesting it goes in the driver, but this is cleaner: In the Actually, can i modify that thought and go back to my initial contention now that things are making more sense?;-> A single s/ware queue per hardware transmit queue is good - but that being pfifo_fast would be a lot better. It would, in the minimal, keep things as they were for non-multiq and is a sane choice for any virtual wire. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 11:20 ` jamal @ 2008-07-21 16:45 ` David Miller 0 siblings, 0 replies; 58+ messages in thread From: David Miller @ 2008-07-21 16:45 UTC (permalink / raw) To: hadi; +Cc: kaber, netdev, johannes, linux-wireless From: jamal <hadi@cyberus.ca> Date: Mon, 21 Jul 2008 07:20:01 -0400 > Actually, can i modify that thought and go back to my initial contention > now that things are making more sense?;-> > A single s/ware queue per hardware transmit queue is good - but that > being pfifo_fast would be a lot better. It would, in the minimal, keep > things as they were for non-multiq and is a sane choice for any virtual > wire. Ok, I'll revert the change that changed pfifo_fast into plain fifo_fast. Thanks. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 2:20 ` jamal 2008-07-21 11:20 ` jamal @ 2008-07-21 11:58 ` Herbert Xu [not found] ` <E1KKu2D-0002kd-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org> 1 sibling, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 11:58 UTC (permalink / raw) To: hadi-fAAogVwAN2Kw5LPnMra/2Q Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org> wrote: > >> Otherwise, there is zero point in all of these TX multiqueue features >> in the hardware if we can't parallelize things fully. > > parallelization is achieveable in the ideal case. I think I get you now. You're suggesting that we essentially do what Dave has right now in the non-contending case, i.e., bypassing the qdisc so we get fully parallel processing until one of the hardware queues seizes up. At that point you'd stop all queues and make every packet go through the software qdisc to ensure ordering. This continues until all queues have vacancies again. If this is what you're suggesting, then I think that will offer pretty much the same behaviour as what we've got, while still offering at least some (perhaps even most, but that is debatable) of the benefits of multi-queue. At this point I don't think this is something that we need right now, but it would be good to make sure that the architecture allows such a thing to be implemented in future. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <E1KKu2D-0002kd-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <E1KKu2D-0002kd-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org> @ 2008-07-21 13:08 ` jamal 2008-07-21 13:19 ` Herbert Xu 2008-07-21 17:35 ` David Miller 1 sibling, 1 reply; 58+ messages in thread From: jamal @ 2008-07-21 13:08 UTC (permalink / raw) To: Herbert Xu Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, 2008-21-07 at 19:58 +0800, Herbert Xu wrote: > I think I get you now. You're suggesting that we essentially > do what Dave has right now in the non-contending case, i.e., > bypassing the qdisc so we get fully parallel processing until > one of the hardware queues seizes up. yes. That way there is no need for an intermediate queueing. As it is now, packets first get queued to qdisc then we dequeu and send to driver even when the driver would be happy to take it. That approach is fine if you want to support non-work conserving schedulers on single-hwqueue hardware. > At that point you'd stop all queues and make every packet go > through the software qdisc to ensure ordering. This continues > until all queues have vacancies again. I always visualize these as a single netdevice per hardware tx queue. If i understood correctly, ordering is taken care of already in the current patches because the stateless filter selects a hardware-queue. Dave has those queues sitting as a qdisc level (as pfifo) - which seems better in retrospect (than what i was thinking that they should sit in the driver) because one could decide they want to shape packets in the future on a per-virtual-customer-sharing-a-virtual-wire and attach an HTB instead. The one thing i am unsure of still: I think it would be cleaner to just stop a single queue (instead of all) when one hardware queue fills up. i.e if there is no congestion on the other hardware queues, packets should continue to be fed to their hardware queues and not be buffered at qdisc level. > If this is what you're suggesting, then I think that will offer > pretty much the same behaviour as what we've got, while still > offering at least some (perhaps even most, but that is debatable) > of the benefits of multi-queue. > > At this point I don't think this is something that we need right > now, but it would be good to make sure that the architecture > allows such a thing to be implemented in future. I think it is a pretty good first start (I am a lot more optimistic to be honest). Parallelization would work if you can get X CPUs to send to X hardware queues concurently. Feasible in static host setup like virtualization environment where you can tie a vm to a cpu. Not very feasible in routing where you are driven to a random hardware tx queue by arriving packets. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 13:08 ` jamal @ 2008-07-21 13:19 ` Herbert Xu 2008-07-21 13:56 ` jamal 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 13:19 UTC (permalink / raw) To: jamal; +Cc: davem, kaber, netdev, johannes, linux-wireless On Mon, Jul 21, 2008 at 09:08:44AM -0400, jamal wrote: > > The one thing i am unsure of still: > I think it would be cleaner to just stop a single queue (instead of all) > when one hardware queue fills up. i.e if there is no congestion on the > other hardware queues, packets should continue to be fed to their > hardware queues and not be buffered at qdisc level. Right you don't necessarily have to stop all queues but you do need to direct all packets into the qdisc. > Parallelization would work if you can get X CPUs to send to X hardware > queues concurently. Feasible in static host setup like virtualization > environment where you can tie a vm to a cpu. Not very feasible in > routing where you are driven to a random hardware tx queue by arriving > packets. It should work just fine for routing assuming your card does multi-queue RX. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 13:19 ` Herbert Xu @ 2008-07-21 13:56 ` jamal 2008-07-21 13:58 ` Herbert Xu 0 siblings, 1 reply; 58+ messages in thread From: jamal @ 2008-07-21 13:56 UTC (permalink / raw) To: Herbert Xu; +Cc: davem, kaber, netdev, johannes, linux-wireless On Mon, 2008-21-07 at 21:19 +0800, Herbert Xu wrote: > It should work just fine for routing assuming your card does > multi-queue RX. Ok - You may be able to pull it if you have the exact same hashing on hardware rx as it is on transmit stateless filtering. I can see a traffic pattern that could be cooked to give some good numbers in such a scenario;-> cheers, jamal ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 13:56 ` jamal @ 2008-07-21 13:58 ` Herbert Xu [not found] ` <20080721135846.GA11543-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 13:58 UTC (permalink / raw) To: jamal; +Cc: davem, kaber, netdev, johannes, linux-wireless On Mon, Jul 21, 2008 at 09:56:25AM -0400, jamal wrote: > > Ok - You may be able to pull it if you have the exact same hashing on > hardware rx as it is on transmit stateless filtering. > I can see a traffic pattern that could be cooked to give some good > numbers in such a scenario;-> Actually you've hit it on the head, as an alternative to TX hashing on the packet content, we need to allow TX queue selection based on the current CPU ID. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721135846.GA11543-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721135846.GA11543-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-07-21 15:09 ` David Miller [not found] ` <20080721.080901.214889979.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-21 15:09 UTC (permalink / raw) To: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Date: Mon, 21 Jul 2008 21:58:46 +0800 > On Mon, Jul 21, 2008 at 09:56:25AM -0400, jamal wrote: > > > > Ok - You may be able to pull it if you have the exact same hashing on > > hardware rx as it is on transmit stateless filtering. > > I can see a traffic pattern that could be cooked to give some good > > numbers in such a scenario;-> > > Actually you've hit it on the head, as an alternative to TX hashing > on the packet content, we need to allow TX queue selection based on > the current CPU ID. This we should avoid, it would allow reordering within a flow. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721.080901.214889979.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721.080901.214889979.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-21 15:22 ` Herbert Xu 2008-07-21 15:26 ` David Miller 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 15:22 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, Jul 21, 2008 at 08:09:01AM -0700, David Miller wrote: > > > Actually you've hit it on the head, as an alternative to TX hashing > > on the packet content, we need to allow TX queue selection based on > > the current CPU ID. > > This we should avoid, it would allow reordering within a flow. Not if the RX hashing is flow-based... Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 15:22 ` Herbert Xu @ 2008-07-21 15:26 ` David Miller [not found] ` <20080721.082627.128593651.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-21 15:26 UTC (permalink / raw) To: herbert; +Cc: hadi, kaber, netdev, johannes, linux-wireless From: Herbert Xu <herbert@gondor.apana.org.au> Date: Mon, 21 Jul 2008 23:22:20 +0800 > On Mon, Jul 21, 2008 at 08:09:01AM -0700, David Miller wrote: > > > > > Actually you've hit it on the head, as an alternative to TX hashing > > > on the packet content, we need to allow TX queue selection based on > > > the current CPU ID. > > > > This we should avoid, it would allow reordering within a flow. > > Not if the RX hashing is flow-based... Can you control the process scheduler and where it decides to run the processing running sendmsg() too? That's the problem. It is totally unwise to do CPU based TX hashing. ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721.082627.128593651.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721.082627.128593651.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-21 16:16 ` Herbert Xu [not found] ` <20080721161616.GA12938-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 16:16 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, Jul 21, 2008 at 08:26:27AM -0700, David Miller wrote: > > It is totally unwise to do CPU based TX hashing. Right I'm not suggesting having this as a default. However, if you have a finely tuned system (e.g., a router) where you've pinned all you RX queues to specific CPUs and your local apps as well then it would make sense to provide this as an alternative. If this alternative doesn't exist, then unless the RX hash happens to match the TX hash, for routing at least the packets are going to jump all over the place which isn't nice. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721161616.GA12938-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721161616.GA12938-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-07-21 16:25 ` David Miller 2008-07-21 16:43 ` Herbert Xu 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-21 16:25 UTC (permalink / raw) To: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Date: Tue, 22 Jul 2008 00:16:16 +0800 > On Mon, Jul 21, 2008 at 08:26:27AM -0700, David Miller wrote: > > > > It is totally unwise to do CPU based TX hashing. > > Right I'm not suggesting having this as a default. However, > if you have a finely tuned system (e.g., a router) where you've > pinned all you RX queues to specific CPUs and your local apps > as well then it would make sense to provide this as an alternative. > > If this alternative doesn't exist, then unless the RX hash happens > to match the TX hash, for routing at least the packets are going > to jump all over the place which isn't nice. Where are these places they are going to "jump all over"? :-) You can view the RX and TX queues as roughly independent namespaces. If the TX hash is good enough (current one certainly isn't and I will work on fixing that), it is likely to spread the accesses enough that there won't be many collisions to matter. If you really think about it, this is even pointless. Packets that hit RX queue A will always hit TX queue B, and so on and so forth. Therefore there will be no new level of separation afforded by using some kind of strict RX to TX mapping. We could provide the option, but it is so dangerous and I also see no real tangible benfit from it. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 16:25 ` David Miller @ 2008-07-21 16:43 ` Herbert Xu 2008-07-21 16:51 ` David Miller 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 16:43 UTC (permalink / raw) To: David Miller; +Cc: hadi, kaber, netdev, johannes, linux-wireless On Mon, Jul 21, 2008 at 09:25:56AM -0700, David Miller wrote: > > Where are these places they are going to "jump all over"? :-) Well consider the case where you have 4 queues, but a large number of flows per second (>= 1000). No matter how good your hash is, there is just no way of squeezing 1000 flows into 4 queues without getting loads of collisions :) So let's assume that these flows have been distributed uniformly by both the RX hash and the TX hash such that each queue is handling ~250 flows. If the TX hash does not match the result produced by the RX hash, you're going to get a hell lot of contention once you get into the NIC driver on the TX side. This is because for NICs like the ones from Intel ones you have to protect the TX queue accesses so that only one CPU touches a given queue at any point in time. The end result is either the driver being bogged down by lock or TX queue contention, or the mid-layer will have to redistribute skb's to the right CPUs in which case the synchronisation cost is simply moved over there. > If the TX hash is good enough (current one certainly isn't and I will > work on fixing that), it is likely to spread the accesses enough that > there won't be many collisions to matter. I agree that what you've got here makes total sense for a host. But I think there is room for something different for routers. > We could provide the option, but it is so dangerous and I also see no > real tangible benfit from it. The benefit as far as I can see is that this would allow a packet's entire journey through Linux to stay on exactly one CPU. There will be zero memory written by multiple CPUs as far as that packet is concerned. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 16:43 ` Herbert Xu @ 2008-07-21 16:51 ` David Miller [not found] ` <20080721.095124.89249903.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-21 16:51 UTC (permalink / raw) To: herbert; +Cc: hadi, kaber, netdev, johannes, linux-wireless From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 22 Jul 2008 00:43:06 +0800 > So let's assume that these flows have been distributed uniformly > by both the RX hash and the TX hash such that each queue is handling > ~250 flows. If the TX hash does not match the result produced by > the RX hash, you're going to get a hell lot of contention once you > get into the NIC driver on the TX side. How so? If the TX hash is well distributed, which it should be, it is at least going to approximate the distribution provided by the RX hash. > The benefit as far as I can see is that this would allow a packet's > entire journey through Linux to stay on exactly one CPU. There will > be zero memory written by multiple CPUs as far as that packet is > concerned. So will the existing one, enough of the time for it to matter, and yes even on a router or firewall. At least this is my belief :-) ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721.095124.89249903.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721.095124.89249903.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-21 17:02 ` Herbert Xu [not found] ` <20080721170233.GA13417-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 17:02 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, Jul 21, 2008 at 09:51:24AM -0700, David Miller wrote: > > How so? If the TX hash is well distributed, which it should be, > it is at least going to approximate the distribution provided by > the RX hash. This is a matter of probabilities :) In general, if the TX hash and the RX hash are completely unrelated, then the end result of the hash distribution should be independent of each other (independent in the probablistic sense). That is, for the flows which have been RX hashed into one queue, they should be hashed on average across all queues by the TX hash. Conversely, those that have been hashed into one TX queue would be distributed across all RX queues. Now if you've bound each RX queue to a specific CPU, then this means for a given TX queue, its packets are going to come from all the CPUs (well all those involved in network reception anyway). As each TX queue is designed to be accessed by only one CPU at a time, somebody somewhere has to pay for all this synchronisation :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721170233.GA13417-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721170233.GA13417-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-07-21 17:08 ` David Miller [not found] ` <20080721.100821.38432201.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-07-21 17:08 UTC (permalink / raw) To: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Date: Tue, 22 Jul 2008 01:02:33 +0800 > On Mon, Jul 21, 2008 at 09:51:24AM -0700, David Miller wrote: > > > > How so? If the TX hash is well distributed, which it should be, > > it is at least going to approximate the distribution provided by > > the RX hash. > > This is a matter of probabilities :) In general, if the TX hash > and the RX hash are completely unrelated, then the end result of > the hash distribution should be independent of each other (independent > in the probablistic sense). That is, for the flows which have > been RX hashed into one queue, they should be hashed on average > across all queues by the TX hash. Conversely, those that have > been hashed into one TX queue would be distributed across all > RX queues. Theoretically perhaps you are right. Can I at least get some commitment that someone will test that this really is necessary before we add the CPU ID hash option? -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721.100821.38432201.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721.100821.38432201.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-21 17:11 ` Herbert Xu 2008-08-22 6:56 ` Herbert Xu 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 17:11 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, Jul 21, 2008 at 10:08:21AM -0700, David Miller wrote: > > Can I at least get some commitment that someone will test > that this really is necessary before we add the CPU ID > hash option? Sure, I'll be testing some related things on this front so I'll try to produce some results that compare these two cases. For the purposes of demonstrating this I'll choose the worst-case scenario, that is, I'll ensure that the TX hash result exactly distributes each RX hash preimage across all possible TX hash values. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 17:11 ` Herbert Xu @ 2008-08-22 6:56 ` Herbert Xu 2008-08-22 7:16 ` David Miller [not found] ` <20080822065655.GA18471-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 2 replies; 58+ messages in thread From: Herbert Xu @ 2008-08-22 6:56 UTC (permalink / raw) To: David Miller; +Cc: hadi, kaber, netdev, johannes, linux-wireless On Tue, Jul 22, 2008 at 01:11:41AM +0800, Herbert Xu wrote: > On Mon, Jul 21, 2008 at 10:08:21AM -0700, David Miller wrote: > > > > Can I at least get some commitment that someone will test > > that this really is necessary before we add the CPU ID > > hash option? > > Sure, I'll be testing some related things on this front so I'll > try to produce some results that compare these two cases. I haven't had a chance to do the test yet but I've just had an idea of how we can get the best of both worlds. The problem with always directing traffic based on the CPU alone is that processes move around and we don't want to introduce packet reordering because to that. The problem with hashing based on packet headers alone is that it doesn't take CPU affinity into account at all so we may end up with a situation where one thread out of a thread pool (e.g., a web server) has n sockets which are hashed to n different queues. So here's the idea, we determine the tx queue for a flow based on the CPU on which we saw its first packet. Once we have decided on a queue we store that in a dst object (see below). This ensures that all subsequent packets of that flow ends up in the same queue so there is no reordering. It also avoids the problem where traffic genreated by one CPU gets scattered across queues. Of course to make this work we need to restart the flow cache project so that we have somewhere to store this txq assignment. The good thing is that a flow cache would be of benefit for IPsec users too and I hear that there is some interest in doing that in the immediate future. So perhaps we can combine efforts and use it for txq assignment as well. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-08-22 6:56 ` Herbert Xu @ 2008-08-22 7:16 ` David Miller [not found] ` <20080822.001620.141235108.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> [not found] ` <20080822065655.GA18471-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 1 sibling, 1 reply; 58+ messages in thread From: David Miller @ 2008-08-22 7:16 UTC (permalink / raw) To: herbert; +Cc: hadi, kaber, netdev, johannes, linux-wireless From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 22 Aug 2008 16:56:55 +1000 ... Storing initial TXQ assignment in dst entry idea ... > Of course to make this work we need to restart the flow cache > project so that we have somewhere to store this txq assignment. > > The good thing is that a flow cache would be of benefit for IPsec > users too and I hear that there is some interest in doing that > in the immediate future. So perhaps we can combine efforts and > use it for txq assignment as well. IPSEC already has a flow cache doesn't it? :) I suppose you're talking about the grand unified flow cache that never gets implemented right? I classify that in the same category as net channels at the moment, theoretically very interesting but no practical implementation in sight. ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080822.001620.141235108.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080822.001620.141235108.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-08-22 7:41 ` Herbert Xu [not found] ` <20080822074115.GB25615-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-08-22 7:41 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Fri, Aug 22, 2008 at 12:16:20AM -0700, David Miller wrote: > > I suppose you're talking about the grand unified flow cache > that never gets implemented right? I classify that in the > same category as net channels at the moment, theoretically > very interesting but no practical implementation in sight. In any case, we could implement what I suggested even without a flow cache, by simply storing the info in the dst. This means that all flows on the same dst will end up in the same queue, which is not as good as what a flow cache could give, but it's useful enough to be an option. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080822074115.GB25615-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080822074115.GB25615-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-08-22 10:42 ` David Miller [not found] ` <20080822.034217.135152551.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: David Miller @ 2008-08-22 10:42 UTC (permalink / raw) To: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Date: Fri, 22 Aug 2008 17:41:15 +1000 > On Fri, Aug 22, 2008 at 12:16:20AM -0700, David Miller wrote: > > > > I suppose you're talking about the grand unified flow cache > > that never gets implemented right? I classify that in the > > same category as net channels at the moment, theoretically > > very interesting but no practical implementation in sight. > > In any case, we could implement what I suggested even without > a flow cache, by simply storing the info in the dst. This means > that all flows on the same dst will end up in the same queue, which > is not as good as what a flow cache could give, but it's useful > enough to be an option. I had another idea in the back of my head that I wanted to mention. The SKB has a hash, and sources set the hash. It doesn't matter where in the stack it comes from. At transmit time, the select queue logic in dev_queue_xmit() will use the SKB hash if one has been set already. Otherwise it will do the hashing it does currently. So in the simplest case for forwarding, the RX side puts the RSS hash or whatever into this SKB hash location. Then, taking things to the next level, protocols set hashes for locally created packets. And this leads to being able to delete the simple_tx_hash() code entirely and also we won't have to add support for every protocol on the planet to that function. :) TCP sockets maybe could even do something clever, like, using the incoming SKB hash of the SYN and SYN+ACK packets for all subsequent SKBs sent by that socket. Another nice side effect, we have the choice of allowing IPSEC encapsulation of locally generated frames to not change the TX queue. This would have to be controlled by a sysctl or similar because I can see both ways being useful in different circumstances. You get the idea. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080822.034217.135152551.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080822.034217.135152551.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-08-22 10:47 ` Herbert Xu 2008-08-22 13:52 ` jamal 1 sibling, 0 replies; 58+ messages in thread From: Herbert Xu @ 2008-08-22 10:47 UTC (permalink / raw) To: David Miller Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Fri, Aug 22, 2008 at 03:42:17AM -0700, David Miller wrote: > > I had another idea in the back of my head that I wanted to mention. > > The SKB has a hash, and sources set the hash. It doesn't matter > where in the stack it comes from. Yes this sounds great! > So in the simplest case for forwarding, the RX side puts the RSS > hash or whatever into this SKB hash location. This works on all the routers. > Then, taking things to the next level, protocols set hashes for > locally created packets. And this leads to being able to delete > the simple_tx_hash() code entirely and also we won't have to add > support for every protocol on the planet to that function. :) And this takes care of the hosts. Best of all it's per flow and we don't even need to add any new infrastructure :) Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080822.034217.135152551.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 2008-08-22 10:47 ` Herbert Xu @ 2008-08-22 13:52 ` jamal 1 sibling, 0 replies; 58+ messages in thread From: jamal @ 2008-08-22 13:52 UTC (permalink / raw) To: David Miller Cc: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Fri, 2008-22-08 at 03:42 -0700, David Miller wrote: > I had another idea in the back of my head that I wanted to mention. > > The SKB has a hash, and sources set the hash. It doesn't matter > where in the stack it comes from. > > At transmit time, the select queue logic in dev_queue_xmit() will use > the SKB hash if one has been set already. Otherwise it will do the > hashing it does currently. > > So in the simplest case for forwarding, the RX side puts the RSS > hash or whatever into this SKB hash location. > Makes sense. In the case Herbert described, I am assuming that the RSS side could be configured to select a processor statically (MSI or otherwise). And that the hash value will be that of the processorid ego the tx queue selected. > Then, taking things to the next level, protocols set hashes for > locally created packets. And this leads to being able to delete > the simple_tx_hash() code entirely and also we won't have to add > support for every protocol on the planet to that function. :) > Well, conntrack could be useful (as long as you dont make it mandatory) since it already keeps track of flows. When unresolved, one could add it to some default queue. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080822065655.GA18471-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080822065655.GA18471-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-08-22 13:43 ` jamal 0 siblings, 0 replies; 58+ messages in thread From: jamal @ 2008-08-22 13:43 UTC (permalink / raw) To: Herbert Xu Cc: David Miller, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Fri, 2008-22-08 at 16:56 +1000, Herbert Xu wrote: > On Tue, Jul 22, 2008 at 01:11:41AM +0800, Herbert Xu wrote: > > On Mon, Jul 21, 2008 at 10:08:21AM -0700, David Miller wrote: > I haven't had a chance to do the test yet but I've just had an > idea of how we can get the best of both worlds. > > The problem with always directing traffic based on the CPU alone > is that processes move around and we don't want to introduce packet > reordering because to that. Assuming multi-rx queues with configurable MSI or otherwise to map to a receive processor, then in the case of routing/bridging or otherfavoriteformofforwarding: If you tie static filters to a specific cpu that will always work. So no reordering there. Local traffic i can see migration/reordering happening. > The problem with hashing based on packet headers alone is that > it doesn't take CPU affinity into account at all so we may end > up with a situation where one thread out of a thread pool (e.g., > a web server) has n sockets which are hashed to n different > queues. Indeed. In the forwarding case, the problem is not reordering rather all flows will always end up in the same cpu. So if you may end up just overloading one cpu while the other 1023 stayed idle. My earlier statement was you could cook traffic scenarios where all 1024 are being fully utilized (the operative term is "cook");-> > So here's the idea, we determine the tx queue for a flow based > on the CPU on which we saw its first packet. Once we have decided > on a queue we store that in a dst object (see below). This > ensures that all subsequent packets of that flow ends up in > the same queue so there is no reordering. It also avoids the > problem where traffic genreated by one CPU gets scattered across > queues. Wont work with static multi-rx nic; iirc, changing those filters is _expensive_. so you want it to stay static. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <E1KKu2D-0002kd-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org> 2008-07-21 13:08 ` jamal @ 2008-07-21 17:35 ` David Miller 1 sibling, 0 replies; 58+ messages in thread From: David Miller @ 2008-07-21 17:35 UTC (permalink / raw) To: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q Cc: hadi-fAAogVwAN2Kw5LPnMra/2Q, kaber-dcUjhNyLwpNeoWH0uzbU5w, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA From: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Date: Mon, 21 Jul 2008 19:58:33 +0800 > I think I get you now. You're suggesting that we essentially > do what Dave has right now in the non-contending case, i.e., > bypassing the qdisc so we get fully parallel processing until > one of the hardware queues seizes up. > > At that point you'd stop all queues and make every packet go > through the software qdisc to ensure ordering. This continues > until all queues have vacancies again. > > If this is what you're suggesting, then I think that will offer > pretty much the same behaviour as what we've got, while still > offering at least some (perhaps even most, but that is debatable) > of the benefits of multi-queue. > > At this point I don't think this is something that we need right > now, but it would be good to make sure that the architecture > allows such a thing to be implemented in future. Doing something like the noqueue_qdisc (bypassing the qdisc entirely) is very attractive because it eliminates the qdisc lock. We're only left with the TX lock. This whole idea of doing an optimized send to the device when the TX queue has space is indeed tempting. But we can't do it for qdiscs that measure rates, enforce limits, etc. And if the device kicks back at us with an error, we have to perform the normal ->enqueue() path. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-18 13:10 ` jamal 2008-07-18 13:27 ` jamal @ 2008-07-18 17:10 ` Roland Dreier [not found] ` <adaljzz6ruq.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 58+ messages in thread From: Roland Dreier @ 2008-07-18 17:10 UTC (permalink / raw) To: hadi-fAAogVwAN2Kw5LPnMra/2Q Cc: Patrick McHardy, David Miller, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA > The question is whether the packets sitting on the fifo qdisc are being > sorted fairly when congestion kicks in. Remember there is still a single > wire still even on multiple rings;-> This is definitely true, but it is good to keep in mind that in the near future we will start to see things look a little like multiple "virtual wires." This is because of new ethernet standards like per-priority pause, which makes it possible that one hardware ring on a NIC can transmit while another ring is paused (possibly because of congestion far off in the network). - R. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <adaljzz6ruq.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <adaljzz6ruq.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> @ 2008-07-20 14:58 ` jamal 0 siblings, 0 replies; 58+ messages in thread From: jamal @ 2008-07-20 14:58 UTC (permalink / raw) To: Roland Dreier Cc: Patrick McHardy, David Miller, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Fri, 2008-18-07 at 10:10 -0700, Roland Dreier wrote: > This is definitely true, but it is good to keep in mind that in the near > future we will start to see things look a little like multiple "virtual > wires." This is because of new ethernet standards like per-priority > pause, which makes it possible that one hardware ring on a NIC can > transmit while another ring is paused (possibly because of congestion > far off in the network). Thats essentially what i am arguing for. [I think some, not all, of the wireless qos schemes also have similar scheduling]. My understanding of these wired "datacentre/virtualization" schemes you describe is they are strict prio based. When the low prio "virtual wire" is contending for the "physical wire" with a higher prio "virtual wire", the high prio always wins. We just need to make sure this behavior is also maintained whatever buffering scheme is used within or above the driver(qdisc level). cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-17 22:24 ` David Miller [not found] ` <20080717.152447.89672084.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2008-07-20 14:32 ` Herbert Xu 2008-07-20 17:20 ` David Miller 1 sibling, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-20 14:32 UTC (permalink / raw) To: David Miller; +Cc: kaber, hadi, netdev, johannes, linux-wireless David Miller <davem@davemloft.net> wrote: > > Are we really really sure it's OK to let users do this? :) Well security questions aside, I have seen this used on routers to ensure certain traffic (e.g., VOIP) get preferential treatment when it comes to transmission. Of course some of these routers setup custom qdiscs so they wouldn't be affected, but there may be ones out there relying on the default pfifo. Now I understand that having a pfifo is an anathema to multi-queue TX by definition. However, even if we couldn't preserve the full semantics by default, if we could at least preserve the ordering within each queue it would still be a plus. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-20 14:32 ` Herbert Xu @ 2008-07-20 17:20 ` David Miller 0 siblings, 0 replies; 58+ messages in thread From: David Miller @ 2008-07-20 17:20 UTC (permalink / raw) To: herbert; +Cc: kaber, hadi, netdev, johannes, linux-wireless From: Herbert Xu <herbert@gondor.apana.org.au> Date: Sun, 20 Jul 2008 22:32:35 +0800 > Now I understand that having a pfifo is an anathema to multi-queue TX > by definition. However, even if we couldn't preserve the full > semantics by default, if we could at least preserve the ordering > within each queue it would still be a plus. That's how I basically feel right now as well. So we can revert that change that killed pfifo_fast if we want. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-17 13:35 ` jamal 2008-07-17 14:02 ` Patrick McHardy @ 2008-07-20 14:20 ` Herbert Xu [not found] ` <E1KKZmE-0000ty-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org> 1 sibling, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-20 14:20 UTC (permalink / raw) To: hadi-fAAogVwAN2Kw5LPnMra/2Q Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org> wrote: > > Actions are fine because they are intended to be globaly shared. > [i.e A classifier on ethx with qdiscA:Y (in/egress) can share an action > with classifer on ethy with qdiscB:Z (eg/ingress)]. Not all actions :) That nat action for example wasn't intended to be shared at all. In fact I still need to submit a patch to make it skip the shared hash as otherwise it simply won't scale as the number of nat actions increases (e.g., to 256K). Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <E1KKZmE-0000ty-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <E1KKZmE-0000ty-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org> @ 2008-07-20 15:35 ` jamal 2008-07-21 0:11 ` Herbert Xu 0 siblings, 1 reply; 58+ messages in thread From: jamal @ 2008-07-20 15:35 UTC (permalink / raw) To: Herbert Xu Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Sun, 2008-20-07 at 22:20 +0800, Herbert Xu wrote: > Not all actions :) That nat action for example wasn't intended to > be shared at all. In fact I still need to submit a patch to make > it skip the shared hash as otherwise it simply won't scale as the > number of nat actions increases (e.g., to 256K). True, sharing in the case of nat will cause scaling challenges because there is a per-action locking. So you dont want to share in that case. Let me clarify the global "sharedness" of actions, because i dont think there is an issue: All actions (on a per-type hash table basis) have an index. You create filter rule X and specify action nat. You may specify the index of the action when you create the filter X. If you then create another filter rule Y, also using the same action index, then that nat action is shared between rule X and rule Y[1]. If you dont specify the index a new nat action is created. So in essence, if you create 256K rules each with an action and as long as you dont specify the action index, you should be fine because none will be shared. The only scaling thing i can think of is to try and make the nat action hash table large to reduce the init lookup. Other than that, once the action is bound to a filter lookup cost is zero. cheers, jamal [1]This is useful for tow reasons: a) memory saving purposes: If you dont care that much about performance or on a uniprocessor machine, one action would just be sufficient for many rules. b) accounting purposes; as you know qdiscs/filters/actions are per-device. Over the years, a need has arosen from some users to have a "per system" accounting (refer to the IMQ/IFB approach). Eg, if i wanted the policer action to account for ingress eth0 and egress eth1 for a user, i couldnt do it without some acrobatics. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-20 15:35 ` jamal @ 2008-07-21 0:11 ` Herbert Xu [not found] ` <20080721001119.GA6515-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 0:11 UTC (permalink / raw) To: jamal; +Cc: kaber, davem, netdev, johannes, linux-wireless On Sun, Jul 20, 2008 at 11:35:19AM -0400, jamal wrote: > > All actions (on a per-type hash table basis) have an index. > You create filter rule X and specify action nat. > You may specify the index of the action when you create the filter X. > If you then create another filter rule Y, also using the same action > index, then that nat action is shared between rule X and rule Y[1]. This is exactly what I want to get rid of because otherwise even if no index was specified we'll still do a hash insertion which simply falls apart with a small hash table. Using a large hash table on the other is bad for people who only have a few rules. > [1]This is useful for tow reasons: > a) memory saving purposes: If you dont care that much about performance > or on a uniprocessor machine, one action would just be sufficient for > many rules. > b) accounting purposes; as you know qdiscs/filters/actions are > per-device. Over the years, a need has arosen from some users to have a > "per system" accounting (refer to the IMQ/IFB approach). Eg, if i wanted > the policer action to account for ingress eth0 and egress eth1 for a > user, i couldnt do it without some acrobatics. We could do a dynamic table but so far I'm not convinced that it's worth anybody's effort to implement :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721001119.GA6515-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721001119.GA6515-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-07-21 2:33 ` jamal 2008-07-21 3:17 ` Herbert Xu 0 siblings, 1 reply; 58+ messages in thread From: jamal @ 2008-07-21 2:33 UTC (permalink / raw) To: Herbert Xu Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, 2008-21-07 at 08:11 +0800, Herbert Xu wrote: > This is exactly what I want to get rid of because otherwise even > if no index was specified we'll still do a hash insertion which > simply falls apart with a small hash table. Using a large hash > table on the other is bad for people who only have a few rules. True. But note: this is only during rule creation - once you create the rule (user space to kernel path), then no more hash table reference. Fast path has already a filter with actions attached, and is a mere pointer dereference. > We could do a dynamic table but so far I'm not convinced that > it's worth anybody's effort to implement :) If user<->kernel performance insertion/deletion is important, it is worth it. For example: Dave implemented dynamic hash tables on xfrm (voip setup time with ipsec is a metric used in the industry in that case) . The only operational problem i had with xfrm was lack of an upper bound of how large a table can grow; i would rather user space be told ENOMEM than continuing to grow in some cases (I actually implemented a patch which put a stop after a certain number of sad/spd - but i dont expect hugs if i was to post it;->). cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 2:33 ` jamal @ 2008-07-21 3:17 ` Herbert Xu 2008-07-21 11:14 ` jamal 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 3:17 UTC (permalink / raw) To: jamal Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Sun, Jul 20, 2008 at 10:33:57PM -0400, jamal wrote: > > True. > But note: this is only during rule creation - once you create the > rule (user space to kernel path), then no more hash table reference. > Fast path has already a filter with actions attached, and is a mere > pointer dereference. Unfortunately the scenario that I wrote this for requires frequent addition/removal. > > We could do a dynamic table but so far I'm not convinced that > > it's worth anybody's effort to implement :) > > If user<->kernel performance insertion/deletion is important, it is > worth it. Only if you also want to share it :) In the end I patched it to not share it which is much easier. > For example: > Dave implemented dynamic hash tables on xfrm (voip setup time with ipsec > is a metric used in the industry in that case) . The only operational > problem i had with xfrm was lack of an upper bound of how large a table > can grow; i would rather user space be told ENOMEM than continuing to > grow in some cases (I actually implemented a patch which put a stop > after a certain number of sad/spd - but i dont expect hugs if i was to > post it;->). Of course if you're volunteering to write the dynamic hash table for actions then I'd happily switch back to sharing :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 3:17 ` Herbert Xu @ 2008-07-21 11:14 ` jamal 2008-07-21 11:36 ` Herbert Xu 0 siblings, 1 reply; 58+ messages in thread From: jamal @ 2008-07-21 11:14 UTC (permalink / raw) To: Herbert Xu; +Cc: kaber, davem, netdev, johannes, linux-wireless On Mon, 2008-21-07 at 11:17 +0800, Herbert Xu wrote: > Unfortunately the scenario that I wrote this for requires frequent > addition/removal. Aha - makes absolute sense then;-> > Only if you also want to share it :) In the end I patched it to > not share it which is much easier. I am trying to visualize: if you dont share, you must have 256K copies then? Assuming also you have a fast lookup since that was design intent. > Of course if you're volunteering to write the dynamic hash table > for actions then I'd happily switch back to sharing :) It is a unique need like you said earlier (and would require a medium-size surgery). How about this: if a second user shows up with such a need I could do it. If you knew you had a 256K entry, then you could make NAT_TAB_MASK to be (256K-1) and you are guaranteed to get O(1) lookup if you dont specify indices. I know youve patched it already -havent quiet understood how and your current solution may be better- but one other way is to have a Kconfig option which lets the user type the size of the nat hash table size at kernel compile time. So then a change of the sort: #ifdef CONFIG_HASH_SIZE #define NAT_TAB_MASK CONFIGURED_HASH_SIZE #else #define NAT_TAB_MASK 15 #endif What do you think? cheers, jamal ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-21 11:14 ` jamal @ 2008-07-21 11:36 ` Herbert Xu [not found] ` <20080721113600.GA10322-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> 0 siblings, 1 reply; 58+ messages in thread From: Herbert Xu @ 2008-07-21 11:36 UTC (permalink / raw) To: jamal; +Cc: kaber, davem, netdev, johannes, linux-wireless On Mon, Jul 21, 2008 at 07:14:39AM -0400, jamal wrote: > > > Only if you also want to share it :) In the end I patched it to > > not share it which is much easier. > > I am trying to visualize: if you dont share, you must have 256K copies > then? Assuming also you have a fast lookup since that was design intent. They can't be shared anyway because each of those 256K rules NATs to a different IP address. > #ifdef CONFIG_HASH_SIZE > #define NAT_TAB_MASK CONFIGURED_HASH_SIZE > #else > #define NAT_TAB_MASK 15 > #endif > > What do you think? Sorry, I think I'll have to poke my eyes out :) But yeah if we ever get a generic dynamic hash table implmenetation then I'd be happy for act_nat to use that. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <20080721113600.GA10322-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>]
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. [not found] ` <20080721113600.GA10322-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> @ 2008-07-21 11:39 ` jamal 0 siblings, 0 replies; 58+ messages in thread From: jamal @ 2008-07-21 11:39 UTC (permalink / raw) To: Herbert Xu Cc: kaber-dcUjhNyLwpNeoWH0uzbU5w, davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA, johannes-cdvu00un1VgdHxzADdlk8Q, linux-wireless-u79uwXL29TY76Z2rM5mHXA On Mon, 2008-21-07 at 19:36 +0800, Herbert Xu wrote: > Sorry, I think I'll have to poke my eyes out :) But i still need your eyes to poke at my code Herbert;-> > But yeah if we ever get a generic dynamic hash table implmenetation > then I'd be happy for act_nat to use that. Ok, i have added it to my todo. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU. 2008-07-17 13:03 ` Patrick McHardy [not found] ` <487F4327.1000107-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org> @ 2008-07-19 3:59 ` David Miller 1 sibling, 0 replies; 58+ messages in thread From: David Miller @ 2008-07-19 3:59 UTC (permalink / raw) To: kaber; +Cc: netdev, johannes, linux-wireless From: Patrick McHardy <kaber@trash.net> Date: Thu, 17 Jul 2008 15:03:35 +0200 > The remaining problem is data that was previously only used and > modified under the RTNL (u32_list is one example). Modifications > during destruction now need protection against concurrent use in > process context. I took a closer look at u32_list. It's members seem to be keyed by qdisc :-) All this thing is doing is making up for the lack of a classifier private pointer in the Qdisc. It's tough because classifiers of different types can be queued up to a qdisc, so it's not like we can add one traffic classifier private pointer to struct Qdisc and be done with it. However, I could not find anything other than u32 that seems to need some global state that works like this. So what I've done is checked in the following change for now to make u32 delete operation safe entirely within a qdisc tree's namespace. pkt_sched: Get rid of u32_list. The u32_list is just an indirect way of maintaining a reference to a U32 node on a per-qdisc basis. Just add an explicit node pointer for u32 to struct Qdisc an do away with this global list. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 0a158ff..8a44386 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -56,6 +56,8 @@ struct Qdisc int (*reshape_fail)(struct sk_buff *skb, struct Qdisc *q); + void *u32_node; + /* This field is deprecated, but it is still used by CBQ * and it will live until better solution will be invented. */ diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index 4d75544..527db25 100644 --- a/net/sched/cls_u32.c +++ b/net/sched/cls_u32.c @@ -75,7 +75,6 @@ struct tc_u_hnode struct tc_u_common { - struct tc_u_common *next; struct tc_u_hnode *hlist; struct Qdisc *q; int refcnt; @@ -87,8 +86,6 @@ static const struct tcf_ext_map u32_ext_map = { .police = TCA_U32_POLICE }; -static struct tc_u_common *u32_list; - static __inline__ unsigned u32_hash_fold(__be32 key, struct tc_u32_sel *sel, u8 fshift) { unsigned h = ntohl(key & sel->hmask)>>fshift; @@ -287,9 +284,7 @@ static int u32_init(struct tcf_proto *tp) struct tc_u_hnode *root_ht; struct tc_u_common *tp_c; - for (tp_c = u32_list; tp_c; tp_c = tp_c->next) - if (tp_c->q == tp->q) - break; + tp_c = tp->q->u32_node; root_ht = kzalloc(sizeof(*root_ht), GFP_KERNEL); if (root_ht == NULL) @@ -307,8 +302,7 @@ static int u32_init(struct tcf_proto *tp) return -ENOBUFS; } tp_c->q = tp->q; - tp_c->next = u32_list; - u32_list = tp_c; + tp->q->u32_node = tp_c; } tp_c->refcnt++; @@ -402,14 +396,8 @@ static void u32_destroy(struct tcf_proto *tp) if (--tp_c->refcnt == 0) { struct tc_u_hnode *ht; - struct tc_u_common **tp_cp; - for (tp_cp = &u32_list; *tp_cp; tp_cp = &(*tp_cp)->next) { - if (*tp_cp == tp_c) { - *tp_cp = tp_c->next; - break; - } - } + tp->q->u32_node = NULL; for (ht = tp_c->hlist; ht; ht = ht->next) { ht->refcnt--; ^ permalink raw reply related [flat|nested] 58+ messages in thread
end of thread, other threads:[~2008-08-22 13:52 UTC | newest]
Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-17 12:17 [PATCH 20/31]: pkt_sched: Perform bulk of qdisc destruction in RCU David Miller
[not found] ` <20080717.051726.226040470.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-17 13:03 ` Patrick McHardy
[not found] ` <487F4327.1000107-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
2008-07-17 13:12 ` David Miller
2008-07-17 13:48 ` Patrick McHardy
[not found] ` <487F4DA6.6010009-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
2008-07-17 22:36 ` David Miller
[not found] ` <20080717.153609.142867331.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-17 23:58 ` Patrick McHardy
2008-07-17 13:35 ` jamal
2008-07-17 14:02 ` Patrick McHardy
[not found] ` <487F50EC.80008-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
2008-07-17 22:24 ` David Miller
[not found] ` <20080717.152447.89672084.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-17 23:48 ` Patrick McHardy
2008-07-18 13:10 ` jamal
2008-07-18 13:27 ` jamal
2008-07-18 21:05 ` David Miller
2008-07-20 15:16 ` jamal
2008-07-20 17:25 ` David Miller
[not found] ` <20080720.102534.246150854.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-20 17:34 ` Tomas Winkler
[not found] ` <1ba2fa240807201034u1fbe3343tdcc858f7772d2b39-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-20 17:35 ` David Miller
2008-07-20 22:32 ` jamal
2008-07-20 23:59 ` David Miller
2008-07-21 2:20 ` jamal
2008-07-21 11:20 ` jamal
2008-07-21 16:45 ` David Miller
2008-07-21 11:58 ` Herbert Xu
[not found] ` <E1KKu2D-0002kd-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org>
2008-07-21 13:08 ` jamal
2008-07-21 13:19 ` Herbert Xu
2008-07-21 13:56 ` jamal
2008-07-21 13:58 ` Herbert Xu
[not found] ` <20080721135846.GA11543-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-07-21 15:09 ` David Miller
[not found] ` <20080721.080901.214889979.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-21 15:22 ` Herbert Xu
2008-07-21 15:26 ` David Miller
[not found] ` <20080721.082627.128593651.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-21 16:16 ` Herbert Xu
[not found] ` <20080721161616.GA12938-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-07-21 16:25 ` David Miller
2008-07-21 16:43 ` Herbert Xu
2008-07-21 16:51 ` David Miller
[not found] ` <20080721.095124.89249903.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-21 17:02 ` Herbert Xu
[not found] ` <20080721170233.GA13417-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-07-21 17:08 ` David Miller
[not found] ` <20080721.100821.38432201.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-07-21 17:11 ` Herbert Xu
2008-08-22 6:56 ` Herbert Xu
2008-08-22 7:16 ` David Miller
[not found] ` <20080822.001620.141235108.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-08-22 7:41 ` Herbert Xu
[not found] ` <20080822074115.GB25615-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-08-22 10:42 ` David Miller
[not found] ` <20080822.034217.135152551.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-08-22 10:47 ` Herbert Xu
2008-08-22 13:52 ` jamal
[not found] ` <20080822065655.GA18471-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-08-22 13:43 ` jamal
2008-07-21 17:35 ` David Miller
2008-07-18 17:10 ` Roland Dreier
[not found] ` <adaljzz6ruq.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2008-07-20 14:58 ` jamal
2008-07-20 14:32 ` Herbert Xu
2008-07-20 17:20 ` David Miller
2008-07-20 14:20 ` Herbert Xu
[not found] ` <E1KKZmE-0000ty-00-XQvu0L+U/CjiRBuR/1fSEKKkPtS2pBon@public.gmane.org>
2008-07-20 15:35 ` jamal
2008-07-21 0:11 ` Herbert Xu
[not found] ` <20080721001119.GA6515-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-07-21 2:33 ` jamal
2008-07-21 3:17 ` Herbert Xu
2008-07-21 11:14 ` jamal
2008-07-21 11:36 ` Herbert Xu
[not found] ` <20080721113600.GA10322-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2008-07-21 11:39 ` jamal
2008-07-19 3:59 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).