From: Patrick McHardy <kaber@trash.net>
To: hadi@cyberus.ca
Cc: Russell Stuart <russell-tcatm@stuart.id.au>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
Stephen Hemminger <shemminger@osdl.org>,
netdev@vger.kernel.org, Jesper Dangaard Brouer <hawk@diku.dk>
Subject: Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)
Date: Tue, 20 Jun 2006 18:51:13 +0200 [thread overview]
Message-ID: <44982781.8030301@trash.net> (raw)
In-Reply-To: <1150817922.5270.125.camel@jzny2>
jamal wrote:
> On Tue, 2006-20-06 at 16:45 +0200, Patrick McHardy wrote:
>
>>Actually in the PPPoE case Linux doesn't know about ethernet
>>headers either, since shaping is usually done on the PPP device.
>>But that doesn't really matter since the ethernet link is not
>>the bottleneck - although it does add some delay for packetization.
>
>
> good point. But one could argue that is within linux (local) as opposed
> to something downstream at the ISP i.e. i have knowledge of it and i
> could do clever things. The other is: I have to know that the ISP is
> using pigeons as the link layer downstream and compensate for it.
>
> The issue is really is whether Linux should be interested in the
> throughput it is told about or the goodput (also known as effective
> throughput) the service provider offers. Two different issues by
> definition.
In the case of PPPoE non-work-conserving qdiscs are already used
to manage a link that is non-local with knowledge of the its
bandwidth, contrary to a local link that would be best managed
in work-conserving mode. And I think for better accuracy it is
necessary to manage effective throughput, especially if you're
interested in guaranteed delays.
>>>Yes, Linux cant tell if your service provider is lying to you.
>>
>>I wouldn't call it lying as long as they don't say "1.5mbps IP
>>layer throughput".
>
>
> It is a scam for sure.
> By definition of what throughput is - you are telling the truth; just
> not the whole truth. Most users think in terms of goodput and not
> throughput.
> i.e you are not telling the whole truth by not saying "it is 1.5Mbps ATM
> throughput". Tpyically not an issue until somebody finds that by leaving
> out "ATM" you meant throughput and not goodput.
I think that point can be used to argue in favour of that Linux should
be able to manage effective throughput :)
>>Ethernet doesn't provide 100mbit IP layer
>>throughput either, and with minimum sized IP packets its actually
>>well below that.
>
>
> OTOH, nobody has ethernet MTUs of 64 bytes.
Sure, but I might now want my HFSC class with guaranteed delay of 140us
to be distrurbed by someone sending small packets, that need more time
on the wire than HFSC thinks.
> To be academic and pedantic: The schedulers should be focusing on
> throughput and not goodput.
> Look at it from another angle related to the nature of the link layer
> used:
> If i buy a 1.5 Mbps 802.11JHS (such a link layer technology doesnt
> exist, but assume for the sake of arguement it does) from a wireless
> service provider, ethernet headers etc - but in this case the link is so
> bad (because of the link layer technology) i have to retransmit so much
> that 0.5 Mbps is wasted on retransmits, the question becomes:
> 1)Do i fix the scheduler to compensate for this link layer retransmit?
> or
> 2)Do i find some other creative way to tell the scheduler that
> without making any changes to it that my ftp (despite the retransmits)
> should only chew 100Kbps.?
>
> I am saying that #2 is the choice to go with hence my assertion earlier,
> it should be fine to tell the scheduler all it has is 1Mbps and nobody
> gets hurt. #1 if i could do it with minimal intrusion and still get to
> use it when i have 802.11g.
>
> Not sure i made sense.
HFSC is actually capable of handling this quite well. If you use it
in work-conserving mode (and the card doesn't do (much) internal
queueing) it will get clocked by successful transmissions. Using
link-sharing classes you can define proportions for use of available
bandwidth, possibly with upper limits. No hacks required :)
Anyway, this again goes more in the direction of handling link speed
changes.
>>A non intrusive way is prefered of course, but I can't really see
>>one if you want more than just a special-case solution that only
>>covers qdiscs using rate-tables and even ignores inner qdiscs.
>>HFSC and SFQ for example both need to calculate the wire length
>>at runtime.
>>
>
> Agreed. That would be equivalent to #1 above.
>
>
>>Handling all qdiscs would mean adding a pointer to a mapping table
>>to struct net_device and using something like "skb_wire_len(skb, dev)"
>>instead of skb->len in the queueing layer.
>
>
> That does seem sensible and simpler. I would suspect then that you will
> do this one time with something like
> ip dev add compensate_header 100 bytes
Something like that, but its a bit more complicated.
For ATM we need some mapping:
[0-48] -> 53
[49-96] -> 106
...
for Ethernet we need:
[0-60] -> 64
[60-n] -> n + 4
We could do something like this (feel free to imagine nicer names):
ATM:
table = {
.step = 53,
.map = {
[0..48] = 53,
[49..96] = 106,
...
}
};
Requiring a table of size 32 for typical MTUs.
Ethernet:
table = {
.step = 60,
.map = {
[0..60] = 60,
[...] = 0,
},
.fixed_overhead = 4,
};
static inline unsigned int
skb_wire_len(struct sk_buff *skb, struct net_device *dev)
{
unsigned int idx, len;
if (dev->lengthtable == NULL)
return skb->len;
idx = skb->len / dev->lengthtable->step;
len = dev->lengthtable->map[idx];
return dev->lengthtable->fixed_overhead + len ? len : skb->len;
}
Unforunately I can't think of a way to handle the ATM case without
a division .. or iteration.
>>That of course doesn't
>>mean that we can't still provide pre-adjusted ratetables for qdiscs
>>that use them.
>>
>
>
> But what would the point be then if you can compensate as you did above?
It doesn't need runtime divisions :)
next prev parent reply other threads:[~2006-06-20 16:52 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-14 9:40 [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace) Jesper Dangaard Brouer
2006-06-14 10:57 ` Alan Cox
2006-06-14 13:18 ` Jesper Dangaard Brouer
2006-06-15 0:47 ` Russell Stuart
2006-06-15 13:03 ` jamal
2006-06-19 19:31 ` [LARTC] Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL Jesper Dangaard Brouer
2006-06-19 19:31 ` [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace) Jesper Dangaard Brouer
2006-06-20 14:06 ` jamal
2006-06-20 14:45 ` Patrick McHardy
2006-06-20 15:38 ` jamal
2006-06-20 16:51 ` Patrick McHardy [this message]
2006-06-22 19:02 ` jamal
2006-06-23 15:05 ` Patrick McHardy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44982781.8030301@trash.net \
--to=kaber@trash.net \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=hadi@cyberus.ca \
--cc=hawk@diku.dk \
--cc=netdev@vger.kernel.org \
--cc=russell-tcatm@stuart.id.au \
--cc=shemminger@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.