From: Artur Skawina <art.08.09@gmail.com>
To: Christian Lamparter <chunkeey@web.de>
Cc: Artur Skawina <art.08.09@gmail.com>, linux-wireless@vger.kernel.org
Subject: Re: [RFC][RFT][PATCH] p54usb: rx refill revamp
Date: Thu, 22 Jan 2009 00:22:16 +0100 [thread overview]
Message-ID: <4977AE28.2080109@gmail.com> (raw)
In-Reply-To: <200901212156.27152.chunkeey@web.de>
Christian Lamparter wrote:
> On Wednesday 21 January 2009 20:32:43 Artur Skawina wrote:
>> Christian Lamparter wrote:
>>>> This patch makes the usb rx path alloc-less (except for the actual urb
>>>> submission call) which is good, but i wonder if we should try a GFP_NOWAIT
>>>> allocation, and only fallback if that one fails.
>>> Not necessary, we waste quite a lot memory by filling the rx ring with 32 useable packets.
>>> So there should be no shortage (anymore).
>> Not allocating-on-receive at all worries me a bit. Will test under load. (i already
>> had instrumented the cb, but the crashes prevented any useful testing).
>
> no problem... I'll wait for your data before removing the RFC/RFT tags
That part is probably fine, and i'm just being paranoid. Ignore me.
>>> The net2280 tx path does at least three allocs, one tiny never-changing buffer
>>>> and two urbs, i'd like to get rid of all of them.
>>> why? AFAIK kernel memory alloc already provides a good amount of (small) buffer caches,
>>> so why should stockpile them only for ourself?
>>>
>>> You know, 802.11b/g isn't exactly fast by any standards - heck even a 15 year old ethernet NIC
>>> is easily 5-6 times faster. So, "optimizations" are a bit useless when we have these bottlenecks.
>> no, i don't expect it do much difference performance-wise; i don't want it to
>> fail under memory pressure. preallocating ~three small buffers isn't that bad ;)
>
> well, the memory pressure is not sooo bad in a (prioritized) kernel thread.
> After all, the kernel reserves extra space for the kernel only and the OOM killer will become active as well...
> So unless you got a machine with 8mb (afaik that's the lowest limit linux boots now a days and is
> still useable!?) and a no/slowwwww swap, you'll have a really hard time to get any shortage of rx urbs at all.
TX, not RX.
I'll try stealing^Hborrowing the urbs from the refill queue (fixing up
the rx code to allow for this first, of course).
>>> In fact, if you have more than one GHz in your box, you should let your CPU do the
>>> encryption/decryption instead of the 30Mhz ARM CPU....
>>> this will give you a better latency for next to nothing.
>> BTW i tested both w/ hw encryption and w/o and both worked; saw no difference
>> in throughput, but didn't benchmark yet.
>> And no, i don't have >1GHz, the target system has probably 1/4 of that available
>> when it's idle, and much less when it's under load. Also i'd like to be able to
>> connect the device to a small fanless brick and have it do it's work (if i can find
>> a usable 2.6-based one, that is).
>
> well, the latency is usually about 0.1 - 0.2 msec better.
> However you'll get a big improvement if you change the MTU...
> As a ethernet device, the default is at 1500 octets, however the limit for WLAN is somewhere at 2274.
Good to know. As most packets will go over a ~1500 link upstream anyway
i'd rather not pay the pmtu discovery cost ;)
>>>> The constant buffer is easy - we can just kmalloc a cacheline-sized chunk on init, and (re)use that.
>>> only a single constant buffer? are you sure that's a good idea, on dual cores?
>>> (Or is this a misunderstanding and you plan to have up to 32/64 constant buffers?)
>> why not? the content never changes, and will only be read by the usb host controller;
>> the cpu shouldn't even need to see it after the initial setup.
> Ok, I guess we're talking about different things here.
> Please, show me a patch, before it gets too confusing ;-)
ok, I was just saying that that all this:
> reg = kmalloc(sizeof(*reg), GFP_ATOMIC);
> if (!reg) {
> printk(KERN_INFO "tx_net2280 kmalloc(reg), txqlen = %d\n",
> skb_queue_len(&priv->common.tx_queue) );
> return;
> }
> [...]
> reg->port = cpu_to_le16(NET2280_DEV_U32);
> reg->addr = cpu_to_le32(P54U_DEV_BASE);
> reg->val = cpu_to_le32(ISL38XX_DEV_INT_DATA);
does not need to happen for every single tx-ed frame.
>>>>> + if (usb_submit_urb(entry, GFP_ATOMIC)) {
>>>> GFP_KERNEL? [would need dropping rx_queue.lock earlier and retaking in the
>>>> (hopefully rare) error path]
>>> why not... I don't remember the real reason why I did this complicated lock, probably
>> You were already doing this for the skb allocation anyway ;)
> do you mean the old "init_urbs"?
I meant that you were already dropping a spinlock around one allocation, so it
seemed odd to not to do that for the other call.
> Well the bits I've still in mind about the "complicated lock". Was something about
> a theroeticall race between p54u_rx_cb, the workqueue and free_urbs.
>
> but of course, I've never seen a oops because of it.
>>> A updated patch is attached (as file)
>> Will test.
>> Are the free_urb/get_urb calls necessary? IOW why drop the reference
>> when preparing the urb, only to grab it again in the completion?
>
> Oh, I'm not arguing with Alan Stern about it:.
> http://lkml.org/lkml/2008/12/6/166
0) urb is allocated, refcount==1 # and placed on the refill list; p54u_init_urbs()
1) p54u_rx_refill()
2) urb is anchored refcount==2 # p54u_rx_refill
3) submit_urb() refcount==3 # in drivers/usb/core/hcd.c:usb_hcd_submit_urb
4) free_urb() refcount==2
5) ... usb transfer happens ... refcount==2
6) urb is unanchored refcount==1 # in drivers/usb/core/hcd.c:usb_hcd_giveback_urb()
7) p54u_rx_cb() # completion is called
8) usb_get_urb(urb) refcount==2 # unconditionally called in p54u_rx_cb()
9) p54u_rx_cb() returns
10) usb_put_urb() refcount==1 # in drivers/usb/core/hcd.c:usb_hcd_giveback_urb()
11) urb sits on the refill list
12) goto 1.
IOW step 4 causes the refcount to temporarily drop to 1, but as you
never want the urb freed in the completion, step 8 grabs another reference,
and the refcount can never become 0.
(for the skb-reuse case, you anchor and resubmit in step 7 (rc==3),
then return from completion (rc==2) and restart at step 5.)
Unless i missed something (i'm only looking at the patch).
So if you omit steps 4 and 8 nothing changes, except that the refcount
increases by one, in steps 5..7.
The urbs are freed by p54u_free_rx_refill_list(), which drops the last ref;
(it would need to get them all though, IOW would have to call
usb_kill_anchored_urbs() _before_ p54u_free_rx_refill_list(). )
Oh, and I just realized urbs are getting lost in the 'unlikely(urb->status)'
case, both before your patch, and after.
A simple fix, with your new code, would be to place them on the refill list,
from where they will be eventually resubmitted.
artur
next prev parent reply other threads:[~2009-01-21 23:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-21 13:50 [RFC][RFT][PATCH] p54usb: rx refill revamp Christian Lamparter
2009-01-21 16:04 ` Artur Skawina
2009-01-21 18:24 ` Christian Lamparter
2009-01-21 19:32 ` Artur Skawina
2009-01-21 20:56 ` Christian Lamparter
2009-01-21 23:22 ` Artur Skawina [this message]
2009-01-22 15:00 ` Christian Lamparter
2009-01-22 15:43 ` Artur Skawina
2009-01-22 21:39 ` Christian Lamparter
2009-01-22 21:45 ` Artur Skawina
2009-01-22 22:12 ` Christian Lamparter
2009-01-22 5:40 ` Artur Skawina
2009-01-22 15:09 ` Christian Lamparter
2009-01-22 15:52 ` Artur Skawina
2009-01-22 16:01 ` Christian Lamparter
2009-01-22 19:19 ` Artur Skawina
2009-01-22 21:02 ` Christian Lamparter
2009-01-22 22:05 ` Artur Skawina
2009-01-22 22:39 ` Christian Lamparter
2009-01-22 22:51 ` Artur Skawina
2009-01-23 1:11 ` Artur Skawina
2009-01-21 20:06 ` Larry Finger
2009-01-21 20:51 ` Christian Lamparter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4977AE28.2080109@gmail.com \
--to=art.08.09@gmail.com \
--cc=chunkeey@web.de \
--cc=linux-wireless@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.