From: "Benoît Thébaudeau" <benoit.thebaudeau@advansee.com>
To: u-boot@lists.denx.de
Subject: [U-Boot] [PATCH 2/5] ehci-hcd: Boost transfer speed
Date: Fri, 27 Jul 2012 15:59:36 +0200 (CEST) [thread overview]
Message-ID: <806388763.701359.1343397576841.JavaMail.root@advansee.com> (raw)
In-Reply-To: <201207271454.07888.marex@denx.de>
Dear Marek Vasut,
On Fri, Jul 27, 2012 at 02:54:07 PM, Marek Vasut wrote:
> > This patch takes advantage of the hardware EHCI qTD queuing
> > mechanism to
> > avoid software overhead and to make transfers as fast as possible.
> >
> > The only drawback is a call to memalign. However, this is fast
> > compared to
> > the transfer timings, and the heap size to allocate is small, e.g.
> > a
> > little bit more than 100 kB for a transfer length of 65535 packets
> > of 512
> > bytes.
> >
> > Tested on i.MX25 and i.MX35. In my test conditions, the speedup was
> > about
> > 15x using page-aligned buffers, which is really appreciable when
> > accessing
> > large files.
> >
> > Signed-off-by: Beno?t Th?baudeau <benoit.thebaudeau@advansee.com>
> > Cc: Marek Vasut <marex@denx.de>
> > Cc: Ilya Yanok <ilya.yanok@cogentembedded.com>
> > Cc: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
> > ---
> > .../drivers/usb/host/ehci-hcd.c | 94
> > ++++++++++++++------ 1 file changed, 65 insertions(+), 29
> > deletions(-)
> >
> > diff --git u-boot-usb-1b4bd0e.orig/drivers/usb/host/ehci-hcd.c
> > u-boot-usb-1b4bd0e/drivers/usb/host/ehci-hcd.c index
> > 5b3b906..b5645fa
> > 100644
> > --- u-boot-usb-1b4bd0e.orig/drivers/usb/host/ehci-hcd.c
> > +++ u-boot-usb-1b4bd0e/drivers/usb/host/ehci-hcd.c
> > @@ -208,7 +208,8 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned long
> > pipe, void *buffer, int length, struct devrequest *req)
> > {
> > ALLOC_ALIGN_BUFFER(struct QH, qh, 1, USB_DMA_MINALIGN);
> > - ALLOC_ALIGN_BUFFER(struct qTD, qtd, 3, USB_DMA_MINALIGN);
> > + struct qTD *qtd;
> > + int qtd_count = 0;
> > int qtd_counter = 0;
> >
> > volatile struct qTD *vtd;
> > @@ -229,8 +230,25 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned
> > long pipe, void *buffer, le16_to_cpu(req->value),
> > le16_to_cpu(req->value),
> > le16_to_cpu(req->index));
> >
> > + if (req != NULL) /* SETUP + ACK */
> > + qtd_count += 1 + 1;
> > + if (length > 0 || req == NULL) { /* buffer */
> > + if ((uint32_t)buffer & 4095) /* page-unaligned */
> > + qtd_count += (((uint32_t)buffer & 4095) + length +
> > + (QT_BUFFER_CNT - 1) * 4096 - 1) /
> > + ((QT_BUFFER_CNT - 1) * 4096);
>
> Ok, maybe you can please elaborate on this crazy calculation in here?
> Or somehow
> clarify it? Also, won't the macros in include/common.h help in a way?
> (like
> ROUND() etc).
I have already posted a v2 for this specific patch (only 2/5) that does exactly
that. Please review only the latest versions of patches.
> I don't really graps what you're trying to calculate here, so maybe
> even a
> comment would help.
I'll do that.
> > + else /* page-aligned */
> > + qtd_count += (length + QT_BUFFER_CNT * 4096 - 1) /
> > + (QT_BUFFER_CNT * 4096);
>
> Same here, also please avoid using those 4096 and such constants ...
> maybe
> #define them in ehci.h ?
Yes. It was already everywhere, so I went on the same way. Do you think using
PAGE_SIZE from <linux/compat.h> would be fine since these 4096 are nothing more
than page sizes?
> > + }
> > + qtd = memalign(USB_DMA_MINALIGN, qtd_count * sizeof(struct qTD));
>
> So your code can alloc more than 5 qTDs ? How does it chain them
> then? Into more
> QHs ?
It's done in exactly the same way as for the original 3 qTDs, only with more
qTDs, but still with 5 qt_buffers per qTD.
> > + if (qtd == NULL) {
> > + printf("unable to allocate TDs\n");
> > + return -1;
> > + }
> > +
> > memset(qh, 0, sizeof(struct QH));
> > - memset(qtd, 0, 3 * sizeof(*qtd));
> > + memset(qtd, 0, qtd_count * sizeof(*qtd));
> >
> > toggle = usb_gettoggle(dev, usb_pipeendpoint(pipe),
> > usb_pipeout(pipe));
> >
> > @@ -291,31 +309,46 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned
> > long pipe, void *buffer, }
> >
> > if (length > 0 || req == NULL) {
> > - /*
> > - * Setup request qTD (3.5 in ehci-r10.pdf)
> > - *
> > - * qt_next ................ 03-00 H
> > - * qt_altnext ............. 07-04 H
> > - * qt_token ............... 0B-08 H
> > - *
> > - * [ buffer, buffer_hi ] loaded with "buffer".
> > - */
> > - qtd[qtd_counter].qt_next = cpu_to_hc32(QT_NEXT_TERMINATE);
> > - qtd[qtd_counter].qt_altnext = cpu_to_hc32(QT_NEXT_TERMINATE);
> > - token = (toggle << 31) |
> > - (length << 16) |
> > - ((req == NULL ? 1 : 0) << 15) |
> > - (0 << 12) |
> > - (3 << 10) |
> > - ((usb_pipein(pipe) ? 1 : 0) << 8) | (0x80 << 0);
> > - qtd[qtd_counter].qt_token = cpu_to_hc32(token);
> > - if (ehci_td_buffer(&qtd[qtd_counter], buffer, length) != 0) {
> > - printf("unable construct DATA td\n");
> > - goto fail;
> > - }
> > - /* Update previous qTD! */
> > - *tdp = cpu_to_hc32((uint32_t)&qtd[qtd_counter]);
> > - tdp = &qtd[qtd_counter++].qt_next;
> > + uint8_t *buf_ptr = buffer;
> > + int left_length = length;
> > +
> > + do {
> > + int xfr_bytes = min(left_length,
> > + (QT_BUFFER_CNT * 4096 -
> > + ((uint32_t)buf_ptr & 4095)) &
> > + ~4095);
>
> Magic formula yet again ... comment would again be welcome please.
OK.
> > + /*
> > + * Setup request qTD (3.5 in ehci-r10.pdf)
> > + *
> > + * qt_next ................ 03-00 H
> > + * qt_altnext ............. 07-04 H
> > + * qt_token ............... 0B-08 H
> > + *
> > + * [ buffer, buffer_hi ] loaded with "buffer".
> > + */
> > + qtd[qtd_counter].qt_next =
> > + cpu_to_hc32(QT_NEXT_TERMINATE);
> > + qtd[qtd_counter].qt_altnext =
> > + cpu_to_hc32(QT_NEXT_TERMINATE);
> > + token = (toggle << 31) |
> > + (xfr_bytes << 16) |
> > + ((req == NULL ? 1 : 0) << 15) |
> > + (0 << 12) |
> > + (3 << 10) |
> > + ((usb_pipein(pipe) ? 1 : 0) << 8) | (0x80 << 0);
>
> If you could fix all this magic afterwards (not in these patches),
> that'd be
> great.
Do you only mean #defining all those values?
> > + qtd[qtd_counter].qt_token = cpu_to_hc32(token);
> > + if (ehci_td_buffer(&qtd[qtd_counter], buf_ptr,
> > + xfr_bytes) != 0) {
> > + printf("unable construct DATA td\n");
> > + goto fail;
> > + }
> > + /* Update previous qTD! */
> > + *tdp = cpu_to_hc32((uint32_t)&qtd[qtd_counter]);
> > + tdp = &qtd[qtd_counter++].qt_next;
> > + buf_ptr += xfr_bytes;
> > + left_length -= xfr_bytes;
> > + } while (left_length > 0);
> > }
> >
> > if (req != NULL) {
> > @@ -346,7 +379,8 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned long
> > pipe, void *buffer, flush_dcache_range((uint32_t)qh_list,
> > ALIGN_END_ADDR(struct QH, qh_list, 1));
> > flush_dcache_range((uint32_t)qh, ALIGN_END_ADDR(struct QH, qh,
> > 1));
> > - flush_dcache_range((uint32_t)qtd, ALIGN_END_ADDR(struct qTD, qtd,
> > 3));
> > + flush_dcache_range((uint32_t)qtd,
> > + ALIGN_END_ADDR(struct qTD, qtd, qtd_count));
> >
> > /* Set async. queue head pointer. */
> > ehci_writel(&hcor->or_asynclistaddr, (uint32_t)qh_list);
> > @@ -377,7 +411,7 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned long
> > pipe, void *buffer, invalidate_dcache_range((uint32_t)qh,
> > ALIGN_END_ADDR(struct QH, qh, 1));
> > invalidate_dcache_range((uint32_t)qtd,
> > - ALIGN_END_ADDR(struct qTD, qtd, 3));
> > + ALIGN_END_ADDR(struct qTD, qtd, qtd_count));
> >
> > token = hc32_to_cpu(vtd->qt_token);
> > if (!(token & 0x80))
> > @@ -450,9 +484,11 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned
> > long pipe, void *buffer, ehci_readl(&hcor->or_portsc[1]));
> > }
> >
> > + free(qtd);
> > return (dev->status != USB_ST_NOT_PROC) ? 0 : -1;
> >
> > fail:
> > + free(qtd);
> > return -1;
> > }
>
Best regards,
Beno?t Th?baudeau
next prev parent reply other threads:[~2012-07-27 13:59 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-19 20:17 [U-Boot] [PATCH 2/5] ehci-hcd: Boost transfer speed Benoît Thébaudeau
2012-07-20 11:26 ` [U-Boot] [PATCH v2 " Benoît Thébaudeau
2012-07-20 11:37 ` Stefan Herbrechtsmeier
2012-07-20 13:17 ` Benoît Thébaudeau
2012-07-20 13:44 ` Marek Vasut
2012-07-20 13:56 ` Benoît Thébaudeau
2012-07-20 14:51 ` Stefan Herbrechtsmeier
2012-07-20 15:03 ` Benoît Thébaudeau
2012-07-20 15:15 ` Stefan Herbrechtsmeier
2012-07-20 15:35 ` Benoît Thébaudeau
2012-07-23 13:35 ` Stefan Herbrechtsmeier
2012-07-23 17:15 ` Benoît Thébaudeau
2012-07-24 13:02 ` Stefan Herbrechtsmeier
2012-07-29 0:48 ` Benoît Thébaudeau
2012-07-30 22:38 ` Marek Vasut
2012-07-31 1:06 ` Benoît Thébaudeau
2012-07-31 19:52 ` Stefan Herbrechtsmeier
2012-08-01 2:41 ` Marek Vasut
2012-08-03 23:02 ` Benoît Thébaudeau
2012-08-04 7:45 ` Marek Vasut
2012-08-08 23:14 ` Benoît Thébaudeau
2012-08-08 23:14 ` Marek Vasut
2012-07-31 20:01 ` Stefan Herbrechtsmeier
2012-07-27 14:07 ` Marek Vasut
2012-07-27 14:16 ` Benoît Thébaudeau
2012-07-27 14:30 ` Marek Vasut
2012-08-09 21:51 ` [U-Boot] [PATCH v3 3/8] " Benoît Thébaudeau
2012-08-09 22:32 ` Marek Vasut
2012-07-27 12:54 ` [U-Boot] [PATCH 2/5] " Marek Vasut
2012-07-27 13:59 ` Benoît Thébaudeau [this message]
2012-07-27 14:01 ` Marek Vasut
2012-07-27 14:13 ` Benoît Thébaudeau
2012-07-27 14:31 ` Marek Vasut
2012-07-29 0:58 ` Benoît Thébaudeau
2012-07-29 1:40 ` Marek Vasut
2012-07-29 14:14 ` Benoît Thébaudeau
2012-07-29 18:08 ` Marek Vasut
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=806388763.701359.1343397576841.JavaMail.root@advansee.com \
--to=benoit.thebaudeau@advansee.com \
--cc=u-boot@lists.denx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.