* [PATCH v2 0/4] USB: HCD/EHCI: giveback of URB in tasklet context
@ 2013-06-24 9:42 Ming Lei
2013-06-24 9:42 ` [PATCH v2 1/4] USB: HCD: support " Ming Lei
` (3 more replies)
0 siblings, 4 replies; 21+ messages in thread
From: Ming Lei @ 2013-06-24 9:42 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
The patchset supports to run giveback of URB in tasklet context, so that
DMA unmapping/mapping on transfer buffer and compelte() callback can be
run with interrupt enabled, then time of HCD interrupt handler(IRQs
disabled time) can be saved much. Also this approach may simplify HCD
since HCD lock needn't be released any more before calling
usb_hcd_giveback_urb().
The patchset enables the mechanism on EHCI HCD now.
In the commit log of patch 4/4, detailed test result on three machines
(ARM A9/A15 dual core, X86) are provided about transfer performance and
ehci irq handling time. From the result, no transfer performance loss
is found and ehci irq handling time drops much with the patchset.
V2:
- 1/4: always run URB complete() of root-hub in tasklet
- 1/4: store urb status in urb->unlinked
- 1/4: don't allocate 'struct giveback_urb_bh' dynamically
- 1/4: other minor changes
- 2/4: descript changes simply
- 3/4: don't use QH_STATE_UNLINK_WAIT to implement intr qh
unlink wait
- 3/4: cancel unlink wait change
- 4/4: merge HCD private lock changes
- rebase on 3.10-rc7-next20130624
V1:
- change percput tasklet into tasklet in HCD to avoid out of order
of URB->complete() for same endpoint
- disable local IRQs when calling complete() from tasklet to
avoid deadlock which is caused by holding lock via spin_lock
and the same lock might be acquired in hard irq context
- document coming change about calling complete() with irq enabled
so that we can start to clean up USB drivers which call spin_lock()
in complete()
Documentation/usb/URB.txt | 21 +++--
drivers/usb/core/hcd.c | 169 ++++++++++++++++++++++++++++++-------
drivers/usb/host/ehci-fsl.c | 2 +-
drivers/usb/host/ehci-grlib.c | 2 +-
drivers/usb/host/ehci-hcd.c | 3 +-
drivers/usb/host/ehci-hub.c | 1 +
drivers/usb/host/ehci-mem.c | 1 +
drivers/usb/host/ehci-mv.c | 2 +-
drivers/usb/host/ehci-octeon.c | 2 +-
drivers/usb/host/ehci-pmcmsp.c | 2 +-
drivers/usb/host/ehci-ppc-of.c | 2 +-
drivers/usb/host/ehci-ps3.c | 2 +-
drivers/usb/host/ehci-q.c | 5 --
drivers/usb/host/ehci-sched.c | 62 +++++++++++++-
drivers/usb/host/ehci-sead3.c | 2 +-
drivers/usb/host/ehci-sh.c | 2 +-
drivers/usb/host/ehci-tilegx.c | 2 +-
drivers/usb/host/ehci-timer.c | 45 +++++++++-
drivers/usb/host/ehci-w90x900.c | 2 +-
drivers/usb/host/ehci-xilinx-of.c | 2 +-
drivers/usb/host/ehci.h | 3 +
include/linux/usb/hcd.h | 17 ++++
22 files changed, 287 insertions(+), 64 deletions(-)
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] USB: HCD: support giveback of URB in tasklet context
2013-06-24 9:42 [PATCH v2 0/4] USB: HCD/EHCI: giveback of URB in tasklet context Ming Lei
@ 2013-06-24 9:42 ` Ming Lei
2013-06-24 11:49 ` Ming Lei
2013-06-24 15:17 ` Alan Stern
2013-06-24 9:42 ` [PATCH v2 2/4] USB: URB documentation: claim complete() will be run with IRQs enabled Ming Lei
` (2 subsequent siblings)
3 siblings, 2 replies; 21+ messages in thread
From: Ming Lei @ 2013-06-24 9:42 UTC (permalink / raw)
To: linux-arm-kernel
This patch implements the mechanism of giveback of URB in
tasklet context, so that hardware interrupt handling time for
usb host controller can be saved much, and HCD interrupt handling
can be simplified.
Motivations:
1), on some arch(such as ARM), DMA mapping/unmapping is a bit
time-consuming, for example: when accessing usb mass storage
via EHCI on pandaboard, the common length of transfer buffer is 120KB,
the time consumed on DMA unmapping may reach hundreds of microseconds;
even on A15 based box, the time is still about scores of microseconds
2), on some arch, reading DMA coherent memoery is very time-consuming,
the most common example is usb video class driver[1]
3), driver's complete() callback may do much things which is driver
specific, so the time is consumed unnecessarily in hardware irq context.
4), running driver's complete() callback in hardware irq context causes
that host controller driver has to release its lock in interrupt handler,
so reacquiring the lock after return may busy wait a while and increase
interrupt handling time. More seriously, releasing the HCD lock makes
HCD becoming quite complicated to deal with introduced races.
So the patch proposes to run giveback of URB in tasklet context, then
time consumed in HCD irq handling doesn't depend on drivers' complete and
DMA mapping/unmapping any more, also we can simplify HCD since the HCD
lock isn't needed to be released during irq handling.
The patch should be reasonable and doable:
1), for drivers, they don't care if the complete() is called in hard irq
context or softirq context
2), the biggest change is the situation in which usb_submit_urb() is called
in complete() callback, so the introduced tasklet schedule delay might be a
con, but it shouldn't be a big deal:
- control/bulk asynchronous transfer isn't sensitive to schedule
delay
- the patch schedules giveback of periodic URBs using
tasklet_hi_schedule, so the introduced delay should be very
small
- for ISOC transfer, generally, drivers submit several URBs
concurrently to avoid interrupt delay, so it is OK with the
little schedule delay.
- for interrupt transfer, generally, drivers only submit one URB
at the same time, but interrupt transfer is often used in event
report, polling, ... situations, and a little delay should be OK.
Considered that HCDs may optimize on submitting URB in complete(), the
patch may cause the optimization not working, so introduces one flag to mark
if the HCD supports to run giveback URB in tasklet context. When all HCDs
are ready, the flag can be removed.
[1], http://marc.info/?t=136438111600010&r=1&w=2
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
drivers/usb/core/hcd.c | 169 ++++++++++++++++++++++++++++++++++++++---------
include/linux/usb/hcd.h | 17 +++++
2 files changed, 156 insertions(+), 30 deletions(-)
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 014dc99..1fbd193 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -694,15 +694,7 @@ error:
/* any errors get returned through the urb completion */
spin_lock_irq(&hcd_root_hub_lock);
usb_hcd_unlink_urb_from_ep(hcd, urb);
-
- /* This peculiar use of spinlocks echoes what real HC drivers do.
- * Avoiding calls to local_irq_disable/enable makes the code
- * RT-friendly.
- */
- spin_unlock(&hcd_root_hub_lock);
usb_hcd_giveback_urb(hcd, urb, status);
- spin_lock(&hcd_root_hub_lock);
-
spin_unlock_irq(&hcd_root_hub_lock);
return 0;
}
@@ -742,9 +734,7 @@ void usb_hcd_poll_rh_status(struct usb_hcd *hcd)
memcpy(urb->transfer_buffer, buffer, length);
usb_hcd_unlink_urb_from_ep(hcd, urb);
- spin_unlock(&hcd_root_hub_lock);
usb_hcd_giveback_urb(hcd, urb, 0);
- spin_lock(&hcd_root_hub_lock);
} else {
length = 0;
set_bit(HCD_FLAG_POLL_PENDING, &hcd->flags);
@@ -834,10 +824,7 @@ static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
if (urb == hcd->status_urb) {
hcd->status_urb = NULL;
usb_hcd_unlink_urb_from_ep(hcd, urb);
-
- spin_unlock(&hcd_root_hub_lock);
usb_hcd_giveback_urb(hcd, urb, status);
- spin_lock(&hcd_root_hub_lock);
}
}
done:
@@ -1648,6 +1635,73 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
/*-------------------------------------------------------------------------*/
+static void __usb_hcd_giveback_urb(struct urb *urb)
+{
+ struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
+ int status = urb->unlinked;
+ unsigned long flags;
+
+ urb->hcpriv = NULL;
+ if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
+ urb->actual_length < urb->transfer_buffer_length &&
+ !status))
+ status = -EREMOTEIO;
+
+ unmap_urb_for_dma(hcd, urb);
+ usbmon_urb_complete(&hcd->self, urb, status);
+ usb_unanchor_urb(urb);
+
+ /* pass ownership to the completion handler */
+ urb->status = status;
+
+ /*
+ * We disable local IRQs here avoid possible deadlock because
+ * drivers may call spin_lock() to hold lock which might be
+ * acquired in one hard interrupt handler.
+ *
+ * The local_irq_save()/local_irq_restore() around complete()
+ * will be removed if current USB drivers have been cleaned up
+ * and no one may trigger the above deadlock situation when
+ * running complete() in tasklet.
+ */
+ local_irq_save(flags);
+ urb->complete(urb);
+ local_irq_restore(flags);
+
+ atomic_dec(&urb->use_count);
+ if (unlikely(atomic_read(&urb->reject)))
+ wake_up(&usb_kill_urb_queue);
+ usb_put_urb(urb);
+}
+
+static void usb_giveback_urb_bh(unsigned long param)
+{
+ struct giveback_urb_bh *bh = (struct giveback_urb_bh *)param;
+ unsigned long flags;
+ struct list_head local_list;
+
+ spin_lock_irqsave(&bh->lock, flags);
+ bh->running = 1;
+ restart:
+ list_replace_init(&bh->head, &local_list);
+ spin_unlock_irqrestore(&bh->lock, flags);
+
+ while (!list_empty(&local_list)) {
+ struct urb *urb;
+
+ urb = list_entry(local_list.next, struct urb, urb_list);
+ list_del_init(&urb->urb_list);
+ __usb_hcd_giveback_urb(urb);
+ }
+
+ /* check if there are new URBs to giveback */
+ spin_lock_irqsave(&bh->lock, flags);
+ if (!list_empty(&bh->head))
+ goto restart;
+ bh->running = 0;
+ spin_unlock_irqrestore(&bh->lock, flags);
+}
+
/**
* usb_hcd_giveback_urb - return URB from HCD to device driver
* @hcd: host controller returning the URB
@@ -1667,25 +1721,40 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
*/
void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
{
- urb->hcpriv = NULL;
- if (unlikely(urb->unlinked))
- status = urb->unlinked;
- else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
- urb->actual_length < urb->transfer_buffer_length &&
- !status))
- status = -EREMOTEIO;
+ struct giveback_urb_bh *bh;
+ bool sched, high_prio_bh;
- unmap_urb_for_dma(hcd, urb);
- usbmon_urb_complete(&hcd->self, urb, status);
- usb_unanchor_urb(urb);
+ /* pass status to tasklet via unlinked */
+ if (likely(!urb->unlinked))
+ urb->unlinked = status;
- /* pass ownership to the completion handler */
- urb->status = status;
- urb->complete (urb);
- atomic_dec (&urb->use_count);
- if (unlikely(atomic_read(&urb->reject)))
- wake_up (&usb_kill_urb_queue);
- usb_put_urb (urb);
+ if (!hcd_giveback_urb_in_bh(hcd) && !is_root_hub(urb->dev)) {
+ __usb_hcd_giveback_urb(urb);
+ return;
+ }
+
+ if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
+ bh = &hcd->high_prio_bh;
+ high_prio_bh = 1;
+ } else {
+ bh = &hcd->low_prio_bh;
+ high_prio_bh = 0;
+ }
+
+ spin_lock(&bh->lock);
+ list_add_tail(&urb->urb_list, &bh->head);
+ if (bh->running)
+ sched = 0;
+ else
+ sched = 1;
+ spin_unlock(&bh->lock);
+
+ if (!sched)
+ ;
+ else if (high_prio_bh)
+ tasklet_hi_schedule(&bh->bh);
+ else
+ tasklet_schedule(&bh->bh);
}
EXPORT_SYMBOL_GPL(usb_hcd_giveback_urb);
@@ -2307,6 +2376,42 @@ EXPORT_SYMBOL_GPL (usb_hc_died);
/*-------------------------------------------------------------------------*/
+static void __init_giveback_urb_bh(struct giveback_urb_bh *bh)
+{
+
+ spin_lock_init(&bh->lock);
+ INIT_LIST_HEAD(&bh->head);
+ tasklet_init(&bh->bh, usb_giveback_urb_bh, (unsigned long)bh);
+}
+
+static void init_giveback_urb_bh(struct usb_hcd *hcd)
+{
+ if (!hcd_giveback_urb_in_bh(hcd))
+ return;
+
+ __init_giveback_urb_bh(&hcd->high_prio_bh);
+ __init_giveback_urb_bh(&hcd->low_prio_bh);
+}
+
+static void exit_giveback_urb_bh(struct usb_hcd *hcd)
+{
+ if (!hcd_giveback_urb_in_bh(hcd))
+ return;
+
+ /*
+ * tasklet_kill() isn't needed here because:
+ * - driver's disconnec() called from usb_disconnect() should
+ * make sure its URBs are completed during the disconnect()
+ * callback
+ *
+ * - it is too late to run complete() here since driver may have
+ * been removed already now
+ *
+ * Using tasklet to run URB's complete() doesn't change this
+ * behavior of usbcore.
+ */
+}
+
/**
* usb_create_shared_hcd - create and initialize an HCD structure
* @driver: HC driver that will use this hcd
@@ -2573,6 +2678,8 @@ int usb_add_hcd(struct usb_hcd *hcd,
&& device_can_wakeup(&hcd->self.root_hub->dev))
dev_dbg(hcd->self.controller, "supports USB remote wakeup\n");
+ init_giveback_urb_bh(hcd);
+
/* enable irqs just before we start the controller,
* if the BIOS provides legacy PCI irqs.
*/
@@ -2681,6 +2788,8 @@ void usb_remove_hcd(struct usb_hcd *hcd)
usb_disconnect(&rhdev); /* Sets rhdev to NULL */
mutex_unlock(&usb_bus_list_lock);
+ exit_giveback_urb_bh(hcd);
+
/* Prevent any more root-hub status calls from the timer.
* The HCD might still restart the timer (if a port status change
* interrupt occurs), but usb_hcd_poll_rh_status() won't invoke
diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
index 1e88377..a9c7d44 100644
--- a/include/linux/usb/hcd.h
+++ b/include/linux/usb/hcd.h
@@ -22,6 +22,7 @@
#ifdef __KERNEL__
#include <linux/rwsem.h>
+#include <linux/interrupt.h>
#define MAX_TOPO_LEVEL 6
@@ -67,6 +68,13 @@
/*-------------------------------------------------------------------------*/
+struct giveback_urb_bh {
+ bool running;
+ spinlock_t lock;
+ struct list_head head;
+ struct tasklet_struct bh;
+};
+
struct usb_hcd {
/*
@@ -139,6 +147,9 @@ struct usb_hcd {
resource_size_t rsrc_len; /* memory/io resource length */
unsigned power_budget; /* in mA, 0 = no limit */
+ struct giveback_urb_bh high_prio_bh;
+ struct giveback_urb_bh low_prio_bh;
+
/* bandwidth_mutex should be taken before adding or removing
* any new bus bandwidth constraints:
* 1. Before adding a configuration for a new device.
@@ -221,6 +232,7 @@ struct hc_driver {
#define HCD_USB25 0x0030 /* Wireless USB 1.0 (USB 2.5)*/
#define HCD_USB3 0x0040 /* USB 3.0 */
#define HCD_MASK 0x0070
+#define HCD_BH 0x0100 /* URB complete in BH context */
/* called to init HCD and root hub */
int (*reset) (struct usb_hcd *hcd);
@@ -361,6 +373,11 @@ struct hc_driver {
int (*find_raw_port_number)(struct usb_hcd *, int);
};
+static inline int hcd_giveback_urb_in_bh(struct usb_hcd *hcd)
+{
+ return hcd->driver->flags & HCD_BH;
+}
+
extern int usb_hcd_link_urb_to_ep(struct usb_hcd *hcd, struct urb *urb);
extern int usb_hcd_check_unlink_urb(struct usb_hcd *hcd, struct urb *urb,
int status);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/4] USB: URB documentation: claim complete() will be run with IRQs enabled
2013-06-24 9:42 [PATCH v2 0/4] USB: HCD/EHCI: giveback of URB in tasklet context Ming Lei
2013-06-24 9:42 ` [PATCH v2 1/4] USB: HCD: support " Ming Lei
@ 2013-06-24 9:42 ` Ming Lei
2013-06-24 15:18 ` Alan Stern
2013-06-24 9:42 ` [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink Ming Lei
2013-06-24 9:42 ` [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context Ming Lei
3 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2013-06-24 9:42 UTC (permalink / raw)
To: linux-arm-kernel
There is no good reason to run complete() in hard interrupt
disabled context.
After switch to run complete() in tasklet, we will enable local IRQs
when calling complete() since we can do it at that time.
Even though we still disable IRQs now when calling complete()
in tasklet, the URB documentation is updated to claim complete()
will be run in tasklet context and local IRQs will be enabled, so
that USB drivers can know the change and avoid one deadlock caused
by: assume IRQs disabled in complete() and call spin_lock() to
hold lock which might be acquired in interrupt context.
Current spin_lock() usages in drivers' complete() will be cleaned
up at the same time, and once the cleanup is finished, local IRQs
will be enabled when calling complete() in tasklet.
Also fix description about type of usb_complete_t, and remove the
advice of running completion handler in tasklet for decreasing
system latency.
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
Documentation/usb/URB.txt | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/Documentation/usb/URB.txt b/Documentation/usb/URB.txt
index 00d2c64..50da0d4 100644
--- a/Documentation/usb/URB.txt
+++ b/Documentation/usb/URB.txt
@@ -195,13 +195,12 @@ by the completion handler.
The handler is of the following type:
- typedef void (*usb_complete_t)(struct urb *, struct pt_regs *)
+ typedef void (*usb_complete_t)(struct urb *)
-I.e., it gets the URB that caused the completion call, plus the
-register values at the time of the corresponding interrupt (if any).
-In the completion handler, you should have a look at urb->status to
-detect any USB errors. Since the context parameter is included in the URB,
-you can pass information to the completion handler.
+I.e., it gets the URB that caused the completion call. In the completion
+handler, you should have a look at urb->status to detect any USB errors.
+Since the context parameter is included in the URB, you can pass
+information to the completion handler.
Note that even when an error (or unlink) is reported, data may have been
transferred. That's because USB transfers are packetized; it might take
@@ -210,12 +209,12 @@ have transferred successfully before the completion was called.
NOTE: ***** WARNING *****
-NEVER SLEEP IN A COMPLETION HANDLER. These are normally called
-during hardware interrupt processing. If you can, defer substantial
-work to a tasklet (bottom half) to keep system latencies low. You'll
-probably need to use spinlocks to protect data structures you manipulate
-in completion handlers.
+NEVER SLEEP IN A COMPLETION HANDLER. These are often called in atomic
+context.
+In the current kernel, completion handlers run with local interrupts
+disabled, but in the future this will be changed, so don't assume that
+local IRQs are always disabled inside completion handlers.
1.8. How to do isochronous (ISO) transfers?
--
1.7.9.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 9:42 [PATCH v2 0/4] USB: HCD/EHCI: giveback of URB in tasklet context Ming Lei
2013-06-24 9:42 ` [PATCH v2 1/4] USB: HCD: support " Ming Lei
2013-06-24 9:42 ` [PATCH v2 2/4] USB: URB documentation: claim complete() will be run with IRQs enabled Ming Lei
@ 2013-06-24 9:42 ` Ming Lei
2013-06-24 10:28 ` Oliver Neukum
2013-06-24 18:53 ` Alan Stern
2013-06-24 9:42 ` [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context Ming Lei
3 siblings, 2 replies; 21+ messages in thread
From: Ming Lei @ 2013-06-24 9:42 UTC (permalink / raw)
To: linux-arm-kernel
Given interrupt URB will be resubmitted from tasklet context which
is scheduled by ehci hardware interrupt handler, and commonly only
one interrupt URB is scheduled on qh, so the qh may be unlinked
immediately once qh_completions() returns from ehci_irq(), then
the intr URB to be resubmitted in complete() can only be scheduled
and linked to hardware until the qh unlink is completed.
This patch improves this above situation, and the qh will wait for 5
milliseconds before being unlinked from hardware, if one URB is submitted
during the period, the qh is move out of unlink wait list and the
interrupt transfer can be scheduled immediately, not like before: the
transfer is only linked to hardware until previous unlink is completed.
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
drivers/usb/host/ehci-hcd.c | 1 +
drivers/usb/host/ehci-hub.c | 1 +
drivers/usb/host/ehci-mem.c | 1 +
drivers/usb/host/ehci-sched.c | 62 ++++++++++++++++++++++++++++++++++++++---
drivers/usb/host/ehci-timer.c | 48 ++++++++++++++++++++++++++++++-
drivers/usb/host/ehci.h | 3 ++
6 files changed, 111 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 7abf1ce..35f1372 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -487,6 +487,7 @@ static int ehci_init(struct usb_hcd *hcd)
ehci->periodic_size = DEFAULT_I_TDPS;
INIT_LIST_HEAD(&ehci->async_unlink);
INIT_LIST_HEAD(&ehci->async_idle);
+ INIT_LIST_HEAD(&ehci->intr_unlink_wait);
INIT_LIST_HEAD(&ehci->intr_unlink);
INIT_LIST_HEAD(&ehci->intr_qh_list);
INIT_LIST_HEAD(&ehci->cached_itd_list);
diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c
index 2b70277..6037f84 100644
--- a/drivers/usb/host/ehci-hub.c
+++ b/drivers/usb/host/ehci-hub.c
@@ -345,6 +345,7 @@ static int ehci_bus_suspend (struct usb_hcd *hcd)
end_unlink_async(ehci);
unlink_empty_async_suspended(ehci);
+ ehci_handle_start_intr_unlinks(ehci);
ehci_handle_intr_unlinks(ehci);
end_free_itds(ehci);
diff --git a/drivers/usb/host/ehci-mem.c b/drivers/usb/host/ehci-mem.c
index ef2c3a1..52a7773 100644
--- a/drivers/usb/host/ehci-mem.c
+++ b/drivers/usb/host/ehci-mem.c
@@ -93,6 +93,7 @@ static struct ehci_qh *ehci_qh_alloc (struct ehci_hcd *ehci, gfp_t flags)
qh->qh_dma = dma;
// INIT_LIST_HEAD (&qh->qh_list);
INIT_LIST_HEAD (&qh->qtd_list);
+ INIT_LIST_HEAD(&qh->unlink_node);
/* dummy td enables safe urb queuing */
qh->dummy = ehci_qtd_alloc (ehci, flags);
diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
index f80d033..5bf67e2 100644
--- a/drivers/usb/host/ehci-sched.c
+++ b/drivers/usb/host/ehci-sched.c
@@ -601,12 +601,24 @@ static void qh_unlink_periodic(struct ehci_hcd *ehci, struct ehci_qh *qh)
list_del(&qh->intr_node);
}
-static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
+/* must be called with holding ehci->lock */
+static void cancel_unlink_wait_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
{
- /* If the QH isn't linked then there's nothing we can do. */
- if (qh->qh_state != QH_STATE_LINKED)
+ if (qh->qh_state != QH_STATE_LINKED || list_empty(&qh->unlink_node))
return;
+ list_del_init(&qh->unlink_node);
+
+ /* avoid unnecessary CPU wakeup */
+ if (list_empty(&ehci->intr_unlink_wait))
+ ehci_disable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR);
+}
+
+static void start_do_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
+{
+ /* if the qh is waitting for unlink, cancel it now */
+ cancel_unlink_wait_intr(ehci, qh);
+
qh_unlink_periodic (ehci, qh);
/* Make sure the unlinks are visible before starting the timer */
@@ -632,6 +644,45 @@ static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
}
}
+/*
+ * It is common only one intr URB is scheduled on one qh, and
+ * given complete() is run in tasklet context, introduce a bit
+ * delay to avoid unlink qh too early.
+ */
+static void __start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh,
+ bool wait)
+{
+ /* If the QH isn't linked then there's nothing we can do. */
+ if (qh->qh_state != QH_STATE_LINKED)
+ return;
+
+ if (!wait)
+ return start_do_unlink_intr(ehci, qh);
+
+ qh->unlink_cycle = ehci->intr_unlink_wait_cycle;
+
+ /* New entries go at the end of the intr_unlink_wait list */
+ list_add_tail(&qh->unlink_node, &ehci->intr_unlink_wait);
+
+ if (ehci->rh_state < EHCI_RH_RUNNING)
+ ehci_handle_start_intr_unlinks(ehci);
+ else if (ehci->intr_unlink_wait.next == &qh->unlink_node) {
+ ehci_enable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR, true);
+ ++ehci->intr_unlink_wait_cycle;
+ }
+}
+
+static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
+{
+ __start_unlink_intr(ehci, qh, false);
+}
+
+static void start_unlink_intr_wait(struct ehci_hcd *ehci,
+ struct ehci_qh *qh)
+{
+ __start_unlink_intr(ehci, qh, true);
+}
+
static void end_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
{
struct ehci_qh_hw *hw = qh->hw;
@@ -881,6 +932,9 @@ static int intr_submit (
goto done;
}
+ /* put back qh from unlink wait list */
+ cancel_unlink_wait_intr(ehci, qh);
+
/* then queue the urb's tds to the qh */
qh = qh_append_tds(ehci, urb, qtd_list, epnum, &urb->ep->hcpriv);
BUG_ON (qh == NULL);
@@ -926,7 +980,7 @@ static void scan_intr(struct ehci_hcd *ehci)
temp = qh_completions(ehci, qh);
if (unlikely(temp || (list_empty(&qh->qtd_list) &&
qh->qh_state == QH_STATE_LINKED)))
- start_unlink_intr(ehci, qh);
+ start_unlink_intr_wait(ehci, qh);
}
}
}
diff --git a/drivers/usb/host/ehci-timer.c b/drivers/usb/host/ehci-timer.c
index 11e5b32..9fef5d4 100644
--- a/drivers/usb/host/ehci-timer.c
+++ b/drivers/usb/host/ehci-timer.c
@@ -72,6 +72,7 @@ static unsigned event_delays_ns[] = {
1 * NSEC_PER_MSEC, /* EHCI_HRTIMER_POLL_DEAD */
1125 * NSEC_PER_USEC, /* EHCI_HRTIMER_UNLINK_INTR */
2 * NSEC_PER_MSEC, /* EHCI_HRTIMER_FREE_ITDS */
+ 5 * NSEC_PER_MSEC, /* EHCI_HRTIMER_START_UNLINK_INTR */
6 * NSEC_PER_MSEC, /* EHCI_HRTIMER_ASYNC_UNLINKS */
10 * NSEC_PER_MSEC, /* EHCI_HRTIMER_IAA_WATCHDOG */
10 * NSEC_PER_MSEC, /* EHCI_HRTIMER_DISABLE_PERIODIC */
@@ -98,6 +99,19 @@ static void ehci_enable_event(struct ehci_hcd *ehci, unsigned event,
}
}
+/* Warning: don't call this function from hrtimer handler context */
+static void ehci_disable_event(struct ehci_hcd *ehci, unsigned event)
+{
+ if (!hcd_giveback_urb_in_bh(ehci_to_hcd(ehci)))
+ return;
+
+ ehci->enabled_hrtimer_events &= ~(1 << event);
+ if (!ehci->enabled_hrtimer_events) {
+ ehci->next_hrtimer_event = EHCI_HRTIMER_NO_EVENT;
+ hrtimer_cancel(&ehci->hrtimer);
+ }
+}
+
/* Poll the STS_ASS status bit; see when it agrees with CMD_ASE */
static void ehci_poll_ASS(struct ehci_hcd *ehci)
@@ -215,6 +229,37 @@ static void ehci_handle_controller_death(struct ehci_hcd *ehci)
/* Not in process context, so don't try to reset the controller */
}
+/* start to unlink interrupt QHs */
+static void ehci_handle_start_intr_unlinks(struct ehci_hcd *ehci)
+{
+ bool stopped = (ehci->rh_state < EHCI_RH_RUNNING);
+
+ /*
+ * Process all the QHs on the intr_unlink list that were added
+ * before the current unlink cycle began. The list is in
+ * temporal order, so stop when we reach the first entry in the
+ * current cycle. But if the root hub isn't running then
+ * process all the QHs on the list.
+ */
+ while (!list_empty(&ehci->intr_unlink_wait)) {
+ struct ehci_qh *qh;
+
+ qh = list_first_entry(&ehci->intr_unlink_wait,
+ struct ehci_qh, unlink_node);
+ if (!stopped &&
+ (qh->unlink_cycle ==
+ ehci->intr_unlink_wait_cycle))
+ break;
+ list_del_init(&qh->unlink_node);
+ start_unlink_intr(ehci, qh);
+ }
+
+ /* Handle remaining entries later */
+ if (!list_empty(&ehci->intr_unlink_wait)) {
+ ehci_enable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR, true);
+ ++ehci->intr_unlink_wait_cycle;
+ }
+}
/* Handle unlinked interrupt QHs once they are gone from the hardware */
static void ehci_handle_intr_unlinks(struct ehci_hcd *ehci)
@@ -236,7 +281,7 @@ static void ehci_handle_intr_unlinks(struct ehci_hcd *ehci)
unlink_node);
if (!stopped && qh->unlink_cycle == ehci->intr_unlink_cycle)
break;
- list_del(&qh->unlink_node);
+ list_del_init(&qh->unlink_node);
end_unlink_intr(ehci, qh);
}
@@ -363,6 +408,7 @@ static void (*event_handlers[])(struct ehci_hcd *) = {
ehci_handle_controller_death, /* EHCI_HRTIMER_POLL_DEAD */
ehci_handle_intr_unlinks, /* EHCI_HRTIMER_UNLINK_INTR */
end_free_itds, /* EHCI_HRTIMER_FREE_ITDS */
+ ehci_handle_start_intr_unlinks, /* EHCI_HRTIMER_START_UNLINK_INTR */
unlink_empty_async, /* EHCI_HRTIMER_ASYNC_UNLINKS */
ehci_iaa_watchdog, /* EHCI_HRTIMER_IAA_WATCHDOG */
ehci_disable_PSE, /* EHCI_HRTIMER_DISABLE_PERIODIC */
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
index 64f9a08..28dc8d2 100644
--- a/drivers/usb/host/ehci.h
+++ b/drivers/usb/host/ehci.h
@@ -88,6 +88,7 @@ enum ehci_hrtimer_event {
EHCI_HRTIMER_POLL_DEAD, /* Wait for dead controller to stop */
EHCI_HRTIMER_UNLINK_INTR, /* Wait for interrupt QH unlink */
EHCI_HRTIMER_FREE_ITDS, /* Wait for unused iTDs and siTDs */
+ EHCI_HRTIMER_START_UNLINK_INTR, /* wait for new intr schedule */
EHCI_HRTIMER_ASYNC_UNLINKS, /* Unlink empty async QHs */
EHCI_HRTIMER_IAA_WATCHDOG, /* Handle lost IAA interrupts */
EHCI_HRTIMER_DISABLE_PERIODIC, /* Wait to disable periodic sched */
@@ -143,6 +144,8 @@ struct ehci_hcd { /* one per controller */
unsigned i_thresh; /* uframes HC might cache */
union ehci_shadow *pshadow; /* mirror hw periodic table */
+ struct list_head intr_unlink_wait;
+ unsigned intr_unlink_wait_cycle;
struct list_head intr_unlink;
unsigned intr_unlink_cycle;
unsigned now_frame; /* frame from HC hardware */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context
2013-06-24 9:42 [PATCH v2 0/4] USB: HCD/EHCI: giveback of URB in tasklet context Ming Lei
` (2 preceding siblings ...)
2013-06-24 9:42 ` [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink Ming Lei
@ 2013-06-24 9:42 ` Ming Lei
2013-06-24 10:24 ` Oliver Neukum
3 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2013-06-24 9:42 UTC (permalink / raw)
To: linux-arm-kernel
All 4 transfer types can work well on EHCI HCD after switching to run
URB giveback in tasklet context, so mark all HCD drivers to support
it.
At the same time, don't release ehci->lock during URB giveback,
and remove the check on HCD_BH in ehci_disable_event().
>From below test results on 3 machines(2 ARM and one x86), time
consumed by EHCI interrupt handler droped much without performance
loss.
1 test description
1.1 mass storage performance test:
- run below command 10 times and compute the average performance
dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1
- two usb mass storage device:
A: sandisk extreme USB 3.0 16G(used in test case 1 & case 2)
B: kingston DataTraveler G2 4GB(only used in test case 2)
1.2 uvc function test:
- run one simple capture program in the below link
http://kernel.ubuntu.com/~ming/up/capture.c
- capture format 640*480 and results in High Bandwidth mode on the
uvc device: Z-Star 0x0ac8/0x3450
- on T410(x86) laptop, also use guvcview to watch video capture/playback
1.3 about test2 and test4
- both two devices involved are tested concurrently by above test items
1.4 how to compute irq time(the time consumed by ehci_irq)
- use trace points of irq:irq_handler_entry and irq:irq_handler_exit
1.5 kernel
3.10.0-rc3-next-20130528
1.6 test machines
Pandaboard A1: ARM CortexA9 dural core
Arndale board: ARM CortexA15 dural core
T410: i5 CPU 2.67GHz quad core
2 test result
2.1 test case1: single mass storage device performance test
--------------------------------------------------------------------
upstream | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1: 25.280(avg:145,max:772) | 25.540(avg:14, max:75)
Arndale board: 29.700(avg:33, max:129) | 29.700(avg:10, max:50)
T410: 34.430(avg:17, max:154*)| 34.660(avg:12, max:155)
---------------------------------------------------------------------
2.2 test case2: two mass storage devices' performance test
--------------------------------------------------------------------
upstream | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1: 15.840/15.580(avg:158,max:1216) | 16.500/16.160(avg:15,max:139)
Arndale board: 17.370/16.220(avg:33 max:234) | 17.480/16.200(avg:11, max:91)
T410: 21.180/19.820(avg:18 max:160) | 21.220/19.880(avg:11, max:149)
---------------------------------------------------------------------
2.3 test case3: one uvc streaming test
- uvc device works well(on x86, luvcview can be used too and has
same result with uvc capture)
--------------------------------------------------------------------
upstream | patched
irq time(us) | irq time(us)
--------------------------------------------------------------------
Pandaboard A1: (avg:445, max:873) | (avg:33, max:44)
Arndale board: (avg:316, max:630) | (avg:20, max:27)
T410: (avg:39, max:107) | (avg:10, max:65)
---------------------------------------------------------------------
2.4 test case4: one uvc streaming plus one mass storage device test
--------------------------------------------------------------------
upstream | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1: 20.340(avg:259,max:1704)| 20.390(avg:24, max:101)
Arndale board: 23.460(avg:124,max:726) | 23.370(avg:15, max:52)
T410: 28.520(avg:27, max:169) | 28.630(avg:13, max:160)
---------------------------------------------------------------------
* On T410, sometimes read ehci status register in ehci_irq takes more
than 100us, and the problem has been reported on the link:
http://marc.info/?t=137065867300001&r=1&w=2
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
drivers/usb/host/ehci-fsl.c | 2 +-
drivers/usb/host/ehci-grlib.c | 2 +-
drivers/usb/host/ehci-hcd.c | 2 +-
drivers/usb/host/ehci-mv.c | 2 +-
drivers/usb/host/ehci-octeon.c | 2 +-
drivers/usb/host/ehci-pmcmsp.c | 2 +-
drivers/usb/host/ehci-ppc-of.c | 2 +-
drivers/usb/host/ehci-ps3.c | 2 +-
drivers/usb/host/ehci-q.c | 5 -----
drivers/usb/host/ehci-sead3.c | 2 +-
drivers/usb/host/ehci-sh.c | 2 +-
drivers/usb/host/ehci-tilegx.c | 2 +-
drivers/usb/host/ehci-timer.c | 3 ---
drivers/usb/host/ehci-w90x900.c | 2 +-
drivers/usb/host/ehci-xilinx-of.c | 2 +-
15 files changed, 13 insertions(+), 21 deletions(-)
diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
index bd831ec..330274a 100644
--- a/drivers/usb/host/ehci-fsl.c
+++ b/drivers/usb/host/ehci-fsl.c
@@ -669,7 +669,7 @@ static const struct hc_driver ehci_fsl_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_USB2 | HCD_MEMORY,
+ .flags = HCD_USB2 | HCD_MEMORY | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-grlib.c b/drivers/usb/host/ehci-grlib.c
index a77bd8d..2905004 100644
--- a/drivers/usb/host/ehci-grlib.c
+++ b/drivers/usb/host/ehci-grlib.c
@@ -43,7 +43,7 @@ static const struct hc_driver ehci_grlib_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 35f1372..fe27038 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -1170,7 +1170,7 @@ static const struct hc_driver ehci_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-mv.c b/drivers/usb/host/ehci-mv.c
index 915c2db..ce18a36 100644
--- a/drivers/usb/host/ehci-mv.c
+++ b/drivers/usb/host/ehci-mv.c
@@ -96,7 +96,7 @@ static const struct hc_driver mv_ehci_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-octeon.c b/drivers/usb/host/ehci-octeon.c
index 45cc001..ab0397e 100644
--- a/drivers/usb/host/ehci-octeon.c
+++ b/drivers/usb/host/ehci-octeon.c
@@ -51,7 +51,7 @@ static const struct hc_driver ehci_octeon_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-pmcmsp.c b/drivers/usb/host/ehci-pmcmsp.c
index 601e208..893b707 100644
--- a/drivers/usb/host/ehci-pmcmsp.c
+++ b/drivers/usb/host/ehci-pmcmsp.c
@@ -286,7 +286,7 @@ static const struct hc_driver ehci_msp_hc_driver = {
#else
.irq = ehci_irq,
#endif
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-ppc-of.c b/drivers/usb/host/ehci-ppc-of.c
index 86da09c..014d37b 100644
--- a/drivers/usb/host/ehci-ppc-of.c
+++ b/drivers/usb/host/ehci-ppc-of.c
@@ -28,7 +28,7 @@ static const struct hc_driver ehci_ppc_of_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-ps3.c b/drivers/usb/host/ehci-ps3.c
index fd98377..8188542 100644
--- a/drivers/usb/host/ehci-ps3.c
+++ b/drivers/usb/host/ehci-ps3.c
@@ -71,7 +71,7 @@ static const struct hc_driver ps3_ehci_hc_driver = {
.product_desc = "PS3 EHCI Host Controller",
.hcd_priv_size = sizeof(struct ehci_hcd),
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
.reset = ps3_ehci_hc_reset,
.start = ehci_run,
.stop = ehci_stop,
diff --git a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c
index d34b399..b637a65 100644
--- a/drivers/usb/host/ehci-q.c
+++ b/drivers/usb/host/ehci-q.c
@@ -254,8 +254,6 @@ static int qtd_copy_status (
static void
ehci_urb_done(struct ehci_hcd *ehci, struct urb *urb, int status)
-__releases(ehci->lock)
-__acquires(ehci->lock)
{
if (usb_pipetype(urb->pipe) == PIPE_INTERRUPT) {
/* ... update hc-wide periodic stats */
@@ -281,11 +279,8 @@ __acquires(ehci->lock)
urb->actual_length, urb->transfer_buffer_length);
#endif
- /* complete() can reenter this HCD */
usb_hcd_unlink_urb_from_ep(ehci_to_hcd(ehci), urb);
- spin_unlock (&ehci->lock);
usb_hcd_giveback_urb(ehci_to_hcd(ehci), urb, status);
- spin_lock (&ehci->lock);
}
static int qh_schedule (struct ehci_hcd *ehci, struct ehci_qh *qh);
diff --git a/drivers/usb/host/ehci-sead3.c b/drivers/usb/host/ehci-sead3.c
index b2de52d..8a73449 100644
--- a/drivers/usb/host/ehci-sead3.c
+++ b/drivers/usb/host/ehci-sead3.c
@@ -55,7 +55,7 @@ const struct hc_driver ehci_sead3_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-sh.c b/drivers/usb/host/ehci-sh.c
index c4c0ee9..1691f8e 100644
--- a/drivers/usb/host/ehci-sh.c
+++ b/drivers/usb/host/ehci-sh.c
@@ -36,7 +36,7 @@ static const struct hc_driver ehci_sh_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_USB2 | HCD_MEMORY,
+ .flags = HCD_USB2 | HCD_MEMORY | HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-tilegx.c b/drivers/usb/host/ehci-tilegx.c
index d72b292..204d3b6 100644
--- a/drivers/usb/host/ehci-tilegx.c
+++ b/drivers/usb/host/ehci-tilegx.c
@@ -61,7 +61,7 @@ static const struct hc_driver ehci_tilegx_hc_driver = {
* Generic hardware linkage.
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* Basic lifecycle operations.
diff --git a/drivers/usb/host/ehci-timer.c b/drivers/usb/host/ehci-timer.c
index 9fef5d4..9970746 100644
--- a/drivers/usb/host/ehci-timer.c
+++ b/drivers/usb/host/ehci-timer.c
@@ -102,9 +102,6 @@ static void ehci_enable_event(struct ehci_hcd *ehci, unsigned event,
/* Warning: don't call this function from hrtimer handler context */
static void ehci_disable_event(struct ehci_hcd *ehci, unsigned event)
{
- if (!hcd_giveback_urb_in_bh(ehci_to_hcd(ehci)))
- return;
-
ehci->enabled_hrtimer_events &= ~(1 << event);
if (!ehci->enabled_hrtimer_events) {
ehci->next_hrtimer_event = EHCI_HRTIMER_NO_EVENT;
diff --git a/drivers/usb/host/ehci-w90x900.c b/drivers/usb/host/ehci-w90x900.c
index 59e0e24..1c370df 100644
--- a/drivers/usb/host/ehci-w90x900.c
+++ b/drivers/usb/host/ehci-w90x900.c
@@ -108,7 +108,7 @@ static const struct hc_driver ehci_w90x900_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_USB2|HCD_MEMORY,
+ .flags = HCD_USB2|HCD_MEMORY|HCD_BH,
/*
* basic lifecycle operations
diff --git a/drivers/usb/host/ehci-xilinx-of.c b/drivers/usb/host/ehci-xilinx-of.c
index 35c7f90..c6591ea 100644
--- a/drivers/usb/host/ehci-xilinx-of.c
+++ b/drivers/usb/host/ehci-xilinx-of.c
@@ -79,7 +79,7 @@ static const struct hc_driver ehci_xilinx_of_hc_driver = {
* generic hardware linkage
*/
.irq = ehci_irq,
- .flags = HCD_MEMORY | HCD_USB2,
+ .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
/*
* basic lifecycle operations
--
1.7.9.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context
2013-06-24 9:42 ` [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context Ming Lei
@ 2013-06-24 10:24 ` Oliver Neukum
2013-06-24 12:58 ` Ming Lei
0 siblings, 1 reply; 21+ messages in thread
From: Oliver Neukum @ 2013-06-24 10:24 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 24 June 2013 17:42:05 Ming Lei wrote:
> All 4 transfer types can work well on EHCI HCD after switching to run
> URB giveback in tasklet context, so mark all HCD drivers to support
> it.
>
> At the same time, don't release ehci->lock during URB giveback,
> and remove the check on HCD_BH in ehci_disable_event().
>
> From below test results on 3 machines(2 ARM and one x86), time
> consumed by EHCI interrupt handler droped much without performance
> loss.
>
> 1 test description
> 1.1 mass storage performance test:
> - run below command 10 times and compute the average performance
>
> dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1
It would be nice to get worst case numbers. How bad does it get
if you reduce the sg size in usb-storage from 120K to 4K?
Regards
Oliver
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 9:42 ` [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink Ming Lei
@ 2013-06-24 10:28 ` Oliver Neukum
2013-06-24 11:16 ` Ming Lei
2013-06-24 18:53 ` Alan Stern
1 sibling, 1 reply; 21+ messages in thread
From: Oliver Neukum @ 2013-06-24 10:28 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 24 June 2013 17:42:04 Ming Lei wrote:
> This patch improves this above situation, and the qh will wait for 5
> milliseconds before being unlinked from hardware, if one URB is submitted
> during the period, the qh is move out of unlink wait list and the
> interrupt transfer can be scheduled immediately, not like before: the
> transfer is only linked to hardware until previous unlink is completed.
It seems to me that this logic should be used only if the URB completed
without error.
Regards
Oliver
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 10:28 ` Oliver Neukum
@ 2013-06-24 11:16 ` Ming Lei
2013-06-24 12:08 ` Oliver Neukum
0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2013-06-24 11:16 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 24, 2013 at 6:28 PM, Oliver Neukum <oliver@neukum.org> wrote:
> On Monday 24 June 2013 17:42:04 Ming Lei wrote:
>> This patch improves this above situation, and the qh will wait for 5
>> milliseconds before being unlinked from hardware, if one URB is submitted
>> during the period, the qh is move out of unlink wait list and the
>> interrupt transfer can be scheduled immediately, not like before: the
>> transfer is only linked to hardware until previous unlink is completed.
>
> It seems to me that this logic should be used only if the URB completed
> without error.
The current completed URB with error doesn't mean the later submitted
URB will complete with error, so I don't think the logic is related with URB
completion error.
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] USB: HCD: support giveback of URB in tasklet context
2013-06-24 9:42 ` [PATCH v2 1/4] USB: HCD: support " Ming Lei
@ 2013-06-24 11:49 ` Ming Lei
2013-06-24 15:17 ` Alan Stern
1 sibling, 0 replies; 21+ messages in thread
From: Ming Lei @ 2013-06-24 11:49 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 24, 2013 at 5:42 PM, Ming Lei <ming.lei@canonical.com> wrote:
> +
> +static void init_giveback_urb_bh(struct usb_hcd *hcd)
> +{
> + if (!hcd_giveback_urb_in_bh(hcd))
> + return;
Sorry, the above check isn't needed, and the below check isn't
needed too. I will fix it in V3.
> +
> + __init_giveback_urb_bh(&hcd->high_prio_bh);
> + __init_giveback_urb_bh(&hcd->low_prio_bh);
> +}
> +
> +static void exit_giveback_urb_bh(struct usb_hcd *hcd)
> +{
> + if (!hcd_giveback_urb_in_bh(hcd))
> + return;
> +
> + /*
> + * tasklet_kill() isn't needed here because:
> + * - driver's disconnec() called from usb_disconnect() should
> + * make sure its URBs are completed during the disconnect()
> + * callback
> + *
> + * - it is too late to run complete() here since driver may have
> + * been removed already now
> + *
> + * Using tasklet to run URB's complete() doesn't change this
> + * behavior of usbcore.
> + */
> +}
> +
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 11:16 ` Ming Lei
@ 2013-06-24 12:08 ` Oliver Neukum
2013-06-24 12:17 ` Ming Lei
0 siblings, 1 reply; 21+ messages in thread
From: Oliver Neukum @ 2013-06-24 12:08 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 24 June 2013 19:16:43 Ming Lei wrote:
> On Mon, Jun 24, 2013 at 6:28 PM, Oliver Neukum <oliver@neukum.org> wrote:
> > On Monday 24 June 2013 17:42:04 Ming Lei wrote:
> >> This patch improves this above situation, and the qh will wait for 5
> >> milliseconds before being unlinked from hardware, if one URB is submitted
> >> during the period, the qh is move out of unlink wait list and the
> >> interrupt transfer can be scheduled immediately, not like before: the
> >> transfer is only linked to hardware until previous unlink is completed.
> >
> > It seems to me that this logic should be used only if the URB completed
> > without error.
>
> The current completed URB with error doesn't mean the later submitted
> URB will complete with error, so I don't think the logic is related with URB
> completion error.
But
a) it is likelier
b) the driver will start error handling
And what about the bandwidth? At least if you unlink the URB, its bandwidth
should be given back immediately.
Regards
Oliver
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 12:08 ` Oliver Neukum
@ 2013-06-24 12:17 ` Ming Lei
0 siblings, 0 replies; 21+ messages in thread
From: Ming Lei @ 2013-06-24 12:17 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 24, 2013 at 8:08 PM, Oliver Neukum <oliver@neukum.org> wrote:
> On Monday 24 June 2013 19:16:43 Ming Lei wrote:
>> On Mon, Jun 24, 2013 at 6:28 PM, Oliver Neukum <oliver@neukum.org> wrote:
>> > On Monday 24 June 2013 17:42:04 Ming Lei wrote:
>> >> This patch improves this above situation, and the qh will wait for 5
>> >> milliseconds before being unlinked from hardware, if one URB is submitted
>> >> during the period, the qh is move out of unlink wait list and the
>> >> interrupt transfer can be scheduled immediately, not like before: the
>> >> transfer is only linked to hardware until previous unlink is completed.
>> >
>> > It seems to me that this logic should be used only if the URB completed
>> > without error.
>>
>> The current completed URB with error doesn't mean the later submitted
>> URB will complete with error, so I don't think the logic is related with URB
>> completion error.
>
> But
>
> a) it is likelier
> b) the driver will start error handling
>
> And what about the bandwidth? At least if you unlink the URB, its bandwidth
> should be given back immediately.
Yes, the patch doesn't affect unlinking intr URB because the
behavior of start_unlink_intr() isn't changed basically.
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context
2013-06-24 10:24 ` Oliver Neukum
@ 2013-06-24 12:58 ` Ming Lei
2013-06-24 13:06 ` Oliver Neukum
0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2013-06-24 12:58 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 24, 2013 at 6:24 PM, Oliver Neukum <oliver@neukum.org> wrote:
> On Monday 24 June 2013 17:42:05 Ming Lei wrote:
>> All 4 transfer types can work well on EHCI HCD after switching to run
>> URB giveback in tasklet context, so mark all HCD drivers to support
>> it.
>>
>> At the same time, don't release ehci->lock during URB giveback,
>> and remove the check on HCD_BH in ehci_disable_event().
>>
>> From below test results on 3 machines(2 ARM and one x86), time
>> consumed by EHCI interrupt handler droped much without performance
>> loss.
>>
>> 1 test description
>> 1.1 mass storage performance test:
>> - run below command 10 times and compute the average performance
>>
>> dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1
>
> It would be nice to get worst case numbers. How bad does it get
> if you reduce the sg size in usb-storage from 120K to 4K?
A quick test on one arm A15 box shows that the average speed over
10 times 'dd' becomes 8.0MB/sec from 8.160MB/sec when 'bs'
parameter of 'dd' changes to 4K, so there is ~1.9% performance
loss with the patch under the worst case.
Same test on my T410(x86), the speed difference is only 40K.
I will collect the worst case numbers and include it in the commit
log of V3.
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context
2013-06-24 12:58 ` Ming Lei
@ 2013-06-24 13:06 ` Oliver Neukum
2013-06-24 13:14 ` Ming Lei
0 siblings, 1 reply; 21+ messages in thread
From: Oliver Neukum @ 2013-06-24 13:06 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 24 June 2013 20:58:26 Ming Lei wrote:
> On Mon, Jun 24, 2013 at 6:24 PM, Oliver Neukum <oliver@neukum.org> wrote:
> > On Monday 24 June 2013 17:42:05 Ming Lei wrote:
> >> All 4 transfer types can work well on EHCI HCD after switching to run
> >> URB giveback in tasklet context, so mark all HCD drivers to support
> >> it.
> >>
> >> At the same time, don't release ehci->lock during URB giveback,
> >> and remove the check on HCD_BH in ehci_disable_event().
> >>
> >> From below test results on 3 machines(2 ARM and one x86), time
> >> consumed by EHCI interrupt handler droped much without performance
> >> loss.
> >>
> >> 1 test description
> >> 1.1 mass storage performance test:
> >> - run below command 10 times and compute the average performance
> >>
> >> dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1
> >
> > It would be nice to get worst case numbers. How bad does it get
> > if you reduce the sg size in usb-storage from 120K to 4K?
>
> A quick test on one arm A15 box shows that the average speed over
> 10 times 'dd' becomes 8.0MB/sec from 8.160MB/sec when 'bs'
> parameter of 'dd' changes to 4K, so there is ~1.9% performance
> loss with the patch under the worst case.
>
> Same test on my T410(x86), the speed difference is only 40K.
>
> I will collect the worst case numbers and include it in the commit
> log of V3.
Sorry,
I was referring to scsiglue.c
struct scsi_host_template usb_stor_host_template = {
/* basic userland interface stuff */
.name = "usb-storage",
.proc_name = "usb-storage",
.proc_info = proc_info,
.info = host_info,
/* command interface -- queued only */
.queuecommand = queuecommand,
/* error and abort handlers */
.eh_abort_handler = command_abort,
.eh_device_reset_handler = device_reset,
.eh_bus_reset_handler = bus_reset,
/* queue commands only, only one command per LUN */
.can_queue = 1,
.cmd_per_lun = 1,
/* unknown initiator id */
.this_id = -1,
.slave_alloc = slave_alloc,
.slave_configure = slave_configure,
/* lots of sg segments can be handled */
.sg_tablesize = SCSI_MAX_SG_CHAIN_SEGMENTS,
/* limit the total size of a transfer to 120 KB */
.max_sectors = 240,
If you go to 8 sectors here, you should get the absolute worst case.
Regards
Oliver
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context
2013-06-24 13:06 ` Oliver Neukum
@ 2013-06-24 13:14 ` Ming Lei
0 siblings, 0 replies; 21+ messages in thread
From: Ming Lei @ 2013-06-24 13:14 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 24, 2013 at 9:06 PM, Oliver Neukum <oliver@neukum.org> wrote:
> On Monday 24 June 2013 20:58:26 Ming Lei wrote:
>> On Mon, Jun 24, 2013 at 6:24 PM, Oliver Neukum <oliver@neukum.org> wrote:
>> > On Monday 24 June 2013 17:42:05 Ming Lei wrote:
>> >> All 4 transfer types can work well on EHCI HCD after switching to run
>> >> URB giveback in tasklet context, so mark all HCD drivers to support
>> >> it.
>> >>
>> >> At the same time, don't release ehci->lock during URB giveback,
>> >> and remove the check on HCD_BH in ehci_disable_event().
>> >>
>> >> From below test results on 3 machines(2 ARM and one x86), time
>> >> consumed by EHCI interrupt handler droped much without performance
>> >> loss.
>> >>
>> >> 1 test description
>> >> 1.1 mass storage performance test:
>> >> - run below command 10 times and compute the average performance
>> >>
>> >> dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1
>> >
>> > It would be nice to get worst case numbers. How bad does it get
>> > if you reduce the sg size in usb-storage from 120K to 4K?
>>
>> A quick test on one arm A15 box shows that the average speed over
>> 10 times 'dd' becomes 8.0MB/sec from 8.160MB/sec when 'bs'
>> parameter of 'dd' changes to 4K, so there is ~1.9% performance
>> loss with the patch under the worst case.
>>
>> Same test on my T410(x86), the speed difference is only 40K.
>>
>> I will collect the worst case numbers and include it in the commit
>> log of V3.
>
> Sorry,
>
> I was referring to scsiglue.c
>
> struct scsi_host_template usb_stor_host_template = {
> /* basic userland interface stuff */
> .name = "usb-storage",
> .proc_name = "usb-storage",
> .proc_info = proc_info,
> .info = host_info,
>
> /* command interface -- queued only */
> .queuecommand = queuecommand,
>
> /* error and abort handlers */
> .eh_abort_handler = command_abort,
> .eh_device_reset_handler = device_reset,
> .eh_bus_reset_handler = bus_reset,
>
> /* queue commands only, only one command per LUN */
> .can_queue = 1,
> .cmd_per_lun = 1,
>
> /* unknown initiator id */
> .this_id = -1,
>
> .slave_alloc = slave_alloc,
> .slave_configure = slave_configure,
>
> /* lots of sg segments can be handled */
> .sg_tablesize = SCSI_MAX_SG_CHAIN_SEGMENTS,
>
> /* limit the total size of a transfer to 120 KB */
> .max_sectors = 240,
>
> If you go to 8 sectors here, you should get the absolute worst case.
If you check trace of usbmon, 'dd if=/dev/sda iflag=direct bs=4k count=xxx'
generates the 4k data stage per transfer(cbw/data/csw).
So there is no difference between them.
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] USB: HCD: support giveback of URB in tasklet context
2013-06-24 9:42 ` [PATCH v2 1/4] USB: HCD: support " Ming Lei
2013-06-24 11:49 ` Ming Lei
@ 2013-06-24 15:17 ` Alan Stern
2013-06-25 11:30 ` Ming Lei
1 sibling, 1 reply; 21+ messages in thread
From: Alan Stern @ 2013-06-24 15:17 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 24 Jun 2013, Ming Lei wrote:
> This patch implements the mechanism of giveback of URB in
> tasklet context, so that hardware interrupt handling time for
> usb host controller can be saved much, and HCD interrupt handling
> can be simplified.
Changes from v1 to v2?
> +static void usb_giveback_urb_bh(unsigned long param)
> +{
> + struct giveback_urb_bh *bh = (struct giveback_urb_bh *)param;
> + unsigned long flags;
> + struct list_head local_list;
> +
> + spin_lock_irqsave(&bh->lock, flags);
> + bh->running = 1;
> + restart:
> + list_replace_init(&bh->head, &local_list);
> + spin_unlock_irqrestore(&bh->lock, flags);
Tasklet routines are always called with interrupts enabled, right?
Therefore you don't need to use the "flags" argument here or below.
> @@ -1667,25 +1721,40 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
> */
> void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
> {
> - urb->hcpriv = NULL;
> - if (unlikely(urb->unlinked))
> - status = urb->unlinked;
> - else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
> - urb->actual_length < urb->transfer_buffer_length &&
> - !status))
> - status = -EREMOTEIO;
> + struct giveback_urb_bh *bh;
> + bool sched, high_prio_bh;
>
> - unmap_urb_for_dma(hcd, urb);
> - usbmon_urb_complete(&hcd->self, urb, status);
> - usb_unanchor_urb(urb);
> + /* pass status to tasklet via unlinked */
> + if (likely(!urb->unlinked))
> + urb->unlinked = status;
>
> - /* pass ownership to the completion handler */
> - urb->status = status;
> - urb->complete (urb);
> - atomic_dec (&urb->use_count);
> - if (unlikely(atomic_read(&urb->reject)))
> - wake_up (&usb_kill_urb_queue);
> - usb_put_urb (urb);
> + if (!hcd_giveback_urb_in_bh(hcd) && !is_root_hub(urb->dev)) {
> + __usb_hcd_giveback_urb(urb);
> + return;
> + }
> +
> + if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
> + bh = &hcd->high_prio_bh;
> + high_prio_bh = 1;
> + } else {
> + bh = &hcd->low_prio_bh;
> + high_prio_bh = 0;
> + }
Bool values should be assigned "true" or "false", not "1" or "0".
> +
> + spin_lock(&bh->lock);
> + list_add_tail(&urb->urb_list, &bh->head);
> + if (bh->running)
> + sched = 0;
> + else
> + sched = 1;
> + spin_unlock(&bh->lock);
How about calling this variable "running" instead of "sched"? Then you
could just say:
running = bh->running;
with no "if" statement.
> +
> + if (!sched)
> + ;
> + else if (high_prio_bh)
> + tasklet_hi_schedule(&bh->bh);
> + else
> + tasklet_schedule(&bh->bh);
> }
> EXPORT_SYMBOL_GPL(usb_hcd_giveback_urb);
>
> @@ -2307,6 +2376,42 @@ EXPORT_SYMBOL_GPL (usb_hc_died);
>
> /*-------------------------------------------------------------------------*/
>
> +static void __init_giveback_urb_bh(struct giveback_urb_bh *bh)
> +{
> +
> + spin_lock_init(&bh->lock);
> + INIT_LIST_HEAD(&bh->head);
> + tasklet_init(&bh->bh, usb_giveback_urb_bh, (unsigned long)bh);
> +}
> +
> +static void init_giveback_urb_bh(struct usb_hcd *hcd)
> +{
> + if (!hcd_giveback_urb_in_bh(hcd))
> + return;
As you mentioned, there's not much point in skipping this code when
it's not needed. In fact, you could just put the two lines below
directly into usb_create_shared_hcd(), and get rid of the "__" from the
beginning of the name.
> +
> + __init_giveback_urb_bh(&hcd->high_prio_bh);
> + __init_giveback_urb_bh(&hcd->low_prio_bh);
> +}
> +
> +static void exit_giveback_urb_bh(struct usb_hcd *hcd)
> +{
> + if (!hcd_giveback_urb_in_bh(hcd))
> + return;
> +
> + /*
> + * tasklet_kill() isn't needed here because:
> + * - driver's disconnec() called from usb_disconnect() should
Misspelled "disconnect".
> + * make sure its URBs are completed during the disconnect()
> + * callback
> + *
> + * - it is too late to run complete() here since driver may have
> + * been removed already now
> + *
> + * Using tasklet to run URB's complete() doesn't change this
> + * behavior of usbcore.
> + */
> +}
Why have a separate subroutine for doing nothing? Simply put this
comment directly into usb_remove_hcd(). (And you can remove the last
two lines of the comment; they don't make sense in this context.)
Alan Stern
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 2/4] USB: URB documentation: claim complete() will be run with IRQs enabled
2013-06-24 9:42 ` [PATCH v2 2/4] USB: URB documentation: claim complete() will be run with IRQs enabled Ming Lei
@ 2013-06-24 15:18 ` Alan Stern
0 siblings, 0 replies; 21+ messages in thread
From: Alan Stern @ 2013-06-24 15:18 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 24 Jun 2013, Ming Lei wrote:
> There is no good reason to run complete() in hard interrupt
> disabled context.
>
> After switch to run complete() in tasklet, we will enable local IRQs
> when calling complete() since we can do it at that time.
>
> Even though we still disable IRQs now when calling complete()
> in tasklet, the URB documentation is updated to claim complete()
> will be run in tasklet context and local IRQs will be enabled, so
> that USB drivers can know the change and avoid one deadlock caused
> by: assume IRQs disabled in complete() and call spin_lock() to
> hold lock which might be acquired in interrupt context.
>
> Current spin_lock() usages in drivers' complete() will be cleaned
> up at the same time, and once the cleanup is finished, local IRQs
> will be enabled when calling complete() in tasklet.
>
> Also fix description about type of usb_complete_t, and remove the
> advice of running completion handler in tasklet for decreasing
> system latency.
>
> Cc: Oliver Neukum <oliver@neukum.org>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
> Documentation/usb/URB.txt | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/usb/URB.txt b/Documentation/usb/URB.txt
> index 00d2c64..50da0d4 100644
> --- a/Documentation/usb/URB.txt
> +++ b/Documentation/usb/URB.txt
> @@ -195,13 +195,12 @@ by the completion handler.
>
> The handler is of the following type:
>
> - typedef void (*usb_complete_t)(struct urb *, struct pt_regs *)
> + typedef void (*usb_complete_t)(struct urb *)
>
> -I.e., it gets the URB that caused the completion call, plus the
> -register values at the time of the corresponding interrupt (if any).
> -In the completion handler, you should have a look at urb->status to
> -detect any USB errors. Since the context parameter is included in the URB,
> -you can pass information to the completion handler.
> +I.e., it gets the URB that caused the completion call. In the completion
> +handler, you should have a look at urb->status to detect any USB errors.
> +Since the context parameter is included in the URB, you can pass
> +information to the completion handler.
>
> Note that even when an error (or unlink) is reported, data may have been
> transferred. That's because USB transfers are packetized; it might take
> @@ -210,12 +209,12 @@ have transferred successfully before the completion was called.
>
>
> NOTE: ***** WARNING *****
> -NEVER SLEEP IN A COMPLETION HANDLER. These are normally called
> -during hardware interrupt processing. If you can, defer substantial
> -work to a tasklet (bottom half) to keep system latencies low. You'll
> -probably need to use spinlocks to protect data structures you manipulate
> -in completion handlers.
> +NEVER SLEEP IN A COMPLETION HANDLER. These are often called in atomic
> +context.
>
> +In the current kernel, completion handlers run with local interrupts
> +disabled, but in the future this will be changed, so don't assume that
> +local IRQs are always disabled inside completion handlers.
>
> 1.8. How to do isochronous (ISO) transfers?
Acked-by: Alan Stern <stern@rowland.harvard.edu>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 9:42 ` [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink Ming Lei
2013-06-24 10:28 ` Oliver Neukum
@ 2013-06-24 18:53 ` Alan Stern
2013-06-25 11:56 ` Ming Lei
1 sibling, 1 reply; 21+ messages in thread
From: Alan Stern @ 2013-06-24 18:53 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 24 Jun 2013, Ming Lei wrote:
> Given interrupt URB will be resubmitted from tasklet context which
> is scheduled by ehci hardware interrupt handler, and commonly only
> one interrupt URB is scheduled on qh, so the qh may be unlinked
> immediately once qh_completions() returns from ehci_irq(), then
> the intr URB to be resubmitted in complete() can only be scheduled
> and linked to hardware until the qh unlink is completed.
>
> This patch improves this above situation, and the qh will wait for 5
> milliseconds before being unlinked from hardware, if one URB is submitted
> during the period, the qh is move out of unlink wait list and the
> interrupt transfer can be scheduled immediately, not like before: the
> transfer is only linked to hardware until previous unlink is completed.
This description is very hard to understand. I suggest rewriting it,
something like this:
ehci-hcd currently unlinks an interrupt QH when it becomes empty, that
is, after its last URB completes. This works well because in almost
all cases, the completion handler for an interrupt URB resubmits the
URB; therefore the QH doesn't become empty and doesn't get unlinked.
When we start using tasklets for URB completion, this scheme won't work
as well. The resubmission won't occur until the tasklet runs, which
will be some time after the completion is queued with the tasklet.
During that delay, the QH will be empty and so will be unlinked
unnecessarily.
To prevent this problem, this patch adds a 5-ms time delay before empty
interrupt QHs are unlinked. Most often, during that time the interrupt
URB will be resubmitted and thus we can avoid unlinking the QH.
> diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
> index f80d033..5bf67e2 100644
> --- a/drivers/usb/host/ehci-sched.c
> +++ b/drivers/usb/host/ehci-sched.c
> @@ -601,12 +601,24 @@ static void qh_unlink_periodic(struct ehci_hcd *ehci, struct ehci_qh *qh)
> list_del(&qh->intr_node);
> }
>
> -static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
> +/* must be called with holding ehci->lock */
The comment should be:
/* caller must hold ehci->lock */
But in fact you can leave it out. Almost all the code in this file
runs with the lock held.
> +static void cancel_unlink_wait_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
> {
> - /* If the QH isn't linked then there's nothing we can do. */
> - if (qh->qh_state != QH_STATE_LINKED)
> + if (qh->qh_state != QH_STATE_LINKED || list_empty(&qh->unlink_node))
> return;
>
> + list_del_init(&qh->unlink_node);
> +
> + /* avoid unnecessary CPU wakeup */
> + if (list_empty(&ehci->intr_unlink_wait))
> + ehci_disable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR);
If you don't mind, can we leave out ehci_disable_event() for now and
add it after the rest of this series is merged? It will keeps things a
little simpler, and then we'll be able to use ehci_disable_event() for
the IAA watchdog timer event as well as using it here.
> +}
> +
> +static void start_do_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
> +{
> + /* if the qh is waitting for unlink, cancel it now */
s/waitt/wait/
> + cancel_unlink_wait_intr(ehci, qh);
> +
> qh_unlink_periodic (ehci, qh);
>
> /* Make sure the unlinks are visible before starting the timer */
> @@ -632,6 +644,45 @@ static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
> }
> }
>
> +/*
> + * It is common only one intr URB is scheduled on one qh, and
> + * given complete() is run in tasklet context, introduce a bit
> + * delay to avoid unlink qh too early.
> + */
> +static void __start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh,
> + bool wait)
> +{
> + /* If the QH isn't linked then there's nothing we can do. */
> + if (qh->qh_state != QH_STATE_LINKED)
> + return;
> +
> + if (!wait)
> + return start_do_unlink_intr(ehci, qh);
> +
> + qh->unlink_cycle = ehci->intr_unlink_wait_cycle;
> +
> + /* New entries go at the end of the intr_unlink_wait list */
> + list_add_tail(&qh->unlink_node, &ehci->intr_unlink_wait);
> +
> + if (ehci->rh_state < EHCI_RH_RUNNING)
> + ehci_handle_start_intr_unlinks(ehci);
> + else if (ehci->intr_unlink_wait.next == &qh->unlink_node) {
> + ehci_enable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR, true);
> + ++ehci->intr_unlink_wait_cycle;
> + }
> +}
> +
> +static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
> +{
> + __start_unlink_intr(ehci, qh, false);
> +}
> +
> +static void start_unlink_intr_wait(struct ehci_hcd *ehci,
> + struct ehci_qh *qh)
> +{
> + __start_unlink_intr(ehci, qh, true);
> +}
This is not what I had in mind.
Instead, copy the qh->qh_state test into the beginning of
start_unlink_intr() and move the rest of start_do_unlink_intr() there.
Then you can rename __start_unlink_intr() to start_unlink_intr_wait()
and get rid of the "wait" parameter.
> @@ -881,6 +932,9 @@ static int intr_submit (
> goto done;
> }
>
> + /* put back qh from unlink wait list */
> + cancel_unlink_wait_intr(ehci, qh);
It might be better to change this line into an "else" clause for the
"if (qh->qh_state == QH_STATE_IDLE)" statement after the BUG_ON below.
> +
> /* then queue the urb's tds to the qh */
> qh = qh_append_tds(ehci, urb, qtd_list, epnum, &urb->ep->hcpriv);
> BUG_ON (qh == NULL);
> +static void ehci_handle_start_intr_unlinks(struct ehci_hcd *ehci)
> +{
> + bool stopped = (ehci->rh_state < EHCI_RH_RUNNING);
> +
> + /*
> + * Process all the QHs on the intr_unlink list that were added
> + * before the current unlink cycle began. The list is in
> + * temporal order, so stop when we reach the first entry in the
> + * current cycle. But if the root hub isn't running then
> + * process all the QHs on the list.
> + */
> + while (!list_empty(&ehci->intr_unlink_wait)) {
> + struct ehci_qh *qh;
> +
> + qh = list_first_entry(&ehci->intr_unlink_wait,
> + struct ehci_qh, unlink_node);
As I mentioned before, this line should be indented by two tab stops
beyond the preceding line.
> + if (!stopped &&
> + (qh->unlink_cycle ==
This doesn't need to be on a separate line at all.
> + ehci->intr_unlink_wait_cycle))
And this should also be indented by two tab stops beyond the preceding
line.
> --- a/drivers/usb/host/ehci.h
> +++ b/drivers/usb/host/ehci.h
> @@ -88,6 +88,7 @@ enum ehci_hrtimer_event {
> EHCI_HRTIMER_POLL_DEAD, /* Wait for dead controller to stop */
> EHCI_HRTIMER_UNLINK_INTR, /* Wait for interrupt QH unlink */
> EHCI_HRTIMER_FREE_ITDS, /* Wait for unused iTDs and siTDs */
> + EHCI_HRTIMER_START_UNLINK_INTR, /* wait for new intr schedule */
The comment should be: /* Unlink empty interrupt QHs */
> EHCI_HRTIMER_ASYNC_UNLINKS, /* Unlink empty async QHs */
> EHCI_HRTIMER_IAA_WATCHDOG, /* Handle lost IAA interrupts */
> EHCI_HRTIMER_DISABLE_PERIODIC, /* Wait to disable periodic sched */
> @@ -143,6 +144,8 @@ struct ehci_hcd { /* one per controller */
> unsigned i_thresh; /* uframes HC might cache */
>
> union ehci_shadow *pshadow; /* mirror hw periodic table */
> + struct list_head intr_unlink_wait;
> + unsigned intr_unlink_wait_cycle;
> struct list_head intr_unlink;
> unsigned intr_unlink_cycle;
> unsigned now_frame; /* frame from HC hardware */
To avoid an alignment gap on 64-bit systems, put intr_unlink_wait_cycle
after intr_unlink.
Alan Stern
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] USB: HCD: support giveback of URB in tasklet context
2013-06-24 15:17 ` Alan Stern
@ 2013-06-25 11:30 ` Ming Lei
2013-06-25 14:48 ` Alan Stern
0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2013-06-25 11:30 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 24, 2013 at 11:17 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Mon, 24 Jun 2013, Ming Lei wrote:
>
>> This patch implements the mechanism of giveback of URB in
>> tasklet context, so that hardware interrupt handling time for
>> usb host controller can be saved much, and HCD interrupt handling
>> can be simplified.
>
> Changes from v1 to v2?
The change log is put in 0/4.
>
>
>> +static void usb_giveback_urb_bh(unsigned long param)
>> +{
>> + struct giveback_urb_bh *bh = (struct giveback_urb_bh *)param;
>> + unsigned long flags;
>> + struct list_head local_list;
>> +
>> + spin_lock_irqsave(&bh->lock, flags);
>> + bh->running = 1;
>> + restart:
>> + list_replace_init(&bh->head, &local_list);
>> + spin_unlock_irqrestore(&bh->lock, flags);
>
> Tasklet routines are always called with interrupts enabled, right?
> Therefore you don't need to use the "flags" argument here or below.
Good catch.
>
>> @@ -1667,25 +1721,40 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
>> */
>> void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
>> {
>> - urb->hcpriv = NULL;
>> - if (unlikely(urb->unlinked))
>> - status = urb->unlinked;
>> - else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
>> - urb->actual_length < urb->transfer_buffer_length &&
>> - !status))
>> - status = -EREMOTEIO;
>> + struct giveback_urb_bh *bh;
>> + bool sched, high_prio_bh;
>>
>> - unmap_urb_for_dma(hcd, urb);
>> - usbmon_urb_complete(&hcd->self, urb, status);
>> - usb_unanchor_urb(urb);
>> + /* pass status to tasklet via unlinked */
>> + if (likely(!urb->unlinked))
>> + urb->unlinked = status;
>>
>> - /* pass ownership to the completion handler */
>> - urb->status = status;
>> - urb->complete (urb);
>> - atomic_dec (&urb->use_count);
>> - if (unlikely(atomic_read(&urb->reject)))
>> - wake_up (&usb_kill_urb_queue);
>> - usb_put_urb (urb);
>> + if (!hcd_giveback_urb_in_bh(hcd) && !is_root_hub(urb->dev)) {
>> + __usb_hcd_giveback_urb(urb);
>> + return;
>> + }
>> +
>> + if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
>> + bh = &hcd->high_prio_bh;
>> + high_prio_bh = 1;
>> + } else {
>> + bh = &hcd->low_prio_bh;
>> + high_prio_bh = 0;
>> + }
>
> Bool values should be assigned "true" or "false", not "1" or "0".
Right.
>
>> +
>> + spin_lock(&bh->lock);
>> + list_add_tail(&urb->urb_list, &bh->head);
>> + if (bh->running)
>> + sched = 0;
>> + else
>> + sched = 1;
>> + spin_unlock(&bh->lock);
>
> How about calling this variable "running" instead of "sched"? Then you
> could just say:
>
> running = bh->running;
>
> with no "if" statement.
OK, even we can do this below without name change:
sched = !bh->running;
>
>> +
>> + if (!sched)
>> + ;
>> + else if (high_prio_bh)
>> + tasklet_hi_schedule(&bh->bh);
>> + else
>> + tasklet_schedule(&bh->bh);
>> }
>> EXPORT_SYMBOL_GPL(usb_hcd_giveback_urb);
>>
>> @@ -2307,6 +2376,42 @@ EXPORT_SYMBOL_GPL (usb_hc_died);
>>
>> /*-------------------------------------------------------------------------*/
>>
>> +static void __init_giveback_urb_bh(struct giveback_urb_bh *bh)
>> +{
>> +
>> + spin_lock_init(&bh->lock);
>> + INIT_LIST_HEAD(&bh->head);
>> + tasklet_init(&bh->bh, usb_giveback_urb_bh, (unsigned long)bh);
>> +}
>> +
>> +static void init_giveback_urb_bh(struct usb_hcd *hcd)
>> +{
>> + if (!hcd_giveback_urb_in_bh(hcd))
>> + return;
>
> As you mentioned, there's not much point in skipping this code when
> it's not needed. In fact, you could just put the two lines below
> directly into usb_create_shared_hcd(), and get rid of the "__" from the
> beginning of the name.
OK.
>
>> +
>> + __init_giveback_urb_bh(&hcd->high_prio_bh);
>> + __init_giveback_urb_bh(&hcd->low_prio_bh);
>> +}
>> +
>> +static void exit_giveback_urb_bh(struct usb_hcd *hcd)
>> +{
>> + if (!hcd_giveback_urb_in_bh(hcd))
>> + return;
>> +
>> + /*
>> + * tasklet_kill() isn't needed here because:
>> + * - driver's disconnec() called from usb_disconnect() should
>
> Misspelled "disconnect".
>
>> + * make sure its URBs are completed during the disconnect()
>> + * callback
>> + *
>> + * - it is too late to run complete() here since driver may have
>> + * been removed already now
>> + *
>> + * Using tasklet to run URB's complete() doesn't change this
>> + * behavior of usbcore.
>> + */
>> +}
>
> Why have a separate subroutine for doing nothing? Simply put this
> comment directly into usb_remove_hcd(). (And you can remove the last
> two lines of the comment; they don't make sense in this context.)
Looks it is a bit clean to put the comment in the function, but it is
OK to add them in usb_remove_hcd() too.
Thanks again for your review.
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-24 18:53 ` Alan Stern
@ 2013-06-25 11:56 ` Ming Lei
2013-06-25 14:54 ` Alan Stern
0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2013-06-25 11:56 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Jun 25, 2013 at 2:53 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Mon, 24 Jun 2013, Ming Lei wrote:
>
>> Given interrupt URB will be resubmitted from tasklet context which
>> is scheduled by ehci hardware interrupt handler, and commonly only
>> one interrupt URB is scheduled on qh, so the qh may be unlinked
>> immediately once qh_completions() returns from ehci_irq(), then
>> the intr URB to be resubmitted in complete() can only be scheduled
>> and linked to hardware until the qh unlink is completed.
>>
>> This patch improves this above situation, and the qh will wait for 5
>> milliseconds before being unlinked from hardware, if one URB is submitted
>> during the period, the qh is move out of unlink wait list and the
>> interrupt transfer can be scheduled immediately, not like before: the
>> transfer is only linked to hardware until previous unlink is completed.
>
> This description is very hard to understand. I suggest rewriting it,
> something like this:
>
> ehci-hcd currently unlinks an interrupt QH when it becomes empty, that
> is, after its last URB completes. This works well because in almost
> all cases, the completion handler for an interrupt URB resubmits the
> URB; therefore the QH doesn't become empty and doesn't get unlinked.
>
> When we start using tasklets for URB completion, this scheme won't work
> as well. The resubmission won't occur until the tasklet runs, which
> will be some time after the completion is queued with the tasklet.
> During that delay, the QH will be empty and so will be unlinked
> unnecessarily.
>
> To prevent this problem, this patch adds a 5-ms time delay before empty
> interrupt QHs are unlinked. Most often, during that time the interrupt
> URB will be resubmitted and thus we can avoid unlinking the QH.
Excellent description about the change, and I will add it in V3,
also care to add your signed-off in the patch for the change log part?
>
>> diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
>> index f80d033..5bf67e2 100644
>> --- a/drivers/usb/host/ehci-sched.c
>> +++ b/drivers/usb/host/ehci-sched.c
>> @@ -601,12 +601,24 @@ static void qh_unlink_periodic(struct ehci_hcd *ehci, struct ehci_qh *qh)
>> list_del(&qh->intr_node);
>> }
>>
>> -static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
>> +/* must be called with holding ehci->lock */
>
> The comment should be:
>
> /* caller must hold ehci->lock */
>
> But in fact you can leave it out. Almost all the code in this file
> runs with the lock held.
OK.
>
>> +static void cancel_unlink_wait_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
>> {
>> - /* If the QH isn't linked then there's nothing we can do. */
>> - if (qh->qh_state != QH_STATE_LINKED)
>> + if (qh->qh_state != QH_STATE_LINKED || list_empty(&qh->unlink_node))
>> return;
>>
>> + list_del_init(&qh->unlink_node);
>> +
>> + /* avoid unnecessary CPU wakeup */
>> + if (list_empty(&ehci->intr_unlink_wait))
>> + ehci_disable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR);
>
> If you don't mind, can we leave out ehci_disable_event() for now and
> add it after the rest of this series is merged? It will keeps things a
> little simpler, and then we'll be able to use ehci_disable_event() for
> the IAA watchdog timer event as well as using it here.
I am fine, so this patch will become simpler and has less change.
>
>> +}
>> +
>> +static void start_do_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
>> +{
>> + /* if the qh is waitting for unlink, cancel it now */
>
> s/waitt/wait/
>
>> + cancel_unlink_wait_intr(ehci, qh);
>> +
>> qh_unlink_periodic (ehci, qh);
>>
>> /* Make sure the unlinks are visible before starting the timer */
>> @@ -632,6 +644,45 @@ static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
>> }
>> }
>>
>> +/*
>> + * It is common only one intr URB is scheduled on one qh, and
>> + * given complete() is run in tasklet context, introduce a bit
>> + * delay to avoid unlink qh too early.
>> + */
>> +static void __start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh,
>> + bool wait)
>> +{
>> + /* If the QH isn't linked then there's nothing we can do. */
>> + if (qh->qh_state != QH_STATE_LINKED)
>> + return;
>> +
>> + if (!wait)
>> + return start_do_unlink_intr(ehci, qh);
>> +
>> + qh->unlink_cycle = ehci->intr_unlink_wait_cycle;
>> +
>> + /* New entries go at the end of the intr_unlink_wait list */
>> + list_add_tail(&qh->unlink_node, &ehci->intr_unlink_wait);
>> +
>> + if (ehci->rh_state < EHCI_RH_RUNNING)
>> + ehci_handle_start_intr_unlinks(ehci);
>> + else if (ehci->intr_unlink_wait.next == &qh->unlink_node) {
>> + ehci_enable_event(ehci, EHCI_HRTIMER_START_UNLINK_INTR, true);
>> + ++ehci->intr_unlink_wait_cycle;
>> + }
>> +}
>> +
>> +static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
>> +{
>> + __start_unlink_intr(ehci, qh, false);
>> +}
>> +
>> +static void start_unlink_intr_wait(struct ehci_hcd *ehci,
>> + struct ehci_qh *qh)
>> +{
>> + __start_unlink_intr(ehci, qh, true);
>> +}
>
> This is not what I had in mind.
>
> Instead, copy the qh->qh_state test into the beginning of
> start_unlink_intr() and move the rest of start_do_unlink_intr() there.
> Then you can rename __start_unlink_intr() to start_unlink_intr_wait()
> and get rid of the "wait" parameter.
The current code can do the check centrally, but it isn't big deal, and I
will follow your suggestion.
>
>> @@ -881,6 +932,9 @@ static int intr_submit (
>> goto done;
>> }
>>
>> + /* put back qh from unlink wait list */
>> + cancel_unlink_wait_intr(ehci, qh);
>
> It might be better to change this line into an "else" clause for the
> "if (qh->qh_state == QH_STATE_IDLE)" statement after the BUG_ON below.
OK.
>
>> +
>> /* then queue the urb's tds to the qh */
>> qh = qh_append_tds(ehci, urb, qtd_list, epnum, &urb->ep->hcpriv);
>> BUG_ON (qh == NULL);
>
>> +static void ehci_handle_start_intr_unlinks(struct ehci_hcd *ehci)
>> +{
>> + bool stopped = (ehci->rh_state < EHCI_RH_RUNNING);
>> +
>> + /*
>> + * Process all the QHs on the intr_unlink list that were added
>> + * before the current unlink cycle began. The list is in
>> + * temporal order, so stop when we reach the first entry in the
>> + * current cycle. But if the root hub isn't running then
>> + * process all the QHs on the list.
>> + */
>> + while (!list_empty(&ehci->intr_unlink_wait)) {
>> + struct ehci_qh *qh;
>> +
>> + qh = list_first_entry(&ehci->intr_unlink_wait,
>> + struct ehci_qh, unlink_node);
>
> As I mentioned before, this line should be indented by two tab stops
> beyond the preceding line.
Looks I forget this one.
>
>> + if (!stopped &&
>> + (qh->unlink_cycle ==
>
> This doesn't need to be on a separate line at all.
>
>> + ehci->intr_unlink_wait_cycle))
>
> And this should also be indented by two tab stops beyond the preceding
> line.
OK.
BTW, indenting two tab might cause more lines, and looks many subsystems
don't do that. But, anyway, I am fine with two tab, :-)
>
>> --- a/drivers/usb/host/ehci.h
>> +++ b/drivers/usb/host/ehci.h
>> @@ -88,6 +88,7 @@ enum ehci_hrtimer_event {
>> EHCI_HRTIMER_POLL_DEAD, /* Wait for dead controller to stop */
>> EHCI_HRTIMER_UNLINK_INTR, /* Wait for interrupt QH unlink */
>> EHCI_HRTIMER_FREE_ITDS, /* Wait for unused iTDs and siTDs */
>> + EHCI_HRTIMER_START_UNLINK_INTR, /* wait for new intr schedule */
>
> The comment should be: /* Unlink empty interrupt QHs */
OK
>
>> EHCI_HRTIMER_ASYNC_UNLINKS, /* Unlink empty async QHs */
>> EHCI_HRTIMER_IAA_WATCHDOG, /* Handle lost IAA interrupts */
>> EHCI_HRTIMER_DISABLE_PERIODIC, /* Wait to disable periodic sched */
>> @@ -143,6 +144,8 @@ struct ehci_hcd { /* one per controller */
>> unsigned i_thresh; /* uframes HC might cache */
>>
>> union ehci_shadow *pshadow; /* mirror hw periodic table */
>> + struct list_head intr_unlink_wait;
>> + unsigned intr_unlink_wait_cycle;
>> struct list_head intr_unlink;
>> unsigned intr_unlink_cycle;
>> unsigned now_frame; /* frame from HC hardware */
>
> To avoid an alignment gap on 64-bit systems, put intr_unlink_wait_cycle
> after intr_unlink.
Good catch.
Thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] USB: HCD: support giveback of URB in tasklet context
2013-06-25 11:30 ` Ming Lei
@ 2013-06-25 14:48 ` Alan Stern
0 siblings, 0 replies; 21+ messages in thread
From: Alan Stern @ 2013-06-25 14:48 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 25 Jun 2013, Ming Lei wrote:
> >> +
> >> + spin_lock(&bh->lock);
> >> + list_add_tail(&urb->urb_list, &bh->head);
> >> + if (bh->running)
> >> + sched = 0;
> >> + else
> >> + sched = 1;
> >> + spin_unlock(&bh->lock);
> >
> > How about calling this variable "running" instead of "sched"? Then you
> > could just say:
> >
> > running = bh->running;
> >
> > with no "if" statement.
>
> OK, even we can do this below without name change:
>
> sched = !bh->running;
>
> >
> >> +
> >> + if (!sched)
> >> + ;
> >> + else if (high_prio_bh)
> >> + tasklet_hi_schedule(&bh->bh);
> >> + else
> >> + tasklet_schedule(&bh->bh);
The advantage of "running" instead of "sched" is that it avoids a
double negative:
sched = !bh->running;
...
if (!sched) ...
as opposed to
running = bh->running;
...
if (running) ...
Alan Stern
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink
2013-06-25 11:56 ` Ming Lei
@ 2013-06-25 14:54 ` Alan Stern
0 siblings, 0 replies; 21+ messages in thread
From: Alan Stern @ 2013-06-25 14:54 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 25 Jun 2013, Ming Lei wrote:
> On Tue, Jun 25, 2013 at 2:53 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Mon, 24 Jun 2013, Ming Lei wrote:
> >
> >> Given interrupt URB will be resubmitted from tasklet context which
> >> is scheduled by ehci hardware interrupt handler, and commonly only
> >> one interrupt URB is scheduled on qh, so the qh may be unlinked
> >> immediately once qh_completions() returns from ehci_irq(), then
> >> the intr URB to be resubmitted in complete() can only be scheduled
> >> and linked to hardware until the qh unlink is completed.
> >>
> >> This patch improves this above situation, and the qh will wait for 5
> >> milliseconds before being unlinked from hardware, if one URB is submitted
> >> during the period, the qh is move out of unlink wait list and the
> >> interrupt transfer can be scheduled immediately, not like before: the
> >> transfer is only linked to hardware until previous unlink is completed.
> >
> > This description is very hard to understand. I suggest rewriting it,
> > something like this:
> >
> > ehci-hcd currently unlinks an interrupt QH when it becomes empty, that
> > is, after its last URB completes. This works well because in almost
> > all cases, the completion handler for an interrupt URB resubmits the
> > URB; therefore the QH doesn't become empty and doesn't get unlinked.
> >
> > When we start using tasklets for URB completion, this scheme won't work
> > as well. The resubmission won't occur until the tasklet runs, which
> > will be some time after the completion is queued with the tasklet.
> > During that delay, the QH will be empty and so will be unlinked
> > unnecessarily.
> >
> > To prevent this problem, this patch adds a 5-ms time delay before empty
> > interrupt QHs are unlinked. Most often, during that time the interrupt
> > URB will be resubmitted and thus we can avoid unlinking the QH.
>
> Excellent description about the change, and I will add it in V3,
> also care to add your signed-off in the patch for the change log part?
You have my permission to add it.
> >> +static void start_unlink_intr(struct ehci_hcd *ehci, struct ehci_qh *qh)
> >> +{
> >> + __start_unlink_intr(ehci, qh, false);
> >> +}
> >> +
> >> +static void start_unlink_intr_wait(struct ehci_hcd *ehci,
> >> + struct ehci_qh *qh)
> >> +{
> >> + __start_unlink_intr(ehci, qh, true);
> >> +}
> >
> > This is not what I had in mind.
> >
> > Instead, copy the qh->qh_state test into the beginning of
> > start_unlink_intr() and move the rest of start_do_unlink_intr() there.
> > Then you can rename __start_unlink_intr() to start_unlink_intr_wait()
> > and get rid of the "wait" parameter.
>
> The current code can do the check centrally, but it isn't big deal, and I
> will follow your suggestion.
Copying the check adds two lines of code (plus a comment and a blank
line). Your two extra function headers, extra parameter, and extra
test add more than that.
> OK.
>
> BTW, indenting two tab might cause more lines, and looks many subsystems
> don't do that. But, anyway, I am fine with two tab, :-)
I admit, it isn't a perfect system and sometimes I don't use it. But
it does have the advantage that changes to the first line don't require
the continuation line to be changed as well, just to keep the desired
alignment. Furthermore, aligning the continuation with the first
parameter of a function call can often require _more_ than two extra
tab stops of indentation.
Alan Stern
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2013-06-25 14:54 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-24 9:42 [PATCH v2 0/4] USB: HCD/EHCI: giveback of URB in tasklet context Ming Lei
2013-06-24 9:42 ` [PATCH v2 1/4] USB: HCD: support " Ming Lei
2013-06-24 11:49 ` Ming Lei
2013-06-24 15:17 ` Alan Stern
2013-06-25 11:30 ` Ming Lei
2013-06-25 14:48 ` Alan Stern
2013-06-24 9:42 ` [PATCH v2 2/4] USB: URB documentation: claim complete() will be run with IRQs enabled Ming Lei
2013-06-24 15:18 ` Alan Stern
2013-06-24 9:42 ` [PATCH v2 3/4] USB: EHCI: improve interrupt qh unlink Ming Lei
2013-06-24 10:28 ` Oliver Neukum
2013-06-24 11:16 ` Ming Lei
2013-06-24 12:08 ` Oliver Neukum
2013-06-24 12:17 ` Ming Lei
2013-06-24 18:53 ` Alan Stern
2013-06-25 11:56 ` Ming Lei
2013-06-25 14:54 ` Alan Stern
2013-06-24 9:42 ` [PATCH v2 4/4] USB: EHCI: support running URB giveback in tasklet context Ming Lei
2013-06-24 10:24 ` Oliver Neukum
2013-06-24 12:58 ` Ming Lei
2013-06-24 13:06 ` Oliver Neukum
2013-06-24 13:14 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).