From: Mark Salter <msalter@redhat.com>
To: Rob Herring <robherring2@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>,
Nicolas Pitre <nico@fluxnic.net>,
Russell King <linux@arm.linux.org.uk>, Greg KH <greg@kroah.com>,
Chen Peter-B29397 <B29397@freescale.com>,
"ming.lei@canonical.com" <ming.lei@canonical.com>,
"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
"stern@rowland.harvard.edu" <stern@rowland.harvard.edu>,
"linux-omap@vger.kernel.org" <linux-omap@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] usb: ehci: make HC see up-to-date qh/qtd descriptor ASAP
Date: Wed, 31 Aug 2011 14:35:16 -0400 [thread overview]
Message-ID: <1314815719.2344.95.camel@deneb.redhat.com> (raw)
In-Reply-To: <4E5E7B35.9080008@gmail.com>
On Wed, 2011-08-31 at 13:19 -0500, Rob Herring wrote:
> On 08/31/2011 12:51 PM, Will Deacon wrote:
> > On Wed, Aug 31, 2011 at 06:46:50PM +0100, Nicolas Pitre wrote:
> >> On Wed, 31 Aug 2011, Will Deacon wrote:
> >>
> >>> On Wed, Aug 31, 2011 at 02:43:33PM +0100, Mark Salter wrote:
> >>>> On Wed, 2011-08-31 at 09:49 +0100, Will Deacon wrote:
> >>>>> On Wed, Aug 31, 2011 at 01:23:47AM +0100, Chen Peter-B29397 wrote:
> >>>>>> One question: why this write buffer issue did not happen at UP ARM V7 platform, whose dma buffer
> >>>>>> also uncache, but bufferable?
> >>>>>
> >>>>> Which CPU was on this platform?
> >>>>
> >>>> Using a 3.1.0-rc4+ kernel on a Pandaboard, and running 'hdparm -t' on a
> >>>> usb disk drive, I see ~5.8MB/s read speed. Same kernel, but passing
> >>>> nosmp on the commandline, I see 20.3MB/s.
> >>>>
> >>>> Can someone explain why nosmp would make such a difference?
> >>>
> >>> Oh gawd, that's horrible. I have a feeling it's probably a separate issue
> >>> though, caused by:
> >>>
> >>> omap_modify_auxcoreboot0(0x200, 0xfffffdff);
> >>>
> >>> in boot_secondary for OMAP. Unfortunately I have no idea what that line is
> >>> doing because it ends up talking to the secure monitor.
> >>
> >> Well, this issue is apparently affecting other ARMv9 implementations
> >> too. In which case this code in arch/arm/mm/mmu.c could be responsible:
> >>
> >> if (is_smp()) {
> >> /*
> >> * Mark memory with the "shared" attribute
> >> * for SMP systems
> >> */
> >> user_pgprot |= L_PTE_SHARED;
> >> kern_pgprot |= L_PTE_SHARED;
> >> vecs_pgprot |= L_PTE_SHARED;
> >> mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_DEVICE_WC].prot_pte |= L_PTE_SHARED;
> >> mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_DEVICE_CACHED].prot_pte |= L_PTE_SHARED;
> >> mem_types[MT_MEMORY].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_MEMORY].prot_pte |= L_PTE_SHARED;
> >> mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_MEMORY_NONCACHED].prot_pte |= L_PTE_SHARED;
> >> }
> >>
> >> However I don't see the nosmp kernel argument having any effect on the
> >> result from is_smp().
> >
> > Yes, the first thing that sprung to mind was the shared attribute, but like
> > you say, that doesn't seem to be affected by the nosmp command line
> > argument.
> >
> > Another thing that Marc and I tried on OMAP4 was not bringing up the secondary
> > CPU during boot (by commenting out most of smp_init). In this case, I/O
> > performance was good until we tried to online the secondary CPU. The online
> > failed but after that the I/O performance was certainly degraded.
> >
>
> Was the SCU enabled at that point? One diff between nosmp boot and
> offlining the 2nd core would be that the SCU remains enabled in the
> latter case. I think the SCU does not get enabled for nosmp.
>
> Do we really know which write buffer the data is sitting? Some
> experiments to only flush the L1 write buffer would be interesting.
> Perhaps something executed on the 2nd core has a mb which doesn't help
> for SMP because the other core's L1 write buffer is not flushed, but it
> helps for nosmp because everything runs on 1 core and any occurrence of
> a mb will flush all data out. I wouldn't expect the behavior to be so
> consistent though. Could it be something is not visible to the other
> core rather than not visible to the EHCI controller?
One experiment I did a few days ago was to pin processes and interrupts
to core#0 (except IPI and local timer). This didn't make any noticeable
difference.
My current understanding is that the writes are getting hung up in a
cache and not a write buffer. I am seeing delays of 10-15ms between
queuing the urb and getting an interrupt for urb completion. That
drops to a few hundred microseconds with the explicit flushing added
to the ehci driver. I don't see how any write buffer could hold data
that long without draining out on its own. What I see seems to suggest
that the memory is only coherent among the cores and not coherent for
CPU writes/device reads. Adding just a dsb() for the ehci flush does
not help. An outer_sync() is also necessary.
--Mark
WARNING: multiple messages have this Message-ID (diff)
From: msalter@redhat.com (Mark Salter)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] usb: ehci: make HC see up-to-date qh/qtd descriptor ASAP
Date: Wed, 31 Aug 2011 14:35:16 -0400 [thread overview]
Message-ID: <1314815719.2344.95.camel@deneb.redhat.com> (raw)
In-Reply-To: <4E5E7B35.9080008@gmail.com>
On Wed, 2011-08-31 at 13:19 -0500, Rob Herring wrote:
> On 08/31/2011 12:51 PM, Will Deacon wrote:
> > On Wed, Aug 31, 2011 at 06:46:50PM +0100, Nicolas Pitre wrote:
> >> On Wed, 31 Aug 2011, Will Deacon wrote:
> >>
> >>> On Wed, Aug 31, 2011 at 02:43:33PM +0100, Mark Salter wrote:
> >>>> On Wed, 2011-08-31 at 09:49 +0100, Will Deacon wrote:
> >>>>> On Wed, Aug 31, 2011 at 01:23:47AM +0100, Chen Peter-B29397 wrote:
> >>>>>> One question: why this write buffer issue did not happen at UP ARM V7 platform, whose dma buffer
> >>>>>> also uncache, but bufferable?
> >>>>>
> >>>>> Which CPU was on this platform?
> >>>>
> >>>> Using a 3.1.0-rc4+ kernel on a Pandaboard, and running 'hdparm -t' on a
> >>>> usb disk drive, I see ~5.8MB/s read speed. Same kernel, but passing
> >>>> nosmp on the commandline, I see 20.3MB/s.
> >>>>
> >>>> Can someone explain why nosmp would make such a difference?
> >>>
> >>> Oh gawd, that's horrible. I have a feeling it's probably a separate issue
> >>> though, caused by:
> >>>
> >>> omap_modify_auxcoreboot0(0x200, 0xfffffdff);
> >>>
> >>> in boot_secondary for OMAP. Unfortunately I have no idea what that line is
> >>> doing because it ends up talking to the secure monitor.
> >>
> >> Well, this issue is apparently affecting other ARMv9 implementations
> >> too. In which case this code in arch/arm/mm/mmu.c could be responsible:
> >>
> >> if (is_smp()) {
> >> /*
> >> * Mark memory with the "shared" attribute
> >> * for SMP systems
> >> */
> >> user_pgprot |= L_PTE_SHARED;
> >> kern_pgprot |= L_PTE_SHARED;
> >> vecs_pgprot |= L_PTE_SHARED;
> >> mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_DEVICE_WC].prot_pte |= L_PTE_SHARED;
> >> mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_DEVICE_CACHED].prot_pte |= L_PTE_SHARED;
> >> mem_types[MT_MEMORY].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_MEMORY].prot_pte |= L_PTE_SHARED;
> >> mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_S;
> >> mem_types[MT_MEMORY_NONCACHED].prot_pte |= L_PTE_SHARED;
> >> }
> >>
> >> However I don't see the nosmp kernel argument having any effect on the
> >> result from is_smp().
> >
> > Yes, the first thing that sprung to mind was the shared attribute, but like
> > you say, that doesn't seem to be affected by the nosmp command line
> > argument.
> >
> > Another thing that Marc and I tried on OMAP4 was not bringing up the secondary
> > CPU during boot (by commenting out most of smp_init). In this case, I/O
> > performance was good until we tried to online the secondary CPU. The online
> > failed but after that the I/O performance was certainly degraded.
> >
>
> Was the SCU enabled at that point? One diff between nosmp boot and
> offlining the 2nd core would be that the SCU remains enabled in the
> latter case. I think the SCU does not get enabled for nosmp.
>
> Do we really know which write buffer the data is sitting? Some
> experiments to only flush the L1 write buffer would be interesting.
> Perhaps something executed on the 2nd core has a mb which doesn't help
> for SMP because the other core's L1 write buffer is not flushed, but it
> helps for nosmp because everything runs on 1 core and any occurrence of
> a mb will flush all data out. I wouldn't expect the behavior to be so
> consistent though. Could it be something is not visible to the other
> core rather than not visible to the EHCI controller?
One experiment I did a few days ago was to pin processes and interrupts
to core#0 (except IPI and local timer). This didn't make any noticeable
difference.
My current understanding is that the writes are getting hung up in a
cache and not a write buffer. I am seeing delays of 10-15ms between
queuing the urb and getting an interrupt for urb completion. That
drops to a few hundred microseconds with the explicit flushing added
to the ehci driver. I don't see how any write buffer could hold data
that long without draining out on its own. What I see seems to suggest
that the memory is only coherent among the cores and not coherent for
CPU writes/device reads. Adding just a dsb() for the ehci flush does
not help. An outer_sync() is also necessary.
--Mark
next prev parent reply other threads:[~2011-08-31 18:35 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-30 16:03 [PATCH] usb: ehci: make HC see up-to-date qh/qtd descriptor ASAP ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw
2011-08-30 16:03 ` ming.lei at canonical.com
2011-08-30 16:15 ` Alan Stern
2011-08-30 16:15 ` Alan Stern
2011-08-30 16:38 ` Mark Salter
2011-08-30 16:38 ` Mark Salter
2011-08-30 17:15 ` Alan Stern
2011-08-30 17:15 ` Alan Stern
2011-08-30 18:45 ` Mark Salter
2011-08-30 18:45 ` Mark Salter
2011-08-30 17:26 ` Will Deacon
2011-08-30 17:26 ` Will Deacon
[not found] ` <20110830172642.GE3464-SGELLbQ0bobZROr8t4l/smS4ubULX0JqMm0uRHvK7Nw@public.gmane.org>
2011-08-30 17:48 ` Greg KH
2011-08-30 17:48 ` Greg KH
2011-08-30 17:54 ` Will Deacon
2011-08-30 17:54 ` Will Deacon
[not found] ` <20110830175432.GG3464-SGELLbQ0bobZROr8t4l/smS4ubULX0JqMm0uRHvK7Nw@public.gmane.org>
2011-08-31 0:23 ` Chen Peter-B29397
2011-08-31 0:23 ` Chen Peter-B29397
2011-08-31 8:49 ` Will Deacon
2011-08-31 8:49 ` Will Deacon
2011-08-31 12:33 ` Chen Peter-B29397
2011-08-31 12:33 ` Chen Peter-B29397
2011-08-31 13:43 ` Mark Salter
2011-08-31 13:43 ` Mark Salter
2011-08-31 15:21 ` Will Deacon
2011-08-31 15:21 ` Will Deacon
2011-08-31 15:27 ` Mark Salter
2011-08-31 15:27 ` Mark Salter
2011-08-31 16:12 ` Marc Zyngier
2011-08-31 16:12 ` Marc Zyngier
2011-08-31 16:55 ` Marc Dietrich
2011-08-31 16:55 ` Marc Dietrich
2011-09-01 10:34 ` Marc Zyngier
2011-09-01 10:34 ` Marc Zyngier
[not found] ` <4E5F5FA9.3010305-5wv7dgnIgG8@public.gmane.org>
2011-09-01 11:13 ` Marc Dietich
2011-09-01 11:13 ` Marc Dietich
2011-09-01 19:08 ` Stephen Warren
2011-09-01 19:08 ` Stephen Warren
2011-09-02 9:50 ` Marc Zyngier
2011-09-02 9:50 ` Marc Zyngier
2011-09-02 17:07 ` Stephen Warren
2011-09-02 17:07 ` Stephen Warren
[not found] ` <74CDBE0F657A3D45AFBB94109FB122FF04B327A383-C7FfzLzN0UxDw2glCA4ptUEOCMrvLtNR@public.gmane.org>
2011-09-02 11:13 ` Marc Dietich
2011-09-02 11:13 ` Marc Dietich
2011-08-31 17:46 ` Nicolas Pitre
2011-08-31 17:46 ` Nicolas Pitre
2011-08-31 17:51 ` Will Deacon
2011-08-31 17:51 ` Will Deacon
[not found] ` <20110831175147.GI8777-SGELLbQ0bobZROr8t4l/smS4ubULX0JqMm0uRHvK7Nw@public.gmane.org>
2011-08-31 18:19 ` Rob Herring
2011-08-31 18:19 ` Rob Herring
2011-08-31 18:35 ` Mark Salter [this message]
2011-08-31 18:35 ` Mark Salter
2011-08-31 18:49 ` Rob Herring
2011-08-31 18:49 ` Rob Herring
[not found] ` <4E5E8230.9060307-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2011-08-31 18:58 ` Mark Salter
2011-08-31 18:58 ` Mark Salter
2011-08-31 19:35 ` Will Deacon
2011-08-31 19:35 ` Will Deacon
2011-09-08 22:41 ` Mark Salter
2011-09-08 22:41 ` Mark Salter
[not found] ` <1315521779.2313.29.camel-PDpCo7skNiwAicBL8TP8PQ@public.gmane.org>
2011-10-31 6:49 ` Pandita, Vikram
2011-10-31 6:49 ` Pandita, Vikram
2011-08-31 0:56 ` Ming Lei
2011-08-31 0:56 ` Ming Lei
2011-09-01 23:16 ` Grant Grundler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1314815719.2344.95.camel@deneb.redhat.com \
--to=msalter@redhat.com \
--cc=B29397@freescale.com \
--cc=greg@kroah.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-omap@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=ming.lei@canonical.com \
--cc=nico@fluxnic.net \
--cc=robherring2@gmail.com \
--cc=stern@rowland.harvard.edu \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.